Important: This documentation is about an older version. It's relevant only to the release noted, many of the features and functions have been updated or replaced. Please view the current version.
regex
stage
The regex
stage is a parsing stage that parses a log line using a regular
expression. Named capture groups in the regex support adding data into the
extracted map.
Schema
regex:
# The RE2 regular expression. Each capture group must be named.
expression: <string>
# Name from extracted data to parse. If empty, uses the log message.
[source: <string>]
expression
needs to be a Go RE2 regex
string. Every capture group (re)
will be set into the extracted
map, every capture group must be named:
(?P<name>re)
. The name of the capture group will be used as the key in the
extracted map.
Because of how YAML treats backslashes in double-quoted strings, note that all backslashes in a regex expression must be escaped when using double quotes. For example, all of these are valid:
expression: \w*
expression: '\w*'
expression: "\\w*"
But these are not:
expression: \\w*
(only escape backslashes when using double quotes)expression: '\\w*'
(only escape backslashes when using double quotes)expression: "\w*"
(backslash must be escaped)
If you run Promtail with the --config.expand-env=true
flag the configuration
will run through envsubst which will
replace double slashes with single slashes. Because of this when using
expand-env=true
you need to use double slashes for each single slash. For
example:
expression: '\w*'
must beexpression: '\\w*'
expression: "\\w*"
must beexpression: "\\\\w*"
Example
Without source
Given the pipeline:
- regex:
expression: "^(?s)(?P<time>\\S+?) (?P<stream>stdout|stderr) (?P<flags>\\S+?) (?P<content>.*)$"
And the log line:
2019-01-01T01:00:00.000000001Z stderr P i'm a log message!
The following key-value pairs would be added to the extracted map:
time
:2019-01-01T01:00:00.000000001Z
,stream
:stderr
,flags
:P
,content
:i'm a log message
With source
Given the pipeline:
- json:
expressions:
time:
- regex:
expression: "^(?P<year>\\d+)"
source: "time"
And the log line:
{"time":"2019-01-01T01:00:00.000000001Z"}
The first stage would add the following key-value pairs into the extracted
map:
time
:2019-01-01T01:00:00.000000001Z
While the regex stage would then parse the value for time
in the extracted map
and append the following key-value pairs back into the extracted map:
year
:2019