This is a Fluentd plugin to parse strings in log messages and re-emit them. This parser also supports multiline format.
ParserOutput has just same with 'in_tail' about 'format' and 'time_format':
<match raw.apache.common.*>
@type parser
remove_prefix raw
format /^(?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^ ]*) +\S*)?" (?<code>[^ ]*) (?<size>[^ ]*)$/
time_format %d/%b/%Y:%H:%M:%S %z
key_name message
</match>
Of course, you can use predefined format 'apache' and 'syslog':
<match raw.apache.combined.*>
@type parser
remove_prefix raw
format apache
key_name message
</match>
If you want to parse multiline log:
<match raw.java.*>
@type parser
remove_prefix raw
format multiline
format_firstline /\d{4}-\d{1,2}-\d{1,2}/
format1 /^(?<time>\d{4}-\d{1,2}-\d{1,2} \d{1,2}:\d{1,2}:\d{1,2}) \[(?<thread>.*)\] (?<level>[^\s]+)(?<message>.*)/
key_name message
</match>
fluent-plugin-multiline-parser
uses parser plugins of Fluentd (and your own customized parser plugin).
See document page for more details: http://docs.fluentd.org/articles/parser-plugin-overview
If you want original attribute-data pair in re-emitted message, specify 'reserve_data':
<match raw.apache.*>
@type parser
tag apache
format apache
key_name message
reserve_data yes
</match>
If you want to suppress 'pattern not match' log, specify 'suppress_parse_error_log true' to configuration. default value is false.
<match in.hogelog>
@type parser
tag hogelog
format /^col1=(?<col1>.+) col2=(?<col2>.+)$/
key_name message
suppress_parse_error_log true
</match>
To store parsed values with specified key name prefix, use inject_key_prefix
option:
<match raw.sales.*>
@type parser
tag sales
format json
key_name sales
reserve_data yes
inject_key_prefix sales.
</match>
# input string of 'sales': {"user":1,"num":2}
# output data: {"sales":"{\"user\":1,\"num\":2}","sales.user":1, "sales.num":2}
To store parsed values as a hash value in a field, use hash_value_field
option:
<match raw.sales.*>
@type parser
tag sales
format json
key_name sales
hash_value_field parsed
</match>
# input string of 'sales': {"user":1,"num":2}
# output data: {"parsed":{"user":1, "num":2}}
Other options (ex: reserve_data
, inject_key_prefix
) are available with hash_value_field
.
# output data: {"sales":"{\"user\":1,\"num\":2}", "parsed":{"sales.user":1, "sales.num":2}}
Not to parse times (reserve that field like 'time' in record), specify time_parse no
:
<match raw.sales.*>
type parser
tag sales
format json
key_name sales
hash_value_field parsed
time_parse no
</match>
# input string of 'sales': {"user":1,"num":2,"time":"2013-10-31 12:48:33"}
# output data: {"parsed":{"user":1, "num":2,"time":"2013-10-31 12:48:33"}}
To keep newline separators between concatenated multi-lines, specify keep_newlines yes
:
<match raw.java.*>
@type parser
remove_prefix raw
format multiline
format_firstline /\d{4}-\d{1,2}-\d{1,2}/
format1 /^(?<time>\d{4}-\d{1,2}-\d{1,2} \d{1,2}:\d{1,2}:\d{1,2}) \[(?<thread>.*)\] (?<level>[^\s]+)(?<message>.*)/
key_name message
keep_newlines yes
</match>
This is the filter version of ParserOutput.
Note that this filter version of parser plugin does not have modifing tag functionality.
ParserFilter has just same with 'in_tail' about 'format' and 'time_format':
<filter raw.apache.common.*>
@type parser
format /^(?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^ ]*) +\S*)?" (?<code>[^ ]*) (?<size>[^ ]*)$/
time_format %d/%b/%Y:%H:%M:%S %z
key_name message
</filter>
Of course, you can use predefined format 'apache' and 'syslog':
<filter raw.apache.combined.*>
@type parser
format apache
key_name message
</filter>
fluent-plugin-multiline-parser
uses parser plugins of Fluentd (and your own customized parser plugin).
See document page for more details: http://docs.fluentd.org/articles/parser-plugin-overview
If you want to parse multiline log:
<filter raw.java.*>
@type parser
format multiline
format_firstline /\d{4}-\d{1,2}-\d{1,2}/
format1 /^(?<time>\d{4}-\d{1,2}-\d{1,2} \d{1,2}:\d{1,2}:\d{1,2}) \[(?<thread>.*)\] (?<level>[^\s]+)(?<message>.*)/
key_name message
</filter>
If you want original attribute-data pair in re-emitted message, specify 'reserve_data':
<filter raw.apache.*>
@type parser
format apache
key_name message
reserve_data yes
</filter>
If you want to suppress 'pattern not match' log, specify 'suppress_parse_error_log true' to configuration. default value is false.
<filter in.hogelog>
@type parser
format /^col1=(?<col1>.+) col2=(?<col2>.+)$/
key_name message
suppress_parse_error_log true
</filter>
To store parsed values with specified key name prefix, use inject_key_prefix
option:
<filter raw.sales.*>
@type parser
format json
key_name sales
reserve_data yes
inject_key_prefix sales.
</filter>
# input string of 'sales': {"user":1,"num":2}
# output data: {"sales":"{\"user\":1,\"num\":2}","sales.user":1, "sales.num":2}
To store parsed values as a hash value in a field, use hash_value_field
option:
<filter raw.sales.*>
@type parser
tag sales
format json
key_name sales
hash_value_field parsed
</filter>
# input string of 'sales': {"user":1,"num":2}
# output data: {"parsed":{"user":1, "num":2}}
Other options (ex: reserve_data
, inject_key_prefix
) are available with hash_value_field
.
# output data: {"sales":"{\"user\":1,\"num\":2}", "parsed":{"sales.user":1, "sales.num":2}}
Not to parse times (reserve that field like 'time' in record), specify time_parse no
:
<filter raw.sales.*>
@type parser
format json
key_name sales
hash_value_field parsed
time_parse no
</filter>
# input string of 'sales': {"user":1,"num":2,"time":"2013-10-31 12:48:33"}
# output data: {"parsed":{"user":1, "num":2,"time":"2013-10-31 12:48:33"}}
- consider what to do next
- patches welcome!
- Copyright
- Copyright (c) 2016 Jerry Zhou
- License
- Apache License, Version 2.0