diff --git a/README.md b/README.md index 283c5fa6af..7820022447 100644 --- a/README.md +++ b/README.md @@ -141,8 +141,9 @@ A variety of default configuration files are provided: - [OpenTelemetry Collector](https://github.com/signalfx/splunk-otel-collector/tree/main/cmd/otelcol/config/collector) see `full_config_linux.yaml` for a commented configuration with links to full - documentation. `agent_config.yaml` is the recommended starting - configuration for most environments. + documentation. The `logs_config_linux.yaml` is a good starting point for using + the collector for collecting application logs on Linux environments. + `agent_config.yaml` is the recommended starting configuration for most environments. - [Fluentd](https://github.com/signalfx/splunk-otel-collector/tree/main/internal/buildscripts/packaging/fpm/etc/otel/collector/fluentd) applicable to Helm or installer script installations only. See the `*.conf` files as well as the `conf.d` directory. Common sources including filelog, diff --git a/cmd/otelcol/config/collector/logs_config_linux.yaml b/cmd/otelcol/config/collector/logs_config_linux.yaml new file mode 100644 index 0000000000..12bf00335d --- /dev/null +++ b/cmd/otelcol/config/collector/logs_config_linux.yaml @@ -0,0 +1,740 @@ +# Example configuration file for logs collection. + +# If the collector is installed without the Linux/Windows installer script, the following +# environment variables are required to be manually defined or configured below: +# - SPLUNK_ACCESS_TOKEN: The Splunk access token to authenticate requests +# - SPLUNK_HEC_TOKEN: The Splunk HEC authentication token +# - SPLUNK_HEC_URL: The Splunk HEC endpoint URL, e.g. https://ingest.us0.signalfx.com/v1/log +# - SPLUNK_INGEST_URL: The Splunk ingest URL, e.g. https://ingest.us0.signalfx.com +# - SPLUNK_LISTEN_INTERFACE: The network interface the agent receivers listen on. +# - SPLUNK_BALLAST_SIZE_MIB: This size of the ballast which should be 1/3 to 1/2 of memory allocated +# - SPLUNK_MEMORY_LIMIT_MIB: 90% of memory allocated + +receivers: + # Receivers for tailing and parsing log files coming from various services. + # + # The 'regex_parser' operator parses a log entry body. Using named capturing + # regex groups -- (?pattern) syntax -- the operator assigns one or more + # attributes using values that were captured by corresponding capture groups + # (see https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/pkg/stanza/docs/operators/regex_parser.md) + # + # The 'timestamp' operator sets the timestamp on a log entry by parsing a + # textual date-and-time value from one of the attributes + # (see https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/pkg/stanza/docs/operators/time_parser.md). + # + # Similarly, the 'severity' operator sets the severity on a log entry by + # parsing a value from one of the attributes + # (see https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/pkg/stanza/docs/operators/severity_parser.md). + # + # The 'retain' operator allows us to select which attributes we want to keep. + # Attributes not listed will be removed from the log entry. + # If the 'retain' operator is not used, then all attributes are kept. + # + # For a filelog receiver to be enabled it must be listed in the 'service' block + # at the end of this file. + + # Apache access log. + # Configured using the LogFormat directive. + # By default, date and time are in the 'day/Month/year:hour:minute:second zone' format. + # See https://httpd.apache.org/docs/current/mod/mod_log_config.html#logformat + filelog/apache-access: + include: [ "/var/log/apache*/access.log", "/var/log/apache*/access_log", "/var/log/httpd/access.log", "/var/log/httpd/access_log" ] + operators: + - type: regex_parser + regex: '^(?P.+) (?P.+) (?P.+) \[(?P