Skip to content
This repository has been archived by the owner on Dec 14, 2024. It is now read-only.

[Bug] Cortex Data > Splunk HEC event line breaks missing #324

Open
0xC0FFEEEE opened this issue Feb 28, 2024 · 3 comments
Open

[Bug] Cortex Data > Splunk HEC event line breaks missing #324

0xC0FFEEEE opened this issue Feb 28, 2024 · 3 comments
Assignees
Labels

Comments

@0xC0FFEEEE
Copy link

0xC0FFEEEE commented Feb 28, 2024

Describe the bug

Cortex Data > Splunk HEC event line breaks missing

Expected behavior

JSON events from Cortex are extracted correctly, broken by line breaks when using the pan:firewall_cloud sourcetype.

Current behavior

JSON events are not line broken, preventing logs from being parsed correctly when using the pan:firewall_cloud sourcetype.

Possible solution

Preferably fix the cortex data lake side to send individual JSON events with proper line breaking as intended.

Or less preferably, update LINE_BREAKER to break out individual JSON events.

Steps to reproduce

  1. Configure Splunk HEC
  2. Configure Log Forwarding in Cortex Data Lake using Splunk/Stacked JSON option
  3. Observe that events are not line broken.

Screenshots

image

Context

This bug effectively breaks all functionality of the Palo Alto add on when using cortex data lake and Splunk HEC collectors.

@0xC0FFEEEE 0xC0FFEEEE added the bug label Feb 28, 2024
@paulmnguyen
Copy link
Contributor

Hey @0xC0FFEEEE,

Was this working before and suddenly changed?

@paulmnguyen paulmnguyen self-assigned this Feb 28, 2024
@0xC0FFEEEE
Copy link
Author

0xC0FFEEEE commented Feb 28, 2024

I believe this has always been the case, our Obs team has been trying to get to the bottom of this since the integration was originally set up and have had an ongoing case raised with Splunk on this issue.

My fix for this is a bit hacky, I've cloned the sourcetype and set the following:
LINE_BREAKER: \}\}()
TRUNCATE: 999999

For reference the installed version on our Splunk cloud environment is currently v8.1.0

@0xC0FFEEEE
Copy link
Author

I've identified another issue which is impacting the ability to perform field extractions and normalization. similar to #325. This could be due to the events being sent to the incorrect HTTP endpoint (as there are several to choose from in the Splunk dev docs covering HTTP inputs).

What should happen is that the top level key/value pairs in the JSON sent to Splunk should be interpreted as internal fields (.e.g. host, source, time), and the nested JSON under the event key interpreted as the actual event and displayed accordingly.

What is actually happening is the following, the Palo event data lives under the event key at the top level and therefore all of the Palo knowledge objects are not being applied as the fields are extracted as event.<field_name>.

image

Example of a correctly interpreted event sent to the https://http-inputs.splunkcloud.com/services/collector/event endpoint. Note that the JSON displayed is the nested JSON from the top level event key, this was sent to the HTTP input endpoint as follows:

{
  "event": {"Accept":"*/*","Accept-Encoding":"gzip,deflate,br","Accept-Language":"en-GB,en; q=0.5", <snip>},
  "source": "mysource",
  "host": "myhost"
}

And is interpreted and displayed like this:
image

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

2 participants