Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: clean invalid UTF8 from trace field values when generating propagation header #232

Merged
merged 7 commits into from
Oct 11, 2023

Conversation

robbkidd
Copy link
Member

@robbkidd robbkidd commented Oct 2, 2023

Which problem is this PR solving?

Short description of the changes

Tests to Demonstrate the Problem

  • Tests added to cover invalid UTF8 characters in trace field values for classic and E&S modes of Honeycomb propagation

MarshalTraceContext

  • Replace invalid UTF8 characters when serializing in Honeycomb-flavored propagators

Mixes in the data cleaning logic already present in Libhoney::Cleaner to prepare trace data for JSON serialization.

  • Refactor the encoding logic for serialization in an extracted method to avoid having the two implementations diverge.

UnmarshalTraceContext

  • Decode trace context with the decoder that matches the encoder.

Testing revealed that decoding trace context would fail to produce a string parseable as JSON if some UTF8 characters were included in the data, even if those characters are valid. Because we encode with urlsafe, we need to decode with urlsafe.

Linting

  • Disable the Style/AccessModifierDeclarations Rubocop rule.

Sometimes we group methods under "private", sometimes we inline a symbol reference to a method with "module_method". We remain flexible. This rule does not serve us and we do not serve it.

@robbkidd robbkidd added type: bug Something isn't working status: oncall Flagged for awareness from Honeycomb Telemetry Oncall labels Oct 2, 2023
@robbkidd robbkidd self-assigned this Oct 2, 2023
Trace context is urlsafe_encoded, so the unmarshal/parse should use the
matching urlsafe_decode. Otherwise, there are cases where the encoded
UTF8 munges characters rendering the decoded output unparsable as JSON.
Reuse the cleaning logic already in Libhoney.
@robbkidd robbkidd marked this pull request as ready for review October 2, 2023 21:41
@robbkidd robbkidd requested a review from a team October 2, 2023 21:41
Copy link
Contributor

@ajvondrak ajvondrak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @robbkidd! Saves me the PR-slinging. 😄 And looks to have the shape I would expect (though admittedly I'm not scouring this PR for edge cases).

- Style/TrailingCommaInArguments:

> Put a comma after the last parameter of a multiline method call.

OK.

- Style/AccessModifierDeclarations:

> module_function should not be inlined in method definitions.

Turned off this rule. Sometimes we group, sometimes we inline a symbol
reference to a method. We remain flexible.
@JamieDanielson JamieDanielson added the version: bump patch A PR with release-worthy changes and is backwards-compatible. label Oct 11, 2023
@JamieDanielson JamieDanielson merged commit 1653665 into main Oct 11, 2023
5 checks passed
@JamieDanielson JamieDanielson deleted the robb.fix-invalid-utf8-in-trace-fields branch October 11, 2023 15:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: oncall Flagged for awareness from Honeycomb Telemetry Oncall type: bug Something isn't working version: bump patch A PR with release-worthy changes and is backwards-compatible.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Trace-level fields don't get "cleaned" for header serialization
5 participants