Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement the ability to toggle dropping CLP processed field from the original generic record. #14534

Merged
merged 2 commits into from
Nov 27, 2024

Conversation

jackluo923
Copy link
Contributor

@jackluo923 jackluo923 commented Nov 25, 2024

Allow the ability to toggle dropping CLP processed field from the original generic record inside CLPLogRecordExtractor. One can toggle the removing the processed field via the following config:

stream.kafka.decoder.prop.removeProcessedFields": "true"

One can replicate the original behavior inside schemaConformingTransformerV2Config

"schemaConformingTransformerV2Config": {
        ...
        "fieldPathsToSkipStorage": [
          "message"
        ],

@codecov-commenter
Copy link

codecov-commenter commented Nov 25, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 63.99%. Comparing base (59551e4) to head (318fc26).
Report is 1390 commits behind head on master.

Additional details and impacted files
@@             Coverage Diff              @@
##             master   #14534      +/-   ##
============================================
+ Coverage     61.75%   63.99%   +2.23%     
- Complexity      207     1570    +1363     
============================================
  Files          2436     2675     +239     
  Lines        133233   146975   +13742     
  Branches      20636    22559    +1923     
============================================
+ Hits          82274    94050   +11776     
- Misses        44911    45970    +1059     
- Partials       6048     6955     +907     
Flag Coverage Δ
custom-integration1 100.00% <ø> (+99.99%) ⬆️
integration 100.00% <ø> (+99.99%) ⬆️
integration1 100.00% <ø> (+99.99%) ⬆️
integration2 0.00% <ø> (ø)
java-11 63.89% <ø> (+2.18%) ⬆️
java-21 63.88% <ø> (+2.25%) ⬆️
skip-bytebuffers-false 63.98% <ø> (+2.23%) ⬆️
skip-bytebuffers-true 63.77% <ø> (+36.04%) ⬆️
temurin 63.99% <ø> (+2.23%) ⬆️
unittests 63.98% <ø> (+2.23%) ⬆️
unittests1 55.63% <ø> (+8.74%) ⬆️
unittests2 34.61% <ø> (+6.88%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.


if (!_config.getRemoveProcessedFields()) {
to.putValue(key, value);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this feature used for double ingestion?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. if removeProcessedField is true, we won't add the original key and value to the output.

@@ -54,6 +55,7 @@ public class CLPLogRecordExtractorConfig implements RecordExtractorConfig {
private final Set<String> _fieldsForClpEncoding = new HashSet<>();
private String _unencodableFieldSuffix = null;
private String _unencodableFieldError = null;
private boolean _removeProcessedFields = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems the default value for removeProcessedFields is set to false. what's the current behavior, do we already by default keep the processedField e.g. message?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default behavior is changed. Previously, field is removed by default, now it's retained by default

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we set removeProcessedFields=True by default to keep existing behavior. Thus, we can keep the current behavior without change tableConfig to existing tables.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add more comments for how to maintain the existing behavior.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done. Added instructions on how to maintain the existing behavior via config.

@deemoliu deemoliu added backward-incompat Referenced by PRs that introduce or fix backward compat issues feature request labels Nov 26, 2024
@deemoliu deemoliu merged commit da897dd into apache:master Nov 27, 2024
21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backward-incompat Referenced by PRs that introduce or fix backward compat issues feature request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants