Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Makes the Ion 1.1 text writer write symbol tokens inline by default, instead of using symbol identifiers. #1012

Merged

Conversation

tgregg
Copy link
Contributor

@tgregg tgregg commented Dec 10, 2024

Description of changes:
I noticed this when implementing macro-aware transcoding. When transcoding from binary, where symbol IDs are generally available, those symbol IDs were being transcoded to text using symbol identifier syntax, e.g., $1. This was due to an incorrect default in the Ion 1.1 text writer builder.

After this change, transcoded text Ion 1.1 goes from looking like this:

$ion_1_1 (:$ion::set_symbols (:: "foo" "bar")) $1 $2 (:$ion::add_symbols (:: "baz")) $1 $3 (:$ion::set_symbols (:: "abc" "def")) $1 $2 

To this:

$ion_1_1 (:$ion::set_symbols (:: "foo" "bar")) foo bar (:$ion::add_symbols (:: "baz")) foo baz (:$ion::set_symbols (:: "abc" "def")) abc def 

These two streams are data model equivalent, but the latter is easier for humans to read. Anyone who wants to see the symbol identifiers in order to get more information about how the symbols would be encoded in binary can enable this using SymbolInliningStrategy.NEVER_INLINE, as demonstrated by the added unit test.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@@ -22,26 +22,6 @@ data class ManagedWriterOptions_1_1(
val lengthPrefixStrategy: LengthPrefixStrategy,
val eExpressionIdentifierStrategy: EExpressionIdentifierStrategy,
) : SymbolInliningStrategy by symbolInliningStrategy, LengthPrefixStrategy by lengthPrefixStrategy {
companion object {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These were unused. The defaults are now contained in the writer builders.

@tgregg tgregg force-pushed the ion-11-encoding-by-value-transcode-text branch from db7014a to b9df158 Compare December 12, 2024 19:11
@tgregg tgregg changed the base branch from ion-11-encoding-by-value-transcode-text to ion-11-encoding December 12, 2024 19:12
@tgregg tgregg force-pushed the ion-11-encoding-by-value-transcode-text-inline branch from 06d391e to ea8a1e6 Compare December 12, 2024 19:22
@tgregg tgregg merged commit 8f98013 into ion-11-encoding Dec 12, 2024
16 checks passed
@tgregg tgregg deleted the ion-11-encoding-by-value-transcode-text-inline branch December 12, 2024 19:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants