Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

System macros in tdl #996

Merged

Conversation

jobarr-amzn
Copy link
Contributor

Issue #, if available: Fixes #993

Description of changes:
Prior to this, the EncodingDirectiveReader implementations could only allow reference in TDL to macros defined in the encoding directive being read. Now, they can understand references to macros belonging to the currently-operative encoding context.

  • Adds a DEFAULT EncodingContext
  • Adds a MacroTable abstraction
  • Adds a SystemMacro MacroTable

The approach I've taken is to have immutable and mutable macro tables (presently only the SystemMacro macro table is immutable) and immutable and immutable encoding contexts. The default (static, shared) encoding context is immutable.

When you want to append to it you create a new, mutable EncodingContext with a mutable macro table. The mutable macro table takes an antecedent table which it will read through to if it does not find the MacroRef it's looking for in its mutable space, allowing unqualified reference to non-shadowed macros from the inherited context without requiring copying.

I'd be fine to change that to a copy if it seems over-complicated, has the wrong semantics, or is prone to error in some way I'm not seeing- just let me know :)

Without this patch, the provided tests fails. With it, the test passes.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@jobarr-amzn jobarr-amzn requested a review from tgregg November 15, 2024 05:40
Copy link
Contributor

@tgregg tgregg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Thanks for adding this.

Looks like the build failed due to ktlint, which you can resolve with ./gradlew ktlintFormat.

src/main/java/com/amazon/ion/impl/macro/MacroTable.kt Outdated Show resolved Hide resolved
src/main/java/com/amazon/ion/impl/macro/SystemMacro.kt Outdated Show resolved Hide resolved
// SPDX-License-Identifier: Apache-2.0
package com.amazon.ion.impl.macro

class MutableMacroTable(private val antecedent: MacroTable) : MacroTable {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have a chain of antecedents, we can never garbage collect the old macro tables until the entire chain is dropped. Is this what we want? The alternative is to create a mutable flat copy from the immutable macro table.

  • A chain of antecedents
    • has extra memory overhead for each link in the chain (because each link has it's own auxiliary data on top of the actual macros)
    • when new macros shadow old macros, will retain (i.e. in memory) macro definitions that can no longer be reached
    • for antecedent macros that are shadowed, will require multiple unsuccessful lookups before finding the macro definition
  • A flat macro table
    • Has O(n) cost to create a mutable copy
    • May require more memory than the antecedent model if we regularly need access to multiple older versions of a macro table

Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In practice at present there's only ever a single antecedent, the system macro table or the empty.

You're right though that this could be more of a problem when we have general encoding module support and that the interface seen here allows it to be a problem for an implementer now.

In the general case a flat mutable table does make it easier to add and shadow values, but it makes it more difficult to hand a consistent view of some values off to a consumer without transcribing the values entirely (I'm thinking of a case like e.g. Trino where we may want to hand off a tuple of <bytes, interpretive context>, not the present state of affairs here).

Definitely something to chew on. I'd be happy to go with the consensus on this one, whatever it is. @tgregg any thoughts?

As an aside- if you've got a better name for anything here please let me know.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hesitate to proclaim that one alternative is better than the other before we've done some performance analysis, so I'm happy to go with the proposed modeling for now and swap it out if we identify that a flat table is typically better. I, too, suspect that both have benefits for some use cases and drawbacks for others.

Copy link

codecov bot commented Nov 19, 2024

Codecov Report

Attention: Patch coverage is 81.48148% with 10 lines in your changes missing coverage. Please review.

Please upload report for BASE (ion-11-encoding@7cd94d4). Learn more about missing BASE report.

Files with missing lines Patch % Lines
...va/com/amazon/ion/impl/bin/IonManagedWriter_1_1.kt 60.00% 0 Missing and 2 partials ⚠️
...in/java/com/amazon/ion/impl/macro/MacroCompiler.kt 0.00% 0 Missing and 2 partials ⚠️
.../main/java/com/amazon/ion/impl/macro/MacroTable.kt 33.33% 2 Missing ⚠️
...mazon/ion/impl/IonReaderContinuableCoreBinary.java 91.66% 1 Missing ⚠️
...java/com/amazon/ion/impl/IonReaderTextSystemX.java 92.30% 0 Missing and 1 partial ⚠️
...ava/com/amazon/ion/impl/macro/MutableMacroTable.kt 75.00% 0 Missing and 1 partial ⚠️
...main/java/com/amazon/ion/impl/macro/SystemMacro.kt 50.00% 1 Missing ⚠️
Additional details and impacted files
@@                Coverage Diff                 @@
##             ion-11-encoding     #996   +/-   ##
==================================================
  Coverage                   ?   70.60%           
  Complexity                 ?     7097           
==================================================
  Files                      ?      203           
  Lines                      ?    28242           
  Branches                   ?     5037           
==================================================
  Hits                       ?    19939           
  Misses                     ?     6717           
  Partials                   ?     1586           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.


🚨 Try these New Features:

@jobarr-amzn jobarr-amzn merged commit e9237a2 into amazon-ion:ion-11-encoding Nov 19, 2024
20 of 35 checks passed
@jobarr-amzn jobarr-amzn deleted the system-macros-in-tdl branch November 19, 2024 01:17
tgregg pushed a commit that referenced this pull request Dec 13, 2024
* Test TDL reference to pre-existing macros

* Support reference of pre-existing macros in TDL

- Adds a DEFAULT Encoding context
- Adds a MacroTable abstraction

* PR suggestions

* Comment asking for more modeling attention in MutableMacroTable
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants