Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes various implementation gaps and bugs related to skipping and filling binary macro invocations. #1022

Merged
merged 1 commit into from
Jan 6, 2025

Conversation

tgregg
Copy link
Contributor

@tgregg tgregg commented Dec 24, 2024

Description of changes:
This PR ensures that binary macro invocations can be filled and skipped, regardless of whether they are delimited or length-prefixed, and regardless of the types of arguments they contain. The bulk of the additions in this PR are added test cases to exercise the various combinations.

I've also included some TODOs to be left for future PRs. We generally need more coverage of early-step-out cases when evaluating macros, and we need to add more testing of the continuable reader when input runs out at various locations within a macro invocation.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

Comment on lines -1917 to -1919
if (event == Event.START_CONTAINER && valueMarker.endIndex == DELIMITED_MARKER) {
seekPastDelimitedContainer_1_1();
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This functionality is now covered by slowNextToken()

* @param signature the signature of the macro invocation for which to read a presence bitmap.
* @return the presence bitmap.
*/
protected PresenceBitmap loadPresenceBitmap(List<Macro.Parameter> signature) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved from IonReaderContinuableCoreBinary, and corrected to perform readFrom and validate even if the AEB byte size is 0. This ensures that the in-memory PresenceBitmap is set properly for ExactlyOne arguments.

Comment on lines +2048 to +2049
// TODO step out of the macro evaluator only if it's in a container. Otherwise, finish the evaluation early
// and step out of the parent.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@popematt popematt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to add more testing of the continuable reader when input runs out at various locations within a macro invocation.

This should be easily covered by running the conformance tests with an input source that always needs refilling, right?

@tgregg
Copy link
Contributor Author

tgregg commented Jan 6, 2025

We need to add more testing of the continuable reader when input runs out at various locations within a macro invocation.

This should be easily covered by running the conformance tests with an input source that always needs refilling, right?

Yes, it would be sufficient to feed all the binary conformance test data byte-by-byte to the continuable binary reader.

@tgregg tgregg merged commit 4eb1ce2 into ion-11-encoding Jan 6, 2025
17 checks passed
@tgregg tgregg deleted the ion-11-encoding-fix-binary-macro-skip branch January 6, 2025 19:49
@@ -842,6 +854,10 @@ private long refill(long minimumNumberOfBytesRequired) {
* @return true if not enough data was available in the stream; otherwise, false.
*/
private boolean slowSeek(long numberOfBytes) {
if (refillableState.pinOffset > -1) {
// If data is being pinned, fill the buffer instead of potentially skipping directly from the source.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// If data is being pinned, fill the buffer instead of potentially skipping directly from the source.
// Pinned bytes must be preserved in the buffer until un-pinned.
// If data is being pinned, fill the buffer instead of potentially skipping directly from the source.

Copying 'pinning' concept explanation from pinOffset comment. YMMV but I would have found that helpful.

Comment on lines +1935 to +1973
/**
* Skips past the current macro parameter.
* @param parameter the parameter to skip.
*/
private void uncheckedSkipMacroParameter(Macro.Parameter parameter) {
switch (parameter.getType()) {
case Tagged:
uncheckedNextToken();
break;
case Int8:
case Uint8:
peekIndex += 1;
break;
case Int16:
case Uint16:
case Float16:
peekIndex += 2;
break;
case Int32:
case Uint32:
case Float32:
peekIndex += 4;
break;
case Int64:
case Uint64:
case Float64:
peekIndex += 8;
break;
case FlexUint:
uncheckedReadFlexUInt_1_1();
break;
case FlexInt:
uncheckedReadFlexInt_1_1();
break;
case FlexSym:
uncheckedReadFlexSym_1_1(valueMarker);
break;
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Java switch expressions - in JDK 14 😭

Nothing to do here....

Comment on lines +1995 to +2051
private boolean slowSkipMacroParameter(Macro.Parameter parameter) {
switch (parameter.getType()) {
case Tagged:
slowNextToken();
if (event == Event.NEEDS_DATA) {
return true;
}
break;
case Int8:
case Uint8:
if (!fillAt(peekIndex, 1)) {
return true;
}
peekIndex += 1;
break;
case Int16:
case Uint16:
case Float16:
if (!fillAt(peekIndex, 2)) {
return true;
}
peekIndex += 2;
break;
case Int32:
case Uint32:
case Float32:
if (!fillAt(peekIndex, 4)) {
return true;
}
peekIndex += 4;
break;
case Int64:
case Uint64:
case Float64:
if (!fillAt(peekIndex, 8)) {
return true;
}
peekIndex += 8;
break;
case FlexUint:
if (slowReadFlexUInt_1_1() < 0) {
return true;
}
break;
case FlexInt:
if (slowReadFlexInt_1_1(valueMarker)) {
return true;
}
break;
case FlexSym:
if (FlexSymType.INCOMPLETE == slowSkipFlexSym_1_1(valueMarker)) {
return true;
}
break;
}
return false;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
private boolean slowSkipMacroParameter(Macro.Parameter parameter) {
switch (parameter.getType()) {
case Tagged:
slowNextToken();
if (event == Event.NEEDS_DATA) {
return true;
}
break;
case Int8:
case Uint8:
if (!fillAt(peekIndex, 1)) {
return true;
}
peekIndex += 1;
break;
case Int16:
case Uint16:
case Float16:
if (!fillAt(peekIndex, 2)) {
return true;
}
peekIndex += 2;
break;
case Int32:
case Uint32:
case Float32:
if (!fillAt(peekIndex, 4)) {
return true;
}
peekIndex += 4;
break;
case Int64:
case Uint64:
case Float64:
if (!fillAt(peekIndex, 8)) {
return true;
}
peekIndex += 8;
break;
case FlexUint:
if (slowReadFlexUInt_1_1() < 0) {
return true;
}
break;
case FlexInt:
if (slowReadFlexInt_1_1(valueMarker)) {
return true;
}
break;
case FlexSym:
if (FlexSymType.INCOMPLETE == slowSkipFlexSym_1_1(valueMarker)) {
return true;
}
break;
}
return false;
}
private boolean slowSkipMacroParameter(Macro.Parameter parameter) {
switch (parameter.getType()) {
case Int8:
case Uint8:
return fillAndUpdatePeekIndex(1);
case Int16:
case Uint16:
case Float16:
return fillAndUpdatePeekIndex(2);
case Int32:
case Uint32:
case Float32:
return fillAndUpdatePeekIndex(4);
case Int64:
case Uint64:
case Float64:
return fillAndUpdatePeekIndex(8);
case FlexUint:
return slowReadFlexUInt_1_1() < 0;
case FlexInt:
return slowReadFlexInt_1_1(valueMarker);
case FlexSym:
return FlexSymType.INCOMPLETE == slowSkipFlexSym_1_1(valueMarker);
case Tagged:
slowNextToken();
return event == Event.NEEDS_DATA;
default:
return false;
}
}
private boolean fillAndUpdatePeekIndex(int numberOfBytes) {
if (!fillAt(peekIndex, numberOfBytes)) {
return true;
}
peekIndex += numberOfBytes;
return false;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If slowNextToken returned an Event then all of these cases could be simple return expressions.

Comment on lines +2060 to +2062
if (macro == null) {
throw new IonException(String.format("Cannot skip over unknown macro with ID %d.", macroInvocationId));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes me think we should have a helper:

void require(boolean check, String format, Object... parameters) {
    if (!check) throw new IonException(String.format(format, parameters));
}

Then this would be expressed as:

  require(macro != null, "Cannot skip over unknown macro with ID %d.", macroInvocationId);

Looks like we have ~300 constructions of IonException this way... what do you think of a helper like this?

Comment on lines +2318 to 2327
if (
// The end of the current value hasn't been reached.
peekIndex < valueMarker.endIndex ||
// The end of the parent container hasn't been reached.
parent.endIndex > peekIndex ||
// A container was just encountered, but not yet stepped in.
checkpointLocation == CheckpointLocation.AFTER_CONTAINER_HEADER
) {
return false;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I appreciate the comments, thanks :D

Comment on lines +2934 to +2939
/**
* Seeks to the end of the value on which the cursor is positioned (if any), or to the end of the e-expression
* (if known). Note: it is up to the caller to ensure that the e-expression is at its end when this method is
* called.
* @return true if not enough data was available in the stream to seek to the target location; otherwise, false.
*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: it is up to the caller to ensure that the e-expression is at its end when this method is called.

What does this mean? This seeks to the end, so what does it mean to be "at its end when this method is called"?

EDIT: Oh, this is explained below:

// The e-expression is not length-prefixed and its end has not previously been calculated. Just seek
// to the end of the current child value. The caller ensures that this is the last value in the expression.

That seems like a difficult sort of condition to find yourself in, or rather I'm likely still misunderstanding something. We havea. non-length-prefixed expression with as-of-yet unknown end point, and we're supposed to call slowSeekToEndOfEExpression() only when we're positioned on the last value of the expression? How can the caller enure that this is the last value in the expression?

Comment on lines +1593 to +1595
@ParameterizedTest(name = "{0},{1}")
@MethodSource("allCombinations")
public void taglessExpressionGroupInDelimitedStruct(InputType inputType, StreamType streamType) throws Exception {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test took a fair bit of reading to digest. It looks like it's testing three different scenarios for the same test data, and those with cardinality 4 for InputType/StreamType.

Not a problem for this PR but we need to find a better way to express these patterns. It may be that JUnit 5 nested tests or maybe something with Kotest would help us reduce the maintenance overhead for tests, because the complexity of grokking these is high.

jobarr-amzn added a commit that referenced this pull request Jan 7, 2025
jobarr-amzn added a commit that referenced this pull request Jan 7, 2025
jobarr-amzn added a commit that referenced this pull request Jan 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants