Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handles the case where the binary cursor's InputStream provides fewer bytes than requested before reaching EOF. #623

Merged
merged 3 commits into from
Nov 3, 2023

Conversation

tgregg
Copy link
Contributor

@tgregg tgregg commented Oct 24, 2023

Description of changes:

InputStream implementations are allowed to return fewer bytes than requested, and doing so does not necessarily indicate that the end of the stream has been reached. GZIPInputStream has this behavior in certain cases.

Before this fix, the binary reader would throw an IonException for unexpected EOF if this happened in the middle of a scalar value. After the fix, the reader will continue requesting bytes from the stream until EOF is definitely reached or all requested bytes have been received.

The test shouldNotFailWhenAnInputStreamProvidesFewerBytesThanRequestedWithoutReachingEof is the one that reproduced the problem, and now succeeds. I initially tried reproducing it using GZIPInputStream, starting with shouldNotFailWhenGZIPBoundaryIsEncounteredInStringValue, but gave up when it became clear that the early-return behavior occurred in a native method that I couldn't step-through debug to root cause. I left the test because it's still a good thing to verify.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

Comment on lines 670 to 673
if (numberOfBytesFilled < 0) {
return;
return numberOfBytesFilled;
}
limit += numberOfBytesFilled;
return numberOfBytesFilled;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It took me a minute to understand what this was doing. Can we rearrange this a bit?

        if (numberOfBytesFilled > 0) {
            limit += numberOfBytesFilled;
        }
        return numberOfBytesFilled;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

int numberOfBytesToReturn;
if (available > 1) {
// Return fewer bytes than requested and fewer than are available, avoiding EOF.
numberOfBytesToReturn = Math.min(available -1, len - 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
numberOfBytesToReturn = Math.min(available -1, len - 1);
numberOfBytesToReturn = Math.min(available - 1, len - 1);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@tgregg tgregg force-pushed the handle-throttled-inputstream branch from 7117003 to cf6ef90 Compare October 24, 2023 22:12
@@ -649,8 +655,9 @@ private void shiftIndicesLeft(int shiftAmount) {
* Fills the buffer with up to the requested number of additional bytes. It is the caller's responsibility to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Attempts to fill....

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

*/
private void refill(long numberOfBytesToFill) {
private int refill(long numberOfBytesToFill) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize this is a private method and only used in the fillAt method, for now. I would put the loop here lest we add some other method that calls this and have the same problem. Or being crystal clear in the javadoc that it is the caller's responsbility to retry, though then we'd have two callers doing looping that could be here (unless I'm missing something).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, moved the loop into refill. Also added a test to ensure skipping over a value works when the InputStream has this behavior.

@tgregg tgregg force-pushed the handle-throttled-inputstream branch from cf6ef90 to 72c1962 Compare October 27, 2023 17:42
@tgregg tgregg force-pushed the handle-throttled-inputstream branch from 72c1962 to 884ae15 Compare November 2, 2023 23:48
Base automatically changed from span-seek-sets-ion-version to master November 3, 2023 00:15
@tgregg tgregg merged commit 61d76e0 into master Nov 3, 2023
16 of 31 checks passed
@tgregg tgregg deleted the handle-throttled-inputstream branch November 3, 2023 00:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants