-
Notifications
You must be signed in to change notification settings - Fork 111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restores the previous binary reader implementation's local symbol table handling behavior for several edge cases. #619
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
... and 3 files with indirect coverage changes 📢 Thoughts on this report? Let us know!. |
if (symbolTableLastTransferred.isLocalTable()) { | ||
// This method is called when transferring the reader's symbol table to either a writer or an IonDatagram. | ||
// Those cases require a mutable copy of the reader's symbol table. | ||
return ((_Private_LocalSymbolTable) symbolTableLastTransferred).makeCopy(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious—makeCopy()
is a synchronized
method. Is this in a hot path in the code? Are there any performance concerns with this? If there are and it's feasible, is it worth making a second, unsynchronized copy
method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This class only deals with _Private_LocalSymbolTable
implementations provided by IonReaderContinuableApplicationBinary
(i.e., LocalSymbolTableSnapshot
), which does not mark this method synchronized
. It is synchronized
in LocalSymbolTable
; I did not change that.
Since some of the regression detector checks failed, I did a more in-depth run locally and did not observe any impact. Merging. |
…le handling behavior for several edge cases.
c763b25
to
c7afd21
Compare
@Test | ||
public void modifySymbolsAfterLoadThenSerialize() | ||
throws Exception | ||
{ | ||
IonDatagram dg = system().getLoader().load(TestUtils.ensureBinary(system(), "{foo: bar::baz}".getBytes(StandardCharsets.UTF_8))); | ||
IonStruct struct = (IonStruct) dg.get(0); | ||
IonSymbol baz = (IonSymbol) struct.get("foo"); | ||
baz.setTypeAnnotations("bar", "abc"); | ||
byte[] serialized = dg.getBytes(); | ||
try (IonReader reader = IonReaderBuilder.standard().build(serialized)) { | ||
assertEquals(IonType.STRUCT, reader.next()); | ||
reader.stepIn(); | ||
assertEquals(IonType.SYMBOL, reader.next()); | ||
assertEquals("foo", reader.getFieldName()); | ||
String[] annotations = reader.getTypeAnnotations(); | ||
assertEquals(2, annotations.length); | ||
assertEquals("bar", annotations[0]); | ||
assertEquals("abc", annotations[1]); | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test is named after the setup, I have to wade through it to understand the assertions before I can construct backwards an understanding of what property the test is trying to demonstrate. If possible please name tests such that test failure tells you what behavioral expectation has been violated, not the context in which it has been violated.
Imperfect car analogy: if I want to know why someone failed their license test I want the reason "whenTakingTest_shouldNotExceedTheSpeedLimit" not the reason "runningLateToAnAppointment".
@Test | |
public void modifySymbolsAfterLoadThenSerialize() | |
throws Exception | |
{ | |
IonDatagram dg = system().getLoader().load(TestUtils.ensureBinary(system(), "{foo: bar::baz}".getBytes(StandardCharsets.UTF_8))); | |
IonStruct struct = (IonStruct) dg.get(0); | |
IonSymbol baz = (IonSymbol) struct.get("foo"); | |
baz.setTypeAnnotations("bar", "abc"); | |
byte[] serialized = dg.getBytes(); | |
try (IonReader reader = IonReaderBuilder.standard().build(serialized)) { | |
assertEquals(IonType.STRUCT, reader.next()); | |
reader.stepIn(); | |
assertEquals(IonType.SYMBOL, reader.next()); | |
assertEquals("foo", reader.getFieldName()); | |
String[] annotations = reader.getTypeAnnotations(); | |
assertEquals(2, annotations.length); | |
assertEquals("bar", annotations[0]); | |
assertEquals("abc", annotations[1]); | |
} | |
} | |
} | |
@Test | |
public void whenSymbolTableIsModifiedAfterLoad_thatChangeSucceedsAndWillBePersisted() | |
throws Exception | |
{ | |
IonDatagram dg = system().getLoader().load(TestUtils.ensureBinary("old_annotation::foo")); | |
IonSymbol foo = (IonSymbol) dg.get(0); | |
// Changing annotations requires mutation of the symbol table associated with the IonValue above, adds a symbol | |
foo.setTypeAnnotations("new_annotation"); | |
byte[] serialized = dg.getBytes(); | |
try (IonReader reader = IonReaderBuilder.standard().build(serialized)) { | |
MatcherAssert.assertThat(reader.next(), Matchers.is(IonType.SYMBOL)); | |
assertEquals("foo", reader.stringValue()); | |
MatcherAssert.assertThat(reader.getTypeAnnotations(), arrayContaining("new_annotation")); | |
} | |
} |
This test assumes some additions to TestUtils:
public static byte[] ensureBinary(String ionData) {
return ensureBinary(ionData.getBytes(StandardCharsets.UTF_8));
}
public static byte[] ensureBinary(byte[] ionData)
{
if (IonStreamUtils.isIonBinary(ionData)) return ionData;
return IonSystemBuilder.standard().build()
.getLoader()
.load(ionData)
.getBytes();
}
@ParameterizedTest(name = "constructFromBytes={0}") | ||
@ValueSource(booleans = {true, false}) | ||
public void unknownSymbolInFieldName(boolean constructFromBytes) throws Exception { | ||
reader = readerFor(constructFromBytes, 0xD3, 0x8A, 0x21, 0x01); | ||
assertSequence(next(IonType.STRUCT), STEP_IN, next(IonType.INT)); | ||
assertThrows(UnknownSymbolException.class, reader::getFieldNameSymbol); | ||
assertThrows(UnknownSymbolException.class, reader::getFieldName); | ||
reader.close(); | ||
} | ||
|
||
@ParameterizedTest(name = "constructFromBytes={0}") | ||
@ValueSource(booleans = {true, false}) | ||
public void unknownSymbolInAnnotation(boolean constructFromBytes) throws Exception { | ||
reader = readerFor(constructFromBytes, 0xE4, 0x81, 0x8A, 0x21, 0x01); | ||
assertSequence(next(IonType.INT)); | ||
assertThrows(UnknownSymbolException.class, reader::getTypeAnnotationSymbols); | ||
assertThrows(UnknownSymbolException.class, reader::getTypeAnnotations); | ||
reader.close(); | ||
} | ||
|
||
@ParameterizedTest(name = "constructFromBytes={0}") | ||
@ValueSource(booleans = {true, false}) | ||
public void unknownSymbolInValue(boolean constructFromBytes) throws Exception { | ||
reader = readerFor(constructFromBytes, 0x71, 0x0A); | ||
assertSequence(next(IonType.SYMBOL)); | ||
assertThrows(UnknownSymbolException.class, reader::symbolValue); | ||
assertThrows(UnknownSymbolException.class, reader::stringValue); | ||
reader.close(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These tests are likewise named after their circumstance rather than behavior, though in this case the behavior and setup are simple enough that it matters less. Your mileage may vary, feel free to take or discard this suggestion:
@ParameterizedTest(name = "constructFromBytes={0}") | |
@ValueSource(booleans = {true, false}) | |
public void unknownSymbolInFieldName(boolean constructFromBytes) throws Exception { | |
reader = readerFor(constructFromBytes, 0xD3, 0x8A, 0x21, 0x01); | |
assertSequence(next(IonType.STRUCT), STEP_IN, next(IonType.INT)); | |
assertThrows(UnknownSymbolException.class, reader::getFieldNameSymbol); | |
assertThrows(UnknownSymbolException.class, reader::getFieldName); | |
reader.close(); | |
} | |
@ParameterizedTest(name = "constructFromBytes={0}") | |
@ValueSource(booleans = {true, false}) | |
public void unknownSymbolInAnnotation(boolean constructFromBytes) throws Exception { | |
reader = readerFor(constructFromBytes, 0xE4, 0x81, 0x8A, 0x21, 0x01); | |
assertSequence(next(IonType.INT)); | |
assertThrows(UnknownSymbolException.class, reader::getTypeAnnotationSymbols); | |
assertThrows(UnknownSymbolException.class, reader::getTypeAnnotations); | |
reader.close(); | |
} | |
@ParameterizedTest(name = "constructFromBytes={0}") | |
@ValueSource(booleans = {true, false}) | |
public void unknownSymbolInValue(boolean constructFromBytes) throws Exception { | |
reader = readerFor(constructFromBytes, 0x71, 0x0A); | |
assertSequence(next(IonType.SYMBOL)); | |
assertThrows(UnknownSymbolException.class, reader::symbolValue); | |
assertThrows(UnknownSymbolException.class, reader::stringValue); | |
reader.close(); | |
} | |
@ParameterizedTest(name = "constructFromBytes={0}") | |
@ValueSource(booleans = {true, false}) | |
public void inFieldName_accessingUnknownSymbol_throwsUnknownSymbolException(boolean constructFromBytes) throws Exception { | |
reader = readerFor(constructFromBytes, 0xD3, 0x8A, 0x21, 0x01); | |
assertSequence(next(IonType.STRUCT), STEP_IN, next(IonType.INT)); | |
assertThrows(UnknownSymbolException.class, reader::getFieldNameSymbol); | |
assertThrows(UnknownSymbolException.class, reader::getFieldName); | |
reader.close(); | |
} | |
@ParameterizedTest(name = "constructFromBytes={0}") | |
@ValueSource(booleans = {true, false}) | |
public void inAnnotation_accessingUnknownSymbol_throwsUnknownSymbolException(boolean constructFromBytes) throws Exception { | |
reader = readerFor(constructFromBytes, 0xE4, 0x81, 0x8A, 0x21, 0x01); | |
assertSequence(next(IonType.INT)); | |
assertThrows(UnknownSymbolException.class, reader::getTypeAnnotationSymbols); | |
assertThrows(UnknownSymbolException.class, reader::getTypeAnnotations); | |
reader.close(); | |
} | |
@ParameterizedTest(name = "constructFromBytes={0}") | |
@ValueSource(booleans = {true, false}) | |
public void asValue_accessingUnknownSymbol_throwsUnknownSymbolException(boolean constructFromBytes) throws Exception { | |
reader = readerFor(constructFromBytes, 0x71, 0x0A); | |
assertSequence(next(IonType.SYMBOL)); | |
assertThrows(UnknownSymbolException.class, reader::symbolValue); | |
assertThrows(UnknownSymbolException.class, reader::stringValue); | |
reader.close(); | |
} |
throws Exception | ||
{ | ||
SymbolTable lst; | ||
try (IonReader reader = IonReaderBuilder.standard().build(TestUtils.ensureBinary(IonSystemBuilder.standard().build(), "foo".getBytes(StandardCharsets.UTF_8)))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This struck me as odd, specifically the call to ensureBinary
with the construction of a throw-away IonSystem. Why does ensureBinary take an IonSystem as a parameter?
It wasn't clear to me from a quick look that there are any uses where we should care about which IonSystem is used, but I'm sure you're more familiar with the implications here than I am so I'll make that a question rather than a statement. Are there any?
EDIT: In any case this test doesn't need to build a system, we can add a helper in TestUtils which does that as in the example above.
try (IonReader reader = IonReaderBuilder.standard().build(TestUtils.ensureBinary(IonSystemBuilder.standard().build(), "foo".getBytes(StandardCharsets.UTF_8)))) { | |
try (IonReader reader = IonReaderBuilder.standard().build(TestUtils.ensureBinary("foo"))) { |
IonWriter writer = b.build(out); | ||
assertTrue(symbolTableEquals(lst, writer.getSymbolTable())); | ||
|
||
writer = b.build(out); | ||
assertTrue(symbolTableEquals(lst, writer.getSymbolTable())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's going on here? What does repeated writer builds do for us?
Description of changes:
Restores previous behavior for three cases, which map to the three modified unit test classes in the diff.
UnknownSymbolException
(a subclass ofIonException
) when an out-of-range symbol was encountered. The new implementation threw a baseIonException
. This case was previously only tested by the ion-tests harness, which asserted failure with anyIonException
. The added tests assert that anUnknownSymbolException
is thrown.LocalSymbolTable
class, so the writer builder could safely cast the provided localSymbolTable
toLocalSymbolTable
. The new reader implementation added its ownSymbolTable
implementation (LocalSymbolTableSnapshot
) for local symbol tables to achieve better performance in common cases. This implementation would fail the cast toLocalSymbolTable
when provided to the builder. The solution was to extract the new_Private_LocalSymbolTable
interface to be implemented by bothLocalSymbolTable
andLocalSymbolTableSnapshot
, and make the builder cast to that instead. Existing tests for the builder'ssetInitialSymbolTable
method all constructed a symbol table manually; the new test provides a symbol table retrieved from the binary reader.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.