Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds support for writing binary Ion 1.1 timestamps #618

Merged
merged 4 commits into from
Nov 8, 2023
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
218 changes: 216 additions & 2 deletions src/com/amazon/ion/impl/bin/IonEncoder_1_1.java
Original file line number Diff line number Diff line change
@@ -1,13 +1,12 @@
package com.amazon.ion.impl.bin;

import com.amazon.ion.Decimal;
import com.amazon.ion.IonType;
import com.amazon.ion.Timestamp;
import com.amazon.ion.impl.bin.utf8.Utf8StringEncoder;
Comment on lines -3 to -6
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These were unused imports.


import java.math.BigDecimal;
import java.math.BigInteger;

import static com.amazon.ion.impl.bin.Ion_1_1_Constants.*;
import static java.lang.Double.doubleToRawLongBits;
import static java.lang.Float.floatToIntBits;

Expand Down Expand Up @@ -158,4 +157,219 @@ public static int writeFloat(WriteBuffer buffer, final double value) {
return 9;
}
}

/**
* Writes a Timestamp to the given WriteBuffer using the Ion 1.1 encoding for Ion Timestamps.
* @return the number of bytes written
*/
public static int writeTimestampValue(WriteBuffer buffer, Timestamp value) {
if (value == null) {
return writeNullValue(buffer, IonType.TIMESTAMP);
}
// Timestamps may be encoded using the short form if they meet certain conditions.
// Condition 1: The year is between 1970 and 2097.
if (value.getYear() < 1970 || value.getYear() > 2097) {
return writeLongFormTimestampValue(buffer, value);
}

// If the precision is year, month, or day, we can skip the remaining checks.
if (!value.getPrecision().includes(Timestamp.Precision.MINUTE)) {
return writeShortFormTimestampValue(buffer, value);
}

// Condition 2: The fractional seconds are a common precision.
int secondsScale = value.getDecimalSecond().scale();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getDecimalSecond() creates a new BigDecimal on each call. Let's use getZFractionalSecond() instead. It's deprecated, but if we ever get around to removing it we'll replace it with some internal accessor. Note: this also requires a null check.

if (secondsScale != 0 && secondsScale != 3 && secondsScale != 6 && secondsScale != 9) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: in my initial prototype I used secondsScale % 3 != 0 || secondsScale > 9. Comparing these alternatives, I think it's pretty much a wash. In the common case (that the scale matches one of the common precisions) my suggestion always requires two comparisons, but it also has the modulo. Yours varies depending on the actual precision. One comparison if the scale is 0, two if it's 3, and so on. Leaving this comment in case you feel like forming a strong opinion on which one is preferable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a preference, and I suspect that it will not have very significant difference. I'll create an issue to revisit this later.

return writeLongFormTimestampValue(buffer, value);
}
// Condition 3: The local offset is either UTC, unknown, or falls between -14:00 to +14:00 and is divisible by 15 minutes.
Integer offset = value.getLocalOffset();
if (offset != null && (offset < -14 * 60 || offset > 14 * 60 || offset % 15 != 0)) {
return writeLongFormTimestampValue(buffer, value);
}
return writeShortFormTimestampValue(buffer, value);
}

/**
* Writes a short-form timestamp.
* Value cannot be null.
* If calling from outside this class, use writeTimestampValue instead.
*/
private static int writeShortFormTimestampValue(WriteBuffer buffer, Timestamp value) {
long bits = (value.getYear() - 1970L);
if (value.getPrecision() == Timestamp.Precision.YEAR) {
buffer.writeByte(OpCodes.TIMESTAMP_YEAR_PRECISION);
buffer.writeFixedIntOrUInt(bits, 1);
return 2;
}

bits |= ((long) value.getMonth()) << S_TIMESTAMP_MONTH_BIT_OFFSET;
if (value.getPrecision() == Timestamp.Precision.MONTH) {
buffer.writeByte(OpCodes.TIMESTAMP_MONTH_PRECISION);
buffer.writeFixedIntOrUInt(bits, 2);
return 3;
}

bits |= ((long) value.getDay()) << S_TIMESTAMP_DAY_BIT_OFFSET;
if (value.getPrecision() == Timestamp.Precision.DAY) {
buffer.writeByte(OpCodes.TIMESTAMP_DAY_PRECISION);
buffer.writeFixedIntOrUInt(bits, 2);
return 3;
}

bits |= ((long) value.getHour()) << S_TIMESTAMP_HOUR_BIT_OFFSET;
bits |= ((long) value.getMinute()) << S_TIMESTAMP_MINUTE_BIT_OFFSET;
if (value.getLocalOffset() == null || value.getLocalOffset() == 0) {
if (value.getLocalOffset() != null) {
bits |= S_U_TIMESTAMP_UTC_FLAG;
}

if (value.getPrecision() == Timestamp.Precision.MINUTE) {
buffer.writeByte(OpCodes.TIMESTAMP_MINUTE_PRECISION);
buffer.writeFixedIntOrUInt(bits, 4);
return 5;
}

bits |= ((long) value.getSecond()) << S_U_TIMESTAMP_SECOND_BIT_OFFSET;

int secondsScale = value.getDecimalSecond().scale();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getZFractionalSecond() here too, and below

if (secondsScale != 0) {
long fractionalSeconds = value.getDecimalSecond().remainder(BigDecimal.ONE).movePointRight(secondsScale).longValue();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see value.unscaledValue().longValue() in my prototype. Maybe you can get that to work. Remember that any BigDecimal method that returns BigDecimal allocates a new one, so we want to minimize that.

bits |= fractionalSeconds << S_U_TIMESTAMP_FRACTION_BIT_OFFSET;
}
switch (secondsScale) {
case 0:
buffer.writeByte(OpCodes.TIMESTAMP_SECOND_PRECISION);
buffer.writeFixedIntOrUInt(bits, 5);
return 6;
case 3:
buffer.writeByte(OpCodes.TIMESTAMP_MILLIS_PRECISION);
buffer.writeFixedIntOrUInt(bits, 6);
return 7;
case 6:
buffer.writeByte(OpCodes.TIMESTAMP_MICROS_PRECISION);
buffer.writeFixedIntOrUInt(bits, 7);
return 8;
case 9:
buffer.writeByte(OpCodes.TIMESTAMP_NANOS_PRECISION);
buffer.writeFixedIntOrUInt(bits, 8);
return 9;
default:
throw new IllegalStateException("This is unreachable!");
}
} else {
long localOffset = value.getLocalOffset().longValue() / 15;
bits |= (localOffset & LEAST_SIGNIFICANT_7_BITS) << S_O_TIMESTAMP_OFFSET_BIT_OFFSET;

if (value.getPrecision() == Timestamp.Precision.MINUTE) {
buffer.writeByte(OpCodes.TIMESTAMP_MINUTE_PRECISION_WITH_OFFSET);
buffer.writeFixedIntOrUInt(bits, 5);
return 6;
}

bits |= ((long) value.getSecond()) << S_O_TIMESTAMP_SECOND_BIT_OFFSET;

// The fractional seconds bits will be put into a separate long because we need nine bytes total
// if there are nanoseconds (which is too much for one long) and the boundary between the seconds
// and fractional seconds subfields conveniently aligns with a byte boundary.
long fractionBits = 0;
int secondsScale = value.getDecimalSecond().scale();
if (secondsScale != 0) {
fractionBits = value.getDecimalSecond().remainder(BigDecimal.ONE).movePointRight(secondsScale).longValue();
}
switch (secondsScale) {
case 0:
buffer.writeByte(OpCodes.TIMESTAMP_SECOND_PRECISION_WITH_OFFSET);
buffer.writeFixedIntOrUInt(bits, 5);
return 6;
case 3:
buffer.writeByte(OpCodes.TIMESTAMP_MILLIS_PRECISION_WITH_OFFSET);
buffer.writeFixedIntOrUInt(bits, 5);
buffer.writeFixedIntOrUInt(fractionBits, 2);
return 8;
case 6:
buffer.writeByte(OpCodes.TIMESTAMP_MICROS_PRECISION_WITH_OFFSET);
buffer.writeFixedIntOrUInt(bits, 5);
buffer.writeFixedIntOrUInt(fractionBits, 3);
return 9;
case 9:
buffer.writeByte(OpCodes.TIMESTAMP_NANOS_PRECISION_WITH_OFFSET);
buffer.writeFixedIntOrUInt(bits, 5);
buffer.writeFixedIntOrUInt(fractionBits, 4);
return 10;
default:
throw new IllegalStateException("This is unreachable!");
}
}
}

/**
* Writes a long-form timestamp.
* Value may not be null.
* Only visible for testing. If calling from outside this class, use writeTimestampValue instead.
*/
static int writeLongFormTimestampValue(WriteBuffer buffer, Timestamp value) {
buffer.writeByte(OpCodes.VARIABLE_LENGTH_TIMESTAMP);

long bits = value.getYear();
if (value.getPrecision() == Timestamp.Precision.YEAR) {
buffer.writeFlexUInt(2);
buffer.writeFixedIntOrUInt(bits, 2);
return 4; // OpCode + FlexUInt + 2 bytes data
}

bits |= ((long) value.getMonth()) << L_TIMESTAMP_MONTH_BIT_OFFSET;
if (value.getPrecision() == Timestamp.Precision.MONTH) {
buffer.writeFlexUInt(3);
buffer.writeFixedIntOrUInt(bits, 3);
return 5; // OpCode + FlexUInt + 3 bytes data
}

bits |= ((long) value.getDay()) << L_TIMESTAMP_DAY_BIT_OFFSET;
if (value.getPrecision() == Timestamp.Precision.DAY) {
buffer.writeFlexUInt(3);
buffer.writeFixedIntOrUInt(bits, 3);
return 5; // OpCode + FlexUInt + 3 bytes data
}

bits |= ((long) value.getHour()) << L_TIMESTAMP_HOUR_BIT_OFFSET;
bits |= ((long) value.getMinute()) << L_TIMESTAMP_MINUTE_BIT_OFFSET;
long localOffsetValue = L_TIMESTAMP_UNKNOWN_OFFSET_VALUE;
if (value.getLocalOffset() != null) {
localOffsetValue = value.getLocalOffset() + (24 * 60);
}
bits |= localOffsetValue << L_TIMESTAMP_OFFSET_BIT_OFFSET;

if (value.getPrecision() == Timestamp.Precision.MINUTE) {
buffer.writeFlexUInt(6);
buffer.writeFixedIntOrUInt(bits, 6);
return 8; // OpCode + FlexUInt + 6 bytes data
}


bits |= ((long) value.getSecond()) << L_TIMESTAMP_SECOND_BIT_OFFSET;
int secondsScale = value.getDecimalSecond().scale();
if (secondsScale == 0) {
buffer.writeFlexUInt(7);
buffer.writeFixedIntOrUInt(bits, 7);
return 9; // OpCode + FlexUInt + 7 bytes data
}

BigDecimal fractionalSeconds = value.getDecimalSecond().remainder(BigDecimal.ONE);
BigInteger coefficient = fractionalSeconds.unscaledValue();
long exponent = fractionalSeconds.scale();
int numCoefficientBytes = WriteBuffer.flexUIntLength(coefficient);
int numExponentBytes = WriteBuffer.fixedUIntLength(exponent);
// Years-seconds data (7 bytes) + fraction coefficient + fraction exponent
int dataLength = 7 + numCoefficientBytes + numExponentBytes;

buffer.writeFlexUInt(dataLength);
buffer.writeFixedIntOrUInt(bits, 7);
buffer.writeFlexUInt(coefficient);
buffer.writeFixedUInt(exponent);

// OpCode + FlexUInt length + dataLength
return 1 + WriteBuffer.flexUIntLength(dataLength) + dataLength;
}

}
36 changes: 36 additions & 0 deletions src/com/amazon/ion/impl/bin/Ion_1_1_Constants.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
package com.amazon.ion.impl.bin;

/**
* Contains constants (other than OpCodes) which are generally applicable to both reading and writing binary Ion 1.1
*/
public class Ion_1_1_Constants {
private Ion_1_1_Constants() {}

//////// Timestamp Field Constants ////////

// S_TIMESTAMP_* is applicable to all short-form timestamps
static final int S_TIMESTAMP_MONTH_BIT_OFFSET = 7;
static final int S_TIMESTAMP_DAY_BIT_OFFSET = 11;
static final int S_TIMESTAMP_HOUR_BIT_OFFSET = 16;
static final int S_TIMESTAMP_MINUTE_BIT_OFFSET = 21;
// S_U_TIMESTAMP_* is applicable to all short-form timestamps with a `U` bit
static final int S_U_TIMESTAMP_UTC_FLAG = 1 << 27;
static final int S_U_TIMESTAMP_SECOND_BIT_OFFSET = 28;
static final int S_U_TIMESTAMP_FRACTION_BIT_OFFSET = 34;
// S_O_TIMESTAMP_* is applicable to all short-form timestamps with `o` (offset) bits
static final int S_O_TIMESTAMP_OFFSET_BIT_OFFSET = 27;
static final int S_O_TIMESTAMP_SECOND_BIT_OFFSET = 34;

// L_TIMESTAMP_* is applicable to all long-form timestamps
static final int L_TIMESTAMP_MONTH_BIT_OFFSET = 14;
static final int L_TIMESTAMP_DAY_BIT_OFFSET = 18;
static final int L_TIMESTAMP_HOUR_BIT_OFFSET = 23;
static final int L_TIMESTAMP_MINUTE_BIT_OFFSET = 28;
static final int L_TIMESTAMP_OFFSET_BIT_OFFSET = 34;
static final int L_TIMESTAMP_SECOND_BIT_OFFSET = 44;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks off. The offset is 12 bits, but here the second bit is only 10 after the offset bits.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch.

static final int L_TIMESTAMP_UNKNOWN_OFFSET_VALUE = 0b111111111111;

//////// Bit masks ////////

static final long LEAST_SIGNIFICANT_7_BITS = 0b01111111L;
}
15 changes: 15 additions & 0 deletions src/com/amazon/ion/impl/bin/OpCodes.java
Original file line number Diff line number Diff line change
Expand Up @@ -20,9 +20,24 @@ private OpCodes() {}
// 0x61-0x6E are additional lengths of decimals.
public static final byte NEGATIVE_ZERO_DECIMAL = 0x6F;

public static final byte TIMESTAMP_YEAR_PRECISION = 0x70;
public static final byte TIMESTAMP_MONTH_PRECISION = 0x71;
public static final byte TIMESTAMP_DAY_PRECISION = 0x72;
public static final byte TIMESTAMP_MINUTE_PRECISION = 0x73;
public static final byte TIMESTAMP_SECOND_PRECISION = 0x74;
public static final byte TIMESTAMP_MILLIS_PRECISION = 0x75;
public static final byte TIMESTAMP_MICROS_PRECISION = 0x76;
public static final byte TIMESTAMP_NANOS_PRECISION = 0x77;
public static final byte TIMESTAMP_MINUTE_PRECISION_WITH_OFFSET = 0x78;
public static final byte TIMESTAMP_SECOND_PRECISION_WITH_OFFSET = 0x79;
public static final byte TIMESTAMP_MILLIS_PRECISION_WITH_OFFSET = 0x7A;
public static final byte TIMESTAMP_MICROS_PRECISION_WITH_OFFSET = 0x7B;
public static final byte TIMESTAMP_NANOS_PRECISION_WITH_OFFSET = 0x7C;
// 0x7D-0x7F Reserved

public static final byte NULL_UNTYPED = (byte) 0xEA;
public static final byte NULL_TYPED = (byte) 0xEB;

public static final byte VARIABLE_LENGTH_INTEGER = (byte) 0xF5;
public static final byte VARIABLE_LENGTH_TIMESTAMP = (byte) 0xF7;
}
35 changes: 29 additions & 6 deletions src/com/amazon/ion/impl/bin/WriteBuffer.java
Original file line number Diff line number Diff line change
Expand Up @@ -1437,7 +1437,7 @@ public static int fixedIntLength(final long value) {
*/
public int writeFixedInt(final long value) {
int numBytes = fixedIntLength(value);
return writeFixedIntOrUInt(value, numBytes);
return _writeFixedIntOrUInt(value, numBytes);
}

/** Get the length of FixedUInt for the provided value. */
Expand All @@ -1453,17 +1453,40 @@ public static int fixedUIntLength(final long value) {
*/
public int writeFixedUInt(final long value) {
if (value < 0) {
throw new IllegalArgumentException("Attempted to write a FlexUInt for " + value);
throw new IllegalArgumentException("Attempted to write a FixedUInt for " + value);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just fixing a typo that I noticed while working on Timestamps.

}
int numBytes = fixedUIntLength(value);
return writeFixedIntOrUInt(value, numBytes);
return _writeFixedIntOrUInt(value, numBytes);
}

/**
* Because the fixed int and fixed uint encodings are so similar, we can use this method to write either one as long
* as we provide the correct number of bytes needed to encode the value.
* Writes the bytes of a {@code long} as a {@code FixedInt} or {@code FixedUInt} using {@code numBytes} bytes.
* <p>
* {@code numBytes} should be an integer from 1 to 8 inclusive. If {@code numBytes} is out of bounds, that is a
* programmer error and will result in an IllegalArgumentException.
* <p>
* Because the {@code FixedInt} and {@code FixedUInt} encodings are so similar, we can use this method to write
* either one as long as we provide the correct number of bytes needed to encode the value.
* <p>
* Most of the time, you should not use this method. Instead, use {@link WriteBuffer#writeFixedInt} or
* {@link WriteBuffer#writeFixedUInt}, which calculate the minimum number of required bytes to represent the value.
* <p>
* You <i>should</i> use this method when the spec requires a {@code FixedInt} or {@code FixedUInt} of a specific
* size when it's possible that the value could fit in a smaller FixedInt or FixedUInt than the size required in
* the spec.
*/
public int writeFixedIntOrUInt(final long value, final int numBytes) {
if (0 > numBytes || numBytes > 8) {
throw new IllegalArgumentException("numBytes is out of bounds; was " + numBytes);
}
return _writeFixedIntOrUInt(value, numBytes);
}

/**
* Because the {@code FixedInt} and {@code FixedUInt} encodings are so similar, we can use this method to write
* either one as long as we provide the correct number of bytes needed to encode the value.
*/
private int writeFixedIntOrUInt(final long value, final int numBytes) {
private int _writeFixedIntOrUInt(final long value, final int numBytes) {
writeByte((byte) value);
if (numBytes > 1) {
writeByte((byte) (value >> 8));
Expand Down
Loading