Releases: apache/datasketches-hive
Releases · apache/datasketches-hive
2.0.0
This is largely a service release and should not change the API for querying via Hive, although there are changes to internal APIs that may impact anyone deriving from internal classes.
This release updates the datasketches-java library to 6.0 (from 1.X):
- Supports reading compressed theta sketches
- Moves to the current DataSketches model for inclusive/exclusive quantile family queries
datasketches-hive-1.2.0
This is a maintenance release to make this Apache Hive component work with the latest versions of datasketches-java-3.1.0 and datasketches-memory-2.0.0
Apache Release 1.1.0-incubating
- This release fixes critical bug
- updates datasketches-java dependency to 1.3.0-incubating
- minor licensing fixes
- minor code cleanup.
Apache Release 1.0.0-incubating
This is the initial Apache release for this component.
- The Java package structure has been changed to org.apache.datasketches
- The file license headers have been updated with the Apache license header
- The LICENSE, NOTICE, and DISCLAIMER-WIP files have been added and/or updated.
No other significant code changes from the prior version.
sketches-hive-0.13.0
- Based on sketches-core-0.13.0
- CPC sketch UDFs
- KLL sketch UDFs
- additional quantiles sketch UDFs: toString, getN, getCDF
- additional HLL sketch UDFs: SketchToString, getEstimateAndErrorBounds
sketches-hive-0.11.0
Compatibility with sketches-core-0.11.0
sketches-hive-0.10.5: new core, HLL late init fix, char and varchar
- based on sketches-core-0.10.3
- support HLL sketch late init from Hive
- support char and varchar types as HLL and Theta sketch input
sketches-hive-0.10.4: use sketches-core-0.10.2
This is a maintenance release to use the latest sketches-core-0.10.2
Sketches core 0.10.1, new Tuple sketch UDFs, performance improvement
- This is based on sketches-core-0.10.1 and memory-0.10.3
- New Tuple sketch UDFs: ArrayOfDoublesSketchesTTestUDF, ArrayOfDoublesSketchToMeansUDF, ArrayOfDoublesSketchToVariancesUDF, ArrayOfDoublesSketchToEstimateAndErrorBoundsUDF, ArrayOfDoublesSketchToNumberOfRetainedEntriesUDF, ArrayOfDoublesSketchToQuantilesSketchUDF
- Performance improvement: wrap() is used instead of heapify() in HLL UDFs
HllSketch performance improvement for strings
- HLL DataToSketchUDAF: Input strings are converted to char[] before passing to HllSketch. This is substantially faster than passing strings due to avoiding UTF-8 conversion process. Warning: effectively a different hash function is used for strings. So unions of sketches produced by this version and the previous version will have no overlap, and therefore produce incorrect results. We recommend upgrading to this version, and, if any sketches have been created with string inputs and stored, we recommend recomputing them from the raw data.