Change Log

All notable changes to Fili will be documented here. Changes are accumulated as new paragraphs at the top of the current major version. Each change has a link to the pull request that makes the change and to the issue that triggered the pull request if there was one.

Current

Fixed:

Added:

Changed:

Removed:

Fixed:

Deprecated:

Known Issues:

Contract changes:

Extensive deprecation cleanup

0.12 Highlights

Extensibility improvements

Added extensions for query transformation before and after running queries. Added additional binding points and some additional interfaces to extend.

Protocols Metrics

We extended protocol metrics to underpin most of the makers.

Infrastructure

Motivated by changes in resource availability and billing, we migrated off Jenkins onto Screwdriver.
With bintray shutting down we migrated off bintray onto maven central. With this migration we realigned our group id from com.yahoo.fili to com.yahoo.bard to better reflect the codebase.

Sample applications

We split out sample applications into different modules

Performance improvements

We fixed a number of performance issues, including caching weight check queries, making timeouts global across query splits and using non blocking calls to memcached.

General Extensibility

We added a number of methods and classes to make it easier to add extensions, such as Chaining ResultSetMappers so post aggregation functions in fili can be multistep. Support for a full protocol aware metric grammar in antler.
Support for druid virtual columns in queries. And more options for binding suppliers.

v0.12.109 - 2021/03/13

Fixed:

Fix: Added name property back to FilteredAggreation ]
- So that filteredAggregation can be sorted based on aggregator name during canonicalization of json object.
Fix: Fili-sql druid all time grain query support
Fix: Install python to support build tagging on new screwdriver image
[Fix: Change Log Level of Presto Response] (#1130)
Fix: Presto Limit and the SD build
Fix: Bug: dataSourceMetadataService in TestBinderFactory hides instance in AbstractBinderFactory
Fix: Another batch of security alerts
Fix: fili-presto inFilter RexNode construction
- Fix inFilter RexNode construction
Fix: fili-presto INSTANCE time grain
- When time grain is chosen to be INSTANCE for the presto query, the timestamp column won't be a group key as we want the total aggregation across all time ranges
Fix: Weight Check Query Caching
- Caching the weight check query instead of the druid query in CacheWeightCheckRequestHandler.
Fix: Test failed due to static side effects revealed during Screwdriver migration
Fix: Make build work with JAVA_TOOL_OPTIONS
- Fixed maven build failure when JAVA_TOOL_OPTIONS was set by reconfiguring ant output
Fix: Add null check for cache response to CacheV2RequestHandler
- Added null check to response from cache read that delegates to next handler.
Fix: Bad serialization of AllGranularity
- Change to rollup formatter broke serialization of all timegrain.
Fix: Fix missing VARCHAR cast for SelectorFilter
- The PrestoFilterEvaluator.java missed the evaluator for the SelectorFilter (equals) case. Fix by adding the evaluator and corresponding VARCHAR cast.
Fix: Add metric type to metric detail metadata queries
Fix: Futures generated by AsycHttpCalls are leaking
- Return non null response to indicate that future is done.
Fix: Antlr having syntax extended to more correctly support dynamic metrics
- String escaping corrected in Having Grammar
Fix: Add withDimension in TemplateDruidQuery
- Modify a TDQ to add a dependent dimension.
- Added getAllGroupingDimensions in DataApiRequest to get request,metric,filter dimensions and pass it along to query builder.
Fix: Antlr Sort Parser didn't work. Wasn't tested. Patched and Tested
- Moved binding out of Sorts.g4 grammar and into SortDirection.
- Duplicated OrderByGeneratorSpec tests and added protocol metric tests.
Fix: Metric Generator must be bound before loader.load() generates metrics
- AbstractBinderFactory changed so that MetricBinding happens after metric generation not before.
Fix bug where metric binding hook has set to private so it couldn't be overriden
Adds missing argument to format string in a rarely used error path in RequestLog
Fix bug where some Aggregation model types were incorrectly reporting precision
- CardinalityAggregation, LongMaxAggregation, and LongMinAggregation were incorrectly reporting their precision as floating point by not overriding the Aggregation#isFloatingPoint method to return "false".
- ArithmeticPostAggregation was reporting precision based on operands, but druid always coerces result of arithmetic post agg to double.
  - Fixed in PR #1023
[Fix_OR_logic_for_presto_support](#999 and #1002)
- We use to split the filter clause by ANDand then cast each field to varchar before comparison. Add split or OR as well to support Presto better
- Add support to use of () in the filter clause
Fix contains filter behavior
- When there is a Contains filter applied to a non-cached dimension, it will be translated into a SearchFilter instead of SelectorFilter.
Fix Presto query on filtering
- When translating from sql query to Presto query, there is no type information available for table columns. To make filtering WHERE clauses works in Presto, cast coulmns to varchar before comparing
MetricUnionAvailability properly defensively copies availability map
- MetricUnionAvailability previously did not create a defensive copy of the availabilitiesToMetricNames parameter. This has been fixed.
Upgrades to netty 4.1.42.45.Final to address CVE-2019-20444 and CVE-2019-20445
[Fixing sort for protocol metrics] (#1047)
- Created an antlr grammar for Sorts
- Fixed integration issues with Havings
- Injected dynamic sort building into BardConfigResources and DataApiServlet
- Changed TestBinderFactory to support protocol metric tests.
- Created LegacyGenerator as a bridge interface from the existing constructor based api request impls and the factory based value object usage.
Update Presto Sketch Regex
- Update presto sketch regex statement to account for underscores ('_') is metric names.

Added:

Add support to rewrite api request with desired metrics
- Added support to rewrite MetricsApiRequest with desired set of LogicalMetric.
Add virtual column merge support in template druid query
- Added virtual column merge support in TemplateDruidQuery.
Add virtual column support in query building contexts
- Added virtual column support in query building contexts.
Fix Fili-sql class loading issue in Java 11
- Set the thread's context class loader to avoid class loading issues in downstream Janino dynamic class loading logic.
Add finalizing field access to have no name
- Added a condition in Postaggregation to allow no name for type Finalizing Field Access.
Added virtual columns and first last any aggregation support
- Added support for virtual columns and aggregation support for first/last and any aggregations
Added sql-presto daily table support
- Added support for specifying hourly and daily timestamp format and choosing upon physical table time grain
Added sql-presto topN support
- Add topN support for presto queries
Added more logging for cache set and get
- Added additional cache set and get logging info on both success/failures.
- It will help us tracking query logs and get more insights about any cache issues.
Add filtered dimensions to combined dimensions
- Included filtered dimensions in combined dimensions for DataRequest.
Added more logging in BardQueryInfo for cache set and get failures
- Add more cache logging in BardQueryInfo which will hold map of BardCacheInfo cache set and get failures.
- Added more counters in BardQueryInfo for factCacheSetFailures and factCacheSetTimeoutFailures.
Add logging for cache miss ,hit and potential hits
- Utilize BardCacheInfo to add logging for cache hit and miss.
- Re-sequence in `CacheV2ResponseProcessor' to send response back to the user first and then set cache.
Add canonicalization of ArrayNode in cache key
- Added parsing and canonicalization of ArrayNodes in cache key generation.
Add logging for cache put failures
- Added BardCacheInfo logging for cache put failures.
Add withLogicalMetricInfo to MetricInstance class
- Supporting method added to MetricInstance to replace LogicalMetricInfo.
Add ablity to chain ResultSetMappers
- Added ChainingResultSetMapper to provide ablity to delegate to all ResultSetMappers in its list.
  - It provides functionality to retain the existing calculations and also add the new ResultSetMapper at query time.
  - It implements RenameableResultSetMapper in order to rename mappers with the provided new name.
  - Added @NotNull constraint on nestedQuery parameter for one of the TDQ constructor
Add ability to cahce weight checks
- Add CacheWeightCheckRequestHandler to support caching weight checks.
  - CacheWeightCheckRequestHandler delegates to WeightCheckRequestHandler to determine whether a request should be processed based on estimated query cost.
  - It also checks the cache for a matching request, else writes it to the cache using CacheService utility.
Add logic to rename aggregation to avoid name collision
- Add renameIfConflicting logic for aggregations in BaseProtocolMetricMaker.
  - Subclasses of BaseProtocolMetricMaker can implement getRenamedMetricNameWithPrefix method to have unique rename prefix for corresponding maker.
  - Default implementation will add Prefix __renamed_ whenever there is aggregation name collision.
Add ability in fili-sql to translate FilteredAggregation into SQL
- Translate a Druid query with n FilteredAggregation into SQL using (n + 1) subquery unions.
  - See PR description for details
Add ability to convert TimeSeriesQuery to GroupByQuery
- Add withDimensions to TimeSeriesQuery with dimensions parameter.
  - Allows for creating a GroupByQuery from a TimeSeriesQuery.
Add the ability for the metrics grammar to handle quoted metric values
- Metric values can be quoted with single quotes ('), and can contain any value.
  - Literal single quotes must be escaped with a backslash (\)
Add TemplateDruidQueryUtils class, which contains static utility methods for interacting with TemplateDruidQuery
- Add repointToNewMetricField method, which recursively checks a given field for references to a MetricField instance and replaces it with a different MetricField instance.
- Relies on the new WithPostAggregations and WithMetricField interfaces to find children on MetricField instances
Add interface RenamableResultSetMapper to support pointing ResultSetMappers at different columns at query time
- All ResultSetMapper implementations that are tied to a specific column in the ResultSet must implement this interface
- This re-pointing functionality is required to support metric renaming
- Currently only supports re-pointing at a single column. If multiple column re-point is required this interface must be expanded
Add withLogicalMetricInfo to LogicalMetric interface to support metric renaming
- Supporting method added to TemplateDruidQuery to rename a target MetricField
- Default implementation written on existing LogicalMetric implementations (such as LogicalMetricImpl)
  - Client subclasses of these implementations MUST override the withLogicalMetricInfo method, as the default implementation will NOT return an instance of the client subclass!
Add rename capability to TemplateDruidQuery
- Functionality added to support LogicalMetric renaming and aliasing
- Add ability to rename the output name of MetricFields on the TemplateDruidQuery
- Add method to check if TemplateDruidQuery contains MetricField with a given output name
Add __granularity to parameters map for ApiMetric by extracting it from the API query
- Add __granularity to parameter map of ApiMetric in ProtocolLogicalMetricGenerator
Add parameters for output logical metric info to core classes in the ProtocolMetric API
- Add outputMetadata parameter to the ProtocolMetric and MetricTransformer interfaces
  - Allows for a consistent way to name and track the result metric transformation, instead of deferring the renaming responsibility to the implementations.
Add COUNT(*) support in fili-sql
- When there is a count metric that uses countMaker, it will be translated into a COUNT(*) in SQL query.
Add Presto support
- Fili now can connect to Presto servers.
- Fixed Druid metadata deserialization issue with the newer druid-api-0.12.1.
- Version bump for jackson, jackson-databind, async-http-client to address package vulnerability.
- Add suppression for other packages with vulnerability but without fixed version.
Default implementations of new Generator interface
- All default implementations are based on the equivalent method from ApiRequestImpl. The logic backing them is a direct copy and paste from theApiRequestImpl implementation.
- The logic is implemented in public static methods for dependent code to use. DO NOT WRITE NEW CODE REFERENCING THOSE METHODS unless you have a good reason to do so. Prefer creating instances of the generator.
- Generators based on DataApiRequestImpl are not yet implemented.
Add Sketch Metrics support for Preso
- Fili can now translate requests that include sketch metrics to the correct presto sql statements.

Changed:

Migrating from bintray to maven central
- Bintray EOL, migrating publishing of artifacts to Maven Central.
Support for embedded java client usage: made uri builder an optional contract
Added support for druid timeouts to be cumulatively linked to request start time
- Modify RequestLog to support fetching without modification on Timers.
- Build TimeRemainingFunction to pull a time delta using the RequestLog start of request.
- Add injection constructors for WebServiceSelectorRequestHandler to inject TimeRemainingFunction and support counting down druid timeouts.
Moving to open source screwdriver for builds
- Travis-ci.com was no longer funded, migrated builds onto screwdriver.
Logging of AllGranularity to use friendlier serialization
- Switched to using getName
Better supporrt for different protocols on a single parameter name
- Added 'acceptsParameter' contract to ProtocolSupport and 'withReplaceProtocols' to support replacement of contract names.
- Shifted validation error in ProtocolChain to check for parameter missing instead of protocol missing.
- Stripped out consumed protocol core parameters to avoid retriggering core parameters on divergent protocols.
Memcached not blocking on set
- Added TimeoutConfigureBinaryConnectionFactory to make Memached timeout configuration driven
- Made set() on MemDataCache not block on get on the produced future.
- Marked clients referring to the existing get backed signature for later removal.
Additional refactor for better extensibility
- TableFullViewFormatter needed some more generality to make it easier to extend.
Refactored metadata generation into testable classes
- Created metadata formatter classes to hold code to generate metadata endpoint responses.
- Created MetadataObject to encapsulate Map<String, Object> patterns.
Added extension capabilities to the default parsing of formats
- Added static map for response formats to DefaultResponseFormatGenerator
- Added unit test for DefaultResponseFormatGenerator
Missing resource elements on Endpoints return 404, not 400
- Empty dictionaries will not longer return 400 errors on metadata. (Potentially this is more property a 5** error, but regardless 400 is wrong)
- Missing resources will return as 404 errors on metadata endpoints.
Updated dependency code
- Removed expired suppressions
- Suppressed transitive hibernate validator dependency with OWASP issue
Changing parameter value escaping from single quote to paranetheses
Refactored FilterBinders to allow for injection of filter parsing strategies
- Filter parsing strategies are created by implementing the ApiFilterParser interface
- This interface can be bound to a specific instance in the client's AbstractBinderFactory extension, which will then be injected into FilterBinders instances.
Error handling on sort of protocol metrics could be better
- Added available metrics to help with debugging mismatch between sort metric parameters and selected metrics.
WithFields interface refactored to WithPostAggregations and WithMetricFields
- WithPostAggregations interface is almost a direct rename from WithFields
  - Indicates a MetricField that can (but does not have to) depend on many PostAggregations
  - Type parameter removed
- WithMetricField indicates a MetricField that depends on exactly 1 MetricField
Add time grain to error message for metric missing from table errors
Made TimeAverageMetricTransformer use a delegate for non-match error
- Delegate to a default error handler to support easy chaining.
Added parameter for ApiRequestLogicalMetricBinder to the core binding constructor on DataApiRequestImpl
- BardConfigResources has a method for providing an instance of ApiRequestLogicalMetricBind
- Existing delegating constructors that take BardConfigResources have their public contracts unchanged
- DefaultLogicalMetricBinder is the default implementation of this interface.
- BardConfigResources#getMetricBinder is defaulted to return a new instance of this class.
Extracted DataSourceConstraint into an interface
- DataSourceConstraint is now an interface.
- Migration path documented in the linked issue.
- All public methods on the original base class are defined on the interface
- The original base class implementation has been renamed to BaseDataSourceConstraint.
- The class hierarchy has otherwise been maintained
- The method withDimensionFilter has been added to the DataSourceConstraint interface.
- This method creates a new view of the constraint that is filtered with a provided predicate.
- This method is meant to be the dimension version of the already existing withMetricIntersection method.
Methods in ApiRequestImpl for constructing ApiRequest resources have been moved to relevant generators
- No methods have been removed from ApiRequestImpl, but the implementation code has been moved to the relevant default generator implementation for that resource, and the existing methods now defer to public static methods on the generators.
Refactored sample applications into distinct submodules
- Split luthier into a library package and a sample application
- Nested all sample applications
- Resolved dependecy issues around where properties files were sourced
- Rationalized dependencies for sample applications
Uses addFactories rather than withFactories in Luthier setup.
Change the access specifier of LoadTask.setFuture to public

Removed:

Removed redis key value store support
- Redis KeyValueStore implementation and support removed.

Fixed:

-Fix: Update vulnerabilities

Update OWASP vulnerabilities and temporarily suppress database updates.

Fix vulnerability in printing stack trace instead of using logger
- Printing stack trace directly to standard.err is not safe. Using logger instead.
Fixing missing Jackson injectable
- Bumping the druid-api exposed a missing requirement now on the DataSegment contract.
- Added handlers to the ObjectMapperSuite json mapper.
Version bump jackson to resolve security vulernability

Deprecated:

Known Issues:

Contract changes:

v0.11.79 - 2019/09/17

0.11 Highlights

Configuration - Luthier

We added an external configuration system that resolves dependencies using Lua (with Json interopability via tools). Luthier provides concise and scriptable means to configure tables, metrics, dimensions, etc. See (https://github.com/yahoo/fili/tree/master/luthier)

Configuration and Extensibility

Partial data devolved into distinct features.

Added custom response handling.

The CURRENT macro can now be configured to resolve relative to a timezone other than UTC.

Code style

We added SONARQ code quality checking and addressed discovered issues.

We removed inappropriate use of Optionals from constructors and parameter types on interfaces.

Security patches

We added OWASP vulnerability checking and addressed identified issues. (https://www.owasp.org/index.php/OWASP_Dependency_Check) Enhancements included jackson version upgrades to address injection vulnerabilities.

DataApiRequest

We deprecated many methods related to DataApiRequest being used to build of druid model objects and to carry factory objects to other parts of the application. These factories are now being injected by Dependecy Injection (HK2).

The corresponding methods have been deprecated and will be removed very soon.

A Pojo DataApiRequest has been built as well as a Generator interface and a Builder. These will form the basis of a complete replacement of the existing DataApiRequestImpl in the next version, using a Factory+Builder pattern to produce an immutable value object.

Extensibility

Injectable custom response handling.

Added:

Added protected method to allow injection of dimension config loading in Generic Application
- Made dimension config loading into a protected feature of the GenericMetricLoader
Added groundwork classes for POJO DataApiRequest build path
- Generator contract, builder, and POJO object have all been added, along with some silent contract changes.
- Immutable implementations of LinkedHashSet, LinkedHashMap, and ApiFilters have been added.
Adds FlagFromTagDimension
- FlagFromTagDimension is a virtual dimension that exposes a flag based interface to API users, but is actually based on the presence or absence of a tag value in an underlying multivalued dimension.
- This implementation is based on two underlying physical columns: a filtering column which can be efficiently filtered against using the default druid filter serialization, and a grouping dimension containing a comma separated string of tag values, which is parsed to determined the presence of the desired tag value and then converted to the appropriate truth value.
- The filtering behavior is supported through the new FilterOptimizable interface and associated request mapper.
Add FilterOptimizable interface
- Adds the FilterOptimizable interface, which indicates that the implementing object has the ability to optimize a Collection of ApiFilter objects.
- Adds FilterOptimizingRequestMapper which will check if any of the filtered on dimensions can optimize their filters and performs the optimizations.
Add ImmutableSearchProvider interface and MapSearchProvider
- Adds the ImmutableSearchProvider interface, which is a marker interface indicating that the SearchProvider implementation is immutable.
- Adds MapSearchProvider which is an implementation of ImmutableSearchProvider based on a constant map.
Add support to DataCache for key-specific expirations
- Adds a new method boolean set(String key, T value, int expiration) that allows customers to to set the expiration date for a key when it is being added to the cache.
- The default implementation delegates to boolean set(String key, T value) (so throwing away the expiration), so this won't affect any customers who have their own DataCache.
- The memcache-backed implementation implements the new set, and the old set delegates to it, passing in the configured EXPIRATION constant.
Add config parameter to control lookback on druid dimension loader
- Add config parameter: bard__druid_dim_loader_lookback_period to control window of time used in loading.
Add ApiFilters to LogicalTable
- Add ApiFilters to LogicalTable class. These filters function as a view on the underlying physical tables by restricting access to only a subset of the data present on the logical table.
- These filters are merged with ApiFilters from the api request during druid query building and on the TablesApiRequestImpl for requests to the tables servlet.
- Small Patch: ApiFilters contract was breaking downstream application tests, so switched to supporting Optional
- Second small patch: Fix null pointer errors with TablesApiRequestImpl
Make current macro align on the end of network day
- Added BardFeatureFlag.CURRENT_TIME_ZONE_ADJUSTMENT which determines if adjustment based on timezone is needed.
- Added BardFeatureFlag.ADJUSTED_TIME_ZONE which tells to what timezone the macro has to be adjusted.
- If CURRENT_TIME_ZONE_ADJUSTMENT flag is enabled, macro is aligned on end of UTC day.
Create a TagExtractionFunctionFactory to transform comma list values into a Boolean dimension
- Create an extraction function to transform a comma list of values into a boolean dimension value.
Add Partial Data Feature Flags to separate query planning and data protection
- BardFeatureFlag.PARTIAL_DATA_PROTECTION activates removal of time buckets based on availability
- BardFeatureFlag.PARTIAL_DATA_QUERY_OPTIMIZATION activates the use of PartialData when query planning.
- BardFeatureFlag.PARTIAL_DATA still activates both capabilities.
- If any of these flags are active partial data answers are included in responses.
Add system config to disable requiring metrics in Api queries
- Added the system config require_metrics_in_query which toggles whether or not metrics should be required in queries
  - this setting is turned ON by default
- This property is controlled through the feature flag BardFeatureFlag.REQUIRE_METRICS_QUERY
Add more BoundFilterBuilding validation and hooks
- Added minimum and maximum arguments to FilterOperation
- Added validation on number of arguments to the bound filter builder
- Added hook for normalizing BoundFilterBuilder arguments
Force update of cardinality to SearchIndexes
- SearchProvider now has method int getDimensionCardinality(boolean refresh), where refresh indicates the cardinality count should be refreshed before being returned.
  - default implementation just defers to existing method int getDimensionCardinality()
  - LuceneSearchProvider overrides the default and refreshes the cardinality count if refresh is true
Added aliases to api filter operations
- Filter ops now have aliases that match the relevant ops and aliases for havings.
Added filename parameter to api query
- If the filename parameter is present in the request the response is assumed to be downloaded with the provided filename. The download format depends on the format provided to the format parameter.
- Filename parameter is currently only available to data queries.
Ability to add Dimension objects to DimensionSpecs as a nonserialized config object
- DimensionSpec and relevant subclasses have had a constructor added that takes a Dimension and a getter for the Dimension
Added expected start and end dates to PhysicalTableDefiniton
- New constructors on PhysicalTableDefinition and ConcretePhysicalTableDefinition that take expected start and end date
- New public getters on PhysicalTableDefinition for expected start and end date
Added expected start and end dates to availability
- Add methods for getting expected start and end dates given a datasource constraint to the Availability interface.
  - start and end dates are optional, with an empty optional indicating no expected start or end date.
  - the new methods default to returning an empty optional.
- The start and end dates are not concrete. If an availability has intervals outside of the expected range those intervals are NOT suppressed.
- BaseCompositeAvailability reports its expected start and end dates as the earliest start date and latest end date of its composed availabilities.
  - no expected start or end date supercedes any configured start or end date, so if ANY of the composed availabilities has no start or end date, and empty optional is reported.
- Add a constructor to StrictAvailability that takes start and end dates, which allow for direct configuration of expected start and end dates.
Fili can now route to one of several Druid webservices based on custom routing logic
- This allows customers to put Fili in front of multiple Druid clusters, and then use custom logic to decide which cluster to query for each request.
- We introduce a new interface DruidWebServiceSelector that wraps the routing logic, and pass an instance to the AsyncWebServiceRequestHandler for it to use.
Add Druid Bound filter support to Fili
- Added the DruidBoundFilter class to support the Bound Filter supported by Druid.
Add static Factory build methods for BoundFilter
- Added static factory methods for building lowerBound, upperBound, strictLowerBound and strictUpperBound Bound filters.
Add insertion order aware method for Stream Utils
- Added orderedSetMerge that merges 2 sets in the order provided.
Add DimensionRow transformation support with ResultSetMapper
- Added helper constructor to DimensionRow
- Created MemoizingDimensionMappingResultSetMapper to support field transform use case
Added LogicalTable name metdata interface and BaseTableLoader methods to accept it
- LogicalTable accepts LogicalTableName as a constructor parameter
- BaseTableLoader.loadLogicalTablesWithGranularities accepts LogicalTableNames to pass to new LogicalTable constructor
- Changed default retention for LogicalTable to null rather that P1Y

Changed:

Removed references to yahoo internal authorization system
- Renamed bouncer code to 'status code' and references to 'Bouncer' in class name and fields.
Deprecated optionals on constructors for DataApiRequest implementations and related objects
- Some objects related to DataApiRequest were taking optionals as construction parameters or storing optionals internally. This behvaior has been changed to more closely align with accept guidelines for using optionals
OptionalInt has been removed from some interfaces relating to DataApiRequest and replaced with Optional
- While OptionalInt is considered to be preferable to Optional<Integer> for performance reasons, it does not have the same rich set of utility methods as Optional. The relevant code was not in a performance critical location so OptionalInt was replaced with Optional<Integer> for improved usability.
Improved user provided filename handling to truncate extra user provided file extensions
- If the user provided filename ends with a file extension that matches the file extension provided by the response format type, that file extension is removed.
- Some other small refactors where do on the ResponseUtils class
Upgrade to Jackson 2.9.9
- Addresses https://nvd.nist.gov/vuln/detail/CVE-2019-12086, a new vulnerability in jackson databind.
Made Filter Construction more flexible
- Changed FilterBinder.INSTANCE from final to static with accessors
- Refactored FilterBinders to support chain-of-responsibility FilterFactory
Create a TagExtractionFunctionFactory to transform comma list values into a Boolean dimension
- Create an extraction function to transform a comma list of values into a boolean dimension value.
Fix security alerts & Dependency version bump
- Checkstyle prior to 8.18 loads external DTDs by default, which can potentially lead to denial of service attacks or the leaking of confidential information.
- This PR upgrades com.puppycrawl.tools to the safe-version.
RoleDimensionApiFilterRequestMapper builds api filters with a defined, consistent ordering
- The resulting set of ApiFilters is backed by a linked hash set, which is ordered by the names of the dimension, dimension field, and filter operation.
- The values in each constructed ApiFilter are sorted.
Better expose lucene analyzer LuceneSearchProvider
- Continue to make LuceneSearchProvider internals protected over private so extending classes have an easier time extending behavior.
- removed protected getter and setter on LuceneSearchProvider.analyzer field in favor of just making the field protected
LuceneSearchProvider will throw an exception if a thread spends too much time waiting on acquiring a lock
- Currently, if LuceneSearchProvider tries to acquire a lock it will wait forever until the lock is released. If the lock is erroneously never released the requesting thread will hang forever.
- The new behavior is to timeout after some amount of time, throw an error, and fail the query.
Strict Availability no longer returns no availability when queried with constraint with no columns
- Currently, StrictAvailability.getAvailableIntervals(Constraint) returns an empty interval list when called with a constraint with an empty column list. This behavior is now changed to defer the call to StrictAvailability.getAvailableIntervals()
- This behavior change is only relevant to StrictAvailability, all other default availability implementations are composite availabilities and defer this call to their underlying availabilities.
Change log level for several servlet
- SlicesServlet, DimensionsServlet, MetricsServlet, TablesServlet, FeatureFlagsServlet all has debug level log for the entire query response. Change log level to trace to avoid log spamming.
Better exposed dimension analyzer fields in LuceneSearchProvider
- Changed LuceneSearchProvider to using an analyzer field instead of a final, statically create StandardAnalyzer
- some previously private fields and methods are now either protected or public.
Better exposed static method on DimensionsServlet to subclasses
- Changed DimensionsServlet.getDescriptionKey to protected
ResponseFormatType now contains information relevant to generating response headers associated with response format
- ResponseFormatType interface exposes getCharset(), getFileExtension(), and getContentType() methods which provide information used to build response headers
ApiRequest interfaces exposes getDownloadFilename method
- ApiRequestImpl and other classes relating to the construction of DataApiRequest implementations have had new constructors added to handle the filename.
- Constructors that don't handle filename are now deprecated.
Added config property for SSL cipher suites
Updated asynch http dependency to resolve security issues
- Moved netty to current 4.1.31.Final
- Moved asynch-http-client to current 2.6.0
Truncate csv response file path length * Set a max size to file name for a downloaded csv report. ~~- max length is 218 characters, which is Microsoft Excel's max file length~~
The algorithm for PartionAvailability is changed to consider using expected start and end date
- Currently PartitonAvailability reports its availability as the intersection of all participating sub availabilities
- PartitionAvailability now takes the union of available intervals from all participating availabilities and subtracts from it the union of missing intervals from all participating availabilities.
  - The current algorithm is equivalent to the new algorithm if ALL participating availabilities have NO expected start AND end dates
Configurable limit to csv filename length
- Created configuration parameter 'download_file_max_name_length' to truncate filename lengths
  - Default truncation is no truncation (0)
  - truncation happens before applying file extension
Generifying FilterBuilder exceptions
- Make FilterBuilder exceptions more general and use them for non search provider exceptons.
Removed deprecations in maker classes
- Add LogicalMetricInfo conversion method on ApiMetricField class
- Moved all tests and internal uses onto LMI based construction
- Removed calls to deprecated Sketch utility methods
- Removed unused RowNumMapper non-LMI implementation
- Undeprecated needed sketch utility method
Eliminate String based metric creation
- Add LogicalMetricInfo conversion method on ApiMetricField class
- Moved all tests and internal uses onto LMI based construction
ApiFilter allows preserving the insertion order of filter values
- Constructor and withValues() method's argument changed from Set to Collection.
- Added getValueList() to return the original ordered filter values.
RoleDimensionApiFilterRequestMapper preserves the order of insertion of ApiFilter values
- Changed unionMergeFilterValues() to be order cognizant for ApiFilter values.
Cleanup DataApiRequestImpl and Builders
- Moved to ordered bind/validate semantics.
- Created interface for druid having building and moved existing builder
- Moved DruidQueryBuilder off of apiRequest.getDruidHavings to use apiRequest.getHavings().isEmpty()
- Moved generators into the binders subpackage of web.apirequest
Refactored DruidHavingBuilder
- Moved DruidHavingBuilder to new package
- Made static methods into instance methods, with a default instance
- Created a factory interface
- Injected DruidHavingBuilder into QueryBuilder
Refactored DruidFilterBuilders
- Moved DruidFilterBuilder and clients to new package
Removed deprecations in maker classes
- Add LogicalMetricInfo conversion method on ApiMetricField class
- Moved all tests and internal uses onto LMI based construction
- Removed calls to deprecated Sketch utility methods
- Removed unused RowNumMapper non-LMI implementation
- Undeprecated needed sketch utility method
Eliminate String based metric creation
- Add LogicalMetricInfo conversion method on ApiMetricField class
- Moved all tests and internal uses onto LMI based construction
Update Redison dependencies

Deprecated:

Removed:

Removed specialized PhysicalDataSourceConstraint methods from Availability
Disabled TableUtilsSpec test that only tested testcode

Fixed:

CVSS vulernability resolution
- Upversion druid-api to 0.12 to stop bringing in vulnerable versions of apache and avro
- Upversion jetty for emedded examples to resolve transitive vulneratibilities
- Upversion calcite to resolve protobuf vulnerabilities
Disabled erroring javadoc reports
- Disabled link following during javadoc processing that was creating many false positive warnings.
Hardened the regex generated by TagExtractionFunctionFactory to fail partial matches
- Problem: Previously, the regex for the FlagFromTag dimension did not use start and end line characters. This means that regex engines can partial match on incorrect strings (eg: desired flag value "1" can match "2,3,12").
- Druid uses a matching strategy that avoids this, but itis changed to always fail, even with engines that will match on partial matches (which is common).
SystemConfigException now extends RuntimeException instead of Error
- Problem: Previously, if an unexpected behaviour happens in the Class build time in fili-system-config module's SystemConfig.java, an Error will be raised and bubbles up in mvn build, where we suspect only Exceptions are logged.
- Behaviour: For example, if a fili module is missing appropriate moduleConfig.properties in its src/main/resources, in runtime, we will get java.util.NoSuchElementexception followed by NoClassDefFound instead of the SystemConfigException being correctly logged.
- Fix: Now the SystemConfigException will no longer extend Error. It extends RuntimeException instead. We also log the message in-place in SystemConfig.java when any Exception is caught.
Reorder applying table api filters in druid filter building
- table api filters where not being used in query planning, moving the merge above query planning fixes this.
Unstuck druid dimension loader in time
- Unstuck lookback so that the window slides forward rather than stopping at static load time.
Reverted addition of Verizon Media Group to copyright
Handle null lastLoadDate in DruidDimensionLoader
- Protected DruidDimensionsLoader from null pointer exceptions on no LastRunDate
Filtered partial time comparison to requested intervals in PartialTimeComprator
Fixed many compile warnings and other issues
- Many minor syntax and structual issues resolved.
Fixed FilteredAggregation nesting behavior
- Currently, FilteredAggregation effectively makes a second copy of its wrapped aggregation, and the inner copy will be wrapped with the filter and the outer query won't
- This behavior is changed to call nest on the wrapped query, then wrap produced inner query with the filter. The filtered inner query is returned as the inner query of nest call, and the raw outer query is returned as the outer query of the nest call
Filter Code now intersects security constraints instead of unioning with requests
- Switched to ensure security and request filters don't merge but instead intersect
Bump Jackson version to patch vulnerability
- Bumped dependency version to 2.9.8
Bump Jackson version again to patch vulnerability
- Bumped dependency version to 2.9.5
- Made serialization order more specific in several classes
- Fixed bad format in error message
- Moved tests off of serialization of SimplifiedIntervalList. That's turning out to be hard to solve.
Bump lucene version to patch vulnerability
- Bumped dependency version to 7.5.0
- Added error in case of greater than maxint hits from Lucene
Bump spring code to patch vulnerability
- Bumped dependency version to [5.1.2,)
- Throw validation error if excessive documents are returned from Lucene now that it supports up to long hitcounts
Updated copyright style to include Verizon Media Group
Fixed incorrect key for physicalTableDictionary lookup
Log exception messages in DataApiExceptionHandler
- DataApiExceptionHandler clearly intended to log error messages but improperly used the sl4j log syntax.
- Added original URI logging to exception handling cases.
Fixed corrupted getDefaultDimensionFields after show
- map merge incorrectly modified source set

Known Issues:

Contract changes:

Allow Optional nesting in Aggregation
- Changed the return type of nest method in Aggreagtion class.
- nest method in Aggregation now returns a Pair of Optional<Aggregation>.
ResponseUtils is now responsible for generating response headers
- ResponseUtils now generates the Content-Type header and the Content-Disposition header if relevant.
- ResponseUtils handles all format types instead of just building the Content-Disposition header value for CSV responses
Response is assumed to be rendered in the browser unless a filename is provided
- ResponseUtils takes a list of formats that default to download. This list can be changed or overridden
- By default CSV is added to the default to download list. This is to maintain backwards compatibility
ApiRequest.getApiDruidFilters() must now not be null
DataApiRequest property renames
- DataApiRequest getHaving -> getQueryHaving
- DataApiRequest getDruidFilter -> getQueryFilter
- Deprecate old paths
Additional healthcheck logging on healthchck failure on data request
- Added user, request url, and timestamp to healthcheck error message on data request.

v0.10.48 - 2018/10/04

0.10 Highlights

Extensibility improvements

A number of classes which were either not extensible or very difficult to extend have been restructured.

Response building, extraction functions, Web logging and Exception handling in DataServlet are now injectable.

Extensive deprecation cleanup

A lot of tech debt has been paid down in the form of removing deprecated code and moving code off deprecated methods. Old sketch support has been retired in favor of the official community open source version: (https://datasketches.github.io/docs/Theta/ThetaSketchFramework.html)

Some deprecations have been removed because the migration path off of those deprecated methods seemed less useful than simply supporting the older (simpler) contract.

General cleanup

Some packages have been rationalized together or apart. Some concrete classes became interfaces and vice-versa.

Externalizing filter building code

Generating ApiFilters and Druid Filters has been externalized to make it easier to reimplement with non-Regular Expression solutions.

Supporting Druid in-filters

A general performance improvement by implementing the in filter in druid (as opposed to long chains of single value select filters)

Added:

-- MAJOR FEATURE: Protocol Metrics] * Split LogicalMetric into an interface and an implementation: LogicalMetricImpl * Added Protocol, ProtocolSupport and MetricTransformer to support metric self-trasnformation. * Added DefaultSystemMetricProtocols and ProtocolDictionary to support protocol configuration. * Updated ArithmeticMaker and AggregationAverageMaker to produce protocol metrics.

Sorting JSON objects before caching
- Added canonicalize method in Utils class which sorts the JSON objects of druid query before hashing so that hash values are consistent.
- canonicalize takes a boolean parameter preserveContext which determines if context has to be omitted.
- Deprecated omitField method in Utils class.
Enforce role based security for incoming API requests
- Added RoleBasedTableValidatorRequestMapper class which checks if a user's role satisfies the predicates defined for a logical table.
HttpResponseMaker header building made extendable
- Added buildAndAddResponseHeaders method in HttpResponseMaker which handles building and adding headers to a response builder. This logic was moved from createResponseBuilder.
- Made createResponseBuilder method protected to open up the class to be more extendable.
Add factory for build ApiFilter objects
- ApiFilter changed into a simple value object.
- ApiFilter constructor using filter clause from API request moved to factory as static build method.
- ApiFilter union method moved to factory.
Add interface to FilterOperation for easy extension
- Changed existing version of FilterOperation to DefaulFilterOperation and made FilterOperation into an interface.
- Changed code that depended on the enum to be dependent on the new interfaces instead.
Wrapping DruidInFilterBuilder as default filter builder under a feature flag
- Added DEFAULT_IN_FILTER feature flag.
- If DEFAULT_IN_FILTER feature flag is enabled, then DruidInFilterBuilder will be used as the default druid query builder.
- If DEFAULT_IN_FILTER feature flag is disabled, then DruidOrFilterBuilder will be used as the default druid query builder.
Enable Druid In-filter in Fili
- DefaultDruidFilterBuilder is renamed to DruidOrFilterBuilder
- Implement DruidInFilterBuilder which turns list of selector filter generated by DruidOrFilterBuilder into a single Druid in-filter. The in-filter resolves the timeout issue. The in-filter could substantially shorten druid query, making the query more sharable and readable.
- Fili now uses the DruidInFilterBuilder as the "Default" Druid query filter builder, in stead of the old DruidOrFilterBuilder, because the new default filter is terser and packs the query payload tighter, and in proportion to the number of filters being applied.
Move off BaseCompositePhysicalTable inheritance usage
- Added builder methods to MetricUnionAvailability and PartitionAvailability to save on needing to add additional table classes.
An injection point for customizing the WebLoggingFilter to use during tests
- Extend JerseyTestBinder and override getLoggingFilter.
An injection point for customizing Exception handling
- Customers can provide their own logic for handling top level Exceptions in the DataServlet by implementing DataExceptionHandler, and any other servlet by implementing MetadataExceptionHandler.
Add support for "case_sensitive" attribute in FragmentSearchQuerySpec
- Enable FragmentSearchQuerySpec to accept an argument for case_sensitive so that API users can configure this attribute for JSON serialization through Fili.
Add specs for InsensitiveContainsSearchQuerySpec & RegexSearchQuerySpec
- RegexSearchQuerySpec and InsensitiveContainsSearchQuerySpec have no dedicated test specs. This PR adds tests for them.
Implement ContainsSearchQuerySpec
- Adds serialization for ContainsSearchQuerySpec so that Fili API users can use that through Fili.
Add storageStrategy as a field of the DimensionConfig class
- Adds getStorageStrategy as a field of the dimensionConfig class.
- Passes the storage strategy to the KeyValueStoreDimension Constructor
Add more tests to RegisteredLookupMetadataLoadTask
- Adds tests to make sure the load tasks can update status correctly.
Add tests to Utils
- Add missing tests to Utils class.
Enable Druid Bound-filter in Fili
- Added BoundFilter class to support Druid Bound-filter

Changed:

Made In Filter use sorted Set to support reliable equality
- In filter tested in unstable ways. Since order doesn't matter, switched to a canonically ordered set.
Make ConjunctionDruidFilterBuilder's protected helper methods instance methods
Widen access privileges to suppress IDE warnings
Corrected generality on with methods
- Changed DataApiRequest methods to not refer to the implementation classes.
Cleaned up ApiRequest contracts
- DataApiRequest default implementation of getting dimension filters
- Aync after nullable (to indicate unconfigured as opposed to explicitly zero)
- DefaultPagination removed from ApiRequest interface (ApiRequest should not be a service provider)
- Moved filter generation into an explicit class, defaulted implementations in subclasses using extensible methods.
Corrected generality on with methods
- Changed DataApiRequest methods to not refer to the implementation classes.
- Added getGeneratorFilter (to replace 'generate') to DataApiRequest (yes this contradicts the item above, but this is a transitional change.)
Extract logic for getting pagination of dimension rows
- Extract the logic in DimensionsServlet to get pagination of dimension rows into a protected function.
Removed deprecations
- Removed Pagination deprecation
- Removed DataSourceConstraint deprecation
Bumping query id inside withIntervals of LookBackQuery
- Returning a new LookBackQuery with doFork set to true which bumps query id inside withIntervals method.
- Renamed every occurrence of doFork to incrementQueryId.
- Removed withDoFork from LookBackQuery class.
Change to ordered data structures for ApiRequestImpls
- Change Set to LinkedHashSet in most ApiRequestImpl getters
- Change Set to List in ApiRequests
Moved ResponseFormatType from Singleton enum to interface with enum impl
- Refactored ResponseFormatType to allow expansion via additional interface implementors
- Replaced equality based matching with 'accepts' model
Restructured Report Building out of ApiRequest
- Pushed UriInfo used to produce PaginationMapper into RequestContext
- Renamed 'getPage' to 'paginate' and refactored off ApiRequest and into PageLinkBuilder
- Moved pagination mostly out of individual servlet classes and into EndpointServlet
- Initialized per request Response.ResponseBuilder in EndpointServlet rather than ApiRequestImpl constructor
- Simplified injected classes that took both UrlInfo and ContainerRequestContext to get the one from inside the other.
- Pushed entire container request context to content disposition building code (prelude to #709 )
Change BaseCompositePhysicalTable into a concrete class
- Currently BaseCompositePhysicalTable is an abstract class, though it has all of the functionality for a simple composite physical table. Changed to a concrete class to allow for simple composite table behavior with requiring an extension.
- Two simple implementations of BaseCompositePhysicalTable, PartitionCompositeTable and MetricUnionCompositeTable are now deprecated. Instead, the availabilities for these tables should be created directly and passed in to BaseCompositePhysicalTable.
Change availability behavior on BasePhysicalTable
- Currently BasePhysicalTable overrides getAvailableIntervals(constraint) and getAllAvailableIntervals(), and defers this behavior to its availability. This PR changes BasePhysicalTable to also override getAvailableIntervals() and defer to its availability.
Making Custom Sketch operations possible in PostAggreations
- Added PostAggregation and Aggregation instance check in asSketchEstimate(MetricField field) method of the class ThetaSketchFieldConverter.
- FieldAccessorPostAggregation is called only for Aggregation and not for PostAggregation.
Let DimensionApiRequestMapper throw RequestValidationException instead of BadApiRequestException
- DimensionApiRequestMapper.apply() is made to obey the interface contract by throwing RequestValidationException instead of BadApiRequestException
Inject customizable extraction functions to RegisteredLookupDimension
- Instead of injecting registered lookup names, we inject registered lookup extraction functions for lookup dimension so that downstream projects can configure all fields of registered lookup extraction functions.
Abort request when too many Druid filters are generated
- In order to avoid Druid queries with too much filters on high-cardinality dimension, Fili sets a upper limit on the number of filters and aborts requests if the limit is exceeded.
Class re-organization
- Put Granularity interfaces and its implementations in the same package
- Put *ApiRequest interfaces and their implementations in the same package
Avoid casting to generate SimplifiedIntervalList
- Some downstream projects generated partial intervals as ArrayList, which cannot be cased to SimplifiedIntervalList in places like getVolatileIntervalsWithDefault. The result is a casting exception which crashes downstream applications. Casting is replaced with a explicit SimplifiedIntervalList object creation.

Deprecated:

Undeprecated pagination by collection
- Since we seem to be in no hurry to switch to heavier reliance on streams. (also renamed paginate and moved to PageLinkBuilder)
Deprecate PartitionCompositeTable and MetricUnionCompositeTable
- Two simple implementations of BaseCompositePhysicalTable (PartitionCompositeTable and MetricUnionCompositeTable) are now deprecated. Instead, the availabilities for these tables should be created directly and passed in to BaseCompositePhysicalTable.

Fixed:

ApiFilter constructor from filter query string was removed when it should have been deprecated
- Constructor using filter query string is restored and calls the ApiFilterGenerator.build() to construct itself from the string.
Fix equality on Filtered Metrics
- ThetaSketchIntersection had a bug where it wasn't collecting dependent metrics from InFilters
Fix equality on Filtered Metrics
- Filtered metrics should take filter values into account for equals and hashcode
Fix name change in test logical metrics that breaks downstream tests
- Change test logical metric generation to use LogicalMetricInfo constructor which takes both long name and description.
Fix GroovyTestUtils json parsing
- Properly handles json parsing failures and non-JSON expected strings.
Fix generate intervals logic when availability is empty
- Logic to generate intervals when CURRENT_MACRO_USES_LATEST flag is turned on has a bug. The code throws NoSuchElementException when the table has no availabilities. This PR fixes the bug by checking if the availability of the underlying table is empty.
Correct Druid coordinator URL in Wikipedia example
- Config value for Druid coordinator URL is mis-typed.
Upgrade codenarc to recognize unused imports in Groovy
- There are number of unused imports sitting in tests. The cause is an out-dated codenarc version. This PR upgrades the version and removes those unused imports with the new version.
Adding withDoFork to LookBackQuery
- Added withDoFork to LookBackQuery class.
- Fixes failure of Lookback query to correctly do context forking under query split
Removed dependency on ApiFilter in ThetaSketchMetricsHelder
- Instead injecting dependency on creation (with defaulting constructor)
Removed deprecated code references
- Renamed keys from BardLoggingFilter properties off deprecated refence class (this was an artifact from a bad rename)
Removed older deprecated code
- Removed constructos and getters with clean replacements
- Stripped the remaining UI/NonUI code
- Cleaned up old schema classes and methods
- Removed orphaned metadata response data factory
- Removed pre-theta sketch code
- Removed deprecated min/max aggregations
- Removed loader code for metrics that don't include dimension dictionary
- Removed KeyValueStoreDimension

Known Issues:

Improved deletion on non-existing path
- If path does not exist, do not run deletion on that path

v0.9.137 - 2018/04/13

0.9 Highlights

Fili Security Added!

Release security module for fili data security filters. Created ChainingRequestMapper, and a set of mappers for gatekeeping on security roles and whitelisting dimension filters.

Added by @michael-mclawhorn in #405

DataApiRequestFactory layer

Downstream projects now have more flexibility to construct DataApiRequest by using injectableFactory. An additional constructor for DataApiRequestImpl unpacks the config resources bundle to make it easier to override dictionaries.

Added by @michael-mclawhorn in #603

Make Field Accessor PostAggregation able to reference post aggregations in adddition to aggregations

Druid allows (but does not protect against ordering) post aggregation trees referencing columns that are also post aggregation trees. This makes it possible to send such a query by using a field accessor to reference another query expression. Using this capability may have some risk.

Added by @michael-mclawhorn in #543

Etag Cache

In the more recent versions of druid that are released after February 23rd, 2017. Druid added support for HTTP Etag. By including a If-None-Match header along with a druid query, druid will compute a hash as the etag in a way such that each unique response has a corresponding unique etag, the etag will be included in the header along with the response. In addition, if a query to druid includes the If-None-Match with a etag of the query, druid will check if the etag matches the response of the query, if yes, druid will return a HTTP Status 304 Content Not Modified response to indicate that the response is unchanged and matches the etag received from druid query request header. Otherwise druid will execute the query and respond normally with a new etag attached to the response header.

This new feature is designed by @garyluoex . For more info, visit @garyluoex 's design at #255

More robust Lucene Search Provider and Key Value Store

Lucene Search Provider can re-open in a bug-free way and close more cleanly

Added by @garyluoex in #551 and #521

Extraction Function on selector filter

Update Fili to accommodate the deprecated ExtractionFilter in druid, use selector filter with extraction function instead. Added extraction function on dimensional filter, defaults to extraction function on dimension if it exists.

Added by @garyluoex in #617

More controllable RequestLog

Exposes the LogInfo objects stored in the RequestLog, via RequestLog::retrieveAll making it easier for customers to implement their own scheme for logging the RequestLog

Added by @archolewa in #574

Druid lookup metadata load status check

Fili now supports checking Druid lookup status as one of it's health check. It will be very easy to identify any failed lookups.

Added by @QubitPi in #620

Add ability to use custom rate limiting schemes

While backward compatibility is guaranteed, Fili now allows users to rate limit(with a a new rate limiter) based on different criteria other than the default criteria.

Added by @efronbs in #591

Support Time Format Extraction Function in Fili

Druid TimeFormatExtractionFunction is added to Fili. API users could interact with Druid using TimeFormatExtractionFunction through Fili.

Added by @QubitPi in #611

Dimension load strategy indicator

In order to allow clients to be notified if a dimension's values are browsable and searchable, a storage strategy metadata is added to dimension. A browsable and searchable dimension is denoted by LOADED, whereas the opposite is denoted by NONE. This will be very useful for UI backed by Fili on sending dimension-related queries.

Added by @michael-mclawhorn, @garyluoex and @QubitPi in #575, #589, #558, #578

Query Split Logging

Include metrics in logging to allow for better evaluation of the impact of caching for split queries. There used to be only a binary flag (BardQueryInfo.cached) that is inconsistently set for split queries. Now 3 new metrics are added

Number of split queries satisfied by cache
Number of split queries actually sent to the fact store. (not satisfied by cache)
Number of weight-checked queries

Added by @QubitPi in #537

Configurable Metric Long Name

Logical metric has more config-richness to not just configure metric name, but also metric long name, description, etc. MetricInstance is now created by accepting a LogicalMetricInfo which contains all these fields in addition to metric name.

Added by @QubitPi in #492

Search provider can hot-swap index and key value store can hot-swap store location

LuceneSearchProvider is able to hot swap index by replacing Lucene index by moving the old index directory to a different location, moving new indexes to a new directory with the same old name, and deleting the old index directory in file system. KeyValueStore is also made to support hot-swapping key value store location

Added by @QubitPi in #522

Uptime Status Metric

A metric showing how long Fili has been running is available.

Added by @mpardesh in #518

Consolidate UI & Non-UI broker configurations

ui_druid_broke and non_ui_druid_broker are not used separately anymore. Instead, a single druid_broker replaces the two. For backwards compatibility, Fili checks if druid_broker is set. If not, Fili uses non_ui_druid_broker and then ui_druid_broker

Added by @mpardesh in #489 Amended by @gab-umich in #933

Credits

Thanks to everyone who contributed to this release!

@michael-mclawhorn Michael Mclawhorn @garyluoex Gary Luo @archolewa Andrew Cholewa @QubitPi Jiaqi Liu @asifmansoora Asif Mansoor Amanullah @efronbs Ben Efron @deepakb91 Deepak Babu @tarrantzhang Tarrant Zhang @kevinhinterlong Kevin Hinterlong @mpardesh Monica Pardeshi @colemanProjects Neelan Coleman @onlinecco @dejan2609 Dejan Stojadinović

Added:

Added with method for timesort to DataApiRequest
Latest Time Macro
- Added logicalTableAvailability to TableUtils which returns the union of intervals for the logical table.
- Added now parameter to generateIntervals for which time macros will be relatively calculated.
- Added CURRENT_MACRO_USES_LATEST flag which when turned on uses the first unavailable availability to generate the intervals.
Annotate Functional Interface
- Add @FunctionalInterface annotation to all functional interfaces.
Implement LookupLoadTask
- Add capability for Fili to check load statuses of Druid lookups.
Extraction Function on selector filter
- Added extraction function on dimensional filter, defaults to extraction function on dimension if it exists.
Implement TimeFormatExtractionFunction
- Enable TimeFormatExtractionFunction in Fili so that API users could interact with Druid using TimeFormatExtractionFunction through Fili.
Complete DateTimeUtils tests
- Add tests for all un-tested methods in DateTimeUtils.
Enable checkstyle to detect incorrect package header
- Fili was able to pass the build with wrong package headers in some source files. This needs to be fixed, and it's fixed in this PR by adding PackageDeclaration checkstyle rule.
- In addition, checkstyle version has been bumped to the latest one(Nov, 2017), which is now able to detect more styling errors.
Add loaded strategy onto tables full view endpoint
- Add dimension storage strategy to table full view endpoint
Add getter for LogicalMetricInfo in MetricInstance
- There are 3 instance variables inside MetricInstance class, two of which have getters. The one without getter, LogicalMetricInfo, should have one, as well, so that subclass can access it without creating a duplicate LogicalMetricInfo inside their own.
Backwards compatible constructor for KeyValueStoreDimension around storage strategy
- Provide a backwards compatible constructor for existing implementations that don't provide storage strategies.
Have Table Endpoint Filter Using QueryPlanningConstraint
- Enable tables endpoint to fiilter availabilities based on availability-constraint
Implement dimension metadata to indicate storage strategy
- In order to allow clients to be notified if a dimension's values are browsable and searchable, a storage strategy metadata is added to dimension.
Refactor ApiRequest
- Add inteface layer to each type of API request class. The types of API request under the refactor are
  - TablesApiRequest
  - DimensionApiRequest
  - SlicesApiRequest
  - MetricsApiRequest
  - JobsApiRequest
Implement Query Split Logging
- Include metrics in logging to allow for better evaluation of the impact of caching for split queries.
  - Currently there is only a binary flag (BardQueryInfo.cached) that is inconsistently set for split queries
  - Three new metrics are added
    1. Number of split queries satisfied by cache
    2. Number of split queries actually sent to the fact store. (not satisfied by cache)
    3. Number of weight-checked queries
Documentation that Dimension::getFieldByName should throw an IllegalArgumentException if there is no field with the passed in name
Evaluate format type from both URI and Accept header
- Add a new functional interface ResponseFormatResolver to coalesce Accept header format type and URI format type.
- Implement a concrete implementation of ResponseFormatResolver in AbstractBindingFactory.
Add Constructor and wither for TableApiRequest
- Making the TablesApiRequest similar to other ApiRequest classses so added an all argument constructor and withers. The all argument constructor is made private since its used only by the withers.
Add Code Narc to validate Groovy style
- Checkstyle is great, but it doesn't process Groovy. Code Narc is Checkstyle for Groovy, so we should totally use it.
Allow Webservice to Configure Metric Long Name
- Logical metric needs more config-richness to not just configure metric name, but also metric long name, description, etc. MetricInstance is now created by accepting a LogicalMetricInfo which contains all these fields in addition to metric name.
Enable search provider to hot-swap index and key value store to hot-swap store location
- Add new default method to SearchProvider interface in order to support hot-swapping index.
- Implement the hot-swapping method of the SearchProvider interface in LuceneSearchProvider - replace Lucene index by moving the old index directory to a different location, moving new indexes to a new directory with the same old name, and deleting the old index directory in file system.
- Add new default method to KeyValueStore interface in order to support hot-swapping key value store location.
Translate doc, built-in-makers.md, to Chinese
- Part of Fili translation in order to increase popularity of Fili in Chinese tech industries.
Add Uptime Status Metric
- Add a metric to show how long Fili has been running
Add druid_broker config parameter to replace ui_druid_broker and non_ui_druid_broker
Have Tables Endpoint Support (but not use) Additional Query Parameters
- Make the availability consider the TablesApiRequest by passing it into the getLogicalTableFullView method
- Move auxiliary methods from DataApiRequest to ApiRequest in order to make them sharable between DataApiRequest and TableApiRequest.
Fili-security module
- Added security module for fili data security filters
- Created ChainingRequestMapper, and a set of mappers for gatekeeping on security roles and whitelisting dimension filters.
Add Table-wide Availability
- Add availableIntervals field to tables endpoint by union the availability for the logical table without taking the TablesApiRequest into account.
Implement EtagCacheRequestHandler
- Add EtagCacheRequestHandler that checks the cache for a matching eTag
- Add EtagCacheRequestHandler to DruidWorkflow
- Make MemTupleDataCache take parametrized meta data type
Implement EtagCacheResponseProcessor
- Add EtagCacheResponseProcessor that caches the results if appropriate after completing a query according to etag value.
Add dimension dictionary to metric loader
- Added a two argument version of loadMetricDictionary default method in MetricLoader interface that allows dimension dependent metrics by providing a dimension dictionary given by ConfigurationLoader

Changed:

Avoid casting to generate SimplifiedIntervalList
- Some downstream projects generated partial intervals as ArrayList, which cannot be cased to SimplifiedIntervalList in places like getPartialIntervalsWithDefault. The result is a casting exception which crashes downstream applications. Casting is replaced with a explicit SimplifiedIntervalList object creation.
ResponseProcessor is now injectable.
- To add a custom ResponseProcessor, implement ResponseProcessorFactory, override AbstractBinderFactory::buildResponseProcessorFactory to return your custom ResponseProcessorFactory.class.
Add config to ignore partial/volatile intervals and cache everything in cache V2
- In cache V2, user should be able to decide whether partial data or volatile data should be cached or not. This PR adds a config that allows the user to do this.
Lift required override on deprecated method in MetricLoader
- Add default implementation to deprecated loadMetricDictionary in MetricLoader so that downstream projects are able to implement the new version without worrying about the deprecated version.
Added DataApiRequestFactory layer
- Replaced static construction of DataApiRequest with an injectableFactory
- Create an additional constructor for DataApiRequestImpl which unpacks the config resources bundle to make it easier to override dictionaries.
Refactored HttpResponseMaker to allow for custom ResponseData implementations
- Currently ResponseData is being directly created in when building a response in the HttpResponseMaker. This creation has been extracted to a factory method, which subclasses of HttpResponseMaker can override.
- Changed relevant methods fields from private to protected.
Move makeRequest to JTB
- Move makeRequest from test to JerseyTestBinder
- Some tests uses variable name jerseyTestBinder; some uses jtb. They are all renamed to the former for naming conformance
- Re-indent testing strings for better code formatting
Moved availabilities to metrics construction to MetricUnionCompositeTableDefinition
- Currently, the availability to metrics construction is taking place even before the availability is loaded. Hence, moving the construction to MetricUnionCompositeTableDefinition so that availability is loaded first.
Better programmatic generation of metadata json in tests
- Rework metadata tests to be more generated from strings and more pluggable to support heavier and more expressive testing. This allows for more consistency, as well as make it easier to test more cases.
Ability to use custom rate limiting schemes
- Allows users to rate limit based on different criteria that the default criteria.
- Existing rate limiting code is now located in DefaultRateLimiter.
- Create a new rate limiter by:
  - implementing the RateLimiter interface
  - overriding the buildRateLimiter method in concrete implementation of AbstractBinderFactory to return custom RateLimiter implementation
  - Default token that uses a callback mechanism is available
    - CallbackRateLimitRequestToken takes an implementation of the callback interface RateLimitCleanupOnRequestComplete. When the request is completed the token calls the cleanup method of the callback to handle releasing any resources associate with the inflight request that this token belongs to.
Expose RequestLog LogInfo objects
- Exposes the LogInfo objects stored in the RequestLog, via RequestLog::retrieveAll making it easier for customers to implement their own scheme for logging the RequestLog.
Display corrected case on StorageStrategy serialization
- The default serialization of enum is name() which is final and thus cannot be overridden. An API method is added to return the API name of a storage strategy.
[Made StorageStrategy lower case]
Make shareable methods accessiable to all types of API requests
- As non-data endpoints are behaving more like data endpoints, some methods deserve to be shared among all types of API requests. Methods for
  - parsing and generating LogicalMetrics
  - parsing and generating LogicalTable
  - computing the union of constrained availabilities of constrained logical table
  are made available.
Substitute preflight method wildcard character with explicit allowed methods
- Modify ResponseCorsFilter Allowed Methods header to explicitly list allowed methods. Some browsers do not support a wildcard header value.
[Make Field Accessor PostAggregation able to reference post aggregations in adddition to aggregations]
- Druid allows (but does not protect against ordering) post aggregation trees referencing columns that are also post aggregation trees. This makes it possible to send such a query by using a field accessor to reference another query expression. Using this capability may have some risk.
Include ETags on responses
- Modify FullResponse JSON Objects to contain a flag showing whether a response is new or fetched from cache.
Fix wrong default druid url and broken getInnerMostQuery
- Comment out the wrong default druid broker url in module config that break old url config compatibility, add check for validate url in DruidClientConfigHelper
- Fix broken getInnermostQuery method in DruidQuery
Rename filter variables and methods in DataApiRequest
- The method names getFilter and getFilters can be confusing, as well as the filters variable
Decoupled from static dimension lookup building
- Instead of ModelUtils, create an interface for ExtractionFunctionDimension and rebase LookupDimension and RegisteredLookupDimension on that interface.
- LookupDimensionToDimensionSpec now uses only the Extraction interface to decide how to serialize dimensions.
DruidDimensionLoader is now a more generic DimensionValueLoadTask
- The DimensionValueLoadTask takes in a collection of DimensionValueLoaders to allow for non-Druid dimensions to be loaded.
DruidQuery::getInnerQuery and Datasource::getQuery return Optional
- Returning Optional is more correct for their usage and should protect against unexpected null values.
Use all available segment metadata in fili-generic-example
- The fili-generic-example now uses all segment metadata given by Druid instead of just the first one and also provides it to the metadata service.
Refactor Response class and implement new serialization logics
- Define interface ResponseWriter and its default implementation
- Refactor Response class, splitting into ResponseData and three implementations of ResponseWriter
- Define interface ResponseWriterSelector and its default implementation.
- Hook up the new serialization logic with HttpResponseMaker to replace the old one
LuceneSearchProvide needs to handle nulls
- Lucene search provider cannot handle null load values. Treat all null values as empty string.
Make AvroDimensionRowParser.parseAvroFileDimensionRows support consumer model
- In order to do deferred/buffered file reading, create a call back style method.
Make HttpResponseMaker injectable and change functions signature related to custom response creation
- Make HttpResponseMaker injectable. DataServlet and JobsServlet takes HttpResponseMaker as input parameter now
- Add ApiRequest to BuildResponse, HttpReponseChannel and createResponseBuilder to enable passing information needed by customizable serialization
- Remove duplicate parameter such as UriInfo that can be derived from ApiRequest
Change id field in DefaultDimensionField to lower case for Navi compatibility.
- Navi's default setting only recongizes lower case 'id' key name.
Fix a bug where table loader uses nested compute if absent
- Nesting computeIfAbsent on maps can cause a lot of issues in the map internals that causes weird behavior, nesting structure is now removed
Convert null avro record value to empty string
- Make AvroDimensionRowParser convert null record value into empty string to avoid NPE
FailedFuture is replaced by CompletedFuture
- CompletedFuture allows values to be returned when calling .get on a future instead of just throwing an exception

Deprecated:

Extraction Function on selector filter
- Deprecated ExtractionFilter since it is deprecated in druid, use selector filter with extraction function instead
Rename filter variables and methods in DataApiRequest
- Deprecated getFilters in favor of getApiFilters and getFilter in favor of getDruidFilter
Deprecate ui_druid_broker and non_ui_druid_broker and added druid_broker
Add dimension dictionary to metric loader
- Deprecated single argument version of loadMetricDictionary in MetricLoader, favor additional dimension dictionary argument loadMetricDictionary instead

Fixed:

Correct exception message & add missing tests
- Clarified exception message thrown by StreamUtils.throwingMerger
Fix lookup metadata loader by pulling the RegisteredLookupDimension
- Lookup Metadata Health Check always return true when some Druid registered lookup are absolutely failing to be loaded. Instead of checking load status of RegisteredLookupDimension, RegisteredLookupMetadataLoadTask is checking the status of LookupDimension. This PR corrects this behavior.
Fix 'descriptionription' mis-naming in dimension field
- This is caused by a "desc" -> "description" string replacement. A string handling method has been added to detect "desc" and transform it to "description". If it already comes with "description", no string transformation is made
Fix caching condition
- We want to cache partial or volatile data when cache_partial_data is set to true. This is condition is currently reversed. This PR shall fix it
Add Missing perPage Param
- Pagination links on the first pages are missing perPage param. This PR fixes this problem.
Having clause was nesting inward on nested queries resulting in rows that didn't exist being referenced
None show clause was not being respected
- Changed ResponseData and JsonApiResponseWriter to suppress columns that don't have associated dimension fields.
- Updated tests to reflect none being hidden.
Scoped metric dictionaries and the having clause now work together by default
- Add a new ApiHavingGenerator that builds a temporary metric dictionary from the set of requested metrics(not from globally scoped metric dictionary), and then using those to resolve the having clause.
- Add a table generating functions in BaseTableLoader that effectively allow the customer to provide a different metric dictionary at lower scope(not from the globally scoped metric dictionary) for use when building each table.
Debug BardQueryInfo to show query split counting
- Query counter in BardQueryInfo does not show up in logging because the counter used to be static and JSON serializer does not serialize static fields.
- This externalizes the state via a getter for serialization.
Fix intermittent class scanner error on DataSourceConstraint equal
- Class Scanner Spec was injecting an improper dependent field due to type erasure. Made field type explicit.
Fix tests with wrong time offset calculation
- Time-checking based tests setup time offset in a wrong way. timeZoneId.getOffset is fixed to take the right argument.
Handle Number Format errors from empty or missing cardinality value
Fix lucene search provider replace method
- Reopen the search index
Fix ConstantMaker make method with LogicalMetricInfo class
- The ConstantMaker make method needs to be rewritten with the LogicalMetricInfo class.
Slices endpoint returns druid name instead of api name
- The slices endpoint now gives the druid name instead of the api name for dimensions.
Prevent NPE in test due to null instance variables in DataApiRequest
- A particular order of loading ClassScannerSpec classes results in NullPointerException and fails tests, because some instance variables from testing DataApiRequest are null. This patch assigns non-null values to those variables.
- The testing constructor DataApiRequestImpl() is now deprecated and will be removed entirely.
Fix Lucene Cardinality in New KeyValueStores
- Fix lucene to put correct cardinality value to new key value store that does not contain the cardinality key
Log stack trace at error on unexpected DimensionServlet failures
- DimensionServlet was using debug to log unexpected exceptions and not printing the stack trace
Fix datasource name physical table name mismatch in VolatileDataRequestHandler
- Fix fetching from physicaltableDictionary using datasource name. Now use proper physical table name instead.
Fix performance bug around feature flag
- BardFeatureFlag, when used in a tight loop, is very expensive. Underlying map configuration copies the config map on each access.
- Switching to lazy value evaluation
- Added reset contract so changes to feature flags can be directly reverted rather than going through the SystemConfig directly
Fix deploy branch issue where substrings of whitelisted branches could be released
Fix availability testing utils to be compatible with composite tables
- Fix availability testing utils populatePhysicalTableCacheIntervals to assign a TestAvailability that will serialize correctly instead of always StrictAvailability
- Fix internal representation of VolatileIntervalsFunction in DefaultingVolatileIntervalsService from Map<PhysicalTable, VolatileIntervalsFunction> to Map<String, VolatileIntervalsFunction>
Fix metric and dimension names for wikipedia-example
- The metrics and dimensions configured in the fili-wikipedia-example were different from those in Druid and as a result the queries sent to Druid were incorrect

Known Issues:

Removed:

Removed withHaving that was on DataApiRequest as a Bug
- withHaving was incorrectly implemented in DataApiRequest. It should not have been added at all and had no behavioral impact except to rebuild the object without changes.
Remove testing constructor of *ApiRequestImpl
- It is a better practice to separate testing code with implementation. All testing constructors of the following API requests are removed:
  - ApiRequestImpl
  - DataApiRequestImpl
  - DimensionsApiRequestImpl
  - MetricsApiRequestImpl
  - SlicesApiRequestImpl
  - TablesApiRequestImpl
- Meanwhile, construction of testing API request is delegated to testing class, e.g. TestingDataApiRequestImpl
Reverted the druid name change in slices endpoint instead added to factName
- Reverting the PR-419(#419) so that the name still points to apiName and added factName which points to druidName. name was not valid for cases when it is a Lookup dimension because it was pointing to the base dimension name , so reverted that change and added druidName which is the actual druid fact name and name being the apiName
Remove custom immutable collections in favor of Guava
- Utils.makeImmutable(...) was misleading and uneeded so it has been removed. Use Guava's immutable collections.
Remove dependency on org.apache.httpcomponents
- This library was only used in fili-wikipedia-example and has been replaced with AsyncHttpClient.
Remove dependency on org.json
- Replace uses of org.json with the jackson equivalent
Remove NO_INTERVALS from SimplifiedIntervalList
- This shared instance was vulnerable to being changed globally. All calls to this have been replaced with the empty constructor

v0.8.69 - 2017/06/06

The main changes in this version are changes to the Table and Schema structure, including a major refactoring of Physical Table. The concept of Availability was split off from Physical Table, allowing Fili to better reason about availability of columns in Data Sources in ways that it couldn't easily do before, like in the case of Unions. As part of this refactor, Fili also gains 1st-class support for queries using the Union data source.

_{Full description of changes to Tables, Schemas, Physical Tables, Availability, PartialDataHandler, etc. tbd}

This was a long and winding journey this cycle, so the changelog is not nearly as tight as we'd like (hopefully we'll come back and consolidate it for this release), but all of the changes are in there. Along the way, we also addressed a number of other small concerns. Here are some of the highlights beyond the main changes around Physical Tables:

Fixes:

Unicode characters are now properly sent back to Druid
Druid client now follows redirects

New Capabilities & Enhancements:

Can sort on dateTime
Can use Druid query response for final verification of response partiality
Class Scanner Spec can discover dependencies, making its dynamic equality testing easier to use
There's an example application that shows how to slurp configuration from an existing Druid instance
Druid queries return a Future instead of void, allowing for blocking requests if needed (though use sparingly!)
Support for extensions defining new Druid query types

Performance upgrades:

Lazy DruidFilters
Assorted log level reductions
Lucene "total results" 50% speedup

Deprecations:

DataSource::getDataSources no longer makes sense, since UnionDataSource only supports 1 table now
BaseTableLoader::loadPhysicalTable. Use loadPhysicalTablesWithDependency instead
LogicalMetricColumn isn't really a needed concept

Removals:

PartialDataHandler::findMissingRequestTimeGrainIntervals
permissive_column_availability_enabled feature flag, since the new Availability infrastructure now handles this
Lots of things on PhysicalTable, since that system was majorly overhauled
SegmentMetadataLoader, which had been deprecated for a while and relies on no longer supported Druid features

Added:

Implement DruidPartialDataRequestHandler
- Implement DruidPartialDataRequestHandler that injects druid_uncovered_interval_limit into Druid query context
- Append DruidPartialDataResponseProcessor to the current next ResponseProcessor chain
- Add DruidPartialDataRequestHandler to DruidWorkflow between AsyncDruidRequestHandler and CacheV2RequestHandler and invoke the DruidPartialDataRequestHandler if druid_uncovered_interval_limit is greater than 0
Prepare for etag Cache
- Deprecate Cache V1 and V2 and log warning wherever they are used in codebase
- Add config param query_response_caching_strategy that allows any one of the TTL cache, local signature cache, or etag cache to be used as caching strategy
- Add 'CacheMode' that represent the caching strategy
- Add 'DefaultCacheMode' that represents all available caching strategies
- Make AsyncDruidWebServiceImpl::sendRequest not blow up when getting a 304 status response if etag cache is on
- Add etag header to response JSON if etag cache is set to be used
- Add FeatureFlag::isSet to expose whether feature flags have been explicitly configured
Implement DruidPartialDataResponseProcessor
- Add FullResponseProcessor interface that extends ResponseProcessor
- Add response status code to JSON response
- Add DruidPartialDataResponseProcessor that checks for any missing data that's not being found
Add DataSourceName concept, removing responsibility from TableName
- TableName was serving double-duty, and it was causing problems and confusion. Splitting the concepts fixes it.
Add a BaseMetadataAvailability as a parallel to BaseCompositeAvailability
- Concrete and PermissiveAvailability both extend this new base Availability
Constrained Table Support for Table Serialization
- Add ConstrainedTable which closes over a physical table and an availability, caching all availability merges.
- Add PartialDataHandler method to use ConstrainedTable
Testing: ClassScannerSpec now supports 'discoverable' depenencies
- Creating supplyDependencies method on a class's spec allows definitions of dependencies for dynamic equality testing
Moved UnionDataSource to support only single tables
- DataSource now supports getDataSource() operation
Prepare For Partial Data V2
- Add new query context for druid's uncovered interval feature
- Add a configurable property named "druid_uncovered_interval_limit"
- Add new response error messages as needed by Partial Data V2
Merge Druid Response Header into Druid Response Body Json Node in AsyncDruidWebServiceImplV2
- Add configuration to AsyncDruidWebServiceImpl so that we can opt-in configuration of JSON response content.
- AsyncDruidWebServiceImpl takes a strategy for building the JSON from the entire response.
Add MetricUnionCompositeTableDefinition
Add partition availability and table
Add constructor to specify DruidDimensionLoader dimensions directly
Add IntervalUtils::getTimeGrain to determine the grain given an Interval
Add example for slurping in config from a Druid instance
Add Permissive Concrete Physical Table Definition
- Added PermissiveConcretePhysicalTableDefinition for defining a PermissiveConcretePhysicalTable
Fix to use physical name instead of logical name to retrieve available interval
- Added PhysicalDataSourceConstraint class to capture physical names of columns for retrieving available intervals
BaseCompositePhysicalTable
- BaseCompositePhysicalTable provides common operations, such as validating coarsest ZonedTimeGrain, for composite tables.
Add Reciprocal satisfies() relationship complementing satisfiedBy() on Granularity
Add a timer around DataApiRequestMappers
MetricUnionAvailability and MetricUnionCompositeTable
- Added MetricUnionAvailability which puts metric columns of different availabilities together and MetricUnionCompositeTable which puts metric columns of different tables together in a single table.
Method for finding coarsest ZonedTimeGrain
- Added utility method for returning coarsest ZonedTimeGrain from a collection of ZonedTimeGrains. This is useful to construct composite tables that requires the coarsest ZonedTimeGrain among a set of tables.
Should also setConnectTimeout when using setReadTimeout
- Setting connectTimeout on DefaultAsyncHttpClientConfig when building AsyncDruidWebServiceImpl
CompositePhysicalTable Core Components Refactor
- Added ConcretePhysicalTable and ConcreteAvailability to model table in druid datasource and its availability in the new table availability structure
- Added class variable for DataSourceMetadataService and ConfigurationLoader into AbstractBinderFactory for application to access
- Added loadPhysicalTablesWithDependency into BaseTableLoader to load physical tables with dependencies
PermissiveAvailability and PermissiveConcretePhysicalTable
- Added PermissiveConcretePhysicalTable and PermissiveAvailability to model table in druid datasource and its availability in the new table availability structure. PermissiveAvailability differs from ConcreteAvailability in the way of returning available intervals: ConcreteAvailability returns the available intervals constraint by DataSourceConstraint and provides them in intersection. PermissiveAvailability, however, returns them without constraint from DataSourceConstraint and provides them in union. PermissiveConcretePhysicalTable is different from ConcretePhysicalTable in that the former is backed by PermissiveAvailability while the latter is backed by ConcreteAvailability.
Refactor DatasourceMetaDataService to fit composite table needs
- DataSourceMetadataService also stores interval data from segment data as intervals by column name map and provides method getAvailableIntervalsByTable to retrieve it
QueryPlanningConstraint and DataSourceConstraint
- Added QueryPlanningConstraint to replace current interface of Matchers and Resolvers arguments during query planning
- Added DataSourceConstraint to allow implementation of PartitionedFactTable's availability in the near future
Major refactor for availability and schemas and tables
- ImmutableAvailability - provides immutable, typed replacement for maps of column availabilities
- New Table Implementations:
  - BasePhysicalTable core implementation
  - ConcretePhysicalTable creates an ImmutableAvailability
- Schema implementations
  - BaseSchema has Columns, Granularity
  - PhysicalTableSchema has base plus ZonedTimeGrain, name mappings
  - LogicalTableSchema base with builder from table group
  - ResultSetSchema base with transforming with-ers
- ApiName, TableName: Added static factory from String to Name
- ErrorMessageFormat for errors during ResultSetMapper cycle
Added default base class for all dimension types
- Added base classes DefaultKeyValueStoreDimensionConfig, DefaultLookupDimensionConfig and DefaultRegisteredLookupDimensionConfig to create default dimensions.
dateTime based sort feature for the final ResultSet added
- Now we support dateTime column based sort in ASC or DESC order.
- Added DateTimeSortMapper to sort the time buckets and DateTimeSortRequestHandler to inject to the workflow
dateTime specified as sortable field in sorting clause
- added dateTimeSort as class parameter in DataApiRequest. So it can be tracked down to decide the resultSet sorting direction.
Detect unset userPrincipal in Preface log block
- Logs a warning if no userPrincipal is set on the request (ie. we don't know who the user is), and sets the user field in the Preface log block to NO_USER_PRINCIPAL.
Support timeouts for lucene search provider

Changed:

Prepare for etag Cache
- Made isOn dynamic on BardFeatureFlag
Rename Concrete to Strict for the respective PhysicalTable and Availability
- The main difference is in the availability reduction, so make the class name match that.
Make PermissiveConcretePhysicalTable and ConcretePhysicalTable extend from a common base
- Makes the structure match that for composite tables, so they can be reasoned about together.
Make PermissiveConcretePhysicalTable (now just PermissivePhysicalTable) a sibling, instead of extend from, ConcretePhysicalTable
- The main difference is in the accepted availabilities, so make the class structure match that.
Make MetricUnionAvailability take a set of Availability instead of PhysicalTable
- Since it was just unwrapping anyways, simplifying the dependency and pushing the unwrap up-stream makes sense
Add DataSourceName concept, removing responsibility from TableName
- Impacts:
  - DataSource & children
  - DataSourceMetadataService & DataSourceMetadataLoader
  - SegmentIntervalsHashIdGenerator
  - PhysicalTable & children
  - Availability & children
  - ErrorMessageFormat
  - SlicesApiRequest
Force ConcretePhysicalTable only take a ConcreteAvailability
- Only a ConcreteAvailability makes sense, so let the types enforce it
Clarify name of built-in static TableName comparator
- Change to AS_NAME_COMPARATOR
Constrained Table Support for Table Serialization
- Switched PartialDataRequestHandler to use the table from the query rather than the PhysicalTableDictionary
- DruidQueryBuilder uses constrained tables to dynamically pick between Union and Table DataSource implementations
- PartialDataHandler has multiple different entry points now depending on pre or post constraint conditions
- getAvailability moved to a ConfigTable interface and all configured Tables to that interface
- DataSource implementations bind to ConstrainedTable and only ConstrainedTable is used after table selection
- PhysicalTable.getAllAvailableIntervals explicitly rather than implicitly uses SimplifiedIntervalList
- Bound and default versions of getAvailableIntervals and getAllAvailableIntervals added to PhysicalTable interface
- Package-private optimize tests in DruidQueryBuilder moved to protected
- Immutable NoVolatileIntervalsFunction class made final
Moved UnionDataSource to support only single tables
- UnionDataSource now accepts only single tables instead of sets of tables.
- DataSource now supports getDataSource() operation
- IntervalUtils.collectBucketedIntervalsNotInIntervalList moved to PartialDataHandler
Druid filters are now lazy
- The Druid filter is built when requested, NOT at DatApiRequest construction. This will make it easier to write performant DataApiRequest mappers.
Reduce log level of failure to store a result in the asynchronous job store
- Customers who aren't using the asynchronous infrastructure shouldn't be seeing spurious warnings about a failure to execute one step (which is a no-op for them) in a complex system they aren't using. Until we can revisit how we log report asynchronous errors, we reduce the log level to DEBUG to reduce noise.
Clean up BaseDataSourceComponentSpec
- Drop a log from error to trace when a response comes back as an error
- Make JSON validation helpers return boolean instead of def
Make BasePhysicalTable take a more extension-friendly set of PhysicalTables
- Take <? extends PhysicalTable> instead of just PhysicalTable
Update availabilities for PartitionAvailability
- Created BaseCompositeAvailability for common features
- Refactored DataSourceMetadataService methods to use SimplifiedIntervaList to standardize intersections
Queries to Druid Web Service now return a Future
- Queries now return a Future<Response> in addition to having method callbacks.
Refactor Physical Table Definition and Update Table Loader
- PhysicalTableDefinition is now an abstract class, construct using ConcretePhysicalTableDefinition instead
- PhysicalTableDefinition now requires a build methods to be implemented that builds a physical table
- BaseTableLoader now constructs physical tables by calling PhysicalTableDefinition::build in buildPhysicalTablesWithDependency
- BaseTableLoader::buildDimensionSpanningTableGroup now uses loadPhysicalTablesWithDependency instead of deprecated loadPhysicalTables
- BaseTableLoader::buildDimensionSpanningTableGroup now does not take druid metrics as arguments, instead PhysicalTableDefinition does
Fix to use physical name instead of logical name to retrieve available interval
- ConcreteAvailability::getAllAvailableIntervals no longer filters out un-configured columns, instead Physicaltable::getAllAvailableIntervals does
- Availability::getAvailableIntervals now takes PhysicalDataSourceConstraint instead of DataSourceConstraint
- Availability no longer takes a set of columns on the table, only table needs to know
- Availability::getAllAvailableIntervals now returns a map of column physical name string to interval list instead of column to interval list
- TestDataSourceMetadataService now takes map from string to list of intervals instead of column to list of intervals for constructor
Reduced number of queries sent by LuceneSearchProvider by 50% in the common case
- Before, we were using IndexSearcher::count to get the total number of documents, which spawned an entire second query (so two Lucene queries rather than one when requesting the first page of results). We now pull that information from the results of the query directly.
Allow GranularityComparator to return static instance
- Implementation of PR #193 suggests an possible improvement on GranularityComparator: Put the static instance on the GranularityComparator class itself, so that everywhere in the system that wants it could just call GranularityComparator.getInstance()
Make TemplateDruidQuery::getMetricField get the first field instead of any field
- Previously, order was by luck, now it's by the contract of findFirst
Clean up config loading and add more logs and checks
- Use correct logger in ConfigurationGraph (was ConfigResourceLoader)
- Add error / logging messages for module dependency indicator
- Tweak loading resources debug log to read better
- Tweak module found log to read better
- Convert from Resource::getFilename to Resource::getDescription when reporting errors in the configuration graph. getDescription is more informative, usually holding the whole file path, rather than just the terminal segment / file name
Restore non-default query support in TestDruidWebservice
Base TableDataSource serialization on ConcretePhysicalTable fact name
CompositePhsyicalTable Core Components Refactor
- TableLoader now takes an additional constructor argument (DataSourceMetadataService) for creating tables
- PartialDataHandler::findMissingRequestTimeGrainIntervals now takes DataSourceConstraint
- Renamed buildTableGroup method to buildDimensionSpanningTableGroup
Restored flexibility about columns for query from DruidResponseParser
- Immutable schemas prevented custom query types from changing ResultSetSchema columns.
- Columns are now sourced from DruidResponseParser and default implemented on DruidAggregationQuery
Refactor DatasourceMetaDataService to fit composite table needs
- BasePhysicalTable now stores table name as the TableName instead of String
- SegmentInfo now stores dimension and metrics from segment data for constructing column to available interval map
QueryPlanningConstraint and DataSourceConstraint
- QueryPlanningConstraint replaces current interface of Matchers and Resolvers DataApiRequest and TemplateDruidQuery arguments during query planning
- Modified findMissingTimeGrainIntervals method in PartialDataHandler to take a set of columns instead of DataApiRequest and DruidAggregationQuery
Major refactor for availability and schemas and tables
- Schema and Table became interfaces
  - Table has-a Schema
  - PhysicalTable extends Table, interface only supports read-only operations
- Schema constructed as immutable, Columns no longer bind to Schema
  - Removed addNew*Column method
- Schema implementations now: BaseSchema, PhysicalTableSchema, LogicalTableSchema, ResultSetSchema
- DimensionLoader uses ConcretePhysicalTable
- PhysicalTableDefinition made some fields private, accepts iterables, returns immutable dimensions
- ResultSet constructor parameter order swapped
- ResultSetMapper now depends on ResultSetSchema
- TableDataSource constructor arg narrows: PhysicalTable -> ConcreteTable
- DataApiRequest constructor arg narrows: Table -> LogicalTable
- DruidQueryBuilder now polymorphic on building data sources models from new physical tables
- ApiFilter schema validation moved to DataApiRequest
- Guava version bumped to 21.0
Request IDs now support underscores.
Added support for extensions defining new Query types
- TestDruidWebService assumes unknown query types behave like GroupBy, TimeSeries, and TopN
- ResultSetResponseProcessor delegates to DruidResponseProcessor to build expected query schema, allowing subclasses to override and extend the schema behavior
Add dimension fields to fullView table format
Make HealthCheckFilter reject message nicer
- The previous message of reject <url> wasn't helpful, useful, nor very nice to users, and the message logged was not very useful either. The message has been made nicer (Service is unhealthy. At least 1 healthcheck is failing), and the log has been made better as well.
RequestLog timings support the try-with-resources block
- A block of code can now be timed by wrapping the timed block in a try-with-resources block that starts the timer. Note: This won't work when performing timings across threads, or across contexts. Those need to be started and stopped manually.
Clean up logging and responses in DimensionCacheLoaderServlet
- Switched a number of error-level logs to debug level to line up with logging guidance when request failures were result of client error
- Reduced some info-level logs down to debug
- Converted to 404 when error was cause by not finding a path element metadata generated by query we run to get the actual results.
Update LogBack version 1.1.7 -> 1.2.3
- In web-applications, logback-classic will automatically install a listener which will stop the logging context and release resources when your web-app is reloaded.
- Logback-classic now searches for the file logback-test.xml, then logback.groovy, and then logback.xml. In previous versions logback.groovy was looked up first which was non-sensical in presence of logback-test.xml
- AsyncAppender no longer drops events when the current thread has its interrupt flag set.
- Critical parts of the code now use COWArrayList, a custom developed allocation-free lock-free thread-safe implementation of the List interface. It is optimized for cases where iterations over the list vastly outnumber modifications on the list. It is based on CopyOnWriteArrayList but allows allocation-free iterations over the list.
Update Metrics version 3.1.2 -> 3.2.2
- Added support for disabling reporting of metric attributes.
- Support for setting a custom initial delay for reporters.
- Support for custom details in a result of a health check.
- Support for asynchronous health checks
- Added a listener for health checks.
- Health checks are reported as unhealthy on exceptions.
- Added support for Jetty 9.3 and higher.
- Shutdown health check registry
- Add support for the default shared health check registry name
Update hk2 version 2.5.0-b05 -> 2.5.0-b36
- Add class-proxy in the case of a generic factory using a registered contract
Update SLF4J version 1.7.21 -> 1.7.25
- When running under Java 9, log4j version 1.2.x is unable to correctly parse the "java.version" system property. Assuming an incorrect Java version, it proceeded to disable its MDC functionality. The slf4j-log4j12 module shipping in this release fixes the issue by tweaking MDC internals by reflection, allowing log4j to run under Java 9.
- The slf4j-simple module now uses the latest reference to System.out or System.err.
- In slf4j-simple module, added a configuration option to enable/disable caching of the System.out/err target.
Update Lucene version 5.3.0 -> 6.5.0
- Added IndexSearcher::getQueryCache and getQueryCachingPolicy
- org.apache.lucene.search.Filter is now deprecated. You should use Query objects instead of Filters, and the BooleanClause.Occur.FILTER clause in order to let Lucene know that a Query should be used for filtering but not scoring.
- MatchAllDocsQuery now has a dedicated BulkScorer for better performance when used as a top-level query.
- Added a IndexWriter::getFieldNames method (experimental) to return all field names as visible from the IndexWriter. This would be useful for IndexWriter::updateDocValues calls, to prevent calling with non-existent docValues fields

Deprecated:

Revert deprecation of getAvailbleInterval with PhysicalDatasourceConstraint
- The method is needed in order for availability to function correctly, there is a deeper dive and planning required to actually deprecate it in favor of simpler less confusing design.
Remove PhysicalTable::getTableName to use getName instead
- Having more than 1 method for the same concept (ie. what's the name of this physical table) was confusing and not very useful.
Remove PhysicalTableDictionary dependency from SegmentIntervalHashIdGenerator
- Constructors taking the dictionary have been deprecated, since it is not used any more
Add DataSourceName concept, removing responsibility from TableName
- Impacts:
  - DataSourceMetadataService & DataSourceMetadataLoader
  - ConcretePhysicalTable
Deprecate old static TableName comparator
- Changed to AS_NAME_COMPARATOR since it's more descriptive
Constrained Table Support for Table Serialization
- Deprecated static empty instance of SimplifiedIntervalList.NO_INTERVALS
  - It looks like an immutable singleton, but it's mutable and therefore unsafe. Just make new instances of SimplifiedIntervalList instead.
- PartialDataRequestHandler constructor using PhysicalTableDictionary
Moved UnionDataSource to support only single tables
- DataSource::getDataSources no longer makes sense, since UnionDataSource only supports 1 table now
Support for Lucene 5 indexes
- Added lucene-backward-codecs.jar as a dependency to restore support for indexes built on earlier instances.
- Support for indexes will only remain while the current Lucene generation supports them. All Fili users should rebuild indexes on Lucene 6 to avoid later pain.
Refactor Physical Table Definition and Update Table Loader
- Deprecated BaseTableLoader::loadPhysicalTable. Use loadPhysicalTablesWithDependency instead.
CompositePhysicalTable Core Components Refactor
- Deprecated BasePhysicalTable::setAvailability to discourage using it for testing
RequestLog::stopMostRecentTimer has been deprecated
- This method is a part of the infrastructure to support the recently deprecated RequestLog::switchTiming.
LogicalMetricColumn doesn't need a 2-arg constructor
- It's only used in one place, and there's no real need for it because the other constructor does the same thing
DimensionColumn's 2-arg constructor is only used by a deprecated class
- When that deprecated class (LogicalDimensionColumn) goes away, this constructor will go away as well

Fixed:

Fix druid partial data and partition table incompatibility
- Datasource names returned by partition table now contains only datasources that are actually used in the query
- Fix the problem where uncovered intervals is given by druid for partition table that fili filtered out
Fix the generic example for loading multiple tables
- Loading multiple tables caused it to hang and eventually time out.
- Also fixed issue causing all tables to show the same set of dimensions.
Support for Lucene 5 indexes restored
- Added lucene-backward-codecs.jar as a dependency to restore support for indexes built on earlier instances.
Specify the character encoding to support unicode characters
- Default character set used by the back end was mangling Unicode characters.
Correct empty-string behavior for druid header supplier class config
- Empty string would have tried to build a custom supplier. Now it doesn't.
Default the AsyncDruidWebServiceImpl to follow redirects
- It defaulted to not following redirects, and now it doesn't, and will follow redirects appropriately
Reenable custom query types in TestDruidWebService
Fixed SegmentMetadataLoader Unconfigured Dimension Bug
- Immutable availability was failing when attempting to bind segment dimension columns not configured in the dimension dictionary.
- Fix to filter irrelevant column names.
Major refactor for availability and schemas and tables
- Ordering of fields on serialization could be inconsistent if intermediate stages used HashSet or HashMap.
- Several constructors switched to accept Iterable and return LinkedHashSet to emphasize importance of ordering/prevent HashSet intermediates which disrupt ordering.
Fix Lookup Dimension Serialization
- Fix a bug where lookup dimension is serialized as dimension spec in both outer and inner query
Correct error message logged when no table schema match is found
Setting readTimeout on DefaultAsyncHttpClientConfig when building AsyncDruidWebServiceImpl

Removed:

Refactor Physical Table Definition and Update Table Loader
- Removed deprecated PhysicalTableDefinition constructor that takes a ZonlessTimeGrain. Use ZonedTimeGrain instead
- Removed BaseTableLoader::buildPhysicalTable. Table building logic has been moved to PhysicalTableDefinition
Move UnionDataSource to support only single tables
- DataSource no longer accepts Set<Table> in a constructor
CompositePhsyicalTable Core Components Refactor
- Removed deprecated method PartialDataHandler::findMissingRequestTimeGrainIntervals
- Removed permissive_column_availability_enabled feature flag support and corresponding functionality in PartialDataHandler. Permissive availability is instead handled via table configuration, and continued usage of the configuration field generates a warning when Fili starts.
- Removed getIntersectSubintervalsForColumns and getUnionSubintervalsForColumns from PartialDataHandler. Availability now handles these responsibilities.
- Removed getIntervalsByColumnName, resetColumns and hasLogicalMapping methods in PhysicalTable. These methods were either part of the availability infrastructure, which changed completely, or the responsibilities have moved to PhysicalTableSchema (in the case of hasLogicalMapping).
- Removed PartialDataHandler::getAvailability. Availability (on the PhysicalTables) has taken it's place.
- Removed SegmentMetadataLoader because the endpoint this relied on had been deprecated in Druid. Use the DataSourceMetadataLoader instead.
  - Removed SegmentMetadataLoaderHealthCheck as well.
Major refactor for availability and schemas and tables
- Removed ZonedSchema (all methods moved to child class ResultSetSchema)
- PhysicalTable no longer supports mutable availability
  - Removed addColumn, removeColumn, getWorkingIntervals, and commit
  - Other mutators no longer exist, availability is immutable
  - Removed getAvailableIntervals. Availability::getAvailableIntervals replaces it.
- Removed DruidResponseParser::buildSchema. That logic has moved to the ResultSetSchema constructor.
- Removed redundant buildLogicalTable methods from BaseTableLoader

v0.7.37 - 2017/04/04

This patch is to back-port a fix for getting Druid to handle international / UTF character sets correctly. It is included in the v0.8.x stable releases.

Fixed:

Specify the character encoding to support unicode characters
- Default character set used by the back end was mangling Unicode characters.

v0.7.36 - 2017/01/30

This release is a mix of fixes, upgrades, and interface clean-up. The general themes for the changes are around metric configuration, logging and timing, and adding support for tagging dimension fields. Here are some of the highlights, but take a look in the lower sections for more details.

Fixes:

Deadlock in LuceneSearchProvider
CORS support when using the RoleBasedAuthFilter

New Capabilities & Enhancements:

Dimension field tagging
Controls around max size of Druid response to cache
Logging and timing enhancements

Deprecations / Removals:

RequestLog::switchTiming is deprecated due to it's difficulty to use correctly
Metric configuration has a number of deprecations as part of the effort to make configuration easier and less complex

Changes:

There was a major overhaul of Fili's dependencies to upgrade their versions

Added:

Dimension Field Tagging and Dynamic Dimension Field Serilization
- Added a new module fili-navi for components added to support for Navi
- Added TaggedDimensionField and related components in fili-navi
Ability to prevent caching of Druid responses larger than the maximum size supported by the cache
- Supported for both Cache v1 and V2
- Controlled with bard__druid_max_response_length_to_cache setting
- Default value is MAX_LONG, so no cache prevention will happen by default
Log a warning if SegmentMetadataLoader tries to load empty segment metadata
- While not an error condition (eg. configuration migration), it's unusual, and likely shouldn't stay this way long
More descriptive log message when no physical table found due to schema mismatch
- Previous log message was user-facing only, and not as helpful as it could have been
Logs more finegrained timings of the request processing workflow
Added RegisteredLookupDimension and RegisteredLookupExtractionFunction
- This enables supporting Druid's most recent evolution of the Query Time Lookup feature
FilteredAggregationMaker
- This version is rudimentary. See issue 120 for future plans.
Support for Druid's LinearShardSpec metadata type
Added MetricField accessor to the interface of LogicalMetric
- Previously accessing the metric field involved using three method calls
Ability for ClassScanner to instantiate arrays
- This allows for more robust testing of classes that make use of arrays in their constructor parameters
Module load test code
- Code to automatically test module is correctly configured.
Comprehensive tests for MetricMaker field coercion methods

Changed:

The druid query posting timer has been removed
- There wasn't really a good way of stopping timing only the posting itself. Since the timer is probably not that useful, it has been removed.
Dimension Field Tagging and Dynamic Dimension Field Serilization
- Changed fili-core dimension endpoint DimensionField serialization strategy from hard coded static attributes to dynamic serialization based on jackson serializer
MetricMaker cleanup and simplification
- Simplified raw aggregation makers
- ConstantMaker now throws an IllegalArgumentException wrapping the raw NumberFormatException on a bad argument
- FilteredAggregation no longer requires a metric name to be passed in. (Aggregation field name is used)
- FilteredAggregationMaker now accepts a metric to the 'make' method instead of binding at construction time.
- ArithmeticAggregationMaker default now uses NoOpResultSetMapper instead of rounding mapper. (breaking change)
- FilteredAggregationMaker, SketchSetOperationMaker members are now private
Used Metric Field accessor to simplify maker code
- Using metric field accessor simplifies and enables streaminess in maker code
Fili's name for a PhysicalTable is decoupled from the name of the associated table in Druid
No match found due to schema mismatch now a 500 Internal Server Error response instead of a 400 Bad Request response
- This should never be a user fault, since that check is much earlier
Make SegmentMetadata::equals null-safe
- It was not properly checking for null before and could have exploded
Default DimensionColumn name to use apiName instead of physicalName
- Change DimensionColumn.java to use dimension api name instead of physical name as its name
- Modified files dependent on DimensionColumn.java and corresponding tests according to the above change
NoOpResultSetMapper now runs in constant time and space.
Remove restriction for single physical dimension to multiple lookup dimensions
- Change physical dimension name to logical dimension name mapping into Map<String, Set<String>> instead of Map<String, String> in PhysicalTable.java
SegmentMetadataLoader include provided request headers
- SegmentMetadataLoader sends requests with the provided request headers in AsyncDruidWebservice now
- Refactored AsyncDruidWebserviceSpec test and added test for checking getJsonData includes request headers too
Include physical table name in warning log message for logicalToPhysical mapping
- Without this name, it's hard to know what table seems to be misconfigured.
ResponseValidationException uses Response.StatusType rather than Response.Status
- Response.StatusType is the interface that Response.Status implements.
- This will have no impact on current code in Fili that uses ResponseValidationException, and it allows customers to inject http codes not included in Response.Status.
Removed "provided" modifier for SLF4J and Logback dependencies in the Wikipedia example
Updated dependencies

Unless otherwise noted, all dependency upgrades are for general stability and performance improvement. The called- out changes are only those that are likely of interest to Fili. Any dependency upgrade for which a changelog could not be found has not been linked to one, otherwise all other upgrades include a link to the relevant changelog.

WARNING: There is a known dependency conflict between apache commons configuration 1.6 and 1.10. If after upgrading to the latest Fili, your tests begin to fail with NoClassDefFoundExceptions, it is likely that you are explicitly depending on the apache commons configuration 1.6. Removing that dependency or upgrading it to 1.10 should fix the issue.
- Gmaven plugin 1.4 -> 1.5
- Guava 16.0.1 -> 20.0
- Jedis 2.7.2 -> 2.9.0:
  - Geo command support and binary mode support
  - ZADD support
  - Ipv6 and SSL support
  - Other assorted feature and Redis support upgrades
- Redisson 2.2.13 -> 3.1.0:
  - Support for binary stream in and out of Reddison
  - Lots of features for distributed data structure capabilities
  - Can make fire-and-forget style calls in ack-response-only modes
  - Many fixes and improvements for PubSub features
  - Support for command timeouts
  - Fixed bug where connections did not always close when RedisClient shut down
  - Breaking API changes:
    - Moved config classes to own package
    - Moved core classes to api package
    - Moved to Redisson's RFuture instead of netty's Future
- JodaTime 2.8.2 -> 2.9.6:
  - Faster TZ parsing
  - Added Interval.parseWithOffset
  - GMT fix for JDK 8u60
  - Fixed Interval overflow bug
  - TZ data update from 2015g to 2016i
- AsyncHttpClient 2.0.2 -> 2.0.24:
  - Custom header separator fix
  - No longer double-wrapping CompletableFuture exceptions
- Apache HttpClient 4.5 -> 4.5.2:
  - Supports handling a redirect response to a POST request
  - Fixed deflate zlib header issue
- RxJava 1.1.5 -> 1.2.2:
  - Deprecate TestObserver in favor of TestSubscriber
- Spymemcached 2.12.0 -> 2.12.1
- org.json 20141113 -> 20160810
- Maven release plugin 2.5 -> 2.5.3:
  - Fixes release:prepare not committing pom.xml if not in the git root
  - Fixes version update not updating inter-module dependencies
  - Fixes version update failing when project is not a SNAPSHOT
- Maven antrun plugin 1.7 -> 1.8
- Maven compiler plugin 3.3 -> 3.6.0:
  - Fix for compiler fail in Eclipse
- Maven surefire plugin 2.17 -> 2.19.1:
  - Correct indentation for Groovy's power asserts
- Maven javadoc plugin 2.10.3 -> 2.10.4
- Maven site plugin 3.5 -> 3.6
- SLF4J 1.7.12 -> 1.7.21:
  - Fixed to MDC adapter, leaking information to non-child threads
  - Better handling of ill-formatted strings
  - Cleaned up multi-thread consistency for LoggerFactory-based logger initializations
  - Closed a multi-threaded gap where early logs may be lost if they happened while SLF4J was initializing in a multi-threaded application
- Logback 1.1.3 -> :
  - Child threads no longer inherit MDC values
  - AsyncAppender can be configured to never block
  - Fixed issue with variable substitution when the value ends in a colon
- Apache Commons Lang 3.4 -> 3.5
- Apache Commons Configuration 1.6 -> 1.10:
  - Tightened getList's behavior if the list values are non-strings
  - MapConfiguration can be set to not trim values by default
  - CompositeConfiguration can now handle non-BaseConfiguration core configurations
  - addConfiguration() overload added to allow correcting inconsistent configuration compositing
- Apache Avro 1.8.0 -> 1.8.1
- Spring Core 4.0.5 -> 4.3.4
- CGLib 3.2.0 -> 3.2.4:
  - Optimizations and regression fixes
- Objenesis 2.2 -> 2.4
- Jersey 2.22 -> 2.24:
  - https://jersey.java.net/release-notes/2.24.html
  - https://jersey.java.net/release-notes/2.23.html
  - @BeanParam linking support fix
  - Declarative linking with Maps fixed
  - Async write ordering deadlock fix
- HK2 2.4.0-b31 -> 2.5.0-b05:
  - Necessitated by Jersey upgrade
- JavaX Annotation API 1.2 -> 1.3

Deprecated:

Deprecated DefaultingDictionary usage in DefaultingVolatileIntervalsService
RequestLog::switchTiming has been deprecated
- RequestLog::switchTiming is very context-dependent, and therefore brittle. In particular, adding any additional timers inside code called by a timed block may result in the original timer not stopping properly. All usages of switchTiming should be replaced with explicit calls to RequestLog::startTiming and RequestLog::stopTiming.
Dimension Field Tagging and Dynamic Dimension Field Serilization
- Deprecated DimensionsServlet::getDimensionFieldListSummaryView and DimensionsServlet::getDimensionFieldSummaryView since there is no need for it anymore due to the change in serialization of DimensionField
Default DimensionColumn name to use apiName instead of physicalName
- Deprecated TableUtils::getColumnNames(DataApiRequest, DruidAggregationQuery, PhysicalTable) returning dimension physical name, in favor of TableUtils::getColumnNames(DataApiRequest, DruidAggregationQuery) returning dimension api name
- Deprecated DimensionColumn::DimensionColumn addNewDimensionColumn(Schema, Dimension, PhysicalTable) in favor of DimensionColumn::DimensionColumn addNewDimensionColumn(Schema, Dimension) which uses api name instead of physical name as column identifier for columns
- Deprecated LogicalDimensionColumn in favor of DimensionColumn since DimensionColumn stores api name instead of physical name now, so LogicalDimensionColumn is no longer needed
Moved to static implementations for numeric and sketch coercion helper methods
- MetricMaker.getSketchField(String fieldName) rather use MetricMaker.getSketchField(MetricField field)
- MetricMaker.getNumericField(String fieldName) rather use MetricMaker.getNumericField(MetricField field)
MetricMaker cleanup and simplification
- AggregationAverageMaker deprecated conversion method required by deprecated sketch library
Metric configuration deprecations
- Deprecated superfluous constructor of FilteredAggregator with superfluous argument
- Deprecated MetricMaker utility method in favor of using new field accessor on Metric
Deprecated MetricMaker.getDependentQuery lookup method in favor of simpler direct access

Fixed:

Fixes a potential deadlock
- There is a chance the LuceneSearchProvider will deadlock if one thread is attempting to read a dimension for the first time while another is attempting to load it:
  - Thread A is pushing in new dimension data. It invokes refreshIndex, and acquires the write lock.
  - Thread B is reading dimension data. It invokes getResultsPage, and then initializeIndexSearcher, then reopenIndexSearcher. It hits the write lock (acquired by Thread A) and blocks.
  - At the end of its computation of refreshIndex, Thread A attempts to invoke reopenIndexSearcher. However, reopenIndexSearcher is synchronized, and Thread B is already invoking it.
  - To fix the resulting deadlock, reopenIndexSearcher is no longer synchronized. Since threads need to acquire a write lock before doing anything else anyway, the method is still effectively synchronized.
Fix and refactor role based filter to allow CORS
- Fix RoleBasedAuthFilter to bypass OPTIONS request for CORS
- Discovered a bug where user_roles is declared but unset still reads as a list with empty string (included a temporary fix by commenting the variable declaration)
- Refactored RoleBasedAuthFilter and RoleBasedAuthFilterSpec for better testing
Added missing coverage for ThetaSketchEstimate unwrapping in MetricMaker.getSketchField
DataSource::getNames now returns Fili identifiers, not fact store identifiers
Made a few injection points not useless
- Template types don't get the same subclass goodness that method invocation and dependencies get, so this method did not allow returning a subclass of DruidQueryBuilder or of DruidResponseParser.
Made now required constructor for ArithmeticMaker with rounding public

Removed:

Removed invalid constructor from SketchRoundUpMappepr

v0.6.29 - 2016/11/16

This release is focused on general stability, with a number of bugs fixed, and also adds a few small new capabilities and enhancements. Here are some of the highlights, but take a look in the lower sections for more details.

Fixes:

Dimension keys are now properly case-sensitive (
- Because this is a breaking change, the fix has been wrapped in a feature flag. For now, this defaults to the existing broken behavior, but this will change in a future version, and eventually the fix will be permanent.
all-grain queries are no longer split
Closed a race condition in the LuceneSearchProvider where readers would get an error if an update was in progress
Correctly interpreting List-type configs from the Environment tier as a true List
Stopped recording synchronous requests in the ApiJobStore, which is only intended to hold async requests

New Capabilities & Enhancements:

Customizable logging format
X-Request-Id header support, letting clients set a request ID that will be included in the Druid query
Support for Druid's In filter
Native support for building DimensionRows from AVRO files
Ability to set headers on Druid requests, letting Fili talk to a secure Druid
Better error messaging when things go wrong
Better ability to use custom Druid query types

Added:

[Added Dimension Value implementation for PartitionTableDefinition]
- Added DimensionIdFilter implementation of DataSourceFilter
- Created DimensionListPartitionTableDefinition
Added 'hasAnyRows' to SearchProvider interface
- Has Any Rows allows implementations to optimize queries which only need to identify existence of matches
Customize Logging Format in RequestLog
Can Populate Dimension Rows from an AVRO file
- Added AvroDimensionRowParser that parses an AVRO data file into DimensionRows after validating the AVRO schema.
- Added a functional Interface DimensionFieldMapper that maps field name.
Support for Druid's In Filter
- The in-filter only works with Druid versions 0.9.0 and up.
Support for X-Request-ID
Documentation for topN
Adding slice availability to slices endpoint
- Slice availability can be used to debug availability issues on Physical tables
Ability to set headers for requests to Druid * The AsyncDruidWebServiceImpl now accepts a Supplier<Map<String, String>> argument which specifies the headers to add to the Druid data requests. This feature is made configurable through SystemConfig in the AbstractBinderFactory.

Changed:

Error messages generated during response processing include the request id.
DimensionStoreKeyUtils now supports case sensitive row and column keys
- Wrapped this config in a feature flag case_sensitive_keys_enabled which is set to false by default for backwards compatibility. This flag will be set to true in future versions.
The getGrainMap method in StandardGranularityParser class is renamed to getDefaultGrainMap and is made public static.
- Created new class GranularityDictionary and bind getGranularityDictionary to it
PhysicalTable now uses getAvailableIntervals internally rather than directly referencing its intervals
CSV attachment name for multi-interval request now contain '__' instead of ','
- This change is made to allow running multi-api request with csv format using chrome browser.
Improves error messages when querying Druid goes wrong
- The ResponseException now includes a message that prints the ResponseException's internal state (i.e. the druid query and response code) using the error messages ErrorMessageFormat::FAILED_TO_SEND_QUERY_TO_DRUID and ErrorMessageFormat::ERROR_FROM_DRUID
- The druid query and status code, reason and response body are now logged at the error level in the failure and error callbacks in AsyncDruidWebServiceImpl
Fili now supports custom Druid query types
- QueryType has been turned into an interface, backed by an enum DefaultQueryType.
  - The default implementations of DruidResponseParser DruidQueryBuilder, WeightEvaluationQuery and TestDruidWebService only support DefaultQueryType.
- DruidResponseParser is now injectable by overriding AbstractBinderFactory::buildDruidResponseParser method.
- DruidQueryBuilder is now injectable by overriding AbstractBinderFactory::buildDruidQueryBuilder method.
Updated commons-collections4 dependency from 4.0 to 4.1 to address a security vulnerability in the library.
- For details see: https://commons.apache.org/proper/commons-collections/security-reports.html#Apache_Commons_Collections_Security_Vulnerabilities
- It should be noted that Fili does not make use of any the serialization/deserialization capabilities of any classes in the functor package, so the security vulnerability does not affect Fili.
Clean up build plugins
- Move some plugin configs up to pluginManagement
- Make fili-core publish test javadocs
- Default source plugin to target jar-no-fork instead of jar
- Default javadoc plugin to target javadoc-no-fork as well as jar
- Move some versions up to pluginManagement
- Remove overly (and un-usedly) specified options in surfire plugin configs
- Make all projects pull in the source plugin
Corrected bug with Fili sub-module dependency specification
- Dependency versions are now set via a fixed property at deploy time, rather than relying on project.version
Cleaned up dependencies in pom files
- Moved version management of dependencies up to the parent Pom's dependency management section
- Cleaned up the parent Pom's dependency section to only be those dependencies that truly every sub-project should depend on.
- Cleaned up sub-project Pom dependency sections to handle and better use the dependencies the parent Pom provides

Deprecated:

DimensionStoreKeyUtils now supports case sensitive row and column keys
- Case insensitive row and column keys will be deprecated going forward.
- Because this is a breaking change, the fix has been wrapped in a feature flag. For now, this defaults to the existing broken behavior, but this will change in a future version, and eventually the fix will be permanent.
  - The feature flag for this is bard__case_sensitive_keys_enabled
All constructors of ResponseException that do not take an ObjectWriter
- An ObjectWriter is required in order to ensure that the exception correctly serializes its associated Druid query

Fixed:

Environment comma separated list variables are now correctly pulled in as a list
- Before it was pulled in as a single sting containing commas, now environment variables are pulled in the same way as the properties files
- Added test to test comma separated list environment variables when FILI_TEST_LIST environment variable exists
Druid queries are now serialized correctly when logging ResponseExceptions
Disable Query split for "all" grain
- Before, if we requested "all" grain with multiple intervals, the SplitQueryRequestHandler would incorrectly split the query and we would get multiple buckets in the output. Now, the query split is disabled for "all" grain and we correctly get only one bucket in the response.
Fixed typo emit -> omit in Utils.java omitField()
Adds read locking to all attempts to read the Lucene index
- Before, if Fili attempted to read from the Lucene indices (i.e. processing a query with filters) while loading dimension indices, the request would fail and we would get a LuceneIndexReaderAlreadyClosedException. Now, the read locks should ensure that the query processing will wait until indexing completes (and vice versa).
Fixes a bug where job metadata was being stored in the ApiJobStore even when the results came back synchronously
- The workflow that updates the job's metadata with success was running even when the query was synchronous. That update also caused the ticket to be stored in the ApiJobStore.
- The delay operator didn't stop the "update" workflow from executing because it viewed an Observable::onCompleted call as a message for the purpose of the delay. Since the two observables that that the metadata update gated on are empty when the query is synchronous, the "update metadata" workflow was being triggered every time.
- The delay operator was replaced by zipWith as a gating mechanism.
#45, removing sorting from weight check queries
JsonSlurper can now handle sorting lists with mixed-type entries
- even if the list starts with a string, number, or boolean
Broken segment metadata with Druid v0.9.1
- Made NumberedShardSpec ignore unexpected properties during deserialization
- Added tests to DataSourceMetadataLoaderSpec to test the v.0.9.1 optional field shardSpec.partitionDimensions on segment info JSON.

v0.1.x - 2016/09/23

This release focuses on stabilization, especially of the Query Time Lookup (QTL) capabilities, and the Async API and Jobs resource. Here are the highlights of what's in this release:

A bugfix for the DruidDimensionLoader
A new default DimensionLoader
A bunch more tests and test upgrades
Filtering and pagination on the Jobs resource
A userId field for default Job resource representations
Package cleanup for the jobs-related classes

Added:

always keyword for the asyncAfter parameter now guarantees that a query will be asynchronous
A test implementation of the AsynchronousWorkflowsBuilder: TestAsynchronousWorkflowsBuilder
- Identical to the DefaultAsynchronousWorkflowsBuilder, except that it includes hooks to allow outside forces (i.e. Specifications) to add additional subscribers to each workflow.
Functional tests for Asynchronous queries
[Enrich jobs endpoint with filtering functionality] (#26)
- Jobs endpoint now supports filters
[Enrich the ApiJobStore interface] (#23)
- ApiJobStore interface now supports filtering JobRows in the store
- Added support for filtering JobRows in HashJobStore
- Added JobRowFilter to hold filter information
QueryTimeLookup Functionality Testing
- Added two tests LookupDimensionFilteringDataServletSpec and LookupDimensionGroupingDataServletSpec to test QTL functionality
Lookup Dimension Serializer
- Created LookupDimensionToDimensionSpec serializer for LookupDimension
- Created corresponding tests for LookupDimensionToDimensionSpec in LookupDimensionToDimensionSpecSpec

Deprecated:

Allow configurable headers for Druid data requests
- Deprecated AsyncDruidWebServiceImpl(DruidServiceConfig, ObjectMapper) and AsyncDruidWebServiceImpl(DruidServiceConfig, AsyncHttpClient, ObjectMapper) because we added new construstructors that take a Supplier argument for Druid data request headers.
QueryTimeLookup Functionality Testing
- Deprecated KeyValueDimensionLoader, in favor of TypeAwareDimensionLoader

Changed:

Removed physicalName lookup for metrics in TableUtils::getColumnNames to remove spurious warnings
- Metrics are not mapped like dimensions are. Dimensions are aliased per physical table and metrics are aliazed per logical table.
- Logical metric is mapped with one or many physical metrics. Same look up logic for dimension and metrics doesn't make sense.

Jobs:

HashPreResponseStore moved to test root directory.
- The HashPreResponseStore is really intended only for testing, and does not have capabilities (i.e. TTL) that are needed for production.
The TestBinderFactory now uses the TestAsynchronousWorkflowsBuilder
- This allows the asynchronous functional tests to add countdown latches to the workflows where necessary, allowing for thread-safe tests.
Removed JobsApiRequest::handleBroadcastChannelNotification
- That logic does not really belong in the JobsApiRequest (which is responsible for modeling a response, not processing it), and has been consolidated into the JobsServlet.
ISSUE-17 Added pagination parameters to PreResponse
- Updated JobsServlet::handlePreResponseWithError to update ResultSet object with pagination parameters
Enrich jobs endpoint with filtering functionality
- The default job payload generated by DefaultJobPayloadBuilder now has a userId
Removed timing component in JobsApiRequestSpec
- Rather than setting an async timeout, and then sleeping, JobsApiRequestSpec::handleBroadcastChannelNotification returns an empty Observable if a timeout occurs before the notification is received now verifies that the Observable returned terminates without sending any messages.
Reorganizes asynchronous package structure
- The jobs package is renamed to async and split into the following subpackages:
  - broadcastchannels - Everything dealing with broadcast channels
  - jobs - Everything related to jobs, broken into subpackages
    - jobrows - Everything related to the content of the job metadata
    - payloads - Everything related to building the version of the job metadata to send to the user
    - stores - Everything related to the databases for job data
  - preresponses - Everything related to PreResponses, broken into subpackages
    - stores - Everything related to the the databases for PreResponse data
  - workflows - Everything related to the asynchronous workflow

Query Time Lookup (QTL)

QueryTimeLookup Functionality Testing
- AbstractBinderFactory now uses TypeAwareDimensionLoader instead of KeyValueStoreDimensionLoader
Fix Dimension Serialization Problem with Nested Queries
- Modified DimensionToDefaultDimensionSpec serializer to serialize Dimension to apiName if it's not in the inner-most query
- Added Util::hasInnerQuery helper in serializer package to determine if query is the inner most query or not
- Added tests for DimensionToDefaultDimensionSpec

General:

Preserve collection order of dimensions, dimension fields and metrics
- DataApiRequest::generateDimensions now returns a LinkedHashSet
- DataApiRequest::generateDimensionFields now returns a LinkedHashMap<Dimension, LinkedHashSet<DimensionField>>
- DataApiRequest::withPerDimensionFields now takes a LinkedHashSet as its second argument.
- DataApiRequest::getDimensionFields now returns a LinkedHashMap<Dimension, LinkedHashSet<DimensionField>>>
- Response::Response now takes a LinkedHashSet and LinkedHashMap<Dimension, LinkedHashSet<DimensionField>>> as its second and third arguments.
- ResponseContext::dimensionToDimensionFieldMap now takes a LinkedHashMap<Dimension, LinkedHashSet<DimensionField>>>
- ResponseContext::getDimensionToDimensionFieldMap now returns a LinkedHashMap<Dimension, LinkedHashSet<DimensionField>>>
TestDruidWebService::jsonResponse is now a Producer<String> Producer
QueryTimeLookup Functionality Testing
- Modified some testing resources (PETS table and corresponding dimensions) to allow better testing on LookupDimensions
Memoize generated values during recursive class-scan class construction

Fixed:

Fixing the case when the security context is not complete
- Check for nulls in the DefaultJobRowBuilder.userIdExtractor function.
DruidDimensionsLoader doesn't set the dimension's lastUpdated date
- DruidDimensionsLoader now properly sets the lastUpdated field after it finished processing the Druid response

Files

CHANGELOG_0_x.md

Latest commit

History

CHANGELOG_0_x.md

File metadata and controls

Change Log

Current

Fixed:

Added:

Changed:

Removed:

Fixed:

Deprecated:

Known Issues:

Contract changes:

Extensive deprecation cleanup

0.12 Highlights

Extensibility improvements

Protocols Metrics

Infrastructure

Sample applications

Performance improvements

General Extensibility

v0.12.109 - 2021/03/13

Fixed:

Added:

Changed:

Removed:

Fixed:

Deprecated:

Known Issues:

Contract changes:

v0.11.79 - 2019/09/17

0.11 Highlights

Configuration - Luthier

Configuration and Extensibility

Code style

Security patches

DataApiRequest

Extensibility

Added:

Changed:

Deprecated:

Removed:

Fixed:

Known Issues:

Contract changes:

v0.10.48 - 2018/10/04

0.10 Highlights

Extensibility improvements

Extensive deprecation cleanup

General cleanup

Externalizing filter building code

Supporting Druid in-filters

Added:

Changed:

Deprecated:

Fixed:

Known Issues:

v0.9.137 - 2018/04/13

0.9 Highlights

Fili Security Added!

DataApiRequestFactory layer

Make Field Accessor PostAggregation able to reference post aggregations in adddition to aggregations

Etag Cache

More robust Lucene Search Provider and Key Value Store

Extraction Function on selector filter

More controllable RequestLog

Druid lookup metadata load status check

Add ability to use custom rate limiting schemes

Support Time Format Extraction Function in Fili

Dimension load strategy indicator

Query Split Logging

Configurable Metric Long Name

Search provider can hot-swap index and key value store can hot-swap store location

Uptime Status Metric

Consolidate UI & Non-UI broker configurations

Credits

Added: