This repository has been archived by the owner on Aug 2, 2022. It is now read-only.
Query on rollup index for average aggegation metric is giving incorrect results #440
Labels
bug
Something isn't working
Describe the bug
Same Aggregation query is being fired on source index and rollup index for aggregation metric values comparision, Results are not matching. Average aggregation query on rollup index giving incorrect results.
Rollup Job Configuration
curl -XPUT "localhost:9200/_opendistro/_rollup/jobs/rollup-test?pretty" -H "Content-Type:application/json" -d '{ "rollup": { "enabled": true, "schedule": { "cron": { "expression": "*/1 * * * *", "timezone":"UTC" } }, "description": "Test rollup job", "source_index": "jaeger-span-2021.04.17-000103", "target_index": "rollup-test", "page_size": 5000, "delay": 300, "continuous": false, "dimensions": [ { "date_histogram": { "source_field": "startTimeMillis", "fixed_interval": "1h", "timezone": "UTC" } }, { "terms": { "source_field": "process.serviceName" } }, { "terms": { "source_field": "process.tag.application@version" } }, { "terms": { "source_field": "operationName" } }, { "terms": { "source_field": "exception.type" } }, { "terms": { "source_field": "exception.message" } } ], "metrics": [ { "source_field": "duration", "metrics": [ { "avg": {} }, { "max": {} }, { "min": {} }, { "sum": {} }, { "value_count": {} } ] } ] } } '
Query on Rollup Index
Request
curl -X GET "localhost:9200/rollup-test/_search?pretty&size=0" -H 'Content-Type: application/json' -d' { "query": { "bool": { "must": [ { "terms": { "process.serviceName": [ "service-xxxxxx" ] } } ] } }, "aggregations": { "timeline": { "date_histogram": { "field": "startTimeMillis", "fixed_interval": "1h" }, "aggs": { "service": { "terms": { "field": "process.serviceName" }, "aggs": { "avg_duration": { "avg": { "field": "duration" } }, "max_duration": { "max": { "field": "duration" } }, "min_duration": { "min": { "field": "duration" } }, "count": { "value_count": { "field": "duration" } }, "sum": { "sum": { "field": "duration" } } } } } } } }'
Response
rollup-index-response.txt
Query on Source Index
Request
curl -X GET "localhost:9200/jaeger-span-2021.04.17-000103/_search?pretty&size=0" -H 'Content-Type: application/json' -d' { "query": { "bool": { "must": [ { "terms": { "process.serviceName": [ "service-xxxxxx" ] } } ] } }, "aggregations": { "timeline": { "date_histogram": { "field": "startTimeMillis", "fixed_interval": "1h" }, "aggs": { "service": { "terms": { "field": "process.serviceName" }, "aggs": { "avg_duration": { "avg": { "field": "duration" } }, "max_duration": { "max": { "field": "duration" } }, "min_duration": { "min": { "field": "duration" } }, "count": { "value_count": { "field": "duration" } }, "sum": { "sum": { "field": "duration" } } } } } } } }'
Response
source-index-response.txt
Setup Details
All other metrics SUM, VALUE_COUNT, MIN, MAX are giving correct results and matching with aggregation metrics of source index. Only Average is giving incorrect results.
Consider following example taken from response of Rollup index query:
{ "key_as_string" : "2021-04-17T02:00:00.000Z", "key" : 1618624800000, "doc_count" : 562, "service" : { "doc_count_error_upper_bound" : 0, "sum_other_doc_count" : 0, "buckets" : [ { "key" : "service-xxxxxx", "doc_count" : 562, "avg_duration" : { "value" : 754.1463076048377 }, "count" : { "value" : 2847569 }, "min_duration" : { "value" : 37.0 }, "sum" : { "value" : 1.5818190941E10 }, "max_duration" : { "value" : 2.07551568E8 } } ] } }
Here the expected avg_duration: 1.5818190941E10 / 2847569 = 5,554.9807365511 but the actual value resulted in response is avg_duration = 754.1463076048377
Can anyone explain the reason behind this discrepancy?
The text was updated successfully, but these errors were encountered: