You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When scaling down a cluster, we're noticing that some data becomes unavailable for query. It appears in the form of partial results when querying for older data (from before the scale down).
We're also noticing that database_tick_index_num_docs remains flat for each node through the scale down, and then jumps up once the node is restarted. The effect if summing across all nodes the cluster is that the metric drops (when the old node is removed), and recovers to prior level when the remaining nodes restart.
General Issues
What service is experiencing the issue? (M3Coordinator, M3DB, M3Aggregator, etc)
m3db
What is the configuration of the service? Please include any YAML files, as well as namespace / placement configuration (with any sensitive information anonymized if necessary).
can provide if required, but I think this might be general
RF=3
How are you using the service? For example, are you performing read/writes to the service via Prometheus, or are you using a custom script?
issue relates to reads happening via remote read
Is there a reliable way to reproduce the behavior? If so, please provide detailed instructions.
It appears to be consistent when removing nodes from placements. It's more obvious when the clusters are small as more of the index is affected.
The text was updated successfully, but these errors were encountered:
Hey @BertHartm - there’s been some fixes and tests added to cover this category of bugs. As far as we know there are not outstanding bugs in this space, so perhaps I can take our tests and run it against the version you’re running.
Is this 1.3 or 1.5? The exact SHA would be helpful as we investigate this. Thanks for reporting!
When scaling down a cluster, we're noticing that some data becomes unavailable for query. It appears in the form of partial results when querying for older data (from before the scale down).
We're also noticing that
database_tick_index_num_docs
remains flat for each node through the scale down, and then jumps up once the node is restarted. The effect if summing across all nodes the cluster is that the metric drops (when the old node is removed), and recovers to prior level when the remaining nodes restart.General Issues
m3db
can provide if required, but I think this might be general
RF=3
issue relates to reads happening via remote read
It appears to be consistent when removing nodes from placements. It's more obvious when the clusters are small as more of the index is affected.
The text was updated successfully, but these errors were encountered: