Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race condition between processing and scraping #144

Open
leklund opened this issue Jul 6, 2023 · 0 comments
Open

Race condition between processing and scraping #144

leklund opened this issue Jul 6, 2023 · 0 comments

Comments

@leklund
Copy link
Member

leklund commented Jul 6, 2023

There is a race condition that exists when the scrape is happening while all the per datacenter metrics are being incremented. When the results are processed from the real-time stats API it’s iterating and incrementing the metrics per-datacenter. If the scrape happens during that processing loop, the metrics that are reported won’t include all metrics for all datacenters since the response from the realtime API hasn’t finished processing yet. Therefore that scrape is reporting all the data from the last second of realtime data. I was able to easily reproduce by adding an artificial delay in the processing loop to force the scrape to happen in the middle of the loop. This can cause interesting graphs when running queries like:

(sum(rate(fastly_rt_requests_total[1m])) by(service_id)- (
sum(rate(fastly_rt_tls_total[1m]))by(service_id) ))

This line should be flat:

Screen Shot 2023-07-06 at 3 36 29 PM

A potential solution is to add some locking so that every scrape is guaranteed to have a full set of data from any given response from the API. This has some performance implications especially when running against many services.

Thanks to @mrnetops for reporting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant