Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PVC usage for the Thanos store-gateway is increasing at a rate of about 1 GB per day #7837

Open
yangtian9999 opened this issue Oct 17, 2024 · 10 comments

Comments

@yangtian9999
Copy link

Thanos, Prometheus and Golang version used: bitnami/thanos:0.36.1-debian-12-r0

Object Storage Provider: aws s3

What happened: We have observed that the PVC usage for the Thanos store-gateway is increasing at a rate of about 1 GB per day.

What you expected to happen: Thanos store-gateway can do housekeeping by itself everyday.

How to reproduce it (as minimally and precisely as possible):

Full logs to relevant components: No error log.

Anything else we need to know: Please notice that the Thanos store-gateway pod is on during 8am to 8pm everyday only.

@yangtian9999
Copy link
Author

image

Copy link

dosubot bot commented Oct 17, 2024

Hey @yangtian9999 👋, I'm here to help you with any bugs, questions, or contributions you have. Let's get to work!

I found a similar open issue regarding the increasing disk usage for the Thanos store-gateway. The issue suggests configuring time-based partitioning to shard the store-gateways, which helps in managing disk usage effectively. You can find more details in the open issue.

Additionally, the increase in PVC usage might be due to the lack of configured retention policies. To manage this and perform automatic housekeeping, you can configure retention policies using the --retention.resolution-raw, --retention.resolution-5m, and --retention.resolution-1h flags. These flags allow you to set retention periods for different resolutions of data, ensuring that old data is deleted after compaction and downsampling processes. For example:

thanos compact --data-dir /tmp/thanos-compact --objstore.config-file=bucket.yml --retention.resolution-raw=30d --retention.resolution-5m=90d --retention.resolution-1h=365d

This configuration will retain raw data for 30 days, 5-minute resolution data for 90 days, and 1-hour resolution data for 365 days. Adjust these values based on your storage and data retention requirements [1].

To continue talking to Dosu, mention @dosu.

@yangtian9999
Copy link
Author

yes, we do have the retention setting for thanos compactor.
retentionResolutionRaw: 7d
retentionResolution5m: 28d
retentionResolution1h: 180d

@yeya24
Copy link
Contributor

yeya24 commented Oct 17, 2024

Is your Thanos newly set up or it has been running for a while?
Based on your retention policy, it is expected to see space usage increase within 180 days as that's when your data eventually get deleted.

@yangtian9999
Copy link
Author

Hi, @yeya24 , yes, new cluster did bootstrap 2 weeks ago.
Does this mean that I have to set 200GB for thanos-store-gateway PVC to meet 180d setting? Thanks.

@GiedriusS
Copy link
Member

Is thanos_compactor_iterations_total more than 0?

@yangtian9999
Copy link
Author

no value return for thanos_compactor_iterations_total

@yeya24
Copy link
Contributor

yeya24 commented Oct 17, 2024

Maybe you can double check https://thanos.io/tip/operating/compactor-backlog.md/.
If the compactor can run properly it should greatly reduce space usage

@yangtian9999
Copy link
Author

@yeya24 ,
thanos-compactor pvc has been healthy for few days after increased the pvc size.
But store-gateway still increased daily.

image

image

image

Please help to check if need any more info, Thanks so much.

@yeya24
Copy link
Contributor

yeya24 commented Oct 18, 2024

PVC status doesn't matter here. Please make sure your compactor is up and running and it compacts your data on time.
If it is working normally but you still see PVC usage increased every day then it is expected because of your retention time setting

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants