Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DRAFT] CNDB-12408 Track memory usage via sensors #1508

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from
Draft

Conversation

aymkhalil
Copy link

@aymkhalil aymkhalil commented Jan 16, 2025

What is the issue

Tracking memory (allocations and snapshots) in order to prevent crashing: https://github.com/riptano/cndb/issues/12408

What does this PR fix and why was it fixed

VERY draft pr to get a sense of all work areas required, and to deploy in CNDB with metrics enabled so we can iterate. In the meantime, I'm asking for high level feedback (don't chase low level details for now) like:

  • The whole idea of taking allocation rate + snapshot for prediction.
  • Should we create specialized sensor APIs for node level metrics? Today the context, that is unique on keyspace/table/table_id is a central design decision.
  • Should publish max memory snapshots as gauges? Or only emit a final boolean OOM prediction to autoscaler? Otherwise, programmatically all info is available. Also, arguably, on-heap and off-heap have well defined maxes by java settings. For unsafe, we need to think how to cap it (should it compete again onheap/offheap maxes or allocated?)
  • Better API to use for snapshotting on-heap/off-heap/physical memory?
  • Sensor rate calculation
  • ...

Checklist before you submit for review

  • Make sure there is a PR in the CNDB project updating the Converged Cassandra version
  • Use NoSpamLogger for log lines that may appear frequently in the logs
  • Verify test results on Butler
  • Test coverage for new/modified code is > 80%
  • Proper code formatting
  • Proper title for each commit staring with the project-issue number, like CNDB-1234
  • Each commit has a meaningful description
  • Each commit is not very long and contains related changes
  • Renames, moves and reformatting are in distinct commits

@aymkhalil aymkhalil requested review from sbtourist and jkni January 16, 2025 01:19
@jkni
Copy link

jkni commented Jan 16, 2025

Not a review per se, but catching up after today's discussions, and this makes sense to me as a first build to start testing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants