-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
All benchmark-worker heap size to flex with the size of the machine #78
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a JVM tuning expert but those sound sensible.
Only one I would be concerned about would be GC threads one which are being set to 32. Would this be a problem on smaller machines?
I think I took that from upstream. We could probably take that one out and let the JVM choose the number of threads, what do you think? |
Yeah lets maybe take those out. |
@tmgstevens coming back to this one: I don't see this long list of tuning options upstream, over there it is just a quite simple Maybe this is tuning for a big box? I think host-size specific tuning might be better off in the deployment & launch scripts for specific scenarios, as we have in the Redpanda folder, since things like very large heap sizes (as we had hardcoded in the past) and thread counts are sort of tied to specific benchmark deployments with machines of a certain size? The idea is that you override the HEAP_OPTS when you launch the worker. I do like the the idea of the letting the heap "flex" with the machine size, because we also currently run into problems with 8G heap on small machines were we get an OOME. What if we just put in the "flex" part (which is swapping out Xms/Xms for MaxRAMPercentage and friends) for now? |
Ping @tmgstevens . |
@travisdowns it comes from here: https://github.com/openmessaging/benchmark/blob/master/driver-kafka/deploy/ssd-deployment/deploy.yaml#L229 The other salient reference being https://github.com/confluentinc/benchmark/blob/10x-performance-blog/driver-kafka/deploy/confluent-deployment/deploy.yaml#L335 But now I also see https://github.com/redpanda-data/openmessaging-benchmark/blob/main/driver-redpanda/deploy/deploy.yaml#L389 as well. So the purpose of this change was specifically for when Ansible isn't being used, which would be the Helm chart install. Otherwise, the stuff in the Ansible is going to take precedence, although tbh we should change that so it's not hard coded to 50GB. |
Thanks @tmgstevens . I didn't know the ansible actually just modiied the start script in-situ with a regex. Among other things, this means we'd better be careful that this Anyway, back to the change: Here's a concrete suggestion: what if we just make these variables properly overridable by env vars (same names) passed in by the caller? Then the helm chart (and, arguably ansible) can just set those env vars to whatever they want depending on the scenario they are setting up. Because it doesn't seem like 1 default to rule them all is going to work anyway since we presumably have different scenarios we might set up for. |
No description provided.