-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cuda_plot Feature Request: a -2 for every -g #30
Comments
I never envisioned someone doing partial RAM plotting on a multi-socket system... I agree this would help in that case, but why use hybrid mode when you can easily get more RAM? Partial RAM plotting is really just a bandaid for people without server grade hardware. |
I would probably never use such a feature, and I agree that it's pretty niche and might not be worth your time. It's just a thought I had while benchmarking different configurations. There are a few boards out there with limited memory flexibility, including no support for LRDIMMs, only four DIMMs per socket, maybe one processor has a bad memory channel, etc. In that case you might find yourself trying to do e.g. k33 with 256GB RAM and two weaker GPUs. You decide that you need a robust -2 to keep up with the GPUs, which is fine because you have a bunch of bifurcated expansion slots free. But now your inter-socket communication is causing the system to thrash... |
The cost of a |
I'm surprised you even bothered to implement partial mode in the first place. |
You'd be surprised how many people want a 64 G mode, I guess it makes sense for small farmers who dont want to invest into more hardware. You'd still plot quite a bit faster than on a regular desktop CPU. |
In partial RAM mode, with multiple GPUs, it might help performance to be able to set a -2 for each GPU (placed by the operator in the same NUMA node as the GPU).
How much it might help would depend on how much cross-traffic there is between GPUs and the temporary files they're reading and writing. If each GPU works mostly independently, reading and writing its own temporary files, then this feature might help a lot. If the inputs and outputs of each GPU are being split and merged over and over again throughout the various phases and tables, it might not help at all.
The text was updated successfully, but these errors were encountered: