Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question related to GPU device attributes #31

Open
starry91 opened this issue Apr 18, 2023 · 3 comments
Open

Question related to GPU device attributes #31

starry91 opened this issue Apr 18, 2023 · 3 comments

Comments

@starry91
Copy link

Hi, I am looking for a programmatic way to get the Streaming Multiprocessor (SM) count for T4/A100(mig enabled, disabled) cards. Is there an API in go-dcgm or go-nvml that I can use?

@nikkon-dev
Copy link
Collaborator

To get the number of SMs, you must use cuDeviceGetAttribute Cuda driver API with CU_DEVICE_ATTRIBUTE_MULTIPROCESSOR_COUNT attribute.

Neither dcgm nor nvml exposes this API. The closest info that you could get from NVML is the number of slices in the enabled MIG profile, but that would only give you the number of the GPCs that you would have to multiply by the number of SMs per GPC that depends on the GPU architecture (thus, not helpful).

@starry91
Copy link
Author

@nikkon-dev Is there a plan to add this API in DCGM or NVML anytime soon? The API should support both T4 and A100 cards. To add more context, in DCGM we expose the the SM Activity field which is inclusive of the no. of SMs in the GPU/MIG while there is no way to get the actual number of SMs in the device. Hence, we cannot infer the SM Activity value as much as we would like to.

@starry91
Copy link
Author

@nikkon-dev Any update on this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants