-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(nvidia): use NVML + lspci to detect NVIDIA GPUs (without running nvidia-smi) #127
Conversation
Signed-off-by: Gyuho Lee <[email protected]>
Signed-off-by: Gyuho Lee <[email protected]>
Signed-off-by: Gyuho Lee <[email protected]>
Signed-off-by: Gyuho Lee <[email protected]>
if !smiInstalled { | ||
return false, nil | ||
} | ||
log.Logger.Info("nvidia-smi installed") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be debug level?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
if err != nil { | ||
return false, err | ||
} | ||
log.Logger.Infow("detected nvidia gpu", "product", productName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
productName can also be a network card. do we check the type of the device?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's calling the device.GetName, so changed this logging to GPU device name.
Signed-off-by: Gyuho Lee <[email protected]>
|
||
// now that nvidia-smi installed, | ||
// check the NVIDIA GPU presence via PCI bus | ||
pciDevices, err := ListPCIs(ctx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is only useful for printing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed to ListNVIDIAPCIs
. It's only listing the devices with NVIDIA name in it, thus checking whether the host has NVIDIA devices (we check len(results) > 0
)
Signed-off-by: Gyuho Lee <[email protected]>
No description provided.