Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPU support #11

Closed
ksatzke opened this issue May 8, 2020 · 7 comments
Closed

GPU support #11

ksatzke opened this issue May 8, 2020 · 7 comments
Assignees
Labels
feature_request New feature request in progress This issue is already being fixed

Comments

@ksatzke
Copy link
Collaborator

ksatzke commented May 8, 2020

Environment]: first on bare metal with NVIDIA GPUs
[Known affected releases]: master (includes all releases)

allow KNIX functions to use available GPU respources.

@ksatzke ksatzke added feature_request New feature request in progress This issue is already being fixed labels May 8, 2020
@ksatzke ksatzke self-assigned this May 8, 2020
@iakkus
Copy link
Member

iakkus commented May 26, 2020

How are the GPUs multiplexed to various applications? For example, if there are two applications/workflows that want to use a GPU and there is only one GPU on the host, can they share the same GPU?

Or do you keep track of available GPUs and exclusively assign them to applications?

@ksatzke
Copy link
Collaborator Author

ksatzke commented May 26, 2020 via email

@iakkus
Copy link
Member

iakkus commented May 26, 2020

I see, thanks!

For more information, the following is from the wiki of the nvidia-docker repo.

Can I share a GPU between multiple containers?
Yes. This is no different than sharing a GPU between multiple processes outside of containers.
Scheduling and compute preemption vary from one GPU architecture to another (e.g. CTA-level, instruction-level).

Can I limit the GPU resources (e.g. bandwidth, memory, CUDA cores) taken by a container?
No. Your only option is to set the GPU clocks at a lower frequency before starting the container.

Can I enforce exclusive access for a GPU?
This is not currently supported but you can enforce it:

At the container orchestration layer (Kubernetes, Swarm, Mesos, Slurm…) since this is tied to resource allocation.
At the driver level by setting the compute mode of the GPU.

@ksatzke
Copy link
Collaborator Author

ksatzke commented May 26, 2020 via email

@iakkus
Copy link
Member

iakkus commented May 26, 2020

I'd think that the GPU support would first be supported only on bare metal environment until that issue is resolved for knative.

@iakkus
Copy link
Member

iakkus commented May 28, 2020

Implementation started on branch feature/GPU_support.

@iakkus
Copy link
Member

iakkus commented May 16, 2021

This issue is subsumed by #87.

@iakkus iakkus closed this as completed May 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature_request New feature request in progress This issue is already being fixed
Projects
None yet
Development

No branches or pull requests

2 participants