GPU support #11

ksatzke · 2020-05-08T14:03:33Z

Environment]: first on bare metal with NVIDIA GPUs
[Known affected releases]: master (includes all releases)

allow KNIX functions to use available GPU respources.

iakkus · 2020-05-26T08:53:13Z

How are the GPUs multiplexed to various applications? For example, if there are two applications/workflows that want to use a GPU and there is only one GPU on the host, can they share the same GPU?

Or do you keep track of available GPUs and exclusively assign them to applications?

ksatzke · 2020-05-26T09:17:31Z

Hi Ekin, if we specifically want to support CUDA and nvidia-docker. The NVIDIA engineers found a way to share GPU drivers from host to containers, without having them installed on each container individually. https://github.com/NVIDIA/nvidia-docker So one GPU can shared between different sandboxes using this framework. This would also happen with e.g. two workflows running in different containers. BR Klaus

iakkus · 2020-05-26T09:30:04Z

I see, thanks!

For more information, the following is from the wiki of the nvidia-docker repo.

Can I share a GPU between multiple containers?
Yes. This is no different than sharing a GPU between multiple processes outside of containers.
Scheduling and compute preemption vary from one GPU architecture to another (e.g. CTA-level, instruction-level).

Can I limit the GPU resources (e.g. bandwidth, memory, CUDA cores) taken by a container?
No. Your only option is to set the GPU clocks at a lower frequency before starting the container.

Can I enforce exclusive access for a GPU?
This is not currently supported but you can enforce it:

At the container orchestration layer (Kubernetes, Swarm, Mesos, Slurm…) since this is tied to resource allocation.
At the driver level by setting the compute mode of the GPU.

ksatzke · 2020-05-26T09:48:53Z

Yes but it seems as if GPU support for knative is still an open issue? knative/client#490

iakkus · 2020-05-26T10:16:09Z

I'd think that the GPU support would first be supported only on bare metal environment until that issue is resolved for knative.

iakkus · 2020-05-28T09:10:54Z

Implementation started on branch feature/GPU_support.

iakkus · 2021-05-16T20:38:37Z

This issue is subsumed by #87.

ksatzke added feature_request New feature request in progress This issue is already being fixed labels May 8, 2020

ksatzke self-assigned this May 8, 2020

iakkus mentioned this issue Oct 17, 2020

Feature/gpu support extended #87

Open

iakkus closed this as completed May 16, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPU support #11

GPU support #11

ksatzke commented May 8, 2020

iakkus commented May 26, 2020

ksatzke commented May 26, 2020 via email •

edited by manuelstein

Loading

iakkus commented May 26, 2020

ksatzke commented May 26, 2020 via email •

edited by manuelstein

Loading

iakkus commented May 26, 2020

iakkus commented May 28, 2020

iakkus commented May 16, 2021

GPU support #11

GPU support #11

Comments

ksatzke commented May 8, 2020

iakkus commented May 26, 2020

ksatzke commented May 26, 2020 via email • edited by manuelstein Loading

iakkus commented May 26, 2020

ksatzke commented May 26, 2020 via email • edited by manuelstein Loading

iakkus commented May 26, 2020

iakkus commented May 28, 2020

iakkus commented May 16, 2021

ksatzke commented May 26, 2020 via email •

edited by manuelstein

Loading

ksatzke commented May 26, 2020 via email •

edited by manuelstein

Loading