-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/gpu support extended #87
base: develop
Are you sure you want to change the base?
Conversation
…nding values.yml capability definition example
…GPU related tests
Sandbox/Dockerfile_gpu
Outdated
# Install dlib for CUDA | ||
RUN git clone https://github.com/davisking/dlib.git | ||
RUN mkdir -p /dlib/build | ||
|
||
RUN cmake -H/dlib -B/dlib/build -DDLIB_USE_CUDA=1 -DUSE_AVX_INSTRUCTIONS=1 | ||
RUN cmake --build /dlib/build | ||
|
||
RUN cd /dlib; python3 /dlib/setup.py install | ||
|
||
# Install the face recognition package and tensorflow | ||
RUN pip3 install face_recognition | ||
RUN pip3 install tensorflow==2.1.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure why we need to install all these custom libraries for the GPU usage.
If these are needed by the workflows, then they should specify it in the function requirements.
mfn_sdk/mfn_sdk/mfnclient.py
Outdated
@@ -449,7 +449,7 @@ def _get_state_names_and_resource(self, desired_state_type, wf_dict): | |||
return state_list | |||
|
|||
|
|||
def add_workflow(self,name,filename=None): | |||
def add_workflow(self,name,filename=None, gpu_usage="None"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should read: gpu_usage=None
deploy/ansible/Makefile
Outdated
@@ -21,7 +21,7 @@ NAMES := $(YAML:%.yaml=%) | |||
.PHONY: $(NAMES) | |||
default: prepare_packages install | |||
|
|||
install: init_once riak elasticsearch fluentbit datalayer sandbox management nginx | |||
install: init_once installnvidiadocker riak elasticsearch fluentbit datalayer frontend sandbox management nginx |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think 'frontend' component does not exist anymore.
What happens if the host does not have any Nvidia GPUs? Will the 'installnvidiadocker' still succeed?
@@ -107,6 +118,7 @@ image_java: \ | |||
|
|||
push: image image_java | |||
$(call push_image,microfn/sandbox) | |||
$(call push_image,microfn/sandbox_gpu) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
microfn/sandbox_java_gpu?
Need to also update the dependencies for the Makefile target.
gpu_hosts[hostname] = hostip | ||
|
||
# instruct hosts to start the sandbox and deploy workflow | ||
if runtime=="Java" or sandbox_image_name == "microfn/sandbox": # can use any host |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought we had the "microfn/sandbox_java_gpu" image?
…x-microfunctions/knix into feature/GPU_support_extended
This reverts commit 7a1b157.
This PR adds the capability to execute Python KNIX functions in sandboxes using NVIDIA GPU resources for both ansible and helm deployments of KNIX. GPU nodes are detected and configured automatically. The required kubernetes configuration for deployments with GPU nodes is described in README_GPU_Installation.md
Subsumes #11, and fixes #79.