Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Issue]: Rocm-smi does not work when building container from the dockerfile hip-libraries-rocm-ubuntu.Dockerfile #171

Open
umechand-amd opened this issue Sep 27, 2024 · 4 comments

Comments

@umechand-amd
Copy link

Problem Description

Rocm-smi doesnt work when using the container from the Dockerfile hip-libraries-rocm-ubuntu.Dockerfile
developer@gpuperf-lab-78:/workspaces$ rocm-smi
bash: rocm-smi: command not found
OS:
NAME="Ubuntu"
VERSION="22.04.4 LTS (Jammy Jellyfish)"
CPU:
model name : AMD EPYC 9554 64-Core Processor
GPU:

Operating System

Ubuntu 22.04.4 LTS

CPU

AMD EPYC 9554 64-Core Processor

GPU

AMD Instinct MI300X, AMD Instinct MI300A

ROCm Version

ROCm 6.2.0

ROCm Component

No response

Steps to Reproduce

No response

(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support

No response

Additional Information

No response

@dgaliffiAMD
Copy link
Collaborator

Hi @umechand-amd ,
I believe the docker files are limited to include only the components required to run the examples. This is to reduce the overall size of the image.
Can you include details on which example is failing in your setup?
Thank you.

@umechand-amd
Copy link
Author

I dont even to get to build because the container cannot find GPUs.

@dgaliffiAMD
Copy link
Collaborator

Thanks for the addition details. We're trying to reproduce in-house.
You should be able to build without GPUs. That's what we do in our GitHub workflows. But, I understand, you won't be able to run.

@Beanavil
Copy link
Collaborator

Beanavil commented Oct 1, 2024

Seems like rocm-smi-lib is missing from the install step of the dockerfile. I think as an immediate fix you can install it from within the container:

sudo apt-get install rocm-smi-lib

or add it to your local dockerfile:

# Install the HIP compiler and libraries from the ROCm repositories
RUN export DEBIAN_FRONTEND=noninteractive; \
    mkdir -p /etc/apt/keyrings \
    && wget -q -O - https://repo.radeon.com/rocm/rocm.gpg.key | gpg --dearmor > /etc/apt/keyrings/rocm.gpg \
    && echo "deb [arch=amd64, signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/rocm/apt/$ROCM_VERSION_APT/ jammy main" > /etc/apt/sources.list.d/rocm.list \
    && printf 'Package: *\nPin: origin "repo.radeon.com"\nPin-Priority: 9001\n' > /etc/apt/preferences.d/radeon.pref \
    && apt-get update -qq \
    && apt-get install --no-install-recommends -y \
-       hip-base hipify-clang rocm-core hipcc \
+       hip-base hipify-clang rocm-core hipcc rocm-smi-lib \
        hip-dev rocm-hip-runtime-dev rocm-llvm-dev \
        rocrand-dev hiprand-dev \
        rocprim-dev hipcub-dev \
        rocblas-dev hipblas-dev \
        rocsolver-dev hipsolver-dev \
        rocfft-dev hipfft-dev \
        rocsparse-dev \
        rocthrust-dev \
    && rm -rf /var/lib/apt/lists/*

At least that works on my end

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants