Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Issue]: rocprofiler-sdk compilation from source #8

Open
maartenarnst opened this issue Jul 13, 2024 · 2 comments
Open

[Issue]: rocprofiler-sdk compilation from source #8

maartenarnst opened this issue Jul 13, 2024 · 2 comments

Comments

@maartenarnst
Copy link

Problem Description

With this issue, I wanted to share an experience with compiling the new rocprofiler-sdk from source.

I compiled rocprofiler-sdk in the docker image rocm/dev-ubuntu-22.04:6.1.2. The build succeeded after resolving the following issues that I encountered:

  • The documentation in the README.md indicates that rocprofv3 is "shipped with the ROCm stack". It appears that this may not be the case yet for the current publicly available versions of the ROCm stack. E.g., in the docker image rocm/dev-ubuntu-22.04:6.1.2, there does not appear to be a rocprofv3 executable in /opt/rocm/bin.

  • I followed the Build and Installation instructions from the README.md file. In these instructions, the link in the git clone command appears to contain an error. I reported this in [Documentation]: erroneous git clone command #7.

  • I used the source code from the amd-mainline branch. The build exited with an error. The reason seems to be that code in the file rocprofiler-sdk/source/lib/rocprofiler-sdk/counters/tests/hsa_tables.cpp seeks to set a member variable hsa_amd_queue_get_info_fn from a struct AmdExtTable. However, it seems that this variable is not yet a member variable of this struct in the currently latest public version of ROCm, and may only become one in a future release. Protecting this line with the preprocessor conditional HSA_AMD_EXT_API_TABLE_STEP_VERSION >= 0x02 as in the amd-staging branch allowed the build to proceed. I do not know if this fix is sufficient; e.g., there are other occurrences of hsa_amd_queue_get_info_fn in the code, but they seemingly did not cause a build issue.

  • I initially used the libhsa-amd-aqlprofile64.so that came with the docker image. The build exited with linking errors due to missing symbols related to the aqlprofile. I installed the latest libhsa-amd-aqlprofile64.so by installing the .deb package from the link indicated in the warning in the README.md. I did so by using the command dpkg -i .... This deleted the existing aqlprofile libraries from the /opt/rocm/lib directory, and new libraries were installed in a new directory /opt/rocm-6.2.0-14213. I made symlinks in /opt/rocm/lib to the new libraries in the new directory. This allowed the build to proceed and complete. I do not know if this is a good way of installing the aqlprofile, nor whether replacing the aqlprofile libraries shipped with the rocm image with these new versions may break anything in the existing rocm stack.

This experience may lead to two thoughts:

  • Unless I made mistakes, this experience suggests that the source code in the amd-mainline branch of rocprofiler-sdk cannot currently be compiled against currently publicly available versions of the ROCm stack, because it appears to depend on features from future releases. This leads to the question of whether there could/should be a branch in the repository that can be compiled against an existing publicly available version of the ROCm stack. This question may be related to the distinction between the "mainline" and "staging" branches.

  • It would be helpful to provide not only a "warning" about the "aqlprofile", but also some guidance to guide users/developers to install in an appropriate way the aqlprofile library, as well as some guidance about the impact that the installation of an updated version may have. E.g., does it impact only rocprofiler, or does it risk breaking anything in the stack?

The work on rocprofiler-sdk seems really great! I'm looking forward to explore more.

Operating System

Ubuntu 22.04

CPU

AMD Ryzen9 5950X

GPU

AMD Radeon Pro VII

ROCm Version

ROCm 6.1.0

ROCm Component

rocprofiler

Steps to Reproduce

No response

(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support

No response

Additional Information

No response

@jrmadsen
Copy link
Contributor

rocprofiler-sdk is targeted to be publicly available as a beta release in ROCm 6.2. Building it against 6.1 is unfortunately a bit pointless since 6.1 doesn’t have some necessary changes in HIP to allow tracing the HIP runtime.

@maartenarnst
Copy link
Author

Hi @jrmadsen. OK, thanks a lot for the clarification. We'll wait for ROCm 6.2 before exploring the new rocprofiler more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants