Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Switch to use new Python bindings from cuda-python/cuda-bindings #111

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

leofang
Copy link
Member

@leofang leofang commented Oct 29, 2024

Note: Currently to try this PR it requires building cuda-python/cuda-bindings from source:
https://github.com/NVIDIA/cuda-python/tree/main/cuda_bindings

  • Drop the old, CPython API-based nvJitLink bindings, in favor of the new one from CUDA Python
    • Support dynamic linking, no more rebuilding bindings for each CUDA minor release
    • Support Windows
    • pynvjitlink becomes a pure Python package
  • All tests in pynvjitlink/tests/test_pynvjitlink.py passed except one (see comment below)
    • I need someone to help take a look

Copy link

copy-pr-bot bot commented Oct 29, 2024

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Comment on lines -77 to +84
with pytest.raises(RuntimeError):
with pytest.raises(_nvjitlinklib.nvJitLinkError):
# FIXME: For some reason this API would leak the error log to stderr
_nvjitlinklib.complete(handle)
error_log = _nvjitlinklib.get_error_log(handle)
_nvjitlinklib.destroy(handle)
assert (
"Undefined reference to '_Z5undefff' "
"in 'undefined_extern.cubin'" in error_log
)
), f"{error_log=}"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the failing test:

$ pytest test_pynvjitlink.py  --maxfail=3
============================================================= test session starts =============================================================
platform linux -- Python 3.12.5, pytest-8.3.3, pluggy-1.5.0
rootdir: /local/home/leof/dev/pynvjitlink
configfile: pyproject.toml
collected 20 items                                                                                                                            

test_pynvjitlink.py ............F.......                                                                                                [100%]

================================================================== FAILURES ===================================================================
_____________________________________________________________ test_get_error_log ______________________________________________________________

undefined_extern_cubin = ('undefined_extern.cubin', b'\x7fELF\x02\x01\x013\x07\x00\x00\x00\x00\x00\x00\x00\x01\x00\xbe\x00~\x00\x00\x00\x00\x00...0\x00\x00\x00\x00\x00\x03\x00\x00\x00\x07\x00\x00\x18\x80\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00')
gpu_arch_flag = '-arch=sm_89'

    def test_get_error_log(undefined_extern_cubin, gpu_arch_flag):
        handle = _nvjitlinklib.create(gpu_arch_flag)
        filename, data = undefined_extern_cubin
        input_type = InputType.CUBIN.value
        _nvjitlinklib.add_data(handle, input_type, data, filename)
        with pytest.raises(_nvjitlinklib.nvJitLinkError):
            # FIXME: For some reason this API would leak the error log to stderr
            _nvjitlinklib.complete(handle)
        error_log = _nvjitlinklib.get_error_log(handle)
        _nvjitlinklib.destroy(handle)
>       assert (
            "Undefined reference to '_Z5undefff' "
            "in 'undefined_extern.cubin'" in error_log
        ), f"{error_log=}"
E       AssertionError: error_log='ERROR 9: finish\n\x00'
E       assert "Undefined reference to '_Z5undefff' in 'undefined_extern.cubin'" in 'ERROR 9: finish\n\x00'

test_pynvjitlink.py:81: AssertionError
------------------------------------------------------------ Captured stderr call -------------------------------------------------------------
error   : Undefined reference to '_Z5undefff' in '�����'
=========================================================== short test summary info ===========================================================
FAILED test_pynvjitlink.py::test_get_error_log - AssertionError: error_log='ERROR 9: finish\n\x00'
======================================================== 1 failed, 19 passed in 0.57s =========================================================

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't capture the exception raised by _nvjitlinklib.complete(handle), I do see the expected error log (no weird unicode mess) showing up in stderr, which is weird/unexpected.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be a regression in the version of nvJitLink used. Is that on your local system? What is the nvJitLink version?

@leofang leofang mentioned this pull request Oct 29, 2024
Comment on lines 5 to 9
project(
pynvjitlink
VERSION ${SKBUILD_PROJECT_VERSION}
LANGUAGES CXX CUDA
LANGUAGES CXX
)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that this PR turns pynvjitlink into a pure Python package (I think!), I don't really know what to put here, given that I've never used CMake to package a pure Python project... 😆

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it makes sense to involve CMake for a pure Python package. Probably only a pyproject.toml would be necessary.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we would need to rearchitect the build systems and CI scripts to account for having only a single build of a pure Python package.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some reason I had to set the env var NUMBA_CUDA_USE_NVIDIA_BINDING=1 to run the tests, otherwise I hit weird type errors when Numba tried to load the cubin etc using the Driver APIs (through ctypes).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that's occurring, I think this PR is likely to be changing the API of pynvjitlink such that the type of a returned cubin is different. I haven't been able to guess exactly what might be happening from the code, so will have to try it locally (or if you have the type error handy, feel free to share it).

@gmarkall
Copy link
Contributor

/ok to test

@bdice
Copy link
Contributor

bdice commented Nov 1, 2024

@gmarkall @leofang If you'd like a hand with the repackaging to make pynvjitlink a pure Python package, let me know. I can help with some pieces of this.

Copy link

@warsawnv warsawnv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple more drive-by observations. Feel free to ignore.

from cuda.bindings.nvjitlink import nvJitLinkError


def nvjitlink_version():

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is kind of an odd looking module. Is it provided just for backward compatibility? Also, why is this one method not just called version()?

Copy link
Member Author

@leofang leofang Nov 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is kind of an odd looking module. Is it provided just for backward compatibility?

Yup! The whole purpose of this module is to drop-in replace the old one (that was implemented using CPython C APIs). This method replaces the following one:

static PyObject *nvjitlink_version(PyObject *self, PyObject *Py_UNUSED(args)) {



def add_file(*args, **kwargs):
raise NotImplementedError("seems unused by pynvjitlink")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If unimplemented, should this shim be omitted? Is it better to get an AttributeError if someone tries to access this or a NotImplementedError when they do?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's been a while. Looks like I did a faithful translation 😂

static PyObject *add_file(PyObject *self, PyObject *args) {
set_exception(PyExc_NotImplementedError, "Unimplemented", NVJITLINK_SUCCESS);
return nullptr;
}

@gmarkall
Copy link
Contributor

If test binaries need to be built for Windows, I added a script to numba-cuda that builds them - it might be useful to copy across to this PR when Windows support is added / tested: https://github.com/NVIDIA/numba-cuda/blob/main/numba_cuda/numba/cuda/tests/test_binary_generation/build.bat

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants