Proposal for CUDA upgrade #5300
amadeuszsz
started this conversation in
Design
Replies: 1 comment
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Introduction
Currently Autoware supports strictly defined versions of CUDA-related libraries. These versions of libraries became relatively old and block future development. One of the example is ongoing development of 3D LiDAR semantic segmentation for Autoware which requires TensorRT
10.0
+.Proposal
We currently support:
CUDA=12.3
CUDNN=8.9.5.29-1+cuda12.2
TensorRT=8.6.1.6-1+cuda12.0
Version upgrade could be done based on two visions.
Option 1 - JetPack support
This week (when writing this proposal) there was a JetPack 6.1 release which brings upgrades for CUDA-related libraries. Fortunately, libraries versions meet our requirements and from OSS community perspective it could be nice if Autoware align with JetPack release.
In addition, looking into last update, there was a valuable comment regarding choosing specific libraries version which reflects experience for edge devices users.
Finally, it gives us:
CUDA=12.6
CUDNN=9.3.0.75-1+cuda12.6
TensorRT=10.3.0.26-1+cuda12.5
Option 2 - latest versions
Simply go to the latest releases:
CUDA=12.6
CUDNN=9.4.0.58-1+cuda12.6
TensorRT=10.5.0.18-1+cuda12.6
Summary
In JetPack support -> latest versions upgrade we enrich libraries with extra features, old bug fixes and new bug introductions as described in changelogs:
CUDA
- not appliesCUDNN
- 9.3.0 -> 9.4.0TensorRT
- 10.3.0 -> 10.4.0 and 10.4.0 -> 10.5.0CUDNN for option 1 (
9.3.0
) potentially can bring issues as it is described in changelog (Some graphs containing the norm forward operation in inference mode fail to serialize.
) and it should be validated first. Secondly, TensorRT support for Ubuntu24.04
was introduced in10.4.0
and we might need to align with roadmap for ROS 2 Jazzy Autoware update (@mitsudome-r (?)).What's next?
There is no ideal solution but what we can do is to follow these steps:
10.0
+ old API is removed. Related packages:tensorrt_common
autoware_lidar_apollo_instance_segmentation
autoware_lidar_centerpoint
autoware_shape_estimation
autoware_tensorrt_classifier
autoware_tensorrt_yolox
There might be conflict between new Autoware dependencies and edge devices user with JetPack up toNew API was introduced in TensorRT 8.5.2 which is already in JetPack 5.1.2. I assume it is fair to support only new API.6.0
. I suggest to refactor all packages to usetensorrt_common
API call for model inference, where we add multiple macros to choose API call based on installed TensorRT version.This way, we will know if we can support edge devices and if none of the packages have issues with the upgraded libraries. (updated
amd64.env
andarm64.env
are compatible). Secondly, we will be prepared for future ROS 2 Jazzy upgrade.Please feel free to share your thoughts. This upgrade may affect contributors' work, therefore we would like to find the best solution for all of us.
Beta Was this translation helpful? Give feedback.
All reactions