Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cuda 11 and 12 compatibility #132

Draft
wants to merge 21 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
CMAKE_MINIMUM_REQUIRED(VERSION 2.8)
CMAKE_MINIMUM_REQUIRED(VERSION 3.2)

project(IcWorkspace)

SET(ICMAKER_DIRECTORY ${CMAKE_CURRENT_SOURCE_DIR}/icmaker)

Expand Down
27 changes: 20 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,12 +13,12 @@ If you have questions or experience any problems with the software, please post
at https://github.com/fzi-forschungszentrum-informatik/gpu-voxels/issues

## Install dependencies:
This software is tested under 64 Bit Ubuntu Linux 14.04, 16.04 and 18.04.
This software was tested under 64 Bit Ubuntu Linux 16.04, 18.04. and 20.04
Find detailed installation and linking instructions in our Doxygen.

**Core:**

- CUDA 7.5, 8.0, 9.x or 10.0
- CUDA 9.x, 10.x or 11.x
- PCL
- OpenNI
- Boost
Expand Down Expand Up @@ -51,17 +51,30 @@ Doxygen files can be generated by

make doc

## Compiling without C++11
C++11 is enabled by default. To compile without C++11 mode comment out this at the top of packages/gpu_voxels/CMakeLists.txt:
SET(CMAKE_CXX_STANDARD 11)
## Compiling with C++11 instead of C++14
C++14 is enabled by default. To compile with C++11 instead change these lines in packages/gpu_voxels/CMakeLists.txt:

SET(CMAKE_CXX_STANDARD 14)
SET(ICMAKER_CUDA_CXX_STANDARD "--std=c++14")

This is important if you are still using ROS Indigo and need to compile without C++11 support for compatibility reasons. There is a separate indigo branch that only differs in this line to disable C++11 during compilation.

## Known issues
- If the ROS dependency was found, but the GPU-Voxels URDF features are still unabailable, run `source /opt/ros/YOUR_ROS_DISTRO/setup.bash` before running cmake.
- Eigen 3 issues: can be fixed by cloning a more current unstable Eigen version and placing it in CMAKE_PREFIX_PATH
- If the ROS dependency was found but the GPU-Voxels URDF features are still unabailable, run `source /opt/ros/YOUR_ROS_DISTRO/setup.bash` before running cmake.
- Octrees are currently bugged on modern GPU architectures of compute capability 7.2 and newer, leading to excessive/infinite runtime of octree algorithms.
+ as a stopgap measure using octree functions on these GPUs now throws an exception until a fix has been found
+ to run all tests except for the octree ones run `./bin/test_gpu_voxels_core --run_test=\!octree_selftest:\!octree_collisions`
- Eigen 3 issues on Ubuntu 18.04: can be fixed by cloning a more current unstable Eigen version and placing it in CMAKE_PREFIX_PATH
+ on Ubuntu 18.04 with CUDA 10.0: "math_functions.hpp not found"
+ Eigen 3.3.4 and 3.3.5 with CUDA 9.0, 9.1, 9.2: Error: class "Eigen::half" has no member "x"
+ see http://eigen.tuxfamily.org/index.php?title=Main_Page#Download
+ follow install_eigen_for_18.04.sh to add a suitable Eigen to CMAKE_PREFIX_PATH or do the following to install a new Eigen system-wide
```bash
git clone https://gitlab.com/libeigen/eigen.git
sudo cp -r signature_of_eigen3_matrix_library /usr/include/eigen3
sudo cp -r unsupported/ /usr/include/eigen3
sudo cp -r Eigen/ /usr/include/eigen3
```
- Cuda 8.0: Code compiled with Cuda 8.0 works fine with older GPU drivers such as 375.66, but there are runtime errors with driver 384.111 and newer ("PTX JIT compilation failed"). Easy fix: use Cuda 10 with a current 410 or newer driver version. Cuda 10 is also available for Ubuntu 14.04 and 16.04.
- GLM: There is a known bug in GLM on Ubuntu 16.04 that has to be patched to allow usage of the visualizer. Apply the patch described at https://github.com/g-truc/glm/issues/530 to /usr/include/glm/detail/func_common.inl

Expand Down
4 changes: 4 additions & 0 deletions icmaker/IcMaker.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -91,6 +91,10 @@ IF(POLICY CMP0015)
CMAKE_POLICY(SET CMP0015 NEW)
ENDIF()

cmake_policy(SET CMP0017 NEW) # prefer system CMake modules
cmake_policy(SET CMP0072 OLD) # use old FindOpenGL behavior
cmake_policy(SET CMP0074 OLD) # use env variables for find_package <PkgName>_ROOT discovery

# ----------------------------------------------------------------------------
# Current version number:
# ----------------------------------------------------------------------------
Expand Down
11 changes: 11 additions & 0 deletions install_eigen_for_18.04.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
#!/bin/bash

DEST=3rd-party/eigen-git-mirror
VERSION="3.3.7"

echo "Cloning Eigen3 tag $VERSION to $DEST ($(realpath $DEST))"
mkdir -p $DEST
git clone https://github.com/eigenteam/eigen-git-mirror.git -b $VERSION $DEST

echo "Add the Eigen3 root folder to CMAKE_PREFIX_PATH as follows:"
echo "export CMAKE_PREFIX_PATH=$(realpath ${DEST}):\$CMAKE_PREFIX_PATH"
80 changes: 64 additions & 16 deletions packages/gpu_voxels/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -1,17 +1,43 @@
# this is for emacs file handling -*- mode: cmake; indent-tabs-mode: nil -*-

SET(CMAKE_CXX_STANDARD 11) # comment out to deactivate C++11 when using ROS indigo to avoid incompatibilities
# --- Global options ---

cmake_policy(SET CMP0074 OLD) # use env variables for find_package <PkgName>_ROOT discovery

## support Jetson mobile GPUs
SET(JETSON_SUPPORT OFF)

## allow compilation without PCL for gvl_ompl_planner
SET(PCL_SUPPORT ON)

## disable compilation of GPU-Voxels examples
SET(BUILD_EXAMPLES ON)

# fix for cuda 9.1 thrust cub addArgument runtime segfault
SET(CUDA_USE_STATIC_CUDA_RUNTIME OFF)

# required in order to query CUDA_VERSION_STRING
FIND_PACKAGE(CUDA)

## set C++ standard
# Note: C++11 is incompatible with ROS Indigo / Ubuntu 16.04
SET(CMAKE_CXX_STANDARD 14)
SET(ICMAKER_CUDA_CXX_STANDARD "--std=c++14")

# suppress erroneous warnings, see https://github.com/ORNL-CEES/DataTransferKit/pull/404
# Warning: incompatible with CUDA < 9.0 !
MESSAGE(STATUS "------------- GPU Voxels found CUDA version ${CUDA_VERSION_STRING} ------------")
IF(CUDA_VERSION_STRING VERSION_LESS "9")
SET(SUPPRESS_NVCC_DEFAULTED_WARNINGS "")
ELSE()
SET(SUPPRESS_NVCC_DEFAULTED_WARNINGS "-Xcudafe --diag_suppress=esa_on_defaulted_function_ignored")
ENDIF()

# --- To be used by other modules ---
SET(GPU_VOXELS_INCLUDE_DIRS "${CMAKE_CURRENT_SOURCE_DIR}/src" "${CMAKE_CURRENT_BINARY_DIR}/src" CACHE INTERNAL "")
SET(GPU_VOXELS_IDL_DIRS "${CMAKE_CURRENT_SOURCE_DIR}/src" CACHE INTERNAL "")
SET(GPU_VOXELS_IDE_FOLDER "gpu_voxels")

# --- Global options ---

SET(CUDA_USE_STATIC_CUDA_RUNTIME OFF) # fix for cuda 9.1 thrust cub addArgument runtime segfault
FIND_PACKAGE(CUDA)

FIND_PACKAGE(Boost COMPONENTS system thread filesystem date_time unit_test_framework chrono)

# LibRT is needed for Boost Interprocess on POSIX systems
Expand All @@ -37,7 +63,9 @@ FIND_PACKAGE(kdl_parser)
FIND_PACKAGE(Octomap)

# Dependencies for PCL interfaces
FIND_PACKAGE(PCL)
IF (PCL_SUPPORT)
FIND_PACKAGE(PCL)
ENDIF (PCL_SUPPORT)

FIND_PACKAGE(pcl_ros)

Expand All @@ -48,6 +76,14 @@ FIND_PACKAGE(OpenNi)
# ICL Package management
ICMAKER_REGISTER_PACKAGE(gpu_voxels)

# Check if pcl/io/openni_grabber.h exists
FIND_FILE(PCL_OPENNI_GRABBER_FILE pcl/io/openni_grabber.h ${PCL_INCLUDE_DIRS})
IF(${PCL_OPENNI_GRABBER_FILE} STREQUAL "PCL_OPENNI_GRABBER_FILE-NOTFOUND")
SET(PCL_OPENNI_GRABBER_FILE_FOUND OFF)
ELSE()
SET(PCL_OPENNI_GRABBER_FILE_FOUND ON)
ENDIF()

MESSAGE(STATUS "--------------------------------------------------------------------------")
MESSAGE(STATUS "------------------------ GPU Voxels configuration ------------------------")
MESSAGE(STATUS " ")
Expand Down Expand Up @@ -106,11 +142,11 @@ ENDIF(urdfdom_FOUND AND orocos_kdl_FOUND AND kdl_parser_FOUND)
# MESSAGE(STATUS "[WARNING] Building GPU-Voxels without Kinect support. OpenNI2 not found.")
#ENDIF(OPENNI2_FOUND)

IF(OPENNI_FOUND)
MESSAGE(STATUS "[OK] Building GPU-Voxels with Kinect support. OpenNI was found.")
IF(OPENNI_FOUND AND PCL_OPENNI_GRABBER_FILE_FOUND)
MESSAGE(STATUS "[OK] Building GPU-Voxels with Kinect support. OpenNI and ${PCL_OPENNI_GRABBER_FILE} was found.")
ELSE(OPENNI_FOUND)
MESSAGE(STATUS "[WARNING] Building GPU-Voxels without Kinect support. OpenNI not found.")
ENDIF(OPENNI_FOUND)
MESSAGE(STATUS "[WARNING] Building GPU-Voxels without Kinect support. OpenNI or pcl/io/openni_grabber.h not found.")
ENDIF(OPENNI_FOUND AND PCL_OPENNI_GRABBER_FILE_FOUND)

IF(Octomap_FOUND)
MESSAGE(STATUS "[OK] Building GPU-Voxels with Octomap support. Octomap found.")
Expand All @@ -135,8 +171,10 @@ MESSAGE(STATUS "----------------------------------------------------------------

# Change these lines to increase performance if your GPU's compute capability is higher
# GPU-Voxels requires GPU CUDA capabilities >= 2.0
#SET(ICMAKER_CUDA_COMPUTE_VERSION 20)
SET(ICMAKER_CUDA_COMPUTE_VERSION 35)
# examples: GTX980 is 5.2, Jetson TX1 is 5.3
# Cuda 11 will print warnings for deprecated architectures like 3.5 and 3.7
#SET(ICMAKER_CUDA_COMPUTE_VERSION 35)
SET(ICMAKER_CUDA_COMPUTE_VERSION 52)

SET(ICMAKER_CUDA_ARCH compute_${ICMAKER_CUDA_COMPUTE_VERSION})
SET(ICMAKER_CUDA_CODE compute_${ICMAKER_CUDA_COMPUTE_VERSION})
Expand All @@ -153,9 +191,17 @@ SET(ICMAKER_CUDA_CODE compute_${ICMAKER_CUDA_COMPUTE_VERSION})

SET(ICMAKER_CUDA_PTXAS_VERBOSE "") # "--resource-usage") #nvcc outputs register and memory usage data for each kernel
SET(ICMAKER_CUDA_WALL "-Xcompiler=-Wall")
SET(ICMAKER_CUDA_MAXREGS "--maxrregcount=63") # set to 31 to compile on JetsonTX1/TX2/Nano (compute capability 5.3/6.2); N*blocksize must be smaller than max registers per block
SET(ICMAKER_CUDA_DEBUG "-lineinfo") # "-lineinfo") #"-g -G") # -G disables optimization. use -lineinfo for profiling

# might be replaced by setting per-kernel launch bounds: https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#launch-bounds
# TODO: test again if using 32 (Jetson) or 64 (non-Jetson) instead of 31/63 causes problems
IF(JETSON_SUPPORT)
# set to 31 to compile on JetsonTX1/TX2/Nano (compute capability 5.3/6.2); N*blocksize must be smaller than max registers per block
SET(ICMAKER_CUDA_MAXREGS "--maxrregcount=31")
ELSE(JETSON_SUPPORT)
SET(ICMAKER_CUDA_MAXREGS "--maxrregcount=63") # there are 64K registers on current Geforce GPUs, fittting 1K threads with 64 registers
ENDIF(JETSON_SUPPORT)

# Enable position independent code
SET(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -fPIC")

Expand All @@ -166,7 +212,7 @@ ELSE()
SET(ICMAKER_CUDA_THRUST_DEBUG "" )
ENDIF()

SET(ICMAKER_CUDA_CPPDEFINES "${ICMAKER_CUDA_DEBUG} ${ICMAKER_CUDA_MAXREGS} ${ICMAKER_CUDA_PTXAS_VERBOSE} -gencode=arch=${ICMAKER_CUDA_ARCH},code=[${ICMAKER_CUDA_CODE}] ${ICMAKER_CUDA_THRUST_DEBUG} ${ICMAKER_CUDA_WALL}")
SET(ICMAKER_CUDA_CPPDEFINES "${ICMAKER_CUDA_DEBUG} ${ICMAKER_CUDA_MAXREGS} ${ICMAKER_CUDA_PTXAS_VERBOSE} -gencode=arch=${ICMAKER_CUDA_ARCH},code=[${ICMAKER_CUDA_CODE}] ${ICMAKER_CUDA_THRUST_DEBUG} ${ICMAKER_CUDA_CXX_STANDARD} ${ICMAKER_CUDA_WALL} ${SUPPRESS_NVCC_DEFAULTED_WARNINGS}")

MESSAGE("ICMAKER_CUDA_CPPDEFINES: ${ICMAKER_CUDA_CPPDEFINES}")

Expand Down Expand Up @@ -199,7 +245,9 @@ ICMAKER_CONFIGURE_PACKAGE()

###############################################################################
# Build examples
ADD_SUBDIRECTORY(src/examples)
IF(BUILD_EXAMPLES)
ADD_SUBDIRECTORY(src/examples)
ENDIF()

IF(BUILD_TESTS)
ENDIF(BUILD_TESTS)
Expand Down
31 changes: 20 additions & 11 deletions packages/gpu_voxels/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,12 +13,12 @@ If you have questions or experience any problems with the software, please post
at https://github.com/fzi-forschungszentrum-informatik/gpu-voxels/issues

## Install dependencies:
This software is tested under 64 Bit Ubuntu Linux 14.04, 16.04 and 18.04.
This software was tested under 64 Bit Ubuntu Linux 16.04, 18.04. and 20.04
Find detailed installation and linking instructions in our Doxygen.

**Core:**

- CUDA 7.5, 8.0, 9.x or 10.0
- CUDA 9.x, 10.x or 11.x
- PCL
- OpenNI
- Boost
Expand Down Expand Up @@ -51,21 +51,30 @@ Doxygen files can be generated by

make doc

## Compiling without C++11
C++11 is enabled by default. To compile without C++11 mode comment out this at the top of packages/gpu_voxels/CMakeLists.txt:
SET(CMAKE_CXX_STANDARD 11)
This is important if you are still using ROS Indigo and need to compile without C++11 support for compatibility reasons. There is a separate indigo branch that only differs in this line to disable C++11 during compilation.
## Compiling with C++11 instead of C++14
C++14 is enabled by default. To compile with C++11 instead change these lines in packages/gpu_voxels/CMakeLists.txt:

SET(CMAKE_CXX_STANDARD 14)
SET(ICMAKER_CUDA_CXX_STANDARD "--std=c++14")

## Enabling C++11
C++11 is enabled by default. To compile without C++11 mode comment out this at the top of packages/gpu_voxels/CMakeLists.txt:
SET(CMAKE_CXX_STANDARD 11)
This is important if you are still using ROS Indigo and need to compile without C++11 support for compatibility reasons. There is a separate indigo branch that only differs in this line to disable C++11 during compilation.

## Known issues
- If the ROS dependency was found, but the GPU-Voxels URDF features are still unabailable, run `source /opt/ros/YOUR_ROS_DISTRO/setup.bash` before running cmake.
- Eigen 3 issues: can be fixed by cloning a more current unstable Eigen version and placing it in CMAKE_PREFIX_PATH
- If the ROS dependency was found but the GPU-Voxels URDF features are still unabailable, run `source /opt/ros/YOUR_ROS_DISTRO/setup.bash` before running cmake.
- Octrees are currently bugged on modern GPU architectures of compute capability 7.2 and newer, leading to excessive/infinite runtime of octree algorithms.
+ as a stopgap measure using octree functions on these GPUs now throws an exception until a fix has been found
+ to run all tests except for the octree ones run `./bin/test_gpu_voxels_core --run_test=\!octree_selftest:\!octree_collisions`
- Eigen 3 issues on Ubuntu 18.04: can be fixed by cloning a more current unstable Eigen version and placing it in CMAKE_PREFIX_PATH
+ on Ubuntu 18.04 with CUDA 10.0: "math_functions.hpp not found"
+ Eigen 3.3.4 and 3.3.5 with CUDA 9.0, 9.1, 9.2: Error: class "Eigen::half" has no member "x"
+ see http://eigen.tuxfamily.org/index.php?title=Main_Page#Download
+ follow install_eigen_for_18.04.sh to add a suitable Eigen to CMAKE_PREFIX_PATH or do the following to install a new Eigen system-wide
```bash
git clone https://gitlab.com/libeigen/eigen.git
sudo cp -r signature_of_eigen3_matrix_library /usr/include/eigen3
sudo cp -r unsupported/ /usr/include/eigen3
sudo cp -r Eigen/ /usr/include/eigen3
```
- Cuda 8.0: Code compiled with Cuda 8.0 works fine with older GPU drivers such as 375.66, but there are runtime errors with driver 384.111 and newer ("PTX JIT compilation failed"). Easy fix: use Cuda 10 with a current 410 or newer driver version. Cuda 10 is also available for Ubuntu 14.04 and 16.04.
- GLM: There is a known bug in GLM on Ubuntu 16.04 that has to be patched to allow usage of the visualizer. Apply the patch described at https://github.com/g-truc/glm/issues/530 to /usr/include/glm/detail/func_common.inl

Expand Down
Loading