The NVIDIA Performance Libraries (NVPL) are a collection of high performance mathematical libraries optimized for the NVIDIA Grace Armv9.0-A Neoverse-V2 architecture.
These CPU-only libraries have no dependencies on CUDA or CTK, and are drop in replacements for standard C and Fortran mathematical APIs allowing HPC applications to achieve maximum performance on the Grace platform.
The provided sample codes show how to call and link to NVPL Libraries in Fortran, C, and C++ applications and libraries. Most examples use CMake, but are easily modified for use in custom build environments.
- NVPL Downloads
- Latest release: NVPL-24.7
Samples are compatible with the latest nvpl release. Compatibility with older releases is not guaranteed.
- NVPL BLAS Samples
- NVPL FFT Samples
- NVPL LAPACK Samples
- NVPL RAND Samples
- NVPL ScaLAPACK Samples
- NVPL Sparse Samples
- NVPL Tensor Samples
- Architecture: aarch64-linux
- Platform: Arm SBSA
- CPUs Supported
- NVIDIA Grace (Armv9.0-A Neoverse-V2)
- AWS Graviton 3/3e (Armv8.4-A Neoverse-V1)
- AWS Graviton 2 (Arm-8.2-A Neoverse-N1)
- Ampere Altra (Armv8.2-A Neoverse-N1)
- Any CPU with Armv8.1-A or later micro Architecture
- OS (Linux)
- Ubuntu: 20.04, 22.04, 24.04
- RHEL: RHEL8, RHEL9
- Fedora: 37, 38, 39, 40
- SLES: SLES15
- OpenSUSE/leap: 15.5
- AmazonLinux: 2, 2023
- Generally any Linux OS with support for aarch64
- GCC-8 - GCC-14+
- Clang-14 - Clang-18+
- Clang for NVIDIA Grace: 16.x, 17.x, 18.x
- NVIDA HPC Compilers: 23.9 - 24.5
- C: All libraries
- C++: All libraries via C interfaces
- Fortran: Selected libraries
- GFortran ABI
- NVPL BLAS, LAPACK, and ScaLAPACK provide
lp64
andilp64
integer ABIs - See individual libraries samples documentation for further details
All libraries support the following OpenMP runtime libraries. See individual libraries documentation for details and API extensions supporting nested parallelism.
- GCC: libgomp.so
- Clang: libomp.so
- NVHPC: libnvomp.so
NVPL provides standard BLACS interfaces for the following MPI distributions. See the NVPL ScaLAPACK Samples Documentation for details.
- MPICH: Runtime support for
>=mpich-4.0 && <mpich-4.2
- OpenMPI-3.x
- OpenMPI-4.x
- OpenMPI-5.x
- NVIDIA HPC-X: Use
openmpi4
BLACS interface
NVPL provides CMake Package Config files for the each component library.
If NVPL was installed via the OS package manager under the /usr
directory, the NVPL packages will already be on the default
CMAKE_PREFIX_PATH
. The nvpl_ROOT
environment can be used to override
the default search path and force finding nvpl under a specific prefix.
The find_package() command is used to find nvpl and any component libraries:
find_package(nvpl)
Each NVPL component library found will print a brief status message with important locations.
- Variable
nvpl_FOUND
will be true if nvpl is successfully found - Variable
nvpl_VERSION
will contain the found version - Pass the
REQUIRED
keyword to raise an error ifnvpl
package is not found. - Regardless of the
COMPONENTS
keyword, all available nvpl component libraries installed in the same prefix will be found. - To raise an error if a particular component is not found, use
REQUIRED COMPONENTS ...
- Set
QUIET
to avoid printing status messages, or reporting an error if nvpl is not found find_package(nvpl)
can safely be called multiple times from different locations in a project.
The NVPL component libraries provide Imported Interface
Targets
under the common nvpl::
namespace. To add all the necessary flags to
compile and link against NVPL libraries, use the
target_link_libraries()
command:
target_link_libraries(my_target PUBLIC nvpl::<lib>_<opts>)
Here <lib>
is the lowercase shorthand for the library and and <opts>
are defined by the library.
NVPL component and target names use all-lowercase naming schema. See individual libraries documentation for details on available options.
Component | Targets | Options / Notes |
---|---|---|
blas | nvpl::blas_<int>_<thr> |
<int> : lp64 , ilp64 <thr> : seq , omp |
fft | nvpl::fftw |
FFTW API interface |
lapack | nvpl::lapack_<int>_<thr> |
<int> : lp64 , ilp64 <thr> : seq , omp |
rand | nvpl::rand nvpl::rand_mt |
Single-threaded Multi-threaded (OpenMP) |
scalapack | nvpl::blacs_<int>_<mpi> nvpl::scalapack_<int> |
<int> : lp64 , ilp64 <mpi> : mpich , openmpi3 ,openmpi4 , openmpi5 |
sparse | nvpl::sparse |
|
tensor | nvpl::tensor |
Each nvpl component library also exports variables
nvpl_<comp>_VERSION
- Version of component librarynvpl_<comp>_INCLUDE_DIR
- Full path to component headers directorynvpl_<comp>_LIBRARY_DIR
- Full path to component libraries directory
These Sample codes are provided under the NVIDIA Software license for NVPL SDK.