Author: Damodar Rajbhandari (2023-Jan-01 - Last Update: 2023-Feb-21)
# Shifted-inverse power method using cuSolver
make
./maxeigenvalue mtxs/L11.mtx
# Power method using cuSparse and thrust
make mainpower
./maxeigenvaluepower mtxs/L11.mtx
# Compile Shifted-inverse power method, power method, and Spectra library
make all
- nvcc compiler supporting c++14 to compile the codebase
- cuSparse (in CUDA Toolkit) for sparse matrix operations
- cuSolver (in CUDA Toolkit) to find largest eigenvalue
- thrust (in CUDA Toolkit) for parallel data-structure & algorithms
- NVIDIA Nsight for vscode extension (Optional) for debugging and profiling purposes
- Code ran on NVIDIA GeForce RTX 2080 Ti and CUDA Version 11.7. Check yours using
nvidia-smi -q
command.
- Used Spectra version 1.0.1 on the top of Eigen3 version 3.4.0
- Code ran on Intel(R) Core(TM) i9-10980XE CPU @ 3.00GHz with 125GB RAM and Arch GNU/Linux x86-64 with Linux kernel: 5.15.41-1-lts. Check yours using these commands:
lscpu
to get CPU details,free -g -h -t
to get RAM details, andcat /etc/os-release
OS details.
make mainspectra
./maxeigenvaluespectra mtxs/dL22.mtx
- Please install hyperfine. It is a command-line benchmarking tool.
- It can be installed via
conda
from theconda-forge
channel:conda install -c conda-forge hyperfine
- It can be installed via
Here are the results:
- Using power method on GPU (unoptimized code, without using tolerance for convergence. See:
computeMaxEigenvaluePowerMethod
)hyperfine './maxeigenvaluepower mtxs/dL22.mtx'
- Results:
Benchmark 1: ./maxeigenvaluepower mtxs/dL22.mtx Time (mean ± σ): 14.282 s ± 0.043 s [User: 12.608 s, System: 1.569 s] Range (min … max): 14.241 s … 14.373 s 10 runs
- Results:
- Using power method on GPU (optimized code, using tolerance for convergence. See
computeMaxEigenvaluePowerMethodOptimized
)hyperfine './maxeigenvaluepower mtxs/dL22.mtx'
- Results:
Benchmark 1: ./maxeigenvaluepower mtxs/dL22.mtx Time (mean ± σ): 13.782 s ± 0.038 s [User: 12.112 s, System: 1.569 s] Range (min … max): 13.726 s … 13.873 s 10 runs
- Results:
- Using Spectra library on CPU
hyperfine './maxeigenvaluespectra mtxs/dL22.mtx'
- Results:
Benchmark 1: ./maxeigenvaluespectra mtxs/dL22.mtx Time (mean ± σ): 2.485 s ± 0.012 s [User: 2.478 s, System: 0.007 s] Range (min … max): 2.466 s … 2.506 s 10 runs
- Results:
-
maxeigenvalue
which is based oncusolverSpScsreigvs
doesnot work for larger matrices. For example:mtx/dL22.mtx
. -
maxeigenvaluepower
doesnot work formtx/dL00.mtx
ormtx/L00.mtx
if you usecomputeMaxEigenvaluePowerMethod
but for others inmtx/
directory; it works fine. This is because we set the initial vectorx_i
sets to 1.0 for all its elements. This initial vector may gives rise to an orthogonal vector with eigenvector for some matrices. Ideally choosing a random vector such that its norm is 1 and entries is mostly non-zero (because$Ax = 0$ if$x$ is$0$ ) allows the chance to decrease that our vector is orthogonal to the eigenvector. It is done incomputeMaxEigenvaluePowerMethodOptimized
function.