v4.0.0

SuperScientificSoftwareLaboratory · Jul 23, 2024 · fe8024f · fe8024f
1 parent 7e6d2fd
commit fe8024f
Show file tree

Hide file tree

Showing 57 changed files with 5,719 additions and 3,314 deletions.
diff --git a/Makefile b/Makefile
@@ -1,16 +1,18 @@
 all:
 	(cd src; make)
 	(cd lib; make)
-	(cd test; make)
+	(cd examples; make)
 src:
 	(cd src; make)
 
 lib:
 	(cd lib; make)
 
-test:
-	(cd test; make)
+examples:
+	(cd examples; make)
 clean:
 	(cd src; make clean)
 	(cd lib; make clean)
-	(cd test; make clean)
+	(cd examples; make clean)
+
+update:clean all
diff --git a/README.md b/README.md
@@ -11,8 +11,8 @@ PanguLU is an open source software package for solving a linear system *Ax = b*
 ```
 PanguLU/README      instructions on installation
 PanguLU/src         C and CUDA source code, to be compiled into libpangulu.a and libpangulu.so
-PanguLU/test        testing code
-PanguLU/icnlude     contains headers archieve libpangulu.a and libpangulu.so
+PanguLU/examples    example code
+PanguLU/include     contains headers archieve libpangulu.a and libpangulu.so
 PanguLU/lib         contains library archieve libpangulu.a and libpangulu.so
 PanguLU/Makefile    top-level Makefile that does installation and testing
 PanguLU/make.inc    compiler, compiler flags included in all Makefiles
@@ -23,17 +23,28 @@ we use the method is to use make automatic build system.
 installation method:
 You will need install make.
 Frist, in order to use MPI, you need to install mpich (recommended version: OpenMPI-4.1.2).
-Second, in order to use NVCC, you need to install CUDA (recommended version: CUDA-12.1).
+Second, if GPUs are used, NVCC is required. in order to use NVCC, you need to install CUDA (recommended version: CUDA-12.2).
 Third, Specify the installation path to be used in make.inc.
 Fianlly, use make for automatic installation.
 > **make**
 
 ## Compilation options
-One type of compilation are provided.
+Three compilation options are provided.
 
 
-1.You need to open the GPU run option in pangulu_common.h:
-> **#define   GPU_OPEN**
+1 If you want to disable GPU:
+
+1.1 Remove **-DGPU_OPEN** of variable PANGULU_FLAGS in **make.inc**;
+
+1.2 Remove **GPU_CUDA** in file **build_list.csv**.
+
+2 If you want to solve complex matrices:
+
+2.1 Append **-DCALCULATE_TYPE_R64** after variable PANGULU_FLAGS in **make.inc**;
+
+2.2 Use driver routine **driver_cr64.cpp** in directory **examples**.
+
+Note : Solving complex matrices on GPU is not supported in this version.
 
 ## Preprocess methods
 Now offering two types of preprocessing options.
@@ -52,52 +63,42 @@ You will need the following two actions:
 
 note: METIS needs to be 64-bit.
 
-## Calculation Type
-PanguLU currently offer two types of accuracy.
-
-### Double
-If you want use double in calculation, You need to change the calculation type in pangulu_common.h:
->**#define calculate_type double**
-
-Then you need to open the MPI_double option in pangulu_common.h:
->**#define MPI_VAL_TYPE MPI_DOUBLE**
-
-
-### Float
-If you want use float in calculation, You need to change the calculation type in pangulu_common.h:
->**#define calculate_type float**
-
-Then you need to open the MPI_float option in pangulu_common.h:
->**#define MPI_VAL_TYPE MPI_FLOAT**
-
-
 ## Execution of PanguLU
-PanguLU is to complete the operation of solving *Ax = b*, and the test file is placed in the test folder. The test is to first perform the LU numeric decomposition of the matrix test.mtx, and use *Ly = b* to complete the lower triangular solution and *Ux = y* to complete the upper triangular solution test method.
+PanguLU is to complete the operation of solving *Ax = b*, and the test files are placed in the **examples** folder. The driver_r64.cpp file is to first perform the LU numeric decomposition of the matrix test.mtx, and use *Ly = b* to complete the lower triangular solution and *Ux = y* to complete the upper triangular solution test method.
 ### run command
 
-> **mpirun -np process_number ./PanguLU -NB NB_number -F Smatrix_name**
+> **mpirun -np process_count ./pangulu_driver.elf -NB NB_number -F Smatrix_name**
  
-process_number : this process number 
-NB_number : the number of processes required is equal to the product of P and Q; 
-Smatrix_name : the Matrix name in csr format.(This matrix needs to be decomposed symbolically)
+process_count : MPI process number to launch PanguLU;
+
+NB_number : Block size of each non-zero block;
+
+Smatrix_name : the Matrix name in mtx format.(This matrix needs to be decomposed symbolically)
 
 You can also use the run.sh, for example:
 
 > **bash run Smatrix_name NB_number process_number**
 
 ### test sample
 
-> **mpirun -np 6 ./PanguLU -NB 2 -F test.mtx**
+> **mpirun -np 6 ./pangulu_driver.elf -NB 2 -F test.mtx**
 
 or use the run.sh:
 > **bash run.sh test.mtx 2 6**
 
 
-In this example,six processes are used to test, the  NB_number is 2 ,P_number is 2,Q_number is 3, matrix name is test.mtx
+In this example,six processes are used to test, the  NB_number is 2, matrix name is test.mtx.
 
 
 ## Release versions
 
+#### <p align='left'>Version 4.0.0 (Jul. 24, 2024) </p>
+
+* Optimized user interfaces of solver routines;
+* Optimized performamce of numeric factorisation phase on CPU platform;
+* Added support on complex matrix solving;
+* Optimized preprocessing performance;
+
 #### <p align='left'>Version 3.5.0 (Aug. 06, 2023) </p>
 
 * Updated the pre-processing phase with OpenMP.

diff --git a/build_helper.py b/build_helper.py
@@ -0,0 +1,79 @@
+#!/usr/bin/python3
+import csv
+import os
+import sys
+import subprocess
+
+def generate_platform_names(build_list_path, platform_list_path):
+    build_name_list = []
+    with open(build_list_path, "r") as f:
+        build_reader = csv.reader(f)
+        for build_item in build_reader:
+            if len(build_item) < 1:
+                continue
+            build_name_list.append(build_item[0])
+
+    platform_list = []
+    with open(platform_list_path, "r") as f:
+        platform_reader = csv.reader(f)
+        for platform_item in platform_reader:
+            platform_list.append(platform_item)
+
+    build_name_list_ret = []
+    for name in build_name_list:
+        for platform in platform_list:
+            if len(platform) < 2:
+                continue
+            if platform[1] == name:
+                build_name_list_ret.append(platform)
+                break
+    return build_name_list_ret
+
+
+def generate_platform_paths(build_platform_names, platform_list_path):
+    platform_paths = []
+    for platform in build_platform_names:
+        platform_id = platform[0]
+        assert(len(platform_id) == 7)
+        platform_id_l1 = platform_id[0:2]
+        platform_id_l2 = platform_id[2:4]
+        platform_id_l3 = platform_id[4:7]
+        dir_l1 = None
+        dir_l2 = None
+        dir_l3 = None
+        dirs_l1 = [file for file in os.listdir(os.path.dirname(platform_list_path))]
+        for current_dir_l1 in dirs_l1:
+            if current_dir_l1[:2] == platform_id_l1:
+                dir_l1 = current_dir_l1
+                break
+        dirs_l2 = [file for file in os.listdir(os.path.join(os.path.dirname(platform_list_path), dir_l1))]
+        for current_dir_l2 in dirs_l2:
+            if current_dir_l2[:2] == platform_id_l2:
+                dir_l2 = current_dir_l2
+                break
+        dirs_l3 = [file for file in os.listdir(os.path.join(os.path.dirname(platform_list_path), dir_l1, dir_l2))]
+        for current_dir_l3 in dirs_l3:
+            if current_dir_l3[:3] == platform_id_l3:
+                dir_l3 = current_dir_l3
+                break
+        platform_paths.append([platform_id, f"platforms/{dir_l1}/{dir_l2}/{dir_l3}"])
+    return platform_paths
+
+
+def compile_platform_code(build_list_path, platform_list_path):
+    build_platform_names = generate_platform_names(build_list_path, platform_list_path)
+    build_platform_paths = generate_platform_paths(build_platform_names, platform_list_path)
+    for build_platform_path in build_platform_paths:
+        command = f"make -C src/{build_platform_path[1]}"
+        print(command)
+        return_code = subprocess.call(command.split())
+        if return_code != 0:
+            exit(return_code)
+
+
+if __name__ == "__main__":
+    if sys.argv[1] == "compile_platform_code":
+        compile_platform_code("build_list.csv", "src/platforms/platform_list.csv")
+    else:
+        print("[BUILD_HELPER_ERROR] Unknown command.")
+        exit(1)
diff --git a/build_list.csv b/build_list.csv
@@ -0,0 +1 @@
+GPU_CUDA
diff --git a/examples/Makefile b/examples/Makefile
@@ -0,0 +1,12 @@
+LINK_METIS = /path/to/libparmetis.a /path/to/libmetis.a /path/to/libGKlib.a
+OPENBLAS_LIB = /path/to/libopenblas.a
+LINK_CUDA = -L/usr/local/cuda-12.2/lib64 -lcudart -lcusparse
+LINK_PANGULU = ../lib/libpangulu.a # Derictly importing static library as compiler input makes dynamic library loader searching the directory of static library.
+
+all: pangulu_driver.elf
+
+pangulu_driver.elf:driver_r64.cpp
+	mpic++ -O3 $< -I../include $(LINK_PANGULU) $(LINK_CUDA) $(LINK_METIS) $(OPENBLAS_LIB) -fopenmp -o $@
+
+clean:
+	rm -f *.elf *.tsv