Bench_Grid_HiRep code v-0.0.1

Bench_Grid_HiRep is a code used in system programming to build and launch Lattice Gauge theory codes. It aims at extracting system information or system management and/or test/benchmark different scenarios on various HPC clusters as well as local ones.

The code is modular and aims at

build
Launch
collect and data analyse results

from a remote setting across several machines simultaneously.

The code is currently driven in bash and works under both

Windows 10 via Cygwin/X or WSL (Debian or Ubuntu) other emulators are also capable.
Linux

Environments.

Some parts of code are/will ./be written in C/C++/Cuda-C and is still under development. The driving part is done in bash and is portable on different OS.

A Python interface will then be developed at a later stage. The code may also be driven using bash/PowerShell commands line with [options]. The prupose of the Python environment is to have a a proper data analysis tool in the data collection.

Requirements

Linux: GNU chain tool
Windows: The built in Visual Studio Tool Chain (clang)
Make, CMake, and CUDA-toolkit (12.x preferably)
Windows: Visual Studio 2019 community version
Python:
- psycopg2 (PostGreSQL access)
- psycopg[binaries]
- pyodbc (SQL access)
- selenium (automated testing on websites)

How to use it

Right now the code is still under development and only some of the classes have been developed. The main is just a simple driver code that will be rearranged otherwise later once, most of the primary classes have been at least developed

Depending on the purpose the use of bash , Pyhton or C/C++/CUDA-C is determined

For building >$ sh dispatcher_Grid_hiRep.sh, in the root project.
For data analysis >$ python Bench_Grid_HiRep.py [--grid=Grid, --hirep=HiRep]
For C/C++ driven with GPU acceleration >$ cmake. Via Ninja build.

The Python or the bash may be used for complete automation at some point this is still under development.

How to download.

The code is available in GitHub via a private share.

https: https://github.com/fbonnet08/Bench_Grid_HiRep.git
ssh: [email protected]:fbonnet08/Bench_Grid_HiRep.git
github cli: gh repo clone fbonnet08/Bench_Grid_HiRep

GPU-Component

The cuda tool kit is used for handling some of the computationally expansive tasks and may be removed later if not needed at all. At the moment it stays. Some basic kernels and testcode has been developed and inserted and may be removed later as I see fits.

Certain functionalities

The code parses system information such as network infrastructure and information. The information is then stored in data structures which can then be passed around in the code for extraction and exploitation.

The Benches

SOMBRERO:

Small, strong scaling, nodes=1, 2, 3, 4, 6, 8, 12
Large, strong scaling; small/large, weak scaling, nodes=1, 2, 3, 4, 6, 8, 12, 16, 24, 32
In all cases, fill entire node

BKeeper CPU:

--grid 24.24.24.32, 1 node, vary ntasks per node, set threads per task to fill node
--grid 24.24.24.32, 1 node, fix optimal ntasks per node and threads per task, vary --mpi argument.
(From this we can identify what pattern of MPI filling optimises performance.
This is typically going 1.1.1.2, 1.1.1.4, until we run out of factors,
and then going to 1.1.2.N, etc., but this should be verified)
--grid 24.24.24.32, use a sensible --mpi and ntasks-per-node, nodes=1, 2, 3, 4, 6, 8, 12, 16
--grid 64.64.64.96, use a sensible --mpi and ntasks-per-node, nodes=1, 2, 3, 4, 6, 8, 12, 16, 24, 32
--grid 24.24.24.{number of MPI ranks} --mpi 1.1.2.{number of MPI ranks/2}, nodes=1, 2, 3, 4, 6, 8, 12, 16, 24, 32

BKeeper GPU:
Note when running BKeeper in parallel on GPU that a wrapper script is needed to place processes on the correct device. This will vary from machine to machine; an example for Tursa is attached

--grid 24.24.24.32, use a sensible --mpi from above, one task per GPU, nodes=1, 2, 3, 4, 6, 8, 12, 16
--grid 64.64.64.96, use a sensible --mpi from above, one task per GPU, nodes=1, 2, 3, 4, 6, 8, 12, 16, 24, 32
--grid 24.24.24.{number of MPI ranks} --mpi 1.1.2.{number of MPI ranks/2}, nodes=1, 2, 3, 4, 6, 8, 12, 16, 24, 32
--grid 48.48.48.64 --mpi 1.1.1.4, scan GPU clock frequency

Grid GPU:
tests/sp2n/Test_hmc_Sp_WF_2_Fund_3_2AS.cc, using a thermalised starting configuration

--grid 32.32.32.64, use a sensible --mpi, one task per GPU, nodes=1, 2, 4, 8, 16, 32

HiRep LLR HMC CPU:

Weak scaling, 1 rank per replica, number of replicas = total number of CPU cores (varies with platform),
nodes=1, 2, 3, 4
Strong scaling, number of CPU cores per node = number of replicas;
total number of CPU cores = number of replicas * number of domains per replica, nodes=1, 2, 3, 4, 6, 8

On EuroHPC machines we absolutely need to get the latter two tests done so that we can prepare a convincing application. On other machines we only need SOMBRERO and BKeeper.

Documentation

Right now the code is not documented but will be later once it matures a little

Name		Name	Last commit message	Last commit date
Latest commit History 98 Commits
Scripts		Scripts
app_makers		app_makers
backup		backup
benchmarks		benchmarks
doc		doc
include		include
src		src
test_codes		test_codes
.gitignore		.gitignore
BenchGridHiRep_DB.sqlite		BenchGridHiRep_DB.sqlite
Bench_Grid_HiRep.py		Bench_Grid_HiRep.py
CMakeLists.txt		CMakeLists.txt
Makefile		Makefile
Makefile_macros		Makefile_macros
README.md		README.md
build_Grid.sh		build_Grid.sh
build_HiRep-LLR_HMC-master.sh		build_HiRep-LLR_HMC-master.sh
build_Hirep_LLR_SP.sh		build_Hirep_LLR_SP.sh
build_SombreroBKeeper.sh		build_SombreroBKeeper.sh
build_dependencies.sh		build_dependencies.sh
clean_all_builds.sh		clean_all_builds.sh
common_main.sh		common_main.sh
config_Run_Batchs.sh		config_Run_Batchs.sh
config_batch_action.sh		config_batch_action.sh
config_system.sh		config_system.sh
creator_bench_all_batchs.sh		creator_bench_all_batchs.sh
creator_bench_case_batch.sh		creator_bench_case_batch.sh
creator_bench_controller_batch.sh		creator_bench_controller_batch.sh
dispatcher_Grid_HiRep.sh		dispatcher_Grid_HiRep.sh
install_Grid.sh		install_Grid.sh
junk.bash		junk.bash
launcher_bench_BKeeper.sh		launcher_bench_BKeeper.sh
launcher_bench_Grid.sh		launcher_bench_Grid.sh
launcher_bench_HiRep.sh		launcher_bench_HiRep.sh
launcher_bench_Sombrero_Strong.sh		launcher_bench_Sombrero_Strong.sh
launcher_bench_Sombrero_Weak.sh		launcher_bench_Sombrero_Weak.sh
main.cu		main.cu
makemake		makemake

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bench_Grid_HiRep code v-0.0.1

Requirements

How to use it

How to download.

GPU-Component

Certain functionalities

The Benches

Documentation

About

Releases

Packages

Languages

fbonnet08/Bench_Grid_HiRep

Folders and files

Latest commit

History

Repository files navigation

Bench_Grid_HiRep code v-0.0.1

Requirements

How to use it

How to download.

GPU-Component

Certain functionalities

The Benches

Documentation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages