Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use EESSI module in EESSI-install-software.sh #790

Open
wants to merge 14 commits into
base: 2023.06-software.eessi.io
Choose a base branch
from

Conversation

casparvl
Copy link
Collaborator

No description provided.

Copy link

eessi-bot bot commented Oct 16, 2024

Instance eessi-bot-mc-aws is configured to build for:

  • architectures: x86_64/generic, x86_64/intel/haswell, x86_64/intel/skylake_avx512, x86_64/amd/zen2, x86_64/amd/zen3, aarch64/generic, aarch64/neoverse_n1, aarch64/neoverse_v1
  • repositories: eessi.io-2023.06-compat, eessi-hpc.org-2023.06-software, eessi-hpc.org-2023.06-compat, eessi.io-2023.06-software

Instance boegel-bot-deucalion is configured to build for:

  • architectures: aarch64/a64fx
  • repositories: eessi.io-2023.06-software

Copy link

eessi-bot bot commented Oct 16, 2024

Instance eessi-bot-mc-azure is configured to build for:

  • architectures: x86_64/amd/zen4
  • repositories: eessi-hpc.org-2023.06-software, eessi-hpc.org-2023.06-compat, eessi.io-2023.06-software, eessi.io-2023.06-compat

@casparvl casparvl changed the title Use module in eessi install Use EESSI module in EESSI-install-software.sh Oct 16, 2024
@casparvl
Copy link
Collaborator Author

bot: build repo:eessi.io-2023.06-software arch:zen2

Copy link

eessi-bot bot commented Oct 16, 2024

Updates by the bot instance eessi-bot-mc-aws (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:zen2 from casparvl

    • expanded format: build repository:eessi.io-2023.06-software architecture:zen2
  • handling command build repository:eessi.io-2023.06-software architecture:zen2 resulted in:

Updates by the bot instance boegel-bot-deucalion (click for details)
  • account casparvl has NO permission to send commands to the bot

Copy link

eessi-bot bot commented Oct 16, 2024

Updates by the bot instance eessi-bot-mc-azure (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:zen2 from casparvl

    • expanded format: build repository:eessi.io-2023.06-software architecture:zen2
  • handling command build repository:eessi.io-2023.06-software architecture:zen2 resulted in:

    • no jobs were submitted

Copy link

eessi-bot bot commented Oct 16, 2024

New job on instance eessi-bot-mc-aws for CPU micro-architecture x86_64-amd-zen2 for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.10/pr_790/23588

date job status comment
Oct 16 08:50:59 UTC 2024 submitted job id 23588 awaits release by job manager
Oct 16 08:51:23 UTC 2024 released job awaits launch by Slurm scheduler
Oct 16 09:54:36 UTC 2024 running job 23588 is running
Oct 16 10:00:42 UTC 2024 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-23588.out
❌ found message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
No artefacts were created or found.
Oct 16 10:00:42 UTC 2024 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ OK ] ( 1/10) EESSI_LAMMPS_lj %scale=1_node %device_type=cpu %module_name=LAMMPS/29Aug2024-foss-2023b-kokkos /aeb2d9df @BotBuildTests:x86-64-amd-zen2-node+default
P: perf: 433.066 timesteps/s (r:0, l:None, u:None)
[ OK ] ( 2/10) EESSI_LAMMPS_lj %scale=1_node %device_type=cpu %module_name=LAMMPS/2Aug2023_update2-foss-2023a-kokkos /04ff9ece @BotBuildTests:x86-64-amd-zen2-node+default
P: perf: 433.174 timesteps/s (r:0, l:None, u:None)
[ OK ] ( 3/10) EESSI_OSU_Micro_Benchmarks_coll %benchmark_info=mpi.collective.osu_allreduce %scale=1_node %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %device_type=cpu /31ac6ab9 @BotBuildTests:x86-64-amd-zen2-node+default
P: latency: 5.93 us (r:0, l:None, u:None)
[ OK ] ( 4/10) EESSI_OSU_Micro_Benchmarks_coll %benchmark_info=mpi.collective.osu_allreduce %scale=1_node %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %device_type=cpu /f3be40a2 @BotBuildTests:x86-64-amd-zen2-node+default
P: latency: 4.76 us (r:0, l:None, u:None)
[ OK ] ( 5/10) EESSI_OSU_Micro_Benchmarks_coll %benchmark_info=mpi.collective.osu_alltoall %scale=1_node %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %device_type=cpu /10e66fba @BotBuildTests:x86-64-amd-zen2-node+default
P: latency: 10.85 us (r:0, l:None, u:None)
[ OK ] ( 6/10) EESSI_OSU_Micro_Benchmarks_coll %benchmark_info=mpi.collective.osu_alltoall %scale=1_node %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %device_type=cpu /5be57ae7 @BotBuildTests:x86-64-amd-zen2-node+default
P: latency: 8.43 us (r:0, l:None, u:None)
[ OK ] ( 7/10) EESSI_OSU_Micro_Benchmarks_pt2pt %benchmark_info=mpi.pt2pt.osu_latency %scale=1_node %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %device_type=cpu /c8c9aff5 @BotBuildTests:x86-64-amd-zen2-node+default
P: latency: 0.34 us (r:0, l:None, u:None)
[ OK ] ( 8/10) EESSI_OSU_Micro_Benchmarks_pt2pt %benchmark_info=mpi.pt2pt.osu_latency %scale=1_node %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %device_type=cpu /9795e491 @BotBuildTests:x86-64-amd-zen2-node+default
P: latency: 0.33 us (r:0, l:None, u:None)
[ OK ] ( 9/10) EESSI_OSU_Micro_Benchmarks_pt2pt %benchmark_info=mpi.pt2pt.osu_bw %scale=1_node %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %device_type=cpu /48da21c5 @BotBuildTests:x86-64-amd-zen2-node+default
P: bandwidth: 7728.22 MB/s (r:0, l:None, u:None)
[ OK ] (10/10) EESSI_OSU_Micro_Benchmarks_pt2pt %benchmark_info=mpi.pt2pt.osu_bw %scale=1_node %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %device_type=cpu /1b8c1ca2 @BotBuildTests:x86-64-amd-zen2-node+default
P: bandwidth: 7814.82 MB/s (r:0, l:None, u:None)
[ PASSED ] Ran 10/10 test case(s) from 10 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-23588.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

@ocaisa
Copy link
Member

ocaisa commented Oct 16, 2024

@casparvl The EESSI module had a few issues, I've added quite a few changes and a lot of additional CI in #781

(it was not that it didn't work, but it actually did not give back the original environment after unloading!)

@casparvl
Copy link
Collaborator Author

casparvl commented Oct 16, 2024

Ghehe, is that a polite way of saying you would like a review? ;-)

I don't think it will outright affect the use in EESSI-install-software.sh in this PR, since we don't unload it here. But I'll put it on my radar - I want to review the cuDNN one first though...

@casparvl
Copy link
Collaborator Author

bot: build repo:eessi.io-2023.06-software arch:zen2

Copy link

eessi-bot bot commented Oct 16, 2024

Updates by the bot instance eessi-bot-mc-aws (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:zen2 from casparvl

    • expanded format: build repository:eessi.io-2023.06-software architecture:zen2
  • handling command build repository:eessi.io-2023.06-software architecture:zen2 resulted in:

Updates by the bot instance boegel-bot-deucalion (click for details)
  • account casparvl has NO permission to send commands to the bot

Copy link

eessi-bot bot commented Oct 16, 2024

Updates by the bot instance eessi-bot-mc-azure (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:zen2 from casparvl

    • expanded format: build repository:eessi.io-2023.06-software architecture:zen2
  • handling command build repository:eessi.io-2023.06-software architecture:zen2 resulted in:

    • no jobs were submitted

Copy link

eessi-bot bot commented Oct 16, 2024

New job on instance eessi-bot-mc-aws for CPU micro-architecture x86_64-amd-zen2 for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.10/pr_790/23589

date job status comment
Oct 16 09:28:55 UTC 2024 submitted job id 23589 awaits release by job manager
Oct 16 09:29:22 UTC 2024 released job awaits launch by Slurm scheduler
Oct 16 09:30:24 UTC 2024 running job 23589 is running
Oct 16 09:36:31 UTC 2024 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-23589.out
❌ found message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
No artefacts were created or found.
Oct 16 09:36:31 UTC 2024 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ OK ] ( 1/10) EESSI_LAMMPS_lj %scale=1_node %device_type=cpu %module_name=LAMMPS/29Aug2024-foss-2023b-kokkos /aeb2d9df @BotBuildTests:x86-64-amd-zen2-node+default
P: perf: 436.353 timesteps/s (r:0, l:None, u:None)
[ OK ] ( 2/10) EESSI_LAMMPS_lj %scale=1_node %device_type=cpu %module_name=LAMMPS/2Aug2023_update2-foss-2023a-kokkos /04ff9ece @BotBuildTests:x86-64-amd-zen2-node+default
P: perf: 435.814 timesteps/s (r:0, l:None, u:None)
[ OK ] ( 3/10) EESSI_OSU_Micro_Benchmarks_coll %benchmark_info=mpi.collective.osu_allreduce %scale=1_node %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %device_type=cpu /31ac6ab9 @BotBuildTests:x86-64-amd-zen2-node+default
P: latency: 4.82 us (r:0, l:None, u:None)
[ OK ] ( 4/10) EESSI_OSU_Micro_Benchmarks_coll %benchmark_info=mpi.collective.osu_allreduce %scale=1_node %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %device_type=cpu /f3be40a2 @BotBuildTests:x86-64-amd-zen2-node+default
P: latency: 4.56 us (r:0, l:None, u:None)
[ OK ] ( 5/10) EESSI_OSU_Micro_Benchmarks_coll %benchmark_info=mpi.collective.osu_alltoall %scale=1_node %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %device_type=cpu /10e66fba @BotBuildTests:x86-64-amd-zen2-node+default
P: latency: 8.92 us (r:0, l:None, u:None)
[ OK ] ( 6/10) EESSI_OSU_Micro_Benchmarks_coll %benchmark_info=mpi.collective.osu_alltoall %scale=1_node %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %device_type=cpu /5be57ae7 @BotBuildTests:x86-64-amd-zen2-node+default
P: latency: 8.72 us (r:0, l:None, u:None)
[ OK ] ( 7/10) EESSI_OSU_Micro_Benchmarks_pt2pt %benchmark_info=mpi.pt2pt.osu_latency %scale=1_node %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %device_type=cpu /c8c9aff5 @BotBuildTests:x86-64-amd-zen2-node+default
P: latency: 0.33 us (r:0, l:None, u:None)
[ OK ] ( 8/10) EESSI_OSU_Micro_Benchmarks_pt2pt %benchmark_info=mpi.pt2pt.osu_latency %scale=1_node %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %device_type=cpu /9795e491 @BotBuildTests:x86-64-amd-zen2-node+default
P: latency: 0.3 us (r:0, l:None, u:None)
[ OK ] ( 9/10) EESSI_OSU_Micro_Benchmarks_pt2pt %benchmark_info=mpi.pt2pt.osu_bw %scale=1_node %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %device_type=cpu /48da21c5 @BotBuildTests:x86-64-amd-zen2-node+default
P: bandwidth: 7769.14 MB/s (r:0, l:None, u:None)
[ OK ] (10/10) EESSI_OSU_Micro_Benchmarks_pt2pt %benchmark_info=mpi.pt2pt.osu_bw %scale=1_node %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %device_type=cpu /1b8c1ca2 @BotBuildTests:x86-64-amd-zen2-node+default
P: bandwidth: 7687.88 MB/s (r:0, l:None, u:None)
[ PASSED ] Ran 10/10 test case(s) from 10 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-23589.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

@casparvl
Copy link
Collaborator Author

bot: build repo:eessi.io-2023.06-software arch:zen2

Copy link

eessi-bot bot commented Oct 16, 2024

Updates by the bot instance eessi-bot-mc-aws (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:zen2 from casparvl

    • expanded format: build repository:eessi.io-2023.06-software architecture:zen2
  • handling command build repository:eessi.io-2023.06-software architecture:zen2 resulted in:

Copy link

eessi-bot bot commented Oct 16, 2024

Updates by the bot instance eessi-bot-mc-azure (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:zen2 from casparvl

    • expanded format: build repository:eessi.io-2023.06-software architecture:zen2
  • handling command build repository:eessi.io-2023.06-software architecture:zen2 resulted in:

    • no jobs were submitted

Updates by the bot instance boegel-bot-deucalion (click for details)
  • account casparvl has NO permission to send commands to the bot

Copy link

eessi-bot bot commented Oct 16, 2024

New job on instance eessi-bot-mc-aws for CPU micro-architecture x86_64-amd-zen2 for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.10/pr_790/23602

date job status comment
Oct 16 14:32:08 UTC 2024 submitted job id 23602 awaits release by job manager
Oct 16 14:32:22 UTC 2024 released job awaits launch by Slurm scheduler
Oct 16 14:33:24 UTC 2024 running job 23602 is running
Oct 16 14:39:33 UTC 2024 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-23602.out
❌ found message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
No artefacts were created or found.
Oct 16 14:39:33 UTC 2024 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ OK ] ( 1/10) EESSI_LAMMPS_lj %scale=1_node %device_type=cpu %module_name=LAMMPS/29Aug2024-foss-2023b-kokkos /aeb2d9df @BotBuildTests:x86-64-amd-zen2-node+default
P: perf: 441.162 timesteps/s (r:0, l:None, u:None)
[ OK ] ( 2/10) EESSI_LAMMPS_lj %scale=1_node %device_type=cpu %module_name=LAMMPS/2Aug2023_update2-foss-2023a-kokkos /04ff9ece @BotBuildTests:x86-64-amd-zen2-node+default
P: perf: 436.124 timesteps/s (r:0, l:None, u:None)
[ OK ] ( 3/10) EESSI_OSU_Micro_Benchmarks_coll %benchmark_info=mpi.collective.osu_allreduce %scale=1_node %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %device_type=cpu /31ac6ab9 @BotBuildTests:x86-64-amd-zen2-node+default
P: latency: 4.7 us (r:0, l:None, u:None)
[ OK ] ( 4/10) EESSI_OSU_Micro_Benchmarks_coll %benchmark_info=mpi.collective.osu_allreduce %scale=1_node %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %device_type=cpu /f3be40a2 @BotBuildTests:x86-64-amd-zen2-node+default
P: latency: 4.51 us (r:0, l:None, u:None)
[ OK ] ( 5/10) EESSI_OSU_Micro_Benchmarks_coll %benchmark_info=mpi.collective.osu_alltoall %scale=1_node %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %device_type=cpu /10e66fba @BotBuildTests:x86-64-amd-zen2-node+default
P: latency: 11.18 us (r:0, l:None, u:None)
[ OK ] ( 6/10) EESSI_OSU_Micro_Benchmarks_coll %benchmark_info=mpi.collective.osu_alltoall %scale=1_node %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %device_type=cpu /5be57ae7 @BotBuildTests:x86-64-amd-zen2-node+default
P: latency: 8.4 us (r:0, l:None, u:None)
[ OK ] ( 7/10) EESSI_OSU_Micro_Benchmarks_pt2pt %benchmark_info=mpi.pt2pt.osu_latency %scale=1_node %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %device_type=cpu /c8c9aff5 @BotBuildTests:x86-64-amd-zen2-node+default
P: latency: 0.34 us (r:0, l:None, u:None)
[ OK ] ( 8/10) EESSI_OSU_Micro_Benchmarks_pt2pt %benchmark_info=mpi.pt2pt.osu_latency %scale=1_node %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %device_type=cpu /9795e491 @BotBuildTests:x86-64-amd-zen2-node+default
P: latency: 0.31 us (r:0, l:None, u:None)
[ OK ] ( 9/10) EESSI_OSU_Micro_Benchmarks_pt2pt %benchmark_info=mpi.pt2pt.osu_bw %scale=1_node %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %device_type=cpu /48da21c5 @BotBuildTests:x86-64-amd-zen2-node+default
P: bandwidth: 7887.22 MB/s (r:0, l:None, u:None)
[ OK ] (10/10) EESSI_OSU_Micro_Benchmarks_pt2pt %benchmark_info=mpi.pt2pt.osu_bw %scale=1_node %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %device_type=cpu /1b8c1ca2 @BotBuildTests:x86-64-amd-zen2-node+default
P: bandwidth: 7916.31 MB/s (r:0, l:None, u:None)
[ PASSED ] Ran 10/10 test case(s) from 10 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-23602.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

…tware-layer into use_module_in_eessi_install
@casparvl
Copy link
Collaborator Author

bot: build repo:eessi.io-2023.06-software arch:zen2

Copy link

eessi-bot bot commented Oct 16, 2024

Updates by the bot instance eessi-bot-mc-aws (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:zen2 from casparvl

    • expanded format: build repository:eessi.io-2023.06-software architecture:zen2
  • handling command build repository:eessi.io-2023.06-software architecture:zen2 resulted in:

Updates by the bot instance boegel-bot-deucalion (click for details)
  • account casparvl has NO permission to send commands to the bot

@casparvl
Copy link
Collaborator Author

bot: build repo:eessi.io-2023.06-software arch:zen2

Updates by the bot instance boegel-bot-deucalion (click for details)
  • account casparvl has NO permission to send commands to the bot

Copy link

eessi-bot bot commented Oct 21, 2024

Updates by the bot instance eessi-bot-mc-aws (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:zen2 from casparvl

    • expanded format: build repository:eessi.io-2023.06-software architecture:zen2
  • handling command build repository:eessi.io-2023.06-software architecture:zen2 resulted in:

Copy link

eessi-bot bot commented Oct 21, 2024

Updates by the bot instance eessi-bot-mc-azure (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:zen2 from casparvl

    • expanded format: build repository:eessi.io-2023.06-software architecture:zen2
  • handling command build repository:eessi.io-2023.06-software architecture:zen2 resulted in:

    • no jobs were submitted

Copy link

eessi-bot bot commented Oct 21, 2024

New job on instance eessi-bot-mc-aws for CPU micro-architecture x86_64-amd-zen2 for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.10/pr_790/24404

date job status comment
Oct 21 13:21:28 UTC 2024 submitted job id 24404 awaits release by job manager
Oct 21 13:22:23 UTC 2024 released job awaits launch by Slurm scheduler
Oct 21 14:53:34 UTC 2024 running job 24404 is running
Oct 21 15:01:45 UTC 2024 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-24404.out
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-amd-zen2-1729522502.tar.gzsize: 0 MiB (3659 bytes)
entries: 1
modules under 2023.06/software/linux/x86_64/amd/zen2/modules/all
no module files in tarball
software under 2023.06/software/linux/x86_64/amd/zen2/software
no software packages in tarball
other under 2023.06/software/linux/x86_64/amd/zen2
2023.06/scripts/gpu_support/nvidia/install_cuda_and_libraries.sh
Oct 21 15:01:45 UTC 2024 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ OK ] ( 1/10) EESSI_LAMMPS_lj %scale=1_node %device_type=cpu %module_name=LAMMPS/29Aug2024-foss-2023b-kokkos /aeb2d9df @BotBuildTests:x86-64-amd-zen2-node+default
P: perf: 440.612 timesteps/s (r:0, l:None, u:None)
[ OK ] ( 2/10) EESSI_LAMMPS_lj %scale=1_node %device_type=cpu %module_name=LAMMPS/2Aug2023_update2-foss-2023a-kokkos /04ff9ece @BotBuildTests:x86-64-amd-zen2-node+default
P: perf: 448.965 timesteps/s (r:0, l:None, u:None)
[ OK ] ( 3/10) EESSI_OSU_Micro_Benchmarks_coll %benchmark_info=mpi.collective.osu_allreduce %scale=1_node %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %device_type=cpu /31ac6ab9 @BotBuildTests:x86-64-amd-zen2-node+default
P: latency: 4.77 us (r:0, l:None, u:None)
[ OK ] ( 4/10) EESSI_OSU_Micro_Benchmarks_coll %benchmark_info=mpi.collective.osu_allreduce %scale=1_node %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %device_type=cpu /f3be40a2 @BotBuildTests:x86-64-amd-zen2-node+default
P: latency: 4.19 us (r:0, l:None, u:None)
[ OK ] ( 5/10) EESSI_OSU_Micro_Benchmarks_coll %benchmark_info=mpi.collective.osu_alltoall %scale=1_node %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %device_type=cpu /10e66fba @BotBuildTests:x86-64-amd-zen2-node+default
P: latency: 9.06 us (r:0, l:None, u:None)
[ OK ] ( 6/10) EESSI_OSU_Micro_Benchmarks_coll %benchmark_info=mpi.collective.osu_alltoall %scale=1_node %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %device_type=cpu /5be57ae7 @BotBuildTests:x86-64-amd-zen2-node+default
P: latency: 8.31 us (r:0, l:None, u:None)
[ OK ] ( 7/10) EESSI_OSU_Micro_Benchmarks_pt2pt %benchmark_info=mpi.pt2pt.osu_latency %scale=1_node %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %device_type=cpu /c8c9aff5 @BotBuildTests:x86-64-amd-zen2-node+default
P: latency: 0.31 us (r:0, l:None, u:None)
[ OK ] ( 8/10) EESSI_OSU_Micro_Benchmarks_pt2pt %benchmark_info=mpi.pt2pt.osu_latency %scale=1_node %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %device_type=cpu /9795e491 @BotBuildTests:x86-64-amd-zen2-node+default
P: latency: 0.31 us (r:0, l:None, u:None)
[ OK ] ( 9/10) EESSI_OSU_Micro_Benchmarks_pt2pt %benchmark_info=mpi.pt2pt.osu_bw %scale=1_node %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %device_type=cpu /48da21c5 @BotBuildTests:x86-64-amd-zen2-node+default
P: bandwidth: 7818.35 MB/s (r:0, l:None, u:None)
[ OK ] (10/10) EESSI_OSU_Micro_Benchmarks_pt2pt %benchmark_info=mpi.pt2pt.osu_bw %scale=1_node %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %device_type=cpu /1b8c1ca2 @BotBuildTests:x86-64-amd-zen2-node+default
P: bandwidth: 7880.45 MB/s (r:0, l:None, u:None)
[ PASSED ] Ran 10/10 test case(s) from 10 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-24404.out
✅ no message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

@casparvl
Copy link
Collaborator Author

Hmm...

size: 0 MiB (3659 bytes)
entries: 1
modules under 2023.06/software/linux/x86_64/amd/zen2/modules/all
no module files in tarball
software under 2023.06/software/linux/x86_64/amd/zen2/software
no software packages in tarball
other under 2023.06/software/linux/x86_64/amd/zen2
2023.06/scripts/gpu_support/nvidia/install_cuda_and_libraries.sh

That's not the artefacts list I expected :D Yes, this script should be reinstalled. But it should at the very least also install the patchelf I listed in the EasyStack file. Looking at the logs:

== Running parse hook for patchelf-0.18.0-GCCcore-13.2.0.eb...
== Running parse hook for GCCcore-13.2.0.eb...
== patchelf/0.18.0-GCCcore-13.2.0 is already installed (module found), skipping
== No easyconfigs left to be built.
== Build succeeded for 0 out of 0

Ah, that explains! Someone already installed my favorite dummy package that I like to try ;-)

@casparvl
Copy link
Collaborator Author

bot: build repo:eessi.io-2023.06-software arch:zen2

Updates by the bot instance boegel-bot-deucalion (click for details)
  • account casparvl has NO permission to send commands to the bot

Copy link

eessi-bot bot commented Oct 22, 2024

Updates by the bot instance eessi-bot-mc-aws (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:zen2 from casparvl
    • expanded format: build repository:eessi.io-2023.06-software architecture:zen2

Copy link

eessi-bot bot commented Oct 22, 2024

Updates by the bot instance eessi-bot-mc-azure (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:zen2 from casparvl

    • expanded format: build repository:eessi.io-2023.06-software architecture:zen2
  • handling command build repository:eessi.io-2023.06-software architecture:zen2 resulted in:

    • no jobs were submitted

Copy link

eessi-bot bot commented Oct 22, 2024

# Note we need to do this after running install_cuda_and_libraries, since that does installations in the EESSI_SITE_INSTALL 
error: patch failed: easystacks/software.eessi.io/2023.06/eessi-2023.06-eb-4.9.4-2023a.yml:9
error: easystacks/software.eessi.io/2023.06/eessi-2023.06-eb-4.9.4-2023a.yml: patch does not apply

Unable to download or merge changes between the source branch and the destination branch.
Tip: This can usually be resolved by syncing your branch and resolving any merge conflicts.

@casparvl
Copy link
Collaborator Author

bot: build repo:eessi.io-2023.06-software arch:zen2

Copy link

eessi-bot bot commented Oct 22, 2024

Updates by the bot instance eessi-bot-mc-aws (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:zen2 from casparvl

    • expanded format: build repository:eessi.io-2023.06-software architecture:zen2
  • handling command build repository:eessi.io-2023.06-software architecture:zen2 resulted in:

Updates by the bot instance boegel-bot-deucalion (click for details)
  • account casparvl has NO permission to send commands to the bot

Copy link

eessi-bot bot commented Oct 22, 2024

Updates by the bot instance eessi-bot-mc-azure (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:zen2 from casparvl

    • expanded format: build repository:eessi.io-2023.06-software architecture:zen2
  • handling command build repository:eessi.io-2023.06-software architecture:zen2 resulted in:

    • no jobs were submitted

Copy link

eessi-bot bot commented Oct 22, 2024

New job on instance eessi-bot-mc-aws for CPU micro-architecture x86_64-amd-zen2 for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.10/pr_790/24701

date job status comment
Oct 22 06:18:34 UTC 2024 submitted job id 24701 awaits release by job manager
Oct 22 06:18:59 UTC 2024 released job awaits launch by Slurm scheduler
Oct 22 06:26:01 UTC 2024 running job 24701 is running
Oct 22 06:37:17 UTC 2024 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-24701.out
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-amd-zen2-1729578614.tar.gzsize: 7 MiB (8113119 bytes)
entries: 508
modules under 2023.06/software/linux/x86_64/amd/zen2/modules/all
util-linux/2.39-GCCcore-13.2.0.lua
software under 2023.06/software/linux/x86_64/amd/zen2/software
util-linux/2.39-GCCcore-13.2.0
other under 2023.06/software/linux/x86_64/amd/zen2
2023.06/scripts/gpu_support/nvidia/install_cuda_and_libraries.sh
Oct 22 06:37:17 UTC 2024 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ OK ] ( 1/10) EESSI_LAMMPS_lj %scale=1_node %device_type=cpu %module_name=LAMMPS/29Aug2024-foss-2023b-kokkos /aeb2d9df @BotBuildTests:x86-64-amd-zen2-node+default
P: perf: 434.84 timesteps/s (r:0, l:None, u:None)
[ OK ] ( 2/10) EESSI_LAMMPS_lj %scale=1_node %device_type=cpu %module_name=LAMMPS/2Aug2023_update2-foss-2023a-kokkos /04ff9ece @BotBuildTests:x86-64-amd-zen2-node+default
P: perf: 439.021 timesteps/s (r:0, l:None, u:None)
[ OK ] ( 3/10) EESSI_OSU_Micro_Benchmarks_coll %benchmark_info=mpi.collective.osu_allreduce %scale=1_node %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %device_type=cpu /31ac6ab9 @BotBuildTests:x86-64-amd-zen2-node+default
P: latency: 4.97 us (r:0, l:None, u:None)
[ OK ] ( 4/10) EESSI_OSU_Micro_Benchmarks_coll %benchmark_info=mpi.collective.osu_allreduce %scale=1_node %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %device_type=cpu /f3be40a2 @BotBuildTests:x86-64-amd-zen2-node+default
P: latency: 6.02 us (r:0, l:None, u:None)
[ OK ] ( 5/10) EESSI_OSU_Micro_Benchmarks_coll %benchmark_info=mpi.collective.osu_alltoall %scale=1_node %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %device_type=cpu /10e66fba @BotBuildTests:x86-64-amd-zen2-node+default
P: latency: 9.18 us (r:0, l:None, u:None)
[ OK ] ( 6/10) EESSI_OSU_Micro_Benchmarks_coll %benchmark_info=mpi.collective.osu_alltoall %scale=1_node %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %device_type=cpu /5be57ae7 @BotBuildTests:x86-64-amd-zen2-node+default
P: latency: 10.83 us (r:0, l:None, u:None)
[ OK ] ( 7/10) EESSI_OSU_Micro_Benchmarks_pt2pt %benchmark_info=mpi.pt2pt.osu_latency %scale=1_node %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %device_type=cpu /c8c9aff5 @BotBuildTests:x86-64-amd-zen2-node+default
P: latency: 0.34 us (r:0, l:None, u:None)
[ OK ] ( 8/10) EESSI_OSU_Micro_Benchmarks_pt2pt %benchmark_info=mpi.pt2pt.osu_latency %scale=1_node %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %device_type=cpu /9795e491 @BotBuildTests:x86-64-amd-zen2-node+default
P: latency: 0.31 us (r:0, l:None, u:None)
[ OK ] ( 9/10) EESSI_OSU_Micro_Benchmarks_pt2pt %benchmark_info=mpi.pt2pt.osu_bw %scale=1_node %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %device_type=cpu /48da21c5 @BotBuildTests:x86-64-amd-zen2-node+default
P: bandwidth: 7442.93 MB/s (r:0, l:None, u:None)
[ OK ] (10/10) EESSI_OSU_Micro_Benchmarks_pt2pt %benchmark_info=mpi.pt2pt.osu_bw %scale=1_node %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %device_type=cpu /1b8c1ca2 @BotBuildTests:x86-64-amd-zen2-node+default
P: bandwidth: 7301.12 MB/s (r:0, l:None, u:None)
[ PASSED ] Ran 10/10 test case(s) from 10 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-24701.out
✅ no message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

@casparvl
Copy link
Collaborator Author

bot: build repo:eessi.io-2023.06-software arch:zen2

Copy link

eessi-bot bot commented Oct 22, 2024

Updates by the bot instance eessi-bot-mc-aws (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:zen2 from casparvl

    • expanded format: build repository:eessi.io-2023.06-software architecture:zen2
  • handling command build repository:eessi.io-2023.06-software architecture:zen2 resulted in:

Updates by the bot instance boegel-bot-deucalion (click for details)
  • account casparvl has NO permission to send commands to the bot

Copy link

eessi-bot bot commented Oct 22, 2024

Updates by the bot instance eessi-bot-mc-azure (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:zen2 from casparvl

    • expanded format: build repository:eessi.io-2023.06-software architecture:zen2
  • handling command build repository:eessi.io-2023.06-software architecture:zen2 resulted in:

    • no jobs were submitted

Copy link

eessi-bot bot commented Oct 22, 2024

New job on instance eessi-bot-mc-aws for CPU micro-architecture x86_64-amd-zen2 for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.10/pr_790/24702

date job status comment
Oct 22 07:09:51 UTC 2024 submitted job id 24702 awaits release by job manager
Oct 22 07:10:23 UTC 2024 released job awaits launch by Slurm scheduler
Oct 22 07:11:25 UTC 2024 running job 24702 is running
Oct 22 07:17:35 UTC 2024 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-24702.out
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-amd-zen2-1729581072.tar.gzsize: 0 MiB (3659 bytes)
entries: 1
modules under 2023.06/software/linux/x86_64/amd/zen2/modules/all
no module files in tarball
software under 2023.06/software/linux/x86_64/amd/zen2/software
no software packages in tarball
other under 2023.06/software/linux/x86_64/amd/zen2
2023.06/scripts/gpu_support/nvidia/install_cuda_and_libraries.sh
Oct 22 07:17:35 UTC 2024 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ OK ] ( 1/10) EESSI_LAMMPS_lj %scale=1_node %device_type=cpu %module_name=LAMMPS/29Aug2024-foss-2023b-kokkos /aeb2d9df @BotBuildTests:x86-64-amd-zen2-node+default
P: perf: 441.041 timesteps/s (r:0, l:None, u:None)
[ OK ] ( 2/10) EESSI_LAMMPS_lj %scale=1_node %device_type=cpu %module_name=LAMMPS/2Aug2023_update2-foss-2023a-kokkos /04ff9ece @BotBuildTests:x86-64-amd-zen2-node+default
P: perf: 444.564 timesteps/s (r:0, l:None, u:None)
[ OK ] ( 3/10) EESSI_OSU_Micro_Benchmarks_coll %benchmark_info=mpi.collective.osu_allreduce %scale=1_node %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %device_type=cpu /31ac6ab9 @BotBuildTests:x86-64-amd-zen2-node+default
P: latency: 4.65 us (r:0, l:None, u:None)
[ OK ] ( 4/10) EESSI_OSU_Micro_Benchmarks_coll %benchmark_info=mpi.collective.osu_allreduce %scale=1_node %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %device_type=cpu /f3be40a2 @BotBuildTests:x86-64-amd-zen2-node+default
P: latency: 4.8 us (r:0, l:None, u:None)
[ OK ] ( 5/10) EESSI_OSU_Micro_Benchmarks_coll %benchmark_info=mpi.collective.osu_alltoall %scale=1_node %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %device_type=cpu /10e66fba @BotBuildTests:x86-64-amd-zen2-node+default
P: latency: 8.86 us (r:0, l:None, u:None)
[ OK ] ( 6/10) EESSI_OSU_Micro_Benchmarks_coll %benchmark_info=mpi.collective.osu_alltoall %scale=1_node %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %device_type=cpu /5be57ae7 @BotBuildTests:x86-64-amd-zen2-node+default
P: latency: 8.29 us (r:0, l:None, u:None)
[ OK ] ( 7/10) EESSI_OSU_Micro_Benchmarks_pt2pt %benchmark_info=mpi.pt2pt.osu_latency %scale=1_node %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %device_type=cpu /c8c9aff5 @BotBuildTests:x86-64-amd-zen2-node+default
P: latency: 0.35 us (r:0, l:None, u:None)
[ OK ] ( 8/10) EESSI_OSU_Micro_Benchmarks_pt2pt %benchmark_info=mpi.pt2pt.osu_latency %scale=1_node %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %device_type=cpu /9795e491 @BotBuildTests:x86-64-amd-zen2-node+default
P: latency: 0.3 us (r:0, l:None, u:None)
[ OK ] ( 9/10) EESSI_OSU_Micro_Benchmarks_pt2pt %benchmark_info=mpi.pt2pt.osu_bw %scale=1_node %module_name=OSU-Micro-Benchmarks/7.2-gompi-2023b %device_type=cpu /48da21c5 @BotBuildTests:x86-64-amd-zen2-node+default
P: bandwidth: 7915.54 MB/s (r:0, l:None, u:None)
[ OK ] (10/10) EESSI_OSU_Micro_Benchmarks_pt2pt %benchmark_info=mpi.pt2pt.osu_bw %scale=1_node %module_name=OSU-Micro-Benchmarks/7.1-1-gompi-2023a %device_type=cpu /1b8c1ca2 @BotBuildTests:x86-64-amd-zen2-node+default
P: bandwidth: 7875.25 MB/s (r:0, l:None, u:None)
[ PASSED ] Ran 10/10 test case(s) from 10 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-24702.out
✅ no message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

@casparvl
Copy link
Collaborator Author

#790 proves the adaptations to the script works. I removed util-linux, since that was just to prove the functionality. #790 is to actually deploy the slightly modified install_cuda_and_libraries.sh

Copy link
Member

@ocaisa ocaisa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, I think I see an issue in a particular scenario for dev.eessi.io

EESSI-install-software.sh Outdated Show resolved Hide resolved

# If in dev.eessi.io, allow building on top of softw
if [[ "${EESSI_CVMFS_REPO}" == /cvmfs/dev.eessi.io ]]; then
module use /cvmfs/software.eessi.io/versions/$EESSI_VERSION/software/${EESSI_OS_TYPE}/${EESSI_SOFTWARE_SUBDIR_OVERRIDE}/modules/all
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to make a comment that explains the expected precedence in MODULEPATH:

/path/to/dev.eessi.io/modules:/path/to/accel/modules:/path/to/software.eessi.io/modules

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure you have caught the case though that would define this:

  • I want to install in dev.eessi.io
  • I want to install for an accelerator (and require accelerator dependencies)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder is this what we want to support:

/path/to/dev.eessi.io/accel/modules:/path/to/dev.eessi.io/modules:/path/to/software.eessi.io/accel/modules:/path/to/software.eessi.io/modules

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While my pr moved this section of code, I'm pretty sure it was already there, wasn't it? I'm on my phone currently, so can't easily check

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is on me, these changes come from #771, see also @boegel's comment about handling accel cases #771 (comment)

In a meeting we talked about fixing this in a separate PR, but if it's better to address this on this PR I can give it a go when I'm back at my desk this week

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be in favour of doing that in a separate PR, if @ocaisa agrees. This PR currently is mostly a reshuffle: some parts that can only be done after EESSI is initialized got moved down until after the EESSI module is loaded. While the pr replaced sourcing the various init scripts with loading the module, it should not modify behaviour in any way. I think it's more clear of we do a big fix like that in a separate PR

@casparvl casparvl marked this pull request as ready for review October 22, 2024 21:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants