Run Phoenix jobs using available CPU cores #266

sbryngelson · 2023-12-17T18:14:36Z

I'm attempting to improve runtime on Phoenix runners. Requesting a node without specifying the cpu-small partition can hang for a bit.

I'm noticing that there's a problem with ./mfc.sh test -a always rebuilding HDF5/Silo even if they are already built during ./mfc.sh build. This is slowing things down (especially on Debug runs). I think this might be triggered via the following output:

atl1-1-03-002-15-1: p-sbryngelson3-0/MFC-2 $ time ./mfc.sh test -j 12 -b mpirun -a
mfc: OK > (venv) Entered the Python virtual environment.

      .=++*:          -+*+=.          [email protected] [Linux]
     :+   -*-        ==   =* .        -------------------------------------------------------
   :*+      ==      ++    .+-         --jobs 12
  :*##-.....:*+   .#%+++=--+=:::.     --mpi
  -=-++-======#=--**+++==+*++=::-:.   --no-gpu
 .:++=----------====+*= ==..:%.....   --no-debug
  .:-=++++===--==+=-+=   +.  :=
  +#=::::::::=%=. -+:    =+   *:      -----------------------------------------------------------
 .*=-=*=..    :=+*+:      -...--      $ ./mfc.sh [build, run, test, clean, count, packer] --help

Generating syscheck/include/case.fpp.
  INFO: Custom case.fpp file is up to date.

$ cmake --build /storage/coda1/p-sbryngelson3/0/sbryngelson3/MFC-2/build/no-debug_no-gpu_mpi/syscheck --target syscheck
-j 12 --config Release

[100%] Built target syscheck

$ cmake --install /storage/coda1/p-sbryngelson3/0/sbryngelson3/MFC-2/build/no-debug_no-gpu_mpi/syscheck

-- Install configuration: "Release"
-- Installing: /storage/home/hcoda1/6/sbryngelson3/p-sbryngelson3-0/MFC-2/build/install/no-debug_no-gpu_mpi/bin/syscheck
-- Set runtime path of "/storage/home/hcoda1/6/sbryngelson3/p-sbryngelson3-0/MFC-2/build/install/no-debug_no-gpu_mpi/bin/syscheck" to ""
Generating pre_process/include/case.fpp.
  INFO: Custom case.fpp file is up to date.

$ cmake --build /storage/coda1/p-sbryngelson3/0/sbryngelson3/MFC-2/build/no-debug_no-gpu_mpi/pre_process --target
pre_process -j 12 --config Release

-- GLOB mismatch!
-- Enabled IPO / LTO
-- Configuring done
-- Generating done
-- Build files have been written to: /storage/home/hcoda1/6/sbryngelson3/p-sbryngelson3-0/MFC-2/build/no-debug_no-gpu_mpi/pre_process
[  8%] Preprocessing (Fypp) m_variables_conversion.fpp
[  8%] Preprocessing (Fypp) m_constants.fpp
[ 11%] Preprocessing (Fypp) m_check_patches.fpp
[ 11%] Preprocessing (Fypp) m_data_output.fpp
[ 14%] Preprocessing (Fypp) m_derived_types.fpp
[ 17%] Preprocessing (Fypp) m_global_parameters.fpp
[ 26%] Preprocessing (Fypp) m_helper.fpp
[ 26%] Preprocessing (Fypp) m_initial_condition.fpp
[ 26%] Preprocessing (Fypp) m_model.fpp
[ 29%] Preprocessing (Fypp) m_mpi_common.fpp
[ 32%] Preprocessing (Fypp) m_mpi_proxy.fpp
[ 35%] Preprocessing (Fypp) m_patches.fpp
[ 38%] Preprocessing (Fypp) m_start_up.fpp
Scanning dependencies of target pre_process
[ 41%] Building Fortran object CMakeFiles/pre_process.dir/src/pre_process/autogen/m_constants.fpp.f90.o
nvfortran-Warning-CUDA_HOME has been deprecated. Please, use NVHPC_CUDA_HOME instead.
/storage/home/hcoda1/6/sbryngelson3/p-sbryngelson3-0/MFC-2/src/pre_process/autogen/m_constants.fpp.f90:
[ 44%] Building Fortran object CMakeFiles/pre_process.dir/src/pre_process/autogen/m_derived_types.fpp.f90.o
nvfortran-Warning-CUDA_HOME has been deprecated. Please, use NVHPC_CUDA_HOME instead.
/storage/home/hcoda1/6/sbryngelson3/p-sbryngelson3-0/MFC-2/src/pre_process/autogen/m_derived_types.fpp.f90:
[ 47%] Building Fortran object CMakeFiles/pre_process.dir/src/pre_process/autogen/m_global_parameters.fpp.f90.o
nvfortran-Warning-CUDA_HOME has been deprecated. Please, use NVHPC_CUDA_HOME instead.
/storage/home/hcoda1/6/sbryngelson3/p-sbryngelson3-0/MFC-2/src/pre_process/autogen/m_global_parameters.fpp.f90:
[ 52%] Building Fortran object CMakeFiles/pre_process.dir/src/pre_process/autogen/m_mpi_common.fpp.f90.o
[ 52%] Building Fortran object CMakeFiles/pre_process.dir/src/pre_process/autogen/m_helper.fpp.f90.o
nvfortran-Warning-CUDA_HOME has been deprecated. Please, use NVHPC_CUDA_HOME instead.

Notice the GLOB mismatch that occurs for all targets and thus rebuilds their dependencies...

I also see that the build step on Phoenix takes 30 minutes (at least for the CPU build), but about 10 minutes if one grabs a CPU node and builds the code there. This might motivate building MFC in CI on a Phoenix compute node so we can do -j 12.

Update @henryleberre: This should actually be -j 24 on the build and test, the compute nodes are dual-socket 12 core Intel Golds.

Update 2: Using ./mfc.sh test -j 24 -b mpirun -a dispatches 24 jobs to 1 core on a 24 core node. Core 0 is saturated at 100% utilization (per htop) but others are idle. Is this an easy fix?

henryleberre

In my latest PR, we will build and test on a compute node, by not issuing a build command, only a test command. This also goes around the problem where you end up building multiple times.

I would have to look into this post_process issue but I recall this only being a problem for debug builds with HDF5.

sbryngelson · 2023-12-17T18:48:46Z

In my latest PR, we will build and test on a compute node, by not issuing a build command, only a test command. This also goes around the problem where you end up building multiple times.

I would have to look into this post_process issue but I recall this only being a problem for debug builds with HDF5.

Thanks! Also (per above) using ./mfc.sh test -j 24 -b mpirun -a dispatches 24 jobs to 1 core on a 24 core node. Core 0 is saturated at 100% utilization (per htop) but others are idle. Is this an easy fix?

sbryngelson · 2023-12-17T18:50:05Z

In my latest PR, we will build and test on a compute node, by not issuing a build command, only a test command. This also goes around the problem where you end up building multiple times.

I would have to look into this post_process issue but I recall this only being a problem for debug builds with HDF5.

Can we specify the submit partition and other sbatch options?

henryleberre · 2023-12-17T18:53:44Z

In my latest PR, we will build and test on a compute node, by not issuing a build command, only a test command. This also goes around the problem where you end up building multiple times.

I would have to look into this post_process issue but I recall this only being a problem for debug builds with HDF5.

Thanks! Also (per above) using ./mfc.sh test -j 24 -b mpirun -a dispatches 24 jobs to 1 core on a 24 core node. Core 0 is saturated at 100% utilization (per htop) but others are idle. Is this an easy fix?

That's interesting. Could you try prepending numactl --all to the mfc.sh invocation?

henryleberre · 2023-12-17T18:56:27Z

One issue with the embers queue is that the job can get killed by the submission of another with a higher priority. Do you know if PACE is configured to relaunch the jobs or if there is a way around it?

sbryngelson · 2023-12-17T19:01:41Z

One issue with the embers queue is that the job can get killed by the submission of another with a higher priority. Do you know if PACE is configured to relaunch the jobs or if there is a way around it?

True, but I don't think this has ever happened to us. So, I'm not worried about it for now.

sbryngelson · 2023-12-17T19:04:16Z

In my latest PR, we will build and test on a compute node, by not issuing a build command, only a test command. This also goes around the problem where you end up building multiple times.

I would have to look into this post_process issue but I recall this only being a problem for debug builds with HDF5.

Thanks! Also (per above) using ./mfc.sh test -j 24 -b mpirun -a dispatches 24 jobs to 1 core on a 24 core node. Core 0 is saturated at 100% utilization (per htop) but others are idle. Is this an easy fix?

That's interesting. Could you try prepending numactl --all to the mfc.sh invocation?

No luck (with numactl --all ./mfc.sh test -j 24 -b mpirun -a) where one notices 100/24 ~ 4%

sbryngelson · 2023-12-18T02:55:43Z

@henryleberre I noticed that all the running binaries from ./mfc.sh test -j 24 have the same CPU affinity (it's 1) ->

atl1-1-01-006-14-1: 6/sbryngelson3 $ taskset -cp 232235
pid 232235's current affinity list: 1
atl1-1-01-006-14-1: 6/sbryngelson3 $ taskset -cp 232224
pid 232224's current affinity list: 1

sbryngelson · 2023-12-18T20:20:20Z

Requesting advice from @henryleberre on the CPU affinity/subprocesses issue since the slowness of the Phoenix CPU runner was the real reason for this PR.

If this will be fixed in PR #257 then I can just merge this PR.

Update: Just to double check, indeed ./mfc.sh test -j 8 runs on 8 cores on my MacBook. So, I suspect this is something to do with the invocation in the Slurm-allocated compute node. I tried invoking srun ./mfc.sh test -j X but this is not doing the right thing.

henryleberre · 2023-12-20T00:48:57Z

I discovered that adding --bind-to none to the mpirun invocation fixes it on Phoenix. I'm adding this to #257.

Update run-phoenix-release-cpu.sh

c8e0a74

sbryngelson requested a review from henryleberre December 17, 2023 18:30

henryleberre approved these changes Dec 17, 2023

View reviewed changes

sbryngelson mentioned this pull request Dec 17, 2023

HDF5/Silo built twice in CI? #263

Closed

sbryngelson changed the title ~~Update run-phoenix-release-cpu.sh~~ Run Phoenix jobs using available CPU cores Dec 18, 2023

henryleberre mentioned this pull request Dec 20, 2023

#49: GitHub CI Benchmarking #257

Closed

sbryngelson merged commit 0131901 into master Dec 20, 2023
15 checks passed

sbryngelson deleted the sbryngelson-patch-1 branch December 20, 2023 08:43

JRChreim pushed a commit to JRChreim/MFC-JRChreim that referenced this pull request Dec 21, 2023

Run Phoenix jobs using available CPU cores (MFlowCode#266)

c9d6a7e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Run Phoenix jobs using available CPU cores #266

Run Phoenix jobs using available CPU cores #266

sbryngelson commented Dec 17, 2023 •

edited

Loading

henryleberre left a comment •

edited

Loading

sbryngelson commented Dec 17, 2023

sbryngelson commented Dec 17, 2023

henryleberre commented Dec 17, 2023

henryleberre commented Dec 17, 2023

sbryngelson commented Dec 17, 2023

sbryngelson commented Dec 17, 2023 •

edited

Loading

sbryngelson commented Dec 18, 2023 •

edited

Loading

sbryngelson commented Dec 18, 2023 •

edited

Loading

henryleberre commented Dec 20, 2023

Run Phoenix jobs using available CPU cores #266

Run Phoenix jobs using available CPU cores #266

Conversation

sbryngelson commented Dec 17, 2023 • edited Loading

henryleberre left a comment • edited Loading

Choose a reason for hiding this comment

sbryngelson commented Dec 17, 2023

sbryngelson commented Dec 17, 2023

henryleberre commented Dec 17, 2023

henryleberre commented Dec 17, 2023

sbryngelson commented Dec 17, 2023

sbryngelson commented Dec 17, 2023 • edited Loading

sbryngelson commented Dec 18, 2023 • edited Loading

sbryngelson commented Dec 18, 2023 • edited Loading

henryleberre commented Dec 20, 2023

sbryngelson commented Dec 17, 2023 •

edited

Loading

henryleberre left a comment •

edited

Loading

sbryngelson commented Dec 17, 2023 •

edited

Loading

sbryngelson commented Dec 18, 2023 •

edited

Loading

sbryngelson commented Dec 18, 2023 •

edited

Loading