Inaccurate memory requested value #24

prehensilecode · 2022-11-07T22:30:01Z

Environment:

RHEL 8.1
Slurm 21.08.8
Python 3.9.14
seff-array (latest master branch as of 2022-11-07 17:00 EST)

An example of an array of >400 tasks. Job request:

#SBATCH --cpus-per-task=20
#SBATCH --mem=8GB

Reported by seff are a couple examples:

Job ID: 4160740
Array Job ID: 4160739_0
Cluster: foocluster
User/Group: juser/juser
State: COMPLETED (exit code 0)
Nodes: 1
Cores per node: 20
CPU Utilized: 00:11:28
CPU Efficiency: 26.46% of 00:43:20 core-walltime
Job Wall-clock time: 00:02:10
Memory Utilized: 5.58 GB
Memory Efficiency: 69.77% of 8.00 GB

Job ID: 4160954
Array Job ID: 4160739_207
Cluster: foocluster
User/Group: juser/juser
State: COMPLETED (exit code 0)
Nodes: 1
Cores per node: 20
CPU Utilized: 00:25:51
CPU Efficiency: 36.93% of 01:10:00 core-walltime
Job Wall-clock time: 00:03:30
Memory Utilized: 5.46 GB
Memory Efficiency: 68.28% of 8.00 GB

Latest seff_array reports inaccurate value 409.6MB and inaccurate memory efficiency 1228.85% of 409.6MB

Job ID: 4160739
Cluster: foocluster
User/Group: juser/juser
Cores: 20
Average CPU Utilized: 01:04.22
CPU Efficiency: 36.21% of 02:57.38 core-walltime
Job Wall-clock time: 02:57.38
Average Memory Utilized: 5.03GB
Memory Efficiency: 1228.85% of 409.6MB
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Job States
COMPLETED: 414
FAILED: 1
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
============================================================================== Max Memory Usage ==============================================================================
# NumSamples = 415; Min = 1.07MB; Max = 7.77GB
# Mean = 5.03GB; SD = 1.73GB; Median 5.64GB
# each ∎ represents a count of 2
  963.0KB -    1.07MB [   1]:
   1.07MB -  778.05MB [  32]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
 778.05MB -    1.56GB [   2]: ∎
   1.56GB -    2.33GB [   8]: ∎∎∎∎
   2.33GB -    3.11GB [   7]: ∎∎∎
   3.11GB -    3.89GB [  15]: ∎∎∎∎∎∎∎
   3.89GB -    4.66GB [   8]: ∎∎∎∎
   4.66GB -    5.44GB [  57]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
   5.44GB -    6.22GB [ 265]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
   6.22GB -    6.99GB [  13]: ∎∎∎∎∎∎
   6.99GB -    7.77GB [   7]: ∎∎∎
The requested memory was 409.6MB.
…

The text was updated successfully, but these errors were encountered:

prehensilecode · 2022-11-07T22:57:58Z

Looks like I misunderstood the value shown as "requested memory". seff-array shows the requested memory per CPU.

However the MaxRSS reported is for the task, and not a per-CPU value. So, shouldn't the memory efficiency be:

100 * rss_sum / req_mem / data_len

out of

mb_to_str(req_mem)

Or, if you want the per-CPU value:

100 * (rss_sum / req_cpus) / (req_mem / req_cpus) / data_len
== 100 * rss_sum / req_mem / data_len  # since the "req_cpus" divide out

out of

mb_to_str(req_mem / req_cpus)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inaccurate memory requested value #24

Inaccurate memory requested value #24

prehensilecode commented Nov 7, 2022 •

edited

Loading

prehensilecode commented Nov 7, 2022

Inaccurate memory requested value #24

Inaccurate memory requested value #24

Comments

prehensilecode commented Nov 7, 2022 • edited Loading

prehensilecode commented Nov 7, 2022

prehensilecode commented Nov 7, 2022 •

edited

Loading