-
Notifications
You must be signed in to change notification settings - Fork 0
/
test.txt
167 lines (126 loc) · 6.02 KB
/
test.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
-----------------------------
To run the minitest tests in spack-configs
-----------------------------
So get a compute node from a login node:
At LLNL:
On rzansel and lassen, <cmd-to-run> = "lalloc 1"
On rzmanta and ray, <cmd-to-run> = "lalloc 1"
On rzhasgpu and quartz, <cmd-to-run> = "salloc -N 1 -ppdebug"
At LANL:
On kodiak, <cmd-to-run> = "salloc -N 1 --qos=interactive"
At Sandia:
On eclipse and stria, <cmd-to-run> = "salloc -N1"
On vortex, <cmd-to-run> = "lalloc 1"
At ANL:
At NERSC:
To run a <cmd> from a compute node from lalloc:
lrun -1 -g 1 -c 1 <cmd>
-----------------------------
To run the hpctest test suite
-----------------------------
Be sure you have run "spack compiler find" before starting.
The suite is contains 6 tests, one of which does not test profiling.
Of the five remaining tests, four require MPI.
1. Clone the test suite repository
git clone https://github.com/scottkwarren/hpctest ~/gits/hpctest
cd ~/gits/hpctest
2. Clean up the for hpctest's use of its private spack version, which is incompatible with
the regular spack.
/bin/rm -rf ~/.spack
Find the gcc that you want to use, module load it, and run hpctest spack compiler find.
3. Set up and build the tests.
If you will be running on a system with a particular MPI to use, create
a packages.yaml file in you ~/.spack directory specifying that openmpi.
It should match the entry used in installing the hpctoolkit.
3. Set up the bits to test:
If you have cloned the hpctest repo, its config.yaml will not point to any hpctoolkit
That implies that the tests will use the hpctoolkit on your path.
module avail hpctoolkit
module load hpctoolkit@xxxx
xxxx represents the bits to test
Or you could edit config.yaml to point to an installation
profile:
hpctoolkit:
path: whatever
4. If you are running on a login node, you should run the tests on the compute nodes.
hpctest has support for batch submission. Add specifications for the particular
batch manager to the config.yaml file:
config:
batch:
manager: Slurm
params:
account: xxxxxx
partition: batch
time: "hh:mm:ss" -- time controls each test, not the whole suite
-or-
config:
batch:
manager: Summit
params:
project: xxxxxx
time: "hh:mm:ss" -- time controls each test, not the whole suite
prelude: module unload darshan-runtime
5. Build the full suite:
<cmd-to-run> hpctest build all
On machines without compute nodes, or if you have configured the suite to use
a batch manager, no <cmd-to-run> is needed; on login-nodes for a test suite that
is not configured for a batch manager, <cmd-to-run> is needed to run on the compute nodes.
At LLNL:
On rzansel and lassen, <cmd-to-run> = "lalloc 1"
On rzmanta and ray, <cmd-to-run> = "lalloc 1"
On rzhasgpu and quartz, <cmd-to-run> = "salloc -N 1 -ppdebug"
At LANL:
On kodiak, <cmd-to-run> = "salloc -N 1 --qos=interactive"
At Sandia:
On eclipse and stria, <cmd-to-run> = "salloc -N1"
On vortex, <cmd-to-run> = "lalloc 1"
At ANL:
At NERSC:
6. Running the test suite
To run the full suite:
<cmd-to-run> hpctest run all
To run a single test, e.g.,
<cmd-to-run> hpctest run app/amgmk
(That's the only passing test that does not require MPI)
See 5, above, for the description of <cmd-to-run>
6. Understanding the test results:
Each invocation of hpctest creates a directory in .../hpctest/work with a name of the form
study-<date>--<time>
where <date> is e.g., "2019-11-25", and <time> is e.g., "15-02-59"
in this case, referring to the run started at 3:02:59 PM on November 25, 2019.
Under that directory is a set of subdirectories, one for each test.
with names like, e.g.,:
app--AMG2006:%gcc:-e.REALTIME@10000
each referring to a single test. During the run, hpctest will say:
running test app/AMG2006
At the end of the test run, result from that test is labelled:
APP / AMG2006 with %GCC and -e REALTIME@10000
Note the differing capitalization and punctuation in the various usages.
Each of those individual test subdirectories has a link for the executable, pointing into its
build subdirectory, a build subdirectory which is populated with everything built by
spack for that test. Each test subdirectory has an OUT subdirectory.
The OUT subdirectory has an OUT.yaml file, summarizing the execution of the test.
It also has various other files and two directories, numbered starting 01-<name>, in
the order with which they were generated during the test. One of the directories
is "04-hpctoolkit-<app>-measurements" (perhaps with a different number in some runs).
(The <app> here is lower-case, despite the test being upper-case.)
It contains the raw data recorded in the run: "<app>-*.hpcrun" and "<app>-*.hpctrace"
files for each process and thread, and "<app>-*.log" files for each process.
The OUT subdirectory also has a "11-hpctoolkit-<app>-database" (again, perhaps with
a different number in some runs). It contains the experiment.xml file, and the individual
trace files, copied from the measurements directory. It also contains a src subdirectory
with what appears to be a copy of the raw source, cloned into a subdirectory tree
mimicing the full path of the subtree of hpctest containing the source.
To look for stack unwind failures, cd to the study-* directory of interest,
and run:
grep errant */OUT/OUT.yaml
The hpcviewer can be brought up with the name of the "11-hpctoolkit-<app>-database" directory
in the OUT directory for a test.
7. Cleaning up
To remove previous runs, run "hpctest clean" It will leave the pre-built executables, but remove all
recorded data.
To remove all the prebuilt executables, run "hpctest clean --all"
============================================================================================
There is also a test directory with some unit tests. You can get it with
git clone https://github.com/HPCToolkit/hpctoolkit-tests
Each subdirectory is a unit test, but they are in varying states.