Merging output coffea files

File merging

To plot results, the output coffea files must be merged into a single accumulator. You can see here for instructions to use jexec to submit jobs and produce coffea files.

The merging is done via the jmerge executable, as follows:

jmerge $indir -o ./path/to/output/directory -j4

In the above command, -o specifies the output directory to save the merged accumulator, and -j specifies the number of parallel jobs to run the merging. $indir is the path to the submission directory which holds the individual .coffea files.

Access in code

Once the merge is done via jmerge, the accumulator is accessed in the code using the klepto library as follows:

from klepto.archives import dir_archive

acc = dir_archive("/path/to/merged/files") # Same as the -o argument to jmerge

And each histogram inside the accumulator can be accessed by first loading a copy into memory via acc.load(), as follows:

# Let's say we want to access the MET histogram which we named as "met"
distribution = "met"

acc.load(distribution)
histo = acc[distribution]

Note that without the first acc.load() call, a direct attempt to access acc[distribution] will give a KeyError.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merging output coffea files

File merging

Access in code

Clone this wiki locally