Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatically copy acquisition and trigger monitor data to midway #80

Open
ershockley opened this issue Feb 19, 2017 · 6 comments
Open
Assignees

Comments

@ershockley
Copy link

People need acquisition and trigger monitor data for analyses, even if the raw data isn't present. Up to this point Boris and I have been copying manually, but obviously not best solution. Need to add a method in cax to copy the acquisition_monitor_data.pickles and trigger_monitor_data.zip from the raw data to corresponding directories found in /project/lgrandi/xenon1t.

@XeBoris
Copy link
Contributor

XeBoris commented Feb 20, 2017

Just that I fully understand: Do these data acquisition_monitor_data.pickles and trigger_monitor_data.zip appear on xe1t-datamanager every time once a run is finished?
Because this is what usually is used for rucio uploads and this is what I haven seen in the past. Therefore once a raw data set is copied (by ruciax or cax) to another location these data should always be copied to midway. So why are these missing?

@ershockley
Copy link
Author

Sorry I wasn't clear. Yes those files are in (most of) the raw data on datamanager, the problem is that people want them after raw data has been purged from midway. They're fairly small so we were told it was worth it to try to keep them on midway.

@lgrandi
Copy link

lgrandi commented Mar 26, 2017

From Evan' and Patrick email:
########################
On Mar 22, 2017, at 10:19 AM, Evan Shockley [email protected] wrote:

Hi Dan,

After speaking with Patrick I implemented yesterday what he described below. The metadata files should now be placed in their respective directories in /project/lgrandi/xenon1t automatically at time of processing (if they don’t exist already). The script can be found at ~tunnell/verify_stash_transfers.sh

I tested for a few runs and it seemed to work, but let me know if you find newly processed runs with these files missing.

Evan
########################
On Mar 21, 2017, at 10:30 AM, Patrick de Perio [email protected] wrote:

Dear Dan,

Agree it's better to keep a single data stream out of DAQ. And agree with cleanest solution; there's already a "KEY: TNamed pax_metadata" inside the pax processed ROOT file, though I'm also not an expert on the metadata format and how pax gets this in.

However, hold on with the trigger update; Evan and I just thought about it some more following your suggestion and figured a new possible solution using the OSG processing post-script:

1) (Current) Transfer processed file to Midway,
2) (Current) Run massive-cax on Midway via ssh to verify and run hax,
3) (Proposed) Call rucio-download on Midway via ssh to download metadata.

This script (originally from Boris) is working stably for grabbing individual .zip's, so in principle can be modified to download metadata instead and called in step 3. We'll also need a cronjob running somewhere that automatically renews the grid certificate (Boris working on this).

Patrick

@lgrandi
Copy link

lgrandi commented Mar 26, 2017

can we close this issue @ershockley @coderdj ? Or are we still having a hole between the new processed data (can someone confirm that the script works for new data) and SR0 runs?

@pdeperio
Copy link
Contributor

@ershockley can you post a link to your solution (and/or commit the script) here before closing?

@ershockley
Copy link
Author

Here is the solution I came up with, though it's not ideal because errors are not caught easily.

After a run is processed on OSG and the output file copied to midway, we need midway cax to verify the transfer occurred successfully. To do this we submit jobs on midway from OSG login node via ssh. See https://github.com/XENON1T/cax/blob/OSG_dev2/osg_scripts/hadd_and_upload.sh#L75. We also use this midway script to download the metadata, see https://gist.github.com/ershockley/8b77f3209b845f19064eecaf681c2843.

It mostly works, however there are times when we don't catch failed downloads and so we can have gaps of runs that don't have metadata. It's pretty easy to recover the missing runs once someone informs me that they aren't there, but still could be improved.

If people are fine with this mostly-working-solution, we can close.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants