Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation and package maintenance #481

Merged
merged 6 commits into from
May 13, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 14 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,22 @@
# straxen
Streaming analysis for XENON(nT)

[![Build Status](https://travis-ci.org/XENONnT/straxen.svg?branch=master)](https://travis-ci.org/XENONnT/straxen)
[![Test package](https://github.com/XENONnT/straxen/workflows/Test%20package/badge.svg?branch=master)](https://github.com/XENONnT/straxen/actions?query=branch%3Amaster)
[![PyPI version shields.io](https://img.shields.io/pypi/v/straxen.svg)](https://pypi.python.org/pypi/straxen/)
[![Readthedocs Badge](https://readthedocs.org/projects/straxen/badge/?version=latest)](https://straxen.readthedocs.io/en/latest/?badge=latest)
[![Test package](https://github.com/XENONnT/straxen/actions/workflows/pytest.yml/badge.svg?branch=master)](https://github.com/XENONnT/straxen/actions/workflows/pytest.yml)
[![CodeFactor](https://www.codefactor.io/repository/github/xenonnt/straxen/badge)](https://www.codefactor.io/repository/github/xenonnt/straxen)
[![Coverage Status](https://coveralls.io/repos/github/XENONnT/straxen/badge.svg)](https://coveralls.io/github/XENONnT/straxen)
![Update context collection](https://github.com/XENONnT/straxen/workflows/Update%20context%20collection/badge.svg)
[![Join the chat at https://gitter.im/AxFoundation/strax](https://badges.gitter.im/Join%20Chat.svg)](https://gitter.im/AxFoundation/strax?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)
[![PyPI version shields.io](https://img.shields.io/pypi/v/straxen.svg)](https://pypi.python.org/pypi/straxen/)
[![Readthedocs Badge](https://readthedocs.org/projects/straxen/badge/?version=latest)](https://straxen.readthedocs.io/en/latest/?badge=latest)

[Straxen](https://straxen.readthedocs.io) is the analysis framework for XENONnT, built on top of the generic [strax framework](https://github.com/AxFoundation/strax). Currently it is configured for analyzing XENONnT and XENON1T data.

For installation instructions and usage information, please see the [straxen documentation](https://straxen.readthedocs.io/en/latest/setup.html).

Straxen is the analysis framework for XENONnT, built on top of the generic [strax framework](https://github.com/AxFoundation/strax). Currently it is configured for analyzing XENON1T data.

For installation instructions and usage information, please see the [straxen documentation](https://straxen.readthedocs.io).
## Further status
[![Python Versions](https://img.shields.io/pypi/pyversions/straxen.svg)](https://pypi.python.org/pypi/straxen)
[![PyPI downloads](https://img.shields.io/pypi/dm/straxen.svg)](https://pypistats.org/packages/straxen)
[![Build Status](https://travis-ci.org/XENONnT/straxen.svg?branch=master)](https://travis-ci.org/XENONnT/straxen)

[![Update context collection](https://github.com/XENONnT/straxen/workflows/Update%20context%20collection/badge.svg)](https://github.com/XENONnT/straxen/actions/workflows/contexts.yml)
[![Python style](https://github.com/XENONnT/straxen/actions/workflows/code_style.yml/badge.svg)](https://github.com/XENONnT/straxen/actions/workflows/code_style.yml)
[![Coveralls](https://github.com/XENONnT/straxen/actions/workflows/coveralls.yml/badge.svg?branch=master)](https://github.com/XENONnT/straxen/actions/workflows/coveralls.yml)
2 changes: 1 addition & 1 deletion bin/straxer
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ def parse_args():
help="ID of the run to process; usually the run name.")
parser.add_argument(
'--context',
default='xenon1t_dali',
default='xenonnt_online',
help="Name of straxen context to use")
parser.add_argument(
'--target',
Expand Down
24 changes: 14 additions & 10 deletions docs/pull_request_template.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,19 @@
Before you submit this PR: make sure to put all operations-related information in a wiki-note, a PR should be about code and is publicly accessible
_Before you submit this PR: make sure to put all operations-related information in a wiki-note, a PR should be about code and is publicly accessible_

**What is the problem / what does the code in this PR do**
## What does the code in this PR do / what does it improve?

**Can you briefly describe how it works?**
## Can you briefly describe how it works?

**Can you give a minimal working example (or illustrate with a figure)?**
## Can you give a minimal working example (or illustrate with a figure)?

Please include the following if applicable:
- Update the docstring(s)
- Update the documentation
- Tests to check the (new) code is working as desired.
- Does it solve one of the open issues on github?
_Please include the following if applicable:_
- [ ] _Update the docstring(s)_
- [ ] _Update the documentation_
- [ ] _Tests to check the (new) code is working as desired._
- [ ] _Does it solve one of the open issues on github?_

Please make sure that all automated tests have passed before asking for a review (you can save the PR as a draft otherwise).
### _Notes on testing_
- _Until the automated tests pass, please mark the PR as a draft._
- _On the XENONnT fork we test with database access, on private forks there is no database access for security considerations._

All _italic_ comments can be removed from this template.
71 changes: 71 additions & 0 deletions docs/source/bootstrax.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
Bootstrax: XENONnT online processing manager
=============================================
The ``bootstrax`` script watches for new runs to appear from the DAQ, then starts a
strax process to process them. If a run fails, it will retry it with
exponential backoff, each time waiting a little longer before retying.
After 10 failures, ``bootstrax`` stops trying to reprocess a run.
Additionally, every new time it is restarted it tries to process fewer plugins.
After a certain number of tries, it only reprocesses the raw-records.
Therefore a run that may fail at first may successfully be processed later. For example, if

You can run more than one ``bootstrax`` instance, but only one per machine.
If you start a second one on the same machine, it will try to kill the
first one.


Philosophy
----------------
Bootstrax has a crash-only / recovery first philosophy. Any error in
the core code causes a crash; there is no nice exit or mandatory
cleanup. Bootstrax focuses on recovery after restarts: before starting
work, we look for and fix any mess left by crashes.

This ensures that hangs and hard crashes do not require expert tinkering
to repair databases. Plus, you can just stop the program with ctrl-c
(or, in principle, pulling the machine's power plug) at any time.

Errors during run processing are assumed to be retry-able. We track the
number of failures per run to decide how long to wait until we retry;
only if a user marks a run as 'abandoned' (using an external system,
e.g. the website) do we stop retrying.


Mongo documents
----------------
Bootstrax records its status in a document in the '``bootstrax``' collection
in the runs db. These documents contain:

- **host**: socket.getfqdn()
- **time**: last time this ``bootstrax`` showed life signs
- **state**: one of the following:
- **busy**: doing something
- **idle**: NOT doing something; available for processing new runs

Additionally, ``bootstrax`` tracks information with each run in the
'``bootstrax``' field of the run doc. We could also put this elsewhere, but
it seemed convenient. This field contains the following subfields:

- **state**: one of the following:
- **considering**: a ``bootstrax`` is deciding what to do with it
- **busy**: a strax process is working on it
- **failed**: something is wrong, but we will retry after some amount of time.
- **abandoned**: ``bootstrax`` will ignore this run
- **reason**: reason for last failure, if there ever was one (otherwise this field
does not exists). Thus, it's quite possible for this field to exist (and
show an exception) when the state is ``'done'``: that just means it failed
at least once but succeeded later. Tracking failure history is primarily
the DAQ log's responsibility; this message is only provided for convenience.
- **n_failures**: number of failures on this run, if there ever was one
(otherwise this field does not exist).
- **next_retry**: time after which ``bootstrax`` might retry processing this run.
Like 'reason', this will refer to the last failure.

Finally, ``bootstrax`` outputs the load on the eventbuilder machine(s)
whereon it is running to a collection in the DAQ database into the
capped collection 'eb_monitor'. This collection contains information on
what ``bootstrax`` is thinking of at the moment.

- **disk_used**: used part of the disk whereto this ``bootstrax`` instance
is writing to (in percent).

*Last updated 2021-05-07. Joran Angevaare*
1 change: 1 addition & 0 deletions docs/source/figures/online_monitor.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
16 changes: 16 additions & 0 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,14 @@ Straxen is the analysis framework for XENONnT, built on top of the generic `stra
tutorials/Open_data.ipynb
tutorials/mini_analyses


.. toctree::
:maxdepth: 2
:caption: XENONnT Online monitor

online_monitor


.. toctree::
:maxdepth: 2
:caption: Configuration storage
Expand All @@ -39,6 +47,14 @@ Straxen is the analysis framework for XENONnT, built on top of the generic `stra

reference/datastructure_1T

.. toctree::
:maxdepth: 2
:caption: scripts

scripts
bootstrax


.. toctree::
:maxdepth: 1
:caption: Reference
Expand Down
123 changes: 123 additions & 0 deletions docs/source/online_monitor.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
XENONnT online monitor
======================
Using strax, it is possible to live-process data while acquiring it.
This allows for fast monitoring. To further allow this, straxen has an
online monitor frontend. This allows a portion of the data to be
shipped of to the Mongo database while collecting the data at the DAQ.
This means that analysers can have fast feedback on what is going on inside the
TPC.


Loading data via the online monitor
-----------------------------------
In order to load this data in straxen, one can use the following setup
and start developing live-displays!


.. code-block:: python

import straxen
st = straxen.contexts.xenonnt_online(_add_online_monitor_frontend=True)

# Allow unfinished runs to be loaded, even before the DAQ has finished processing this run!
st.set_context_config({'allow_incomplete': True})
st.get_df(latest_run_id, 'event_basics')

This command adds the online-monitor frontend to the context. If data is
now requested by the user strax will fetch the data via this frontend
if it is not available in any of the other storage frontends. Usually the data
is available within ~30 second after a pulse was detected by a PMT.


Machinery
---------
Using the strax online monitor frontend, each chunk of data being processed
on the DAQ can be shipped out to via the mongo database. Schematically,
this looks as in the following schematic. For data that is stored in the
online-monitor collection of the database, each chunk of data is stored twice.
The data that is written to the DAQ local storage is transferred by
`admix <https://github.com/XENONnT/admix>`_ to the shared analysis cluster
(`dali`). This transfer can only start once a run has been finished and also
the transfer takes time. To make data access almost instantaneous, this data is
also stored online.


.. image:: figures/online_monitor.svg

The user will retrieve the data from the mongo database just as if the
data were stored locally. It takes slightly longer to store the data than if
it was stored on disk because each chunk is saved online individually.
However, with a decent internet connection, loading one run of any data
should only take ~10 s.


How long and what data is stored online?
----------------------------------------
The online storage cannot hold data for extended periods of time, and, since
data is shipped to the analysis sites, there is no need to keep it around
forever.
As such, data will be available up to 7 days after writing the data to the
database. After that, the online data will be deleted automatically.

Depending on the current settings, selected datatypes are stored in the database.
At the time of writing, these were:

- ``online_peak_monitor``
- ``event_basics``
- ``veto_regions``

For the most up-to-date information, one can check the registration in the
``straxen.contexts.xenonnt_online`` context:
`here <https://github.com/XENONnT/straxen/blob/master/straxen/contexts.py#L160-L165>`_.


Caching the results of the online monitor
-----------------------------------------
For some applications, it's worth to keep a local copy of the data from the
online monitor. If one is interested in multiple runs, this is usually a good option.

To this end one can use the context function ``copy_to_frontend``. By setting
``rechunk=True``, we are combining the many small files (one per chunk) into
a few bigger files which makes it much faster to load next time.


.. code-block:: python

import straxen
st = straxen.contexts.xenonnt_online(_add_online_monitor_frontend=True)
st.copy_to_frontend(latest_run_id, 'event_basics', rechunk=True)

One can look now where this run is stored:

.. code-block:: python

for storage_frontend in st.storage:
is_stored = st._is_stored_in_sf(latest_run_id, 'event_basics', storage_frontend)
print(f'{storage_frontend.__class__.__name__} has a copy: {is_stored}')

which prints

.. code-block:: rst

RunDB has a copy: False
DataDirectory has a copy: False
DataDirectory has a copy: False
DataDirectory has a copy: True
OnlineMonitor has a copy: True

You can also ``print(st.storage)`` to see which directories these refer to.
The ``DataDirectory``-storage frontends that do not have a copy are readonly
folders and not accessible to the user for writing.

For more information on this, checkout the
`strax documentation on copying data <https://strax.readthedocs.io/en/latest/advanced/recompression.html>`_.


Pre-configured monitoring tools
-------------------------------
For XENONnT we have the private monitor called `olmo <https://github.com/XENONnT/olmo>`_
which is only visible for XENONnT members.


*Last updated 2021-05-07. Joran Angevaare*

77 changes: 77 additions & 0 deletions docs/source/scripts.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
Straxen scripts
===================
Straxen comes with
`several scripts <https://github.com/XENONnT/straxen/tree/master/bin>`_
that allow common uses of straxen. Some of these scripts are designed
to run on the DAQ whereas others are for common use cases. Each of the
scripts will be briefly discussed below:

straxer
-------
``straxer`` is the most useful straxen script for regular users. Allows data to be
generated in a script format. Especially useful for reprocessing data
in batch jobs.

For example a user can reprocess the data of run ``012100`` using the
following command up to ``event_info_double``.

.. code-block:: bash

straxer 12100 event_info_double

For more information on the options, please refer to the help:

.. code-block:: bash

straxer --help


ajax [DAQ-only]
----------------
The DAQ-cleaning script. Data is stored on the DAQ such that other tools
like `admix <https://github.com/XENONnT/admix>`_ may ship the data to
distributed storage. A portion of the high level data is stored on the DAQ
for diagnostic purposes for longer periods of time. ``ajax`` removes this
data if needed.
The ``ajax`` script looks for data on the eventbuilders
that can be deleted because at least one of the following reasons:

- A run has been "abandoned", this means that there is no further use
for this data, e.g. a board failed during a run, there is no point in
keeping a run where part of the data on the DAQ.
- The live-data (intermediate DAQ format, even more raw than raw-records) has
been successfully processed. Therefore remove this intermediate datakind from
daq.
- A run has been abandoned but there is live-data still on the DAQ-bugger.
- Data is "unregistered" (not in the runsdatabase),
this only occurs if DAQ-experts perform tests on the DAQ.
- Since bootstrax runs on multiple hosts, some of the data may appear to be
stored more than once since a given bootstrax instance could crash during it's processing.
The data of unsucessful processings should be removed by ``ajax``.
- Finally ``ajax`` also checks if all the entries that are in the database are also on the host still
This sanity check catches any potential issues in the data handling by admix.


bootstrax [DAQ-only]
--------------------
As the main DAQ processing script. This is discussed separately. It is only used for XENONnT.


fake_daq
------------------
Script that allows mimiming DAQ-processing by opening raw-records data.


microstrax
------------------
Mini strax interface that allows strax-data to be retrieved using HTTP requests
on a given port. This is at the time of writing used on the DAQ as a pulse viewer.


refresh_raw_records
-------------------
Updates raw-records from old strax versions. This data is of a different
format and needs to be refreshed before it can be opened with more recent
versions of strax.

*Last updated 2021-05-07. Joran Angevaare*
Loading