Skip to content

Commit

Permalink
reviews
Browse files Browse the repository at this point in the history
* better definition of vector algebra
* don't use only/de-facto - mention PyROOT, fix language
* expand on the backends
* jagged -> ragged + a definition for ragged
  • Loading branch information
Saransh-cpp committed Sep 2, 2024
1 parent c5ea94f commit 6af7186
Show file tree
Hide file tree
Showing 2 changed files with 77 additions and 29 deletions.
40 changes: 40 additions & 0 deletions paper/paper.bib
Original file line number Diff line number Diff line change
Expand Up @@ -161,3 +161,43 @@ @software{pylhe
doi = {10.5281/zenodo.1217031},
url = {https://github.com/scikit-hep/pylhe},
}

@software{root:2020,
author = {Rene Brun and
Fons Rademakers and
Philippe Canal and
Axel Naumann and
Olivier Couet and
Lorenzo Moneta and
Vassil Vassilev and
Sergey Linev and
Danilo Piparo and
Gerardo GANIS and
Bertrand Bellenot and
Enrico Guiraud and
Guilherme Amadio and
wverkerke and
Pere Mato and
TimurP and
Matevž Tadel and
wlav and
Enric Tejedor and
Jakob Blomer and
Andrei Gheata and
Stephan Hageboeck and
Stefan Roiser and
marsupial and
Stefan Wunsch and
Oksana Shadura and
Anirudha Bose and
CristinaCristescu and
Xavier Valls and
Raphael Isemann},
title = {root-project/root: v6.18/02},
month = jun,
year = 2020,
publisher = {Zenodo},
version = {v6-18-02},
doi = {10.5281/zenodo.3895860},
url = {https://doi.org/10.5281/zenodo.3895860}
}
66 changes: 37 additions & 29 deletions paper/paper.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,12 +28,16 @@ bibliography: paper.bib

# Summary

Vector algebra is a crucial component of data analysis pipelines in high energy
physics, enabling physicists to transform raw data into meaningful results that
can be visualized. Given that high energy physics data is not uniform, the
vector algebra frameworks or libraries are expected to work readily on
non-uniform or jagged data, allowing users to perform operations on an entire
jagged array in minimum passes. Furthermore, optimizing memory usage and
Mathematical manipulations of vectors is a crucial component of data analysis
pipelines in high energy physics, enabling physicists to transform raw data
into meaningful results that can be visualized. More specifically, high energy
physicists work with 2D and 3D Euclidean vectors, and 4D Lorentz vectors that
can be used as physical quantities, such as position, momentum, and forces.
Given that high energy physics data is not uniform, the vector manipulation
frameworks or libraries are expected to work readily on non-uniform or ragged
data, data with variable-sized rows (or a nested data structure with variable-sized
entries); thus, the library is expected to perform operations on an entire
ragged structure in minimum passes. Furthermore, optimizing memory usage and
processing time has become essential with the increasing computational demands
at the LHC. Vector is a Python library for creating and manipulating 2D, 3D,
and Lorentz vectors, especially arrays of vectors, to solve common physics
Expand All @@ -45,36 +49,40 @@ high energy physics experiments.

# Statement of need

Vector is currently the only Lorentz vector library providing a Pythonic
interface but a C++ (through Awkward Array [@Pivarski:2018]) computational
backend. Vector integrates seamlessly with the existing high energy physics
Vector is one of the few Lorentz vector libraries providing a Pythonic interface
but a compiled (through Awkward Array [@Pivarski:2018]) computational backend.
Vector integrates seamlessly with the existing high energy physics
ecosystem and the broader scientific Python ecosystem, including libraries like
Dask [@rocklin:2015] and Numba [@lam:2015]. The library implements a variety of
backends for several purposes. Although vector was written with high energy
physics in mind, it is a general-purpose library that can be used for any
scientific or engineering application. The library houses 3+2 numerical
backends for experimental physicists and 1 symbolic backend for theoretical
physicists. These backends include a pure Python object backend for simple
computations, a SymPy [@Meurer:2017] backend for symbolic computations, a
NumPy backend for computations on regular data, an Awkward backend for
computations on jagged data, and implementations of the Object and the Awkward
backend in Numba for just-in-time compilable operations. Support for JAX and
Dask is also provided through the Awkward backend, which enable vector
functionalities to support automatic differentiation and parallel computing.
scientific or engineering application. The library houses a set of diverse
backends, 3 numerical backends for experimental physicists and 1 symbolic
backend for theoretical physicists. These backends include:

- a pure Python object (builtin) backend for scalar computations,
- a NumPy backend for computations on regular collection-type data,
- a SymPy [@Meurer:2017] backend for symbolic computations, and
- an Awkward backend for computations on ragged collection-type data

There also exists implementations of the Object and the Awkward backend in Numba
for just-in-time compilable operations. Further, support for JAX and Dask is
provided through the Awkward backend, which enables vector functionalities to
support automatic differentiation and parallel computing.

## Impact

Vector has become the de facto library for vector algebra in Python based high
energy physics data analysis pipelines. The library has been installed over
2 million times and 314 GitHub repositories use it as a dependency at the time
of writing this paper. Along with being utilized directly in analysis pipelines
at LHC and other experiments [@Kling:2023; @Held:2024; @Qu:2022], the library
is also used as a dependency in user-facing frameworks, such as, Coffea,
MadMiner [@Brehmer:2020], FastJet [@aryan:2023], Spyral [@spyral-utils:2024],
Weaver [@weaver-core:2024], and pylhe [@pylhe]. The library is also used in
multiple teaching materials for graduate courses and workshops. Finally, given
the generic nature of the library, it is also often used in non high energy
physics use cases.
Besides PyROOT's TLorentzVector [@root:2020], vector has now become a popular
choice for vector manipulations in Python based high energy physics data
analysis pipelines. The library has been installed over 2 million times and 314
GitHub repositories use it as a dependency at the time of writing this paper.
Along with being utilized directly in analysis pipelines at LHC and other
experiments [@Kling:2023; @Held:2024; @Qu:2022], the library is being used as a
dependency in user-facing frameworks, such as, Coffea, MadMiner [@Brehmer:2020],
FastJet [@aryan:2023], Spyral [@spyral-utils:2024], Weaver [@weaver-core:2024],
and pylhe [@pylhe]. The library is also used in multiple teaching materials for
graduate courses and workshops. Finally, given the generic nature of the library,
it is often used in non high energy physics use cases.

# Acknowledgements

Expand Down

0 comments on commit 6af7186

Please sign in to comment.