Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: reduce python overhead for awkward backend #554

Merged
merged 2 commits into from
Jan 15, 2025

Conversation

pfackeldey
Copy link
Collaborator

Reduce python overhead of awkward-array backend by "grouping" multiple operations into a single broadcasting traversal with ak.transform.

Description

This PR improves the performance of the awkward-array backend by fusing the application of multiple operations (that would each trigger broadcasting every time) into a single broadcast. (This is conceptually similar to fusing GPU kernels to avoid multiple kernel launches, where a kernel launch would be equivalent to a broadcasting with awkward-array - of course, fusing GPU kernels is very much different otherwise...).

This removes the python overhead significantly, e.g.:

import numpy as np
import awkward as ak

x1 = y1 = z1 = x2 = y2 = z2 = ak.Array([[1, 2, 3], [], None, [4, 5]])

lib = np
angle = 0.3

# an example function that has a lot of operations
from vector._compute.spatial.rotate_axis import cartesian

cartesian(lib, angle, x1, y1, z1, x2, y2, z2)
#(<Array [[1, 2, 3], [], None, [4, 5]] type='4 * option[var * float64]'>,
# <Array [[1, 2, 3], [], None, [4, 5]] type='4 * option[var * float64]'>,
# <Array [[1, 2, 3], [], None, [4, 5]] type='4 * option[var * float64]'>)

%timeit cartesian(lib, angle, x1, y1, z1, x2, y2, z2)
# 17.7 ms ± 33.6 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)

# the following happens now automatically in vector (but it's also exposed to be used somewhere else, e.g. coffea?)
import vector

vector.awkward_transform(cartesian)(lib, angle, x1, y1, z1, x2, y2, z2)
#(<Array [[1, 2, 3], [], None, [4, 5]] type='4 * option[var * float64]'>,
# <Array [[1, 2, 3], [], None, [4, 5]] type='4 * option[var * float64]'>,
# <Array [[1, 2, 3], [], None, [4, 5]] type='4 * option[var * float64]'>)

%timeit vector.awkward_transform(cartesian)(lib, angle, x1, y1, z1, x2, y2, z2)
# 682 μs ± 4.66 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each) 

With this PR vector.awkward_transform is used internally in vector to speed up all vector._compute methods with the awkward backend (through vector's dispatch mechanism).

Note: vector.awkward_transform currently makes some assumptions about the internal vector._compute methods, e.g. their signature and return signature. This could be written in a general way, but where to stop the generality? At some point input and output arguments would need to be flattened and unflattened similar to JAX PyTrees... and that would be a bit out-of-scope for this decorator. Thus, it is currently exposed in vector's public API, but only intended for expert usage.

Oh, and if someone has a better name for this function/decorator, I'm happy to change it.

Checklist

  • Have you followed the guidelines in our Contributing document?
  • Have you checked to ensure there aren't any other open Pull Requests for the required change?
  • Does your submission pass pre-commit? ($ pre-commit run --all-files or $ nox -s lint)
  • Does your submission pass tests? ($ pytest or $ nox -s tests)
  • Does the documentation build with your changes? ($ cd docs; make clean; make html or $ nox -s docs)
  • Does your submission pass the doctests? ($ pytest --doctest-plus src/vector/ or $ nox -s doctests)

@pfackeldey
Copy link
Collaborator Author

In context of the trijet mass calculation of the AGC (ref: scikit-hep/awkward#3359), this brings the runtime further down to (from ~25ms):

# 100k events (cpu backend)
In [1]: %timeit calculate_trijet_mass(events)
19.1 ms ± 203 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)

# typetracer
In [2]: %timeit calculate_trijet_mass(tt)
18.2 ms ± 49.6 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Copy link
Collaborator

@ianna ianna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pfackeldey - Thanks - this reduces broadcasting overhead as discussed during the meeting! Looks good to me!

@pfackeldey
Copy link
Collaborator Author

pfackeldey commented Jan 14, 2025

Hi @ianna,
thanks for your review!
Do you have some opinions about the name of this function/decorator? And/or how "publicly" exposed it should be?
I was wondering also, if we should add a "Tipp" to the awkward-array documentation that if one wants to apply a batch of e.g. ufuncs, it might be beneficial to do this in a single layout traversal with ak.transform. Is this somewhere already, if not, do you think it's worth to add somewhere?

@ianna
Copy link
Collaborator

ianna commented Jan 15, 2025

Hi @ianna, thanks for your review! Do you have some opinions about the name of this function/decorator? And/or how "publicly" exposed it should be? I was wondering also, if we should add a "Tipp" to the awkward-array documentation that if one wants to apply a batch of e.g. ufuncs, it might be beneficial to do this in a single layout traversal with ak.transform. Is this somewhere already, if not, do you think it's worth to add somewhere?

It's a good idea to add a performance hint in the documentation. Perhaps, even a new section with "good practices" or "performance guide"?

Do you mean something like:

@broadcast_once(axis=0)
def my_func(arr) 
...

@pfackeldey
Copy link
Collaborator Author

Hi @ianna, thanks for your review! Do you have some opinions about the name of this function/decorator? And/or how "publicly" exposed it should be? I was wondering also, if we should add a "Tipp" to the awkward-array documentation that if one wants to apply a batch of e.g. ufuncs, it might be beneficial to do this in a single layout traversal with ak.transform. Is this somewhere already, if not, do you think it's worth to add somewhere?

It's a good idea to add a performance hint in the documentation. Perhaps, even a new section with "good practices" or "performance guide"?

Do you mean something like:

@broadcast_once(axis=0)
def my_func(arr) 
...

ak.transform is already doing the layout traversal (e.g. in context of broadcasting) only once. So we don't need to add a new function. It would just be a hint in the ak.transform documentation. I don't think we can really specify an axis, because all inputs to a transform need to be broadcasted together (numpy's broadcasting also doesn't support specifying an axis).

Ok, I'd go ahead and think about how to add a tipp/hint in awkward's documentation for ak.transform 👍.

@ianna
Copy link
Collaborator

ianna commented Jan 15, 2025

Hi @ianna, thanks for your review! Do you have some opinions about the name of this function/decorator? And/or how "publicly" exposed it should be? I was wondering also, if we should add a "Tipp" to the awkward-array documentation that if one wants to apply a batch of e.g. ufuncs, it might be beneficial to do this in a single layout traversal with ak.transform. Is this somewhere already, if not, do you think it's worth to add somewhere?

It's a good idea to add a performance hint in the documentation. Perhaps, even a new section with "good practices" or "performance guide"?
Do you mean something like:

@broadcast_once(axis=0)
def my_func(arr) 
...

ak.transform is already doing the layout traversal (e.g. in context of broadcasting) only once. So we don't need to add a new function. It would just be a hint in the ak.transform documentation. I don't think we can really specify an axis, because all inputs to a transform need to be broadcasted together (numpy's broadcasting also doesn't support specifying an axis).

Ok, I'd go ahead and think about how to add a tipp/hint in awkward's documentation for ak.transform 👍.

if you are done with the PR, I can merge it :-)

@ianna ianna merged commit 78cd02f into main Jan 15, 2025
18 checks passed
@ianna ianna deleted the pfackeldey/reduce_python_overhead branch January 15, 2025 15:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants