Work on performance issues in summary statistics due to using ArrayBase::sum #35
Labels
Enhancement
New feature or request
On Hold
Issues we can't act upon due to external constraints (e.g. missing Rust feature, other libraries..)
The summary statistics methods use
ArrayBase::sum
(directly or indirectly) in anticipation of pairwise summation (rust-ndarray/ndarray#577), which provides improved accuracy over naive summation usingfold
. However, to do this, some of the methods have unnecessary allocations or other performance issues.For example,
harmonic_mean
is implemented like this:It's implemented this way to take advantage of
.mean()
(which is implemented in terms of.sum()
), but this approach requires a temporary allocation for the result ofself.map
.summary_statistics::means::moments
has a similar issue:It's implemented this way to take advantage of
.sum()
. However, this implementation requires a temporary allocation for the result ofa.map
. Additionally, it would probably be faster to make the loop overk
be the innermost loop to improve the locality of reference.We should be able to resolve these issues with a lazy version of
map
combined with a pairwise summation method on that lazymap
. Something like jturner314/nditer would work once it's stable.[Edit: This issue also appears in the entropy methods.]
The text was updated successfully, but these errors were encountered: