Skip to content

Conversation

@pfackeldey
Copy link
Collaborator

@pfackeldey pfackeldey commented Sep 16, 2025

This PR adds axis=None reducer specializations. Should fix #1250.

A reducer may now implement a specialization of itself that is used in the axis=None case (and axis=0 or axis=-1 case for 1D rectangular arrays). With this PR the specialization makes use of the underlying nplike implementation for the reducer, i.e. np.sum, which implements a sum algorithm that takes care of compensation due to floating point precision. In order for this interface to work I extended the nplike interface to also implement sum.

@ikrommyd and @valsdav could you give this a try?

@ianna what do you think about these specializations? Unfortunately this is not getting rid of intermediate arrays, that would be a much more invasive change in the codebase. However, I noticed that ak.sum(..., axis=None) is now quite a bit faster than before (due to the numpy kernel).

(I might need some help regarding the cuda issue here @ianna :D)

@codecov
Copy link

codecov bot commented Sep 16, 2025

Codecov Report

❌ Patch coverage is 78.12500% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.66%. Comparing base (b749e49) to head (3352c53).
⚠️ Report is 426 commits behind head on main.

Files with missing lines Patch % Lines
src/awkward/_nplikes/cupy.py 16.66% 5 Missing ⚠️
src/awkward/_nplikes/typetracer.py 50.00% 1 Missing ⚠️
src/awkward/_reducers.py 92.85% 1 Missing ⚠️
Additional details and impacted files
Files with missing lines Coverage Δ
src/awkward/_do.py 84.14% <100.00%> (+0.71%) ⬆️
src/awkward/_nplikes/array_module.py 95.25% <100.00%> (+8.87%) ⬆️
src/awkward/_nplikes/numpy_like.py 100.00% <ø> (+24.70%) ⬆️
src/awkward/_nplikes/typetracer.py 77.11% <50.00%> (+2.25%) ⬆️
src/awkward/_reducers.py 98.11% <92.85%> (+0.79%) ⬆️
src/awkward/_nplikes/cupy.py 38.38% <16.66%> (+0.60%) ⬆️

... and 191 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@ikrommyd
Copy link
Collaborator

I like this

In [9]: np.sum(uproot.open("../../Hgg/testing/mc.root:Events")["genWeight"].array().to_numpy(), axis=0)
Out[9]: np.float32(108436264.0)

In [10]: np.sum(uproot.open("../../Hgg/testing/mc.root:Events")["genWeight"].array().to_numpy(), axis=-1)
Out[10]: np.float32(108436264.0)

In [11]: np.sum(uproot.open("../../Hgg/testing/mc.root:Events")["genWeight"].array().to_numpy(), axis=None)
Out[11]: np.float32(108436264.0)

In [12]: ak.sum(uproot.open("../../Hgg/testing/mc.root:Events")["genWeight"].array().to_numpy(), axis=0)
Out[12]: np.float32(108436264.0)

In [13]: ak.sum(uproot.open("../../Hgg/testing/mc.root:Events")["genWeight"].array().to_numpy(), axis=-1)
Out[13]: np.float32(108436264.0)

In [14]: ak.sum(uproot.open("../../Hgg/testing/mc.root:Events")["genWeight"].array().to_numpy(), axis=None)
Out[14]: np.float32(108436264.0)

@github-actions
Copy link

The documentation preview is ready to be viewed at http://preview.awkward-array.org.s3-website.us-east-1.amazonaws.com/PR3653

@pfackeldey
Copy link
Collaborator Author

Closing this as @ianna will take this over.

@pfackeldey pfackeldey closed this Sep 16, 2025
@ianna ianna reopened this Sep 16, 2025
Copy link
Collaborator

@ianna ianna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pfackeldey Thanks for jumping in on this one! I was planning to tackle it, but you’ve already done a great job. Everything looks good from my side—let’s go ahead and merge it. Appreciate the teamwork! 👍

@pfackeldey
Copy link
Collaborator Author

Ok 👍 Let me have a close look again tomorrow, the last thing I'm not sure about is the axis overwriting for flat 1D array types. I'd appreciate your input here.

@ikrommyd
Copy link
Collaborator

@pfackeldey @ianna there is still the issue of an error not being raised for a wrong axis value that I commented above. It should be addressed before merging.

@pfackeldey
Copy link
Collaborator Author

I think I fixed it. Could you have another look @ianna and @ikrommyd ?

@ikrommyd
Copy link
Collaborator

I think I fixed it. Could you have another look @ianna and @ikrommyd ?

It seems to properly be using numpy to do the summation and also properly error for bad user-passed axes now.

In [17]: ak.sum(uproot.open("../../Hgg/testing/mc.root:Events")["genWeight"].array().to_numpy(), axis=0)
Out[17]: np.float32(108436264.0)

In [18]: ak.sum(uproot.open("../../Hgg/testing/mc.root:Events")["genWeight"].array().to_numpy(), axis=-1)
Out[18]: np.float32(108436264.0)

In [19]: ak.sum(uproot.open("../../Hgg/testing/mc.root:Events")["genWeight"].array().to_numpy(), axis=None)
Out[19]: np.float32(108436264.0)
ValueError: axis=1 exceeds the depth of the nested list structure (which is 1)

This error occurred while calling

    ak.sum(
        numpy.ndarray([ 209.77744  209.77744  209.77744 ...
        axis = 1
    )

Copy link
Collaborator

@ikrommyd ikrommyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This addresses the issue for analyzers so it looks good to me!

Tests may be good to add but @ianna should decide that.
It's possible to construct an array of float32s where awkward on main currently gives inaccurate results during summing versus numpy. This PR fixes this

In [17]: array = np.random.normal(loc=10000, scale=1000, size=10000).astype(np.float32)

In [18]: np.sum(array), ak.sum(array)
Out[18]: (np.float32(99882600.0), np.float32(99882670.0))

Copy link
Collaborator

@ianna ianna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pfackeldey - Great! Thanks!

@ianna ianna added the pr-next-release Required for the next release label Sep 18, 2025
@ianna ianna merged commit 0415602 into scikit-hep:main Sep 18, 2025
46 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-next-release Required for the next release

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Use improved summation routine in sum kernels

3 participants