Skip to content

Conversation

sfc-gh-joshi
Copy link
Contributor

What do these changes do?

Implements __array_function__ to prevent errors when a backend implements the function as an extension method. See linked issue for details.

  • first commit message and PR title follow format outlined here

    NOTE: If you edit the PR title to match this format, you need to add another commit (even if it's empty) or amend your last commit for the CI job that checks the PR title to pick up the new PR title.

  • passes flake8 modin/ asv_bench/benchmarks scripts/doc_checker.py
  • passes black --check modin/ asv_bench/benchmarks scripts/doc_checker.py
  • signed commit with git commit -s
  • Resolves BUG: Overriding __array_function__ with extensions causes AttributeError with NEP18 dispatch #7616
  • tests added and passing
  • module layout described at docs/development/architecture.rst is up-to-date

The result of applying the function to this dataset. By default, it will return
a NumPy array.
"""
return self._query_compiler.do_array_function_implementation(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity; what would happen here if you just returned the NotImplemented sentinel? Would this be a way to avoid calling __array__()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TypeError: no implementation found for 'numpy.where' on types that implement __array_function__: [<class 'modin.pandas.dataframe.DataFrame'>]

I think numpy only explicitly calls __array__ if no __array_function__ implementation is available.

@sfc-gh-joshi sfc-gh-joshi merged commit 69f2751 into modin-project:main Jun 30, 2025
41 checks passed
@sfc-gh-joshi sfc-gh-joshi deleted the joshi/array_function branch June 30, 2025 20:44
sfc-gh-joshi added a commit to snowflakedb/snowpark-python that referenced this pull request Jun 30, 2025
…nd native Series constructor switching bugs (#3498)

SNOW-2157873 occurs because upstream modin does not implement __array_function__, instead converting to numpy ndarrays via __array__ when a numpy function is called on it. The presence of the extension wrapper for __array_function__ introduced by Snowpark pandas confuses numpy dispatch, causing unexpected AttributeErrors. This is fixed upstream with modin-project/modin#7617, and will presumably become available in the next modin release. On the Snowpark side, this PR adds relevant tests, and adds a version-guarded flag to remove the extension function and push it down to the query compiler.

SNOW-2173644 occurs in specific circumstances when determining switching conditions for the DataFrame constructor. Series objects are treated as dict-like, but Series.values is a property rather than a function. We thus skip over native_pd.Series objects in the dict-like check in move_to_me_cost.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BUG: Overriding __array_function__ with extensions causes AttributeError with NEP18 dispatch

3 participants