Skip to content

Let view include dtype change and pass that to __array_finalize__ #31271

Draft
mhvk wants to merge 4 commits intonumpy:mainfrom
mhvk:view-include-dtype-change
Draft

Let view include dtype change and pass that to __array_finalize__ #31271
mhvk wants to merge 4 commits intonumpy:mainfrom
mhvk:view-include-dtype-change

Conversation

@mhvk
Copy link
Copy Markdown
Contributor

@mhvk mhvk commented Apr 19, 2026

This is an alternative to #31234, which aims to have PyArray_View() do a one-step view of both a possibly new subtype and a new dtype. In principle, I feel that is where we should be aiming, but I found that for MaskedArray I had to make more changes than in #31324, because its __array_finalize__ relies on the dtype not being correct yet.

Given that, there may be other code that relies on the current behaviour of view, so if we go with this at all, I would need to add code similar to #31324 to check whether someone has a dtype override (I think _set_dtype is not present in any published version, so one would perhaps not have to check for that).

Also, when thinking about this more, I realize that in this PR, the shape and strides can be incorrect when __array_finalize__ is called. This is not difficult to solve, but I thought I would get some opinions on whether this is the right direction at all before doing that. Though opening as draft for now

Note that now that I've made this, I think there is very little reason not just to put in #31234, as it seems clear that what I have here can be done as follow up as well.

No AI was used.

@seberg
Copy link
Copy Markdown
Member

seberg commented Apr 19, 2026

The intermediate view may not be a big deal, but yeah, it's definitely nice to skip it!

but I thought I would get some opinions on whether this is the right direction at all before doing that. Though opening as draft for now

I like this in principle, but I think there are two subtleties, and unless we shrug them off (probably OK, but...), I am actually not sure there is a clean solution:

  • I think PyArray_View() diverges from .view() here, and I am not sure that it still works as well as it should with this change (without fixing __array_finalize__ there can be regressions, e.g. MaskedArray itself could have one in theory.).
  • It would be nice to keep .dtype = work with a DeprecationWarning to not break existing implementations.

Now, I can see calling obj.view() from PyArray_View() if it exists, while arr.view() would of course not and (try) call .dtype = (arr.view() would be the current implementation).
In that scenario, I think we get:

  • def view() can be used to avoid the .dtype = setting (but it may be a bit inconvenient).
  • If delete .view() you can push all logic into __array_finalize__, which may make a lot of sense. (But MaskedArray is interesting as it adds a kwarg, so we can't delete it.)

So, in my approach _set_dtype sticks around indefinitely, while in this .view() does, but may not be used.

With _set_dtype we could eventually deprecate and force transitioning to __array_finalize__ only, but it's a two step thing. (I am not actually sure we can if we promise to call .view(), though.)

But too much text, not sure what it gives us. To me using _set_dtype is still the easier choice probably. But if we either disregard the above problems a bit or like the PyArray_View() calling subclass.view() as the final state for sure, then I think this is better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants