Add `IntoPyObject` conversion trait #4060

Icxolu · 2024-04-08T16:55:38Z

This implements a proof of concept of a new fallible conversion trait.

Design

IntoPyObject

This introduces a new trait IntoPyObject to perform the conversion of implemented type into Python. Currently the trait is generic so that the same type can have multiple conversions to different Python types. In this version I used it to provide an additional typed conversion for the chrono types when the unlimited API is enabled. There might be more situations where this could be useful. I changed the generic to an associated type as outlined in #4060 (comment)

The Error is specified as an associated type, since every conversion should have a single applicable error (or Infallible if it can't fail).

The return type is specified to Bound<T> to guarantee that we convert to a Python type.

pub trait IntoPyObject<'py>: Sized {
    type Target;
    type Error;

    fn into_pyobject(self, py: Python<'py>) -> Result<Bound<'py, Self::Target>, Self::Error>;
}

IntoPyObjectExt

Additionally to IntoPyObject this also adds IntoPyObjectExt. The sole purpose of this trait is to allow specifying the desired output type using the turbo fish syntax:

obj.into_pyobject::<PyDateTime, _>(py)

~~This makes it much nicer to work with types that implement multiple conversions and need type hinting. This would probably be the API most users would interact with.~~ This is only beneficial in the case of a generic trait.

Macros

To use the new conversion on return values of #[pyfunction] and friends without breaking current code, a variant of ~~autoref~~ deref specialization is implemented. It prefers the implementation of IntoPyObject over IntoPy<PyObject>. (~~Similar to the current approach the conversion to PyAny needs to be available.~~) This means there should be no changes required by users. The macros will pick up the fallible IntoPyObject implementation automatically, if available. For the migration we could then just deprecate IntoPy (and ToPyObject) and advise to implement IntoPyObject instead, removing the old conversions in a future release. Additionally there is another layer of specialization for the () return type in #[pymethods] and #[pyfunctions]. This allows converting them into PyNone as part of the macro code, because that's the Python equivalent of a function returning "nothing", while still using PyTuple as the IntoPyObject::Target.

Open Questions

Do we need IntoPyObject be generic, or would an associated type also work? Maybe implementing it on some more types will give us more insight whether the generic is needed.
From my initial implementation I have the feeling that the generic version introduces a lot of complexity while not really providing a big benefit.
What to do with APIs that are generic over IntoPy and ToPyObject? I assume we need to introduce new variants that are generic over IntoPyObject, similar to what we did with the _bound constructors.

There are probably some edge cases that don't work correctly, but I thought I'll gather some feedback first whether this goes into the right direction before I chase those down.

TODOs

rename and seal AnyBound, possible candidates:
- BoundObject
- ~~BoundType~~
- ~~IntoPyObjectOutput~~
- ~~BoundMethods~~
lift type restrictions of Either impl

Followups

add implementation to references of container types like &HashMap, &BTreeSet or &Vec
optimize implementation for PyRef/PyRefMut, possibly by storing a Borrowed in them
work out the migration of IntoPy<PyObject> and ToPyObject
- rework implementations that currently rely on APIs with the old bounds
- we can only provide a blanket impl for one of them...
work out PyBytes specialization ([RFC] standardize PyBytes <-> Vec<u8> or &[u8] or Cow<[u8]> #4182)
add BoundObject to the prelude?

Xref #4041

codspeed-hq · 2024-04-08T20:39:20Z

CodSpeed Performance Report

Merging #4060 will not alter performance

_{Comparing Icxolu:into-pyobject (3be2433) with main (3bf2f1f)}

🎉 Hooray! `codspeed-rust` just leveled up to 2.6.0!

A heads-up, this is a breaking change and it might affect your current performance baseline a bit. But here's the exciting part - it's packed with new, cool features and promises improved result stability 🥳!
Curious about what's new? Visit our releases page to delve into all the awesome details about this new version.

Summary

✅ 68 untouched benchmarks

Icxolu · 2024-04-15T21:04:56Z

I've played around with this some more, and added more conversions for experimentation. (I will remove most of the impls again from here, once we reached some consensus, since many of them warrant their own PRs)

I have the feeling that an associated type is the more appropriate choice:

In pretty much all cases there is one clear Python type to turn a Rust type into.
If there would be a case for multiple different conversions, I think having wrapper types for each conversion would a much clearer option.
The API is more ergonomic
- We don't need a separate implementation for PyAny, which is nearly all cases just forwards to a different typed impl.
- Stricter type safety - because there is only one target type, choosing most appropriate one (instead of falling back to PyAny for the macros) involves less boilerplate.
- There is no need for an additional extension trait.

For these points I switched the generic for an associated type.

The only type that's a bit special is (), which should have Target = PyTuple, but still turn into None as a return type of #[pyfunction] and #[pymethods]. Since that is all inside macro code, I added another specialization layer to handle that.

davidhewitt · 2024-05-17T11:41:58Z

I just wanted to comment that although I haven't managed to actually reply or give thought in detail here, I'm really excited for this and want to circle back to it ASAP. I think at this point it might be that this is too big of a change for landing in 0.22, but I'd be keen to see it land in 0.23 if we can make it happen!

Icxolu · 2024-05-17T18:17:47Z

I'm really excited for this and want to circle back to it ASAP.

I'm excited too! I hope this can both enable new use cases, that we previously just didn't support, while also reducing the complexity to implement Python conversions. Having a single trait in each direction is hopefully making it much clearer how everything is wired together.

I think at this point it might be that this is too big of a change for landing in 0.22, but I'd be keen to see it land in 0.23 if we can make it happen!

Yes, I agree. Given how close we are to releasing 0.22 I think it makes sense to target 0.23 here 🤞I think there is also some more work following this, I included a lot of new impls in here for demontration purposes, but in general I think it would make sense to land them seperately, so we can discuss them individually and make reviewing easier. So no need to rush 😄

Icxolu · 2024-07-06T21:58:41Z

I just rebased this again, now that 0.22 is shipped. It seems like that the deref specialization used here prevents the nicer error message that we introduced in #4220 for the missing conversion trait case (or I haven't found the correct place yet).

We should also think again about whether and how we want to integrate the ideas from #4182 here.

davidhewitt

I just rebased this again, now that 0.22 is shipped. It seems like that the deref specialization used here prevents the nicer error message that we introduced in #4220 for the missing conversion trait case (or I haven't found the correct place yet).

We should also think again about whether and how we want to integrate the ideas from here.

Awesome! I am heavily in favour of moving forward with this trait for 0.23, and I think this PR is probably the correct thing to land as a first step more or less in this form. We can then iterate towards the 0.23 release.

For this PR:

The conflict with Improve the span and message for return types of pymethod/functions #4220 - do you want me to pull this branch and see if I can figure it out? I also wonder if we should remove the GIL refs deprecations from the macros to simplify the problem.
I'll seek to give a full review at earliest availability.

Follow-ups:

I think we will need to solve the migration story for how to re-implement ToPyObject and IntoPy<PyObject> (plus other forms) in terms of this new API, and figure out how we expect users to need to adjust code.
Ideas from [RFC] standardize PyBytes <-> Vec<u8> or &[u8] or Cow<[u8]> #4182 - yes I think we should do these. It might be worth first solving the above bullet so that ToPyObject and IntoPy behave consistently with whatever we add to this trait.
I wonder if we should also adjust FromPyObject to have the type Error associated type, for consistency (and possible optimisations).

src/conversion.rs

Icxolu · 2024-07-07T13:54:46Z

The conflict with Improve the span and message for return types of pymethod/functions #4220 - do you want me to pull this branch and see if I can figure it out? I also wonder if we should remove the GIL refs deprecations from the macros to simplify the problem.

I will give this another try later today or tomorrow, but since there is now more that one trait involved, and we are intentionally vague about which one the compiler should use, I feel like there is no clear choice for the compiler to say "This one is not implemented" instead we now have a set of traits from which any implementation would be fine. Therefore the error message is more generic in that case (or at least that would be my explanation). I can also prepare a PR removing the gil-ref deprecations from the macros to see if that helps.

I'll seek to give a full review at earliest availability.

Awesome, much appreciated! Are you fine with me leaving all the trait implementations in for the review, or should I remove them here we land them separately?

I think we will need to solve the migration story for how to re-implement ToPyObject and IntoPy<PyObject> (plus other forms) in terms of this new API, and figure out how we expect users to need to adjust code.

I will probably look at that next 👍

davidhewitt · 2024-07-09T16:24:03Z

Are you fine with me leaving all the trait implementations in for the review, or should I remove them here we land them separately?

I think all implementations are fine to be in this PR 👍

davidhewitt

Overall this looks like it's shaping up nicely and I'm optimistic that we've got a good and flexible enough formulation here. The trait definition is now a fair bit more complex than ToPyObject or IntoPy, but I think that can be handled by good documentation and examples.

Reading through more and more of the implementations I find myself tempted to scrutinise each one, so I'm beginning to wonder how we can land this in chunks without making this PR get stuck in review hell. I think answering that question will probably give us an incremental pathway that also works for users, so I'll keep musing on that point.

I haven't yet looked too hard at the specialization in the macros, I will try to study that soon.

src/conversion.rs

src/conversions/either.rs

davidhewitt · 2024-07-10T15:38:39Z

src/conversions/hashbrown.rs

+            dict.set_item(
+                k.into_pyobject(py)?.into_bound(),
+                v.into_pyobject(py)?.into_bound(),
+            )?;


I think here it would be nice to just have dict.set_item(k, v) and ideally the fact that K and V implement IntoPyObject would be enough.

Yes, I guess this ties into how do the migration from IntoPy/ToPyObject. Maybe this is something to solve in a followup once we have put some thoughts into how that can and should work? This implementation is kind of workaround for that missing piece.

Agreed, I think a followup makes sense.

davidhewitt · 2024-07-10T15:39:46Z

src/conversions/hashbrown.rs

@@ -51,6 +54,29 @@ where
    }
 }

+impl<'py, K, V, H> IntoPyObject<'py> for hashbrown::HashMap<K, V, H>


I guess there's a question here, should we implement this for &HashMap too? (Subject to &K and &V implementing IntoPyObject.)

Similar for a lot of other containers.

Yes, I think it would be desirable to provide these, since there is no real reason why one should not be able to continue using such a container after the conversion to Python.

I vaguely remember that I tried to add them at some point but ran into lots of HRTBs, I just tried again and it worked like I would have expected so either the implementation evolved and made it simpler, or I just did it wrong last time 🤷.

Since this is already quite a lot I suggest to land them separately.

davidhewitt · 2024-07-10T16:08:01Z

src/conversions/std/array.rs

+where
+    T: IntoPyObject<'py>,
+{
+    type Target = PyList;


I guess there's a relevant discussion point here if [u8; N] should be treated as a byte slice and become PyBytes (in which case this here might be PyAny, unless we can find a way to do a type-level specialization on this associated type).

davidhewitt · 2024-07-10T16:09:52Z

src/pycell.rs

+impl<'py, T: PyClass> IntoPyObject<'py> for PyRef<'py, T> {
+    type Target = T;
+    type Output = Bound<'py, T>;
+    type Error = Infallible;
+
+    fn into_pyobject(self, _py: Python<'py>) -> Result<Self::Output, Self::Error> {
+        Ok(self.inner.clone())
+    }
+}


I wonder if somehow we can manually decrement the borrow count and then just return self.inner to avoid the clone here (same for the PyRefMut case).

Yeah, these are suboptimal... I haven't figured out a way to move the inner because of the Drop impl. But with the complete remove of the gil-refs now, maybe we can also now store a Borrowed in PyRef/PyRefMut, which should solve that issue I think.

Reflecting on Borrowed in PyRef / PyRefMut, it creates the downside we already saw with types like Vec<&'a str> where they can't easily be extracted. (Because a PyRef<'a, 'py, T> containing a borrowed won't be able to be stored by iterating a list, for example.)

But I think on balance that might actually be a good thing, because it helps prevent locking objects to read/write for longer than desired. (The migration solution would be encouraging users to extract Vec<Bound<'py, T>> and then .borrow() for just the needed span of code.)

davidhewitt · 2024-07-12T11:30:35Z

I've been thinking hard about the migration options, in particular what we want users to have to do when upgrading.

I think if we finish the change to FromPyObject to be replaced by FromPyObjectBound then already in 0.23 users will be making changes to trait implementations. So we could present 0.23 as an adjustment to traits for performance and flexibility and document the migration work needed in both the from-python and to-python directions.

I have been thinking hard about a blanket implementation of IntoPyObject from IntoPy<Py<PyAny>> or the reverse. Both seem to have downsides:

impl<T: IntoPy<Py<PyAny>>>> IntoPyObject for T - this would mean we could change all PyO3 APIs to use IntoPyObject and migrate trait implementations incrementally. Users migrate from IntoPy<Py<PyAny>> to IntoPyObject implementations. I think the biggest pain users will face is that .into_py() calls will immediately break when replacing IntoPy<Py<PyAny>> with IntoPyObject implementations (and we wouldn't be allowed both because of the blanked).
impl<T: IntoPyObject> IntoPy<Py<PyAny>> for T - this would be potentially breaking in many ways, if we have small behavioural differences between IntoPy<Py<PyAny>> and IntoPyObject for e.g. the bytes cases. I also think this doesn't actually create a seamless migration unless we continue to use IntoPy<Py<PyAny>> as the generic bound on PyO3 APIs for now, which defeats the point of doing the hard work to use the new trait.

The third option is to adopt no blanket implementation and just switch all PyO3 APIs to use IntoPyObject. The proc-macros could potentially prefer IntoPyObject but fall back to IntoPy where necessary. (I also wondered whether we could have some sort of temporary trait to make the same thing work in generic code, but I can't see a way to actually make that work without some kind of negative reasoning.)

I think on balance my currently opinion is that we should take this third option and add neither blanket. This means that code again gets duplicated, similar to the Bound migration. At least then new infrastructure can be built cleanly without compatibility hacks, and no existing code will be broken (aside from places where types need to implement the new traits.)

We can at least reduce the amount of duplication by having a (public?) helper macro to create an IntoPy<Py<PyAny>> implementation from an IntoPyObject implementation.

Icxolu · 2024-07-12T16:28:53Z

I have been thinking hard about a blanket implementation of IntoPyObject from IntoPy<Py<PyAny>> or the reverse. Both seem to have downsides:

I agree, we want to change the APIs to use the new IntoPyObject otherwise we kind of turn in circles. So I think only this blanked impl<T: IntoPy<Py<PyAny>>>> IntoPyObject for T would make sense. But there are also a fair number of APIs the use ToPyObject as their bound and since we can't provide both blankets it's probably better to not provide a blanket.

At least then new infrastructure can be built cleanly without compatibility hacks, and no existing code will be broken (aside from places where types need to implement the new traits.)

Yeah I think this is important so that we have a solid foundation for the future.

The proc-macros could potentially prefer IntoPyObject but fall back to IntoPy where necessary. (I also wondered whether we could have some sort of temporary trait to make the same thing work in generic code, but I can't see a way to actually make that work without some kind of negative reasoning.)

For the proc-macros this is what I implemented here already (or did you mean something different?) As for generic code, I will give this a thought, but my initial feeling says the same as you.

We can at least reduce the amount of duplication by having a (public?) helper macro to create an IntoPy<Py<PyAny>> implementation from an IntoPyObject implementation.

Maybe yes, will explore that.

davidhewitt

Implementations all look good. I think we should proceed to merge this so that we can then iterate in smaller PRs.

I will briefly have a go now at the missing_intopy error regression...

davidhewitt · 2024-08-02T13:38:03Z

src/conversions/smallvec.rs

+    A::Item: IntoPyObject<'py>,
+    PyErr: From<<A::Item as IntoPyObject<'py>>::Error>,
+{
+    type Target = PyList;


Possible question if this will become PyAny for SmallVec<u8> creating bytes.

davidhewitt · 2024-08-02T15:33:26Z

@Icxolu I pushed a commit which adjusted the autoderef specialization mechanism to make it a bit easier for me to reason about and managed to add fallback cases which means that the diagnostic actually emits hints that IntoPyObject are not satisfied (rather than IntoPy<PyObject>) 🎉

It's a little janky and this branch might need fixups / rebase, but otherwise please feel free to merge this (I'm out of time for today).

allow error type to be convertible into `PyErr` instead of requiring `PyErr`

Icxolu · 2024-08-03T13:54:59Z

Thanks for the review! Your autoderef specialization mechanism is much easier to read than what I originally implemented, I really like that 🚀 . Also keeping the more descriptive error message is really nice 👍 . I rebased this again, fixed the clippy warnings and added a missing implementation for #[classattr]. I'm excited to see this land 🎉 and will continue to work on the points that we left for followups here.

davidhewitt mentioned this pull request May 24, 2024

[RFC] standardize PyBytes <-> Vec<u8> or &[u8] or Cow<[u8]> #4182

Closed

Icxolu force-pushed the into-pyobject branch from 3423b33 to 1020cd8 Compare May 24, 2024 19:06

Icxolu force-pushed the into-pyobject branch from 1020cd8 to 78eb08c Compare July 6, 2024 21:50

davidhewitt reviewed Jul 7, 2024

View reviewed changes

src/conversion.rs Outdated Show resolved Hide resolved

Icxolu force-pushed the into-pyobject branch 3 times, most recently from 45968b9 to 4b52b58 Compare July 9, 2024 22:04

davidhewitt reviewed Jul 10, 2024

View reviewed changes

Icxolu force-pushed the into-pyobject branch from aade6ed to 8d7e0c0 Compare July 12, 2024 16:11

Icxolu force-pushed the into-pyobject branch from 8d7e0c0 to ef32a84 Compare July 14, 2024 20:03

Icxolu force-pushed the into-pyobject branch from ef32a84 to f194a99 Compare July 28, 2024 14:21

davidhewitt approved these changes Aug 2, 2024

View reviewed changes

Icxolu added 8 commits August 3, 2024 12:04

add trait

86effa6

add chrono impls

51ec459

impl autoref specialization to prefer IntoPyObject over IntoPy

b3f43c7

relax map_into_ptr trait bounds

5907e74

allow error type to be convertible into `PyErr` instead of requiring `PyErr`

fix clippy

f146dd9

switch to deref-specialization and specialize () convert PyNone

e04aec0

more conversions

9feafac

assoc type

0419891

Icxolu and others added 10 commits August 3, 2024 12:05

even more conversions

943b0db

macro impls

059c234

fix clippy

d78afa5

add Output type allowing optimizing impls which borrow from the input

97467f4

loose type restrictions on Either impl

d5a0604

move, seal and rename AnyBound to BoundObject

6700d02

add newsfragments

3fffc73

fix diagnostic on IntoPyObject unimplemented

f47d370

fix clippy

3c402ec

specialize #[classattr]

3be2433

Icxolu force-pushed the into-pyobject branch from 67f3c75 to 3be2433 Compare August 3, 2024 13:48

Icxolu marked this pull request as ready for review August 3, 2024 13:49

Icxolu enabled auto-merge August 3, 2024 13:55

Icxolu added this pull request to the merge queue Aug 3, 2024

Merged via the queue into PyO3:main with commit 0a185cd Aug 3, 2024
39 of 42 checks passed

Icxolu deleted the into-pyobject branch August 3, 2024 14:45

Icxolu mentioned this pull request Aug 4, 2024

specialize collections holding bytes to turn into PyBytes rather than PyList #4417

Closed

davidhewitt mentioned this pull request Oct 4, 2024

Fallible alternative to IntoPy? #1813

Closed

davidhewitt changed the title ~~Experiment: Fallible conversion trait~~ Add IntoPyObject conversion trait Oct 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `IntoPyObject` conversion trait #4060

Add `IntoPyObject` conversion trait #4060

Icxolu commented Apr 8, 2024 •

edited

Loading

codspeed-hq bot commented Apr 8, 2024 •

edited

Loading

Icxolu commented Apr 15, 2024

davidhewitt commented May 17, 2024

Icxolu commented May 17, 2024

Icxolu commented Jul 6, 2024

davidhewitt left a comment

Icxolu commented Jul 7, 2024

davidhewitt commented Jul 9, 2024

davidhewitt left a comment

davidhewitt Jul 10, 2024

Icxolu Jul 10, 2024

davidhewitt Jul 12, 2024

davidhewitt Jul 10, 2024

Icxolu Jul 10, 2024

davidhewitt Jul 10, 2024

davidhewitt Jul 10, 2024

Icxolu Jul 10, 2024

davidhewitt Jul 12, 2024

davidhewitt commented Jul 12, 2024

Icxolu commented Jul 12, 2024

davidhewitt left a comment

davidhewitt Aug 2, 2024

davidhewitt commented Aug 2, 2024

Icxolu commented Aug 3, 2024

Add IntoPyObject conversion trait #4060

Add IntoPyObject conversion trait #4060

Conversation

Icxolu commented Apr 8, 2024 • edited Loading

Design

IntoPyObject

IntoPyObjectExt

Macros

Open Questions

TODOs

Followups

codspeed-hq bot commented Apr 8, 2024 • edited Loading

Merging #4060 will not alter performance

🎉 Hooray! codspeed-rust just leveled up to 2.6.0!

Summary

Icxolu commented Apr 15, 2024

davidhewitt commented May 17, 2024

Icxolu commented May 17, 2024

Icxolu commented Jul 6, 2024

davidhewitt left a comment

Choose a reason for hiding this comment

Icxolu commented Jul 7, 2024

davidhewitt commented Jul 9, 2024

davidhewitt left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

davidhewitt commented Jul 12, 2024

Icxolu commented Jul 12, 2024

davidhewitt left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

davidhewitt commented Aug 2, 2024

Icxolu commented Aug 3, 2024

Add `IntoPyObject` conversion trait #4060

Add `IntoPyObject` conversion trait #4060

Icxolu commented Apr 8, 2024 •

edited

Loading

codspeed-hq bot commented Apr 8, 2024 •

edited

Loading

🎉 Hooray! `codspeed-rust` just leveled up to 2.6.0!