Last active
October 25, 2023 13:08
-
-
Save davidhewitt/d0ed031fb05f6db98ee249ae089b268e to your computer and use it in GitHub Desktop.
Dreaming of arbitrary self types for PyO3
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
//! The following is a simplified form of a possible PyO3 API which shows | |
//! cases where arbitrary self types would help resolve papercuts. | |
// ---------------------------------------------------------------------------------- | |
// | |
// Case 1 - PyO3's object hierarchy. We have a smart pointer type Py<T> and want to | |
// use it as a receiver for Python method calls. | |
// | |
// | |
/// Python's C API is wrapped by `pyo3-ffi` crate, also exported as `pyo3::ffi` | |
/// submodule. | |
mod ffi { | |
extern { | |
/// A Python object. For this model we don't care about it's contents, so we | |
/// just use unstable "extern type" syntax to name it. | |
type PyObject; | |
} | |
} | |
/// A smart pointer to a Python object, which is reference counted. A good enough | |
/// description is that it is approximately an `Arc<T>` where the memory is | |
/// stored on the Python heap and reference counting is synchronized by the | |
/// Python GIL (Global Interpreter Lock). | |
/// | |
/// Here in this model we ignore the existence of the Python GIL as it is just a | |
/// distraction. In PyO3's real API we have a lifetime `'py` on several types to | |
/// model this | |
struct Py<T>(NonNull<ffi::PyObject>); | |
// -- Some zero-sized types to describe Python's object hierarchy. -- | |
/// Any Python object. | |
struct PyAny(()); | |
/// A concrete subtype, a Python list. | |
struct PyList(()); | |
// -- Implementations of methods on these types -- | |
// In practice these methods return results, we'll ignore that here. | |
impl PyAny { | |
/// Get an attribute on this object. In Python syntax this is `self.name`. | |
/// | |
/// Receiver is &Py<PyAny> - arbitrary self type! | |
fn getattr(self: &Py<PyAny>, name: &str) -> Py<PyAny> { /* ... */ } | |
} | |
impl PyList { | |
/// Get an element from this list. In Python syntax this is `self[idx]`. | |
/// | |
/// Receiver is &Py<PyList> - arbitrary self type! | |
fn get_item(self: &Py<PyList>, idx: usize) -> Py<PyAny> { /* ... */ } | |
} | |
// In addition, we want to call `getattr` with a `Py<PyList>`, because this is | |
// a valid operation too. The cleanest way to do this is with `Deref`: | |
impl Deref for Py<PyList> { | |
type Target = Py<PyAny>; | |
fn deref(&self) -> &Py<PyAny> { /* ... */ } | |
} | |
// ... but if arbitrary self types is tied to Deref, instead we have to have | |
impl Deref for Py<PyList> { | |
type Target = PyList; | |
fn deref(&self) -> &PyList { /* ... */ } | |
} | |
// We could find other ways to make Py<PyList> have a getattr method without | |
// `Deref`, e.g. by moving all of `PyAny` methods onto a trait and implementing | |
// it for `Py<PyAny>`, `Py<PyList>` and so on. This leads to a lot of repetition; | |
// N trait implementations for N concrete types PyAny, PyList, etc. | |
// Also the `&PyList` reference on its own is useless, so `Deref<Target = PyList>` | |
// is a little weird. | |
// ---------------------------------------------------------------------------------- | |
// | |
// Case 2 - PyO3's "refcell" container synchronized by the GIL. This has a close | |
// cousin in `std::cell::RefCell`. | |
// | |
// | |
/// PyO3 has a `#[pyclass]` macro which generates a Python type for a Rust | |
/// struct. | |
/// - `Foo` continues to be the plain old Rust struct. | |
/// - `Py<Foo>` is a smart pointer to a Python object which contains a `Foo`. | |
#[pyclass] | |
struct Foo { /* ... */ } | |
/// To implement methods on the Python type PyO3 has a `#[pymethods]` macro. | |
/// | |
/// Users can use `&self` and `&mut self` receivers. To make this possible, | |
/// `Py<Foo>` like `RefCell<Foo>` but uses the Python GIL for synchronization. | |
/// `PyRef<'_, Foo>` and `PyRefMut<'_, Foo>` are the guards to `Py<Foo>`. | |
impl Foo { | |
/// Receive by `&self``, read only the Rust data. Possible today. | |
fn a(&self) { /* ... */ } | |
/// Receive by `&mut self`, read or write only the Rust data. Possible today. | |
fn b(&mut self) { /* ... */ } | |
/// Receive by `Py<Foo>`. `Py<Foo>` implements `Deref<Target = Py<PyAny>>` | |
/// so that all Python operations are accessible. | |
/// | |
/// This is an arbitrary self type. | |
/// | |
/// Current users of PyO3 have to use `slf: Py<Foo>` which is awkward | |
/// and also loses method call syntax. | |
fn c(self: Py<Foo>) { /* ... */ } | |
/// Receive by `PyRef<'_, Foo>`. `PyRef<'_, Foo>` is a pointer to the Python | |
/// data. It implements `Deref<Target = Foo>` to give read access to the Rust | |
/// data. | |
/// | |
/// This is an arbitrary self type. | |
/// | |
/// Same workarounds for current users of PyO3 apply. | |
fn d(self: PyRef<'_, Foo>) { /* ... */ } | |
/// Receive by `PyRefMut<'_, Foo>`. `PyRefMut<'_, Foo>` is a pointer to the Python | |
/// data. It implements `DerefMut<Target = Foo>` to give read and write access to | |
/// the Rust data. | |
/// | |
/// This is an arbitrary self type. | |
/// | |
/// Same workarounds for current users of PyO3 apply. | |
fn e(self: PyRefMut<'_, Foo>) { /* ... */ } | |
} | |
// Note that in the above, `PyRef<'_, Foo>` and `PyRefMut<'_, Foo>` both implement | |
// `Deref<Target = Foo>` so would fit fine with deref-based arbitrary self types. | |
// | |
// However `Py<Foo>` cannot implement `Deref<Target = Foo>`, just like how `RefCell<T>` | |
// cannot implement `Deref<Target = T>`. | |
// | |
// To make `Py<Foo>` be able to implement `Deref`, we must give up its refcell-like | |
// feature. This removes `PyRef<'_, Foo>` and `PyRefMut<'_, Foo>`, and it also | |
// removes the ability to have `&mut self` as a receiver. The mutable access | |
// needs the runtime refcell protection due to Python code being incompatible with | |
// the borrow checker. | |
// | |
// There is a possible argument that removing `&mut self` and refcell feature is | |
// a good thing, but it is also _extremely_ ergonomic for users. We could have | |
// a long conversation about whether PyO3 made the wrong API choice here. There is | |
// `#[pyclass(frozen)]` which opts-in to this restriction, so by flipping the default | |
// and then removing the option we could evolve PyO3's API over time if we think | |
// deref-based arbitrary self types is the correct formulation of arbitrary self types. | |
// | |
// If you feel like a long distraction, we can discuss how Python might | |
// be removing the GIL, and how that means that PyO3 might be forced to change | |
// anyway. | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Thanks @madsmtm and @davidhewitt for all the discussion.
I think I agree with your last comment David - I think you can make everything work with just a
Receiver
impl (withoutDeref
) and then some traits.That said,
Receiver
forDeref
will be seen as Not The Rust Way as soon as I raise the PR. It is certainly unusual. I am worried we will sink a ton of time discussing this without really clear arguments on either side. If you can think of a way to avoid this, I'm all ears!Receiver
withoutDeref
. I think the RFC is insufficiently clear about this, and I'll work on it.Deref
because their type is a smart pointer containing something (has-a relationships). You want(ed) to useDeref
for a completely different purpose, to express is-a relationships, along the lines of coercion. In this case, people might validly want theirDeref
resolution and theirReceiver
resolution to point in different directions. I think we need to be more explicit in the RFC that we're choosing not to be compatible with such use-cases, and they should be achieved using traits (or some futureCoercion
trait) instead.So this has been a most useful discussion, thanks!