-
-
Save davidhewitt/d0ed031fb05f6db98ee249ae089b268e to your computer and use it in GitHub Desktop.
//! The following is a simplified form of a possible PyO3 API which shows | |
//! cases where arbitrary self types would help resolve papercuts. | |
// ---------------------------------------------------------------------------------- | |
// | |
// Case 1 - PyO3's object hierarchy. We have a smart pointer type Py<T> and want to | |
// use it as a receiver for Python method calls. | |
// | |
// | |
/// Python's C API is wrapped by `pyo3-ffi` crate, also exported as `pyo3::ffi` | |
/// submodule. | |
mod ffi { | |
extern { | |
/// A Python object. For this model we don't care about it's contents, so we | |
/// just use unstable "extern type" syntax to name it. | |
type PyObject; | |
} | |
} | |
/// A smart pointer to a Python object, which is reference counted. A good enough | |
/// description is that it is approximately an `Arc<T>` where the memory is | |
/// stored on the Python heap and reference counting is synchronized by the | |
/// Python GIL (Global Interpreter Lock). | |
/// | |
/// Here in this model we ignore the existence of the Python GIL as it is just a | |
/// distraction. In PyO3's real API we have a lifetime `'py` on several types to | |
/// model this | |
struct Py<T>(NonNull<ffi::PyObject>); | |
// -- Some zero-sized types to describe Python's object hierarchy. -- | |
/// Any Python object. | |
struct PyAny(()); | |
/// A concrete subtype, a Python list. | |
struct PyList(()); | |
// -- Implementations of methods on these types -- | |
// In practice these methods return results, we'll ignore that here. | |
impl PyAny { | |
/// Get an attribute on this object. In Python syntax this is `self.name`. | |
/// | |
/// Receiver is &Py<PyAny> - arbitrary self type! | |
fn getattr(self: &Py<PyAny>, name: &str) -> Py<PyAny> { /* ... */ } | |
} | |
impl PyList { | |
/// Get an element from this list. In Python syntax this is `self[idx]`. | |
/// | |
/// Receiver is &Py<PyList> - arbitrary self type! | |
fn get_item(self: &Py<PyList>, idx: usize) -> Py<PyAny> { /* ... */ } | |
} | |
// In addition, we want to call `getattr` with a `Py<PyList>`, because this is | |
// a valid operation too. The cleanest way to do this is with `Deref`: | |
impl Deref for Py<PyList> { | |
type Target = Py<PyAny>; | |
fn deref(&self) -> &Py<PyAny> { /* ... */ } | |
} | |
// ... but if arbitrary self types is tied to Deref, instead we have to have | |
impl Deref for Py<PyList> { | |
type Target = PyList; | |
fn deref(&self) -> &PyList { /* ... */ } | |
} | |
// We could find other ways to make Py<PyList> have a getattr method without | |
// `Deref`, e.g. by moving all of `PyAny` methods onto a trait and implementing | |
// it for `Py<PyAny>`, `Py<PyList>` and so on. This leads to a lot of repetition; | |
// N trait implementations for N concrete types PyAny, PyList, etc. | |
// Also the `&PyList` reference on its own is useless, so `Deref<Target = PyList>` | |
// is a little weird. | |
// ---------------------------------------------------------------------------------- | |
// | |
// Case 2 - PyO3's "refcell" container synchronized by the GIL. This has a close | |
// cousin in `std::cell::RefCell`. | |
// | |
// | |
/// PyO3 has a `#[pyclass]` macro which generates a Python type for a Rust | |
/// struct. | |
/// - `Foo` continues to be the plain old Rust struct. | |
/// - `Py<Foo>` is a smart pointer to a Python object which contains a `Foo`. | |
#[pyclass] | |
struct Foo { /* ... */ } | |
/// To implement methods on the Python type PyO3 has a `#[pymethods]` macro. | |
/// | |
/// Users can use `&self` and `&mut self` receivers. To make this possible, | |
/// `Py<Foo>` like `RefCell<Foo>` but uses the Python GIL for synchronization. | |
/// `PyRef<'_, Foo>` and `PyRefMut<'_, Foo>` are the guards to `Py<Foo>`. | |
impl Foo { | |
/// Receive by `&self``, read only the Rust data. Possible today. | |
fn a(&self) { /* ... */ } | |
/// Receive by `&mut self`, read or write only the Rust data. Possible today. | |
fn b(&mut self) { /* ... */ } | |
/// Receive by `Py<Foo>`. `Py<Foo>` implements `Deref<Target = Py<PyAny>>` | |
/// so that all Python operations are accessible. | |
/// | |
/// This is an arbitrary self type. | |
/// | |
/// Current users of PyO3 have to use `slf: Py<Foo>` which is awkward | |
/// and also loses method call syntax. | |
fn c(self: Py<Foo>) { /* ... */ } | |
/// Receive by `PyRef<'_, Foo>`. `PyRef<'_, Foo>` is a pointer to the Python | |
/// data. It implements `Deref<Target = Foo>` to give read access to the Rust | |
/// data. | |
/// | |
/// This is an arbitrary self type. | |
/// | |
/// Same workarounds for current users of PyO3 apply. | |
fn d(self: PyRef<'_, Foo>) { /* ... */ } | |
/// Receive by `PyRefMut<'_, Foo>`. `PyRefMut<'_, Foo>` is a pointer to the Python | |
/// data. It implements `DerefMut<Target = Foo>` to give read and write access to | |
/// the Rust data. | |
/// | |
/// This is an arbitrary self type. | |
/// | |
/// Same workarounds for current users of PyO3 apply. | |
fn e(self: PyRefMut<'_, Foo>) { /* ... */ } | |
} | |
// Note that in the above, `PyRef<'_, Foo>` and `PyRefMut<'_, Foo>` both implement | |
// `Deref<Target = Foo>` so would fit fine with deref-based arbitrary self types. | |
// | |
// However `Py<Foo>` cannot implement `Deref<Target = Foo>`, just like how `RefCell<T>` | |
// cannot implement `Deref<Target = T>`. | |
// | |
// To make `Py<Foo>` be able to implement `Deref`, we must give up its refcell-like | |
// feature. This removes `PyRef<'_, Foo>` and `PyRefMut<'_, Foo>`, and it also | |
// removes the ability to have `&mut self` as a receiver. The mutable access | |
// needs the runtime refcell protection due to Python code being incompatible with | |
// the borrow checker. | |
// | |
// There is a possible argument that removing `&mut self` and refcell feature is | |
// a good thing, but it is also _extremely_ ergonomic for users. We could have | |
// a long conversation about whether PyO3 made the wrong API choice here. There is | |
// `#[pyclass(frozen)]` which opts-in to this restriction, so by flipping the default | |
// and then removing the option we could evolve PyO3's API over time if we think | |
// deref-based arbitrary self types is the correct formulation of arbitrary self types. | |
// | |
// If you feel like a long distraction, we can discuss how Python might | |
// be removing the GIL, and how that means that PyO3 might be forced to change | |
// anyway. | |
I see your point that the blanket makes a lot of libraries just work. Strictly speaking they wouldn't have to bump MSRV, they can add a build script to do feature detection and conditionally implement Receiver
. But that's still a bit of work across multiple points in the ecosystem.
Also true that I can still implement Receiver
if I don't implement Deref
; I keep overlooking this because I keep wanting to have the Deref
impl 😅. Having reflected on this I think I can make PyO3 work without either CoerceUnsized
or Deref
. I can have a trait PyAnyMethods
and a blanket impl<T> PyAnyMethods for Py<T>
.
So, maybe your existing RFC draft is already fine for what PyO3 needs, and you shouldn't zap the blanket impl based on what I've said here? Certainly it's been great to discuss all these cases and I'm hopeful to see the RFC accepted!
Thanks @madsmtm and @davidhewitt for all the discussion.
I think I agree with your last comment David - I think you can make everything work with just a Receiver
impl (without Deref
) and then some traits.
That said,
- I am pretty sure that the blanket impl of
Receiver
forDeref
will be seen as Not The Rust Way as soon as I raise the PR. It is certainly unusual. I am worried we will sink a ton of time discussing this without really clear arguments on either side. If you can think of a way to avoid this, I'm all ears! - It's not your fault that you overlook the possibility of
Receiver
withoutDeref
. I think the RFC is insufficiently clear about this, and I'll work on it. - Your PyO3 example made me realize an assumption underlying our blanket impl: we're assuming that folks are using
Deref
because their type is a smart pointer containing something (has-a relationships). You want(ed) to useDeref
for a completely different purpose, to express is-a relationships, along the lines of coercion. In this case, people might validly want theirDeref
resolution and theirReceiver
resolution to point in different directions. I think we need to be more explicit in the RFC that we're choosing not to be compatible with such use-cases, and they should be achieved using traits (or some futureCoercion
trait) instead.
So this has been a most useful discussion, thanks!
I think my point here would be that the PyO3 would then implement
Receiver<Target = T> for Py<T>
instead, and not have aDeref
implementation at all (which is possible under the original RFC whereReceiver
has a blanket impl forT: Deref
).I agree that the naming of that Rust feature does not really reflect what we want, in reality we want a more generic
Coercion
trait of some kind.Ideally, if we could do breaking changes to Rust, I think the prettiest design would've been
trait Receiver { type Target; }
andtrait Deref: Receiver { fn deref(&self) -> &Self::Target }
, but that ship has sailed, so while I agree that Rust should favour explicit implementations, I think there is a lot of value in keeping the blanket impl.Just imagine how many libraries out there is using
Deref
/DerefMut
, which would now have bump their MSRV and be updated to also have aReceiver
implementation that exactly matches theirDeref
implementation, just to be as nicely usable asBox<T>
.