Skip to content

Instantly share code, notes, and snippets.

@chowder
Last active April 29, 2022 12:55
Show Gist options
  • Save chowder/309e4ac6f8cc139354d98491ac01023a to your computer and use it in GitHub Desktop.
Save chowder/309e4ac6f8cc139354d98491ac01023a to your computer and use it in GitHub Desktop.
Supporting slicing for PythonExtensions in Python 3

PythonExtension

Below we have a custom 'list' PythonExtension implemented in C++:

struct DummyList : public Py::PythonExtension<DummyList>
{
    std::vector<Py::Object> items;
 
    Py::Object append(const Py::Tuple& args, const Py::Dict& keywords)
    {
        try
        {
 
            Py::Object other = static_cast<Py::Object>(args[0]);
            items.push_back(other);
            return Py::None();
        }
        catch (...)
        {
            throw "Unknown exception in DummyList::append()";
        }
    }
 
    Py_ssize_t sequence_length()
    {
        return items.size();
    }
 
    Py::Object sequence_item(Py_ssize_t i)
    {
        if (i >= sequence_length())
            throw Py::IndexError("list index out of range");
        return items[i];
    }
 
    Py::Object sequence_slice(Py_ssize_t i, Py_ssize_t j)
    {
        DummyList* slice = new DummyList;
        for (Py_ssize_t x = i; (x < j) && (static_cast<size_t>(x) < items.size()); x++)
            slice->items.push_back(items[x]);
        return Py::asObject(slice);
    }
 
    static void init_type()
    {
        behaviors().name("DummyList");
        behaviors().doc("Dummy list type");
        behaviors().supportSequenceType();
 
        add_keyword_method("append", &DummyList::append);
    }
};

When we call behaviour.supportSequenceType() in DummyList::init_type, PyCXX helps us setup the necessary hooks between the Python runtime and the sequence_length, sequence_item, and sequence_slice methods.

see: Sequence Object Structures for more information

Removal of sq_slice

Python 3 has silently removed the sq_slice method for sequences:

Meaning that sequence_slice is not binded to anything when compiling the above code against Python 3, and attempting to slice a DummyList object during runtime throws the following error:

>>> a[:]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: sequence index must be integer, not 'slice'

We can see that the PyCXX source supports this claim:

PythonType &PythonType::supportSequenceType( int methods_to_support ) {
#if !defined( Py_LIMITED_API )
    if(sequence_table)
    {
        return *this;
    }
    sequence_table = new PySequenceMethods;
    memset( sequence_table, 0, sizeof( PySequenceMethods ) );   // ensure new fields are 0
    table->tp_as_sequence = sequence_table;
#endif
    FILL_SEQUENCE_SLOT(length)
    FILL_SEQUENCE_SLOT(concat)
    FILL_SEQUENCE_SLOT(repeat)
    FILL_SEQUENCE_SLOT(item)
    FILL_SEQUENCE_SLOT(ass_item)
    FILL_SEQUENCE_SLOT(inplace_concat)
    FILL_SEQUENCE_SLOT(inplace_repeat)
    FILL_SEQUENCE_SLOT(contains)
    return *this;
}

Where the FILL_SEQUENCE_SLOT(X) macro binds the sq_X Python call to the sequence_(X) method in DummyList. There isn't a FILL_SEQUENCE_SLOT(slice) macro call here.

Solution: Hacking the mp_subscript method

In theory, by extending DummyList to implement the 'mapping' interface, we would be able to intercept all subscript/index calls.

In other words, performing dummyList[5:1] would be equivalent to indexing into a map with the key object <instance 'slice(start=1, end=5)'>. Depending on the key's type (either a slice or an index), we can accordingly mock the slicing interface.

Implementation

subscript_from_slice

This is a method that re-uses the existing sequence_slice and sequence_item methods to implement the subscript function:

(heavily based on: http://renesd.blogspot.com/2009/07/python3-c-api-simple-slicing-sqslice.html)

static Py::Object subscript_from_slice(T* self, const Py::Object& item)
{
#if PY_MAJOR_VERSION == 3
    if (PyIndex_Check(item.ptr()))
    {
        Py_ssize_t i = PyNumber_AsSsize_t(item.ptr(), PyExc_IndexError);
#else
    if (PyInt_Check(item.ptr()))
    {
        Py_ssize_t i = 0;
#endif
        if (i == -1 && PyErr_Occurred())
        {
            return Py::None();
        }
        if (i < 0)
        {
            i += PyList_GET_SIZE(self);
        }
        // Reuse `sq_item` (or `sequence_item`) here
        return self->sequence_item(i);
    }
    if (PySlice_Check(item.ptr()))
    {
        int len = self->sequence_length();
        Py_ssize_t start, stop, step, sliceLength;
        if (PySlice_GetIndicesEx(item.ptr(), len, &start, &stop, &step, &sliceLength) < 0)
        {
            return Py::None();
        }
        if (sliceLength <= 0)
        {
            return Py::asObject(PyList_New(0));
        }
        else if (step == 1)
        {
            // Reuse the sq_slice (or sequence_slice) method here
            return self->sequence_slice(start, stop);
        }
        else
        {
            PyErr_SetString(PyExc_TypeError, "slice steps not supported yet");
            return Py::None();
        }
    }
    else
    {
        PyErr_Format(PyExc_TypeError, "Indices must be integers, not %.200s", item.ptr()->ob_type->tp_name);
        return Py::None();
    }
}

mapping_subscript

Next is to implement a mapping_subscript function on our PythonExtension. With the above utility code, this becomes trivial:

Py::Object DummyList::mapping_subscript(const Py::Object& item)
{
    return subscript_from_slice(this, item);
}

supportMappingType

Finally, in the PythonExtension's initialisation function we call behaviors().suportMappingtype() to make PyCXX bind the mapping_subscript function to the mp_subscript Python call.

Result

Slicing works.

Python 3.7.3 (default, Sep  8 2020, 11:28:48)
[GCC 4.9.2] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import example
>>> d = example.get_dummy()
>>> d.append("alice")
>>> d.append("bob")
>>> d.append("charlie")
>>> list(d[:])
['alice', 'bob', 'charlie']
>>> list(d[1:])
['bob', 'charlie']
>>> list(d[:2])
['alice', 'bob']
>>> list(d[:10])
['alice', 'bob', 'charlie']
>>> d[5]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: list index out of range
>>> d[0]
'alice'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment