Usually implemented when we want objects to support and interact with fundamental elements of the language, as:
- Collections
- Attribute access
- Iteration (including async for)
- Operator overloading
- invoking functions and methods
- representation and string formatting
- async programing using await
- creation and destruction of objects
- context management using
with
andasync with
Simple deck of cards:
import collections
Card = collections.namedtuple('Card', ['rank', 'suit'])
class FrenchDeck:
ranks = [str(n) for n in range(2, 11)] + list('JQKA')
suits = 'spadeds diamonds clubs hearts'.split()
def __init__(self):
self._cards = [Card(rank, suit) for suit in self.suits for rank in self.ranks]
def __len__(self):
return len(self._cards)
def __getitem__(self, position):
return self._cards[position]
We use namedtuple to build a simple class representing individual cards. We use it because they're just a group of attributes, without methods (as a DB register).
As we use __getitem__
and __len__
we can reuse python stuff, as slicing, random, reversed and everything.
Iteration is many times implicit. If a collection do not have a __contains__
method, the in
operator realizes a sequential search. In the case above, the operator works because FrenchDeck
is iterable.
Example: vector implementation (very simplistic)
import math
class Vector:
def __init__(self, x=0, y=0):
self.x = x
self.y = y
def __repr__(self):
return f'Vector({self.x!r}, {self.y!r})'
def __abs__(self):
return math.hypot(self.x, self.y)
def __bool__(self):
return bool(abs(self))
def __add__(self, other):
x = self.x + other.x
y = self.y + other.y
return Vector(x, y)
def __mul__(self, scalar):
return Vector(self.x * scalar, self.y * scalar)
Note that the "correct" behavior for a method like __add__
is return a new vector, (create a new object) and not touch in its operands. Note too that in this example we break the commutative property of scale multiplication (we will fix with __rmul__
later.
The special method __repr__
is called by repr
to get an object representation as a string. Without a personalized __repr__
, python console would show an instance as Vector object at 0x1231
.
repr
is also called with %r
positional marker and in '{my_obj!r}'
On the other hand, __str__
is called by the str()
method and used implicitly by print
. If no __str__
is implemented, __repr__
is called.
Class instances defined by the user are considered to be true, unless __bool__
or __len__
is implemented. Basically, bool(x) calls bool dunder and if not implemented, calls len dunder and return false if len is 0.
It documents interfaces of essencial collection types in the language.
List of dunder methods excluding operators: https://pythonfluente.com/#special_names_tbl Dunder methods for operators: https://pythonfluente.com/#special_operators_tbl
(we can use matrix1 @ matrix2
to perform matmul.
the default library offers a good selection of sequence types, implemented in C:
Container sequences
They can store different types of itens, including nested containers and objects of all kind. Some examples are: list tuple collections.deque
.
Plane sequences
Store items of some simple type, but not other collections or object references. Ex: str bytes array.array
A container sequence keeps references for objects that it contains, that can be of any type, and a plane sequence stores the value of its content in its own memory space, not as distinct python objects.
Note: All python object in memory has a metadata header. For example,
float
has a field of value and two metadata fields:ob_refcnt
: reference counting,ob_type
pointer to obj type,ob_fval
double of C keeping the float val.
We can also group sequence as mutable or immutable.
>>> symbols = '$¢£¥€¤'
>>> tuple(ord(symbol) for symbol in symbols) (1)
(36, 162, 163, 165, 8364, 164)
>>> import array
>>> array.array('I', (ord(symbol) for symbol in symbols)) (2)
array('I', [36, 162, 163, 165, 8364, 164])