Skip to content

Instantly share code, notes, and snippets.

@francois-durand
Created September 25, 2020 13:09
Show Gist options
  • Save francois-durand/2aa9d8755b1dc0c2ff0f56d22463a412 to your computer and use it in GitHub Desktop.
Save francois-durand/2aa9d8755b1dc0c2ff0f56d22463a412 to your computer and use it in GitHub Desktop.
Collections in Python
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"<font size=\"+4\"><b>Collections in Python</b></font>\n",
"\n",
"François Durand (Nokia Bell Labs France)\n",
"\n",
"[email protected]\n",
"\n",
"Python Workshop - Lincs, 17 & 24 September 2020"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# Overview"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Abstract Class `Collection`"
]
},
{
"cell_type": "markdown",
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-14T09:21:31.250954Z",
"start_time": "2020-09-14T09:21:31.245929Z"
}
},
"source": [
"A `Collection` is something that can have elements inside. It implements:\n",
"* `__len__` (it is a `Sizable`),\n",
"* `__contains__` (it is a `Container`),\n",
"* `__iter__` (it is an `Iterable`).\n",
"\n",
"Examples: `set`, `list`, `dict`."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"For example:"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:48.307454Z",
"start_time": "2020-09-23T08:47:48.304498Z"
}
},
"outputs": [],
"source": [
"my_list = ['a', 'b', 'c', 'd']"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:48.335376Z",
"start_time": "2020-09-23T08:47:48.318423Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"4"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"len(my_list) # Uses __len__"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:48.350342Z",
"start_time": "2020-09-23T08:47:48.336373Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"True"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"'b' in my_list # Uses __contains__"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:48.368290Z",
"start_time": "2020-09-23T08:47:48.351337Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"True"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"'e' not in my_list # Uses __contains__"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:48.385747Z",
"start_time": "2020-09-23T08:47:48.370284Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"a\n",
"b\n",
"c\n",
"d\n"
]
}
],
"source": [
"for element in my_list: # Uses __iter__\n",
" print(element)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Main Subclasses of `Collection`"
]
},
{
"cell_type": "markdown",
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-18T09:43:33.573301Z",
"start_time": "2020-09-18T09:43:33.147514Z"
}
},
"source": [
"Inclusion diagram:\n",
"\n",
"`Collection`:\n",
"* `Set`\n",
"* `Sequence`\n",
"* `Mapping`"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Abstract Class `Set`"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"`Collection`:\n",
"* **`Set`**:\n",
" * **`set`**\n",
" * **`frozenset`**\n",
"* `Sequence`\n",
"* `Mapping` "
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"A `Set` gives access to inclusion tests (`__le__`, `__lt__`, `__eq__`, `__ne__`, `__gt__`, `__ge__`), disjunction test (`isdisjoint`) and set-theoretic operations (`__and__`, `__or__`, `__sub__`, `__xor__`). For example:"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:48.395720Z",
"start_time": "2020-09-23T08:47:48.386744Z"
}
},
"outputs": [],
"source": [
"my_set = {'a', 'b', 'c', 'd'}"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:48.412675Z",
"start_time": "2020-09-23T08:47:48.397715Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"True"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_set <= {'a', 'b', 'c', 'd', 'e', 'f'} # Inclusion test: uses __le__"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:48.424643Z",
"start_time": "2020-09-23T08:47:48.414671Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"True"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_set.isdisjoint({'e', 'f'})"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:48.438901Z",
"start_time": "2020-09-23T08:47:48.426678Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"{'b', 'd'}"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_set & {'b', 'd', 'f'} # Intersection: uses __and__"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We will say more on `set` and `frozenset` later."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Abstract Class `Sequence`"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"`Collection`:\n",
"* `Set`\n",
"* **`Sequence`:**\n",
" * **`tuple`**\n",
" * **Classes defined by `namedtuple`**\n",
" * **`list`**\n",
" * **`deque`**\n",
" * **`str`**\n",
" * **`bytes`**\n",
"* `Mapping` "
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"A `Sequence` implements `__getitem__`. As a side-effect, you have access to indexing, slicing and a lot of other features. For example:"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:48.450846Z",
"start_time": "2020-09-23T08:47:48.440914Z"
}
},
"outputs": [],
"source": [
"my_tuple = ('a', 'b', 'c', 'd', 'b')"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:48.464807Z",
"start_time": "2020-09-23T08:47:48.452840Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"'a'"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_tuple[0] # Indexing"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:48.476777Z",
"start_time": "2020-09-23T08:47:48.466802Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"('b', 'c')"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_tuple[1:3] # Slicing"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We will develop on `tuple`, `namedtuple`, `list` and `deque` later, but we will not say more about `str` and `bytes` in this talk."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Abstract Class `Mapping`"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"`Collection`:\n",
"* `Set`\n",
"* `Sequence`\n",
"* **`Mapping`:**\n",
" * **`dict`**\n",
" * **`OrderedDict`**\n",
" * **`defaultdict`**\n",
" * **`Counter`**\n",
" * **`ChainMap`**"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"A `Mapping` implements `__getitem__`, like a `Sequence`, but keys are not necessarily integers. As a side-effect, you have access to `get`, `keys`, `values` and `items`."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "-"
}
},
"source": [
"For example:"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:48.488745Z",
"start_time": "2020-09-23T08:47:48.478771Z"
}
},
"outputs": [],
"source": [
"my_dict = {'zero': 'a', 'one': 'b', 'two': 'c', 'three': 'd'}"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:48.503705Z",
"start_time": "2020-09-23T08:47:48.490740Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"a\n",
"a\n"
]
}
],
"source": [
"print(my_dict['zero'])\n",
"print(my_dict.get('zero'))"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:48.517668Z",
"start_time": "2020-09-23T08:47:48.504702Z"
},
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"zero\n",
"one\n",
"two\n",
"three\n"
]
}
],
"source": [
"for k in my_dict.keys():\n",
" print(k)"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:48.529640Z",
"start_time": "2020-09-23T08:47:48.519667Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"a\n",
"b\n",
"c\n",
"d\n"
]
}
],
"source": [
"for v in my_dict.values():\n",
" print(v)"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:48.543598Z",
"start_time": "2020-09-23T08:47:48.531631Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"zero: a\n",
"one: b\n",
"two: c\n",
"three: d\n"
]
}
],
"source": [
"for k, v in my_dict.items():\n",
" print('{}: {}'.format(k, v))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We will say more on `dict` and its main subclasses later."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## References"
]
},
{
"cell_type": "markdown",
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-14T11:44:43.032769Z",
"start_time": "2020-09-14T11:44:43.025789Z"
},
"slideshow": {
"slide_type": "-"
}
},
"source": [
"* https://docs.python.org/3/tutorial/datastructures.html\n",
"* https://docs.python.org/3/library/stdtypes.html\n",
"* https://docs.python.org/3/library/collections.html\n",
"* https://docs.python.org/3/library/collections.abc.html"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Roadmap of the Talk"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"1. Overview \n",
"2. `Set`: `set`, `frozenset`.\n",
"3. `Sequence`: `tuple`, `namedtuple`, `list`, `deque`.\n",
"4. `Mapping`: `dict`, `OrderedDict`, `defaultdict`, `Counter`, `ChainMap`."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# `Set`"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"`Collection`:\n",
"* **`Set`**:\n",
" * **`set`**\n",
" * **`frozenset`**\n",
"* `Sequence`\n",
"* `Mapping` "
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## `set`"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"A `set` is **mutable** but **not hashable** (we'll come back to it)."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Create a `set`"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:48.557560Z",
"start_time": "2020-09-23T08:47:48.546598Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"{'a', 'b', 'c', 'd'}"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_set = {'a', 'b', 'c', 'd'}\n",
"my_set"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:48.572519Z",
"start_time": "2020-09-23T08:47:48.559557Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"{'a', 'b', 'c', 'd'}"
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_set = set(['a', 'b', 'c', 'd']) # Any iterable can be used\n",
"my_set"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:48.587480Z",
"start_time": "2020-09-23T08:47:48.575527Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"{0, 1, 4, 9, 16, 25, 36, 49, 64, 81}"
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_set = {x**2 for x in range(10)} # \"Set comprehension\"\n",
"my_set"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:48.606430Z",
"start_time": "2020-09-23T08:47:48.588477Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"{1, 4, 16, 25, 49, 64}"
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_set = {x**2 for x in range(10) if x % 3 != 0} # Set comprehension with an \"if\" clause\n",
"my_set"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:48.625379Z",
"start_time": "2020-09-23T08:47:48.610419Z"
},
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"<class 'dict'>\n"
]
}
],
"source": [
"be_careful_this_is_not_an_empty_set = {}\n",
"print(type(be_careful_this_is_not_an_empty_set))"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:48.644328Z",
"start_time": "2020-09-23T08:47:48.628372Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"set()\n"
]
}
],
"source": [
"this_is_an_empty_set = set()\n",
"print((this_is_an_empty_set))"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Syntax Inherited from `Collection`"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:48.657720Z",
"start_time": "2020-09-23T08:47:48.647320Z"
}
},
"outputs": [],
"source": [
"my_set = {'a', 'b', 'c', 'd'}"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:48.669688Z",
"start_time": "2020-09-23T08:47:48.659716Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"4"
]
},
"execution_count": 25,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"len(my_set)"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:48.683165Z",
"start_time": "2020-09-23T08:47:48.670686Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The set is not empty\n"
]
}
],
"source": [
"if my_set: # Check if len(my_set) == 0\n",
" print('The set is not empty')\n",
"else:\n",
" print('The set is empty')"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:48.693004Z",
"start_time": "2020-09-23T08:47:48.684162Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"True"
]
},
"execution_count": 27,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"'a' in my_set"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:48.706957Z",
"start_time": "2020-09-23T08:47:48.694991Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"True"
]
},
"execution_count": 28,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"'e' not in my_set"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:48.722915Z",
"start_time": "2020-09-23T08:47:48.709949Z"
},
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"b is in my_set\n",
"c is in my_set\n",
"a is in my_set\n",
"d is in my_set\n"
]
}
],
"source": [
"for x in my_set: # Do not rely on the order\n",
" print('{} is in my_set'.format(x))"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:48.733886Z",
"start_time": "2020-09-23T08:47:48.723912Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"i = 0, x = b\n",
"i = 1, x = c\n",
"i = 2, x = a\n",
"i = 3, x = d\n"
]
}
],
"source": [
"for i, x in enumerate(my_set): # This syntax is more common with Sequences (list, tuple...)\n",
" print('i = {}, x = {}'.format(i, x))"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Syntax Inherited from `Set`"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#### Set-Theoretic Tests"
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:48.744857Z",
"start_time": "2020-09-23T08:47:48.734882Z"
}
},
"outputs": [],
"source": [
"my_set = {'a', 'b', 'c', 'd'}"
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:48.756825Z",
"start_time": "2020-09-23T08:47:48.747849Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"True"
]
},
"execution_count": 32,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_set.isdisjoint({'e', 'f'})"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:48.766797Z",
"start_time": "2020-09-23T08:47:48.758819Z"
},
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"data": {
"text/plain": [
"True"
]
},
"execution_count": 33,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_set.issubset({'a', 'b', 'c', 'd', 'e', 'f'})"
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:48.777769Z",
"start_time": "2020-09-23T08:47:48.767795Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"True"
]
},
"execution_count": 34,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_set <= {'a', 'b', 'c', 'd', 'e', 'f'} # Synonym of issubset"
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:48.788739Z",
"start_time": "2020-09-23T08:47:48.779764Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"True"
]
},
"execution_count": 35,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_set < {'a', 'b', 'c', 'd', 'e', 'f'} # Strict subset"
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:48.801704Z",
"start_time": "2020-09-23T08:47:48.790735Z"
},
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"data": {
"text/plain": [
"True"
]
},
"execution_count": 36,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_set.issuperset({'a', 'b'})"
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:48.815668Z",
"start_time": "2020-09-23T08:47:48.802701Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"True"
]
},
"execution_count": 37,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_set >= {'a', 'b'} # Synonym of issuperset"
]
},
{
"cell_type": "code",
"execution_count": 38,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:48.827636Z",
"start_time": "2020-09-23T08:47:48.817663Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"True"
]
},
"execution_count": 38,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_set > {'a', 'b'} # Strict superset"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#### Set-Theoretic Operations"
]
},
{
"cell_type": "code",
"execution_count": 39,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:48.837608Z",
"start_time": "2020-09-23T08:47:48.831625Z"
}
},
"outputs": [],
"source": [
"my_set = {'a', 'b', 'c', 'd'}"
]
},
{
"cell_type": "code",
"execution_count": 40,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:48.851571Z",
"start_time": "2020-09-23T08:47:48.839603Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"{'a', 'b', 'c', 'd', 'e', 'f'}"
]
},
"execution_count": 40,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_set.union({'b', 'd', 'e', 'f'})"
]
},
{
"cell_type": "code",
"execution_count": 41,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:48.862542Z",
"start_time": "2020-09-23T08:47:48.852568Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"{'a', 'b', 'c', 'd', 'e', 'f'}"
]
},
"execution_count": 41,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_set | {'b', 'd', 'e', 'f'} # Synonym of union: elements that are in my_set OR {'e', 'f'}"
]
},
{
"cell_type": "code",
"execution_count": 42,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:48.874510Z",
"start_time": "2020-09-23T08:47:48.863540Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"{'b', 'd'}"
]
},
"execution_count": 42,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_set.intersection({'b', 'd', 'e', 'f'})"
]
},
{
"cell_type": "code",
"execution_count": 43,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:48.886477Z",
"start_time": "2020-09-23T08:47:48.876505Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"{'b', 'd'}"
]
},
"execution_count": 43,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_set & {'b', 'd', 'e', 'f'} # Synonym of intersection: elements that are in my_set AND {'b', 'd', 'f'}"
]
},
{
"cell_type": "code",
"execution_count": 44,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:48.895462Z",
"start_time": "2020-09-23T08:47:48.888472Z"
}
},
"outputs": [],
"source": [
"my_set = {'a', 'b', 'c', 'd'}"
]
},
{
"cell_type": "code",
"execution_count": 45,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:48.912412Z",
"start_time": "2020-09-23T08:47:48.897448Z"
},
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"data": {
"text/plain": [
"{'a', 'c'}"
]
},
"execution_count": 45,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_set.difference({'b', 'd', 'e', 'f'})"
]
},
{
"cell_type": "code",
"execution_count": 46,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:48.924380Z",
"start_time": "2020-09-23T08:47:48.915400Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"{'a', 'c'}"
]
},
"execution_count": 46,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_set - {'b', 'd', 'e', 'f'} # Synonym of difference"
]
},
{
"cell_type": "code",
"execution_count": 47,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:48.939274Z",
"start_time": "2020-09-23T08:47:48.927368Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"{'a', 'c', 'e', 'f'}"
]
},
"execution_count": 47,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_set.symmetric_difference({'b', 'd', 'e', 'f'})"
]
},
{
"cell_type": "code",
"execution_count": 48,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:48.953237Z",
"start_time": "2020-09-23T08:47:48.946259Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"{'a', 'c', 'e', 'f'}"
]
},
"execution_count": 48,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_set ^ {'b', 'd', 'e', 'f'} # Synonym of symmetric_difference"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Specificities of `set` "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"A `set` is **mutable**."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#### Add Elements"
]
},
{
"cell_type": "code",
"execution_count": 49,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:48.963211Z",
"start_time": "2020-09-23T08:47:48.955231Z"
}
},
"outputs": [],
"source": [
"my_set = {'a', 'b', 'c', 'd'}"
]
},
{
"cell_type": "code",
"execution_count": 50,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:48.978171Z",
"start_time": "2020-09-23T08:47:48.966202Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"{'a', 'b', 'c', 'd', 'e'}"
]
},
"execution_count": 50,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_set.add('e')\n",
"my_set"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#### Remove Elements"
]
},
{
"cell_type": "code",
"execution_count": 51,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:48.987152Z",
"start_time": "2020-09-23T08:47:48.980165Z"
}
},
"outputs": [],
"source": [
"my_set = {'a', 'b', 'c', 'd'}"
]
},
{
"cell_type": "code",
"execution_count": 52,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.003103Z",
"start_time": "2020-09-23T08:47:48.989140Z"
},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"data": {
"text/plain": [
"{'a', 'c', 'd'}"
]
},
"execution_count": 52,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_set.remove('b')\n",
"my_set"
]
},
{
"cell_type": "code",
"execution_count": 53,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.013077Z",
"start_time": "2020-09-23T08:47:49.004100Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"{'c', 'd'}"
]
},
"execution_count": 53,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_set.discard('a')\n",
"my_set"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"`remove` and `discard` both remove the desired element from the set. Their behavior differ only if the element is not in the set: in that case, only `remove` raises an error."
]
},
{
"cell_type": "code",
"execution_count": 54,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.025045Z",
"start_time": "2020-09-23T08:47:49.014074Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"KeyError: 'z'\n"
]
}
],
"source": [
"try:\n",
" my_set.remove('z')\n",
"except KeyError as e:\n",
" print('KeyError:', e)"
]
},
{
"cell_type": "code",
"execution_count": 55,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.035018Z",
"start_time": "2020-09-23T08:47:49.026043Z"
}
},
"outputs": [],
"source": [
"my_set.discard('z')"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"Remove an arbitrary element and return it:"
]
},
{
"cell_type": "code",
"execution_count": 56,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.045988Z",
"start_time": "2020-09-23T08:47:49.036016Z"
}
},
"outputs": [],
"source": [
"my_set = {'a', 'b', 'c', 'd'}"
]
},
{
"cell_type": "code",
"execution_count": 57,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.057960Z",
"start_time": "2020-09-23T08:47:49.047983Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"b\n",
"{'c', 'a', 'd'}\n"
]
}
],
"source": [
"x = my_set.pop()\n",
"print(x)\n",
"print(my_set)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "-"
}
},
"source": [
"Remove all elements:"
]
},
{
"cell_type": "code",
"execution_count": 58,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.068926Z",
"start_time": "2020-09-23T08:47:49.059951Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"set()"
]
},
"execution_count": 58,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_set.clear()\n",
"my_set"
]
},
{
"cell_type": "markdown",
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-14T12:12:48.159260Z",
"start_time": "2020-09-14T12:12:48.156233Z"
},
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#### Set-Theoretic updates"
]
},
{
"cell_type": "code",
"execution_count": 59,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.082889Z",
"start_time": "2020-09-23T08:47:49.069924Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"{'a', 'b', 'c', 'd', 'e', 'f'}"
]
},
"execution_count": 59,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_set = {'a', 'b', 'c', 'd'}\n",
"my_set.update({'b', 'd', 'e', 'f'}) # In-place version of `union`\n",
"my_set"
]
},
{
"cell_type": "code",
"execution_count": 60,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.093860Z",
"start_time": "2020-09-23T08:47:49.083886Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"{'b', 'd'}"
]
},
"execution_count": 60,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_set = {'a', 'b', 'c', 'd'}\n",
"my_set.intersection_update({'b', 'd', 'e', 'f'}) # In-place version of `intersection`\n",
"my_set"
]
},
{
"cell_type": "code",
"execution_count": 61,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.104831Z",
"start_time": "2020-09-23T08:47:49.095855Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"{'a', 'c'}"
]
},
"execution_count": 61,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_set = {'a', 'b', 'c', 'd'}\n",
"my_set.difference_update({'b', 'd', 'e', 'f'}) # In-place version of `difference_update`\n",
"my_set"
]
},
{
"cell_type": "code",
"execution_count": 62,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.116799Z",
"start_time": "2020-09-23T08:47:49.105828Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"{'a', 'c', 'e', 'f'}"
]
},
"execution_count": 62,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_set = {'a', 'b', 'c', 'd'}\n",
"my_set.symmetric_difference_update({'b', 'd', 'e', 'f'}) # In-place version of `symmetric_difference_update`\n",
"my_set"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## `frozenset`"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "-"
}
},
"source": [
"`frozenset` is **hashable** but **not mutable** (we'll come back to it)."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Create a `frozenset`"
]
},
{
"cell_type": "code",
"execution_count": 63,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.127770Z",
"start_time": "2020-09-23T08:47:49.117797Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"frozenset({'a', 'b', 'c', 'd'})"
]
},
"execution_count": 63,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_frozen_set = frozenset({'a', 'b', 'c', 'd'}) # Any iterable can be used\n",
"my_frozen_set"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Syntax Inherited from `Set`"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"From `Collection`: `len`, `in` / `not in`, iteration.\n",
"\n",
"From `Set`:\n",
"* Set-theoretic tests: `isdisjoint`, `issubset`, `<=`...\n",
"* Set-theoretic operations: `union`, `intersection`..."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Specificities of `frozenset`"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#### A `frozenset` is not mutable"
]
},
{
"cell_type": "code",
"execution_count": 64,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.137742Z",
"start_time": "2020-09-23T08:47:49.128767Z"
},
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"AttributeError: 'frozenset' object has no attribute 'add'\n"
]
}
],
"source": [
"try:\n",
" my_frozen_set = frozenset({'a', 'b', 'c', 'd'})\n",
" my_frozen_set.add('e')\n",
"except AttributeError as e:\n",
" print('AttributeError:', e)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#### A `frozenset` is hashable"
]
},
{
"cell_type": "code",
"execution_count": 65,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.149711Z",
"start_time": "2020-09-23T08:47:49.138740Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"-5449832526543167815"
]
},
"execution_count": 65,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_frozen_set = frozenset({'a', 'b', 'c', 'd'})\n",
"hash(my_frozen_set)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "-"
}
},
"source": [
"This leads to 2 main use cases..."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#### Set of Sets"
]
},
{
"cell_type": "code",
"execution_count": 66,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.161679Z",
"start_time": "2020-09-23T08:47:49.152703Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"{frozenset({'a', 'b'}), frozenset({'c', 'd'})}"
]
},
"execution_count": 66,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"pairs = {\n",
" frozenset({'a', 'b'}),\n",
" frozenset({'c', 'd'})\n",
"}\n",
"pairs"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This would not work with `set`:"
]
},
{
"cell_type": "code",
"execution_count": 67,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.171652Z",
"start_time": "2020-09-23T08:47:49.163674Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"TypeError: unhashable type: 'set'\n"
]
}
],
"source": [
"try:\n",
" pairs = {\n",
" {'a', 'b'},\n",
" {'c', 'd'}\n",
" }\n",
" pairs\n",
"except TypeError as e:\n",
" print('TypeError:', e)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#### Dictionary \"Set to Something\""
]
},
{
"cell_type": "code",
"execution_count": 68,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.183620Z",
"start_time": "2020-09-23T08:47:49.173647Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"{frozenset({'a', 'b'}): 42, frozenset({'c', 'd'}): 51}"
]
},
"execution_count": 68,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d_pair_score = {\n",
" frozenset({'a', 'b'}): 42,\n",
" frozenset({'c', 'd'}): 51\n",
"}\n",
"d_pair_score"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This would not work with `set`:"
]
},
{
"cell_type": "code",
"execution_count": 69,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.193593Z",
"start_time": "2020-09-23T08:47:49.184623Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"TypeError: unhashable type: 'set'\n"
]
}
],
"source": [
"try:\n",
" d_pair_score = {\n",
" {'a', 'b'}: 42,\n",
" {'c', 'd'}: 51\n",
" }\n",
" d_pair_score\n",
"except TypeError as e:\n",
" print('TypeError:', e)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Use `set` or `frozenset`?"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Rule of thumb:**\n",
"* Generally, use `set`.\n",
"* If your set needs to be hashable (typically for a set of sets, or a dictionary \"set to something\"), use `frozenset`."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# Sequence"
]
},
{
"cell_type": "markdown",
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-18T12:44:33.989506Z",
"start_time": "2020-09-18T12:44:33.980409Z"
}
},
"source": [
"`Collection`:\n",
"* `Set`\n",
"* **`Sequence`:**\n",
" * **`tuple`**\n",
" * **Classes defined by `namedtuple`**\n",
" * **`list`**\n",
" * **`deque`**\n",
" * **`str`** (not in this talk)\n",
" * **`bytes`** (not in this talk)\n",
"* `Mapping` "
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## `tuple`"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"A `tuple` is **hashable** but **not mutable**."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Create a `tuple`"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"2 elements or more:"
]
},
{
"cell_type": "code",
"execution_count": 70,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.203567Z",
"start_time": "2020-09-23T08:47:49.194591Z"
}
},
"outputs": [],
"source": [
"my_tuple = ('a', 'b', 'c', 'd')"
]
},
{
"cell_type": "code",
"execution_count": 71,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.215534Z",
"start_time": "2020-09-23T08:47:49.205562Z"
}
},
"outputs": [],
"source": [
"my_tuple = 'a', 'b', 'c', 'd' # This syntax is called \"packing\" (similarly to \"unpacking\", cf. below)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"1 element:"
]
},
{
"cell_type": "code",
"execution_count": 72,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.225508Z",
"start_time": "2020-09-23T08:47:49.216532Z"
}
},
"outputs": [],
"source": [
"my_tuple = ('a', ) # I personally prefer this one because it does look like a tuple..."
]
},
{
"cell_type": "code",
"execution_count": 73,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.237477Z",
"start_time": "2020-09-23T08:47:49.226505Z"
}
},
"outputs": [],
"source": [
"my_tuple = 'a', # ... but this one is also possible"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"0 elements:"
]
},
{
"cell_type": "code",
"execution_count": 74,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.248446Z",
"start_time": "2020-09-23T08:47:49.239472Z"
}
},
"outputs": [],
"source": [
"my_tuple = ()"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Syntax Inherited from `Sequence`"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Reminder: since `Sequence` is a subclass of `Collection`, we have access to `len`, `in` / `not in` and iteration."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#### Unpack"
]
},
{
"cell_type": "code",
"execution_count": 75,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.258420Z",
"start_time": "2020-09-23T08:47:49.249444Z"
}
},
"outputs": [],
"source": [
"my_tuple = ('a', 'b', 'c', 'd')"
]
},
{
"cell_type": "code",
"execution_count": 76,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.272384Z",
"start_time": "2020-09-23T08:47:49.260415Z"
},
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"a\n",
"b\n",
"c\n",
"d\n"
]
}
],
"source": [
"alpha, beta, gamma, delta = my_tuple\n",
"print(alpha)\n",
"print(beta)\n",
"print(gamma)\n",
"print(delta)"
]
},
{
"cell_type": "code",
"execution_count": 77,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.286346Z",
"start_time": "2020-09-23T08:47:49.274379Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"'abcd'"
]
},
"execution_count": 77,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"def my_function(x, y, z, t):\n",
" return x + y + z + t\n",
"my_function(*my_tuple) # Note the \"*\" operator"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This is equivalent to:"
]
},
{
"cell_type": "code",
"execution_count": 78,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.298314Z",
"start_time": "2020-09-23T08:47:49.288340Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"'abcd'"
]
},
"execution_count": 78,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_function('a', 'b', 'c', 'd')"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#### Access Elements: Indexing and Slicing"
]
},
{
"cell_type": "code",
"execution_count": 79,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.308287Z",
"start_time": "2020-09-23T08:47:49.300308Z"
}
},
"outputs": [],
"source": [
"my_tuple = ('a', 'b', 'c', 'd')"
]
},
{
"cell_type": "code",
"execution_count": 80,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.321252Z",
"start_time": "2020-09-23T08:47:49.311279Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"'a'"
]
},
"execution_count": 80,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_tuple[0]"
]
},
{
"cell_type": "code",
"execution_count": 81,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.333097Z",
"start_time": "2020-09-23T08:47:49.322249Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"'d'"
]
},
"execution_count": 81,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_tuple[-1]"
]
},
{
"cell_type": "code",
"execution_count": 82,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.344191Z",
"start_time": "2020-09-23T08:47:49.335215Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"('b', 'c')"
]
},
"execution_count": 82,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_tuple[1:3] # Start included, end excluded"
]
},
{
"cell_type": "code",
"execution_count": 83,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.354164Z",
"start_time": "2020-09-23T08:47:49.346186Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"('b', 'c', 'd')"
]
},
"execution_count": 83,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_tuple[1:]"
]
},
{
"cell_type": "code",
"execution_count": 84,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.366133Z",
"start_time": "2020-09-23T08:47:49.355161Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"('a', 'c')"
]
},
"execution_count": 84,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_tuple[::2]"
]
},
{
"cell_type": "code",
"execution_count": 85,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.377103Z",
"start_time": "2020-09-23T08:47:49.368127Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"('d', 'c', 'b', 'a')"
]
},
"execution_count": 85,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_tuple[::-1] # But if you want to iterate on this, rather use `reversed` (cf. below)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#### Iterate on the Reversed Sequence"
]
},
{
"cell_type": "code",
"execution_count": 86,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.387076Z",
"start_time": "2020-09-23T08:47:49.379098Z"
}
},
"outputs": [],
"source": [
"my_tuple = ('a', 'b', 'c', 'd')"
]
},
{
"cell_type": "code",
"execution_count": 87,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.401039Z",
"start_time": "2020-09-23T08:47:49.388073Z"
},
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"d\n",
"c\n",
"b\n",
"a\n"
]
}
],
"source": [
"for x in reversed(my_tuple):\n",
" print(x)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#### Find and Count Elements"
]
},
{
"cell_type": "code",
"execution_count": 88,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.412009Z",
"start_time": "2020-09-23T08:47:49.404031Z"
}
},
"outputs": [],
"source": [
"my_tuple = ('a', 'b', 'c', 'd', 'b')"
]
},
{
"cell_type": "code",
"execution_count": 89,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.422987Z",
"start_time": "2020-09-23T08:47:49.414004Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"1"
]
},
"execution_count": 89,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_tuple.index('b') # First occurrence"
]
},
{
"cell_type": "code",
"execution_count": 90,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.434948Z",
"start_time": "2020-09-23T08:47:49.423977Z"
},
"scrolled": true
},
"outputs": [
{
"data": {
"text/plain": [
"2"
]
},
"execution_count": 90,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_tuple.count('b') # Number of occurrences"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#### Concatenate"
]
},
{
"cell_type": "code",
"execution_count": 91,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.444921Z",
"start_time": "2020-09-23T08:47:49.435945Z"
}
},
"outputs": [],
"source": [
"my_tuple = ('a', 'b', 'c', 'd')"
]
},
{
"cell_type": "code",
"execution_count": 92,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.456889Z",
"start_time": "2020-09-23T08:47:49.446917Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"('a', 'b', 'c', 'd', 'e', 'f')"
]
},
"execution_count": 92,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_tuple + ('e', 'f')"
]
},
{
"cell_type": "code",
"execution_count": 93,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.467860Z",
"start_time": "2020-09-23T08:47:49.458884Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"('a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'a', 'b', 'c', 'd')"
]
},
"execution_count": 93,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_tuple * 3"
]
},
{
"cell_type": "code",
"execution_count": 94,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.479828Z",
"start_time": "2020-09-23T08:47:49.468858Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"('a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'a', 'b', 'c', 'd')"
]
},
"execution_count": 94,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"3 * my_tuple"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#### Min / Max"
]
},
{
"cell_type": "code",
"execution_count": 95,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.488808Z",
"start_time": "2020-09-23T08:47:49.481822Z"
}
},
"outputs": [],
"source": [
"my_tuple = ('a', 'b', 'c', 'd')"
]
},
{
"cell_type": "code",
"execution_count": 96,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.501769Z",
"start_time": "2020-09-23T08:47:49.489802Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"'a'"
]
},
"execution_count": 96,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"min(my_tuple)"
]
},
{
"cell_type": "code",
"execution_count": 97,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.513737Z",
"start_time": "2020-09-23T08:47:49.502766Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"'d'"
]
},
"execution_count": 97,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"max(my_tuple)"
]
},
{
"cell_type": "code",
"execution_count": 98,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.523710Z",
"start_time": "2020-09-23T08:47:49.514734Z"
}
},
"outputs": [],
"source": [
"my_tuple = (-5, 1, -3, 2, 4)"
]
},
{
"cell_type": "code",
"execution_count": 99,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.535678Z",
"start_time": "2020-09-23T08:47:49.524708Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"1"
]
},
"execution_count": 99,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"min(my_tuple, key=abs)"
]
},
{
"cell_type": "code",
"execution_count": 100,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.546650Z",
"start_time": "2020-09-23T08:47:49.537673Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"1"
]
},
"execution_count": 100,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"min(my_tuple, key=lambda x: abs(x))"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#### Argmax"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"I don't understand why there is no argmax function in standard Python!"
]
},
{
"cell_type": "code",
"execution_count": 101,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.557620Z",
"start_time": "2020-09-23T08:47:49.548644Z"
}
},
"outputs": [],
"source": [
"my_tuple = ('a', 'b', 'c', 'a', 'c')"
]
},
{
"cell_type": "code",
"execution_count": 102,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.570586Z",
"start_time": "2020-09-23T08:47:49.558618Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"2"
]
},
"execution_count": 102,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"max(range(len(my_tuple)), key=lambda i: my_tuple[i]) # The most readable?"
]
},
{
"cell_type": "code",
"execution_count": 103,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.582553Z",
"start_time": "2020-09-23T08:47:49.571583Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"2"
]
},
"execution_count": 103,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"max(range(len(my_tuple)), key=my_tuple.__getitem__) # This one looks good too."
]
},
{
"cell_type": "code",
"execution_count": 104,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.593524Z",
"start_time": "2020-09-23T08:47:49.583550Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"2"
]
},
"execution_count": 104,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"max(enumerate(my_tuple), key=lambda x: x[1])[0]"
]
},
{
"cell_type": "code",
"execution_count": 105,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.604495Z",
"start_time": "2020-09-23T08:47:49.595519Z"
},
"scrolled": true
},
"outputs": [
{
"data": {
"text/plain": [
"2"
]
},
"execution_count": 105,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from operator import itemgetter\n",
"max(enumerate(my_tuple), key=itemgetter(1))[0]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"N.B.: in `numpy`, there exists an `argmax` method."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#### Compare Tuples"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The comparison is based on lexicographical order:"
]
},
{
"cell_type": "code",
"execution_count": 106,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.615466Z",
"start_time": "2020-09-23T08:47:49.606490Z"
}
},
"outputs": [],
"source": [
"my_tuple = ('a', 'b', 'c', 'd')"
]
},
{
"cell_type": "code",
"execution_count": 107,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.627434Z",
"start_time": "2020-09-23T08:47:49.616463Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"True"
]
},
"execution_count": 107,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_tuple < ('a', 'b', 'd', 'e')"
]
},
{
"cell_type": "code",
"execution_count": 108,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.636413Z",
"start_time": "2020-09-23T08:47:49.629430Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"True"
]
},
"execution_count": 108,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_tuple < ('a', 'b', 'c', 'd', 'e')"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Specificities of `tuple`"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"A `tuple` is **immutable** and **hashable**."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#### A `tuple` is Immutable"
]
},
{
"cell_type": "code",
"execution_count": 109,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.648378Z",
"start_time": "2020-09-23T08:47:49.638404Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"TypeError: 'tuple' object does not support item assignment\n"
]
}
],
"source": [
"try: \n",
" my_tuple = ('a', 'b', 'c', 'd')\n",
" my_tuple[0] = 'alpha'\n",
"except TypeError as e:\n",
" print('TypeError:', e)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#### A `tuple` is Hashable"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The only additional feature of `tuple`, compared to other sequences, is to be **hashable**."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Thus, tuples can be used in sets:"
]
},
{
"cell_type": "code",
"execution_count": 110,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.658351Z",
"start_time": "2020-09-23T08:47:49.649375Z"
}
},
"outputs": [],
"source": [
"my_set = {('a', 'b'), ('c', 'd')}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Or as keys in a dictionary:"
]
},
{
"cell_type": "code",
"execution_count": 111,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.668324Z",
"start_time": "2020-09-23T08:47:49.659349Z"
}
},
"outputs": [],
"source": [
"d_pair_score = {\n",
" ('a', 'b'): 42, \n",
" ('c', 'd'): 51\n",
"}"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Usage for Function Outputs"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Traditionally, tuples are often used as function outputs, especially if they are heterogeneous:"
]
},
{
"cell_type": "code",
"execution_count": 112,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.678299Z",
"start_time": "2020-09-23T08:47:49.671317Z"
}
},
"outputs": [],
"source": [
"def test_exceed_threshold(x, threshold):\n",
" test = x > threshold\n",
" delta = x - threshold\n",
" return test, delta"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"It is frequent to use unpacking to access the results of such a function:"
]
},
{
"cell_type": "code",
"execution_count": 113,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.689268Z",
"start_time": "2020-09-23T08:47:49.680293Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"True\n",
"9\n"
]
}
],
"source": [
"test, delta = test_exceed_threshold(51, 42)\n",
"print(test)\n",
"print(delta)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"Note that this is more a custom than a syntactic specificity of `tuple`. Indeed, the following works:"
]
},
{
"cell_type": "code",
"execution_count": 114,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.697355Z",
"start_time": "2020-09-23T08:47:49.690265Z"
}
},
"outputs": [],
"source": [
"def test_exceed_threshold(x, threshold):\n",
" test = x > threshold\n",
" delta = x - threshold\n",
" return [test, delta] # Note the list here instead of a tuple"
]
},
{
"cell_type": "code",
"execution_count": 115,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.708313Z",
"start_time": "2020-09-23T08:47:49.699338Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"True\n",
"9\n"
]
}
],
"source": [
"test, delta = test_exceed_threshold(51, 42)\n",
"print(test)\n",
"print(delta)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This alternate version is just less Pythonic. Hence your reader (possibly yourself later) may waste time wondering why you did that."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"When the outputs are homogeneous, both versions are reasonably natural:"
]
},
{
"cell_type": "code",
"execution_count": 116,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.718287Z",
"start_time": "2020-09-23T08:47:49.709311Z"
}
},
"outputs": [],
"source": [
"from math import cos, sin, pi"
]
},
{
"cell_type": "code",
"execution_count": 117,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.728260Z",
"start_time": "2020-09-23T08:47:49.719284Z"
}
},
"outputs": [],
"source": [
"def polar_to_cartesian(r, theta):\n",
" return r * cos(theta), r * sin(theta)"
]
},
{
"cell_type": "code",
"execution_count": 118,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.738233Z",
"start_time": "2020-09-23T08:47:49.730255Z"
}
},
"outputs": [],
"source": [
"def polar_to_cartesian(r, theta):\n",
" return [r * cos(theta), r * sin(theta)] # Note the list here instead of a tuple"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## `namedtuple`"
]
},
{
"cell_type": "code",
"execution_count": 119,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.748206Z",
"start_time": "2020-09-23T08:47:49.742223Z"
}
},
"outputs": [],
"source": [
"from collections import namedtuple"
]
},
{
"cell_type": "markdown",
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-18T13:22:42.341936Z",
"start_time": "2020-09-18T13:22:42.337908Z"
}
},
"source": [
"`namedtuple` gives a way to create tuples whose fields have names."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Syntax"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"`collections.namedtuple` is not a class itself, but a function that return a custom subclass of `tuple`:"
]
},
{
"cell_type": "code",
"execution_count": 120,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.761172Z",
"start_time": "2020-09-23T08:47:49.750201Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"True"
]
},
"execution_count": 120,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"Point = namedtuple('Point', ['x', 'y'])\n",
"issubclass(Point, tuple)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now, we can define a `Point`:"
]
},
{
"cell_type": "code",
"execution_count": 121,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.771145Z",
"start_time": "2020-09-23T08:47:49.762170Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"Point(x=11, y=22)"
]
},
"execution_count": 121,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"p = Point(11, 22)\n",
"p"
]
},
{
"cell_type": "code",
"execution_count": 122,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.783114Z",
"start_time": "2020-09-23T08:47:49.772142Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"Point(x=11, y=22)"
]
},
"execution_count": 122,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"p = Point(x=11, y=22)\n",
"p"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"... and access its elements:"
]
},
{
"cell_type": "code",
"execution_count": 123,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.805570Z",
"start_time": "2020-09-23T08:47:49.785110Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"11"
]
},
"execution_count": 123,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"p[0]"
]
},
{
"cell_type": "code",
"execution_count": 124,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.817536Z",
"start_time": "2020-09-23T08:47:49.807562Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"11"
]
},
"execution_count": 124,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"p.x"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Some other methods are provided (`_as_dict`, etc): cf. the documentation."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Usage for Function Outputs"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"It could be interesting to use named tuples instead of tuples as function outputs. Consider the following function:"
]
},
{
"cell_type": "code",
"execution_count": 125,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.827509Z",
"start_time": "2020-09-23T08:47:49.820529Z"
}
},
"outputs": [],
"source": [
"def my_function(n):\n",
" return n * 3, n * 4"
]
},
{
"cell_type": "code",
"execution_count": 126,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.842467Z",
"start_time": "2020-09-23T08:47:49.831500Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"126"
]
},
"execution_count": 126,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_function(42)[0]"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"You can rewrite it as:"
]
},
{
"cell_type": "code",
"execution_count": 127,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.870393Z",
"start_time": "2020-09-23T08:47:49.844463Z"
}
},
"outputs": [],
"source": [
"MyFunctionOutput = namedtuple('MyFunctionOutput', ['triple', 'quadruple'])\n",
"def my_function(n):\n",
" return MyFunctionOutput(n * 3, n * 4)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"It is backward-compatible:"
]
},
{
"cell_type": "code",
"execution_count": 128,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.880373Z",
"start_time": "2020-09-23T08:47:49.871391Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"126"
]
},
"execution_count": 128,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_function(42)[0]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"But now you can also write:"
]
},
{
"cell_type": "code",
"execution_count": 129,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.892336Z",
"start_time": "2020-09-23T08:47:49.883359Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"126"
]
},
"execution_count": 129,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_function(42).triple"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"... which is less error-prone and easier to maintain."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
" For example, if you decide later to add another output and/or change the order of the outputs, you can do:"
]
},
{
"cell_type": "code",
"execution_count": 130,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.911822Z",
"start_time": "2020-09-23T08:47:49.902854Z"
}
},
"outputs": [],
"source": [
"MyFunctionOutput = namedtuple('MyFunctionOutput', ['double', 'triple', 'quadruple'])\n",
"def my_function(n):\n",
" return MyFunctionOutput(n * 2, n * 3, n * 4)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"And the named-tuple-style usage still works:"
]
},
{
"cell_type": "code",
"execution_count": 131,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.930783Z",
"start_time": "2020-09-23T08:47:49.915812Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"126"
]
},
"execution_count": 131,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_function(42).triple"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"For the moment, I don't do it myself (but I sometimes use dictionaries instead). What do you think of it?"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Remark: instead, you can also use a data class..."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## `list`"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"A `list` is **mutable** but **not hashable**."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Create a `list`"
]
},
{
"cell_type": "code",
"execution_count": 132,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.948731Z",
"start_time": "2020-09-23T08:47:49.933774Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"['a', 'b', 'c', 'd']"
]
},
"execution_count": 132,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_list = ['a', 'b', 'c', 'd']\n",
"my_list"
]
},
{
"cell_type": "code",
"execution_count": 133,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:49.985629Z",
"start_time": "2020-09-23T08:47:49.953718Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]"
]
},
"execution_count": 133,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_list = [x**2 for x in range(10)] # \"List comprehension\"\n",
"my_list"
]
},
{
"cell_type": "code",
"execution_count": 134,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.026259Z",
"start_time": "2020-09-23T08:47:49.988616Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"[1, 4, 16, 25, 49, 64]"
]
},
"execution_count": 134,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_list = [x**2 for x in range(10) if x % 3 != 0] # List comprehension with an \"if\" clause\n",
"my_list"
]
},
{
"cell_type": "code",
"execution_count": 135,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.053188Z",
"start_time": "2020-09-23T08:47:50.032239Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"[1, 3, 5, 7]"
]
},
"execution_count": 135,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"raw_data = [0, 1, 4, 9, 16]\n",
"differences = [end - start for start, end in zip(raw_data[:-1], raw_data[1:])]\n",
"differences"
]
},
{
"cell_type": "code",
"execution_count": 136,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.075125Z",
"start_time": "2020-09-23T08:47:50.055178Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"['a', 'b', 'c', 'd']"
]
},
"execution_count": 136,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_list = 'a,b,c,d'.split(',')\n",
"my_list"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Syntax Inherited from `Sequence`"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"From `Collection`: `len`, `in` / `not in`, iteration, `enumerate`...\n",
"\n",
"From `Sequence`:\n",
"* Unpack,\n",
"* Indexing and slicing,\n",
"* Find (`index`) and count (`count`) an element,\n",
"* Concatenate,\n",
"* Min / Max,\n",
"* Compare lists (lexicographical order),\n",
"* Iterate in reverse order."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"Examples:"
]
},
{
"cell_type": "code",
"execution_count": 137,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.091082Z",
"start_time": "2020-09-23T08:47:50.078118Z"
}
},
"outputs": [],
"source": [
"my_list = ['a', 'b', 'c', 'd']"
]
},
{
"cell_type": "code",
"execution_count": 138,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.108487Z",
"start_time": "2020-09-23T08:47:50.094082Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The list is not empty\n"
]
}
],
"source": [
"if my_list:\n",
" print('The list is not empty')\n",
"else:\n",
" print('The list is empty')"
]
},
{
"cell_type": "code",
"execution_count": 139,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.133420Z",
"start_time": "2020-09-23T08:47:50.112476Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"i = 0, x = a\n",
"i = 1, x = b\n",
"i = 2, x = c\n",
"i = 3, x = d\n"
]
}
],
"source": [
"for i, x in enumerate(my_list):\n",
" print('i = {}, x = {}'.format(i, x))"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Specificities of `list`"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"A list is **mutable**."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#### Modify Elements by Indexing and Slicing"
]
},
{
"cell_type": "code",
"execution_count": 140,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.155358Z",
"start_time": "2020-09-23T08:47:50.140399Z"
}
},
"outputs": [],
"source": [
"my_list = ['a', 'b', 'c', 'd']"
]
},
{
"cell_type": "code",
"execution_count": 141,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.184282Z",
"start_time": "2020-09-23T08:47:50.157353Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"['alpha', 'b', 'c', 'd']"
]
},
"execution_count": 141,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_list[0] = 'alpha'\n",
"my_list"
]
},
{
"cell_type": "code",
"execution_count": 142,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.196267Z",
"start_time": "2020-09-23T08:47:50.188270Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"['alpha', 'beta', 'gamma']"
]
},
"execution_count": 142,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_list[1:] = ['beta', 'gamma'] # The number of elements does not need to match\n",
"my_list"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#### Add Elements"
]
},
{
"cell_type": "code",
"execution_count": 143,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.213212Z",
"start_time": "2020-09-23T08:47:50.202237Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"['a', 'b', 'c', 'd']"
]
},
"execution_count": 143,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_list = ['a', 'b', 'c', 'd']\n",
"my_list"
]
},
{
"cell_type": "code",
"execution_count": 144,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.228164Z",
"start_time": "2020-09-23T08:47:50.217193Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"['a', 'b', 'c', 'd', 'e']"
]
},
"execution_count": 144,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_list.append('e')\n",
"my_list"
]
},
{
"cell_type": "code",
"execution_count": 145,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.247122Z",
"start_time": "2020-09-23T08:47:50.238138Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"['a', 'b', 'c', 'd', 'e', 'f', 'g']"
]
},
"execution_count": 145,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_list.extend(['f', 'g'])\n",
"# Compared to concatenation:\n",
"# * This is an in-place operation,\n",
"# * Any iterable can be used, i.e. ('f', 'g').\n",
"my_list"
]
},
{
"cell_type": "code",
"execution_count": 146,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.266068Z",
"start_time": "2020-09-23T08:47:50.254104Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"['g', 'a', 'b', 'c', 'd', 'e', 'f', 'g']"
]
},
"execution_count": 146,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_list.insert(0, 'g')\n",
"my_list"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#### Remove Elements"
]
},
{
"cell_type": "code",
"execution_count": 147,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.285017Z",
"start_time": "2020-09-23T08:47:50.274043Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"['g', 'a', 'b', 'c', 'd', 'e', 'f', 'g']"
]
},
"execution_count": 147,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_list = ['g', 'a', 'b', 'c', 'd', 'e', 'f', 'g']\n",
"my_list"
]
},
{
"cell_type": "code",
"execution_count": 148,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.307954Z",
"start_time": "2020-09-23T08:47:50.292991Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"['a', 'b', 'c', 'd', 'e', 'f', 'g']"
]
},
"execution_count": 148,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_list.remove('g') # Remove only the first corresponding element\n",
"my_list"
]
},
{
"cell_type": "code",
"execution_count": 149,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.321913Z",
"start_time": "2020-09-23T08:47:50.314933Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"['a', 'b', 'd', 'e', 'f', 'g']"
]
},
"execution_count": 149,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"del my_list[2]\n",
"my_list"
]
},
{
"cell_type": "code",
"execution_count": 150,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.334954Z",
"start_time": "2020-09-23T08:47:50.323909Z"
},
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [],
"source": [
"my_list = ['a', 'b', 'c', 'd']"
]
},
{
"cell_type": "code",
"execution_count": 151,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.349915Z",
"start_time": "2020-09-23T08:47:50.337948Z"
},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"d\n",
"['a', 'b', 'c']\n"
]
}
],
"source": [
"x = my_list.pop()\n",
"print(x)\n",
"print(my_list)"
]
},
{
"cell_type": "code",
"execution_count": 152,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.365873Z",
"start_time": "2020-09-23T08:47:50.352928Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"a\n",
"['b', 'c']\n"
]
}
],
"source": [
"x = my_list.pop(0)\n",
"print(x)\n",
"print(my_list)"
]
},
{
"cell_type": "code",
"execution_count": 153,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.378836Z",
"start_time": "2020-09-23T08:47:50.368864Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[]\n"
]
}
],
"source": [
"my_list.clear()\n",
"print(my_list)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#### Sort"
]
},
{
"cell_type": "code",
"execution_count": 154,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.387818Z",
"start_time": "2020-09-23T08:47:50.381832Z"
}
},
"outputs": [],
"source": [
"my_list = ['d', 'b', 'a', 'c']"
]
},
{
"cell_type": "code",
"execution_count": 155,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.402773Z",
"start_time": "2020-09-23T08:47:50.391803Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"['a', 'b', 'c', 'd']"
]
},
"execution_count": 155,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"sorted(my_list) # Works for any iterable, but always returns a list."
]
},
{
"cell_type": "code",
"execution_count": 156,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.417732Z",
"start_time": "2020-09-23T08:47:50.404767Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"['d', 'b', 'a', 'c']"
]
},
"execution_count": 156,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_list # Note that the original list is not modified."
]
},
{
"cell_type": "code",
"execution_count": 157,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.429710Z",
"start_time": "2020-09-23T08:47:50.419726Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"['a', 'b', 'c', 'd']"
]
},
"execution_count": 157,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_list.sort() # Works specifically for a list: this is an in-place operation.\n",
"my_list"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"Sort a list, based on a criterion (\"key\"):"
]
},
{
"cell_type": "code",
"execution_count": 158,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.440671Z",
"start_time": "2020-09-23T08:47:50.435684Z"
}
},
"outputs": [],
"source": [
"my_list = [-5, 1, -3, 2, 4]"
]
},
{
"cell_type": "code",
"execution_count": 159,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.455632Z",
"start_time": "2020-09-23T08:47:50.446660Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"[1, 2, -3, 4, -5]"
]
},
"execution_count": 159,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"sorted(my_list, key=abs)"
]
},
{
"cell_type": "code",
"execution_count": 160,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.466602Z",
"start_time": "2020-09-23T08:47:50.457627Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"[1, 2, -3, 4, -5]"
]
},
"execution_count": 160,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"sorted(my_list, key=lambda x: abs(x))"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"Sort a list of objects based on the value of an attribute:"
]
},
{
"cell_type": "code",
"execution_count": 161,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.478571Z",
"start_time": "2020-09-23T08:47:50.468596Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"[Person(Alice, 42), Person(Bob, 27), Person(Cate, 33)]"
]
},
"execution_count": 161,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"class Person:\n",
" def __init__(self, name, age):\n",
" self.name = name\n",
" self.age = age\n",
" def __repr__(self):\n",
" return 'Person({}, {})'.format(self.name, self.age) \n",
" \n",
"my_list = [Person('Alice', 42), Person('Bob', 27), Person('Cate', 33)]\n",
"my_list"
]
},
{
"cell_type": "code",
"execution_count": 162,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.489541Z",
"start_time": "2020-09-23T08:47:50.481561Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"[Person(Bob, 27), Person(Cate, 33), Person(Alice, 42)]"
]
},
"execution_count": 162,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"sorted(my_list, key=lambda person: person.age)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In fact, this case is so frequent that there is a dedicated function to do that, `attrgetter`:"
]
},
{
"cell_type": "code",
"execution_count": 163,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.501574Z",
"start_time": "2020-09-23T08:47:50.491544Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"[Person(Bob, 27), Person(Cate, 33), Person(Alice, 42)]"
]
},
"execution_count": 163,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from operator import attrgetter\n",
"sorted(my_list, key=attrgetter('age'))"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"Similarly, there is a function `itemgetter`:"
]
},
{
"cell_type": "code",
"execution_count": 164,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.511482Z",
"start_time": "2020-09-23T08:47:50.503503Z"
}
},
"outputs": [],
"source": [
"from operator import itemgetter"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"It works for any `Sequence`, for example a `list`:"
]
},
{
"cell_type": "code",
"execution_count": 165,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.522452Z",
"start_time": "2020-09-23T08:47:50.514474Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"[['Bob', 27], ['Cate', 33], ['Alice', 42]]"
]
},
"execution_count": 165,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_list = [['Alice', 42], ['Bob', 27], ['Cate', 33]]\n",
"sorted(my_list, key=itemgetter(1))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"... or a `tuple`:"
]
},
{
"cell_type": "code",
"execution_count": 166,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.535428Z",
"start_time": "2020-09-23T08:47:50.525477Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"[('Bob', 27), ('Cate', 33), ('Alice', 42)]"
]
},
"execution_count": 166,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_list = [('Alice', 42), ('Bob', 27), ('Cate', 33)]\n",
"sorted(my_list, key=itemgetter(1))"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#### Reverse"
]
},
{
"cell_type": "code",
"execution_count": 167,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.545400Z",
"start_time": "2020-09-23T08:47:50.537412Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"['d', 'c', 'b', 'a']"
]
},
"execution_count": 167,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_list = ['a', 'b', 'c', 'd']\n",
"my_list.reverse() # This is an in-place operation\n",
"my_list"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Comparison between `list` and `tuple`"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"*From https://docs.python.org/3/tutorial/datastructures.html#tuples-and-sequences:*\n",
"\n",
"\"Though tuples may seem similar to lists, they are often used in different situations and for different purposes.\n",
"\n",
"* Tuples are **immutable**, and usually contain a **heterogeneous** sequence of elements that are accessed via **unpacking** or **indexing** (or even by attribute in the case of namedtuples).\n",
"* Lists are **mutable**, and their elements are usually **homogeneous** and are accessed by **iterating** over the list.\""
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Use a List as a Stack (LIFO)"
]
},
{
"cell_type": "code",
"execution_count": 168,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.554367Z",
"start_time": "2020-09-23T08:47:50.548383Z"
}
},
"outputs": [],
"source": [
"stack = ['a', 'b', 'c', 'd']"
]
},
{
"cell_type": "code",
"execution_count": 169,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.566336Z",
"start_time": "2020-09-23T08:47:50.555364Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"['a', 'b', 'c', 'd', 'e']"
]
},
"execution_count": 169,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"stack.append('e')\n",
"stack"
]
},
{
"cell_type": "code",
"execution_count": 170,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.576950Z",
"start_time": "2020-09-23T08:47:50.568958Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"e\n"
]
},
{
"data": {
"text/plain": [
"['a', 'b', 'c', 'd']"
]
},
"execution_count": 170,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"x = stack.pop()\n",
"print(x)\n",
"stack"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Use a List as a Queue (FIFO)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"It is theoretically possible, but not efficient because adding or removing elements at the beginning of the list is slow. Rather use `deque` (cf. below)."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## `deque` (Double-Ended Queue)"
]
},
{
"cell_type": "code",
"execution_count": 171,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.584915Z",
"start_time": "2020-09-23T08:47:50.578936Z"
}
},
"outputs": [],
"source": [
"from collections import deque"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Pronounce: \"deck\"."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Deques implement **stacks** and **queues** in an optimized way."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "-"
}
},
"source": [
"A `deque` is **mutable** but **not hashable**. It is similar to a `list`, but none is a subclass of the other."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"Advantage of `deque` (from https://docs.python.org/3/library/collections.html#collections.deque):\n",
"* Deques support thread-safe, memory efficient appends and pops from either side of the deque with approximately the same O(1) performance in either direction.\n",
"* Though list objects support similar operations, they are optimized for fast fixed-length operations and incur O(n) memory movement costs for `pop(0)` and `insert(0, v)` operations which change both the size and position of the underlying data representation."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Drawback of `deque` (from https://docs.python.org/3/library/collections.html#collections.deque):\n",
"\n",
"* Indexed access is O(1) at both ends but slows to O(n) in the middle.\n",
"* For fast random access, use lists instead since the operation is in O(1)."
]
},
{
"cell_type": "markdown",
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-14T14:58:46.425988Z",
"start_time": "2020-09-14T14:58:46.421965Z"
},
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Create a deque"
]
},
{
"cell_type": "code",
"execution_count": 172,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.593902Z",
"start_time": "2020-09-23T08:47:50.585912Z"
}
},
"outputs": [],
"source": [
"my_deque = deque(['a', 'b', 'c', 'd'])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You may specify a maximum length:"
]
},
{
"cell_type": "code",
"execution_count": 173,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.603378Z",
"start_time": "2020-09-23T08:47:50.596883Z"
}
},
"outputs": [],
"source": [
"my_deque = deque(['a', 'b', 'c', 'd'], maxlen=5)"
]
},
{
"cell_type": "code",
"execution_count": 174,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.612364Z",
"start_time": "2020-09-23T08:47:50.604375Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"deque(['a', 'b', 'c', 'd', 'e'], maxlen=5)\n"
]
}
],
"source": [
"my_deque.append('e')\n",
"print(my_deque)"
]
},
{
"cell_type": "code",
"execution_count": 175,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.623325Z",
"start_time": "2020-09-23T08:47:50.614349Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"deque(['b', 'c', 'd', 'e', 'f'], maxlen=5)\n"
]
}
],
"source": [
"my_deque.append('f')\n",
"print(my_deque)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Syntax Inherited from `Sequence`"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"From `Collection`: `len`, `in` / `not in`, iteration, `enumerate`...\n",
"\n",
"From `Sequence`:\n",
"* Unpack,\n",
"* Indexing and slicing,\n",
"* Find (`index`) and count (`count`) an element,\n",
"* Concatenate,\n",
"* Min / Max,\n",
"* Compare lists (lexicographical order),\n",
"* Iterate in reverse order."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Specificities of `deque`"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"A `deque` is optimized to add and remove elements at both ends."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#### Add Elements"
]
},
{
"cell_type": "code",
"execution_count": 176,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.633298Z",
"start_time": "2020-09-23T08:47:50.625320Z"
}
},
"outputs": [],
"source": [
"my_deque = deque(['a', 'b', 'c', 'd'])"
]
},
{
"cell_type": "code",
"execution_count": 177,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.645266Z",
"start_time": "2020-09-23T08:47:50.634296Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"deque(['a', 'b', 'c', 'd', 'e'])"
]
},
"execution_count": 177,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_deque.append('e')\n",
"my_deque"
]
},
{
"cell_type": "code",
"execution_count": 178,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.654242Z",
"start_time": "2020-09-23T08:47:50.647261Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"deque(['z', 'a', 'b', 'c', 'd', 'e'])"
]
},
"execution_count": 178,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_deque.appendleft('z')\n",
"my_deque"
]
},
{
"cell_type": "code",
"execution_count": 179,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.665213Z",
"start_time": "2020-09-23T08:47:50.655239Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"deque(['z', 'a', 'b', 'c', 'd', 'e', 'f', 'g'])"
]
},
"execution_count": 179,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_deque.extend(['f', 'g'])\n",
"my_deque"
]
},
{
"cell_type": "code",
"execution_count": 180,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.675188Z",
"start_time": "2020-09-23T08:47:50.667208Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"deque(['x', 'y', 'z', 'a', 'b', 'c', 'd', 'e', 'f', 'g'])"
]
},
"execution_count": 180,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_deque.extendleft(['y', 'x']) # Note the \"inversion\": we append 'y', then 'x'.\n",
"my_deque"
]
},
{
"cell_type": "code",
"execution_count": 181,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.685159Z",
"start_time": "2020-09-23T08:47:50.677181Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"deque(['x', 'y', 'inserted', 'z', 'a', 'b', 'c', 'd', 'e', 'f', 'g'])"
]
},
"execution_count": 181,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_deque.insert(2, 'inserted')\n",
"my_deque"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#### Remove Elements"
]
},
{
"cell_type": "code",
"execution_count": 182,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.693848Z",
"start_time": "2020-09-23T08:47:50.686157Z"
}
},
"outputs": [],
"source": [
"my_deque = deque(['a', 'b', 'c', 'd'])"
]
},
{
"cell_type": "code",
"execution_count": 183,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.704619Z",
"start_time": "2020-09-23T08:47:50.697128Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"d\n",
"deque(['a', 'b', 'c'])\n"
]
}
],
"source": [
"x = my_deque.pop()\n",
"print(x)\n",
"print(my_deque)"
]
},
{
"cell_type": "code",
"execution_count": 184,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.714592Z",
"start_time": "2020-09-23T08:47:50.706615Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"a\n",
"deque(['b', 'c'])\n"
]
}
],
"source": [
"x = my_deque.popleft()\n",
"print(x)\n",
"print(my_deque)"
]
},
{
"cell_type": "code",
"execution_count": 185,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.724566Z",
"start_time": "2020-09-23T08:47:50.715590Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"deque(['c'])"
]
},
"execution_count": 185,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_deque.remove('b')\n",
"my_deque"
]
},
{
"cell_type": "code",
"execution_count": 186,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.734568Z",
"start_time": "2020-09-23T08:47:50.727559Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"deque([])"
]
},
"execution_count": 186,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_deque.clear()\n",
"my_deque"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#### Rotate"
]
},
{
"cell_type": "code",
"execution_count": 187,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.742545Z",
"start_time": "2020-09-23T08:47:50.735562Z"
}
},
"outputs": [],
"source": [
"my_deque = deque(['a', 'b', 'c', 'd'])"
]
},
{
"cell_type": "code",
"execution_count": 188,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.754512Z",
"start_time": "2020-09-23T08:47:50.744539Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"deque(['d', 'a', 'b', 'c'])"
]
},
"execution_count": 188,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"my_deque.rotate(1)\n",
"my_deque"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# `Mapping`"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"`Collection`:\n",
"* `Set`\n",
"* `Sequence`\n",
"* **`Mapping`:**\n",
" * **`dict`**\n",
" * **`OrderedDict`**\n",
" * **`defaultdict`**\n",
" * **`Counter`**\n",
" * **`ChainMap`**"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## `dict`"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"A `dict` is **mutable** and **not hashable**."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Create a `dict`"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Several ways to define the same dictionary:"
]
},
{
"cell_type": "code",
"execution_count": 189,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.766479Z",
"start_time": "2020-09-23T08:47:50.757505Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"{'zero': 'a', 'one': 'b', 'two': 'c', 'three': 'd'}"
]
},
"execution_count": 189,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d_number_letter = {'zero': 'a', 'one': 'b', 'two': 'c', 'three': 'd'}\n",
"d_number_letter"
]
},
{
"cell_type": "code",
"execution_count": 190,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.776453Z",
"start_time": "2020-09-23T08:47:50.767477Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"{'zero': 'a', 'one': 'b', 'two': 'c', 'three': 'd'}"
]
},
"execution_count": 190,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d_number_letter = dict([\n",
" ('zero', 'a'),\n",
" ('one', 'b'), \n",
" ('two', 'c'), \n",
" ('three', 'd'),\n",
"])\n",
"d_number_letter"
]
},
{
"cell_type": "code",
"execution_count": 191,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.787423Z",
"start_time": "2020-09-23T08:47:50.778448Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"{'zero': 'a', 'one': 'b', 'two': 'c', 'three': 'd'}"
]
},
"execution_count": 191,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d_number_letter = dict(zip(\n",
" ['zero', 'one', 'two', 'three'], \n",
" ['a', 'b', 'c', 'd'],\n",
"))\n",
"d_number_letter"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"When the keys are valid Python identifiers, you may also use the following syntax:"
]
},
{
"cell_type": "code",
"execution_count": 192,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.797397Z",
"start_time": "2020-09-23T08:47:50.788421Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"{'zero': 'a', 'one': 'b', 'two': 'c', 'three': 'd'}"
]
},
"execution_count": 192,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d_number_letter = dict(zero='a', one='b', two='c', three='d')\n",
"d_number_letter"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"Dict comprehension:"
]
},
{
"cell_type": "code",
"execution_count": 193,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.809366Z",
"start_time": "2020-09-23T08:47:50.799393Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"{0: 0, 1: 1, 2: 4, 3: 9, 4: 16, 5: 25, 6: 36, 7: 49, 8: 64, 9: 81}"
]
},
"execution_count": 193,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d_number_square = {x: x**2 for x in range(10)}\n",
"d_number_square"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Use `fromkeys`:"
]
},
{
"cell_type": "code",
"execution_count": 194,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.819338Z",
"start_time": "2020-09-23T08:47:50.811360Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"{'Alice': 0, 'Bob': 0, 'Cate': 0}"
]
},
"execution_count": 194,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d_player_score = dict.fromkeys(['Alice', 'Bob', 'Cate'], 0)\n",
"d_player_score"
]
},
{
"cell_type": "markdown",
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-14T15:38:20.194015Z",
"start_time": "2020-09-14T15:38:20.190066Z"
},
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Syntax Inherited from `Collection`"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"`len` works as expected:"
]
},
{
"cell_type": "code",
"execution_count": 195,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.829314Z",
"start_time": "2020-09-23T08:47:50.824327Z"
}
},
"outputs": [],
"source": [
"d_number_letter = {'zero': 'a', 'one': 'b', 'two': 'c', 'three': 'd'}"
]
},
{
"cell_type": "code",
"execution_count": 196,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.840283Z",
"start_time": "2020-09-23T08:47:50.832303Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"4"
]
},
"execution_count": 196,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"len(d_number_letter)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"From the point of view of `Collection`, you can see a dict as a collection of keys that are just labelled with values."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"As such, `in` / `not in` works on keys:"
]
},
{
"cell_type": "code",
"execution_count": 197,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.849259Z",
"start_time": "2020-09-23T08:47:50.843275Z"
}
},
"outputs": [],
"source": [
"d_number_letter = {'zero': 'a', 'one': 'b', 'two': 'c', 'three': 'd'}"
]
},
{
"cell_type": "code",
"execution_count": 198,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.861227Z",
"start_time": "2020-09-23T08:47:50.851253Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"True"
]
},
"execution_count": 198,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"'zero' in d_number_letter"
]
},
{
"cell_type": "code",
"execution_count": 199,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.873194Z",
"start_time": "2020-09-23T08:47:50.863221Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"False"
]
},
"execution_count": 199,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"'a' in d_number_letter"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"Similarly, iteration works on keys:"
]
},
{
"cell_type": "code",
"execution_count": 200,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.883167Z",
"start_time": "2020-09-23T08:47:50.876189Z"
}
},
"outputs": [],
"source": [
"d_number_letter = {'zero': 'a', 'one': 'b', 'two': 'c', 'three': 'd'}"
]
},
{
"cell_type": "code",
"execution_count": 201,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.893141Z",
"start_time": "2020-09-23T08:47:50.884165Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"zero\n",
"one\n",
"two\n",
"three\n"
]
}
],
"source": [
"for k in d_number_letter:\n",
" print(k)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"However:\n",
"* Most of time, you want not only the keys but also the corresponding values, hence you iterate over `d_number_letter.items()`,\n",
"* In the fewer case where you want to iterate over the keys and are not interested in the values, I personally find it more readable to iterate over `d_number_letter.keys()`.\n",
"\n",
"Cf. below."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Views and Iteration"
]
},
{
"cell_type": "code",
"execution_count": 202,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.904111Z",
"start_time": "2020-09-23T08:47:50.895136Z"
}
},
"outputs": [],
"source": [
"d_number_letter = {'zero': 'a', 'one': 'b', 'two': 'c', 'three': 'd'}"
]
},
{
"cell_type": "code",
"execution_count": 203,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.916080Z",
"start_time": "2020-09-23T08:47:50.905109Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"dict_keys(['zero', 'one', 'two', 'three'])"
]
},
"execution_count": 203,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d_number_letter.keys()"
]
},
{
"cell_type": "code",
"execution_count": 204,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.924063Z",
"start_time": "2020-09-23T08:47:50.918074Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"dict_values(['a', 'b', 'c', 'd'])"
]
},
"execution_count": 204,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d_number_letter.values()"
]
},
{
"cell_type": "code",
"execution_count": 205,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.936027Z",
"start_time": "2020-09-23T08:47:50.926053Z"
},
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"zero: a\n",
"one: b\n",
"two: c\n",
"three: d\n"
]
}
],
"source": [
"for number, letter in d_number_letter.items():\n",
" print('{}: {}'.format(number, letter))"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Unpack"
]
},
{
"cell_type": "code",
"execution_count": 206,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.944535Z",
"start_time": "2020-09-23T08:47:50.937573Z"
}
},
"outputs": [],
"source": [
"def f(zero, one, two, three):\n",
" return zero + one + two + three"
]
},
{
"cell_type": "code",
"execution_count": 207,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.954508Z",
"start_time": "2020-09-23T08:47:50.946530Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"'abcd'"
]
},
"execution_count": 207,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d_number_letter = {'zero': 'a', 'one': 'b', 'two': 'c', 'three': 'd'}\n",
"f(**d_number_letter) # Note the \"**\" operator"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This is equivalent to:"
]
},
{
"cell_type": "code",
"execution_count": 208,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.965478Z",
"start_time": "2020-09-23T08:47:50.956504Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"'abcd'"
]
},
"execution_count": 208,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"f(zero='a', one='b', two='c', three='d')"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Get Values"
]
},
{
"cell_type": "code",
"execution_count": 209,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.973046Z",
"start_time": "2020-09-23T08:47:50.968067Z"
}
},
"outputs": [],
"source": [
"d_number_letter = {'zero': 'a', 'one': 'b', 'two': 'c', 'three': 'd'}"
]
},
{
"cell_type": "code",
"execution_count": 210,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.984015Z",
"start_time": "2020-09-23T08:47:50.975039Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"'a'"
]
},
"execution_count": 210,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d_number_letter['zero']"
]
},
{
"cell_type": "code",
"execution_count": 211,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:50.992994Z",
"start_time": "2020-09-23T08:47:50.985013Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"'a'"
]
},
"execution_count": 211,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d_number_letter.get('zero')"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"Indexing (with brackets) and `get` behave differently when the value is missing:"
]
},
{
"cell_type": "code",
"execution_count": 212,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.001967Z",
"start_time": "2020-09-23T08:47:50.994987Z"
}
},
"outputs": [],
"source": [
"d_number_letter = {'zero': 'a', 'one': 'b', 'two': 'c', 'three': 'd'}"
]
},
{
"cell_type": "code",
"execution_count": 213,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.011942Z",
"start_time": "2020-09-23T08:47:51.003962Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"KeyError: 'ten'\n"
]
}
],
"source": [
"try:\n",
" d_number_letter['ten']\n",
"except KeyError as e:\n",
" print('KeyError:', e)"
]
},
{
"cell_type": "code",
"execution_count": 214,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.020916Z",
"start_time": "2020-09-23T08:47:51.012938Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"None\n"
]
}
],
"source": [
"print(d_number_letter.get('ten'))"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"The method `get` is especially useful when you use it as a callback function, typically to sort the keys by values or to find the argmax / argmin:"
]
},
{
"cell_type": "code",
"execution_count": 215,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.031888Z",
"start_time": "2020-09-23T08:47:51.021914Z"
}
},
"outputs": [],
"source": [
"d_number_letter = {'zero': 'a', 'one': 'b', 'two': 'c', 'three': 'd'}"
]
},
{
"cell_type": "code",
"execution_count": 216,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.044853Z",
"start_time": "2020-09-23T08:47:51.033882Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"['zero', 'one', 'two', 'three']"
]
},
"execution_count": 216,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"sorted(d_number_letter.keys(), key=d_number_letter.get)"
]
},
{
"cell_type": "code",
"execution_count": 217,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.053829Z",
"start_time": "2020-09-23T08:47:51.046848Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"'three'"
]
},
"execution_count": 217,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"max(d_number_letter.keys(), key=d_number_letter.get)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"The function `setdefault` is a bit special, in that it may modify the dictionary if the key is not present:"
]
},
{
"cell_type": "code",
"execution_count": 218,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.062809Z",
"start_time": "2020-09-23T08:47:51.054826Z"
}
},
"outputs": [],
"source": [
"d_number_letter = {'zero': 'a', 'one': 'b', 'two': 'c', 'three': 'd'}"
]
},
{
"cell_type": "code",
"execution_count": 219,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.074773Z",
"start_time": "2020-09-23T08:47:51.064799Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"a\n",
"{'zero': 'a', 'one': 'b', 'two': 'c', 'three': 'd'}\n"
]
}
],
"source": [
"x = d_number_letter.setdefault('zero', 'default_value')\n",
"print(x)\n",
"print(d_number_letter)"
]
},
{
"cell_type": "code",
"execution_count": 220,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.083748Z",
"start_time": "2020-09-23T08:47:51.076767Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"default_value\n",
"{'zero': 'a', 'one': 'b', 'two': 'c', 'three': 'd', 'four': 'default_value'}\n"
]
}
],
"source": [
"x = d_number_letter.setdefault('four', 'default_value')\n",
"print(x)\n",
"print(d_number_letter)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Add / Update Values"
]
},
{
"cell_type": "code",
"execution_count": 221,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.090733Z",
"start_time": "2020-09-23T08:47:51.084746Z"
}
},
"outputs": [],
"source": [
"d_number_letter = {'zero': 'a', 'one': 'b', 'two': 'c', 'three': 'd'}"
]
},
{
"cell_type": "code",
"execution_count": 222,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.103696Z",
"start_time": "2020-09-23T08:47:51.093724Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"{'zero': 'alpha', 'one': 'b', 'two': 'c', 'three': 'd'}"
]
},
"execution_count": 222,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d_number_letter['zero'] = 'alpha' # Modify an existing value\n",
"d_number_letter"
]
},
{
"cell_type": "code",
"execution_count": 223,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.113669Z",
"start_time": "2020-09-23T08:47:51.104693Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"{'zero': 'alpha', 'one': 'b', 'two': 'c', 'three': 'd', 'four': 'e'}"
]
},
"execution_count": 223,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d_number_letter['four'] = 'e' # Add a value\n",
"d_number_letter"
]
},
{
"cell_type": "code",
"execution_count": 224,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.123642Z",
"start_time": "2020-09-23T08:47:51.114665Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"{'zero': 'alpha',\n",
" 'one': 'beta',\n",
" 'two': 'c',\n",
" 'three': 'd',\n",
" 'four': 'e',\n",
" 'five': 'f'}"
]
},
"execution_count": 224,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d_number_letter.update({'one': 'beta', 'five': 'f'})\n",
"d_number_letter"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Remove Elements"
]
},
{
"cell_type": "code",
"execution_count": 225,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.133616Z",
"start_time": "2020-09-23T08:47:51.128629Z"
}
},
"outputs": [],
"source": [
"d_number_letter = {'zero': 'a', 'one': 'b', 'two': 'c', 'three': 'd'}"
]
},
{
"cell_type": "code",
"execution_count": 226,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.144586Z",
"start_time": "2020-09-23T08:47:51.134612Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"{'one': 'b', 'two': 'c', 'three': 'd'}"
]
},
"execution_count": 226,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"del d_number_letter['zero']\n",
"d_number_letter"
]
},
{
"cell_type": "code",
"execution_count": 227,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.153562Z",
"start_time": "2020-09-23T08:47:51.146580Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"b\n",
"{'two': 'c', 'three': 'd'}\n"
]
}
],
"source": [
"x = d_number_letter.pop('one')\n",
"print(x)\n",
"print(d_number_letter)"
]
},
{
"cell_type": "code",
"execution_count": 228,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.162249Z",
"start_time": "2020-09-23T08:47:51.154566Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"default_value\n",
"{'two': 'c', 'three': 'd'}\n"
]
}
],
"source": [
"x = d_number_letter.pop('ten', 'default_value')\n",
"print(x)\n",
"print(d_number_letter)"
]
},
{
"cell_type": "code",
"execution_count": 229,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.173211Z",
"start_time": "2020-09-23T08:47:51.164236Z"
},
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [],
"source": [
"d_number_letter = {'zero': 'a', 'one': 'b', 'two': 'c', 'three': 'd'}"
]
},
{
"cell_type": "code",
"execution_count": 230,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.183185Z",
"start_time": "2020-09-23T08:47:51.175208Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"three d\n",
"{'zero': 'a', 'one': 'b', 'two': 'c'}\n"
]
}
],
"source": [
"number, letter = d_number_letter.popitem() # Remove and return a (key, value) pair.\n",
"print(number, letter)\n",
"print(d_number_letter)"
]
},
{
"cell_type": "code",
"execution_count": 231,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.194156Z",
"start_time": "2020-09-23T08:47:51.189169Z"
},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"data": {
"text/plain": [
"{}"
]
},
"execution_count": 231,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d_number_letter.clear()\n",
"d_number_letter"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"Remark: `pop` proves very handy to deal with function using `kwargs`. Consider the following function:"
]
},
{
"cell_type": "code",
"execution_count": 232,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.203131Z",
"start_time": "2020-09-23T08:47:51.196151Z"
}
},
"outputs": [],
"source": [
"def my_function(a='default_value_a', b='default_value_b'):\n",
" return a, b"
]
},
{
"cell_type": "code",
"execution_count": 233,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.232057Z",
"start_time": "2020-09-23T08:47:51.204130Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"TypeError: my_function() got an unexpected keyword argument 'c'\n"
]
}
],
"source": [
"try:\n",
" my_function(c=12)\n",
"except TypeError as e:\n",
" print('TypeError:', e)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If for some reason you want or need to work with `kwargs`, you can emulate the same behavior with:"
]
},
{
"cell_type": "code",
"execution_count": 234,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.240038Z",
"start_time": "2020-09-23T08:47:51.233052Z"
}
},
"outputs": [],
"source": [
"def my_function(**kwargs):\n",
" a = kwargs.pop('a', 'default_value_a')\n",
" b = kwargs.pop('b', 'default_value_b')\n",
" if kwargs:\n",
" k, _ = kwargs.popitem()\n",
" raise TypeError(\"my_function() got an unexpected keyword argument '{}'\".format(k))\n",
" return a, b"
]
},
{
"cell_type": "code",
"execution_count": 235,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.252001Z",
"start_time": "2020-09-23T08:47:51.243026Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"TypeError: my_function() got an unexpected keyword argument 'c'\n"
]
}
],
"source": [
"try:\n",
" my_function(c=12)\n",
"except TypeError as e:\n",
" print('TypeError:', e)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Tip: Merge Dictionaries"
]
},
{
"cell_type": "code",
"execution_count": 236,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.260978Z",
"start_time": "2020-09-23T08:47:51.253996Z"
}
},
"outputs": [],
"source": [
"d0 = {'a': 0, 'b': 0, 'c': 0}\n",
"d1 = {'a': 1, 'd': 1}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"A first option:"
]
},
{
"cell_type": "code",
"execution_count": 237,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.273943Z",
"start_time": "2020-09-23T08:47:51.262971Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"{'a': 1, 'b': 0, 'c': 0, 'd': 1}"
]
},
"execution_count": 237,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d = d0.copy()\n",
"d.update(d1)\n",
"d"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"A more elegant solution:"
]
},
{
"cell_type": "code",
"execution_count": 238,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.285910Z",
"start_time": "2020-09-23T08:47:51.276934Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"{'a': 1, 'b': 0, 'c': 0, 'd': 1}"
]
},
"execution_count": 238,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d = {**d0, **d1}\n",
"d"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"N.B.: in a situation where you are tempted to merge dictionaries, consider using `ChainMap` (cf. below)."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## `OrderedDict`"
]
},
{
"cell_type": "code",
"execution_count": 239,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.294894Z",
"start_time": "2020-09-23T08:47:51.286907Z"
}
},
"outputs": [],
"source": [
"from collections import OrderedDict"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"`OrderedDict` is a subclass of `dict` that puts more emphasis on the order of the elements."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### A Bit of History about `dict`"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"* Initially in Python, dicts did not preserve insertion order.\n",
"* Since Python 3.6, in the CPython implementation, they do, but this is considered an \"implementation detail\".\n",
"* Since Python 3.7, it becomes a language feature.\n",
"\n",
"So, in the following, the \"key, value\" pairs will always pop in the same order, which happens to be LIFO:"
]
},
{
"cell_type": "code",
"execution_count": 240,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.306858Z",
"start_time": "2020-09-23T08:47:51.297893Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"('three', 'd')\n",
"('two', 'c')\n",
"('one', 'b')\n",
"('zero', 'a')\n"
]
}
],
"source": [
"d_number_letter = {'zero': 'a', 'one': 'b', 'two': 'c', 'three': 'd'}\n",
"while d_number_letter:\n",
" print(d_number_letter.popitem())"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"However, two dicts are considered equal even if they are not given in the same order:"
]
},
{
"cell_type": "code",
"execution_count": 241,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.321814Z",
"start_time": "2020-09-23T08:47:51.309846Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"True"
]
},
"execution_count": 241,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d_number_letter_1 = {'zero': 'a', 'one': 'b', 'two': 'c', 'three': 'd'}\n",
"d_number_letter_2 = {'three': 'd', 'one': 'b', 'zero': 'a', 'two': 'c'}\n",
"d_number_letter_1 == d_number_letter_2"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Moreover, dicts do not have methods to change the order, except by removing and re-inserting a \"key, value\" pair."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Specificities of `OrderedDict`"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"`OrderedDict` is a subclass of `dict` that puts more emphasis on the order of the elements."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"Now that dicts preserve order, `OrderedDict` has lost a bit of its interest. \n",
"\n",
"But it keeps an important specificity: equality test is order-sensitive. Example:"
]
},
{
"cell_type": "code",
"execution_count": 242,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.334780Z",
"start_time": "2020-09-23T08:47:51.323809Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"False"
]
},
"execution_count": 242,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d_number_letter_1 = OrderedDict({'zero': 'a', 'one': 'b', 'two': 'c', 'three': 'd'})\n",
"d_number_letter_2 = OrderedDict({'three': 'd', 'one': 'b', 'zero': 'a', 'two': 'c'})\n",
"d_number_letter_1 == d_number_letter_2"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"Moreover, `popitem` provides an optional argument `last` to control LIFO or FIFO order:"
]
},
{
"cell_type": "code",
"execution_count": 243,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.344827Z",
"start_time": "2020-09-23T08:47:51.336774Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"('zero', 'a')\n",
"('one', 'b')\n",
"('two', 'c')\n",
"('three', 'd')\n"
]
}
],
"source": [
"d_number_letter = OrderedDict({'zero': 'a', 'one': 'b', 'two': 'c', 'three': 'd'})\n",
"while d_number_letter:\n",
" print(d_number_letter.popitem(last=False)) # We choose FIFO here"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"There is also a method `move_to_end`, with an optional argument `last`:"
]
},
{
"cell_type": "code",
"execution_count": 244,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.354797Z",
"start_time": "2020-09-23T08:47:51.348814Z"
}
},
"outputs": [],
"source": [
"d_number_letter = OrderedDict({'zero': 'a', 'one': 'b', 'two': 'c', 'three': 'd'})"
]
},
{
"cell_type": "code",
"execution_count": 245,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.366767Z",
"start_time": "2020-09-23T08:47:51.355794Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"OrderedDict([('one', 'b'), ('two', 'c'), ('three', 'd'), ('zero', 'a')])"
]
},
"execution_count": 245,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d_number_letter.move_to_end('zero')\n",
"d_number_letter"
]
},
{
"cell_type": "code",
"execution_count": 246,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.377737Z",
"start_time": "2020-09-23T08:47:51.368760Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"OrderedDict([('three', 'd'), ('one', 'b'), ('two', 'c'), ('zero', 'a')])"
]
},
"execution_count": 246,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d_number_letter.move_to_end('three', last=False)\n",
"d_number_letter"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## `defaultdict`"
]
},
{
"cell_type": "code",
"execution_count": 247,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.386711Z",
"start_time": "2020-09-23T08:47:51.379733Z"
}
},
"outputs": [],
"source": [
"from collections import defaultdict"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"`defaultdict` is a subclass of `dict` that provides a way to generate a value when it is missing."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Naive Example"
]
},
{
"cell_type": "code",
"execution_count": 248,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.398698Z",
"start_time": "2020-09-23T08:47:51.387709Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"defaultdict(<function __main__.constant_0000()>,\n",
" {'Alice': '1234', 'Bob': '6666'})"
]
},
"execution_count": 248,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"def constant_0000():\n",
" return '0000'\n",
"d_user_pin = defaultdict(constant_0000) # Equivalent to defaultdict(lambda: '0000')\n",
"d_user_pin['Alice'] = '1234'\n",
"d_user_pin['Bob'] = '6666'\n",
"d_user_pin"
]
},
{
"cell_type": "code",
"execution_count": 249,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.407656Z",
"start_time": "2020-09-23T08:47:51.400675Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0000\n",
"defaultdict(<function constant_0000 at 0x00000222BFC2B950>, {'Alice': '1234', 'Bob': '6666', 'Cate': '0000'})\n"
]
}
],
"source": [
"print(d_user_pin['Cate'])\n",
"print(d_user_pin)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "-"
}
},
"source": [
"Caution: `get` has the same behavior as with a regular dictionary. When the value is missing, it returns `None` and does not modify the dictionary:"
]
},
{
"cell_type": "code",
"execution_count": 250,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.420621Z",
"start_time": "2020-09-23T08:47:51.408654Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"None\n",
"defaultdict(<function constant_0000 at 0x00000222BFC2B950>, {'Alice': '1234', 'Bob': '6666', 'Cate': '0000'})\n"
]
}
],
"source": [
"print(d_user_pin.get('Dave'))\n",
"print(d_user_pin)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Example with `set`"
]
},
{
"cell_type": "code",
"execution_count": 251,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.428602Z",
"start_time": "2020-09-23T08:47:51.421618Z"
}
},
"outputs": [],
"source": [
"events = [('Alice', 'work'), ('Bob', 'sleep'), ('Alice', 'play'), ('Alice', 'work'), ('Bob', 'work')] "
]
},
{
"cell_type": "code",
"execution_count": 252,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.439575Z",
"start_time": "2020-09-23T08:47:51.431592Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"defaultdict(set, {'Alice': {'play', 'work'}, 'Bob': {'sleep', 'work'}})"
]
},
"execution_count": 252,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d_person_activities = defaultdict(set)\n",
"for person, activity in events:\n",
" d_person_activities[person].add(activity)\n",
"d_person_activities"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Example with `list`"
]
},
{
"cell_type": "code",
"execution_count": 253,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.448547Z",
"start_time": "2020-09-23T08:47:51.441566Z"
}
},
"outputs": [],
"source": [
"events = [('Alice', 'work'), ('Bob', 'sleep'), ('Alice', 'play'), ('Alice', 'work'), ('Bob', 'work')]"
]
},
{
"cell_type": "code",
"execution_count": 254,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.458521Z",
"start_time": "2020-09-23T08:47:51.449544Z"
},
"scrolled": true
},
"outputs": [
{
"data": {
"text/plain": [
"defaultdict(list,\n",
" {'Alice': ['work', 'play', 'work'], 'Bob': ['sleep', 'work']})"
]
},
"execution_count": 254,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d_person_activities = defaultdict(list) # Equivalent to defaultdict(lambda: [])\n",
"for person, activity in events:\n",
" d_person_activities[person].append(activity)\n",
"d_person_activities"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Example with Nested Dictionaries"
]
},
{
"cell_type": "code",
"execution_count": 255,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.469490Z",
"start_time": "2020-09-23T08:47:51.459518Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"defaultdict(dict, {'Alice': {'age': 32, 'pin': '1234'}, 'Bob': {'age': 27}})"
]
},
"execution_count": 255,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d_person_feature_value = defaultdict(dict)\n",
"d_person_feature_value['Alice']['age'] = 32\n",
"d_person_feature_value['Alice']['pin'] = '1234'\n",
"d_person_feature_value['Bob']['age'] = 27\n",
"d_person_feature_value"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Example with `int`"
]
},
{
"cell_type": "code",
"execution_count": 256,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.480461Z",
"start_time": "2020-09-23T08:47:51.470488Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"defaultdict(int, {'m': 1, 'i': 4, 's': 4, 'p': 1})"
]
},
"execution_count": 256,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"s = 'mississipi'\n",
"d_letter_count = defaultdict(int)\n",
"for letter in s:\n",
" d_letter_count[letter] += 1\n",
"d_letter_count"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Remark: for this last application, consider using the class `Counter` (cf. below)."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## `Counter`"
]
},
{
"cell_type": "code",
"execution_count": 257,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.488440Z",
"start_time": "2020-09-23T08:47:51.481459Z"
}
},
"outputs": [],
"source": [
"from collections import Counter"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"`Counter` is a subclass of `dict`, specialized in counting elements.\n",
"\n",
"Elements are stored as dictionary keys and theirs counts are stored as dictionary values."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Create a `Counter`"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Usually, you initialize a `Counter` by counting the elements of an iterable:"
]
},
{
"cell_type": "code",
"execution_count": 258,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.499410Z",
"start_time": "2020-09-23T08:47:51.489438Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"Counter({'m': 1, 'i': 4, 's': 4, 'p': 1})"
]
},
"execution_count": 258,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d_letter_count = Counter('mississipi')\n",
"d_letter_count"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Other ways of initializing a dictionary still work:"
]
},
{
"cell_type": "code",
"execution_count": 259,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.509384Z",
"start_time": "2020-09-23T08:47:51.500408Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"Counter({'m': 1, 'i': 4, 's': 4, 'p': 1})"
]
},
"execution_count": 259,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d_letter_count = Counter({'m': 1, 'i': 4, 's': 4, 'p': 1})\n",
"d_letter_count"
]
},
{
"cell_type": "code",
"execution_count": 260,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.519357Z",
"start_time": "2020-09-23T08:47:51.511379Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"Counter({'m': 1, 'i': 4, 's': 4, 'p': 1})"
]
},
"execution_count": 260,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d_letter_count = Counter(m=1, i=4, s=4, p=1)\n",
"d_letter_count"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Missing Elements"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"A `Counter` essentially work as a `dict`, except for missing elements:"
]
},
{
"cell_type": "code",
"execution_count": 261,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.531325Z",
"start_time": "2020-09-23T08:47:51.522348Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Counter({'i': 4, 's': 4, 'm': 1, 'p': 1})\n",
"0\n"
]
}
],
"source": [
"d_letter_count = Counter('mississipi')\n",
"print(d_letter_count)\n",
"print(d_letter_count['a'])"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Add Counts"
]
},
{
"cell_type": "code",
"execution_count": 262,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.541298Z",
"start_time": "2020-09-23T08:47:51.533319Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"Counter({'m': 1, 'i': 4, 's': 4, 'p': 1})"
]
},
"execution_count": 262,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d_letter_count = Counter('mississipi')\n",
"d_letter_count"
]
},
{
"cell_type": "code",
"execution_count": 263,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.551272Z",
"start_time": "2020-09-23T08:47:51.542295Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"Counter({'m': 1, 'i': 4, 's': 4, 'p': 1, 'a': 1, 'b': 1, 'c': 1})"
]
},
"execution_count": 263,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d_letter_count.update('abc') # Update with an iterable\n",
"d_letter_count"
]
},
{
"cell_type": "code",
"execution_count": 264,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.563460Z",
"start_time": "2020-09-23T08:47:51.552269Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"Counter({'m': 2, 'i': 5, 's': 4, 'p': 1, 'a': 1, 'b': 1, 'c': 1})"
]
},
"execution_count": 264,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d_letter_count.update({'m': 1, 'i': 1}) # Update with a mapping\n",
"d_letter_count"
]
},
{
"cell_type": "markdown",
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-19T12:43:45.060918Z",
"start_time": "2020-09-19T12:43:45.055933Z"
},
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Subtract Counts"
]
},
{
"cell_type": "code",
"execution_count": 265,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.572435Z",
"start_time": "2020-09-23T08:47:51.564457Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"Counter({'m': 1, 'i': 4, 's': 4, 'p': 1})"
]
},
"execution_count": 265,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d_letter_count = Counter('mississipi')\n",
"d_letter_count"
]
},
{
"cell_type": "code",
"execution_count": 266,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.582409Z",
"start_time": "2020-09-23T08:47:51.573433Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"Counter({'m': 0, 'i': 3, 's': 2, 'p': 1})"
]
},
"execution_count": 266,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d_letter_count.subtract('miss') # Subtract with an iterable\n",
"d_letter_count"
]
},
{
"cell_type": "code",
"execution_count": 267,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.594377Z",
"start_time": "2020-09-23T08:47:51.583405Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"Counter({'m': -1, 'i': 1, 's': 2, 'p': 1})"
]
},
"execution_count": 267,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d_letter_count.subtract({'m': 1, 'i': 2}) # Subtract with a mapping\n",
"d_letter_count"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Remark: a count can be negative, as in the last example above."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Iterate Over Elements"
]
},
{
"cell_type": "code",
"execution_count": 268,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.603353Z",
"start_time": "2020-09-23T08:47:51.595374Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"Counter({'m': 1, 'i': 4, 's': 4, 'p': 1})"
]
},
"execution_count": 268,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d_letter_count = Counter('mississipi')\n",
"d_letter_count"
]
},
{
"cell_type": "code",
"execution_count": 269,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.612329Z",
"start_time": "2020-09-23T08:47:51.604350Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"m\n",
"i\n",
"i\n",
"i\n",
"i\n",
"s\n",
"s\n",
"s\n",
"s\n",
"p\n"
]
}
],
"source": [
"for letter in d_letter_count.elements():\n",
" print(letter)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Remark: this will ignore elements with negative count."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Most Common Elements"
]
},
{
"cell_type": "code",
"execution_count": 270,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.621363Z",
"start_time": "2020-09-23T08:47:51.613326Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"Counter({'m': 1, 'i': 4, 's': 4, 'p': 1})"
]
},
"execution_count": 270,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d_letter_count = Counter('mississipi')\n",
"d_letter_count"
]
},
{
"cell_type": "code",
"execution_count": 271,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.632275Z",
"start_time": "2020-09-23T08:47:51.623300Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"[('i', 4), ('s', 4), ('m', 1)]"
]
},
"execution_count": 271,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d_letter_count.most_common(3)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Use Counters to Represent Multisets"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The following operations are intended to be used with Counters whose values are all positive, used to represent multisets."
]
},
{
"cell_type": "code",
"execution_count": 272,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.640254Z",
"start_time": "2020-09-23T08:47:51.633272Z"
}
},
"outputs": [],
"source": [
"c = Counter(a=3, b=1)\n",
"d = Counter(a=1, b=2)"
]
},
{
"cell_type": "code",
"execution_count": 273,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.650227Z",
"start_time": "2020-09-23T08:47:51.641252Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"Counter({'a': 4, 'b': 3})"
]
},
"execution_count": 273,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"c + d # add two counters together: c[x] + d[x]"
]
},
{
"cell_type": "code",
"execution_count": 274,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.659204Z",
"start_time": "2020-09-23T08:47:51.651224Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"Counter({'a': 2})"
]
},
"execution_count": 274,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"c - d # subtract (keeping only positive counts)"
]
},
{
"cell_type": "code",
"execution_count": 275,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.668805Z",
"start_time": "2020-09-23T08:47:51.660200Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"Counter({'a': 1, 'b': 1})"
]
},
"execution_count": 275,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"c & d # intersection: min(c[x], d[x]) "
]
},
{
"cell_type": "code",
"execution_count": 276,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.680147Z",
"start_time": "2020-09-23T08:47:51.670175Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"Counter({'a': 3, 'b': 2})"
]
},
"execution_count": 276,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"c | d # union: max(c[x], d[x])"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## `ChainMap`"
]
},
{
"cell_type": "code",
"execution_count": 277,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.688126Z",
"start_time": "2020-09-23T08:47:51.681144Z"
}
},
"outputs": [],
"source": [
"from collections import ChainMap"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"`ChainMap` is a subclass of `dict`. It allows to gather several mappings into one, with an order of priority."
]
},
{
"cell_type": "code",
"execution_count": 278,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.700094Z",
"start_time": "2020-09-23T08:47:51.689124Z"
},
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"data": {
"text/plain": [
"ChainMap({'a': 2, 'd': 2}, {'a': 1, 'c': 1}, {'a': 0, 'b': 0})"
]
},
"execution_count": 278,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"builtins = {'a': 0, 'b': 0}\n",
"global_variables = {'a': 1, 'c': 1}\n",
"local_variables = {'a': 2, 'd': 2}\n",
"variables = ChainMap(local_variables, global_variables, builtins)\n",
"variables"
]
},
{
"cell_type": "code",
"execution_count": 279,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.709070Z",
"start_time": "2020-09-23T08:47:51.703087Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"a: 2\n",
"b: 0\n",
"c: 1\n",
"d: 2\n"
]
}
],
"source": [
"for key in ['a', 'b', 'c', 'd']:\n",
" print('{}: {}'.format(key, variables[key]))"
]
},
{
"cell_type": "code",
"execution_count": 280,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.720040Z",
"start_time": "2020-09-23T08:47:51.710067Z"
},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"data": {
"text/plain": [
"ChainMap({'a': 3, 'd': 2}, {'a': 1, 'c': 1}, {'a': 0, 'b': 0})"
]
},
"execution_count": 280,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"variables['a'] = 3\n",
"variables"
]
},
{
"cell_type": "code",
"execution_count": 281,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.730014Z",
"start_time": "2020-09-23T08:47:51.721038Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"ChainMap({'a': 3, 'd': 2}, {'a': 42, 'c': 1}, {'a': 0, 'b': 0})"
]
},
"execution_count": 281,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"global_variables['a'] = 42\n",
"variables"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Cf. documentation for more detail about the other methods of `ChainMap`."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# P.S.: Pitfalls of Mutable Objects"
]
},
{
"cell_type": "markdown",
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-14T07:47:50.666686Z",
"start_time": "2020-09-14T07:47:50.662695Z"
},
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Copy"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You probably do **not** want to do that:"
]
},
{
"cell_type": "code",
"execution_count": 282,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.739987Z",
"start_time": "2020-09-23T08:47:51.731011Z"
},
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"['a', 'b', 'c', 'd', 'e']\n",
"['a', 'b', 'c', 'd', 'e']\n"
]
}
],
"source": [
"my_list = ['a', 'b', 'c', 'd']\n",
"copied_list = my_list\n",
"my_list.append('e')\n",
"print(my_list)\n",
"print(copied_list)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"... but rather that:"
]
},
{
"cell_type": "code",
"execution_count": 283,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T09:42:26.400725Z",
"start_time": "2020-09-23T09:42:26.396769Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"['a', 'b', 'c', 'd', 'e']\n",
"['a', 'b', 'c', 'd']\n"
]
}
],
"source": [
"my_list = ['a', 'b', 'c', 'd']\n",
"# copied_list = my_list.copy() # Here is the difference\n",
"# copied_list = list(my_list) # Second option, same thing\n",
"# copied_list = [x for x in my_list] # Third option, same thing\n",
"copied_list = my_list[:] # Fourth option, same thing\n",
"my_list.append('e')\n",
"print(my_list)\n",
"print(copied_list)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Use a Mutable as Default Argument"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This is a classic mistake in Python:"
]
},
{
"cell_type": "code",
"execution_count": 284,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.756942Z",
"start_time": "2020-09-23T08:47:51.749960Z"
}
},
"outputs": [],
"source": [
"def append_42(lst=[]): # NEVER do that!\n",
" lst.append(42)\n",
" return lst"
]
},
{
"cell_type": "code",
"execution_count": 285,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.767564Z",
"start_time": "2020-09-23T08:47:51.757939Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"[1, 2, 3, 42]"
]
},
"execution_count": 285,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"append_42([1, 2, 3])"
]
},
{
"cell_type": "code",
"execution_count": 286,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.777526Z",
"start_time": "2020-09-23T08:47:51.768550Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"[42]"
]
},
"execution_count": 286,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"append_42()"
]
},
{
"cell_type": "code",
"execution_count": 287,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.788497Z",
"start_time": "2020-09-23T08:47:51.779521Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"[42, 42]"
]
},
"execution_count": 287,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"append_42()"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"Do that instead:"
]
},
{
"cell_type": "code",
"execution_count": 288,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.797473Z",
"start_time": "2020-09-23T08:47:51.790492Z"
}
},
"outputs": [],
"source": [
"def append_42(lst=None):\n",
" if lst is None:\n",
" lst = []\n",
" lst.append(42)\n",
" return lst"
]
},
{
"cell_type": "code",
"execution_count": 289,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.807446Z",
"start_time": "2020-09-23T08:47:51.800466Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"[1, 2, 3, 42]"
]
},
"execution_count": 289,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"append_42([1, 2, 3])"
]
},
{
"cell_type": "code",
"execution_count": 290,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.817420Z",
"start_time": "2020-09-23T08:47:51.809441Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"[42]"
]
},
"execution_count": 290,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"append_42()"
]
},
{
"cell_type": "code",
"execution_count": 291,
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-23T08:47:51.825398Z",
"start_time": "2020-09-23T08:47:51.818417Z"
}
},
"outputs": [
{
"data": {
"text/plain": [
"[42]"
]
},
"execution_count": 291,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"append_42()"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# Conclusion"
]
},
{
"cell_type": "markdown",
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-15T22:33:00.599142Z",
"start_time": "2020-09-15T22:33:00.593185Z"
},
"slideshow": {
"slide_type": "-"
}
},
"source": [
"We have talked about `Collection`:\n",
"* `Set`: `set`, `frozenset`,\n",
"* `Sequence`: `tuple`, `namedtuple`, `list`, `deque` (and `str`, `bytes`),\n",
"* `Mapping`: `dict`, `OrderedDict`, `defaultdict`, `Counter`, `ChainMap`."
]
},
{
"cell_type": "markdown",
"metadata": {
"ExecuteTime": {
"end_time": "2020-09-15T22:33:10.482606Z",
"start_time": "2020-09-15T22:33:10.476619Z"
},
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"<font size=\"+4\"><b>That's all folks!</b></font>\n",
"\n",
"Thanks for your attention!"
]
}
],
"metadata": {
"celltoolbar": "Diaporama",
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.3"
},
"toc": {
"base_numbering": 1,
"nav_menu": {},
"number_sections": true,
"sideBar": true,
"skip_h1_title": false,
"title_cell": "Table of Contents",
"title_sidebar": "Contents",
"toc_cell": false,
"toc_position": {
"height": "calc(100% - 180px)",
"left": "10px",
"top": "150px",
"width": "348.469px"
},
"toc_section_display": true,
"toc_window_display": false
}
},
"nbformat": 4,
"nbformat_minor": 4
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment