Skip to content

Instantly share code, notes, and snippets.

@mhermans
Created February 6, 2013 00:11
Show Gist options
  • Save mhermans/4719001 to your computer and use it in GitHub Desktop.
Save mhermans/4719001 to your computer and use it in GitHub Desktop.
Notebook of Alexis Metaireau Python Tricks presentation
{
"metadata": {
"name": "Python Tricks"
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
"cell_type": "heading",
"level": 1,
"metadata": {},
"source": [
"Notebook of Python tricks, as presented by Alexis Metaireau"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[Alexis Metaireau](https://twitter.com/ametaireau) presented \"[Astonishing Python Tricks](https://fosdem.org/2013/schedule/event/astonish_python_tricks/)\" ([slides](http://alexis.notmyidea.org/pythontricks/)) on [FOSDEM](https://fosdem.org/2013/) this weekend. Trying to get these standard library tricks into my fingers, I tried them out in the form of a iPython notebook. You can download the notebook, and play around with them interactively."
]
},
{
"cell_type": "heading",
"level": 2,
"metadata": {},
"source": [
"Strings"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Whitespace partitioning 1: search for substring (whitespace in this case), and return it, together with the part in front of if, and behind it."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"'hackers gonna hack'.partition(' ')"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 12,
"text": [
"('hackers', ' ', 'gonna hack')"
]
}
],
"prompt_number": 12
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Whitespace partitioning 2: split on first whitespace (drop second argument for all matches), return split pieces of string."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"'hackers gonna hack'.split(' ', 1)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 13,
"text": [
"['hackers', 'gonna hack']"
]
}
],
"prompt_number": 13
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Test if string starts (`startswith()`) or ends (`endswith()`) with a certain string."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"'hackers gonna hack'.startswith('hackers')"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 16,
"text": [
"True"
]
}
],
"prompt_number": 16
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"'hackers gonna hack'.endswith('hack')"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 15,
"text": [
"True"
]
}
],
"prompt_number": 15
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Matching can be against a tuple of possible strings instead of a single string."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"'hackers gonna hack'.startswith(('hackers', 'haters'))"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 22,
"text": [
"True"
]
}
],
"prompt_number": 22
},
{
"cell_type": "heading",
"level": 2,
"metadata": {},
"source": [
"Strings & memory"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We fetch two times the same text/string, but `id()` show that it is stored twice, in different memory locations."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"import urllib2\n",
"c1 = urllib2.urlopen('http://blog.notmyidea.org').read()\n",
"c2 = urllib2.urlopen('http://blog.notmyidea.org').read()\n",
"print id(c1) \n",
"print id(c2) "
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"38643104\n",
"38611888\n"
]
}
],
"prompt_number": 29
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The `intern()` function returns a canonical instance of a string. This is [useful](http://stackoverflow.com/questions/1136826/what-does-python-intern-do-and-when-should-it-be-used) to save memory if you have many string instances that are equal, and in addition you can also compare canonicalized strings by identity instead of equality which is faster.\n",
"\n",
"Requesting the `id()` for the canonical representations of the strings shows that they are identical."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"id(intern(c1)) == id(intern(c2))"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 31,
"text": [
"True"
]
}
],
"prompt_number": 31
},
{
"cell_type": "heading",
"level": 2,
"metadata": {},
"source": [
"Working with sets"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"import string\n",
"vowels = set('aeiouy')\n",
"all_letters = set(string.ascii_lowercase)\n",
"vowels"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 39,
"text": [
"set(['a', 'e', 'i', 'o', 'u', 'y'])"
]
}
],
"prompt_number": 39
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"consonants = all_letters - vowels\n",
"consonants"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 40,
"text": [
"set(['c',\n",
" 'b',\n",
" 'd',\n",
" 'g',\n",
" 'f',\n",
" 'h',\n",
" 'k',\n",
" 'j',\n",
" 'm',\n",
" 'l',\n",
" 'n',\n",
" 'q',\n",
" 'p',\n",
" 's',\n",
" 'r',\n",
" 't',\n",
" 'w',\n",
" 'v',\n",
" 'x',\n",
" 'z'])"
]
}
],
"prompt_number": 40
},
{
"cell_type": "heading",
"level": 2,
"metadata": {},
"source": [
"Enumerate"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"`enumerate()` over an iterable provides both the values and a counter to go with them. This saves you from initializing and incrementing your own counter."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"list(enumerate('this'))"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 41,
"text": [
"[(0, 't'), (1, 'h'), (2, 'i'), (3, 's')]"
]
}
],
"prompt_number": 41
},
{
"cell_type": "heading",
"level": 2,
"metadata": {},
"source": [
"Dictionary comprehension (Since 2.7)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Dictionary comprehension follows the pattern of the more widely used list comprehension. This can be used for e.g. dictionary construction:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"{string.ascii_lowercase[v]: k for k,v in enumerate(range(10))}"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 42,
"text": [
"{'a': 0,\n",
" 'b': 1,\n",
" 'c': 2,\n",
" 'd': 3,\n",
" 'e': 4,\n",
" 'f': 5,\n",
" 'g': 6,\n",
" 'h': 7,\n",
" 'i': 8,\n",
" 'j': 9}"
]
}
],
"prompt_number": 42
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The same condition test/filter at the end of the comprehension also works for dicts. "
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"{string.ascii_lowercase[v]: k for k,v in enumerate(range(10)) if v%2 == 0} # only even numbers"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 50,
"text": [
"{'a': 0, 'c': 2, 'e': 4, 'g': 6, 'i': 8}"
]
}
],
"prompt_number": 50
},
{
"cell_type": "heading",
"level": 2,
"metadata": {},
"source": [
"Slices"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"range(10)[::2] # iterate from start (0) to end (9) in steps of 2"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 51,
"text": [
"[0, 2, 4, 6, 8]"
]
}
],
"prompt_number": 51
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"range(10)[1::2] # iterate from 1 to end (9) in steps of 2"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 53,
"text": [
"[1, 3, 5, 7, 9]"
]
}
],
"prompt_number": 53
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"range(10)[::-1] # iterate over all items, and in steps of 1, but backwards (\"steps of -1\")"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 55,
"text": [
"[9, 8, 7, 6, 5, 4, 3, 2, 1, 0]"
]
}
],
"prompt_number": 55
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"even = slice(None, None, 2) # construct a slice-object in advance, equal to \"[::2]\"\n",
"range(10)[even]"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 57,
"text": [
"[0, 2, 4, 6, 8]"
]
}
],
"prompt_number": 57
},
{
"cell_type": "heading",
"level": 2,
"metadata": {},
"source": [
"Named tuples"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Named tuples provide simple, lightweigh objects with pre-defined properties, that can be used instead of custom class objects. In the case of e.g. a lot of small, static objects named tuples are more memory efficient than initializing comparable class instances."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"from collections import namedtuple\n",
"Beer = namedtuple('Beer', ('name', 'type', 'level'))\n",
"karmeliet = Beer('Tripel Karmeliet', 'high fermentation', 0.084)\n",
"orval = Beer('Orval', 'high fermentation', 0.062)"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 65
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"karmeliet.level > orval.level # Karmeliet has a higher alcohol percentage compared to Orval?"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 70,
"text": [
"True"
]
}
],
"prompt_number": 70
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"[beer.name for beer in [orval, karmeliet]] # get beer names"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 74,
"text": [
"['Orval', 'Tripel Karmeliet']"
]
}
],
"prompt_number": 74
},
{
"cell_type": "heading",
"level": 2,
"metadata": {},
"source": [
"Counters"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The `Counter` class makes it easy to summarize/count elements in lists and tuples."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"from collections import Counter\n",
"sandwich = ('spam', 'bacon', 'spam', 'spam', 'egg', 'bacon', 'spam')\n",
"Counter(sandwich) # summarize in a Counter-object"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 77,
"text": [
"Counter({'spam': 4, 'bacon': 2, 'egg': 1})"
]
}
],
"prompt_number": 77
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"Counter(sandwich).most_common(2) # get 2 most common elements (and their frequency)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 79,
"text": [
"[('spam', 4), ('bacon', 2)]"
]
}
],
"prompt_number": 79
},
{
"cell_type": "heading",
"level": 2,
"metadata": {},
"source": [
"Default dicts"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"A `defaultdict` is a subclass of the regular `dict`, which takes as fist argument a function that is called if a non-existing key is called. This can be used to solve the problem of non-exising keys when e.g. building a dict of lists:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"regular_dict = {}\n",
"regular_dict['sandwich'].append('spam')"
],
"language": "python",
"metadata": {},
"outputs": [
{
"ename": "KeyError",
"evalue": "'sandwich'",
"output_type": "pyerr",
"traceback": [
"\u001b[1;31m---------------------------------------------------------------------------\u001b[0m\n\u001b[1;31mKeyError\u001b[0m Traceback (most recent call last)",
"\u001b[1;32m<ipython-input-86-633fbd5220c5>\u001b[0m in \u001b[0;36m<module>\u001b[1;34m()\u001b[0m\n\u001b[0;32m 1\u001b[0m \u001b[0mregular_dict\u001b[0m \u001b[1;33m=\u001b[0m \u001b[1;33m{\u001b[0m\u001b[1;33m}\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m----> 2\u001b[1;33m \u001b[0mregular_dict\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;34m'sandwich'\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mappend\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34m'spam'\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m",
"\u001b[1;31mKeyError\u001b[0m: 'sandwich'"
]
}
],
"prompt_number": 86
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We initialize `defaultdict` with `list` as an argument. This means that when a non-existing key is accessed, a new list object is returned (instead of a KeyError). "
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"from collections import defaultdict\n",
"listdict = defaultdict(list)\n",
"listdict['cheese']"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 107,
"text": [
"[]"
]
}
],
"prompt_number": 107
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This allows us to append/extend lists in a dictionary, without first initializing all the possible keys."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"listdict['sandwich'].append('spam')\n",
"listdict['sandwich']"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 108,
"text": [
"['spam']"
]
}
],
"prompt_number": 108
},
{
"cell_type": "heading",
"level": 2,
"metadata": {},
"source": [
"Iterators"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can construct an iterator object using the syntax `iter(callable, sentinel)`. This iterator wil return the callable until the sentinel is returned. I have not been able to locate a applied example, the provided and online examples seem better suited with build in iterator support (i.e. file-like objects)."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# example from the slides\n",
"for data in iter(partial(fileobj.read, BLOCKSIZE), ''):\n",
" # do something"
],
"language": "python",
"metadata": {},
"outputs": [
{
"ename": "IndentationError",
"evalue": "expected an indented block (<ipython-input-1-8eb6b590de7c>, line 2)",
"output_type": "pyerr",
"traceback": [
"\u001b[1;31mIndentationError\u001b[0m\u001b[1;31m:\u001b[0m expected an indented block\n"
]
}
],
"prompt_number": 1
},
{
"cell_type": "heading",
"level": 2,
"metadata": {},
"source": [
"Functools partial (since 2.5)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The [partial-function](http://docs.python.org/2/library/functools.html#functools.partial) will return a `partial` object that functions the same as the object that is passed in as first argument, but with some arguments already set. This results in callable objects that adhere to the same logic, but require less arguments."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"import functools\n",
"def add(a, b):\n",
" return a + b\n",
"add2 = functools.partial(add, 2) # partial function that always add 2\n",
"add2(4)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 118,
"text": [
"6"
]
}
],
"prompt_number": 118
},
{
"cell_type": "heading",
"level": 2,
"metadata": {},
"source": [
"Functools ordening"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Functools [contains](http://docs.python.org/2/library/functools.html#functools.total_ordering) a `@total_ordening` decorator which will provide the full set of comparison operators for an object, if you provide the `__eq__()` method, and one of the `__lt__()`, `__le__()`, `__gt__()`, or `__ge__()` methods."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"class Presenter:\n",
" def __init__(self, name):\n",
" self.name = name\n",
" def __eq__(self, other):\n",
" return self.name == other.name\n",
" def __lt__(self, other):\n",
" return self.name < other.name\n",
" \n",
"am = Presenter('Alexis Metaireau')\n",
"kr = Presenter('Kenneth Reitz')\n",
"am.__eq__(am), am.__gt__(kr)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"ename": "AttributeError",
"evalue": "Presenter instance has no attribute '__gt__'",
"output_type": "pyerr",
"traceback": [
"\u001b[1;31m---------------------------------------------------------------------------\u001b[0m\n\u001b[1;31mAttributeError\u001b[0m Traceback (most recent call last)",
"\u001b[1;32m<ipython-input-139-d5485c12012a>\u001b[0m in \u001b[0;36m<module>\u001b[1;34m()\u001b[0m\n\u001b[0;32m 9\u001b[0m \u001b[0mam\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mPresenter\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34m'Alexis Metaireau'\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 10\u001b[0m \u001b[0mkr\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mPresenter\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34m'Kenneth Reitz'\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m---> 11\u001b[1;33m \u001b[0mam\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m__eq__\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mam\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mam\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m__gt__\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mkr\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m",
"\u001b[1;31mAttributeError\u001b[0m: Presenter instance has no attribute '__gt__'"
]
}
],
"prompt_number": 139
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"from functools import total_ordering\n",
"\n",
"@total_ordering\n",
"class Presenter:\n",
" def __init__(self, name):\n",
" self.name = name\n",
" def __eq__(self, other):\n",
" return self.name == other.name\n",
" def __lt__(self, other):\n",
" return self.name < other.name\n",
" \n",
"am = Presenter('Alexis Metaireau')\n",
"kr = Presenter('Kenneth Reitz')\n",
"am.__eq__(am), am.__gt__(kr)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 140,
"text": [
"(True, False)"
]
}
],
"prompt_number": 140
},
{
"cell_type": "heading",
"level": 2,
"metadata": {},
"source": [
"Contextlib (or how to make good use of \"with\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Contextlib provides a [contextmanager](http://docs.python.org/2/library/contextlib.html#contextlib.contextmanager) that makes is easier to define a factory function for `with` statements."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"from contextlib import contextmanager\n",
"from datetime import datetime\n",
"\n",
"@contextmanager\n",
"def timeit(message):\n",
" t1 = datetime.now()\n",
" yield\n",
" print \"%s in %s\" % (message, datetime.now() - t1)\n",
"\n",
"with timeit(\"zipped ranges together\"):\n",
" zip(range(100000), range(10000)[::-1])"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"zipped ranges together in 0:00:00.004413\n"
]
}
],
"prompt_number": 145
},
{
"cell_type": "heading",
"level": 2,
"metadata": {},
"source": [
"Decorators"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This example of decoraters was included in the presentation, but skipped due to time constraints. I'm not entirely sure what the example is illustrating..."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"class classproperty(object):\n",
" def __init__(self, getter):\n",
" self.getter = getter\n",
" def __get__(self, instance, owner):\n",
" return self.getter(owner)\n",
" \n",
"class MyClass(object):\n",
" @classproperty\n",
" def tagname(cls):\n",
" return \"data-%s\" % cls.__name__\n",
" \n",
"c = MyClass()\n",
"c.tagname"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "pyout",
"prompt_number": 150,
"text": [
"'data-MyClass'"
]
}
],
"prompt_number": 150
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"I skipped a few examples that are Python 3 specific, e.g. [tuple unpacking](http://alexis.notmyidea.org/pythontricks/?full#iterators-py3k) and [function annotation](http://alexis.notmyidea.org/pythontricks/?full#functions-annotation)."
]
}
],
"metadata": {}
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment