Skip to content

Instantly share code, notes, and snippets.

@mpkocher
Created February 27, 2019 04:34
Show Gist options
  • Save mpkocher/88f01a6237df44a8a01b51fb58cbb544 to your computer and use it in GitHub Desktop.
Save mpkocher/88f01a6237df44a8a01b51fb58cbb544 to your computer and use it in GitHub Desktop.
Functional Programming Techniques in Python: Part 4
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Functional Programming Techniques in Python: Part 4\n",
"\n",
"This is the last section of Function Programming Techniques (FPT) in Python. \n",
"\n",
"In this section we'll cover suggestions for adoption and migration to FPT as well as cover a few 'gotchas' when using closures. "
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import datetime\n",
"import operator as op"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Today is 2019-02-26 20:31:43.291057\n"
]
}
],
"source": [
"print(\"Today is {}\".format(datetime.datetime.now()))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# General Suggestions for Migrating to FPT\n",
"\n",
"- Don't start forcing a FPT heavy design on existing repos or packages. Specifically for cases where you might be changing the public API of core libs of your application or stack. For example, migrating to heavily use [toolz](https://toolz.readthedocs.io/en/latest/streaming-analytics.html) might create too many impedence mismatches (style wise) in your software stack.\n",
"- Hold brown bag sessions during lunch (or similar) to help your team/organization understand or aren't fluent with the FPT patterns approach.\n",
"- Expanding your knowledge to FPT will help you understand and extend other libs, such as `click`. Having a solid understanding of FPT will provide you with one more tool in help you express yourself in code.\n",
"- Find specific examples in your application or code base where you're having composiblity issues with OO patterns and investigate using a FPT design. Be sure to discuss composibilty improvements as well as (hopefully) reduced cognative overhead of the code base.\n",
"- In FPT you're often going to have a lot of small functions, **consistent naming conventions** are extremely important. Perhaps even more so than in OO design. ([Obligatory naming is hard discussion](https://hilton.org.uk/blog/why-naming-things-is-hard))\n",
"- If you're using Python 3. Try to consistently leverage type declarations when passing around functions as arguments to other functions. \n",
"\n",
"*As with any pattern, it's essential to judiciously use the a pattern and not treat the pattern as a hammer to every problem you run across.*"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Here's a few misc comments or items before we close out the FPT series."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# A Few Misc Comments"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Scope Gotchas\n",
"\n",
"In Python 2.x, I found Python's sense of scope to be a bit odd at times. Python 3 added `nonlocal` via PEP-3104. The detailed discussion is here and is an interesting read.\n",
"\n",
"https://www.python.org/dev/peps/pep-3104/\n",
"\n",
"To demonstrate potential scope *gotchas*, here an example in Python 3 using a simple caching mechanism, slightly inspired from Python's [lru_cache](https://docs.python.org/3/library/functools.html#functools.lru_cache)\n",
"\n",
"### Example #1"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"def compute(f=op.add):\n",
" \n",
" cache = {}\n",
"\n",
" def f2(m, n):\n",
" key = (m, n)\n",
" if key in cache:\n",
" value = cache[key]\n",
" print(\"Loading from cache {}={}\".format(key, value))\n",
" return value\n",
" else:\n",
" v = f(m, n)\n",
" cache[key] = v\n",
" return v\n",
" return f2"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"f = compute()"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"3"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"f(1, 2)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Loading from cache (1, 2)=3\n",
"Loading from cache (1, 2)=3\n"
]
}
],
"source": [
"for _ in range(2):\n",
" f(1, 2)"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"4"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"f(1, 3)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Example #2\n",
"\n",
"Let's modify the previous example by only keeping track of how many times the function is called using the same pattern.\n",
"\n",
"This will not work as expected and will raise an `UnboundLocalError`. "
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [],
"source": [
"def compute(f=op.add):\n",
" \n",
" num_times_called = 0\n",
" \n",
" def get_num_times():\n",
" \"\"\"Return the cache size\"\"\"\n",
" return num_times_called\n",
"\n",
" def f2(m, n):\n",
" # In scoping rules in Python num_times_called is never defined\n",
" num_times_called += 1\n",
" return f(m, n)\n",
"\n",
"\n",
" f2.get_num_times = get_num_times\n",
" return f2"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [],
"source": [
"f = compute()"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"ename": "UnboundLocalError",
"evalue": "local variable 'num_times_called' referenced before assignment",
"output_type": "error",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mUnboundLocalError\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m<ipython-input-10-a13c71fe0dfa>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mf\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m2\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m3\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
"\u001b[0;32m<ipython-input-8-64eb60550c47>\u001b[0m in \u001b[0;36mf2\u001b[0;34m(m, n)\u001b[0m\n\u001b[1;32m 9\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0mf2\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mm\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mn\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 10\u001b[0m \u001b[0;31m# In scoping rules in Python num_times_called is never defined\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 11\u001b[0;31m \u001b[0mnum_times_called\u001b[0m \u001b[0;34m+=\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 12\u001b[0m \u001b[0;32mreturn\u001b[0m \u001b[0mf\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mm\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mn\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 13\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;31mUnboundLocalError\u001b[0m: local variable 'num_times_called' referenced before assignment"
]
}
],
"source": [
"f(2, 3)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can solve this in Python 3 by using `nonlocal`. \n",
"\n",
"https://docs.python.org/3/reference/simple_stmts.html#nonlocal"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [],
"source": [
"def compute(f=op.add):\n",
" \n",
" num_times_called = 0\n",
" \n",
" def get_num_times():\n",
" \"\"\"Return the cache size\"\"\"\n",
" return num_times_called\n",
"\n",
" def f2(m, n):\n",
" nonlocal num_times_called\n",
" \n",
" num_times_called += 1\n",
" return f(m, n)\n",
"\n",
" f2.get_num_times = get_num_times\n",
" return f2"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"3"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"f = compute()\n",
"f(1, 2)"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"5"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"f(2, 3)"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"2"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"f.get_num_times()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"A workaround/hack for Python 2 is to use a container object, such as a list."
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"calling func=3\n",
"total times called=1\n",
"calling func=3\n",
"total times called=2\n",
"calling func=3\n",
"total times called=3\n",
"calling func=3\n",
"total times called=4\n",
"calling func=3\n",
"total times called=5\n"
]
}
],
"source": [
"def two_seven(n):\n",
" # DONT do this in Python 3. Use nonlocal.\n",
" total = [0]\n",
" \n",
" def f(m):\n",
" total[0] += 1\n",
" return n + m\n",
" \n",
" def get_total():\n",
" return total[0]\n",
" \n",
" # see comments below about use of this pattern\n",
" f.get_total = get_total\n",
" return f\n",
"\n",
"def two_seven_example():\n",
" f27 = two_seven(2)\n",
" def fx():\n",
" return f27(1)\n",
" for _ in range(5):\n",
" for name, func in ((\"calling func\", fx), (\"total times called\", f27.get_total)):\n",
" print(\"{}={}\".format(name, func()))\n",
"\n",
"two_seven_example()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Abusing Mutation within a Closure\n",
"\n",
"It's easy to get carried away with a flat 'no' design approach. This really is not a great idea. It's very difficult to reason and debug. "
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [],
"source": [
"def bad_example(m):\n",
" \n",
" alpha = 1\n",
" beta = 2\n",
" gamma = 3.0\n",
" \n",
" def to_gamma(n):\n",
" nonlocal gamma \n",
" gamma = (alpha + 1) * (beta + 2) + (n + m)\n",
" \n",
" def to_alpha(n):\n",
" nonlocal alpha \n",
" alpha = beta * (n + m)\n",
" \n",
" def to_beta(n):\n",
" nonlocal beta \n",
" beta = alpha * 3.14 * m\n",
" \n",
" def f(n):\n",
" to_gamma(n)\n",
" to_alpha(n)\n",
" to_beta(n)\n",
" return \"alpha={:.2f}, beta={:.2f}, gamma={:.2f}\".format(alpha, beta, gamma)\n",
" \n",
" return f"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [],
"source": [
"f = bad_example(7)"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'alpha=18.00, beta=395.64, gamma=17.00'"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"f(2)"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'alpha=3956.40, beta=86961.67, gamma=7565.16'"
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"f(3)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This is analogous to a hyper mutable class with too many mutable instance variables. This is marginally different than using global variables and this should be avoided if possible. It's often better to split up the problem into smaller pieces and potentially migrate to an OO pattern.\n",
"\n",
"A general guideline when designing with closures is it's often useful if the **closed over value is immutable or is explicitly a mutable variable used as a cache.**\n",
"\n",
"(Also, the closure based approach is arguable worse because it's not easy to test the inner functions)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Python Decorators\n",
"\n",
"This is perhaps an omission from Part 1 of FPT. However, there are several terrific resources for this specific topic.\n",
"\n",
"https://realpython.com/primer-on-python-decorators/\n",
"\n",
"If you're not using Python decorators (specifically in designing your core libs or frameworks), you should probably invest a few cycles in understanding this core feature of Python. Decorators are also complementary to many of the FPT listed in Part 1.\n",
"\n",
"Here's a small hello-world timer example. Please see [realpython.com](https://realpython.com/primer-on-python-decorators/) for more examples.\n",
"\n",
"Here's real world examples from [production code used by Pacific Biosciences](https://github.com/mpkocher/pbcommand/blob/master/pbcommand/pb_io/tool_contract_io.py#L195)."
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [],
"source": [
"import time\n",
"\n",
"def custom_timer(f):\n",
" def g(*args, **kw):\n",
" started_at = datetime.datetime.now()\n",
" result = f(*args, **kw)\n",
" completed_at = datetime.datetime.now()\n",
" dt = completed_at - started_at \n",
" msg = \"Ran func {} at {} in {:.2f} sec with args={} kw={}\".format(f.__name__, started_at, dt.total_seconds(), args, kw)\n",
" print(msg)\n",
" return result\n",
" return g\n",
"\n",
"@custom_timer \n",
"def adder(a, b):\n",
" time.sleep(0.25)\n",
" return a + b\n",
"\n",
"def run_example():\n",
" nums = zip(range(1, 5), range(7, 11))\n",
" for a, b in nums:\n",
" out = adder(a, b)"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Ran func adder at 2019-02-26 20:32:08.086756 in 0.25 sec with args=(1, 7) kw={}\n",
"Ran func adder at 2019-02-26 20:32:08.337394 in 0.25 sec with args=(2, 8) kw={}\n",
"Ran func adder at 2019-02-26 20:32:08.591756 in 0.25 sec with args=(3, 9) kw={}\n",
"Ran func adder at 2019-02-26 20:32:08.846650 in 0.25 sec with args=(4, 10) kw={}\n"
]
}
],
"source": [
"run_example()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Summary and Final Comments\n",
"\n",
"- Make sure your team is fluent with leveraging\n",
"- Judicously apply FPT patterns when desiging your program or application\n",
"- Leverage `nonlocal` to avoid mutable scope issues\n",
"- Avoid too much mutation in inner functions. \n",
"- Leverage decorators\n",
"\n",
"If you're a data scientist, backend dev or even an OO-wizard, I hope you've picked up some tricks to better express your ideas in code using the **Functional Progamming Techniques (FPT) in Python series**. \n",
"\n",
"Best to you and your Python-ing!"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Today is 2019-02-26 20:32:10.684615\n"
]
}
],
"source": [
"print(\"Today is {}\".format(datetime.datetime.now()))"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.2"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment