Skip to content

Instantly share code, notes, and snippets.

@ChimeraCoder
Created November 16, 2013 21:33
Show Gist options
  • Save ChimeraCoder/7505630 to your computer and use it in GitHub Desktop.
Save ChimeraCoder/7505630 to your computer and use it in GitHub Desktop.
Note book from github.com/ChimeraCoder/intro-to-numpy-and-scipy. Details and license on the main repository. Linked here for http://nbviewer.ipython.org/
Display the source blob
Display the rendered blob
Raw
{
"metadata": {
"name": ""
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Why NumPy?\n",
"--------------------\n",
"\n",
"\n",
"####Python can be slow:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"from __future__ import division, print_function\n",
"import itertools"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"lst1 = range(1000000)\n",
"lst2 = lst1[::-1] #Reverse the list"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%timeit [l1 + l2 for l1, l2 in itertools.izip(lst1, lst2)]"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"#Cycles per addition (approx)\n",
"((109e-3)*(2.4e9))/1000000"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Why is Python so slow?\n",
"------------------------"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"from IPython.core.display import Image\n",
"Image(filename=\"python_memory_model.png\")"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"###Can we do any better?\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"import numpy as np"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"arr1 = np.array(lst1)\n",
"arr2 = np.array(lst2)"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%timeit arr1+arr2"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"((1.97e-3)*(2.4e9))/1000000"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"Image(filename=\"numpy_memory_model.png\")\n",
"\n"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"####But, there are some tradeoffs to this."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"a = np.array([(2<<30)-1],dtype=np.int32)"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"a"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"a+1"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"####Uh-oh."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"a = np.array([-.5])"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"a"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"a/0"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"a**.5"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"###Performance is not a free lunch either"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"temperatures_f = np.array([i for i in xrange(32,212)])"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"temperatures_c = (temperatures_f -32)*5/9.0"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"####Each of the above three arithmetic operations creates a temporary value"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"###Numpy offers a number of convenient ways to create arrays"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"a = np.arange(0, 20, 2, dtype=None)\n"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"np.empty((4,5), dtype=float, order=None)"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"np.zeros((4,5), dtype=float, order=None)\n"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"np.ones((4,5), dtype=float, order=None)\n"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": true,
"input": [
"np.asarray([[i for i in xrange(20)], [j for j in xrange(10)]], dtype=None)\n",
"#These will fail as \"collection\" and iterable is not defined\n",
"#np.array(collection, dtype=None, copy=True, order=None)\n",
"#np.fromiter(iterable, dtype, count=-1)"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"a = np.arange(12)\n",
"a"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"a = a.reshape(3,4)\n",
"a"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"(a*10).reshape(2,6)"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"a"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"a * [2,4,6,8] #The 4-vector will be broadcast"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"a * [2,3,4] #This will cause an error, as a 3-vector cannot be broadcast (the dimensions do not match)"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"####We can get views of the data by indexing\n"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"a = np.arange(12)\n",
"a"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"b = a[::2]\n",
"b"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"b[2] = -1\n",
"b"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"a"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"b.flags['OWNDATA']"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"b.base is a"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"####We can index by a list of ints, and get an array of those items"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"a = np.arange(10)*10\n",
"a"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"b = a[[4,3,-2]]\n",
"b"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"b.flags[\"OWNDATA\"] #Note that this gives us a copy of the data"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"####We can index by a list of boolean values as well"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"a = a.reshape((5,2))"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"a[(a%3)==0].shape"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"b = ((a%3)==0)"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"a[b]"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"###Let's dive a bit deeper into the memory layout"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"a = np.arange(3000000)\n",
"a.shape = (5,3,200000)\n",
"a"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"b = a.swapaxes(1,2) #Swap the last two axes\n",
"print(b.shape)\n",
"print(a.shape)\n",
"print(b.flags[\"OWNDATA\"])"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### When changing the shape, it helps to remember how arrays are laid out in memory"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"a.shape = (5,600000) "
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"a.shape = (5,3,200000) #Reset the shape of a, before we reshape a different way\n"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"a.shape = (1000000,3)"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"print(b.shape) \n",
"b.shape = (1000000,3)"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### If we take a look at the flags of the arrays, we can see why this error message happened"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"a.flags"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"b.flags"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"###We can also do some fun graphing with matplotlib"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"import random\n",
"%pylab inline --no-import-all"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"pylab.ion()\n",
"pylab.figure()\n",
"pylab.plot([random.gauss(10, 3) for i in xrange(30)], 'g')\n",
"pylab.ioff()"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"####SymPy lets us do symbolic manipulation"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"from sympy import symbols, limit, log, integrate, Integral, sqrt"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"x, y = symbols('x y')\n"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"limit (x*log (x),x,0)"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"limit (x*log (x),x,20)"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"integrate(x/(x**2+2*x+1), x)"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"from sympy import latex, init_printing"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"integrate(x/(x**2+2*x+1), x)"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### That works, but it's a bit ugly. Can we do any better?"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"init_printing()"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"integrate(x/(x**2+2*x+1), x)"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"Integral(sqrt(1/x), x)"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"###We can solve equations symbolically"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"from sympy import solve, Eq\n",
"solve(Eq(x**2, 1), x)"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"###DiffEqs? No sweat!"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"from sympy import dsolve, Function, sin\n",
"f, g = symbols('f g', cls=Function)\n",
"diffeq = Eq(f(x).diff(x, x) - 2*f(x).diff(x) + f(x), sin(x))\n",
"diffeq"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"dsolve(diffeq, f(x))"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"from sympy import Matrix\n",
"M = Matrix(( [1, 2, 1], [6, -1, 0], [-1, -2, -1] ))\n",
"M.eigenvals()\n"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"M.eigenvects()"
],
"language": "python",
"metadata": {},
"outputs": []
},
{
"cell_type": "code",
"collapsed": true,
"input": [
"M = Matrix(( [1, 2, 3], [3, 6, 2], [2, 0, 1] ))\n",
"M.eigenvals()"
],
"language": "python",
"metadata": {},
"outputs": []
}
],
"metadata": {}
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment