Last active
December 16, 2015 21:10
-
-
Save tritemio/5498153 to your computer and use it in GitHub Desktop.
An IPython Notebook as an Interactive Parallel Computing Tutorial
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"metadata": { | |
"name": "Interactive IPython Parallel Computing Tutorial" | |
}, | |
"nbformat": 3, | |
"nbformat_minor": 0, | |
"worksheets": [ | |
{ | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Interactive IPython Parallel Computing Tutorial\n", | |
"============================\n", | |
"Introduction\n", | |
"-----------\n", | |
"This tutorial shows some a basic examples on how to use the powerful Parallel Computing functionality of [IPython](http://ipython.org/).\n", | |
"\n", | |
"The tutorial itself is written in IPython Notebook of which you are reading is a static HTML representation. To execute this notebook on your computer download it and drag&drop the file on the Notebook Dashboard. You can find a \"Download Notebook\" link in the upper right part of this page.\n", | |
"\n", | |
"Other interesting resources:\n", | |
" \n", | |
"- [Running Code in the IPython Notebook](http://nbviewer.ipython.org/urls/github.com/ipython/ipython/raw/master/examples/notebooks/Part%25201%2520-%2520Running%2520Code.ipynb)\n", | |
"- [Offician IPython Documentation](http://ipython.org/documentation.html)\n", | |
"\n", | |
"\n", | |
"**DISCLAIMER**: Part of this tutorial is shamelessly copied from the [Official IPython Parallel Computing Documentation](http://ipython.org/ipython-doc/stable/parallel/index.html).\n", | |
"\n", | |
"Installation Requirements\n", | |
"-----------\n", | |
"To run this tutorial you have to install a recent version of [IPython](http://ipython.org/). Some commands will also require Numpy. \n", | |
"\n", | |
"It is recommended however to install a complete scientific python environment.\n", | |
"\n", | |
"On Windows, my favorite scientific python distribution is [WinPython](http://code.google.com/p/winpython/): it has 64bit support, includes a wonderful IDE called [Spyder](http://code.google.com/p/spyderlib/), and the installation folder can be moved anywhere.\n", | |
"\n", | |
"After installing WinPython, to launch the IPython Notebook click on **WinPython Command Prompt** and type:\n", | |
"\n", | |
" ipython notebook --pylab inline\n", | |
"\n", | |
"At this point a web browser should automagically open showing the **IPython Notebook Dashboard**." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Starting the the cluster\n", | |
"------------------\n", | |
"\n", | |
"For the purposes this tutorial you can start a cluster on your local machine. Just go to the \"Notebook Dashboard\" tab in your browser, click on the *Cluster* tab and specify the number of \"parallel python sessions\" (called **ipengines**) to start. A good number is the number of your cores. After clicking **Start** your local cluster should be running.\n", | |
"\n", | |
"**NOTE:** To setup a more complex cluster you can follow the official IPython documentation [here](http://ipython.org/ipython-doc/stable/parallel/parallel_process.html). \n", | |
"\n", | |
"Once the cluster is started (doesn't matter if locally or on remote machines) the following tutorial can be followed and re-executed." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Starting a parallel session\n", | |
"--------------------\n", | |
"\n", | |
"Once the cluster is started we ca oper a new ipython notebook and run this command to connect to the running engines:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"from IPython.parallel import Client" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 1 | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"rc = Client()" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 2 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"The `rc` variable will contain all the running enignes. With the `.ids` attribute we can see the ID associated with each engine. If the list is empty no engine is running. In our case:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"rc.ids" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"output_type": "pyout", | |
"prompt_number": 3, | |
"text": [ | |
"[0, 1]" | |
] | |
} | |
], | |
"prompt_number": 3 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"To use the engine we first have to select them. The selection is done through python indexing or slicing. For example to select all the running engines just do:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"dview = rc[:]" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 4 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Now `dview` contains a DirectView object that can be used to send/receive code and data back an forth between our session and the engines." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Execute code on the enignes\n", | |
"---------------------------\n", | |
"Our enignes are basically multiple ipython process running in parallel. To run a command an all our engines we can use the **`%px`** magic command. Let see some examples:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"%px print 'ciao'" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"stream": "stdout", | |
"text": [ | |
"[stdout:0] ciao\n", | |
"[stdout:1] ciao\n" | |
] | |
} | |
], | |
"prompt_number": 5 | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"%px import os\n", | |
"%px print os.getpid()" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"stream": "stdout", | |
"text": [ | |
"[stdout:0] 3375\n", | |
"[stdout:1] 3376\n" | |
] | |
} | |
], | |
"prompt_number": 6 | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"%px from numpy.random import randint\n", | |
"%px a = rand(5)\n", | |
"%px print a" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"stream": "stdout", | |
"text": [ | |
"[stdout:0] [ 0.15690218 0.56782216 0.92297292 0.19870273 0.39490221]\n", | |
"[stdout:1] [ 0.06350091 0.94723982 0.21775028 0.2323376 0.19959411]\n" | |
] | |
} | |
], | |
"prompt_number": 7 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"**NOTE:** Under the hood the **%px** command uses the method [`dview.execute()`](http://ipython.org/ipython-doc/stable/api/generated/IPython.parallel.client.view.html?highlight=view.execute#IPython.parallel.client.view.DirectView.execute) to run the command. This method returns an [AsyncResult object](http://ipython.org/ipython-doc/stable/parallel/asyncresult.html) that is used to see the output. The magic **%px** convenientily shows the output right away." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Transferring data: Push/Pull\n", | |
"-----------------------------\n", | |
"\n", | |
"With the last command we created a variable **a** on each engine. To transfer it to our local session, we **pull** it using the **`dview`** object:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"dview['a']" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"output_type": "pyout", | |
"prompt_number": 8, | |
"text": [ | |
"[array([ 0.15690218, 0.56782216, 0.92297292, 0.19870273, 0.39490221]),\n", | |
" array([ 0.06350091, 0.94723982, 0.21775028, 0.2323376 , 0.19959411])]" | |
] | |
} | |
], | |
"prompt_number": 8 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Basically, the python dictionary syntax is used on the dview object to **pull** data from the engines. We see that the command return a list in which each element is the requested object (in this case a numpy array).\n", | |
"\n", | |
"Similarly, in order to **push** data to the remote engines we can use the dictionary assignment syntax:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"dview['b'] = 3" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 9 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"**NOTE:** The same functionality can be obtained through the methods `dview.push()` and `dview.pull()`. See [here](http://ipython.org/ipython-doc/stable/parallel/parallel_multiengine.html#moving-python-objects-around) for more details." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"###Push/Pull Numpy arrays\n", | |
"\n", | |
"\n", | |
"When moving Numpy arrays we must be aware that the data at destination is always read-only. To modify the array we must make a copy.\n", | |
"\n", | |
"See [Details of Parallel Computing with IPython](ipython.org/ipython-doc/stable/parallel/parallel_details.html) for more information." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Parallel map: map_syn(), map_asynch()\n", | |
"------------\n", | |
"As a first example we use the parallel map that returns a list.\n", | |
"This example apply the scatter/gather method to split an array/list, send the fragments to the engines (all apply the same function but on different data), and finally recollect (gather) the result in a single list." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"parallel_result = dview.map_sync(lambda x: x**10, arange(32))" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 10 | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"serial_result = map(lambda x:x**10, arange(32))" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 11 | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"(parallel_result == serial_result)" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"output_type": "pyout", | |
"prompt_number": 12, | |
"text": [ | |
"True" | |
] | |
} | |
], | |
"prompt_number": 12 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Execute functions on the engines: apply*\n", | |
"-------------------------------\n", | |
"This will call the same function on all the engines" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"dview.block=True\n", | |
"dview['a'] = 5\n", | |
"dview['b'] = 10\n", | |
"dview.apply(lambda x: a+b+x, 27)" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"output_type": "pyout", | |
"prompt_number": 13, | |
"text": [ | |
"[42, 42]" | |
] | |
} | |
], | |
"prompt_number": 13 | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"ar = dview.apply_async(lambda x: a+b+x, 33)" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 14 | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"ar.get()" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"output_type": "pyout", | |
"prompt_number": 15, | |
"text": [ | |
"[48, 48]" | |
] | |
} | |
], | |
"prompt_number": 15 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"We can send different functions to different engines using the target property:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"dview.targets = 0\n", | |
"ar0 = dview.apply_async(lambda x: a+b+x, 27)\n", | |
"dview.targets = 1\n", | |
"ar1 = dview.apply_async(lambda x: a+b+x, 33)" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 16 | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"print ar0.get(), ar1.get()" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"stream": "stdout", | |
"text": [ | |
"42 48\n" | |
] | |
} | |
], | |
"prompt_number": 17 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Alternatively one can create different DirectViews (dview) by slicing rc and apply a different function to each of them." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Scatter/Gather\n", | |
"-------------" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"dview.targets = [0,1]\n", | |
"dview.scatter('a',arange(16))" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 18 | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"dview.gather('a')" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"output_type": "pyout", | |
"prompt_number": 19, | |
"text": [ | |
"array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15])" | |
] | |
} | |
], | |
"prompt_number": 19 | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"dview['a']" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"output_type": "pyout", | |
"prompt_number": 20, | |
"text": [ | |
"[array([0, 1, 2, 3, 4, 5, 6, 7]), array([ 8, 9, 10, 11, 12, 13, 14, 15])]" | |
] | |
} | |
], | |
"prompt_number": 20 | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 20 | |
} | |
], | |
"metadata": {} | |
} | |
] | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment