Last active
August 29, 2015 14:10
-
-
Save versae/06ed109e815be1d71b65 to your computer and use it in GitHub Desktop.
A Mini Guide to IPython Parallel
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"metadata": { | |
"name": "", | |
"signature": "sha256:2bde77c6d6afc841cf657b15dbdf133979fb23d8960105f8567e98d222b235a7" | |
}, | |
"nbformat": 3, | |
"nbformat_minor": 0, | |
"worksheets": [ | |
{ | |
"cells": [ | |
{ | |
"cell_type": "heading", | |
"level": 1, | |
"metadata": {}, | |
"source": [ | |
"A Mini Guide to IPython Parallel" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"IPython Parallalel is able to run in a single machine to maximize the use of the cores of the processor. If that's the case, the only thing you need is `ipcluster`, but what we want to do is to use 4 cores in 2 different machines. Those machines will be called the *engines*, and the one that connects to them, the *controller*.\n", | |
"\n", | |
"1. In the controller, we first create a profile named, let's say `cluster`\n", | |
" ```\n", | |
" $ ipython profile create --parallel --profile=cluster\n", | |
" ```\n", | |
" \n", | |
"2. The step is to create or edit a file named `ipcontroller_config.py`, where `c.HubFactory.ip` is the host to listen to, in our case all the interfaces, and `c.HubFactory.location` the external IP of the controller.\n", | |
" ```python\n", | |
" # ipcontroller_config.py\n", | |
" c.HubFactory.ip = '*'\n", | |
" c.HubFactory.location = 'controller.host'\n", | |
" ```\n", | |
" \n", | |
"3. Then, in each engine, we copy the file `~/.ipython/profile_cluster/security/ipcontroller-engine.json`.\n", | |
" ```\n", | |
" $ scp [email protected]:/home/username/.ipython/profile_cluster/security/ipcontroller-engine.json ./\n", | |
" ```\n", | |
" \n", | |
"4. In each engine, we just run `ipengine` with the file we just downloaded from the controller.\n", | |
" ```\n", | |
" $ ipengine --file=./ipcontroller-engine.json\n", | |
" ```\n", | |
" \n", | |
" So far, the easiest way to take advantage of all the cores in a single engine, is by running `ipengine` as many times as cores the machine has.\n", | |
"5. Finally, from the controller, we execute an IPython Notebook with password and using the profile.\n", | |
" ```\n", | |
" $ ipython notebook --profile=cluster --no-browser --pprint --NotebookApp.password=`python -c \"from IPython.lib import passwd; print passwd()\"\n", | |
" \n", | |
" ```\n", | |
" We can add `--ip=0.0.0.0` for the Notebook to listen in all the interfaces.\n", | |
" \n", | |
"Once everything is working, we just need an `IPython.parallel.Client` instance in our Notebook." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"from IPython.parallel import Client" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 1 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Accessing the `ids` of the engines." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"clients = Client()\n", | |
"clients.ids" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"metadata": {}, | |
"output_type": "pyout", | |
"prompt_number": 2, | |
"text": [ | |
"[3, 4, 5, 6, 7, 8, 9, 10]" | |
] | |
} | |
], | |
"prompt_number": 2 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"To execute functions in parallel, we access to a `View` and then call `map_sync` or `map_async`." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"view = Client()[:]" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 3 | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"view.map_sync(lambda x: 5**x, range(10))" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"metadata": {}, | |
"output_type": "pyout", | |
"prompt_number": 4, | |
"text": [ | |
"[1, 5, 25, 125, 625, 3125, 15625, 78125, 390625, 1953125]" | |
] | |
} | |
], | |
"prompt_number": 4 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"We can execute some commands in all the engines using the cell and line magic `%px`," | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"%px import socket; print(socket.gethostname())" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"stream": "stdout", | |
"text": [ | |
"[stdout:3] engine1-desktop\n", | |
"[stdout:4] engine1-desktop\n", | |
"[stdout:5] engine1-desktop\n", | |
"[stdout:6] engine1-desktop\n", | |
"[stdout:7] engine2-desktop\n", | |
"[stdout:8] engine2-desktop\n", | |
"[stdout:9] engine2-desktop\n", | |
"[stdout:10] engine2-desktop\n" | |
] | |
} | |
], | |
"prompt_number": 5 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Tested that everything works as expected, we can now run our actual code." | |
] | |
} | |
], | |
"metadata": {} | |
} | |
] | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment