Created
August 10, 2016 15:23
-
-
Save joefutrelle/39edb8b0eb63033ef9bb1069c348954a to your computer and use it in GitHub Desktop.
Introduction to multiprocessing in iPython notebook
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"metadata": { | |
"name": "", | |
"signature": "sha256:833c1966dc11a64a8f20850c6197f3713debc6b8c8ef56596dabed5304ec7d3e" | |
}, | |
"nbformat": 3, | |
"nbformat_minor": 0, | |
"worksheets": [ | |
{ | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"IPython/Jupyter provides extensive support for multiprocessing, using a variety of multiprocessing platforms and APIs.\n", | |
"\n", | |
"In addition to supporting complex architectures such as MPI and large-scale HPC clusters, Jupyter also provides a simple implementation suitable for small projects or prototyping using Jupyter notebooks. This notebook demonstrates that simple interface.\n", | |
"\n", | |
"To enable simple multiprocessing, you can run the `ipcluster` command at the same time as the notebook server, and tell it to start with a certain number of processes (this is a global limit on the number of tasks that can run simultaneously). For instance, this starts a cluster with 4 processes:\n", | |
"\n", | |
"```\n", | |
"ipcluster start -n 4\n", | |
"```\n", | |
"\n", | |
"Once the cluster is running, you can construct a `Client` object with no options, and it will connect to the default cluster.\n", | |
"\n", | |
"IPython's parallel API calls these processes \"engines\" to distinguish them from operating system processes, because there are multiple implementations of engines, not all of which are based on operating system processes." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"from IPython.parallel import Client\n", | |
"\n", | |
"pc = Client()" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 1 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"A typical use case is load balancing, where a sequence of tasks (potentially larger than the size of the ipcluster) are assigned to cluster processes on an availbility basis. That way you can process a long sequence of items faster, as each item will be distributed to the next available process up to the number of available processes, keeping the processes busy as items are processed." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"lbv = pc.load_balanced_view()" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 2 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"A simple way to run a function in a `load_balanced_view` is to decorate the function with the view's `parallel` decorator, and then call `map` on a list of arguments. This will call the function once for each argument, on whatever process is currently available, making the results available as an iterator. In this example, a function is run over a series of numbers, returning the square of the number. The results are put in a list." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"args = range(10)" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 3 | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"@lbv.parallel()\n", | |
"def square_it(n):\n", | |
" return n * n\n", | |
"\n", | |
"list(square_it.map(args))" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"metadata": {}, | |
"output_type": "pyout", | |
"prompt_number": 4, | |
"text": [ | |
"[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]" | |
] | |
} | |
], | |
"prompt_number": 4 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"A function decorated like this is quite different from a function you would define in a notebook, because it runs in a separate Python process than the notebook, and does not have any of the notebook's state (variables, imports) in it. This doesn't work:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"CONSTANT=7\n", | |
"\n", | |
"@lbv.parallel()\n", | |
"def by_constant(n):\n", | |
" return CONSTANT * n\n", | |
"\n", | |
"list(by_constant.map(args))" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"ename": "RemoteError", | |
"evalue": "NameError(global name 'CONSTANT' is not defined)", | |
"output_type": "pyerr", | |
"traceback": [ | |
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", | |
"\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)\u001b[0;32m<string>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m", | |
"\u001b[0;32m<ipython-input-5-29605cdae368>\u001b[0m in \u001b[0;36mby_constant\u001b[0;34m(n)\u001b[0m", | |
"\u001b[0;31mNameError\u001b[0m: global name 'CONSTANT' is not defined" | |
] | |
} | |
], | |
"prompt_number": 5 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Variables can be set on the individual using a `DirectView`'s `dict` interface, and then they're available in each engine:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"direct_view = pc[:] # direct view over all engines\n", | |
"direct_view['CONSTANT'] = 5\n", | |
"\n", | |
"@lbv.parallel()\n", | |
"def by_constant(n):\n", | |
" return CONSTANT * n\n", | |
"\n", | |
"list(by_constant.map(args))" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"metadata": {}, | |
"output_type": "pyout", | |
"prompt_number": 6, | |
"text": [ | |
"[0, 5, 10, 15, 20, 25, 30, 35, 40, 45]" | |
] | |
} | |
], | |
"prompt_number": 6 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Imports are also not visible inside parallel functions, if you use the obvious approach:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"import numpy as np\n", | |
"\n", | |
"@lbv.parallel()\n", | |
"def by_random(n):\n", | |
" return np.random() * n\n", | |
"\n", | |
"list(by_random.map(args))" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"ename": "RemoteError", | |
"evalue": "NameError(global name 'np' is not defined)", | |
"output_type": "pyerr", | |
"traceback": [ | |
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", | |
"\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)\u001b[0;32m<string>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m", | |
"\u001b[0;32m<ipython-input-7-929fadd70007>\u001b[0m in \u001b[0;36mby_random\u001b[0;34m(n)\u001b[0m", | |
"\u001b[0;31mNameError\u001b[0m: global name 'np' is not defined" | |
] | |
} | |
], | |
"prompt_number": 7 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"However, there's nothing preventing you from importing modules inside your parallel function:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"@lbv.parallel()\n", | |
"def by_random(n):\n", | |
" from numpy.random import random\n", | |
" return random() * n\n", | |
"\n", | |
"list(by_random.map(args))" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"metadata": {}, | |
"output_type": "pyout", | |
"prompt_number": 8, | |
"text": [ | |
"[0.0,\n", | |
" 0.167368008870593,\n", | |
" 1.2411598743381458,\n", | |
" 1.7675180201599598,\n", | |
" 1.5641618164917213,\n", | |
" 3.187434517128824,\n", | |
" 4.696644251196928,\n", | |
" 5.776716959789537,\n", | |
" 7.168258584886458,\n", | |
" 7.568338223605384]" | |
] | |
} | |
], | |
"prompt_number": 8 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"To manage this a little more cleanly, there is a facility provided by IPython called `sync_imports` that alleviates the need to import inside each parallel function:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"with direct_view.sync_imports():\n", | |
" from numpy.random import random\n", | |
" \n", | |
"@lbv.parallel()\n", | |
"def by_random(n):\n", | |
" return random() * n\n", | |
"\n", | |
"list(by_random.map(args))" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"stream": "stdout", | |
"text": [ | |
"importing random from numpy.random on engine(s)\n" | |
] | |
}, | |
{ | |
"metadata": {}, | |
"output_type": "pyout", | |
"prompt_number": 9, | |
"text": [ | |
"[0.0,\n", | |
" 0.8573835129828095,\n", | |
" 1.2177870049996347,\n", | |
" 1.4226322656236978,\n", | |
" 1.1289807912539036,\n", | |
" 4.850557193909525,\n", | |
" 1.6582739273772558,\n", | |
" 1.1274669801940815,\n", | |
" 6.9331105679123075,\n", | |
" 5.49846427013942]" | |
] | |
} | |
], | |
"prompt_number": 9 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"However, when using `sync_imports` you cannot do `import ... as`:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"with direct_view.sync_imports():\n", | |
" import numpy as np\n", | |
" \n", | |
"@lbv.parallel()\n", | |
"def by_random(n):\n", | |
" return np.random.random() * n\n", | |
"\n", | |
"list(by_random.map(args))" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"stream": "stdout", | |
"text": [ | |
"importing numpy on engine(s)\n" | |
] | |
}, | |
{ | |
"ename": "RemoteError", | |
"evalue": "NameError(global name 'np' is not defined)", | |
"output_type": "pyerr", | |
"traceback": [ | |
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", | |
"\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)\u001b[0;32m<string>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m", | |
"\u001b[0;32m<ipython-input-10-27f3dd1978d1>\u001b[0m in \u001b[0;36mby_random\u001b[0;34m(n)\u001b[0m", | |
"\u001b[0;31mNameError\u001b[0m: global name 'np' is not defined" | |
] | |
} | |
], | |
"prompt_number": 10 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"You can, however, simply rename any module you've imported using simple assignment:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"with direct_view.sync_imports():\n", | |
" import numpy\n", | |
" \n", | |
"@lbv.parallel()\n", | |
"def by_random(n):\n", | |
" np = numpy\n", | |
" return np.random.random() * n\n", | |
"\n", | |
"list(by_random.map(args))" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"stream": "stdout", | |
"text": [ | |
"importing numpy on engine(s)\n" | |
] | |
}, | |
{ | |
"metadata": {}, | |
"output_type": "pyout", | |
"prompt_number": 11, | |
"text": [ | |
"[0.0,\n", | |
" 0.7557438027200684,\n", | |
" 0.9585518741576387,\n", | |
" 0.5353510246225531,\n", | |
" 3.3578900123228412,\n", | |
" 3.9121982488904368,\n", | |
" 0.65422416491363,\n", | |
" 2.759055729179653,\n", | |
" 6.639056480586465,\n", | |
" 8.06777721160626]" | |
] | |
} | |
], | |
"prompt_number": 11 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"If you want to import your own modules, they will need to be in the `PYTHONPATH` when the `ipcluster` command is run, or you will need to manipulate `sys.path` before importing modules in your function or in `sync_imports`. The latter is not recommended because it will stop working when you move the code to a different directory." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"When iterating over the results of `map` called on a load-balanced function, results come back in the same order as the iterable being mapped over. In many cases this can be slower than allowing results to come back out of order. If you don't care about order, you can speed up your code using `ordered=False` on the `parallel` decorator:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"with direct_view.sync_imports():\n", | |
" import numpy\n", | |
" import time\n", | |
" \n", | |
"@lbv.parallel(ordered=False)\n", | |
"def unordered_by_random(n):\n", | |
" np = numpy\n", | |
" time.sleep((n % 3.) / 100.)\n", | |
" return n\n", | |
"\n", | |
"list(unordered_by_random.map(args))" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"stream": "stdout", | |
"text": [ | |
"importing numpy on engine(s)\n", | |
"importing time on engine(s)\n" | |
] | |
}, | |
{ | |
"metadata": {}, | |
"output_type": "pyout", | |
"prompt_number": 12, | |
"text": [ | |
"[0, 3, 1, 6, 9, 4, 2, 7, 5, 8]" | |
] | |
} | |
], | |
"prompt_number": 12 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"All of the techniques described above will work outside of the IPython Notebook, but if you are in a Notebook there are magics that make it even easier to parallelize simple operations. For example the `%px` line magic parallelizes the operation that follows (by default, running the code once on each engine). The same considerations about access to variables and imports apply when using this capability:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"with direct_view.sync_imports():\n", | |
" import numpy\n", | |
" \n", | |
"%px numpy.random.random()" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"stream": "stdout", | |
"text": [ | |
"importing numpy on engine(s)\n" | |
] | |
}, | |
{ | |
"metadata": {}, | |
"output_type": "display_data", | |
"text": [ | |
"\u001b[0;31mOut[0:1]: \u001b[0m0.07115533851434919" | |
] | |
}, | |
{ | |
"metadata": {}, | |
"output_type": "display_data", | |
"text": [ | |
"\u001b[0;31mOut[1:1]: \u001b[0m0.34365542068224153" | |
] | |
}, | |
{ | |
"metadata": {}, | |
"output_type": "display_data", | |
"text": [ | |
"\u001b[0;31mOut[2:1]: \u001b[0m0.3789227038209141" | |
] | |
}, | |
{ | |
"metadata": {}, | |
"output_type": "display_data", | |
"text": [ | |
"\u001b[0;31mOut[3:1]: \u001b[0m0.42153883642779055" | |
] | |
}, | |
{ | |
"metadata": {}, | |
"output_type": "display_data", | |
"text": [ | |
"\u001b[0;31mOut[4:1]: \u001b[0m0.26371907716587106" | |
] | |
}, | |
{ | |
"metadata": {}, | |
"output_type": "display_data", | |
"text": [ | |
"\u001b[0;31mOut[5:1]: \u001b[0m0.028787998300220385" | |
] | |
}, | |
{ | |
"metadata": {}, | |
"output_type": "display_data", | |
"text": [ | |
"\u001b[0;31mOut[6:1]: \u001b[0m0.21379245391397483" | |
] | |
}, | |
{ | |
"metadata": {}, | |
"output_type": "display_data", | |
"text": [ | |
"\u001b[0;31mOut[7:1]: \u001b[0m0.7923412125091567" | |
] | |
} | |
], | |
"prompt_number": 13 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"The `%%px` cell magic parallelizes an entire cell." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"%%px\n", | |
"\n", | |
"import numpy as np\n", | |
"\n", | |
"np.random.random()" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"metadata": {}, | |
"output_type": "display_data", | |
"text": [ | |
"\u001b[0;31mOut[0:2]: \u001b[0m0.06626255282969074" | |
] | |
}, | |
{ | |
"metadata": {}, | |
"output_type": "display_data", | |
"text": [ | |
"\u001b[0;31mOut[1:2]: \u001b[0m0.2974173113219971" | |
] | |
}, | |
{ | |
"metadata": {}, | |
"output_type": "display_data", | |
"text": [ | |
"\u001b[0;31mOut[2:2]: \u001b[0m0.6160651692284698" | |
] | |
}, | |
{ | |
"metadata": {}, | |
"output_type": "display_data", | |
"text": [ | |
"\u001b[0;31mOut[3:2]: \u001b[0m0.03866279859384625" | |
] | |
}, | |
{ | |
"metadata": {}, | |
"output_type": "display_data", | |
"text": [ | |
"\u001b[0;31mOut[4:2]: \u001b[0m0.8489438951855887" | |
] | |
}, | |
{ | |
"metadata": {}, | |
"output_type": "display_data", | |
"text": [ | |
"\u001b[0;31mOut[5:2]: \u001b[0m0.3644690741009735" | |
] | |
}, | |
{ | |
"metadata": {}, | |
"output_type": "display_data", | |
"text": [ | |
"\u001b[0;31mOut[6:2]: \u001b[0m0.44580574231394954" | |
] | |
}, | |
{ | |
"metadata": {}, | |
"output_type": "display_data", | |
"text": [ | |
"\u001b[0;31mOut[7:2]: \u001b[0m0.4017524889048948" | |
] | |
} | |
], | |
"prompt_number": 14 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"There is no way to collect results from such an execution using `map`, but there are other means which will be described below." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"IPython engines also support magics, so you can use magics in a parallelized line or cell:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"%%px\n", | |
"\n", | |
"def fibonacci(n):\n", | |
" if n < 2:\n", | |
" return 1\n", | |
" else:\n", | |
" return fibonacci(n-2) + fibonacci(n-1)\n", | |
"\n", | |
"%timeit fibonacci(30)" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"stream": "stdout", | |
"text": [ | |
"[stdout:0] 1 loops, best of 3: 548 ms per loop\n", | |
"[stdout:1] 1 loops, best of 3: 535 ms per loop\n", | |
"[stdout:2] 1 loops, best of 3: 475 ms per loop\n", | |
"[stdout:3] 1 loops, best of 3: 508 ms per loop\n", | |
"[stdout:4] 1 loops, best of 3: 451 ms per loop\n", | |
"[stdout:5] 1 loops, best of 3: 493 ms per loop\n", | |
"[stdout:6] 1 loops, best of 3: 450 ms per loop\n", | |
"[stdout:7] 1 loops, best of 3: 561 ms per loop\n" | |
] | |
} | |
], | |
"prompt_number": 15 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"If a magic is not pre-loaded then you will need to load it into all the engines, or it will not be available to them. This example depends on R, and rmagic which is provided by `rpy2`:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"%px %load_ext rpy2.ipython" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 18 | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"%%px \n", | |
"%%R\n", | |
"a <- array(seq(24), dim=c(6,4));\n", | |
"mean(a);" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"metadata": {}, | |
"output_type": "display_data", | |
"text": [ | |
"[output:0]" | |
] | |
}, | |
{ | |
"metadata": { | |
"engine": 0 | |
}, | |
"output_type": "display_data", | |
"text": [ | |
"[1] 12.5\n" | |
] | |
}, | |
{ | |
"metadata": {}, | |
"output_type": "display_data", | |
"text": [ | |
"[output:1]" | |
] | |
}, | |
{ | |
"metadata": { | |
"engine": 1 | |
}, | |
"output_type": "display_data", | |
"text": [ | |
"[1] 12.5\n" | |
] | |
}, | |
{ | |
"metadata": {}, | |
"output_type": "display_data", | |
"text": [ | |
"[output:2]" | |
] | |
}, | |
{ | |
"metadata": { | |
"engine": 2 | |
}, | |
"output_type": "display_data", | |
"text": [ | |
"[1] 12.5\n" | |
] | |
}, | |
{ | |
"metadata": {}, | |
"output_type": "display_data", | |
"text": [ | |
"[output:3]" | |
] | |
}, | |
{ | |
"metadata": { | |
"engine": 3 | |
}, | |
"output_type": "display_data", | |
"text": [ | |
"[1] 12.5\n" | |
] | |
}, | |
{ | |
"metadata": {}, | |
"output_type": "display_data", | |
"text": [ | |
"[output:4]" | |
] | |
}, | |
{ | |
"metadata": { | |
"engine": 4 | |
}, | |
"output_type": "display_data", | |
"text": [ | |
"[1] 12.5\n" | |
] | |
}, | |
{ | |
"metadata": {}, | |
"output_type": "display_data", | |
"text": [ | |
"[output:5]" | |
] | |
}, | |
{ | |
"metadata": { | |
"engine": 5 | |
}, | |
"output_type": "display_data", | |
"text": [ | |
"[1] 12.5\n" | |
] | |
}, | |
{ | |
"metadata": {}, | |
"output_type": "display_data", | |
"text": [ | |
"[output:6]" | |
] | |
}, | |
{ | |
"metadata": { | |
"engine": 6 | |
}, | |
"output_type": "display_data", | |
"text": [ | |
"[1] 12.5\n" | |
] | |
}, | |
{ | |
"metadata": {}, | |
"output_type": "display_data", | |
"text": [ | |
"[output:7]" | |
] | |
}, | |
{ | |
"metadata": { | |
"engine": 7 | |
}, | |
"output_type": "display_data", | |
"text": [ | |
"[1] 12.5\n" | |
] | |
} | |
], | |
"prompt_number": 19 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Variables and other names such as imports that are made available to the engines persist, even after the engines complete a job. This requires some care to manage. If an engine changes one of those values, it does not change for other engines:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"direct_view['my_var'] = 20\n", | |
"\n", | |
"%px print my_var\n", | |
"print 'randomizing my_var'\n", | |
"%px import random; my_var = random.random()\n", | |
"%px print my_var" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"stream": "stdout", | |
"text": [ | |
"[stdout:0] 20\n", | |
"[stdout:1] 20\n", | |
"[stdout:2] 20\n", | |
"[stdout:3] 20\n", | |
"[stdout:4] 20\n", | |
"[stdout:5] 20\n", | |
"[stdout:6] 20\n", | |
"[stdout:7] 20\n", | |
"randomizing my_var\n", | |
"[stdout:0] 0.0672676577096\n", | |
"[stdout:1] 0.593290055845\n", | |
"[stdout:2] 0.940976210716\n", | |
"[stdout:3] 0.666114074861\n", | |
"[stdout:4] 0.281673993964\n", | |
"[stdout:5] 0.750778326328\n", | |
"[stdout:6] 0.00123476896935\n", | |
"[stdout:7] 0.895196120864\n" | |
] | |
} | |
], | |
"prompt_number": 20 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"However, the values of any variable in all the engines are available using the direct view, and this is a convenient way of retrieving results from each engine after they do their work." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"direct_view['my_var']" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"metadata": {}, | |
"output_type": "pyout", | |
"prompt_number": 21, | |
"text": [ | |
"[0.06726765770962795,\n", | |
" 0.5932900558447385,\n", | |
" 0.9409762107163354,\n", | |
" 0.6661140748607993,\n", | |
" 0.2816739939637196,\n", | |
" 0.7507783263282061,\n", | |
" 0.001234768969350819,\n", | |
" 0.8951961208642395]" | |
] | |
} | |
], | |
"prompt_number": 21 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"To ensure that your parallelized code starts \"clean\", without any variables or imports defined for it, you can call `clear` on the direct view:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"direct_view.clear()\n", | |
"\n", | |
"%px print my_var" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"ename": "CompositeError", | |
"evalue": "one or more exceptions from call to method: execute\n[0:execute]: NameError: name 'my_var' is not defined\n[1:execute]: NameError: name 'my_var' is not defined\n[2:execute]: NameError: name 'my_var' is not defined\n[3:execute]: NameError: name 'my_var' is not defined\n.... 4 more exceptions ...", | |
"output_type": "pyerr", | |
"traceback": [ | |
"[0:execute]: ", | |
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", | |
"\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)\u001b[0;32m<ipython-input-11-8c8012704917>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m", | |
"\u001b[0;32m----> 1\u001b[0;31m \u001b[0;32mprint\u001b[0m \u001b[0mmy_var\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m", | |
"\u001b[0m\u001b[0;31mNameError\u001b[0m: name 'my_var' is not defined", | |
"", | |
"[1:execute]: ", | |
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", | |
"\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)\u001b[0;32m<ipython-input-11-8c8012704917>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m", | |
"\u001b[0;32m----> 1\u001b[0;31m \u001b[0;32mprint\u001b[0m \u001b[0mmy_var\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m", | |
"\u001b[0m\u001b[0;31mNameError\u001b[0m: name 'my_var' is not defined", | |
"", | |
"[2:execute]: ", | |
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", | |
"\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)\u001b[0;32m<ipython-input-11-8c8012704917>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m", | |
"\u001b[0;32m----> 1\u001b[0;31m \u001b[0;32mprint\u001b[0m \u001b[0mmy_var\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m", | |
"\u001b[0m\u001b[0;31mNameError\u001b[0m: name 'my_var' is not defined", | |
"", | |
"[3:execute]: ", | |
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", | |
"\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)\u001b[0;32m<ipython-input-11-8c8012704917>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m", | |
"\u001b[0;32m----> 1\u001b[0;31m \u001b[0;32mprint\u001b[0m \u001b[0mmy_var\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m", | |
"\u001b[0m\u001b[0;31mNameError\u001b[0m: name 'my_var' is not defined", | |
"", | |
"... 4 more exceptions ..." | |
] | |
} | |
], | |
"prompt_number": 22 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"The `%px` and `%%px` magics can also be used to import modules into engines, and they support `import ... as`:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"direct_view.clear()\n", | |
"\n", | |
"%px import numpy as np\n", | |
"\n", | |
"%px np.random.random()" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"metadata": {}, | |
"output_type": "display_data", | |
"text": [ | |
"\u001b[0;31mOut[0:13]: \u001b[0m0.32817339574765314" | |
] | |
}, | |
{ | |
"metadata": {}, | |
"output_type": "display_data", | |
"text": [ | |
"\u001b[0;31mOut[1:13]: \u001b[0m0.2969531330516032" | |
] | |
}, | |
{ | |
"metadata": {}, | |
"output_type": "display_data", | |
"text": [ | |
"\u001b[0;31mOut[2:13]: \u001b[0m0.7961021553865206" | |
] | |
}, | |
{ | |
"metadata": {}, | |
"output_type": "display_data", | |
"text": [ | |
"\u001b[0;31mOut[3:13]: \u001b[0m0.21141354222531006" | |
] | |
}, | |
{ | |
"metadata": {}, | |
"output_type": "display_data", | |
"text": [ | |
"\u001b[0;31mOut[4:13]: \u001b[0m0.5107222443099002" | |
] | |
}, | |
{ | |
"metadata": {}, | |
"output_type": "display_data", | |
"text": [ | |
"\u001b[0;31mOut[5:13]: \u001b[0m0.8213017821298888" | |
] | |
}, | |
{ | |
"metadata": {}, | |
"output_type": "display_data", | |
"text": [ | |
"\u001b[0;31mOut[6:13]: \u001b[0m0.31726720666842334" | |
] | |
}, | |
{ | |
"metadata": {}, | |
"output_type": "display_data", | |
"text": [ | |
"\u001b[0;31mOut[7:13]: \u001b[0m0.13085758833719752" | |
] | |
} | |
], | |
"prompt_number": 23 | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Cython also works in parallel engines, and the C translation, compilation, and loading is done automatically. (FIXME this is parenthetical and redundant with the R example above)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"%px %reload_ext cythonmagic" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 24 | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"%%px\n", | |
"%%cython\n", | |
"\n", | |
"def foo(int i):\n", | |
" return i + 1" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 26 | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"%px foo(3)" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"metadata": {}, | |
"output_type": "display_data", | |
"text": [ | |
"\u001b[0;31mOut[0:17]: \u001b[0m4" | |
] | |
}, | |
{ | |
"metadata": {}, | |
"output_type": "display_data", | |
"text": [ | |
"\u001b[0;31mOut[1:17]: \u001b[0m4" | |
] | |
}, | |
{ | |
"metadata": {}, | |
"output_type": "display_data", | |
"text": [ | |
"\u001b[0;31mOut[2:17]: \u001b[0m4" | |
] | |
}, | |
{ | |
"metadata": {}, | |
"output_type": "display_data", | |
"text": [ | |
"\u001b[0;31mOut[3:17]: \u001b[0m4" | |
] | |
}, | |
{ | |
"metadata": {}, | |
"output_type": "display_data", | |
"text": [ | |
"\u001b[0;31mOut[4:17]: \u001b[0m4" | |
] | |
}, | |
{ | |
"metadata": {}, | |
"output_type": "display_data", | |
"text": [ | |
"\u001b[0;31mOut[5:17]: \u001b[0m4" | |
] | |
}, | |
{ | |
"metadata": {}, | |
"output_type": "display_data", | |
"text": [ | |
"\u001b[0;31mOut[6:17]: \u001b[0m4" | |
] | |
}, | |
{ | |
"metadata": {}, | |
"output_type": "display_data", | |
"text": [ | |
"\u001b[0;31mOut[7:17]: \u001b[0m4" | |
] | |
} | |
], | |
"prompt_number": 27 | |
} | |
], | |
"metadata": {} | |
} | |
] | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment