Created
November 6, 2013 11:02
-
-
Save Midnighter/7334305 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"metadata": { | |
"name": "" | |
}, | |
"nbformat": 3, | |
"nbformat_minor": 0, | |
"worksheets": [ | |
{ | |
"cells": [ | |
{ | |
"cell_type": "heading", | |
"level": 1, | |
"metadata": {}, | |
"source": [ | |
"IPython Parallel with globals in remote namespace" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"from IPython.parallel import Client" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 1 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Make sure that a cluster with a number of kernels was started using the specified profile." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"cl = Client(profile=\"default\")" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 2 | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"dv = cl.direct_view()" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 3 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"We want to test nested functions in remote kernels also using global variables." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"def outer_function(a):\n", | |
" global lookup\n", | |
" global backup\n", | |
" backup = defaultdict(int)\n", | |
" lookup[a][\"value\"] = inner_function(lookup[a][\"seq\"])" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 4 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"The inner function counts each symbol in a sequence and normalizes that frequency by the length of the sequence." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"def inner_function(seq):\n", | |
" global backup\n", | |
" for sym in seq:\n", | |
" backup[sym] += 1\n", | |
" total = float(len(seq))\n", | |
" return [backup[key] / total for key in backup]" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 5 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Some different IDs for the dictionary." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"seq_ids = list(\"ABCDEF\")" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 6 | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"import random" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 7 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Some random sequences stored under the IDs." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"lookup = dict()\n", | |
"for key in seq_ids:\n", | |
" lookup[key] = dict()\n", | |
" lookup[key][\"seq\"] = [random.choice(\"ATGC\") for i in xrange(1000)]" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 8 | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"from collections import defaultdict" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 9 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Here we push certain variables into the global namespace of the remote kernels. Note that instead of pushing the defaultdict, we could also execute::\n", | |
"\n", | |
" dv.execute(\"from collections import defaultdict\", block=True)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"dv.push({\"lookup\": lookup, \"backup\": dict(), \"inner_function\": inner_function,\n", | |
" \"defaultdict\": defaultdict}, block=True)" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"metadata": {}, | |
"output_type": "pyout", | |
"prompt_number": 10, | |
"text": [ | |
"[None, None]" | |
] | |
} | |
], | |
"prompt_number": 10 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Now we map the outer function to the IDs. If the outer function required more arguments, we could add additional sequences of equal length in the call." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"results = dv.map(outer_function, seq_ids, block=True)" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 11 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"No results, since the outer function has no return value but stores everything in the dictionary." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"results" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"metadata": {}, | |
"output_type": "pyout", | |
"prompt_number": 12, | |
"text": [ | |
"[None, None, None, None, None, None]" | |
] | |
} | |
], | |
"prompt_number": 12 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"We can retrieve the global dictionaries using `pull` which will return a list with one dictionary per remote kernel." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"all_dicts = dv.pull(\"lookup\", block=True)" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 13 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"We update the current dictionary with the remote ones." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"result_dict = lookup.copy()\n", | |
"for d in all_dicts:\n", | |
" for key in seq_ids:\n", | |
" result_dict[key].update(d[key])" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 14 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"These are the normalized values for each ID." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"for key in seq_ids:\n", | |
" print result_dict[key][\"value\"]" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"stream": "stdout", | |
"text": [ | |
"[0.248, 0.256, 0.241, 0.255]\n", | |
"[0.271, 0.254, 0.255, 0.22]\n", | |
"[0.25, 0.244, 0.255, 0.251]\n", | |
"[0.267, 0.256, 0.259, 0.218]\n", | |
"[0.255, 0.237, 0.255, 0.253]\n", | |
"[0.236, 0.269, 0.235, 0.26]\n" | |
] | |
} | |
], | |
"prompt_number": 15 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"If we did this in serial, rather than in parallel, would we get the same result?" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"backup = dict()\n", | |
"for key in seq_ids:\n", | |
" outer_function(key)" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 16 | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"for key in seq_ids:\n", | |
" print lookup[key][\"value\"]" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"stream": "stdout", | |
"text": [ | |
"[0.248, 0.256, 0.241, 0.255]\n", | |
"[0.271, 0.254, 0.255, 0.22]\n", | |
"[0.25, 0.244, 0.255, 0.251]\n", | |
"[0.267, 0.256, 0.259, 0.218]\n", | |
"[0.255, 0.237, 0.255, 0.253]\n", | |
"[0.236, 0.269, 0.235, 0.26]\n" | |
] | |
} | |
], | |
"prompt_number": 17 | |
} | |
], | |
"metadata": {} | |
} | |
] | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment