Created
January 1, 2021 04:44
-
-
Save lordpretzel/55bfee9a0655ea8d12b0705411ba68be to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"collapsed": false | |
}, | |
"source": [ | |
"# Hashtables\n", | |
"\n", | |
"## Agenda\n", | |
"\n", | |
"- Discussion: pros/cons of array-backed and linked structures\n", | |
"- Comparison to `set` and `dict`\n", | |
"- The *map* ADT\n", | |
"- Direct lookups via *Hashing*\n", | |
"- Hashtables\n", | |
" - Collisions and the \"Birthday problem\"\n", | |
"- Runtime analysis & Discussion" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"collapsed": false | |
}, | |
"source": [ | |
"## Discussion: pros/cons of array-backed and linked structures" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"collapsed": false | |
}, | |
"source": [ | |
"Between the array-backed and linked list we have:\n", | |
"\n", | |
"1. $O(1)$ indexing (array-backed)\n", | |
"2. $O(1)$ appending (array-backed & linked)\n", | |
"3. $O(1)$ insertion/deletion without indexing (linked)\n", | |
"4. $O(N)$ linear search (unsorted)\n", | |
"5. $O(\\log N)$ binary search, when sorted (only array-backed lists)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"collapsed": false | |
}, | |
"source": [ | |
"## Comparison to `set` and `dict`" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"collapsed": false | |
}, | |
"source": [ | |
"The `set` and `dict` types don't support positional access (i.e., by index), but do support lookup/search. How fast do they fare compared to lists?" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 95, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [], | |
"source": [ | |
"import timeit\n", | |
"\n", | |
"def lin_search(lst, x):\n", | |
" return x in lst\n", | |
" \n", | |
"def bin_search(lst, x):\n", | |
" # assumes lst is sorted\n", | |
" low = 0\n", | |
" hi = len(lst)-1\n", | |
" while low <= hi:\n", | |
" mid = (low + hi) // 2\n", | |
" if x < lst[mid]:\n", | |
" hi = mid - 1\n", | |
" elif x < lst[mid]:\n", | |
" low = mid + 1\n", | |
" else:\n", | |
" return True\n", | |
" else:\n", | |
" return False\n", | |
" \n", | |
"def set_search(st, x):\n", | |
" return x in st\n", | |
" \n", | |
" \n", | |
"def dict_search(dct, x):\n", | |
" return x in dct" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 96, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [], | |
"source": [ | |
"%matplotlib inline\n", | |
"import matplotlib.pyplot as plt\n", | |
"import numpy as np\n", | |
"import random\n", | |
"\n", | |
"ns = np.linspace(100, 10_000, 50, dtype=int)\n", | |
"\n", | |
"ts_linsearch = [timeit.timeit('lin_search(lst, lst[-1])',\n", | |
" setup='lst = list(range({})); random.shuffle(lst)'.format(n),\n", | |
" globals=globals(),\n", | |
" number=100)\n", | |
" for n in ns]\n", | |
"\n", | |
"ts_binsearch = [timeit.timeit('bin_search(lst, 0)',\n", | |
" setup='lst = list(range({}))'.format(n),\n", | |
" globals=globals(),\n", | |
" number=100)\n", | |
" for n in ns]\n", | |
"\n", | |
"\n", | |
"ts_setsearch = [timeit.timeit(#'set_search(st, 0)',\n", | |
" 'set_search(st, {})'.format(random.randrange(n)),\n", | |
" setup='lst = list(range({})); random.shuffle(lst);'\n", | |
" 'st = set(lst)'.format(n),\n", | |
" globals=globals(),\n", | |
" number=100)\n", | |
" for n in ns]\n", | |
"\n", | |
"ts_dctsearch = [timeit.timeit(#'dict_search(dct, 0)',\n", | |
" 'dict_search(dct, {})'.format(random.randrange(n)),\n", | |
" setup='lst = list(range({})); random.shuffle(lst);'\n", | |
" 'dct = {{x:x for x in lst}}'.format(n),\n", | |
" globals=globals(),\n", | |
" number=100)\n", | |
" for n in ns]" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 103, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYEAAAD4CAYAAAAKA1qZAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy86wFpkAAAACXBIWXMAAAsTAAALEwEAmpwYAAAcgUlEQVR4nO3dfYxd9X3n8ffX43jwkK0Dw8QQP8yMFUM7VNkSj6jZVt2qQ4vxRrjpIsXITdyG1Goh202zEWsHrTZBIhtWKImjkqReSETDBJuYbDNCON7lQepW2xrGTcuT480EbGMvhsEkzoZn29/94/wGX1/fc+89M+fe8/R5SaO593cefud3z8z53vM7vwdzd0REpJrmZX0AIiKSHQUBEZEKUxAQEakwBQERkQpTEBARqbD5WR9AEhdccIEPDQ1lfRgiIoWxd+/el919IG55oYLA0NAQk5OTWR+GiEhhmNnBZstVHSQiUmEKAiIiFaYgICJSYQoCIiIVpiAgIlJhCgIiInk1Pg5DQzBvXvR7fDz1LArVRFREpDLGx2HTJnjttej9wYPRe4ANG1LLRncCIiJ5dPPNpwPAjNdei9JTpCAgIpJHhw4lS58lBQERkTxavjxZ+iwpCIiI5NGtt0Jf35lpfX1ReooUBERE8mjDBti2DQYHwSz6vW1bqg+Foc0gYGZrzGy/mU2Z2eYGy3vNbEdYvsfMhkJ6v5k9ama/MLO/rNtmlZk9Gbb5qplZKiUSESmLDRvgwAE4dSr6nXIAgDaCgJn1AHcAVwMjwHVmNlK32vXAT939/cCXgdtC+hvAfwI+02DXXwf+BFgZftbMpgAiIoXXhf4Acdq5E7gcmHL3Z939LWA7sK5unXXA3eH1TmDMzMzdX3X3vyMKBu8ws4uAX3L3f3B3B/4a+P05lENE5LQML6qJzfQHOHgQ3E/3B+jSMbcTBJYAz9e8PxzSGq7j7ieA40B/i30ebrFPEZHkMr6oJtal/gBxcv9g2Mw2mdmkmU1OT09nfTgikncZX1QT61J/gDjtBIEjwLKa90tDWsN1zGw+sAg41mKfS1vsEwB33+buo+4+OjAQO0OaiEgk44tqYl3qDxCnnSDwOLDSzIbNbAGwHpioW2cC2BheXws8Eur6G3L3F4Cfm9nq0CroY8D3Ex+9iEi9jC+qiXWpP0CclkEg1PF/EtgN7APuc/enzewWM7smrHYX0G9mU8CngXeakZrZAeBLwB+Z2eGalkU3AHcCU8BPgF3pFElEKi3ji2piXeoPEMeafGHPndHRUddE8yLS0vh49Azg0KHoDuDWW7t2Uc0bM9vr7qNxy3P/YFhEJLEudLKKVaTmqSgIiIikp1Xz1BwGCE0qIyKSllbNU7swSUxSuhMQEUlLs+apOe2/oCAgIpKWZs1Tc9p/QUFARCQtzZqn5rT/goKAiEhamrX5z2n/BQUBERFIr+VOXPPUjDuFxVEQEBGZzcijswkaWfZfiKEgICKStOVO0YarbkJBQEQ6I4cdo2IlbbmT0+aes6EgICLpK9o35aQtd3La3HM2FAREJH1F+6actOVOTpt7zoaCgIikr2jflJu13GlUrZXT5p6zoSAgIukr4jflRi134qq1IJfNPWdDQUBE0tetb8qzeficZJtm1Vo5bO45GxpFVETSN3NB7OTELjPf0pOMypl0m6JVa82CZhYTkWIaGoou4vUGB6Nv5mlsM5s8ckYzi4lIOc3mW3rSbUr0ADiOgoCIFNNsHj4n3San4/2kSUFARIppNt/SZ7NNSR4Ax1EQEJFims239Ap8s09KD4ZFREpMD4ZFRCSWgoCI5F+RRiQtGHUWE5F8m02nMGmb7gREJN+KNiJpwSgIiEi+VWDohiwpCIhIvhVxRNICURAQkXyrwNANWWorCJjZGjPbb2ZTZra5wfJeM9sRlu8xs6GaZVtC+n4zu6om/S/M7Gkze8rM7jWzc1IpkYiUizp4dVTLIGBmPcAdwNXACHCdmY3UrXY98FN3fz/wZeC2sO0IsB64FFgDfM3MesxsCfDnwKi7/yrQE9YTETlbyYduyFI7dwKXA1Pu/qy7vwVsB9bVrbMOuDu83gmMmZmF9O3u/qa7PwdMhf1B1Dx1oZnNB/qA/zu3ooiISFLtBIElwPM17w+HtIbruPsJ4DjQH7etux8BbgcOAS8Ax939fzTK3Mw2mdmkmU1OT0+3cbgiItKuTB4Mm9l5RHcJw8D7gHPN7A8brevu29x91N1HBwYGunmYIiKl104QOAIsq3m/NKQ1XCdU7ywCjjXZ9krgOXefdve3ge8B/2o2BRCRLunG0A0aHqLr2gkCjwMrzWzYzBYQPcCdqFtnAtgYXl8LPOLR8KQTwPrQemgYWAk8RlQNtNrM+sKzgzFg39yLIyIdMTN0w8GD4H566IY0L9LdyEPO0tZQ0ma2FvgKUSueb7r7rWZ2CzDp7hOheee3gcuAV4D17v5s2PZm4OPACeBT7r4rpH8e+EhI/yHwCXd/s9lxaChpkYx0Y67dEsznm0ethpLWfAIi0tq8edG383pmUbPNpMbHo7F/Dh2Kev7eeit89KPp5iGA5hMQkTSkOXRDXLXP+eenl4e0TUFARFpLc+iGuFFBZ/aZRh7SNgUBEWktzaEb4kb/fOUVDQ+RAQUBkSLJsgllWkM3NKta0vAQXacgIFIUZWlCqVFBc0VBQKQoyjLDlkYFzRUFAZE8alTtk9cZtmZTRaVqn9zQRPMieRM3sfr558OxY2evn2UTSk0CX3i6ExDJUqNv0UVqQlmWKqoKUxAQyUrcg95GQydAPptQ5rWKStqm6iCRrMR9i+7pgZMnz15/pgllnqpZli9vHLTUy7cwdCcgkpW4b8snT+av2ieOmnsWnoKASFbivi3PVPPkqdonjpp7Fp6CgEhWmn2LzrIJZdImn2ruWWgKAiJZyfpbdKOLfVl6JUvbNJ+ASBXVt++H6C5k4cLGfRE0sUthaT4Bkazlcd7cuJZJjQIAqMlniamJqEgn5bVHbdKLupp8lpbuBEQ6Ka89auMu6v39avJZMQoCIp2U1x61cS2Ttm5Vk8+KUXWQSCfltUftzEW9frL3mXRd9CtDdwIinZTnHrVq3y8oCIh0VtZ9AURaUHWQSKflbdA3kRq6ExApszz2UZBc0Z2ASFnltY+C5IruBETKKq99FCRXFAREyiqvfRQkVxQEROIUqT690bHG9UXIuo+C5EpbQcDM1pjZfjObMrPNDZb3mtmOsHyPmQ3VLNsS0veb2VU16e8xs51m9iMz22dmV6RSIpE0FGlI5bhjXbs2v30UJDdaBgEz6wHuAK4GRoDrzGykbrXrgZ+6+/uBLwO3hW1HgPXApcAa4GthfwBbgR+4+y8D/xLYN/fiiKSkSPXpccf64IPqoyAttXMncDkw5e7PuvtbwHZgXd0664C7w+udwJiZWUjf7u5vuvtzwBRwuZktAn4LuAvA3d9y95/NuTQiaZlNfXpW1UfNjlW9gqWFdoLAEuD5mveHQ1rDddz9BHAc6G+y7TAwDXzLzH5oZnea2bmNMjezTWY2aWaT09PTbRyuSAqS1qdnWX2kun+Zg6weDM8HPgh83d0vA14FznrWAODu29x91N1HBwYGunmMUmVJx/zJsvooz+MTSe61EwSOAMtq3i8NaQ3XMbP5wCLgWJNtDwOH3X1PSN9JFBRE8iHpmD9ZNsfU+EQyB+0EgceBlWY2bGYLiB70TtStMwFsDK+vBR7xaPLiCWB9aD00DKwEHnP3o8DzZnZJ2GYMeGaOZRFJV5L69KyrZFT3L7PUMgiEOv5PAruJWvDc5+5Pm9ktZnZNWO0uoN/MpoBPE6p23P1p4D6iC/wPgBvd/WTY5t8B42b2BPBrwBdSK5VIt6lKRgrKoi/sxTA6OuqTk5NZH4ZIY+Pj8ZO0iGTEzPa6+2jccg0gJ5IWDRktBaRhI0REKkxBQESkwhQEREQqTEFARKTCFARERCpMQUBEpMIUBEREKkxBQESkwhQEREQqTEFAJKkizT0s0oKGjRBJYmbymJm5A2YmjwENGSGFpDsBkSSKNPewSBsUBESSyHLyGJEOUBAQSSLryWNEUqYgIJKEJo+RklEQEElC8/lKyah1kEhSmjxGSkR3AiIiFaYgIOWijlwiiag6SMpDHblEEtOdgJTHbDpy6c5BKk53AlIeSTty6c5BRHcC0kWd/tadtCOXhoAQURCQLpn51n3wILif/tY9Pt48OCQJHM06cjXaj4aAEMHcPetjaNvo6KhPTk5mfRgyG0ND0YW/Xn8/vP76md/I+/qiDlhwZnVN7bK46prx8eib/KFD0R3ATE/eRvtZuBCOHTt7H4ODcOBAktKJ5JaZ7XX30djlCgLSFfPmRXcA7RocjH43ChxJL9KzCUB6JiAl0SoIqDpIuiPpAGuHDqVXXRO3/iuvaAgIqTwFAemOuPr6/v7G6y9fnt6Inc32s2FDdFdx6lT0WwFAKqatIGBma8xsv5lNmdnmBst7zWxHWL7HzIZqlm0J6fvN7Kq67XrM7Idm9sCcSyL5Fjfw2tat8Q9z0xqxUyN/isRz96Y/QA/wE2AFsAD4Z2Ckbp0bgG+E1+uBHeH1SFi/FxgO++mp2e7TwHeAB1odh7uzatUqlxK65x73wUF3s+j3Pfe0tyytPERKDJj0JtfVdu4ELgem3P1Zd38L2A6sq1tnHXB3eL0TGDMzC+nb3f1Nd38OmAr7w8yWAv8GuLPtiCX5kla7/2ZVMmlV16jaR6ShdoLAEuD5mveHQ1rDddz9BHAc6G+x7VeAm4BTzTI3s01mNmlmk9PT020crnRFs3b/3cpfwz2IzFkmD4bN7EPAS+6+t9W67r7N3UfdfXRgYKALRydtadbbttMX6KwDkEiJtBMEjgDLat4vDWkN1zGz+cAi4FiTbX8DuMbMDhBVL/2Omd0zi+OXrMQ1u5y5IHfyAq3hHkRS004QeBxYaWbDZraA6MHvRN06E8DG8Ppa4JHwQGICWB9aDw0DK4HH3H2Luy9196Gwv0fc/Q9TKI90QqNv9nHNLnt6On+B1nAPIqlpGQRCHf8ngd3APuA+d3/azG4xs2vCancB/WY2RdTiZ3PY9mngPuAZ4AfAje5+Mv1iSMfEVb2sXdu42eXJmNOb5gU6rf4DIqJhI6SFuCEXBgejdvb14/TcfHM6Qz00Uz8ENGi4B5EYrYaN0HwC0lyzqpe4CdcbXaDT7Jg1k2d9AFIAEElMw0ZIc0mrXuJ6Bqd9gVa7f5FUKAhIc7MZckEXaJHCUBCQSFzb/m59sxeRTOiZgLSeazeu7l9ECk93AqLOVyIVpiAg6nwlUmEKAmWVZPwedb4SqSwFgTJKOsCaJl0RqSwFgTJKOsKnWgCJVJaGjSijefOiO4BG+vo03IJIhbQaNkJ3AmWU5QifIlIoCgJlFFfH340RPkWkUBQEyiiujn9wsPH6agUkUlnqMVxWWY3wKSKFojuBKlErIBGpoyBQdEknddcInyJSQ9VBRdZq4DcRkRZ0J1BkGvhNROZIQaDINPCbiMyRgkCRaeA3EZkjBYEi08BvIjJHCgJFoKkfRaRD1Doo7zT1o4h0kO4E8k4tgESkgxQE8k4tgESkgxQEsqCpH0UkJxQEuk1TP4pIjigIdJumfhSRHGlrekkzWwNsBXqAO939i3XLe4G/BlYBx4CPuPuBsGwLcD1wEvhzd99tZsvC+osBB7a5+9ZWx1GK6SU19aOIdNGcp5c0sx7gDuBqYAS4zsxG6la7Hvipu78f+DJwW9h2BFgPXAqsAb4W9ncC+A/uPgKsBm5ssM9y0tSPIpIj7VQHXQ5Mufuz7v4WsB1YV7fOOuDu8HonMGZmFtK3u/ub7v4cMAVc7u4vuPs/Arj7/wP2AUvmXpwC0NSPIpIj7QSBJcDzNe8Pc/YF+5113P0EcBzob2dbMxsCLgP2NMrczDaZ2aSZTU5PT7dxuDmnqR9FJEcy7TFsZu8G7gc+5e4/b7SOu28DtkH0TKCLh9c5mvpRRHKinTuBI8CymvdLQ1rDdcxsPrCI6AFx7LZm9i6iADDu7t+bzcGXiloBiUgG2gkCjwMrzWzYzBYQPeidqFtnAtgYXl8LPOJRs6MJYL2Z9ZrZMLASeCw8L7gL2OfuX0qjIKWgqR9FpMtaVge5+wkz+ySwm6iJ6Dfd/WkzuwWYdPcJogv6t81sCniFKFAQ1rsPeIaoRdCN7n7SzH4T+CjwpJn9U8jqs+7+YMrlExGRJtrqJ5AXpegnICLSRXPuJyBzkGSMIBGRDGg+gU5pNQ+AiEgO6E6gUzQPgIgUgIJAp2geABEpAAWBNDSq+9c8ACJSAAoCcxU3P8DatZoHQERyT0FgruLq/h98UD2ARST31E9gruLmBzCLev6KiGRI/QQ6TXX/IlJgCgJzpTmARaTAFATaFdf7V6N/ikiBqcdwO1r1/o2bH0BEJOd0J9AO9f4VkZJSEGiHev+KSEkpCLRDLYBEpKQUBNqhFkAiUlIKAvUatQJSCyARKSm1DqrVTisgEZES0Z1ALbUCEpGKURCopVZAIlIxCgK11ApIRCqmukGg0QNgtQISkYqpZhCImwgG1ApIRCqlmvMJDA1FF/56g4Nw4MDc9y8ikhOt5hOoZhPRkjwAvvD2C3nx1RfPSl987mKOfuZoYfLotKKVoWjHK8VWzSCwfHnjO4GCPQBudKFolp7XPOKkdTHMsgzNxJUvzouvvoh93s5Kz2NwUCArjvJXB42PR+38Dx2KLvIzD3lrO4VB9AB4FvX/Wf6xN7ogNDOPeZzi7Ckv49IXn7s48YUyaR7dyDupxecuBhoHiSzLl1SzvKGz5Wsmruxp5Z1lubuRd9JrS6vqoHIHgfoewAB9fVy45V28ePL4WauX/SIpIuXg/7n963al5xi+8ImPYTe9ht0wgg08Gf2+6bXTAeClEbjjyeg3nL7Ytpl+xkW4bllcetI80sx71undyCMmfeyJMe69fScPf+5R7r19J2NPjJWr3N3IQ3mXM++0uHvLH2ANsB+YAjY3WN4L7AjL9wBDNcu2hPT9wFXt7rPRz6pVqzwJPoePXXO132t/6w/zqN9rf+tj11ydano38sgy7yzLN/YHY75r/i5/lEff+dk1f5eP/cFYKcpd9r8d5d25vBNdB2Gy2XW1ZXWQmfUA/wf4XeAw8Dhwnbs/U7PODcAH3P1PzWw98GF3/4iZjQD3ApcD7wMeAi4OmzXdZyNJq4Ou/LdX8pn//lnO8dM3PG/YKXat+j5X71035/TbP/wFgI7m0SzvRb2L+JPtNxY6j2Z5f+LhT3Dh8Qup97OFP+OcN36p0Oc1678d5V3cvG//8Bd46P6HaNecnwmY2RXA59z9qvB+C4C7/5eadXaHdf7ezOYDR4EBYHPtujPrhc2a7rORpEFge98DXPj6u89KP4Ezn7MfqiZNP7rwFwAdzaNZ3v19/bzr2JuFzqNZ3u99/VzmNVjmONbhvKGz5/Xowl8wz+bx3tf6zlp2EqenwTZx6XksX1z6S32vccpPZfY/A9n9v6aV99GFv2D9ax86Kz1OGv0ElgDP17w/DPx63DrufsLMjgP9If0f6rZdEl632udMATYBmwCWJ2zC+d4GHzjQ8B9pNulx+08zj6Z5v372xbloeTTL+9i8XgZONcq/C59tjG7k0SjwNUvPY/ni0i9oEPS6lXfm/68p5d1sX7OR+wfD7r7N3UfdfXRgYCDRti+F6FvvJI3vfpKmv9z3Kif7exsuO5VSHrPJ22NajiZNP9nf2/E8muV96uMreKPuT/QN5vFmb+PvLkUqd9afrfIubt5x+5mtdoLAEWBZzfulIa3hOqE6aBFwrMm27exzzr77+9/iDTuzeeUbdoqHrtjN2z1nFv3tnnks/7MlidL/9bZRPrB1RcNlyxLuK828l/7p+1JJ/8DWFR3Po1neH/lvi/lfv34JL9LLKeBFevm71Zdw2V0rC1/urD9b5V3cvD+wdQVpaicIPA6sNLNhM1sArAcm6taZADaG19cCj4Sn0hPAejPrNbNhYCXwWJv7nLP7v3M/v3LnpbzcE11EXu7p5VfuvJQv/u8vcvG2S85Iv3jbJVz8tYsTpS/esJjFGxansq885p11+QA++/Biblp+BVfab/MfB69gy0PlKHfWn63yLm7eM/8bqWmnaSawlqg1z0+Am0PaLcA14fU5wHeJmns+Bqyo2fbmsN1+4Opm+2z1k7SJ6IynnnK/9NLodyfSu5FHlnmXvXz6bJV3EfNuF3NtIponqY0iKiJSEZXuMSwiIs0pCIiIVJiCgIhIhSkIiIhUWKEeDJvZNNBgNphYFwAvd+hw8qqKZYZqlruKZYZqlnsuZR5099ietoUKAkmZ2WSzp+JlVMUyQzXLXcUyQzXL3ckyqzpIRKTCFARERCqs7EFgW9YHkIEqlhmqWe4qlhmqWe6OlbnUzwRERKS5st8JiIhIEwoCIiIVVsogYGZrzGy/mU2Z2easj2cuzGyZmT1qZs+Y2dNm9u9D+vlm9j/N7Mfh93kh3czsq6HsT5jZB2v2tTGs/2Mz2xiXZ56YWY+Z/dDMHgjvh81sTyjfjjAUOWG48h0hfY+ZDdXsY0tI329mV2VUlLaY2XvMbKeZ/cjM9pnZFVU412b2F+Hv+ykzu9fMzinjuTazb5rZS2b2VE1aaufXzFaZ2ZNhm6+aWczUNDWaDTFaxB+gh2h46hXAAuCfgZGsj2sO5bkI+GB4/S+Iht8eAf4rsDmkbwZu89NDdO8imoNxNbAnpJ8PPBt+nxden5d1+doo/6eB7wAPhPf3AevD628AfxZe3wB8I7xeD+wIr0fC30AvMBz+NnqyLleT8t4NfCK8XgC8p+znmmjK2eeAhTXn+I/KeK6B3wI+CDxVk5ba+SUayn912GYXNcP3xx5T1h9KBz7kK4DdNe+3AFuyPq4Uy/d94HeJ5me4KKRdBOwPr/8KuK5m/f1h+XXAX9Wkn7FeHn+IZpx7GPgd4IHwh/0yML/+XAO7gSvC6/lhPas//7Xr5e2HaEa+5wgNNurPYVnPNafnKD8/nLsHgKvKeq6BobogkMr5Dct+VJN+xnpxP2WsDnpn0vugdnL7Qgu3vZcBe4DF7v5CWHQUmJluKK78RfxcvgLcBMzMEdoP/MzdT4T3tWV4p3xh+fGwfpHKPQxMA98KVWB3mtm5lPxcu/sR4HbgEPAC0bnbS7nPda20zu+S8Lo+vakyBoFSMrN3A/cDn3L3n9cu8yjsl6qtr5l9CHjJ3fdmfSxdNJ+oquDr7n4Z8CpR9cA7SnquzwPWEQXB9wHnAmsyPaiMZHF+yxgEujKJfTeZ2buIAsC4u38vJL9oZheF5RcBL4X0uPIX7XP5DeAaMzsAbCeqEtoKvMfM5od1asvwTvnC8kXAMYpV7sPAYXffE97vJAoKZT/XVwLPufu0u78NfI/o/Jf5XNdK6/weCa/r05sqYxDoyiT23RKe7t8F7HP3L9UsmgBmWgVsJHpWMJP+sdCyYDVwPNxq7gZ+z8zOC9+8fi+k5ZK7b3H3pe4+RHQOH3H3DcCjwLVhtfpyz3we14b1PaSvDy1KhoGVRA/PcsfdjwLPm9klIWkMeIaSn2uiaqDVZtYX/t5nyl3ac10nlfMblv3czFaHz/FjNfuKl/VDkg49eEk8iX1ef4DfJLo9fAL4p/CzlqgO9GHgx8BDwPlhfQPuCGV/Ehit2dfHganw88dZly3BZ/DbnG4dtILoH3sK+C7QG9LPCe+nwvIVNdvfHD6P/bTRWiLjsv4aMBnO998Qtf4o/bkGPg/8CHgK+DZRC5/SnWvgXqLnHm8T3fldn+b5BUbDZ/gT4C+pa2TQ6EfDRoiIVFgZq4NERKRNCgIiIhWmICAiUmEKAiIiFaYgICJSYQoCIiIVpiAgIlJh/x8qSOIIl/rtoAAAAABJRU5ErkJggg==\n", | |
"text/plain": [ | |
"<Figure size 432x288 with 1 Axes>" | |
] | |
}, | |
"metadata": { | |
"needs_background": "light" | |
}, | |
"output_type": "display_data" | |
} | |
], | |
"source": [ | |
"plt.plot(ns, ts_linsearch, 'or')\n", | |
"plt.plot(ns, ts_binsearch, 'sg')\n", | |
"plt.plot(ns, ts_setsearch, 'db')\n", | |
"plt.plot(ns, ts_dctsearch, 'om');" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"collapsed": false | |
}, | |
"source": [ | |
"It looks like sets and dictionaries support lookup in constant time! How?!" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"collapsed": false | |
}, | |
"source": [ | |
"## The `map` ADT" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"collapsed": false | |
}, | |
"source": [ | |
"We will focus next on the \"*map*\" abstract data type (aka \"associative array\" or \"dictionary\"), which is used to associate keys (which must be unique) with values. Python's `dict` type is an implementation of the map ADT. \n", | |
"\n", | |
"Given an implementation of a map, it is trivial to implement a *set* on top of it (how?).\n", | |
"\n", | |
"Here's a simple map API:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 109, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [], | |
"source": [ | |
"class MapDS:\n", | |
" def __init__(self):\n", | |
" self.data = {}\n", | |
" \n", | |
" def __setitem__(self, key, value):\n", | |
" self.data[key] = value\n", | |
"\n", | |
"\n", | |
"\n", | |
" def __getitem__(self, key):\n", | |
" return self.data[key]\n", | |
" \n", | |
" def __contains__(self, key):\n", | |
" return self.data.contains(key)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 105, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [], | |
"source": [ | |
"m = MapDS()\n", | |
"m['batman'] = 'bruce wayne'\n", | |
"m['superman'] = 'clark kent'\n", | |
"m['spiderman'] = 'peter parker'" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 106, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"'bruce wayne'" | |
] | |
}, | |
"execution_count": 106, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"m['batman']" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 107, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [], | |
"source": [ | |
"m['batman'] = 'tony stark'" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 108, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"'tony stark'" | |
] | |
}, | |
"execution_count": 108, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"m['batman']" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"collapsed": false | |
}, | |
"source": [ | |
"How do we make the leap from linear runtime complexity to constant?!" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"collapsed": false | |
}, | |
"source": [ | |
"## Direct lookups via *Hashing*" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"collapsed": false | |
}, | |
"source": [ | |
"Hashes (a.k.a. hash codes or hash values) are simply numerical values computed for objects." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 10, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"-954384285558724197" | |
] | |
}, | |
"execution_count": 10, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"hash('hello')" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 11, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"5465486877337622348" | |
] | |
}, | |
"execution_count": 11, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"hash('batman')" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 12, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"8014717029909393586" | |
] | |
}, | |
"execution_count": 12, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"hash('batmen') " | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 100, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"[2429629120202328647,\n", | |
" 8372779892654583019,\n", | |
" -8906997482930836953,\n", | |
" 853381216711768263,\n", | |
" 2429629120202328647,\n", | |
" 7225739097362972930]" | |
] | |
}, | |
"execution_count": 100, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"[hash(s) for s in ['different', 'objects', 'have', 'very', 'different', 'hashes']]" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 99, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"[47, 19, 47, 63, 47, 30]" | |
] | |
}, | |
"execution_count": 99, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"[hash(s)%100 for s in ['different', 'objects', 'have', 'very', 'different', 'hashes']]" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"collapsed": false | |
}, | |
"source": [ | |
"## Hashtables" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 51, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [], | |
"source": [ | |
"class Hashtable:\n", | |
" def __init__(self, n_buckets):\n", | |
" self.buckets = [[]] * n_buckets\n", | |
" \n", | |
" def __setitem__(self, key, val):\n", | |
" h = hash(key)\n", | |
" bucket = self.buckets[h % len(self.buckets)]\n", | |
" for k in bucket:\n", | |
" if(k[0] == key):\n", | |
" k[1] = val\n", | |
" bucket.append([key,val])\n", | |
" \n", | |
" def __getitem__(self, key):\n", | |
" h = hash(key)\n", | |
" for k in self.buckets[h % len(self.buckets)]:\n", | |
" if(k[0] == key):\n", | |
" return k[1]\n", | |
" raise Exception(f\"key {key} not in hashtable\")\n", | |
"\n", | |
" def __contains__(self, key):\n", | |
" try:\n", | |
" _ = self[key]\n", | |
"\n", | |
" return True\n", | |
" except:\n", | |
" return False" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 52, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [], | |
"source": [ | |
"ht = Hashtable(100)\n", | |
"ht['spiderman'] = 'peter parker'\n", | |
"ht['batman'] = 'bruce wayne'\n", | |
"ht['superman'] = 'clark kent'" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 53, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"'peter parker'" | |
] | |
}, | |
"execution_count": 53, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"ht['spiderman']" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 54, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"'bruce wayne'" | |
] | |
}, | |
"execution_count": 54, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"ht['batman']" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 56, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"'clark kent'" | |
] | |
}, | |
"execution_count": 56, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"ht['superman']" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 58, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"'bob'" | |
] | |
}, | |
"execution_count": 58, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"ht['superman'] = 'bob'\n", | |
"ht['superman']" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"collapsed": false | |
}, | |
"source": [ | |
"## On Collisions" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"collapsed": false | |
}, | |
"source": [ | |
"### The \"Birthday Problem\"" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"collapsed": false | |
}, | |
"source": [ | |
"Problem statement: Given $N$ people at a party, how likely is it that at least two people will have the same birthday?" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 59, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [], | |
"source": [ | |
"def birthday_p(n_people):\n", | |
" p_inv = 1\n", | |
" for n in range(365, 365-n_people, -1):\n", | |
" p_inv *= n / 365\n", | |
" return 1 - p_inv" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 60, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"0.008204165884781345" | |
] | |
}, | |
"execution_count": 60, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"birthday_p(3)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 61, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"0.008204165884781456" | |
] | |
}, | |
"execution_count": 61, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"1-364/365*363/365" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 64, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAXQAAAD4CAYAAAD8Zh1EAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy86wFpkAAAACXBIWXMAAAsTAAALEwEAmpwYAAAgmklEQVR4nO3deXTU9b3/8ec7OwkQlgQEQtgRUFnD4m5xA63QWhdsa9FaqLdaW7WtO1Xb3621duEerbe0LnVD0bpQRbEq1lsXIOyENSCBREgCBELIPvP5/TGDjSmYASb5zvJ6nDNn5rskeZ3MzCvffOa7mHMOERGJfgleBxARkfBQoYuIxAgVuohIjFChi4jECBW6iEiMSPLqB2dlZbm+fft69eNFRKLSsmXLdjvnsg+3zLNC79u3L/n5+V79eBGRqGRmRUdapiEXEZEYoUIXEYkRKnQRkRihQhcRiREqdBGRGNFioZvZ42ZWZmZrj7DczOx/zKzQzFab2ejwxxQRkZaEsoX+JDDpS5ZPBgYFbzOBR48/loiIHK0W90N3zn1gZn2/ZJWpwFMucB7eT8ysk5n1cM7tDFdIEYkPPr+jween3uenodFPgy8w3eDzB5c5fH5Ho98fvHef3/uDj30u+NgFpp0Df7PHfgeOwD3Bab/793ICs3G44P0XpwPL/73s0PoQWOeL000EZ547tDsjencK++8vHAcW9QJ2NJkuDs77j0I3s5kEtuLJzc0Nw48Wkbbm8zsqaxqorG3gQG3j5/dVtY0crG/kYJ2Pg3WBxzX1PqrrfdQ0+Kip91Hb4KO20Udtg5+6Rh91DYHyPnTv88f+9RnMoFvHtIgt9JA55+YAcwDy8vJi/5kTiQLOOfYerGdXZS3lB+ooP1DH7qp6dlfVsfdgPRXV9VQcrGdvdT37qgPl3ZIEg4yUJNqlJJKekkhaciLtUhJJS0qkQ1oSacmBealJCaQmJZBy6JaYSHKSkZKYQHLwlpRoJCda4HGCkZgQuE9KNBITjKSEBBITIMEC04kJ9vnjQ/cGgXsL3mMkGGCBrzMC9wlmYIHSNcCCywLTga+n2fSh9Qh8u8+X02R+WwlHoZcAvZtM5wTniUgE8Pkdn+2roWhPNdv3VlOyr5qSihpK9tWwc38tZZV11Pv8//F17ZIT6do+hS4ZKXROT6FfVgad0lPIbJf8+a1DWhId0g7dJ5GRmkT71CRSkxLavMwkPIU+H7jRzJ4HxgP7NX4u0vYO1DawqbSKLWVVbCmvorCsik93H2RHRTUNvn//Q5yYYJzQMY1enduR16cz3TPTOKFj4NatYypZ7QO3jFTPTvUkx6jFZ8zM5gLnAFlmVgz8HEgGcM79L7AAuAgoBKqBa1srrIgElFbWsmrHPtaW7Gf9rgNs2FXJjr01ny9PSUqgf1YGQ3p04MKTT6Bv13Ryu2SQ2zWd7h1SSUrUISixKJS9XK5qYbkDbghbIhH5grpGH2uK95NfVMGyogpWF++jtLIOCIxV989uz4icTkwbm8uJ3TswqHt7cjqnk5igIY94o/+pRCJMfaOfFdsr+HDLHj7esptVxfupbwyMcffLyuC0AVmc0iuTEb0zGdYjk3YpiR4nlkihQheJADv2VvPehjIWbSxj8da91DT4SDA4pVcm15zWlzF9OjOmT2ey2qd6HVUimApdxAPOOVYX7+fNtbt4d30pm8uqgMAW+BV5OZw2MIsJ/buS2S7Z46QSTVToIm3kUIm/sWYnC9bspLiihqQEY3z/Lkwbl8vEId3ol5XhdUyJYip0kVb22b4aXllRwsvLi9lSfpDkROOMgVncdO4gLhjWnU7pKV5HlBihQhdpBY0+P++sL+PZxUX8q3A3zsHYvp353pn9uejkHmSmayhFwk+FLhJGZZW1PLdkO88v2cGuylp6ZKbxw4mD+MboXvTpquEUaV0qdJEw2FR6gD9/sJVXV5bQ4HOcNTib+6eexMQh3XQQj7QZFbrIcVi6bS+PLCrk/Y3lpCUncNW4XL57ej/66sNN8YAKXeQYLN22lz+8s4kPC/eQ1T6FW88fzLcn9KFzhj7gFO+o0EWOwortFTz09sZgkady98VD+db4PjpaUyKCCl0kBEV7DvLgwo28sXonWe1TVOQSkVToIl9if3UDs9/dzNOfbCMpIYGbzh3EzLP6016nlpUIpFelyGH4/Y6XlhfzwJsb2FddzxV5vbn5/MF075jmdTSRI1KhizRT8Nl+Zr1WwLKiCsb06cz9U8dxUs9Mr2OJtEiFLhJU2+Dj9+9s4s8fbKVzegq/uWw43xidQ4LOKy5RQoUuAiwr2stPX1rN1vKDTBvbmzsmD9Xh+RJ1VOgS12obfPxm4UYe//BTema245nrxnPGoCyvY4kcExW6xK0Nuyq5ae4KNpVWcfWEPtw2eYj2XpGoplevxB3nHE9+tI1fvbmBjmnJPPXdcZw1ONvrWCLHTYUucaXiYD23vriK9zaUMXFINx68bLgu6yYxQ4UucWN18T7+65nllB+o495LhjH9tL6YaQ8WiR0qdIl5zjnmLtnBvfMLyO6QyovXn8qI3p28jiUSdip0iWl1jT7ufmUtLy4r5sxBWcyeNoouOiOixCgVusSsPVV1fP/pZeQXVfDDiQP58XmDSdRBQhLDVOgSkzbuOsB1f11K+YE6Hv7mKL46vKfXkURanQpdYs6iDWX8cO4K0lMSmfd9jZdL/FChS0x5Yel27nxlLUNO6MBfpufRI7Od15FE2owKXWKCc45HFhXy0NubOGtwNo9+azQZOupT4oxe8RL1fH7HfX8v4KmPi/j6qF78+hvDSUlK8DqWSJtToUtUq2/0c/MLK3ljzU5mntWf2ycN0eluJW6FtBljZpPMbKOZFZrZ7YdZnmtmi8xshZmtNrOLwh9V5ItqG3z84NllvLFmJ3deNIQ7LxqqMpe41mKhm1ki8AgwGRgGXGVmw5qtdjcwzzk3CpgG/DHcQUWaqqn3MeOpfN5ZX8YvvnYyM88a4HUkEc+FsoU+Dih0zm11ztUDzwNTm63jgI7Bx5nAZ+GLKPJFB+saufbJJfyrcDcPfmM4V0/o43UkkYgQyhh6L2BHk+liYHyzde4F3jazHwIZwHmH+0ZmNhOYCZCbm3u0WUU4WNfINU8sYfn2ffzhypFMHdnL60giESNcuwJcBTzpnMsBLgKeNrP/+N7OuTnOuTznXF52ts4/LUentsHH9/6az7KiCmZPU5mLNBdKoZcAvZtM5wTnNXUdMA/AOfcxkAboOl4SNrUNPmY+vYxPPt3Db68YoUP5RQ4jlEJfCgwys35mlkLgQ8/5zdbZDpwLYGZDCRR6eTiDSvyqb/Rz43PL+WBTOb++dDhfH5XjdSSRiNRioTvnGoEbgYXAegJ7sxSY2f1mNiW42q3ADDNbBcwFrnHOudYKLfHD53fcPG/l53uzXDG2d8tfJBKnQjqwyDm3AFjQbN6sJo/XAaeHN5rEO+cc984v4I3VO7lj8hDtzSLSAh0fLRFr9rubefqTIr5/Vn++f7b2MxdpiQpdItLTnxTxh3c2c9mYHG6fPMTrOCJRQYUuEefNNTuZ9dpazh3SjQcuPUUXchYJkQpdIsry7RX8+IWVjOrdiYe/OZqkRL1ERUKld4tEjO17qpnx13y6d0zjz9/Jo11KoteRRKKKCl0iwv7qBq59cgmNfscT146la/tUryOJRB0VuniuvtHP95/JZ/veauZcPYYB2e29jiQSlXSBC/GUc457Xl3LJ1v38vsrRzC+f1evI4lELW2hi6f++tE2XsjfwY1fGahD+kWOkwpdPPNh4W5+8cZ6zh/WnVvOH+x1HJGop0IXTxTtOcgPnl3OgOwMfn/lSF06TiQMVOjS5qrqGpnxVD5m8Ofv5NE+VR/liISD3knSppxz/PTFVWwpP8hT3x1Hn64ZXkcSiRnaQpc29Zf/+5Q31+7itkkncvpAXQNFJJxU6NJmFm/dwwNvbWDyyScw48z+XscRiTkqdGkTpZW13PDcCvp0TefBy4brhFsirUBj6NLqGnx+bnh2OdX1jTw3Yzwd0pK9jiQSk1To0uoeWriR/KIK/ueqUQzu3sHrOCIxS0Mu0qre21DKnz7Yyrcn5DJlRE+v44jENBW6tJqd+2u4dd4qhvboyN0XD/M6jkjMU6FLq2j0+blp7grqG/088s1RpCXr3OYirU1j6NIqfv/OJpZuq2D2tJH01+lwRdqEttAl7D4s3M0f39/ClXm9mTqyl9dxROKGCl3CquJgPbfMW0n/rAzunXKS13FE4oqGXCRsnHPc9rfV7D1Yz2PTx+qaoCJtTFvoEjZzl+zg7XWl/OzCIZzcK9PrOCJxR4UuYVFYVsX9rxdwxsAsrjujn9dxROKSCl2OW32jnx89v4J2yYn89ooRuliFiEc0hi7Hbfa7myj4rJI/XT2G7h3TvI4jEre0hS7HZVlRBY++v4XLx+Rw4UkneB1HJK6FVOhmNsnMNppZoZndfoR1rjCzdWZWYGbPhTemRKLq+kZunbeSHpntmHWJDu0X8VqLQy5mlgg8ApwPFANLzWy+c25dk3UGAXcApzvnKsysW2sFlsjx3wvWU7S3mrkzJuiUuCIRIJQt9HFAoXNuq3OuHngemNpsnRnAI865CgDnXFl4Y0qkeX9jGc98sp3vndGPCf27eh1HRAit0HsBO5pMFwfnNTUYGGxmH5rZJ2Y26XDfyMxmmlm+meWXl5cfW2Lx3P7qBm7722oGd2/PrRec6HUcEQkK14eiScAg4BzgKuDPZtap+UrOuTnOuTznXF52dnaYfrS0tfteL2B3VT2/vXykzqIoEkFCKfQSoHeT6ZzgvKaKgfnOuQbn3KfAJgIFLzHmnXWlvLy8hB+cM4BTcnQ0qEgkCaXQlwKDzKyfmaUA04D5zdZ5lcDWOWaWRWAIZmv4Ykok2F/dwJ2vrGHICR344UT9vRaJNC0WunOuEbgRWAisB+Y55wrM7H4zmxJcbSGwx8zWAYuAnzrn9rRWaPHGfX8vYM/Beh66fAQpSTqEQSTShHSkqHNuAbCg2bxZTR474JbgTWLQP9aV8vKKEm6aOFAn3hKJUNrMkhY1HWq5UUMtIhFL53KRFv2/BevYe7CeJ64Zq6EWkQimd6d8qX9t3s28/GJmnNlfQy0iEU6FLkd0sK6R219eTf+sDH58noZaRCKdhlzkiB56eyPFFTXM+/6pOoBIJApoC10Oa1nRXp78aBvfObUP4/p18TqOiIRAhS7/oa7Rx21/W0PPzHb8bNIQr+OISIg05CL/4dH3t1BYVsUT14ylfapeIiLRQlvo8gWFZQf446ItXDKiJ18ZotPai0QTFbp8zu933PHyGtqlJDLrq7oCkUi0UaHL5+Yu3c7SbRXcdfFQsjukeh1HRI6SCl0AKK2s5YEFGzi1f1cuH5PjdRwROQYqdAECZ1Ks8/n570tPwcy8jiMix0CFLry7vpQFa3Zx08SB9MvK8DqOiBwjFXqcq65vZNZrBQzq1p6ZZw3wOo6IHAftZBzn/vDOZkr21fDi9afqTIoiUU7v4Di27rNKHvvXp1w1rjdj++rwfpFop0KPUz6/485X1tA5PZnbdHi/SExQocep5xYXsXLHPu6+eBid0lO8jiMiYaBCj0NlB2p58K2NnD6wK1NH9vQ6joiEiQo9Dv3y9fXU+fz88mva51wklqjQ48z/bS5n/qrP+ME5A7TPuUiMUaHHkdoGH/e8upZ+WRlcf7b2OReJNdoPPY48+v4Wtu2p5pnrxuuSciIxSFvocWJreRWPvr+FqSN7csagLK/jiEgrUKHHAeccs14rIDU5gbsuHup1HBFpJSr0OPD31Tv5V+FufnrhiXTrkOZ1HBFpJSr0GFdZ28AvXl/HKb0y+db4Pl7HEZFWpA9FY9zv3t7E7qo6HpueR2KC9jkXiWXaQo9ha0v289TH2/j2+D4Mz+nkdRwRaWUq9Bjl9zvuenUtXTJS+MmFJ3odR0TaQEiFbmaTzGyjmRWa2e1fst43zMyZWV74IsqxmLt0O6t27OPOi4aS2S7Z6zgi0gZaLHQzSwQeASYDw4CrzGzYYdbrAPwIWBzukHJ0dlfV8eBbG5nQvwtfH9XL6zgi0kZC2UIfBxQ657Y65+qB54Gph1nvF8Cvgdow5pNj8MCbGzhY18gvv3ayTr4lEkdCKfRewI4m08XBeZ8zs9FAb+fcG1/2jcxsppnlm1l+eXn5UYeVli35dC8vLStmxln9Gditg9dxRKQNHfeHomaWAPwOuLWldZ1zc5xzec65vOzs7OP90dJMg8/P3a+uoVendvxw4kCv44hIGwul0EuA3k2mc4LzDukAnAy8b2bbgAnAfH0w2vYe/9enbCqt4t4pJ5GeokMMROJNKIW+FBhkZv3MLAWYBsw/tNA5t985l+Wc6+uc6wt8AkxxzuW3SmI5rM/21fCHdzZz3tBunD+su9dxRMQDLRa6c64RuBFYCKwH5jnnCszsfjOb0toBJTT3/b0Ah+Pnl5zkdRQR8UhI/5c75xYAC5rNm3WEdc85/lhyNN7bUMrCglJ+euGJ9O6S7nUcEfGIjhSNcjX1Pma9VsDAbu2ZcWZ/r+OIiIf0yVmUe3jRZoorapg7YwIpSfr7LBLP1ABRrLDsAHM+2Mqlo3px6oCuXscREY+p0KOUc457Xi2gXXIid+oqRCKCCj1qvbqyhI+37uFnk4aQ1T7V6zgiEgFU6FFof3UDv3x9PaNyO/HNcblexxGRCKEPRaPQA29tYF9NA09/7RQSdBUiEQnSFnqUWVa0l7lLtnPtaX0Z1rOj13FEJIKo0KNIg8/PXa+spUdmGjefP9jrOCISYTTkEkWe+PBTNuw6wJ+uHkNGqp46EfkibaFHieKKan7/j8DJty7QybdE5DBU6FHAOces1woAuHfKSboKkYgclgo9CixYs4v3NpRx6wWDyemsk2+JyOGp0CPc/poG7v17ASf36sg1p/X1Oo6IRDB9shbhHnxrA3uq6nh8+liSEvX3V0SOTA0RwZYV7eXZxdu55rR+nJKT6XUcEYlwKvQIVd/o546X19AzM41bL9A+5yLSMg25RKj//ecWNpVW8dj0PO1zLiIh0RZ6BCosO8DD7xVyyYienDtU+5yLSGhU6BHG73fc9rc1pKcm8vNLhnkdR0SiiAo9wjyzuIhlRRXcc/EwnedcRI6KCj2ClOyr4ddvbuDMQVlcOrqX13FEJMqo0COEc467X1mD38F/f/0UHd4vIkdNhR4hXl5ewqKN5fzkwhPp3UWH94vI0VOhR4Cyylru+3sBeX066/B+ETlmKnSPOee485W11DX6efCy4STqknIicoxU6B6bv+oz3llfyk8uOJH+2e29jiMiUUyF7qGyA7X8fH4Bo3I78d0z+nkdR0SinArdI8457nl1LdX1Pn5z2QgNtYjIcVOhe+Tl5SUsLCjl1vMHM7CbhlpE5PiFVOhmNsnMNppZoZndfpjlt5jZOjNbbWbvmlmf8EeNHSX7arh3fgHj+nbhe2f29zqOiMSIFgvdzBKBR4DJwDDgKjNrfpKRFUCec2448BLwYLiDxgq/3/HTF1fhd46HLtdQi4iETyhb6OOAQufcVudcPfA8MLXpCs65Rc656uDkJ0BOeGPGjr9+vI2Ptuzhnq8OI7erDiASkfAJpdB7ATuaTBcH5x3JdcCbh1tgZjPNLN/M8svLy0NPGSMKyw7wwJsbOHdIN64c29vrOCISY8L6oaiZfRvIA35zuOXOuTnOuTznXF52dnY4f3TEq2/08+MXVpKeksivvqFztYhI+IVyKZwSoOnmZE5w3heY2XnAXcDZzrm68MSLHQ+9vZG1JZXMuXoM3TqkeR1HRGJQKFvoS4FBZtbPzFKAacD8piuY2SjgT8AU51xZ+GNGtw82lTPng618e0IuF5x0gtdxRCRGtVjozrlG4EZgIbAemOecKzCz+81sSnC13wDtgRfNbKWZzT/Ct4s7u6vquGXeKgZ1a89dF+kKRCLSekK6+rBzbgGwoNm8WU0enxfmXDHBOcfPXlpNZW0DT183jnYpiV5HEpEYpiNFW9ETH27jvQ1l3Dl5CEN7dPQ6jojEOBV6K1mxvYJfvbme84Z2Y7rOcS4ibUCF3goqDtZzw7PL6d4xjd9ePlK7KIpImwhpDF1C5/c7bp63kt1V9bz0X6eSmZ7sdSQRiRPaQg+zR/+5hfc3lnPPJcMYntPJ6zgiEkdU6GH0UeFufvv2RqaM6Mm3x+d6HUdE4owKPUx27K3mhueWMyC7Pb+6VIf2i0jbU6GHwcG6RmY8lY/fwV+m55GRqo8mRKTtqdCPk9/vuGXeSjaVHuCRb46mT9cMryOJSJxSoR+n2e9uZmFBKXddPIwzBmV5HUdE4pgK/Ti8vvozZr+7mcvG5PDd0/t6HUdE4pwK/Rgt3rqHW15Yxdi+nfnl107Wh6Ai4jkV+jHYVHqAGU/lk9s1nT9/J4+0ZJ10S0S8p0I/SqWVtVzz+BJSkxN58tqxdEpP8TqSiAigQj8qlbUNXPPEUvbXNPDktWPJ6ayLPItI5NAO0yGqqmvkmseXUFh2gMemj+WknpleRxIR+QIVegiq6xv57pNLWVW8n0e+OZqzBsfXBa5FJDpoyKUFtQ0+ZjyVT/62vcyeNpJJJ+uaoCISmbSF/iVqG3xc/8wyPtqyh99dMYKvDu/pdSQRkSNSoR9BZW0D3/trPku37eWBS0/h66NyvI4kIvKlVOiHUX6gjumPL2Fz2QFmTxvFlBHaMheRyKdCb2b7nmqufnwxZZV1PDZ9rD4AFZGooUJvYlnRXq5/ZjkNPj/PzRjPqNzOXkcSEQmZ9nIJenZxEdPmfEJ6SiIvXX+qylxEok7cb6HXNfq4d34Bc5fs4OzB2fzPtFG6sLOIRKW4LvSt5VXcPG8Vq3bs4wfnDODWC04kMUFnTRSR6BSXhe73O574aBsPvrWBtOREHv3WaCaf0sPrWCIixyXuCn37nmp+8tIqlny6l4lDuvGrS0+he8c0r2OJiBy3uCn0/dUNPPJ+IU9+tI3UxAR+c9lwLhuTowtTiEjMiPlCr23w8fTHRTy8qJDK2gYuHZXDTy4cTI/Mdl5HExEJq5gt9M/21TB3yXbmLtnB7qo6zh6cze2ThzC0R0evo4mItIqQCt3MJgGzgUTgL865B5otTwWeAsYAe4ArnXPbwhu1ZftrGviocDcvryjh3fWlOGDiid247ox+nDYwq63jiIi0qRYL3cwSgUeA84FiYKmZzXfOrWuy2nVAhXNuoJlNA34NXNkagQ+pb/RTXFFN0Z5q1pTs55+bylm5Yx8+v6NrRgrXnz2Aq8bl0ruLriokIvEhlC30cUChc24rgJk9D0wFmhb6VODe4OOXgIfNzJxzLoxZAXh+yXYeXlTIZ/tq8Ae/uxkM75XJf509gLMGZzMqtxPJiToIVkTiSyiF3gvY0WS6GBh/pHWcc41mth/oCuxuupKZzQRmAuTm5h5T4Kz2qYzp05lLR+fQp0s6fbPSGZDdXhdrFpG416Yfijrn5gBzAPLy8o5p6/28Yd05b1j3sOYSEYkFoYxLlAC9m0znBOcddh0zSwIyCXw4KiIibSSUQl8KDDKzfmaWAkwD5jdbZz4wPfj4MuC91hg/FxGRI2txyCU4Jn4jsJDAbouPO+cKzOx+IN85Nx94DHjazAqBvQRKX0RE2lBIY+jOuQXAgmbzZjV5XAtcHt5oIiJyNLRvn4hIjFChi4jECBW6iEiMUKGLiMQI82rvQjMrB4pCXD2LZkedRhBlOzbKdmwiORtEdr5YydbHOZd9uAWeFfrRMLN851ye1zkOR9mOjbIdm0jOBpGdLx6yachFRCRGqNBFRGJEtBT6HK8DfAllOzbKdmwiORtEdr6YzxYVY+giItKyaNlCFxGRFqjQRURiRMQXuplNMrONZlZoZrd7nOVxMyszs7VN5nUxs3+Y2ebgfWePsvU2s0Vmts7MCszsR5GSz8zSzGyJma0KZrsvOL+fmS0OPrcvBE/P7AkzSzSzFWb2eiRlM7NtZrbGzFaaWX5wnufPaTBHJzN7ycw2mNl6Mzs1ErKZ2YnB39ehW6WZ/TgSsgXz3Rx8H6w1s7nB90dYXm8RXehNLlA9GRgGXGVmwzyM9CQwqdm824F3nXODgHeD015oBG51zg0DJgA3BH9XkZCvDpjonBsBjAQmmdkEAhcT/71zbiBQQeBi4175EbC+yXQkZfuKc25kk/2UI+E5BZgNvOWcGwKMIPD78zybc25j8Pc1EhgDVAOvREI2M+sF3ATkOedOJnBK8mmE6/XmnIvYG3AqsLDJ9B3AHR5n6gusbTK9EegRfNwD2Oj17y2Y5TXg/EjLB6QDywlcl3Y3kHS457qNM+UQeINPBF4HLIKybQOyms3z/DklcFWyTwnuWBFJ2ZrluQD4MFKy8e/rL3chcPry14ELw/V6i+gtdA5/gepeHmU5ku7OuZ3Bx7sAzy94amZ9gVHAYiIkX3BIYyVQBvwD2ALsc841Blfx8rn9A/AzwB+c7krkZHPA22a2LHiRdYiM57QfUA48ERyq+ouZZURItqamAXODjz3P5pwrAR4CtgM7gf3AMsL0eov0Qo8qLvDn1dP9QM2sPfA34MfOucqmy7zM55zzucC/wDnAOGCIFzmaM7OvAmXOuWVeZzmCM5xzowkMO95gZmc1Xejhc5oEjAYedc6NAg7SbAjD6/dDcBx6CvBi82VeZQuO208l8AexJ5DBfw7jHrNIL/RQLlDttVIz6wEQvC/zKoiZJRMo82edcy9HWj4A59w+YBGBfys7BS8qDt49t6cDU8xsG/A8gWGX2RGS7dAWHc65MgLjwOOIjOe0GCh2zi0OTr9EoOAjIdshk4HlzrnS4HQkZDsP+NQ5V+6cawBeJvAaDMvrLdILPZQLVHut6QWypxMYu25zZmYEru263jn3uyaLPM9nZtlm1in4uB2Bsf31BIr9Mi+zOefucM7lOOf6Enh9veec+1YkZDOzDDPrcOgxgfHgtUTAc+qc2wXsMLMTg7POBdZFQrYmruLfwy0QGdm2AxPMLD34nj30ewvP683LDyxC/BDhImATgTHXuzzOMpfAuFcDgS2U6wiMt74LbAbeAbp4lO0MAv9CrgZWBm8XRUI+YDiwIphtLTArOL8/sAQoJPBvcarHz+85wOuRki2YYVXwVnDo9R8Jz2kwx0ggP/i8vgp0jqBsGcAeILPJvEjJdh+wIfheeBpIDdfrTYf+i4jEiEgfchERkRCp0EVEYoQKXUQkRqjQRURihApdRCRGqNBFRGKECl1EJEb8f858/h5PVo2vAAAAAElFTkSuQmCC\n", | |
"text/plain": [ | |
"<Figure size 432x288 with 1 Axes>" | |
] | |
}, | |
"metadata": { | |
"needs_background": "light" | |
}, | |
"output_type": "display_data" | |
} | |
], | |
"source": [ | |
"n_people = range(1, 80)\n", | |
"plt.plot(n_people, [birthday_p(n) for n in n_people]);" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"collapsed": false | |
}, | |
"source": [ | |
"### General collision statistics" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"collapsed": false | |
}, | |
"source": [ | |
"Repeat the birthday problem, but with a given number of values and \"buckets\" that are allotted to hold them. How likely is it that two or more values will map to the same bucket?" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 65, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [], | |
"source": [ | |
"def collision_p(n_values, n_buckets):\n", | |
" p_inv = 1\n", | |
" for n in range(n_buckets, n_buckets-n_values, -1):\n", | |
" p_inv *= n / n_buckets\n", | |
" return 1 - p_inv" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 66, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"0.5072972343239857" | |
] | |
}, | |
"execution_count": 66, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"collision_p(23, 365) # same as birthday problem, for 23 people" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 67, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"0.37184349044470544" | |
] | |
}, | |
"execution_count": 67, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"collision_p(10, 100)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 68, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"0.9940410733677595" | |
] | |
}, | |
"execution_count": 68, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"collision_p(100, 1000)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 69, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAXoAAAD4CAYAAADiry33AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy86wFpkAAAACXBIWXMAAAsTAAALEwEAmpwYAAAfDUlEQVR4nO3deXgc9Z3n8fdX3a2rdR+WZUu2bJAvMASjOBAgkAATwyRmNpud4BzkYCCZTGZz8OwueTI7k7A7u88kk3kyJORggE3IJByTZIOXkOUKWWICBpnD2DLCQsa25EuXJevuln77R5dE20i2LLdUUvXn9Tz9qOpXv1Z9yyV/uvpX1dXmnENERIIrw+8CRERkZinoRUQCTkEvIhJwCnoRkYBT0IuIBFzYrxWXlZW5mpoav1YvIjIvbdu2rd05V346z/Et6Gtqaqivr/dr9SIi85KZ7T3d52joRkQk4BT0IiIBp6AXEQk4Bb2ISMAp6EVEAu6UQW9m95jZETPbMclyM7PbzazJzLab2brUlykiItM1lSP6HwMbTrL8GqDWe9wM/ODMyxIRkVQ5ZdA7554GOk/S5TrgXpfwHFBkZpWpKvBEb7T18u3HGhmMjczUKkREAiUVY/SLgf1J8y1e29uY2c1mVm9m9W1tbdNa2RMNh/nu75r409v/wLa9J3v9ERERmOWTsc65O51zdc65uvLy0/oE77jPXn4WP/nMegZjo3z4h8/yPx7Zhb48RURkcqkI+lagOmm+ymubMZevKOfRL7+Hf/eOxdz5dDONh4/N5OpEROa1VAT9ZuAG7+qbi4Bu59zBFPzek8rLCvPZy88C4LWDCnoRkclM5fLK+4BngZVm1mJmN5rZ58zsc16XR4BmoAn4F+DzM1btCZaXR8kMZbDrYM9srVJEZN455d0rnXObTrHcAX+VsopOQySUwdkL8th1SEf0IiKTmfefjF1dWaAjehGRkwhA0OfTdmyI9t4hv0sREZmTAhD0BYBOyIqITGbeB/2qhfkAvHZIwzciIhOZ90FfmpfFgvwsGjROLyIyoXkf9ACrKgs0dCMiMolABP3qhfk0HeklNjLqdykiInNOMIK+soDhkVGa2/r8LkVEZM4JRNCvqtQJWRGRyQQi6M8qzyMSMp2QFRGZQCCCPnErhHydkBURmUAggh4Sn5DVrRBERN4uOEG/sIAjx4bo0K0QRESOE5igX+F9QrbpSK/PlYiIzC2BCfqlJbkA7O3s97kSEZG5JTBBv7g4h1CGsa9DQS8ikiwwQR8JZbCoKFtH9CIiJwhM0AMsLYmyr0OfjhURSRasoC/N5U0N3YiIHCdwQd89EKO7P+Z3KSIic0aggn5JSRSAvZ0avhERGROooF9a6l1iqeEbEZFxgQr6Jd619Pt05Y2IyLhABX00K0xZXhZ7deWNiMi4QAU9JIZvNHQjIvKW4AV9Sa6GbkREkgQv6EujHOoZZDA24ncpIiJzQgCDPhfnoKVLR/UiIhDAoF+iSyxFRI4TuKAfv12xgl5EBAhg0JdEM8nLCusSSxERT+CC3sxYUpKr2xWLiHgCF/SQOCGrLyAREUmYUtCb2QYzazSzJjO7dYLlS8zsKTN7ycy2m9m1qS916paU5rK/q5+RUednGSIic8Ipg97MQsAdwDXAGmCTma05odvfAA865y4Arge+n+pCT8fSkiixEcfB7gE/yxARmROmckS/HmhyzjU754aB+4HrTujjgAJvuhA4kLoST1+NLrEUERk3laBfDOxPmm/x2pJ9Hfi4mbUAjwB/PdEvMrObzazezOrb2tqmUe7ULCtP3Je+uV1X3oiIpOpk7Cbgx865KuBa4Kdm9rbf7Zy70zlX55yrKy8vT9Gq325hQTa5mSGa23pnbB0iIvPFVIK+FahOmq/y2pLdCDwI4Jx7FsgGylJR4HSYGcvKojS36YheRGQqQf8CUGtmy8wsk8TJ1s0n9NkHXAlgZqtJBP3Mjc1MwfLyPJrbdUQvInLKoHfOxYEvAI8Cu0hcXbPTzG4zs41et1uAm8zsFeA+4FPOOV+vbVxeFqWla0B3sRSRtBeeSifn3CMkTrImt/1t0nQDcElqSzszy8ujOJe48mblwny/yxER8U0gPxkLcFZ5HoBOyIpI2gts0C8rS1xi+YaCXkTSXGCDPpoVZmFBtq68EZG0F9igh8Q4/Rv60JSIpLnAB31zWy8+XwAkIuKrYAd9WR7HBuO09w77XYqIiG+CHfRj97zRCVkRSWOBDvrxSyw1Ti8iaSzQQb+oKIfMcIaO6EUkrQU66EMZxnLd3ExE0lyggx68K280dCMiaSz4QV+Wx77Ofobjo36XIiLii+AHfXmUkVHHvk59raCIpKc0CPrElTe6542IpKvAB33tgjzMoPHQMb9LERHxReCDPpoVZmlJLrsO9vhdioiILwIf9ACrKwtoUNCLSJpKm6Df29FP71Dc71JERGZd2gQ9QOMhHdWLSPpJk6BPfGdsw0GdkBWR9JMWQb+4KIeC7LBOyIpIWkqLoDczVlUWKOhFJC2lRdADrKksoPHQMUZH9W1TIpJe0iro+4dH2KtbIYhImkmboB+78kbDNyKSbtIm6Gsr8ghlmIJeRNJO2gR9diTE8rKogl5E0k7aBD0khm926Vp6EUkzaRf0rUcH6O6P+V2KiMisSbOgT3xCdpduhSAiaSStgn7NosSVNztau32uRERk9qRV0C/Iz2ZxUQ4v7TvqdykiIrNmSkFvZhvMrNHMmszs1kn6/LmZNZjZTjP7eWrLTJ0Llxbz4r4uv8sQEZk1pwx6MwsBdwDXAGuATWa25oQ+tcBXgUucc+cAX0p9qamxbkkRB7sHOXB0wO9SRERmxVSO6NcDTc65ZufcMHA/cN0JfW4C7nDOdQE4546ktszUuXBpCQDb9uqoXkTSw1SCfjGwP2m+xWtLtgJYYWbPmNlzZrZhol9kZjebWb2Z1be1tU2v4jO0qjKf7EiGhm9EJG2k6mRsGKgFrgA2Af9iZkUndnLO3emcq3PO1ZWXl6do1acnEsrg/KoiXtQRvYikiakEfStQnTRf5bUlawE2O+dizrk9wOskgn9OWre0mJ0HehiMjfhdiojIjJtK0L8A1JrZMjPLBK4HNp/Q59ckjuYxszISQznNqSsztS5cUkx81LG9RdfTi0jwnTLonXNx4AvAo8Au4EHn3E4zu83MNnrdHgU6zKwBeAr4T865jpkq+kytW1oM6ISsiKSH8FQ6OeceAR45oe1vk6Yd8BXvMeeVRDNZVhbVCVkRSQtp9cnYZOuWFPPi3i4Sr1EiIsGVvkG/tIiOvmH2duirBUUk2NI26C/UOL2IpIm0DfraBfkU5kR4rnnOnjMWEUmJtA36UIZxydmlbGlq1zi9iARa2gY9wGW15RzsHuSNtl6/SxERmTFpHfSXnl0GwNOvt/tciYjIzEnroK8uyWV5WZQ/7PbnBmsiIrMhrYMe4LLaMp5r7mQorvveiEgwKehryxmIjegySxEJrLQP+ovOKiWcYfxht8bpRSSY0j7o87LCrFtarHF6EQmstA96gMvOLmNHaw8dvUN+lyIiknIKeuCyFYlvu9rSpOEbEQkeBT2wdnEhxbkRnnptzn6nuYjItCnoSdwO4eo1FTy564gusxSRwFHQe65ZW8mxoTjPaPhGRAJGQe+55KwyCrLD/Gb7Ib9LERFJKQW9JzOcwdVrFvJ4wyGG46N+lyMikjIK+iTXrl1Iz2CcP76h4RsRCQ4FfZJLa8vIywrz21c1fCMiwaGgT5IVDnHV6gU82nCI2IiGb0QkGBT0J7hmbSVH+2Nsbe70uxQRkZRQ0J/g8hXlRDNDPLz9gN+liIikhIL+BNmRENesreTh7QfpH477XY6IyBlT0E/g+ndW0zsU5zfbD/pdiojIGVPQT+DCpcWcVR7lgRf2+12KiMgZU9BPwMz4yDurqd/bRdORY36XIyJyRhT0k/jQuirCGaajehGZ9xT0kyjLy+LqNRX88sVW3RJBROY1Bf1JfOSd1XT2DfPErsN+lyIiMm0K+pO4rLacRYXZ/GzrXr9LERGZNgX9SYQyjBveXcMzTR3saO32uxwRkWmZUtCb2QYzazSzJjO79ST9/r2ZOTOrS12J/vrou5aQlxXmzqeb/S5FRGRaThn0ZhYC7gCuAdYAm8xszQT98oEvAltTXaSfCrIjbFpfzW9ePcj+zn6/yxEROW1TOaJfDzQ555qdc8PA/cB1E/T7b8A/AIMprG9O+PQlyzDg7i17/C5FROS0TSXoFwPJF5O3eG3jzGwdUO2c+83JfpGZ3Wxm9WZW39bWdtrF+mVRUQ4bz1/EAy/sp6tv2O9yREROyxmfjDWzDOCfgFtO1dc5d6dzrs45V1deXn6mq55VN1++nIHYCP/6nK7AEZH5ZSpB3wpUJ81XeW1j8oFzgd+b2ZvARcDmIJ2QBVi1sIArVpZzzzN76BmM+V2OiMiUTSXoXwBqzWyZmWUC1wObxxY657qdc2XOuRrnXA3wHLDROVc/IxX76JarV9LVH+MuXYEjIvPIKYPeORcHvgA8CuwCHnTO7TSz28xs40wXOJesrSrkT8+r5K4te2g7NuR3OSIiUzKlMXrn3CPOuRXOubOcc3/vtf2tc27zBH2vCOLR/Jhbrl7BUHyU7/1ut9+liIhMiT4Ze5qWl+fxkXdW8/Pn97GvQ9fVi8jcp6Cfhi9eWUuGGd9+vNHvUkRETklBPw0VBdn8xWXLeOjlAzy/p9PvckRETkpBP01feG8ti4ty+Jtfv0psRPerF5G5S0E/TTmZIb6x8RxeP9zLPbo1gojMYQr6M3DVmgquWl3Bd57YTevRAb/LERGZkIL+DH19Y+JGnl/fvBPnnM/ViIi8nYL+DFUV5/Llq2t5vOEw//ul1lM/QURklinoU+DGS5ezvqaEv3top+5ZLyJzjoI+BUIZxrf//HwccMuDrzAyqiEcEZk7FPQpUl2Syzc2nsPzb3bqawdFZE5R0KfQh9Yt5tq1C/n2Y41s26sPUonI3KCgTyEz439+6DwWF+fwuX99kSM9gftWRRGZhxT0KVaYE+HOT9TROxjn8z97keG4PjUrIv5S0M+AlQvz+dZ/OI/6vV389980+F2OiKS5sN8FBNUHzlvE9pZu7ny6mWVlUT59yTK/SxKRNKWgn0H/ZcMq3mzv47aHG6goyObatZV+lyQiaUhDNzMolGHcvukC1i0p5ksPvMzW5g6/SxKRNKSgn2HZkRB33VBHVXEON91bz47Wbr9LEpE0o6CfBcXRTH7y6fXkZYX5+N1b2XlAYS8is0dBP0uqS3K5/+aLyYmE+NhdW2k40ON3SSKSJhT0s2hJaS7333yRF/bP8WqLjuxFZOYp6GfZ0tIo9910EbmZYa6/81m27G73uyQRCTgFvQ9qyqL86vPvprokl0//+Hk2v3LA75JEJMAU9D6pKMjmgc9ezAVLivmP973E93/fpG+oEpEZoaD3UWFOhHs/s54Pnr+Ib/7fRr54/8sMDI/4XZaIBIw+Geuz7EiI269/B6sr8/nWo400t/fyo0/Usbgox+/SRCQgdEQ/B5gZn7/ibO66oY432/u59p//wOMNh/0uS0QCQkE/h1y5uoKH//pSlpTkctO99dz2fxoYimsoR0TOjIJ+jqkpi/KLv7yYT727hnue2cN133tGH64SkTOioJ+DssIhvr7xHO7+ZB0dfcNcd8cWvvvkbuIj+hITETl9Cvo57MrVFTz2pfew4dxKvv3462z83jO8sv+o32WJyDwzpaA3sw1m1mhmTWZ26wTLv2JmDWa23cyeNLOlqS81PRVHM/nupgv4wcfW0dE3xJ99/xn+7qEd9AzG/C5NROaJUwa9mYWAO4BrgDXAJjNbc0K3l4A659x5wC+Ab6a60HR3zdpKnvjK5Xzy4hrufW4v7/vH33P/8/sYGdWHrETk5KZyRL8eaHLONTvnhoH7geuSOzjnnnLO9XuzzwFVqS1TAPKzI3x94zls/qtLqSmNcuuvXuUD393CM026X46ITG4qQb8Y2J803+K1TeZG4LcTLTCzm82s3szq29rapl6lHGdtVSH/9rmL+d5HL6BnIMbH7trKx+/aqvF7EZlQSk/GmtnHgTrgWxMtd87d6Zyrc87VlZeXp3LVacfM+MB5i3jylsv5rx9YQ8PBHq674xluurdetz8WkeNM5RYIrUB10nyV13YcM7sK+BpwuXNuKDXlyalkR0LceOkyPvLOau76QzN3b9nD4w2Hee/Kcr7wvlouXFrsd4ki4jM71R0TzSwMvA5cSSLgXwA+6pzbmdTnAhInYTc453ZPZcV1dXWuvr5+unXLJHoGY9z7xze5a8sejvbHqFtazE3vWc5VqysIZZjf5YnIGTKzbc65utN6zlRujWtm1wLfAULAPc65vzez24B659xmM3sCWAsc9J6yzzm38WS/U0E/s/qG4jxYv5+7t+yhpWuAmtJcPnFxDR++sIrCnIjf5YnINM1Y0M8EBf3siI+M8tsdh/jxH99k294uciIh/uyCRWxav4S1iwsx01G+yHyioJeT2tHazU+f3ctDr7QyGBtlTWUB16+v5oPnLaI4mul3eSIyBQp6mZKewRgPvXyA+5/fx84DPURCxvtWLeBD66q4YmU5WeGQ3yWKyCQU9HLadh7o5lcvtvLQy6209w6Tnx1mwzkL+eD5i7j4rFIiId0OSWQuUdDLtMVHRtnS1M7mVw7w2M7D9A7FKcyJcNXqCq45dyGX1paRHdGRvojfphP0+ipBASAcyuCKlQu4YuUCBmMj/L/X23h0xyEeazjEL19sIScS4rLaMq5aU8EVK8tZkJ/td8kiMkUKenmb7EiI95+zkPefs5Dh+CjPNnfw5K7DPNFwmMe8rzg8d3EB7125gPesKOcd1UUa4hGZwzR0I1PmnKPhYA+/b2zj941H2La3i1EHeVlhLlpeyiVnl/Lus8pYUZGnyzZFZojG6GVWdffHeLa5nad3t7Nldzv7OhM3MC2NZvKu5SWsrylh/bJSVi3MJ0OfyhVJCY3Ry6wqzI2w4dxKNpxbCUBLVz/PvtHBs290sHVPJ4+8egiA/OwwFywppm5p4nFedRF5WfrTE5ktOqKXGdPS1c/zezp54c0utu3t5PXDvQCYwYoF+byjuojzq4s4r6qQlQvzNc4vMgUaupE5rbs/xsstR3lpXxcv7jvK9pajHO1PfCViZjiD1QvzOXdxIecuLuScRQWsqMjXJZ0iJ1DQy7zinGNfZz+vtHTzastRdrT2sONAN8cG4wCEMozlZVFWVxawqjKfVQvzWbmwgEWF2TrZK2lLY/Qyr5gZS0ujLC2NsvH8RQCMjjr2d/XTcKCHhoM9NBzoYdveLja/cmD8eflZYWor8lhRkc/ZC/KorcjnrPIoiwpzdNJXZAI6opd5oXsgRuOhY7x+OPFoPHSMpiO9dPQNj/fJjmSwvCyP5eVRlpfnsbwsyrKyKDVlUd2aWQJDR/QSWIU5EdYvK2H9spLj2jt6h2g60ssbbX280dZL05Fetrd088irBxlNOoYpzo1QUxalpjTKkpJclpbmsqQk8SjPz9JQkASagl7mtdK8LErzsnjX8tLj2gdjI+zr7GdPex9vtvfxZkcfezsSVwH9+uVWkt/IZoUzqCrOobokl6riHKqKc1lclMPi4hyqinIoy8vSkJDMawp6CaTsSIgVFfmsqMh/27Kh+AitXQPs6+xnX2c/+zv72d85wP6ufl7e/9aVQGMiIaOyMIfKwmwWFSV+VhblUFmQzcLCxKMkN1MvBjJnKegl7WSFQ4kx/PK8CZcfG4zRenSAA0cHaO0aoOXoAAeODnLw6ADP7+nkUM8gI6PHn9uKhIwF+YnQryjIYkF+NhUF2SzIz2KBN1+en0VxbkTDRDLrFPQiJ8jPjrBqYYRVCwsmXD4y6ujoHeJg9yAHuwc41D3IoZ4hDvcMcrhnkMZDx3j69XZ6h+Jve24kZJTlZVGWl0V5fhZleZnj82X5WZRFM73hqEyKczP1he6SEgp6kdMUyjAWFGSzoCCb86uLJu3XPxzniPcC0NY7xJGeIY4cG6K9d4i2Y0Mc6h5kR2s3HX3Db3uHAIlPEBfnZlIS9R65mZTkeT+jmRRHIxTnZo73KcqNkJcV1jsGeRsFvcgMyc0MU1MWpqYsetJ+o6OO7oEY7b1DtPcO0947RGffMB29Q3T0DSem+4Zpauul681huvqHmeB1AUi8YyjMyaQ4N0JRboSi3EyKchLThTneIzfzrWnvUZAdJqxbUASWgl7EZxkZRnE0k+JoJrUVp+4/9sLQ1Z8I/c6+xPTR/mG6+mMc7R/maH+ibX9nPzsGYhztjzEQGznp741mhhKhnxOhIDtCfnbYmw6TnzSfnzzvTedlhcnNDOndxByloBeZZ5JfGE7HUHyE7oEY3f2xxM+kR89AnJ7BsekYPYMxDnQP0nj4GMcG4xwbjE36LmK8Lkt8N8FY8EezQuRlR8j3pqNZYW868cgbnw4RzUyazgoTzQzr/EQKKehF0kRWOMSC/NC0vgbSOUff8AjHBmPjwd8zGOfYYJxeb/7YYJzeofj48r7hON0DMVq6+ukbitM3NELfcJypfhg/O5JBNDNMrvdCkJsZItf7Gc0Kk5MZIjcSIjczRM748kSfnMwMciJvtWVH3lqWFc5Iu0thFfQickpmRp53FF5ZOP3fMzrq6I+N0DeUeFEYfwEYitM3nJjuH04sGxgeOa6tf3iE3qE47b1D9A+/1dY/fPIhqYnkRELkZIbIiYTIjmQkTSceOZG3lmVnhsgOj7VnjPfJjmSQFRlbdnz7WP+58qKioBeRWZOR8dYLxhROR0zJ6KhjMJ4I/AEv+PuH4+PTA7Gx9jgDsVFvPs7g+PQIg7FEv96hOG3HhsbnB4ZHGIyPMhwfnXZ9maEMssKJF4WscAbZkQy+dNUKPujdyG82KOhFZF7LyDBvSGfm4mxk1DEUH2EwNspgbMR7jDIYH2FweCTx01s2FB9lKJZ4gRjvN9YeT/wsyp3dm+wp6EVETiE0/mLidyXTowtnRUQCTkEvIhJwCnoRkYBT0IuIBJyCXkQk4BT0IiIBp6AXEQk4Bb2ISMCZm+odhlK9YrM2YO80n14GtKewnPkgHbcZ0nO703GbIT23ezrbvNQ5V346T/At6M+EmdU75+r8rmM2peM2Q3pudzpuM6Tnds/WNmvoRkQk4BT0IiIBN1+D/k6/C/BBOm4zpOd2p+M2Q3pu96xs87wcoxcRkambr0f0IiIyRQp6EZGAm1dBb2YbzKzRzJrM7Fa/65kOM6s2s6fMrMHMdprZF732EjN73Mx2ez+LvXYzs9u9bd5uZuuSftcnvf67zeyTSe0Xmtmr3nNuNzP/v7QSMLOQmb1kZg9788vMbKtX5wNmlum1Z3nzTd7ymqTf8VWvvdHM3p/UPif/NsysyMx+YWavmdkuM7s46PvazL7s/W3vMLP7zCw7iPvazO4xsyNmtiOpbcb37WTrOCnn3Lx4ACHgDWA5kAm8Aqzxu65pbEclsM6bzgdeB9YA3wRu9dpvBf7Bm74W+C1gwEXAVq+9BGj2fhZ708Xesue9vuY99xq/t9ur6yvAz4GHvfkHgeu96R8Cf+lNfx74oTd9PfCAN73G2+9ZwDLv7yE0l/82gJ8Af+FNZwJFQd7XwGJgD5CTtI8/FcR9DbwHWAfsSGqb8X072TpOWqvf/xFO4x/1YuDRpPmvAl/1u64UbNdDwNVAI1DptVUCjd70j4BNSf0bveWbgB8ltf/Ia6sEXktqP66fj9tZBTwJvA942PvjbQfCJ+5f4FHgYm867PWzE/f5WL+5+rcBFHqhZye0B3Zfkwj6/V5whb19/f6g7mughuODfsb37WTrONljPg3djP0BjWnx2uYt723qBcBWoMI5d9BbdAio8KYn2+6TtbdM0O637wD/GRj15kuBo865uDefXOf4tnnLu73+p/tv4bdlQBvwv7whq7vMLEqA97VzrhX4R2AfcJDEvttG8Pf1mNnYt5OtY1LzKegDxczygF8CX3LO9SQvc4mX6sBc92pmHwCOOOe2+V3LLAuTeGv/A+fcBUAfibfa4wK4r4uB60i8yC0CosAGX4vyyWzs26muYz4FfStQnTRf5bXNO2YWIRHyP3PO/cprPmxmld7ySuCI1z7Zdp+svWqCdj9dAmw0szeB+0kM3/wzUGRmYa9Pcp3j2+YtLwQ6OP1/C7+1AC3Oua3e/C9IBH+Q9/VVwB7nXJtzLgb8isT+D/q+HjMb+3aydUxqPgX9C0Ctd/Y+k8SJm80+13TavDPndwO7nHP/lLRoMzB2xv2TJMbux9pv8M7aXwR0e2/bHgX+xMyKvaOoPyExdnkQ6DGzi7x13ZD0u3zhnPuqc67KOVdDYr/9zjn3MeAp4MNetxO3eezf4sNef+e1X+9dqbEMqCVxwmpO/m045w4B+81spdd0JdBAgPc1iSGbi8ws16tpbJsDva+TzMa+nWwdk/PrJMY0T3xcS+IqlTeAr/ldzzS34VISb7W2Ay97j2tJjEs+CewGngBKvP4G3OFt86tAXdLv+gzQ5D0+ndReB+zwnvM9TjgZ6PP2X8FbV90sJ/Gftwn4NyDLa8/25pu85cuTnv81b7saSbrCZK7+bQDvAOq9/f1rEldWBHpfA98AXvPq+imJK2cCt6+B+0ich4iRePd242zs28nWcbKHboEgIhJw82noRkREpkFBLyIScAp6EZGAU9CLiAScgl5EJOAU9CIiAaegFxEJuP8Pgjw1kV2ejZwAAAAASUVORK5CYII=\n", | |
"text/plain": [ | |
"<Figure size 432x288 with 1 Axes>" | |
] | |
}, | |
"metadata": { | |
"needs_background": "light" | |
}, | |
"output_type": "display_data" | |
} | |
], | |
"source": [ | |
"# keeping number of values fixed at 100, but vary number of buckets: visualize probability of collision\n", | |
"n_buckets = range(100, 100001, 1000)\n", | |
"plt.plot(n_buckets, [collision_p(100, nb) for nb in n_buckets]);" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 70, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [], | |
"source": [ | |
"def avg_num_collisions(n, b):\n", | |
" \"\"\"Returns the expected number of collisions for n values uniformly distributed\n", | |
" over a hashtable of b buckets. Based on (fairly) elementary probability theory.\n", | |
" (Pay attention in MATH 474!)\"\"\"\n", | |
" return n - b + b * (1 - 1/b)**n" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 71, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"1.011442040700615" | |
] | |
}, | |
"execution_count": 71, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"avg_num_collisions(28, 365)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 72, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"367.6954247709637" | |
] | |
}, | |
"execution_count": 72, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"avg_num_collisions(1000, 1000)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 73, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"48.32893558556316" | |
] | |
}, | |
"execution_count": 73, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"avg_num_collisions(1000, 10000)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"collapsed": false | |
}, | |
"source": [ | |
"## Dealing with Collisions" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"collapsed": false | |
}, | |
"source": [ | |
"To deal with collisions in a hashtable, we simply create a \"chain\" of key/value pairs for each bucket where collisions occur. The chain needs to be a data structure that supports quick insertion — natural choice: the linked list!" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 74, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [], | |
"source": [ | |
"class Hashtable:\n", | |
" class Node:\n", | |
" def __init__(self, key, val, next=None):\n", | |
" self.key = key\n", | |
" self.val = val\n", | |
" self.next = next\n", | |
" \n", | |
" def __init__(self, n_buckets=1000):\n", | |
" self.buckets = [None] * n_buckets\n", | |
" \n", | |
" def __setitem__(self, key, val):\n", | |
" bidx = hash(key) % len(self.buckets)\n", | |
" \n", | |
" def __getitem__(self, key):\n", | |
" bidx = hash(key) % len(self.buckets)\n", | |
" \n", | |
" def __contains__(self, key):\n", | |
" try:\n", | |
" _ = self[key]\n", | |
" return True\n", | |
" except:\n", | |
" return False" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 75, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [], | |
"source": [ | |
"ht = Hashtable(1)\n", | |
"ht['batman'] = 'bruce wayne'\n", | |
"ht['superman'] = 'clark kent'\n", | |
"ht['spiderman'] = 'peter parker'" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 76, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [], | |
"source": [ | |
"ht['batman']" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 77, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [], | |
"source": [ | |
"ht['superman']" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 78, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [], | |
"source": [ | |
"ht['spiderman']" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 79, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [], | |
"source": [ | |
"def ht_search(ht, x):\n", | |
" return x in ht\n", | |
"\n", | |
"def init_ht(size):\n", | |
" ht = Hashtable(size)\n", | |
" for x in range(size):\n", | |
" ht[x] = x\n", | |
" return ht\n", | |
"\n", | |
"ns = np.linspace(100, 10_000, 50, dtype=int)\n", | |
"ts_htsearch = [timeit.timeit('ht_search(ht, 0)',\n", | |
" #'ht_search(ht, {})'.format(random.randrange(n)),\n", | |
" setup='ht = init_ht({})'.format(n),\n", | |
" globals=globals(),\n", | |
" number=100)\n", | |
" for n in ns]" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 80, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAY0AAAD4CAYAAAAQP7oXAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy86wFpkAAAACXBIWXMAAAsTAAALEwEAmpwYAAAfK0lEQVR4nO3dfawd9X3n8ffX17Hh0l0eri0gNr7XKM5WtrrLw5UL2lUahU0xqIp3VaSY3iZOAnKXB2XbaNXFstpsyFpaulUTWJ56BUQEbmNTFjV3EQRtSlZNUwJcLw/lIU4uBhtTQxxIHW0c4dr+7h/zu/G5xzNzfjN3ztOcz0s6uuf8Zub3cGbufM/Mb34z5u6IiIjEWNTtCoiISP9Q0BARkWgKGiIiEk1BQ0REoiloiIhItMXdrkA7LVu2zMfGxrpdDRGRvrJr166fuPvytGm1DhpjY2PMzMx0uxoiIn3FzPZmTdPpKRERiaagISIi0RQ0REQkmoKGiIhEU9AQEZFoChrS2tQUjI3BokXJ36mpbtdIpHO0/c9T60tupQJTU7BlCxw+nHzeuzf5DDAx0b16iXSCtv+T6EhD8m3bduIfZs7hw0m6SCv9/itd2/9JdKQh+fbtK5YuMqcOv9K1/Z9ERxqSb9WqYukic+rwK13b/0kUNCTf9u0wPDw/bXg4SRfJU4df6dr+T6KgIfkmJmByEkZHwSz5OznZP6cXpHvq8Ctd2/9JrM7PCB8fH3fdsFCkS5r7NCD5lT7gO91+YGa73H08bZqONESkPfQrvZZ09ZSItM/EhIJEzehIQ0REoiloiIhINAUNERGJpqAhIiLRFDRERCSagoaIiERT0BARkWgKGiIiEk1BQ0REoiloiIhINAUNERGJpqAhIiLRFDRERCSagoaIiESLChpmtsHMdpvZrJndlDJ9qZntDNOfNrOxhmlbQ/puM7u8VZ5mNhXSXzKz+8zsAyH9o2Z2yMyeD68/XlDLRUR6xdQUjI3BokXJ36mpbtcoU8ugYWZDwB3AFcBa4GozW9s02zXAT939Q8BXgFvCsmuBTcA6YANwp5kNtchzCvhV4NeAU4FrG8r5rrtfEF43l2mwiEhPmXvC4d694J783bKlZwNHzJHGemDW3fe4+xFgB7CxaZ6NwP3h/cPAZWZmIX2Hu7/v7q8DsyG/zDzd/TEPgGeAlQtroohID9u2bf4jcSH5vG1bd+rTQkzQWAG82fB5f0hLncfdjwKHgJGcZVvmGU5LfQr4VkPypWb2gpk9bmbr0iprZlvMbMbMZg4ePBjRPBGRLtq3r1h6l/VyR/idwN+4+3fD5/8LjLr7vwL+B/BXaQu5+6S7j7v7+PLlyztTUxGRslatKpbeZTFB4y3gvIbPK0Na6jxmthg4HXg3Z9ncPM3si8By4Atzae7+M3f/f+H9Y8AHzGxZRP1FRHrX9u0wPDw/bXg4Se9BMUHjWWCNma02syUkHdvTTfNMA5vD+6uAJ0OfxDSwKVxdtRpYQ9JPkZmnmV0LXA5c7e7H5wows3NCPwlmtj7U/d0yjRYR6RkTEzA5CaOjYJb8nZxM0nvQ4lYzuPtRM7sReAIYAu5z95fN7GZgxt2ngXuBB8xsFniPJAgQ5nsIeAU4Ctzg7scA0vIMRd4N7AWeCjHikXCl1FXAdWZ2FPgFsCkEJhGR/jYx0bNBopnVeb87Pj7uMzMz3a6GiEhfMbNd7j6eNq2XO8JFRKTHKGiIiEg0BQ0REYmmoCEiItEUNEREJJqChoiIRFPQEBGRaAoaIiISTUFDRESiKWiIiEg0BQ0REYmmoCEiItEUNEREJJqChohIr5qagrExWLQo+Ts11e0atX6ehoiIdMHUFGzZAocPJ5/37k0+Q1efvaEjDRGRXrRt24mAMefw4SS9ixQ0RER60b59xdI7REFDRKQXrVpVLL1DFDRERHrR9u0wPDw/bXg4Se8iBQ0RkV40MQGTkzA6CmbJ38nJrnaCg66eEhHpXRMTXQ8SzXSkISIi0RQ0REQkmoKGiIhEU9AQEZFoChoiIhJNQUNERKJFBQ0z22Bmu81s1sxuSpm+1Mx2hulPm9lYw7StIX23mV3eKk8zmwrpL5nZfWb2gZBuZnZbmP9FM7toQS0XEZHCWgYNMxsC7gCuANYCV5vZ2qbZrgF+6u4fAr4C3BKWXQtsAtYBG4A7zWyoRZ5TwK8CvwacClwb0q8A1oTXFuCuMg0WEZHyYo401gOz7r7H3Y8AO4CNTfNsBO4P7x8GLjMzC+k73P19d38dmA35Zebp7o95ADwDrGwo4+th0veBM8zs3JLtFhGREmKCxgrgzYbP+0Na6jzufhQ4BIzkLNsyz3Ba6lPAtwrUAzPbYmYzZjZz8ODBiOaJiEisXu4IvxP4G3f/bpGF3H3S3cfdfXz58uVtqpqIyGCKuffUW8B5DZ9XhrS0efab2WLgdODdFstm5mlmXwSWA79XsB4iItJGMUcazwJrzGy1mS0h6diebppnGtgc3l8FPBn6JKaBTeHqqtUkndjP5OVpZtcClwNXu/vxpjI+Ha6iugQ45O4HSrRZRERKanmk4e5HzexG4AlgCLjP3V82s5uBGXefBu4FHjCzWeA9kiBAmO8h4BXgKHCDux8DSMszFHk3sBd4KulL5xF3vxl4DLiSpDP9MPDZKr4AERGJZ8kBQT2Nj4/7zMxMt6shItJXzGyXu4+nTevljnAREekxChoiIhJNQUNERKIpaIiISDQFDekNU1MwNgaLFiV/p6a6XSMRSREzuE+kvaamYMsWOHw4+bx3b/IZYGKie/USkZPoSEO6b9u2EwFjzuHDSbqI9BQFjbrqxOmeqsrYt69Yuoh0jU5P1VEnTvdUWcaqVcnyaeki0lN0pFFHnTjdU2UZ27fD8PD8tOHhJF1EeoqCRh114nRPlWVMTMDkJIyOglnyd3JSneAiPUinp+qoE6d7qi5jYkJBQqQP6EijjjpxukenlEQGkoJGHXXidI9OKYkMJN0aXURE5tGt0UVEOqnGt8VR0BARKSMrMMyNYdq7F9xPjGGamqo2mOSV386A5e61fV188cUuIgU9+KD76Ki7WfL3wQe7XaPOKNLuBx90Hx52T8JC8hoePpFHY/rca2Qke5kydU3L67rrKimD5FHeqftV9WmIyAnNI/0huSqu7hc5FG332Fj6Jeejo8lYpSL71dFReOONYvXNKn9oCI4dW3AZeX0aChoickLezrDojq2fFG33okXpgcEsewxTFjM4fjx+/rzyKypDHeEiEmdQbx5ZtN1Zg1hXrcoewzQyUiyvPFnLDA1VV0YGBQ0ROSFvZ1hnRdudN7g1awzTrbdWNyA2q/wtW9o/6Dars6MOL3WEixSU18FbZ2XaXeaCgSovMsjKq4IyUEe4iESbmkruVrxv34nTLXXuBJ8zqO1OoT4NEYk3MZF0/h4/nvyd23HWZcBaVjuy2i3zKGiIZCm6k6xysFWvPXmx6gFrRb+rMt9t2rS8dkicrPNWdXipT0NKyzvHnXbOuOxgqyJ5ZZWdlU9jW4qUkabKAWtFv6ui6XntGxlJb8foaPltpYbI6dOI2vkCG4DdwCxwU8r0pcDOMP1pYKxh2taQvhu4vFWewI0hzYFlDekfBQ4Bz4fXH7eqt4KGREnbqRbdSWbtjIaGsndSRXdsWWV3Yudplj5/1itvJ5z13WZ9V0XTR0ezy8h6mVW6SfW7BQUNYAh4DTgfWAK8AKxtmud64O7wfhOwM7xfG+ZfCqwO+Qzl5QlcCIwBb6QEjUdb1bfxpaAhLWXtVIvscMq85gJUFXl1YudZ5U64aAAq891WGeQGUF7QiOnTWA/Muvsedz8C7AA2Ns2zEbg/vH8YuMzMLKTvcPf33f31cASxPi9Pd3/O3d+IqJfIwmU96zxrkFRReYOtqhowl3bbCEjyL1pG0XEJZQasFR2YVjR91arsMkZG9PCwBYoJGiuANxs+7w9pqfO4+1GS00gjOcvG5JnmUjN7wcweN7N1aTOY2RYzmzGzmYMHD0ZkKQMta6d67FixnWTWzihvsFXRHVtW2Z3YeVY5YK3owLSi6du3Z5dx6616eNhCZR2CzL2Aq4B7Gj5/Cri9aZ6XgJUNn18DlgG3A7/bkH5vyC8mzzeYf3rqnwO/Et5fCfyoVd11ekpayjrt0ti30a5O6k50tpepb1FVDnKrKr1svcTd809PxQSNS4EnGj5vBbY2zfMEcGl4vxj4CWDN887NF5nnvKCRUq/c6a6gITE6NRK4qry085QOWGjQWAzsIenInuu0Xtc0zw3M7wh/KLxfx/yO8D0kneAxeTYfaZzDibvyrgf2zX3OeiloDKBu39pBpAbygkbLPg1P+ihuDEcJr4aA8LKZ3Wxmnwiz3QuMmNks8AXgprDsy8BDwCvAt4Ab3P1YVp4AZvZ5M9sPrAReNLN7QhlXAS+Z2QvAbcCm0DjpNd16OlnZAWgaCSwSLyua1OGlI42CqvjFXeU59KLn8IuOb9ARhUgqdMNCaamqJ7ZlPcxmZAR+8Yti+Rd9OllRdX+wkEhJenKftFbVE9uKPlEsL/+ieRVV5olpIgNAd7mV1qp6YlvRh/Xk5V90EFjR8Q11f7CQSBsoaEiiqie2VTlyuOggsKyBW1U+MU1k0GV1dtThpY7wAqp8YlsVd1XNyysvvWg+InIS1BEuUdr95DI9GU2kL6gjXESkQuf86Tm88/N3Tko/+7Szefs/vd2FGlVLHeEiIhVKCxh56XWioCEiItEUNHpJ0ecdt1qmaBkiIi0s7nYFJGgekT1336Q5adO+9z24//70ZdI6mPPKUIe0iERQR3ivyBuRDcVup5E1yrqqUd8iA86+ZJnT/Iv9v0/N6wjXkUavKDMiO+8xn1WVISInOfu0szOvnqo7BY1esWpV+lHA3IjpIkcaeaO788oQkSh1uKy2LHWE94qsW2bkPe847xnJRcsQEYmgoNErJiayH3ifNe3OO7OXKVqGiEgEdYSLiMg8GhEuIiKVUNBop6oG5ImI9AhdPdUuWQPpig7IExHpIerTaJeiz7fWADsR6RHq0+iGrAFzRQfkiYj0EAWNdin6fGsNsBORPqA+jXbZvn1+nwYkA+k2b57fpzGXrgF2IrJAnXg4lI402qWqAXkiIpE68XAodYRLKXV/3KVIP6rq7rvqCJfKDfLjLkUGWVTQMLMNZrbbzGbN7KaU6UvNbGeY/rSZjTVM2xrSd5vZ5a3yNLMbQ5qb2bKGdDOz28K0F83sotKtLkuD9URkwLXsCDezIeAO4OPAfuBZM5t291caZrsG+Km7f8jMNgG3AJ80s7XAJmAd8EHg22b24bBMVp7fAx4F/k9TVa4A1oTXrwN3hb+docF6IiJRRxrrgVl33+PuR4AdwMameTYC94f3DwOXmZmF9B3u/r67vw7Mhvwy83T359z9jZR6bAS+7onvA2eY2blFGrsg27bNv+IJks+Tk+np27Z1rGoiIpD9EKgqHw4Vc8ntCuDNhs/7OfkX/i/ncfejZnYIGAnp329adkV43yrPmHqsAA40zmRmW4AtAKuqHPugwXoi0uM6cRFK7TrC3X3S3cfdfXz58uXVZazBevN04heNiPSemCONt4DzGj6vDGlp8+w3s8XA6cC7LZZtlWeZerSPBuvNo8tqRQZTzJHGs8AaM1ttZktIOranm+aZBjaH91cBT3oyAGQa2BSurlpN0on9TGSezaaBT4erqC4BDrn7gRbLVEeD9URE4gb3mdmVwFeBIeA+d99uZjcDM+4+bWanAA8AFwLvAZvcfU9YdhvwOeAo8Pvu/nhWniH988AfAucAPwYec/drQ8f67cAG4DDwWXfPHbmnwX0iIsXlDe7TiHAREZlHI8JFRKQSChoiIhJNt0YXQDcgFJE4OtIQQDcgFJE4ChoiIhJNQUNERKIpaDTTbc5FRDKpI7xR1u3PQSO8RUTQkcZ8Wbc/H4DbnOsGhCISQ0cajbJuZz4AtznXZbUiEkNHGo2ybmde89uci4jEUtBotH17clvzRgNwm3MRkVgKGo2ybn+uTnAREUB9GiebmFCQEBHJoCMNERGJpqAhIiLRFDRERCSagoaIiERTR7jUhp4JItJ+OtKQ2tAzQUTaT0FDRESiKWiIiEg0BQ0REYmmjnCRGit6cYAuJpBWFDQkV1U7kbx8IL2zumgZZ592dm4Z7VTme+rEDrroxQF56fYlOyldwWTwKGhIrqI7l6wdYdH8W01Lk7dzbvcOr8yVW3W42quf6irVUNCIlLUzXMQijnO8bel5v8SrLKPdO8+y0nb2RduRt3Ouar3myQpYZZbpp1/1OtVVT1FBw8w2ALcCQ8A97v7fmqYvBb4OXAy8C3zS3d8I07YC1wDHgM+7+xN5eZrZamAHMALsAj7l7kfM7DPAfwfeCsXe7u73lGt2uryNPGvHk7UDqSo9byfciTJ6UZXtqGq9VlVuq2V6MZhk1akOR1JyspZBw8yGgDuAjwP7gWfNbNrdX2mY7Rrgp+7+ITPbBNwCfNLM1gKbgHXAB4Fvm9mHwzJZed4CfMXdd5jZ3SHvu8IyO939xgW2OZM2culHecEkT9oyVenE/4yOZLoj5khjPTDr7nsAzGwHsBFoDBobgf8S3j8M3G5mFtJ3uPv7wOtmNhvyIy1PM3sV+BjwO2Ge+0O+c0FDRCK98/N3cn/xd1ORI6Yq+8myKADFiwkaK4A3Gz7vB349ax53P2pmh0hOL60Avt+07IrwPi3PEeAf3f1oyvwAv21mHwF+CPyBuzfmAYCZbQG2AKzSs70XrN1XJOX12dRdJ3boWTu8vKMM/6KflFZ0x11G1hFTGUWDgM4yxOunjvD/BXzD3d83s98jOQr5WPNM7j4JTAKMj4+fvPVLIUV/ZeUFmaJ5VbUDyatTlTuFtJ1tXhvK7NC7pZ/qCsUvHZZ4MUHjLeC8hs8rOdEZ3TzPfjNbDJxO0iGet2xa+rvAGWa2OBxt/HJ+d3+3Yf57gD+JqHtlsnYwdbl6qipVHsoX/c6z2pFXp6qunsoqu8yRWq+eUiqqLu2Q+WKCxrPAmnBV01skHdu/0zTPNLAZeAq4CnjS3d3MpoG/MLM/I+kIXwM8A1hanmGZ74Q8doQ8vwlgZue6+4FQ3ieAV0u2OVOVv5Jl4Trxnbe7jDL5Fx0M2E1l/mf67Zd+L16x1k0tg0boo7gReILk8tj73P1lM7sZmHH3aeBe4IHQ0f0eSRAgzPcQSaf5UeAGdz8GkJZnKPI/AzvM7L8Cz4W8AT5vZp8I+bwHfGbBrW8yqBuB9IcyI8uzVNVX1Yv/M504wum14N1J5l7f0/7j4+M+MzPT7WqISIOqjpi6fTST1ocF9bgSy8x2uft42rR+6ggXkRroxL24yhxtFL2QIUtVt97pxB0cylDQEBlgBw7Apk2wcyecc05361LljrDKU3lVKhKE8u580M2jGQUNkQH25S/D3/5t8veOO7pdm/brl9NDrXRzXIkewiQyoA4cgK99DY4fT/6+XY/9aaXyLqUeVDrSEBlQX/5yEjAAjh0bnKONIupyZFIlHWn0mAMH4Dd+Q7/6pL3mjjKOHEk+Hzmio42q1P0oREcaPWbQzjFLdzQeZczR0UY1qroMOO/qqW6OE1HQ6CHN55j/6I+6f0WL1NNTT504yphz5Aj83d91pz6DoMo7TnTzKjAFjR6ic8zSKc891+0aDJ5OXFLcCerT6BE6xywy2PqlP1NBo0fknWOuWr9snCKDpLE/s5cpaPSITp5j7peNU2RQ9NOYGQWNHvHcc+B+8qvqc8/9tHGKDIq0/sxepaAxYPpp4xQZBP3Wn6mgMUD6beMUGQSd7M+sgoLGAOm3jVNkEPTbmBmN0xgg/bZxigyCfhszo6AxQPpt4xSR3qPTUyIiEk1BQ3pG0UGHGqQ4WDqxvntxGyxTRjvrpaBRkayVVOXKK1pGJza2KuuUNegwa5mi8+eVXVU7qvzOqyy7m+u1qryKru9OlJE3ULaqbaRMGW0dwOvutX1dfPHFXsY//IP7Rz7ifuBA/LTrrnNftMj9+uvj0vPKyFK0jKz0MmVU1e68/E85JRnSeOqp88tJW6bo/HnpVbajzPpud9mtpqXpRNlF0sus73aXkTd/mXanKVNGq2ViADOesV/t+o69na+yQaPoRpi1kspu6GmKllFmw6nqn6bs97FkSTJtyZIT5WQtU3T+vLKrakeZ9d2JsotuC50ou2h60fXdiTKy5i/7XaUpU0beMrEUNAoosxFmraQyG3qWomWU2XCq+qdZyPcx95orJ22ZovO3+j6qakeZ9d3usstsC50ou0h6mfXd7jLy5i/7XTUrU0arZWIpaBRQdCPMWknPP19uQ09TtIy8souWUfSfpkydGvOfey1Z4r55c/oymzcXmz+v7KraUWZ9d6LsojuRKtdrVXkVXd9lyi5aRtb8ZfcJrfZFsWXk1asIBY1Ief9gRVfSunXFN/QsWRtPVhlZ6XkbTlU77jJ1uuCC+elzr5GR9GWWLSs2f17ZVX23ZdZ3u8vOa1/WtlDltlZVXkXXd5myi5aRNf8FFxQvO2tdZP1f5JWRV68iFDQi5f2DFV1JzTuKmA09S9bGk1VGVnrehlPVjrsTdcpapuj3dMEF1X23ZdZ3u8vOa19V32EnvtuidS1Tdpkyim63Zf4HipZRJq80Cw4awAZgNzAL3JQyfSmwM0x/GhhrmLY1pO8GLm+VJ7A65DEb8lzSqoysV9GgUWYjLLqS2r2yq9Zv9e01+v6kH+UFDUumZzOzIeCHwMeB/cCzwNXu/krDPNcD/9Ld/4OZbQL+vbt/0szWAt8A1gMfBL4NfDgslpqnmT0EPOLuO8zsbuAFd78rq4y8uo+Pj/vMzExu+0REZD4z2+Xu42nTYgb3rQdm3X2Pux8BdgAbm+bZCNwf3j8MXGZmFtJ3uPv77v46yVHC+qw8wzIfC3kQ8vx3LcoQEZEOiQkaK4A3Gz7vD2mp87j7UeAQMJKzbFb6CPCPIY/msrLKmMfMtpjZjJnNHDx4MKJ5IiISq3a3EXH3SXcfd/fx5cuXd7s6IiK1EhM03gLOa/i8MqSlzmNmi4HTgXdzls1Kfxc4I+TRXFZWGSIi0iExQeNZYI2ZrTazJcAmYLppnmlgc3h/FfBk6IGfBjaZ2VIzWw2sAZ7JyjMs852QByHPb7YoQ0REOqTl1VMAZnYl8FVgCLjP3beb2c0kl2VNm9kpwAPAhcB7wCZ33xOW3QZ8DjgK/L67P56VZ0g/n6Rj/CzgOeB33f39vDJy6n0Q2Bv/dbAM+EmB+etiENs9iG2GwWz3ILYZFtbuUXdPPb8fFTQGhZnNZF1mVmeD2O5BbDMMZrsHsc3QvnbXriNcRETaR0FDRESiKWjMN9ntCnTJILZ7ENsMg9nuQWwztKnd6tMQEZFoOtIQEZFoChoiIhJNQSMwsw1mttvMZs3spm7XZyHM7Dwz+46ZvWJmL5vZfwzpZ5nZ/zazH4W/Z4Z0M7PbQttfNLOLGvLaHOb/kZltziqzV5jZkJk9Z2aPhs+rzezp0LadYTApYcDpzpD+tJmNNeSxNaTvNrPLu9SUaGZ2hpk9bGY/MLNXzezSuq9rM/uDsG2/ZGbfMLNT6riuzew+M/uxmb3UkFbZujWzi83s78Myt5lF3AQ2657pg/QiGWD4GnA+sAR4AVjb7XotoD3nAheF9/+M5Db0a4E/ITy7BLgJuCW8vxJ4HDDgEuDpkH4WsCf8PTO8P7Pb7WvR9i8AfwE8Gj4/RDIQFOBu4Lrw/nrg7vB+E7AzvF8b1v9Skme7vAYMdbtdLdp8P3BteL8EOKPO65rk5qWvA6c2rOPP1HFdAx8BLgJeakirbN2S3KHjkrDM48AVLevU7S+lF17ApcATDZ+3Alu7Xa8K2/dNkmeX7AbODWnnArvD+z8neZ7J3Py7w/SrgT9vSJ83X6+9SO5V9tckt9d/NPwj/ARY3LyegSeAS8P7xWE+a173jfP14ovkHmyvEy5qaV6HdVzXnLjj9Vlh3T0KXF7XdQ2MNQWNStZtmPaDhvR582W9dHoqEXP7974UDsUvJHna4dnufiBMehs4O7wvegv7XvVV4A+B4+FzmVvt91ubVwMHga+F03L3mNlp1Hhdu/tbwJ8C+4ADJOtuF/Vf13OqWrcrwvvm9FwKGjVmZr8C/E+Se379rHGaJz8tanO9tZn9FvBjd9/V7bp02GKS0xd3ufuFwM9JTln8Ug3X9ZkkD2VbTfJE0NNIHh89cLqxbhU0EjG3f+8rZvYBkoAx5e6PhOR3zOzcMP1c4Mchvegt7HvRvwY+YWZvkNzw8mPArRS/1X4/tRmSX4f73f3p8PlhkiBS53X9b4HX3f2gu/8T8AjJ+q/7up5T1bp9K7xvTs+loJGIuf173whXQNwLvOruf9YwqfH28s23nf90uPriEuBQOPx9AvhNMzsz/Lr7zZDWc9x9q7uvdPcxkvX3pLtPUPxW+1m38+9J7v428KaZ/YuQdBnwCjVe1ySnpS4xs+Gwrc+1udbrukEl6zZM+5mZXRK+x0835JWt2508vfIiufLghyRXUGzrdn0W2JZ/Q3LI+iLwfHhdSXIe96+BHwHfBs4K8xtwR2j73wPjDXl9juTZ7rPAZ7vdtsj2f5QTV0+dT7IjmAX+Elga0k8Jn2fD9PMblt8WvovdRFxN0u0XcAEwE9b3X5FcIVPrdQ18CfgB8BLJIxOW1nFdA98g6bf5J5KjymuqXLfAePgOXwNup+mCirSXbiMiIiLRdHpKRESiKWiIiEg0BQ0REYmmoCEiItEUNEREJJqChoiIRFPQEBGRaP8fnTewvaZ6W+IAAAAASUVORK5CYII=\n", | |
"text/plain": [ | |
"<Figure size 432x288 with 1 Axes>" | |
] | |
}, | |
"metadata": { | |
"needs_background": "light" | |
}, | |
"output_type": "display_data" | |
} | |
], | |
"source": [ | |
"plt.plot(ns, ts_binsearch, 'ro')\n", | |
"plt.plot(ns, ts_htsearch, 'gs')\n", | |
"plt.plot(ns, ts_dctsearch, 'b^');" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"collapsed": false | |
}, | |
"source": [ | |
"## Loose ends" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"collapsed": false | |
}, | |
"source": [ | |
"### Iteration" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 81, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [], | |
"source": [ | |
"class Hashtable(Hashtable):\n", | |
" def __iter__(self):\n", | |
" pass" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 82, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [], | |
"source": [ | |
"ht = Hashtable(100)\n", | |
"ht['batman'] = 'bruce wayne'\n", | |
"ht['superman'] = 'clark kent'\n", | |
"ht['spiderman'] = 'peter parker'" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 83, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"ename": "TypeError", | |
"evalue": "iter() returned non-iterator of type 'NoneType'", | |
"output_type": "error", | |
"traceback": [ | |
"\u001b[0;31m\u001b[0m", | |
"\u001b[0;31mTypeError\u001b[0mTraceback (most recent call last)", | |
"\u001b[0;32m<ipython-input-83-43c83c094cda>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0;32mfor\u001b[0m \u001b[0mk\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mht\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 2\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mk\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", | |
"\u001b[0;31mTypeError\u001b[0m: iter() returned non-iterator of type 'NoneType'" | |
] | |
} | |
], | |
"source": [ | |
"for k in ht:\n", | |
" print(k)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"collapsed": false | |
}, | |
"source": [ | |
"### Key ordering" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 84, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [], | |
"source": [ | |
"ht = Hashtable()\n", | |
"d = {}\n", | |
"for x in 'banana apple cat dog elephant'.split():\n", | |
" d[x[0]] = x\n", | |
" ht[x[0]] = x" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 85, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"b => banana\n", | |
"a => apple\n", | |
"c => cat\n", | |
"d => dog\n", | |
"e => elephant\n" | |
] | |
} | |
], | |
"source": [ | |
"for k in d:\n", | |
" print(k, '=>', d[k])" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 86, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"ename": "TypeError", | |
"evalue": "iter() returned non-iterator of type 'NoneType'", | |
"output_type": "error", | |
"traceback": [ | |
"\u001b[0;31m\u001b[0m", | |
"\u001b[0;31mTypeError\u001b[0mTraceback (most recent call last)", | |
"\u001b[0;32m<ipython-input-86-74efa9b88228>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0;32mfor\u001b[0m \u001b[0mk\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mht\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 2\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mk\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m'=>'\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mht\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mk\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", | |
"\u001b[0;31mTypeError\u001b[0m: iter() returned non-iterator of type 'NoneType'" | |
] | |
} | |
], | |
"source": [ | |
"for k in ht:\n", | |
" print(k, '=>', ht[k])" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"collapsed": false | |
}, | |
"source": [ | |
"### Load factor & Rehashing\n", | |
"\n", | |
"It is clear that the ratio of the number of keys to the number of buckets (known as the **load factor**) can have a significant effect on the performance of a hashtable.\n", | |
"\n", | |
"A fixed number of buckets doesn't make sense, as it might be wasteful for a small number of keys, and also scale poorly to a relatively large number of keys. And it also doesn't make sense to have the user of the hashtable manually specify the number of buckets (which is a low-level implementation detail). \n", | |
"\n", | |
"Instead: a practical hashtable implementation would start with a relatively small number of buckets, and if/when the load factor increases beyond some threshold (typically 1), it *dynamically increases the number of buckets* (typically to twice the previous number). This requires that all existing keys be *rehashed* to new buckets (why?)." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"collapsed": false | |
}, | |
"source": [ | |
"### Uniform hashing\n", | |
"\n", | |
"Ultimately, the performance of a hashtable also heavily depends on hashcodes being *uniformly distributed* --- i.e., where, statistically, each bucket has roughly the same number of keys hashing to it. Designing hash functions that do this is an algorithmic problem that's outside the scope of this class!" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"collapsed": false | |
}, | |
"source": [ | |
"## Runtime analysis & Discussion" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"collapsed": false | |
}, | |
"source": [ | |
"For a hashtable with $N$ key/value entries, we have the following *worst-case runtime complexity*:\n", | |
"\n", | |
"- Insertion: $O(N)$\n", | |
"- Lookup: $O(N)$\n", | |
"- Deletion: $O(N)$" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"collapsed": false | |
}, | |
"source": [ | |
"Assuming uniform hashing and rehashing behavior described above, it is also possible to prove that hashtables have $O(1)$ *amortized runtime complexity* for the above operations. Proving this is also beyond the scope of this class (but is demonstrated by empirical data)." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"collapsed": false | |
}, | |
"source": [ | |
"## Vocabulary list\n", | |
"\n", | |
"- hashtable\n", | |
"- hashing and hashes\n", | |
"- collision\n", | |
"- hash buckets & chains\n", | |
"- birthday problem\n", | |
"- load factor\n", | |
"- rehashing" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"collapsed": false | |
}, | |
"source": [ | |
"---" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"collapsed": false | |
}, | |
"source": [ | |
"## Addendum: On *Hashability*" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"collapsed": false | |
}, | |
"source": [ | |
"Remember: *a given object must always hash to the same value*. This is required so that we can always map the object to the same hash bucket.\n", | |
"\n", | |
"Hashcodes for collections of objects are usually computed from the hashcodes of its contents, e.g., the hash of a tuple is a function of the hashes of the objects in said tuple:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 87, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"4246727162495154915" | |
] | |
}, | |
"execution_count": 87, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"hash(('two', 'strings'))" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"collapsed": false | |
}, | |
"source": [ | |
"This is useful. It allows us to use a tuple, for instance, as a key for a hashtable.\n", | |
"\n", | |
"However, if the collection of objects is *mutable* — i.e., we can alter its contents — this means that we can potentially change its hashcode.`\n", | |
"\n", | |
"If we were to use such a collection as a key in a hashtable, and alter the collection after it's been assigned to a particular bucket, this leads to a serious problem: the collection may now be in the wrong bucket (as it was assigned to a bucket based on its original hashcode)!\n", | |
"\n", | |
"For this reason, only immutable types are, by default, hashable in Python. So while we can use integers, strings, and tuples as keys in dictionaries, lists (which are mutable) cannot be used. Indeed, Python marks built-in mutable types as \"unhashable\", e.g.," | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 88, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"ename": "TypeError", | |
"evalue": "unhashable type: 'list'", | |
"output_type": "error", | |
"traceback": [ | |
"\u001b[0;31m\u001b[0m", | |
"\u001b[0;31mTypeError\u001b[0mTraceback (most recent call last)", | |
"\u001b[0;32m<ipython-input-88-84d65be9aa35>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mhash\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m2\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m3\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", | |
"\u001b[0;31mTypeError\u001b[0m: unhashable type: 'list'" | |
] | |
} | |
], | |
"source": [ | |
"hash([1, 2, 3])" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"collapsed": false | |
}, | |
"source": [ | |
"That said, Python does support hashing on instances of custom classes (which are mutable). This is because the default hash function implementation does not rely on the contents of instances of custom classes. E.g.," | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 89, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [], | |
"source": [ | |
"class Student:\n", | |
" def __init__(self, fname, lname):\n", | |
" self.fname = fname\n", | |
" self.lname = lname" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 90, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"298582137" | |
] | |
}, | |
"execution_count": 90, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"s = Student('John', 'Doe')\n", | |
"hash(s)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 91, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"298582137" | |
] | |
}, | |
"execution_count": 91, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"s.fname = 'Jane'\n", | |
"hash(s) # same as before mutation" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"collapsed": false | |
}, | |
"source": [ | |
"We can change the default behavior by providing our own hash function in `__hash__`, e.g.," | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 92, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [], | |
"source": [ | |
"class Student:\n", | |
" def __init__(self, fname, lname):\n", | |
" self.fname = fname\n", | |
" self.lname = lname\n", | |
" \n", | |
" def __hash__(self):\n", | |
" return hash(self.fname) + hash(self.lname)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 93, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"7828797879385466672" | |
] | |
}, | |
"execution_count": 93, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"s = Student('John', 'Doe')\n", | |
"hash(s)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 94, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"-7042091445038950747" | |
] | |
}, | |
"execution_count": 94, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"s.fname = 'Jane'\n", | |
"hash(s)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"collapsed": false | |
}, | |
"source": [ | |
"But be careful: instances of this class are no longer suitable for use as keys in hashtables (or dictionaries), if you intend to mutate them after using them as keys!" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [], | |
"source": [] | |
} | |
], | |
"metadata": { | |
"kernelspec": { | |
"argv": [ | |
"python", | |
"-m", | |
"ipykernel_launcher", | |
"-f", | |
"{connection_file}" | |
], | |
"display_name": "Python 3", | |
"env": null, | |
"interrupt_mode": "signal", | |
"language": "python", | |
"metadata": null, | |
"name": "python3" | |
}, | |
"language_info": { | |
"codemirror_mode": { | |
"name": "ipython", | |
"version": 3 | |
}, | |
"file_extension": ".py", | |
"mimetype": "text/x-python", | |
"name": "python", | |
"nbconvert_exporter": "python", | |
"pygments_lexer": "ipython3", | |
"version": "3.7.7" | |
}, | |
"name": "hashtable-demo.ipynb" | |
}, | |
"nbformat": 4, | |
"nbformat_minor": 4 | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment