Created
December 21, 2023 07:56
-
-
Save Zerui18/7a82d2f2452681994e6683f1f297b8cc to your computer and use it in GitHub Desktop.
Proof that Binary Search
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"# Validity of Iterative Binary Search (2 flavours)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Conditions of Validity\n", | |
"\n", | |
"We breakdownn the proof into 2 steps which jointly establishes its validity: termination & invariant maintenance.\n", | |
"Termination: the function will always terminate in finite time.\n", | |
"Invariant Maintenance: the target, if in array, is always bounded by `low` and `high`" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### 1. Convergence\n", | |
"\n", | |
"To prove that the function will ALWAYS converge to an answer in FINITE time, we show that it (specifically the `while` loop) has a termination condition (duh) and each iteration **certainly** brings us closer to it." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### 2. Invariant Maintenance\n", | |
"\n", | |
"The invariant for binary search is that if the target is in the array (valid inputs for `arr` and `value`), it must always be within the current range bounded by `low, high` - from initialization time to every iteration." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Analysis of Variant 1" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Convergence" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"def binary_search(arr, value):\n", | |
" low = 0\n", | |
" high = len(arr) - 1\n", | |
"\n", | |
" # note the following condition: low <= high\n", | |
" while low <= high:\n", | |
"\n", | |
" # now we analyse the following statement to derive\n", | |
" # the relationship between mid and low, high:\n", | |
" mid = low + (high-low)//2\n", | |
" # FOR LOW:\n", | |
" ## (high-low)//2 >= 0\n", | |
" ## THUS mid >= low\n", | |
"\n", | |
" # FOR HIGH:\n", | |
" ## mid = low + floor((high-low)/2) <= low + (high-low)/2 = high - (high-low)/2 <= high\n", | |
" ## THUS mid <= high\n", | |
"\n", | |
" # CONCLUSION 1: low <= mid <= high\n", | |
"\n", | |
" # lastly we analyse how low, high change:\n", | |
" if arr[mid] == value:\n", | |
" return mid\n", | |
" elif arr[mid] < value:\n", | |
" low = mid + 1 # low' > mid >= low, thus low' > low\n", | |
" else:\n", | |
" high = mid - 1 # high' < mid <= high, thus high' < high\n", | |
" # CONCLUSION 2: either low is increased OR high is decreased with EACH ITER\n", | |
" return -1" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"#### Conclusion\n", | |
"\n", | |
"Each iteration we certainly narrows the interval between low and high, thus we will eventually terminate at `low > high` in a finite number of steps." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Invariance Maintenance\n", | |
"\n", | |
"We zoom into the following lines, keeping in mind the assumption that `value` exists in a **sorted** `arr`:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"# the 1st branch doesn't update low or high\n", | |
"if arr[mid] == value:\n", | |
" return mid\n", | |
"# we eliminate the left half of the array\n", | |
"# since mid is not the value, we can safely exclude it\n", | |
"elif arr[mid] < value:\n", | |
" low = mid + 1\n", | |
"# we eliminate the right half of the array\n", | |
"# since mid is not the value, we can safely exclude it\n", | |
"else:\n", | |
" high = mid - 1" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"#### Conclusion\n", | |
"\n", | |
"If `value` is in `arr`: it is always bounded in `[low, high]`.\n", | |
"If `value` is not in `arr`: its insertion position is always bounded in `[low, high]`. Until `low > high` when the loop exits and `-1` is returned." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Conclusion for Variant 1\n", | |
"\n", | |
"Because it satisfies both Convergence and Invariant Maintenance, it is valid!" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Analysis of Variant 1" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Convergence" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"def binary_search(arr, value):\n", | |
" low = 0\n", | |
" high = len(arr)\n", | |
"\n", | |
" # note the following condition: low < high\n", | |
" # different from the previous version\n", | |
" # this excludes low == high\n", | |
" while low < high:\n", | |
"\n", | |
" # the next statement is the same as before:\n", | |
" # now we analyse the following statement to derive\n", | |
" # the relationship between mid and low, high:\n", | |
" mid = low + (high-low)//2\n", | |
" # FOR LOW:\n", | |
" ## (high-low)//2 >= 0\n", | |
" ## THUS mid >= low\n", | |
"\n", | |
" # FOR HIGH:\n", | |
" ## mid = low + floor((high-low)/2) <= low + (high-low)/2 = high - (high-low)/2 <= high\n", | |
" ## THUS mid <= high\n", | |
"\n", | |
" # IMPORTANT:\n", | |
" # BUT (mid == high) IFF (low == high) [otherwise it will always be `pulled down` by `low`]\n", | |
" # which is not possible since low < high by the loop condition\n", | |
" # THUS mid < high\n", | |
"\n", | |
" # CONCLUSION 1: low <= mid < high\n", | |
" # note that the above is different from the previous version\n", | |
"\n", | |
" # lastly we analyse how low, high change:\n", | |
" if arr[mid] == value:\n", | |
" return mid\n", | |
" elif arr[mid] < value:\n", | |
" low = mid + 1 # low' > mid >= low, thus low' > low [same as previous version]\n", | |
" else:\n", | |
" high = mid # high' = mid < high, thus high' < high [diff path but same conclusion]\n", | |
" # CONCLUSION 2: either low is increased OR high is decreased with EACH ITER\n", | |
" # thus we still have the same conclusion as before\n", | |
" return -1" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"#### Conclusion\n", | |
"\n", | |
"Each iteration we certainly narrows the interval between low and high, thus we will eventually terminate at `low == high` in a finite number of steps." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Invariance Maintenance\n", | |
"\n", | |
"We zoom into the following lines, keeping in mind the assumption that `value` exists in a **sorted** `arr`:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"if arr[mid] == value:\n", | |
" return mid\n", | |
"elif arr[mid] < value:\n", | |
" low = mid + 1\n", | |
"# we eliminate the right half of the array\n", | |
"# since `mid` is not the value, we can safely exclude it\n", | |
"else:\n", | |
" high = mid\n", | |
"# note that the above is different from the previous version\n", | |
"# because `high` is excluded from the search range\n", | |
"# we can set it to `mid` instead of `mid - 1`\n", | |
"\n", | |
"# PS: why is `high` excluded?\n", | |
"# PS: because in the above analysis, we have shown that `mid < high`\n", | |
"# PS: and `mid` is like a pointer that accesses the elements to be searched\n", | |
"# PS: so if `mid < high`, `high` will never be accessed, thus it's excluded from the search range" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"#### Conclusion\n", | |
"\n", | |
"If `value` is in `arr`: it is always bounded in `[low, high)`.\n", | |
"If `value` is not in `arr`: its insertion position is always bounded in `[low, high)`. Until `low == high` when the loop exits and `-1` is returned." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Conclusion for Variant 2\n", | |
"\n", | |
"Because it satisfies both Convergence and Invariant Maintenance, it is valid!" | |
] | |
} | |
], | |
"metadata": { | |
"kernelspec": { | |
"display_name": "Python 3", | |
"language": "python", | |
"name": "python3" | |
}, | |
"language_info": { | |
"codemirror_mode": { | |
"name": "ipython", | |
"version": 3 | |
}, | |
"file_extension": ".py", | |
"mimetype": "text/x-python", | |
"name": "python", | |
"nbconvert_exporter": "python", | |
"pygments_lexer": "ipython3", | |
"version": "3.9.6" | |
} | |
}, | |
"nbformat": 4, | |
"nbformat_minor": 2 | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment