Skip to content

Instantly share code, notes, and snippets.

@Zerui18
Created December 21, 2023 07:56
Show Gist options
  • Save Zerui18/7a82d2f2452681994e6683f1f297b8cc to your computer and use it in GitHub Desktop.
Save Zerui18/7a82d2f2452681994e6683f1f297b8cc to your computer and use it in GitHub Desktop.
Proof that Binary Search
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Validity of Iterative Binary Search (2 flavours)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Conditions of Validity\n",
"\n",
"We breakdownn the proof into 2 steps which jointly establishes its validity: termination & invariant maintenance.\n",
"Termination: the function will always terminate in finite time.\n",
"Invariant Maintenance: the target, if in array, is always bounded by `low` and `high`"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 1. Convergence\n",
"\n",
"To prove that the function will ALWAYS converge to an answer in FINITE time, we show that it (specifically the `while` loop) has a termination condition (duh) and each iteration **certainly** brings us closer to it."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 2. Invariant Maintenance\n",
"\n",
"The invariant for binary search is that if the target is in the array (valid inputs for `arr` and `value`), it must always be within the current range bounded by `low, high` - from initialization time to every iteration."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Analysis of Variant 1"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Convergence"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def binary_search(arr, value):\n",
" low = 0\n",
" high = len(arr) - 1\n",
"\n",
" # note the following condition: low <= high\n",
" while low <= high:\n",
"\n",
" # now we analyse the following statement to derive\n",
" # the relationship between mid and low, high:\n",
" mid = low + (high-low)//2\n",
" # FOR LOW:\n",
" ## (high-low)//2 >= 0\n",
" ## THUS mid >= low\n",
"\n",
" # FOR HIGH:\n",
" ## mid = low + floor((high-low)/2) <= low + (high-low)/2 = high - (high-low)/2 <= high\n",
" ## THUS mid <= high\n",
"\n",
" # CONCLUSION 1: low <= mid <= high\n",
"\n",
" # lastly we analyse how low, high change:\n",
" if arr[mid] == value:\n",
" return mid\n",
" elif arr[mid] < value:\n",
" low = mid + 1 # low' > mid >= low, thus low' > low\n",
" else:\n",
" high = mid - 1 # high' < mid <= high, thus high' < high\n",
" # CONCLUSION 2: either low is increased OR high is decreased with EACH ITER\n",
" return -1"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Conclusion\n",
"\n",
"Each iteration we certainly narrows the interval between low and high, thus we will eventually terminate at `low > high` in a finite number of steps."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Invariance Maintenance\n",
"\n",
"We zoom into the following lines, keeping in mind the assumption that `value` exists in a **sorted** `arr`:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# the 1st branch doesn't update low or high\n",
"if arr[mid] == value:\n",
" return mid\n",
"# we eliminate the left half of the array\n",
"# since mid is not the value, we can safely exclude it\n",
"elif arr[mid] < value:\n",
" low = mid + 1\n",
"# we eliminate the right half of the array\n",
"# since mid is not the value, we can safely exclude it\n",
"else:\n",
" high = mid - 1"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Conclusion\n",
"\n",
"If `value` is in `arr`: it is always bounded in `[low, high]`.\n",
"If `value` is not in `arr`: its insertion position is always bounded in `[low, high]`. Until `low > high` when the loop exits and `-1` is returned."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Conclusion for Variant 1\n",
"\n",
"Because it satisfies both Convergence and Invariant Maintenance, it is valid!"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Analysis of Variant 1"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Convergence"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def binary_search(arr, value):\n",
" low = 0\n",
" high = len(arr)\n",
"\n",
" # note the following condition: low < high\n",
" # different from the previous version\n",
" # this excludes low == high\n",
" while low < high:\n",
"\n",
" # the next statement is the same as before:\n",
" # now we analyse the following statement to derive\n",
" # the relationship between mid and low, high:\n",
" mid = low + (high-low)//2\n",
" # FOR LOW:\n",
" ## (high-low)//2 >= 0\n",
" ## THUS mid >= low\n",
"\n",
" # FOR HIGH:\n",
" ## mid = low + floor((high-low)/2) <= low + (high-low)/2 = high - (high-low)/2 <= high\n",
" ## THUS mid <= high\n",
"\n",
" # IMPORTANT:\n",
" # BUT (mid == high) IFF (low == high) [otherwise it will always be `pulled down` by `low`]\n",
" # which is not possible since low < high by the loop condition\n",
" # THUS mid < high\n",
"\n",
" # CONCLUSION 1: low <= mid < high\n",
" # note that the above is different from the previous version\n",
"\n",
" # lastly we analyse how low, high change:\n",
" if arr[mid] == value:\n",
" return mid\n",
" elif arr[mid] < value:\n",
" low = mid + 1 # low' > mid >= low, thus low' > low [same as previous version]\n",
" else:\n",
" high = mid # high' = mid < high, thus high' < high [diff path but same conclusion]\n",
" # CONCLUSION 2: either low is increased OR high is decreased with EACH ITER\n",
" # thus we still have the same conclusion as before\n",
" return -1"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Conclusion\n",
"\n",
"Each iteration we certainly narrows the interval between low and high, thus we will eventually terminate at `low == high` in a finite number of steps."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Invariance Maintenance\n",
"\n",
"We zoom into the following lines, keeping in mind the assumption that `value` exists in a **sorted** `arr`:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"if arr[mid] == value:\n",
" return mid\n",
"elif arr[mid] < value:\n",
" low = mid + 1\n",
"# we eliminate the right half of the array\n",
"# since `mid` is not the value, we can safely exclude it\n",
"else:\n",
" high = mid\n",
"# note that the above is different from the previous version\n",
"# because `high` is excluded from the search range\n",
"# we can set it to `mid` instead of `mid - 1`\n",
"\n",
"# PS: why is `high` excluded?\n",
"# PS: because in the above analysis, we have shown that `mid < high`\n",
"# PS: and `mid` is like a pointer that accesses the elements to be searched\n",
"# PS: so if `mid < high`, `high` will never be accessed, thus it's excluded from the search range"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Conclusion\n",
"\n",
"If `value` is in `arr`: it is always bounded in `[low, high)`.\n",
"If `value` is not in `arr`: its insertion position is always bounded in `[low, high)`. Until `low == high` when the loop exits and `-1` is returned."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Conclusion for Variant 2\n",
"\n",
"Because it satisfies both Convergence and Invariant Maintenance, it is valid!"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.6"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment