Zerui18 · December 21, 2023 07:56
diff --git a/binary_search_proof.ipynb b/binary_search_proof.ipynb
 {
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Validity of Iterative Binary Search (2 flavours)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Conditions of Validity\n",
    "\n",
    "We breakdownn the proof into 2 steps which jointly establishes its validity: termination & invariant maintenance.\n",
    "Termination: the function will always terminate in finite time.\n",
    "Invariant Maintenance: the target, if in array, is always bounded by `low` and `high`"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 1. Convergence\n",
    "\n",
    "To prove that the function will ALWAYS converge to an answer in FINITE time, we show that it (specifically the `while` loop) has a termination condition (duh) and each iteration **certainly** brings us closer to it."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 2. Invariant Maintenance\n",
    "\n",
    "The invariant for binary search is that if the target is in the array (valid inputs for `arr` and `value`), it must always be within the current range bounded by `low, high` - from initialization time to every iteration."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Analysis of Variant 1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Convergence"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def binary_search(arr, value):\n",
    "    low = 0\n",
    "    high = len(arr) - 1\n",
    "\n",
    "    # note the following condition: low <= high\n",
    "    while low <= high:\n",
    "\n",
    "        # now we analyse the following statement to derive\n",
    "        # the relationship between mid and low, high:\n",
    "        mid = low + (high-low)//2\n",
    "        # FOR LOW:\n",
    "        ## (high-low)//2 >= 0\n",
    "        ## THUS mid >= low\n",
    "\n",
    "        # FOR HIGH:\n",
    "        ## mid = low + floor((high-low)/2) <= low + (high-low)/2 = high - (high-low)/2 <= high\n",
    "        ## THUS mid <= high\n",
    "\n",
    "        # CONCLUSION 1: low <= mid <= high\n",
    "\n",
    "        # lastly we analyse how low, high change:\n",
    "        if arr[mid] == value:\n",
    "            return mid\n",
    "        elif arr[mid] < value:\n",
    "            low = mid + 1 # low' > mid >= low, thus low' > low\n",
    "        else:\n",
    "            high = mid - 1 # high' < mid <= high, thus high' < high\n",
    "        # CONCLUSION 2: either low is increased OR high is decreased with EACH ITER\n",
    "    return -1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Conclusion\n",
    "\n",
    "Each iteration we certainly narrows the interval between low and high, thus we will eventually terminate at `low > high` in a finite number of steps."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Invariance Maintenance\n",
    "\n",
    "We zoom into the following lines, keeping in mind the assumption that `value` exists in a **sorted** `arr`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# the 1st branch doesn't update low or high\n",
    "if arr[mid] == value:\n",
    "    return mid\n",
    "# we eliminate the left half of the array\n",
    "# since mid is not the value, we can safely exclude it\n",
    "elif arr[mid] < value:\n",
    "    low = mid + 1\n",
    "# we eliminate the right half of the array\n",
    "# since mid is not the value, we can safely exclude it\n",
    "else:\n",
    "    high = mid - 1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Conclusion\n",
    "\n",
    "If `value` is in `arr`: it is always bounded in `[low, high]`.\n",
    "If `value` is not in `arr`: its insertion position is always bounded in `[low, high]`. Until `low > high` when the loop exits and `-1` is returned."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Conclusion for Variant 1\n",
    "\n",
    "Because it satisfies both Convergence and Invariant Maintenance, it is valid!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Analysis of Variant 1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Convergence"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def binary_search(arr, value):\n",
    "    low = 0\n",
    "    high = len(arr)\n",
    "\n",
    "    # note the following condition: low < high\n",
    "    # different from the previous version\n",
    "    # this excludes low == high\n",
    "    while low < high:\n",
    "\n",
    "        # the next statement is the same as before:\n",
    "        # now we analyse the following statement to derive\n",
    "        # the relationship between mid and low, high:\n",
    "        mid = low + (high-low)//2\n",
    "        # FOR LOW:\n",
    "        ## (high-low)//2 >= 0\n",
    "        ## THUS mid >= low\n",
    "\n",
    "        # FOR HIGH:\n",
    "        ## mid = low + floor((high-low)/2) <= low + (high-low)/2 = high - (high-low)/2 <= high\n",
    "        ## THUS mid <= high\n",
    "\n",
    "        # IMPORTANT:\n",
    "        # BUT (mid == high) IFF (low == high) [otherwise it will always be `pulled down` by `low`]\n",
    "        # which is not possible since low < high by the loop condition\n",
    "        # THUS mid < high\n",
    "\n",
    "        # CONCLUSION 1: low <= mid < high\n",
    "        # note that the above is different from the previous version\n",
    "\n",
    "        # lastly we analyse how low, high change:\n",
    "        if arr[mid] == value:\n",
    "            return mid\n",
    "        elif arr[mid] < value:\n",
    "            low = mid + 1 # low' > mid >= low, thus low' > low [same as previous version]\n",
    "        else:\n",
    "            high = mid # high' = mid < high, thus high' < high [diff path but same conclusion]\n",
    "        # CONCLUSION 2: either low is increased OR high is decreased with EACH ITER\n",
    "        # thus we still have the same conclusion as before\n",
    "    return -1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Conclusion\n",
    "\n",
    "Each iteration we certainly narrows the interval between low and high, thus we will eventually terminate at `low == high` in a finite number of steps."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Invariance Maintenance\n",
    "\n",
    "We zoom into the following lines, keeping in mind the assumption that `value` exists in a **sorted** `arr`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "if arr[mid] == value:\n",
    "    return mid\n",
    "elif arr[mid] < value:\n",
    "    low = mid + 1\n",
    "# we eliminate the right half of the array\n",
    "# since `mid` is not the value, we can safely exclude it\n",
    "else:\n",
    "    high = mid\n",
    "# note that the above is different from the previous version\n",
    "# because `high` is excluded from the search range\n",
    "# we can set it to `mid` instead of `mid - 1`\n",
    "\n",
    "# PS: why is `high` excluded?\n",
    "# PS: because in the above analysis, we have shown that `mid < high`\n",
    "# PS: and `mid` is like a pointer that accesses the elements to be searched\n",
    "# PS: so if `mid < high`, `high` will never be accessed, thus it's excluded from the search range"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Conclusion\n",
    "\n",
    "If `value` is in `arr`: it is always bounded in `[low, high)`.\n",
    "If `value` is not in `arr`: its insertion position is always bounded in `[low, high)`. Until `low == high` when the loop exits and `-1` is returned."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Conclusion for Variant 2\n",
    "\n",
    "Because it satisfies both Convergence and Invariant Maintenance, it is valid!"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
 }
	{
	"cells": [
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"# Validity of Iterative Binary Search (2 flavours)"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"## Conditions of Validity\n",
	"\n",
	"We breakdownn the proof into 2 steps which jointly establishes its validity: termination & invariant maintenance.\n",
	"Termination: the function will always terminate in finite time.\n",
	"Invariant Maintenance: the target, if in array, is always bounded by `low` and `high`"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"### 1. Convergence\n",
	"\n",
	"To prove that the function will ALWAYS converge to an answer in FINITE time, we show that it (specifically the `while` loop) has a termination condition (duh) and each iteration certainly brings us closer to it."
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"### 2. Invariant Maintenance\n",
	"\n",
	"The invariant for binary search is that if the target is in the array (valid inputs for `arr` and `value`), it must always be within the current range bounded by `low, high` - from initialization time to every iteration."
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"## Analysis of Variant 1"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"### Convergence"
	]
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"metadata": {},
	"outputs": [],
	"source": [
	"def binary_search(arr, value):\n",
	" low = 0\n",
	" high = len(arr) - 1\n",
	"\n",
	" # note the following condition: low <= high\n",
	" while low <= high:\n",
	"\n",
	" # now we analyse the following statement to derive\n",
	" # the relationship between mid and low, high:\n",
	" mid = low + (high-low)//2\n",
	" # FOR LOW:\n",
	" ## (high-low)//2 >= 0\n",
	" ## THUS mid >= low\n",
	"\n",
	" # FOR HIGH:\n",
	" ## mid = low + floor((high-low)/2) <= low + (high-low)/2 = high - (high-low)/2 <= high\n",
	" ## THUS mid <= high\n",
	"\n",
	" # CONCLUSION 1: low <= mid <= high\n",
	"\n",
	" # lastly we analyse how low, high change:\n",
	" if arr[mid] == value:\n",
	" return mid\n",
	" elif arr[mid] < value:\n",
	" low = mid + 1 # low' > mid >= low, thus low' > low\n",
	" else:\n",
	" high = mid - 1 # high' < mid <= high, thus high' < high\n",
	" # CONCLUSION 2: either low is increased OR high is decreased with EACH ITER\n",
	" return -1"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"#### Conclusion\n",
	"\n",
	"Each iteration we certainly narrows the interval between low and high, thus we will eventually terminate at `low > high` in a finite number of steps."
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"### Invariance Maintenance\n",
	"\n",
	"We zoom into the following lines, keeping in mind the assumption that `value` exists in a sorted `arr`:"
	]
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"metadata": {},
	"outputs": [],
	"source": [
	"# the 1st branch doesn't update low or high\n",
	"if arr[mid] == value:\n",
	" return mid\n",
	"# we eliminate the left half of the array\n",
	"# since mid is not the value, we can safely exclude it\n",
	"elif arr[mid] < value:\n",
	" low = mid + 1\n",
	"# we eliminate the right half of the array\n",
	"# since mid is not the value, we can safely exclude it\n",
	"else:\n",
	" high = mid - 1"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"#### Conclusion\n",
	"\n",
	"If `value` is in `arr`: it is always bounded in `[low, high]`.\n",
	"If `value` is not in `arr`: its insertion position is always bounded in `[low, high]`. Until `low > high` when the loop exits and `-1` is returned."
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"### Conclusion for Variant 1\n",
	"\n",
	"Because it satisfies both Convergence and Invariant Maintenance, it is valid!"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"## Analysis of Variant 1"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"### Convergence"
	]
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"metadata": {},
	"outputs": [],
	"source": [
	"def binary_search(arr, value):\n",
	" low = 0\n",
	" high = len(arr)\n",
	"\n",
	" # note the following condition: low < high\n",
	" # different from the previous version\n",
	" # this excludes low == high\n",
	" while low < high:\n",
	"\n",
	" # the next statement is the same as before:\n",
	" # now we analyse the following statement to derive\n",
	" # the relationship between mid and low, high:\n",
	" mid = low + (high-low)//2\n",
	" # FOR LOW:\n",
	" ## (high-low)//2 >= 0\n",
	" ## THUS mid >= low\n",
	"\n",
	" # FOR HIGH:\n",
	" ## mid = low + floor((high-low)/2) <= low + (high-low)/2 = high - (high-low)/2 <= high\n",
	" ## THUS mid <= high\n",
	"\n",
	" # IMPORTANT:\n",
	" # BUT (mid == high) IFF (low == high) [otherwise it will always be `pulled down` by `low`]\n",
	" # which is not possible since low < high by the loop condition\n",
	" # THUS mid < high\n",
	"\n",
	" # CONCLUSION 1: low <= mid < high\n",
	" # note that the above is different from the previous version\n",
	"\n",
	" # lastly we analyse how low, high change:\n",
	" if arr[mid] == value:\n",
	" return mid\n",
	" elif arr[mid] < value:\n",
	" low = mid + 1 # low' > mid >= low, thus low' > low [same as previous version]\n",
	" else:\n",
	" high = mid # high' = mid < high, thus high' < high [diff path but same conclusion]\n",
	" # CONCLUSION 2: either low is increased OR high is decreased with EACH ITER\n",
	" # thus we still have the same conclusion as before\n",
	" return -1"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"#### Conclusion\n",
	"\n",
	"Each iteration we certainly narrows the interval between low and high, thus we will eventually terminate at `low == high` in a finite number of steps."
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"### Invariance Maintenance\n",
	"\n",
	"We zoom into the following lines, keeping in mind the assumption that `value` exists in a sorted `arr`:"
	]
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"metadata": {},
	"outputs": [],
	"source": [
	"if arr[mid] == value:\n",
	" return mid\n",
	"elif arr[mid] < value:\n",
	" low = mid + 1\n",
	"# we eliminate the right half of the array\n",
	"# since `mid` is not the value, we can safely exclude it\n",
	"else:\n",
	" high = mid\n",
	"# note that the above is different from the previous version\n",
	"# because `high` is excluded from the search range\n",
	"# we can set it to `mid` instead of `mid - 1`\n",
	"\n",
	"# PS: why is `high` excluded?\n",
	"# PS: because in the above analysis, we have shown that `mid < high`\n",
	"# PS: and `mid` is like a pointer that accesses the elements to be searched\n",
	"# PS: so if `mid < high`, `high` will never be accessed, thus it's excluded from the search range"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"#### Conclusion\n",
	"\n",
	"If `value` is in `arr`: it is always bounded in `[low, high)`.\n",
	"If `value` is not in `arr`: its insertion position is always bounded in `[low, high)`. Until `low == high` when the loop exits and `-1` is returned."
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"### Conclusion for Variant 2\n",
	"\n",
	"Because it satisfies both Convergence and Invariant Maintenance, it is valid!"
	]
	}
	],
	"metadata": {
	"kernelspec": {
	"display_name": "Python 3",
	"language": "python",
	"name": "python3"
	},
	"language_info": {
	"codemirror_mode": {
	"name": "ipython",
	"version": 3
	},
	"file_extension": ".py",
	"mimetype": "text/x-python",
	"name": "python",
	"nbconvert_exporter": "python",
	"pygments_lexer": "ipython3",
	"version": "3.9.6"
	}
	},
	"nbformat": 4,
	"nbformat_minor": 2
	}