Last active
December 10, 2020 16:26
-
-
Save mazino2d/cfe2cb8c44f399d5659f52ab4e691b1c to your computer and use it in GitHub Desktop.
khoidd/cv-practice/jpeg-compression.ipynb
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"cells": [ | |
{ | |
"metadata": { | |
"ExecuteTime": { | |
"start_time": "2020-12-10T16:26:03.254657Z", | |
"end_time": "2020-12-10T16:26:03.313238Z" | |
}, | |
"scrolled": false, | |
"trusted": true | |
}, | |
"cell_type": "code", | |
"source": "# Init env and image\nimport numpy as np\nimport heapq as hp\nimport scipy.fftpack as ft\nimport math, typing as T, functools as F\n\nIMG_SIZE = (9, 9)\nnp.set_printoptions(threshold=IMG_SIZE[0]*IMG_SIZE[1])\n\nimg = np.random.randint(0, 256, IMG_SIZE)\nprint(img)", | |
"execution_count": 1, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"text": "[[165 74 2 134 216 244 252 82 159]\n [198 233 253 168 127 3 243 20 170]\n [214 221 249 195 122 57 108 34 70]\n [ 49 13 86 204 223 233 113 3 219]\n [159 126 177 192 177 89 209 125 164]\n [199 49 82 207 184 145 210 125 119]\n [ 77 94 250 98 140 175 42 15 84]\n [190 12 3 100 80 176 255 35 159]\n [ 89 91 54 11 192 18 145 112 52]]\n", | |
"name": "stdout" | |
} | |
] | |
}, | |
{ | |
"metadata": {}, | |
"cell_type": "markdown", | |
"source": "## DCT Encoder\n- Pseudo code:\n\n```\n1. Divide the image into 8x8 subimages;\n For each subimage do:\n2. Shift the gray-levels in the range [-128, 127]\n3. Apply DCT (64 coefficients will be obtained: 1 DC coefficient F(0,0), 63 AC coefficients F(u,v)).\n``` \n\n- DCT type 2:\n$$y_k = a(k) \\sum_{n=0}^{N-1} f(x) cos(\\frac{\\pi k (2n + 1)}{2N})$$\n\n\\begin{gather*} \na(u == 0) = \\sqrt(\\frac{1}{N}) \\\\\na(u != 0) = \\sqrt(\\frac{2}{N})\n\\end{gather*}" | |
}, | |
{ | |
"metadata": { | |
"ExecuteTime": { | |
"start_time": "2020-12-10T16:26:03.314643Z", | |
"end_time": "2020-12-10T16:26:03.322364Z" | |
}, | |
"trusted": true, | |
"code_folding": [ | |
0, | |
5, | |
15 | |
] | |
}, | |
"cell_type": "code", | |
"source": "def shift_img(img: np.ndarray, restore=False) -> np.ndarray:\n if restore:\n return img + 128\n return img - 128\n\ndef cal_dct_blk(img: np.ndarray, size_blk:int) -> np.ndarray:\n z = np.zeros(img.shape)\n num_blk = img.shape[0] // size_blk\n for i in range(num_blk):\n for j in range(num_blk):\n submat = img[(i * size_blk):((i + 1) * size_blk), (j * size_blk):((j + 1) * size_blk)]\n submat_dct = ft.dct(ft.dct(submat, axis=0, norm='ortho'), axis=1, norm='ortho')\n z[(i * size_blk):((i + 1) * size_blk), (j * size_blk):((j + 1) * size_blk)] = submat_dct\n return np.array(z, dtype=np.int32)\n\ndef cal_idct_blk(img: np.ndarray, size_blk:int) -> np.ndarray:\n z = np.zeros(img.shape)\n num_blk = img.shape[0] // size_blk\n for i in range(num_blk):\n for j in range(num_blk):\n submat = img[(i * size_blk):((i + 1) * size_blk), (j * size_blk):((j + 1) * size_blk)]\n submat_dct = ft.idct(ft.idct(submat, axis=0, norm='ortho'), axis=1, norm='ortho')\n z[(i * size_blk):((i + 1) * size_blk), (j * size_blk):((j + 1) * size_blk)] = submat_dct\n return np.array(z, dtype=np.int32)\n\n# test\nimg_dct = cal_dct_blk(shift_img(img), 3)\nimg_idct = cal_idct_blk(img_dct, 3)\nprint(\"original image: \\n\", img)\nprint(\"dct trans image: \\n\", img_dct)\nprint(\"diff: \", shift_img(img_idct - img, restore=True).sum())", | |
"execution_count": 2, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"text": "original image: \n [[165 74 2 134 216 244 252 82 159]\n [198 233 253 168 127 3 243 20 170]\n [214 221 249 195 122 57 108 34 70]\n [ 49 13 86 204 223 233 113 3 219]\n [159 126 177 192 177 89 209 125 164]\n [199 49 82 207 184 145 210 125 119]\n [ 77 94 250 98 140 175 42 15 84]\n [190 12 3 100 80 176 255 35 159]\n [ 89 91 54 11 192 18 145 112 52]]\ndct trans image: \n [[ 152 29 5 38 78 -30 -4 83 172]\n [-180 99 0 89 -124 -17 114 27 39]\n [-104 68 11 87 -87 19 -37 -4 -64]\n [ -70 25 88 167 55 -23 45 12 124]\n [ -74 -77 -21 50 -45 2 -48 -98 71]\n [-105 33 20 65 -49 20 -48 -30 26]\n [ -97 20 63 -54 -65 -57 -84 60 97]\n [ 76 -104 51 78 -35 100 -68 -67 35]\n [ 57 -147 -39 -18 19 -99 -105 -40 -103]]\ndiff: 3\n", | |
"name": "stdout" | |
} | |
] | |
}, | |
{ | |
"metadata": {}, | |
"cell_type": "markdown", | |
"source": "## Quatizer\n\n- Pseudo code:\n\n```\n4. Quantize the coefficients (i.e., reduce the amplitude of coefficients that do not contribute a lot).\n```\n\n- Formula:\n\n $$ C_q(u, v) = Round(\\frac{C(u, v)}{Q(u, v)})$$" | |
}, | |
{ | |
"metadata": { | |
"ExecuteTime": { | |
"start_time": "2020-12-10T16:26:03.323675Z", | |
"end_time": "2020-12-10T16:26:03.326729Z" | |
}, | |
"trusted": true | |
}, | |
"cell_type": "code", | |
"source": "def quantizize(img: np.ndarray, q=10) -> np.ndarray:\n mat_q = q * np.ones(img.shape)\n return np.array(img_dct // mat_q, dtype=np.int8)\n\nimg_q = quantizize(img)\nprint(img_q)", | |
"execution_count": 3, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"text": "[[ 15 2 0 3 7 -3 -1 8 17]\n [-18 9 0 8 -13 -2 11 2 3]\n [-11 6 1 8 -9 1 -4 -1 -7]\n [ -7 2 8 16 5 -3 4 1 12]\n [ -8 -8 -3 5 -5 0 -5 -10 7]\n [-11 3 2 6 -5 2 -5 -3 2]\n [-10 2 6 -6 -7 -6 -9 6 9]\n [ 7 -11 5 7 -4 10 -7 -7 3]\n [ 5 -15 -4 -2 1 -10 -11 -4 -11]]\n", | |
"name": "stdout" | |
} | |
] | |
}, | |
{ | |
"metadata": {}, | |
"cell_type": "markdown", | |
"source": "## Entropy decoder\n\n```\n5. Order the coefficients using zig-zag ordering\n- Place non-zero coefficients first\n- Create long runs of zeros (i.e., good for run-length encoding)\n\n6. Encode coefficients.\nDC coefficients are encoded using predictive encoding\nAll coefficients are converted to a binary sequence:\n6.1 Form intermediate symbol sequence\n6.2 Apply Huffman (or arithmetic) coding (i.e., entropy coding)\n```" | |
}, | |
{ | |
"metadata": { | |
"ExecuteTime": { | |
"start_time": "2020-12-10T16:26:03.327902Z", | |
"end_time": "2020-12-10T16:26:03.331648Z" | |
}, | |
"trusted": true, | |
"code_folding": [ | |
0 | |
] | |
}, | |
"cell_type": "code", | |
"source": "def zigzag(img: np.ndarray) -> np.ndarray:\n list_diag = [np.diagonal(img[::-1,:], k)[::(2*(k % 2)-1)] for k in range(1-img.shape[0], img.shape[0])]\n return np.concatenate(list_diag)\n\nimg_zigzag = zigzag(img_q)\n\nhist_zigzag, bins_zigzag = np.histogram(\n img_zigzag, \n bins=[x for x in range(img_zigzag.min(), img_zigzag.max() + 1)]\n)\nprint(img_zigzag)", | |
"execution_count": 4, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"text": "[ 15 -18 2 0 9 -11 -7 6 0 3 7 8 1 2 -8 -11 -8 8\n 8 -13 -3 -1 -2 -9 16 -3 3 -10 7 2 2 5 5 1 11 8\n 17 2 -4 -3 -5 6 6 -11 5 -15 5 -6 -5 0 4 -1 3 -7\n 1 -5 2 -7 7 -4 -2 -4 -6 -5 -10 12 7 -3 -9 10 1 -10\n -7 6 2 9 -7 -11 -4 3 -11]\n", | |
"name": "stdout" | |
} | |
] | |
}, | |
{ | |
"metadata": { | |
"ExecuteTime": { | |
"start_time": "2020-12-10T16:26:03.332805Z", | |
"end_time": "2020-12-10T16:26:03.340156Z" | |
}, | |
"trusted": true, | |
"code_folding": [ | |
0, | |
22 | |
] | |
}, | |
"cell_type": "code", | |
"source": "class HuffmanNode:\n def __init__(self, value=None, freq=0, left=None, right=None):\n self.value = value\n self.freq = freq\n self.left = left\n self.right = right\n \n def __lt__(self, other: 'HuffmanNode'):\n return self.freq < other.freq\n \n def __gt__(self, other: 'HuffmanNode'):\n return self.freq > other.freq\n \n def __le__(self, other: 'HuffmanNode'):\n return self.freq <= other.freq\n \n def __ge__(self, other: 'HuffmanNode'):\n return self.freq >= other.freq\n \n def __str__(self):\n return 'HuffmanNode(value=%s, freq=%s)'%(self.value, self.freq)\n\nclass HuffmanTree:\n def __init__(self, ls_node: T.List[HuffmanNode]):\n self.total: int = F.reduce(lambda cum, node: cum + node.freq, ls_node, 0)\n self.dict = {node.value: '' for node in ls_node}\n \n prefix: str = ''\n queue_huff: T.List[HuffmanNode] = ls_node\n while len(queue_huff) > 1:\n node_small = hp.heappop(queue_huff)\n node_big = hp.heappop(queue_huff)\n node_new = HuffmanNode(\n freq=(node_small.freq + node_big.freq),\n left=node_small, right=node_big\n )\n hp.heappush(queue_huff, node_new)\n\n self._travel(node_new.left, '0')\n self._travel(node_new.right, '1')\n\n \n def __str__(self):\n res = \"HuffmanTree(total=%s)\"%(self.total)\n return res+ str(json.dumps(self.dict, indent=4))\n \n def _travel(self, node:HuffmanNode, bit: str) -> None:\n \n if node == None:\n return\n if node.value != None:\n self.dict[node.value] = bit + self.dict[node.value]\n else:\n self._travel(node.left, bit)\n self._travel(node.right, bit)\n\nif True and \"test\":\n decoder_entropy = HuffmanTree([\n HuffmanNode(int(bins_zigzag[idx]), hist_zigzag[idx]) for idx in range(len(hist_zigzag))\n ])\n print(decoder_entropy)", | |
"execution_count": 5, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"text": "HuffmanTree(total=81){\n \"-18\": \"11101110\",\n \"-17\": \"11101000111\",\n \"-16\": \"11101111\",\n \"-15\": \"1110110\",\n \"-14\": \"1110100010\",\n \"-13\": \"11101001\",\n \"-12\": \"11101000110\",\n \"-11\": \"1011\",\n \"-10\": \"10100\",\n \"-9\": \"00011\",\n \"-8\": \"110010\",\n \"-7\": \"1001\",\n \"-6\": \"01000\",\n \"-5\": \"0110\",\n \"-4\": \"00111\",\n \"-3\": \"0101\",\n \"-2\": \"00010\",\n \"-1\": \"10101\",\n \"0\": \"10001\",\n \"1\": \"0000\",\n \"2\": \"1101\",\n \"3\": \"11110\",\n \"4\": \"110011\",\n \"5\": \"0010\",\n \"6\": \"11100\",\n \"7\": \"0111\",\n \"8\": \"11111\",\n \"9\": \"10000\",\n \"10\": \"1110101\",\n \"11\": \"110001\",\n \"12\": \"1100001\",\n \"13\": \"00110\",\n \"14\": \"111010000\",\n \"15\": \"1100000\",\n \"16\": \"01001\"\n}\n", | |
"name": "stdout" | |
} | |
] | |
} | |
], | |
"metadata": { | |
"kernelspec": { | |
"name": "python3", | |
"display_name": "Python 3", | |
"language": "python" | |
}, | |
"language_info": { | |
"name": "python", | |
"version": "3.8.5", | |
"mimetype": "text/x-python", | |
"codemirror_mode": { | |
"name": "ipython", | |
"version": 3 | |
}, | |
"pygments_lexer": "ipython3", | |
"nbconvert_exporter": "python", | |
"file_extension": ".py" | |
}, | |
"latex_envs": { | |
"eqNumInitial": 1, | |
"eqLabelWithNumbers": true, | |
"current_citInitial": 1, | |
"cite_by": "apalike", | |
"bibliofile": "biblio.bib", | |
"LaTeX_envs_menu_present": true, | |
"labels_anchors": false, | |
"latex_user_defs": false, | |
"user_envs_cfg": false, | |
"report_style_numbering": false, | |
"autoclose": false, | |
"autocomplete": true, | |
"hotkeys": { | |
"equation": "Ctrl-E", | |
"itemize": "Ctrl-I" | |
} | |
}, | |
"toc": { | |
"nav_menu": {}, | |
"number_sections": true, | |
"sideBar": true, | |
"skip_h1_title": false, | |
"base_numbering": 1, | |
"title_cell": "Table of Contents", | |
"title_sidebar": "Contents", | |
"toc_cell": false, | |
"toc_position": { | |
"height": "calc(100% - 180px)", | |
"left": "10px", | |
"top": "150px", | |
"width": "165px" | |
}, | |
"toc_section_display": true, | |
"toc_window_display": true | |
}, | |
"varInspector": { | |
"window_display": false, | |
"cols": { | |
"lenName": 16, | |
"lenType": 16, | |
"lenVar": 40 | |
}, | |
"kernels_config": { | |
"python": { | |
"library": "var_list.py", | |
"delete_cmd_prefix": "del ", | |
"delete_cmd_postfix": "", | |
"varRefreshCmd": "print(var_dic_list())" | |
}, | |
"r": { | |
"library": "var_list.r", | |
"delete_cmd_prefix": "rm(", | |
"delete_cmd_postfix": ") ", | |
"varRefreshCmd": "cat(var_dic_list()) " | |
} | |
}, | |
"types_to_exclude": [ | |
"module", | |
"function", | |
"builtin_function_or_method", | |
"instance", | |
"_Feature" | |
] | |
}, | |
"gist": { | |
"id": "", | |
"data": { | |
"description": "khoidd/cv-practice/jpeg-compression.ipynb", | |
"public": true | |
} | |
} | |
}, | |
"nbformat": 4, | |
"nbformat_minor": 4 | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment