Created
February 11, 2017 08:11
-
-
Save CreatCodeBuild/d24d5ec0888ad27d5f7ff9ecda87729f to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"# Motivation\n", | |
"I create this notebook to point out common mistakes among students who have done the __\"Build Your First Neural Network\"__ project\n", | |
"\n", | |
"__Erros in this notebook are intentional. Do not fix them.__" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Linear ALgebra refresher\n", | |
"Before we dive into deep neural networks, let's review some linear algebra and some numpy gotcha\n", | |
"\n", | |
"__This toturial focus on numpy programming, not math.__ \n", | |
"__For a mathamatical refresher, see [this great tutorial](https://www.youtube.com/watch?v=sYlOjyPyX3g&t=419s)__" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 1, | |
"metadata": { | |
"collapsed": true | |
}, | |
"outputs": [], | |
"source": [ | |
"from numpy import array, dot" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 6, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"[14]\n", | |
"[[1 4 9]]\n" | |
] | |
} | |
], | |
"source": [ | |
"# First\n", | |
"# dot(A, B) != A * B\n", | |
"A = array([[1,2,3]])\n", | |
"B = array([1,2,3])\n", | |
"dot_C = dot(A, B)\n", | |
"element_wise_c = A * B\n", | |
"print(dot_C)\n", | |
"print(element_wise_c)\n", | |
"\n", | |
"# dot is the dot product\n", | |
"# * is the element-wise product" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 8, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"ename": "ValueError", | |
"evalue": "shapes (1,3) and (1,3) not aligned: 3 (dim 1) != 1 (dim 0)", | |
"output_type": "error", | |
"traceback": [ | |
"\u001b[1;31m---------------------------------------------------------------------------\u001b[0m", | |
"\u001b[1;31mValueError\u001b[0m Traceback (most recent call last)", | |
"\u001b[1;32m<ipython-input-8-b76d9186a652>\u001b[0m in \u001b[0;36m<module>\u001b[1;34m()\u001b[0m\n\u001b[0;32m 4\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 5\u001b[0m \u001b[1;31m# this line will fail, because it make no sense to dot product two matrices with shape (1, 3) and (1, 3)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m----> 6\u001b[1;33m \u001b[0mdot_C\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mdot\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mA\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mB\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m", | |
"\u001b[1;31mValueError\u001b[0m: shapes (1,3) and (1,3) not aligned: 3 (dim 1) != 1 (dim 0)" | |
] | |
} | |
], | |
"source": [ | |
"# But, these is a gotcha\n", | |
"A = array([[1,2,3]])\n", | |
"B = array([[1,2,3]])\n", | |
"\n", | |
"# this line will fail, because it make no sense to dot product two matrices with shape (1, 3) and (1, 3)\n", | |
"dot_C = dot(A, B)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 35, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"[[14]]\n" | |
] | |
} | |
], | |
"source": [ | |
"# However\n", | |
"A = array([[1,2,3]])\n", | |
"B = array([[1,2,3]])\n", | |
"\n", | |
"# This is the same as if B = [1,2,3]. But actually, this is the correct way. \n", | |
"dot_C = dot(A, B.T)\n", | |
"print(dot_C)\n", | |
"\n", | |
"# dot([[1,2,3]], [1,2,3]) works probably just because it's a numpy short hand.\n", | |
"# Because mathamatically speaking, it make less sense." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 27, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"[[1 4]\n", | |
" [2 5]\n", | |
" [3 6]]\n", | |
"[[ 7 0]\n", | |
" [ 8 1]\n", | |
" [ 9 2]\n", | |
" [10 3]]\n" | |
] | |
} | |
], | |
"source": [ | |
"# Now, let's consider bigger matrix\n", | |
"A = array([\n", | |
" [1,2,3],\n", | |
" [4,5,6]\n", | |
" ])\n", | |
"B = array([\n", | |
" [7,8,9,10],\n", | |
" [0,1,2,3]\n", | |
" ])\n", | |
"\n", | |
"\n", | |
"# First let's see that is a transpose matrix .T\n", | |
"print(A.T)\n", | |
"print(B.T)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 28, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"X:\n", | |
"[[ 7 12 17 22]\n", | |
" [14 21 28 35]\n", | |
" [21 30 39 48]]\n", | |
"\n", | |
"Y:\n", | |
"[[ 7 14 21]\n", | |
" [12 21 30]\n", | |
" [17 28 39]\n", | |
" [22 35 48]]\n" | |
] | |
} | |
], | |
"source": [ | |
"X = dot(A.T, B)\n", | |
"Y = dot(B.T, A)\n", | |
"\n", | |
"print('X:')\n", | |
"print(X)\n", | |
"print()\n", | |
"print('Y:')\n", | |
"print(Y)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 25, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"[[ True True True]\n", | |
" [ True True True]\n", | |
" [ True True True]\n", | |
" [ True True True]]\n", | |
"[[ True True True True]\n", | |
" [ True True True True]\n", | |
" [ True True True True]]\n", | |
"[[ True True True]\n", | |
" [ True True True]\n", | |
" [ True True True]\n", | |
" [ True True True]]\n", | |
"[[ True True True True]\n", | |
" [ True True True True]\n", | |
" [ True True True True]]\n" | |
] | |
} | |
], | |
"source": [ | |
"# You see, X and Y are just transpose of each other\n", | |
"print(X.T == Y)\n", | |
"print(X == Y.T)\n", | |
"print(Y == X.T)\n", | |
"print(Y.T == X)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 33, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"[[ 7 12 17 22]\n", | |
" [14 21 28 35]\n", | |
" [21 30 39 48]]\n" | |
] | |
} | |
], | |
"source": [ | |
"# Also consider\n", | |
"A = array([\n", | |
" [1, 4],\n", | |
" [2, 5],\n", | |
" [3, 6],\n", | |
" ])\n", | |
"B = array([\n", | |
" [7,8,9,10],\n", | |
" [0,1,2,3]\n", | |
" ])\n", | |
"\n", | |
"X = dot(A, B)\n", | |
"print(X)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 34, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"ename": "ValueError", | |
"evalue": "shapes (2,4) and (3,2) not aligned: 4 (dim 1) != 3 (dim 0)", | |
"output_type": "error", | |
"traceback": [ | |
"\u001b[1;31m---------------------------------------------------------------------------\u001b[0m", | |
"\u001b[1;31mValueError\u001b[0m Traceback (most recent call last)", | |
"\u001b[1;32m<ipython-input-34-aba7d292d99b>\u001b[0m in \u001b[0;36m<module>\u001b[1;34m()\u001b[0m\n\u001b[0;32m 1\u001b[0m \u001b[1;31m# But this will fail\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m----> 2\u001b[1;33m \u001b[0mY\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mdot\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mB\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mA\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 3\u001b[0m \u001b[0mprint\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mY\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n", | |
"\u001b[1;31mValueError\u001b[0m: shapes (2,4) and (3,2) not aligned: 4 (dim 1) != 3 (dim 0)" | |
] | |
} | |
], | |
"source": [ | |
"# But this will fail, because the shape doesn't match\n", | |
"Y = dot(B, A)\n", | |
"print(Y)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 37, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"[[11 16 21]\n", | |
" [19 26 33]\n", | |
" [27 36 45]]\n", | |
"[[ 50 122]\n", | |
" [ 14 32]]\n" | |
] | |
} | |
], | |
"source": [ | |
"# Therefore, dot(A, B) != dot(B, A)\n", | |
"# also consider\n", | |
"A = array([\n", | |
" [1, 4],\n", | |
" [2, 5],\n", | |
" [3, 6],\n", | |
" ])\n", | |
"B = array([\n", | |
" [7, 8, 9],\n", | |
" [1, 2, 3]\n", | |
" ])\n", | |
"X = dot(A, B)\n", | |
"Y = dot(B, A)\n", | |
"print(X)\n", | |
"print(Y)\n", | |
"\n", | |
"# As you can see, \n", | |
"# just because 2 operations with the same arguments but different order get computed without error,\n", | |
"# it doesn't mean the semantics are the same.\n", | |
"\n", | |
"# But, for *, A * B == B * A because it's just element-wise!" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 42, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"[[70 80 90]\n", | |
" [10 20 30]]\n", | |
"[[70 80 90]\n", | |
" [10 20 30]]\n", | |
"[[70 80 90]\n", | |
" [10 20 30]]\n", | |
"[[70 80 90]\n", | |
" [10 20 30]]\n" | |
] | |
} | |
], | |
"source": [ | |
"# Now, let's consider scalar or 1 unit vector/matrix\n", | |
"\n", | |
"# salar\n", | |
"A = 10\n", | |
"B = array([\n", | |
" [7, 8, 9],\n", | |
" [1, 2, 3]\n", | |
" ])\n", | |
"print(A * B)\n", | |
"print(B * A)\n", | |
"print(dot(A, B))\n", | |
"print(dot(B, A))" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 46, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"[[70 80 90]\n", | |
" [10 20 30]]\n", | |
"[[70 80 90]\n", | |
" [10 20 30]]\n" | |
] | |
}, | |
{ | |
"ename": "ValueError", | |
"evalue": "shapes (1,) and (2,3) not aligned: 1 (dim 0) != 2 (dim 0)", | |
"output_type": "error", | |
"traceback": [ | |
"\u001b[1;31m---------------------------------------------------------------------------\u001b[0m", | |
"\u001b[1;31mValueError\u001b[0m Traceback (most recent call last)", | |
"\u001b[1;32m<ipython-input-46-00b31614be64>\u001b[0m in \u001b[0;36m<module>\u001b[1;34m()\u001b[0m\n\u001b[0;32m 7\u001b[0m \u001b[0mprint\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mA\u001b[0m \u001b[1;33m*\u001b[0m \u001b[0mB\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 8\u001b[0m \u001b[0mprint\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mB\u001b[0m \u001b[1;33m*\u001b[0m \u001b[0mA\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m----> 9\u001b[1;33m \u001b[0mprint\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mdot\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mA\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mB\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 10\u001b[0m \u001b[0mprint\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mdot\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mB\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mA\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n", | |
"\u001b[1;31mValueError\u001b[0m: shapes (1,) and (2,3) not aligned: 1 (dim 0) != 2 (dim 0)" | |
] | |
} | |
], | |
"source": [ | |
"# 1 unit vector\n", | |
"A = array([10])\n", | |
"B = array([\n", | |
" [7, 8, 9],\n", | |
" [1, 2, 3]\n", | |
" ])\n", | |
"print(A * B)\n", | |
"print(B * A)\n", | |
"print(dot(A, B))\n", | |
"print(dot(B, A))" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 47, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"[[70 80 90]\n", | |
" [10 20 30]]\n", | |
"[[70 80 90]\n", | |
" [10 20 30]]\n" | |
] | |
}, | |
{ | |
"ename": "ValueError", | |
"evalue": "shapes (1,1) and (2,3) not aligned: 1 (dim 1) != 2 (dim 0)", | |
"output_type": "error", | |
"traceback": [ | |
"\u001b[1;31m---------------------------------------------------------------------------\u001b[0m", | |
"\u001b[1;31mValueError\u001b[0m Traceback (most recent call last)", | |
"\u001b[1;32m<ipython-input-47-61d8df2d1573>\u001b[0m in \u001b[0;36m<module>\u001b[1;34m()\u001b[0m\n\u001b[0;32m 7\u001b[0m \u001b[0mprint\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mA\u001b[0m \u001b[1;33m*\u001b[0m \u001b[0mB\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 8\u001b[0m \u001b[0mprint\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mB\u001b[0m \u001b[1;33m*\u001b[0m \u001b[0mA\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m----> 9\u001b[1;33m \u001b[0mprint\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mdot\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mA\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mB\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 10\u001b[0m \u001b[0mprint\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mdot\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mB\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mA\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n", | |
"\u001b[1;31mValueError\u001b[0m: shapes (1,1) and (2,3) not aligned: 1 (dim 1) != 2 (dim 0)" | |
] | |
} | |
], | |
"source": [ | |
"# 1 unit matrix\n", | |
"A = array([[10]])\n", | |
"B = array([\n", | |
" [7, 8, 9],\n", | |
" [1, 2, 3]\n", | |
" ])\n", | |
"print(A * B)\n", | |
"print(B * A)\n", | |
"print(dot(A, B))\n", | |
"print(dot(B, A))" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 56, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"[[70 80 90]]\n", | |
"[[70 80 90]]\n", | |
"[[70 80 90]]\n", | |
"[[70 80 90]] \n", | |
"\n", | |
"1 unit vector\n", | |
"[[70 80 90]]\n", | |
"[[70 80 90]]\n", | |
"[70 80 90]\n" | |
] | |
}, | |
{ | |
"ename": "ValueError", | |
"evalue": "shapes (1,3) and (1,) not aligned: 3 (dim 1) != 1 (dim 0)", | |
"output_type": "error", | |
"traceback": [ | |
"\u001b[1;31m---------------------------------------------------------------------------\u001b[0m", | |
"\u001b[1;31mValueError\u001b[0m Traceback (most recent call last)", | |
"\u001b[1;32m<ipython-input-56-9e972ee5436b>\u001b[0m in \u001b[0;36m<module>\u001b[1;34m()\u001b[0m\n\u001b[0;32m 15\u001b[0m \u001b[0mprint\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mB\u001b[0m \u001b[1;33m*\u001b[0m \u001b[0mA\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 16\u001b[0m \u001b[0mprint\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mdot\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mA\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mB\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m---> 17\u001b[1;33m \u001b[0mprint\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mdot\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mB\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mA\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m", | |
"\u001b[1;31mValueError\u001b[0m: shapes (1,3) and (1,) not aligned: 3 (dim 1) != 1 (dim 0)" | |
] | |
} | |
], | |
"source": [ | |
"# But, what if B has only one layer?\n", | |
"# saclar\n", | |
"A = array(10)\n", | |
"B = array([[7, 8, 9]])\n", | |
"print(A * B)\n", | |
"print(B * A)\n", | |
"print(dot(A, B))\n", | |
"print(dot(B, A), '\\n')\n", | |
"\n", | |
"# 1-unit vector\n", | |
"print('1 unit vector')\n", | |
"A = array([10])\n", | |
"B = array([[7, 8, 9]])\n", | |
"print(A * B)\n", | |
"print(B * A)\n", | |
"print(dot(A, B))\n", | |
"print(dot(B, A))" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 55, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"1 unit matrix\n", | |
"[[70 80 90]]\n", | |
"[[70 80 90]]\n", | |
"[[70 80 90]]\n" | |
] | |
}, | |
{ | |
"ename": "ValueError", | |
"evalue": "shapes (1,3) and (1,1) not aligned: 3 (dim 1) != 1 (dim 0)", | |
"output_type": "error", | |
"traceback": [ | |
"\u001b[1;31m---------------------------------------------------------------------------\u001b[0m", | |
"\u001b[1;31mValueError\u001b[0m Traceback (most recent call last)", | |
"\u001b[1;32m<ipython-input-55-6183a2d96180>\u001b[0m in \u001b[0;36m<module>\u001b[1;34m()\u001b[0m\n\u001b[0;32m 6\u001b[0m \u001b[0mprint\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mB\u001b[0m \u001b[1;33m*\u001b[0m \u001b[0mA\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 7\u001b[0m \u001b[0mprint\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mdot\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mA\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mB\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m----> 8\u001b[1;33m \u001b[0mprint\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mdot\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mB\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mA\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m", | |
"\u001b[1;31mValueError\u001b[0m: shapes (1,3) and (1,1) not aligned: 3 (dim 1) != 1 (dim 0)" | |
] | |
} | |
], | |
"source": [ | |
"# 1-unit matrix\n", | |
"print('1 unit matrix')\n", | |
"A = array([[10]])\n", | |
"B = array([[7, 8, 9]])\n", | |
"print(A * B)\n", | |
"print(B * A)\n", | |
"print(dot(A, B))\n", | |
"print(dot(B, A))" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": { | |
"collapsed": true | |
}, | |
"outputs": [], | |
"source": [ | |
"# Have you noticed that [10] dot [[7,8,9]] != [[10]] dot [[7,8,9]]" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 59, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"[70 80 90]\n", | |
"[70 80 90]\n", | |
"[70 80 90]\n", | |
"[70 80 90] \n", | |
"\n", | |
"[70 80 90]\n", | |
"[70 80 90]\n" | |
] | |
}, | |
{ | |
"ename": "ValueError", | |
"evalue": "shapes (1,) and (3,) not aligned: 1 (dim 0) != 3 (dim 0)", | |
"output_type": "error", | |
"traceback": [ | |
"\u001b[1;31m---------------------------------------------------------------------------\u001b[0m", | |
"\u001b[1;31mValueError\u001b[0m Traceback (most recent call last)", | |
"\u001b[1;32m<ipython-input-59-714314867e51>\u001b[0m in \u001b[0;36m<module>\u001b[1;34m()\u001b[0m\n\u001b[0;32m 11\u001b[0m \u001b[0mprint\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mA\u001b[0m \u001b[1;33m*\u001b[0m \u001b[0mB\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 12\u001b[0m \u001b[0mprint\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mB\u001b[0m \u001b[1;33m*\u001b[0m \u001b[0mA\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m---> 13\u001b[1;33m \u001b[0mprint\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mdot\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mA\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mB\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 14\u001b[0m \u001b[0mprint\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mdot\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mB\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mA\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n", | |
"\u001b[1;31mValueError\u001b[0m: shapes (1,) and (3,) not aligned: 1 (dim 0) != 3 (dim 0)" | |
] | |
} | |
], | |
"source": [ | |
"# Now, what if B is also a vector?\n", | |
"A = array(10)\n", | |
"B = array([7, 8, 9])\n", | |
"print(A * B)\n", | |
"print(B * A)\n", | |
"print(dot(A, B))\n", | |
"print(dot(B, A), '\\n')\n", | |
"\n", | |
"A = array([10])\n", | |
"B = array([7, 8, 9])\n", | |
"print(A * B)\n", | |
"print(B * A)\n", | |
"print(dot(A, B))\n", | |
"print(dot(B, A))" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 60, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"And I will stop here. That's enough numpy matrix multiplication\n" | |
] | |
} | |
], | |
"source": [ | |
"# The above is another gotcha!!!\n", | |
"# matrix dot product only make sense when it's a matrix,\n", | |
"# that is, has shape (x, y), even if x or y could be 1!\n", | |
"\n", | |
"print(\"And I will stop here. That's enough numpy matrix multiplication\")" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Now, let's dive into deep neural network\n", | |
"__Just simple fully connected neural nets. No convolution, no recurrence__ \n", | |
"For example, if layer A and layer B are adjacent and B is received input from A.\n", | |
"\n", | |
"A has 10 nodes and B has 15 nodes.\n", | |
"\n", | |
"Then there are 150 connections between A and B. Each connection is an edge if you think of it as a computational graph.\n", | |
"\n", | |
"Therefore, B received 150 values(scalars) from A. Each node in B received 10 values from A. Each node in A ouput 15 values to B.\n", | |
"\n", | |
"Therefore, no matter it's forward pass or backward pass, 150 values get computed between A and B each pass.\n", | |
"\n", | |
"The forward pass computes the activation. The backward pass computes the gradient and updating values.\n", | |
"\n", | |
"These 150 edges contains 150 weights, 1 weight each." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 65, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"[[ 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15]\n", | |
" [ 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30]\n", | |
" [ 3 6 9 12 15 18 21 24 27 30 33 36 39 42 45]\n", | |
" [ 4 8 12 16 20 24 28 32 36 40 44 48 52 56 60]\n", | |
" [ 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75]\n", | |
" [ 6 12 18 24 30 36 42 48 54 60 66 72 78 84 90]\n", | |
" [ 7 14 21 28 35 42 49 56 63 70 77 84 91 98 105]\n", | |
" [ 8 16 24 32 40 48 56 64 72 80 88 96 104 112 120]\n", | |
" [ 9 18 27 36 45 54 63 72 81 90 99 108 117 126 135]\n", | |
" [ 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150]]\n", | |
"[[ 1 2 3 4 5 6 7 8 9 10]\n", | |
" [ 2 4 6 8 10 12 14 16 18 20]\n", | |
" [ 3 6 9 12 15 18 21 24 27 30]\n", | |
" [ 4 8 12 16 20 24 28 32 36 40]\n", | |
" [ 5 10 15 20 25 30 35 40 45 50]\n", | |
" [ 6 12 18 24 30 36 42 48 54 60]\n", | |
" [ 7 14 21 28 35 42 49 56 63 70]\n", | |
" [ 8 16 24 32 40 48 56 64 72 80]\n", | |
" [ 9 18 27 36 45 54 63 72 81 90]\n", | |
" [ 10 20 30 40 50 60 70 80 90 100]\n", | |
" [ 11 22 33 44 55 66 77 88 99 110]\n", | |
" [ 12 24 36 48 60 72 84 96 108 120]\n", | |
" [ 13 26 39 52 65 78 91 104 117 130]\n", | |
" [ 14 28 42 56 70 84 98 112 126 140]\n", | |
" [ 15 30 45 60 75 90 105 120 135 150]]\n" | |
] | |
} | |
], | |
"source": [ | |
"# Therefore, assuming\n", | |
"A = array([[1,2,3,4,5,6,7,8,9,10]])\n", | |
"B = array([[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15]])\n", | |
"# it only make semantic sense to write\n", | |
"whatever_been_computed = dot(A.T, B)\n", | |
"print(whatever_been_computed)\n", | |
"# or\n", | |
"whatever_been_computed = dot(B.T, A)\n", | |
"print(whatever_been_computed)\n", | |
"\n", | |
"# no matter it's computing the weights, gradient, error, or what ever intermedia values between 2 layers." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 67, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"[ 1 2 3 4 5 6 7 8 9 10]\n" | |
] | |
}, | |
{ | |
"ename": "ValueError", | |
"evalue": "shapes (10,) and (15,) not aligned: 10 (dim 0) != 15 (dim 0)", | |
"output_type": "error", | |
"traceback": [ | |
"\u001b[1;31m---------------------------------------------------------------------------\u001b[0m", | |
"\u001b[1;31mValueError\u001b[0m Traceback (most recent call last)", | |
"\u001b[1;32m<ipython-input-67-c890709b7c01>\u001b[0m in \u001b[0;36m<module>\u001b[1;34m()\u001b[0m\n\u001b[0;32m 6\u001b[0m \u001b[0mprint\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mA\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mT\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 7\u001b[0m \u001b[1;31m# and this will fall and it hurts\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m----> 8\u001b[1;33m \u001b[0mdot\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mA\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mT\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mB\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m", | |
"\u001b[1;31mValueError\u001b[0m: shapes (10,) and (15,) not aligned: 10 (dim 0) != 15 (dim 0)" | |
] | |
} | |
], | |
"source": [ | |
"# You can't store A and B in plain vectors\n", | |
"A = array([1,2,3,4,5,6,7,8,9,10])\n", | |
"B = array([1,2,3,4,5,6,7,8,9,10,11,12,13,14,15])\n", | |
"\n", | |
"# Because A.T make zero impact\n", | |
"print(A.T)\n", | |
"# and this will fall and it hurts\n", | |
"dot(A.T, B)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 75, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"[[ 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15]]\n", | |
"[[ 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15]]\n", | |
"[[ 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15]]\n", | |
"[[ 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15]]\n", | |
"[[ 1]\n", | |
" [ 2]\n", | |
" [ 3]\n", | |
" [ 4]\n", | |
" [ 5]\n", | |
" [ 6]\n", | |
" [ 7]\n", | |
" [ 8]\n", | |
" [ 9]\n", | |
" [10]\n", | |
" [11]\n", | |
" [12]\n", | |
" [13]\n", | |
" [14]\n", | |
" [15]]\n", | |
"[[ 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15]]\n" | |
] | |
} | |
], | |
"source": [ | |
"# But, if one of you layer has only 1 node\n", | |
"A = array([[1]])\n", | |
"B = array([[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15]])\n", | |
"\n", | |
"print(A * B)\n", | |
"print(B * A)\n", | |
"print(dot(A, B))\n", | |
"print(dot(A.T, B))\n", | |
"print(dot(B.T, A)) # oh, and this one's transpose is just\n", | |
"print(dot(B.T, A).T) # which == all of the first 4 equations" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"__And the above exmple is the reason some students start to do crazy combinations of `*`, `dot` and `.T`__\n", | |
"\n", | |
"Eventually, They will probably gets the shapes matched. But, does that mean they understand how it actually work?\n", | |
"\n", | |
"Also, even if the shapes match, the value could be wrong." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 94, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"[[ 7 16 13]\n", | |
" [14 26 23]\n", | |
" [21 36 33]]\n", | |
"[[ 49 56 63]\n", | |
" [112 148 154]\n", | |
" [189 228 249]]\n" | |
] | |
} | |
], | |
"source": [ | |
"# For example\n", | |
"A = array([\n", | |
" [1,2,3],\n", | |
" [4,5,6]\n", | |
" ])\n", | |
"B = array([\n", | |
" [7,8,9],\n", | |
" [0,2,1]\n", | |
" ])\n", | |
"\n", | |
"print(dot(A.T, B))\n", | |
"print(dot((A * B).T, B))" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### End\n", | |
"I hope this will help you to clear your mind and understand what exactly happend when you are implementing neural netowrks with numpy as your matrix manipulation tool" | |
] | |
} | |
], | |
"metadata": { | |
"kernelspec": { | |
"display_name": "Python 3", | |
"language": "python", | |
"name": "python3" | |
}, | |
"language_info": { | |
"codemirror_mode": { | |
"name": "ipython", | |
"version": 3 | |
}, | |
"file_extension": ".py", | |
"mimetype": "text/x-python", | |
"name": "python", | |
"nbconvert_exporter": "python", | |
"pygments_lexer": "ipython3", | |
"version": "3.5.2" | |
} | |
}, | |
"nbformat": 4, | |
"nbformat_minor": 0 | |
} |
@aluxian Sorry that I didn't respond you in time.
This notebook meant to show some incorrect ways to use NumPy and some NumPy gotchas. Therefore, I intentionally put errors.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Why?