Created
March 20, 2019 22:47
-
-
Save shotahorii/dfafe7b2f637de31f07a3c7636181f8b to your computer and use it in GitHub Desktop.
Linear Algebra in PRML 3.1.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"### Linear Algebra in PRML 3.1.\n", | |
"\n", | |
"PRMLの行列式の計算で若干こんがらがったので整理しておく。\n", | |
"\n", | |
"本題に入る前にまず、PRMLにおける基本的な表記を整理しておく。(PRML p138)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 33, | |
"metadata": { | |
"collapsed": false, | |
"scrolled": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/latex": [ | |
"あるD次元上の一つの観測点を $${\\bf x}$$ で表し、以下のようなcolumn vectorとして定義する。\n", | |
"<br><br>\n", | |
"$${\\bf x} = (x_1,x_2,...,x_D)^T$$\n", | |
"<br><br>\n", | |
"例えばN個の観測点を得たとして、そのうちn番目の観測点を $${\\bf x}_n$$と表記する。\n", | |
"<br><br>\n", | |
"また、この観測点の集合を $${\\bf X}$$ で表し、以下の集合で定義する。\n", | |
"<br><br>\n", | |
"$${\\bf X} = \\{{\\bf x}_1,{\\bf x}_2,...,{\\bf x}_N\\}$$\n", | |
"<br><br>\n", | |
"\n", | |
"\n", | |
"\n", | |
"次に、ある観測点に対する観測値を $$t$$で表す。N個の観測値を得たとして、n番目の観測値は $$t_n$$と表記される。\n", | |
"<br>\n", | |
"N個の観測値の集合を $${\\bf t}$$ で表し、以下のようなcolumn vectorとして定義する。\n", | |
"<br><br>\n", | |
"$${\\bf t} = (t_1,t_2,...,t_N)^T$$\n", | |
"<br><br>\n", | |
"\n", | |
"観測値についての観測点の線形回帰を以下のように表す。\n", | |
"<br><br>\n", | |
"$$y({\\bf x},{\\bf w}) = w_0 + w_1x_1 + ... + w_Dx_D$$\n", | |
"<br><br>\n", | |
"ここで、$${\\bf w}$$は観測点の各変数と切片に対する重み(=回帰において推定すべきパラメータ)を表し、以下のcolumn vectorとして定義される。\n", | |
"<br><br>\n", | |
"$${\\bf w}=(w_0,w_1,...,w_D)^T$$\n", | |
"<br><br>\n", | |
"ここまで、変数の数を次元を表す$$D$$を用いて表記してきたが、ここから先は、モデルにおいて推定すべきパラメータ数を表す$$M$$を使う。\n", | |
"<br><br>\n", | |
"$$M=$$モデルにおいて推定するパラメータ数\n", | |
"<br><br>\n", | |
"つまり、切片の重み$$w_0$$もパラメータ数にカウントされるため、$${\\bf w}$$は以下のように書かれる。\n", | |
"<br><br>\n", | |
"$${\\bf w}=(w_0,w_1,...,w_{M-1})^T$$\n", | |
"<br><br>\n", | |
"ここで、線形回帰で前提となる線形関係は$$t$$と$${\\bf w}$$の間に成り立っていればよく、\n", | |
"$${\\bf x}$$は$$t$$と非線形の関係でもよいことを考えると、$$x_j$$部分は$${\\bf x}$$に何らかの変換(非線形も可)を行ったものとして一般化できる。\n", | |
"この「$${\\bf x}$$に対する何らかの変換」を$$\\phi_j({\\bf x})$$で表す。\n", | |
"例として、以下のような変換などが考えられる。<br><br>\n", | |
"$$\\phi_j({\\bf x}) = exp(-\\frac{(x_j-\\mu_j)^2}{2s^2}) $$\n", | |
"<br><br>\n", | |
"何の変換も行わない最もシンプルな線形回帰であれば、 $$\\phi_j({\\bf x})=x_j$$ であると考えればよい。\n", | |
"<br><br>\n", | |
"一般化した式を以下のように表す。<br><br>\n", | |
"$$y({\\bf x},{\\bf w})=w_0 + \\sum_{j=1}^{M-1}w_j\\phi_j({\\bf x})\n", | |
"= \\sum_{j=0}^{M-1}w_j\\phi_j({\\bf x}) = {\\bf w}^T\\phi({\\bf x})$$<br>\n", | |
"ただし、$$\\phi_0({\\bf x})=1$$<br>\n", | |
"<br><br>\n", | |
"わかりやすさのため展開すると、<br><br>\n", | |
"$${\\bf w}^T\\phi({\\bf x})=(w_0,w_1,...,w_{M-1})\n", | |
" \\left(\n", | |
" \\begin{array}{c}\n", | |
" \\phi_0({\\bf x}) \\\\\n", | |
" \\phi_1({\\bf x}) \\\\\n", | |
" ... \\\\\n", | |
" \\phi_{M-1}({\\bf x}) \\\\\n", | |
" \\end{array}\n", | |
" \\right)\n", | |
"= w_0\\phi_0({\\bf x})+w_1\\phi_1({\\bf x})+...+w_{M-1}\\phi_{M-1}({\\bf x})\n", | |
"$$\n", | |
"<br><br>\n", | |
"である。" | |
], | |
"text/plain": [ | |
"<IPython.core.display.Latex object>" | |
] | |
}, | |
"metadata": {}, | |
"output_type": "display_data" | |
} | |
], | |
"source": [ | |
"%%latex\n", | |
"あるD次元上の一つの観測点を $${\\bf x}$$ で表し、以下のようなcolumn vectorとして定義する。\n", | |
"<br><br>\n", | |
"$${\\bf x} = (x_1,x_2,...,x_D)^T$$\n", | |
"<br><br>\n", | |
"例えばN個の観測点を得たとして、そのうちn番目の観測点を $${\\bf x}_n$$と表記する。\n", | |
"<br><br>\n", | |
"また、この観測点の集合を $${\\bf X}$$ で表し、以下の集合で定義する。\n", | |
"<br><br>\n", | |
"$${\\bf X} = \\{{\\bf x}_1,{\\bf x}_2,...,{\\bf x}_N\\}$$\n", | |
"<br><br>\n", | |
"\n", | |
"\n", | |
"\n", | |
"次に、ある観測点に対する観測値を $$t$$で表す。N個の観測値を得たとして、n番目の観測値は $$t_n$$と表記される。\n", | |
"<br>\n", | |
"N個の観測値の集合を $${\\bf t}$$ で表し、以下のようなcolumn vectorとして定義する。\n", | |
"<br><br>\n", | |
"$${\\bf t} = (t_1,t_2,...,t_N)^T$$\n", | |
"<br><br>\n", | |
"\n", | |
"観測値についての観測点の線形回帰を以下のように表す。\n", | |
"<br><br>\n", | |
"$$y({\\bf x},{\\bf w}) = w_0 + w_1x_1 + ... + w_Dx_D$$\n", | |
"<br><br>\n", | |
"ここで、$${\\bf w}$$は観測点の各変数と切片に対する重み(=回帰において推定すべきパラメータ)を表し、以下のcolumn vectorとして定義される。\n", | |
"<br><br>\n", | |
"$${\\bf w}=(w_0,w_1,...,w_D)^T$$\n", | |
"<br><br>\n", | |
"ここまで、変数の数を次元を表す$$D$$を用いて表記してきたが、ここから先は、モデルにおいて推定すべきパラメータ数を表す$$M$$を使う。\n", | |
"<br><br>\n", | |
"$$M=$$モデルにおいて推定するパラメータ数\n", | |
"<br><br>\n", | |
"つまり、切片の重み$$w_0$$もパラメータ数にカウントされるため、$${\\bf w}$$は以下のように書かれる。\n", | |
"<br><br>\n", | |
"$${\\bf w}=(w_0,w_1,...,w_{M-1})^T$$\n", | |
"<br><br>\n", | |
"ここで、線形回帰で前提となる線形関係は$$t$$と$${\\bf w}$$の間に成り立っていればよく、\n", | |
"$${\\bf x}$$は$$t$$と非線形の関係でもよいことを考えると、$$x_j$$部分は$${\\bf x}$$に何らかの変換(非線形も可)を行ったものとして一般化できる。\n", | |
"この「$${\\bf x}$$に対する何らかの変換」を$$\\phi_j({\\bf x})$$で表す。\n", | |
"例として、以下のような変換などが考えられる。<br><br>\n", | |
"$$\\phi_j({\\bf x}) = exp(-\\frac{(x_j-\\mu_j)^2}{2s^2}) $$\n", | |
"<br><br>\n", | |
"何の変換も行わない最もシンプルな線形回帰であれば、 $$\\phi_j({\\bf x})=x_j$$ であると考えればよい。\n", | |
"<br><br>\n", | |
"一般化した式を以下のように表す。<br><br>\n", | |
"$$y({\\bf x},{\\bf w})=w_0 + \\sum_{j=1}^{M-1}w_j\\phi_j({\\bf x})\n", | |
"= \\sum_{j=0}^{M-1}w_j\\phi_j({\\bf x}) = {\\bf w}^T\\phi({\\bf x})$$<br>\n", | |
"ただし、$$\\phi_0({\\bf x})=1$$<br>\n", | |
"<br><br>\n", | |
"わかりやすさのため展開すると、<br><br>\n", | |
"$${\\bf w}^T\\phi({\\bf x})=(w_0,w_1,...,w_{M-1})\n", | |
" \\left(\n", | |
" \\begin{array}{c}\n", | |
" \\phi_0({\\bf x}) \\\\\n", | |
" \\phi_1({\\bf x}) \\\\\n", | |
" ... \\\\\n", | |
" \\phi_{M-1}({\\bf x}) \\\\\n", | |
" \\end{array}\n", | |
" \\right)\n", | |
"= w_0\\phi_0({\\bf x})+w_1\\phi_1({\\bf x})+...+w_{M-1}\\phi_{M-1}({\\bf x})\n", | |
"$$\n", | |
"<br><br>\n", | |
"である。" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"さて、ここからが本題である。PRML p142の(3.14)式を例に、行列式の展開について整理する。" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 39, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/latex": [ | |
"まず、(3.14)式は以下である。\n", | |
"<br><br>\n", | |
"$$0=\\sum_{n=1}^N t_n\\phi({\\bf x}_n)^T - {\\bf w}^T (\\sum_{n-1}^N \\phi({\\bf x}_n)\\phi({\\bf x}_n)^T)$$\n", | |
"<br><br>\n", | |
"この式を解くと、解は以下のようになる。\n", | |
"<br><br>\n", | |
"$${\\bf w} = (\\Phi^T\\Phi)^{-1}\\Phi^T{\\bf t}$$\n", | |
"<br><br>\n", | |
"ここで、$$\\Phi$$はdesign matrix(計画行列)であり以下のように定義される。\n", | |
"<br><br>\n", | |
"$$\n", | |
" \\Phi = \\left(\n", | |
" \\begin{array}{ccc}\n", | |
" \\phi_0(x_1) & \\phi_1(x_1) & ... & \\phi_{M-1}(x_1) \\\\\n", | |
" \\phi_0(x_2) & \\phi_1(x_2) & ... & \\phi_{M-1}(x_2) \\\\\n", | |
" ... & ... & ... & ... \\\\\n", | |
" \\phi_0(x_N) & \\phi_1(x_N) & ... & \\phi_{M-1}(x_N) \\\\\n", | |
" \\end{array}\n", | |
" \\right)\n", | |
"$$\n", | |
"<br>\n", | |
"$$\\Phi \\, is \\, an \\, N \\times M \\, matrix$$ \n", | |
"<br><br>\n", | |
"ここで少し詰まったので、ものすごく噛み砕いて以下に書き残しておく。" | |
], | |
"text/plain": [ | |
"<IPython.core.display.Latex object>" | |
] | |
}, | |
"metadata": {}, | |
"output_type": "display_data" | |
} | |
], | |
"source": [ | |
"%%latex\n", | |
"まず、(3.14)式は以下である。\n", | |
"<br><br>\n", | |
"$$0=\\sum_{n=1}^N t_n\\phi({\\bf x}_n)^T - {\\bf w}^T (\\sum_{n-1}^N \\phi({\\bf x}_n)\\phi({\\bf x}_n)^T)$$\n", | |
"<br><br>\n", | |
"この式を解くと、解は以下のようになる。\n", | |
"<br><br>\n", | |
"$${\\bf w} = (\\Phi^T\\Phi)^{-1}\\Phi^T{\\bf t}$$\n", | |
"<br><br>\n", | |
"ここで、$$\\Phi$$はdesign matrix(計画行列)であり以下のように定義される。\n", | |
"<br><br>\n", | |
"$$\n", | |
" \\Phi = \\left(\n", | |
" \\begin{array}{ccc}\n", | |
" \\phi_0(x_1) & \\phi_1(x_1) & ... & \\phi_{M-1}(x_1) \\\\\n", | |
" \\phi_0(x_2) & \\phi_1(x_2) & ... & \\phi_{M-1}(x_2) \\\\\n", | |
" ... & ... & ... & ... \\\\\n", | |
" \\phi_0(x_N) & \\phi_1(x_N) & ... & \\phi_{M-1}(x_N) \\\\\n", | |
" \\end{array}\n", | |
" \\right)\n", | |
"$$\n", | |
"<br>\n", | |
"$$\\Phi \\, is \\, an \\, N \\times M \\, matrix$$ \n", | |
"<br><br>\n", | |
"ここで少し詰まったので、ものすごく噛み砕いて以下に書き残しておく。" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 79, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/latex": [ | |
"まず第一項。<br><br>\n", | |
"$$\\sum_{n=1}^N t_n\\phi({\\bf x}_n)^T$$<br>\n", | |
"$$=t_1(\\phi_0(x_1),\\phi_1(x_1),...,\\phi_{M-1}(x_1))$$<br>\n", | |
"$$+t_2(\\phi_0(x_2),\\phi_1(x_2),...,\\phi_{M-1}(x_2))$$<br>\n", | |
"$$...$$<br>\n", | |
"$$+t_N(\\phi_0(x_N),\\phi_1(x_N),...,\\phi_{M-1}(x_N))$$<br>\n", | |
"$$=(t_1\\phi_0(x_1)+t_2\\phi_0(x_2)+...+t_N\\phi_0(x_N), ...\n", | |
" , t_1\\phi_{M-1}(x_1)+t_2\\phi_{M-1}(x_2)+...+t_N\\phi_{M-1}(x_N))$$<br>\n", | |
"$$= (t_1,t_2,...,t_N)\n", | |
" \\left(\n", | |
" \\begin{array}{ccc}\n", | |
" \\phi_0(x_1) & \\phi_1(x_1) & ... & \\phi_{M-1}(x_1) \\\\\n", | |
" \\phi_0(x_2) & \\phi_1(x_2) & ... & \\phi_{M-1}(x_2) \\\\\n", | |
" ... & ... & ... & ... \\\\\n", | |
" \\phi_0(x_N) & \\phi_1(x_N) & ... & \\phi_{M-1}(x_N) \\\\\n", | |
" \\end{array}\n", | |
" \\right)$$<br>\n", | |
"$$={\\bf t}^T\\Phi$$" | |
], | |
"text/plain": [ | |
"<IPython.core.display.Latex object>" | |
] | |
}, | |
"metadata": {}, | |
"output_type": "display_data" | |
} | |
], | |
"source": [ | |
"%%latex\n", | |
"まず第一項。<br><br>\n", | |
"$$\\sum_{n=1}^N t_n\\phi({\\bf x}_n)^T$$<br>\n", | |
"$$=t_1(\\phi_0(x_1),\\phi_1(x_1),...,\\phi_{M-1}(x_1))$$<br>\n", | |
"$$+t_2(\\phi_0(x_2),\\phi_1(x_2),...,\\phi_{M-1}(x_2))$$<br>\n", | |
"$$...$$<br>\n", | |
"$$+t_N(\\phi_0(x_N),\\phi_1(x_N),...,\\phi_{M-1}(x_N))$$<br>\n", | |
"$$=(t_1\\phi_0(x_1)+t_2\\phi_0(x_2)+...+t_N\\phi_0(x_N), ...\n", | |
" , t_1\\phi_{M-1}(x_1)+t_2\\phi_{M-1}(x_2)+...+t_N\\phi_{M-1}(x_N))$$<br>\n", | |
"$$= (t_1,t_2,...,t_N)\n", | |
" \\left(\n", | |
" \\begin{array}{ccc}\n", | |
" \\phi_0(x_1) & \\phi_1(x_1) & ... & \\phi_{M-1}(x_1) \\\\\n", | |
" \\phi_0(x_2) & \\phi_1(x_2) & ... & \\phi_{M-1}(x_2) \\\\\n", | |
" ... & ... & ... & ... \\\\\n", | |
" \\phi_0(x_N) & \\phi_1(x_N) & ... & \\phi_{M-1}(x_N) \\\\\n", | |
" \\end{array}\n", | |
" \\right)$$<br>\n", | |
"$$={\\bf t}^T\\Phi$$" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 66, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/latex": [ | |
"次に第二項の括弧内。\n", | |
"<br><br>\n", | |
"$$\\sum_{n-1}^N \\phi({\\bf x}_n)\\phi({\\bf x}_n)^T$$<br>\n", | |
"$$=\n", | |
"\\left(\n", | |
" \\begin{array}{c}\n", | |
" \\phi_0(x_1)\\\\\n", | |
" \\phi_1(x_1)\\\\\n", | |
" ...\\\\\n", | |
" \\phi_{M-1}(x_1)\\\\\n", | |
" \\end{array}\n", | |
"\\right)\n", | |
"(\\phi_0(x_1), \\phi_1(x_1), ..., \\phi_{M-1}(x_1))\n", | |
"+\n", | |
"...\n", | |
"+\n", | |
"\\left(\n", | |
" \\begin{array}{c}\n", | |
" \\phi_0(x_N)\\\\\n", | |
" \\phi_1(x_N)\\\\\n", | |
" ...\\\\\n", | |
" \\phi_{M-1}(x_N)\\\\\n", | |
" \\end{array}\n", | |
"\\right)\n", | |
"(\\phi_0(x_N), \\phi_1(x_N), ..., \\phi_{M-1}(x_N))\n", | |
"$$<br>\n", | |
"$$=\n", | |
"\\left(\n", | |
" \\begin{array}{ccc}\n", | |
" \\phi_0(x_1)\\phi_0(x_1) & \\phi_0(x_1)\\phi_1(x_1) & ... & \\phi_0(x_1)\\phi_{M-1}(x_1) \\\\\n", | |
" \\phi_1(x_1)\\phi_0(x_1) & \\phi_1(x_1)\\phi_1(x_1) & ... & \\phi_1(x_1)\\phi_{M-1}(x_1) \\\\\n", | |
" ... & ... & ... & ... \\\\\n", | |
" \\phi_{M-1}(x_1)\\phi_0(x_1) & \\phi_{M-1}(x_1)\\phi_1(x_1) & ... & \\phi_{M-1}(x_1)\\phi_{M-1}(x_1) \\\\\n", | |
" \\end{array}\n", | |
"\\right) + ... \n", | |
"$$<br>\n", | |
"$$+\n", | |
"\\left(\n", | |
" \\begin{array}{ccc}\n", | |
" \\phi_0(x_N)\\phi_0(x_N) & \\phi_0(x_N)\\phi_1(x_N) & ... & \\phi_0(x_N)\\phi_{M-1}(x_N) \\\\\n", | |
" \\phi_1(x_N)\\phi_0(x_N) & \\phi_1(x_N)\\phi_1(x_N) & ... & \\phi_1(x_N)\\phi_{M-1}(x_N) \\\\\n", | |
" ... & ... & ... & ... \\\\\n", | |
" \\phi_{M-1}(x_N)\\phi_0(x_N) & \\phi_{M-1}(x_N)\\phi_1(x_N) & ... & \\phi_{M-1}(x_N)\\phi_{M-1}(x_N) \\\\\n", | |
" \\end{array}\n", | |
"\\right)\n", | |
"$$<br>\n", | |
"$$=\n", | |
"\\left(\n", | |
" \\begin{array}{ccc}\n", | |
" \\phi_0(x_1)\\phi_0(x_1) + ... + \\phi_0(x_N)\\phi_0(x_N) & ... & \\phi_0(x_1)\\phi_{M-1}(x_1)+...+\\phi_0(x_N)\\phi_{M-1}(x_N) \\\\\n", | |
" \\phi_1(x_1)\\phi_0(x_1) + ... + \\phi_1(x_N)\\phi_0(x_N) & ... & \\phi_1(x_1)\\phi_{M-1}(x_1)+...+\\phi_1(x_N)\\phi_{M-1}(x_N) \\\\\n", | |
" ... & ... & ... \\\\\n", | |
" \\phi_{M-1}(x_1)\\phi_0(x_1)+...+\\phi_{M-1}(x_N)\\phi_0(x_N) & ... & \\phi_{M-1}(x_1)\\phi_{M-1}(x_1)+...+\\phi_{M-1}(x_N)\\phi_{M-1}(x_N) \\\\\n", | |
" \\end{array}\n", | |
"\\right)\n", | |
"$$\n", | |
"$$=\\Phi^T\\Phi$$" | |
], | |
"text/plain": [ | |
"<IPython.core.display.Latex object>" | |
] | |
}, | |
"metadata": {}, | |
"output_type": "display_data" | |
} | |
], | |
"source": [ | |
"%%latex\n", | |
"次に第二項の括弧内。\n", | |
"<br><br>\n", | |
"$$\\sum_{n-1}^N \\phi({\\bf x}_n)\\phi({\\bf x}_n)^T$$<br>\n", | |
"$$=\n", | |
"\\left(\n", | |
" \\begin{array}{c}\n", | |
" \\phi_0(x_1)\\\\\n", | |
" \\phi_1(x_1)\\\\\n", | |
" ...\\\\\n", | |
" \\phi_{M-1}(x_1)\\\\\n", | |
" \\end{array}\n", | |
"\\right)\n", | |
"(\\phi_0(x_1), \\phi_1(x_1), ..., \\phi_{M-1}(x_1))\n", | |
"+\n", | |
"...\n", | |
"+\n", | |
"\\left(\n", | |
" \\begin{array}{c}\n", | |
" \\phi_0(x_N)\\\\\n", | |
" \\phi_1(x_N)\\\\\n", | |
" ...\\\\\n", | |
" \\phi_{M-1}(x_N)\\\\\n", | |
" \\end{array}\n", | |
"\\right)\n", | |
"(\\phi_0(x_N), \\phi_1(x_N), ..., \\phi_{M-1}(x_N))\n", | |
"$$<br>\n", | |
"$$=\n", | |
"\\left(\n", | |
" \\begin{array}{ccc}\n", | |
" \\phi_0(x_1)\\phi_0(x_1) & \\phi_0(x_1)\\phi_1(x_1) & ... & \\phi_0(x_1)\\phi_{M-1}(x_1) \\\\\n", | |
" \\phi_1(x_1)\\phi_0(x_1) & \\phi_1(x_1)\\phi_1(x_1) & ... & \\phi_1(x_1)\\phi_{M-1}(x_1) \\\\\n", | |
" ... & ... & ... & ... \\\\\n", | |
" \\phi_{M-1}(x_1)\\phi_0(x_1) & \\phi_{M-1}(x_1)\\phi_1(x_1) & ... & \\phi_{M-1}(x_1)\\phi_{M-1}(x_1) \\\\\n", | |
" \\end{array}\n", | |
"\\right) + ... \n", | |
"$$<br>\n", | |
"$$+\n", | |
"\\left(\n", | |
" \\begin{array}{ccc}\n", | |
" \\phi_0(x_N)\\phi_0(x_N) & \\phi_0(x_N)\\phi_1(x_N) & ... & \\phi_0(x_N)\\phi_{M-1}(x_N) \\\\\n", | |
" \\phi_1(x_N)\\phi_0(x_N) & \\phi_1(x_N)\\phi_1(x_N) & ... & \\phi_1(x_N)\\phi_{M-1}(x_N) \\\\\n", | |
" ... & ... & ... & ... \\\\\n", | |
" \\phi_{M-1}(x_N)\\phi_0(x_N) & \\phi_{M-1}(x_N)\\phi_1(x_N) & ... & \\phi_{M-1}(x_N)\\phi_{M-1}(x_N) \\\\\n", | |
" \\end{array}\n", | |
"\\right)\n", | |
"$$<br>\n", | |
"$$=\n", | |
"\\left(\n", | |
" \\begin{array}{ccc}\n", | |
" \\phi_0(x_1)\\phi_0(x_1) + ... + \\phi_0(x_N)\\phi_0(x_N) & ... & \\phi_0(x_1)\\phi_{M-1}(x_1)+...+\\phi_0(x_N)\\phi_{M-1}(x_N) \\\\\n", | |
" \\phi_1(x_1)\\phi_0(x_1) + ... + \\phi_1(x_N)\\phi_0(x_N) & ... & \\phi_1(x_1)\\phi_{M-1}(x_1)+...+\\phi_1(x_N)\\phi_{M-1}(x_N) \\\\\n", | |
" ... & ... & ... \\\\\n", | |
" \\phi_{M-1}(x_1)\\phi_0(x_1)+...+\\phi_{M-1}(x_N)\\phi_0(x_N) & ... & \\phi_{M-1}(x_1)\\phi_{M-1}(x_1)+...+\\phi_{M-1}(x_N)\\phi_{M-1}(x_N) \\\\\n", | |
" \\end{array}\n", | |
"\\right)\n", | |
"$$\n", | |
"$$=\\Phi^T\\Phi$$\n" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 84, | |
"metadata": { | |
"collapsed": false | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/latex": [ | |
"ここまでを整理すると以下である。\n", | |
"<br><br>\n", | |
"$$0=\\sum_{n=1}^N t_n\\phi({\\bf x}_n)^T - {\\bf w}^T (\\sum_{n-1}^N \\phi({\\bf x}_n)\\phi({\\bf x}_n)^T)$$\n", | |
"<br><br>\n", | |
"$$0={\\bf t}^T\\Phi - {\\bf w}^T\\Phi^T\\Phi$$\n", | |
"<br><br>\n", | |
"$${\\bf w}^T\\Phi^T\\Phi={\\bf t}^T\\Phi$$\n", | |
"<br><br>\n", | |
"ここで、以下の行列の性質を使う。\n", | |
"<br><br>\n", | |
"$$(A^T)^T = A$$<br>\n", | |
"$$(A_1^TA_2^T...A_n^T)=A_n^T...A_2^TA_1^T$$<br>\n", | |
"$$A^{-1}A = AA^{-1} = I$$\n", | |
"<br><br>\n", | |
"元の式に戻る。最初の性質を使って左辺右辺ともに以下のように書き換える。\n", | |
"<br><br>\n", | |
"$${\\bf w}^T\\Phi^T(\\Phi^T)^T={\\bf t}^T(\\Phi^T)^T$$\n", | |
"<br><br>\n", | |
"二番目の性質を使って左辺右辺ともに以下のように書き換える。\n", | |
"<br><br>\n", | |
"$$(\\Phi^T\\Phi{\\bf w})^T=(\\Phi^T{\\bf t})^T$$\n", | |
"<br><br>\n", | |
"すなわち\n", | |
"<br><br>\n", | |
"$$\\Phi^T\\Phi{\\bf w}=\\Phi^T{\\bf t}$$\n", | |
"<br><br>\n", | |
"最後に、三番目の性質を使って以下のように書き換える。\n", | |
"<br><br>\n", | |
"$$(\\Phi^T\\Phi)^{-1}\\Phi^T\\Phi{\\bf w}=(\\Phi^T\\Phi)^{-1}\\Phi^T{\\bf t}$$<br>\n", | |
"$${\\bf w}=(\\Phi^T\\Phi)^{-1}\\Phi^T{\\bf t}$$" | |
], | |
"text/plain": [ | |
"<IPython.core.display.Latex object>" | |
] | |
}, | |
"metadata": {}, | |
"output_type": "display_data" | |
} | |
], | |
"source": [ | |
"%%latex\n", | |
"ここまでを整理すると以下である。\n", | |
"<br><br>\n", | |
"$$0=\\sum_{n=1}^N t_n\\phi({\\bf x}_n)^T - {\\bf w}^T (\\sum_{n-1}^N \\phi({\\bf x}_n)\\phi({\\bf x}_n)^T)$$\n", | |
"<br><br>\n", | |
"$$0={\\bf t}^T\\Phi - {\\bf w}^T\\Phi^T\\Phi$$\n", | |
"<br><br>\n", | |
"$${\\bf w}^T\\Phi^T\\Phi={\\bf t}^T\\Phi$$\n", | |
"<br><br>\n", | |
"ここで、以下の行列の性質を使う。\n", | |
"<br><br>\n", | |
"$$(A^T)^T = A$$<br>\n", | |
"$$(A_1^TA_2^T...A_n^T)=A_n^T...A_2^TA_1^T$$<br>\n", | |
"$$A^{-1}A = AA^{-1} = I$$\n", | |
"<br><br>\n", | |
"元の式に戻る。最初の性質を使って左辺右辺ともに以下のように書き換える。\n", | |
"<br><br>\n", | |
"$${\\bf w}^T\\Phi^T(\\Phi^T)^T={\\bf t}^T(\\Phi^T)^T$$\n", | |
"<br><br>\n", | |
"二番目の性質を使って左辺右辺ともに以下のように書き換える。\n", | |
"<br><br>\n", | |
"$$(\\Phi^T\\Phi{\\bf w})^T=(\\Phi^T{\\bf t})^T$$\n", | |
"<br><br>\n", | |
"すなわち\n", | |
"<br><br>\n", | |
"$$\\Phi^T\\Phi{\\bf w}=\\Phi^T{\\bf t}$$\n", | |
"<br><br>\n", | |
"最後に、三番目の性質を使って以下のように書き換える。\n", | |
"<br><br>\n", | |
"$$(\\Phi^T\\Phi)^{-1}\\Phi^T\\Phi{\\bf w}=(\\Phi^T\\Phi)^{-1}\\Phi^T{\\bf t}$$<br>\n", | |
"$${\\bf w}=(\\Phi^T\\Phi)^{-1}\\Phi^T{\\bf t}$$" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"ということで、PRML p142の(3.14)式についての行列計算ができた。\n", | |
" \n", | |
"参考\n", | |
"- [転置行列の定理](https://www.iwanttobeacat.com/entry/2018/01/05/220317)\n", | |
"- [転置行列の基本的な4つの性質と証明](https://mathtrain.jp/transpose)\n", | |
"- [正則行列と逆行列の諸性質](https://risalc.info/src/inverse-matrix.html)" | |
] | |
} | |
], | |
"metadata": { | |
"kernelspec": { | |
"display_name": "Python 3", | |
"language": "python", | |
"name": "python3" | |
}, | |
"language_info": { | |
"codemirror_mode": { | |
"name": "ipython", | |
"version": 3 | |
}, | |
"file_extension": ".py", | |
"mimetype": "text/x-python", | |
"name": "python", | |
"nbconvert_exporter": "python", | |
"pygments_lexer": "ipython3", | |
"version": "3.5.2" | |
} | |
}, | |
"nbformat": 4, | |
"nbformat_minor": 0 | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment