Last active
February 28, 2020 05:43
-
-
Save caryan/6cb5be4bf7e0306dc236c3688e548d68 to your computer and use it in GitHub Desktop.
Two approaches to Savitsky Golay filtering in Julia
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"# Introduction\n", | |
"\n", | |
"Savitsky-Golay filtering smooths noisy data by fitting polynomials to a moving window of data points. The key insight is that we can solve for the polynomial coefficients as linear combinations of the data keeping the data undefined. The smoothing then becomes a convolution filter to calculate a weighted sum of the data in the window. \n", | |
"\n", | |
"There are two approaches to obtaining the filter coefficients. Using conventional polynomials $p_0 + p_1x + p_2x^2 + p_3x^3 + \\dots$ then we can solve for the coefficients with a matrix (pseudo)-inversion. Alternatively we can use a orthogonal polynomial basis (the Gram polynomials) in which case we have a closed form for the coefficients.\n", | |
"\n", | |
"Here we compare the two methods for fun. " | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"# Conventional Savitsky Golay\n", | |
"\n", | |
"We can cast the polynomial fitting problem in matrix form: $\\mathbf{A}\\cdot\\mathbf{p} = \\mathbf{y}$ for a column vector of the polynomial coefficients, $\\mathbf{p}$ and the data in the window to fit $\\mathbf{y}$. The design matrix $\\mathbf{A}$ is a Vandermonde matrix\n", | |
"\n", | |
"\\begin{equation*}\n", | |
"\\begin{bmatrix}\n", | |
"\\vdots & \\vdots & \\vdots & \\vdots &\\\\\n", | |
"1 & -2 & 4 & -8 & \\dots \\\\\n", | |
"1 & -1 & 1 & -1 & \\dots \\\\\n", | |
"1 & 0 & 0 & 0 & \\dots \\\\\n", | |
"1 & 1 & 1 & 1 & \\dots \\\\\n", | |
"1 & 2 & 4 & 8 & \\dots \\\\\n", | |
"\\vdots & \\vdots & \\vdots & \\vdots &\\\\\n", | |
"\\end{bmatrix}\n", | |
"\\begin{bmatrix}\n", | |
"p_0 \\\\\n", | |
"p_1 \\\\\n", | |
"p_2 \\\\\n", | |
"p_3 \\\\\n", | |
"\\vdots\n", | |
"\\end{bmatrix}\n", | |
"=\n", | |
"\\begin{bmatrix}\n", | |
"\\vdots \\\\\n", | |
"y_{-2} \\\\\n", | |
"y_{-1} \\\\\n", | |
"y_0 \\\\\n", | |
"y_1 \\\\\n", | |
"y_2 \\\\\n", | |
"\\vdots\n", | |
"\\end{bmatrix}\n", | |
"\\end{equation*}\n", | |
"\n", | |
"For fitting a $m^{th}$ order polynomial to a data window $w$ points wide we have a $\\mathbf{A}$ is a $w \\times (m+1)$ matrix and we are looking for the pseudo-inverse matrix $\\mathbf{C}$ ($(m+1) \\times w$) such that $\\mathbf{CA}$ is a $(m+1) \\times (m+1)$ identity matrix and if we left multiply both sides by $\\mathbf{C}$ then we have the coefficients as a product of a matrix $\\mathbf{C}$ and the data.\n", | |
"\n", | |
"\\begin{equation*}\n", | |
"\\begin{bmatrix}\n", | |
"p_0 \\\\\n", | |
"p_1 \\\\\n", | |
"p_2 \\\\\n", | |
"p_3 \\\\\n", | |
"\\vdots\n", | |
"\\end{bmatrix}\n", | |
"=\n", | |
"\\mathbf{C}\n", | |
"\\begin{bmatrix}\n", | |
"\\vdots \\\\\n", | |
"y_{-2} \\\\\n", | |
"y_{-1} \\\\\n", | |
"y_0 \\\\\n", | |
"y_1 \\\\\n", | |
"y_2 \\\\\n", | |
"\\vdots\n", | |
"\\end{bmatrix}\n", | |
"\\end{equation*}\n", | |
"\n", | |
"We are guaranteed the pseudo-inverse exists because $w > (m+1)$ for the fit to unique and a rectangular $w \\times m$ Vandermonde matrix with all $x_i$ unique (which we have because we have equally spaced points) has maximum rank. " | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Then each row of $\\mathbf{C}$ gives the coefficients for a particular polynomial coefficient. However, for Savitsky-Golay we have a moving window and we only need to evalutate the window at the middle point where $x=0$ and so all the $p_i, i>0$ don't matter and we only need the first row. Hence we only need a single row of the matrix. Similar reasoning to needingly only a single row if we are taking the $n$'th derivative. We can take advantage of that by least-squares solving for only a single row of $\\mathbf{C}$. We're looking to solve $\\mathbf{CA} = \\mathbf{I}$ and taking the transpose to put it in the conventional $\\mathbf{AX} = \\mathbf{B}$ we have $\\mathbf{A}^\\mathrm{T}\\mathbf{C}^\\mathrm{T} = \\mathbf{I}$ or $\\mathbf{C}^\\mathrm{T} = \\mathbf{A}^\\mathrm{T} \\backslash \\mathbf{I}$. We can then solve for say only the first column of $\\mathbf{C}^\\mathrm{T}$ with\n", | |
"\n", | |
"\\begin{equation*}\n", | |
"\\vec{C}_0 = \\mathbf{A}^\\mathrm{T} \\backslash\n", | |
"\\begin{bmatrix}\n", | |
"1 \\\\\n", | |
"0 \\\\\n", | |
"0 \\\\\n", | |
"0 \\\\\n", | |
"\\vdots\n", | |
"\\end{bmatrix}\n", | |
"\\end{equation*}" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Numerical Example\n", | |
"\n", | |
"Take the usual example of fitting a quadratic to window size of 5." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 1, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"Julia Version 1.3.1\n", | |
"Commit 2d5741174c (2019-12-30 21:36 UTC)\n", | |
"Platform Info:\n", | |
" OS: Linux (x86_64-linux-gnu)\n", | |
" CPU: Intel(R) Core(TM) i7-8565U CPU @ 1.80GHz\n", | |
" WORD_SIZE: 64\n", | |
" LIBM: libopenlibm\n", | |
" LLVM: libLLVM-6.0.1 (ORCJIT, skylake)\n" | |
] | |
} | |
], | |
"source": [ | |
"versioninfo()" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 2, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"5×3 Array{Float64,2}:\n", | |
" 1.0 -2.0 4.0\n", | |
" 1.0 -1.0 1.0\n", | |
" 1.0 0.0 0.0\n", | |
" 1.0 1.0 1.0\n", | |
" 1.0 2.0 4.0" | |
] | |
}, | |
"execution_count": 2, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"# create the design matrix A\n", | |
"function design_matrix(order, window_length)\n", | |
" half_length = (window_length-1) / 2\n", | |
"\n", | |
" A = Matrix{Float64}(undef, window_length, order+1)\n", | |
"\n", | |
" for i in 1:window_length, j in 1:order+1\n", | |
" A[i,j] = (i-1-half_length)^(j-1)\n", | |
" end\n", | |
" A\n", | |
"end\n", | |
"A = design_matrix(2,5)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Now compare the full pseudo-inverse to least-squares solving. " | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 3, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"using LinearAlgebra: pinv, transpose, lu, I" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 4, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"3×5 Array{Float64,2}:\n", | |
" -3.0 12.0 17.0 12.0 -3.0\n", | |
" -7.0 -3.5 2.93447e-15 3.5 7.0\n", | |
" 5.0 -2.5 -5.0 -2.5 5.0" | |
] | |
}, | |
"execution_count": 4, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"# from the pseudo-inverse the rows give us the convolution coefficients\n", | |
"35 * pinv(A)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 5, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"5×3 Array{Float64,2}:\n", | |
" -3.0 -7.0 5.0\n", | |
" 12.0 -3.5 -2.5\n", | |
" 17.0 1.88488e-15 -5.0\n", | |
" 12.0 3.5 -2.5\n", | |
" -3.0 7.0 5.0" | |
] | |
}, | |
"execution_count": 5, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"# from the least squares approach the columns give us the convolution coefficients\n", | |
"Id = Matrix(1.0I, size(A,2), size(A,2))\n", | |
"35 * (transpose(A) \\ Id)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"There were actually many errors in the original Savitsky-Golay [1]. This was pointed out in a later comment [2] which reproduced all the tables with the rather harsh comment:\n", | |
"\n", | |
"```\n", | |
"The corrected values are presented in Table I. Also, Tables 11, 111, IV, V, and VI are corrected versions of Tables V, VII, IX, X, and XI given by Savitzky and Golay (I). They either contain more numerous errors or are systematically false.\n", | |
"```\n", | |
"\n", | |
"[1]: Savitzky, A., & Golay, M. J. E. (1964). Smoothing and Differentiation of Data by Simplified Least Squares Procedures. [Analytical Chemistry, 36(8), 1627–1639](https://doi.org/10.1021/ac60214a047).\n", | |
"[2]: Steinier, J., Termonia, Y., & Deltour, J. (1972). Comments on Smoothing and differentiation of data by simplified least square procedure. [Analytical Chemistry, 44(11), 1906–1909](https://doi.org/10.1021/ac60319a045)." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 6, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"-12 -253\n", | |
"-11 -138\n", | |
"-10 -33\n", | |
" -9 62\n", | |
" -8 147\n", | |
" -7 222\n", | |
" -6 287\n", | |
" -5 342\n", | |
" -4 387\n", | |
" -3 422\n", | |
" -2 447\n", | |
" -1 462\n", | |
" 0 467\n", | |
" 1 462\n", | |
" 2 447\n", | |
" 3 422\n", | |
" 4 387\n", | |
" 5 342\n", | |
" 6 287\n", | |
" 7 222\n", | |
" 8 147\n", | |
" 9 62\n", | |
" 10 -33\n", | |
" 11 -138\n", | |
" 12 -253\n" | |
] | |
} | |
], | |
"source": [ | |
"# confirm the error in Table I of the original Savitsky-Golay paper: \n", | |
"# Savitzky, A., & Golay, M. J. E. (1964). Smoothing and Differentiation of Data by Simplified Least Squares Procedures. Analytical Chemistry, 36(8), 1627–1639. https://doi.org/10.1021/ac60214a047\n", | |
"using Printf\n", | |
"A = design_matrix(2, 25)\n", | |
"Id = Matrix(1.0I, size(A,2), size(A,2))\n", | |
"coeffs = round.(Int, 5175 * (transpose(A) \\ Id[:,1]))\n", | |
"for (pt, coeff) = zip(-12:12, coeffs)\n", | |
" @printf(\"%3d %3d\\n\", pt, coeff) \n", | |
"end\n", | |
"# note the values at ±5 are 342 not 322 as in the paper" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 7, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"Creating the design matrix.\n", | |
" 2.380 μs (1 allocation: 1.59 KiB)\n", | |
"\n", | |
"Full matrix of coefficients from `pinv`\n", | |
" 12.943 μs (27 allocations: 20.20 KiB)\n", | |
"\n", | |
"Full matrix of coefficients from least squares\n", | |
" 12.118 μs (73 allocations: 90.61 KiB)\n", | |
"\n", | |
"Single row by slicing pinv\n", | |
" 13.730 μs (28 allocations: 20.53 KiB)\n", | |
"\n", | |
"Single row by least squares\n", | |
" 10.608 μs (77 allocations: 85.75 KiB)\n" | |
] | |
} | |
], | |
"source": [ | |
"# and the relative performance.\n", | |
"using BenchmarkTools\n", | |
"\n", | |
"println(\"Creating the design matrix.\")\n", | |
"@btime design_matrix(5,31)\n", | |
"\n", | |
"A = design_matrix(5,31)\n", | |
"println(\"\\nFull matrix of coefficients from `pinv`\")\n", | |
"@btime pinv($A)\n", | |
"\n", | |
"println(\"\\nFull matrix of coefficients from least squares\")\n", | |
"AT = transpose(A)\n", | |
"Id = Matrix(1.0I, 6, 6)\n", | |
"@btime $AT \\ $Id\n", | |
"\n", | |
"println(\"\\nSingle row by slicing pinv\")\n", | |
"@btime pinv($A)[1,:]\n", | |
"\n", | |
"println(\"\\nSingle row by least squares\")\n", | |
"unit_vector = Id[:,1]\n", | |
"@btime $AT \\ $unit_vector\n", | |
";" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"# Gram Polynomials\n", | |
"\n", | |
"There is nothing special about the conventional polynomials $p_0 + p_1x + p_2x^2 + p_3x^3 + \\dots$ and so there is no reason not to use another basis of polynomials for the fitting and smoothing. Indeed if we use an orthogonal set (the [Gram or discrete Chebyshev polynomials](https://en.wikipedia.org/wiki/Discrete_Chebyshev_polynomials)) then closed form solutions for convolution weights can be found. Following Gorry, P. A. (1990). General Least-Squares Smoothing and Differentiation by the Convolution (Savitzky-Golay) Method. [Analytical Chemistry, 62(6), 570–573](https://doi.org/10.1021/ac00205a007) we can write down a recursive form for the convolution weight matrix." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 8, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"conv_weight" | |
] | |
}, | |
"execution_count": 8, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"\"\"\"\n", | |
"Generalized factorial function a⁽ᵇ⁾ = (a)(a - 1) ... (a - b + 1) and a⁽⁰⁾ = 1.\n", | |
"\"\"\"\n", | |
"gen_fact(a::Int, b::Int) = reduce(*, a:-1:(a-b+1); init=1)\n", | |
"\n", | |
"\"\"\"\n", | |
"Recursive calculation of the order `k` Gram polynomial over `2m+1` points evaluted at `i`. If `s>0` then return the `s`'th derivative. Eq. 11 of the paper.\n", | |
"\"\"\"\n", | |
"# the Gram Polynomial \n", | |
"function gram_poly(i::Int, m::Int, k::Int, s::Int)\n", | |
" if k > 0\n", | |
" denom = k*(2*m - k + 1)\n", | |
" return (4*k-2)/denom * (i*gram_poly(i, m, k-1,s) + s*gram_poly(i,m,k-1,s-1)) - ((k-1)*(2*m + k)/denom) * gram_poly(i,m,k-2,s)\n", | |
" elseif k == 0 && s == 0\n", | |
" return 1.0\n", | |
" else\n", | |
" return 0.0\n", | |
" end\n", | |
"end\n", | |
"\n", | |
"\"\"\"\n", | |
"Evaluate the convolution weight for the `i`'th data point for the `t`'th least square point for the `n`'th order Gram polynomial evaluated over `2m+1` points or `s`'th derivative of. Eq. 8 of the paper.\n", | |
"\"\"\"\n", | |
"function conv_weight(i::Int, t::Int, m::Int, n::Int, s::Int)\n", | |
" sum( (2*k+1)*gen_fact(2*m,k)/gen_fact(2*m+k+1, k+1) * gram_poly(i,m,k,0)*gram_poly(t,m,k,s) for k = 0:n )\n", | |
"end" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 9, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"5×5 Array{Float64,2}:\n", | |
" 31.0 9.0 -3.0 -5.0 3.0\n", | |
" 9.0 13.0 12.0 6.0 -5.0\n", | |
" -3.0 12.0 17.0 12.0 -3.0\n", | |
" -5.0 6.0 12.0 13.0 9.0\n", | |
" 3.0 -5.0 -3.0 9.0 31.0" | |
] | |
}, | |
"execution_count": 9, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"# check the classic quadratic fit to five points from Table I\n", | |
"# the middle row reproduces the conventional Savitsky-Golay coefficients\n", | |
"m = 2\n", | |
"hcat((35 * [[conv_weight(i,t,m,2,0) for i = -m:m] for t = -m:m])...)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 10, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"7×7 Array{Float64,2}:\n", | |
" -257.0 -122.0 -29.0 22.0 31.0 -2.0 -77.0\n", | |
" 122.0 17.0 -46.0 -67.0 -46.0 17.0 122.0\n", | |
" 185.0 62.0 -19.0 -58.0 -55.0 -10.0 77.0\n", | |
" 72.0 48.0 24.0 0.0 -24.0 -48.0 -72.0\n", | |
" -77.0 10.0 55.0 58.0 19.0 -62.0 -185.0\n", | |
" -122.0 -17.0 46.0 67.0 46.0 -17.0 -122.0\n", | |
" 77.0 2.0 -31.0 -22.0 29.0 122.0 257.0" | |
] | |
}, | |
"execution_count": 10, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"# check non-symmetric result and derivative from Table II: cubic fit to 7 pts, 1st derivative\n", | |
"m = 3\n", | |
"hcat((252 * [[conv_weight(i,t,m,3,1) for i = -m:m] for t = -m:m])...)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 11, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
" 1.890 ms (5798 allocations: 145.64 KiB)\n" | |
] | |
} | |
], | |
"source": [ | |
"# timing\n", | |
"m = 15\n", | |
"@btime [[conv_weight(i,t,m,5,0) for i = -$m:$m] for t = -$m:$m];" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": {}, | |
"outputs": [], | |
"source": [] | |
} | |
], | |
"metadata": { | |
"kernelspec": { | |
"display_name": "Julia 1.3.1", | |
"language": "julia", | |
"name": "julia-1.3" | |
}, | |
"language_info": { | |
"file_extension": ".jl", | |
"mimetype": "application/julia", | |
"name": "julia", | |
"version": "1.3.1" | |
} | |
}, | |
"nbformat": 4, | |
"nbformat_minor": 4 | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment