Source: SSC 389: AR(1) + noise
- Xt is hidden state at time t
- Yt is observed state at time t
- Dt is information known at time t, i.e. {Y1, ..., Yt}
- T is length of X, Y, D
Hidden: Xt = c + bX[t-1] + e
Observed: Yt = Xt + v
- where v ≈ N(0, s), e ≈ N(0, w), and v,e independent
- and X is stationary
Goal: describe a gibbs sampler for viewing the posterior of all unknowns given only the observed Y: P(X, c, b, w, s | DT)
- Draw (c, b, w, s) from P(c, b, w, s | X, DT)
- Draw X from P(X | c, b, w, s, DT)
[...]?
(Let A:={c,b,w,s})
P(X | A, DT) = product of P(XT | A, DT) and P(Xt | X[t+1:T], A, DT) for all t<T
2a. P(XT | A, DT) ≈ N(mt, Ct) 2b. P(Xt | X[t+1:T], A, DT) = P(Xt | X[t+1], A, DT) = N(ht, Bt)
So to draw from this distribution, first we draw XT from P(XT | A, DT), then we should be able to sample X[T-i] from P(X[T-i] | X[T-i+1], A, D[T-i]) for each i<T, giving us a draw from the joint distribution of P(X | A, DT).
First we want to find the posterior at last time point T: P(XT | DT) ≈ N(mt, Ct). What is (mt, Ct)?
Posterior at t-1: [X[t-1] | D[t-1]] ≈ N(m[t-1], C[t-1])
- m0, C0 are known (?).
Prior at t-1: [Xt | D[t-1]] ≈ N(at, Rt)
- at := a + b*m(t−1)
- Rt := b^2*Ct + w^2
Predictive at t-1: [Yt | D[t-1]] ≈ N(ft, Qt)
- ft := at
- Qt := Rt + s^2
Posterior at t: [Xt | Dt] ≈ N(mt, Ct)
- mt := at + At*et
- Ct := Rt − Rt*At
- At = Rt/Qt
- et = yt − ft
Thus to calculate (mt, Ct) we need to calculate the forward probabilities by beginning at t=0 and going up until we get to t=T. Then we draw XT from N(mt, Ct).
At this point we already know XT, so we now want to draw from P(X[T-1] | XT, A, D[T-1]) = N(h[t-1], B[t-1]), where:
- h[t-i] := m(t-i) + bC[t-i]/Y[t-i+1](X[t-i+1] − a[t-i])
- B[t-i] := C(t-i) − b^2*C[t-i]^2/Y[t-i+1]
We just calculated all these things in the previous step. So we simply need to start at i=T-1 and go until i=1.