Last active
May 18, 2020 14:15
-
-
Save danstowell/151a3b68ceb9c5f4a51f to your computer and use it in GitHub Desktop.
Implementations of non-negative matrix factorisation (NMF) in Stan
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
N <- 15 | |
M <- 12 | |
K <- 3 | |
Wconc <- 100 | |
Hconc <- 5 | |
Winit <- structure(c(.056,.111,.056,.111,.056,.111,.056,.111,.056,.111,.056,.111,.111,.056,.111,.056,.111,.056,.111,.056,.111,.056,.111,.056,.063,.063,.125,.063,.063,.125,.063,.063,.125,.063,.063,.125 | |
), .Dim=c(12,3)) | |
X <- structure(c(.032,.032,.091,.032,.157,.264,.157,.139,.486,.257,.709,.934,.036,.036,.08,.036,.255,.192,.255,.1,.638,.171,1.228,.62,.039,.039,.226,.039,.119,.521,.119,.128,.809,.227,.475,1.537,.047,.047,.212,.047,.113,.589,.113,.242,.717,.458,.407,2.002,.02,.02,.058,.02,.07,.189,.07,.107,.251,.203,.293,.705,.033,.033,.062,.033,.128,.268,.128,.208,.325,.403,.549,1.142,.039,.039,.245,.039,.144,.516,.144,.081,.923,.128,.609,1.39,.05,.05,.189,.05,.326,.383,.326,.091,1.077,.136,1.55,1.024,.039,.039,.245,.039,.106,.548,.106,.112,.847,.194,.407,1.565,.023,.023,.134,.023,.072,.301,.072,.065,.486,.112,.291,.862,.04,.04,.145,.04,.205,.352,.205,.131,.726,.233,.938,1.107,.028,.028,.173,.028,.076,.397,.076,.093,.593,.164,.287,1.163,.028,.028,.058,.028,.14,.199,.14,.136,.36,.257,.636,.781,.033,.033,.081,.033,.108,.309,.108,.207,.347,.401,.441,1.245,.036,.036,.066,.036,.142,.285,.142,.221,.358,.428,.615,1.211 | |
), .Dim=c(12,15)) | |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
data { | |
int<lower=1> N; // num frames | |
int<lower=1> M; // num bins | |
int<lower=1> K; // num factors | |
real<lower=0> Wconc; // concentration parameter on template priors - a large number means relative certainty, stick close to the prior | |
real<lower=0> Hconc; // concentration parameter on activations - a large number encourages sparsity | |
matrix<lower=0>[M,K] Winit; // prior spectral templates -- these should each be different (or else it'd be unidentifiable). avoid zeroes too. | |
matrix<lower=0>[M,N] X; // spectrogram | |
} | |
parameters { | |
simplex[M] W[K]; // inferred spectral templates | |
matrix<lower=0>[K,N] H; // activations | |
} | |
transformed parameters { | |
vector<lower=0>[M*N] Xnorm; | |
vector<lower=0>[M] Winitnorm[K]; | |
Xnorm <- to_vector(X); // spectrogram, flattened, but in this case NOT normalised (since not PLCA) | |
for (k in 1:K) { | |
Winitnorm[k] <- col(Winit, k) * Wconc / sum(col(Winit, k)); // Winit, normalised and pre-multiplied by concn | |
} | |
} | |
model { | |
vector[M*N] Xest; | |
matrix[K,M] Wmat; // W, but indexed backwards and transposed later, since it seems that's the only way to assign the simplices into the matrix columns | |
for (k in 1:K) { | |
Wmat[k] <- W[k]'; | |
} | |
Xest <- to_vector(Wmat' * H); | |
for (k in 1:K) { | |
W[k] ~ dirichlet(Winitnorm[k]); // our supplied spectral templates are used as centres for the priors on the spectral templates. each tpl has its own dirichlet. | |
} | |
to_vector(H) ~ exponential(Hconc); // a prior on the activations | |
Xnorm ~ normal(0, 0.5 * Xest); // IS-NMF -- the factor of 1/2 here comes from X being assumed a power-spectrogram where the underlying random variables are complex unit-norm | |
} | |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
data { | |
int<lower=1> N; // num frames | |
int<lower=1> M; // num bins | |
int<lower=1> K; // num factors | |
real<lower=0> Wconc; // concentration parameter on template priors - a large number means relative certainty, stick close to the prior | |
real<lower=0> Hconc; // concentration parameter on activations - a large number encourages sparsity | |
matrix<lower=0>[M,K] Winit; // prior spectral templates -- these should each be different (or else it'd be unidentifiable). avoid zeroes too. | |
matrix<lower=0>[M,N] X; // spectrogram | |
} | |
parameters { | |
simplex[M] W[K]; // inferred spectral templates | |
matrix<lower=0>[K,N] H; // activations | |
real<lower=0> Xconc; // overall concentration of spectrogram, which means how closely it sticks to the dotproduct, i.e. relates to noise level | |
} | |
transformed parameters { | |
vector<lower=0>[M*N] Xnorm; | |
vector<lower=0>[M] Winitnorm[K]; | |
Xnorm <- to_vector(X / sum(X)); // spectrogram, normalised and flattened | |
for (k in 1:K) { | |
Winitnorm[k] <- col(Winit, k) * Wconc / sum(col(Winit, k)); // Winit, normalised and pre-multiplied by concn | |
} | |
} | |
model { | |
vector[M*N] Xest; | |
matrix[K,M] Wmat; // W, but indexed backwards and transposed later, since it seems that's the only way to assign the simplices into the matrix columns | |
for (k in 1:K) { | |
Wmat[k] <- W[k]'; | |
} | |
Xest <- to_vector(Wmat' * H); | |
for (k in 1:K) { | |
W[k] ~ dirichlet(Winitnorm[k]); // our supplied spectral templates are used as centres for the priors on the spectral templates. each tpl has its own dirichlet. | |
} | |
to_vector(H) ~ exponential(Hconc); // a prior on the activations | |
Xconc ~ lognormal(100, 10); // a high value of Xconc means low SNR (a more semantic parametrisation would be nice...) | |
Xnorm ~ dirichlet(Xest * Xconc); // our input data is modelled as a multinomial sampled from this overall mega-dirichlet | |
} |
Hi. I'm afraid I don't know - this code is old and I'm not active in the same topics right now.
The transformed parameters: well, the W are dirichlet, as you say. Rather than passing the Dirichlet prior in directly as parameters, I allow the user to pass in a set of spectral templates and a concentration parameter, then we transform these into "Winitnorm" as a probabilistically meaningful Dirichlet prior.
It may also be useful to pre-transform some things (e.g. "Xnorm"), and it's more efficient to do it in this preprocessing block than inside the main model that Stan optimises.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi!
I am trying to implement NMF in rstan and this code helps a lot. I just have a few questions, it would be great if you could help.
So, I understand you're trying to factorize X =WH where columns of W are dirichlet and H is exponential.
What is 'transformed parameters' block for here?