Created
May 25, 2017 00:36
-
-
Save slwu89/b849f1fbe60449708fe5aaada239f4be to your computer and use it in GitHub Desktop.
how to do MLE of multinomial distribution for categorical data
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
############################################################### | |
# | |
# MLE example with categorical data | |
# | |
############################################################### | |
# make up some 'data'; c(0.1,0.2,0.3,0.4) are the 'real' parameters | |
data = rmultinom(n = 40,size = 4,prob = c(0.1,0.2,0.3,0.4)) | |
# the log-likelihood of the first sample | |
dmultinom(x = data[,1],size = 4,prob = c(0.1,0.2,0.3,0.4),log = TRUE) | |
# pretend we don't know the real parameters and want to estimate with MLE | |
# first, define the function to calculate the log-likelihood of the data given a vector of parameters, 'x' | |
loglike = function(x, data){ | |
p1 = x[1]; p2 = x[2]; p3 = x[3]; p4 = x[4] | |
-sum(apply(X = data,MARGIN = 2,FUN = dmultinom,size = 4, prob = c(p1,p2,p3,p4), log = TRUE)) | |
} | |
# use optimization to find the MLE parameters | |
# initial guess is 0.25 for all 4 categories (we assume we know nothing) | |
# method 1: L-BFGS-B optimziation for MLE | |
mle1 = optim(par = c(0.25,0.25,0.25,0.25),fn = loglike,data = data,method = "L-BFGS-B", | |
lower = c(.Machine$double.eps,.Machine$double.eps,.Machine$double.eps,.Machine$double.eps),upper = c(1,1,1,1)) | |
# MLE values: it is kind of close to c(0.1,0.2,0.3,0.4) | |
mle1$par / sum(mle1$par) | |
# method 2: Nelder-Mead with constraints for MLE | |
# set constraints so probabilities between 0 and 1 | |
bounds <- matrix(c( | |
0,1, | |
0,1, | |
0,1, | |
0,1 | |
), ncol =2 , byrow=TRUE) | |
colnames(bounds) <- c("lower", "upper") | |
# Convert the constraints to the ui and ci matrices | |
n <- nrow(bounds) | |
ui <- rbind( diag(n), -diag(n) ) | |
ci <- c( bounds[,1], - bounds[,2] ) | |
mle2 = constrOptim(theta = c(0.25,0.25,0.25,0.25),f = loglike,grad = NULL,ui = ui,ci = ci,data = data) | |
# MLE values: it is kind of close to c(0.1,0.2,0.3,0.4) | |
mle2$par |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Thanks for sharing your code. I'd thought I'd point out that there is a problem with optimizing multinomial model. There are only 3 independent parameters but the optimization procedure above is on 4 parameters and so the model is not identifiable and different parameter values will give the same likelihood value, e.g.
I was looking for a way to optimize a multinomial model using optim but I can't think of a sensible way because the constraints on the parameters are not fixed (the only approach I can think of is EM).