Skip to content

Instantly share code, notes, and snippets.

@eric-pedersen
Last active February 16, 2021 18:57
Show Gist options
  • Save eric-pedersen/e87fc2a3cce9c84591a6ee8466176efa to your computer and use it in GitHub Desktop.
Save eric-pedersen/e87fc2a3cce9c84591a6ee8466176efa to your computer and use it in GitHub Desktop.
#Create an explicit binomially distributed set of numbers
n = 1000
frac = 0.9
x = rep(c(1,0),times = c(n*frac, n*(1-frac)))
#Fit a Gaussian model and a binomial model to the same data
gauss_mod = glm(x~1,family = gaussian)
binom_mod = glm(x~1, family= binomial)
#Compare AIC
AIC(gauss_mod, binom_mod)
@eric-pedersen
Copy link
Author

Overall result: the binomial model would be thrown out based on AIC comparisons, even though it's the actual model for the data:

df AIC
gauss_mod 2 425
binom_mod 1 648

@eric-pedersen
Copy link
Author

As for why that's true: the likelihood for continuous distributions are based on probability density functions, which can range from 0 to infinity, meaning the log-Likelihood (LL) can range from -infinity -> +infinity. Discrete distributions use probability mass functions, with a maximum value of 1, or a maximum LL of 0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment