Skip to content

Instantly share code, notes, and snippets.

@benmarwick
Created April 20, 2013 04:01
Show Gist options
  • Save benmarwick/5424699 to your computer and use it in GitHub Desktop.
Save benmarwick/5424699 to your computer and use it in GitHub Desktop.
Calculate a topic model for a corpus then calculate the probability of those topics in a new corpus. Using R on windows 7.
library(topicmodels)
data(AssociatedPress)
train <- AssociatedPress[1:100]
test <- AssociatedPress[101:150]
train.lda <- LDA(train,5)
# Determine the posterior probabilities of the topics
# for each document and of the terms for each topic
# for a fitted topic model.
test.lda <- posterior(train.lda,test)
# Here is the matrix with topics as cols,
# documents as rows and cell values as
# posterior probabilities
test.lda[[2]]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment