Created
April 20, 2013 04:01
-
-
Save benmarwick/5424699 to your computer and use it in GitHub Desktop.
Calculate a topic model for a corpus then calculate the probability of those topics in a new corpus. Using R on windows 7.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
library(topicmodels) | |
data(AssociatedPress) | |
train <- AssociatedPress[1:100] | |
test <- AssociatedPress[101:150] | |
train.lda <- LDA(train,5) | |
# Determine the posterior probabilities of the topics | |
# for each document and of the terms for each topic | |
# for a fitted topic model. | |
test.lda <- posterior(train.lda,test) | |
# Here is the matrix with topics as cols, | |
# documents as rows and cell values as | |
# posterior probabilities | |
test.lda[[2]] |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment