Last active
March 31, 2020 04:21
-
-
Save primaryobjects/4c7cca705eeba0d8bad6 to your computer and use it in GitHub Desktop.
Generating text with a markov chain in R.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
library(markovchain) | |
text <- readLines('text.txt') | |
text <- text[nchar(text) > 0] | |
text <- gsub('.', ' .', text, fixed = TRUE) | |
text <- gsub(',', ' ,', text, fixed = TRUE) | |
text <- gsub('!', ' !', text, fixed = TRUE) | |
text <- gsub('(', '( ', text, fixed = TRUE) | |
text <- gsub(')', ' )', text, fixed = TRUE) | |
terms <- unlist(strsplit(text, ' ')) | |
fit <- markovchainFit(data = terms) | |
plot(fit$estimate) | |
paste(markovchainSequence(n=50, markovchain=fit$estimate), collapse=' ') | |
#s <- createSequenceMatrix(terms, sanitize=FALSE) | |
#fit2 <- fitHigherOrder(s) |
you can work starting from a term by using code like this for your final step. t0
represents "term zero" or the initial term.
paste(markovchainSequence(n = 100, markovchain=update_fit$estimate, t0 = "start", include.t0 = TRUE ), collapse = ' ')
does anyone know how to change the depth so that it iterates over bi- or tri-grams rather than going discretely step by step? markovify
in python seems to be able to do this but I can't figure out how to do it with markovchain
in R.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
hello,
thanks for the contribution which is precise and effective . but can you please help me in understanding how the sequence of the title is starting from a particular term?
can we give a term as an input and start our sequence from there on wards??