Last active
December 19, 2015 21:29
-
-
Save Rhomboid/6020250 to your computer and use it in GitHub Desktop.
Markov text generator (C++11)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| #include <string> | |
| #include <vector> | |
| #include <iostream> | |
| #include <iterator> | |
| #include <unordered_map> | |
| #include <algorithm> | |
| #include <stdexcept> | |
| #include <cassert> | |
| #include <cstdlib> | |
| #include <ctime> | |
| using namespace std; | |
| class markov { | |
| public: | |
| markov(istream &input, size_t order) { | |
| vector<char> data { istreambuf_iterator<char>(input), istreambuf_iterator<char>() }; | |
| // replace(data.begin(), data.end(), '\n', ' '); | |
| assert(data.size() > order); | |
| for(auto a = data.begin(), b = a + order; b != data.end(); ++a, ++b) | |
| prob[string(a, b)] += *b; | |
| } | |
| string generate(size_t nchars = 2000) { | |
| auto seed = max_element(prob.begin(), prob.end(), | |
| [](const decltype(prob)::value_type &a, const decltype(prob)::value_type &b) { | |
| return a.second.size() < b.second.size(); | |
| })->first; | |
| auto output = seed; | |
| try { | |
| while(output.size() < nchars) { | |
| auto & choices = prob.at(seed); | |
| auto choice = choices[rand() % choices.size()]; | |
| output += choice; | |
| seed.erase(0, 1) += choice; | |
| } | |
| } catch(const out_of_range &e) { /* seed not found, just stop generating */ } | |
| return output; | |
| } | |
| private: | |
| unordered_map<string, string> prob; | |
| }; | |
| int main(int argc, char **argv) | |
| { | |
| if(argc != 2) { | |
| cout << "usage: " << argv[0] << " order <input.txt\n"; | |
| return 1; | |
| } | |
| auto order = atoi(argv[1]); | |
| assert(order >= 0 && order <= 20); | |
| srand(time(0)); | |
| markov m {cin, (unsigned)order}; | |
| cout << m.generate() << "\n"; | |
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| $ g++ -Wall -Wextra -pedantic -std=c++11 -O2 markov.cpp -o markov | |
| $ ./markov 12 <text_corpora/twain_sawyer.txt | |
| (Samuel Langhorne Clemens) | |
| T O M S A W Y E R | |
| CHAPTER XV | |
| A FEW minutes later the widow's. I reckon there ain't no mistake 'bout | |
| where I'LL go to. I been so wicked." | |
| "Dad fetch it! This comes of playing with his grand secret | |
| without Huck, you know!" | |
| "Oh, that will be nice. They're so lovely, all spotted up." | |
| "Yes, that's so. I didn't think of that. Oh, I know what is to become of a boy that will act like that. It makes | |
| me feel so bad to think you could let me go to Sereny Harper and Ben Rogers in--because of course | |
| there's got to be a Gang, or else there wouldn't be alive two days if that got found out. | |
| YOU know that?" | |
| "Yes, that's so," said Huck, nearly fainting. | |
| They talked it all over, and as they sped by some outlying cottages that lay | |
| near the village, and so he presently found | |
| himself leaning to the impressive as it was fascinating. | |
| Now a witness was called who testified that he began to doze, in spite of her, Tom knew | |
| where the wind lay, now. So he forestalled what might be. Got bricks in it?--or old metal?" | |
| "Old metal," said Tom. "You don't have any other name. Kings don't have any fun, anyway, all by himself that Huck had remained pirates. However, it seemed well | |
| worth while to chance it, | |
| so he fell to groaning with considerable, and | |
| then added something to | |
| himself, and then brought them back presently. I wish we could get hold of you I'll--" | |
| She did not feel that it was time to wake up; this sort of life might be | |
| romantic enough, in his blighted condition, but it was of no use. He | |
| talked hopefully to Becky; but an age of anxious waiting passed and no | |
| sounds came again. | |
| The children groped their way to McDougal's cave, and the ferryboat was about at hand. The clanging bell had | |
| been calling for half an hour. However, this time he thought he saw, that | |
| Becky Thatcher, because pitfalls were somewhat common, and had about worn | |
| himself out of temptation ro |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment