Created
April 21, 2023 23:46
-
-
Save aparrish/e69bb6bf78e5b17f6397b046ba422c28 to your computer and use it in GitHub Desktop.
how to generate rhyming couplets with pronouncing and markovify
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"id": "5ac0842d", | |
"metadata": {}, | |
"source": [ | |
"## making random rhyming couplets from a corpus with pronouncing\n", | |
"\n", | |
"We're going to do this by training a Markov chain on the lines from the corpus... in reverse!" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 31, | |
"id": "a1fb1861", | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"import markovify\n", | |
"import pronouncing\n", | |
"import random" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "6d460b8b", | |
"metadata": {}, | |
"source": [ | |
"Download [sonnets.txt](https://raw.githubusercontent.com/aparrish/plaintext-example-files/master/sonnets.txt) to the same directory as this notebook.\n", | |
"\n", | |
"Read in the lines as a list:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 8, | |
"id": "55fc0b26", | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"lines = open(\"sonnets.txt\").read().split(\"\\n\")" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "73055298", | |
"metadata": {}, | |
"source": [ | |
"Then remove any trailing punctuation. (This is important for our task in this notebook, since we're training on words, and we want, e.g., `thee,` to be the same as `thee:` at the end of a line)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 9, | |
"id": "fd095169", | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"lines = [item.strip(\".!?-:;, \") for item in lines]" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "0f7758b9", | |
"metadata": {}, | |
"source": [ | |
"What it looks like:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 11, | |
"id": "9000ade1", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"['',\n", | |
" 'From fairest creatures we desire increase',\n", | |
" \"That thereby beauty's rose might never die\",\n", | |
" 'But as the riper should by time decease',\n", | |
" 'His tender heir might bear his memory',\n", | |
" 'But thou contracted to thine own bright eyes',\n", | |
" \"Feed'st thy light's flame with self-substantial fuel\",\n", | |
" 'Making a famine where abundance lies',\n", | |
" 'Thy self thy foe, to thy sweet self too cruel',\n", | |
" \"Thou that art now the world's fresh ornament\",\n", | |
" 'And only herald to the gaudy spring',\n", | |
" 'Within thine own bud buriest thy content',\n", | |
" \"And tender churl mak'st waste in niggarding\",\n", | |
" 'Pity the world, or else this glutton be',\n", | |
" \"To eat the world's due, by the grave and thee\",\n", | |
" '',\n", | |
" 'When forty winters shall besiege thy brow',\n", | |
" \"And dig deep trenches in thy beauty's field\",\n", | |
" \"Thy youth's proud livery so gazed on now\",\n", | |
" \"Will be a tatter'd weed of small worth held\",\n", | |
" 'Then being asked, where all thy beauty lies',\n", | |
" 'Where all the treasure of thy lusty days',\n", | |
" 'To say, within thine own deep sunken eyes',\n", | |
" 'Were an all-eating shame, and thriftless praise']" | |
] | |
}, | |
"execution_count": 11, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"lines[:24]" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "d0fef2fe", | |
"metadata": {}, | |
"source": [ | |
"### reversing things in python\n", | |
"\n", | |
"Use `list(reversed(...))`:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 14, | |
"id": "e6d27f0e", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"[5, 4, 3, 2, 1]" | |
] | |
}, | |
"execution_count": 14, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"list(reversed([1, 2, 3, 4, 5]))" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "7709f5dc", | |
"metadata": {}, | |
"source": [ | |
"To reverse the words in a string, split it first:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 17, | |
"id": "45be62e1", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"['test.', 'a', 'is', 'This']" | |
] | |
}, | |
"execution_count": 17, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"list(reversed(\"This is a test.\".split()))" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "92bf6e70", | |
"metadata": {}, | |
"source": [ | |
"And then join the whole thing back together:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 18, | |
"id": "2af7feb8", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"'test. a is This'" | |
] | |
}, | |
"execution_count": 18, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"\" \".join(list(reversed(\"This is a test.\".split())))" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "a2dda8d2", | |
"metadata": {}, | |
"source": [ | |
"Now, we'll create a copy of the original sonnets, but with each line reversed, word-wise:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 12, | |
"id": "224c15f3", | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"reversed_by_word = [\" \".join(list(reversed(item.split(\" \")))) for item in lines]" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 13, | |
"id": "760f03fe", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"['',\n", | |
" 'increase desire we creatures fairest From',\n", | |
" \"die never might rose beauty's thereby That\",\n", | |
" 'decease time by should riper the as But',\n", | |
" 'memory his bear might heir tender His',\n", | |
" 'eyes bright own thine to contracted thou But',\n", | |
" \"fuel self-substantial with flame light's thy Feed'st\",\n", | |
" 'lies abundance where famine a Making',\n", | |
" 'cruel too self sweet thy to foe, thy self Thy',\n", | |
" \"ornament fresh world's the now art that Thou\",\n", | |
" 'spring gaudy the to herald only And',\n", | |
" 'content thy buriest bud own thine Within',\n", | |
" \"niggarding in waste mak'st churl tender And\",\n", | |
" 'be glutton this else or world, the Pity',\n", | |
" \"thee and grave the by due, world's the eat To\",\n", | |
" '',\n", | |
" 'brow thy besiege shall winters forty When',\n", | |
" \"field beauty's thy in trenches deep dig And\",\n", | |
" \"now on gazed so livery proud youth's Thy\",\n", | |
" \"held worth small of weed tatter'd a be Will\",\n", | |
" 'lies beauty thy all where asked, being Then',\n", | |
" 'days lusty thy of treasure the all Where',\n", | |
" 'eyes sunken deep own thine within say, To',\n", | |
" 'praise thriftless and shame, all-eating an Were']" | |
] | |
}, | |
"execution_count": 13, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"reversed_by_word[:24]" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "911811fa", | |
"metadata": {}, | |
"source": [ | |
"## generating lines that end with a particular word\n", | |
"\n", | |
"Now we'll train the markov chain on this corpus. Markovify expects one big string, so we'll join the list of reversed lines with a newline:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 19, | |
"id": "b3d750df", | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"sonnets_reversed_by_word = \"\\n\".join(reversed_by_word)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "4df88830", | |
"metadata": {}, | |
"source": [ | |
"Make the model with state size 1:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 20, | |
"id": "63b48c2d", | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"reversed_model = markovify.NewlineText(sonnets_reversed_by_word, state_size=1)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "90f26d27", | |
"metadata": {}, | |
"source": [ | |
"Now we can generate line of poetry!" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 23, | |
"id": "ae31abe8", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"'thence almost self my is me Pity'" | |
] | |
}, | |
"execution_count": 23, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"reversed_model.make_sentence()" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "76a7ce52", | |
"metadata": {}, | |
"source": [ | |
"The words are reversed, so we need to reverse the line word-wise once again:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 26, | |
"id": "eb5fce3a", | |
"metadata": { | |
"scrolled": true | |
}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"'Under the dear love and thoughts so proud as thou thy sweet'" | |
] | |
}, | |
"execution_count": 26, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"\" \".join(list(reversed(reversed_model.make_sentence().split(\" \"))))" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "939c9a3f", | |
"metadata": {}, | |
"source": [ | |
"With the `init_state` parameter, we can tell the markov generator to begin with a particular word:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 105, | |
"id": "bd8aca3e", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"'thee and grown, waning by laid purpose my all curls, sable And'" | |
] | |
}, | |
"execution_count": 105, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"reversed_model.make_sentence(init_state=('thee',))" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "12f7f5ea", | |
"metadata": {}, | |
"source": [ | |
"Of course, I'm actually intending here to generate a line that *ends* with \"thee.\" So let's reverse it:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 110, | |
"id": "b978ef7c", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"'Give warning to me to have sworn thee'" | |
] | |
}, | |
"execution_count": 110, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"\" \".join(list(reversed(reversed_model.make_sentence(init_state=('thee',)).split())))" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "02ffe9d4", | |
"metadata": {}, | |
"source": [ | |
"Of course, this doesn't work with words that aren't in the corpus:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 111, | |
"id": "6f1f28b4", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"ename": "KeyError", | |
"evalue": "('Allison',)", | |
"output_type": "error", | |
"traceback": [ | |
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", | |
"\u001b[0;31mKeyError\u001b[0m Traceback (most recent call last)", | |
"Cell \u001b[0;32mIn[111], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124m \u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;241m.\u001b[39mjoin(\u001b[38;5;28mlist\u001b[39m(\u001b[38;5;28mreversed\u001b[39m(\u001b[43mreversed_model\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mmake_sentence\u001b[49m\u001b[43m(\u001b[49m\u001b[43minit_state\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43m(\u001b[49m\u001b[38;5;124;43m'\u001b[39;49m\u001b[38;5;124;43mAllison\u001b[39;49m\u001b[38;5;124;43m'\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m)\u001b[49m\u001b[43m)\u001b[49m\u001b[38;5;241m.\u001b[39msplit())))\n", | |
"File \u001b[0;32m~/opt/miniconda3/envs/rwet-2023/lib/python3.9/site-packages/markovify/text.py:231\u001b[0m, in \u001b[0;36mText.make_sentence\u001b[0;34m(self, init_state, **kwargs)\u001b[0m\n\u001b[1;32m 228\u001b[0m \u001b[38;5;28;01mbreak\u001b[39;00m\n\u001b[1;32m 230\u001b[0m \u001b[38;5;28;01mfor\u001b[39;00m _ \u001b[38;5;129;01min\u001b[39;00m \u001b[38;5;28mrange\u001b[39m(tries):\n\u001b[0;32m--> 231\u001b[0m words \u001b[38;5;241m=\u001b[39m prefix \u001b[38;5;241m+\u001b[39m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mchain\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mwalk\u001b[49m\u001b[43m(\u001b[49m\u001b[43minit_state\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 232\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m (max_words \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m \u001b[38;5;129;01mand\u001b[39;00m \u001b[38;5;28mlen\u001b[39m(words) \u001b[38;5;241m>\u001b[39m max_words) \u001b[38;5;129;01mor\u001b[39;00m (\n\u001b[1;32m 233\u001b[0m min_words \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m \u001b[38;5;129;01mand\u001b[39;00m \u001b[38;5;28mlen\u001b[39m(words) \u001b[38;5;241m<\u001b[39m min_words\n\u001b[1;32m 234\u001b[0m ):\n\u001b[1;32m 235\u001b[0m \u001b[38;5;28;01mcontinue\u001b[39;00m \u001b[38;5;66;03m# pragma: no cover # see coveragepy/issues/198\u001b[39;00m\n", | |
"File \u001b[0;32m~/opt/miniconda3/envs/rwet-2023/lib/python3.9/site-packages/markovify/chain.py:142\u001b[0m, in \u001b[0;36mChain.walk\u001b[0;34m(self, init_state)\u001b[0m\n\u001b[1;32m 136\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21mwalk\u001b[39m(\u001b[38;5;28mself\u001b[39m, init_state\u001b[38;5;241m=\u001b[39m\u001b[38;5;28;01mNone\u001b[39;00m):\n\u001b[1;32m 137\u001b[0m \u001b[38;5;124;03m\"\"\"\u001b[39;00m\n\u001b[1;32m 138\u001b[0m \u001b[38;5;124;03m Return a list representing a single run of the Markov model, either\u001b[39;00m\n\u001b[1;32m 139\u001b[0m \u001b[38;5;124;03m starting with a naive BEGIN state, or the provided `init_state`\u001b[39;00m\n\u001b[1;32m 140\u001b[0m \u001b[38;5;124;03m (as a tuple).\u001b[39;00m\n\u001b[1;32m 141\u001b[0m \u001b[38;5;124;03m \"\"\"\u001b[39;00m\n\u001b[0;32m--> 142\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28;43mlist\u001b[39;49m\u001b[43m(\u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mgen\u001b[49m\u001b[43m(\u001b[49m\u001b[43minit_state\u001b[49m\u001b[43m)\u001b[49m\u001b[43m)\u001b[49m\n", | |
"File \u001b[0;32m~/opt/miniconda3/envs/rwet-2023/lib/python3.9/site-packages/markovify/chain.py:130\u001b[0m, in \u001b[0;36mChain.gen\u001b[0;34m(self, init_state)\u001b[0m\n\u001b[1;32m 128\u001b[0m state \u001b[38;5;241m=\u001b[39m init_state \u001b[38;5;129;01mor\u001b[39;00m (BEGIN,) \u001b[38;5;241m*\u001b[39m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mstate_size\n\u001b[1;32m 129\u001b[0m \u001b[38;5;28;01mwhile\u001b[39;00m \u001b[38;5;28;01mTrue\u001b[39;00m:\n\u001b[0;32m--> 130\u001b[0m next_word \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mmove\u001b[49m\u001b[43m(\u001b[49m\u001b[43mstate\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 131\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m next_word \u001b[38;5;241m==\u001b[39m END:\n\u001b[1;32m 132\u001b[0m \u001b[38;5;28;01mbreak\u001b[39;00m\n", | |
"File \u001b[0;32m~/opt/miniconda3/envs/rwet-2023/lib/python3.9/site-packages/markovify/chain.py:116\u001b[0m, in \u001b[0;36mChain.move\u001b[0;34m(self, state)\u001b[0m\n\u001b[1;32m 114\u001b[0m cumdist \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mbegin_cumdist\n\u001b[1;32m 115\u001b[0m \u001b[38;5;28;01melse\u001b[39;00m:\n\u001b[0;32m--> 116\u001b[0m choices, weights \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mzip\u001b[39m(\u001b[38;5;241m*\u001b[39m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mmodel\u001b[49m\u001b[43m[\u001b[49m\u001b[43mstate\u001b[49m\u001b[43m]\u001b[49m\u001b[38;5;241m.\u001b[39mitems())\n\u001b[1;32m 117\u001b[0m cumdist \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mlist\u001b[39m(accumulate(weights))\n\u001b[1;32m 118\u001b[0m r \u001b[38;5;241m=\u001b[39m random\u001b[38;5;241m.\u001b[39mrandom() \u001b[38;5;241m*\u001b[39m cumdist[\u001b[38;5;241m-\u001b[39m\u001b[38;5;241m1\u001b[39m]\n", | |
"\u001b[0;31mKeyError\u001b[0m: ('Allison',)" | |
] | |
} | |
], | |
"source": [ | |
"\" \".join(list(reversed(reversed_model.make_sentence(init_state=('Allison',)).split())))" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "d925ddbb", | |
"metadata": {}, | |
"source": [ | |
"## rhyming couplets at random\n", | |
"\n", | |
"To generate rhyming couplets at random, here's what we're going to do:\n", | |
"\n", | |
"* make a list of all words that end lines in the corpus\n", | |
"* make a dictionary that maps each of those end words to a list of words that rhyme with that word\n", | |
"* select an end word at random from the dictionary\n", | |
"* generate lines whose end words are two randomly sampled words from that end word's rhymes\n", | |
"\n", | |
"Here's what it looks like. First, a list of all possible end-words:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 28, | |
"id": "4c94d80b", | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"end_words = [item.split()[0] for item in reversed_by_word if len(item) > 0]" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "eb4a86db", | |
"metadata": {}, | |
"source": [ | |
"A function that gives us all of the words that rhyme with a word, constrained to the words in a particular list:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 122, | |
"id": "5c4b07ad", | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"# find rhyming words for word, limited to words in list other_wirds\n", | |
"def find_rhyme_in(word, other_words):\n", | |
" rhymes = []\n", | |
" phones = pronouncing.phones_for_word(word)\n", | |
" # if there are no pronunciations for this word, return empty\n", | |
" if len(phones) == 0:\n", | |
" return []\n", | |
" # get the \"rhyming part\" of the list of phones\n", | |
" word_rhyme = pronouncing.rhyming_part(phones[0])\n", | |
" # for each of the words in the other_words list...\n", | |
" # (TODO: optimize this for really big corpora)\n", | |
" for item in other_words:\n", | |
" phones = pronouncing.phones_for_word(item)\n", | |
" if len(phones) == 0:\n", | |
" continue\n", | |
" # check to see if its rhyming part is the same as the word\n", | |
" if pronouncing.rhyming_part(phones[0]) == word_rhyme:\n", | |
" rhymes.append(item)\n", | |
" return list(set(rhymes)) # remove duplicates" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "4b9f51c4", | |
"metadata": {}, | |
"source": [ | |
"Picking a random end word and finding all of its rhyming words gives us a list of end words that are (a) present in the corpus and (b) rhyme with each other." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 61, | |
"id": "6ab5cd67", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"['arrest',\n", | |
" 'west',\n", | |
" 'indigest',\n", | |
" 'chest',\n", | |
" 'rest',\n", | |
" 'best',\n", | |
" 'suppressed',\n", | |
" 'unrest',\n", | |
" 'guest',\n", | |
" 'breast']" | |
] | |
}, | |
"execution_count": 61, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"find_rhyme_in(random.choice(end_words), end_words)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "0abf62d4", | |
"metadata": {}, | |
"source": [ | |
"Now, we make the dictionary that maps each end word to a list of words that rhyme with it (drawing only from rhyming words present in the corpus):" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 62, | |
"id": "1b5fa5f8", | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"end_words_with_rhymes = {}\n", | |
"for i, item in enumerate(end_words):\n", | |
" rhymes = find_rhyme_in(item, end_words)\n", | |
" if len(rhymes) >= 2:\n", | |
" end_words_with_rhymes[item] = rhymes" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "8d6d7921", | |
"metadata": {}, | |
"source": [ | |
"For example, all of the end words that rhyme with `west`:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 78, | |
"id": "76c624c7", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"['arrest',\n", | |
" 'west',\n", | |
" 'indigest',\n", | |
" 'chest',\n", | |
" 'rest',\n", | |
" 'best',\n", | |
" 'suppressed',\n", | |
" 'unrest',\n", | |
" 'guest',\n", | |
" 'breast']" | |
] | |
}, | |
"execution_count": 78, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"end_words_with_rhymes['west']" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "50279875", | |
"metadata": {}, | |
"source": [ | |
"Now we can generate two lines that rhyme with each other, simply by picking end words that we know rhyme with each other:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 80, | |
"id": "c7d90474", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"'west to happy form should which means public Of'" | |
] | |
}, | |
"execution_count": 80, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"reversed_model.make_sentence(init_state=('west',))" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 81, | |
"id": "1b05ea36", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"data": { | |
"text/plain": [ | |
"\"guest heart's to not canst audit acceptable What\"" | |
] | |
}, | |
"execution_count": 81, | |
"metadata": {}, | |
"output_type": "execute_result" | |
} | |
], | |
"source": [ | |
"reversed_model.make_sentence(init_state=('guest',))" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "93817720", | |
"metadata": {}, | |
"source": [ | |
"Of course, we need to reverse the lines:" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 118, | |
"id": "8bacdc74", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"From fairest in the sober west\n", | |
"And all tyrant, for their dear heart's guest\n" | |
] | |
} | |
], | |
"source": [ | |
"print(\" \".join(list(reversed(reversed_model.make_sentence(init_state=('west',)).split()))))\n", | |
"print(\" \".join(list(reversed(reversed_model.make_sentence(init_state=('guest',)).split()))))" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"id": "764fa114", | |
"metadata": {}, | |
"source": [ | |
"## putting it together" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 121, | |
"id": "1180ea84", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"Give not she not kept\n", | |
"When what worth then did except\n", | |
"If the hungry ocean is my tongue\n", | |
"For no cause of me young\n", | |
"Rise, resty Muse, my state\n", | |
"Three winters shall I straight\n", | |
"Darkening thy face\n", | |
"What merit lived for her babe chase\n", | |
"And loathsome canker eat him there bred\n", | |
"When my head\n", | |
"Or, if by Time's fell sick Muse brings\n", | |
"Ah! if it with kings\n", | |
"Yet in thine annoy\n", | |
"So thou present'st a joy\n" | |
] | |
} | |
], | |
"source": [ | |
"for i in range(7):\n", | |
" # get a random end word and list of words that rhyme with it\n", | |
" word, possibles = random.choice(list(end_words_with_rhymes.items()))\n", | |
" # pick two rhyming words from that list\n", | |
" first, second = random.sample(possibles, 2)\n", | |
" # print randomly-generated lines that ends with those words,\n", | |
" # limiting to 40 characters for aesthetic reasons?\n", | |
" print(\" \".join(list(reversed(reversed_model.make_short_sentence(\n", | |
" 40, init_state=(first,)).split()))))\n", | |
" print(\" \".join(list(reversed(reversed_model.make_short_sentence(\n", | |
" 40, init_state=(second,)).split()))))" | |
] | |
} | |
], | |
"metadata": { | |
"kernelspec": { | |
"display_name": "Python 3 (ipykernel)", | |
"language": "python", | |
"name": "python3" | |
}, | |
"language_info": { | |
"codemirror_mode": { | |
"name": "ipython", | |
"version": 3 | |
}, | |
"file_extension": ".py", | |
"mimetype": "text/x-python", | |
"name": "python", | |
"nbconvert_exporter": "python", | |
"pygments_lexer": "ipython3", | |
"version": "3.9.16" | |
} | |
}, | |
"nbformat": 4, | |
"nbformat_minor": 5 | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment