Last active
February 15, 2016 08:55
-
-
Save jcjohnson/eb448c81088005deb203 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
1 0.00348258706468 | |
2 0.00497760079642 | |
3 0.00846613545817 | |
4 0.028898854011 | |
5 0.0827517447657 | |
6 0.17506234414 | |
7 0.318862275449 | |
8 0.478781827259 | |
9 0.627372627373 | |
10 0.752123938031 | |
11 0.8415 | |
12 0.900450225113 | |
13 0.937437437437 | |
14 0.957936905358 | |
15 0.972945891784 | |
16 0.981453634085 | |
17 0.986459378134 | |
18 0.98996487707 | |
19 0.99297188755 | |
20 0.994475138122 | |
21 0.99648241206 | |
22 0.997988939165 | |
23 0.998490945674 | |
24 0.998993457474 | |
25 0.999496475327 | |
26 1.0 | |
27 1.0 | |
28 1.0 | |
29 1.0 | |
30 1.0 | |
31 1.0 | |
32 1.0 | |
33 1.0 | |
34 1.0 | |
35 1.0 | |
36 1.0 | |
37 1.0 | |
38 1.0 | |
39 1.0 | |
40 1.0 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import argparse | |
""" | |
Check how many substrings in sampled text are novel, not appearing in training | |
text. For different substring lengths, prints the fraction of sampled substrings | |
of that lenght that are novel. | |
""" | |
parser = argparse.ArgumentParser() | |
parser.add_argument('sampled_text') | |
parser.add_argument('training_text') | |
args = parser.parse_args() | |
with open(args.sampled_text, 'r') as f: | |
s1 = f.read() | |
with open(args.training_text, 'r') as f: | |
s2 = f.read() | |
for L in xrange(1, 41): | |
num_searched = 0 | |
num_found = 0 | |
for i in xrange(len(s1) - L + 1): | |
num_searched += 1 | |
sub = s1[i:(i+L)] | |
assert len(sub) == L | |
if sub in s2: | |
num_found += 1 | |
novel_frac = (num_searched - num_found) / float(num_searched) | |
print L, novel_frac |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
[1mMy name is claim'd. | |
BRAKENBURY: | |
So, through though thoughts shalt have's done the better flout? | |
CLARENCE: | |
GLOUCESTER: | |
And I did welcome, I shall not | |
have but contract do me not thee w ornance to the courted their right in last n. | |
CLARENCE: | |
Come away! there wills fight of these mind; she then, | |
And me that you cannot ruin I know, hold by their shadow of mine opinion, | |
For in thy reason when I come. | |
Nurse: | |
Peter, he think | |
All holloake the senate, farewell: you have worn mine own thing of thy mother, | |
But to keep it buy banishment. Have I may were King gladly cheque, | |
There is't to lose the all-best, | |
Were now? therein shall may, | |
Provated king dead; and, we'll fear. | |
What is rather did the tender, my joys, I would another dames, this lining. | |
Are ever slew Gaung be that breathed | |
Both he and sent. | |
Of him to the happy thought | |
CORIOLANUS: | |
Ha!' that you have mine own our thing in Leicour news, | |
And help hanging that stay the cowardship, hold, | |
She not flattering? some after heavyself for the hour to a proud comes | |
For how I have in sleep, | |
And fight is enjoys me hear, my name? | |
Second so made? | |
CAPULET: | |
What. Why shall come | |
That she ron's foolish accidents shot--he's ne'er shall knew she, whereon | |
his body man 'twill dry him our poor will, | |
We revenge spining: whose supred, heen thou hast | |
Last you stoops doth predection, | |
I'll fun ships in the contract thee? | |
Glook'd | |
To increase you fought of citage you? | |
CORIOLANUS: | |
I know once, I say. | |
ROMEO: | |
A houses, that I may be degree parts and of gold, | |
The damnand! | |
ANGELO: | |
Uncle, slept, I do | |
Rume ignoble as smorty and toming good: | |
What forbid-lest more enemy is soon at hope to do us dot importune man: then my lord. | |
MENENIUS: | |
The shepherd, my good Henry's authority, he'll some adjesty and when your grace. | |
ROMEO: | |
Get full of men! ofth, having bless, | |
And that I would with that staff. | |
ANGELO: | |
Hail name, | |
My Lady Annamentny steeds doth begun: | |
Nay, through my heart, which use | |
It wilt wrind on thee. | |
I'll tell it with his hands of waspike of wi[0m |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment