Last active
August 16, 2024 09:49
-
-
Save pozhidaevak/0dca594d6f0de367f232909fe21cdb2f to your computer and use it in GitHub Desktop.
Python dict with English letter frequency
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
letterFrequency = {'E' : 12.0, | |
'T' : 9.10, | |
'A' : 8.12, | |
'O' : 7.68, | |
'I' : 7.31, | |
'N' : 6.95, | |
'S' : 6.28, | |
'R' : 6.02, | |
'H' : 5.92, | |
'D' : 4.32, | |
'L' : 3.98, | |
'U' : 2.88, | |
'C' : 2.71, | |
'M' : 2.61, | |
'F' : 2.30, | |
'Y' : 2.11, | |
'W' : 2.09, | |
'G' : 2.03, | |
'P' : 1.82, | |
'B' : 1.49, | |
'V' : 1.11, | |
'K' : 0.69, | |
'X' : 0.17, | |
'Q' : 0.11, | |
'J' : 0.10, | |
'Z' : 0.07 } |
It does not match to the values at https://en.wikipedia.org/wiki/Letter_frequency. But thanks anyway.
Thank you! You really brought the time down for others having to type all this out haha 👍🏽
Thanks!
I found that the probabilities sum to about 99.97, not 100.
Try
values = letterFrequency.values()
s = sum(values)
s > 99.98 # False
If you e.g. want to choose random letters from the alphabet using numpy.random.choice
, then you have to pass a normalized version of these values. So
import numpy
WORD_LENGTH = 7
letters = list(letterFrequency.keys())
p = [v / s for v in values]
"".join([numpy.random.choice(letters, p=p) for _ in range(WORD_LENGTH)])
where I assumed recent enough python for the keys and values to be ordered the same.
You may compute a renormalized dict as follows (with the 100
at the start or not).
letterFrequency_renormalized = {k: 100 * v / sum(letterFrequency.values()) for k, v in letterFrequency.items()}
To 100:
{'E': 12.0036010803241, 'T': 9.102730819245775, 'A': 8.122436731019306, 'O': 7.682304691407423, 'I': 7.3121936580974305, 'N': 6.952085625687707, 'S': 6.281884565369612, 'R': 6.02180654196259, 'H': 5.921776532959889, 'D': 4.321296388916676, 'L': 3.981194358307493, 'U': 2.8808642592777836, 'C': 2.7108132439731922, 'M': 2.6107832349704916, 'F': 2.300690207062119, 'Y': 2.1106331899569875, 'W': 2.0906271881564473, 'G': 2.0306091827548265, 'P': 1.8205461638491551, 'B': 1.4904471341402423, 'V': 1.1103330999299794, 'K': 0.6902070621186357, 'X': 0.1700510153045914, 'Q': 0.11003300990297091, 'J': 0.10003000900270083, 'Z': 0.07002100630189059}
To 1:
{'E': 0.12003601080324099, 'T': 0.09102730819245775, 'A': 0.08122436731019306, 'O': 0.07682304691407423, 'I': 0.0731219365809743, 'N': 0.06952085625687708, 'S': 0.06281884565369612, 'R': 0.06021806541962589, 'H': 0.05921776532959889, 'D': 0.04321296388916676, 'L': 0.03981194358307493, 'U': 0.028808642592777836, 'C': 0.027108132439731925, 'M': 0.026107832349704915, 'F': 0.02300690207062119, 'Y': 0.021106331899569872, 'W': 0.02090627188156447, 'G': 0.020306091827548264, 'P': 0.01820546163849155, 'B': 0.014904471341402423, 'V': 0.011103330999299792, 'K': 0.006902070621186356, 'X': 0.0017005101530459142, 'Q': 0.0011003300990297092, 'J': 0.0010003000900270084, 'Z': 0.0007002100630189059}
Not all heroes wear capes
Just a thumbs up 👍
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Thanks!