Skip to content

Instantly share code, notes, and snippets.

@pozhidaevak
Last active August 16, 2024 09:49
Show Gist options
  • Save pozhidaevak/0dca594d6f0de367f232909fe21cdb2f to your computer and use it in GitHub Desktop.
Save pozhidaevak/0dca594d6f0de367f232909fe21cdb2f to your computer and use it in GitHub Desktop.
Python dict with English letter frequency
letterFrequency = {'E' : 12.0,
'T' : 9.10,
'A' : 8.12,
'O' : 7.68,
'I' : 7.31,
'N' : 6.95,
'S' : 6.28,
'R' : 6.02,
'H' : 5.92,
'D' : 4.32,
'L' : 3.98,
'U' : 2.88,
'C' : 2.71,
'M' : 2.61,
'F' : 2.30,
'Y' : 2.11,
'W' : 2.09,
'G' : 2.03,
'P' : 1.82,
'B' : 1.49,
'V' : 1.11,
'K' : 0.69,
'X' : 0.17,
'Q' : 0.11,
'J' : 0.10,
'Z' : 0.07 }
@pliniosilveira
Copy link

It does not match to the values at https://en.wikipedia.org/wiki/Letter_frequency. But thanks anyway.

@zflorez
Copy link

zflorez commented Feb 17, 2022

Thank you! You really brought the time down for others having to type all this out haha 👍🏽

@Nikolaj-K
Copy link

Nikolaj-K commented Jul 1, 2022

Thanks!

I found that the probabilities sum to about 99.97, not 100.
Try

values = letterFrequency.values()
s = sum(values)

s > 99.98  # False

If you e.g. want to choose random letters from the alphabet using numpy.random.choice, then you have to pass a normalized version of these values. So

import numpy

WORD_LENGTH = 7

letters = list(letterFrequency.keys())
p = [v / s for v in values]

"".join([numpy.random.choice(letters, p=p) for _ in range(WORD_LENGTH)])

where I assumed recent enough python for the keys and values to be ordered the same.

You may compute a renormalized dict as follows (with the 100 at the start or not).

letterFrequency_renormalized = {k: 100 * v / sum(letterFrequency.values()) for k, v in letterFrequency.items()}

To 100:

{'E': 12.0036010803241, 'T': 9.102730819245775, 'A': 8.122436731019306, 'O': 7.682304691407423, 'I': 7.3121936580974305, 'N': 6.952085625687707, 'S': 6.281884565369612, 'R': 6.02180654196259, 'H': 5.921776532959889, 'D': 4.321296388916676, 'L': 3.981194358307493, 'U': 2.8808642592777836, 'C': 2.7108132439731922, 'M': 2.6107832349704916, 'F': 2.300690207062119, 'Y': 2.1106331899569875, 'W': 2.0906271881564473, 'G': 2.0306091827548265, 'P': 1.8205461638491551, 'B': 1.4904471341402423, 'V': 1.1103330999299794, 'K': 0.6902070621186357, 'X': 0.1700510153045914, 'Q': 0.11003300990297091, 'J': 0.10003000900270083, 'Z': 0.07002100630189059}

To 1:

{'E': 0.12003601080324099, 'T': 0.09102730819245775, 'A': 0.08122436731019306, 'O': 0.07682304691407423, 'I': 0.0731219365809743, 'N': 0.06952085625687708, 'S': 0.06281884565369612, 'R': 0.06021806541962589, 'H': 0.05921776532959889, 'D': 0.04321296388916676, 'L': 0.03981194358307493, 'U': 0.028808642592777836, 'C': 0.027108132439731925, 'M': 0.026107832349704915, 'F': 0.02300690207062119, 'Y': 0.021106331899569872, 'W': 0.02090627188156447, 'G': 0.020306091827548264, 'P': 0.01820546163849155, 'B': 0.014904471341402423, 'V': 0.011103330999299792, 'K': 0.006902070621186356, 'X': 0.0017005101530459142, 'Q': 0.0011003300990297092, 'J': 0.0010003000900270084, 'Z': 0.0007002100630189059}

@huguesdevimeux
Copy link

Not all heroes wear capes

@Flecart
Copy link

Flecart commented May 7, 2024

Just a thumbs up 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment