-
-
Save evilpacket/5973230 to your computer and use it in GitHub Desktop.
{ | |
"a": 8.167, | |
"b": 1.492, | |
"c": 2.782, | |
"d": 4.253, | |
"e": 12.702, | |
"f": 2.228, | |
"g": 2.015, | |
"h": 6.094, | |
"i": 6.966, | |
"j": 0.153, | |
"k": 0.772, | |
"l": 4.025, | |
"m": 2.406, | |
"n": 6.749, | |
"o": 7.507, | |
"p": 1.929, | |
"q": 0.095, | |
"r": 5.987, | |
"s": 6.327, | |
"t": 9.056, | |
"u": 2.758, | |
"v": 0.978, | |
"w": 2.360, | |
"x": 0.150, | |
"y": 1.974, | |
"z": 0.074 | |
} |
This frequency table doesn't appear to be accurate for 'json keys' in my opinion, because of the high usage of "size" in tech. That alone seems like it would put 'z' at a higher frequency than 'q'.
Here is my attempt at this:
'A', 'B', 'C', 'D', 'E',
'F', 'G', 'H', 'I', 'J',
'K', 'L', 'M', 'N', 'O',
'P', 'Q', 'R', 'S', 'T',
'U', 'V', 'W', 'X', 'Y',
'Z', '_', '-'
68, 13, 29, 39, 110,
11, 15, 7, 64, 10,
10, 32, 30, 48, 43,
26, 5, 57, 61, 71,
23, 10, 10, 10, 18,
20, 41, 10
This is derived by choosing a random sampling of 100 reasonable keys and taking their frequency, then modifying a bit.
Modifications:
Add a high likelihood of underscore.
Add a lesser arbitrarily chosen amount for dash.
JSON keys only allow uppercase, lowercase, numbers, underscore, and dash.
Here is my frequency normalized:
A 7.63
B 1.46
C 3.25
D 4.38
E 12.35
F 1.23
G 1.68
H 0.79
I 7.18
J 1.12
K 1.12
L 3.59
M 3.37
N 5.39
O 4.83
P 2.92
Q 0.56
R 6.40
S 6.85
T 7.97
U 2.58
V 1.12
W 1.12
X 1.12
Y 2.02
Z 2.24
_ 4.60
- 1.12
The biggest difference I see actually is that Q frequency in the above list is abnormally low.
@handcoding no clue after 11 years. I suspect I would have stolen it from a wikipedia source but I can't be certain at this point.