Skip to content

Instantly share code, notes, and snippets.

@tygern
Created July 9, 2013 13:37
Show Gist options
  • Save tygern/5957417 to your computer and use it in GitHub Desktop.
Save tygern/5957417 to your computer and use it in GitHub Desktop.
Calculate the probability that two people in a given group will have the same first initial and last name.
#!/usr/bin/env ruby
# Get name file from https://www.census.gov/genealogy/www/data/1990surnames/names_files.html
names = File.readlines('dist.male.first.txt')
list = {}
letters = ('A'..'Z')
total_people = 40000
last_name_freq = 0.01006 # Smith
letters.each do |letter|
list[letter] = []
names.each do |line|
if line.match("^#{letter}")
list[letter] += [line[15...20].to_f]
end
end
frequency = list[letter].inject(:+)
probabilty = 1.0
number_of_people = (frequency * total_people / 100 * last_name_freq).to_i
number_of_people.times do |i|
probabilty *= (10000.0 - i)/10000.0
end
p "#{letter}: #{1 - probabilty}"
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment