Skip to content

Instantly share code, notes, and snippets.

@simcap
Last active August 29, 2015 14:10
Show Gist options
  • Select an option

  • Save simcap/115feaa29351f3e4a552 to your computer and use it in GitHub Desktop.

Select an option

Save simcap/115feaa29351f3e4a552 to your computer and use it in GitHub Desktop.
Returns the most likely version among multiple given hashes
#!/usr/bin/ruby
# Run with
# $ ruby -r minitest/autorun $PROGRAM_NAME
#
# Returns the correct version among multiple given
# hashes. In other words, which hash is the one
# that most ressemble the other hashes ?
#
# First a Hash has to be able to say
# how similar it is to another one
class Hash
# Returns a factor f between 0 and 1 that indicates
# the similarity factor of self to other
#
# Calculates the number of keys with the same values
# then divide the result by the number of keys
def similarity_to(other)
sum = 0
each_pair do |k, v|
(sum = sum + 1) if v == other[k]
end
sum.fdiv(size)
end
end
class SimilarityFactorTest < Minitest::Test
def test_similarity
h = {one: 1, two: 2, three: 3, four: 4}
assert_equal 1, h.similarity_to({one: 1, two: 2, three: 3, four: 4})
assert_equal 0.75, h.similarity_to({one: 1, two: 2, three: 0, four: 4})
assert_equal 0.5, h.similarity_to({one: 0, two: 2, three: 0, four: 4})
assert_equal 0.25, h.similarity_to({one: 0, two: 0, three: 0, four: 4})
assert_equal 0.0, h.similarity_to({one: 0, two: 0, three: 0, four: 0})
end
end
# Now to determine the correct version among a sample of hashes
#
# Let us say the operator x returns the similarity factor f
# between h1 and h2 (i.e: f = h1 x h2)
#
# fh1 = (h1 x (H1 + ... + hn)) / n
# fh2 = (h2 x (h1 + ... + hn)) / n
# fhn = (hn x (h1 + ... + hn)) / n
#
# Correct version = max(fh1, fh2, ....fhn)
def correct_version(*candidates)
count = candidates.size
candidates.map do |c|
factor = (candidates.inject(0) do |acc, o|
acc = acc + c.similarity_to(o)
end).fdiv(count)
[c, factor.round(4)]
end.sort_by {|t| t.last}.reverse
end
class CorrectVersionTest < Minitest::Test
def test_versions_sorting_by_correcteness
h1 = {one: 1, two: 2, three: 3, four: 4}
h2 = {one: 1, two: 2, three: 3, four: 4}
h3 = {one: 1, two: 2, three: 3, four: 0}
h4 = {one: 1, two: 2, three: 0, four: 4}
h5 = {one: 1, two: 2, three: 0, four: 0}
h6 = {one: 1, two: 0, three: 0, four: 0}
h7 = {one: 0, two: 0, three: 0, four: 0}
results = correct_version(h1, h2, h3, h4, h5, h6, h7)
assert_equal [[h5, 0.6786], [h3, 0.6429], [h4, 0.6429],
[h1, 0.6071], [h2, 0.6071], [h6, 0.5714], [h7, 0.3929]],
results
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment