Skip to content

Instantly share code, notes, and snippets.

@madlep
Created November 21, 2012 21:58
Show Gist options
  • Save madlep/4128144 to your computer and use it in GitHub Desktop.
Save madlep/4128144 to your computer and use it in GitHub Desktop.
Spearman’s rank correlation coefficient function
# xs & ys are arrays of results in order of rank
def spearman(xs, ys)
# remove items that aren't common to both result sets
# these are mostly outliers anyway which we want to ignore
xs_common = xs.select{|i| ys.include? i }
ys_common = ys.select{|i| xs.include? i }
# calculate the mean of each set of ranks (simple sum/length calculation)
# as both are just the sum of ranks [1,2,3,4...] and have same length,
# we can figure it out based on an arithmetic sum
total = 0.5 * xs_common.length * (xs_common.length + 1)
x_mean = y_mean = total / xs_common.length
# initialize totals that we'll need
sum_mean_diff = 0
sum_x_mean_diff_sq = 0
sum_y_mean_diff_sq = 0
# sum the differences of the items
xs_common.each_with_index do |x, x_rank|
x_rank = x_rank + 1 # ranking is 1-based, not 0-based
# grab the corresponding item from the other set of ranked items
y_rank = ys_common.index(x) + 1
# work out the error of each item from it's mean
x_mean_diff = x_rank - x_mean
y_mean_diff = y_rank - y_mean
# aggregate totals for final calc
sum_mean_diff += x_mean_diff * y_mean_diff
sum_x_mean_diff_sq += x_mean_diff ** 2
sum_y_mean_diff_sq += y_mean_diff ** 2
end
# final coefficient
sum_mean_diff / Math.sqrt(sum_x_mean_diff_sq * sum_y_mean_diff_sq)
end
require 'spearman'
ranks1 = [:a, :b, :c, :d, :e]
ranks2 = [:a, :b, :c, :d, :e]
puts spearman(ranks1, ranks2) # = 1.0 (in exactly the same order)
ranks1 = [:a, :b, :c, :d, :e]
ranks2 = [:e, :d, :c, :b, :a]
puts spearman(ranks1, ranks2) # = -1.0 (in exactly reverse order)
ranks1 = [:a, :b, :c, :d, :e]
ranks2 = [:b, :a, :c, :d, :e]
puts spearman(ranks1, ranks2) # = 0.9 (a & b are out of order)
ranks1 = [:a, :b, :c, :d, :e]
ranks2 = [:b, :d, :c, :a, :e]
puts spearman(ranks1, ranks2) # = 0.3 (stuff is all over the place)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment