Created
November 21, 2012 21:58
-
-
Save madlep/4128144 to your computer and use it in GitHub Desktop.
Spearman’s rank correlation coefficient function
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# xs & ys are arrays of results in order of rank | |
def spearman(xs, ys) | |
# remove items that aren't common to both result sets | |
# these are mostly outliers anyway which we want to ignore | |
xs_common = xs.select{|i| ys.include? i } | |
ys_common = ys.select{|i| xs.include? i } | |
# calculate the mean of each set of ranks (simple sum/length calculation) | |
# as both are just the sum of ranks [1,2,3,4...] and have same length, | |
# we can figure it out based on an arithmetic sum | |
total = 0.5 * xs_common.length * (xs_common.length + 1) | |
x_mean = y_mean = total / xs_common.length | |
# initialize totals that we'll need | |
sum_mean_diff = 0 | |
sum_x_mean_diff_sq = 0 | |
sum_y_mean_diff_sq = 0 | |
# sum the differences of the items | |
xs_common.each_with_index do |x, x_rank| | |
x_rank = x_rank + 1 # ranking is 1-based, not 0-based | |
# grab the corresponding item from the other set of ranked items | |
y_rank = ys_common.index(x) + 1 | |
# work out the error of each item from it's mean | |
x_mean_diff = x_rank - x_mean | |
y_mean_diff = y_rank - y_mean | |
# aggregate totals for final calc | |
sum_mean_diff += x_mean_diff * y_mean_diff | |
sum_x_mean_diff_sq += x_mean_diff ** 2 | |
sum_y_mean_diff_sq += y_mean_diff ** 2 | |
end | |
# final coefficient | |
sum_mean_diff / Math.sqrt(sum_x_mean_diff_sq * sum_y_mean_diff_sq) | |
end |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
require 'spearman' | |
ranks1 = [:a, :b, :c, :d, :e] | |
ranks2 = [:a, :b, :c, :d, :e] | |
puts spearman(ranks1, ranks2) # = 1.0 (in exactly the same order) | |
ranks1 = [:a, :b, :c, :d, :e] | |
ranks2 = [:e, :d, :c, :b, :a] | |
puts spearman(ranks1, ranks2) # = -1.0 (in exactly reverse order) | |
ranks1 = [:a, :b, :c, :d, :e] | |
ranks2 = [:b, :a, :c, :d, :e] | |
puts spearman(ranks1, ranks2) # = 0.9 (a & b are out of order) | |
ranks1 = [:a, :b, :c, :d, :e] | |
ranks2 = [:b, :d, :c, :a, :e] | |
puts spearman(ranks1, ranks2) # = 0.3 (stuff is all over the place) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment