Skip to content

Instantly share code, notes, and snippets.

@ploubser
Created June 28, 2012 16:30
Show Gist options
  • Save ploubser/3012324 to your computer and use it in GitHub Desktop.
Save ploubser/3012324 to your computer and use it in GitHub Desktop.
Simple client side function for determining outliers in a set
module MCollective
class Aggregate
class Outliers<Base
def startup_hook
result[:value] = {:high => [], :low =>[]}
result[:type] = :collection
@aggregate_format = "%s : %s" unless @aggregate_format
@data_set = []
@quartiles = {:high => nil,
:low=> nil}
end
def process_result(value, reply)
@data_set << value
end
def summarize
@data_set.sort!
set_quartiles
find_outliers
end
def set_quartiles
n = @data_set.size
l = Float((1.0/4.0)*(n + 1))
u = Float((3.0/4.0)*(n + 1))
l = l.truncate + 1 unless Integer(l) == l
u = u.truncate
iqr = @data_set[u] = @data_set[l]
@quartiles[:low] = @data_set[l] - (1.5 * iqr)
@quartiles[:high] = @data_set[u] + (1.5 * iqr)
end
def find_outliers
@data_set.each do |data_item|
result[:value][:high] << data_item if data_item > @quartiles[:high]
result[:value][:low] << data_item if data_item < @quartiles[:low]
end
end
end
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment