Skip to content

Instantly share code, notes, and snippets.

@dhoss
Created December 9, 2013 18:44
Show Gist options
  • Save dhoss/7878136 to your computer and use it in GitHub Desktop.
Save dhoss/7878136 to your computer and use it in GitHub Desktop.
simple ruby search
terms = Hash.new{|h,k|h[k]=h.size}
docs = DATA.collect { |line|
name = line.match(/^\d+/)
words = line.downcase.scan(/[a-z]+/)
vector = []
words.each { |word| vector[terms[word]] = 1 }
{:name=>name,:vector=>vector}
}
current = docs.first # or any other
docs.sort_by { |doc|
# assume we have defined cosine on arrays
doc[:vector].cosine(current[:vector])
}
related = docs[1..5].collect{|doc|doc[:name]}
puts related
__END__
0 Human machine interface for Lab ABC computer applications
1 A survey of user opinion of computer system response time
2 The EPS user interface management system
3 System and human system engineering testing of EPS
4 Relation of user-perceived response time to error measurement
5 The generation of random, binary, unordered trees
6 The intersection graph of paths in trees
7 Graph minors IV: Widths of trees and well-quasi-ordering
8 Graph minors: A survey
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment