Skip to content

Instantly share code, notes, and snippets.

@tieleman
Last active August 29, 2015 14:08
Show Gist options
  • Select an option

  • Save tieleman/3c18c74b7d7d46b61b28 to your computer and use it in GitHub Desktop.

Select an option

Save tieleman/3c18c74b7d7d46b61b28 to your computer and use it in GitHub Desktop.
wmstats for Ruby
require 'benchmark'
filename = 'pagecounts-20141029-230000'
min_views = 500
prefix = 'en '
count = []
time = Benchmark.measure do
File.open(filename).each_line do |line|
next unless line.start_with? prefix
parts = line.force_encoding("iso-8859-1").split(' ', 3)
article, views = parts[1], parts[2].to_i
count << [article, views] if views > min_views
end
count.sort! { |a,b| b[1] <=> a[1] }
end
puts "Query took #{time.total} seconds"
count[0..9].each { |item| puts "#{item[0]} (#{item[1]})" }
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment