Skip to content

Instantly share code, notes, and snippets.

@hukl
Created February 12, 2011 13:08
Show Gist options
  • Save hukl/823751 to your computer and use it in GitHub Desktop.
Save hukl/823751 to your computer and use it in GitHub Desktop.
Simple approach of parsing a large logfile
require 'time'
class LogfileParser
# Regexp to match the timestamps in the apache common log format
TIME_REGEXP = /\[\d{2}\/\w{3}\/\d{4}\:\d{2}:\d{2}:\d{2}\s.{5}\]/
def initialize path, starting_at
raise ArgumentError unless File.exists?( path )
@log = File.open( path )
@starting_at = Time.parse( starting_at )
@ending_at = (@starting_at + 300)
end
def emit &block
@log.each_line do |line|
next unless timestamp = line.match( TIME_REGEXP )
current_time = Time.parse( timestamp[0].sub(":", " ") )
if current_time >= @starting_at && current_time <= @ending_at
yield line
end
if current_time > @ending_at
return
end
end
end
end
parser = LogfileParser.new( *ARGV )
parser.emit {|l| puts l}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment