Skip to content

Instantly share code, notes, and snippets.

@lukesutton
Created September 20, 2013 07:20
Show Gist options
  • Select an option

  • Save lukesutton/6634291 to your computer and use it in GitHub Desktop.

Select an option

Save lukesutton/6634291 to your computer and use it in GitHub Desktop.
The beginning of a screen scraper for AusPost's package tracking.
def extract(doc_string)
doc = Nokogiri::HTML(doc_string)
doc.css('#more-details-content .ed-details-row').map do |r|
time = Time.parse(r.at_css('.ed-date p').inner_text.strip)
activity = r.at_css('.ed-activity p').inner_text.strip
location = r.at_css('.ed-location p').inner_text.strip
state = location.match(/\s+(\w{2,3})$/)[1]
suburb = location.gsub(state, '').strip
{
:time => time,
:activity => activity,
:state => state,
:suburb => suburb
}
end
end
# extract(HTTParty.get('http://auspost.com.au/track/track.html?exp=b&id=<CODE>'))
# => [{:time=>2020-09-13 10:54:00 +0930, :activity=>"Delivered", :state=>"SA", :suburb=>"TANUNDA"}, {:time=>2020-09-13 09:45:00 +0930, :activity=>"On board with driver for delivery today", :state=>"SA", :suburb=>"TANUNDA"}, {:time=>2020-09-13 06:52:00 +0930, :activity=>"In Transit", :state=>"SA", :suburb=>"TANUNDA"}, {:time=>2018-09-13 15:21:00 +0930, :activity=>"Received by Australia Post", :state=>"VIC", :suburb=>"WEST FOOTSCRAY"}]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment