Skip to content

Instantly share code, notes, and snippets.

@johana-star
Created December 10, 2013 22:41
Show Gist options
  • Save johana-star/7901732 to your computer and use it in GitHub Desktop.
Save johana-star/7901732 to your computer and use it in GitHub Desktop.
Scrape location data out of Craigslist postings with Nokogiri.
require "nokogiri"; require "open-uri"
doc = Nokogiri::XML open 'http://sfbay.craigslist.org/search/apa/sfc?bedrooms=1&catAbb=apa&maxAsk=2000&format=rss'
urls = doc.xpath("//rdf:li").map { |entry| entry.values.first }
page = Nokogiri::HTML open urls.first
# This page contains a div with an id of 'map' which has two attributes:
# 'data-latitude' and 'data-longitude'. How do I get their values?
# Something like: (???)
page.css("div#map").???
@zph
Copy link

zph commented Dec 10, 2013

This will give you an array of [lat,lng]

page.css('div#map').map { |p| [p.attr('data-latitude'), p.attr('data-longitude')] }.flatten

@zph
Copy link

zph commented Dec 10, 2013

Or to grab one at a time:

page.css('div#map').attr('data-longitude').value

@martinisoft
Copy link

First pass could be a basic:

page.css("div#map").xpath("@data-latitude").first.value
page.css("div#map").xpath("@data-longitude").first.value

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment