Skip to content

Instantly share code, notes, and snippets.

@foreverbell
Created October 3, 2014 12:13
Show Gist options
  • Save foreverbell/aa0fbfb032bf5858632c to your computer and use it in GitHub Desktop.
Save foreverbell/aa0fbfb032bf5858632c to your computer and use it in GitHub Desktop.
A utility to fetch all countries' captial.
#!/usr/bin/env python
# encoding: utf-8
import urllib2, re, json, random
url = 'http://www.tripmondo.com/magazine/facts-and-statistics/list-of-capitals-and-countries/'
html = urllib2.urlopen(url, timeout=5).read()
reg = re.compile('<td><a href=.*?>(.*?)</a></td>\s+<td><a href=.*?>(.*?)</a></td>\s+<td class="right">(.*?)</td>\s+<td class="right">(.*?)</td>')
with open('city.json', 'w') as f :
f.write("[\n")
matches = reg.findall(html)
for country, captial, latitude, longitude in matches :
wiki = 'http://en.wikipedia.org/wiki/' + captial.replace(' ', '_')
lat = float(latitude.replace(',', '.'))
log = float(longitude.replace(',', '.'))
f.write("\t[\"%s\", %.2f, %.2f, %.4f, \"%s\"],\n" % (captial, lat, log, random.random(), wiki))
print country, captial, latitude, longitude
f.write("]\n")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment