This Python script utilizes the GeoPy geocoding library to batch geocode a number of addresses, using various services until a pair of latitude/longitude values are returned.
-
-
Save rgdonohue/c4beedd3ca47d29aef01 to your computer and use it in GitHub Desktop.
# import the geocoding services you'd like to try | |
from geopy.geocoders import ArcGIS, Bing, Nominatim, OpenCage, GeocoderDotUS, GoogleV3, OpenMapQuest | |
import csv, sys | |
print 'creating geocoding objects!' | |
arcgis = ArcGIS(timeout=100) | |
bing = Bing('your-API-key',timeout=100) | |
nominatim = Nominatim(timeout=100) | |
opencage = OpenCage('your-API-key',timeout=100) | |
geocoderDotUS = GeocoderDotUS(timeout=100) | |
googlev3 = GoogleV3(timeout=100) | |
openmapquest = OpenMapQuest(timeout=100) | |
# choose and order your preference for geocoders here | |
geocoders = [googlev3, bing, nominatim] | |
def geocode(address): | |
i = 0 | |
try: | |
while i < len(geocoders): | |
# try to geocode using a service | |
location = geocoders[i].geocode(address) | |
# if it returns a location | |
if location != None: | |
# return those values | |
return [location.latitude, location.longitude] | |
else: | |
# otherwise try the next one | |
i += 1 | |
except: | |
# catch whatever errors, likely timeout, and return null values | |
print sys.exc_info()[0] | |
return ['null','null'] | |
# if all services have failed to geocode, return null values | |
return ['null','null'] | |
print 'geocoding addresses!' | |
# list to hold all rows | |
dout = [] | |
with open('data.csv', mode='rb') as fin: | |
reader = csv.reader(fin) | |
j = 0 | |
for row in reader: | |
print 'processing #',j | |
j+=1 | |
try: | |
# configure this based upon your input CSV file | |
street = row[4] | |
city = row[6] | |
state = row[7] | |
postalcode = row[5] | |
country = row[8] | |
address = street + ", " + city + ", " + state + " " + postalcode + " " + country | |
result = geocode(address) | |
# add the lat/lon values to the row | |
row.extend(result) | |
# add the new row to master list | |
dout.append(row) | |
except: | |
print 'you are a beautiful unicorn' | |
print 'writing the results to file' | |
# print results to file | |
with open('geocoded.csv', 'wb') as fout: | |
writer = csv.writer(fout) | |
writer.writerows(dout) | |
print 'all done!' |
@rgdonohue Don't know if you ever got around to this - I had occasion to use this script and ported it to Python3/rewrote it to use Pandas (for no reason other than to cut down on the lines of code). https://gist.github.com/ericmhuntley/0c293113aa75a254237c143e0cf962fa
Hello,
Thanks for posting this valuable code (at least for a newbie like me).
I would like to ask if you could add a feature to it.
Imagine that you have thousands of addresses to geolocate, every single failure (communication problem for example)
make you restart from the beginning.
It's possible to include a counter so when it reaches it's value (could be a parameter too), it edit's the output file, append the new set of processed records, resets the counter and go on to the next record?
Thank you in advance!
Best regards
KV
can you please elaborate an example of the data.csv (even one record would be great) ? i am new to python application with geopy 👍
@Vondoe79, sorry for the very late response. The data I used were kinda sensitive, so I didn't publish them. I'll work up a more updated example with Python 3 in a Jupyter notebook.