This Python script utilizes the GeoPy geocoding library to batch geocode a number of addresses, using various services until a pair of latitude/longitude values are returned.
Last active
January 11, 2023 21:43
-
-
Save rgdonohue/c4beedd3ca47d29aef01 to your computer and use it in GitHub Desktop.
Batch Geocoding Script with GeoPy
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# import the geocoding services you'd like to try | |
from geopy.geocoders import ArcGIS, Bing, Nominatim, OpenCage, GeocoderDotUS, GoogleV3, OpenMapQuest | |
import csv, sys | |
print 'creating geocoding objects!' | |
arcgis = ArcGIS(timeout=100) | |
bing = Bing('your-API-key',timeout=100) | |
nominatim = Nominatim(timeout=100) | |
opencage = OpenCage('your-API-key',timeout=100) | |
geocoderDotUS = GeocoderDotUS(timeout=100) | |
googlev3 = GoogleV3(timeout=100) | |
openmapquest = OpenMapQuest(timeout=100) | |
# choose and order your preference for geocoders here | |
geocoders = [googlev3, bing, nominatim] | |
def geocode(address): | |
i = 0 | |
try: | |
while i < len(geocoders): | |
# try to geocode using a service | |
location = geocoders[i].geocode(address) | |
# if it returns a location | |
if location != None: | |
# return those values | |
return [location.latitude, location.longitude] | |
else: | |
# otherwise try the next one | |
i += 1 | |
except: | |
# catch whatever errors, likely timeout, and return null values | |
print sys.exc_info()[0] | |
return ['null','null'] | |
# if all services have failed to geocode, return null values | |
return ['null','null'] | |
print 'geocoding addresses!' | |
# list to hold all rows | |
dout = [] | |
with open('data.csv', mode='rb') as fin: | |
reader = csv.reader(fin) | |
j = 0 | |
for row in reader: | |
print 'processing #',j | |
j+=1 | |
try: | |
# configure this based upon your input CSV file | |
street = row[4] | |
city = row[6] | |
state = row[7] | |
postalcode = row[5] | |
country = row[8] | |
address = street + ", " + city + ", " + state + " " + postalcode + " " + country | |
result = geocode(address) | |
# add the lat/lon values to the row | |
row.extend(result) | |
# add the new row to master list | |
dout.append(row) | |
except: | |
print 'you are a beautiful unicorn' | |
print 'writing the results to file' | |
# print results to file | |
with open('geocoded.csv', 'wb') as fout: | |
writer = csv.writer(fout) | |
writer.writerows(dout) | |
print 'all done!' |
Hello,
Thanks for posting this valuable code (at least for a newbie like me).
I would like to ask if you could add a feature to it.
Imagine that you have thousands of addresses to geolocate, every single failure (communication problem for example)
make you restart from the beginning.
It's possible to include a counter so when it reaches it's value (could be a parameter too), it edit's the output file, append the new set of processed records, resets the counter and go on to the next record?
Thank you in advance!
Best regards
KV
can you please elaborate an example of the data.csv (even one record would be great) ? i am new to python application with geopy 👍
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
@rgdonohue Don't know if you ever got around to this - I had occasion to use this script and ported it to Python3/rewrote it to use Pandas (for no reason other than to cut down on the lines of code). https://gist.github.com/ericmhuntley/0c293113aa75a254237c143e0cf962fa