The Mapzen Search API is built from open-source data and technology and allows up to 30,000 geocoding requests per day. This gist shows how to use the Prioritize within a circular region feature to get more accurate results.
Here's the results of geocoding the Starbucks locations in New York City, using the priorization feature:
The data files in this gist:
- starbucks_geocoding.mapzen.priority.py - a Python script that reads starbucks.csv and creates starbucks.geocoded.csv
- starbucks.csv - a list of Starbucks locations as derived from the city's health inspection database
- starbucks.geocoded.csv - the results of running the Python script with Mapzen geocoding.
The API call, with parameters to restrict the search to a focus point and boundary, looks like this:
/v1/search?api_key=search-XXXXXXX
&text=YMCA
&focus.point.lat=-33.856680
&focus.point.lon=151.215281
&boundary.circle.lat=-33.856680
&boundary.circle.lon=151.215281
&boundary.circle.radius=50
Without using these parameters to limit the scope of the search, Mapzen search will sometimes return erroneous results.
For example, the top result of 1 PENN PLAZA MANHATTAN, NY, 10119
will be geocoded to Kansas:
However, specifying a boundary circle around New York, and the erroneous points will (mostly) be fixed:
There's still a few errors...for example, 0 JFK DELTA AIRLINES, QUEENS, NY 11422
is geocoded to be near La Guardia Airport.
In my Python script, I create a lookup table for each borough (e.g. BRONX, NY
), and for each result, I set the focus point to the borough point. There's probably better results to be found by creating a lookup table for each zipcode.
See the CartoDB version of these maps: