Last active
October 13, 2025 10:52
-
-
Save cbeddow/79d68aa6ed0f028d8dbfdad2a4142cf5 to your computer and use it in GitHub Desktop.
Download Mapillary images a JPGs
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import mercantile, mapbox_vector_tile, requests, json, os | |
from vt2geojson.tools import vt_bytes_to_geojson | |
# define an empty geojson as output | |
output= { "type": "FeatureCollection", "features": [] } | |
# vector tile endpoints -- change this in the API request to reference the correct endpoint | |
tile_coverage = 'mly1_public' | |
# tile layer depends which vector tile endpoints: | |
# 1. if map features or traffic signs, it will be "point" always | |
# 2. if looking for coverage, it will be "image" for points, "sequence" for lines, or "overview" for far zoom | |
tile_layer = "image" | |
# Mapillary access token -- user should provide their own | |
access_token = 'MLY|XXX' | |
# a bounding box in [east_lng,_south_lat,west_lng,north_lat] format | |
west, south, east, north = [-80.13423442840576,25.77376933762778,-80.1264238357544,25.788608487732198] | |
# get the list of tiles with x and y coordinates which intersect our bounding box | |
# MUST be at zoom level 14 where the data is available, other zooms currently not supported | |
tiles = list(mercantile.tiles(west, south, east, north, 14)) | |
# loop through list of tiles to get tile z/x/y to plug in to Mapillary endpoints and make request | |
for tile in tiles: | |
tile_url = 'https://tiles.mapillary.com/maps/vtp/{}/2/{}/{}/{}?access_token={}'.format(tile_coverage,tile.z,tile.x,tile.y,access_token) | |
response = requests.get(tile_url) | |
data = vt_bytes_to_geojson(response.content, tile.x, tile.y, tile.z,layer=tile_layer) | |
# push to output geojson object if yes | |
for feature in data['features']: | |
# get lng,lat of each feature | |
lng = feature['geometry']['coordinates'][0] | |
lat = feature['geometry']['coordinates'][1] | |
# ensure feature falls inside bounding box since tiles can extend beyond | |
if lng > west and lng < east and lat > south and lat < north: | |
# create a folder for each unique sequence ID to group images by sequence | |
sequence_id = feature['properties']['sequence_id'] | |
if not os.path.exists(sequence_id): | |
os.makedirs(sequence_id) | |
# request the URL of each image | |
image_id = feature['properties']['id'] | |
header = {'Authorization' : 'OAuth {}'.format(access_token)} | |
url = 'https://graph.mapillary.com/{}?fields=thumb_2048_url'.format(image_id) | |
r = requests.get(url, headers=header) | |
data = r.json() | |
image_url = data['thumb_2048_url'] | |
# save each image with ID as filename to directory by sequence ID | |
with open('{}/{}.jpg'.format(sequence_id, image_id), 'wb') as handler: | |
image_data = requests.get(image_url, stream=True).content | |
handler.write(image_data) | |
@Oscar161 I've made a new script that has pausing and resuming functionality. However, since this new script can use multiple tokens just make sure you have a tokens.txt saved in the same directory. Also, since it downloads so many files, I've changed it so it downloads photos in the format of Images/[tile_id]/[sequence_id]/[imagename.jpg]
Other improved quality of life functions include performance (supports async and concurrency), reliability (takes API rate limits into account, notifies you of network errors + retries), usability (progress bars, logging, cli arguments),
https://gist.github.com/cat768/1f499a566c73d6d0e58172966ce3e467
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I wanted to download all images from one City. Therefor i got my boundingbox data and Started a server. I used this script, to download the pictures ordered by sequence via API :) so far everything was working.
Now i have a problem: My server (which i got from university) has got a timelimit and it stopped and crashed after 4 hours of downloading. Also i can see the last Picture downloaded and the sequence it was in (by timestamp). My question: is there a posibility to restart the Server and modify the script, so that i just start with the image i dowwnloaded last? anf if Yes: how could i modify that? Would be a big help, if someone knows a workaround on that issue, i think serverchrashes are common and i am sure there will be other people having the same problems!
Greetings and thanks!