-
-
Save charlesreid1/4f3d676b33b95fce83af08e4ec261822 to your computer and use it in GitHub Desktop.
#!/bin/bash | |
# | |
# Download the Large-scale CelebFaces Attributes (CelebA) Dataset | |
# from their Google Drive link. | |
# | |
# CelebA: http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html | |
# | |
# Google Drive: https://drive.google.com/drive/folders/0B7EVK8r0v71pWEZsZE9oNnFzTm8 | |
python3 get_drive_file.py 0B7EVK8r0v71pZjFTYXZWM3FlRnM celebA.zip |
import requests | |
def download_file_from_google_drive(id, destination): | |
def get_confirm_token(response): | |
for key, value in response.cookies.items(): | |
if key.startswith('download_warning'): | |
return value | |
return None | |
def save_response_content(response, destination): | |
CHUNK_SIZE = 32768 | |
with open(destination, "wb") as f: | |
for chunk in response.iter_content(CHUNK_SIZE): | |
if chunk: # filter out keep-alive new chunks | |
f.write(chunk) | |
URL = "https://docs.google.com/uc?export=download" | |
session = requests.Session() | |
response = session.get(URL, params = { 'id' : id }, stream = True) | |
token = get_confirm_token(response) | |
if token: | |
params = { 'id' : id, 'confirm' : token } | |
response = session.get(URL, params = params, stream = True) | |
save_response_content(response, destination) | |
if __name__ == "__main__": | |
import sys | |
if len(sys.argv) is not 3: | |
print("Usage: python google_drive.py drive_file_id destination_file_path") | |
else: | |
# TAKE ID FROM SHAREABLE LINK | |
file_id = sys.argv[1] | |
# DESTINATION FILE ON YOUR DISK | |
destination = sys.argv[2] | |
download_file_from_google_drive(file_id, destination) |
I'll look into it!
The command
$ python3 get_drive_file.py 0B7EVK8r0v71pZjFTYXZWM3FlRnM celebA.zip
is working for me. It may have been a temporary issue, please follow up if you experience the error again!
Thanks for checking. It works now. Sorry for not being specific with the description last time. Seems to me like it was a networking error.
Thanks for providing this! The code runs, but suspiciously fast and under unzipping the zip file, I get a
End-of-central-directory signature not found. Either this file is not a zipfile, or it constitutes one disk of a multi-part archive. In the latter case the central directory and zipfile comment will be found on the last disk(s) of this archive.
Thanks for providing this! The code runs, but suspiciously fast and under unzipping the zip file, I get a
End-of-central-directory signature not found. Either this file is not a zipfile, or it constitutes one disk of a multi-part archive. In the latter case the central directory and zipfile comment will be found on the last disk(s) of this archive.
I am having the same issue. Did you find a solution?
Thanks.
The problem seems to be the size of the files and the fact that it's hosted on google drive, which is not really meant to be used for sharing such big datasets. What happens is that the download fails, or it only downloads partly. I ended up getting the data set from a different source. Downloading it with tensorflow datasets worked for me at some point, maybe give this a try?
same problem here.
same problem here.
The problem seems to be the size of the files and the fact that it's hosted on google drive, which is not really meant to be used for sharing such big datasets. What happens is that the download fails, or it only downloads partly. I ended up getting the data set from a different source. Downloading it with tensorflow datasets worked for me at some point, maybe give this a try?
I am having the same issue. Did you find a solution?
Thanks.
https://github.com/matteodalessio/download_google_drive
#Install gdown
!pip install gdown
#bash download command
!gdown 0B7EVK8r0v71pZjFTYXZWM3FlRnM -O celebA.zip
This works for me
Unfortunately this is broken.