Skip to content

Instantly share code, notes, and snippets.

@JoeThunyathep
Last active August 2, 2020 04:29
Show Gist options
  • Save JoeThunyathep/f050b94850f32aaf009d15440c87f5db to your computer and use it in GitHub Desktop.
Save JoeThunyathep/f050b94850f32aaf009d15440c87f5db to your computer and use it in GitHub Desktop.
Python Script to Download Springer Textbooks
import requests, wget
import pandas as pd
df = pd.read_excel("Free+English+textbooks.xlsx")
for index, row in df.iterrows():
# loop through the excel list
file_name = f"{row.loc['Book Title']}_{row.loc['Edition']}".replace('/','-').replace(':','-')
url = f"{row.loc['OpenURL']}"
r = requests.get(url)
download_url = f"{r.url.replace('book','content/pdf')}.pdf"
wget.download(download_url, f"./download/{file_name}.pdf")
print(f"downloading {file_name}.pdf Complete ....")
@0xOneBeing
Copy link

0xOneBeing commented May 21, 2020

This is a great project.
But after running it, I get this error:

image

Here is the code setup in my Sublime Text 3 editor:

image

I have initially installed all necessary packages using pip install ... command. Please, what could be wrong?

UPDATE
I found out that there is a /break character in the some cells in the Edition column of Free+English+textbooks.xlsx.

The error now is that its downloading all files in <100kb PDFs (which is unusual).

image

Please, what could be wrong this time?

@beardedherring
Copy link

beardedherring commented Aug 2, 2020

Please, what could be wrong this time?

reCapthca unfortunately :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment