Skip to content

Instantly share code, notes, and snippets.

@jonathanoheix
Created December 11, 2018 14:56
Show Gist options
  • Select an option

  • Save jonathanoheix/e6d99d9429278512235ae2f103541906 to your computer and use it in GitHub Desktop.

Select an option

Save jonathanoheix/e6d99d9429278512235ae2f103541906 to your computer and use it in GitHub Desktop.
pages_urls = []
new_page = "http://books.toscrape.com/catalogue/page-1.html"
while requests.get(new_page).status_code == 200:
pages_urls.append(new_page)
new_page = pages_urls[-1].split("-")[0] + "-" + str(int(pages_urls[-1].split("-")[1].split(".")[0]) + 1) + ".html"
print(str(len(pages_urls)) + " fetched URLs")
print("Some examples:")
pages_urls[:5]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment