Skip to content

Instantly share code, notes, and snippets.

@ScribbleGhost
Created January 19, 2020 21:53
Show Gist options
  • Select an option

  • Save ScribbleGhost/3045b6d258e36f451a457597039900f5 to your computer and use it in GitHub Desktop.

Select an option

Save ScribbleGhost/3045b6d258e36f451a457597039900f5 to your computer and use it in GitHub Desktop.
Script from scrapethissite.com
import requests
from bs4 import BeautifulSoup
def scrape():
page = 1
page_has_data = True
while page_has_data:
r = requests.get('https://scrapethissite.com/pages/forms/', params = dict(per_page=25, page_num=page))
soup = BeautifulSoup(r.text, 'html.parser')
print(f'---------Scraping page: {page}---------')
if len(soup.findAll('tr', 'team')) == 0:
page_has_data = False
print('---------Done Scraping---------')
for team in soup.findAll('tr', 'team'):
print(team.find('td', 'name').get_text(strip=True))
page +=1
if __name__ == '__main__':
scrape()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment