Skip to content

Instantly share code, notes, and snippets.

@rafaeldalsenter
Created June 6, 2020 12:57
Show Gist options
  • Save rafaeldalsenter/18778ed95598547b0fa2b453f6970435 to your computer and use it in GitHub Desktop.
Save rafaeldalsenter/18778ed95598547b0fa2b453f6970435 to your computer and use it in GitHub Desktop.
from urllib.request import Request, urlopen
from bs4 import BeautifulSoup
URL_STOCKS_LIST = 'https://www.infomoney.com.br/cotacoes/empresas-b3/'
HEADER_BASE = {'user-agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.163 Safari/537.36'}
class Scraping:
...
def set_urls(self, list):
req = Request(URL_STOCKS_LIST, headers = HEADER_BASE)
response = urlopen(req)
html = response.read()
soup = BeautifulSoup(self.__tratamento_html(html), 'html.parser')
for item in list:
item.url = soup.find('td', {'class' : 'strong'}, text=item.codigo).find('a')['href']
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment