Skip to content

Instantly share code, notes, and snippets.

@herberthamaral
Created October 24, 2011 11:33
Show Gist options
  • Save herberthamaral/1308823 to your computer and use it in GitHub Desktop.
Save herberthamaral/1308823 to your computer and use it in GitHub Desktop.
spider receita
"""
Simple script to get the citzen name by its CPF (Brazil's SSN)
"""
from BeautifulSoup import BeautifulSoup as bs
import requests
import urllib2
with requests.session() as session:
url = 'http://www.receita.fazenda.gov.br/aplicacoes/atcta/cpf/ConsultaPublica.asp'
response = session.get(url)
element = bs(response.content)
image_url = 'http://www.receita.fazenda.gov.br/scripts/srf/intercepta/captcha.aspx?opt=image&v=123'
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(response.request.cookiejar))
urllib2.install_opener(opener)
#download image as a file
url_imagem = urllib2.urlopen(image_url)
#saves the image on disk
image_file = open('imagem.gif', 'wb')
image_file.write(url_imagem.read())
image_file.close()
#ask user to input his/her data
captcha = raw_input('Digite o captcha:')
cpf = raw_input('Digite o cpf:')
#sends the request using the user data
dados = {'txtCpf':cpf,'idLetra':captcha}
response = session.post(url, data=dados)
element = bs(response.content)
nome = element.findAll('span',{'class':'clConteudoDados'})[1].string.split(':')[1].lstrip().rstrip()
print(nome)
requests==0.6.2
BeautifulSoup==3.2.0
@gilsondev
Copy link

Não estou vendo o captcha para digitar. O CPF tem que ser somente números ou no formato padrão? (000.000.000-00)

@herberthamaral
Copy link
Author

O captcha fica num arquivo chamado imagem.gif dentro do diretório do script e o cpf é somente numeros.

@gilsondev
Copy link

Agora sim! Muito bom!! Meus parabéns!! =D

@djcapelli
Copy link

Eu estou tendo o mesmo problema que o gilsondev :
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(response.request.cookiejar))
AttributeError: 'PreparedRequest' object has no attribute 'cookiejar'
Existe alguma forma de manipular o cookie sem ser pelo CokeiJar ?

@fndiaz
Copy link

fndiaz commented Feb 20, 2014

Estou tendo o problema, alguém pode me ajudar a resolver

File "cpf.py", line 14, in
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(response.request.cookiejar))
AttributeError: 'PreparedRequest' object has no attribute 'cookiejar'

@licensed
Copy link

Estou com o mesmo problema do amigo acima
AttributeError: 'PreparedRequest' object has no attribute 'cookiejar'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment