Created
August 13, 2019 05:42
-
-
Save lawrencechen0921/139cdf310ab2f3dc5806a39b2c50f73f to your computer and use it in GitHub Desktop.
使用urllib.request 爬蟲
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#這邊會介紹把網頁後端的程式碼抓下來的結果以及會使用到的指令 | |
import urllib.request as ur #先把模組簡單匯入 | |
url="https://看你需要甚麼網站的名稱" #輸入自己要的網站 | |
resp=ur.urlopen(url) #暫時存在ur裏頭 | |
data=resp.read() #data會在暫存器里先讀取 | |
print(data) | |
#以下毒出來的資料皆為byte,你也可以用decode函數轉換成字串 | |
#但是urllib.request的程式碼較為複雜 比較建議使用request的第三方模組,得出的結果也會較為簡潔 | |
#一樣在cmd先做 pip install request的動作,接下來就可以使用了 | |
import requests | |
url="https://選擇自己要的網站" | |
data=requests.get(url) | |
print(data) | |
#是否很簡單呀!! |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment