Skip to content

Instantly share code, notes, and snippets.

@shiumachi
Last active January 3, 2016 20:49
Show Gist options
  • Save shiumachi/8517272 to your computer and use it in GitHub Desktop.
Save shiumachi/8517272 to your computer and use it in GitHub Desktop.
指定したURLのrequestオブジェクトからhrefのリストを取得する (注: BS3 のコード。BS4 のコードを見たい場合はこちらを参照 https://gist.github.com/shiumachi/8633275 )
# -*- coding: utf-8 -*-
# 参考: http://kondou.com/BS4/
# note: This code uses BeautifulSoup3 which is deprecated.
# If you need code sample of BS, please see https://gist.github.com/shiumachi/8633275
from BeautifulSoup import BeautifulSoup
import requests
def get_href_list(requests_obj):
soup = BeautifulSoup(requests_obj.text)
href_list = []
for i in soup.findAll('a'):
href_list.append(i.get('href'))
return href_list
if __name__ == '__main__':
r = requests.get("http://yahoo.co.jp/")
href_list = get_href_list(r)
for h in href_list:
print(h)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment