Skip to content

Instantly share code, notes, and snippets.

@allieus
Created July 21, 2017 14:31
Show Gist options
  • Save allieus/e6c37e21635e867a2593f98ead8c71cc to your computer and use it in GitHub Desktop.
Save allieus/e6c37e21635e867a2593f98ead8c71cc to your computer and use it in GitHub Desktop.
AskDjango VOD내 [크롤링 강의에 대한 질문](http://disq.us/p/1kovfwu)에 대한 답변
import requests
import re
import json
daum_url="http://search.daum.net/search?w=tot&DA=YZR&t__nil_searchbox=btn&sug=&sugo=&q=%EC%98%81%ED%99%94"
html = requests.get(daum_url).text
issue_list = re.findall(r'categoryIssueObj\["([\d\w]+)"\]\s*=\s*(.+?);', html, re.S)
for (key, js_obj) in issue_list:
json_string = re.sub(r'([{,])\s*([\w]+):', r'\1"\2":', js_obj)
print('## {} ##'.format(key))
print(json.loads(json_string))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment