Skip to content

Instantly share code, notes, and snippets.

@dangra
Created April 19, 2013 14:53
Show Gist options
  • Save dangra/5420852 to your computer and use it in GitHub Desktop.
Save dangra/5420852 to your computer and use it in GitHub Desktop.
$ scrapy shell http://taobao.com
2013-04-19 11:53:08-0300 [scrapy] INFO: Scrapy 0.17.0 started (bot: scrapybot)
2013-04-19 11:53:08-0300 [scrapy] DEBUG: Overridden settings: {'LOGSTATS_INTERVAL': 0}
2013-04-19 11:53:08-0300 [scrapy] DEBUG: Enabled extensions: TelnetConsole, CloseSpider, WebService, CoreStats, SpiderState
2013-04-19 11:53:08-0300 [scrapy] DEBUG: Enabled downloader middlewares: HttpAuthMiddleware, DownloadTimeoutMiddleware, UserAgentMiddleware, RetryMiddleware, DefaultHeadersMiddleware, MetaRefreshMiddleware, HttpCompressionMiddleware, RedirectMiddleware, CookiesMiddleware, ChunkedTransferMiddleware, DownloaderStats
2013-04-19 11:53:08-0300 [scrapy] DEBUG: Enabled spider middlewares: HttpErrorMiddleware, OffsiteMiddleware, RefererMiddleware, UrlLengthMiddleware, DepthMiddleware
2013-04-19 11:53:08-0300 [scrapy] DEBUG: Enabled item pipelines:
2013-04-19 11:53:08-0300 [scrapy] DEBUG: Telnet console listening on 0.0.0.0:6023
2013-04-19 11:53:08-0300 [scrapy] DEBUG: Web service listening on 0.0.0.0:6080
2013-04-19 11:53:08-0300 [default] INFO: Spider opened
2013-04-19 11:53:10-0300 [default] DEBUG: Redirecting (302) to <GET http://www.taobao.com/> from <GET http://taobao.com>
2013-04-19 11:53:11-0300 [default] DEBUG: Redirecting (302) to <GET http://www.taobao.com/index_global.php> from <GET http://www.taobao.com/>
2013-04-19 11:53:13-0300 [default] DEBUG: Crawled (200) <GET http://www.taobao.com/index_global.php> (referer: None)
[s] Available Scrapy objects:
[s] hxs <HtmlXPathSelector xpath=None data=u'<html><head><meta charset="gbk"><title>\u6dd8'>
[s] item {}
[s] request <GET http://taobao.com>
[s] response <200 http://www.taobao.com/index_global.php>
[s] settings <CrawlerSettings module=None>
[s] spider <BaseSpider 'default' at 0x1d110d0>
[s] Useful shortcuts:
[s] shelp() Shell help (print this help)
[s] fetch(req_or_url) Fetch request (or URL) and update local objects
[s] view(response) View response in a browser
Python 2.7.3 (default, Sep 26 2012, 21:51:14)
Type "copyright", "credits" or "license" for more information.
IPython 0.13.1 -- An enhanced Interactive Python.
? -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help -> Python's own help system.
object? -> Details about 'object', use 'object??' for extra details.
>>> response.encoding
'gb18030'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment