Skip to content

Instantly share code, notes, and snippets.

0xbf00 /
Last active February 14, 2023 17:38
Workaround for Scrapy issue #355 (Scrapy failure due to overly long headers)

The issue

So you've stumbled upon this bug? Or you've gotten a message similar to the following?

2018-09-11 17:57:04 [scrapy.utils.log] INFO: Scrapy 1.5.1 started (bot: mac_scraper)
2018-09-11 17:57:04 [scrapy.utils.log] INFO: Versions: lxml, libxml2 2.9.8, cssselect 1.0.3, parsel 1.5.0, w3lib 1.19.0, Twisted 18.7.0dev0, Python 3.7.0 (default, Jun 29 2018, 20:13:13) - [Clang 9.1.0 (clang-902.0.39.2)], pyOpenSSL 18.0.0 (OpenSSL 1.1.0i  14 Aug 2018), cryptography 2.3.1, Platform Darwin-17.7.0-x86_64-i386-64bit
2018-09-11 17:57:04 [scrapy.crawler] INFO: Overridden settings: {'BOT_NAME': 'mac_scraper', 'DUPEFILTER_CLASS': 'scrapy.dupefilters.BaseDupeFilter', 'LOGSTATS_INTERVAL': 0, 'NEWSPIDER_MODULE': 'mac_scraper.spiders', 'ROBOTSTXT_OBEY': True, 'SPIDER_MODULES': ['mac_scraper.spiders']}
2018-09-11 17:57:04 [scrapy.middleware] INFO: Enabled extensions: