Created
January 11, 2019 19:36
-
-
Save n3rio/e7421baebe7e54fe8287c61ef64387fa to your computer and use it in GitHub Desktop.
Run several spyders at the same time.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import scrapy | |
| from scrapy.crawler import CrawlerProcess | |
| class MySpider1(scrapy.Spider): | |
| # Your first spider definition | |
| # por ejemplo, obtencion de las paginas correspondientes a un target | |
| ... | |
| class MySpider2(scrapy.Spider): | |
| # Your second spider definition | |
| # por ejemplo, obtencion de la lista de links de usuarios partiendo de la lista de paginas de MySpider1. | |
| ... | |
| class MySpider3(scrapy.Spider): | |
| # Your second spider definition | |
| # por ejemplo, obtencion de la informacion de cada usuario a partir de la lista de usuarios de MySpider2 | |
| ... | |
| process = CrawlerProcess() | |
| process.crawl(MySpider1) | |
| process.crawl(MySpider2) | |
| process.start() # the script will block here until all crawling jobs are finished |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment