Created
January 11, 2019 19:37
-
-
Save n3rio/f14d20a926338f1b094d302e1a226e97 to your computer and use it in GitHub Desktop.
Run for ever spyder, modifying start_requests() method. (private)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # Se coloca el inicio de la ejecucion en start_request() | |
| class Foo(Spider): | |
| name = 'foo' | |
| allowed_domains = ['foo.com'] | |
| def start_requests(self): | |
| while True: | |
| data = self.coll.find({'status': 'unscraped'}).limit(5000) | |
| if not data: | |
| break | |
| for row in data: | |
| pin = row['pin'] | |
| url = 'http://foo.com/Pages/PIN-Results.aspx?PIN={}'.format(pin) | |
| yield Request(url, meta={'pin': pin}) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment