Last active
September 18, 2018 16:28
-
-
Save andykais/1636bc8a1ed20a8c9c09af0c09318388 to your computer and use it in GitHub Desktop.
Possible new spec for passing vars like login to scraper. Essentially passing one scraper as input variables to another scraper. New syntax includes limiting the number of parsed values ("max"), parsing headers ("expect": "header") and "asInput" clause,
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"input": ["username","password"], | |
"scrape": { | |
"download": { | |
"urlTemplate": "http://example-site.com/login", | |
"headers": { | |
"username": "{username}", | |
"password": "{password" | |
}, | |
"parse": { | |
"expect": "header", | |
"selector": "SESSIONID", | |
"max": 1 | |
}, | |
"asInput": { | |
"alias": "token", | |
"inputFor": { | |
"input": ["token", "searchQuery"], | |
"scrape": { | |
"download": { | |
"urlTemplate": "https://example-site.com/search?{searchQuery}", | |
"headers": { | |
"SESSIONID": "{token}" | |
} | |
}, | |
"parse": { | |
"selector": "li.item span" | |
} | |
} | |
} | |
} | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment