Created
October 14, 2015 15:45
-
-
Save athap/304257b4a32e5490425a to your computer and use it in GitHub Desktop.
scraper options
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
./script/scraper --help | |
Usage: scraper [options] [sources ...] | |
Specific Options: | |
-l, --limit N Limit operations to N listings. | |
-t, --throttle N Random amount of time to throttle between url gets. | |
-c, --command [COMMAND] Which types of scraping command you want to run (all harvest collect scrape rescrape all_agents new_agents fix_agents quick_harvest_and_close quick_close validate) | |
-s, --sourceid id Scrapes a single listing using its source id. | |
-o, --office office_key(s) Scrapes office's listings using its office key (for MRIS and ListHub scrapers). Can be comma-delimited. | |
-e, --env [ENVIRONMENT] Which types of evironment you want to run in (prod gamma beta dev) | |
-p, --proxy_list [DATE] Specify a file with a list of proxies to use. | |
-a, --date [date] Use cached data from this date. Must be in a format that Chronic.parse can understand (ex. '2007-10-21') | |
-n, --no_validation Don't validate | |
--hourly run the hourly scraper | |
--rpt [timeframe] Only runs if realplus listings found updated in given timeframe, defaulting to '48 hours ago' - Chronic.parse-able format required | |
--skip-images Skip image downloading this run | |
--populate Run SourceGroup.populate this time | |
--touch Touch untouched listings anyway | |
-y, --yaml dump changes to YAML | |
-d, --debug Debug mode | |
-v, --verbose Verbose mode | |
-q, --quiet Turn off verbose mode | |
--skip-writing Skip the writing step (validate only) | |
-x, --do_not_close_listings Don't close any listings | |
-h, --help Show this message |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment