Created
April 16, 2024 03:38
-
-
Save neilmcguigan/7d516e072f8901d439938bea7cbaaa83 to your computer and use it in GitHub Desktop.
xidel scraping
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# get the current url (watch your quoting): | |
xidel https://example.com --extract '$url' | |
# output JSON lines: | |
xidel --printed-json-format=compact https://example.com --extract '{"title"://title, "url": $url}' | |
# for a list of items, with pagination: follow item, extract data, paginate: | |
xidel https://example.com [ --follow //a[@class='item'] --extract //*[@class='price'] ] --follow //a[@class='next-page'] |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment