Last active
November 6, 2020 13:45
-
-
Save joseluisq/f1e57396703484a0311eb4acc4aab11a to your computer and use it in GitHub Desktop.
Dead simple Javascript browser scraper to backing up YOUR Quora content. Feel free to customize it to your needs.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
/** | |
* DISCLAIMER: | |
* The only intention of this "scraper script" is to serving as a backup tool for YOUR own content on Quora. | |
* Which means YOUR PUBLIC/PRIVATE content with YOU as author. | |
* So use this script under Quora Policy and Term conditions. https://www.quora.com/about/tos_archive | |
**/ | |
/** | |
* Dead simple Javascript browser scraper to backing up YOUR Quora content. | |
* Feel free to customize it to your needs. | |
* | |
* Script also available at https://gist.github.com/joseluisq/8a066c9910952c142c43c58078e2f811 | |
* | |
* USAGE: | |
* This script can be used under equivalent URLs: | |
* - https://es.quora.com/content?content_types=answers | |
* - https://es.quora.com/content?content_types=questions_asked | |
* using some browser Devtools like Chrome/Firefox, etc in order to be executed. | |
* | |
* 0. Login to Quora website and the go to E.g https://es.quora.com/content?content_types=answers | |
* 1. Open Devtools (Chrome/Firefox) and select the HTML element node: `<div class="PagedListItem UserContentListItem">` | |
* 2. Execute next code. Remember that `$0` maps to current HTML node selected. | |
* | |
* NOTE: Since this script backups few Quora pages like answers or questions asked | |
* but no each answer content. It's up to you to do it. | |
**/ | |
const nodes = Array.prototype.slice.call($0.querySelectorAll("div.pagedlist_item")) | |
const answered_questions = [] | |
nodes.forEach(($node) => { | |
const $link = $node.querySelector("a.question_link") | |
const datetime = $node.querySelector("div.metadata").innerText | |
const url = $link.getAttribute("href") | |
const title = $link.querySelector("span.ui_qtext_rendered_qtext").innerText | |
answered_questions.push({ title, url, datetime }) | |
}) | |
// 3. Copy output object into system clipboard in JSON format. | |
// `copy()` is part of Chrome Devtools API. | |
// More details at https://developers.google.com/web/tools/chrome-devtools/console/utilities | |
copy( | |
JSON.stringify(answered_questions) | |
) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment