Skip to content

Instantly share code, notes, and snippets.

@toraritte
Created July 22, 2020 17:20
Show Gist options
  • Save toraritte/0e46fb584f37a505fe8b02e8fe345514 to your computer and use it in GitHub Desktop.
Save toraritte/0e46fb584f37a505fe8b02e8fe345514 to your computer and use it in GitHub Desktop.
"scraping" Sacramento News and Review articles
// List articles on main page (https://sacramento.newsreview.com/)
let mainContent = document.getElementById("main-content");
Array.from(mainContent.querySelectorAll('a')).forEach( (e) => {console.log(e.href)})
// Print PDF out of individual articles
let article = document.querySelector('#main-content article');
let root = document.body.parentNode;
root.removeChild(document.body);
[article].forEach(elem => root.insertAdjacentElement('beforeend', elem));
document.title = "sacramento-news-and-review-" + new URL(document.URL).pathname.replace(/^\/(.*)\/$/, "$1").replace(/\//g, '-');
window.print()
// Choose to print to PDF
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment