Skip to content

Instantly share code, notes, and snippets.

@larryyangsen
Last active September 15, 2017 09:43
Show Gist options
  • Select an option

  • Save larryyangsen/f2a481dbbe5fbe7152aaac1fa9689035 to your computer and use it in GitHub Desktop.

Select an option

Save larryyangsen/f2a481dbbe5fbe7152aaac1fa9689035 to your computer and use it in GitHub Desktop.
puppeteer ptt example
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch({ headless: false });
const page = await browser.newPage();
const prePageSelector = '.btn-group-paging a:nth-child(2)@href';
const listSelector = '.r-ent';
const titleSelector = '.title a';
const titleLinkSelector = '.title a@href';
const authorSelector = '.meta .author';
const dateSelector = '.meta .date';
const pushContentSelector = '.nrec';
const run = async (boardName = 'Gossiping') => {
const endPoint = `https://www.ptt.cc/bbs/${boardName}/index.html`;
const over18btnSelector = 'div.over18-button-container button';
await page.goto(endPoint);
const over18btn = await page.$(over18btnSelector);
if (over18btn) {
await over18btn.click();
}
const lists = await page.evaluate(
(list, title) => {
return [
...document.querySelectorAll(`${list} ${title}`)
].map(t => ({ title: t.innerText, href: t.href }));
},
listSelector,
titleSelector
);
for (const list of lists) {
const newPage = await browser.newPage();
await newPage.goto(list.href);
console.log(await newPage.content());
}
};
run('GameSale');
})();
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment