This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| public async getPage() { | |
| if (!this.browser) { | |
| await this.initBrowser(); | |
| } | |
| const page = await this.browser.newPage(); | |
| // Avoiding Bot detection | |
| const userAgent = | |
| 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.39 Safari/537.36'; | |
| await page.setUserAgent(userAgent); |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| public closeBrowser() { | |
| if (this.browser) { | |
| this.browser.close(); | |
| } | |
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| export class Scraper { | |
| protected name: string; | |
| protected baseURL: string; | |
| protected browser = new HeadlessBrowser(); | |
| constructor() { | |
| this.name = 'Scraper'; | |
| this.baseURL = 'https://github.com/trending'; | |
| } | |
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| const page = await this.browser.getPage(); | |
| await page.goto(this.baseURL, { | |
| timeout: 300000, | |
| waitUntil: 'networkidle0', | |
| }); |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| const data = await page.evaluate(() => { | |
| // Your scraping code here | |
| }); |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| const allReposArticles = document.querySelectorAll( | |
| '.Box-row h1.lh-condensed a', | |
| ); | |
| const allReposArray = Array.from(allReposArticles); | |
| const allNamesRepos = allReposArray.map((item: HTMLElement | any) => { | |
| return { name: item.innerText }; | |
| }); |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| const regexMatchDigits = /\d+/g; | |
| const allStarArticles = document.querySelectorAll( | |
| '.Box-row .d-inline-block.float-sm-right', | |
| ); | |
| const allStarReposArray = Array.from(allStarArticles); | |
| const allStarsRepos = allStarReposArray.map((item: HTMLElement | any) => { | |
| const starDigits = item.innerText.match(regexMatchDigits); | |
| return { stars: Number(starDigits[0]) }; |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| const dataMerged = allNamesRepos.map((repo: any, index: number) => { | |
| const obj = { | |
| name: repo.name, | |
| starsToday: allStarsRepos[index].stars, | |
| }; | |
| return obj; | |
| }); |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import { Scraper } from './scraper'; | |
| async function bootstrap() { | |
| const scraper = new Scraper(); | |
| try { | |
| const data = await scraper.scrape(); | |
| console.log(data); | |
| } catch (e) { | |
| console.error('error', e); |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| terraform { | |
| backend "s3" { | |
| bucket = "your bucket to store state" | |
| key = "terraform.state" | |
| region = "us-east-1" | |
| } | |
| required_providers { | |
| aws = { | |
| source = "hashicorp/aws" |