Skip to content

Instantly share code, notes, and snippets.

@vheidari
Last active May 9, 2023 16:41
Show Gist options
  • Save vheidari/bfbc68bfea64d186513f14eaefdc1306 to your computer and use it in GitHub Desktop.
Save vheidari/bfbc68bfea64d186513f14eaefdc1306 to your computer and use it in GitHub Desktop.
Let's learn Puppeteer with some example :)

Puppeteer

In this article I wanna introduce Puppeteer as a tools that help us to do something cool like Web Scraping or Automation some task. Puppeteer helps developer up and run a google chromium browser throught command line tools this google chromium is headless browser that acting like real world browser. Puppeteer Api helps developer to do anyting that a user could do with it's browser. for example :

  • we could open and new page or new tab
  • we could select any element from DOM with it's api
  • we could typing and selection input element and manipulate them value
  • we could select a button and click on it
  • we could create a pdf from current page that
  • we could create a screenshot from current page
  • ...

There is a official explanation about Puppeteer :

"Puppeteer" is a Node.js library which provides a high-level API to control Chrome/Chromium over the DevTools Protocol. Puppeteer runs in headless mode by default, but can be configured to run in full (non-headless) Chrome/Chromium.

Install Puppeteer

To install this tools we should follow below instructions : First we create a package.json file through this command

npm init

Then use this below command to install Puppeteer

npm install puppeteer --save

when npm did install all dependencies. open the package.json and add "type": "module" inside it as a key/value.

{
  "name": "project-name",
  "version": "1.0.0",
  "description": "",
  "type": "module",
  "main": "app.js",
  "scripts": {
    "run": "node app.js"
  },
  "author": "",
  "license": "ISC",
  "dependencies": {
    "puppeteer": "^19.8.3"
  }
}

ok, when we did all above tasks we are ready to implement our own first example.

Puppeteer Example 01 -> Take a screeenshot

In this example we will learn how we could take a screenshot from a web page or website then save it on the hard disk.

// import puppeteer package
import puppeteer from 'puppeteer';


( async() => {

    // launch a browser
    const browser = await puppeteer.launch();

    // creat a new page
    const page = await browser.newPage();

    // go to this address https://developer.mozilla.org/en-US/
    await page.goto('https://developer.mozilla.org/en-US/');

    // set viewport size, width and height
    await page.setViewport({width: 1980, height: 1080});

    // take a screenshot
    await page.screenshot({path: 'mozillla-dev-center.png', fullPage: true});

    // close browser
    await browser.close();
} )();

Puppeteer Example 02 -> how to read bitcoin price from cmc

In this example we decide to read bitcoin price from CoinMarketCap website. we will learn how to select a element and how extract data from it.

// import puppeteer package
import puppeteer from 'puppeteer';

( async () => {
    // launch a browser
    const browser = await puppeteer.launch();
    
    // creat a new page
    const page = await browser.newPage();

    // go to this address https://coinmarketcap.com/currencies/bitcoin/
    await page.goto('https://coinmarketcap.com/currencies/bitcoin/');
    
    // set viewport size, width and height
    await page.setViewport({width: 1980, height: 1080});

    // select price element and store withing bitcoinElement 
    const bitcoinElement = await page.waitForSelector('.priceValue>span');

    // extract price from bitcoinElement with evaluate method
    const bitcoinPrice  = await bitcoinElement.evaluate( el => el.textContent );

    // print bitcoin price 
    console.log("bitcoin price on the cmc : " + bitcoinPrice);

    // close browser
    await browser.close();

} )();

Puppeteer Example 03 -> how to select an input form and type inside it

Example 03 show us how we could interact with a html form and type anything inside it.

// import puppeteer package
import puppeteer from 'puppeteer';


( async () => {
    // launch a browser
    const browser = await puppeteer.launch();

     // creat a new page
    const page = await browser.newPage();

    // go to this address https://github.com/
    await page.goto('https://github.com/');

    // set viewport size, width and height
    await page.setViewport({width: 1980, height: 1080});
    
    // select search input form with waitForSelector through input[name="q"]
    const searchBox = await page.waitForSelector('input[name="q"]');

    // typing puppeteer inside input element with type method
    await searchBox.type('puppeteer');

    // creating a screenshot from webpage that  show us everything is ok
    await page.screenshot({path: 'github-searchbox.png', fullPage: true});

    await browser.close();


} )();

Puppeteer Example 04 -> how we could download an images with puppeteer

In this example we will attempt to extract first product image from the amazon.com website then save it on the hard disk.

// import puppeteer package
import puppeteer from 'puppeteer';
import * as fs  from 'node:fs/promises';


( async () => {
    // launch a browser
    const browser = await puppeteer.launch();

     // creat a new page
    const page = await browser.newPage();

    // go to this address 
    await page.goto('https://www.amazon.com/Desktop-Processor-12-Thread-Unlocked-Motherboard/dp/B0972FHS7J');

    // set viewport size, width and height
    await page.setViewport({width: 1980, height: 1080});
   
    // timeout to page completly loaded
    await page.waitForTimeout(10000);

    await page.screenshot({path:'amazon.png', fullPage:true});

    // select image throught its id
    const getLandingImage = await page.waitForSelector('#landingImage');
    
    // extract url inside browser through evaluate methond and pass it to landingImageUrl (nodejs enviourment) 
    const landingImageUrl = await getLandingImage.evaluate( x => x.src);

    // go to image url 
    const imagePage = await page.goto(landingImageUrl);
    
    // writing image on the hard disk through fs api, and puppeteer buffer method
    await fs.writeFile(landingImageUrl.split("/").pop(), await imagePage.buffer());

    // log image url to terminal
    console.log(landingImageUrl);

    // creating a screenshot from webpage that  show us everything is ok
    await page.screenshot({path: 'github-searchbox.png', fullPage: true});

    await browser.close();


} )();


Good Examples and Resources

https://github.com/puppeteer/puppeteer/tree/main/examples

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment