Puppeteer

In this article I wanna introduce Puppeteer as a tools that help us to do something cool like Web Scraping or Automation some task. Puppeteer helps developer up and run a google chromium browser throught command line tools this google chromium is headless browser that acting like real world browser. Puppeteer Api helps developer to do anyting that a user could do with it's browser. for example :

we could open and new page or new tab
we could select any element from DOM with it's api
we could typing and selection input element and manipulate them value
we could select a button and click on it
we could create a pdf from current page that
we could create a screenshot from current page
...

There is a official explanation about Puppeteer :

"Puppeteer" is a Node.js library which provides a high-level API to control Chrome/Chromium over the DevTools Protocol. Puppeteer runs in headless mode by default, but can be configured to run in full (non-headless) Chrome/Chromium.

Install `Puppeteer`

To install this tools we should follow below instructions : First we create a package.json file through this command

npm init

Then use this below command to install Puppeteer

npm install puppeteer --save

when npm did install all dependencies. open the package.json and add "type": "module" inside it as a key/value.

{
  "name": "project-name",
  "version": "1.0.0",
  "description": "",
  "type": "module",
  "main": "app.js",
  "scripts": {
    "run": "node app.js"
  },
  "author": "",
  "license": "ISC",
  "dependencies": {
    "puppeteer": "^19.8.3"
  }
}

ok, when we did all above tasks we are ready to implement our own first example.

`Puppeteer` Example 01 -> Take a screeenshot

In this example we will learn how we could take a screenshot from a web page or website then save it on the hard disk.

// import puppeteer package
import puppeteer from 'puppeteer';


( async() => {

    // launch a browser
    const browser = await puppeteer.launch();

    // creat a new page
    const page = await browser.newPage();

    // go to this address https://developer.mozilla.org/en-US/
    await page.goto('https://developer.mozilla.org/en-US/');

    // set viewport size, width and height
    await page.setViewport({width: 1980, height: 1080});

    // take a screenshot
    await page.screenshot({path: 'mozillla-dev-center.png', fullPage: true});

    // close browser
    await browser.close();
} )();

`Puppeteer` Example 02 -> how to read bitcoin price from cmc

In this example we decide to read bitcoin price from CoinMarketCap website. we will learn how to select a element and how extract data from it.

// import puppeteer package
import puppeteer from 'puppeteer';

( async () => {
    // launch a browser
    const browser = await puppeteer.launch();
    
    // creat a new page
    const page = await browser.newPage();

    // go to this address https://coinmarketcap.com/currencies/bitcoin/
    await page.goto('https://coinmarketcap.com/currencies/bitcoin/');
    
    // set viewport size, width and height
    await page.setViewport({width: 1980, height: 1080});

    // select price element and store withing bitcoinElement 
    const bitcoinElement = await page.waitForSelector('.priceValue>span');

    // extract price from bitcoinElement with evaluate method
    const bitcoinPrice  = await bitcoinElement.evaluate( el => el.textContent );

    // print bitcoin price 
    console.log("bitcoin price on the cmc : " + bitcoinPrice);

    // close browser
    await browser.close();

} )();

`Puppeteer` Example 03 -> how to select an input form and type inside it

Example 03 show us how we could interact with a html form and type anything inside it.

// import puppeteer package
import puppeteer from 'puppeteer';


( async () => {
    // launch a browser
    const browser = await puppeteer.launch();

     // creat a new page
    const page = await browser.newPage();

    // go to this address https://github.com/
    await page.goto('https://github.com/');

    // set viewport size, width and height
    await page.setViewport({width: 1980, height: 1080});
    
    // select search input form with waitForSelector through input[name="q"]
    const searchBox = await page.waitForSelector('input[name="q"]');

    // typing puppeteer inside input element with type method
    await searchBox.type('puppeteer');

    // creating a screenshot from webpage that  show us everything is ok
    await page.screenshot({path: 'github-searchbox.png', fullPage: true});

    await browser.close();


} )();

`Puppeteer` Example 04 -> how we could download an images with puppeteer

In this example we will attempt to extract first product image from the amazon.com website then save it on the hard disk.

// import puppeteer package
import puppeteer from 'puppeteer';
import * as fs  from 'node:fs/promises';


( async () => {
    // launch a browser
    const browser = await puppeteer.launch();

     // creat a new page
    const page = await browser.newPage();

    // go to this address 
    await page.goto('https://www.amazon.com/Desktop-Processor-12-Thread-Unlocked-Motherboard/dp/B0972FHS7J');

    // set viewport size, width and height
    await page.setViewport({width: 1980, height: 1080});
   
    // timeout to page completly loaded
    await page.waitForTimeout(10000);

    await page.screenshot({path:'amazon.png', fullPage:true});

    // select image throught its id
    const getLandingImage = await page.waitForSelector('#landingImage');
    
    // extract url inside browser through evaluate methond and pass it to landingImageUrl (nodejs enviourment) 
    const landingImageUrl = await getLandingImage.evaluate( x => x.src);

    // go to image url 
    const imagePage = await page.goto(landingImageUrl);
    
    // writing image on the hard disk through fs api, and puppeteer buffer method
    await fs.writeFile(landingImageUrl.split("/").pop(), await imagePage.buffer());

    // log image url to terminal
    console.log(landingImageUrl);

    // creating a screenshot from webpage that  show us everything is ok
    await page.screenshot({path: 'github-searchbox.png', fullPage: true});

    await browser.close();


} )();

Good Examples and Resources

https://github.com/puppeteer/puppeteer/tree/main/examples

vheidari/Puppeteer.md

Puppeteer

Install Puppeteer

Puppeteer Example 01 -> Take a screeenshot

Puppeteer Example 02 -> how to read bitcoin price from cmc

Puppeteer Example 03 -> how to select an input form and type inside it