🎮

Puppeteer

Image
puppeteer-chrome-api-browser-automation-1.jpg
Puppeteer is a tool that allows you to automate interactions with web pages. It lets you control a headless Chrome browser (which means you won't see it on your screen) and do things like fill out forms, click buttons, and navigate to different pages.
Now, let's get started with an example. Let's say we want to use Puppeteer to go to the Google website and search for "puppies". Here's what our code might look like:
const puppeteer = require('puppeteer'); (async () => { const browser = await puppeteer.launch(); const page = await browser.newPage(); await page.goto('https://www.google.com'); await page.type('input[name="q"]', 'puppies'); await page.click('input[type="submit"]'); // Wait for search results to load await page.waitForSelector('#search'); console.log('Search results loaded!'); await browser.close(); })();
What each line of this code does:
const puppeteer = require('puppeteer');
This line imports the Puppeteer library into our Node.js script.
const browser = await puppeteer.launch();
This line launches a new instance of the headless Chrome browser.
const page = await browser.newPage();
This line creates a new page in the browser instance.
await page.goto('https://www.google.com');
This line navigates to the Google website.
await page.type('input[name="q"]', 'puppies');
This line finds the search bar on the page (identified by the input name "q") and types the word "puppies" into it.
await page.click('input[type="submit"]');
This line finds the submit button on the page (identified by the input type "submit") and clicks it.
await page.waitForSelector('#search');
This line waits for the search results to load (identified by the CSS selector "#search").
console.log('Search results loaded!');
This line logs a message to the console indicating that the search results have loaded.
await browser.close();
Finally, this line closes the browser instance.
That's it! With these few lines of code, we were able to automate the process of searching for "puppies" on Google using Puppeteer. I hope this helps you understand how Puppeteer works in Node.js!
 

Puppeteer:

Puppeteer is a Node.js library developed by the Google Chrome team. It provides a high-level API to control headless Chrome or Chromium over the
Protocol. Puppeteer allows you to automate the testing and scraping of web pages, as well as perform other tasks such as generating screenshots and PDFs of web pages.
In simple terms, Puppeteer is a tool that allows you to programmatically control a web browser (Chrome or Chromium) to interact with web pages and perform various actions, like clicking buttons, filling out forms, and navigating to different pages. With Puppeteer, you can write scripts in Node.js that automate repetitive tasks on the web, which can save you a lot of time and effort.
Puppeteer is built on top of the Chrome DevTools Protocol, which is a set of APIs for interacting with Chrome and Chromium. Puppeteer provides a simpler, more high-level API that abstracts away many of the complexities of the DevTools Protocol and makes it easier to write automation scripts.
Overall, Puppeteer is a powerful tool for web automation and testing, and is widely used in the web development and testing communities.
 

Here are some of its key features:

  1. Automating user interactions: With Puppeteer, you can simulate user interactions with a web page, such as clicking buttons, filling out forms, and navigating to different pages.
  1. Generating screenshots and PDFs: Puppeteer allows you to generate screenshots and PDFs of web pages, which can be useful for testing and debugging.
  1. Web scraping: Puppeteer makes it easy to scrape data from web pages, allowing you to extract information like prices, product details, and more.
  1. Performance testing: Puppeteer provides tools for measuring the performance of web pages, including metrics like page load time and resource usage.
  1. Mobile emulation: Puppeteer can simulate mobile devices, allowing you to test how your web pages look and perform on different devices and screen sizes.
  1. Headless mode: Puppeteer can run in headless mode, which means it runs without a visible user interface, making it faster and more efficient.
  1. Easy setup: Puppeteer can be installed with npm and is easy to set up, making it accessible to developers of all skill levels.
 

Here are some of the key classes in Puppeteer:

  1. Browser: The Browser class represents a browser instance, which can be used to create new pages and perform other browser-level tasks.
  1. Page: The Page class represents a web page, and provides methods for interacting with the page, such as navigating to a URL, clicking elements, and filling out forms.
  1. ElementHandle: The ElementHandle class represents a DOM element on a web page, and provides methods for interacting with the element, such as clicking it, typing into it, and getting its properties.
  1. Frame: The Frame class represents a frame or iframe on a web page, and provides methods for interacting with the frame, such as navigating it and evaluating JavaScript in it.
  1. Request: The Request class represents a network request made by a web page, and provides information about the request, such as its URL, headers, and response.
  1. Response: The Response class represents a network response received by a web page, and provides information about the response, such as its status code, headers, and content.
 

Here are some of the key functionalities of Puppeteer:

  1. Web page automation: With Puppeteer, you can automate interactions with web pages, such as clicking buttons, filling out forms, and navigating to different pages. This allows you to test and debug web pages, and automate repetitive tasks.
  1. Web scraping: Puppeteer makes it easy to scrape data from web pages, allowing you to extract information like prices, product details, and more. This can be useful for a variety of applications, such as data mining and price comparison.
  1. Performance testing: Puppeteer provides tools for measuring the performance of web pages, including metrics like page load time and resource usage. This allows you to optimize the performance of your web pages and ensure that they are fast and responsive.
  1. PDF and screenshot generation: Puppeteer allows you to generate PDFs and screenshots of web pages, which can be useful for testing and debugging, as well as for generating reports and documentation.
  1. Mobile emulation: Puppeteer can simulate mobile devices, allowing you to test how your web pages look and perform on different devices and screen sizes. This can be useful for ensuring that your web pages are responsive and mobile-friendly.
  1. Headless mode: Puppeteer can run in headless mode, which means it runs without a visible user interface. This makes it faster and more efficient, and allows you to automate tasks without being distracted by a visual interface.
 
  1. Mouse interactions: Puppeteer allows you to simulate mouse interactions with a web page using the mouse object. You can move the mouse to a specific point on the page using the move method, click an element using the click method, and perform other mouse actions using other methods like down, up, and wheel.
    1. // Example: Click on a button using the mouse await page.waitForSelector('#my-button'); const button = await page.$('#my-button'); await button.click();
  1. Keyboard interactions: Puppeteer also allows you to simulate keyboard interactions with a web page using the keyboard object. You can type text into an element using the type method, press and release specific keys using the press and release methods, and more.
    1. // Example: Type "hello world" into an input field using the keyboard await page.waitForSelector('#my-input'); const input = await page.$('#my-input'); await input.type('hello world');
  1. File chooser: Puppeteer provides a way to simulate the selection of a file using the FileChooser class. You can use the setFiles method to set the files to be uploaded, and then use the accept method to accept the file selection.
    1. // Example: Upload a file using a file chooser await page.waitForSelector('#my-file-input'); const input = await page.$('#my-file-input'); const [fileChooser] = await Promise.all([ page.waitForFileChooser(), input.click(), ]); await fileChooser.setFiles('/path/to/my/file.pdf'); await fileChooser.accept();
  1. Browser context: Puppeteer allows you to create separate browser contexts using the BrowserContext class. A browser context is like a separate instance of the browser that has its own cookies, cache, and other state. This can be useful for testing scenarios where you need to isolate the state of the browser.
    1. // Example: Create a new browser context and navigate to a page const context = await browser.createIncognitoBrowserContext(); const page = await context.newPage(); await page.goto('https://www.example.com');
  1. Page navigation: Puppeteer provides a variety of methods for navigating between pages and controlling the browser history. You can navigate to a new page using the goto method, go back or forward in the browser history using the goBack and goForward methods, and reload the current page using the reload method.
    1. // Example: Navigate to a new page and go back in the browser history await page.goto('https://www.example.com'); await page.goBack();
  1. Element handling: Puppeteer provides a variety of methods for interacting with elements on a page, including selecting elements by CSS selector, XPath, or other criteria, getting the text or value of an element, and more.
    1. // Example: Get the text of a paragraph element await page.waitForSelector('p'); const paragraph = await page.$('p'); const text = await page.evaluate(element => element.textContent, paragraph); console.log(text);
  1. Network interception: Puppeteer allows you to intercept and modify network requests made by a page using the intercept method. You can use this to mock responses, block certain requests, or modify the request or response headers.
    1. // Example: Intercept a network request and modify the response await page.setRequestInterception(true); page.on('request', request => { if (request.url().endsWith('.png')) { request.respond({ content: 'image/png', body: Buffer.from('fake-image-data'), }); } else { request.continue(); } });
  1. Page events: Puppeteer provides a variety of events that you can listen for on a page, such as the load event, the dialog event (which is triggered when a JavaScript alert or confirmation dialog appears), and the console event (which is triggered when a page logs a message to the console).
    1. // Example: Log console messages to the console page.on('console', message => console.log(message.text()));
       
 
Puppeteer provides several methods for working with URLs, which allow you to navigate to pages, manipulate URLs, and retrieve information about them. Here are some examples:
  1. Navigation: You can navigate to a new page using the goto method, which takes a URL as its argument. You can also retrieve the current URL of a page using the url method.
    1. // Navigate to a new page await page.goto('https://www.example.com'); // Get the current URL const currentUrl = await page.url(); console.log(currentUrl);
  1. Manipulating URLs: Puppeteer provides the URL class, which allows you to manipulate URLs by adding or removing query parameters, fragments, and more. You can create a new URL instance by passing a URL string to its constructor.
    1. // Create a new URL object const url = new URL('https://www.example.com'); // Add a query parameter url.searchParams.set('key', 'value'); // Remove a fragment url.hash = ''; // Get the updated URL string const updatedUrl = url.toString(); console.log(updatedUrl);
  1. Retrieving information about URLs: You can use the parse method of the url module to parse a URL string and retrieve information about its components, such as the protocol, hostname, and port.
    1. // Parse a URL string const url = new URL('https://www.example.com/path/to/page?query=parameter'); // Get the protocol console.log(url.protocol); // "https:" // Get the hostname console.log(url.hostname); // "www.example.com" // Get the port (returns an empty string if the port is not specified) console.log(url.port); // ""
  1. Extracting URLs from a page: You can use Puppeteer to extract URLs from a page, for example by finding all the links on a page and retrieving their href attributes.
    1. // Get all links on the page and extract their URLs const links = await page.$$eval('a', elements => elements.map(element => element.href)); console.log(links);
  1. Checking the URL of a page: You can use Puppeteer to check whether the URL of a page matches a certain pattern, for example to make sure that a redirect has taken you to the expected page.
    1. // Navigate to a page and check its URL await page.goto('https://www.example.com/redirect'); const currentUrl = await page.url(); if (currentUrl === 'https://www.example.com/expected-page') { console.log('Redirect succeeded!'); } else { console.log('Redirect failed: expected URL was', expectedUrl, 'but actual URL was', currentUrl); }
  1. Handling URL fragments: Puppeteer allows you to retrieve and manipulate the fragment (the part of a URL after the # symbol) using the hash property of the URL object.
    1. // Navigate to a page and retrieve the fragment await page.goto('https://www.example.com/page#fragment'); const url = new URL(await page.url()); const fragment = url.hash; console.log(fragment); // Modify the fragment and navigate to the updated URL url.hash = 'new-fragment'; await page.goto(url.toString());
       
 

List of some common methods provided by Puppeteer:

  1. browser.newPage(): Creates a new Page object in the current browser context.
  1. page.goto(url[, options]): Navigates to the specified URL.
  1. page.click(selector[, options]): Clicks the element specified by the given selector.
  1. page.type(selector, text[, options]): Types the given text into the element specified by the given selector.
  1. page.waitForSelector(selector[, options]): Waits for the element specified by the given selector to be added to the page.
  1. page.waitForNavigation([options]): Waits for the page to navigate to a new URL.
  1. page.screenshot([options]): Takes a screenshot of the current page and returns it as a PNG buffer.
  1. page.evaluate(pageFunction[, ...args]): Executes the given function in the context of the page and returns its result.
  1. page.$(selector): Finds the first element matching the given selector.
  1. page.$$(selector): Finds all elements matching the given selector.
  1. page.setContent(html[, options]): Sets the HTML content of the page.
  1. page.goBack([options]): Navigates to the previous page in the history.
  1. page.goForward([options]): Navigates to the next page in the history.
  1. page.waitForTimeout(timeout): Waits for the specified amount of time (in milliseconds) before continuing.
  1. page.waitForFunction(pageFunction[, options[, ...args]]): Waits for the given function to return a truthy value before continuing.
  1. page.waitForNavigation([options]): Waits for the page to navigate to a new URL.
  1. page.setViewport(viewport) Sets the size of the viewport for the page.
  1. page.evaluateHandle(pageFunction[, ...args]): Executes the given function in the context of the page and returns a handle to its result.
  1. page.addScriptTag(options): Adds a script tag to the page.
  1. page.setRequestInterception(value): Enables or disables request interception for the page.
 

Comparison between Puppeteer and Selenium

Sr. No.
Puppeteer
Selenium
1.
Puppeteer is developed mainly for Chromium so the tests developed are mainly executed in Chrome
Selenium can be used to execute tests on multiple browsers like Chrome, Firefox, IE, Safari, and so on.
2.
Puppeteer code can be implemented only in JavaScript
Selenium code can be implemented on multiple languages like Java, Python, JavaScript, C#. and so on.
3.
Puppeteer provides APIs to manage headless execution in Chrome by using the DevTools protocol.
Selenium requires additional external browser drivers that trigger tests as per the user commands.
4.
Puppeteer manages the Chrome browser.
Selenium is primarily used to execute tests to automate the actions performed on the browser.
5.
Puppeteer is faster in executing tests than Selenium
Selenium is slower in executing tests than Puppeteer.
6.
Puppeteer is a module in node developed for Chromium engine.
Selenium is a dedicated test automation tool.
7.
Puppeteer can be used for API testing by utilising the requests and the responses.
API testing with Selenium is difficult.
8.
Puppeteer can be used to verify the count of CSS and JavaScript files utilised for loading a webpage.
Selenium cannot be used to verify the count of CSS and JavaScript files utilised for loading a webpage.
9.
Puppeteer can be used to work on the majority of features in the DevTools in the Chrome browser.
Selenium cannot be used to work on the majority of features in the DevTools in the Chrome browser.
10.
Puppeteer can be used to execute tests on various devices with the help of the emulators
Using an emulator with Selenium is not easy.
11.
Puppeteer can be used to obtain the time needed for a page to load.
Selenium cannot be used to obtain the time needed for a page to load.
12.
Puppeteer can be used to save a screenshot in both image and PDF formats.
Selenium can be used to save a screenshot in both image and PDF formats only in the Selenium 4 version
13.
Puppeteer was first introduced in the year 2017.
Selenium was first introduced in the year 2004.
14.
In Puppeteer, we can verify an application without image loading.
In Selenium, we can verify an application without image loading.
Demo Code1:
const puppeteer = require('puppeteer'); (async function () { const browser = await puppeteer.launch(); console.log("Launched"); const page = await browser.newPage(); await page.goto('https://www.google.com/'); console.log("In Site"); await page.screenshot({ path: './Demo2.png' }); console.log("Captured"); browser.close(); })();
This is what we are doing in this small script:
  1. We import the Puppeteer library using require.
  1. Launch a new browser.
  1. Open a new page (tab) inside that browser.
  1. Navigate to the Wikipedia page.
  1. Take a screenshot.
  1. Close the browser.
 
 
Some of the most common methods:
  1. waitUntil: This method specifies when the page.waitFor...() method should stop waiting. The available options are:
      • load: Wait until the page is fully loaded (i.e., all resources like images, stylesheets, scripts, etc. have finished loading).
      • domcontentloaded: Wait until the DOMContentLoaded event is fired (i.e., the HTML content has been parsed and rendered, but some resources may still be loading).
      • networkidle0: Wait until there are no more than 0 network connections for at least 500ms (i.e., the page is considered fully loaded when there are no pending network requests).
      • networkidle2: Wait until there are no more than 2 network connections for at least 500ms (i.e., the page is considered fully loaded when there are no pending network requests or when there are at most 2 network connections left, which may be useful for pages that load resources dynamically).
  1. timeout: This method specifies the maximum amount of time (in milliseconds) to wait for the condition to be met before timing out and throwing an error.
  1. visible: This method specifies whether to wait for an element to become visible on the page.
  1. hidden: This method specifies whether to wait for an element to become hidden on the page.
  1. selector: This method specifies the CSS selector of the element to wait for.
 
 
appx-tiles-grid-ul
data-listing-name
 
 
const puppeteer = require('puppeteer'); const xlsx = require('xlsx'); (async () => { const browser = await puppeteer.launch({headless: false}); const page = await browser.newPage(); await page.goto('https://appexchange.salesforce.com/consulting'); // Scroll to the bottom of the page to load all data await autoScroll(page); // Collect the data and save it to an Excel file const data = await page.evaluate(() => { const rows = []; document.querySelectorAll('.appx-tile-content-el').forEach(span => { const cells = [span.innerText]; rows.push(cells); }); return rows; }); const wb = xlsx.utils.book_new(); const ws = xlsx.utils.aoa_to_sheet(data); xlsx.utils.book_append_sheet(wb, ws, 'Data'); xlsx.writeFile(wb, 'data.xlsx'); await browser.close(); })(); async function autoScroll(page) { await page.evaluate(async () => { await new Promise((resolve, reject) => { let totalHeight = 0; const distance = 100; const scrollInterval = setInterval(() => { const scrollHeight = document.body.scrollHeight; window.scrollBy(0, distance); totalHeight += distance; if (totalHeight >= scrollHeight) { clearInterval(scrollInterval); resolve(); } }, 1000); // Scroll every 1 second }); await new Promise(resolve => setTimeout(resolve, 2000)); // Wait for 2 seconds after scrolling // Click the "Load More" button repeatedly until it's no longer present while (document.querySelector('#appx-load-more-button-id')) { document.querySelector('#appx-load-more-button-id').click(); await new Promise(resolve => setTimeout(resolve, 2000)); // Wait for 2 seconds after clicking } await new Promise(resolve => setTimeout(resolve, 2000)); // Wait for 2 seconds after clicking all "Load More" buttons }); }
 
 
Salesforce Application
Sales Cloud
Paid Applications
ZOOMINFO FOR SALESFORCE, DEMANDBASE (DATA AND SALES INTELLIGENCE CLOUD), CLEARBIT - AUTOMATICALLY ENRICH LEADS AND CONTACTS IN REAL-TIME, MERCURY SMS: SEND & RECEIVE TEXT MESSAGES, SMS-MAGIC
Free Applications
DATALOADER.IO, THE #1 DATA LOADER FOR SALESFORCE, ASANA FOR SALESFORCE, NATIVE DOCUMENT GENERATION & E-SIGNATURE: PDF, WORD, XLS, EMAIL, REPORTS: S-DOCS
ã…¤
Leading enterprise cloud marketplace Apps, solutions, and consultants Every industry and department Sales, marketing, customer service, and more Service Cloud Phone, email, social media, apps, or any other channel Solve customer problems fast, get insights into their behavior Getfeedback: Surveys for Salesforce - the best rated for CSAT, CES, NPS Q-assign: Lead routing, case assignment, round robin distribution Distribution engine: Lead assignment & opportunity routing. Round robin. In-gage – Surveys, compliance checks, quality audits & case categorization Five9 for Service Cloud Voice BYOT Vonage for Service Cloud Voice and Contact Center, CTI, speech analytics (BYOT) Talkdesk for Service Cloud Voice Avaya OneCloud™ for Salesforce - Service Cloud Voice (BYOT) powered by Avaya B+S Connects for Service Cloud Voice CTI, omni-channel, HVS, dialer, BYOT voice InGenius Nice CXone Agent for Service Cloud Voice (BYOT) CTI, BYOT, phone, HVS Mirage Connector for Service Cloud Voice - BYOT Natterbox Glance Gainsight: The #1 Rated Customer Success Platform RWS Language Weaver for Live Agent UPS Shipping App: Shipping, Returns, RMAs and Tracking Vonage for Service Cloud Voice and Contact Center, CTI, speech analytics (BYOT) Nice CXone Agent for Salesforce - CTI / IVR / ACD / Dialer / Contact Center Five9 Intelligent Cloud Contact Center Interactive Intelligence CTI 8x8 Virtual Office: CTI Ingenius
Key
Value
Salesforce Apps
Service Cloud Applications
AppExchange
Leading enterprise cloud marketplace
Ready-to-install
Apps, solutions, and consultants
Extend Salesforce
Every industry and department
Solutions
Sales, marketing, customer service, and more
Latest Collections
Service Cloud
Customer Service Platform
#1
Support channels
Phone, email, social media, apps, or any other channel
Solutions from AppExchange
Solve customer problems fast, get insights into their behavior
Service & Support Dashboards
Getfeedback: Surveys for Salesforce - the best rated for CSAT, CES, NPS
Lead Routing
Q-assign: Lead routing, case assignment, round robin distribution
Lead assignment
Distribution engine: Lead assignment & opportunity routing. Round robin.
Surveys
In-gage – Surveys, compliance checks, quality audits & case categorization
Service Cloud Voice Telephony Partners
Five9 for Service Cloud Voice BYOT
Contact center
Vonage for Service Cloud Voice and Contact Center, CTI, speech analytics (BYOT)
Cloud Contact Center
Talkdesk for Service Cloud Voice
Salesforce
Avaya OneCloudâ„¢ for Salesforce - Service Cloud Voice (BYOT) powered by Avaya
Cisco Contact Center Integration
B+S Connects for Service Cloud Voice
Genesys Cloud for Salesforce
CTI, omni-channel, HVS, dialer, BYOT voice
Partner Telephony
InGenius
Omnichannel
Nice CXone Agent for Service Cloud Voice (BYOT)
Odigo for Salesforce Service Cloud Voice
CTI, BYOT, phone, HVS
Mirage Connector
Mirage Connector for Service Cloud Voice - BYOT
Speech Analytics
Natterbox
Advanced Service Cloud Features
Glance
Customer Success Platform
Gainsight: The #1 Rated Customer Success Platform
Live Agent
RWS Language Weaver for Live Agent
Shipping
UPS Shipping App: Shipping, Returns, RMAs and Tracking
Service Cloud CTI Partners
Vonage for Service Cloud Voice and Contact Center, CTI, speech analytics (BYOT)
CTI/IVR/ACD/Dialer/Contact Center
Nice CXone Agent for Salesforce - CTI / IVR / ACD / Dialer / Contact Center
Intelligent Cloud Contact Center
Five9 Intelligent Cloud Contact Center
PureConnect
Interactive Intelligence
Amazon Connect CTI Adapter
CTI
Virtual Office
8x8 Virtual Office: CTI
Computer Telephony Integration
Ingenius
Offer your solution on AppExchange
-
Built with Potion.so