Generating PDFs has become a crucial feature for many modern web applications, from creating invoices to producing complex data-driven reports. Over the years, various HTML to PDF tools have emerged, each with its own approach and capabilities. In this post, we’ll take a brief look at the evolution of PDF generation solutions — from wkhtmltopdf and PrinceXML (via DocRaptor) to Puppeteer and Playwright — and then walk through a short tutorial on using Playwright (Node.js) to generate a sample PDF in 2025.
A Brief History of PDF Generation Tools
wkhtmltopdf
wkhtmltopdf is a command-line tool that uses the WebKit rendering engine (the foundation of early Safari) to convert HTML and CSS into PDF. Historically, it was one of the most popular open-source solutions for server-side PDF generation.
➡️ How it works:
Renders a static HTML document (with CSS) in a WebKit-based environment.
Outputs a PDF file that closely (but not always perfectly) matches the layout seen in a WebKit browser.
Integrates easily into scripts or back-end services via CLI.
✅ Pros:
Open-source and free to use.
Simple CLI usage makes it straightforward to integrate with different back-end languages.
Mature project with a large user base.
❌ Cons:
Limited support for modern JavaScript and CSS features like flexbox or grid.
Inconsistent behavior with more complex layouts or interactive elements.
Can be slower and produce lower-quality rendering compared to headless Chrome.
DocRaptor (PrinceXML)
DocRaptor is a cloud-based API that leverages the commercial PrinceXML rendering engine to convert HTML (and CSS) into high-quality PDF or Excel documents. PrinceXML itself is known for its advanced typesetting capabilities and thorough CSS support, including complex paged media features.
➡️ How it works:
You send your HTML/CSS (and optionally JavaScript) via an API call.
The service uses PrinceXML under the hood to produce a precisely formatted PDF.
You receive the final PDF via the API response.
✅ Pros:
High-fidelity rendering for complex layouts, including advanced typographic and pagination features.
Thorough CSS support, handling paged media, footnotes, multi-column layouts, and more.
Backed by a commercial solution with consistent updates and dedicated support.
❌ Cons:
Very high licensing cost: PrinceXML (and by extension DocRaptor) can be prohibitively expensive for many smaller projects or startups.
Not a full headless browser engine: Because PrinceXML isn’t based on Chromium/Firefox, it may struggle with dynamic JavaScript frameworks (React, Vue, Angular) that rely heavily on live browser rendering.
Headless Chrome as a Game Changer
For a long time, generating PDFs that perfectly mirrored a modern browser’s rendering was difficult. Older engines like WebKit in wkhtmltopdf or specialized solutions (e.g., PrinceXML) often lagged behind the latest HTML/CSS/JS capabilities found in Chrome. That changed significantly when Google introduced Headless Chrome, allowing developers to run the browser without a visible UI.
Shortly after, Puppeteer and then Playwright emerged as powerful tools for automating Headless Chrome, enabling developers to generate PDFs and screenshots with precision.
Puppeteer
Puppeteer is a Node.js library that provides an extensive API for automating tasks in headless (or full) Chrome/Chromium. Initially released by the Google Chrome team, it quickly became popular for web scraping, testing, and HTML to PDF conversion.
➡️ How it works:
Puppeteer spins up a headless Chrome/Chromium instance.
Navigates to a given web page or loads an HTML string.
Waits for the content (including JavaScript) to render fully, then exports a PDF that matches Chrome’s final layout.
✅ Pros:
Full support for modern HTML, CSS, and dynamic JavaScript.
Highly configurable: page size, margins, custom headers/footers, etc.
Maintained by the Google Chrome team, ensuring updates align with Chromium.
❌ Cons:
Language focus: Primarily supported in Node.js, so it’s less convenient for developers in other languages.
Resource usage: Spinning up a headless browser can be heavier compared to simpler tools.
Scaling complexity: Requires more infrastructure for large workloads.
Playwright
Playwright was developed by Microsoft and shares many similarities with Puppeteer. It supports automated testing and browser manipulation for Chromium, Firefox, and WebKit, although PDF generation currently works only in Chromium. Despite that limitation, Playwright’s design and multiple language SDKs make it a powerful option for diverse teams.
➡️ How it works:
Playwright launches a headless Chrome/Chromium instance.
It navigates to a specified webpage or processes an HTML string.
The tool waits until the content, including any JavaScript, is fully rendered, and then generates a PDF that reflects the final layout of the browser.
✅ Pros:
Multiple language SDKs (Python, Java, C#/.NET, JavaScript/TypeScript) accommodate varied development teams.
Modern standards compatibility: Because it uses a real Chromium engine for PDF, flexbox, grid, and other advanced CSS features are fully supported.
Flexible configuration: Allows custom margins, paper sizes, headers/footers, and dynamic page manipulation before exporting.
Scalable & actively maintained: Backed by Microsoft and the open-source community, it evolves quickly and is well-documented. You can containerize headless Chromium and distribute load for large-scale PDF tasks.
❌ Cons:
PDF limited to Chromium: Even though Playwright supports Firefox and WebKit for automation, PDF generation is restricted to Chromium.
High resource usage: Like Puppeteer, running a headless browser is heavier than simpler command-line tools.
Overkill for minimal needs: If you only require simple, static PDFs, spinning up Chromium might be more complex than necessary.
Puppeteer vs Playwright
Both Puppeteer and Playwright are powerful tools for PDF generation, leveraging the same Chromium engine. The choice between the two largely depends on your specific needs, such as your preferred programming language, the complexity of your project, and whether you need to automate multiple browsers beyond PDF generation.
Here’s a comparison to help you choose the right tool for your needs:
Aspect | Puppeteer | Playwright |
PDF Generation | Supported via headless Chromium. | Supported via headless Chromium. |
Browser Support | Chromium only. | Chromium only for PDF. Supports Firefox and WebKit for other tasks. |
Multi-language Support | No - Limited to Node.js. | Yes - Python, Java, C#/.NET, JavaScript/TypeScript. |
Dynamic Content Support | Fully supports JavaScript, AJAX, etc. | Fully supports JavaScript, AJAX, etc. |
Configuration Options | Flexible - margins, headers/footers, paper size. | Flexible - margins, headers/footers, paper size. |
Ease of Integration | Ideal for Node.js developers. | Designed for teams using multiple programming languages. |
Scalability | Focused on Chromium-based tasks. | Scales better with multi-browser support. |
Development | Released in 2017 by Google. | Released in 2020 by Microsoft. |
Step-by-Step Guide: Generating Invoice PDF with Playwright
In this quick example, we’ll show how to generate an invoice PDF by rendering an EJS template in Node.js and converting it to a PDF using Playwright.
1️⃣ Set Up the Project
1. Install Node.js.
Make sure you have Node.js installed. If not, download and install it from Node.js Website.
2. Create a New Project Directory.
Open a Terminal and run:
mkdir invoice-generator
cd invoice-generator
3. Initialize a New Node.js Project.
Create a package.json
file with the following command:
npm init -y
4. Install Required Packages.
Install ejs
for templating and playwright
for generating PDFs:
npm install ejs playwright
2️⃣ Organize Your Project Structure
Here’s a suggested structure for better organization:
invoice-generator/
├── data/ // Directory for data files
│ └── invoice-data.json // JSON file for invoice data
├── templates/ // Directory for HTML templates
│ └── invoice.ejs // Template for the invoice
├── generate-invoice.js // Main script to generate PDFs
└── package.json // Project configuration file
3️⃣ Create an EJS Template
To start generating PDFs, you’ll first need an HTML template for your invoice. In this example we will use EJS, a simple templating language that allows you to generate HTML markup using plain JavaScript.
Save this template as
invoice.ejs
in thetemplates
directory. It should define the structure of your invoice and include placeholders for dynamic data.
4️⃣ Add Invoice Data
Save your invoice details as
invoice-data.json
in thedata
directory.This file will hold the dynamic data, such as customer details, that will be used to populate the placeholders in the
invoice.ejs
template.By keeping the data in a separate file, you ensure that the template can be reused with different data sets.
5️⃣ Create the PDF Generator Script
To generate a PDF from your HTML template, you’ll need a script that renders the template and uses Playwright for PDF conversion.
Below is a complete example of the script.
Save this file as
generate-invoice.js
in your project directory.
Below is the complete script:
const ejs = require('ejs');
const fs = require('fs');
const {chromium} = require('playwright');
const path = require('path');
// Load invoice data from the JSON file
const invoiceData = JSON.parse(fs.readFileSync(path.join(__dirname, 'data', 'invoice-data.json'), 'utf8'));
(async () => {
try {
const timestamp = new Date().toISOString().replace(/[:.]/g, '-'); // Generate unique timestamp
// Render the EJS template to HTML
const templatePath = path.join(__dirname, 'templates', 'invoice.ejs');
const html = await ejs.renderFile(templatePath, invoiceData);
// Launch a headless browser using Playwright
const browser = await chromium.launch();
const page = await browser.newPage();
// Load the rendered HTML into the browser
await page.setContent(html, {waitUntil: 'load'});
// Generate the PDF and save it with a timestamped filename
const pdfPath = `invoice-${timestamp}.pdf`;
await page.pdf({
path: pdfPath,
format: 'A4',
printBackground: true
// Additional parameters can be added here
});
await browser.close();
console.log(`PDF successfully created at: ${pdfPath}`);
} catch (error) {
console.error('An error occurred while generating the invoice:', error);
}
})();
️6️⃣ Run the Script
Open a terminal and navigate to the project directory:
cd invoice-generator
Run the script:
node generate-invoice.js
7️⃣ Check the Output
Once the script has executed successfully, you’ll find the following file in your project directory:
Generated PDF:
invoice-<timestamp>.pdf
- the final invoice ready for sharing or printing.Open the PDF to ensure it displays the invoice correctly.
It’s done! You’ve successfully created the invoice PDF. 🎉
Below is a preview of the generated invoice PDF:
Conclusion
Playwright (and headless browsers in general) represent the modern standard for HTML to PDF conversion, providing excellent support for contemporary web layouts and interactive elements. While some projects still rely on wkhtmltopdf or specialized engines like PrinceXML, using headless Chromium ensures your PDFs accurately reflect the latest HTML/CSS/JS capabilities.
Don’t want to manage browser instances yourself? If you’re looking for a simpler route, you could opt for an API-based solution such as PDFBolt — an approach that offloads the maintenance and scaling concerns, letting you focus on your core application logic.
No matter which method you choose, we hope this brief history and tutorial will help you generate PDFs with confidence in 2025 and beyond!
Good Luck and Happy PDFing! 🚀