How to Generate PDFs in 2025

How to Generate PDFs in 2025

Generating PDFs has become a crucial feature for many modern web applications, from creating invoices to producing complex data-driven reports. Over the years, various HTML to PDF tools have emerged, each with its own approach and capabilities. In this post, we’ll take a brief look at the evolution of PDF generation solutions — from wkhtmltopdf and PrinceXML (via DocRaptor) to Puppeteer and Playwright — and then walk through a short tutorial on using Playwright (Node.js) to generate a sample PDF in 2025.

A Brief History of PDF Generation Tools

wkhtmltopdf

wkhtmltopdf is a command-line tool that uses the WebKit rendering engine (the foundation of early Safari) to convert HTML and CSS into PDF. Historically, it was one of the most popular open-source solutions for server-side PDF generation.

➡️ How it works:

  • Renders a static HTML document (with CSS) in a WebKit-based environment.

  • Outputs a PDF file that closely (but not always perfectly) matches the layout seen in a WebKit browser.

  • Integrates easily into scripts or back-end services via CLI.

✅ Pros:

  • Open-source and free to use.

  • Simple CLI usage makes it straightforward to integrate with different back-end languages.

  • Mature project with a large user base.

❌ Cons:

  • Limited support for modern JavaScript and CSS features like flexbox or grid.

  • Inconsistent behavior with more complex layouts or interactive elements.

  • Can be slower and produce lower-quality rendering compared to headless Chrome.

DocRaptor (PrinceXML)

DocRaptor is a cloud-based API that leverages the commercial PrinceXML rendering engine to convert HTML (and CSS) into high-quality PDF or Excel documents. PrinceXML itself is known for its advanced typesetting capabilities and thorough CSS support, including complex paged media features.

➡️ How it works:

  • You send your HTML/CSS (and optionally JavaScript) via an API call.

  • The service uses PrinceXML under the hood to produce a precisely formatted PDF.

  • You receive the final PDF via the API response.

✅ Pros:

  • High-fidelity rendering for complex layouts, including advanced typographic and pagination features.

  • Thorough CSS support, handling paged media, footnotes, multi-column layouts, and more.

  • Backed by a commercial solution with consistent updates and dedicated support.

❌ Cons:

  • Very high licensing cost: PrinceXML (and by extension DocRaptor) can be prohibitively expensive for many smaller projects or startups.

  • Not a full headless browser engine: Because PrinceXML isn’t based on Chromium/Firefox, it may struggle with dynamic JavaScript frameworks (React, Vue, Angular) that rely heavily on live browser rendering.


Headless Chrome as a Game Changer

For a long time, generating PDFs that perfectly mirrored a modern browser’s rendering was difficult. Older engines like WebKit in wkhtmltopdf or specialized solutions (e.g., PrinceXML) often lagged behind the latest HTML/CSS/JS capabilities found in Chrome. That changed significantly when Google introduced Headless Chrome, allowing developers to run the browser without a visible UI.

Shortly after, Puppeteer and then Playwright emerged as powerful tools for automating Headless Chrome, enabling developers to generate PDFs and screenshots with precision.

Puppeteer

Puppeteer is a Node.js library that provides an extensive API for automating tasks in headless (or full) Chrome/Chromium. Initially released by the Google Chrome team, it quickly became popular for web scraping, testing, and HTML to PDF conversion.

➡️ How it works:

  • Puppeteer spins up a headless Chrome/Chromium instance.

  • Navigates to a given web page or loads an HTML string.

  • Waits for the content (including JavaScript) to render fully, then exports a PDF that matches Chrome’s final layout.

✅ Pros:

  • Full support for modern HTML, CSS, and dynamic JavaScript.

  • Highly configurable: page size, margins, custom headers/footers, etc.

  • Maintained by the Google Chrome team, ensuring updates align with Chromium.

❌ Cons:

  • Language focus: Primarily supported in Node.js, so it’s less convenient for developers in other languages.

  • Resource usage: Spinning up a headless browser can be heavier compared to simpler tools.

  • Scaling complexity: Requires more infrastructure for large workloads.

Playwright

Playwright was developed by Microsoft and shares many similarities with Puppeteer. It supports automated testing and browser manipulation for Chromium, Firefox, and WebKit, although PDF generation currently works only in Chromium. Despite that limitation, Playwright’s design and multiple language SDKs make it a powerful option for diverse teams.

➡️ How it works:

  • Playwright launches a headless Chrome/Chromium instance.

  • It navigates to a specified webpage or processes an HTML string.

  • The tool waits until the content, including any JavaScript, is fully rendered, and then generates a PDF that reflects the final layout of the browser.

✅ Pros:

  • Multiple language SDKs (Python, Java, C#/.NET, JavaScript/TypeScript) accommodate varied development teams.

  • Modern standards compatibility: Because it uses a real Chromium engine for PDF, flexbox, grid, and other advanced CSS features are fully supported.

  • Flexible configuration: Allows custom margins, paper sizes, headers/footers, and dynamic page manipulation before exporting.

  • Scalable & actively maintained: Backed by Microsoft and the open-source community, it evolves quickly and is well-documented. You can containerize headless Chromium and distribute load for large-scale PDF tasks.

❌ Cons:

  • PDF limited to Chromium: Even though Playwright supports Firefox and WebKit for automation, PDF generation is restricted to Chromium.

  • High resource usage: Like Puppeteer, running a headless browser is heavier than simpler command-line tools.

  • Overkill for minimal needs: If you only require simple, static PDFs, spinning up Chromium might be more complex than necessary.

Puppeteer vs Playwright

Both Puppeteer and Playwright are powerful tools for PDF generation, leveraging the same Chromium engine. The choice between the two largely depends on your specific needs, such as your preferred programming language, the complexity of your project, and whether you need to automate multiple browsers beyond PDF generation.

Here’s a comparison to help you choose the right tool for your needs:

AspectPuppeteerPlaywright
PDF GenerationSupported via headless Chromium.Supported via headless Chromium.
Browser SupportChromium only.Chromium only for PDF. Supports Firefox and WebKit for other tasks.
Multi-language SupportNo - Limited to Node.js.Yes - Python, Java, C#/.NET, JavaScript/TypeScript.
Dynamic Content SupportFully supports JavaScript, AJAX, etc.Fully supports JavaScript, AJAX, etc.
Configuration OptionsFlexible - margins, headers/footers, paper size.Flexible - margins, headers/footers, paper size.
Ease of IntegrationIdeal for Node.js developers.Designed for teams using multiple programming languages.
ScalabilityFocused on Chromium-based tasks.Scales better with multi-browser support.
DevelopmentReleased in 2017 by Google.Released in 2020 by Microsoft.

Step-by-Step Guide: Generating Invoice PDF with Playwright

In this quick example, we’ll show how to generate an invoice PDF by rendering an EJS template in Node.js and converting it to a PDF using Playwright.

💡
Tip: The entire example is available on GitHub if you'd like to view or clone the full project.

1️⃣ Set Up the Project

1. Install Node.js.

Make sure you have Node.js installed. If not, download and install it from Node.js Website.

2. Create a New Project Directory.

Open a Terminal and run:

mkdir invoice-generator
cd invoice-generator

3. Initialize a New Node.js Project.

Create a package.json file with the following command:

npm init -y

4. Install Required Packages.

Install ejs for templating and playwright for generating PDFs:

npm install ejs playwright

2️⃣ Organize Your Project Structure

Here’s a suggested structure for better organization:

invoice-generator/
├── data/                   // Directory for data files
│   └── invoice-data.json   // JSON file for invoice data
├── templates/              // Directory for HTML templates
│   └── invoice.ejs         // Template for the invoice
├── generate-invoice.js     // Main script to generate PDFs
└── package.json            // Project configuration file

3️⃣ Create an EJS Template

  • To start generating PDFs, you’ll first need an HTML template for your invoice. In this example we will use EJS, a simple templating language that allows you to generate HTML markup using plain JavaScript.

  • Save this template as invoice.ejs in the templates directory. It should define the structure of your invoice and include placeholders for dynamic data.

🔗
You can find my example template on GitHub.

4️⃣ Add Invoice Data

  • Save your invoice details as invoice-data.json in the data directory.

  • This file will hold the dynamic data, such as customer details, that will be used to populate the placeholders in the invoice.ejs template.

  • By keeping the data in a separate file, you ensure that the template can be reused with different data sets.

🔗
You can find the example invoice data on GitHub.

5️⃣ Create the PDF Generator Script

  • To generate a PDF from your HTML template, you’ll need a script that renders the template and uses Playwright for PDF conversion.

  • Below is a complete example of the script.

  • Save this file as generate-invoice.js in your project directory.

Below is the complete script:

const ejs = require('ejs');
const fs = require('fs');
const {chromium} = require('playwright');
const path = require('path');

// Load invoice data from the JSON file
const invoiceData = JSON.parse(fs.readFileSync(path.join(__dirname, 'data', 'invoice-data.json'), 'utf8'));

(async () => {
    try {
        const timestamp = new Date().toISOString().replace(/[:.]/g, '-'); // Generate unique timestamp

        // Render the EJS template to HTML
        const templatePath = path.join(__dirname, 'templates', 'invoice.ejs');
        const html = await ejs.renderFile(templatePath, invoiceData);

        // Launch a headless browser using Playwright
        const browser = await chromium.launch();
        const page = await browser.newPage();

        // Load the rendered HTML into the browser
        await page.setContent(html, {waitUntil: 'load'});

        // Generate the PDF and save it with a timestamped filename
        const pdfPath = `invoice-${timestamp}.pdf`;
        await page.pdf({
            path: pdfPath,
            format: 'A4',
            printBackground: true
            // Additional parameters can be added here
        });

        await browser.close();
        console.log(`PDF successfully created at: ${pdfPath}`);
    } catch (error) {
        console.error('An error occurred while generating the invoice:', error);
    }
})();
🔗
You can also find the complete script on GitHub.

6️⃣ Run the Script

Open a terminal and navigate to the project directory:

cd invoice-generator

Run the script:

node generate-invoice.js

7️⃣ Check the Output

Once the script has executed successfully, you’ll find the following file in your project directory:

  • Generated PDF: invoice-<timestamp>.pdf - the final invoice ready for sharing or printing.

  • Open the PDF to ensure it displays the invoice correctly.

It’s done! You’ve successfully created the invoice PDF. 🎉

Below is a preview of the generated invoice PDF:


Conclusion

Playwright (and headless browsers in general) represent the modern standard for HTML to PDF conversion, providing excellent support for contemporary web layouts and interactive elements. While some projects still rely on wkhtmltopdf or specialized engines like PrinceXML, using headless Chromium ensures your PDFs accurately reflect the latest HTML/CSS/JS capabilities.

Don’t want to manage browser instances yourself? If you’re looking for a simpler route, you could opt for an API-based solution such as PDFBolt — an approach that offloads the maintenance and scaling concerns, letting you focus on your core application logic.

No matter which method you choose, we hope this brief history and tutorial will help you generate PDFs with confidence in 2025 and beyond!

Good Luck and Happy PDFing! 🚀