html-to-pdf

Name: html-to-pdf
Author: owenrao

Description

Convert an HTML file to a PDF using headless Chrome (Puppeteer) — the same approach atypica uses for its AI-generated research reports. Use this skill whenev...

README (SKILL.md)

Overview

This skill converts an HTML file to PDF using Puppeteer (headless Chromium), exactly how atypica exports its AI research reports. Two modes are supported:

Mode	When to use
Single-page (default)	Design/report pages meant to look like one tall poster — no page breaks. Full-width at 1440 px.
Paginated	Documents meant to be printed or read page-by-page (A4, Letter, etc.).

Quickstart (3 steps)

# 1. Copy the bundled scripts to a working directory
cp \x3Cskill-dir>/scripts/html-to-pdf.js ./
cp \x3Cskill-dir>/scripts/package.json ./

# 2. Install the only dependency (downloads Chromium automatically, ~170 MB, one-time)
npm install

# 3. Run
node html-to-pdf.js report.html report.pdf

\x3Cskill-dir> is the directory that contains this SKILL.md file.

Note: npm install puppeteer (~170 MB) downloads a pinned Chromium binary. This is the only install step — no system Chrome, no wkhtmltopdf, no separate server needed. If the environment already has Puppeteer installed, skip step 2.

Command reference

node html-to-pdf.js \x3Cinput.html> \x3Coutput.pdf> [options]

Options:
  --paginated         A4-paginated mode (respects @media print, page-breaks)
  --format \x3Cfmt>      Page format: A4 (default), A3, Letter, Legal
  --width \x3Cpx>        Viewport width for single-page mode (default: 1440)
  --wait \x3Cms>         Extra milliseconds to wait after page load (for JS-rendered content)
  --header-footer     Add page-number footer in paginated mode

Examples

# Single-page full-height (atypica report style)
node html-to-pdf.js report.html report.pdf

# A4 paginated document
node html-to-pdf.js document.html document.pdf --paginated

# A4 with page numbers
node html-to-pdf.js document.html document.pdf --paginated --header-footer

# Narrower single-page layout
node html-to-pdf.js report.html report.pdf --width 1280

# Wait 2 s for JavaScript-rendered charts
node html-to-pdf.js dashboard.html dashboard.pdf --wait 2000

How it works (mirrors atypica's browser service)

Launches headless Chromium via Puppeteer with sandbox disabled and CJK font hints enabled.
Loads the HTML from a file:// URL so relative assets (images, local CSS) resolve correctly.
Injects system-font CSS to ensure Chinese/Japanese/Korean characters render on any OS.
Single-page mode: measures document.body.scrollHeight, sets viewport to that height, and generates a single-page PDF at that exact size — no clipping, no page breaks.
Paginated mode: injects @media print CSS for clean page-breaks, then generates a standard-format paginated PDF.
Writes the PDF buffer to the output path.

Handling common issues

Problem	Fix
Chromium not found after `npm install puppeteer`	Run `npx puppeteer browsers install chrome`
Missing system fonts / boxes instead of CJK chars	Inject works for most cases; for guaranteed rendering install `fonts-noto-cjk` (Linux) or ensure macOS system fonts are accessible
JavaScript-rendered content missing	Add `--wait 2000` (or more) to let JS execute after load
Images not loading	Make sure image `src` paths are relative to the HTML file location
PDF cut off at bottom	The script auto-measures height; if content loads lazily add `--wait`
`--no-sandbox` error in strict container	Puppeteer requires `--no-sandbox` in Docker/CI; this flag is already set

Dependency notes

Node.js ≥ 18 required (≥ 20 recommended)
puppeteer is the only npm dependency — it self-contains Chromium
No global Chrome installation needed
Works on macOS, Linux, and Windows (WSL)
In CI/Docker, add --disable-dev-shm-usage (already included in the script)

Usage Guidance

This skill appears to do what it says. Before running: (1) Be aware npm install puppeteer will download many packages and a ~170 MB Chromium binary; ensure you have bandwidth/disk space. (2) Rendering may cause Chromium to fetch external assets (Google Fonts, CDNs, remote images referenced in the HTML) — if the HTML contains URLs to private services, those hosts will see requests (possible data leakage). (3) The script runs Chromium with --no-sandbox (often required in CI/Docker); for untrusted HTML run it in an isolated container or VM. (4) Requires Node ≥18; review the HTML you convert if it contains sensitive data or external references.

Capability Analysis

Type: OpenClaw Skill Name: html2pdf Version: 1.0.0 The skill provides HTML-to-PDF conversion using Puppeteer in `scripts/html-to-pdf.js`, but it employs several high-risk configurations. Specifically, it disables the browser sandbox (`--no-sandbox`), uses the `file://` protocol to load local content, and lacks input sanitization for the file paths provided as arguments, which could lead to arbitrary file read/write vulnerabilities. While these behaviors are plausibly needed for the stated purpose and are documented in `SKILL.md`, they represent a significant attack surface and meet the criteria for suspicious classification due to the inherent security risks.

Capability Assessment

✓ Purpose & Capability

Name/description (html-to-pdf via headless Chrome) match the provided files and instructions. The included script implements the stated functionality and there are no unrelated credentials, binaries, or config paths requested.

ℹ Instruction Scope

Instructions and the script operate on local HTML files (read, patch, write temp file, produce PDF) as described. The script intentionally fetches external resources (Google Fonts, Tailwind CDN, remote images/CSS referenced by the HTML) when rendering; this means Chromium will perform outgoing network requests to those hosts. The script also launches Chromium with --no-sandbox (documented in SKILL.md), which is commonly necessary in containers but reduces sandboxing. These behaviors are expected for accurate rendering but are worth noting as they cause network traffic and reduce process isolation.

ℹ Install Mechanism

This is an instruction-only skill (no registry install). The recommended install is npm install (puppeteer) which will download many npm packages and a pinned Chromium binary (~170 MB). The packages come from the npm registry (package.json/package-lock.json present); there are no downloads from obscure personal servers in the provided files. Installing will write dependencies and a large browser binary to disk.

✓ Credentials

No environment variables, credentials, or external config paths are required or requested. The script only uses local filesystem access to read input and write output (intended behavior).

✓ Persistence & Privilege

The skill does not request persistent or elevated platform privileges, does not set always:true, and does not modify other skills or system-wide agent settings. It writes a short-lived temporary file next to the input HTML and deletes it on exit.

Version History

v1.0.0

- Major update: Rewritten skill interface and README for a simplified, robust HTML-to-PDF export exactly as used in atypica AI research reports. - Provides single-page (poster style, full height) and paginated (A4/Letter) PDF modes with clear CLI options. - Adds concise usage instructions: one-step local install with bundled scripts and npm install. - Clarifies command-line options and troubleshooting for JS-rendered content, images, and fonts. - Removes former reference docs; workflow is now fully explained in the main documentation. - Scripts and package files are now formally part of the skill for easy setup.

Metadata

Slug html2pdf

Version 1.0.0

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 1

Frequently Asked Questions

What is html-to-pdf?

Convert an HTML file to a PDF using headless Chrome (Puppeteer) — the same approach atypica uses for its AI-generated research reports. Use this skill whenev... It is an AI Agent Skill for Claude Code / OpenClaw, with 107 downloads so far.

How do I install html-to-pdf?

Run "/install html2pdf" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is html-to-pdf free?

Yes, html-to-pdf is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does html-to-pdf support?

html-to-pdf is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created html-to-pdf?

It is built and maintained by owenrao (@owenrao); the current version is v1.0.0.

More Skills