← 返回 Skills 市场

pdf2ofd

Name: pdf2ofd
Author: xzw

作者 xzw · GitHub ↗ · v1.0.2 · MIT-0

cross-platform ✓ 安全检测通过

238

总下载

当前安装

版本数

在 OpenClaw 中安装

/install pdf2ofd

功能描述

Converts PDF documents (invoices, reports) to High-Fidelity OFD format with pixel-perfect precision.

使用说明 (SKILL.md)

PDF to OFD High-Fidelity Converter

🎯 Purpose

A specialized skill for converting PDF documents into the Chinese National Standard OFD (GB/T 33190-2016) format. Optimized for Electronic Invoices (OFD版式发票) with advanced rendering capabilities that exceed standard conversion libraries.

✨ Key Features

High-Fidelity Text Placement: Uses character-level positioning (DeltaX arrays) and baseline origin data extracted via rawdict to ensure text layout is 100% identical to the source PDF.
Advanced Vector Graphics: Directly extracts original stroke colors, fill colors, and line widths. Supports complex path types and fill instructions.
Transparency Preservation: Fully supports Alpha and FillOpacity for vector paths and SMask transparency for images (e.g., electronic seals and signatures).
Cross-Platform Font Mapping: Intelligent mapping of macOS-specific (STSong, STKaiti) and Windows-specific font names to standardized OFD font names (宋体, 楷体, 黑体).
In-Memory Packaging: Generates the final OFD zip structure entirely in memory to avoid temporary file clutter and ensure security.
Color Snapping: Heuristic "Invoice Red" correction (128 0 0) for financial documents while preserving non-standard colors.

🛠️ Usage Instructions

When a user asks to convert a PDF or a "High-Fidelity" invoice to OFD:

Direct Execution:

python3 pdf2ofd.py \x3Cinput_path.pdf> [output_path.ofd]

Plugin Integration: The script implements a PDF2OFDConverter class that can be easily imported and used in other Python workflows.

Example Output

Success: /path/to/invoice.ofd

📦 Requirements

Dependencies required in the environment:

PyMuPDF (fitz): For advanced PDF parsing and raw character data extraction.
Pillow: For image processing and transparency handling.
easyofd: The base library for OFD structure (extended via internal monkey patches).
xmltodict: For XML manipulation.

💡 Notes

This skill uses deep monkey-patching on easyofd to fix known library limitations regarding character positioning and resource ID tracking.
The conversion process assumes standard Chinese fonts (SimSun, KaiTi, SimHei) are available on the viewing system.
Zero-copy resource handling: Images are extracted and re-compressed as PNG/JPG only when necessary to preserve quality.

安全使用建议

This skill appears to do what it says: convert PDFs to OFD using PyMuPDF, Pillow, and easyofd. Before installing or running: - Install dependencies in an isolated environment (virtualenv, venv, or container) and pin package versions from a trusted source. - Review the remainder of pdf2ofd.py (the file was truncated in the provided excerpt) for any network/socket usage or subprocess calls before running it on sensitive documents. - Consider changing uuid1 usage to uuid4 if you will share produced OFD files and want to avoid embedding a host MAC/time-based identifier. - Because the skill monkey-patches easyofd, test conversion results and failure modes on non-sensitive sample files to ensure the patches behave correctly with the specific easyofd version you install. - If you require stronger assurance, request the upstream source repository or an author/homepage so you can verify version history and maintainers.

功能分析

Type: OpenClaw Skill Name: pdf2ofd Version: 1.0.2 The skill implements a PDF to OFD (Chinese National Standard) converter by extending the 'easyofd' library through monkey-patching. The code focuses on high-fidelity rendering of text, images, and vector graphics using PyMuPDF (fitz) and Pillow. No indicators of data exfiltration, malicious execution, or prompt injection were found; all operations are local and consistent with the stated document conversion purpose in pdf2ofd.py and SKILL.md.

能力评估

✓ Purpose & Capability

Name/description, SKILL.md, requirements.txt, and pdf2ofd.py align: PDF parsing (PyMuPDF), image handling (Pillow), OFD generation (easyofd + xmltodict) and monkey-patching of easyofd are coherent for a high-fidelity converter.

ℹ Instruction Scope

SKILL.md instructs only to run the Python converter or import its class; the script's logic operates on the provided PDF bytes and builds OFD in memory. Minor note: the code imports uuid1 (which embeds host MAC/time) — if those UUIDs are written into output artifacts they could leak a host identifier when the OFD is shared; consider using uuid4 or another non-MAC-based ID if privacy is a concern. No evidence in the shown code of reading unrelated system files, environment variables, or contacting external endpoints.

ℹ Install Mechanism

No install spec is provided (instruction-only with bundled source). This is low-risk for automatic install, but the skill requires several Python packages (requirements.txt). Users should install dependencies in a controlled Python environment (virtualenv/container) and pin versions before installing.

✓ Credentials

The skill requests no environment variables or credentials. Required runtime packages match the stated task; there are no unrelated secret accesses or config path requirements.

✓ Persistence & Privilege

Skill is not always-enabled and does not request elevated/persistent platform privileges. It monkey-patches the local easyofd library at runtime (explained in SKILL.md) but does not modify other skills or system-wide configuration files in the provided code.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install pdf2ofd
安装完成后，直接呼叫该 Skill 的名称或使用 /pdf2ofd 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.0.2

Version 1.1.0 (High-Fidelity Update) - Greatly improved text matching with precise character-level placement and accurate baseline extraction. - Enhanced preservation of original vector graphics, including exact stroke and fill colors, complex paths, and line styles. - Full support for image and vector transparency, including alpha channels and SMask overlays. - Intelligent, cross-platform mapping of Windows and macOS font names to OFD standard fonts. - Now generates OFD output entirely in memory, avoiding temporary files for better performance and security. - Added advanced color snapping for financial documents to ensure compliance with "Invoice Red" standards.

v1.0.1

- Initial release of the pdf2ofd skill. - Converts PDF files, especially electronic invoices, to the OFD format with accurate rendering. - Handles invoice stamp transparency, consistent dark-red border coloring, and path correction for proper OFD viewing. - Provides command-line usage instructions and font requirements for best results.

v1.0.0

Initial release of pdf2ofd skill. - Converts PDF documents, especially Chinese Electronic Invoices, to the Chinese National Standard OFD format. - Accurately extracts stamp alpha-masks and applies solid dark-red (`128 0 0`) mapping to graphics elements. - Handles complex layout issues like path closure and vector graphic transformations to ensure viewer compatibility. - Includes CLI script for straightforward PDF-to-OFD conversion. - Provides guidance on required dependencies and font setup.

元数据

Slug pdf2ofd

版本 1.0.2

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 3

常见问题

pdf2ofd 是什么？

Converts PDF documents (invoices, reports) to High-Fidelity OFD format with pixel-perfect precision. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 238 次。

如何安装 pdf2ofd？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install pdf2ofd」即可一键安装，无需额外配置。

pdf2ofd 是免费的吗？

是的，pdf2ofd 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

pdf2ofd 支持哪些平台？

pdf2ofd 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 pdf2ofd？

由 xzw（@xzw）开发并维护，当前版本 v1.0.2。