← 返回 Skills 市场
ofd-text-extractor
作者
liuwei19820201
· GitHub ↗
· v1.0.1
· MIT-0
101
总下载
0
收藏
0
当前安装
2
版本数
在 OpenClaw 中安装
/install ofd-text-extractor
功能描述
本技能用于从 OFD 格式文件中提取文本内容,并保留位置信息。 触发场景包括:分析 OFD 发票内容、从 OFD 文件中提取特定位置的信息、 或需要了解 OFD 文件的详细结构时使用。
安全使用建议
What to check before installing/using:
- Functional checks: The included Python script appears safe and runs locally (no network/credential access). Run it on a sample OFD to confirm it produces the JSON fields you need. Note: the script writes pages with separate pageTexts and templateTexts fields, whereas SKILL.md's sample JSON shows a combined texts array with isTemplate flags — adapt your consumers accordingly.
- Missing file: SKILL.md shows a PowerShell wrapper scripts/extract_ofd.ps1, but that file is not included. Use python scripts/extract_ofd.py ... directly or create your own wrapper.
- Robustness: the script uses regular expressions to parse XML in places (fragile for edge cases). If you rely on exact extraction (invoices, automated pipelines), test with representative OFD files and consider improving XML parsing for complex inputs.
- Safety: there is no obvious exfiltration (no network/socket modules, no external calls). Still review or run in an isolated environment if processing untrusted files, and verify outputs before feeding results into downstream automated systems.
If you want, I can: (a) show the exact JSON structure the script emits for a sample OFD, (b) propose a small patch to make the output match SKILL.md, or (c) add a simple PowerShell wrapper compatible with the docs.
功能分析
Type: OpenClaw Skill
Name: ofd-text-extractor
Version: 1.0.1
The ofd-text-extractor skill bundle is a legitimate tool for parsing OFD (Open Fixed-layout Document) files to extract text and coordinate data. The core logic in scripts/extract_ofd.py uses standard Python libraries (zipfile, re, json) to process the document structure without any network activity, unauthorized file access, or obfuscated code. The SKILL.md instructions are well-aligned with the stated purpose and do not contain any prompt-injection attempts or malicious directives.
能力评估
Purpose & Capability
Name/description (extract text+positions from OFD) matches the included Python script: it reads a local .ofd (ZIP), parses XML/Content.xml and template pages, and computes character positions. No unrelated binaries, credentials, or services are requested.
Instruction Scope
SKILL.md instructs running scripts and describes an output JSON schema that differs from what extract_ofd.py actually writes. SKILL.md examples also show a PowerShell wrapper (scripts/extract_ofd.ps1) which is not present in the package. These mismatches could lead to broken automation or unexpected outputs.
Install Mechanism
No install spec; runtime is an included Python script with only standard-library imports. No external downloads or package installs are requested.
Credentials
The skill declares no environment variables, credentials, or config paths. The script operates on a user-supplied local OFD file only.
Persistence & Privilege
Skill does not request 'always' or any elevated/persistent privileges. It does not modify other skills or system config. Autonomous invocation is allowed (platform default) but not combined with other concerning requests.
如何使用
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install ofd-text-extractor - 安装完成后,直接呼叫该 Skill 的名称或使用
/ofd-text-extractor触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.1
- 技能主提取脚本由 PowerShell 脚本(extract_ofd.ps1)切换为 Python 脚本(extract_ofd.py),提升跨平台兼容性。
- 脚本使用方法和参数说明已同步更新为 Python 语法,例如 --output 和 --show-chars。
- 其他功能和使用说明保持不变。
v1.0.0
Initial release of ofd-text-extractor skill.
- Extracts text content from OFD files with precise position information.
- Supports template page extraction (Form-like pages).
- Calculates character-level positions, including support for DeltaX spacing.
- Outputs structured JSON, with optional character position details using the -ShowCharacters flag.
- Merges template and page content, marking sources with isTemplate field.
- Provides PowerShell extraction script with options for basic, positional, and JSON output.
元数据
常见问题
ofd-text-extractor 是什么?
本技能用于从 OFD 格式文件中提取文本内容,并保留位置信息。 触发场景包括:分析 OFD 发票内容、从 OFD 文件中提取特定位置的信息、 或需要了解 OFD 文件的详细结构时使用。 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 101 次。
如何安装 ofd-text-extractor?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install ofd-text-extractor」即可一键安装,无需额外配置。
ofd-text-extractor 是免费的吗?
是的,ofd-text-extractor 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。
ofd-text-extractor 支持哪些平台?
ofd-text-extractor 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 ofd-text-extractor?
由 liuwei19820201(@liuwei19820201)开发并维护,当前版本 v1.0.1。
推荐 Skills