← Back to Skills Marketplace
1286
Downloads
0
Stars
10
Active Installs
2
Versions
Install in OpenClaw
/install document-parser
Description
Extract structured data from PDFs, images, and Word files with layout analysis, table recognition, OCR, seal detection, and directory extraction.
README (SKILL.md)
document-parser
高精度文档解析技能,从 PDF、图片、Word 文档中提取结构化数据。
用途
- 解析 PDF、图片 (JPG/PNG)、Word 文档
- 版面分析与结构提取
- 表格识别(输出 HTML/Markdown)
- OCR 文字识别
- 印章检测
- 目录提取
命令
解析文档
document-parser parse \x3C文件路径> [选项]
示例:
document-parser parse C:\docs\report.pdf
document-parser parse C:\docs\scan.jpg --layout --table
document-parser parse C:\docs\contract.docx --output markdown
查询任务状态
document-parser status \x3C任务 ID>
参数说明
| 参数 | 说明 | 示例 |
|---|---|---|
| 文件路径 | PDF/图片/Word 文件路径 | C:\docs\report.pdf |
| --layout | 启用版面分析 | --layout |
| --table | 启用表格识别 | --table |
| --seal | 启用印章检测 | --seal |
| --output | 输出格式 (json/markdown/both) | --output markdown |
| --pages | 页码范围 | --pages 1-5,8,10-12 |
配置
方式一:环境变量
DOCUMENT_PARSER_API_KEY=your_api_key
DOCUMENT_PARSER_BASE_URL=http://47.111.146.164:8088/taidp/v1/idp/general_parse
方式二:配置文件
在技能目录创建 config.json:
{
"api_key": "your_api_key",
"base_url": "http://47.111.146.164:8088/taidp/v1/idp/general_parse"
}
输出格式
返回结构化 JSON 包含:
- pages: 解析后的页面数组
- elements: 版面元素(文本、表格、图片等)
- markdown: Markdown 格式文本
- data: 数据统计摘要
依赖
- requests
- python-docx (Word 支持)
- Pillow (图片处理)
错误码
| 错误码 | 消息 | 说明 |
|---|---|---|
| 10000 | Success | 识别成功 |
| 10001 | Missing parameter | 参数缺失 |
| 10002 | Invalid parameter | 非法参数 |
| 10003 | Invalid file | 文件格式非法 |
| 10004 | Failed to recognize | 识别失败 |
| 10005 | Internal error | 内部错误 |
Usage Guidance
This skill is functionally coherent but risky by default: if you run 'document-parser parse <file>' as-is, the file will be uploaded to the default server at 47.111.146.164 (the README and config example point to that IP). Before installing/using it, consider: 1) Do not upload sensitive documents to an unknown third party. 2) Prefer to set DOCUMENT_PARSER_BASE_URL to a trusted/self-hosted parser endpoint, or host the parsing service yourself. 3) If you must use this skill, require and provide an API key and confirm the operator/trustworthiness of the endpoint. 4) Audit network traffic (or run in an isolated environment) to verify where files are sent. 5) If you don't have a trusted remote parser, avoid using the skill or inspect/modify index.py to implement local processing instead.
Capability Analysis
Type: OpenClaw Skill
Name: document-parser
Version: 1.0.1
The skill sends local documents (PDF, Word, images) to a hardcoded external IP address (47.111.146.164) over unencrypted HTTP for processing. While this behavior is consistent with the stated purpose of a document parser, the use of an insecure protocol and a specific IP address instead of a verified domain name poses a high risk of data exposure or exfiltration. These indicators are found in index.py, SKILL.md, and clawhub.yaml.
Capability Assessment
Purpose & Capability
The code and documentation consistently implement a remote-document-parser client (PDF/image/Word parsing, OCR, table and seal detection). Using a remote API for heavy tasks like OCR/layout analysis is reasonable, so the capability aligns with the name/description. However, the packaged default base_url is an IP address (47.111.146.164) embedded in examples and defaults, which is unexpected for a generic skill and should be justified by the author.
Instruction Scope
Runtime instructions and the CLI cause the skill to read local files and POST their binary contents to a remote HTTP endpoint. The SKILL.md and config examples explicitly point to the same unknown IP. The skill will attempt uploads even without an API key (it logs a warning but proceeds), so users could inadvertently exfiltrate sensitive documents simply by running the default parse command.
Install Mechanism
This is instruction-only plus a Python script; there is no download-from-URL or post-install arbitrary code fetch. Dependencies are standard (requests, python-docx, Pillow) and listed in requirements.txt. No high-risk install behavior was found.
Credentials
The skill does not require environment variables, but supports optional DOCUMENT_PARSER_API_KEY and DOCUMENT_PARSER_BASE_URL. The problem is not many credentials requested, but that the default configuration/README/config.example hardcodes an explicit IP-based endpoint. Sensitive files are sent to that endpoint by default, and the API key is optional — meaning data can be uploaded unauthenticated. That is disproportionate for a drop-in skill where users may expect local processing or to configure their own server.
Persistence & Privilege
The package does not request always:true, does not modify other skills or system-wide settings, and only writes output files derived from user input to the current working directory. It does read a local config.json if present (expected). No elevated persistence or privilege escalation behavior observed.
How to Use
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install document-parser - After installation, invoke the skill by name or use
/document-parser - Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.1
- Removed sample files: .gitignore and 使用手册_v1.1_parsed.json
- No changes to commands, features, configuration, or documentation content
v1.0.0
- Initial release of document-parser.
- Extracts structured data from PDF, images (JPG/PNG), and Word documents.
- Supports layout analysis, table recognition (HTML/Markdown output), OCR text extraction, seal detection, and catalog extraction.
- Provides flexible command-line options for parsing, status checks, and output formatting.
- Configuration supported via environment variables or config file.
- Returns structured JSON results, including pages, elements, markdown, and data summaries.
Metadata
Frequently Asked Questions
What is document-parser?
Extract structured data from PDFs, images, and Word files with layout analysis, table recognition, OCR, seal detection, and directory extraction. It is an AI Agent Skill for Claude Code / OpenClaw, with 1286 downloads so far.
How do I install document-parser?
Run "/install document-parser" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is document-parser free?
Yes, document-parser is completely free, licensed under MIT-0. You can download, install and use it at no cost.
Which platforms does document-parser support?
document-parser is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created document-parser?
It is built and maintained by docpilot (@ankylala); the current version is v1.0.1.
More Skills