← Back to Skills Marketplace

Document Parse Ocr

Name: Document Parse Ocr
Author: scnet-sugon

by SCNet-sugon · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ✓ Security Clean

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install document-parse-ocr

Description

支持文档智能识别（异步），适用于大批量文档处理场景。提交公网可访问的文件 URL，自动识别文档中的文本、表格、标题等结构信息，返回结构化的 JSON 结果文件下载地址。

README (SKILL.md)

\r \r

Sugon-Scnet 文档智能 OCR 技能\r

\r 本技能封装了 Scnet OCR 文档智能服务的异步 API，支持提交公网可访问的文件 URL，自动进行文档解析（文本、表格、标题等），并通过轮询获取识别结果。\r \r

功能特性\r

异步处理：适用于大批量文档，无需长时间等待同步响应\r
结构化解析：自动识别文档中的段落、标题、表格、图表、公式、页眉页脚、脚注、印章等元素\r
结果下载：任务成功后返回结果文件的临时下载地址（有效期为 12 小时）\r
Markdown 输出：自动生成整页的 Markdown 内容，并映射图片/印章路径\r \r

前置配置\r

⚠️ 重要：使用前需要申请 Scnet API Token\r

申请 API Token\r

访问 Scnet 官网注册/登录\r
在控制台申请 API 密钥（格式：sc-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx）\r
复制密钥备用\r \r

配置 Token\r

\r 手动配置（推荐）\r

在技能目录下创建 config/.env 文件，内容如下：\r

# =====  Sugon-Scnet OCR API 配置 =====\r
# 申请地址：https://www.scnet.cn\r
SCNET_API_KEY=your_scnet_api_key_here\r
\r
# API 基础地址（一般无需修改）\r
SCNET_API_BASE=https://api.scnet.cn/api/llm/v1\r
```\r
2.添加：SCNET_API_KEY=你的密钥\r
\r
3.设置文件权限为 600（仅所有者可读写）\r
**⚠️ 安全警告**：切勿将 API Key 直接粘贴到聊天对话中，否则可能被记录或泄露。\r
\r
### Token 更新\r
\r
Token 过期后调用会返回 401 或 403 错误。更新方法：重新申请 Token 并替换 config/.env 中的 SCNET_API_KEY。\r
\r
### 依赖安装\r
\r
本技能需要 Python 3.6+ 和 requests 库。请运行以下命令：\r
\r
```bash\r
   pip install requests\r
```\r
---\r
### 使用方法\r
\r
### 参数说明\r
\r
| 参数名 | 类型 | 必填 | 描述 |\r
|--------|------|----|------|\r
| ocrType | string | 否  | 识别类别，目前仅支持：\x3Cbr>• DOC_PARSING（默认值）|\r
| fileUrl | string | 是  | 待处理文件的公网可访问下载地址（支持 HTTP/HTTPS） |\r
\r
### 命令行调用示例\r
\r
```bash\r
   python .claude/skills/document_parse_ocr/scripts/main.py DOC_PARSING "https://example.com/document.pdf"\r
```\r
如果省略 ocrType，可只传 fileUrl：\r
```bash\r
   python scripts/main.py "https://example.com/document.pdf"\r
```\r
\r
### 在 AI 对话中使用\r
\r
用户可以说：\r
\r
- “帮我解析这个文档：https://example.com/report.pdf”\r
- “对这份合同进行 OCR 识别，文件地址是 https://...”\r
\r
AI 会根据 description 中的关键词自动触发本技能。\r
\r
### AI 调用建议\r
由于任务异步处理，技能内部会自动轮询（最长等待 10 分钟，可配置）。建议在调用时设置较长的 timeout（如 600 秒），避免因轮询超时导致命令中断.\r
\r
### 配置选项\r
\r
编辑 `config/.env` 文件：\r
\r
| 变量名 | 默认值 | 说明 |\r
|--------|--------|------|\r
| SCNET_API_KEY | 必需 | Scnet API 密钥 |\r
| SCNET_API_BASE | https://api.scnet.cn/api/llm/v1 | API 基础地址（一般无需修改） |\r
\r
### 输出\r
\r
- 标准输出：识别结果的 JSON 格式。若任务成功，输出解析后的文档内容（即结果文件中的 JSON 对象）；若失败，输出错误信息。\r
- 错误信息：以 错误: 开头的友好提示。\r
\r
### 注意事项\r
\r
- 文件 URL 必须是公网可访问的下载链接（不支持本地文件路径）。如需识别本地文件，请先上传至对象存储或临时文件服务。\r
- 结果文件下载地址有效期为 12 小时，请及时获取。\r
- 异步任务最长处理时间取决于文档大小和复杂度，轮询超时默认 600 秒（10 分钟），可通过修改 POLL_TIMEOUT 变量调整。\r
- 技能内部会处理限流（429）重试，最多重试 3 次。\r
\r
### 故障排除\r
\r
| 问题 | 解决方案 |\r
|------|----------|\r
| 配置文件不存在 | 创建 config/.env 并填入 Token（参考前置配置） |\r
| API Key 无效/过期 | 重新申请 Token 并更新 `.env` 文件 |\r
| 文件 URL 无法访问 | 确保 URL 是公网可下载的，且无防火墙限制 |\r
| 网络连接失败 | 检查网络连接或防火墙设置 |\r
| 任务长时间 running | 检查文档大小是否超过限制（联系服务商） |\r
| 401/403/Unauthorized | Token 无效或过期，重新申请并配置 |\r
| 429 Too Many Requests | 请求过于频繁，技能会自动等待并重试（最多 3 次） |\r
| 任务失败 (failed) | 检查 error_code 和 error_message，常见原因：文件格式不支持、内容违规等 |

Usage Guidance

Install only if you are comfortable sending the document URL and OCR output through Scnet. Use short-lived, access-scoped public URLs without embedded secrets, avoid confidential or regulated documents unless approved for third-party OCR processing, protect the SCNET_API_KEY, and do not change SCNET_API_BASE unless you trust the endpoint.

Capability Tags

cryptorequires-sensitive-credentials

Capability Assessment

ℹ Purpose & Capability

The stated purpose is asynchronous document OCR, and the script matches it by submitting a provided file URL to Scnet, polling for completion, downloading the returned JSON result, and printing the parsed content.

ℹ Instruction Scope

The examples are centered on explicit document parsing or OCR requests, but the skill does not require a separate confirmation before sending the document URL and extracted content through a third-party service.

✓ Install Mechanism

No install hooks or hidden setup behavior were found; setup is manual, with Python requests as the only runtime library and a local config file for the API key.

ℹ Credentials

Network access and a Scnet API key are proportionate for OCR, but users must understand that public file URLs and OCR results leave the local environment; overriding SCNET_API_BASE can redirect traffic.

✓ Persistence & Privilege

The skill has no background persistence, privilege escalation, broad local indexing, destructive actions, or credential harvesting; the only persistent item is the user-created API-key config file.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install document-parse-ocr
After installation, invoke the skill by name or use /document-parse-ocr
Provide required inputs per the skill's parameter spec and get structured output

Version History

v1.0.0

Initial release of document-parse-ocr skill. - Supports intelligent document parsing (asynchronous) for batch processing scenarios. - Submits publicly accessible file URLs for OCR and returns structured JSON results, including text, tables, titles, and more. - Asynchronous workflow with polling and automatic result download link (valid for 12 hours). - Requires SCNET_API_KEY; optional SCNET_API_BASE available. - Handles API errors, rate limits, and provides user-friendly error messages.

Metadata

Slug document-parse-ocr

Version 1.0.0

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 1

Frequently Asked Questions

What is Document Parse Ocr?

支持文档智能识别（异步），适用于大批量文档处理场景。提交公网可访问的文件 URL，自动识别文档中的文本、表格、标题等结构信息，返回结构化的 JSON 结果文件下载地址。 It is an AI Agent Skill for Claude Code / OpenClaw, with 54 downloads so far.

How do I install Document Parse Ocr?

Run "/install document-parse-ocr" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Document Parse Ocr free?

Yes, Document Parse Ocr is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Document Parse Ocr support?

Document Parse Ocr is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Document Parse Ocr?

It is built and maintained by SCNet-sugon (@scnet-sugon); the current version is v1.0.0.

More Skills