← 返回 Skills 市场

文档识别-表格识别（invoice-ocr-xy）翔云平台

Name: 文档识别-表格识别（invoice-ocr-xy）翔云平台
Author: liudengkui

作者 liudengkui · GitHub ↗ · v1.0.0 · MIT-0

cross-platform ✓ 安全检测通过

总下载

当前安装

版本数

在 OpenClaw 中安装

/install doc-ocr-xy

功能描述

文档 OCR 识别技能。扫描文件夹中的文档（PDF/图片），调用翔云 OCR API 识别文档信息。**重要：首次使用必须先配置翔云凭证，主动向用户索要 netocr_key 和 netocr_secret，或引导用户运行 --config 命令自行配置。**

使用说明 (SKILL.md)

文档 OCR 识别技能

批量识别文档。

⚠️ 首次使用必须配置凭证

此技能需要翔云 OCR 服务，使用前必须先配置凭证！

方式一：向用户提供凭证（推荐）

主动询问用户：

"使用此技能需要翔云 netocr_key 和 netocr_secret，请提供这两个凭证。
获取方式：翔云个人中心"

然后运行：

python scripts/recognize_doc.py --config

方式二：引导用户自行配置

告诉用户：

"请先运行以下命令配置翔云凭证："
python ~/.openclaw/skills/invoice-ocr/scripts/recognize_doc.py --config

特点

支持语言	代码
简体中文（印刷）	0
简体中文（印刷+手写）	3
繁体中文（印刷）	1
繁体中文（印刷+手写）	4
英文	2
阿拉伯	5
乌尔都	6
格鲁吉亚	7
西里尔文	8
法文	9
西班牙文	10
日文	11
韩文	12
葡萄牙文	13
越南	14
孟加拉	15

支持的文件格式

格式	扩展名
PDF	.pdf
OFD	.ofd
图片	.jpg, .jpeg, .png, .bmp , .tif, .tiff, .webp

使用方法

识别文档

# 识别文件夹中的所有文档
python scripts/recognize_doc.py /path/to/doc

# 识别单文档
python scripts/recognize_doc.py /path/to/doc/123.png

配置管理

# 设置翔云凭证
python scripts/recognize_doc.py --config

# 查看当前配置
python scripts/recognize_doc.py --list-config

获取 netocr_key 和 netocr_secret

登录翔云
在个人中心获得

详细 API 说明见翔云 OCR API 参考

工作流程

文档文件 → OCR识别 → 返回结果（输出原文不必翻译）
   ↓                    ↓
 PDF/图片             md结构

注意事项

图片需清晰，建议长宽 > 500px
单个文件不超过 10MB
翔云 OCR 按次计费，注意费用控制
配置文件保存在技能目录下的 config.json

安全使用建议

This skill appears to do what it says: send document data to the NetOCR API to perform OCR. Before installing/using it: (1) Do not paste your netocr_key/netocr_secret into chat — instead run the script locally with --config to store credentials in the skill directory. (2) Be aware config.json stores credentials in plaintext under the skill folder; treat that file as sensitive and restrict access. (3) Documents you process are uploaded to a third-party service (netocr.com) — ensure you're comfortable with that for any sensitive documents and check billing implications. (4) If you must provide credentials via conversation, understand that they may be retained in logs; prefer local configuration. If you want, I can point out the exact config path and show how to run the script locally to avoid sharing secrets in chat.

功能分析

Type: OpenClaw Skill Name: doc-ocr-xy Version: 1.0.0 The skill is a legitimate tool for performing OCR on documents using the Xiangyun (netocr.com) API. The Python script `scripts/recognize_doc.py` correctly implements the API integration, handling file reading, base64 encoding, and multipart form-data submission to the official endpoint. While it requires and stores API credentials in `config.json`, this behavior is transparently documented in `SKILL.md` and is necessary for the tool's functionality. No evidence of data exfiltration to unauthorized domains, malicious execution, or prompt injection was found.

能力评估

✓ Purpose & Capability

Name/description, SKILL.md, and scripts/recognize_doc.py consistently implement a document OCR skill that calls the netocr.com API. The script sends base64-encoded file data to https://netocr.com/api/recog_table_base64 and expects netocr_key/netocr_secret credentials — this is coherent with the stated purpose.

ℹ Instruction Scope

SKILL.md instructs the agent to scan folders or single files and to ask the user for netocr_key/netocr_secret or guide them to run the script with --config. The runtime instructions and code operate only on files the user points the script at and the NetOCR endpoint. Note: the skill's instructions explicitly recommend 'proactively asking the user' for credentials (see risk section).

✓ Install Mechanism

No install spec; this is an instruction + local Python script. No external downloads or package installs are performed automatically. The only optional dependency is Pillow (PIL) for image conversion, which is documented in the script.

ℹ Credentials

The skill does not request unrelated environment variables. It requires the NetOCR API key and secret, which is appropriate. However, credentials are saved in a local config.json inside the skill directory in plaintext (unencrypted). SKILL.md encourages the agent to 'actively ask' the user for credentials in conversation — this risks credential disclosure into chat logs or conversation history. Prefer local --config usage rather than pasting secrets into chat.

✓ Persistence & Privilege

Skill is user-invocable and not always-enabled; it does not request elevated privileges, does not modify other skills, and stores config only in its own skill directory. No persistent system-wide changes detected.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install doc-ocr-xy
安装完成后，直接呼叫该 Skill 的名称或使用 /doc-ocr-xy 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.0.0

- Initial release of doc-ocr: batch OCR document recognition using the Xiangyun OCR API. - Supports PDF, OFD, and common image formats (.jpg, .png, .tif, etc.). - Requires users to configure Xiangyun API credentials (netocr_key and netocr_secret) before first use. - Supports multiple languages, including Simplified/Traditional Chinese (print/handwriting), English, Arabic, Cyrillic, Japanese, Korean, and more. - Includes configuration commands for credential management and usage instructions. - Credentials and configuration are stored locally in the skill's directory.

元数据

Slug doc-ocr-xy

版本 1.0.0

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 1

常见问题