← Back to Skills Marketplace
TencentCloud ExtractDoc OCR
by
tencent-ocr
· GitHub ↗
· v1.0.2
· MIT-0
408
Downloads
0
Stars
3
Active Installs
3
Versions
Install in OpenClaw
/install tencentcloud-ocr-extractdocagent
Description
腾讯云实时文档抽取Agent(ExtractDocAgent)接口调用技能。当用户需要从图片或PDF中按自定义字段名称进行结构化信息抽取时,应使用此技能。支持自定义字段名称、字段类型(KV对或表格字段)和字段提示词,实现灵活的文档信息提取。适用于合同、发票、报告等各类文档的结构化数据抽取场景。
Usage Guidance
This skill's code and README implement a typical Tencent Cloud OCR integration and will call ocr.tencentcloudapi.com using your Tencent Cloud API keys. Before installing: (1) be aware the skill requires TENCENTCLOUD_SECRET_ID and TENCENTCLOUD_SECRET_KEY — the registry metadata omitted these, so do not rely solely on the registry listing; (2) provide least-privilege API credentials (a key scoped only to the OCR service if possible) and monitor billing/usage (calls are billable); (3) confirm you trust the skill source because images/PDFs you send will be uploaded to Tencent's API; avoid sending highly sensitive content unless you accept that external processing will occur; (4) the SKILL.md embeds a minor schema inconsistency (a 'UserAgent' entry shown inside ItemNames) — this looks like documentation drift, not an exploit, but verify the expected CLI options and defaults; (5) consider running the script locally first to observe network traffic (ensure it uses TLS) and to confirm credentials and region behavior. If the metadata owner/source cannot explain the missing credential declaration, treat the listing as untrusted.
Capability Analysis
Type: OpenClaw Skill
Name: tencentcloud-ocr-extractdocagent
Version: 1.0.2
The skill is a legitimate integration for the Tencent Cloud OCR ExtractDocAgent API, designed to extract structured data from images and PDFs. The Python script `scripts/main.py` uses the official `tencentcloud-sdk-python` and follows standard practices for handling cloud credentials via environment variables. No evidence of malicious behavior, data exfiltration, or prompt injection was found in the code or the `SKILL.md` instructions.
Capability Assessment
Purpose & Capability
The SKILL.md and scripts/main.py implement calling Tencent Cloud's ExtractDocAgent API (ocr.tencentcloudapi.com) and require Tencent Cloud API credentials — this matches the stated purpose. However, the registry metadata claims no required environment variables or primary credential, while both SKILL.md and the script require TENCENTCLOUD_SECRET_ID and TENCENTCLOUD_SECRET_KEY. This metadata omission is an inconsistency that should be resolved before trusting the package listing.
Instruction Scope
The runtime instructions and the script stay within OCR/document-extraction scope: they accept image URLs or file paths (or Base64), build a request, call Tencent's OCR API, and format the response. A small incoherence: SKILL.md lists a 'UserAgent' field inside the ItemNames structure (marked as optional and fixed to 'Skills'), which is an odd place for a request-source identifier; the code instead uses a CLI arg (args.user_agent) to set the client request header. No instructions ask the agent to read unrelated system files or call external endpoints other than Tencent Cloud.
Install Mechanism
There is no install spec (instruction-only plus an included script). The script requires the public package tencentcloud-sdk-python (pip). That dependency is proportional and expected for calling Tencent Cloud APIs. No downloads from untrusted URLs or archive extraction were found.
Credentials
The script needs TENCENTCLOUD_SECRET_ID and TENCENTCLOUD_SECRET_KEY which are appropriate and necessary for API access. The proportionality concern is the metadata mismatch: the registry metadata lists no required env vars while the documentation and code require credentials. This discrepancy can mislead users about the secrets the skill needs.
Persistence & Privilege
The skill does not request always:true and has no install steps that modify other skills or system-wide config. It simply runs a client script and does not persist credentials itself. No elevated or persistent privileges were requested.
How to Use
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install tencentcloud-ocr-extractdocagent - After installation, invoke the skill by name or use
/tencentcloud-ocr-extractdocagent - Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.2
- 新增ItemNames参数说明,支持可选字段UserAgent用于请求来源标识,统一固定为 Skills。
- 明确表述UserAgent字段不会影响现有逻辑,仅作调用追踪和溯源用途。
v1.0.1
更新显示名称
v1.0.0
Initial release of tencentcloud-ocr-extractdocagent:
- Enables structured data extraction from images or PDFs using Tencent Cloud’s ExtractDocAgent.
- Supports custom field names, field types (key-value or table), and field prompts for flexible information extraction.
- Handles common document scenarios including contracts, invoices, and reports.
- Accepts multiple input formats: PNG, JPG, JPEG, BMP, and PDF.
- Provides both formatted and raw JSON output modes.
- Offers command-line usage with multiple configurable parameters.
Metadata
Frequently Asked Questions
What is TencentCloud ExtractDoc OCR?
腾讯云实时文档抽取Agent(ExtractDocAgent)接口调用技能。当用户需要从图片或PDF中按自定义字段名称进行结构化信息抽取时,应使用此技能。支持自定义字段名称、字段类型(KV对或表格字段)和字段提示词,实现灵活的文档信息提取。适用于合同、发票、报告等各类文档的结构化数据抽取场景。 It is an AI Agent Skill for Claude Code / OpenClaw, with 408 downloads so far.
How do I install TencentCloud ExtractDoc OCR?
Run "/install tencentcloud-ocr-extractdocagent" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is TencentCloud ExtractDoc OCR free?
Yes, TencentCloud ExtractDoc OCR is completely free, licensed under MIT-0. You can download, install and use it at no cost.
Which platforms does TencentCloud ExtractDoc OCR support?
TencentCloud ExtractDoc OCR is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created TencentCloud ExtractDoc OCR?
It is built and maintained by tencent-ocr (@zt1314p-design); the current version is v1.0.2.
More Skills