← 返回 Skills 市场

aiparse-ocr

Name: aiparse-ocr
Author: do0388309

作者 do0388309 · GitHub ↗ · v1.0.2 · MIT-0

cross-platform ✓ 安全检测通过

170

总下载

当前安装

版本数

在 OpenClaw 中安装

/install aiparse-ocr

功能描述

Parse PDF files using LLM. **No registration required - free trial available!** Extract information from PDF files and return results in JSON or Markdown for...

使用说明 (SKILL.md)

\r \r

AI Parse\r

\r A skill for parsing PDF files using Large Language Models.\r \r

Capabilities\r

Extract information from PDF files\r
Return results in JSON or Markdown format\r
Resume processing from existing task ID\r
Save task ID information to JSON file for reference\r \r

Parameters\r

\r | Parameter | Type | Required | Description |\r |-----------|------|----------|-------------|\r | pdf_path | string | required | Path to the PDF file to process |\r | result_path | string | required | Path to save the parsing result |\r | format | string | required | Output format: "json" or "md" |\r | task_id_path | string | required | Path to save task ID information (JSON format) |\r | --task-id | string | optional | Existing task ID to resume processing |\r \r

Usage Examples\r

Normal Upload Mode\r

python handler.py \x3Cpdf_path> \x3Cresult_path> \x3Cformat> \x3Ctask_id_path>\r
```\r
\r
### Resume from Existing Task or Check Status\r
\r
```bash\r
python handler.py --task-id \x3Ctask_id> \x3Cresult_path> \x3Cformat>\r
```\r
\r
## Task ID File Format\r
\r
When using normal upload mode, a task ID file will be created at `task_id_path` with the following JSON structure:\r
\r
```json\r
{\r
  "task_id": "AAFXKO",\r
  "pdf_path": "test.pdf",\r
  "submit_time": "2026-04-04 00:33:27"\r
}\r
```\r
\r
This file can be used to:\r
- Track the submitted task\r
- Retrieve the task ID later for status checking\r
- Resume processing if interrupted\r
\r
## Implementation\r
\r
Implemented by `handler.py` which:\r
- Uploads PDF files to the processing service\r
- Polls for processing completion\r
- Downloads and saves results in the requested format\r
- Supports resuming from existing task IDs\r
- Saves task ID information to JSON file\r
\r
## Environment Requirements\r
\r
- Python 3.6+\r
- requests library\r
\r
## Return Value\r
\r
The parsed result will be saved to the specified `result_path` in the requested format:\r
- **JSON format:** Structured JSON with task details and extracted content\r
- **Markdown format:** Formatted Markdown with page-by-page content\r
\r
## Notes\r
\r
- For large PDF files, processing may take multiple minutes\r
- Free users can process 30 PDF pages - visit https://api.pinocch.com/index for extra trial credits\r
- The `--task-id` parameter can be used to resume processing if interrupted\r
- Check the console output for processing progress and status updates\r
- The task ID file is created immediately after successful upload\r
- **IMPORTANT FOR AGENTS:** Before declaring a task as failed, always use the task ID to check the current status of the task. Use the `--task-id` parameter to resume or verify the task status. The task may still be processing or have completed successfully.\r

安全使用建议

This skill uploads any PDF you give it to https://api.pinocch.com for processing. If your PDFs contain sensitive or confidential information, do not send them to an untrusted third party. The code allows optional username/api_token authentication but the SKILL.md does not document supplying those credentials — if you have an account, review handler.py to see how to supply credentials, or test in trial mode with non‑sensitive documents first. Review the handler.py file yourself (or have a developer do so) to confirm no unexpected network endpoints or behavior, and restrict use to documents you are comfortable sharing with the external service.

功能分析

Type: OpenClaw Skill Name: aiparse-ocr Version: 1.0.2 The aiparse-ocr skill is a legitimate tool for extracting data from PDF files using a third-party API (api.pinocch.com). The handler.py script implements standard file upload and polling logic using the requests library, and the SKILL.md instructions are focused on ensuring the AI agent correctly manages long-running tasks. No evidence of data exfiltration, malicious execution, or persistence was found.

能力标签

cryptocan-make-purchases

能力评估

✓ Purpose & Capability

The skill name/description (PDF parsing / OCR using an LLM-backed service) matches the implementation: handler.py uploads PDFs to api.pinocch.com, polls for results, and saves parsed output. Required capabilities are consistent with a remote parsing service.

ℹ Instruction Scope

SKILL.md instructs the agent to run handler.py to upload PDFs, poll status, and save results — which is exactly what the code does. Important scope note: the skill uploads the user's PDF files to an external domain (api.pinocch.com). There are no instructions to read unrelated local files, but any PDF passed will be transmitted off‑device.

✓ Install Mechanism

No install spec is present (instruction-only with an included handler.py). This reduces install-time risk; the code will run when invoked and performs network calls. No unusual download/install operations are performed by the skill itself.

ℹ Credentials

The registry metadata declares no required environment variables or credentials, and the skill works in 'trial mode' with no auth. The code, however, supports optional username and api_token headers (Authorization: Bearer ...) even though SKILL.md does not document how to provide them — minor documentation inconsistency. No unrelated secrets or system credentials are requested.

✓ Persistence & Privilege

The skill does not request persistent/system privileges (always:false). It writes only task ID and result files in paths supplied by the user and does not modify other skills or global agent settings.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install aiparse-ocr
安装完成后，直接呼叫该 Skill 的名称或使用 /aiparse-ocr 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.0.2

- Removed secret.txt file from the repository. - Updated documentation: No registration required and free trial mode highlighted. - Authentication parameters (username, secret) removed from documentation and usage instructions. - Clarified free page limit for unregistered users (30 pages). - Streamlined usage examples for simpler, credential-free command structure.

v1.0.1

Major update: Improved task management, resume functionality, and output tracking for PDF parsing. - Added support for resuming processing using task IDs and checking task status. - Task ID is saved to a JSON file, enabling easier tracking and recovery. - New parameters: task_id_path (required) and --task-id (optional) for managing ongoing or interrupted tasks. - Enhanced usage documentation with authenticated/trial modes and examples for resuming tasks. - Updated important notes: Agents should always check task status via task ID before declaring failure. - Revised environment requirements, keywords, return value details, and general guidance for improved clarity.

v1.0.0

- Initial public release of aiparse-ocr skill. - Provides parsing of PDF files using large language models. - Extracts structured information from PDFs and outputs results in JSON or Markdown format. - Supports optional authentication for usage beyond trial mode. - Allows result file output for processed data. - Includes clear usage instructions and parameter documentation.

元数据

Slug aiparse-ocr

版本 1.0.2

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 3

常见问题

aiparse-ocr 是什么？

Parse PDF files using LLM. **No registration required - free trial available!** Extract information from PDF files and return results in JSON or Markdown for... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 170 次。

如何安装 aiparse-ocr？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install aiparse-ocr」即可一键安装，无需额外配置。

aiparse-ocr 是免费的吗？

是的，aiparse-ocr 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

aiparse-ocr 支持哪些平台？

aiparse-ocr 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 aiparse-ocr？

由 do0388309（@do0388309）开发并维护，当前版本 v1.0.2。