← 返回 Skills 市场
zhuo-yoyowz

Local Document AI OpenVINO

作者 Zhuo Wu · GitHub ↗ · v0.1.2 · MIT-0
cross-platform ✓ 安全检测通过
80
总下载
0
收藏
0
当前安装
3
版本数
在 OpenClaw 中安装
/install local-document-ai-openvino
功能描述
Parse local PDFs and document images with PaddleOCR-VL or PaddleOCR-VL-1.5 on OpenVINO, then route the structured parse into downstream document-to-data or d...
使用说明 (SKILL.md)

Local Document AI with OpenVINO

Use this skill as a local document-to-action pipeline:

  1. Parse the document into a canonical structured representation.
  2. Optionally continue into to-data or to-code.
  3. Save outputs into a predictable artifact folder with traceability.

Read only if needed

Load these references when you need the schema or output contracts:

  • {baseDir}/references/schema.md
  • {baseDir}/references/mode_guide.md
  • {baseDir}/references/output_contracts.md

Primary entrypoints

Use exactly one of these entrypoints:

  • CLI orchestrator: {baseDir}/scripts/run_skill.py
  • Optional local demo UI: {baseDir}/scripts/serve_skill_ui.py

Do not call these implementation scripts directly from the skill:

  • parse_document.py
  • transform_doc_to_data.py
  • transform_doc_to_code.py

Local readiness

Check the environment before processing real documents:

python "{baseDir}/scripts/check_env.py"

Install the base dependencies in a virtual environment:

python -m pip install -r "{baseDir}/requirements.txt"

Install the third-party paddleocr_vl_openvino package only after reviewing the source or wheel and only when you intend to run the real OCR pipeline. Prefer installing from a reviewed local wheel path inside a virtual environment.

Run a quick orchestration smoke test:

python "{baseDir}/scripts/smoke_test.py"

Model assets are discovered from:

  • PADDLEOCR_VL_OPENVINO_MODEL_DIR
  • PADDLEOCR_VL_LAYOUT_MODEL_DIR plus PADDLEOCR_VL_VLM_MODEL_DIR
  • {baseDir}/models/paddleocr-vl-1.5-openvino/
  • {baseDir}/models/paddleocr-vl-openvino/

Allow model auto-download only when the user explicitly approves it.

Supported modes

parse

Use when the user wants the structured parse only.

Outputs:

  • parsed.json
  • parsed.md
  • result_report.html
  • extracted layout, tables, or figures when available

to-data

Use when the user wants structured extraction, normalization, or document classification.

Typical outputs under task_output/:

  • entities.json
  • kv_pairs.json
  • table_index.json
  • normalized.json
  • structured_record.json
  • traceability.json

to-code

Use when the user wants implementation-oriented output from the parse result.

Supported targets:

  • react
  • html-css
  • json-schema
  • jupyter-notebook

Typical outputs under task_output/:

  • component_map.json
  • field_schema.json
  • ui_blueprint.json
  • notes.md
  • traceability.json
  • target-specific artifacts such as app.jsx, index.html, styles.css, schema.json, notebook.ipynb, or notebook_plan.json

Treat all generated code and notebooks as drafts. Review them before running, publishing, or connecting them to real systems.

Pipeline rules

Always follow these rules:

  1. Prefer local execution.
  2. Always parse first into parsed.json.
  3. Generate downstream artifacts from parsed.json, not raw OCR text alone.
  4. Preserve page numbers, reading order, block types, and source anchors when possible.
  5. Write traceability for downstream outputs.
  6. Mark low-confidence regions or assumptions explicitly.
  7. Do not silently drop tables, figures, formulas, charts, or key-value regions.
  8. Save outputs into one artifact folder per run.
  9. For confidential documents, prefer an explicit private --out directory and remove artifacts after review.

Output contract

Default output folder:

./artifacts/\x3Cdocument_stem>/

Expected top-level outputs:

  • effective_config.json
  • run_report.json
  • parsed.json
  • parsed.md
  • result_report.html
  • task_output/

to-code runs may also emit:

  • code_preview.html

CLI examples

Parse

python "{baseDir}/scripts/run_skill.py" \
  --mode parse \
  --file "/absolute/path/to/report.pdf" \
  --out "/absolute/path/to/artifacts/report_parse"

To-data

python "{baseDir}/scripts/run_skill.py" \
  --mode to-data \
  --file "/absolute/path/to/invoice.pdf" \
  --out "/absolute/path/to/artifacts/invoice_data" \
  --extract "tables,entities,kv_pairs"

To-code

python "{baseDir}/scripts/run_skill.py" \
  --mode to-code \
  --file "/absolute/path/to/ui_mockup.png" \
  --out "/absolute/path/to/artifacts/ui_code" \
  --target "react" \
  --title "Generated App"

To-code notebook target

python "{baseDir}/scripts/run_skill.py" \
  --mode to-code \
  --file "/absolute/path/to/architecture_diagram.png" \
  --out "/absolute/path/to/artifacts/notebook_code" \
  --target "jupyter-notebook" \
  --title "OpenVINO Notebook"

Slash-command examples

/skill local-document-ai-openvino parse file=./docs/report.pdf
/skill local-document-ai-openvino to-data file=./docs/invoice.pdf extract=tables,entities,kv_pairs
/skill local-document-ai-openvino to-code file=./mockups/architecture.png target=jupyter-notebook

Optional local demo UI

Start the local UI when the user wants an interactive demo page:

python "{baseDir}/scripts/serve_skill_ui.py"

The UI lets the user:

  • preview a local file
  • choose parse, to-data, or to-code
  • choose the to-code target
  • run the pipeline and inspect the generated local HTML reports

The bundled UI only allows preview/run access for local files under the skill directory and common user content folders such as Downloads, Documents, Desktop, and Pictures.

Failure behavior

If a run fails:

  • state which stage failed
  • do not claim outputs were created if they were not
  • prefer writing error.json with failure details
  • recommend parse first when the downstream request is ambiguous
  • surface stderr or a concise failure summary when available

Safety notes

  • Use a virtual environment for dependency installation.
  • Review and approve model downloads only when you explicitly intend to.
  • Keep outputs in a private local folder when documents are sensitive.
  • Review generated code and notebooks before execution.
  • Delete artifacts when they are no longer needed.
  • The wrapper always uses the bundled local scripts and the current Python interpreter. It does not allow custom interpreter or script-directory overrides.

Short reminder

Present this skill as a local document-understanding workflow with downstream actions, not as a plain OCR wrapper.

安全使用建议
This looks appropriate for a local document OCR workflow. Before installing or running it, use a virtual environment, review external dependencies and model downloads, choose an output folder that is not publicly shared or cloud-synced for confidential documents, and inspect any generated code or notebooks before running them.
功能分析
Type: OpenClaw Skill Name: local-document-ai-openvino Version: 0.1.2 The skill bundle provides a legitimate local document AI pipeline for parsing PDFs and images using OpenVINO and PaddleOCR-VL. It features a well-structured orchestration system (run_skill.py) and a local web UI (serve_skill_ui.py) that includes security measures like path-restriction checks (is_within_allowed_roots) to prevent arbitrary file access. The scripts include safety warnings regarding the manual installation of third-party wheels and the review of generated code, demonstrating a security-conscious design without any evidence of malicious intent, exfiltration, or unauthorized execution.
能力标签
cryptocan-make-purchases
能力评估
Purpose & Capability
The document parsing, structured extraction, and code/notebook artifact generation match the stated purpose; users should remember generated code is only a draft and is not meant to be run without review.
Instruction Scope
The skill allows implicit invocation and processes user-specified local files, which is reasonable for this purpose but means users should be clear about which documents may be parsed.
Install Mechanism
There is no install spec in registry metadata, but SKILL.md and requirements.txt disclose Python dependencies and advise manual review before installing the third-party PaddleOCR-VL OpenVINO package or allowing model downloads.
Credentials
Local PDF/image access and writes to artifact folders are proportionate for local document AI, but the outputs may contain sensitive document contents.
Persistence & Privilege
The skill writes parsed JSON, Markdown, HTML reports, extracted data, and generated artifacts to disk; this is disclosed and purpose-aligned, but confidential documents require careful output-directory handling and cleanup.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install local-document-ai-openvino
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /local-document-ai-openvino 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v0.1.2
Harden the local UI file preview path handling, restrict UI file access to approved local folders, remove raw path preview URLs, and block custom interpreter/script override keys in the wrapper config.
v0.1.1
Remove non-essential screen/demo helpers, stop auto-installing remote OCR wheel by default, and clarify safety guidance for dependencies, generated code, and artifact handling.
v0.1.0
Initial public release
元数据
Slug local-document-ai-openvino
版本 0.1.2
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 3
常见问题

Local Document AI OpenVINO 是什么?

Parse local PDFs and document images with PaddleOCR-VL or PaddleOCR-VL-1.5 on OpenVINO, then route the structured parse into downstream document-to-data or d... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 80 次。

如何安装 Local Document AI OpenVINO?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install local-document-ai-openvino」即可一键安装,无需额外配置。

Local Document AI OpenVINO 是免费的吗?

是的,Local Document AI OpenVINO 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

Local Document AI OpenVINO 支持哪些平台?

Local Document AI OpenVINO 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Local Document AI OpenVINO?

由 Zhuo Wu(@zhuo-yoyowz)开发并维护,当前版本 v0.1.2。

💬 留言讨论