← Back to Skills Marketplace
teoslayer

Pilot Document Processing Setup

by Calin Teodor · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ✓ Security Clean
68
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install pilot-document-processing-setup
Description
Deploy a document processing pipeline with 3 agents that automate ingestion, data extraction, and search indexing. Use this skill when: 1. User wants to set...
README (SKILL.md)

Document Processing Setup

Deploy 3 agents that automate document ingestion, data extraction, and search indexing.

Roles

Role Hostname Skills Purpose
ingester \x3Cprefix>-ingester pilot-stream-data, pilot-share, pilot-archive Accepts documents, converts to processable format
extractor \x3Cprefix>-extractor pilot-task-router, pilot-dataset, pilot-receipt Extracts structured data — tables, entities, amounts
indexer \x3Cprefix>-indexer pilot-webhook-bridge, pilot-announce, pilot-metrics Indexes data for search, publishes to downstream systems

Setup Procedure

Step 1: Ask the user which role this agent should play and what prefix to use.

Step 2: Install the skills for the chosen role:

# ingester:
clawhub install pilot-stream-data pilot-share pilot-archive
# extractor:
clawhub install pilot-task-router pilot-dataset pilot-receipt
# indexer:
clawhub install pilot-webhook-bridge pilot-announce pilot-metrics

Step 3: Set the hostname:

pilotctl --json set-hostname \x3Cprefix>-\x3Crole>

Step 4: Write the setup manifest:

mkdir -p ~/.pilot/setups
cat > ~/.pilot/setups/document-processing.json \x3C\x3C 'MANIFEST'
\x3CUSE ROLE TEMPLATE BELOW>
MANIFEST

Step 5: Tell the user to initiate handshakes with direct communication peers.

Manifest Templates Per Role

ingester

{"setup":"document-processing","setup_name":"Document Processing","role":"ingester","role_name":"Document Ingester","hostname":"\x3Cprefix>-ingester","description":"Accepts documents (PDF, DOCX, images) via upload or webhook, converts to processable format.","skills":{"pilot-stream-data":"Stream raw document bytes to extractor for processing.","pilot-share":"Share converted document files with extractor.","pilot-archive":"Archive original documents for audit and reprocessing."},"peers":[{"role":"extractor","hostname":"\x3Cprefix>-extractor","description":"Receives raw documents for data extraction"}],"data_flows":[{"direction":"send","peer":"\x3Cprefix>-extractor","port":1002,"topic":"raw-document","description":"Raw documents in processable format"}],"handshakes_needed":["\x3Cprefix>-extractor"]}

extractor

{"setup":"document-processing","setup_name":"Document Processing","role":"extractor","role_name":"Data Extractor","hostname":"\x3Cprefix>-extractor","description":"Pulls structured data from documents — tables, key-value pairs, entities, dates, amounts.","skills":{"pilot-task-router":"Route documents to specialized extractors by type (invoice, contract, form).","pilot-dataset":"Store extraction results and training data for accuracy improvement.","pilot-receipt":"Confirm document receipt and report extraction status."},"peers":[{"role":"ingester","hostname":"\x3Cprefix>-ingester","description":"Sends raw documents"},{"role":"indexer","hostname":"\x3Cprefix>-indexer","description":"Receives extracted structured data"}],"data_flows":[{"direction":"receive","peer":"\x3Cprefix>-ingester","port":1002,"topic":"raw-document","description":"Raw documents in processable format"},{"direction":"send","peer":"\x3Cprefix>-indexer","port":1002,"topic":"extracted-data","description":"Extracted structured data as JSON"}],"handshakes_needed":["\x3Cprefix>-ingester","\x3Cprefix>-indexer"]}

indexer

{"setup":"document-processing","setup_name":"Document Processing","role":"indexer","role_name":"Search Indexer","hostname":"\x3Cprefix>-indexer","description":"Indexes extracted data for search, builds document summaries, publishes to downstream systems.","skills":{"pilot-webhook-bridge":"Push index events and summaries to downstream APIs and search engines.","pilot-announce":"Broadcast new document availability to interested subscribers.","pilot-metrics":"Track indexing throughput, search latency, and document counts."},"peers":[{"role":"extractor","hostname":"\x3Cprefix>-extractor","description":"Sends extracted structured data"}],"data_flows":[{"direction":"receive","peer":"\x3Cprefix>-extractor","port":1002,"topic":"extracted-data","description":"Extracted structured data as JSON"},{"direction":"send","peer":"external","port":443,"topic":"index-notification","description":"Index notifications to downstream systems"}],"handshakes_needed":["\x3Cprefix>-extractor"]}

Data Flows

  • ingester -> extractor : raw-document events (port 1002)
  • extractor -> indexer : extracted-data events (port 1002)
  • indexer -> downstream : index notifications via webhook (port 443)

Handshakes

# ingester \x3C-> extractor:
pilotctl --json handshake \x3Cprefix>-extractor "setup: document-processing"
pilotctl --json handshake \x3Cprefix>-ingester "setup: document-processing"
# extractor \x3C-> indexer:
pilotctl --json handshake \x3Cprefix>-indexer "setup: document-processing"
pilotctl --json handshake \x3Cprefix>-extractor "setup: document-processing"

Workflow Example

# On extractor — subscribe to raw documents:
pilotctl --json subscribe \x3Cprefix>-ingester raw-document
# On indexer — subscribe to extracted data:
pilotctl --json subscribe \x3Cprefix>-extractor extracted-data
# On ingester — publish a document:
pilotctl --json publish \x3Cprefix>-extractor raw-document '{"filename":"invoice-2024-003.pdf","type":"pdf","pages":2}'
# On extractor — publish extracted data:
pilotctl --json publish \x3Cprefix>-indexer extracted-data '{"filename":"invoice-2024-003.pdf","vendor":"Acme Corp","amount":12500.00}'

Dependencies

Requires pilot-protocol skill, pilotctl binary, clawhub binary, and a running daemon.

Usage Guidance
This skill appears internally consistent, but before installing or running it: 1) Verify pilotctl and clawhub are genuine/trusted binaries (install sources and checksums). 2) Review each pilot-* skill that clawhub will install — those packages will execute code and may request credentials or network access. 3) Be aware the setup writes manifests under ~/.pilot and configures inter-agent handshakes and network ports (1002 and webhooks to external endpoints on 443); ensure you understand which downstream systems will receive index notifications and whether that may expose sensitive data. 4) If possible, test the deployment in an isolated environment or staging network first. 5) If you need stronger guarantees, request the upstream package sources (URLs or registries) and inspect those packages before installing.
Capability Analysis
Type: OpenClaw Skill Name: pilot-document-processing-setup Version: 1.0.0 The skill provides a legitimate setup for a three-agent document processing pipeline (ingester, extractor, and indexer). It uses standard ecosystem tools like `pilotctl` and `clawhub` to manage hostnames, install sub-skills, and establish secure handshakes between agents, with no evidence of malicious intent or unauthorized data access in SKILL.md or README.md.
Capability Tags
crypto
Capability Assessment
Purpose & Capability
The name/description (deploy a 3-agent document pipeline) matches the instructions and required binaries. pilotctl is used for agent control/handshake/subscribe/publish and clawhub is used to install related Pilot skills — both are appropriate for this setup task.
Instruction Scope
SKILL.md instructs the agent to install other pilot-* skills, set hostnames, write a JSON manifest under ~/.pilot/setups, and run pilotctl handshake/subscribe/publish commands. These actions are within the scope of configuring agents, but they create persistent configuration in the user's home directory and cause the system to fetch and install additional skills (see guidance). No environment variables, secret files, or unrelated system paths are referenced.
Install Mechanism
This is instruction-only (no install spec, no bundled code). That reduces direct risk. Note: the instructions call out 'clawhub install' which will download/install other pilot-* skills — the safety of the overall deployment depends on those packages and where clawhub fetches them from.
Credentials
The skill requests no environment variables, no credentials, and no config paths beyond creating a manifest in ~/.pilot/setups. That is proportionate to the purpose.
Persistence & Privilege
The skill does not request always:true, does not modify other skills' configuration beyond installing them via clawhub, and only writes a setup manifest in ~/.pilot — expected for a setup task. It does not claim elevated platform privileges.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install pilot-document-processing-setup
  3. After installation, invoke the skill by name or use /pilot-document-processing-setup
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
Initial release
Metadata
Slug pilot-document-processing-setup
Version 1.0.0
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 1
Frequently Asked Questions

What is Pilot Document Processing Setup?

Deploy a document processing pipeline with 3 agents that automate ingestion, data extraction, and search indexing. Use this skill when: 1. User wants to set... It is an AI Agent Skill for Claude Code / OpenClaw, with 68 downloads so far.

How do I install Pilot Document Processing Setup?

Run "/install pilot-document-processing-setup" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Pilot Document Processing Setup free?

Yes, Pilot Document Processing Setup is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Pilot Document Processing Setup support?

Pilot Document Processing Setup is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Pilot Document Processing Setup?

It is built and maintained by Calin Teodor (@teoslayer); the current version is v1.0.0.

💬 Comments