← 返回 Skills 市场
zhiyuanw101

Claw Drive

作者 Dissao · GitHub ↗ · v0.4.4
cross-platform ✓ 安全检测通过
575
总下载
2
收藏
4
当前安装
4
版本数
在 OpenClaw 中安装
/install claw-drive
功能描述
Claw Drive — AI-managed personal drive for OpenClaw. Auto-categorize, tag, deduplicate, and retrieve files with natural language. Backed by Google Drive for...
使用说明 (SKILL.md)

Claw Drive

Organize and retrieve personal files with auto-categorization and a searchable index.

⚠️ Privacy — Read This First

File contents are personal data. Treat them accordingly.

  • NEVER read file contents without explicit user consent. Always ask first. Always.
  • If the user doesn't reply → default to SENSITIVE. Silence = no consent.
  • identity/ files are ALWAYS sensitive — never read, never extract, never log contents.
  • Extracted content enters the conversation transcript which is logged permanently to .jsonl files. Once you read a file, its contents are in the logs forever.
  • Descriptions in INDEX.jsonl are also persistent. Don't put sensitive details (SSNs, account numbers, passwords) in descriptions even for non-sensitive files — use redacted/partial forms (e.g. "account ending ****4321").
  • When in doubt, don't read. A vague index entry is better than leaked personal data.

Data locality: All data stays on your machine. INDEX.jsonl, stored files, and hash ledger are local. Conversation transcripts (.jsonl) are also local to your OpenClaw instance. Nothing is sent to external servers unless you explicitly enable Google Drive sync (optional, and only syncs the files you choose).

Dependencies

  • claw-drive CLIbrew install dissaozw/tap/claw-drive (or make install from skill directory for manual setup)
  • pymupdf — PDF text extraction (uv run --with pymupdf — no global install needed)
  • rclone — Google Drive sync (optional): brew install rclone
  • fswatch — file watch daemon (optional): brew install fswatch

⚠️ CLI Usage — Read This Before Running Anything

ALWAYS use the claw-drive CLI. NEVER use cp, mv, or direct file writes to ~/claw-drive/.

The CLI handles copying, hashing, deduplication, and index updates atomically. Bypassing it causes:

  • Files stored without hash registration → dedup breaks silently
  • INDEX.jsonl out of sync with actual files
  • Version confusion when replacing files

PATH note: If installed via Homebrew (brew install dissaozw/tap/claw-drive), the binary is in /opt/homebrew/bin/ and should be in PATH automatically. If installed manually, ~/.local/bin may not be in the agent shell's PATH — use the full path:

claw-drive store ...

If the manual symlink is broken, re-run make install from ~/.openclaw/skills/claw-drive/ to fix it.

Setup

claw-drive init [path]

This creates the directory structure, INDEX.jsonl, and hash ledger. Default path: ~/claw-drive.

Workflow

Storing a file

When receiving a file (email attachment, Telegram upload, etc.):

  1. Privacy check — ask the user gracefully if the file contains sensitive/personal data:

    • Something like: "Should I read the contents to index it better, or would you prefer I keep it private and just use the filename?"
    • If user says sensitive, or if user doesn't reply → treat as sensitive (default-safe)
    • If user confirms it's fine to read → proceed with full extraction
    • Files going to identity/ are always sensitive — never read contents
    • Sensitive flow: classify by filename/metadata only. If that's not enough for a good description, ask the user for a brief description. Never read file contents into the conversation.
  2. Extract (normal files only) — read file contents:

    • PDFs: extract text via uv run --with pymupdf python3 -c "import pymupdf; ..." or use the image tool
    • Images: use the image tool to read/describe contents
    • Other formats: read directly if possible
    • Pull out key entities: names, dates, amounts, account/policy numbers, addresses, etc.
  3. Classify — determine the best category from the categories table below

  4. Inspect category structure — after choosing a category, examine existing subfolders in that category (e.g. with tree/find) before finalizing destination

  5. Choose destination path

    • If an existing subfolder is a clear semantic match, store there
    • If multiple existing subfolders could match (conflicting/ambiguous), store at category root
    • Store at category root when the file is only generally related to the category and lacks specific detail
    • Create a new subfolder only when no existing subfolder fits and the file has clear specific detail that justifies one
  6. Name — choose a descriptive filename: \x3Csubject>-\x3Cdetail>-\x3CYYYY-MM-DD>.\x3Cext>

  7. Describe — write a rich description using extracted content (or user-provided description for sensitive files). Include key details (dates, amounts, IDs, names) so the file is findable by any relevant search term. Don't be vague — "insurance card" is bad, "Acme Insurance ID cards - 2024 Honda Civic, Policy ****3441, effective 1/21/2026–7/21/2026" is good.

  8. Tag — include specific tags from extracted content (model names, policy numbers, VINs, entity names) in addition to category tags

  9. Store — run the CLI (use full path if claw-drive not in PATH):

    claw-drive store \x3Cfile> --category \x3Ccat> --name 'clean-name.ext' --desc 'Rich description with key details' --tags 'tag1, tag2' --source telegram
    
    • Shell quoting safety: Prefer single quotes for --desc/--tags/--name when constructing shell commands. This avoids $ expansion (e.g. currency amounts like $941.39) and prevents metadata corruption. ⚠️ Do NOT use cp or write files directly to ~/claw-drive/. The CLI is the only correct way to store files — it handles copying, hashing, dedup, and index updates atomically.
  10. Report — tell the user: path, category, tags, key extracted details, and what was indexed

The CLI handles copying, hashing, deduplication, and index updates automatically. If the file is a duplicate, it will be rejected.

The --name flag lets you override the original filename (which may be ugly like file_17---8c1ee63d-...) with a clean, descriptive name.

Retrieving a file

Do NOT read INDEX.jsonl directly in the main session. Spawn a search sub-agent instead. This keeps the index out of your context window and scales to large file collections.

Why sub-agent?

The index grows with every stored file (~300 bytes/entry). At 1000+ files, reading the full index into the main agent's context wastes tokens and may hit context limits. A sub-agent runs in its own isolated session with a cheap model, reads the index, and returns only the matching entries.

How to spawn

Use sessions_spawn with:

  • mode: run
  • model: A lightweight model is recommended (the search task is simple). Resolution order:
    1. Explicit model param on sessions_spawn (if provided)
    2. agents.defaults.subagents.model in config (if set)
    3. Falls back to the main agent's model
  • task: The prompt below, with the user's query filled in
You are a file search agent. Read ~/claw-drive/INDEX.jsonl and find entries matching this query:

"\x3CUSER_QUERY>"

Return ONLY valid JSON, no explanation:

{
  "matches": [
    {
      "path": "\x3Cpath from index>",
      "desc": "\x3Cdesc from index>",
      "date": "\x3Cdate from index>",
      "tags": ["\x3Ctags from index>"],
      "confidence": "high|medium|low"
    }
  ],
  "total_indexed": \x3Cnumber of entries in index>,
  "query": "\x3Coriginal query>"
}

Rules:
- Max 5 matches, sorted by relevance
- confidence: high = exact match, medium = likely relevant, low = tangential
- If no matches, return {"matches": [], "total_indexed": N, "query": "..."}
- Only read INDEX.jsonl, never read file contents

Receive and deliver

  1. The sub-agent auto-announces its result back to your session
  2. Parse the JSON from the announce message
  3. Prepend ~/claw-drive/ to each path to get the full file path
  4. Send the file: The claw-drive directory may not be in the message tool's allowed paths. If sending fails with "not under an allowed directory", copy the file to a temp location first (e.g. workspace), send it, then clean up:
    cp ~/claw-drive/\x3Cpath> ~/.openclaw/workspace/
    # send via message tool
    rm ~/.openclaw/workspace/\x3Cfilename>
    
  5. Never show raw sub-agent JSON to the user. The announce message is internal — immediately process it and deliver the file. The user should only see the file and a brief description, not search internals.
  6. For multiple matches, send the most relevant one and list the rest — let the user pick

Troubleshooting: pairing required

If sessions_spawn returns pairing required, the sub-agent's exec harness needs device pairing approval. Run:

openclaw devices list        # find the pending request
openclaw devices approve \x3Crequest-id>

This is a one-time setup — once approved, subsequent spawns work without re-pairing.

Index format

INDEX.jsonl is a JSONL file — one JSON object per line. Each entry has: date, path, desc, tags (array), source, and optional fields metadata (JSON), original_name, correspondent.

Updating an entry

claw-drive update \x3Cpath> --desc "new description" --tags "new, tags"

Both --desc and --tags are optional (at least one required). Uses jq for atomic rewrite.

Deleting a file

claw-drive delete \x3Cpath> --force

Without --force, shows what would be deleted (dry run). With --force, removes file + index entry + dedup hash.

Tagging

Tags add cross-category searchability. A file lives in one folder but can have multiple tags.

Guidelines:

  • 1-5 tags per file, comma-separated
  • Lowercase, single words or short hyphenated phrases
  • Always include the category name as a tag (e.g. medical for files in medical/)
  • Add cross-cutting tags for things like: entity names (my-cat), document type (invoice, receipt, report), context (emergency, tax-2025)
  • Reuse existing tags when possible — read INDEX.jsonl to see existing tags before inventing new ones

Examples:

# Insurance PDF — after extracting: policy number, vehicle, VIN, dates, agent
claw-drive store file.pdf -c insurance -n "acme-auto-id-cards.pdf" \
  -d "Acme Insurance ID cards - 2024 Honda Civic, VIN 1HGBH41JXMN109186, Policy ****3441, effective 1/21/2026–7/21/2026, agent Jane Smith (555) 123-4567" \
  -t "insurance, auto, acme, id-card, honda-civic, california" -s telegram

# Vet invoice — after extracting: clinic, amount, diagnosis, pet name
claw-drive store invoice.pdf -c medical -n "my-cat-vet-invoice-2026-02-15.pdf" \
  -d "VEG emergency visit invoice - Max (cat), $1,234.56, bronchial pattern diagnosis, prednisolone prescribed" \
  -t "medical, invoice, max, emergency, vet" -s email

# W-2 — after extracting: employer, tax year, wages
claw-drive store w2.pdf -c finance -n "w2-2025.pdf" \
  -d "W-2 tax form 2025 - Employer: Acme Corp, wages $120,000" \
  -t "finance, tax-2025, w2" -s email

# Sensitive file — user said "keep it private" or didn't reply
claw-drive store scan.pdf -c identity -n "passport-scan-2026.pdf" \
  -d "Passport scan" \
  -t "identity, passport" -s telegram

# Sensitive file — user provided brief description
claw-drive store doc.pdf -c contracts -n "apartment-lease-2026.pdf" \
  -d "Apartment lease agreement, signed Jan 2026" \
  -t "contracts, lease, housing" -s email

Naming conventions

  • Lowercase, hyphens between words: my-cat-vet-invoice-2026-02-15.pdf
  • Include date when relevant
  • Include subject/entity name for clarity
  • Keep it human-readable — no UUIDs or timestamps

Categories

Categories are not fixed — the agent can create any category that makes sense. The CLI does mkdir -p automatically. These are the defaults created by init, but use whatever fits:

Category Use for
documents General docs, letters, forms, manuals
finance Tax returns, bank statements, investment docs, pay stubs
insurance Insurance policies, claims, coverage documents
medical Health records, lab results, prescriptions, pet health
travel Boarding passes, itineraries, hotel bookings, visas
identity Passport scans, birth certs, SSN docs (⚠️ sensitive)
receipts Purchase receipts, warranties, service invoices
contracts Leases, employment agreements, legal docs
photos Personal photos, document scans
misc Anything that doesn't fit above

Need housing/, work/, pets/? Just use it — the directory is created on first store.

When in doubt: misc/ is fine. Better to store it somewhere than not at all.

Migration

Bulk-import files from an existing directory:

# 1. Scan source directory into a plan
claw-drive migrate scan ~/messy-folder plan.json

# 2. Agent classifies each file (fills in category, name, tags, description in the JSON)

# 3. Review
claw-drive migrate summary plan.json

# 4. Dry run
claw-drive migrate apply plan.json --dry-run

# 5. Execute
claw-drive migrate apply plan.json

The plan JSON contains one entry per file with category, name, tags, description fields (initially null). The agent fills these in using the same extract-first approach, then apply copies files with full dedup and indexing.

Sync (Optional)

Claw Drive can auto-sync to Google Drive (or any rclone-supported backend) via a background daemon.

Prerequisites

brew install rclone fswatch

Authorization

Run claw-drive sync auth. It opens a browser on the machine for Google sign-in.

What happens:

  • rclone requests Google Drive file access only (not full Google account)
  • OAuth token is stored locally at ~/.config/rclone/rclone.conf — never sent to any third party
  • Data flows directly from your machine to Google Drive — no intermediary servers
  • You can revoke access anytime via Google Account → Security → Third-party apps

Agent behavior during auth:

  1. Run claw-drive sync auth in background
  2. Try the OpenClaw browser tool to click through the Google consent screen
  3. If browser tool is unavailable, send the auth URL to the user and ask them to complete sign-in on the machine (e.g. via Screen Sharing)
  4. Wait for rclone to capture the token

Commands

claw-drive sync setup   # verify deps and config
claw-drive sync start   # start background daemon (fswatch + rclone)
claw-drive sync stop    # stop daemon
claw-drive sync push    # manual one-shot sync
claw-drive sync status  # show sync status

The daemon watches the drive directory for file changes and syncs to the remote within seconds. It runs as a launchd service — starts on login, restarts on failure.

Logs: ~/Library/Logs/claw-drive/sync.log

Per-category privacy

Use the exclude list in .sync-config to keep sensitive directories local-only. identity/ is excluded by default.

Verify

Check index ↔ disk ↔ hash consistency:

claw-drive verify          # report issues
claw-drive verify --fix    # auto-repair what's fixable

Auto-fixable: missing on disk (removes stale index entry), missing hash (re-registers). Manual review: orphan files (no metadata to index), hash mismatches (possible corruption).

Run verify after manual file operations or when something seems off.

Tips

  • The CLI maintains INDEX.jsonl automatically — don't edit it manually
  • PDF text extraction: uv run --with pymupdf python3 -c "import pymupdf; ..."
  • Use claw-drive status to see file counts, size, and sync status

Privacy Checklist (every store)

Before storing any file, verify:

  • Did I ask the user about privacy? (not optional)
  • If no reply: am I treating it as sensitive? (must be yes)
  • If sensitive: am I skipping content extraction? (must be yes)
  • If identity/: am I skipping extraction regardless? (must be yes)
  • Are there SSNs, full account numbers, or passwords in my description? (must be no)
  • Would I be comfortable if this INDEX.jsonl entry leaked? (must be yes)
安全使用建议
What to check before installing/using Claw Drive: - Review the Homebrew tap (dissaozw/tap) and the built binary before installing; third‑party taps require trusting the maintainer. If possible, build from the repository yourself or inspect the installed binary. - Understand logging and data flow: if the agent is allowed to read file contents, those extracted contents become part of conversation transcripts and are written to JSONL logs permanently on disk. Never allow reading of highly sensitive files unless you accept that logging behavior. - Keep Google sync optional: only run 'claw-drive sync auth' and 'sync start' if you intend to store credentials in your rclone config and to run a background daemon. Review ~/.config/rclone/rclone.conf and the generated launchd plist (~/Library/LaunchAgents/com.claw-drive.sync.plist) before enabling the daemon. - Consider agent invocation policy: by default the skill can be invoked autonomously. If you want an extra safety guard, require manual confirmation before letting the agent run store/reindex/migrate commands that read files. - When performing large migrations/reindexes, preview plans (dry run) and avoid enabling full content extraction until you confirm it will only run on files you expect. If you want a more confident judgment, provide the Homebrew formula contents or the built binary so they can be inspected for unexpected network calls or privileged actions.
功能分析
Type: OpenClaw Skill Name: claw-drive Version: 0.4.4 The skill bundle demonstrates a high level of security awareness and proactive measures against common vulnerabilities. It includes robust shell injection prevention (e.g., `jq --arg`, `mapfile` for `rclone` excludes in `lib/config.sh` and `lib/sync.sh`), comprehensive path traversal checks (e.g., in `lib/migrate.sh`), and strong, explicit privacy and safety instructions for the AI agent in `SKILL.md` (e.g., 'NEVER read file contents without explicit user consent'). Network and persistence features (Google Drive sync via `rclone` and `launchd`) are optional, transparent, and securely implemented, with sensitive data excluded by default.
能力评估
Purpose & Capability
Name/description (AI-managed personal drive backed by Google Drive) match the files and runtime instructions. The skill requires only the 'claw-drive' binary and documents optional dependencies (rclone, fswatch, pymupdf) that are directly related to sync and content extraction. No unrelated credentials or unexpected binaries are requested.
Instruction Scope
SKILL.md explicitly tells the agent to run the claw-drive CLI and to never read file contents without explicit user consent. The instructions legitimately require reading/writing INDEX.jsonl, scanning/migrating arbitrary directories, and optionally extracting file contents (PDFs/images) into the conversation transcript. That behavior is consistent with the stated purpose, but it carries privacy risk because extracted contents are logged permanently to .jsonl transcripts — the skill documents this, but users must understand the implication before allowing reads. The docs also suggest running 'make install' from the skill directory if symlinks are broken, which means the agent (if permitted) could run build/install steps in the skill directory.
Install Mechanism
Install is via a Homebrew tap (dissaozw/tap/claw-drive) which builds/installs the 'claw-drive' binary. Using a third‑party brew tap (rather than an official org) is a reasonable delivery mechanism but requires trusting the tap maintainer. There are no opaque downloads or URLs in the install spec and the repository includes readable shell scripts (no obfuscated code).
Credentials
requires.env is empty and the skill does not demand unrelated secrets. Optional Google Drive sync uses rclone and will store credentials in the user's standard rclone config (~/.config/rclone/rclone.conf). The skill does not itself request API keys or other unrelated tokens. This is proportionate to a Google Drive sync feature.
Persistence & Privilege
always:false and the skill is user-invocable. Sync is opt-in, but starting sync installs a launchd service (com.claw-drive.sync) that the skill writes to ~/Library/LaunchAgents and loads. That creates a persistent background sync process (expected for a sync feature) and the plist embeds PATH and CLAW_DRIVE_DIR. Because the daemon can run at login and restart on failure, users should review the plist and logs before enabling.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install claw-drive
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /claw-drive 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v0.4.4
- Adds detailed guidance for selecting destination subfolders (inspect existing, avoid ambiguous or duplicate subfolder creation) - Recommends always quoting CLI arguments (especially with `$`) to prevent shell expansion and metadata issues - Updates workflow steps to improve safety and organization when classifying and storing files - No breaking changes to CLI or indexing; documentation significantly clarified
v0.4.3
No file or documentation changes detected in this release. - Version 0.4.3 introduces no code or SKILL.md changes compared to the previous version. - No new features, bug fixes, or updates included.
v0.4.2
claw-drive 0.4.2 - Added detailed documentation, including README, security guidelines, sync instructions, and tags usage. - Introduced new shell libraries and scripts for configuration, deduplication, indexing, migration, reindexing, and sync operations. - Added a test suite for improved reliability and verification. - Updated SKILL.md with a clearer privacy statement and stronger emphasis on local data handling and user consent. - Enhanced install and setup instructions, clarifying Homebrew and manual installation steps.
v0.4.1
Claw Drive 0.4.1 - Added detailed documentation in SKILL.md covering privacy, dependencies, setup, storage workflow, and retrieval process. - Emphasized privacy safeguards: always request explicit user consent before reading file contents; treat non-response as sensitive; files in "identity/" are strictly off-limits. - Clarified requirement to use the dedicated `claw-drive` CLI for storing files; direct file operations are prohibited to maintain data integrity. - Outlined step-by-step workflow for file storage (privacy check, extraction, classification, naming, description, tagging, reporting). - Documented retrieval via sub-agent workflow to avoid context overload and ensure scalable search across large file collections. - Provided concrete setup, CLI usage, and retrieval instructions for end users.
元数据
Slug claw-drive
版本 0.4.4
许可证
累计安装 4
当前安装数 4
历史版本数 4
常见问题

Claw Drive 是什么?

Claw Drive — AI-managed personal drive for OpenClaw. Auto-categorize, tag, deduplicate, and retrieve files with natural language. Backed by Google Drive for... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 575 次。

如何安装 Claw Drive?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install claw-drive」即可一键安装,无需额外配置。

Claw Drive 是免费的吗?

是的,Claw Drive 完全免费(开源免费),可自由下载、安装和使用。

Claw Drive 支持哪些平台?

Claw Drive 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Claw Drive?

由 Dissao(@zhiyuanw101)开发并维护,当前版本 v0.4.4。

💬 留言讨论