← 返回 Skills 市场

data-pods

Name: data-pods
Author: init-v

作者 init-v · GitHub ↗ · v0.2.0

cross-platform ⚠ suspicious

375

总下载

当前安装

版本数

在 OpenClaw 中安装

/install initv-data-pods

功能描述

Create and manage modular portable database pods (SQLite + metadata + embeddings). Includes document ingestion with embeddings for semantic search. Full auto...

使用说明 (SKILL.md)

Data Pods

Overview

Create and manage portable, consent-scoped database pods. Handles document ingestion with embeddings and semantic search.

Architecture

┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│  Ingestion  │ ──► │   DB Pods   │ ──► │  Generation │
│  (ingest)   │     │  (storage)  │     │   (query)   │
└─────────────┘     └─────────────┘     └─────────────┘

Triggers

"create a pod" / "new pod"
"list my pods" / "what pods do I have"
"add to pod" / "add note" / "add content"
"query pod" / "search pod"
"ingest documents" / "add files"
"semantic search" / "find相关内容"
"export pod" / "pack pod"

Core Features

1. Create Pod

When user asks to create a pod:

Ask for pod name and type (scholar/health/shared/projects)
Run: python3 .../scripts/pod.py create \x3Cname> --type \x3Ctype>
Confirm creation

2. Add Content (Manual)

When user asks to add content:

Ask for pod name, title, content, tags
Run: python3 .../scripts/pod.py add \x3Cpod> --title "\x3Ctitle>" --content "\x3Ccontent>" --tags "\x3Ctags>"
Confirm

3. Ingest Documents (Automated)

When user wants to ingest files:

Ask for pod name and folder path
Run: python3 .../scripts/ingest.py ingest \x3Cpod> \x3Cfolder>
Supports: PDF, TXT, MD, DOCX, PNG, JPG
Auto-embeds text (if sentence-transformers installed)

4. Semantic Search

When user wants to search:

Ask for pod name and query
Run: python3 .../scripts/ingest.py search \x3Cpod> "\x3Cquery>"
Returns ranked results with citations

5. Query (Basic)

When user asks to search notes:

Run: python3 .../scripts/pod.py query \x3Cpod> --text "\x3Cquery>"

6. Export

When user asks to export:

Run: python3 .../scripts/podsync.py pack \x3Cpod>

Dependencies

pip install PyPDF2 python-docx pillow pytesseract sentence-transformers

Storage Location

~/.openclaw/data-pods/

Key Commands

# Create pod
python3 .../scripts/pod.py create research --type scholar

# Add note
python3 .../scripts/pod.py add research --title "..." --content "..." --tags "..."

# Ingest folder
python3 .../scripts/ingest.py ingest research ./documents/

# Semantic search
python3 .../scripts/ingest.py search research "transformers"

# List documents
python3 .../scripts/ingest.py list research

# Query notes
python3 .../scripts/pod.py query research --text "..."

Notes

Ingestion auto-chunks large documents
Embeddings enable semantic search
File hash prevents duplicate ingestion
All data stored locally in SQLite

安全使用建议

This skill appears to implement local data pods and a consent layer as advertised, but review before installing or running: - Inconsistency: There are two consent implementations that store grants in different places (~/.config/data-pods/consents/grants.json vs ~/.openclaw/consent/consent.db). Verify which consent manager your agent will call so you don't accidentally bypass consent checks. - Sensitive exports: The tool can export pods (.vpod/.zip) and pack entire pods as a single Markdown file intended for pasting into LLMs. Do not export or paste pods containing sensitive data (health, personal, or confidential research) into external services unless you explicitly intend to share. - Raw SQL: pod.py supports a --sql option that executes arbitrary SQL against the local DB. Be cautious when running it in contexts where results might be returned to an agent or transmitted elsewhere. - Dependencies: sentence-transformers is optional but required for semantic search; installing it can pull heavy model data. Because there's no automated install, manually review and install only the dependencies you need. - Audit and sandbox: If you want to test, run the scripts in a disposable environment (temporary user account or VM), check where files and consent records are created, and inspect outputs before integrating into daily workflows. If you plan to use this skill in production or with sensitive pods, ask the author to clarify which consent implementation is canonical, add a single consistent consent gateway, and consider removing or gating the 'pack for LLM' guidance to avoid accidental exfiltration.

功能分析

Type: OpenClaw Skill Name: initv-data-pods Version: 0.2.0 This skill bundle is classified as suspicious due to multiple critical vulnerabilities that could lead to arbitrary code execution and arbitrary file system access. The `SKILL.md` and `README.md` instruct the AI agent to construct shell commands using unsanitized user input, creating a shell injection vulnerability. The `scripts/pod.py` file contains a severe SQL injection vulnerability in its `query_pod` function, allowing direct execution of user-provided SQL. Additionally, `scripts/podsync.py` is vulnerable to a Zip Slip attack in its `import_pod` function, which could lead to arbitrary file overwrite. While there is no clear evidence of intentional malicious behavior (e.g., data exfiltration or backdoor installation), these vulnerabilities are severe enough to enable such attacks if exploited by a malicious user or a prompt-injected agent.

能力评估

ℹ Purpose & Capability

The code implements the advertised features: pod creation, notes, ingestion, optional embeddings (sentence-transformers), local storage under ~/.openclaw/data-pods, export/pack and a consent layer. However there are internal inconsistencies: the repository contains two consent implementations that use different storage locations (root consent.py writes grants to ~/.config/data-pods/consents/grants.json, while scripts/consent.py uses ~/.openclaw/consent/consent.db). README/usage also reference different paths (e.g. /home/claudio/.openclaw/workspace...). These mismatches could be harmless (old vs new code) but are disproportionate to a clean single-purpose skill and can create confusion about which consent check is actually used.

⚠ Instruction Scope

SKILL.md instructs the agent to run the included Python scripts (pod.py, ingest.py, podsync.py, consent scripts). Those scripts operate on local files and databases only, which aligns with the description. Concerns: 1) pod.py supports a --sql option that executes raw SQL against the SQLite DB — this can leak arbitrary data if results are included in agent responses or exported. 2) podsync.pack writes a single Markdown file 'Ready to paste into ChatGPT!' — exporting sensitive data into a format intended for pasting into external LLMs increases exfiltration risk if users follow that guidance. 3) There are two different consent implementations/paths (see purpose_capability) so an agent following SKILL.md could call one path while a user expects a different consent store — potential for bypassing intended consent checks.

ℹ Install Mechanism

No formal install spec is declared. SKILL.md lists pip dependencies (PyPDF2, python-docx, pillow, pytesseract, sentence-transformers). That's proportionate for document parsing and embeddings, but sentence-transformers is a heavy dependency that will download large models. Because there's no install automation, users must manually install dependencies; this reduces supply-chain risk but requires care when installing large ML packages.

✓ Credentials

The skill requests no environment variables, no credentials, and interacts with local file paths only. That is proportional to the stated purpose. There are no hard-coded network endpoints or secret exfiltration calls in the provided code.

✓ Persistence & Privilege

The skill does not request 'always: true' or any elevated platform privilege. It writes data under user dirs (~/.openclaw and ~/.config) which is expected for a local data management tool. Export and import functions create files under ~/.openclaw/sync; those are normal for export/sync functionality.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install initv-data-pods
安装完成后，直接呼叫该 Skill 的名称或使用 /initv-data-pods 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v0.2.0

v0.2 - Document ingestion with embeddings, semantic search

v0.1.0

v0.1 - Modular portable database pods with SQLite + metadata

元数据

Slug initv-data-pods

版本 0.2.0

许可证 —

累计安装 0

当前安装数 0

历史版本数 2

常见问题

data-pods 是什么？

Create and manage modular portable database pods (SQLite + metadata + embeddings). Includes document ingestion with embeddings for semantic search. Full auto... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 375 次。

如何安装 data-pods？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install initv-data-pods」即可一键安装，无需额外配置。

data-pods 是免费的吗？

是的，data-pods 完全免费（开源免费），可自由下载、安装和使用。

data-pods 支持哪些平台？

data-pods 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 data-pods？

由 init-v（@init-v）开发并维护，当前版本 v0.2.0。