← Back to Skills Marketplace
init-v

data-pods

by init-v · GitHub ↗ · v0.2.0
cross-platform ⚠ suspicious
375
Downloads
0
Stars
0
Active Installs
2
Versions
Install in OpenClaw
/install initv-data-pods
Description
Create and manage modular portable database pods (SQLite + metadata + embeddings). Includes document ingestion with embeddings for semantic search. Full auto...
README (SKILL.md)

Data Pods

Overview

Create and manage portable, consent-scoped database pods. Handles document ingestion with embeddings and semantic search.

Architecture

┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│  Ingestion  │ ──► │   DB Pods   │ ──► │  Generation │
│  (ingest)   │     │  (storage)  │     │   (query)   │
└─────────────┘     └─────────────┘     └─────────────┘

Triggers

  • "create a pod" / "new pod"
  • "list my pods" / "what pods do I have"
  • "add to pod" / "add note" / "add content"
  • "query pod" / "search pod"
  • "ingest documents" / "add files"
  • "semantic search" / "find相关内容"
  • "export pod" / "pack pod"

Core Features

1. Create Pod

When user asks to create a pod:

  1. Ask for pod name and type (scholar/health/shared/projects)
  2. Run: python3 .../scripts/pod.py create \x3Cname> --type \x3Ctype>
  3. Confirm creation

2. Add Content (Manual)

When user asks to add content:

  1. Ask for pod name, title, content, tags
  2. Run: python3 .../scripts/pod.py add \x3Cpod> --title "\x3Ctitle>" --content "\x3Ccontent>" --tags "\x3Ctags>"
  3. Confirm

3. Ingest Documents (Automated)

When user wants to ingest files:

  1. Ask for pod name and folder path
  2. Run: python3 .../scripts/ingest.py ingest \x3Cpod> \x3Cfolder>
  3. Supports: PDF, TXT, MD, DOCX, PNG, JPG
  4. Auto-embeds text (if sentence-transformers installed)

4. Semantic Search

When user wants to search:

  1. Ask for pod name and query
  2. Run: python3 .../scripts/ingest.py search \x3Cpod> "\x3Cquery>"
  3. Returns ranked results with citations

5. Query (Basic)

When user asks to search notes:

  1. Run: python3 .../scripts/pod.py query \x3Cpod> --text "\x3Cquery>"

6. Export

When user asks to export:

  1. Run: python3 .../scripts/podsync.py pack \x3Cpod>

Dependencies

pip install PyPDF2 python-docx pillow pytesseract sentence-transformers

Storage Location

~/.openclaw/data-pods/

Key Commands

# Create pod
python3 .../scripts/pod.py create research --type scholar

# Add note
python3 .../scripts/pod.py add research --title "..." --content "..." --tags "..."

# Ingest folder
python3 .../scripts/ingest.py ingest research ./documents/

# Semantic search
python3 .../scripts/ingest.py search research "transformers"

# List documents
python3 .../scripts/ingest.py list research

# Query notes
python3 .../scripts/pod.py query research --text "..."

Notes

  • Ingestion auto-chunks large documents
  • Embeddings enable semantic search
  • File hash prevents duplicate ingestion
  • All data stored locally in SQLite
Usage Guidance
This skill appears to implement local data pods and a consent layer as advertised, but review before installing or running: - Inconsistency: There are two consent implementations that store grants in different places (~/.config/data-pods/consents/grants.json vs ~/.openclaw/consent/consent.db). Verify which consent manager your agent will call so you don't accidentally bypass consent checks. - Sensitive exports: The tool can export pods (.vpod/.zip) and pack entire pods as a single Markdown file intended for pasting into LLMs. Do not export or paste pods containing sensitive data (health, personal, or confidential research) into external services unless you explicitly intend to share. - Raw SQL: pod.py supports a --sql option that executes arbitrary SQL against the local DB. Be cautious when running it in contexts where results might be returned to an agent or transmitted elsewhere. - Dependencies: sentence-transformers is optional but required for semantic search; installing it can pull heavy model data. Because there's no automated install, manually review and install only the dependencies you need. - Audit and sandbox: If you want to test, run the scripts in a disposable environment (temporary user account or VM), check where files and consent records are created, and inspect outputs before integrating into daily workflows. If you plan to use this skill in production or with sensitive pods, ask the author to clarify which consent implementation is canonical, add a single consistent consent gateway, and consider removing or gating the 'pack for LLM' guidance to avoid accidental exfiltration.
Capability Analysis
Type: OpenClaw Skill Name: initv-data-pods Version: 0.2.0 This skill bundle is classified as suspicious due to multiple critical vulnerabilities that could lead to arbitrary code execution and arbitrary file system access. The `SKILL.md` and `README.md` instruct the AI agent to construct shell commands using unsanitized user input, creating a shell injection vulnerability. The `scripts/pod.py` file contains a severe SQL injection vulnerability in its `query_pod` function, allowing direct execution of user-provided SQL. Additionally, `scripts/podsync.py` is vulnerable to a Zip Slip attack in its `import_pod` function, which could lead to arbitrary file overwrite. While there is no clear evidence of intentional malicious behavior (e.g., data exfiltration or backdoor installation), these vulnerabilities are severe enough to enable such attacks if exploited by a malicious user or a prompt-injected agent.
Capability Assessment
Purpose & Capability
The code implements the advertised features: pod creation, notes, ingestion, optional embeddings (sentence-transformers), local storage under ~/.openclaw/data-pods, export/pack and a consent layer. However there are internal inconsistencies: the repository contains two consent implementations that use different storage locations (root consent.py writes grants to ~/.config/data-pods/consents/grants.json, while scripts/consent.py uses ~/.openclaw/consent/consent.db). README/usage also reference different paths (e.g. /home/claudio/.openclaw/workspace...). These mismatches could be harmless (old vs new code) but are disproportionate to a clean single-purpose skill and can create confusion about which consent check is actually used.
Instruction Scope
SKILL.md instructs the agent to run the included Python scripts (pod.py, ingest.py, podsync.py, consent scripts). Those scripts operate on local files and databases only, which aligns with the description. Concerns: 1) pod.py supports a --sql option that executes raw SQL against the SQLite DB — this can leak arbitrary data if results are included in agent responses or exported. 2) podsync.pack writes a single Markdown file 'Ready to paste into ChatGPT!' — exporting sensitive data into a format intended for pasting into external LLMs increases exfiltration risk if users follow that guidance. 3) There are two different consent implementations/paths (see purpose_capability) so an agent following SKILL.md could call one path while a user expects a different consent store — potential for bypassing intended consent checks.
Install Mechanism
No formal install spec is declared. SKILL.md lists pip dependencies (PyPDF2, python-docx, pillow, pytesseract, sentence-transformers). That's proportionate for document parsing and embeddings, but sentence-transformers is a heavy dependency that will download large models. Because there's no install automation, users must manually install dependencies; this reduces supply-chain risk but requires care when installing large ML packages.
Credentials
The skill requests no environment variables, no credentials, and interacts with local file paths only. That is proportional to the stated purpose. There are no hard-coded network endpoints or secret exfiltration calls in the provided code.
Persistence & Privilege
The skill does not request 'always: true' or any elevated platform privilege. It writes data under user dirs (~/.openclaw and ~/.config) which is expected for a local data management tool. Export and import functions create files under ~/.openclaw/sync; those are normal for export/sync functionality.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install initv-data-pods
  3. After installation, invoke the skill by name or use /initv-data-pods
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v0.2.0
v0.2 - Document ingestion with embeddings, semantic search
v0.1.0
v0.1 - Modular portable database pods with SQLite + metadata
Metadata
Slug initv-data-pods
Version 0.2.0
License
All-time Installs 0
Active Installs 0
Total Versions 2
Frequently Asked Questions

What is data-pods?

Create and manage modular portable database pods (SQLite + metadata + embeddings). Includes document ingestion with embeddings for semantic search. Full auto... It is an AI Agent Skill for Claude Code / OpenClaw, with 375 downloads so far.

How do I install data-pods?

Run "/install initv-data-pods" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is data-pods free?

Yes, data-pods is completely free (open-source). You can download, install and use it at no cost.

Which platforms does data-pods support?

data-pods is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created data-pods?

It is built and maintained by init-v (@init-v); the current version is v0.2.0.

💬 Comments