← Back to Skills Marketplace
abhinas90

RAG Pipeline Starter

by abhinas90 · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ✓ Security Clean
71
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install rag-pipeline-starter
Description
Set up and optimize RAG pipelines for large datasets (50K-500K rows) with document chunking, embedding benchmarking, vector indexing, and retrieval tuning.
Usage Guidance
What to consider before installing/running: - The package is instruction+code only and runs entirely on local files — there are no network calls or credential requests in the code, which reduces exfiltration risk. - The scripts create and modify files under the directories you pass as --output, --index, or --chunks. Run them in a controlled workspace or sandbox if you are testing, and avoid pointing them at sensitive system directories. - The embedding benchmark is mostly a mock/demo implementation. There is a small bug (function name mismatch: compute_similarity__mock vs. compute_similarity_mock) that may cause runtime errors; expect to edit/fix code if you want production use. The recommend logic also uses the first analyzed document to pick a strategy rather than aggregating across all documents — review if you need different behavior. - If you plan to plug in real (paid) embedding providers, you will need to manage API keys yourself; this skill does not request or manage credentials. Keep keys out of plain text and use secure storage. - Best practice: inspect the files locally (you already have them), run on a small sample dataset first, and run under a restricted environment (container or VM) if you are unsure. Given the available materials, the skill appears internally consistent and implements the features it claims; no indicators of data exfiltration or unrelated privileges were found.
Capability Analysis
Type: OpenClaw Skill Name: rag-pipeline-starter Version: 1.0.0 The skill bundle provides a legitimate boilerplate toolkit for building RAG (Retrieval-Augmented Generation) pipelines, including document chunking, embedding benchmarking, and retrieval tuning. Analysis of the Python scripts (chunking_analyzer.py, embedding_benchmark.py, retrieval_tuner.py, and vector_store_manager.py) reveals standard data processing logic and local file I/O without any evidence of network exfiltration, unauthorized command execution, or prompt injection attacks.
Capability Assessment
Purpose & Capability
Name/description match what the code provides: chunking analyzer, embedding benchmark, retrieval tuner, and a simple vector store manager. Required resources (none) and included scripts are proportionate to the stated purpose.
Instruction Scope
SKILL.md instructs running the included Python scripts on local data and creating local indexes. The runtime instructions only reference local files/directories and the included scripts; they do not instruct the agent to read unrelated system files, access credentials, or send data to external endpoints.
Install Mechanism
No install spec is provided; the skill is instruction+code only. This minimizes installation risk — nothing is downloaded or written by an installer. The only runtime requirement is Python 3.8+ and typical Python packages (numpy, sentence-transformers optionally).
Credentials
The skill requests no environment variables or credentials. The code reads and writes local files (chunks, indexes) which is expected for its purpose. There are no references to network endpoints, cloud credentials, or unrelated secrets.
Persistence & Privilege
Skill is not always: true and does not modify other skills or system-wide agent configuration. It persists only to its own index directories/files when run, which is expected behavior for a local vector store manager.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install rag-pipeline-starter
  3. After installation, invoke the skill by name or use /rag-pipeline-starter
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
Initial release
Metadata
Slug rag-pipeline-starter
Version 1.0.0
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 1
Frequently Asked Questions

What is RAG Pipeline Starter?

Set up and optimize RAG pipelines for large datasets (50K-500K rows) with document chunking, embedding benchmarking, vector indexing, and retrieval tuning. It is an AI Agent Skill for Claude Code / OpenClaw, with 71 downloads so far.

How do I install RAG Pipeline Starter?

Run "/install rag-pipeline-starter" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is RAG Pipeline Starter free?

Yes, RAG Pipeline Starter is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does RAG Pipeline Starter support?

RAG Pipeline Starter is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created RAG Pipeline Starter?

It is built and maintained by abhinas90 (@abhinas90); the current version is v1.0.0.

💬 Comments