← Back to Skills Marketplace

Rag Retriever

Name: Rag Retriever
Author: yuyonghao-123

by yuyonghao-123 · GitHub ↗ · v0.1.0 · MIT-0

cross-platform ⚠ suspicious

139

Downloads

Stars

Active Installs

Versions

Install in OpenClaw

/install rag-retriever

Description

提供基于文档分块、简单词频嵌入及混合向量+关键词搜索的RAG 2.0检索系统，支持中英文本和来源引用。

Usage Guidance

This skill is largely coherent with its advertised RAG retriever purpose, but please consider the following before installing: 1) npm install will fetch non-trivial native-heavy packages (transformers, onnxruntime, sharp) — verify you are comfortable with build time and disk usage and run inside a controlled environment if unsure. 2) The code includes an OpenAI embedding provider that will call https://api.openai.com/v1/embeddings if you supply an API key (process.env.OPENAI_API_KEY) — the registry metadata doesn't declare that env var, so only provide keys if you intend to use OpenAI and understand network calls. 3) The skill writes cache and database files under its local data/ directory (embedding-cache.json, LanceDB files) — ensure you trust the documents you add. 4) If you need higher assurance, review package-lock.json for dependency origins, run the skill in a sandbox/container, or audit the few JS files that perform network calls (embeddings.js) and file writes before granting credentials.

Capability Analysis

Type: OpenClaw Skill Name: rag-retriever Version: 0.1.0 The bundle implements a functional RAG (Retrieval-Augmented Generation) system providing document chunking, hybrid search (BM25 and vector), and citation management. It supports multiple embedding providers, including OpenAI and local models via Transformers.js, and utilizes LanceDB for vector storage. The code logic in files like `src/rag2.js`, `src/hybrid-search.js`, and `src/retriever.js` is consistent with the stated purpose of document retrieval and context augmentation. No evidence of malicious intent, data exfiltration, or harmful prompt injection was found; the network requests to OpenAI and Hugging Face mirrors (hf-mirror.com) are standard for this type of application.

Capability Assessment

ℹ Purpose & Capability

Name/description, README and SKILL.md match the included code and data: document chunking, simple TF embeddings, BM25, LanceDB storage and hybrid search are implemented. Minor mismatch: the code contains an OpenAIEmbedding provider that will call OpenAI's embeddings endpoint if used, but the skill's registry metadata does not declare OPENAI_API_KEY (no required env vars). This is optional behavior and coherent with the stated plan to optionally integrate OpenAI embeddings.

✓ Instruction Scope

SKILL.md instructs only to run local CLI commands (init/add/search/rag) and use the provided JavaScript API; runtime instructions and implementation operate on local files and the included LanceDB path. There are no instructions to scan arbitrary system files or to exfiltrate agent data. The only external network call in code is to the OpenAI embeddings API when the OpenAIEmbedding provider is used, which aligns with an embedding provider.

ℹ Install Mechanism

There is no special install spec in registry metadata (instruction-only), but package.json and package-lock indicate npm install is expected. npm will pull sizeable dependencies (e.g., @huggingface/transformers, onnxruntime variants, sharp) which may compile native modules. The repository also includes local model/tokenizer JSON files (large assets) — this increases disk usage but is not inherently malicious. Review native dependency installation and disk requirements before installing.

ℹ Credentials

The skill declares no required environment variables, which is reasonable for its local/simple-embedding default. However, the OpenAIEmbedding implementation will use process.env.OPENAI_API_KEY if present or supplied — that credential is not declared in the metadata. No other credentials or unrelated env vars are requested. If you plan to use OpenAI embeddings, you must provide an API key; otherwise the default SimpleEmbedding is used.

✓ Persistence & Privilege

The skill is not always-enabled, is user-invocable, and does not request system-wide privileges or modify other skills. It writes caches and LanceDB files to its local data/ directory (e.g., data/embedding-cache.json, data/lancedb), which is expected for a retriever and is scoped to the skill's folder.

How to Use

Make sure OpenClaw is installed (local or Docker)
Run the install command in chat: /install rag-retriever
After installation, invoke the skill by name or use /rag-retriever
Provide required inputs per the skill's parameter spec and get structured output

Version History

v0.1.0

Initial release of RAG 2.0 retrieval system for OpenClaw. - Implements document chunking with configurable size and overlap. - Supports simple text embedding (TF-based) and LanceDB vector storage. - Provides hybrid search: vector similarity plus BM25 keyword search (RRF fusion). - Enables source citation tracking and context-augmented RAG prompts. - Adds Chinese segmentation for multilingual search. - Includes CLI tool, JavaScript API, and basic tests.

Metadata

Slug rag-retriever

Version 0.1.0

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 1

Frequently Asked Questions

What is Rag Retriever?

提供基于文档分块、简单词频嵌入及混合向量+关键词搜索的RAG 2.0检索系统，支持中英文本和来源引用。 It is an AI Agent Skill for Claude Code / OpenClaw, with 139 downloads so far.

How do I install Rag Retriever?

Run "/install rag-retriever" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Rag Retriever free?

Yes, Rag Retriever is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Rag Retriever support?

Rag Retriever is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Rag Retriever?

It is built and maintained by yuyonghao-123 (@yuyonghao-123); the current version is v0.1.0.

More Skills