← 返回 Skills 市场
ryno2390

Codebase Search

作者 Ryne Schultz · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ⚠ suspicious
124
总下载
0
收藏
0
当前安装
1
版本数
在 OpenClaw 中安装
/install codebase-search
功能描述
Build a persistent semantic vector index over a Python codebase and search it with natural language. Use when an agent needs to find relevant classes, functi...
使用说明 (SKILL.md)

Codebase Search

Builds a persistent ChromaDB vector index over Python source files and enables semantic search with natural language queries.

Quick Start

1. Install the scripts

Copy scripts/code_chunker.py and scripts/code_index.py into your project. They have no dependencies beyond chromadb (install with pip install chromadb).

2. Build the index

import asyncio
from code_index import CodebaseIndex

index = CodebaseIndex(repo_root="/path/to/repo")
count = asyncio.run(index.build())
print(f"Indexed {count} symbols")

The index persists to {repo_root}/.codebase_index/ and survives restarts. Subsequent calls to build() are fast — only new/changed files are indexed.

3. Search

results = asyncio.run(index.search("token payment handling", top_k=5))
for r in results:
    print(f"[{r.score:.2f}] {r.symbol_name} ({r.symbol_type}) — {r.filepath}:{r.start_line}")

Convenience API (when integrated into a project)

If code_chunker.py and code_index.py are in the project as a module, use the singleton helper:

from prsm.compute.nwtn.corpus import search_codebase

results = await search_codebase("circuit breaker", top_k=3)

Key Options

Parameter Default Description
top_k 5 Number of results to return
symbol_type None Filter to "class" or "function"
force_rebuild False Wipe and rebuild entire index
exclude_patterns see below Directories to skip

Default excludes: __pycache__, .venv, migrations, tests, scripts, .git, node_modules, .codebase_index

How It Works

  1. ChunkingCodeChunker uses Python's ast module to extract every top-level class and function from each .py file. Captures name, type, docstring, line numbers, and source.
  2. Indexing — ChromaDB stores each chunk as a document: "{symbol_name}: {docstring or first 300 chars of source}". Uses ChromaDB's default embedding function (no API key needed).
  3. Search — Cosine similarity query returns ranked SearchResult objects with filepath, symbol name, line numbers, docstring, and relevance score.

.gitignore

Always add .codebase_index/ to .gitignore — it's a local artifact, not source code.

Reference

See references/integration.md for integration patterns, including how to wire semantic search into sub-agent delegation prompts.

安全使用建议
This skill appears to do what it says—build a local ChromaDB-based semantic index of Python code—but there are practical inconsistencies you should be aware of before installing or copying files: - Import/path mismatch: code_index.py uses imports like from prsm.compute.nwtn.corpus.code_chunker import CodeChunker and the SKILL.md shows convenience imports under prsm.compute.nwtn.corpus. The provided files live under scripts/. If you copy the scripts into a project as-is, those imports will likely fail. Either run the scripts from a package layout matching prsm.* or edit imports to use relative/local module names (e.g., import code_chunker). - Dependencies: you must install chromadb (and its runtime dependencies such as onnxruntime/tokenizers) as noted. Verify those packages are acceptable in your environment (they can be heavy and may require native wheels). - Local indexing and privacy: the index is persisted to {repo_root}/.codebase_index and will contain snippets and docstrings from your code. Add that directory to .gitignore (as SKILL.md suggests) and consider whether you want it included in backups or shared storage. - Behavior is local-only: there are no network calls or secret exfiltration in the code, but the indexer will read all .py files not excluded by default. Review the exclude list and adjust if you need to skip sensitive directories. Recommended next steps: test in a disposable/sandbox repo first, fix the import paths or package layout before using in production, and confirm chromadb works in your environment. If the package is to be used by an agent, ensure the agent's runtime has chromadb and the correct module path available.
功能分析
Type: OpenClaw Skill Name: codebase-search Version: 1.0.0 The skill provides a legitimate semantic search tool for Python codebases using ChromaDB and Python's AST module for safe code parsing. The logic is transparent and aligns with the stated purpose of indexing and searching functions and classes. While there is a functional bug in `scripts/code_index.py` (a hardcoded internal import path `prsm.compute.nwtn.corpus.code_chunker` that would fail if the files are moved as suggested in the documentation), there is no evidence of malicious intent, data exfiltration, or prompt injection.
能力评估
Purpose & Capability
The name/description (persistent semantic vector index over a Python codebase) matches the code and instructions: the scripts chunk .py files and persist vectors to a local .codebase_index using ChromaDB. However, code_index.py and SKILL.md reference a package namespace (prsm.compute.nwtn.corpus.* and a convenience import path) that does not match the provided file layout (scripts/*.py). That mismatch means the scripts will likely fail if simply copied into a project as the Quick Start suggests.
Instruction Scope
Runtime instructions and the code stay within the stated purpose: they scan Python files (with explicit exclude patterns), extract top-level classes/functions via ast, store docstrings/snippets, and run local ChromaDB queries. The skill does read the repository tree and files (expected for this purpose) and persists an index to {repo_root}/.codebase_index/; it also instructs adding that dir to .gitignore.
Install Mechanism
There is no install spec (instruction-only), which is low-risk, but the SKILL.md requires pip install chromadb and mentions onnxruntime/tokenizers. The code imports chromadb at runtime. The documentation omits explicit steps for making the provided scripts available under the prsm.* package namespace, creating a likely friction/operational issue. No remote downloads or obscure URLs are used.
Credentials
The skill requests no environment variables, credentials, or config paths. It only needs local filesystem access to the repository to index and to write the .codebase_index persistence directory — this is proportional to its stated purpose.
Persistence & Privilege
The skill persists an index under the repo (.codebase_index) and asks users to .gitignore it. It does not request always:true and does not modify other skills or system-wide settings. The persistence is limited to its own directory.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install codebase-search
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /codebase-search 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
- Initial release of codebase-search: build and persist a semantic vector index over a Python codebase for natural language search. - Enables retrieval of relevant classes, functions, or modules by meaning, not just exact name matches. - Fast incremental indexing with support for filtering by symbol type, customizing exclusions, and easy integration into agentic workflows. - Designed for Python projects; not intended for non-Python codebases or exact-string/grep searches.
元数据
Slug codebase-search
版本 1.0.0
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 1
常见问题

Codebase Search 是什么?

Build a persistent semantic vector index over a Python codebase and search it with natural language. Use when an agent needs to find relevant classes, functi... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 124 次。

如何安装 Codebase Search?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install codebase-search」即可一键安装,无需额外配置。

Codebase Search 是免费的吗?

是的,Codebase Search 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

Codebase Search 支持哪些平台?

Codebase Search 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Codebase Search?

由 Ryne Schultz(@ryno2390)开发并维护,当前版本 v1.0.0。

💬 留言讨论