← Back to Skills Marketplace
ryno2390

Codebase Search

by Ryne Schultz · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ⚠ suspicious
124
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install codebase-search
Description
Build a persistent semantic vector index over a Python codebase and search it with natural language. Use when an agent needs to find relevant classes, functi...
README (SKILL.md)

Codebase Search

Builds a persistent ChromaDB vector index over Python source files and enables semantic search with natural language queries.

Quick Start

1. Install the scripts

Copy scripts/code_chunker.py and scripts/code_index.py into your project. They have no dependencies beyond chromadb (install with pip install chromadb).

2. Build the index

import asyncio
from code_index import CodebaseIndex

index = CodebaseIndex(repo_root="/path/to/repo")
count = asyncio.run(index.build())
print(f"Indexed {count} symbols")

The index persists to {repo_root}/.codebase_index/ and survives restarts. Subsequent calls to build() are fast — only new/changed files are indexed.

3. Search

results = asyncio.run(index.search("token payment handling", top_k=5))
for r in results:
    print(f"[{r.score:.2f}] {r.symbol_name} ({r.symbol_type}) — {r.filepath}:{r.start_line}")

Convenience API (when integrated into a project)

If code_chunker.py and code_index.py are in the project as a module, use the singleton helper:

from prsm.compute.nwtn.corpus import search_codebase

results = await search_codebase("circuit breaker", top_k=3)

Key Options

Parameter Default Description
top_k 5 Number of results to return
symbol_type None Filter to "class" or "function"
force_rebuild False Wipe and rebuild entire index
exclude_patterns see below Directories to skip

Default excludes: __pycache__, .venv, migrations, tests, scripts, .git, node_modules, .codebase_index

How It Works

  1. ChunkingCodeChunker uses Python's ast module to extract every top-level class and function from each .py file. Captures name, type, docstring, line numbers, and source.
  2. Indexing — ChromaDB stores each chunk as a document: "{symbol_name}: {docstring or first 300 chars of source}". Uses ChromaDB's default embedding function (no API key needed).
  3. Search — Cosine similarity query returns ranked SearchResult objects with filepath, symbol name, line numbers, docstring, and relevance score.

.gitignore

Always add .codebase_index/ to .gitignore — it's a local artifact, not source code.

Reference

See references/integration.md for integration patterns, including how to wire semantic search into sub-agent delegation prompts.

Usage Guidance
This skill appears to do what it says—build a local ChromaDB-based semantic index of Python code—but there are practical inconsistencies you should be aware of before installing or copying files: - Import/path mismatch: code_index.py uses imports like from prsm.compute.nwtn.corpus.code_chunker import CodeChunker and the SKILL.md shows convenience imports under prsm.compute.nwtn.corpus. The provided files live under scripts/. If you copy the scripts into a project as-is, those imports will likely fail. Either run the scripts from a package layout matching prsm.* or edit imports to use relative/local module names (e.g., import code_chunker). - Dependencies: you must install chromadb (and its runtime dependencies such as onnxruntime/tokenizers) as noted. Verify those packages are acceptable in your environment (they can be heavy and may require native wheels). - Local indexing and privacy: the index is persisted to {repo_root}/.codebase_index and will contain snippets and docstrings from your code. Add that directory to .gitignore (as SKILL.md suggests) and consider whether you want it included in backups or shared storage. - Behavior is local-only: there are no network calls or secret exfiltration in the code, but the indexer will read all .py files not excluded by default. Review the exclude list and adjust if you need to skip sensitive directories. Recommended next steps: test in a disposable/sandbox repo first, fix the import paths or package layout before using in production, and confirm chromadb works in your environment. If the package is to be used by an agent, ensure the agent's runtime has chromadb and the correct module path available.
Capability Analysis
Type: OpenClaw Skill Name: codebase-search Version: 1.0.0 The skill provides a legitimate semantic search tool for Python codebases using ChromaDB and Python's AST module for safe code parsing. The logic is transparent and aligns with the stated purpose of indexing and searching functions and classes. While there is a functional bug in `scripts/code_index.py` (a hardcoded internal import path `prsm.compute.nwtn.corpus.code_chunker` that would fail if the files are moved as suggested in the documentation), there is no evidence of malicious intent, data exfiltration, or prompt injection.
Capability Assessment
Purpose & Capability
The name/description (persistent semantic vector index over a Python codebase) matches the code and instructions: the scripts chunk .py files and persist vectors to a local .codebase_index using ChromaDB. However, code_index.py and SKILL.md reference a package namespace (prsm.compute.nwtn.corpus.* and a convenience import path) that does not match the provided file layout (scripts/*.py). That mismatch means the scripts will likely fail if simply copied into a project as the Quick Start suggests.
Instruction Scope
Runtime instructions and the code stay within the stated purpose: they scan Python files (with explicit exclude patterns), extract top-level classes/functions via ast, store docstrings/snippets, and run local ChromaDB queries. The skill does read the repository tree and files (expected for this purpose) and persists an index to {repo_root}/.codebase_index/; it also instructs adding that dir to .gitignore.
Install Mechanism
There is no install spec (instruction-only), which is low-risk, but the SKILL.md requires pip install chromadb and mentions onnxruntime/tokenizers. The code imports chromadb at runtime. The documentation omits explicit steps for making the provided scripts available under the prsm.* package namespace, creating a likely friction/operational issue. No remote downloads or obscure URLs are used.
Credentials
The skill requests no environment variables, credentials, or config paths. It only needs local filesystem access to the repository to index and to write the .codebase_index persistence directory — this is proportional to its stated purpose.
Persistence & Privilege
The skill persists an index under the repo (.codebase_index) and asks users to .gitignore it. It does not request always:true and does not modify other skills or system-wide settings. The persistence is limited to its own directory.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install codebase-search
  3. After installation, invoke the skill by name or use /codebase-search
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
- Initial release of codebase-search: build and persist a semantic vector index over a Python codebase for natural language search. - Enables retrieval of relevant classes, functions, or modules by meaning, not just exact name matches. - Fast incremental indexing with support for filtering by symbol type, customizing exclusions, and easy integration into agentic workflows. - Designed for Python projects; not intended for non-Python codebases or exact-string/grep searches.
Metadata
Slug codebase-search
Version 1.0.0
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 1
Frequently Asked Questions

What is Codebase Search?

Build a persistent semantic vector index over a Python codebase and search it with natural language. Use when an agent needs to find relevant classes, functi... It is an AI Agent Skill for Claude Code / OpenClaw, with 124 downloads so far.

How do I install Codebase Search?

Run "/install codebase-search" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Codebase Search free?

Yes, Codebase Search is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Codebase Search support?

Codebase Search is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Codebase Search?

It is built and maintained by Ryne Schultz (@ryno2390); the current version is v1.0.0.

💬 Comments