Description

AI-powered codebase analysis — generate architecture docs, onboarding guides, and key-flow walkthroughs for any project. Use when joining a new codebase, onb...

README (SKILL.md)

Codebase Onboarder

Name: Codebase Onboarder
Author: charlie-morrison

Analyze any codebase and produce a structured onboarding guide. Covers architecture, key flows, patterns, dependencies, entry points, and gotchas — the things that take weeks to figure out by reading code.

Use when: someone says "help me understand this codebase", "onboard me", "document this project", or "what does this repo do".

Analysis Steps

Run these in order. Each step informs the next.

1. Project Identity

# What is this?
cat README.md 2>/dev/null || cat readme.md 2>/dev/null
cat package.json 2>/dev/null | jq '{name, description, scripts}'
cat pyproject.toml 2>/dev/null | head -30
cat Cargo.toml 2>/dev/null | head -20
cat go.mod 2>/dev/null | head -10
cat Makefile 2>/dev/null | head -40

Determine: language, framework, purpose, build system.

2. Project Structure

# Directory tree (depth 3, ignore noise)
find . -maxdepth 3 -type d \
  -not -path '*/node_modules/*' \
  -not -path '*/.git/*' \
  -not -path '*/vendor/*' \
  -not -path '*/__pycache__/*' \
  -not -path '*/dist/*' \
  -not -path '*/build/*' \
  -not -path '*/.next/*' \
  -not -path '*/target/*' \
  | head -80

# Count files by extension
find . -type f -not -path '*/node_modules/*' -not -path '*/.git/*' \
  | sed 's/.*\.//' | sort | uniq -c | sort -rn | head -15

Map the architecture: where does business logic live, where are configs, where are tests, what's the convention.

3. Entry Points

# Web apps
grep -rl "listen\|createServer\|app\.run\|uvicorn\|Flask(__name__)" --include="*.{js,ts,py,go,rb}" . 2>/dev/null | head -10

# CLI tools
grep -rl "if __name__\|func main\|fn main\|bin.*:" --include="*.{py,go,rs,json}" . 2>/dev/null | head -10

# Config-declared entry points
cat package.json 2>/dev/null | jq '.main, .bin, .scripts.start, .scripts.dev'
cat pyproject.toml 2>/dev/null | grep -A5 'scripts\|entry_points'

Identify: where does execution start, what are the main scripts/commands, how do you run it locally.

4. Dependencies & Stack

# Key dependencies (not all — just the important ones)
cat package.json 2>/dev/null | jq '.dependencies | keys' | head -20
cat requirements.txt 2>/dev/null | head -20
cat go.mod 2>/dev/null | grep -v '//' | tail -20
cat Cargo.toml 2>/dev/null | grep -A50 '\[dependencies\]' | head -30

Identify: database (postgres, mongo, redis), framework (express, fastapi, gin), ORM, auth, queue, cloud SDKs. These define the project's personality.

5. Data Layer

# Database schemas, migrations, models
find . -type f \( -name "*.sql" -o -name "*migration*" -o -name "*schema*" -o -name "*model*" \) \
  -not -path '*/node_modules/*' 2>/dev/null | head -20

# ORM models
grep -rl "class.*Model\|@Entity\|schema\.\|CREATE TABLE\|db\.Column" \
  --include="*.{py,ts,js,go,rb,java}" . 2>/dev/null | head -10

Map: what are the core data entities, how are they related, where do migrations live.

6. API Surface

# REST routes
grep -rn "app\.\(get\|post\|put\|delete\|patch\)\|@app\.route\|router\.\(get\|post\)\|@Get\|@Post\|@Controller" \
  --include="*.{ts,js,py,go,rb,java}" . 2>/dev/null | head -30

# GraphQL
find . -name "*.graphql" -o -name "*.gql" -o -name "*schema*" -name "*.graphql" 2>/dev/null | head -10
grep -rl "type Query\|type Mutation\|@Query\|@Mutation" --include="*.{ts,js,py,go}" . 2>/dev/null | head -10

List the key endpoints/operations, grouped by domain.

7. Config & Environment

# Environment variables
cat .env.example 2>/dev/null || cat .env.sample 2>/dev/null || cat .env.template 2>/dev/null
grep -rh "process\.env\.\|os\.environ\|os\.getenv\|env::\|std::env" \
  --include="*.{ts,js,py,go,rs,rb}" . 2>/dev/null | sort -u | head -30

Document: what env vars are needed, which are secrets, what services need to be running.

8. Testing

# Test structure
find . -type f \( -name "*test*" -o -name "*spec*" -o -name "*_test.*" \) \
  -not -path '*/node_modules/*' 2>/dev/null | head -20

# How to run tests
cat package.json 2>/dev/null | jq '.scripts.test'
grep -r "pytest\|jest\|mocha\|vitest\|go test\|cargo test" Makefile* 2>/dev/null

9. CI/CD & Deployment

ls -la .github/workflows/ 2>/dev/null
ls -la .gitlab-ci.yml 2>/dev/null
cat Dockerfile 2>/dev/null | head -20
cat docker-compose.yml 2>/dev/null | head -30
ls -la k8s/ kubernetes/ helm/ 2>/dev/null

Output Template

After analysis, produce a document with these sections:

# [Project Name] — Onboarding Guide

## What This Is
One paragraph: what it does, who it's for, what problem it solves.

## Tech Stack
- Language: X
- Framework: X
- Database: X
- Key dependencies: X, Y, Z

## Architecture
Describe the high-level architecture in 3-5 sentences. Include a simple diagram if helpful:
- Monolith / microservices / serverless
- Request flow: client → API → service → database
- Key patterns: MVC, event-driven, CQRS, etc.

## Directory Map
| Path | Purpose |
|------|---------|
| src/api/ | REST endpoints |
| src/services/ | Business logic |
| src/models/ | Database models |
| ... | ... |

## Key Flows
Walk through 2-3 critical user journeys:
1. **User signup** — POST /auth/register → validate → hash password → insert user → send email → return token
2. **Place order** — POST /orders → check inventory → charge payment → create order → notify warehouse

## Getting Started
Step-by-step: clone, install, configure env, seed database, run locally.

## Gotchas
Things that are non-obvious, surprising, or likely to trip someone up:
- "The auth middleware silently returns 200 on missing tokens (legacy behavior)"
- "Tests require a running Redis instance on port 6380 (not default)"
- "The migration in 0042 takes 20 minutes on large datasets"

## Where to Look
| I want to... | Look at... |
|--------------|-----------|
| Add an API endpoint | src/api/routes/ |
| Change the database schema | src/models/ + migrations/ |
| Debug auth issues | src/middleware/auth.ts |
| Understand the build | Makefile + .github/workflows/ |

Tips

Read tests first — they document behavior better than comments
Check git log for the most-changed files — those are the hot paths
Look at recent PRs for coding conventions and review standards
If something is confusing, it's a gotcha worth documenting

Usage Guidance

This skill appears to do what it says (scan a repo and produce onboarding docs) but take these precautions before installing or running it: - Ensure the runtime environment has the common CLI tools the SKILL.md expects (jq, grep, find, sed, head, tail). The metadata did not declare jq, so the script may fail or the author omitted a dependency. - Be aware the instructions explicitly read .env and other config files. If your repo contains secrets or credentials, run the skill in a safe, isolated copy of the repo (or remove/redact secrets first). - Review outputs the agent will produce or transmit. If the agent sends analysis off-host (not stated here, but possible in your agent configuration), it could leak sensitive contents extracted from the repo. - If you want tighter safety, modify the SKILL.md or run the skill only on sanitized repositories (strip .env, credentials, and private keys) or restrict network egress for the agent that uses this skill. - If the agent will run these shell commands autonomously, verify it will run them in the intended working directory and with least privilege; consider running the analysis locally yourself first to confirm the results and any sensitive data exposure.

Capability Analysis

Type: OpenClaw Skill Name: codebase-onboarder Version: 1.0.0 The codebase-onboarder skill is a legitimate tool designed to automate the analysis of a software project for documentation purposes. It uses standard read-only shell commands (cat, grep, find, jq) to identify project structure, dependencies, entry points, and API routes across various languages and frameworks. There are no signs of data exfiltration, malicious execution, or prompt injection; it specifically targets example environment files (.env.example) rather than active secrets, aligning perfectly with its stated goal of generating onboarding guides.

Capability Tags

cryptocan-make-purchases

Capability Assessment

ℹ Purpose & Capability

The name/description (codebase onboarding) aligns with the instructions: everything the SKILL.md does is about scanning a repository and summarizing it. However the runtime commands use utilities (notably jq) while the skill declares no required binaries — a minor mismatch that could cause errors or indicates the author omitted declared requirements.

ℹ Instruction Scope

The SKILL.md explicitly instructs the agent to read many repository files (README, package.json, pyproject.toml, .env.* examples, migrations, Dockerfiles, CI configs, etc.). This is exactly what an onboarder needs, but it also means the agent will access any secrets or credentials stored in the repo (e.g., .env, .env.example, config files). There are no instructions to access system paths outside the repo or to send data to unexpected external endpoints, which is good.

✓ Install Mechanism

No install spec or code is provided (instruction-only), so nothing is written to disk by the skill itself. This minimizes install-time risk. The SKILL.md does assume standard shell utilities are available.

⚠ Credentials

The skill requests no environment variables or credentials, which is proportionate. However, it directs the agent to read .env and other local configuration files (which may contain secrets) without guidance to redact or treat them as sensitive. Also the SKILL.md uses jq but the skill metadata does not declare jq as a required binary.

✓ Persistence & Privilege

always:false and no install hooks are present. The skill does not request persistent system-wide changes or privileges beyond reading files in the repository working directory.

Version History

v1.0.0

Initial release: AI-powered codebase analysis with structured onboarding guide output

Metadata

Slug codebase-onboarder

Version 1.0.0

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 1

Frequently Asked Questions

What is Codebase Onboarder?

AI-powered codebase analysis — generate architecture docs, onboarding guides, and key-flow walkthroughs for any project. Use when joining a new codebase, onb... It is an AI Agent Skill for Claude Code / OpenClaw, with 47 downloads so far.

How do I install Codebase Onboarder?

Run "/install codebase-onboarder" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Codebase Onboarder free?

Yes, Codebase Onboarder is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Codebase Onboarder support?

Codebase Onboarder is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Codebase Onboarder?

It is built and maintained by charlie-morrison (@charlie-morrison); the current version is v1.0.0.

More Skills

Codebase Onboarder

Codebase Onboarder

Analysis Steps

1. Project Identity

2. Project Structure

3. Entry Points

4. Dependencies & Stack

5. Data Layer

6. API Surface

7. Config & Environment

8. Testing

9. CI/CD & Deployment

Output Template

Tips

What is Codebase Onboarder?

How do I install Codebase Onboarder?

Is Codebase Onboarder free?

Which platforms does Codebase Onboarder support?

Who created Codebase Onboarder?

💬 Comments