← Back to Skills Marketplace
yoloyyh

Repo Explainer

by yoloyyh · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ✓ Security Clean
42
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install repo-explainer
Description
Decode a GitHub repository **or a local code directory** into a non-developer-friendly report, delivered as BOTH a polished HTML page AND a Markdown document...
README (SKILL.md)

repo-explainer

Mission

Turn a GitHub repo URL into a self-contained HTML report that a non-developer (PM, designer, manager) can read and walk away understanding:

  1. What the project does (in plain language)
  2. Who it is for and what problem it solves
  3. How it is built (tech stack + architecture)
  4. What the key modules are and how they interact
  5. What the main user flow looks like
  6. What external dependencies it relies on

Hard rule: every claim must be grounded in actual source code, not just the README. Cite path:line for each non-trivial conclusion.

Inputs

The skill accepts either of two source forms:

  1. GitHub URLhttps://github.com/\x3Cowner>/\x3Crepo> (with optional /tree/\x3Cbranch> suffix). The repo is shallow-cloned into ./workspace/\x3Cowner>__\x3Crepo>/.
  2. Local directory path — absolute, relative, or ~-prefixed. Analyzed in-place; nothing is cloned or copied. If the directory is a git checkout whose origin (or first available remote) points at github.com, the owner / repo / branch / commit SHA are auto-recovered so path:line evidence still renders as clickable permalinks. If not, evidence renders as monospace text and the report still works.

Optional:

  • --subpath — focus on a sub-directory (for monorepos), e.g. packages/core. Applies to both modes.
  • Audience hint — pm | designer | engineer (default pm).

If the user gives a non-GitHub URL (and not a local path), refuse politely and point at the scope limitation.

Workflow

Execute the following 6 stages in order. Each stage has a dedicated script under scripts/. Do NOT skip stages.

Stage 1 — Fetch

# Remote mode — shallow-clone a public GitHub repo
python3 scripts/fetch_repo.py \x3Cgithub_url> [--subpath \x3Cpath>]

# Local mode — analyze a directory in-place (auto-detected when the
# argument is an existing path; force with --local for ambiguous cases)
python3 scripts/fetch_repo.py \x3Clocal_dir> [--subpath \x3Cpath>] [--local]

The script auto-detects the mode:

  • Argument starts with http:// / https:// and contains github.comremote mode (clone)
  • Argument is an existing local directory → local mode (in-place)
  • Otherwise → fall back to URL parser (which will refuse non-GitHub URLs)

Remote mode behavior:

  • Shallow-clone (--depth=1) to ./workspace/\x3Cowner>__\x3Crepo>/
  • Hard cap repo size at 500 MB; if exceeded, abort and ask the user to pick a sub-path
  • Capture default branch, latest commit SHA, repo description, star count, primary language (via GitHub REST API, no auth needed for public repos but works with GH_TOKEN if set)

Local mode behavior:

  • Resolve the path (absolute / relative / ~-prefixed); reject if not a directory
  • Apply the same 500 MB size cap (excluding .git, node_modules, venv, __pycache__, dist, build, target)
  • Try git -C \x3Cpath> for: HEAD SHA, current branch, origin remote. If the remote URL matches github.com/\x3Cowner>/\x3Crepo>(.git)?, owner / repo / branch / commit_sha are populated so downstream path:line evidence becomes clickable permalinks identical to remote mode.
  • If the directory is not a git repo, all four GitHub coordinates are left null — the renderers degrade evidence to monospace text and the hero shows a 📁 \x3Clocal-path> badge instead of a repo link.

The output meta.json schema is identical across both modes; only the mode field ("remote" / "local") and the optional null-ness of owner / repo / branch / commit_sha distinguish them.

Stage 2 — Scan

python3 scripts/scan_structure.py ./workspace/\x3Cowner>__\x3Crepo>/

Produces scan.json containing:

  • Directory tree (depth ≤ 4, files counted per dir)
  • Language breakdown (lines of code per language, via simple extension map)
  • Manifest files detected (package.json, pyproject.toml, go.mod, pom.xml, Cargo.toml, Gemfile, composer.json, requirements.txt, Dockerfile, docker-compose.yml, CI YAMLs)
  • Parsed dependencies from each manifest
  • Entry-point candidates (main.*, index.*, app.*, cmd/, bin/, cli.*, plus framework-specific routes like routes/, pages/, controllers/)
  • README excerpts (first 300 lines)

Stage 3 — Analyze

python3 scripts/analyze_modules.py ./workspace/\x3Cowner>__\x3Crepo>/ scan.json

Produces analysis.json containing:

  • Top-N modules: ranked by (a) inbound import count, (b) LOC, (c) presence in entry-point graph
  • For each top module: file paths, exported symbols (best-effort regex per language), one-line "what it does" guess from docstring/comment
  • API surface: HTTP routes, CLI commands, exported public functions
  • Data models: classes/structs that look like entities (heuristic: fields-only, named like nouns)
  • External calls: detected SDK usage (e.g. boto3, openai, redis, kafka, axios, fetch) with path:line evidence

Stage 4 — Synthesize

python3 scripts/synthesize_summary.py analysis.json scan.json > summary.json

Calls the LLM (via Mira's built-in capability — the orchestrating agent performs this step in-context, the script just prepares a structured prompt payload). The agent must produce JSON conforming to templates/summary.schema.json:

{
  "project_name": "...",
  "one_liner": "≤30 字白话定位",
  "who_for": "目标用户群体",
  "problem_solved": "...",
  "core_value": "...",
  "tech_stack": [{"layer": "前端|后端|数据|基础设施", "items": ["..."]}],
  "key_models": [{"name": "PolicyAST", "path": "src/core/types", "purpose": "...", "key_fields": "id, rules[], snapshot_id", "evidence": "path:line"}],
  "key_modules": [{"name": "...", "path": "...", "responsibility": "...", "principle": "实现原理一句话", "subflow": ["步骤1","步骤2"], "evidence_list": ["path:line", "path:line"]}],
  "limitations": [{"title": "...", "detail": "...", "category": "performance|security|scalability|usability|compatibility|documentation|test_coverage|architecture|other", "severity": "low|medium|high", "evidence": "path:line 或 issue 链接,可空"}],
  "main_flow": [{"step": 1, "actor": "用户|系统", "action": "...", "evidence": "path:line"}],
  "api_surface": [{"type": "HTTP|CLI|SDK", "signature": "...", "purpose": "...", "evidence": "path:line"}],
  "dependencies": [{"name": "...", "purpose": "...", "criticality": "核心|辅助"}],
  "highlights": ["特色 1", "特色 2", "..."],
  "architecture_mermaid": "graph TD\
  ...",
  "sequence_mermaid": "sequenceDiagram\
  ...",
  "glossary": [{"term": "...", "plain": "白话解释"}]
}

Rules for the agent during synthesis:

  • Every evidence field MUST be a real path:line from the scanned repo
  • Mermaid node labels MUST use business language ("登录请求"), NOT class names ("AuthController")
  • Mermaid safety (HARD RULE — violating this breaks rendering on Feishu / GitHub / Notion / Mermaid Live). Both architecture_mermaid and sequence_mermaid MUST follow:
    • No HTML-confusable chars in node labels or sequence message text: forbid \x3C > " ' (use full-width 「」 or remove); replace path/file.ts style with path file.ts or file 文件.
    • No code-fence-confusable chars: forbid backticks , pipes |` inside labels (Markdown table parser eats them).
    • No rect rgb(...) colour blocks inside sequenceDiagram — Feishu / Notion Mermaid renderers commonly fail on them. Use Note over X,Y: 阶段 N 描述 to mark phases instead.
    • Quote labels with non-ASCII: in graph TD, wrap labels containing Chinese / spaces in double quotes — A["登录请求"], not A[登录请求]. In sequenceDiagram, use participant ALIAS as 显示名 (NO quotes).
    • Replace these symbols inside labels / messages: \x3C > → 全角 〈 〉 or remove; &; em-dash → space; slash / inside message text → space (slash in participant alias is fine).
    • One statement per line, no inline ;. Indent body by 2-4 spaces.
    • Self-check before emitting: paste your mermaid into https://mermaid.live mentally — if any line contains the forbidden chars above, rewrite it.
  • glossary MUST contain every technical term that appears in one_liner / core_value / highlights
  • key_models: extract core domain/data structs/classes/protobuf/schemas (e.g. Alert, PolicyAST, Snapshot, User). These are the nouns the project moves around. Render BEFORE key_modules so readers learn the "what" before the "how". Keep 5-12 entries.
  • key_modules: keep principle (核心原理 1-2 句) and subflow (3-6 步关键流程) for each module — readers should leave understanding not just what each module does but HOW it works.
  • limitations: MUST include 4-8 entries. Sources allowed: (a) TODO/FIXME/XXX/HACK comments grep'd from source, (b) open issues / known issues in README, (c) obvious architectural gaps (e.g. no rate limiting on public API, single-node only, no auth on /metrics, hardcoded secrets in tests, missing test coverage, deprecated dependency). Be honest and specific — generic items like "需要更多测试" are forbidden. If evidence cannot be cited, leave evidence empty rather than fake.
  • If a field cannot be confidently filled, leave empty string — never fabricate

Stage 5 — Render (BOTH formats, in parallel)

The pipeline MUST produce both an HTML report and a Markdown report from the same summary.json — never just one. Markdown is for embedding in Feishu / Notion / GitHub READMEs / chat; HTML is for sharing / printing / standalone viewing.

# 5a · HTML (themed, single-file, interactive)
python3 scripts/render_html.py summary.json \
    --meta_json meta.json \
    --file_count "$(jq '.structure.total_files' scan.json)" \
    --theme midnight \
    > report.html

# 5b · Markdown (Mermaid-native, GitHub/Feishu-friendly)
python3 scripts/render_markdown.py summary.json \
    --meta_json meta.json \
    --file_count "$(jq '.structure.total_files' scan.json)" \
    > report.md

Both renderers share the same section order and the same evidence-link contract:

  • Sections (identical in both formats): Hero (one_liner + project_name + repo link) → TL;DR (who_for / problem / value) → Tech Stack table → Architecture diagram (Mermaid) → Key Models table → Key Modules detail cards → Main Flow (sequence diagram) → API Surface table → Dependencies table → Highlights → Limitations / 待优化点 → Glossary
  • Every evidence: path:line rendered as a clickable link to https://github.com/\x3Cowner>/\x3Crepo>/blob/\x3Csha>/\x3Cpath>#L\x3Cline>
  • HTML extras: 3 themes (midnight / light / terminal), @media print rules, drop-shadow + auto-classified subgraph colors + legend, localStorage theme persistence
  • Markdown extras: ```mermaid fenced blocks (rendered natively by GitHub / GitLab / Obsidian / Typora / Bear / VS Code preview), shields.io badges in the hero, severity emoji (🔴 high / 🟠 medium / 🟢 low) in the limitations section

Stage 6 — Deliver

  • Upload both report.html AND report.md via mcp__runtime__upload_file — never just one
  • Return to user: ① uploaded HTML URL ② uploaded Markdown URL ③ brief 3-line summary (project name, one-liner, top-3 highlights) ④ note that they can click any path:line link to verify, and that the Markdown version can be pasted directly into Feishu / Notion / GitHub issues

Failure Modes

Symptom Action
404 from GitHub API Tell user repo is private/non-existent; stop
Clone >500 MB Ask user to pick a sub-path; offer top-3 sub-dirs by size
Zero recognized manifests AND zero code files Report "not a code repo" and stop
LLM synthesis returns invalid JSON Retry once with schema reminder; if still fails, fall back to a degraded report (skip the broken section, mark it [未能可靠生成])
Mermaid render error in HTML The HTML uses Mermaid 10+; on parse error the diagram block shows the raw text instead — DO NOT block the rest of the report. The Markdown report is unaffected since GitHub/Feishu render Mermaid independently.
Markdown render error The Markdown renderer is pure Python with no runtime deps and emits raw fenced ```mermaid blocks — only fails on invalid summary.json. If render_markdown.py fails, still deliver the HTML; report markdown error to the user.

Non-Goals (explicit)

  • Not a security auditor
  • Not a code review tool
  • Not a runner/installer
  • Not a comparison-with-other-projects tool (use competitive-analysis)
  • Not a deep call-graph tracer (out of MVP scope)

Examples

Trigger — GitHub URL:

帮我解读一下 https://github.com/tiangolo/fastapi 这个项目

Trigger — local directory:

帮我分析下 ~/code/my-side-project 这个目录在做什么

explain this folder /Users/me/work/checkout-service

Not a trigger (private/internal GitLab, out of scope):

帮我分析下 https://code.byted.org/some/private/repo

Not a trigger (only wants to run, not understand):

这个 github 项目怎么跑起来

Usage Guidance
Install only if you are comfortable with the agent reading the repository or local folder you ask it to analyze. Avoid pointing it at sensitive private code unless your agent/LLM environment is approved for that data, and do not store important manual work inside the generated ./workspace/<owner>__<repo>/ clone folder because reruns may replace it.
Capability Tags
crypto
Capability Assessment
Purpose & Capability
The scripts and instructions match the stated purpose: clone public GitHub repos or inspect a user-selected local directory, scan code structure, use an LLM synthesis step, and render HTML plus Markdown reports.
Instruction Scope
Scope is mostly clear and user-directed, but local-directory analysis naturally reads source files and git metadata, and Stage 4 uses the agent's LLM context for source-derived summaries.
Install Mechanism
Installation is a standard skill package with markdown, Python scripts, and JSON templates; there is no auto-starting service, package installer, or hidden install-time execution.
Credentials
Network access to GitHub for repo metadata/cloning and CDN/font requests from generated HTML are proportionate to the report workflow, but users in restricted environments should be aware of them.
Persistence & Privilege
No background persistence or privilege escalation is present; the HTML stores only a theme preference in localStorage, while remote-mode cloning can delete an existing deterministic workspace folder before recloning.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install repo-explainer
  3. After installation, invoke the skill by name or use /repo-explainer
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
repo-explainer 1.0.0 - Initial release of repo-explainer, generating polished HTML and Markdown reports for GitHub repositories or local code directories. - Fully analyzes actual source code (not just README), producing project overview, tech stack table, architecture diagrams, key modules, user flows, API/CLI surface, third-party dependency table, and plain-language summaries. - Provides permalinks or path:line references for every claim, supporting both GitHub repos and non-git local directories. - Supports monorepos via --subpath, large local projects, and work-in-progress folders. - Automatically tailors reports to non-developer audiences (PM, designer, manager).
Metadata
Slug repo-explainer
Version 1.0.0
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 1
Frequently Asked Questions

What is Repo Explainer?

Decode a GitHub repository **or a local code directory** into a non-developer-friendly report, delivered as BOTH a polished HTML page AND a Markdown document... It is an AI Agent Skill for Claude Code / OpenClaw, with 42 downloads so far.

How do I install Repo Explainer?

Run "/install repo-explainer" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Repo Explainer free?

Yes, Repo Explainer is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Repo Explainer support?

Repo Explainer is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Repo Explainer?

It is built and maintained by yoloyyh (@yoloyyh); the current version is v1.0.0.

💬 Comments