/install convert-github-repository
Core Position
This skill converts GitHub repository content and metadata into target formats (Markdown documentation, JSON metadata, CSV tables, folder tree JSON). It preserves structure and content fidelity across formats — not a simple file copy, but a semantic transformation that respects GitHub's data model (repos, trees, commits, issues, PRs, releases).
Key responsibilities:
- Parse GitHub repository structure (branches, directory tree, files) using GitHub API or local
.gitdirectory - Convert repository data models (issues, PRs, releases, contributors) into target format representations
- Preserve file content and encoding (UTF-8, binary detection) during format transformation
- Provide a manifest of what was converted (file count, type breakdown, size summary)
Modes
/convert-github-repository --markdown
Repository → Markdown documentation. Converts the entire repository into a set of Markdown files:
README.mdfrom repository root- Each directory becomes a subdirectory with
README.mddescribing its contents - Issues exported as
issues/YYYY-MM-DD-{number}-{title}.md - PRs exported as
pull-requests/YYYY-MM-DD-{number}-{title}.md - Releases exported as
releases/v{semver}.md - Code files preserved in original language with syntax highlighting fences
/convert-github-repository --json
Repository → JSON metadata. Exports repository structure and content as a JSON file:
{
"repo": { "name": "...", "description": "...", "stars": N, "license": "...", "topics": [...] },
"files": [{ "path": "...", "size": N, "type": "file|dir", "sha": "..." }],
"branches": [{ "name": "...", "last_commit": "..." }],
"contributors": [{ "login": "...", "contributions": N }]
}
/convert-github-repository --csv
Issues + PRs → CSV. Exports issues and pull requests as CSV rows with columns: number, title, state, author, created_at, updated_at, labels, assignees, milestone, body_preview.
/convert-github-repository --tree
Repository → folder tree JSON. Outputs a tree structure representing the repository layout:
{
"path": "/",
"type": "directory",
"children": [
{ "path": "src/index.js", "type": "file", "size": 1234, "language": "JavaScript" },
{ "path": "tests/", "type": "directory", "children": [...] }
]
}
/convert-github-repository --readme-to-json
README.md → structured JSON. Parses a README.md and extracts: title (first H1), description (first paragraph after title), installation steps, usage examples, contributing guidelines, license.
Execution Steps
- Identify repository source
Remote GitHub repo: Parse URL to extract owner and repo:
https://github.com/owner/repo -> {owner: "owner", repo: "repo"}
Use GitHub API with GITHUB_TOKEN from env (os.getenv("GITHUB_TOKEN")).
Local repository: Verify .git directory exists. Use git CLI to extract:
git ls-filesfor file listgit log --onelinefor recent commitsgit remote get-url originto confirm repo identity
If neither source is available, report: "Cannot identify repository source — provide either a GitHub URL (https://github.com/owner/repo) or a local path with a .git directory."
- Fetch repository metadata
For remote repos, call GitHub API:
GET https://api.github.com/repos/{owner}/{repo}
Authorization: Bearer {GITHUB_TOKEN}
Accept: application/vnd.github+json
Response contains: full_name, description, stargazers_count, forks_count, license.spdx_id, topics, default_branch, created_at, updated_at, homepage, language.
If the token is missing or rate-limited (403), try unauthenticated (lower rate limit) or report: "GitHub API unavailable — check GITHUB_TOKEN or try again later (rate limit: 60 req/hr unauthenticated)."
- Fetch repository contents
Get default branch tree:
GET https://api.github.com/repos/{owner}/{repo}/git/trees/{default_branch}?recursive=1
Returns a flat list of all files with path, type (blob/tree), size, sha.
For local repos: Run git ls-tree -r --name-only {branch} to get the file list, then git show {sha}:{path} to read file content.
Filter out common non-essential paths:
node_modules/,.git/,vendor/— skip unless specifically requested- Binary files (images, PDFs, compiled binaries) — include in manifest but not content
.gitignore-referenced files that are not committed — skip
- Convert to target format
--markdown conversion:
- README.md: Preserve as-is, update relative links to point to local files
- Code files: Write with syntax-highlighting fence (```language) based on file extension mapping:
.js→```javascript,.py→```python,.go→```go,.rs→```rust, etc.
- Directories: Create
README.mdin each directory with list of contained files - Issues: Each issue becomes
{number}-{slug}.mdwith frontmatter:--- number: 42 state: open author: username created: 2024-01-15 labels: [bug, help-wanted] --- # Title body text... - PRs: Same format as issues, plus
mergedfield and review comments
--json conversion:
- Build nested structure from flat file list (group by directory path)
- Include file content for text files (base64 encode if large > 1MB)
- For large repos (>10K files), stream output to avoid memory exhaustion
--csv conversion:
- Headers:
number,title,state,author,created_at,updated_at,labels,assignees,milestone,body_preview - Labels: join with
;delimiter - Assignees: join with
;delimiter - Body preview: first 200 chars, strip markdown formatting
- Quote fields that contain commas; escape internal double quotes as
""
--tree conversion:
- Build recursive tree structure
- Detect language from extension using common mapping (
.js→ JavaScript,.py→ Python, etc.) - Report total file count, directory count, size breakdown by type
--readme-to-json conversion:
- Parse markdown using regex or a simple state machine:
- First
# Heading→title - Paragraphs between headings → sections (
installation,usage,contributing, etc.) - Code blocks →
code_examplesarray - Tables → parsed as arrays of objects
- Links
[text](url)→ collected inlinksarray
- First
- Validate output
Before delivering:
- If format is JSON: parse with
json.loads()and confirm no errors - If format is CSV: verify all rows have same column count
- If format is Markdown: verify all files are readable UTF-8
- Check that files were not silently skipped — report count of skipped vs converted
- Deliver with manifest
Return the converted output plus a manifest:
{
"format": "markdown",
"files_converted": 142,
"files_skipped": 3,
"skipped_reasons": [
{"path": "node_modules/package/index.js", "reason": "binary or non-text, excluded by default"},
{"path": ".git/config", "reason": "contains sensitive data"}
],
"total_size_bytes": 2048576,
"output_path": "./{repo-name}-converted/"
}
Mandatory Rules
Do not
- Do not hardcode GitHub token — use
os.getenv("GITHUB_TOKEN") - Do not convert binary files (images, PDFs, compiled binaries) as text — detect by extension or size > 5MB
- Do not convert files listed in
.gitignoreunless user explicitly requests--include-ignored - Do not silently skip files — every skip must appear in
skipped_reasonsin the manifest - Do not produce output that fails format validation (malformed JSON, unparseable CSV)
- Do not convert private repos without authentication — verify 401 is not returned before proceeding
Do
- Always produce a manifest with
files_converted,files_skipped,skipped_reasons - Preserve file encoding — read as binary, decode as UTF-8, replace invalid bytes with
\uFFFD - Report the exact file count and size for transparency
- Use pagination when fetching issues/PRs (GitHub API paginates at 30/100 per page)
- For local repos, use
gitCLI — never assume raw file access is available - Filter out common non-code directories (node_modules, .git, pycache, build/, dist/) by default
Quality Bar
| Criterion | Minimum | Ideal |
|---|---|---|
| Content preservation | 100% of text file content preserved | Binary files listed in manifest, not content-lost |
| Format validity | Output passes format parser | Strict schema validation (JSON: draft-7, CSV: consistent columns) |
| Manifest completeness | Every skipped file has a reason | Every converted file listed with size and type |
| Encoding correctness | UTF-8 for all text files | Invalid bytes replaced with U+FFFD, not dropped or garbled |
| Rate limit handling | 403 triggers re-auth attempt | Proactive pagination with token refresh on 403 |
| Large repo handling | \x3C 500MB memory for repos up to 10K files | Streaming JSON output, chunked file writes |
A good output passes the target format parser without errors, preserves all semantic content, and includes a complete manifest of what was converted and why some files were skipped.
Good vs. Bad Examples
| Scenario | Bad | Good |
|---|---|---|
| Missing token | Retries unauthenticated forever | Reports "GitHub API auth required — set GITHUB_TOKEN environment variable" |
| Binary file | Tries to read as text, garbles content | Skips binary, reports in manifest: {"path": "logo.png", "reason": "binary/image, excluded"} |
| Large repo | Loads all files into memory, crashes | Streams output, reports "Processed 8,432 of 12,000 files (70%)" |
| Rate limited | Fails silently after 3 requests | Reports "Rate limit exceeded (403) — retry after 14:32 UTC or set GITHUB_TOKEN" |
| Missing field | Skips homepage when absent |
Includes "homepage": null — no field silently dropped |
| Format error | Writes malformed JSON with trailing comma | Validates with json.dumps(), reports "Output invalid: unexpected token at line 847" |
| Local repo path | Assumes ~/repo exists |
Checks .git directory, reports "Path is not a git repository: ./myrepo" |
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install convert-github-repository - 安装完成后,直接呼叫该 Skill 的名称或使用
/convert-github-repository触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
Convert Github Repository 是什么?
Use when (1) user provides a GitHub repository URL or local repo path and asks to convert it to a different format. (2) user asks to export a GitHub reposito... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 33 次。
如何安装 Convert Github Repository?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install convert-github-repository」即可一键安装,无需额外配置。
Convert Github Repository 是免费的吗?
是的,Convert Github Repository 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。
Convert Github Repository 支持哪些平台?
Convert Github Repository 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 Convert Github Repository?
由 王继鹏(@wangjipeng977)开发并维护,当前版本 v1.0.0。