功能描述

Structured code reviews with severity-ranked findings and deep multi-agent mode. Use when performing a code review, auditing code quality, or critiquing PRs,...

使用说明 (SKILL.md)

Code Review

Name: ia-code-review
Author: iliaal

Two-Stage Review

Stage 1 -- Spec compliance (do this FIRST): verify the changes implement what was intended. Check against the PR description, issue, or task spec. Identify missing requirements, unnecessary additions, and interpretation gaps. If the implementation is wrong, stop here -- reviewing code quality on the wrong feature wastes effort.

Stage 2 -- Code quality: only after Stage 1 passes, review for correctness, maintainability, security, and performance.

Scope Resolution

Pre-flight: verify git rev-parse --git-dir exists before anything else. If not in a git repo, ask for explicit file paths.

When no specific files are given, resolve scope via this fallback chain:

User-specified files/directories (explicit request)
Session-modified files (git diff --name-only for unstaged + staged)
All uncommitted files (git diff --name-only HEAD)
Untracked files (git ls-files --others --exclude-standard) -- new files are often most review-worthy
Zero files → stop. Ask what to review.

Exclude: lockfiles, minified/bundled output, vendored/generated code.

Review Mode Selection

Run this BEFORE reading the full diff. Use metadata only (git diff --stat, file list from scope resolution) to count signals. Reading the diff first creates analysis momentum that bypasses mode selection.

Signal	Threshold	Detect from
Lines changed	>300	`git diff --stat` insertion + deletion totals, excluding test files
Files touched	>8	File count from scope resolution, excluding test files
Modules/directories spanned	>3	Unique top-level directories from non-test file list
Security-sensitive files (auth, crypto, payments, permissions)	any	File path matching
Database migrations present	any	File path matching
API surface changes (public endpoints, exported interfaces)	any	File path matching

Test file exclusion: test files inflate complexity signals without adding review risk -- they're boilerplate-heavy and follow repetitive patterns. Exclude paths matching tests/, test/, __tests__/, *.test.*, *.spec.*, *_test.* from the lines, files, and directories signals. Use git diff --stat -- ':!tests/' ':!test/' ':!__tests__/' ':!*.test.*' ':!*.spec.*' ':!*_test.*' for the filtered count. Report both totals for transparency: "450 lines changed (280 excluding tests)."

3+ signals → deep review. Inform the user, then dispatch parallel specialist agents per deep-review.md. Pass the diff to agents -- do NOT read it first. Reading and analyzing the diff yourself before dispatching agents defeats the purpose of deep review. Stop here -- do not proceed to the Review Process section.

2 signals → suggest: "This touches N files across M modules. Deep review? (y/n)"

0-1 signals → standard review. Proceed to Review Process below.

Before auto-switching to deep review, check the exceptions list in deep-review.md -- certain change types (pure docs, mechanical refactors, single-file \x3C50 lines) override signal count.

Override: deep forces multi-agent, quick forces single-pass.

Review Process

Standard reviews only. If mode selection triggered deep review, specialist agents handle the review per deep-review.md -- do not run these steps yourself.

Context — do these before reading code:
- Scope Drift Check: compare git diff --stat against the PR's stated intent. Classify as CLEAN / DRIFT DETECTED / REQUIREMENTS MISSING. If DRIFT, note the drifted files and ask the author: ship as-is, split, or remove unrelated changes?
- Read the intent: PR description, linked issue, or task spec. If the code does something the intent doesn't describe, or fails to do something the intent promises, flag as a finding — correct code that solves the wrong problem is still wrong.
- Fetch existing discussions: prior review comments may have already resolved issues you'd otherwise re-raise. Check gh api repos/{owner}/{repo}/pulls/{pr}/comments before starting.
- Run automated gates: execute the project's test/lint suite if available (check CI config for the canonical commands) to catch failures before manual review.
Structural scan -- architecture, file organization, API surface changes. Flag breaking changes. For files marked as added (A) in the diff, use the diff content directly -- don't attempt to read them from the working tree when reviewing a remote branch.
Line-by-line -- correctness, edge cases, error handling, naming, readability. Use question-based feedback ("What happens if input is empty here?") instead of declarative statements to encourage author thinking.
Security -- input validation, auth checks, secrets exposure, injection vectors (SQL, XSS, CSRF, SSRF, command, path traversal, unsafe deserialization). Flag race conditions (TOCTOU, check-then-act). Use security-patterns.md for grep-able detection patterns across 11 vulnerability classes.
Test coverage -- verify new code paths have tests. Flag untested error paths, edge cases, and behavioral changes without corresponding test updates. Flag tests coupled to implementation details (mocking internals, testing private methods) -- test behavior, not wiring.
Reliability -- error handling completeness, timeout/retry logic, resource cleanup on error paths, graceful degradation. Use reliability-patterns.md for detection patterns and grep-able signals.
Removal candidates -- identify dead code, unused imports, feature-flagged code that can be cleaned up. Distinguish safe-to-delete (no references) from defer-with-plan (needs migration).
Verify -- run formatter/lint/tests on touched files. State what was skipped and why. If code changes affect features described in README/ARCHITECTURE/CONTRIBUTING, note doc staleness as informational.
Summary -- present findings grouped by severity with verdict: Ready to merge / Ready with fixes / Not ready.

Large diffs (>500 lines): Review by module/directory rather than file-by-file. Summarize each module's changes first, then drill into high-risk areas. Flag if the PR should be split.

Change sizing: Ideal PRs are ~100-300 lines of meaningful changes (excluding generated code, lockfiles, snapshots). PRs beyond this range have slower review cycles and higher defect rates. When a PR exceeds this, suggest splitting using one of these strategies: (a) Stack -- sequential PRs where each builds on the previous, merged in order; (b) By file group -- group related files (e.g., model + migration + tests) into separate PRs; (c) Horizontal -- split by layer (frontend, API, database); (d) Vertical -- split by feature slice (each PR delivers one user-visible behavior end-to-end).

Severity and Confidence

Four severity tiers (Critical / Important / Medium / Minor) and a 5-band confidence rubric (0.0-1.0 → Report / Report-if-actionable / Suppress) govern what lands in the report. Full rules, false-positive suppression categories, and the LLM-specific prompt-injection exception in severity-and-confidence.md.

Tie every finding to concrete code evidence (file path, line number, specific pattern). Never fabricate references.

What to Check

For category checklists (Correctness, Maintainability & Readability, Performance, Adversarial red-team pass, AI-generated code lens), load check-categories.md. It's the structured checklist for the line-by-line review step.

Language-specific checks live in language-profiles.md — load the profile matching the file extensions in the diff (TypeScript/React, Python, PHP, Shell/CI, Configuration, Data Formats, Security, LLM Trust Boundaries).

Action Routing

For every finding, classify the fix into one of four tiers: safe_auto / gated_auto / manual / advisory. Full decision rules and conflict-resolution policy in action-routing.md. When in doubt, escalate to gated_auto — never promote toward safe_auto on disagreement.

Comment Labels

Prefix inline review comments so authors know what requires action:

(no prefix) -- required change (maps to Critical or Important severity), blocks merge
Nit: -- style preference, optional
Consider: -- suggestion worth evaluating, not blocking
FYI: -- informational, no action expected

Anti-Patterns in Reviews

Nitpicking style when linters exist -- defer to automated tools instead
"While you're at it..." scope creep -- open a separate issue instead
Blocking on personal preference -- approve with a Minor comment instead
Rubber-stamping without reading -- always verify at least Stage 1
Reviewing code quality before verifying spec compliance -- do Stage 1 first
Recommending fix patterns without checking currency -- verify the pattern is current for the project's framework version before suggesting it. Prefer built-in alternatives from newer versions
Fighting documented overrides -- if the project's CLAUDE.md, AGENTS.md, or an inline comment documents a deliberate bypass of a general rule (e.g., "we intentionally allow X because Y"), honor it. Do not re-raise the underlying concern; do not "just to be safe" around it. If the override lacks rationale, suggest documenting the reason — don't argue the rule.

When to Stop and Ask

Fixing the issues would require an API redesign beyond the PR's scope
Intent behind a change is ambiguous -- ask rather than assume
Missing validation tooling (no linter, no tests) -- flag the gap, don't guess

Output Format

## Review: [brief title]

### Critical
- **CR-001.** [file:line] `quoted code` -- [issue]. Score: [0.0-1.0]. [What happens if not fixed]. Fix: [concrete suggestion].

### Important
- **CR-002.** [file:line] `quoted code` -- [issue]. Score: [0.0-1.0]. [Why it matters]. Consider: [alternative approach].

### Medium
- **CR-003.** [file:line] -- [issue]. Score: [0.0-1.0]. [Why it matters].

### Minor
- **CR-004.** [file:line] -- [observation].

### What's Working Well
- [specific positive observation with why it's good]

### Residual Risks
- [unresolved assumptions, areas not fully covered, open questions]

### Verdict
Ready to merge / Ready with fixes / Not ready -- [one-sentence rationale]

Number findings as CR-001, CR-002... sequentially across all severity levels so they can be referenced by ID in discussions, PR comments, and follow-up todos. Limit to 10 findings per severity. If more exist, note the count and show the highest-impact ones.

For multi-agent consolidation (deep review, parallel specialists), apply the merge algorithm in deep-review.md — it handles same-line dedupe, conflicting severity, NEEDS DECISION flagging, and cross-lens confidence boosting.

Clean review (no findings): If the code is solid, say so explicitly. Summarize what was checked and why no issues were found. A clean review is a valid outcome, not an indication of insufficient effort.

References

Document	When to load	What it covers
security-patterns.md	Security review step or deep review security agent	Grep-able detection patterns across 11 vulnerability classes
security-test-coverage.md	Full security audit deliverable (used by `ia-security-sentinel` agent)	Auth edge cases, authorization, input boundary, concurrency, session hygiene, output boundary checklist
language-profiles.md	Language-specific checks step	TypeScript/React, Python, PHP, Shell/CI, Config, Security, LLM Trust
deep-review.md	When mode selection triggers deep review	Specialist agents, prompt template, merge algorithm, model selection
review-traps-catalog.md	Starting any non-trivial review; writing findings with "should"/"could"/"what if"	Reachability-before-severity, docs-idiom smoke test, convention-from-3-files, speculative future-design, paired-enum drift, cross-repo contract staleness, language-version gotchas
check-categories.md	Line-by-line step, structuring the read	Correctness, Maintainability, Performance, Adversarial, AI-generated-code lens
action-routing.md	Classifying fix-application for each finding	4-tier split (safe_auto / gated_auto / manual / advisory), conflict resolution
severity-and-confidence.md	Classifying severity + confidence for each finding	Critical/Important/Medium/Minor tiers, 5-band confidence rubric, FP suppression categories
false-positive-suppression.md	Detailed FP categories with framework-idiom and test-specific examples	Linked from severity-and-confidence; covers overridable patterns

Integration

ia-receiving-code-review -- the inbound side (processing review feedback received from others). Action-routing terminology maps across: safe_auto ≈ AUTO-FIX, gated_auto ≈ ESCALATE-for-approval, manual ≈ ESCALATE, advisory ≈ FYI (no-op).
ia-kieran-reviewer agent -- persona-driven Python/TypeScript deep quality review (type safety, naming, modern patterns)
/ia-review -- full ceremony review (worktrees, ultra-thinking, multi-agent). Deep review is lighter: no worktrees, no plan verification, just parallel specialist agents on the same diff.
/resolve-pr-parallel command -- batch-resolve PR comments with parallel agents
ia-security-sentinel agent -- deep security audit beyond the security step in this skill. Also supports threat-model mode for architectural security analysis when the diff introduces new trust boundaries, auth flows, or external API surfaces.

安全使用建议

What to consider before installing/using this skill: - Credentials and CLI: The instructions call `gh api` (GitHub CLI) but the skill declares no required environment variables or credentials. Confirm whether your agent environment already provides an authenticated `gh` context (GITHUB_TOKEN or gh auth). Do not provide credentials unless you trust the skill and its operator. - Running repo tests: The skill recommends executing the project's test/lint suite. Tests run arbitrary code from the repository and can execute malicious or harmful actions on the host. If you run this skill on untrusted PRs, run tests in an isolated/sandboxed environment (container, CI runner, or VM) and review test commands before execution. - Data exposure: Deep review dispatches the full diff to multiple specialist agents/models. If diffs contain secrets, proprietary code, or private data, you may be exposing them to other models or logging. Confirm how your agent platform handles model contexts and whether cross-agent data sharing is permitted and auditable. - Clarify assumptions: Ask the skill author to document required tools (gh CLI), expected authentication (GITHUB_TOKEN), network access, and any sandboxing recommendations. Prefer skills that declare required env vars and explicitly warn about executing repository code. - Mitigations: run the skill only on trusted repos or within sandboxed runners, restrict `gh` permissions (least privilege token), audit outputs for unexpected external calls, and confirm where diffs are sent (internal agents vs external endpoints).

功能分析

Type: OpenClaw Skill Name: compound-eng-code-review Version: 3.0.4 The compound-eng-code-review skill bundle is a highly sophisticated and professional tool designed for structured, multi-stage code reviews. It includes comprehensive references for security patterns (security-patterns.md), reliability (reliability-patterns.md), and a detailed catalog of common review pitfalls (review-traps-catalog.md). While the instructions in SKILL.md authorize the agent to execute shell commands for git operations, GitHub API calls, and local test/lint suites, these capabilities are strictly aligned with the stated purpose of auditing code quality and verifying PR compliance. There is no evidence of malicious intent, data exfiltration, or unauthorized persistence.

能力标签

cryptocan-make-purchasesrequires-oauth-tokenrequires-sensitive-credentials

能力评估

ℹ Purpose & Capability

Name/description (structured code reviews, deep multi-agent mode) aligns with the instructions: scope resolution, diff parsing, multi-agent deep review, severity/confidence handling, and many review checklists are present. Nothing obvious in the files suggests a different functional goal. However, the runtime instructions reference external tooling/APIs (e.g., `gh api`) and model names (default/opus) yet the skill declares no required env vars, credentials, or binaries — a modest mismatch between declared requirements and what the skill expects at runtime.

⚠ Instruction Scope

SKILL.md instructs the agent to run many repo-scoped commands (`git rev-parse`, `git diff`, `git ls-files`) and to fetch PR comments via `gh api repos/{owner}/{repo}/pulls/{pr}/comments`. It also tells the agent to 'execute the project's test/lint suite if available' and to dispatch full diffs to multiple specialist agents/models. Running the project's tests/lints executes arbitrary repository code on the host and may be unsafe if the code is untrusted. Dispatching full diffs to other agents/models can expose sensitive source code to those models. The instructions do not document sandboxing, credential sources, or data-handling constraints.

✓ Install Mechanism

Instruction-only skill with no install spec and no code files; nothing will be written to disk by an installer. This is low risk from an install-mechanism perspective.

⚠ Credentials

No required environment variables, credentials, or binaries are declared, yet the instructions expect use of the GitHub CLI (`gh api`) and model selection (default/opus). In practice `gh` usage usually requires authentication (GITHUB_TOKEN or `gh auth login`) and network access. The lack of declared env requirements is an incoherence: either the skill assumes the agent environment already provides these secrets, or it omits necessary requirements. That should be clarified before use.

✓ Persistence & Privilege

The skill does not request permanent presence (always:false), doesn't modify other skills or system-wide settings in the provided materials, and has no install step. No elevated persistence privileges are requested.

版本历史

v3.0.4

v3.0.3

v3.0.2

v3.0.1

v3.0.0

v2.56.1

v2.56.0

v2.55.1

v2.55.0

v2.53.2

v2.53.0

元数据

Slug compound-eng-code-review

版本 3.0.4

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 11

常见问题

ia-code-review 是什么？

Structured code reviews with severity-ranked findings and deep multi-agent mode. Use when performing a code review, auditing code quality, or critiquing PRs,... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 276 次。

如何安装 ia-code-review？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install compound-eng-code-review」即可一键安装，无需额外配置。

ia-code-review 是免费的吗？

是的，ia-code-review 完全免费，采用 MIT-0 许可证，可自由下载、安装和使用。

ia-code-review 支持哪些平台？

ia-code-review 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 ia-code-review？

由 Ilia Alshanetsky（@iliaal）开发并维护，当前版本 v3.0.4。

ia-code-review