功能描述

Three-tier code quality defense: L1 quick scan, L2 deep audit (via bug-audit), L3 cross-validation with adversarial testing. 三级代码质量防线。

使用说明 (SKILL.md)

Codex Review — Three-Tier Code Quality Defense

Name: Codex Review
Author: abczsl520

Unified orchestration layer: picks audit depth based on trigger phrases. bug-audit is invoked as an independent skill — never modified.

Security & Privacy

Read-only by default: This skill only reads your project files for analysis. It does NOT modify, delete, or upload your code anywhere.
Optional external model: L1/L3 can use an external code-review API (OpenAI-compatible) for a second opinion. This is opt-in — if no API key is configured, the skill works fine with agent-only review.
Credentials via environment variables only: API keys are loaded from CODEX_REVIEW_API_KEY env var. Never hardcoded, never logged, never stored.
Local-only artifacts: Hotspot files are written to system temp directory and auto-cleaned. No network transmission of analysis results.
No data exfiltration: Code snippets sent to the external API are limited to the files being reviewed. No telemetry, no analytics, no third-party data sharing beyond the configured review model.

Prerequisites

External model API (optional, for L1 Round 1 and L3): Any OpenAI-compatible endpoint.
- Set env vars: CODEX_REVIEW_API_BASE (default: https://api.openai.com/v1), CODEX_REVIEW_API_KEY, CODEX_REVIEW_MODEL (default: gpt-4o)
- Works without this — falls back to agent-only audit
bug-audit skill (optional): Required for L2/L3. Without it, L2 uses a built-in fallback.
curl: For API calls (standard on macOS/Linux)

Trigger Mapping

User says	Level	What it does	Est. time
"review" / "quick scan" / "review下" / "检查下"	L1	External model scan + agent deep pass	5-10 min
"audit" / "deep audit" / "审计下" / "排查下"	L2	Full bug-audit flow (or built-in fallback)	30-60 min
"pre-deploy check" / "上线前检查"	L1→L2	L1 scan → record hotspots → L2 audit → hotspot gap check	40-70 min
"cross-validate" / "highest level" / "交叉验证"	L3	Dual independent audits + compare + adversarial test	60-90 min

Level 1: Quick Scan (core of codex-review)

Flow

Gather code — local read, git clone \x3Curl>, server scp, user-pasted snippet, or PR diff
Exclude — node_modules/, .git/, package-lock.json, dist/, *.db, pycache/, vendor/
Round 1 — send to external model API for automated scan (skipped if no API key)
Round 2 — current agent does deep supplementary pass
Merge & dedup — output severity-graded report
Write hotspot file (for L1→L2 handoff)

External Model API Call

curl -s "${CODEX_REVIEW_API_BASE:-https://api.openai.com/v1}/chat/completions" \
  -H "Authorization: Bearer ${CODEX_REVIEW_API_KEY}" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "${CODEX_REVIEW_MODEL:-gpt-4o}",
    "messages": [
      {"role": "system", "content": "\x3CREVIEW_SYSTEM_PROMPT>"},
      {"role": "user", "content": "\x3Ccode content>"}
    ],
    "temperature": 0.2,
    "max_tokens": 6000
  }'

Fallback: If API call fails or times out (120s), skip Round 1 and complete with agent-only audit.

System Prompt (L1 External Scan)

You are an expert code reviewer. Find ALL bugs and security issues:
1. CRITICAL — Security vulnerabilities (XSS, injection, auth bypass), crash bugs
2. HIGH — Logic errors, race conditions, unhandled exceptions
3. MEDIUM — Missing validation, edge cases, performance issues
4. LOW — Code style, dead code, minor improvements

For each: Severity, File+line, Issue, Fix with code snippet.
Focus on real bugs, not style opinions. Output language: match the user's language.

Agent Round 2 — Universal Checklist

Cross-file logic consistency (imports, exports, shared state)
Authentication & authorization bypass
Race conditions (concurrent requests, DB write conflicts)
Unhandled exceptions / missing error boundaries
Input validation & sanitization (SQL injection, XSS, path traversal)
Memory/resource leaks (unclosed connections, event listener buildup)
Sensitive data exposure (keys in code, logs, error messages)
Timezone handling (UTC vs local)
Dependency vulnerabilities (outdated packages, known CVEs)

Agent Round 2 — Tech-Stack Specific (auto-detect & apply)

Node.js/Express:

SQLite pitfalls (DEFAULT doesn't support functions, double-quote = column name)
Middleware ordering (auth before route handlers)
pm2/cluster mode compatibility

Python/Django/Flask:

ORM N+1 queries
CSRF protection enabled
Debug mode in production

Frontend (React/Vue/vanilla):

innerHTML / dangerouslySetInnerHTML without sanitization
WebView compatibility (WeChat, in-app browsers)
Nginx sub-path / base URL issues

Other stacks: adapt checklist to detected technology.

Code Volume Control

Single API request: backend core files only (server + routes + db + config)
Send frontend as a second batch if needed
Very large projects (>50 files): summarize file tree first, then scan in priority order

Hotspot File (L1→L2 handoff)

After L1, write issue summary to ${TMPDIR:-/tmp}/codex-review-hotspots.json:

{
  "project": "my-project",
  "timestamp": "2026-03-05T22:00:00",
  "hotspots": [
    {"file": "routes/admin.js", "severity": "CRITICAL", "brief": "Admin auth bypass via localhost"},
    {"file": "routes/game.js", "severity": "CRITICAL", "brief": "Score submission no server validation"}
  ]
}

This file is only used internally for L1→L2 handoff. bug-audit is unaware of it.

Level 2: Deep Audit

Flow (bug-audit available)

Read bug-audit's SKILL.md and execute its full flow (6 Phases)
bug-audit itself is never modified
Agent strictly follows bug-audit's specification

Flow (bug-audit NOT available — built-in fallback)

Phase 1: Project Dissection — read all source files, build dependency graph
Phase 2: Build Check Matrix — generate project-specific checklist from actual code patterns
Phase 3: Exhaustive Verification — verify every checklist item against real code
Phase 4: Reproduce — for each finding, trace the exact execution path
Phase 5: Report — output full severity-graded report
Phase 6: Fix Suggestions — provide concrete code patches

Level 1→2 Cascade: Pre-Deploy Check

Flow

Execute L1 quick scan
Write hotspot file
Execute L2 (bug-audit or fallback)
After L2, agent does hotspot gap analysis:
- Read hotspot file
- Check if L2 report covers each L1 hotspot
- Uncovered hotspots → targeted deep analysis, add to report
- L1 vs L2 conclusions conflict → flag for manual review
Output final merged report

Level 3: Cross-Validation (highest level)

Flow

Step 1: External model independent audit
  → Full code to external API with detailed system prompt
  → Output: Report A

Step 2: Agent independent audit (bug-audit or fallback)
  → Full bug-audit flow (or built-in fallback)
  → Output: Report B

Step 3: Cross-compare
  → Both found       → 🔴 Confirmed high-risk (high confidence)
  → Only external    → 🟡 Agent verifies (possible false positive)
  → Only agent       → 🟡 External verifies (possible deep logic bug)
  → Contradictory    → ⚠️ Deep analysis, provide judgment

Step 4: Adversarial testing
  → Ask external model to bypass discovered fixes
  → Validate fix robustness

Adversarial Test Prompt

You are a security researcher. The following security fixes were applied to a project.
For each fix, analyze:
1. Can the fix be bypassed? How?
2. Does the fix introduce new vulnerabilities?
3. Are there edge cases the fix doesn't cover?
Be adversarial and thorough. Output language: match the user's language.

Report Format (all levels)

# 🔍 Code Audit Report — [Project Name]
## Audit Level: L1 / L2 / L3
## 📊 Overview
- Files scanned: X
- Issues found: X (🔴 Critical X | 🟠 High X | 🟡 Medium X | 🔵 Low X)
- [L3 only] Cross-validation: Both agreed X | External only X | Agent only X | Conflict X

## 🔴 Critical Issues
### 1. [Issue Title]
- **File**: `path/to/file.js:42-55`
- **Found by**: External model / Agent / Both
- **Description**: ...
- **Fix**:
(code snippet)

## ✅ Highlights
- [What's done well]

User Options

Users can customize behavior by saying:

"only scan backend" / "只扫后端" → skip frontend files
"ignore LOW" / "忽略低级别" → filter out LOW severity
"output in English/Chinese" → control report language
"scan this PR" / "审这个PR" → fetch PR diff instead of full codebase
"skip external model" / "不用外部模型" → agent-only audit

Notes

External API timeout: 120 seconds. On failure, skip that round — agent completes independently
Large projects: split into batches (backend → frontend → config)
Long reports: split across multiple messages, adapted to current channel
L2/L3 bug-audit execution strictly follows its own SKILL.md — no modifications or shortcuts
Hotspot file is ephemeral — overwritten each L1 run, not persisted
All secrets/keys must come from env vars or user config — never hardcoded in this skill

安全使用建议

This skill appears to do what it says: review code locally and optionally call a configured external model. Before installing or running: (1) Understand that enabling the optional API key will send code snippets to the configured endpoint — only provide keys for providers you trust. (2) The skill may clone repos or scp files you instruct it to fetch, and it writes a hotspot file to /tmp; avoid giving it URLs or credentials for sensitive/private systems unless you intend it to fetch them. (3) The README recommends the third-party 'bug-audit' companion for deeper scans — review that skill separately. (4) Minor metadata mismatches (e.g., SKILL.md references curl/git but registry lists no required binaries); this is not dangerous but worth being aware of. If you need higher assurance, ask the author for an explicit privacy/data-flow statement and a published homepage/source repository to inspect.

功能分析

Type: OpenClaw Skill Name: codex-review Version: 2.1.0 The codex-review skill is a legitimate orchestration tool designed for multi-tier code auditing and security reviews. It utilizes local analysis, integration with the bug-audit skill, and optional external LLM APIs (via curl) to identify vulnerabilities. The SKILL.md and README.md files provide transparent documentation regarding data handling, environment variable usage for API keys, and the use of temporary local files for state management, with no evidence of malicious intent or unauthorized data exfiltration.

能力评估

ℹ Purpose & Capability

The name/description match the instructions: multi‑level code review that may use an external model and an optional 'bug-audit' companion. Small inconsistencies: SKILL.md expects curl/git/scp operations and an optional bug-audit companion, but the registry metadata lists no required binaries or required companion — these are optional but should be documented in metadata.

ℹ Instruction Scope

Instructions explicitly allow reading project files via local read, git clone <url>, server scp, pasted snippets, or PR diffs and write a hotspot file to ${TMPDIR:-/tmp}. This is within scope for a code-review skill, but the skill will send code snippets to an external model if configured — the 'read-only' claim is broadly true (no deletions by default) but network transmission of code to a configured API is expected and must be opted into by supplying an API key.

✓ Install Mechanism

No install spec or code is present (instruction-only), so there is no package download or archive extraction risk.

ℹ Credentials

The skill uses optional env vars (CODEX_REVIEW_API_KEY, CODEX_REVIEW_API_BASE, CODEX_REVIEW_MODEL) for an external model; registry metadata declares no required envs — this is consistent if treated as opt-in. No unrelated credentials or broad secret access are requested.

✓ Persistence & Privilege

always:false and no claims of persistent modifications to agent/system settings. The skill writes a temporary hotspot JSON to a temp directory for handoff, which is reasonable and scoped to its function.

版本历史

v2.1.0

Added Security & Privacy section, env var credential handling, explicit data handling transparency to address safety scan concerns

v2.0.0

SECURITY: Remove hardcoded API key. Abstract to any OpenAI-compatible API. Bilingual triggers. Universal + tech-stack checklists. Built-in L2 fallback. User options. Cross-platform temp dir.

v1.0.0

Initial release: L1/L2/L3 three-tier code quality defense

元数据

Slug codex-review

版本 2.1.0

许可证 —

累计安装 5

当前安装数 5

历史版本数 3

常见问题

Codex Review 是什么？

Three-tier code quality defense: L1 quick scan, L2 deep audit (via bug-audit), L3 cross-validation with adversarial testing. 三级代码质量防线。它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 346 次。

如何安装 Codex Review？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install codex-review」即可一键安装，无需额外配置。

Codex Review 是免费的吗？

是的，Codex Review 完全免费（开源免费），可自由下载、安装和使用。

Codex Review 支持哪些平台？

Codex Review 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 Codex Review？

由 abczsl520（@abczsl520）开发并维护，当前版本 v2.1.0。

Codex Review

Codex Review — Three-Tier Code Quality Defense

Security & Privacy

Prerequisites

Trigger Mapping

Level 1: Quick Scan (core of codex-review)

Flow

External Model API Call

System Prompt (L1 External Scan)

Agent Round 2 — Universal Checklist

Agent Round 2 — Tech-Stack Specific (auto-detect & apply)

Code Volume Control

Hotspot File (L1→L2 handoff)

Level 2: Deep Audit

Flow (bug-audit available)

Flow (bug-audit NOT available — built-in fallback)

Level 1→2 Cascade: Pre-Deploy Check

Flow

Level 3: Cross-Validation (highest level)

Flow

Adversarial Test Prompt

Report Format (all levels)

User Options

Notes

Codex Review 是什么？

如何安装 Codex Review？

Codex Review 是免费的吗？

Codex Review 支持哪些平台？

谁开发了 Codex Review？

💬 留言讨论