第 70 章
案例:代码审查与自动修复 Agent
第七十章:案例:代码审查与自动修复 Agent
章节导语
代码审查是软件工程中最耗费工程师时间与注意力的环节之一。统计数据显示,中等规模团队每周平均花费 15-20% 的研发时间在代码审查上,而其中相当比例的评审意见是重复性、规范性的问题——命名不规范、未处理异常、缺少单元测试、SQL 注入风险……这些问题本应由工具自动发现并修复。本章将构建一个完整的 Hermes 代码审查与自动修复 Agent,接入 GitHub Actions CI/CD 流水线,让每一个 Pull Request 都能在人工审查前获得一轮 AI 深度审计,大幅降低审查成本,提升代码库整体健康度。
70.1 需求分析:CI/CD 自动代码审查的痛点
当前代码审查的典型问题
传统 PR 审查流程:
开发者提交 PR
↓
等待 Reviewer 有空(0.5~2 天)
↓
Reviewer 逐行阅读(30~90 分钟/PR)
↓
留下评论(50% 是重复性规范问题)
↓
开发者修改 → 再次等待 → 循环...
核心痛点汇总:
| 痛点 | 影响 | 严重程度 |
|---|---|---|
| 审查等待时间长 | 阻塞功能交付 | 高 |
| 重复性规范问题占用 Reviewer 精力 | 降低审查质量 | 高 |
| 安全漏洞可能被遗漏 | 生产事故风险 | 极高 |
| 没有统一的质量基线 | 代码库质量参差不齐 | 中 |
| Reviewer 主观差异大 | 团队标准难以统一 | 中 |
Agent 的目标能力
我们希望这个 Agent 能够:
- 多语言支持:Python、JavaScript/TypeScript、Go
- 多维度审查:安全性、性能、可读性、规范性
- 自动修复:对明确可修复的问题直接生成 patch
- PR 集成:以 GitHub PR Review 评论形式输出
- 可配置规则:团队可自定义审查标准
70.2 系统架构
整体架构图
┌─────────────────────────────────────────────────────────┐
│ GitHub Repository │
│ │
│ 开发者 Push → Pull Request 创建/更新 │
└────────────────────────┬────────────────────────────────┘
│ webhook / GitHub Actions trigger
▼
┌─────────────────────────────────────────────────────────┐
│ GitHub Actions CI Runner │
│ │
│ ┌──────────────────────────────────────────────────┐ │
│ │ Code Review Agent Entrypoint │ │
│ │ │ │
│ │ 1. 获取 PR diff (GitHub API) │ │
│ │ 2. 解析变更文件列表 │ │
│ │ 3. 调用 Hermes Agent 分析 │ │
│ │ 4. 格式化结果并发布 Review │ │
│ └──────────────────────────────────────────────────┘ │
└────────────────────────┬────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ Hermes Agent Core │
│ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────┐ │
│ │ 静态分析 │ │ 语义理解 │ │ 修复生成 │ │
│ │ 工具集 │ │ (LLM Core) │ │ 工具集 │ │
│ │ │ │ │ │ │ │
│ │ - AST解析 │ │ - 安全审查 │ │ - diff生成 │ │
│ │ - Linter │ │ - 逻辑审查 │ │ - patch应用 │ │
│ │ - 依赖检查 │ │ - 最佳实践 │ │ - 建议生成 │ │
│ └─────────────┘ └─────────────┘ └─────────────────┘ │
└─────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ GitHub PR Review Output │
│ │
│ • 行级评论(问题定位) │
│ • 修复建议(代码块) │
│ • 总体评分(APPROVE / REQUEST_CHANGES / COMMENT) │
│ • 自动提交修复 Commit(可选) │
└─────────────────────────────────────────────────────────┘
工具集设计
| 工具名 | 功能 | 输入 | 输出 |
|---|---|---|---|
get_pr_diff |
获取 PR 变更内容 | PR number | diff 文本 |
parse_code_ast |
AST 解析 | 代码内容+语言 | AST 树 |
run_static_analysis |
静态分析 | 文件路径+语言 | 问题列表 |
check_security_patterns |
安全模式检查 | 代码片段 | 漏洞报告 |
generate_fix_patch |
生成修复补丁 | 问题描述+代码 | unified diff |
post_pr_review |
发布 PR 评论 | 评论数据 | GitHub API 响应 |
search_similar_issues |
搜索相似问题 | 问题描述 | 历史案例 |
70.3 完整实现代码
项目结构
code-review-agent/
├── agent/
│ ├── __init__.py
│ ├── hermes_agent.py # Hermes Agent 主体
│ ├── tools/
│ │ ├── github_tools.py # GitHub API 工具
│ │ ├── analysis_tools.py # 代码分析工具
│ │ └── fix_tools.py # 修复生成工具
│ └── prompts/
│ ├── system_prompt.py
│ └── review_templates.py
├── config/
│ └── review_rules.yaml # 自定义审查规则
├── .github/
│ └── workflows/
│ └── code_review.yml # GitHub Actions
└── main.py # 入口
核心 Agent 实现
# agent/hermes_agent.py
import os
import json
from typing import Optional
from openai import OpenAI
from .tools import github_tools, analysis_tools, fix_tools
# Hermes 使用 OpenAI 兼容接口
client = OpenAI(
base_url=os.getenv("HERMES_BASE_URL", "http://localhost:11434/v1"),
api_key=os.getenv("HERMES_API_KEY", "ollama"),
)
MODEL = os.getenv("HERMES_MODEL", "nous-hermes-2-mixtral-8x7b-dpo")
SYSTEM_PROMPT = """你是一位资深软件工程师,专注于代码质量和安全审查。
你的任务是对 Pull Request 的代码变更进行全面审查,找出潜在问题并提供修复建议。
审查维度:
1. 安全性:SQL注入、XSS、不安全的反序列化、硬编码密钥等
2. 性能:N+1查询、不必要的循环、内存泄漏、阻塞IO等
3. 可维护性:命名规范、函数复杂度、重复代码、缺少注释等
4. 健壮性:异常处理、边界条件、空值处理等
5. 测试覆盖:关键路径是否有测试
对于每个问题,请提供:
- 问题位置(文件名 + 行号)
- 问题严重程度(critical/major/minor/suggestion)
- 清晰的问题描述
- 具体的修复代码
使用工具时要有条理地逐步分析,不要遗漏重要文件。"""
# 工具定义
TOOLS = [
{
"type": "function",
"function": {
"name": "get_pr_diff",
"description": "获取 Pull Request 的代码变更 diff",
"parameters": {
"type": "object",
"properties": {
"pr_number": {"type": "integer", "description": "PR 编号"},
"file_filter": {
"type": "string",
"description": "文件过滤器,如 '*.py' 或 '*.js'",
"default": "*"
}
},
"required": ["pr_number"]
}
}
},
{
"type": "function",
"function": {
"name": "run_static_analysis",
"description": "对指定文件运行静态代码分析",
"parameters": {
"type": "object",
"properties": {
"file_content": {"type": "string", "description": "文件内容"},
"language": {
"type": "string",
"enum": ["python", "javascript", "typescript", "go"],
"description": "编程语言"
},
"filename": {"type": "string", "description": "文件名"}
},
"required": ["file_content", "language", "filename"]
}
}
},
{
"type": "function",
"function": {
"name": "check_security_patterns",
"description": "检查代码中的安全漏洞模式",
"parameters": {
"type": "object",
"properties": {
"code": {"type": "string", "description": "要检查的代码"},
"language": {"type": "string", "description": "编程语言"}
},
"required": ["code", "language"]
}
}
},
{
"type": "function",
"function": {
"name": "generate_fix_patch",
"description": "为发现的问题生成修复代码补丁",
"parameters": {
"type": "object",
"properties": {
"original_code": {"type": "string", "description": "原始代码"},
"issue_description": {"type": "string", "description": "问题描述"},
"language": {"type": "string", "description": "编程语言"}
},
"required": ["original_code", "issue_description", "language"]
}
}
},
{
"type": "function",
"function": {
"name": "post_pr_review",
"description": "将审查结果发布为 GitHub PR Review",
"parameters": {
"type": "object",
"properties": {
"pr_number": {"type": "integer"},
"review_body": {"type": "string", "description": "总体评审摘要"},
"comments": {
"type": "array",
"items": {
"type": "object",
"properties": {
"path": {"type": "string"},
"line": {"type": "integer"},
"body": {"type": "string"},
"severity": {"type": "string"}
}
}
},
"action": {
"type": "string",
"enum": ["APPROVE", "REQUEST_CHANGES", "COMMENT"]
}
},
"required": ["pr_number", "review_body", "comments", "action"]
}
}
}
]
def dispatch_tool(tool_name: str, tool_args: dict) -> str:
"""工具调度器"""
tool_map = {
"get_pr_diff": github_tools.get_pr_diff,
"run_static_analysis": analysis_tools.run_static_analysis,
"check_security_patterns": analysis_tools.check_security_patterns,
"generate_fix_patch": fix_tools.generate_fix_patch,
"post_pr_review": github_tools.post_pr_review,
}
if tool_name not in tool_map:
return json.dumps({"error": f"未知工具: {tool_name}"})
try:
result = tool_map[tool_name](**tool_args)
return json.dumps(result, ensure_ascii=False)
except Exception as e:
return json.dumps({"error": str(e)})
def run_code_review_agent(pr_number: int, repo: str) -> dict:
"""运行代码审查 Agent"""
print(f"[Agent] 开始审查 PR #{pr_number} in {repo}")
messages = [
{"role": "system", "content": SYSTEM_PROMPT},
{
"role": "user",
"content": f"""请对 GitHub 仓库 {repo} 的 PR #{pr_number} 进行全面代码审查。
步骤:
1. 获取 PR 的代码变更
2. 对每个变更文件进行静态分析和安全检查
3. 总结发现的问题,按严重程度分类
4. 对可修复的问题生成修复建议
5. 发布 PR Review 评论
请开始审查。"""
}
]
# Agentic Loop
max_iterations = 20
iteration = 0
while iteration < max_iterations:
iteration += 1
print(f"[Agent] 第 {iteration} 轮推理...")
response = client.chat.completions.create(
model=MODEL,
messages=messages,
tools=TOOLS,
tool_choice="auto",
temperature=0.1, # 代码审查需要低温度保证一致性
)
message = response.choices[0].message
messages.append(message)
# 没有工具调用,任务完成
if not message.tool_calls:
print(f"[Agent] 审查完成")
return {
"status": "completed",
"summary": message.content,
"iterations": iteration
}
# 执行工具调用
for tool_call in message.tool_calls:
tool_name = tool_call.function.name
tool_args = json.loads(tool_call.function.arguments)
print(f"[Agent] 调用工具: {tool_name}({tool_args})")
result = dispatch_tool(tool_name, tool_args)
messages.append({
"role": "tool",
"tool_call_id": tool_call.id,
"content": result
})
return {"status": "max_iterations_reached", "iterations": iteration}
GitHub API 工具实现
# agent/tools/github_tools.py
import os
import re
import requests
from typing import Optional
GITHUB_TOKEN = os.getenv("GITHUB_TOKEN")
REPO = os.getenv("GITHUB_REPOSITORY") # owner/repo 格式
def _headers():
return {
"Authorization": f"Bearer {GITHUB_TOKEN}",
"Accept": "application/vnd.github.v3+json",
"X-GitHub-Api-Version": "2022-11-28"
}
def get_pr_diff(pr_number: int, file_filter: str = "*") -> dict:
"""获取 PR diff 内容"""
url = f"https://api.github.com/repos/{REPO}/pulls/{pr_number}/files"
resp = requests.get(url, headers=_headers())
resp.raise_for_status()
files = resp.json()
result = {"files": [], "total_changes": 0}
for f in files:
filename = f["filename"]
# 文件过滤
if file_filter != "*":
ext = file_filter.lstrip("*")
if not filename.endswith(ext):
continue
# 跳过删除的文件
if f["status"] == "removed":
continue
result["files"].append({
"filename": filename,
"status": f["status"],
"additions": f["additions"],
"deletions": f["deletions"],
"patch": f.get("patch", ""),
"raw_url": f.get("raw_url", "")
})
result["total_changes"] += f["additions"] + f["deletions"]
return result
def post_pr_review(
pr_number: int,
review_body: str,
comments: list,
action: str = "COMMENT"
) -> dict:
"""发布 PR 审查评论"""
# 获取最新 commit SHA
pr_url = f"https://api.github.com/repos/{REPO}/pulls/{pr_number}"
pr_resp = requests.get(pr_url, headers=_headers())
commit_id = pr_resp.json()["head"]["sha"]
# 构建审查请求
review_payload = {
"commit_id": commit_id,
"body": review_body,
"event": action,
"comments": [
{
"path": c["path"],
"line": c["line"],
"body": _format_comment(c)
}
for c in comments
if c.get("line") # 只处理有行号的评论
]
}
url = f"https://api.github.com/repos/{REPO}/pulls/{pr_number}/reviews"
resp = requests.post(url, json=review_payload, headers=_headers())
resp.raise_for_status()
return {"success": True, "review_id": resp.json()["id"]}
def _format_comment(comment: dict) -> str:
"""格式化单条评论"""
severity_emoji = {
"critical": "🔴",
"major": "🟠",
"minor": "🟡",
"suggestion": "💡"
}
emoji = severity_emoji.get(comment.get("severity", "minor"), "💬")
body = f"{emoji} **[{comment.get('severity', 'comment').upper()}]** {comment['body']}"
if comment.get("fix_code"):
body += f"\n\n**建议修复:**\n```\n{comment['fix_code']}\n```"
return body
静态分析工具
# agent/tools/analysis_tools.py
import re
import subprocess
import tempfile
import os
from typing import List, Dict
# 安全漏洞模式库
SECURITY_PATTERNS = {
"python": [
{
"pattern": r"eval\s*\(",
"name": "危险的 eval() 调用",
"severity": "critical",
"description": "eval() 可执行任意代码,存在代码注入风险"
},
{
"pattern": r"exec\s*\(",
"name": "危险的 exec() 调用",
"severity": "critical",
"description": "exec() 可执行任意代码,存在代码注入风险"
},
{
"pattern": r'f".*SELECT.*{',
"name": "SQL 注入风险",
"severity": "critical",
"description": "使用 f-string 拼接 SQL 语句,存在注入风险,应使用参数化查询"
},
{
"pattern": r"pickle\.loads?\(",
"name": "不安全的反序列化",
"severity": "major",
"description": "pickle 反序列化不受信任数据可导致 RCE"
},
{
"pattern": r'(password|secret|api_key|token)\s*=\s*["\'][^"\']+["\']',
"name": "硬编码凭证",
"severity": "critical",
"description": "密钥不应硬编码在源代码中"
},
{
"pattern": r"shell=True",
"name": "Shell 注入风险",
"severity": "major",
"description": "subprocess 使用 shell=True 且参数包含用户输入时存在注入风险"
},
],
"javascript": [
{
"pattern": r"eval\s*\(",
"name": "危险的 eval() 调用",
"severity": "critical",
"description": "eval() 存在 XSS 和代码注入风险"
},
{
"pattern": r"innerHTML\s*=",
"name": "XSS 风险",
"severity": "major",
"description": "直接设置 innerHTML 可能导致 XSS,应使用 textContent 或 DOMPurify"
},
{
"pattern": r"document\.write\(",
"name": "不安全的 document.write",
"severity": "major",
"description": "document.write 可能导致 XSS 攻击"
},
],
"go": [
{
"pattern": r'fmt\.Sprintf.*".*SELECT',
"name": "SQL 注入风险",
"severity": "critical",
"description": "使用格式化字符串构建 SQL,应使用参数化查询"
},
{
"pattern": r"os\.Exec\(",
"name": "命令注入风险",
"severity": "major",
"description": "执行外部命令时应验证输入"
},
]
}
def check_security_patterns(code: str, language: str) -> dict:
"""检查代码安全模式"""
patterns = SECURITY_PATTERNS.get(language, [])
issues = []
lines = code.split("\n")
for i, line in enumerate(lines, 1):
for pattern_def in patterns:
if re.search(pattern_def["pattern"], line, re.IGNORECASE):
issues.append({
"line": i,
"line_content": line.strip(),
"issue_name": pattern_def["name"],
"severity": pattern_def["severity"],
"description": pattern_def["description"]
})
return {
"language": language,
"total_issues": len(issues),
"issues": issues
}
def run_static_analysis(
file_content: str, language: str, filename: str
) -> dict:
"""运行静态分析(结合 linter)"""
results = {"filename": filename, "language": language, "issues": []}
if language == "python":
results["issues"].extend(_run_python_analysis(file_content, filename))
elif language in ("javascript", "typescript"):
results["issues"].extend(_run_js_analysis(file_content, filename))
elif language == "go":
results["issues"].extend(_run_go_analysis(file_content, filename))
# 加入安全检查结果
sec_result = check_security_patterns(file_content, language)
for issue in sec_result["issues"]:
results["issues"].append({
"line": issue["line"],
"severity": issue["severity"],
"message": f"[安全] {issue['issue_name']}: {issue['description']}",
"rule": "security"
})
return results
def _run_python_analysis(content: str, filename: str) -> List[Dict]:
"""Python 静态分析(使用 pylint / flake8)"""
issues = []
with tempfile.NamedTemporaryFile(
mode="w", suffix=".py", delete=False, encoding="utf-8"
) as f:
f.write(content)
tmp_path = f.name
try:
# 运行 flake8
result = subprocess.run(
["flake8", "--max-line-length=100", "--format=%(row)d:%(col)d:%(code)s:%(text)s", tmp_path],
capture_output=True, text=True, timeout=30
)
for line in result.stdout.strip().split("\n"):
if not line:
continue
parts = line.split(":", 3)
if len(parts) >= 4:
issues.append({
"line": int(parts[0]),
"col": int(parts[1]),
"rule": parts[2],
"message": parts[3].strip(),
"severity": "major" if parts[2].startswith("E") else "minor"
})
except (subprocess.TimeoutExpired, FileNotFoundError):
pass
finally:
os.unlink(tmp_path)
return issues
def _run_js_analysis(content: str, filename: str) -> List[Dict]:
"""JavaScript/TypeScript 静态分析"""
# 简化版:仅做基础模式检查
issues = []
lines = content.split("\n")
for i, line in enumerate(lines, 1):
# 检查 console.log(生产代码中不应出现)
if re.search(r"console\.(log|debug|info)\(", line):
issues.append({
"line": i,
"severity": "minor",
"message": "发现 console.log,请在生产代码中移除调试输出",
"rule": "no-console"
})
# 检查 var 声明(应使用 let/const)
if re.search(r"^\s*var\s+", line):
issues.append({
"line": i,
"severity": "minor",
"message": "使用 var 声明,建议改为 let 或 const",
"rule": "no-var"
})
return issues
def _run_go_analysis(content: str, filename: str) -> List[Dict]:
"""Go 静态分析"""
issues = []
lines = content.split("\n")
for i, line in enumerate(lines, 1):
# 检查错误忽略
if re.search(r"_\s*,\s*err\s*:?=", line) or re.search(r"_\s*=\s*\w+\(", line):
pass # 正常的错误忽略场景
if re.search(r",\s*_\b", line) and "err" in line.lower():
issues.append({
"line": i,
"severity": "major",
"message": "错误被忽略,应处理 error 返回值",
"rule": "errcheck"
})
return issues
修复生成工具
# agent/tools/fix_tools.py
import os
from openai import OpenAI
client = OpenAI(
base_url=os.getenv("HERMES_BASE_URL", "http://localhost:11434/v1"),
api_key=os.getenv("HERMES_API_KEY", "ollama"),
)
def generate_fix_patch(
original_code: str,
issue_description: str,
language: str
) -> dict:
"""使用 LLM 生成修复代码"""
prompt = f"""你是一位专业的 {language} 开发者。
以下代码存在问题:
{issue_description}
原始代码:
```{language}
{original_code}
请提供修复后的完整代码。只返回修复后的代码,不要解释。"""
response = client.chat.completions.create(
model=os.getenv("HERMES_MODEL", "nous-hermes-2-mixtral-8x7b-dpo"),
messages=[{"role": "user", "content": prompt}],
temperature=0.1,
max_tokens=1000
)
fixed_code = response.choices[0].message.content
# 提取代码块内容
import re
code_match = re.search(r"```\w*\n(.*?)```", fixed_code, re.DOTALL)
if code_match:
fixed_code = code_match.group(1)
return {
"original": original_code,
"fixed": fixed_code,
"issue": issue_description
}
### GitHub Actions 集成
```yaml
# .github/workflows/code_review.yml
name: AI Code Review
on:
pull_request:
types: [opened, synchronize, reopened]
# 只审查这些文件类型的变更
paths:
- '**.py'
- '**.js'
- '**.ts'
- '**.go'
# 防止同一 PR 的多个审查并发运行
concurrency:
group: code-review-${{ github.event.pull_request.number }}
cancel-in-progress: true
jobs:
ai-code-review:
runs-on: ubuntu-latest
# 只在非 fork 的 PR 上运行(避免密钥泄露)
if: github.event.pull_request.head.repo.full_name == github.repository
permissions:
contents: read
pull-requests: write # 需要写权限发布 Review
steps:
- name: Checkout code
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.11'
- name: Install dependencies
run: |
pip install openai requests flake8
- name: Run AI Code Review Agent
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
GITHUB_REPOSITORY: ${{ github.repository }}
HERMES_BASE_URL: ${{ secrets.HERMES_BASE_URL }}
HERMES_API_KEY: ${{ secrets.HERMES_API_KEY }}
HERMES_MODEL: ${{ vars.HERMES_MODEL || 'nous-hermes-2-mixtral-8x7b-dpo' }}
PR_NUMBER: ${{ github.event.pull_request.number }}
run: |
python main.py
timeout-minutes: 10
主入口
# main.py
import os
import sys
from agent.hermes_agent import run_code_review_agent
def main():
pr_number = int(os.getenv("PR_NUMBER", "0"))
repo = os.getenv("GITHUB_REPOSITORY", "")
if not pr_number or not repo:
print("错误:缺少必要的环境变量 PR_NUMBER 或 GITHUB_REPOSITORY")
sys.exit(1)
result = run_code_review_agent(pr_number, repo)
print(f"审查完成: {result}")
if result["status"] == "completed":
sys.exit(0)
else:
sys.exit(1)
if __name__ == "__main__":
main()
70.4 多语言支持配置
语言检测与路由
# agent/tools/language_detector.py
import os
from typing import Optional
LANGUAGE_MAP = {
".py": "python",
".js": "javascript",
".jsx": "javascript",
".ts": "typescript",
".tsx": "typescript",
".go": "go",
".java": "java",
".rs": "rust",
".rb": "ruby",
".php": "php",
}
def detect_language(filename: str) -> Optional[str]:
"""根据文件扩展名检测语言"""
_, ext = os.path.splitext(filename.lower())
return LANGUAGE_MAP.get(ext)
def is_supported_language(filename: str) -> bool:
"""检查是否是支持的语言"""
lang = detect_language(filename)
return lang in ("python", "javascript", "typescript", "go")
自定义审查规则配置
# config/review_rules.yaml
# 可由团队定制的审查规则
general:
max_function_lines: 50 # 函数最大行数
max_file_lines: 500 # 文件最大行数
require_docstrings: true # 是否要求文档字符串
min_test_coverage: 80 # 最低测试覆盖率(%)
python:
style_guide: "pep8"
max_complexity: 10 # 圈复杂度上限
forbidden_imports: # 禁止导入的模块
- "pickle"
- "marshal"
javascript:
style_guide: "airbnb"
allow_var: false
require_strict_mode: true
go:
require_error_handling: true
max_goroutine_depth: 3
security:
block_pr_on_critical: true # 发现 critical 问题时阻止合并
block_pr_on_major: false # 发现 major 问题时是否阻止合并
require_human_review_patterns: # 这些模式出现时强制人工审查
- "authentication"
- "authorization"
- "payment"
- "crypto"
- "password"
70.5 输出示例:PR Review 评论格式
## 🤖 AI Code Review Report
**审查摘要:**
- 📁 扫描文件:5 个
- ✅ 通过:2 个文件
- ⚠️ 需要关注:3 个文件
- 🔴 Critical 问题:1 个
- 🟠 Major 问题:3 个
- 🟡 Minor 问题:7 个
**总体评分:⭐⭐⭐ (3/5) - 需要修改**
---
### 🔴 Critical Issues(必须修复)
**[`api/user.py` 第 42 行]** SQL 注入漏洞
```python
# ❌ 危险:使用字符串拼接构建 SQL
query = f"SELECT * FROM users WHERE name = '{username}'"
cursor.execute(query)
# ✅ 修复:使用参数化查询
query = "SELECT * FROM users WHERE name = %s"
cursor.execute(query, (username,))
由 Hermes Code Review Agent 自动生成 | 查看配置
---
## 70.6 踩坑记录
### 坑 1:Rate Limit 问题
**问题**:大型 PR(100+ 文件)会在短时间内触发 GitHub API Rate Limit。
**解决方案**:
```python
import time
def get_file_content_with_retry(url: str, max_retries: int = 3) -> str:
for attempt in range(max_retries):
resp = requests.get(url, headers=_headers())
if resp.status_code == 200:
return resp.text
if resp.status_code == 403:
# Rate limited:等待后重试
retry_after = int(resp.headers.get("Retry-After", 60))
print(f"Rate limited,等待 {retry_after} 秒...")
time.sleep(retry_after)
else:
resp.raise_for_status()
raise Exception("超过最大重试次数")
坑 2:Agent 输出格式不稳定
问题:Hermes 模型有时不严格按照工具调用格式输出,导致解析失败。
解决方案:添加输出验证和降级处理:
def safe_parse_tool_args(raw_args: str) -> dict:
"""安全解析工具参数,处理格式错误"""
try:
return json.loads(raw_args)
except json.JSONDecodeError:
# 尝试修复常见的 JSON 格式问题
# 处理单引号替代双引号的情况
fixed = raw_args.replace("'", '"')
try:
return json.loads(fixed)
except:
return {}
坑 3:大文件导致 Token 超限
问题:超过 1000 行的文件直接放入 context 会超出模型 token 限制。
解决方案:仅分析 diff 变更部分,而非完整文件:
def extract_changed_sections(patch: str, context_lines: int = 10) -> str:
"""从 diff patch 中提取变更部分(含上下文)"""
lines = patch.split("\n")
changed_sections = []
for i, line in enumerate(lines):
if line.startswith("+") or line.startswith("-"):
# 取前后 context_lines 行作为上下文
start = max(0, i - context_lines)
end = min(len(lines), i + context_lines)
section = "\n".join(lines[start:end])
if section not in changed_sections:
changed_sections.append(section)
return "\n---\n".join(changed_sections)
坑 4:误报安全问题
问题:正则模式匹配到注释或字符串中的"危险代码",产生大量误报。
解决方案:过滤注释和字符串字面量:
def remove_comments_and_strings(code: str, language: str) -> str:
"""移除注释和字符串后再做模式匹配"""
if language == "python":
# 移除 # 注释
code = re.sub(r'#.*$', '', code, flags=re.MULTILINE)
# 移除三引号字符串
code = re.sub(r'""".*?"""', '""', code, flags=re.DOTALL)
code = re.sub(r"'''.*?'''", "''", code, flags=re.DOTALL)
return code
本章小结
本章完整构建了一个基于 Hermes Agent 的代码审查与自动修复系统:
- 需求层面:解决了传统人工审查中重复性、等待时间长的核心问题
- 架构层面:设计了 GitHub Actions + Hermes Agent + 多工具协作的完整流水线
- 实现层面:提供了涵盖安全检测、静态分析、修复生成、PR评论发布的完整代码
- 工程层面:总结了 Rate Limit、格式不稳定、Token 超限、误报等常见问题的解决方案
这个 Agent 的价值不在于替代人工审查,而在于过滤噪音、聚焦关键——让 Reviewer 的有限注意力集中在架构设计和业务逻辑层面的问题上。
思考题
- 如何设计一个评分机制,量化 Agent 审查与人工审查的一致性?
- 对于安全敏感的代码(支付、认证),是否应该完全禁止 Agent 自动修复?
- 如何利用历史 PR 数据持续改进审查 Agent 的准确率?
- 多语言混合项目(如 Python 后端 + TypeScript 前端)如何统一质量基线?