Chapter 70

Case Study: Code Review and Auto-Fix Agent

Chapter 70: Case Study โ€” Code Review & Auto-Fix Agent

Chapter Introduction

Code review is one of the most time-consuming and cognitively demanding phases in software engineering. Studies show that mid-sized engineering teams spend 15โ€“20% of their weekly development time on code review, and a significant portion of review comments address repetitive, mechanical issues โ€” naming violations, unhandled exceptions, missing tests, SQL injection risks. These problems should be caught and fixed automatically. This chapter builds a complete Hermes Code Review & Auto-Fix Agent integrated with GitHub Actions CI/CD, giving every Pull Request an AI-powered audit pass before human review begins.


70.1 Requirements: The Pain Points of Automated Code Review

The Traditional PR Review Bottleneck

Traditional PR review flow:
Developer opens PR
    โ†“
Wait for reviewer availability (0.5โ€“2 days)
    โ†“
Reviewer reads code line-by-line (30โ€“90 min/PR)
    โ†“
Leave comments (50% are repetitive style/safety issues)
    โ†“
Developer fixes โ†’ waits again โ†’ repeat...

Core pain point summary:

Pain Point Impact Severity
Long review wait times Blocks feature delivery High
Repetitive issues drain reviewer attention Reduces review quality High
Security vulnerabilities may slip through Production incident risk Critical
No unified quality baseline Inconsistent codebase quality Medium
Subjective reviewer differences Hard to enforce team standards Medium

Agent Target Capabilities

The agent should:

  1. Multi-language support: Python, JavaScript/TypeScript, Go
  2. Multi-dimensional review: Security, performance, readability, conventions
  3. Auto-fix: Generate patches for clearly fixable issues
  4. PR integration: Output results as GitHub PR Review comments
  5. Configurable rules: Teams define their own standards

70.2 System Architecture

High-Level Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                   GitHub Repository                      โ”‚
โ”‚                                                         โ”‚
โ”‚  Developer Push โ†’ Pull Request created/updated          โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                         โ”‚ webhook / GitHub Actions trigger
                         โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚              GitHub Actions CI Runner                    โ”‚
โ”‚                                                         โ”‚
โ”‚  1. Fetch PR diff (GitHub API)                          โ”‚
โ”‚  2. Parse changed file list                             โ”‚
โ”‚  3. Invoke Hermes Agent for analysis                    โ”‚
โ”‚  4. Format results and publish Review                   โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                         โ”‚
                         โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                   Hermes Agent Core                      โ”‚
โ”‚                                                         โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚
โ”‚  โ”‚  Static     โ”‚  โ”‚  Semantic   โ”‚  โ”‚   Fix           โ”‚ โ”‚
โ”‚  โ”‚  Analysis   โ”‚  โ”‚  (LLM Core) โ”‚  โ”‚   Generation    โ”‚ โ”‚
โ”‚  โ”‚  Toolset    โ”‚  โ”‚             โ”‚  โ”‚   Toolset       โ”‚ โ”‚
โ”‚  โ”‚             โ”‚  โ”‚ - Security  โ”‚  โ”‚                 โ”‚ โ”‚
โ”‚  โ”‚ - AST parse โ”‚  โ”‚ - Logic     โ”‚  โ”‚ - diff gen      โ”‚ โ”‚
โ”‚  โ”‚ - Linter    โ”‚  โ”‚ - Best prac โ”‚  โ”‚ - patch apply   โ”‚ โ”‚
โ”‚  โ”‚ - Dep scan  โ”‚  โ”‚ - Review    โ”‚  โ”‚ - suggestions   โ”‚ โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                         โ”‚
                         โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚               GitHub PR Review Output                    โ”‚
โ”‚                                                         โ”‚
โ”‚  โ€ข Line-level comments (issue location)                 โ”‚
โ”‚  โ€ข Fix suggestions (code blocks)                        โ”‚
โ”‚  โ€ข Overall decision (APPROVE / REQUEST_CHANGES)         โ”‚
โ”‚  โ€ข Optional: auto-commit fix patches                    โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Tool Inventory

Tool Name Purpose Input Output
get_pr_diff Fetch PR changes PR number diff text
parse_code_ast AST parse code + language AST tree
run_static_analysis Lint + static checks file + language issue list
check_security_patterns Security pattern match code snippet vuln report
generate_fix_patch Generate fix patch issue + code unified diff
post_pr_review Publish PR comments review data GitHub API response

70.3 Full Implementation

Project Structure

code-review-agent/
โ”œโ”€โ”€ agent/
โ”‚   โ”œโ”€โ”€ hermes_agent.py
โ”‚   โ”œโ”€โ”€ tools/
โ”‚   โ”‚   โ”œโ”€โ”€ github_tools.py
โ”‚   โ”‚   โ”œโ”€โ”€ analysis_tools.py
โ”‚   โ”‚   โ””โ”€โ”€ fix_tools.py
โ”‚   โ””โ”€โ”€ prompts/
โ”‚       โ””โ”€โ”€ system_prompt.py
โ”œโ”€โ”€ config/
โ”‚   โ””โ”€โ”€ review_rules.yaml
โ”œโ”€โ”€ .github/
โ”‚   โ””โ”€โ”€ workflows/
โ”‚       โ””โ”€โ”€ code_review.yml
โ””โ”€โ”€ main.py

Core Agent

# agent/hermes_agent.py
import os
import json
from openai import OpenAI

client = OpenAI(
    base_url=os.getenv("HERMES_BASE_URL", "http://localhost:11434/v1"),
    api_key=os.getenv("HERMES_API_KEY", "ollama"),
)
MODEL = os.getenv("HERMES_MODEL", "nous-hermes-2-mixtral-8x7b-dpo")

SYSTEM_PROMPT = """You are a senior software engineer specializing in code quality and security review.
Your task is to thoroughly review Pull Request changes and provide actionable feedback.

Review dimensions:
1. Security: SQL injection, XSS, unsafe deserialization, hardcoded secrets
2. Performance: N+1 queries, unnecessary loops, memory leaks, blocking I/O
3. Maintainability: naming, function complexity, duplication, missing docs
4. Robustness: error handling, edge cases, null checks
5. Test coverage: critical paths covered

For each issue, provide:
- Location (filename + line number)
- Severity (critical/major/minor/suggestion)
- Clear description of the problem
- Concrete fix code

Work systematically through each changed file."""

TOOLS = [
    {
        "type": "function",
        "function": {
            "name": "get_pr_diff",
            "description": "Fetch the code diff for a Pull Request",
            "parameters": {
                "type": "object",
                "properties": {
                    "pr_number": {"type": "integer"},
                    "file_filter": {"type": "string", "default": "*"}
                },
                "required": ["pr_number"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "run_static_analysis",
            "description": "Run static analysis on a file",
            "parameters": {
                "type": "object",
                "properties": {
                    "file_content": {"type": "string"},
                    "language": {
                        "type": "string",
                        "enum": ["python", "javascript", "typescript", "go"]
                    },
                    "filename": {"type": "string"}
                },
                "required": ["file_content", "language", "filename"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "check_security_patterns",
            "description": "Check for security vulnerability patterns",
            "parameters": {
                "type": "object",
                "properties": {
                    "code": {"type": "string"},
                    "language": {"type": "string"}
                },
                "required": ["code", "language"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "generate_fix_patch",
            "description": "Generate a fix patch for a discovered issue",
            "parameters": {
                "type": "object",
                "properties": {
                    "original_code": {"type": "string"},
                    "issue_description": {"type": "string"},
                    "language": {"type": "string"}
                },
                "required": ["original_code", "issue_description", "language"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "post_pr_review",
            "description": "Post review comments to the GitHub PR",
            "parameters": {
                "type": "object",
                "properties": {
                    "pr_number": {"type": "integer"},
                    "review_body": {"type": "string"},
                    "comments": {"type": "array"},
                    "action": {
                        "type": "string",
                        "enum": ["APPROVE", "REQUEST_CHANGES", "COMMENT"]
                    }
                },
                "required": ["pr_number", "review_body", "comments", "action"]
            }
        }
    }
]


def run_code_review_agent(pr_number: int, repo: str) -> dict:
    messages = [
        {"role": "system", "content": SYSTEM_PROMPT},
        {
            "role": "user",
            "content": f"""Please review PR #{pr_number} in repository {repo}.

Steps:
1. Fetch the PR code changes
2. Run static analysis and security checks on each changed file
3. Summarize issues by severity
4. Generate fix suggestions for fixable issues
5. Post the PR Review

Begin the review."""
        }
    ]

    max_iterations = 20
    for iteration in range(max_iterations):
        response = client.chat.completions.create(
            model=MODEL,
            messages=messages,
            tools=TOOLS,
            tool_choice="auto",
            temperature=0.1,
        )

        message = response.choices[0].message
        messages.append(message)

        if not message.tool_calls:
            return {"status": "completed", "summary": message.content, "iterations": iteration + 1}

        for tool_call in message.tool_calls:
            tool_args = json.loads(tool_call.function.arguments)
            result = _dispatch_tool(tool_call.function.name, tool_args)
            messages.append({
                "role": "tool",
                "tool_call_id": tool_call.id,
                "content": json.dumps(result)
            })

    return {"status": "max_iterations_reached"}


def _dispatch_tool(name: str, args: dict):
    from .tools import github_tools, analysis_tools, fix_tools
    dispatch = {
        "get_pr_diff": github_tools.get_pr_diff,
        "run_static_analysis": analysis_tools.run_static_analysis,
        "check_security_patterns": analysis_tools.check_security_patterns,
        "generate_fix_patch": fix_tools.generate_fix_patch,
        "post_pr_review": github_tools.post_pr_review,
    }
    return dispatch[name](**args) if name in dispatch else {"error": f"unknown tool: {name}"}

Security Pattern Checker

# agent/tools/analysis_tools.py
import re

SECURITY_PATTERNS = {
    "python": [
        {"pattern": r"eval\s*\(", "name": "Dangerous eval()", "severity": "critical",
         "description": "eval() executes arbitrary code โ€” code injection risk"},
        {"pattern": r'f".*SELECT.*\{', "name": "SQL Injection", "severity": "critical",
         "description": "f-string SQL concatenation โ€” use parameterized queries"},
        {"pattern": r"pickle\.loads?\(", "name": "Unsafe deserialization", "severity": "major",
         "description": "pickle.loads on untrusted data can lead to RCE"},
        {"pattern": r'(password|secret|api_key)\s*=\s*["\'][^"\']+["\']',
         "name": "Hardcoded credential", "severity": "critical",
         "description": "Secrets must not be hardcoded in source"},
    ],
    "javascript": [
        {"pattern": r"eval\s*\(", "name": "Dangerous eval()", "severity": "critical",
         "description": "eval() creates XSS and code injection vectors"},
        {"pattern": r"innerHTML\s*=", "name": "XSS risk", "severity": "major",
         "description": "Direct innerHTML assignment โ€” use textContent or DOMPurify"},
    ],
    "go": [
        {"pattern": r'fmt\.Sprintf.*".*SELECT', "name": "SQL Injection", "severity": "critical",
         "description": "String-formatted SQL โ€” use parameterized queries"},
    ]
}

def check_security_patterns(code: str, language: str) -> dict:
    patterns = SECURITY_PATTERNS.get(language, [])
    issues = []
    for i, line in enumerate(code.split("\n"), 1):
        for p in patterns:
            if re.search(p["pattern"], line, re.IGNORECASE):
                issues.append({"line": i, "line_content": line.strip(),
                               "issue_name": p["name"], "severity": p["severity"],
                               "description": p["description"]})
    return {"language": language, "total_issues": len(issues), "issues": issues}


def run_static_analysis(file_content: str, language: str, filename: str) -> dict:
    issues = []
    sec = check_security_patterns(file_content, language)
    for issue in sec["issues"]:
        issues.append({
            "line": issue["line"],
            "severity": issue["severity"],
            "message": f"[Security] {issue['issue_name']}: {issue['description']}",
            "rule": "security"
        })
    # Additional linter integration would go here
    return {"filename": filename, "language": language, "issues": issues}

GitHub Actions Workflow

# .github/workflows/code_review.yml
name: AI Code Review

on:
  pull_request:
    types: [opened, synchronize, reopened]
    paths: ['**.py', '**.js', '**.ts', '**.go']

concurrency:
  group: code-review-${{ github.event.pull_request.number }}
  cancel-in-progress: true

jobs:
  ai-code-review:
    runs-on: ubuntu-latest
    if: github.event.pull_request.head.repo.full_name == github.repository
    permissions:
      contents: read
      pull-requests: write

    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - uses: actions/setup-python@v5
        with:
          python-version: '3.11'

      - run: pip install openai requests flake8

      - name: Run AI Code Review Agent
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
          GITHUB_REPOSITORY: ${{ github.repository }}
          HERMES_BASE_URL: ${{ secrets.HERMES_BASE_URL }}
          HERMES_API_KEY: ${{ secrets.HERMES_API_KEY }}
          PR_NUMBER: ${{ github.event.pull_request.number }}
        run: python main.py
        timeout-minutes: 10

70.4 Pitfalls & Solutions

Pitfall 1: GitHub API Rate Limiting

Large PRs (100+ files) quickly exhaust the GitHub API rate limit. Solution: implement exponential backoff retry with Retry-After header respect.

def get_with_retry(url, headers, max_retries=3):
    for attempt in range(max_retries):
        resp = requests.get(url, headers=headers)
        if resp.status_code == 200:
            return resp
        if resp.status_code == 403:
            wait = int(resp.headers.get("Retry-After", 60))
            time.sleep(wait)
        else:
            resp.raise_for_status()
    raise Exception("Max retries exceeded")

Pitfall 2: Unstable Tool-Call Format

Hermes occasionally produces malformed JSON in tool arguments. Guard with:

def safe_parse_args(raw: str) -> dict:
    try:
        return json.loads(raw)
    except json.JSONDecodeError:
        return json.loads(raw.replace("'", '"'))

Pitfall 3: Token Limit on Large Files

Never pass full file content โ€” only analyze the diff hunks plus surrounding context:

def extract_changed_sections(patch: str, context: int = 10) -> str:
    lines = patch.split("\n")
    sections = []
    for i, line in enumerate(lines):
        if line.startswith(("+", "-")):
            s = max(0, i - context)
            e = min(len(lines), i + context)
            section = "\n".join(lines[s:e])
            if section not in sections:
                sections.append(section)
    return "\n---\n".join(sections)

Pitfall 4: False Positives in Comments/Strings

Pattern matching fires on comments and string literals containing "dangerous" keywords. Strip them first:

def strip_comments(code: str, language: str) -> str:
    if language == "python":
        code = re.sub(r'#.*$', '', code, flags=re.MULTILINE)
        code = re.sub(r'""".*?"""', '""', code, flags=re.DOTALL)
    return code

Chapter Summary

This chapter built a complete Hermes-powered code review and auto-fix agent:

The agent's value is not to replace human review but to filter noise โ€” so reviewers focus on architecture and business logic, not console.log left in production.

Discussion Questions

  1. How would you design a metric to measure the alignment rate between AI review and human review?
  2. Should the agent be allowed to auto-merge PRs if it finds no critical issues?
  3. How can historical PR data be used to fine-tune the review agent over time?
  4. In a polyglot project (Python backend + TypeScript frontend), how do you unify the quality baseline?
Rate this chapter
4.5  / 5  (3 ratings)

๐Ÿ’ฌ Comments