Hermes Agent Health Check
/install hermes-agent-health-check
Hermes Agent Health Check
Audit the architecture and health of a Hermes Agent checkout, fork, or deployment support repo.
Hermes Agent has a connected runtime: agent loop, command registry, CLI, TUI, gateway, skills, memory, cron, tools, plugins, and terminal environments. hermescheck helps keep those surfaces aligned.
When to Use
- You are preparing a Hermes Agent PR and want a repeatable architecture review
- A Hermes fork works in CLI but not gateway, TUI, skills, cron, or plugins
- A new slash command risks drifting across surfaces
- A tool or environment change needs clearer capability boundaries
- Memory, session search, or skill behavior regressed after a refactor
- Startup paths or background jobs became hard to reason about
Quick Start
pip install hermescheck
hermescheck /path/to/hermes-agent
Produces audit_results.json and audit_report.md.
The 12-Layer Stack
| # | Layer | What Goes Wrong |
|---|---|---|
| 1 | System prompt | Conflicting instructions, instruction bloat |
| 2 | Session history | Stale context from previous turns |
| 3 | Long-term memory | Pollution across sessions |
| 4 | Distillation | Compressed artifacts re-entering as pseudo-facts |
| 5 | Active recall | Redundant re-summary layers wasting context |
| 6 | Tool selection | Wrong tool routing, model skips required tools |
| 7 | Tool execution | Hallucinated execution — claims to call but doesn't |
| 8 | Tool interpretation | Misread or ignored tool output |
| 9 | Answer shaping | Format corruption in final response |
| 10 | Platform rendering | UI/API/CLI mutates valid answers |
| 11 | Hidden repair loops | Silent fallback/retry agents running second LLM pass |
| 12 | Persistence | Expired state or cached artifacts reused as live evidence |
Audit Scanners
| # | Scanner | Severity | What It Catches |
|---|---|---|---|
| 1 | Hardcoded Secrets | critical | API keys, tokens, credentials in source code |
| 2 | Tool Enforcement Gap | high | "Must use tool X" in prompt but no code validation |
| 3 | Hidden LLM Calls | high | Secret second-pass LLM calls in fallback/repair loops |
| 4 | Unrestricted Code Execution | critical | exec(), eval(), subprocess(shell=True) without sandbox |
| 5 | Static Bug Inference | high | Code-level bug patterns inferred without runtime execution |
| 6 | Token Usage Budget | high | Large default context windows, full-history prompts, missing thrift controls |
| 7 | Memory Lifecycle Governance | medium | Memory without types, lifecycle, retrieval budgets, decay, or evidence pointers |
| 8 | RAG Pipeline Governance | medium | Retrieval without chunk, top-k, rerank, ingestion, or context budget controls |
| 9 | Self-Evolution Capability | high | Learning loops without external signals, source reading, constraint fit, safe landing, or verification |
| 10 | Loop Safety Budget | high | Tool/agent loops without max-iteration, retry budget, stuck-job, or duplicate-call controls |
| 11 | Plugin / Remote Tool Boundary | high | Executable plugins and MCP/OpenAPI tools without sandbox, schema, allowlist, or approval boundaries |
| 12 | Output Pipeline Mutation | medium | Response transformation corrupting correct answers |
| 13 | Missing Observability | medium | No tracing, logging, cost tracking, or audit trail |
Severity Model
| Level | Meaning |
|---|---|
critical |
Agent can confidently produce wrong operational behavior |
high |
Agent frequently degrades correctness or stability |
medium |
Correctness usually survives but output is fragile or wasteful |
low |
Mostly cosmetic or maintainability issues |
Fix Strategy
Default fix order (code-first, not prompt-first):
- Code-gate tool requirements — enforce in code, not just prompt text
- Remove or narrow hidden repair agents — make fallback explicit with contracts
- Reduce context duplication — same info through prompt + history + memory + distillation
- Tighten memory admission — user corrections > agent assertions
- Tighten distillation triggers — don't compress what shouldn't be compressed
- Reduce rendering mutation — pass-through, don't transform
- Convert to typed JSON envelopes — structured internal flow, not freeform prose
Report Schema
Reports follow a formal JSON Schema (see references/report-schema.json) with:
overall_health: critical_risk | high_risk | medium_risk | low_riskfindings: array of severity-ranked issues with evidence refsmaturity_score: positive signal ledger, penalty ledger, score formula, and expected recovery directionsordered_fix_plan: prioritized fix steps with rationale
Anti-Patterns to Avoid
- ❌ Saying "the model is weak" without falsifying the wrapper first
- ❌ Saying "memory is bad" without showing the contamination path
- ❌ Letting a clean current state erase a dirty historical incident
- ❌ Treating markdown prose as a trustworthy internal protocol
- ❌ Accepting "must use tool" in prompt text when code never enforces it
Related
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install hermes-agent-health-check - 安装完成后,直接呼叫该 Skill 的名称或使用
/hermes-agent-health-check触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
Hermes Agent Health Check 是什么?
Audit a NousResearch/hermes-agent checkout or fork for Hermes-specific runtime-contract drift, command-surface splits, memory/skill/gateway health, and agent... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 58 次。
如何安装 Hermes Agent Health Check?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install hermes-agent-health-check」即可一键安装,无需额外配置。
Hermes Agent Health Check 是免费的吗?
是的,Hermes Agent Health Check 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。
Hermes Agent Health Check 支持哪些平台?
Hermes Agent Health Check 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 Hermes Agent Health Check?
由 huangrichao2020(@huangrichao2020)开发并维护,当前版本 v1.1.2。