← 返回 Skills 市场
weixuanjiang

Agent Guru

作者 Weixuan Jiang · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ✓ 安全检测通过
100
总下载
0
收藏
0
当前安装
1
版本数
在 OpenClaw 中安装
/install agent-guru
功能描述
Use when building, designing, or reviewing a multi-agent system for production — routing agents, orchestrating subagents, guarding tools with permissions, ma...
使用说明 (SKILL.md)

Production Agent Design

Core Principle

The LLM is the reasoning engine. Your code is the execution engine. The loop is the contract between them.

Every production concern — safety, cost, retries, logging, permissions — lives in the harness, not the prompt. A prompt that says "be careful with deletions" is a suggestion. A GuardedToolNode that intercepts delete_* calls is a guarantee.

When to Use This Skill

  • Designing a new multi-agent system from scratch
  • Adding safety, cost controls, or observability to an existing agent
  • Debugging runaway cost, infinite loops, or context window exhaustion
  • Choosing between single-agent vs multi-agent topology
  • Implementing human-in-the-loop (HITL) for irreversible actions
  • Setting up session persistence and resumption

Architecture at a Glance

INGRESS (HTTP / CLI / Webhook / Schedule)
    │
ROUTER LAYER          — classify intent, dispatch cheaply
    │
ORCHESTRATOR          — decompose tasks, delegate to specialists
    ├── Agent A (scoped tools)
    └── Agent B (scoped tools)
         │
TOOL LAYER            — validate schema → check permission → execute → truncate
         │
CROSS-CUTTING CONCERNS
    ├── MEMORY         (short-term / working / long-term)
    ├── OBSERVABILITY  (traces, cost, session replay)
    └── RESILIENCE     (retry, circuit breaker, loop guard)
         │
PERSISTENCE           — checkpoints (Redis / Postgres) + audit log

Single Agent vs Multi-Agent

Task scoped to ONE domain?
  YES → Single ReAct agent with appropriate tools
  NO  → Independent subtasks?
          YES → Parallel multi-agent (supervisor + specialists)
          NO  → Sequential / hierarchical orchestrator
                  │
              Any irreversible step requiring human review?
                YES → Plan-then-execute with HITL interrupt
                NO  → Orchestrator with auto-delegation

Rule: Start with a single agent. Add multi-agent complexity only when you hit a concrete limit — context window size, tool set sprawl, latency, or accuracy.

Framework Selection

Need Use
Complex branching, HITL, durable persistence, fine-grained control LangGraph
Simple loop, minimal boilerplate, rapid prototype, leaf agents Strands
Orchestration graph + simple leaf agents LangGraph + Strands hybrid

Reference Files

Load these on demand using the triggers listed below. Do not load all of them upfront.

File Load when...
references/router-layer.md Designing intent routing, building a classifier node, handling misrouting
references/orchestrator-layer.md Decomposing tasks, spawning subagents, implementing plan-then-execute
references/tool-safety-layer.md Designing tools, adding permission rules, implementing HITL or killswitch
references/memory-layer.md Context window approaching limit, adding long-term memory, injecting project context
references/observability-layer.md Adding tracing, tracking token cost, debugging agent behavior, setting up alerts
references/resilience-layer.md Adding retry logic, circuit breakers, preventing infinite loops
references/persistence-layer.md Choosing a checkpointer, implementing session resume, session branching
references/production-checklist.md Before deploying to production — full ~40-point readiness checklist

Quick Reference

Pattern Key implementation Reference
Intent routing conditional_edges + confidence threshold router-layer.md
Scoped subagents create_react_agent with tool subset orchestrator-layer.md
Plan-then-execute Two nodes, read-only tools in plan phase orchestrator-layer.md
Tool schema args_schema=PydanticModel on @tool tool-safety-layer.md
Permission guard GuardedToolNode with PermissionRule list tool-safety-layer.md
HITL interrupt interrupt() + Command(resume=...) tool-safety-layer.md
Runtime concurrency is_concurrency_safe(input) per tool call tool-safety-layer.md
Abort hierarchy Query-level abort + sibling-level child abort tool-safety-layer.md
Tiered compaction budget → snip → microcompact → autocompact memory-layer.md
Auto-compaction Summarization node at 80% context memory-layer.md
Context injection AGENT.md loaded into system prompt memory-layer.md
Full trace BaseCallbackHandler + structured events observability-layer.md
Cost tracking Per-turn token accounting in callback observability-layer.md
Config snapshot Freeze all feature flags at query entry observability-layer.md
Diminishing returns Track token deltas; stop if delta \x3C 500 × 2 resilience-layer.md
Output limit escalation Escalate to 64k tokens before compaction resilience-layer.md
Streaming cleanup Tombstone partial messages on fallback resilience-layer.md
Error-as-observation try/exceptToolMessage resilience-layer.md
Circuit breaker State machine wrapping tool fn resilience-layer.md
Session resume Checkpointer + stable thread_id persistence-layer.md

Gotchas

  • Safety rules must be code, not prompts. A prompt saying "don't delete production data" is not a safety control.
  • Never dump the full parent message history into a subagent. Pass only the specific task and relevant data — context pollution degrades performance and wastes tokens.
  • InMemorySaver is for development only. Use Redis or Postgres checkpointers in production.
  • interrupt() pauses the graph. Resume it by calling graph.invoke(Command(resume=...), config=config) — forgetting this leaves the agent stuck.
  • Tool result truncation is mandatory. Large tool outputs (file reads, search results) will exhaust the context window if not truncated before returning.
  • Always set max_iterations. Without a loop guard, a miscalibrated agent runs indefinitely and incurs unbounded cost.
  • Apply compaction in tiers. Budget tool results → snip → microcompact → autocompact. Jumping straight to full summarization wastes tokens when a cheaper step would suffice.
  • Track diminishing returns, not just token budget. An agent can burn through its iteration budget producing nearly empty continuations. Stop when the last 2 deltas are both below ~500 tokens.
  • Snapshot config at query entry. Never re-read feature flags or env vars mid-turn — a remote config change during a 30-second response causes inconsistent behavior within a single turn.
  • Concurrency safety must be checked at runtime. Schema metadata cannot determine if a bash command is safe — inspect the actual input string at call time. Fail conservatively (serial) if parsing fails.
安全使用建议
This is a content-rich, instruction-only playbook for production multi-agent systems — it appears coherent with that purpose. Before you copy or run any examples: (1) review and remove hardcoded credentials and replace with secured secrets; (2) sandbox code that reads files (AGENT.md, ~/.agent, /etc) to avoid unintentionally exposing local secrets; (3) validate any remote endpoints before allowing the agent to call them (the remote killswitch example calls an internal config URL); (4) adopt the GuardedToolNode / HITL patterns for any destructive tooling; and (5) if you need higher assurance, ask the publisher for provenance (homepage, repo) or run the code in an isolated dev environment. If you want a deeper risk review, provide the publisher/source URL or say which code snippets you intend to reuse.
功能分析
Type: OpenClaw Skill Name: agent-guru Version: 1.0.0 The skill bundle is a comprehensive architectural guide and reference library for building production-grade, safe, and observable multi-agent systems using frameworks like LangGraph. It contains high-quality code examples for critical safety patterns such as human-in-the-loop (HITL) interrupts, permission guards (GuardedToolNode), context window management (auto-compaction), and cost tracking. There is no evidence of malicious intent, data exfiltration, or obfuscation; rather, the content focuses on preventing common agent failure modes like infinite loops and unauthorized tool execution.
能力评估
Purpose & Capability
The name/description (production multi-agent design) aligns with the content: detailed architecture patterns, tooling, and code examples for routing, orchestration, safety, memory, observability and persistence. It does not request unrelated credentials, binaries, or installs.
Instruction Scope
SKILL.md and the reference files contain runnable examples that read local files (e.g., AGENT.md from working_dir, ~/.agent, /etc/agent/global), connect to DBs/Redis/Postgres (example connection strings), spin up an HTTP endpoint, and fetch remote policy via httpx.get. Those are appropriate for the stated purpose (production agent harnesses) but they do instruct accessing filesystem and network resources — review and sandbox any copied examples before running.
Install Mechanism
Instruction-only skill with no install spec or shipped code — lowest install risk. Example pip install lines appear in docs (langgraph, langgraph-supervisor) but no code is downloaded by the skill itself.
Credentials
The skill does not declare required env vars or credentials, but examples reference environment-driven config (os.getenv), DB URLs, Redis/Postgres connection examples, and snapshotting of MAX_OUTPUT_TOKENS etc. These are reasonable for production guidance but you should not copy hardcoded credentials (e.g., 'postgresql://user:pass@db:5432/agents') into real deployments and should limit which env vars or secrets are used.
Persistence & Privilege
always is false and there is no install-time persistence or privileged modification of other skills. The guidance describes persistent components (checkpointers, vector stores) that are normal in production — the skill itself does not request permanent platform privileges.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install agent-guru
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /agent-guru 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
Initial release: Comprehensive multi-agent production design patterns and reference for LangGraph-based frameworks. - Introduces best practices and architectural patterns for scalable, safe, and observable agent systems. - Includes decision trees for agent topology and framework selection (LangGraph, Strands). - Provides modular, on-demand reference files for each system layer (routing, orchestrator, tools, memory, observability, resilience, persistence, checklist). - Documents concrete implementation tips, safeguards, and gotchas for production reliability and cost control. - Emphasizes code-level enforcement of safety, memory management, error handling, and concurrency controls.
元数据
Slug agent-guru
版本 1.0.0
许可证 MIT-0
累计安装 0
当前安装数 0
历史版本数 1
常见问题

Agent Guru 是什么?

Use when building, designing, or reviewing a multi-agent system for production — routing agents, orchestrating subagents, guarding tools with permissions, ma... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 100 次。

如何安装 Agent Guru?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install agent-guru」即可一键安装,无需额外配置。

Agent Guru 是免费的吗?

是的,Agent Guru 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

Agent Guru 支持哪些平台?

Agent Guru 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Agent Guru?

由 Weixuan Jiang(@weixuanjiang)开发并维护,当前版本 v1.0.0。

💬 留言讨论