← 返回 Skills 市场
lhwa8685

Chaos Engineer

作者 lhwa8685 · GitHub ↗ · v0.1.0
cross-platform ⚠ suspicious
399
总下载
0
收藏
1
当前安装
1
版本数
在 OpenClaw 中安装
/install chaos-engineer
功能描述
Use when designing chaos experiments, implementing failure injection frameworks, or conducting game day exercises. Invoke for chaos experiments, resilience t...
使用说明 (SKILL.md)

Chaos Engineer

Senior chaos engineer with deep expertise in controlled failure injection, resilience testing, and building systems that get stronger under stress.

Role Definition

You are a senior chaos engineer with 10+ years of experience in reliability engineering and resilience testing. You specialize in designing and executing controlled chaos experiments, managing blast radius, and building organizational resilience through scientific experimentation and continuous learning from controlled failures.

When to Use This Skill

  • Designing and executing chaos experiments
  • Implementing failure injection frameworks (Chaos Monkey, Litmus, etc.)
  • Planning and conducting game day exercises
  • Building blast radius controls and safety mechanisms
  • Setting up continuous chaos testing in CI/CD
  • Improving system resilience based on experiment findings

Core Workflow

  1. System Analysis - Map architecture, dependencies, critical paths, and failure modes
  2. Experiment Design - Define hypothesis, steady state, blast radius, and safety controls
  3. Execute Chaos - Run controlled experiments with monitoring and quick rollback
  4. Learn & Improve - Document findings, implement fixes, enhance monitoring
  5. Automate - Integrate chaos testing into CI/CD for continuous resilience

Reference Guide

Load detailed guidance based on context:

Topic Reference Load When
Experiments references/experiment-design.md Designing hypothesis, blast radius, rollback
Infrastructure references/infrastructure-chaos.md Server, network, zone, region failures
Kubernetes references/kubernetes-chaos.md Pod, node, Litmus, chaos mesh experiments
Tools & Automation references/chaos-tools.md Chaos Monkey, Gremlin, Pumba, CI/CD integration
Game Days references/game-days.md Planning, executing, learning from game days

Constraints

MUST DO

  • Define steady state metrics before experiments
  • Document hypothesis clearly
  • Control blast radius (start small, isolate impact)
  • Enable automated rollback under 30 seconds
  • Monitor continuously during experiments
  • Ensure zero customer impact initially
  • Capture all learnings and share
  • Implement improvements from findings

MUST NOT DO

  • Run experiments without hypothesis
  • Skip blast radius controls
  • Test in production without safety nets
  • Ignore monitoring during experiments
  • Run multiple variables simultaneously (initially)
  • Forget to document learnings
  • Skip team communication
  • Leave systems in degraded state

Output Templates

When implementing chaos engineering, provide:

  1. Experiment design document (hypothesis, metrics, blast radius)
  2. Implementation code (failure injection scripts/manifests)
  3. Monitoring setup and alert configuration
  4. Rollback procedures and safety controls
  5. Learning summary and improvement recommendations

Knowledge Reference

Chaos Monkey, Litmus Chaos, Chaos Mesh, Gremlin, Pumba, toxiproxy, chaos experiments, blast radius control, game days, failure injection, network chaos, infrastructure resilience, Kubernetes chaos, organizational resilience, MTTR reduction, antifragile systems

安全使用建议
This skill legitimately contains destructive chaos-engineering operations, but it omits any declaration of the credentials and privileges those operations require. Before using/installing: 1) Treat it as high-risk — run only in isolated/non-production environments (staging/labs). 2) Expect to need AWS keys, kubeconfig/EKS access, database connection strings, Gremlin/toxiproxy API keys, Slack webhook secrets, and sudo/root on hosts; do NOT supply production credentials. 3) Review and approve every command/manifest the skill will run (especially any aws ec2/terminate, kubectl apply, apt-get, /etc/hosts edits). 4) Ensure automated rollback, kill switches, monitoring, and manual approval gates are in place. 5) Prefer a version that explicitly lists required env vars, required config paths, and a safer 'dry-run' or simulated mode. If the author provides an updated metadata manifest that declares required credentials and clearly documents scopes and safeguards, re-evaluate — that would reduce the current incoherence.
功能分析
Type: OpenClaw Skill Name: chaos-engineer Version: 0.1.0 The bundle provides a comprehensive set of tools for chaos engineering, including scripts for AWS instance termination, Kubernetes node draining, and system-level network manipulation. While these capabilities are aligned with the stated purpose of resilience testing and the SKILL.md includes safety constraints (e.g., blast radius control), they constitute high-risk behaviors. Notable examples include references/infrastructure-chaos.md, which uses sudo to modify /etc/hosts, and references/chaos-tools.md, which contains scripts for terminating EC2 instances via the AWS CLI. These scripts also exhibit potential shell injection vulnerabilities due to the use of f-strings in subprocess calls without rigorous sanitization. No evidence of intentional malice or data exfiltration was found.
能力评估
Purpose & Capability
The name/description (chaos engineering) aligns with the actions in the references (terminating instances, node drains, Litmus/Chaos Mesh manifests, network/DNS tampering, stress tests). However the skill declares no required env vars/configs while the content expects access to AWS, Kubernetes, local system (sudo,/etc/hosts), Gremlin/toxiproxy, Prometheus, database connections, Slack webhooks, etc. Legitimate for the purpose, but the absence of declared required credentials/config paths is an incoherence.
Instruction Scope
The SKILL.md references and embedded examples instruct running destructive or privileged commands: aws cli terminate-instances, boto3 ec2/asg calls, kubectl apply/wait, editing /etc/hosts with sudo, apt-get install stress-ng, pumba/pod kills, modifying load balancers/target groups, and executing simulated connection leaks against DATABASE_URL. It also queries Prometheus and posts to Slack. These instructions read, mutate, or depend on system state and secrets that are not declared or scoped in the skill metadata.
Install Mechanism
There is no install spec (instruction-only), which reduces installer risk (nothing automatically downloaded at install time). However the content contains commands that will install packages or fetch remote manifests at runtime (apt-get, kubectl apply from remote URLs, downloading Litmus YAML). That means runtime actions can modify the host environment even though no installer is declared.
Credentials
The skill metadata declares no required environment variables or credentials, but the referenced scripts/workflows clearly require: AWS credentials (AWS_ACCESS_KEY_ID/SECRET), kubeconfig or EKS access, Gremlin API key/team id, toxiproxy endpoints, DATABASE_URL (Postgres), Prometheus endpoint, Slack webhook secret, and sudo/root privileges for /etc/hosts edits and package installs. The number and sensitivity of these secrets is large and not represented in the metadata.
Persistence & Privilege
The skill is not marked always:true and is user-invocable (normal). It does not request persistent platform-level privileges in metadata, but the instructions explicitly perform privileged actions at runtime (sudo, system package installs, editing /etc/hosts, terminating cloud instances). Those runtime privileges are significant even though the skill does not request persistent presence.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install chaos-engineer
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /chaos-engineer 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v0.1.0
- Initial release of the chaos-engineer skill focused on controlled failure injection and resilience testing. - Designed for use in chaos experiments, implementing failure injection frameworks, conducting game day exercises, and blast radius control. - Provides a core workflow covering system analysis, experiment design, execution, learning, and automation. - Includes must-do and must-not-do constraints for safe chaos engineering practice. - Offers reference guides and output templates for experiment design, implementation, monitoring, rollback, and post-mortem learning. - Related to SRE and DevOps skills, with emphasis on antifragile systems and organizational resilience.
元数据
Slug chaos-engineer
版本 0.1.0
许可证
累计安装 1
当前安装数 1
历史版本数 1
常见问题

Chaos Engineer 是什么?

Use when designing chaos experiments, implementing failure injection frameworks, or conducting game day exercises. Invoke for chaos experiments, resilience t... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 399 次。

如何安装 Chaos Engineer?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install chaos-engineer」即可一键安装,无需额外配置。

Chaos Engineer 是免费的吗?

是的,Chaos Engineer 完全免费(开源免费),可自由下载、安装和使用。

Chaos Engineer 支持哪些平台?

Chaos Engineer 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Chaos Engineer?

由 lhwa8685(@lhwa8685)开发并维护,当前版本 v0.1.0。

💬 留言讨论