功能描述

工程问题的自动化迭代实验室。给定一个 idea 或工程问题，自动调研方案、设计实现、验证效果、迭代优化，结果存入 Notion。触发词："idea-lab"、"实验一下"、"帮我验证"、"迭代优化"、"idea 验证"。当用户提出一个工程问题并希望自动化地调研→设计→验证→迭代时使用此 skill。

使用说明 (SKILL.md)

Idea Storm

Name: Idea Storm
Author: c4chuan

工程问题的自动化设计→验证→迭代闭环。后台运行，不阻塞主会话。

运行架构

采用分段 spawn 模式：每个检查点之间的工作在独立子 agent 中运行，状态通过文件传递。

主会话                              子 agent (isolated)
  │                                    │
  ├─ 创建 experiment.yaml              │
  ├─ spawn("idea-storm: 调研+设计") ───→ │
  │   (继续聊天)                       ├─ Phase 2: 调研
  │                                    ├─ Phase 3: 方案设计
  │                                    ├─ 更新 experiment.yaml
  │  ◄── announce 方案摘要 ────────────┤  ✅ 检查点1
  │                                    └─ (退出)
  │
  ├─ 用户确认方案
  ├─ spawn("idea-storm: 实现+验证") ───→ │
  │   (继续聊天)                       ├─ 读 experiment.yaml 恢复状态
  │                                    ├─ Phase 4: 实现
  │                                    ├─ Phase 5: 验证
  │                                    ├─ Phase 6: 评估
  │                                    ├─ 更新 experiment.yaml
  │  ◄── announce 迭代结果 ────────────┤  ✅ 检查点2
  │                                    └─ (退出)
  │
  ├─ 用户确认（继续迭代/收敛）
  ├─ spawn("idea-storm: 迭代N") ───→    ...（重复直到收敛）
  │
  ├─ spawn("idea-storm: 收敛报告") ──→  │
  │  ◄── announce 最终报告 ────────────┤  ✅ 检查点3
  └─ 完成

spawn 任务模板

每次 spawn 时，task 中必须包含：

实验状态文件路径：experiments/\x3Cid>/experiment.yaml
当前要执行的阶段
用户的确认/反馈内容（如有）

示例：

sessions_spawn(task="执行 idea-storm 实验。
读取实验状态：experiments/facial-gan-20260213/experiment.yaml
执行阶段：Phase 4-6（实现→验证→评估）
用户反馈：方案OK，用 StyleGAN3 路线
按 idea-storm skill 流程执行，完成后更新 experiment.yaml 并汇报结果。")

子 agent 启动后：

读 idea-storm SKILL.md 获取流程指引
读 experiment.yaml 恢复实验状态
执行指定阶段
更新 experiment.yaml + Notion
announce 结果摘要

记忆管理

三层存储，确保状态不丢失：

层级 1：热状态 (SESSION-STATE.md)

主会话的 SESSION-STATE.md 记录当前活跃实验的概要：

idea_lab:
  active_experiment: "facial-gan-20260213"
  experiment_path: "experiments/facial-gan-20260213/"
  current_phase: "等待用户确认检查点2"
  last_spawn_label: "idea-storm-facial-gan-iter2"

层级 2：实验工作区

每个实验在 workspace 下有独立目录：

experiments/\x3Cexperiment-id>/
├── experiment.yaml          # 实验完整状态（核心）
├── research/                # 调研资料
│   └── findings.md
├── design/                  # 方案设计
│   └── plan.md
├── src/                     # 实现代码
├── data/                    # 输入数据、参考图等
├── results/                 # 每轮验证结果
│   ├── iter-1/
│   ├── iter-2/
│   └── ...
└── report.md                # 最终报告（本地副本）

层级 3：Notion 长期记录

结构化实验报告，按时间和分类组织。详见 Notion 页面结构。

experiment.yaml 规范

实验的完整状态文件，子 agent 靠它恢复上下文：

id: "facial-gan-20260213"
title: "用 GAN 生成面部微表情"
created: "2026-02-13T12:00:00+08:00"
status: "running"  # running | paused | completed | abandoned

# 当前进度
phase: "Phase 5: 验证"
iteration: 2
max_iterations: 5

# 问题定义
problem:
  description: "需要生成逼真的面部微表情动画"
  constraints: "实时渲染，延迟\x3C50ms"

# 验证配置
validation:
  mode: "B"  # A=图片对比 B=指标优化 C=功能验证 D=自定义
  description: "优化 FID score"
  threshold: 50
  current_best: 67.3

# 检查点记录
checkpoints:
  - phase: 3
    time: "2026-02-13T13:00:00+08:00"
    status: "approved"
    user_feedback: "方案确认，用 StyleGAN3"
  - phase: 6
    iteration: 1
    time: "2026-02-13T14:30:00+08:00"
    status: "continue"
    user_feedback: "FID 67.3，继续优化学习率"

# 迭代历史
iterations:
  - round: 1
    changes: "初始实现，lr=0.001"
    result: "FID 67.3"
    decision: "继续，调整学习率"
  - round: 2
    changes: "lr=0.0003, 增加数据增强"
    result: "pending"

# Notion
notion_page_id: "xxx-xxx-xxx"

核心流程

Phase 1: 问题定义（主会话执行）

用户输入工程问题或 idea。提取并确认：

问题描述：要解决什么
成功标准：怎样算解决了
约束条件：技术栈、资源限制
验证方式：见验证模式

如果用户没有明确给出以上信息，主动询问（不要一次问太多）。

确认后：

创建实验目录 experiments/\x3Cid>/
写入 experiment.yaml
创建 Notion 实验页面
更新 SESSION-STATE.md
spawn 子 agent 执行 Phase 2-3

Phase 2: 调研（子 agent）

偏向工程化搜索，优先级：

GitHub 开源项目和实现
技术博客、Stack Overflow、工程实践
产品文档、API 文档
论文（仅在工程资料不足时补充）

工具：web_search + web_fetch

输出：

research/findings.md：调研结果
更新 experiment.yaml
更新 Notion「调研记录」

Phase 3: 方案设计（子 agent）

基于调研设计技术方案：

整体架构
关键技术选型
实现步骤
预期效果

输出：

design/plan.md：方案详情
更新 experiment.yaml（phase → "等待检查点1"）
更新 Notion「方案设计」
announce 方案摘要给主会话

✅ 检查点 1：方案确认（主会话）

用户确认后，主会话 spawn 新子 agent 执行 Phase 4-6。

Phase 4: 实现（子 agent）

按方案执行。可能包括：编写代码、配置环境、生成资源、调用 API。

输出：

src/ 下的实现代码
更新 Notion「实验日志」

Phase 5: 验证（子 agent）

按 experiment.yaml 中定义的验证方式执行。详见验证模式。

输出：

results/iter-N/：本轮验证数据
更新 Notion「验证结果」

Phase 6: 评估与迭代决策（子 agent）

根据验证结果判断：

情况	动作
达标	标记收敛，announce 结果
接近达标，参数可调	自动迭代参数，回到 Phase 4（不超过 max_iterations）
方向有问题	announce 建议换方案

更新 experiment.yaml 后 announce 结果给主会话。

✅ 检查点 2：迭代确认（主会话）

汇报内容：

本轮做了什么
效果数据/截图
下一步建议

用户确认后 spawn 下一轮或进入收敛。

Phase 7: 收敛报告（子 agent）

生成最终报告：

report.md：本地完整报告
更新 Notion 最终报告区块
announce 报告摘要

✅ 检查点 3：最终确认（主会话）

验证模式

由用户在 Phase 1 定义。

模式 A：图片对比

用户提供参考图 + 输入集。Agent 生成输出，与参考图对比。

工具：scripts/compare_images.py（SSIM / 像素差异）或 image 工具（视觉分析）
达标标准由用户定义

模式 B：指标优化

用户定义评测函数或指标，Agent 优化实现以提升指标。

用户提供评测脚本或指标定义
每轮记录指标变化
达到阈值即收敛

模式 C：功能验证

用户定义测试用例或验收标准，Agent 逐项验证。

模式 D：自定义

用户描述验证方式，Agent 按描述执行。

Notion 实验页面结构

每次启动实验时创建新页面。配置见 references/notion-setup.md。

📋 [实验标题] - [日期]
├── 问题定义
├── 调研记录
├── 方案设计
├── 实验日志（按迭代轮次）
├── 验证结果（按迭代轮次）
└── 最终报告

工具使用

阶段	工具
调研	`web_search`, `web_fetch`
实现	Claude Code（首选）, `exec`, `write`, `edit`
图片验证	`image`, `scripts/compare_images.py`
指标验证	`exec`（运行评测脚本）
Notion	Notion API via `exec`
后台运行	`sessions_spawn`
状态传递	`experiment.yaml` 文件
通知用户	announce（子 agent 自动）

Claude Code 集成

Phase 4（实现）阶段，优先使用 Claude Code 在 Docker 沙盒中完成编码任务。

Docker 沙盒架构

每个实验在独立的 Docker 容器中运行 Claude Code，与宿主机隔离：

宿主机                              Docker 容器 (idea-storm-sandbox)
├── openclaw.json ──(env注入)────→  ANTHROPIC_AUTH_TOKEN / BASE_URL
├── experiments/\x3Cid>/ ──(volume)──→ /workspace
│                                   ├── 非 root 用户 (coder)
│                                   ├── Claude Code CLI + --dangerously-skip-permissions
│                                   ├── Python3 / Node.js / Git
│                                   └── 代码写在 /workspace，自动持久化

优势：

完全隔离，不污染宿主机环境
非 root 用户可用 --dangerously-skip-permissions 自动跳过权限
API 配置从 openclaw.json 动态注入，换中转改一处即可
容器用完即删，干净无残留

镜像构建

使用预构建的 idea-storm-sandbox 镜像。Dockerfile 位于 scripts/Dockerfile：

FROM node:22-slim
RUN apt-get update && apt-get install -y --no-install-recommends \
    python3 python3-pip python3-venv git curl ca-certificates \
    && rm -rf /var/lib/apt/lists/*
RUN npm install -g @anthropic-ai/claude-code
RUN useradd -m -s /bin/bash coder
RUN mkdir -p /home/coder/.openclaw /workspace && chown -R coder:coder /workspace /home/coder
USER coder
WORKDIR /workspace
CMD ["bash"]

构建：docker build -t idea-storm-sandbox -f scripts/Dockerfile .

调用方式

从 openclaw.json 动态提取 API 配置，注入容器环境变量：

# 提取 API 配置
API_KEY=$(python3 -c "import json; print(json.load(open('/root/.openclaw/openclaw.json'))['models']['providers']['cc']['apiKey'])")
BASE_URL=$(python3 -c "import json; print(json.load(open('/root/.openclaw/openclaw.json'))['models']['providers']['cc']['baseUrl'])")

# 运行 Claude Code（单次任务）
docker run --rm -t \
  -e ANTHROPIC_AUTH_TOKEN="$API_KEY" \
  -e ANTHROPIC_BASE_URL="$BASE_URL" \
  -v experiments/\x3Cid>:/workspace \
  idea-storm-sandbox \
  bash -c 'cd /workspace && git init -q 2>/dev/null; claude --print --dangerously-skip-permissions "\x3Cprompt>"'

在子 agent 中使用

子 agent 执行 Phase 4 时，通过 exec + pty:true 调用：

exec(
  command="API_KEY=$(python3 -c \"import json; print(json.load(open('/root/.openclaw/openclaw.json'))['models']['providers']['cc']['apiKey'])\") && BASE_URL=$(python3 -c \"import json; print(json.load(open('/root/.openclaw/openclaw.json'))['models']['providers']['cc']['baseUrl'])\") && docker run --rm -t -e ANTHROPIC_AUTH_TOKEN=$API_KEY -e ANTHROPIC_BASE_URL=$BASE_URL -v /root/.openclaw/workspace/experiments/\x3Cid>:/workspace idea-storm-sandbox bash -c 'cd /workspace && git init -q 2>/dev/null; claude --print --dangerously-skip-permissions \"\x3Cprompt>\"'",
  pty=true,
  timeout=300
)

也可以使用辅助脚本 scripts/run-sandbox.sh 简化调用（见下方）。

Prompt 构造原则

给 Claude Code 的 prompt 应包含：

目标：要实现什么功能
上下文：当前项目结构、技术栈、已有代码
约束：文件路径、命名规范、依赖限制
验证：实现后如何验证（测试命令等）

示例：

基于 design/plan.md 中的方案，在当前目录实现面部微表情生成模块。
技术栈：Python 3.11 + PyTorch + StyleGAN3
要求：
1. 实现 FacialExpressionGenerator 类
2. 支持 6 种基本表情
3. 推理延迟 \x3C 50ms
4. 写好单元测试
完成后运行 pytest 确认测试通过。

迭代模式（Ralph Loop）

多轮迭代优化时，循环调用容器中的 Claude Code：

将任务写入实验目录的 PROMPT.md
循环调用 Docker 容器，每轮读取 PROMPT.md
通过文件（experiment.yaml）传递迭代状态
检查完成标记决定是否继续

# 单轮实现（在容器中）
scripts/run-sandbox.sh \x3Cexperiment-id> "$(cat experiments/\x3Cid>/PROMPT.md)"

# 宿主机验证结果
cd experiments/\x3Cid> && python3 -m pytest

# 如果失败，更新 PROMPT.md 加入错误信息，再跑一轮

何时用 Docker 沙盒 vs 宿主机直接执行

场景	推荐
创建项目脚手架、多文件编辑	Docker 沙盒 (Claude Code)
复杂代码重构	Docker 沙盒 (Claude Code)
安装未知依赖、运行不信任代码	Docker 沙盒
简单文件写入、小修改	宿主机 OpenClaw `write`/`edit`
运行已验证的命令	宿主机 OpenClaw `exec`
需要读取实验状态做决策	宿主机 OpenClaw（子 agent 自身）

安全使用建议

This skill largely matches its stated goal, but it has important mismatches you should address before installing: - The run-sandbox.sh script reads your OpenClaw agent config (openclaw.json) to extract model/provider API keys and injects them into a Docker image named 'idea-lab-sandbox'. That file may contain other secrets; the skill metadata does not declare this. Do not allow this unless you explicitly trust the container image and understand which keys are being used. - The skill also expects a NOTION_TOKEN and database ID (used to write experiment pages) but did not declare required env vars in the registry. Provide a dedicated Notion integration token with least privilege and store it separately rather than relying on undeclared variables. - Inspect and vet the Docker image 'idea-lab-sandbox' before running: verify its source, contents, and whether it exfiltrates data. Prefer to run the sandbox image in an environment with limited network access and with only the experiment directory mounted (not the whole workspace), or run the code locally without Docker. - Consider modifying run-sandbox.sh to avoid reading openclaw.json. Instead require the user to pass an explicit sandbox API key or configure a dedicated service account for sandboxing. Remove the '--dangerously-skip-permissions' flag or understand why it's needed. - If you cannot verify the container or do not want to share provider credentials, decline running the sandbox and use the skill in a manual mode (have the agent produce code and run it yourself), or request the maintainer to remove the secret-extraction behavior and to declare required env vars/config paths in the skill metadata. Because these issues involve undisclosed credential access and running arbitrary container code with mounted files, treat the skill as suspicious until those concerns are resolved.

功能分析

Type: OpenClaw Skill Name: idea-storm Version: 1.1.0 The skill is classified as suspicious due to several high-risk vulnerabilities, primarily shell injection within the Docker sandbox and direct access to sensitive API keys. The `scripts/run-sandbox.sh` script directly embeds the `$PROMPT` variable into a `bash -c` command, allowing for arbitrary command execution inside the Docker container if the prompt is not properly sanitized. Additionally, the skill explicitly accesses sensitive API keys/tokens from `openclaw.json` and environment variables (e.g., `ANTHROPIC_AUTH_TOKEN`, `NOTION_TOKEN`) for its operations. While these actions are intended for the skill's stated purpose (using Claude Code and Notion), they represent significant attack surfaces. There is no clear evidence of intentional malicious behavior like unauthorized data exfiltration, persistence, or remote control.

能力评估

ℹ Purpose & Capability

The declared purpose (automating design→implement→validate experiments and storing results in Notion) aligns with the provided files: an image-compare script, Notion API examples, and a sandbox runner. However the skill also reads the agent's local config to extract model provider API keys and base URLs (see run-sandbox.sh). That access to agent model credentials is not mentioned in the skill metadata or description and is not necessary to the stated high-level purpose if a user provides or configures a separate sandbox credential. This is an unexpected privilege.

⚠ Instruction Scope

SKILL.md and scripts instruct reading/writing experiment state files and calling Notion — reasonable. But run-sandbox.sh specifically reads /root/.openclaw/openclaw.json (or OPENCLAW_CONFIG) to extract provider apiKey/baseUrl, mounts the experiment workspace into a container, and runs 'claude --dangerously-skip-permissions'. These steps cause the agent to read internal config and expose provider credentials to an external Docker image and bypass permission checks; that goes beyond the documented scope and is a clear scope creep / data-exfiltration risk.

⚠ Install Mechanism

There is no formal install spec (instruction-only), which limits static install risk. However run-sandbox.sh expects and will docker pull/run an image named 'idea-lab-sandbox' (an arbitrary container image). Running an externally provided image with a mounted workspace and injected credentials is a significant runtime risk: the image can execute arbitrary code with access to mounted files and any env vars passed in.

⚠ Credentials

The registry metadata declares no required env vars or config paths, but the docs and scripts require/expect NOTION_TOKEN/IDEA_LAB_DB_ID and — crucially — read the agent's openclaw.json to extract model provider API keys. Access to the agent's model API keys (and implicitly other config in openclaw.json) is disproportionate to a Notion-logging/experiment orchestrator and is not declared. Passing those keys into the sandbox container is especially problematic.

ℹ Persistence & Privilege

The skill is not force-installed (always:false) and does not request persistent system-level modifications. However it is designed to spawn background child agents and to run Docker sandboxes that mount workspace directories and receive provider credentials. Autonomous invocation combined with the credential-access behavior increases blast radius if the spawned or containerized code is malicious; this combination is noteworthy although the skill itself is not marked always:true.

版本历史

v1.1.0

- Skill renamed from "idea-lab" to "idea-storm" throughout documentation and usage. - Updated trigger words to reflect the new skill name ("idea-storm" etc.). - Documentation and process instructions updated to consistently use "idea-storm" instead of "idea-lab". - Added preferred use of Claude Code running in a Docker sandbox for implementation tasks (Phase 4). - Appended a detailed section describing Claude Code integration, Docker sandbox architecture, and how to invoke coding tasks securely. - Added helper script `scripts/run-sandbox.sh` for managing the Claude Code development environment.

v1.0.0

- Initial release of idea-lab (v1.0.0): an automated engineering experiment workflow for idea/problem research, design, implementation, validation, and iterative optimization. - Supports multi-phase, checkpointed experiments managed via workspace directories and experiment.yaml files, with state announcements at key phases. - Runs phases in isolated sub-agents via spawn, ensuring main chat remains unblocked and experiment context is persisted. - Integrates with Notion for structured, long-term recording of experiment progress and results. - Features multi-level memory management (session state, workspace, Notion) for robust state tracking. - Provides user-triggered keywords for starting or progressing experiments and supports various validation modes (image, metric, custom, etc.).

元数据

Slug idea-storm

版本 1.1.0

许可证 —

累计安装 1

当前安装数 1

历史版本数 2

常见问题

Idea Storm 是什么？

工程问题的自动化迭代实验室。给定一个 idea 或工程问题，自动调研方案、设计实现、验证效果、迭代优化，结果存入 Notion。触发词："idea-lab"、"实验一下"、"帮我验证"、"迭代优化"、"idea 验证"。当用户提出一个工程问题并希望自动化地调研→设计→验证→迭代时使用此 skill。它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 772 次。

如何安装 Idea Storm？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install idea-storm」即可一键安装，无需额外配置。

Idea Storm 是免费的吗？

是的，Idea Storm 完全免费（开源免费），可自由下载、安装和使用。

Idea Storm 支持哪些平台？

Idea Storm 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（cross-platform）。

谁开发了 Idea Storm？

由 c4chuan（@c4chuan）开发并维护，当前版本 v1.1.0。

Idea Storm

Idea Storm

运行架构

spawn 任务模板

记忆管理

层级 1：热状态 (SESSION-STATE.md)

层级 2：实验工作区

层级 3：Notion 长期记录

experiment.yaml 规范

核心流程

Phase 1: 问题定义（主会话执行）

Phase 2: 调研（子 agent）

Phase 3: 方案设计（子 agent）

✅ 检查点 1：方案确认（主会话）

Phase 4: 实现（子 agent）

Phase 5: 验证（子 agent）

Phase 6: 评估与迭代决策（子 agent）

✅ 检查点 2：迭代确认（主会话）

Phase 7: 收敛报告（子 agent）

✅ 检查点 3：最终确认（主会话）

验证模式

模式 A：图片对比

模式 B：指标优化

模式 C：功能验证

模式 D：自定义

Notion 实验页面结构

工具使用

Claude Code 集成

Docker 沙盒架构

镜像构建

调用方式

在子 agent 中使用

Prompt 构造原则

迭代模式（Ralph Loop）

何时用 Docker 沙盒 vs 宿主机直接执行

Idea Storm 是什么？

如何安装 Idea Storm？

Idea Storm 是免费的吗？

Idea Storm 支持哪些平台？

谁开发了 Idea Storm？

💬 留言讨论