← 返回 Skills 市场
0xcjl

Autoresearch

作者 Jialin · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ⚠ suspicious
133
总下载
2
收藏
1
当前安装
1
版本数
在 OpenClaw 中安装
/install autoresearch-pro
功能描述
Automatically improve OpenClaw skills, prompts, or articles through iterative mutation-testing loops. Inspired by Karpathy's autoresearch. Use when user says...
使用说明 (SKILL.md)

autoresearch-pro

Overview

Automatically improve any OpenClaw skill, prompt, or article through iterative mutation-testing: small edits → run test cases → score with checklist → keep improvements, discard regressions.

Inspired by Karpathy/autoresearch.

Supports three optimization modes:

Mode Input Output
Skill Path to a skill directory Improved SKILL.md
Prompt A prompt text string Improved prompt
Article An article/document text Improved article

Workflow

Step 1 — Identify Mode and Input

Ask the user to confirm:

  • Mode 1 — Skill: User says "optimize [skill-name]" or provides a skill path
  • Mode 2 — Prompt: User says "optimize this prompt" or pastes a prompt
  • Mode 3 — Article: User says "improve this article" or pastes article text

For Skill mode, resolve the skill path to ~/.openclaw/skills/\x3Cskill-name>/SKILL.md. For Prompt/Article mode, keep the text in context (do not write to disk unless needed).

Step 2 — Generate Checklist (10 Questions)

Read the target content first. Then generate 10 diverse, specific yes/no checklist questions relevant to the content type:

For Skill mode (same as before):

# Dimension What to Check
1 Description clarity Is the frontmatter description precise and actionable?
2 Trigger coverage Does it cover the main real-world use cases?
3 Workflow structure Are steps clearly sequenced and unambiguous?
4 Error guidance Does it handle error states and edge cases?
5 Tool usage accuracy Are tool names and parameters correct for OpenClaw?
6 Example quality Do examples reflect real usage patterns?
7 Conciseness Is content free of redundant repetition?
8 Freedom calibration Is instruction specificity appropriate?
9 Reference quality Are references and links accurate?
10 Completeness Are all sections filled with real content?

For Prompt mode (10 tailored questions):

# Dimension What to Check
1 Goal clarity Does the prompt state a clear, specific goal?
2 Role/tone Is the desired role or tone specified?
3 Input format Is the input format clearly described?
4 Output format Is the expected output format specified?
5 Constraints Are key constraints and boundaries stated?
6 Context sufficiency Is enough context provided to avoid hallucination?
7 Edge cases Does it handle ambiguous or edge case inputs?
8 Conciseness Is it free of redundant or contradictory instructions?
9 Actionability Are instructions concrete and actionable vs. vague?
10 Completeness Are all necessary elements for the task present?

For Article mode (10 tailored questions):

# Dimension What to Check
1 Title quality Does the title clearly convey the main value?
2 Opening hook Does the opening grab attention and set expectations?
3 Logical structure Are ideas logically organized (not random)?
4 Argument clarity Are claims supported with evidence or reasoning?
5 Conciseness Is unnecessary padding or repetition removed?
6 Transition flow Do paragraphs/sections flow smoothly?
7 Closing strength Does the conclusion summarize and inspire action?
8 Tone consistency Is the tone consistent throughout?
9 Readability Is sentence/paragraph length varied appropriately?
10 Audience match Does language match the target audience level?

Present the 10 questions, numbered 1-10. Ask the user to select which ones to activate (e.g., "use questions 1, 3, 5, 7"). Default: use all 10 if user doesn't specify.

Step 3 — Prepare Test Cases

  • Skill mode: Generate 3-5 realistic prompts a user would send when using the skill
  • Prompt mode: Generate 3-5 test inputs that the prompt would process
  • Article mode: Generate 3-5 ways the article might be read or consumed

Store test cases in context — do not write to disk.

Step 4 — Run Autoresearch Loop

Loop configuration:

  • Rounds per batch: 30
  • Max total rounds: 100
  • Pause: After every 30 rounds, show summary and ask user to continue or stop
  • Stop conditions: User says stop, OR 100 rounds completed

Per-round procedure:

  1. Mutate: Make ONE small edit to the target content:

    • Skill mode: edit SKILL.md
    • Prompt mode: edit the prompt string
    • Article mode: edit the article text
  2. Test: For each test case, simulate what output the content would produce.

  3. Score: Apply each active checklist question (0 or 1 per question). Score = (passed / total) × 100.

  4. Decide: If new score ≥ best score → keep the mutation. If lower → revert.

  5. Log: Round number, mutation type, score, keep/revert decision.

Mutation types (pick one per round):

Type Description
A Add a constraint rule
B Strengthen trigger/coverage
C Add a concrete example
D Tighten vague language
E Improve error/edge case handling
F Remove redundant content
G Improve transitions
H Expand a thin section
I Add cross-reference
J Adjust degree-of-freedom

Step 5 — Report Results

After each batch (30 rounds):

Batch N (rounds X-Y):
  Best score: XX%
  Mutations kept: N  |  Reverted: N
  Most effective types: [list top 2-3]
Accumulated improvements: [summary]
Continue? (yes/stop)

After full completion:

  • Original score vs. final score
  • Top 3 most impactful mutations
  • Final improved content (inline or diff)
  • File path (skill mode only)

Mutation Strategy Reference

High-impact, low-risk changes:

  • Adding explicit constraints where the content is vague
  • Expanding coverage to cover edge cases
  • Adding concrete examples to abstract instructions
  • Tightening soft language ("try to" → "must")

Avoid in one round:

  • Large rewrites of entire sections
  • Multiple unrelated changes at once
  • Changing fundamental scope or purpose

See references/mutation_strategies.md for the full strategy guide.


Mode Selection Quick Reference

User says Mode
"optimize [skill]" / "autoresearch [skill]" Skill
"optimize this prompt" / "improve my prompt" Prompt
"polish this article" / "improve this article" Article
"optimize this document" Article

Default to Prompt mode if the input is a text string without a skill path.

安全使用建议
This skill appears internally consistent with its purpose, but it will read and (for Skill mode) modify SKILL.md files under ~/.openclaw/skills and create a .snapshots directory there. Before installing or running it: (1) inspect the included scripts/run_eval.py yourself (it is the only code file) to confirm you are comfortable with its behavior; (2) run the tool on a copy of a skill (not production) to observe changes and scoring behavior; (3) ensure the agent asks for and obtains your explicit confirmation before it writes to any skill path or proceeds beyond the first batch; (4) if you do not want any disk changes, avoid giving it Skill mode access or restrict the agent's filesystem permissions. No network endpoints or credentials are requested by the skill, which reduces exfiltration risk, but filesystem modification is intrinsic to its function — treat snapshots as your first line of rollback and verify them before accepting changes.
功能分析
Type: OpenClaw Skill Name: autoresearch-pro Version: 1.0.0 The skill is designed to programmatically modify other OpenClaw skills by editing their 'SKILL.md' files in the '~/.openclaw/skills/' directory. It uses a Python helper script ('scripts/run_eval.py') to read, write, and manage snapshots of these files during an iterative 'mutation-testing' loop. While the stated intent is optimization (inspired by Karpathy's autoresearch), the capability to overwrite the instructions of other skills constitutes a high-risk behavior that could be used for persistent prompt injection or lateral modification of agent behavior across the system.
能力评估
Purpose & Capability
The skill claims to iteratively improve SKILL.md, prompts, or articles. The provided SKILL.md and the helper script (scripts/run_eval.py) consistently implement that: they read/write SKILL.md, create .snapshots, generate checklists, and score mutations. There are no unrelated environment variables, binaries, or network endpoints required, so requested capabilities are proportional to the described purpose.
Instruction Scope
The runtime instructions explicitly require reading and (for Skill mode) writing the target SKILL.md at ~/.openclaw/skills/<skill-name>/SKILL.md and creating snapshots in a .snapshots directory. That is coherent for a tool that edits skills, but it does mean the skill will modify user files. Prompt/Article modes state they should avoid writing to disk unless needed; the included script supports file I/O only for Skill mode. User confirmation is described in the workflow (pause after batches), but actual enforcement depends on the agent implementation — the user should expect and approve filesystem changes before running.
Install Mechanism
There is no install spec and no external downloads. This is an instruction-only skill with an optional helper Python script included. No archives or network fetches are performed by the provided files.
Credentials
The skill declares no required environment variables, no credentials, and no config paths other than the target skill directory under the user's home (~/.openclaw/skills). There are no indications of requests for unrelated secrets or cloud credentials.
Persistence & Privilege
always:false (normal). The helper will create snapshots in ~/.openclaw/skills/<skill>/ .snapshots and will write changes to SKILL.md when operating in Skill mode. This is expected behavior for a mutation-based editor, but it does constitute persistent modification of user skill files — the user should ensure they want this and review snapshots. The skill does not modify other skills' config beyond writing to the target skill directory.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install autoresearch-pro
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /autoresearch-pro 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
Initial release: skill/prompt/article optimization, inspired by Karpathy autoresearch
元数据
Slug autoresearch-pro
版本 1.0.0
许可证 MIT-0
累计安装 1
当前安装数 1
历史版本数 1
常见问题

Autoresearch 是什么?

Automatically improve OpenClaw skills, prompts, or articles through iterative mutation-testing loops. Inspired by Karpathy's autoresearch. Use when user says... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 133 次。

如何安装 Autoresearch?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install autoresearch-pro」即可一键安装,无需额外配置。

Autoresearch 是免费的吗?

是的,Autoresearch 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

Autoresearch 支持哪些平台?

Autoresearch 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Autoresearch?

由 Jialin(@0xcjl)开发并维护,当前版本 v1.0.0。

💬 留言讨论