Description

Detect repeated capability gaps, convert recurring user needs into candidate skills, scaffold new OpenClaw-compatible skills, and validate them before instal...

README (SKILL.md)

skill-forge

Name: Skill Forge (legacy slug)
Author: sheepxux

Use this skill to turn repeated demand into a reviewed skill candidate.

Current version: v0.4.3 "Safety Tightening".

Core jobs

Detect repeated capability gaps from logs, .learnings, and feature requests.
Decide whether the pattern is stable enough to deserve its own skill.
Classify the opportunity as academic, product, integration, script, or workflow.
Generate a candidate skill folder with a usable SKILL.md, agents/openai.yaml, and profile-specific resources.
Validate and score the candidate before proposing installation.
Run hidden smoke evaluation and Agent profile authorization before confirmed installation.
Record feedback from later usage and propose reviewed updates instead of mutating installed skills directly.
Provide the Somnia runtime used by scheduled nightly review to find skills with bugs, weak scores, or update-worthy feedback.
Use redacted replay cases to check whether candidates actually cover real feedback-derived tasks.

Trigger Cues

Use this skill when the user or agent mentions:

repeated failure
missing capability
recurring workflow
feature request
make a new skill
scaffold a skill
validate a generated skill

Default Workflow

Detect repeated capability gaps from learning files or session-derived notes.
Classify the strongest opportunity into a skill profile.
Scaffold a candidate skill with profile-specific resources.
Validate the candidate and inspect score, warnings, and references.
Run hidden smoke evaluation without exposing simulated cases to users.
Propose installation, then require Telegram approval before applying it.
Record future usage feedback and run an evolution pipeline when enough feedback accumulates.
During scheduled reviews, write summary reports and optionally propose updates without exposing hidden evaluation details.
Run replay evaluation before approving evolved candidates when replay cases exist.

Commands

Detect

Run:

python3 {baseDir}/scripts/detect_skill_opportunities.py --json

Add --source paths when a specific workspace or learnings file should be analyzed.

Choose

Only create a skill when the pattern is:

recurring
broad enough to reuse
structured enough to document
more stable than a one-off prompt

If the need is too narrow, keep it as a note or workflow rule instead.

Scaffold

Run:

python3 {baseDir}/scripts/generate_skill_scaffold.py \
  --skill-name my-skill \
  --output ./generated \
  --goal "What the skill should achieve." \
  --triggers "keyword1, keyword2" \
  --template auto

Validate

Run:

python3 {baseDir}/scripts/validate_skill_candidate.py ./generated/my-skill

Propose Installation

Run:

python3 {baseDir}/scripts/propose_skill_install.py ./generated/my-skill

Use --apply only after the candidate has been reviewed.

Record Feedback

Run:

python3 {baseDir}/scripts/record_skill_feedback.py \
  --skill my-skill \
  --agent-name StudyAgent \
  --rating negative \
  --feedback "触发词: literature review. The skill should handle literature review planning better." \
  --json

Propose Evolution

Run:

python3 {baseDir}/scripts/evolve_skill_pipeline.py \
  --skill my-skill \
  --output ./generated-updates \
  --install plan \
  --replay hidden \
  --agent-name StudyAgent \
  --json

Replay Evaluation

Run:

python3 {baseDir}/scripts/replay/collect_replay_cases.py \
  --feedback-file ~/.openclaw/workspace/.learnings/skill-feedback.jsonl \
  --skill my-skill \
  --json

python3 {baseDir}/scripts/replay/run_replay_eval.py \
  --skill-dir ./generated-updates/my-skill \
  --skill my-skill \
  --json

Nightly Review

Somnia is now packaged as its own skill for scheduled maintenance. These compatibility commands remain available from Skill Forge:

python3 {baseDir}/scripts/nightly_skill_review.py \
  --scope managed \
  --propose-updates \
  --replay hidden \
  --update-install plan \
  --json

Install a macOS daily sleep-hour schedule, defaulting to 03:00 local time:

python3 {baseDir}/scripts/schedule_nightly_review.py \
  --hour 3 \
  --minute 0 \
  --scope managed \
  --propose-updates \
  --update-install plan \
  --apply \
  --json

Use --telegram-report with --env-file ~/.openclaw/skill-forge.env when scheduled runs should send a summary report to Telegram. The env file should define TELEGRAM_BOT_TOKEN and TELEGRAM_CHAT_ID.

Uninstall or rollback a generated skill:

python3 {baseDir}/scripts/propose_skill_install.py my-skill --uninstall --apply
python3 {baseDir}/scripts/propose_skill_install.py my-skill --uninstall --restore-backup --apply

Full Pipeline

For one-shot operation:

python3 {baseDir}/scripts/forge_pipeline.py \
  --source ~/.openclaw/workspace/.learnings/FEATURE_REQUESTS.md \
  --source ~/.openclaw/workspace/.learnings/ERRORS.md \
  --output ./generated \
  --eval hidden \
  --json

Ask through Telegram before installing:

export TELEGRAM_BOT_TOKEN="..."
export TELEGRAM_CHAT_ID="..."

python3 {baseDir}/scripts/forge_pipeline.py \
  --source ~/.openclaw/workspace/.learnings/FEATURE_REQUESTS.md \
  --source ~/.openclaw/workspace/.learnings/ERRORS.md \
  --output ./generated \
  --install telegram \
  --agent-name StudyAgent \
  --json

Decision rules

Prefer creating a skill over a new agent when the new capability is narrow.
Prefer a new agent over a skill when the work needs a distinct role, tools, and long-term memory boundary.
Prefer references/ when the skill mainly teaches structure and judgment.
Prefer scripts/ when the same code would otherwise be rewritten repeatedly.
Default to --install plan; use --install telegram for any mutation.
Treat --install ask and --install auto as blocked compatibility aliases.
Do not call propose_skill_install.py --apply directly; apply is guarded for Telegram-approved pipelines.
Keep --eval hidden for user-facing flows so simulated checks and prompts are not exposed.
Use --agent-name before installation when a specific agent will receive the skill.
Never hard-code Telegram tokens; discover them from OpenClaw/env files or environment variables.
Redact feedback text before storing it.
Treat feedback-driven changes as update candidates, not direct edits to installed skills.
Install evolved skills only after validation, hidden evaluation, authorization, and approval.
Use replay as a regression gate when feedback-derived cases exist.
Scheduled nightly review may propose updates, but install changes still require Telegram approval.
Nightly reports should show only health summaries, not hidden evaluation prompts or simulated checks.
Default nightly review scope is managed; use --scope all only for explicit full inventory audits.

Output Contract

The skill-forge output should include:

detected opportunity name
recommended template profile
generated candidate path
validation score and grade
install status
approval status when Telegram confirmation is used
nightly review report path when running scheduled review
install plan
review warnings, if any

Quality Gates

Before proposing installation, confirm:

the skill name is concrete and reusable
the description has clear trigger conditions
the generated structure matches the intended job
the candidate improves a real recurring workflow
validation score is at least 70
grade=milestone is preferred before sharing externally
installation threshold is at least 85 unless the user explicitly chooses otherwise
hidden smoke evaluation passes
target Agent profile policy allows the generated skill profile
Telegram approval is available before applying install or uninstall changes
feedback-derived updates are reviewed as candidates before replacing an installed skill
replay evaluation passes when replay cases exist
nightly review reports are written before any update proposal is installed
scheduled automation uses explicit launchd configuration and can be uninstalled

Read references/skill-quality-rubric.md when evaluating a draft.

Resources

References:

references/heuristics.md
references/skill-quality-rubric.md
references/milestone-architecture.md

Scripts:

scripts/detect_skill_opportunities.py
scripts/evaluate_skill_candidate.py
scripts/generate_skill_scaffold.py
scripts/validate_skill_candidate.py
scripts/forge_pipeline.py
scripts/install/propose_skill_install.py
scripts/install/telegram_approval.py
scripts/evolve/evolve_skill_pipeline.py
scripts/evolve/propose_skill_update.py
scripts/evolve/record_skill_feedback.py
scripts/replay/collect_replay_cases.py
scripts/replay/compare_replay_outputs.py
scripts/replay/redact_replay_case.py
scripts/replay/replay_report.py
scripts/replay/run_replay_eval.py
scripts/somnia/nightly_skill_review.py
scripts/somnia/schedule_nightly_review.py

Compatibility wrappers:

scripts/propose_skill_install.py
scripts/telegram_approval.py
scripts/evolve_skill_pipeline.py
scripts/propose_skill_update.py
scripts/record_skill_feedback.py
scripts/nightly_skill_review.py
scripts/schedule_nightly_review.py

Usage Guidance

Install only if you want a skill that can manage other skills. Keep workflows in plan/review mode, inspect generated or evolved skill folders before applying them, protect Telegram credentials, and only enable the nightly schedule if you understand how to monitor and remove it.

Capability Analysis

Type: OpenClaw Skill Name: skill-forge-toolkit Version: 0.4.3 The skill-forge-toolkit is a sophisticated framework for automated skill generation, validation, and lifecycle management. While it possesses high-privilege capabilities—such as generating Python code, installing files to the local system, and establishing persistence via macOS LaunchAgents (scripts/somnia/schedule_nightly_review.py)—these actions are strictly aligned with its stated purpose of 'self-evolving' agent capabilities. The bundle demonstrates a strong security posture by implementing mandatory human-in-the-loop approval via Telegram (scripts/install/telegram_approval.py) for all installations, a robust secret redaction engine to prevent credential leakage in logs (scripts/replay/replay_common.py), and agent-specific authorization policies (scripts/lib/policy.py). No evidence of malicious intent, data exfiltration, or unauthorized remote execution was found.

Capability Tags

requires-oauth-tokenrequires-sensitive-credentials

Capability Assessment

ℹ Purpose & Capability

The artifacts are coherent with the stated purpose of detecting repeated needs, scaffolding skills, validating candidates, and proposing installation, but that purpose inherently affects the agent runtime.

⚠ Instruction Scope

The documented workflows include high-impact actions such as applying skill installs/uninstalls and scheduling nightly reviews. The instructions include approval language, but users should treat these as runtime-changing actions.

⚠ Install Mechanism

There is no dependency install spec, but the skill ships many Python scripts and documents commands that can apply generated skills into the OpenClaw skills area or uninstall/restore them.

ℹ Credentials

Reading `.openclaw/workspace/.learnings` files and using optional Telegram credentials are purpose-aligned, but those paths and credentials can contain sensitive operational context.

⚠ Persistence & Privilege

The skill documents persistent feedback storage and a daily scheduled review job; both are disclosed, but they continue to influence future skill changes beyond a single session.

Version History

v0.4.3

Legacy slug update. Use skills-forge instead. Includes the v0.4.3 path-safety fixes and clean static scan posture.

v0.4.2

Initial public preview with Telegram-gated installs, hidden evaluation, replay regression checks, release checks, and optional Somnia support.

Metadata

Slug skill-forge-toolkit

Version 0.4.3

License MIT-0

All-time Installs 0

Active Installs 0

Total Versions 2

Frequently Asked Questions

What is Skill Forge (legacy slug)?

Detect repeated capability gaps, convert recurring user needs into candidate skills, scaffold new OpenClaw-compatible skills, and validate them before instal... It is an AI Agent Skill for Claude Code / OpenClaw, with 31 downloads so far.

How do I install Skill Forge (legacy slug)?

Run "/install skill-forge-toolkit" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Skill Forge (legacy slug) free?

Yes, Skill Forge (legacy slug) is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Skill Forge (legacy slug) support?

Skill Forge (legacy slug) is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Skill Forge (legacy slug)?

It is built and maintained by SheepXu (@sheepxux); the current version is v0.4.3.

More Skills

Skill Forge (legacy slug)

skill-forge

Core jobs

Trigger Cues

Default Workflow

Commands

Detect

Choose

Scaffold

Validate

Propose Installation

Record Feedback

Propose Evolution

Replay Evaluation

Nightly Review

Full Pipeline

Decision rules

Output Contract

Quality Gates

Resources

What is Skill Forge (legacy slug)?

How do I install Skill Forge (legacy slug)?

Is Skill Forge (legacy slug) free?

Which platforms does Skill Forge (legacy slug) support?

Who created Skill Forge (legacy slug)?

💬 Comments