/install skill-forge-toolkit
skill-forge
Use this skill to turn repeated demand into a reviewed skill candidate.
Current version: v0.4.3 "Safety Tightening".
Core jobs
- Detect repeated capability gaps from logs,
.learnings, and feature requests. - Decide whether the pattern is stable enough to deserve its own skill.
- Classify the opportunity as
academic,product,integration,script, orworkflow. - Generate a candidate skill folder with a usable
SKILL.md,agents/openai.yaml, and profile-specific resources. - Validate and score the candidate before proposing installation.
- Run hidden smoke evaluation and Agent profile authorization before confirmed installation.
- Record feedback from later usage and propose reviewed updates instead of mutating installed skills directly.
- Provide the Somnia runtime used by scheduled nightly review to find skills with bugs, weak scores, or update-worthy feedback.
- Use redacted replay cases to check whether candidates actually cover real feedback-derived tasks.
Trigger Cues
Use this skill when the user or agent mentions:
- repeated failure
- missing capability
- recurring workflow
- feature request
- make a new skill
- scaffold a skill
- validate a generated skill
Default Workflow
- Detect repeated capability gaps from learning files or session-derived notes.
- Classify the strongest opportunity into a skill profile.
- Scaffold a candidate skill with profile-specific resources.
- Validate the candidate and inspect score, warnings, and references.
- Run hidden smoke evaluation without exposing simulated cases to users.
- Propose installation, then require Telegram approval before applying it.
- Record future usage feedback and run an evolution pipeline when enough feedback accumulates.
- During scheduled reviews, write summary reports and optionally propose updates without exposing hidden evaluation details.
- Run replay evaluation before approving evolved candidates when replay cases exist.
Commands
Detect
Run:
python3 {baseDir}/scripts/detect_skill_opportunities.py --json
Add --source paths when a specific workspace or learnings file should be analyzed.
Choose
Only create a skill when the pattern is:
- recurring
- broad enough to reuse
- structured enough to document
- more stable than a one-off prompt
If the need is too narrow, keep it as a note or workflow rule instead.
Scaffold
Run:
python3 {baseDir}/scripts/generate_skill_scaffold.py \
--skill-name my-skill \
--output ./generated \
--goal "What the skill should achieve." \
--triggers "keyword1, keyword2" \
--template auto
Validate
Run:
python3 {baseDir}/scripts/validate_skill_candidate.py ./generated/my-skill
Propose Installation
Run:
python3 {baseDir}/scripts/propose_skill_install.py ./generated/my-skill
Use --apply only after the candidate has been reviewed.
Record Feedback
Run:
python3 {baseDir}/scripts/record_skill_feedback.py \
--skill my-skill \
--agent-name StudyAgent \
--rating negative \
--feedback "触发词: literature review. The skill should handle literature review planning better." \
--json
Propose Evolution
Run:
python3 {baseDir}/scripts/evolve_skill_pipeline.py \
--skill my-skill \
--output ./generated-updates \
--install plan \
--replay hidden \
--agent-name StudyAgent \
--json
Replay Evaluation
Run:
python3 {baseDir}/scripts/replay/collect_replay_cases.py \
--feedback-file ~/.openclaw/workspace/.learnings/skill-feedback.jsonl \
--skill my-skill \
--json
python3 {baseDir}/scripts/replay/run_replay_eval.py \
--skill-dir ./generated-updates/my-skill \
--skill my-skill \
--json
Nightly Review
Somnia is now packaged as its own skill for scheduled maintenance. These compatibility commands remain available from Skill Forge:
python3 {baseDir}/scripts/nightly_skill_review.py \
--scope managed \
--propose-updates \
--replay hidden \
--update-install plan \
--json
Install a macOS daily sleep-hour schedule, defaulting to 03:00 local time:
python3 {baseDir}/scripts/schedule_nightly_review.py \
--hour 3 \
--minute 0 \
--scope managed \
--propose-updates \
--update-install plan \
--apply \
--json
Use --telegram-report with --env-file ~/.openclaw/skill-forge.env when scheduled runs should send a summary report to Telegram. The env file should define TELEGRAM_BOT_TOKEN and TELEGRAM_CHAT_ID.
Uninstall or rollback a generated skill:
python3 {baseDir}/scripts/propose_skill_install.py my-skill --uninstall --apply
python3 {baseDir}/scripts/propose_skill_install.py my-skill --uninstall --restore-backup --apply
Full Pipeline
For one-shot operation:
python3 {baseDir}/scripts/forge_pipeline.py \
--source ~/.openclaw/workspace/.learnings/FEATURE_REQUESTS.md \
--source ~/.openclaw/workspace/.learnings/ERRORS.md \
--output ./generated \
--eval hidden \
--json
Ask through Telegram before installing:
export TELEGRAM_BOT_TOKEN="..."
export TELEGRAM_CHAT_ID="..."
python3 {baseDir}/scripts/forge_pipeline.py \
--source ~/.openclaw/workspace/.learnings/FEATURE_REQUESTS.md \
--source ~/.openclaw/workspace/.learnings/ERRORS.md \
--output ./generated \
--install telegram \
--agent-name StudyAgent \
--json
Decision rules
- Prefer creating a skill over a new agent when the new capability is narrow.
- Prefer a new agent over a skill when the work needs a distinct role, tools, and long-term memory boundary.
- Prefer
references/when the skill mainly teaches structure and judgment. - Prefer
scripts/when the same code would otherwise be rewritten repeatedly. - Default to
--install plan; use--install telegramfor any mutation. - Treat
--install askand--install autoas blocked compatibility aliases. - Do not call
propose_skill_install.py --applydirectly; apply is guarded for Telegram-approved pipelines. - Keep
--eval hiddenfor user-facing flows so simulated checks and prompts are not exposed. - Use
--agent-namebefore installation when a specific agent will receive the skill. - Never hard-code Telegram tokens; discover them from OpenClaw/env files or environment variables.
- Redact feedback text before storing it.
- Treat feedback-driven changes as update candidates, not direct edits to installed skills.
- Install evolved skills only after validation, hidden evaluation, authorization, and approval.
- Use replay as a regression gate when feedback-derived cases exist.
- Scheduled nightly review may propose updates, but install changes still require Telegram approval.
- Nightly reports should show only health summaries, not hidden evaluation prompts or simulated checks.
- Default nightly review scope is
managed; use--scope allonly for explicit full inventory audits.
Output Contract
The skill-forge output should include:
- detected opportunity name
- recommended template profile
- generated candidate path
- validation score and grade
- install status
- approval status when Telegram confirmation is used
- nightly review report path when running scheduled review
- install plan
- review warnings, if any
Quality Gates
Before proposing installation, confirm:
- the skill name is concrete and reusable
- the description has clear trigger conditions
- the generated structure matches the intended job
- the candidate improves a real recurring workflow
- validation score is at least 70
grade=milestoneis preferred before sharing externally- installation threshold is at least 85 unless the user explicitly chooses otherwise
- hidden smoke evaluation passes
- target Agent profile policy allows the generated skill profile
- Telegram approval is available before applying install or uninstall changes
- feedback-derived updates are reviewed as candidates before replacing an installed skill
- replay evaluation passes when replay cases exist
- nightly review reports are written before any update proposal is installed
- scheduled automation uses explicit launchd configuration and can be uninstalled
Read references/skill-quality-rubric.md when evaluating a draft.
Resources
References:
references/heuristics.mdreferences/skill-quality-rubric.mdreferences/milestone-architecture.md
Scripts:
scripts/detect_skill_opportunities.pyscripts/evaluate_skill_candidate.pyscripts/generate_skill_scaffold.pyscripts/validate_skill_candidate.pyscripts/forge_pipeline.pyscripts/install/propose_skill_install.pyscripts/install/telegram_approval.pyscripts/evolve/evolve_skill_pipeline.pyscripts/evolve/propose_skill_update.pyscripts/evolve/record_skill_feedback.pyscripts/replay/collect_replay_cases.pyscripts/replay/compare_replay_outputs.pyscripts/replay/redact_replay_case.pyscripts/replay/replay_report.pyscripts/replay/run_replay_eval.pyscripts/somnia/nightly_skill_review.pyscripts/somnia/schedule_nightly_review.py
Compatibility wrappers:
scripts/propose_skill_install.pyscripts/telegram_approval.pyscripts/evolve_skill_pipeline.pyscripts/propose_skill_update.pyscripts/record_skill_feedback.pyscripts/nightly_skill_review.pyscripts/schedule_nightly_review.py
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install skill-forge-toolkit - After installation, invoke the skill by name or use
/skill-forge-toolkit - Provide required inputs per the skill's parameter spec and get structured output
What is Skill Forge (legacy slug)?
Detect repeated capability gaps, convert recurring user needs into candidate skills, scaffold new OpenClaw-compatible skills, and validate them before instal... It is an AI Agent Skill for Claude Code / OpenClaw, with 31 downloads so far.
How do I install Skill Forge (legacy slug)?
Run "/install skill-forge-toolkit" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is Skill Forge (legacy slug) free?
Yes, Skill Forge (legacy slug) is completely free, licensed under MIT-0. You can download, install and use it at no cost.
Which platforms does Skill Forge (legacy slug) support?
Skill Forge (legacy slug) is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created Skill Forge (legacy slug)?
It is built and maintained by SheepXu (@sheepxux); the current version is v0.4.3.