← Back to Skills Marketplace
68
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install talonforge-safety
Description
Automatically configures trust levels, non-negotiable safety rules, prompt injection defenses, and approval workflows for secure AI interactions.
Usage Guidance
This skill looks like a genuine safety-rails template, but it contains several gaps you should resolve before installing: (1) Verify the npm packages it asks you to install (ai-sentinel, skill-guard) — inspect their source, maintainers, and npm page; don't run npx blindly. (2) Ask the author how the agent is expected to access email/messaging channels and where any tokens are stored; prefer explicit, minimal credential requirements and short-lived tokens. (3) Confirm where installed tooling will be placed and what permissions it will have. (4) Prefer an install manifest from a known origin (GitHub release or vetted registry) rather than ad-hoc npx commands. (5) If you plan to allow the agent to read emails/files, limit access scope and test in a sandbox first. If you cannot verify the third-party packages or the homepage/author identity, treat this as higher risk and do not install.
Capability Analysis
Type: OpenClaw Skill
Name: talonforge-safety
Version: 1.0.0
The bundle provides a framework for AI safety guardrails, including a multi-level trust system and defensive instructions designed to mitigate prompt injection and unauthorized autonomous actions. The SKILL.md file contains purely instructional content that constrains the agent's behavior (e.g., prohibiting financial transactions and treating inbound email as untrusted), and it references the installation of safety-oriented utilities (ai-sentinel, skill-guard) via the platform's package manager. No malicious logic, data exfiltration, or obfuscation was detected.
Capability Assessment
Purpose & Capability
The skill claims to set up safety rails that include reading files, messages and emails and integrating with a 'verified messaging channel', but the package metadata declares no required env vars, credentials, or config paths. That mismatch (ability to read/act on messages + no declared access requirements) is inconsistent and unexplained.
Instruction Scope
SKILL.md instructs the agent to collect user answers (risk tolerance, hard rules, verified channel) and to generate configuration, but it also prescribes behaviors that imply reading emails/messages and preventing/handling prompt-injection. The instructions also tell the user/agent to run npx install commands to add third-party components — this expands scope beyond the simple prose and is vague about what those components will do or what data they will access.
Install Mechanism
Although there is no formal install spec, the SKILL.md tells the operator to run 'npx clawhub@latest install ai-sentinel' and 'npx clawhub@latest install skill-guard'. That implies installing public npm packages at runtime via npx (moderate risk): those packages are external, their provenance and behavior are unknown, and installing them will persist code on disk/executable context without a vetted install manifest.
Credentials
The skill will likely need access to messaging channel credentials and possibly mailbox access to enforce email rules, but requires.env and primary credential fields are empty. Asking for a 'verified messaging channel' without declaring how tokens/credentials are supplied or stored is a proportionality mismatch and a potential blind spot for credential handling.
Persistence & Privilege
always is false and the skill is user-invocable (normal). However, the SKILL.md's recommended npx installs imply adding persistent tools (ai-sentinel, skill-guard) to the environment, which increases long-term privilege surface even though the skill itself does not request always:true or system-wide config changes.
How to Use
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install talonforge-safety - After installation, invoke the skill by name or use
/talonforge-safety - Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
First release. Bilingual safety guardrails from TalonForge.
Metadata
Frequently Asked Questions
What is TalonForge Safety Rails (EN/AR)?
Automatically configures trust levels, non-negotiable safety rules, prompt injection defenses, and approval workflows for secure AI interactions. It is an AI Agent Skill for Claude Code / OpenClaw, with 68 downloads so far.
How do I install TalonForge Safety Rails (EN/AR)?
Run "/install talonforge-safety" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is TalonForge Safety Rails (EN/AR) free?
Yes, TalonForge Safety Rails (EN/AR) is completely free, licensed under MIT-0. You can download, install and use it at no cost.
Which platforms does TalonForge Safety Rails (EN/AR) support?
TalonForge Safety Rails (EN/AR) is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created TalonForge Safety Rails (EN/AR)?
It is built and maintained by zinou (@casperzinou); the current version is v1.0.0.
More Skills