← 返回 Skills 市场

Distil Open Claw Pii

Name: Distil Open Claw Pii
Author: jgolebiowski

作者 Jacek Golebiowski · GitHub ↗ · v1.1.1 · MIT-0

cross-platform ⚠ suspicious

总下载

当前安装

版本数

在 OpenClaw 中安装

/install distil-open-claw-pii

功能描述

Redact PII from text locally using a fine-tuned 1B SLM. Text never leaves your machine. Handles names, emails, phones, addresses, SSNs, credit cards, IBANs,...

使用说明 (SKILL.md)

PII Redaction Skill

When to use

Use this skill when the user asks to redact, anonymize, sanitize, or remove PII / personal data from text.

Privacy guarantee

CRITICAL: NEVER include the user's raw input text in your own responses, context, or reasoning. The entire point of this skill is that the frontier LLM (you) never sees the PII. You pass the text directly to the redaction script and only return the redacted output.

Prerequisites

Python 3
curl (for model download)

The setup script handles everything else (model download + server startup).

First-time setup

If the model server is not running yet, run:

bash scripts/setup.sh

This downloads the GGUF model (~5 GB) and starts the local inference server on port 8712.

How to redact

Pass the user's text directly to the redaction script. Do not echo or repeat the raw text yourself.

python scripts/redact.py "text to redact"

For longer text, pipe it via stdin:

echo "text to redact" | python scripts/redact.py

Return the output to the user as-is.

`--show-entities` flag (use sparingly)

Adding --show-entities outputs the full JSON including the original PII values. Only use this when the user explicitly asks to see which entities were detected or needs the mapping for a downstream task. In normal redaction workflows, omit this flag -- displaying the raw entity values defeats the purpose of PII redaction.

python scripts/redact.py --show-entities "text to redact"

How to stop the server

bash scripts/stop.sh

Output format

By default the script prints only the redacted text -- PII tokens replace the sensitive data and the original values are never shown:

Hi, my name is [PERSON] and I need help with my recent order #ORD-29481.

You can reach me at [EMAIL] or call me at [PHONE]. I'm a [AGE_YEARS:34]-year-old [MARITAL_STATUS] woman living at [ADDRESS]...

With --show-entities, the script returns full JSON including original PII values (see flag note above for when this is appropriate).

See examples/ for full input/output samples.

安全使用建议

Key things to consider before installing: - The metadata claims no required binaries, but the scripts require llama-server (llama.cpp), curl, and Python. Verify you have (or want) those installed. - The model is downloaded from Hugging Face (official domain) into $HOME/.distil-pii; expect ~5 GB disk usage and network download. - Important privacy nuance: the local model is instructed to include original PII values in the 'entities' array. The script prints only the redacted_text by default, but the original values are present inside the model response and will be printed if you use --show-entities (or if a bug/logging step captures the full response). If you need a stronger guarantee that original values are never returned, modify the system prompt/code so the model never includes raw values (e.g., store hashed/masked values or omit the 'value' field entirely). - Confirm llama-server actually binds to localhost (not 0.0.0.0) and that your firewall blocks external access to port 8712 to avoid local network exposure. - Run the setup in an isolated environment (VM/container) if you handle high-risk PII until you verify behavior; inspect server logs and verify no unexpected outbound connections. - If you plan to share this skill in production, ask the publisher to: (1) update registry metadata to list required binaries, (2) document the privacy tradeoffs of the 'entities.value' field, and (3) provide checksums for the model download so you can verify integrity.

功能分析

Type: OpenClaw Skill Name: distil-open-claw-pii Version: 1.1.1 The skill provides local PII redaction by downloading a fine-tuned 1B parameter model from HuggingFace and running it via a local llama.cpp server (port 8712). The implementation in scripts/redact.py and scripts/setup.sh is transparent, uses standard libraries, and strictly follows the stated privacy goal of ensuring sensitive data never leaves the user's machine. The instructions in SKILL.md are designed to prevent the primary AI agent from seeing or leaking raw PII, which serves as a security best practice rather than a malicious injection.

能力标签

cryptocan-make-purchases

能力评估

⚠ Purpose & Capability

The skill claims local-only redaction and no required binaries, but the provided scripts require a local 'llama-server' (llama.cpp) binary, curl, and Python. The registry metadata lists no required binaries even though setup.sh explicitly checks for llama-server and downloads a model—this mismatch is incoherent and should be corrected/confirmed.

⚠ Instruction Scope

SKILL.md instructs the agent to 'NEVER include the user's raw input' and to return only redacted text by default. However, scripts/redact.py's system prompt and output schema explicitly require the entities array to include the original value field (the original PII). The script only prints the redacted text by default, but the model response will contain the original values (and --show-entities prints them). That contradiction increases risk of accidental exposure (logging, debugging, or misuse of --show-entities). The script only talks to localhost, not external endpoints.

✓ Install Mechanism

There is no package install spec; setup.sh downloads a ~5 GB GGUF model from huggingface.co (a known host) and starts a local llama-server. Downloading from Hugging Face is expected for local models; the install does not use obscure URLs or extract untrusted archives. It does start a background server and writes to $HOME/.distil-pii.

✓ Credentials

The skill requests no environment variables or external credentials, which is appropriate for a local redactor. It does create files under $HOME/.distil-pii (model and PID) which is proportionate for this purpose.

ℹ Persistence & Privilege

The skill runs a persistent local server (llama-server) and stores the model and a PID file under $HOME/.distil-pii. always:false (good). Running a local HTTP server on port 8712 is expected, but you should confirm the server binds only to localhost and verify the process is trusted.

如何使用

确保已安装 OpenClaw（本地或 Docker 部署）
在对话框中输入安装命令：/install distil-open-claw-pii
安装完成后，直接呼叫该 Skill 的名称或使用 /distil-open-claw-pii 触发
根据 Skill 的参数说明提供必要输入，即可获得结构化输出

版本历史

v1.1.1

Update docs to discourage --show-entities in normal workflows

v1.1.0

Default to redacted-text-only output, add --show-entities flag for full JSON

v1.0.1

Add ClawHub install instructions to README

v1.0.0

Initial release of PII redaction skill. - Redacts personal information (names, emails, phones, addresses, SSNs, credit cards, IBANs, and more) from text using a fine-tuned local model. - Operates entirely on your machine—text is never sent externally. - Returns structured JSON with redacted text and detected entities. - Includes setup, usage, and server management instructions.

元数据

Slug distil-open-claw-pii

版本 1.1.1

许可证 MIT-0

累计安装 0

当前安装数 0

历史版本数 4

常见问题