← 返回 Skills 市场
xueyetianya

Dedupe

作者 bytesagain4 · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ✓ 安全检测通过
178
总下载
0
收藏
2
当前安装
1
版本数
在 OpenClaw 中安装
/install dedupe
功能描述
Deduplication reference — exact matching, fuzzy matching, hash-based dedup, bloom filters, and data quality. Use when removing duplicate records, files, or d...
使用说明 (SKILL.md)

Dedupe — Data Deduplication Reference

Quick-reference skill for deduplication strategies, algorithms, and data quality patterns.

When to Use

  • Removing duplicate rows from datasets or databases
  • Deduplicating files in storage systems
  • Implementing fuzzy matching for near-duplicate detection
  • Choosing between exact and probabilistic dedup methods
  • Building ETL pipelines with deduplication stages

Commands

intro

scripts/script.sh intro

Overview of deduplication — types, strategies, and tradeoffs.

exact

scripts/script.sh exact

Exact deduplication — hash-based, key-based, and sorting approaches.

fuzzy

scripts/script.sh fuzzy

Fuzzy deduplication — similarity measures, blocking, and record linkage.

files

scripts/script.sh files

File-level deduplication — fdupes, jdupes, rdfind, and storage dedup.

algorithms

scripts/script.sh algorithms

Dedup algorithms — bloom filters, HyperLogLog, MinHash, SimHash.

sql

scripts/script.sh sql

SQL deduplication patterns — ROW_NUMBER, DISTINCT, GROUP BY strategies.

cli

scripts/script.sh cli

Command-line dedup tools — sort, uniq, awk, and stream processing.

checklist

scripts/script.sh checklist

Deduplication quality checklist and validation steps.

help

scripts/script.sh help

version

scripts/script.sh version

Configuration

Variable Description
DEDUPE_DIR Data directory (default: ~/.dedupe/)

Powered by BytesAgain | bytesagain.com | [email protected]

安全使用建议
This skill is a local documentation/reference tool implemented as a shell script and appears coherent with its description. Before installing, you can: (1) quickly inspect the full scripts/script.sh file to confirm it only prints documentation and does not execute network commands or remove files; (2) ensure you are comfortable allowing the agent to run the included script when you invoke the skill. If you see commands that read arbitrary paths, call curl/wget, or run delete operations, treat those as a potential risk and ask for clarification.
功能分析
Type: OpenClaw Skill Name: dedupe Version: 1.0.0 The 'dedupe' skill is a purely educational reference tool for data deduplication strategies. The primary script, `scripts/script.sh`, contains only static text output (via heredocs) providing information on algorithms, SQL patterns, and CLI tools; it performs no actual data manipulation or network activity. No indicators of malicious intent, data exfiltration, or prompt injection were found.
能力评估
Purpose & Capability
Name/description match the provided artifacts: a reference skill for deduplication. It does not ask for unrelated credentials, binaries, or system access.
Instruction Scope
SKILL.md instructs running the included scripts/script.sh commands to show reference content (intro, exact, fuzzy, files, etc.). The instructions do not ask the agent to read unrelated files, contact external endpoints, or exfiltrate data. The only optional configuration variable is DEDUPE_DIR (a local data dir), which is reasonable for a local reference tool.
Install Mechanism
No install spec — instruction-only plus an included script. Nothing is downloaded or written to disk at install time beyond the skill files themselves.
Credentials
The skill declares no required environment variables or credentials. The SKILL.md mentions an optional DEDUPE_DIR, which is proportional to the purpose.
Persistence & Privilege
Skill does not request always:true and is user-invocable only. It does not modify other skills or system-wide agent settings.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install dedupe
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /dedupe 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
publish v1.0.0
元数据
Slug dedupe
版本 1.0.0
许可证 MIT-0
累计安装 2
当前安装数 2
历史版本数 1
常见问题

Dedupe 是什么?

Deduplication reference — exact matching, fuzzy matching, hash-based dedup, bloom filters, and data quality. Use when removing duplicate records, files, or d... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 178 次。

如何安装 Dedupe?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install dedupe」即可一键安装,无需额外配置。

Dedupe 是免费的吗?

是的,Dedupe 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

Dedupe 支持哪些平台?

Dedupe 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Dedupe?

由 bytesagain4(@xueyetianya)开发并维护,当前版本 v1.0.0。

💬 留言讨论