← 返回 Skills 市场
data-quality-checker
作者
charlie-morrison
· GitHub ↗
· v1.0.0
· MIT-0
114
总下载
0
收藏
0
当前安装
1
版本数
在 OpenClaw 中安装
/install data-quality-checker
功能描述
Validate CSV, JSON, and JSONL data files for quality issues. Detects missing values, duplicates, type inconsistencies, statistical outliers, format violation...
使用说明 (SKILL.md)
Data Quality Checker
Validate CSV/JSON/JSONL data for quality issues. Pure Python, zero dependencies.
Quick Start
# Full quality check
python3 scripts/check_data_quality.py data.csv
# JSON/JSONL support
python3 scripts/check_data_quality.py data.json
python3 scripts/check_data_quality.py data.jsonl
# Markdown report
python3 scripts/check_data_quality.py data.csv --format markdown
# JSON report (for CI/CD)
python3 scripts/check_data_quality.py data.csv --format json
# Only specific checks
python3 scripts/check_data_quality.py data.csv --checks missing,duplicates,types
# Only warnings and critical
python3 scripts/check_data_quality.py data.csv --severity warning
# Save report
python3 scripts/check_data_quality.py data.csv --format markdown --output report.md
Schema Validation
# Generate schema from existing data
python3 scripts/check_data_quality.py data.csv --generate-schema schema.json
# Validate against schema
python3 scripts/check_data_quality.py data.csv --schema schema.json
Checks Performed
| Check | Description | Severity |
|---|---|---|
missing |
Missing/null/empty values per column | info → critical |
duplicates |
Duplicate rows and potential ID conflicts | warning |
types |
Mixed data types within columns | info → warning |
outliers |
Statistical outliers via IQR method | info → warning |
formats |
Email/phone/URL/date format violations | warning |
whitespace |
Leading/trailing whitespace | info |
empty |
Entirely empty columns | warning |
drift |
Extra/missing keys across rows (schema drift) | warning |
Quality Score
0-100 score based on weighted severity:
- 90-100: Clean data, minor issues
- 70-89: Usable but needs attention
- 50-69: Significant issues
- 0-49: Critical problems
Exit Codes
0— No warnings or critical issues1— Warnings found2— Critical issues found
Use in CI: python3 scripts/check_data_quality.py data.csv || echo "Quality check failed"
Schema Format
JSON schema with validation rules:
{
"required": ["id", "email", "name"],
"properties": {
"id": {"type": "integer", "minimum": 1},
"email": {"type": "string", "pattern": "^[^@]+@[^@]+\\.[^@]+$"},
"age": {"type": "number", "minimum": 0, "maximum": 150},
"status": {"type": "string", "enum": ["active", "inactive", "pending"]}
}
}
安全使用建议
This skill appears to do what it says: run the included Python script on local CSV/JSON files to produce a data‑quality report. Before installing or running on sensitive data: 1) review the entire scripts/check_data_quality.py file — the listing you provided is truncated, and I couldn't inspect the file tail where networking or other behavior could appear; 2) run it first on non-sensitive sample data in an isolated environment; 3) check memory/CPU behavior on large files (the tool appears in-memory and may not stream very large datasets); 4) prefer installing skills from a known/published source (owner and homepage are unknown and STATUS.md notes a price), and 5) if you need to use it in CI on sensitive datasets, consider adding monitoring or sandboxing and/or reimplementing core checks within your vetted tooling.
功能分析
Type: OpenClaw Skill
Name: data-quality-checker
Version: 1.0.0
The data-quality-checker skill is a legitimate utility for validating CSV and JSON data. The core logic in scripts/check_data_quality.py uses only Python standard libraries to perform statistical analysis, type inference, and schema validation. There are no signs of data exfiltration, network activity, command execution, or prompt injection attempts.
能力评估
Purpose & Capability
Name/description match the included script: the code implements CSV/JSON/JSONL loading and the listed quality checks (missing, duplicates, types, outliers, formats, whitespace, empty, drift). No unrelated binaries, env vars, or services are requested.
Instruction Scope
SKILL.md instructs running the included Python script against local data files and generating reports; the instructions do not ask the agent to read unrelated system files, credentials, or transmit data externally.
Install Mechanism
No install spec (instruction-only + bundled script). This is low risk: nothing is downloaded or installed automatically by the skill.
Credentials
The skill declares no required environment variables or credentials and the visible code does not access environment secrets or configuration. No excessive permissions are requested.
Persistence & Privilege
The skill is not marked always:true and does not attempt to modify system or other skills' configurations in the visible code. Autonomous invocation is allowed (platform default) but not combined with other red flags.
如何使用
- 确保已安装 OpenClaw(本地或 Docker 部署)
- 在对话框中输入安装命令:
/install data-quality-checker - 安装完成后,直接呼叫该 Skill 的名称或使用
/data-quality-checker触发 - 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.0
Initial release
元数据
常见问题
data-quality-checker 是什么?
Validate CSV, JSON, and JSONL data files for quality issues. Detects missing values, duplicates, type inconsistencies, statistical outliers, format violation... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 114 次。
如何安装 data-quality-checker?
在 OpenClaw 或 Claude Code 对话框中运行命令「/install data-quality-checker」即可一键安装,无需额外配置。
data-quality-checker 是免费的吗?
是的,data-quality-checker 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。
data-quality-checker 支持哪些平台?
data-quality-checker 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。
谁开发了 data-quality-checker?
由 charlie-morrison(@charlie-morrison)开发并维护,当前版本 v1.0.0。
推荐 Skills