data-quality-checker
/install data-quality-checker
Data Quality Checker
Validate CSV/JSON/JSONL data for quality issues. Pure Python, zero dependencies.
Quick Start
# Full quality check
python3 scripts/check_data_quality.py data.csv
# JSON/JSONL support
python3 scripts/check_data_quality.py data.json
python3 scripts/check_data_quality.py data.jsonl
# Markdown report
python3 scripts/check_data_quality.py data.csv --format markdown
# JSON report (for CI/CD)
python3 scripts/check_data_quality.py data.csv --format json
# Only specific checks
python3 scripts/check_data_quality.py data.csv --checks missing,duplicates,types
# Only warnings and critical
python3 scripts/check_data_quality.py data.csv --severity warning
# Save report
python3 scripts/check_data_quality.py data.csv --format markdown --output report.md
Schema Validation
# Generate schema from existing data
python3 scripts/check_data_quality.py data.csv --generate-schema schema.json
# Validate against schema
python3 scripts/check_data_quality.py data.csv --schema schema.json
Checks Performed
| Check | Description | Severity |
|---|---|---|
missing |
Missing/null/empty values per column | info → critical |
duplicates |
Duplicate rows and potential ID conflicts | warning |
types |
Mixed data types within columns | info → warning |
outliers |
Statistical outliers via IQR method | info → warning |
formats |
Email/phone/URL/date format violations | warning |
whitespace |
Leading/trailing whitespace | info |
empty |
Entirely empty columns | warning |
drift |
Extra/missing keys across rows (schema drift) | warning |
Quality Score
0-100 score based on weighted severity:
- 90-100: Clean data, minor issues
- 70-89: Usable but needs attention
- 50-69: Significant issues
- 0-49: Critical problems
Exit Codes
0— No warnings or critical issues1— Warnings found2— Critical issues found
Use in CI: python3 scripts/check_data_quality.py data.csv || echo "Quality check failed"
Schema Format
JSON schema with validation rules:
{
"required": ["id", "email", "name"],
"properties": {
"id": {"type": "integer", "minimum": 1},
"email": {"type": "string", "pattern": "^[^@]+@[^@]+\\.[^@]+$"},
"age": {"type": "number", "minimum": 0, "maximum": 150},
"status": {"type": "string", "enum": ["active", "inactive", "pending"]}
}
}
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install data-quality-checker - After installation, invoke the skill by name or use
/data-quality-checker - Provide required inputs per the skill's parameter spec and get structured output
What is data-quality-checker?
Validate CSV, JSON, and JSONL data files for quality issues. Detects missing values, duplicates, type inconsistencies, statistical outliers, format violation... It is an AI Agent Skill for Claude Code / OpenClaw, with 114 downloads so far.
How do I install data-quality-checker?
Run "/install data-quality-checker" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is data-quality-checker free?
Yes, data-quality-checker is completely free, licensed under MIT-0. You can download, install and use it at no cost.
Which platforms does data-quality-checker support?
data-quality-checker is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).
Who created data-quality-checker?
It is built and maintained by charlie-morrison (@charlie-morrison); the current version is v1.0.0.