← Back to Skills Marketplace
charlie-morrison

data-quality-checker

by charlie-morrison · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ✓ Security Clean
114
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install data-quality-checker
Description
Validate CSV, JSON, and JSONL data files for quality issues. Detects missing values, duplicates, type inconsistencies, statistical outliers, format violation...
README (SKILL.md)

Data Quality Checker

Validate CSV/JSON/JSONL data for quality issues. Pure Python, zero dependencies.

Quick Start

# Full quality check
python3 scripts/check_data_quality.py data.csv

# JSON/JSONL support
python3 scripts/check_data_quality.py data.json
python3 scripts/check_data_quality.py data.jsonl

# Markdown report
python3 scripts/check_data_quality.py data.csv --format markdown

# JSON report (for CI/CD)
python3 scripts/check_data_quality.py data.csv --format json

# Only specific checks
python3 scripts/check_data_quality.py data.csv --checks missing,duplicates,types

# Only warnings and critical
python3 scripts/check_data_quality.py data.csv --severity warning

# Save report
python3 scripts/check_data_quality.py data.csv --format markdown --output report.md

Schema Validation

# Generate schema from existing data
python3 scripts/check_data_quality.py data.csv --generate-schema schema.json

# Validate against schema
python3 scripts/check_data_quality.py data.csv --schema schema.json

Checks Performed

Check Description Severity
missing Missing/null/empty values per column info → critical
duplicates Duplicate rows and potential ID conflicts warning
types Mixed data types within columns info → warning
outliers Statistical outliers via IQR method info → warning
formats Email/phone/URL/date format violations warning
whitespace Leading/trailing whitespace info
empty Entirely empty columns warning
drift Extra/missing keys across rows (schema drift) warning

Quality Score

0-100 score based on weighted severity:

  • 90-100: Clean data, minor issues
  • 70-89: Usable but needs attention
  • 50-69: Significant issues
  • 0-49: Critical problems

Exit Codes

  • 0 — No warnings or critical issues
  • 1 — Warnings found
  • 2 — Critical issues found

Use in CI: python3 scripts/check_data_quality.py data.csv || echo "Quality check failed"

Schema Format

JSON schema with validation rules:

{
  "required": ["id", "email", "name"],
  "properties": {
    "id": {"type": "integer", "minimum": 1},
    "email": {"type": "string", "pattern": "^[^@]+@[^@]+\\.[^@]+$"},
    "age": {"type": "number", "minimum": 0, "maximum": 150},
    "status": {"type": "string", "enum": ["active", "inactive", "pending"]}
  }
}
Usage Guidance
This skill appears to do what it says: run the included Python script on local CSV/JSON files to produce a data‑quality report. Before installing or running on sensitive data: 1) review the entire scripts/check_data_quality.py file — the listing you provided is truncated, and I couldn't inspect the file tail where networking or other behavior could appear; 2) run it first on non-sensitive sample data in an isolated environment; 3) check memory/CPU behavior on large files (the tool appears in-memory and may not stream very large datasets); 4) prefer installing skills from a known/published source (owner and homepage are unknown and STATUS.md notes a price), and 5) if you need to use it in CI on sensitive datasets, consider adding monitoring or sandboxing and/or reimplementing core checks within your vetted tooling.
Capability Analysis
Type: OpenClaw Skill Name: data-quality-checker Version: 1.0.0 The data-quality-checker skill is a legitimate utility for validating CSV and JSON data. The core logic in scripts/check_data_quality.py uses only Python standard libraries to perform statistical analysis, type inference, and schema validation. There are no signs of data exfiltration, network activity, command execution, or prompt injection attempts.
Capability Assessment
Purpose & Capability
Name/description match the included script: the code implements CSV/JSON/JSONL loading and the listed quality checks (missing, duplicates, types, outliers, formats, whitespace, empty, drift). No unrelated binaries, env vars, or services are requested.
Instruction Scope
SKILL.md instructs running the included Python script against local data files and generating reports; the instructions do not ask the agent to read unrelated system files, credentials, or transmit data externally.
Install Mechanism
No install spec (instruction-only + bundled script). This is low risk: nothing is downloaded or installed automatically by the skill.
Credentials
The skill declares no required environment variables or credentials and the visible code does not access environment secrets or configuration. No excessive permissions are requested.
Persistence & Privilege
The skill is not marked always:true and does not attempt to modify system or other skills' configurations in the visible code. Autonomous invocation is allowed (platform default) but not combined with other red flags.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install data-quality-checker
  3. After installation, invoke the skill by name or use /data-quality-checker
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
Initial release
Metadata
Slug data-quality-checker
Version 1.0.0
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 1
Frequently Asked Questions

What is data-quality-checker?

Validate CSV, JSON, and JSONL data files for quality issues. Detects missing values, duplicates, type inconsistencies, statistical outliers, format violation... It is an AI Agent Skill for Claude Code / OpenClaw, with 114 downloads so far.

How do I install data-quality-checker?

Run "/install data-quality-checker" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is data-quality-checker free?

Yes, data-quality-checker is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does data-quality-checker support?

data-quality-checker is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created data-quality-checker?

It is built and maintained by charlie-morrison (@charlie-morrison); the current version is v1.0.0.

💬 Comments