功能描述

Handle CSV files from construction software exports. Auto-detect delimiters, encodings, and clean messy data.

使用说明 (SKILL.md)

\r

CSV Handler for Construction Data\r

Name: Csv Handler
Author: datadrivenconstruction

\r

Overview\r

CSV is the universal exchange format in construction - from scheduling exports to cost databases. This skill handles encoding issues, delimiter detection, and data cleaning.\r \r

Python Implementation\r

\r

import pandas as pd\r
import csv\r
from typing import Dict, Any, List, Optional, Tuple\r
from pathlib import Path\r
from dataclasses import dataclass\r
import chardet\r
\r
\r
@dataclass\r
class CSVProfile:\r
    """Profile of CSV file."""\r
    encoding: str\r
    delimiter: str\r
    has_header: bool\r
    row_count: int\r
    column_count: int\r
    columns: List[str]\r
\r
\r
class ConstructionCSVHandler:\r
    """Handle CSV files from construction software."""\r
\r
    COMMON_DELIMITERS = [',', ';', '	', '|']\r
    COMMON_ENCODINGS = ['utf-8', 'utf-8-sig', 'latin-1', 'cp1252', 'iso-8859-1']\r
\r
    def __init__(self):\r
        self.last_profile: Optional[CSVProfile] = None\r
\r
    def detect_encoding(self, file_path: str) -> str:\r
        """Detect file encoding."""\r
        with open(file_path, 'rb') as f:\r
            raw = f.read(10000)\r
        result = chardet.detect(raw)\r
        return result.get('encoding', 'utf-8') or 'utf-8'\r
\r
    def detect_delimiter(self, file_path: str, encoding: str) -> str:\r
        """Detect CSV delimiter."""\r
        with open(file_path, 'r', encoding=encoding, errors='replace') as f:\r
            sample = f.read(5000)\r
\r
        # Count occurrences\r
        counts = {d: sample.count(d) for d in self.COMMON_DELIMITERS}\r
\r
        # Return most common that appears consistently\r
        if counts:\r
            return max(counts, key=counts.get)\r
        return ','\r
\r
    def profile_csv(self, file_path: str) -> CSVProfile:\r
        """Profile CSV file."""\r
        encoding = self.detect_encoding(file_path)\r
        delimiter = self.detect_delimiter(file_path, encoding)\r
\r
        # Read sample\r
        df = pd.read_csv(file_path, encoding=encoding, delimiter=delimiter,\r
                         nrows=10, on_bad_lines='skip')\r
\r
        has_header = not df.columns[0].replace('.', '').replace('-', '').isdigit()\r
\r
        # Full row count\r
        with open(file_path, 'r', encoding=encoding, errors='replace') as f:\r
            row_count = sum(1 for _ in f) - (1 if has_header else 0)\r
\r
        profile = CSVProfile(\r
            encoding=encoding,\r
            delimiter=delimiter,\r
            has_header=has_header,\r
            row_count=row_count,\r
            column_count=len(df.columns),\r
            columns=list(df.columns)\r
        )\r
        self.last_profile = profile\r
        return profile\r
\r
    def read_csv(self, file_path: str,\r
                 encoding: Optional[str] = None,\r
                 delimiter: Optional[str] = None,\r
                 clean: bool = True) -> pd.DataFrame:\r
        """Read CSV with auto-detection."""\r
\r
        # Auto-detect if not provided\r
        if encoding is None:\r
            encoding = self.detect_encoding(file_path)\r
        if delimiter is None:\r
            delimiter = self.detect_delimiter(file_path, encoding)\r
\r
        # Read with error handling\r
        df = pd.read_csv(\r
            file_path,\r
            encoding=encoding,\r
            delimiter=delimiter,\r
            on_bad_lines='skip',\r
            low_memory=False\r
        )\r
\r
        if clean:\r
            df = self.clean_dataframe(df)\r
\r
        return df\r
\r
    def clean_dataframe(self, df: pd.DataFrame) -> pd.DataFrame:\r
        """Clean construction CSV data."""\r
        # Clean column names\r
        df.columns = [self._clean_column_name(c) for c in df.columns]\r
\r
        # Remove empty rows and columns\r
        df = df.dropna(how='all')\r
        df = df.dropna(axis=1, how='all')\r
\r
        # Strip whitespace from strings\r
        for col in df.select_dtypes(include=['object']):\r
            df[col] = df[col].str.strip() if df[col].dtype == 'object' else df[col]\r
\r
        return df\r
\r
    def _clean_column_name(self, name: str) -> str:\r
        """Clean column name."""\r
        if not isinstance(name, str):\r
            return str(name)\r
\r
        # Remove special characters, replace spaces\r
        clean = name.strip().lower()\r
        clean = clean.replace(' ', '_').replace('-', '_')\r
        clean = ''.join(c for c in clean if c.isalnum() or c == '_')\r
        return clean\r
\r
    def merge_csvs(self, file_paths: List[str],\r
                   on_column: Optional[str] = None) -> pd.DataFrame:\r
        """Merge multiple CSV files."""\r
        dfs = []\r
        for path in file_paths:\r
            df = self.read_csv(path)\r
            df['_source_file'] = Path(path).name\r
            dfs.append(df)\r
\r
        if not dfs:\r
            return pd.DataFrame()\r
\r
        if on_column and on_column in dfs[0].columns:\r
            result = dfs[0]\r
            for df in dfs[1:]:\r
                result = pd.merge(result, df, on=on_column, how='outer')\r
            return result\r
\r
        return pd.concat(dfs, ignore_index=True)\r
\r
    def split_csv(self, df: pd.DataFrame,\r
                  group_column: str,\r
                  output_dir: str) -> List[str]:\r
        """Split CSV by column values."""\r
        output_path = Path(output_dir)\r
        output_path.mkdir(parents=True, exist_ok=True)\r
\r
        files = []\r
        for value in df[group_column].unique():\r
            subset = df[df[group_column] == value]\r
            filename = f"{group_column}_{value}.csv"\r
            filepath = output_path / filename\r
            subset.to_csv(filepath, index=False)\r
            files.append(str(filepath))\r
\r
        return files\r
\r
    def convert_types(self, df: pd.DataFrame,\r
                      type_map: Dict[str, str] = None) -> pd.DataFrame:\r
        """Convert column types intelligently."""\r
        df = df.copy()\r
\r
        if type_map:\r
            for col, dtype in type_map.items():\r
                if col in df.columns:\r
                    try:\r
                        df[col] = df[col].astype(dtype)\r
                    except:\r
                        pass\r
        else:\r
            # Auto-convert\r
            for col in df.columns:\r
                # Try numeric\r
                try:\r
                    df[col] = pd.to_numeric(df[col])\r
                    continue\r
                except:\r
                    pass\r
\r
                # Try datetime\r
                try:\r
                    df[col] = pd.to_datetime(df[col])\r
                except:\r
                    pass\r
\r
        return df\r
\r
    def export_csv(self, df: pd.DataFrame,\r
                   file_path: str,\r
                   encoding: str = 'utf-8-sig',\r
                   delimiter: str = ',') -> str:\r
        """Export DataFrame to CSV."""\r
        df.to_csv(file_path, encoding=encoding, sep=delimiter, index=False)\r
        return file_path\r
\r
\r
# Specialized handlers\r
class ScheduleCSVHandler(ConstructionCSVHandler):\r
    """Handler for project schedule CSVs."""\r
\r
    SCHEDULE_COLUMNS = ['task_id', 'task_name', 'start_date', 'end_date',\r
                        'duration', 'predecessors', 'resources']\r
\r
    def parse_schedule(self, file_path: str) -> pd.DataFrame:\r
        """Parse schedule CSV."""\r
        df = self.read_csv(file_path)\r
\r
        # Convert date columns\r
        for col in df.columns:\r
            if 'date' in col.lower() or 'start' in col.lower() or 'end' in col.lower():\r
                try:\r
                    df[col] = pd.to_datetime(df[col])\r
                except:\r
                    pass\r
\r
        return df\r
\r
\r
class CostCSVHandler(ConstructionCSVHandler):\r
    """Handler for cost/estimate CSVs."""\r
\r
    def parse_costs(self, file_path: str) -> pd.DataFrame:\r
        """Parse cost CSV."""\r
        df = self.read_csv(file_path)\r
\r
        # Find and convert numeric columns\r
        for col in df.columns:\r
            if any(word in col.lower() for word in ['cost', 'price', 'amount', 'total', 'qty', 'quantity']):\r
                df[col] = pd.to_numeric(df[col].replace(r'[\$,]', '', regex=True), errors='coerce')\r
\r
        return df\r
```\r
\r
## Quick Start\r
\r
```python\r
handler = ConstructionCSVHandler()\r
\r
# Profile CSV first\r
profile = handler.profile_csv("export.csv")\r
print(f"Encoding: {profile.encoding}, Delimiter: '{profile.delimiter}'")\r
\r
# Read with auto-detection\r
df = handler.read_csv("export.csv")\r
print(f"Loaded {len(df)} rows, {len(df.columns)} columns")\r
```\r
\r
## Common Use Cases\r
\r
### 1. Merge Multiple Exports\r
```python\r
files = ["jan_export.csv", "feb_export.csv", "mar_export.csv"]\r
merged = handler.merge_csvs(files)\r
```\r
\r
### 2. Split by Category\r
```python\r
handler.split_csv(df, group_column='category', output_dir='./split_files')\r
```\r
\r
### 3. Schedule Import\r
```python\r
schedule_handler = ScheduleCSVHandler()\r
schedule = schedule_handler.parse_schedule("p6_export.csv")\r
```\r
\r
## Resources\r
- **DDC Book**: Chapter 2.1 - Structured Data\r

安全使用建议

This skill appears to do what it says: profile and clean construction CSV exports. Before installing or running it: 1) Ensure your runtime has python3 with required libraries (pandas, chardet) since the skill assumes those but doesn't install them. 2) Only supply files you are comfortable having the agent read — the skill will read user-provided file paths and may write output files. 3) There are no declared network endpoints or credential requests, but if you need stronger assurance, ask the author for a full dependency list and confirm there is no hidden behavior beyond the visible SKILL.md. 4) Run it in an environment with appropriate data access controls if the CSVs contain sensitive information.

功能分析

Type: OpenClaw Skill Name: csv-handler Version: 2.1.0 The skill is designed for handling CSV files, which inherently requires file system access. The `claw.json` explicitly declares 'filesystem' permissions, aligning with the Python code's operations (reading and writing CSVs). There is no evidence of malicious intent such as data exfiltration, unauthorized network communication, persistence mechanisms, or prompt injection attempts in `SKILL.md` or `instructions.md`. All code and instructions are consistent with the stated purpose of a CSV handler for construction data.

能力评估

✓ Purpose & Capability

Name/description match the instructions and included Python code (CSV encoding/delimiter detection, cleaning, merging, splitting). Required binary (python3) is appropriate for a Python-based implementation. The declared filesystem permission in claw.json aligns with reading/writing CSV files.

ℹ Instruction Scope

SKILL.md contains concrete Python code for reading, profiling, cleaning, merging, and splitting CSVs and limits actions to files the user provides. It will read files from disk (user-supplied paths) and write split/merged outputs; there are no instructions to read unrelated system configuration or to transmit data to external endpoints. Note: because it reads arbitrary files the user supplies, only provide non-sensitive data if you have concerns.

ℹ Install Mechanism

This is an instruction-only skill (no install spec), which is low risk. However, the bundled Python code depends on third-party packages (pandas, chardet) that are not declared in the skill metadata or install spec — the environment must already have these libraries installed or the code will fail.

✓ Credentials

The skill requests no environment variables, no credentials, and no external config paths. That is proportionate for a CSV-processing utility.

✓ Persistence & Privilege

The skill is not set to always:true and uses normal agent invocation. It does not attempt to modify other skills or system-wide settings in the provided materials.

版本历史

v2.1.0

- Added auto-detection of CSV delimiters and file encodings. - Improved data cleaning: trims whitespace, removes empty rows/columns, and standardizes column names. - Allows merging and splitting CSV files by column values. - Introduced intelligent type conversion for numeric and date fields. - Specialized classes for handling construction schedule and cost CSVs.

v1.0.0

CSV Handler version 1.0.0 – Initial Release - Introduces robust handling for CSV files exported from construction software. - Automatically detects delimiters and file encoding to minimize import errors. - Includes powerful data cleaning, merging, splitting, and column type conversion utilities. - Offers specialized support for schedule and cost CSVs common in the construction industry.

元数据

Slug csv-handler

版本 2.1.0

许可证 —

累计安装 12

当前安装数 12

历史版本数 2

常见问题

Csv Handler 是什么？

Handle CSV files from construction software exports. Auto-detect delimiters, encodings, and clean messy data. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件，目前累计下载 2280 次。

如何安装 Csv Handler？

在 OpenClaw 或 Claude Code 对话框中运行命令「/install csv-handler」即可一键安装，无需额外配置。

Csv Handler 是免费的吗？

是的，Csv Handler 完全免费（开源免费），可自由下载、安装和使用。

Csv Handler 支持哪些平台？

Csv Handler 跨平台运行，可在任意部署了 OpenClaw / Claude Code 的环境中使用（darwin, linux, win32）。

谁开发了 Csv Handler？

由 datadrivenconstruction（@datadrivenconstruction）开发并维护，当前版本 v2.1.0。

Csv Handler