← 返回 Skills 市场
datadrivenconstruction

Image To Data

作者 datadrivenconstruction · GitHub ↗ · v2.0.0
cross-platform ⚠ suspicious
1693
总下载
0
收藏
5
当前安装
2
版本数
在 OpenClaw 中安装
/install image-to-data
功能描述
Extract data from construction images using AI Vision. Analyze site photos, scanned documents, drawings.
使用说明 (SKILL.md)

\r \r

Image To Data\r

\r

Overview\r

\r Based on DDC methodology (Chapter 2.4), this skill extracts structured data from construction images using computer vision, OCR, and AI models to analyze site photos, scanned documents, and drawings.\r \r Book Reference: "Преобразование данных в структурированную форму" / "Data Transformation to Structured Form"\r \r

Quick Start\r

\r

from dataclasses import dataclass, field\r
from enum import Enum\r
from typing import List, Dict, Optional, Any, Tuple\r
from datetime import datetime\r
import json\r
import base64\r
\r
class ImageType(Enum):\r
    """Types of construction images"""\r
    SITE_PHOTO = "site_photo"\r
    SCANNED_DOCUMENT = "scanned_document"\r
    FLOOR_PLAN = "floor_plan"\r
    ELEVATION = "elevation"\r
    DETAIL_DRAWING = "detail_drawing"\r
    PROGRESS_PHOTO = "progress_photo"\r
    SAFETY_PHOTO = "safety_photo"\r
    DEFECT_PHOTO = "defect_photo"\r
    MATERIAL_PHOTO = "material_photo"\r
    EQUIPMENT_PHOTO = "equipment_photo"\r
\r
class ExtractionType(Enum):\r
    """Types of data extraction"""\r
    OCR_TEXT = "ocr_text"\r
    TABLE = "table"\r
    OBJECT_DETECTION = "object_detection"\r
    MEASUREMENT = "measurement"\r
    CLASSIFICATION = "classification"\r
    PROGRESS = "progress"\r
\r
@dataclass\r
class BoundingBox:\r
    """Bounding box for detected region"""\r
    x: int\r
    y: int\r
    width: int\r
    height: int\r
    confidence: float = 1.0\r
\r
@dataclass\r
class TextRegion:\r
    """Extracted text region from image"""\r
    text: str\r
    bbox: BoundingBox\r
    confidence: float\r
    language: str = "en"\r
\r
@dataclass\r
class DetectedObject:\r
    """Detected object in image"""\r
    label: str\r
    bbox: BoundingBox\r
    confidence: float\r
    attributes: Dict[str, Any] = field(default_factory=dict)\r
\r
@dataclass\r
class ExtractedTable:\r
    """Extracted table from image"""\r
    headers: List[str]\r
    rows: List[List[str]]\r
    bbox: BoundingBox\r
    confidence: float\r
\r
@dataclass\r
class ProgressMeasurement:\r
    """Progress measurement from image"""\r
    element_type: str\r
    total_count: int\r
    completed_count: int\r
    percent_complete: float\r
    area_sqft: Optional[float] = None\r
    volume_cuft: Optional[float] = None\r
\r
@dataclass\r
class ImageAnalysisResult:\r
    """Complete image analysis result"""\r
    image_id: str\r
    image_type: ImageType\r
    text_regions: List[TextRegion]\r
    detected_objects: List[DetectedObject]\r
    tables: List[ExtractedTable]\r
    progress: Optional[ProgressMeasurement] = None\r
    metadata: Dict[str, Any] = field(default_factory=dict)\r
    processing_time: float = 0.0\r
\r
\r
class OCREngine:\r
    """OCR engine for text extraction"""\r
\r
    def __init__(self, engine: str = "tesseract"):\r
        self.engine = engine\r
        self.supported_languages = ["en", "ru", "de", "fr", "es"]\r
\r
    def extract_text(\r
        self,\r
        image_data: bytes,\r
        language: str = "en"\r
    ) -> List[TextRegion]:\r
        """Extract text from image"""\r
        # Simulated OCR extraction (use actual OCR library in production)\r
        # In production: pytesseract, EasyOCR, or cloud OCR services\r
\r
        regions = []\r
\r
        # Simulate detecting title block in drawing\r
        regions.append(TextRegion(\r
            text="PROJECT: OFFICE BUILDING",\r
            bbox=BoundingBox(x=100, y=50, width=300, height=30, confidence=0.95),\r
            confidence=0.95,\r
            language=language\r
        ))\r
\r
        regions.append(TextRegion(\r
            text="DRAWING: A-101",\r
            bbox=BoundingBox(x=100, y=90, width=200, height=25, confidence=0.92),\r
            confidence=0.92,\r
            language=language\r
        ))\r
\r
        regions.append(TextRegion(\r
            text="SCALE: 1:100",\r
            bbox=BoundingBox(x=100, y=120, width=150, height=20, confidence=0.88),\r
            confidence=0.88,\r
            language=language\r
        ))\r
\r
        return regions\r
\r
    def extract_structured_text(\r
        self,\r
        image_data: bytes,\r
        template: Optional[Dict] = None\r
    ) -> Dict[str, str]:\r
        """Extract structured text using template matching"""\r
        # Extract text regions\r
        regions = self.extract_text(image_data)\r
\r
        # Match to template fields\r
        structured = {}\r
\r
        if template:\r
            for field_name, field_config in template.items():\r
                # Find matching region\r
                for region in regions:\r
                    if field_config.get("keyword") in region.text.lower():\r
                        structured[field_name] = region.text\r
                        break\r
        else:\r
            # Default extraction\r
            for region in regions:\r
                if "PROJECT:" in region.text:\r
                    structured["project_name"] = region.text.split(":")[-1].strip()\r
                elif "DRAWING:" in region.text:\r
                    structured["drawing_number"] = region.text.split(":")[-1].strip()\r
                elif "SCALE:" in region.text:\r
                    structured["scale"] = region.text.split(":")[-1].strip()\r
\r
        return structured\r
\r
\r
class ObjectDetector:\r
    """Object detection for construction images"""\r
\r
    def __init__(self, model: str = "yolov8"):\r
        self.model = model\r
        self.construction_classes = self._load_construction_classes()\r
\r
    def _load_construction_classes(self) -> Dict[str, Dict]:\r
        """Load construction-specific object classes"""\r
        return {\r
            # Equipment\r
            "excavator": {"category": "equipment", "safety_zone": 20},\r
            "crane": {"category": "equipment", "safety_zone": 30},\r
            "forklift": {"category": "equipment", "safety_zone": 10},\r
            "concrete_mixer": {"category": "equipment", "safety_zone": 5},\r
            "scaffolding": {"category": "equipment", "safety_zone": 5},\r
\r
            # Safety\r
            "hard_hat": {"category": "ppe", "required": True},\r
            "safety_vest": {"category": "ppe", "required": True},\r
            "safety_glasses": {"category": "ppe", "required": False},\r
            "harness": {"category": "ppe", "required": False},\r
\r
            # Materials\r
            "rebar_bundle": {"category": "material", "unit": "bundle"},\r
            "concrete_block": {"category": "material", "unit": "pallet"},\r
            "lumber_stack": {"category": "material", "unit": "bundle"},\r
            "pipe_stack": {"category": "material", "unit": "bundle"},\r
\r
            # Workers\r
            "worker": {"category": "person", "track": True},\r
\r
            # Building elements\r
            "column": {"category": "structure"},\r
            "beam": {"category": "structure"},\r
            "slab": {"category": "structure"},\r
            "wall": {"category": "structure"},\r
        }\r
\r
    def detect(\r
        self,\r
        image_data: bytes,\r
        confidence_threshold: float = 0.5\r
    ) -> List[DetectedObject]:\r
        """Detect objects in image"""\r
        # Simulated detection (use actual model in production)\r
        # In production: YOLO, Faster R-CNN, etc.\r
\r
        detected = []\r
\r
        # Simulate detected objects\r
        sample_detections = [\r
            ("worker", 0.92, BoundingBox(200, 300, 80, 180, 0.92)),\r
            ("hard_hat", 0.88, BoundingBox(210, 300, 30, 25, 0.88)),\r
            ("safety_vest", 0.85, BoundingBox(210, 340, 60, 80, 0.85)),\r
            ("scaffolding", 0.78, BoundingBox(400, 100, 200, 400, 0.78)),\r
            ("concrete_block", 0.72, BoundingBox(50, 450, 100, 50, 0.72)),\r
        ]\r
\r
        for label, conf, bbox in sample_detections:\r
            if conf >= confidence_threshold:\r
                class_info = self.construction_classes.get(label, {})\r
                detected.append(DetectedObject(\r
                    label=label,\r
                    bbox=bbox,\r
                    confidence=conf,\r
                    attributes=class_info\r
                ))\r
\r
        return detected\r
\r
    def detect_safety_compliance(\r
        self,\r
        image_data: bytes\r
    ) -> Dict:\r
        """Detect safety compliance in image"""\r
        objects = self.detect(image_data)\r
\r
        workers = [o for o in objects if o.label == "worker"]\r
        hard_hats = [o for o in objects if o.label == "hard_hat"]\r
        vests = [o for o in objects if o.label == "safety_vest"]\r
\r
        compliance = {\r
            "workers_detected": len(workers),\r
            "hard_hats_detected": len(hard_hats),\r
            "vests_detected": len(vests),\r
            "hard_hat_compliance": len(hard_hats) / len(workers) if workers else 1.0,\r
            "vest_compliance": len(vests) / len(workers) if workers else 1.0,\r
            "overall_compliance": "compliant" if len(hard_hats) >= len(workers) else "non-compliant",\r
            "violations": []\r
        }\r
\r
        if len(hard_hats) \x3C len(workers):\r
            compliance["violations"].append({\r
                "type": "missing_hard_hat",\r
                "count": len(workers) - len(hard_hats)\r
            })\r
\r
        return compliance\r
\r
\r
class TableExtractor:\r
    """Extract tables from images"""\r
\r
    def extract_tables(\r
        self,\r
        image_data: bytes,\r
        detect_headers: bool = True\r
    ) -> List[ExtractedTable]:\r
        """Extract tables from image"""\r
        # Simulated table extraction\r
        # In production: Camelot, Tabula, or custom CNN\r
\r
        tables = []\r
\r
        # Simulate a schedule table\r
        tables.append(ExtractedTable(\r
            headers=["Activity", "Start", "End", "Duration"],\r
            rows=[\r
                ["Foundation", "2024-01-01", "2024-01-15", "14 days"],\r
                ["Framing", "2024-01-16", "2024-02-28", "44 days"],\r
                ["MEP Rough-in", "2024-03-01", "2024-03-31", "31 days"]\r
            ],\r
            bbox=BoundingBox(50, 200, 500, 200, 0.85),\r
            confidence=0.85\r
        ))\r
\r
        return tables\r
\r
    def table_to_dataframe(self, table: ExtractedTable) -> Dict:\r
        """Convert table to dictionary (DataFrame-like)"""\r
        return {\r
            "columns": table.headers,\r
            "data": table.rows,\r
            "records": [\r
                dict(zip(table.headers, row))\r
                for row in table.rows\r
            ]\r
        }\r
\r
\r
class ProgressAnalyzer:\r
    """Analyze construction progress from images"""\r
\r
    def __init__(self):\r
        self.reference_models = {}\r
\r
    def analyze_progress(\r
        self,\r
        current_image: bytes,\r
        reference_image: Optional[bytes] = None,\r
        element_type: str = "general"\r
    ) -> ProgressMeasurement:\r
        """Analyze progress by comparing images"""\r
        # Simulated progress analysis\r
        # In production: Use semantic segmentation + comparison\r
\r
        # Simulate progress detection\r
        return ProgressMeasurement(\r
            element_type=element_type,\r
            total_count=100,\r
            completed_count=65,\r
            percent_complete=65.0,\r
            area_sqft=15000.0,\r
            volume_cuft=None\r
        )\r
\r
    def compare_with_plan(\r
        self,\r
        site_photo: bytes,\r
        plan_image: bytes\r
    ) -> Dict:\r
        """Compare site photo with plan"""\r
        return {\r
            "match_score": 0.78,\r
            "deviations": [],\r
            "completion_estimate": 65.0,\r
            "areas_of_concern": []\r
        }\r
\r
\r
class ConstructionImageAnalyzer:\r
    """\r
    Main class for construction image analysis.\r
    Based on DDC methodology Chapter 2.4.\r
    """\r
\r
    def __init__(self):\r
        self.ocr = OCREngine()\r
        self.detector = ObjectDetector()\r
        self.table_extractor = TableExtractor()\r
        self.progress_analyzer = ProgressAnalyzer()\r
\r
    def analyze_image(\r
        self,\r
        image_data: bytes,\r
        image_type: ImageType,\r
        image_id: str = "img_001",\r
        extract_types: Optional[List[ExtractionType]] = None\r
    ) -> ImageAnalysisResult:\r
        """\r
        Analyze a construction image.\r
\r
        Args:\r
            image_data: Image data as bytes\r
            image_type: Type of image\r
            image_id: Unique image identifier\r
            extract_types: Types of extraction to perform\r
\r
        Returns:\r
            Complete analysis result\r
        """\r
        start_time = datetime.now()\r
\r
        if extract_types is None:\r
            extract_types = [ExtractionType.OCR_TEXT, ExtractionType.OBJECT_DETECTION]\r
\r
        text_regions = []\r
        detected_objects = []\r
        tables = []\r
        progress = None\r
\r
        # OCR extraction\r
        if ExtractionType.OCR_TEXT in extract_types:\r
            text_regions = self.ocr.extract_text(image_data)\r
\r
        # Object detection\r
        if ExtractionType.OBJECT_DETECTION in extract_types:\r
            detected_objects = self.detector.detect(image_data)\r
\r
        # Table extraction\r
        if ExtractionType.TABLE in extract_types:\r
            tables = self.table_extractor.extract_tables(image_data)\r
\r
        # Progress analysis\r
        if ExtractionType.PROGRESS in extract_types:\r
            progress = self.progress_analyzer.analyze_progress(image_data)\r
\r
        processing_time = (datetime.now() - start_time).total_seconds()\r
\r
        return ImageAnalysisResult(\r
            image_id=image_id,\r
            image_type=image_type,\r
            text_regions=text_regions,\r
            detected_objects=detected_objects,\r
            tables=tables,\r
            progress=progress,\r
            metadata={"extraction_types": [e.value for e in extract_types]},\r
            processing_time=processing_time\r
        )\r
\r
    def analyze_site_photo(\r
        self,\r
        image_data: bytes,\r
        image_id: str = "site_001"\r
    ) -> Dict:\r
        """Analyze site photo for progress and safety"""\r
        result = self.analyze_image(\r
            image_data,\r
            ImageType.SITE_PHOTO,\r
            image_id,\r
            [ExtractionType.OBJECT_DETECTION, ExtractionType.PROGRESS]\r
        )\r
\r
        safety = self.detector.detect_safety_compliance(image_data)\r
\r
        return {\r
            "image_id": result.image_id,\r
            "objects_detected": len(result.detected_objects),\r
            "progress": result.progress,\r
            "safety_compliance": safety,\r
            "equipment": [o.label for o in result.detected_objects if o.attributes.get("category") == "equipment"],\r
            "materials": [o.label for o in result.detected_objects if o.attributes.get("category") == "material"]\r
        }\r
\r
    def extract_drawing_data(\r
        self,\r
        image_data: bytes,\r
        image_id: str = "dwg_001"\r
    ) -> Dict:\r
        """Extract data from scanned drawing"""\r
        result = self.analyze_image(\r
            image_data,\r
            ImageType.FLOOR_PLAN,\r
            image_id,\r
            [ExtractionType.OCR_TEXT, ExtractionType.TABLE]\r
        )\r
\r
        # Extract title block info\r
        title_block = self.ocr.extract_structured_text(image_data)\r
\r
        return {\r
            "image_id": result.image_id,\r
            "title_block": title_block,\r
            "text_regions": len(result.text_regions),\r
            "tables": [\r
                self.table_extractor.table_to_dataframe(t)\r
                for t in result.tables\r
            ],\r
            "all_text": [r.text for r in result.text_regions]\r
        }\r
\r
    def batch_analyze(\r
        self,\r
        images: List[Tuple[bytes, ImageType, str]]\r
    ) -> List[ImageAnalysisResult]:\r
        """Analyze multiple images"""\r
        results = []\r
        for image_data, image_type, image_id in images:\r
            result = self.analyze_image(image_data, image_type, image_id)\r
            results.append(result)\r
        return results\r
\r
    def export_results(\r
        self,\r
        result: ImageAnalysisResult,\r
        format: str = "json"\r
    ) -> str:\r
        """Export analysis results"""\r
        data = {\r
            "image_id": result.image_id,\r
            "image_type": result.image_type.value,\r
            "text_count": len(result.text_regions),\r
            "object_count": len(result.detected_objects),\r
            "table_count": len(result.tables),\r
            "texts": [\r
                {"text": r.text, "confidence": r.confidence}\r
                for r in result.text_regions\r
            ],\r
            "objects": [\r
                {"label": o.label, "confidence": o.confidence}\r
                for o in result.detected_objects\r
            ],\r
            "processing_time": result.processing_time\r
        }\r
\r
        if format == "json":\r
            return json.dumps(data, indent=2)\r
        else:\r
            raise ValueError(f"Unsupported format: {format}")\r
```\r
\r
## Common Use Cases\r
\r
### Analyze Site Photo\r
\r
```python\r
analyzer = ConstructionImageAnalyzer()\r
\r
# Load image (in production, read from file)\r
with open("site_photo.jpg", "rb") as f:\r
    image_data = f.read()\r
\r
result = analyzer.analyze_site_photo(image_data)\r
\r
print(f"Objects detected: {result['objects_detected']}")\r
print(f"Safety compliance: {result['safety_compliance']['overall_compliance']}")\r
print(f"Progress: {result['progress'].percent_complete}%")\r
```\r
\r
### Extract Drawing Data\r
\r
```python\r
with open("floor_plan.png", "rb") as f:\r
    drawing_data = f.read()\r
\r
data = analyzer.extract_drawing_data(drawing_data)\r
\r
print(f"Drawing: {data['title_block'].get('drawing_number')}")\r
print(f"Project: {data['title_block'].get('project_name')}")\r
for table in data['tables']:\r
    print(f"Table with {len(table['records'])} rows")\r
```\r
\r
### Detect Safety Violations\r
\r
```python\r
detector = ObjectDetector()\r
\r
with open("site_photo.jpg", "rb") as f:\r
    image_data = f.read()\r
\r
safety = detector.detect_safety_compliance(image_data)\r
\r
if safety['overall_compliance'] == 'non-compliant':\r
    for violation in safety['violations']:\r
        print(f"Violation: {violation['type']} - Count: {violation['count']}")\r
```\r
\r
## Quick Reference\r
\r
| Component | Purpose |\r
|-----------|---------|\r
| `ConstructionImageAnalyzer` | Main analysis engine |\r
| `OCREngine` | Text extraction |\r
| `ObjectDetector` | Object detection |\r
| `TableExtractor` | Table extraction |\r
| `ProgressAnalyzer` | Progress analysis |\r
| `ImageAnalysisResult` | Complete analysis result |\r
\r
## Resources\r
\r
- **Book**: "Data-Driven Construction" by Artem Boiko, Chapter 2.4\r
- **Website**: https://datadrivenconstruction.io\r
\r
## Next Steps\r
\r
- Use [cad-to-data](../cad-to-data/SKILL.md) for CAD/BIM extraction\r
- Use [defect-detection-ai](../../../DDC_Innovative/defect-detection-ai/SKILL.md) for defects\r
- Use [safety-compliance-checker](../../../DDC_Innovative/safety-compliance-checker/SKILL.md) for safety\r
安全使用建议
Before installing, ask the publisher which exact API keys/environment variables are required (e.g., OPENAI_API_KEY, CLAUDE_API_KEY) and why. If you proceed: (1) only provide the minimum-scoped key(s) with restricted permissions; (2) run the skill in a sandboxed agent environment so it cannot read unrelated filesystem paths; (3) monitor outbound network requests (to confirm calls go only to expected Vision API endpoints); (4) prefer a version that explicitly lists required env vars in the manifest; and (5) if you cannot verify the required keys/endpoints, avoid giving it any global or high-privilege secrets.
功能分析
Type: OpenClaw Skill Name: image-to-data Version: 2.0.0 The skill requests broad `filesystem` and `network` permissions in `claw.json`. Although the Python code in `SKILL.md` is a simulation and does not actively utilize these permissions, the `instructions.md` explicitly directs the AI agent to use these permissions for reading image files and making external AI Vision API calls. Additionally, the agent is instructed to load 'All API keys from environment variables'. These capabilities, while plausibly needed for the skill's stated purpose, involve accessing sensitive environment variables and performing network I/O, which are considered risky capabilities that could be exploited if the skill were fully implemented without robust security measures.
能力评估
Purpose & Capability
The skill claims to extract structured data from construction images (OCR, object detection, measurements) which coheres with requiring filesystem and network access (for local images + vision APIs). However, the manifest (requires.env: none) does not declare any API keys even though the instructions explicitly say 'All API keys loaded from environment variables' and mention calling Claude/OpenAI Vision. That omission is an inconsistency: either the skill should enumerate required credentials, or it will attempt to use any available secrets in the environment.
Instruction Scope
SKILL.md and instructions.md direct the agent to read arbitrary image file paths, perform OCR/detection, and call external AI Vision APIs. The docs are vague about which env vars/endpoints to use and do not constrain filesystem paths. The agent is therefore instructed to access local files and make network calls, and could access environment variables broadly because no specific keys are declared.
Install Mechanism
This is an instruction-only skill with no install spec and no code files to be downloaded or executed at install time, which minimizes install-time risk.
Credentials
The skill requires network and filesystem permissions (declared in claw.json) but declares no required environment variables. The instructions nevertheless expect API keys from env vars (e.g., Claude/OpenAI Vision). That mismatch is disproportionate: it is unclear which specific secrets are needed and the skill could try to use any env var present. Network + filesystem access combined with unspecified secret usage increases the risk of accidental or malicious exfiltration.
Persistence & Privilege
always is false and there is no install step that modifies other skills or system-wide configuration. The skill does not request permanent/autonomous elevation beyond normal agent invocation.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install image-to-data
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /image-to-data 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v2.0.0
Version 2.0.0 - Major redesign: now extracts structured data from construction images using vision, OCR, and AI models. - Supports multiple construction image types (site photos, floor plans, scanned documents, etc.). - Provides data extraction for text, tables, detected objects, classification, and progress measurement. - Introduces detailed schemas for detected objects, bounding boxes, OCR text regions, and tables. - Modular architecture for OCR and object detection tailored to common construction needs. - Enables template-based structured text extraction and construction-specific object class detection.
v1.0.0
Image To Data v1.0.0 - Initial release of the skill to extract structured data from construction images using AI vision methods. - Supports analysis of site photos, scanned documents, and construction drawings. - Provides data types and classes for OCR text extraction, table detection, object detection, and construction progress measurement. - Includes example implementations and simulated outputs for OCR and object detection tailored to construction use cases.
元数据
Slug image-to-data
版本 2.0.0
许可证
累计安装 5
当前安装数 5
历史版本数 2
常见问题

Image To Data 是什么?

Extract data from construction images using AI Vision. Analyze site photos, scanned documents, drawings. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 1693 次。

如何安装 Image To Data?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install image-to-data」即可一键安装,无需额外配置。

Image To Data 是免费的吗?

是的,Image To Data 完全免费(开源免费),可自由下载、安装和使用。

Image To Data 支持哪些平台?

Image To Data 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(cross-platform)。

谁开发了 Image To Data?

由 datadrivenconstruction(@datadrivenconstruction)开发并维护,当前版本 v2.0.0。

💬 留言讨论