← Back to Skills Marketplace
kaiyuelv

Data Labeling Studio

by Lv Lancer · GitHub ↗ · v1.0.0 · MIT-0
cross-platform ⚠ suspicious
58
Downloads
0
Stars
0
Active Installs
1
Versions
Install in OpenClaw
/install data-labeling-studio
Description
Intelligent toolkit for annotating images, text, audio, and video with active learning, quality control, and exporting labeled datasets.
README (SKILL.md)

Data Labeling Studio

Metadata

  • Name: data-labeling-studio
  • Display Name: Data Labeling Studio | 数据标注工作室
  • Description:
    • EN: Intelligent data labeling and annotation toolkit supporting image, text, audio, and video with active learning and quality control.
    • ZH: 智能数据标注和注释工具包,支持图像、文本、音频和视频,包含主动学习和质量控制。
  • Version: 1.0.0
  • Author: Kimi Claw
  • Tags: data-labeling, annotation, image-annotation, text-annotation, active-learning, quality-control, dataset, ml-training
  • Category: Data Processing
  • Icon: 🏷️

Capabilities

Actions

image_annotate

Perform image annotation

  • image_dir: Image directory path (string, required)
  • annotation_type: Type of annotation (string, required) - bounding_box, polygon, keypoint, segmentation
  • labels: Label categories (array, required)
  • output_format: Output format (string) - coco, pascal_voc, yolo
  • active_learning: Enable active learning suggestions (boolean, default: true)

text_annotate

Perform text annotation

  • text_data: Text data source (string/object, required)
  • annotation_task: Task type (string, required) - classification, ner, sentiment, summarization
  • labels: Label categories (array, required)
  • output_format: Output format (string) - json, csv, spacy

audio_annotate

Perform audio annotation

  • audio_dir: Audio directory path (string, required)
  • annotation_type: Type (string, required) - transcription, speaker_id, emotion, event
  • segment_duration: Segment duration in seconds (float, default: 5.0)

video_annotate

Perform video annotation

  • video_path: Video file path (string, required)
  • annotation_type: Type (string, required) - object_tracking, action_recognition, scene_detection
  • frame_sample_rate: Frame sampling rate (int, default: 1)

quality_check

Check annotation quality and consistency

  • annotations: Annotation file path (string, required)
  • ground_truth: Ground truth file path (string, optional)
  • metrics: Quality metrics (array) - iou, accuracy, consistency, coverage

dataset_export

Export labeled dataset to ML format

  • annotations: Annotation source (string, required)
  • format: Target format (string, required) - coco, yolo, tfrecord, huggingface
  • output_dir: Output directory (string, required)
  • split_ratios: Train/val/test split (object) - {train: 0.8, val: 0.1, test: 0.1}

Requirements

  • Python 3.8+
  • Pillow >= 10.0.0 (for image processing)
  • OpenCV >= 4.8.0 (for image/video annotation)
  • NumPy >= 1.24.0
  • Pandas >= 2.0.0
  • LabelImg >= 1.8.0 (optional)
  • Librosa >= 0.10.0 (for audio processing)
  • scikit-learn >= 1.3.0 (for active learning)

Examples

Image Annotation

from labeling_studio import ImageAnnotator

# Initialize annotator
annotator = ImageAnnotator(
    annotation_type="bounding_box",
    labels=["person", "car", "dog", "cat"],
    output_format="coco"
)

# Annotate images with active learning
annotator.annotate(
    image_dir="./images",
    output_file="./annotations/coco.json",
    active_learning=True  # AI suggests uncertain samples
)

# Export to YOLO format
annotator.export("./annotations", format="yolo")

Text Annotation

from labeling_studio import TextAnnotator

# NER annotation
annotator = TextAnnotator(
    annotation_task="ner",
    labels=["PERSON", "ORG", "LOC", "DATE"]
)

# Annotate from file
annotations = annotator.annotate(
    text_data="./data/corpus.txt",
    output_file="./annotations/ner.json"
)

Quality Check

from labeling_studio import QualityChecker

# Check annotation quality
checker = QualityChecker()
report = checker.check(
    annotations="./annotations/coco.json",
    ground_truth="./annotations/ground_truth.json",
    metrics=["iou", "consistency", "coverage"]
)

print(f"Average IoU: {report['iou']:.2f}")
print(f"Consistency Score: {report['consistency']:.2f}")
print(f"Coverage: {report['coverage']:.2f}")

Scripts

  • scripts/annotate_images.py: 图像标注工具
  • scripts/annotate_text.py: 文本标注工具
  • scripts/annotate_audio.py: 音频标注工具
  • scripts/annotate_video.py: 视频标注工具
  • scripts/quality_check.py: 质量检查工具
  • scripts/export_dataset.py: 数据集导出工具

Installation

pip install -r requirements.txt

Usage

# Image annotation with active learning
python scripts/annotate_images.py --input ./images --type bbox --labels person,car --format coco

# Text NER annotation
python scripts/annotate_text.py --input ./texts.txt --task ner --labels PERSON,ORG,LOC

# Quality check
python scripts/quality_check.py --annotations ./coco.json --ground-truth ./gt.json

# Export to YOLO
python scripts/export_dataset.py --input ./coco.json --format yolo --output ./yolo_dataset

License

MIT License

Usage Guidance
This package looks internally inconsistent rather than blatantly malicious: it promises a full multi‑modal 'labeling_studio' with many helper scripts and model integrations, but the archive only contains an image annotator script, a quality checker, example/test mocks, and a requirements.txt. Before installing or running anything: - Don't pip install the requirements into your main environment. Use a disposable virtualenv or container to avoid pulling heavy packages unnecessarily. - Inspect or run the included scripts locally to confirm behavior. The image annotator uses mocked/simulated annotations (random), not real models; 'active learning' appears not implemented here. - Be cautious that examples import a module (labeling_studio) that isn't included — this may mean the published bundle is incomplete or the real implementation is fetched from elsewhere (ask the author or source). If the package intended to download or fetch code at runtime, that would be higher risk — but no such downloader is present in the files. - If you need multi‑modal capabilities, request the missing source files or a packaged release (e.g., on GitHub) and verify the code that integrates models or remote endpoints. If you don't get clear answers, prefer an alternative with a complete source/release. Overall: don't run or install this in a production environment until the mismatches are resolved; treat it as incomplete/misleading and proceed in a sandbox if you want to experiment.
Capability Analysis
Type: OpenClaw Skill Name: data-labeling-studio Version: 1.0.0 The data-labeling-studio skill bundle is a legitimate toolkit for data annotation tasks across multiple modalities. The provided Python scripts (scripts/annotate_images.py, scripts/quality_check.py) and documentation (SKILL.md, README.md) contain standard data processing logic, such as image scanning, IoU calculation, and JSON-based annotation management, without any evidence of malicious intent, data exfiltration, or unauthorized execution.
Capability Tags
crypto
Capability Assessment
Purpose & Capability
The skill claims multi‑modal support (image, text, audio, video) and an importable package 'labeling_studio', but the bundle only includes scripts for image annotation and quality checks. Several scripts referenced in SKILL.md (annotate_text.py, annotate_audio.py, annotate_video.py, export_dataset.py) and the labeling_studio module used in examples are not present. Declared requirements (librosa, OpenCV, Pillow, scikit‑learn) are heavier than what the included scripts actually use.
Instruction Scope
SKILL.md instructs running scripts and doing pip install -r requirements.txt which is expected, but many example commands and APIs reference missing files/modules (labeling_studio import, scripts that aren't in the manifest). The runtime instructions also enable 'active learning' and 'pre_annotate' but the included code only contains mock/simulated behavior rather than actual model integration — this is scope creep / mismatch between promised capabilities and real instructions.
Install Mechanism
There is no formal install spec (instruction-only), which is low risk. However SKILL.md and README suggest running 'pip install -r requirements.txt' which will pull several heavy third‑party packages; because the project is incomplete, installing those deps may be unnecessary and should be done in an isolated environment if attempted.
Credentials
The skill requests no environment variables, no credentials, and no config paths. The code reads only local file paths supplied by the user. There is no evidence of attempts to access unrelated secrets or network endpoints in the provided files.
Persistence & Privilege
The skill is not always-enabled and does not request persistent system privileges or modify other skills. It does not include an installer that writes to system locations; it is run on demand as scripts.
How to Use
  1. Make sure OpenClaw is installed (local or Docker)
  2. Run the install command in chat: /install data-labeling-studio
  3. After installation, invoke the skill by name or use /data-labeling-studio
  4. Provide required inputs per the skill's parameter spec and get structured output
Version History
v1.0.0
Initial release of Data Labeling Studio. - Supports intelligent data labeling and annotation for images, text, audio, and video - Includes active learning suggestions and quality control checks - Multiple annotation formats supported: COCO, YOLO, Pascal VOC, TFRecord, HuggingFace, and more - Tools provided for annotation, quality checking, and dataset export - Example usage and script files included for all major features
Metadata
Slug data-labeling-studio
Version 1.0.0
License MIT-0
All-time Installs 0
Active Installs 0
Total Versions 1
Frequently Asked Questions

What is Data Labeling Studio?

Intelligent toolkit for annotating images, text, audio, and video with active learning, quality control, and exporting labeled datasets. It is an AI Agent Skill for Claude Code / OpenClaw, with 58 downloads so far.

How do I install Data Labeling Studio?

Run "/install data-labeling-studio" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.

Is Data Labeling Studio free?

Yes, Data Labeling Studio is completely free, licensed under MIT-0. You can download, install and use it at no cost.

Which platforms does Data Labeling Studio support?

Data Labeling Studio is cross-platform and runs anywhere OpenClaw / Claude Code is available (cross-platform).

Who created Data Labeling Studio?

It is built and maintained by Lv Lancer (@kaiyuelv); the current version is v1.0.0.

💬 Comments