← 返回 Skills 市场
nitishgargiitd

Data Cog

作者 CellCog · GitHub ↗ · v1.0.11 · MIT-0
darwinlinuxwindows ✓ 安全检测通过
2219
总下载
4
收藏
8
当前安装
12
版本数
在 OpenClaw 中安装
/install data-cog
功能描述
AI data analysis and visualization powered by CellCog. Data cleaning, exploratory analysis, hypothesis testing, statistical reports, ML model evaluation, dat...
使用说明 (SKILL.md)

Data Cog - Your Data Has Answers, CellCog Finds Them

Data analysis and visualization from uploaded files.

Most AI tools return code when you ask about data. CellCog returns answers — actual charts, clean datasets, statistical reports, and visual dashboards. Upload messy CSVs with a minimal prompt, and CellCog's coding agent explores your data, finds the patterns, and presents them beautifully. Full Python access for everything from data cleaning to ML model evaluation.

How to Use

For your first CellCog task in a session, read the cellcog skill for the full SDK reference — file handling, chat modes, timeouts, and more.

OpenClaw (fire-and-forget):

result = client.create_chat(
    prompt="[your task prompt]",
    notify_session_key="agent:main:main",
    task_label="my-task",
    chat_mode="agent",
)

All agents except OpenClaw (blocks until done):

from cellcog import CellCogClient
client = CellCogClient(agent_provider="openclaw|cursor|claude-code|codex|...")
result = client.create_chat(
    prompt="[your task prompt]",
    task_label="my-task",
    chat_mode="agent",
)
print(result["message"])

What Makes Data-Cog Different

Code as Tool, Not as Output

Other AI tools give you Python code and say "run this." CellCog runs the code for you and delivers the results:

Other AI Tools Data-Cog
"Here's a pandas script to analyze your data" Here are your actual insights with charts
"Run this matplotlib code to see the chart" Here's the chart, annotated with findings
"This SQL query will find outliers" Found 23 outliers, here's what they mean
"You'll need scikit-learn for this" Model trained, here's accuracy and feature importance

You upload data. You get answers. The code runs behind the scenes.


What Data Work You Can Do

Exploratory Data Analysis

Understand your data fast:

  • Dataset Profiling: "Analyze this CSV — distributions, missing values, outliers, correlations, and data quality summary"
  • Pattern Discovery: "What patterns and trends exist in this sales data? Surprise me."
  • Anomaly Detection: "Find unusual patterns in this server log data — what looks abnormal?"
  • Relationship Analysis: "What factors most strongly correlate with customer churn in this dataset?"

Example prompt:

"Analyze this dataset: \x3CSHOW_FILE>/path/to/customer_data.csv\x3C/SHOW_FILE>

I don't know much about this data yet. Give me:

  • Overview: rows, columns, data types, missing values
  • Key distributions and summary statistics
  • Most interesting correlations
  • Any outliers or data quality issues
  • 3-5 insights that jump out

Present findings as an interactive HTML report with charts."

Data Cleaning & Transformation

Wrangle messy data into shape:

  • Clean Messy Data: "Clean this CSV — fix inconsistent date formats, handle missing values, remove duplicates, standardize column names"
  • Data Transformation: "Pivot this transaction data into a monthly summary by product category"
  • Data Merging: "Join these three CSV files on customer_id and create a unified dataset"
  • Feature Engineering: "Create useful features from this raw data for predicting house prices"

Example prompt:

"Clean and transform this dataset: \x3CSHOW_FILE>/path/to/messy_data.csv\x3C/SHOW_FILE>

Issues I know about:

  • Dates are in mixed formats (MM/DD/YYYY and YYYY-MM-DD)
  • 'Revenue' column has some values with $ signs and commas
  • Duplicate rows exist
  • Missing values in 'Region' column

Clean it up and give me back a clean CSV plus a summary of what you changed."

Statistical Analysis

Rigorous analysis with real numbers:

  • Hypothesis Testing: "Is there a statistically significant difference in conversion rates between our A and B variants?"
  • Regression Analysis: "What factors predict employee salary in this HR dataset? Build a regression model."
  • Time Series Analysis: "Analyze this monthly revenue data — trend, seasonality, and forecast next 6 months"
  • Cohort Analysis: "Create a cohort analysis showing user retention by signup month"

Example prompt:

"I ran an A/B test on our checkout page: \x3CSHOW_FILE>/path/to/ab_test_results.csv\x3C/SHOW_FILE>

Columns: user_id, variant (A or B), converted (0/1), revenue, timestamp

Tell me:

  • Is variant B statistically better? (p-value, confidence interval)
  • Conversion rate difference
  • Revenue per user difference
  • Sample size adequacy check
  • My recommendation: ship B or keep testing?

Present with clear charts and a plain-English conclusion."

Visualization & Reporting

Turn data into visual stories:

  • Chart Generation: "Create a set of charts showing our quarterly performance from this data"
  • Dashboard Reports: "Build an interactive dashboard from this sales dataset with filters by region and product"
  • Presentation-Ready Visuals: "Create publication-quality charts from this research data"
  • Comparison Visuals: "Visualize how our metrics compare to industry benchmarks"

Machine Learning

Applied ML without the setup:

  • Classification: "Predict which customers will churn based on this dataset — train a model, show feature importance"
  • Clustering: "Segment these customers into groups based on behavior — how many natural clusters exist?"
  • Forecasting: "Forecast next quarter's sales using this historical data"
  • Model Evaluation: "I trained a model — here are the predictions. Evaluate: accuracy, precision, recall, confusion matrix, ROC curve"

Example prompt:

"Predict customer churn from this dataset: \x3CSHOW_FILE>/path/to/customer_features.csv\x3C/SHOW_FILE>

Target column: 'churned'

  • Train a model, try at least 2 algorithms
  • Show feature importance — what drives churn?
  • Confusion matrix and ROC curve
  • Plain-English summary: 'The top 3 reasons customers churn are...'
  • Actionable recommendations based on findings

I want insights, not just metrics."


Supported Data Formats

Format How to Send
CSV Upload via SHOW_FILE
Excel (XLSX) Upload via SHOW_FILE
JSON Upload via SHOW_FILE
Parquet Upload via SHOW_FILE
SQL exports Upload the dump via SHOW_FILE
Inline data Describe small datasets directly in prompt

Output Formats

Format Best For
Interactive HTML Dashboard Explorable charts, filters, drill-downs
PDF Report Shareable analysis reports with charts and findings
Clean CSV/XLSX Cleaned or transformed data files for downstream use
Markdown Quick insights for integration into docs

Chat Mode for Data

Scenario Recommended Mode
Quick data cleaning, simple charts, basic statistics "agent"
Deep analysis with multiple techniques, ML modeling, comprehensive reports "agent team"

Use "agent" for most data work. Data cleaning, EDA, chart generation, and standard statistical analysis execute well in agent mode.

Use "agent team" for complex analytical projects — multi-technique analysis, ML model comparisons, or when you need deep domain reasoning about what the data means.


Example Prompts

Minimal prompt, maximum insight:

"Analyze this: \x3CSHOW_FILE>/path/to/data.csv\x3C/SHOW_FILE>

Tell me everything interesting."

That's it. CellCog's coding agent will profile the data, run exploratory analysis, find patterns, and present findings with charts. You don't need to know what to ask — the agent figures it out.

Business analysis:

"Analyze our e-commerce data: \x3CSHOW_FILE>/path/to/orders.csv\x3C/SHOW_FILE>

I need:

  • Revenue trends (daily, weekly, monthly)
  • Best and worst performing products
  • Customer purchase frequency distribution
  • Average order value trends
  • Seasonal patterns
  • Top 5 actionable insights for growing revenue

Interactive HTML dashboard with all charts."

Research data analysis:

"Analyze this survey data from 500 respondents: \x3CSHOW_FILE>/path/to/survey.csv\x3C/SHOW_FILE>

Research questions:

  1. Is there a significant relationship between age group and product preference?
  2. Do satisfaction scores differ by region? (ANOVA)
  3. What factors best predict likelihood to recommend? (regression)

Include: statistical tests, p-values, effect sizes, and publication-ready charts. PDF report format."


Tips for Better Data Analysis

  1. Just upload and ask: You don't need to describe every column. CellCog reads the data and figures out what's there.

  2. State your question: "What drives churn?" is more focused than "Analyze this data." Both work, but the first gets faster results.

  3. Mention the audience: "For my CEO" means executive summary. "For the data team" means show the methodology.

  4. Specify what you'll do with it: "I need to present this to the board" vs "I need clean data for my ML pipeline" — context shapes the output.

  5. Don't over-specify methods: Let CellCog choose the right statistical approach. Say what you want to learn, not which algorithm to use.

  6. Iterate: Upload data → get initial analysis → ask follow-up questions → go deeper. CellCog maintains context across messages.


If CellCog is not installed

Run /cellcog-setup (or /cellcog:cellcog-setup depending on your tool) to install and authenticate. OpenClaw users: Run clawhub install cellcog instead. Manual setup: pip install -U cellcog and set CELLCOG_API_KEY. See the cellcog skill for SDK reference.

安全使用建议
This skill appears to be what it claims, but before installing: (1) Verify the CELLCOG_API_KEY issuer and limit its permissions; consider using an ephemeral/minimally-privileged key. (2) Do not upload sensitive PII, secrets, or proprietary data unless you have reviewed CellCog's privacy, retention, and security practices. (3) Because the skill executes Python via CellCog, treat its processing as remote execution — sanitize or anonymize data if needed. (4) Confirm the skill author/owner (registry shows an owner ID but 'Source' is unknown) and prefer official vendor-provided skills when possible. (5) If you want to limit autonomous uploads, review agent invocation policies or disable autonomous invocation for sessions that may expose sensitive files.
功能分析
Type: OpenClaw Skill Name: data-cog Version: 1.0.11 The data-cog skill bundle is a documentation-centric integration for the CellCog data analysis platform. The SKILL.md and _meta.json files contain instructions and examples for performing data science tasks like cleaning, visualization, and machine learning using the 'cellcog' Python library. There is no evidence of malicious code, data exfiltration, or prompt injection; the requirements (CELLCOG_API_KEY and the cellcog dependency) are consistent with the tool's stated purpose of providing remote or automated data analysis services.
能力标签
cryptocan-make-purchasesrequires-sensitive-credentials
能力评估
Purpose & Capability
Name/description (data analysis/visualization via CellCog) align with required binaries (python3) and required env var (CELLCOG_API_KEY). No unrelated credentials, binaries, or config paths are requested.
Instruction Scope
SKILL.md instructs the agent to use the CellCog SDK to run Python analysis on uploaded files (<SHOW_FILE> tags). This is coherent with the stated purpose, but the skill repeatedly emphasizes 'full Python access' and that CellCog will 'run the code for you' — meaning user data and code will be executed/processed by the external CellCog service. That has privacy/operational implications (see guidance). The instructions do not ask the agent to read unrelated system files or secret env vars.
Install Mechanism
No install spec or external downloads; instruction-only skill (lowest install risk). It references a 'cellcog' dependency/SDK but does not pull arbitrary archives or nonstandard installers.
Credentials
Only CELLCOG_API_KEY is required, which is proportional for a third-party hosted API. No additional unrelated secrets or broad system credentials are requested.
Persistence & Privilege
always is false and the skill does not request persistent system changes or access to other skills' configs. Model invocation is allowed (platform default), which is expected for an integration like this.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install data-cog
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /data-cog 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.11
- Added `requires` field to metadata, specifying dependencies on Python 3 and the CELLCOG_API_KEY environment variable for improved skill setup clarity. - No changes to features or code behavior; documentation updated to reflect requirements.
v1.0.10
- Simplified and shortened the SKILL.md description for clarity. - Updated agent usage instructions for greater accuracy, clarifying "OpenClaw" vs. other agents. - No functional or code changes; documentation only.
v1.0.9
- Updated skill description for improved clarity and added latest benchmark achievement (#1 on DeepResearch Bench, Apr 2026). - Enhanced usage section with explicit CellCogClient example for agent provider configuration. - Minor edits for conciseness and modernized language throughout documentation. - No code or interface changes; documentation only.
v1.0.8
**Changelog for data-cog v1.0.8** - Major rewrite of SKILL.md for clarity and brevity: description and documentation made much more concise. - Simplified feature lists and usage instructions, focusing on core data analysis, visualization, and supported formats. - Updated usage examples for compatibility with new agents and platforms. - Removed excessive marketing, benchmark claims, and redundant text. - Improved section organization for easier reading and faster onboarding.
v1.0.7
- Expanded and restructured documentation in SKILL.md for improved clarity and detail. - Added more example prompts and use cases, showing a wide range of data analysis tasks. - Clarified what sets Data Cog apart from other AI tools: results, not just code. - Listed supported input and output formats more explicitly. - Updated tool description, highlighting performance on DeepResearch Bench (Apr 2026).
v1.0.6
- Major documentation update: SKILL.md rewritten for brevity and clarity. - Simplified description and usage instructions to focus on main features. - Added concise summaries of internal capabilities (Python libraries, dashboards, spreadsheet, PDF output). - Streamlined data analysis examples and recommended chat modes. - Linked to related skills for broader use cases.
v1.0.5
- Added detailed usage instructions for OpenClaw agents and other agents in the Prerequisites section. - Updated the sample Python code to clarify blocking vs. fire-and-forget modes. - Improved references to the cellcog skill for SDK details and usage guidance. - No changes to core features or supported data types; documentation only.
v1.0.4
- Updated SKILL.md for a more streamlined "Quick start" example and clarified SDK instructions. - Added references to the cellcog skill for full SDK API details, delivery modes, and advanced file handling. - Removed the previous detailed Python example in favor of a more general usage pattern. - No changes to core functionality—documentation update only.
v1.0.3
- Updated DeepResearch Bench rating date from Feb 2026 to Apr 2026 in the description and introduction. - No feature or functional changes; documentation update only.
v1.0.2
- Updated skill description to clarify features and simplify language. - Added supported operating systems (darwin, linux, windows) in metadata. - Included a homepage link in the metadata. - No changes to functionality; documentation only.
v1.0.1
- Added clear author and dependency metadata to SKILL.md. - Changed prerequisite wording to reference the `cellcog` skill directly. - No functional changes; documentation improvements only.
v1.0.0
- Initial release of Data-Cog skill. - Enables analysis of messy CSVs and other data files with minimal prompts, returning structured insights (charts, dashboards, reports, and clean data). - Provides full Python access for tasks such as data cleaning, exploratory analysis, visualization, hypothesis testing, ML model evaluation, and dataset profiling. - Focuses on delivering actual answers and visual summaries instead of just sharing code. - Supports multiple data and output formats, including CSV, Excel, JSON, Parquet, and SQL exports. - Requires the CellCog mothership skill for SDK and API usage.
元数据
Slug data-cog
版本 1.0.11
许可证 MIT-0
累计安装 8
当前安装数 8
历史版本数 12
常见问题

Data Cog 是什么?

AI data analysis and visualization powered by CellCog. Data cleaning, exploratory analysis, hypothesis testing, statistical reports, ML model evaluation, dat... 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 2219 次。

如何安装 Data Cog?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install data-cog」即可一键安装,无需额外配置。

Data Cog 是免费的吗?

是的,Data Cog 完全免费,采用 MIT-0 许可证,可自由下载、安装和使用。

Data Cog 支持哪些平台?

Data Cog 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(darwin, linux, windows)。

谁开发了 Data Cog?

由 CellCog(@nitishgargiitd)开发并维护,当前版本 v1.0.11。

💬 留言讨论