← 返回 Skills 市场
ivangdavila

ClickHouse

作者 Iván · GitHub ↗ · v1.0.1
linuxdarwin ⚠ suspicious
577
总下载
1
收藏
3
当前安装
2
版本数
在 OpenClaw 中安装
/install clickhouse
功能描述
Query, optimize, and administer ClickHouse OLAP databases with schema design, performance tuning, and data ingestion patterns.
使用说明 (SKILL.md)

ClickHouse 🏠

Real-time analytics on billions of rows. Sub-second queries. No indexes needed.

Setup

On first use, read setup.md for connection configuration.

When to Use

User needs OLAP analytics, log analysis, time-series data, or real-time dashboards. Agent handles schema design, query optimization, data ingestion, and cluster administration.

Architecture

Memory lives in ~/clickhouse/. See memory-template.md for structure.

~/clickhouse/
├── memory.md        # Connection profiles + query patterns
├── schemas/         # Table definitions per database
└── queries/         # Saved analytical queries

Quick Reference

Topic File
Setup & connection setup.md
Memory template memory-template.md
Query patterns queries.md
Performance tuning performance.md
Data ingestion ingestion.md

Core Rules

1. Always Specify Engine

Every table needs an explicit engine. Default to MergeTree family:

-- Time-series / logs
CREATE TABLE events (
    timestamp DateTime,
    event_type String,
    data String
) ENGINE = MergeTree()
ORDER BY (timestamp, event_type);

-- Aggregated metrics
CREATE TABLE daily_stats (
    date Date,
    metric String,
    value AggregateFunction(sum, UInt64)
) ENGINE = AggregatingMergeTree()
ORDER BY (date, metric);

2. ORDER BY is Your Index

ClickHouse has no traditional indexes. The ORDER BY clause determines data layout:

  • Put high-cardinality filter columns first
  • Put range columns (dates, timestamps) early
  • Match your most common WHERE patterns
-- Good: filters by user_id, then date range
ORDER BY (user_id, date, event_type)

-- Bad: date first when you filter by user_id
ORDER BY (date, user_id, event_type)

3. Use Appropriate Data Types

Use Case Type Why
Timestamps DateTime or DateTime64 Native time functions
Low-cardinality strings LowCardinality(String) 10x compression
Enums with few values Enum8 or Enum16 Smallest footprint
Nullable only if needed Nullable(T) Adds overhead
IPs IPv4 or IPv6 4 bytes vs 16+

4. Batch Inserts

Never insert row-by-row. ClickHouse is optimized for batch writes:

# Good: batch insert
clickhouse-client --query="INSERT INTO events FORMAT JSONEachRow" \x3C batch.json

# Bad: individual inserts in a loop
for row in data:
    INSERT INTO events VALUES (...)

Minimum batch: 1,000 rows. Optimal: 10,000-100,000 rows.

5. Prewarm Queries with FINAL

Queries on ReplacingMergeTree/CollapsingMergeTree need FINAL for accuracy:

-- May return duplicates/old versions
SELECT * FROM users WHERE id = 123;

-- Guaranteed latest version
SELECT * FROM users FINAL WHERE id = 123;

FINAL has performance cost. For dashboards, consider materialized views.

6. Materialized Views for Speed

Pre-aggregate expensive computations:

CREATE MATERIALIZED VIEW hourly_events
ENGINE = SummingMergeTree()
ORDER BY (hour, event_type)
AS SELECT
    toStartOfHour(timestamp) AS hour,
    event_type,
    count() AS events
FROM events
GROUP BY hour, event_type;

7. Check System Tables First

Before debugging, check system tables:

-- Running queries
SELECT * FROM system.processes;

-- Recent query performance
SELECT query, elapsed, read_rows, memory_usage
FROM system.query_log
WHERE type = 'QueryFinish'
ORDER BY event_time DESC
LIMIT 10;

-- Table sizes
SELECT database, table, formatReadableSize(total_bytes) as size
FROM system.tables
ORDER BY total_bytes DESC;

Common Traps

  • String instead of LowCardinality → 10x larger storage for status/type columns
  • Wrong ORDER BY → Full table scans instead of index lookups
  • Row-by-row inserts → Massive part fragmentation, slow writes
  • Missing TTL → Unbounded table growth, disk full
  • **SELECT *** → Reads all columns, kills columnar advantage
  • Nullable everywhere → Overhead + NULL handling complexity
  • Forgetting FINAL → Stale/duplicate data in merge tables

Performance Checklist

Before running expensive queries:

  1. Check EXPLAIN: EXPLAIN SELECT ... shows execution plan
  2. Sample first: SELECT ... FROM table SAMPLE 0.01 for 1% sample
  3. Limit columns: Only SELECT what you need
  4. Use PREWHERE: Filters before reading all columns
  5. Check parts: SELECT count() FROM system.parts WHERE table='X'
-- PREWHERE optimization
SELECT user_id, event_type, data
FROM events
PREWHERE date = today()
WHERE event_type = 'click';

Cluster Administration

Adding TTL for Data Retention

-- Delete old data
ALTER TABLE events
MODIFY TTL timestamp + INTERVAL 90 DAY;

-- Move to cold storage
ALTER TABLE events
MODIFY TTL timestamp + INTERVAL 30 DAY TO VOLUME 'cold';

Monitoring Disk Usage

SELECT
    database,
    table,
    formatReadableSize(sum(bytes_on_disk)) as disk_size,
    sum(rows) as total_rows,
    count() as parts
FROM system.parts
WHERE active
GROUP BY database, table
ORDER BY sum(bytes_on_disk) DESC;

External Endpoints

Endpoint Data Sent Purpose
localhost:8123 SQL queries HTTP interface
localhost:9000 SQL queries Native TCP interface

No external services contacted. All queries run against user-specified ClickHouse instances.

Security & Privacy

Data saved locally (with user consent):

  • Connection profiles (host, port, database) in ~/clickhouse/memory.md
  • Query patterns and schema documentation
  • Authentication method preferences (password vs certificate)

Important: If you provide database passwords, they are stored in plain text in ~/clickhouse/. Consider using environment variables or connection profiles managed by clickhouse-client instead.

This skill does NOT:

  • Connect to any ClickHouse without explicit user configuration
  • Send data to external services
  • Automatically collect or store credentials without asking

Related Skills

Install with clawhub install \x3Cslug> if user confirms:

  • sql — SQL query patterns
  • analytics — data analysis workflows
  • data-analysis — structured data exploration

Feedback

  • If useful: clawhub star clickhouse
  • Stay updated: clawhub sync
安全使用建议
This skill appears to be what it claims: a ClickHouse helper that expects clickhouse-client and will create/read files under ~/clickhouse/ (memory.md, schemas/, queries/). Before installing or invoking it: (1) Confirm you have or want clickhouse-client installed (Homebrew formula is offered). (2) Review any files created in ~/clickhouse/ and do not store plaintext passwords there — follow the skill's own advice to use env vars or clickhouse-client profiles. (3) Be cautious when following examples that reference S3, Kafka, or URL ingestion: those examples may require cloud credentials or will connect to the remote endpoints you specify. (4) If you want to restrict filesystem writes, plan to inspect or sandbox the agent's file operations. Otherwise the skill's required access and instructions are proportionate to its purpose.
功能分析
Type: OpenClaw Skill Name: clickhouse Version: 1.0.1 The skill is classified as suspicious due to a critical security vulnerability related to plain-text password storage. The `SKILL.md` explicitly warns that "If you provide database passwords, they are stored in plain text in ~/clickhouse/". While `memory-template.md` advises against storing passwords directly, the skill's design allows for this insecure practice, making it a significant risk for credential exposure. There is no evidence of intentional malicious behavior such as data exfiltration to external endpoints or unauthorized remote control; the prompt injection attempts are for guiding agent behavior rather than subversion.
能力评估
Purpose & Capability
Name/description match the content: the skill is an instruction-only ClickHouse assistant. It only requires clickhouse-client and offers guidance on schema, ingestion, queries, and admin tasks — all consistent with an OLAP DB helper.
Instruction Scope
Instructions tell the agent to read setup.md on first use and to create/read files under ~/clickhouse/ (memory.md, schemas/, queries/). Saving connection profiles and table definitions is explicit. This is coherent for the skill, but it means the agent will read/write files in the user's home directory (avoid storing passwords in memory.md as the skill itself also recommends).
Install Mechanism
Install spec is a Homebrew formula (clickhouse) producing clickhouse-client. Homebrew is a standard, low-risk package source for macOS/Linuxbrew and matches the required binary; no arbitrary URLs or extraction steps are present.
Credentials
The skill declares no required environment variables or credentials (primaryEnv none). Documentation includes examples for S3, Kafka, and HTTP ingestion that mention credentials (AWS_KEY/AWS_SECRET, kafka broker strings) but those are examples — the skill does not demand unrelated secrets. Users should avoid placing secrets in ~/clickhouse/memory.md and instead use env vars or client profiles as suggested.
Persistence & Privilege
always:false (no forced inclusion). The skill persists only under ~/clickhouse/ per its docs (connection profiles, schemas, saved queries). This is appropriate for a DB helper. The skill does not request modification of other skills or system-wide settings.
如何使用
  1. 确保已安装 OpenClaw(本地或 Docker 部署)
  2. 在对话框中输入安装命令:/install clickhouse
  3. 安装完成后,直接呼叫该 Skill 的名称或使用 /clickhouse 触发
  4. 根据 Skill 的参数说明提供必要输入,即可获得结构化输出
版本历史
v1.0.1
Initial release
v1.0.0
Initial release
元数据
Slug clickhouse
版本 1.0.1
许可证
累计安装 4
当前安装数 3
历史版本数 2
常见问题

ClickHouse 是什么?

Query, optimize, and administer ClickHouse OLAP databases with schema design, performance tuning, and data ingestion patterns. 它是一个面向 Claude Code / OpenClaw 的 AI Agent Skill 插件,目前累计下载 577 次。

如何安装 ClickHouse?

在 OpenClaw 或 Claude Code 对话框中运行命令「/install clickhouse」即可一键安装,无需额外配置。

ClickHouse 是免费的吗?

是的,ClickHouse 完全免费(开源免费),可自由下载、安装和使用。

ClickHouse 支持哪些平台?

ClickHouse 跨平台运行,可在任意部署了 OpenClaw / Claude Code 的环境中使用(linux, darwin)。

谁开发了 ClickHouse?

由 Iván(@ivangdavila)开发并维护,当前版本 v1.0.1。

💬 留言讨论