/install clickhouse
ClickHouse 🏠
Real-time analytics on billions of rows. Sub-second queries. No indexes needed.
Setup
On first use, read setup.md for connection configuration.
When to Use
User needs OLAP analytics, log analysis, time-series data, or real-time dashboards. Agent handles schema design, query optimization, data ingestion, and cluster administration.
Architecture
Memory lives in ~/clickhouse/. See memory-template.md for structure.
~/clickhouse/
├── memory.md # Connection profiles + query patterns
├── schemas/ # Table definitions per database
└── queries/ # Saved analytical queries
Quick Reference
| Topic | File |
|---|---|
| Setup & connection | setup.md |
| Memory template | memory-template.md |
| Query patterns | queries.md |
| Performance tuning | performance.md |
| Data ingestion | ingestion.md |
Core Rules
1. Always Specify Engine
Every table needs an explicit engine. Default to MergeTree family:
-- Time-series / logs
CREATE TABLE events (
timestamp DateTime,
event_type String,
data String
) ENGINE = MergeTree()
ORDER BY (timestamp, event_type);
-- Aggregated metrics
CREATE TABLE daily_stats (
date Date,
metric String,
value AggregateFunction(sum, UInt64)
) ENGINE = AggregatingMergeTree()
ORDER BY (date, metric);
2. ORDER BY is Your Index
ClickHouse has no traditional indexes. The ORDER BY clause determines data layout:
- Put high-cardinality filter columns first
- Put range columns (dates, timestamps) early
- Match your most common WHERE patterns
-- Good: filters by user_id, then date range
ORDER BY (user_id, date, event_type)
-- Bad: date first when you filter by user_id
ORDER BY (date, user_id, event_type)
3. Use Appropriate Data Types
| Use Case | Type | Why |
|---|---|---|
| Timestamps | DateTime or DateTime64 |
Native time functions |
| Low-cardinality strings | LowCardinality(String) |
10x compression |
| Enums with few values | Enum8 or Enum16 |
Smallest footprint |
| Nullable only if needed | Nullable(T) |
Adds overhead |
| IPs | IPv4 or IPv6 |
4 bytes vs 16+ |
4. Batch Inserts
Never insert row-by-row. ClickHouse is optimized for batch writes:
# Good: batch insert
clickhouse-client --query="INSERT INTO events FORMAT JSONEachRow" \x3C batch.json
# Bad: individual inserts in a loop
for row in data:
INSERT INTO events VALUES (...)
Minimum batch: 1,000 rows. Optimal: 10,000-100,000 rows.
5. Prewarm Queries with FINAL
Queries on ReplacingMergeTree/CollapsingMergeTree need FINAL for accuracy:
-- May return duplicates/old versions
SELECT * FROM users WHERE id = 123;
-- Guaranteed latest version
SELECT * FROM users FINAL WHERE id = 123;
FINAL has performance cost. For dashboards, consider materialized views.
6. Materialized Views for Speed
Pre-aggregate expensive computations:
CREATE MATERIALIZED VIEW hourly_events
ENGINE = SummingMergeTree()
ORDER BY (hour, event_type)
AS SELECT
toStartOfHour(timestamp) AS hour,
event_type,
count() AS events
FROM events
GROUP BY hour, event_type;
7. Check System Tables First
Before debugging, check system tables:
-- Running queries
SELECT * FROM system.processes;
-- Recent query performance
SELECT query, elapsed, read_rows, memory_usage
FROM system.query_log
WHERE type = 'QueryFinish'
ORDER BY event_time DESC
LIMIT 10;
-- Table sizes
SELECT database, table, formatReadableSize(total_bytes) as size
FROM system.tables
ORDER BY total_bytes DESC;
Common Traps
- String instead of LowCardinality → 10x larger storage for status/type columns
- Wrong ORDER BY → Full table scans instead of index lookups
- Row-by-row inserts → Massive part fragmentation, slow writes
- Missing TTL → Unbounded table growth, disk full
- **SELECT *** → Reads all columns, kills columnar advantage
- Nullable everywhere → Overhead + NULL handling complexity
- Forgetting FINAL → Stale/duplicate data in merge tables
Performance Checklist
Before running expensive queries:
- Check EXPLAIN:
EXPLAIN SELECT ...shows execution plan - Sample first:
SELECT ... FROM table SAMPLE 0.01for 1% sample - Limit columns: Only SELECT what you need
- Use PREWHERE: Filters before reading all columns
- Check parts:
SELECT count() FROM system.parts WHERE table='X'
-- PREWHERE optimization
SELECT user_id, event_type, data
FROM events
PREWHERE date = today()
WHERE event_type = 'click';
Cluster Administration
Adding TTL for Data Retention
-- Delete old data
ALTER TABLE events
MODIFY TTL timestamp + INTERVAL 90 DAY;
-- Move to cold storage
ALTER TABLE events
MODIFY TTL timestamp + INTERVAL 30 DAY TO VOLUME 'cold';
Monitoring Disk Usage
SELECT
database,
table,
formatReadableSize(sum(bytes_on_disk)) as disk_size,
sum(rows) as total_rows,
count() as parts
FROM system.parts
WHERE active
GROUP BY database, table
ORDER BY sum(bytes_on_disk) DESC;
External Endpoints
| Endpoint | Data Sent | Purpose |
|---|---|---|
| localhost:8123 | SQL queries | HTTP interface |
| localhost:9000 | SQL queries | Native TCP interface |
No external services contacted. All queries run against user-specified ClickHouse instances.
Security & Privacy
Data saved locally (with user consent):
- Connection profiles (host, port, database) in ~/clickhouse/memory.md
- Query patterns and schema documentation
- Authentication method preferences (password vs certificate)
Important: If you provide database passwords, they are stored in plain text in ~/clickhouse/. Consider using environment variables or connection profiles managed by clickhouse-client instead.
This skill does NOT:
- Connect to any ClickHouse without explicit user configuration
- Send data to external services
- Automatically collect or store credentials without asking
Related Skills
Install with clawhub install \x3Cslug> if user confirms:
sql— SQL query patternsanalytics— data analysis workflowsdata-analysis— structured data exploration
Feedback
- If useful:
clawhub star clickhouse - Stay updated:
clawhub sync
- Make sure OpenClaw is installed (local or Docker)
- Run the install command in chat:
/install clickhouse - After installation, invoke the skill by name or use
/clickhouse - Provide required inputs per the skill's parameter spec and get structured output
What is ClickHouse?
Query, optimize, and administer ClickHouse OLAP databases with schema design, performance tuning, and data ingestion patterns. It is an AI Agent Skill for Claude Code / OpenClaw, with 577 downloads so far.
How do I install ClickHouse?
Run "/install clickhouse" in the OpenClaw or Claude Code chat to install it in one step — no extra setup required.
Is ClickHouse free?
Yes, ClickHouse is completely free (open-source). You can download, install and use it at no cost.
Which platforms does ClickHouse support?
ClickHouse is cross-platform and runs anywhere OpenClaw / Claude Code is available (linux, darwin).
Who created ClickHouse?
It is built and maintained by Iván (@ivangdavila); the current version is v1.0.1.