Cassandra指南
数据建模规则
- 围绕查询设计表,而非关系
- 每种查询模式一张表(反规范化是可以的)
- 分区键在节点间分发数据
- 聚集键在分区内排序数据
- 避免大分区(>100MB或10万行)
CQL示例
-- Create keyspace
CREATE KEYSPACE my_app
WITH replication = {'class': 'NetworkTopologyStrategy', 'datacenter1': 3};
-- Create table (query: get user's recent posts by date)
CREATE TABLE posts_by_user (
user_id UUID,
created_at TIMESTAMP,
post_id UUID,
title TEXT,
content TEXT,
tags SET<TEXT>,
metadata MAP<TEXT, TEXT>,
PRIMARY KEY ((user_id), created_at, post_id) -- (partition, clustering...)
) WITH CLUSTERING ORDER BY (created_at DESC);
-- Insert
INSERT INTO posts_by_user (user_id, created_at, post_id, title)
VALUES (uuid(), toTimestamp(now()), uuid(), 'Hello World')
USING TTL 2592000; -- 30 days TTL
-- Query (must include full partition key)
SELECT * FROM posts_by_user
WHERE user_id = ? AND created_at > '2024-01-01'
LIMIT 20;
Cassandra vs MongoDB vs DynamoDB
| Cassandra | MongoDB | DynamoDB | |
|---|---|---|---|
| 适合 | 高写入时序数据 | 灵活文档 | AWS无服务器 |
| 查询灵活性 | 低(必须分区键) | 高 | 中 |
| 写入吞吐量 | 出色 | 好 | 出色(托管) |
| ACID | 轻量级事务 | Multi-doc transactions | 单项ACID |