Database Indexing Guide

Index Types

TypeBest ForSupportsNotes
B-tree (default)Equality, range, sorting=, <, >, BETWEEN, LIKE 'foo%', ORDER BYWorks for 90%+ use cases
HashEquality only= onlyFaster for exact match; no range support
GINMulti-valued (arrays, JSONB, tsvector)@>, ?, @@ full-textLarger, slower to build, fast to query
GiSTGeometric, range, proximityPostGIS, tsrange, nearest neighborLossy — recheck needed
BRINVery large sequential tablesRange queries on naturally ordered dataTiny index; good for time-series data
PartialSubset of rows (WHERE clause)Any type, smaller and fasterOnly indexes matching rows
ExpressionFunction result, LOWER(email)Query must use same expressionMust match query expression exactly
CoveringInclude extra columns (INCLUDE)Index-only scansAvoids heap fetch entirely

Composite Index Column Order

-- Rule: Equality columns first, then range, then ORDER BY -- Scenario: WHERE status = 'active' AND created_at > '2024-01-01' ORDER BY user_id -- Optimal index order: CREATE INDEX idx_status_created_user ON orders(status, created_at, user_id); -- Leftmost prefix rule examples: -- Can use index: -- WHERE status = 'active' ✓ (prefix) -- WHERE status = 'active' AND created_at > X ✓ (prefix + range) -- Cannot use index: -- WHERE created_at > '2024-01-01' ✗ (skips first column) -- WHERE user_id = 42 ✗ (skips first two) -- Covering index (INCLUDE for non-filter columns) CREATE INDEX idx_orders_cover ON orders(user_id, status) INCLUDE (amount, created_at); -- Query: SELECT amount, created_at FROM orders WHERE user_id=42 AND status='paid' -- → Index-only scan, zero heap access -- Partial index (index only active records) CREATE INDEX idx_active_orders ON orders(user_id, created_at) WHERE status = 'active'; -- Smaller index, faster queries for active orders -- Expression index CREATE INDEX idx_email_lower ON users(LOWER(email)); -- Use with: WHERE LOWER(email) = '[email protected]'

Index Selectivity & Maintenance

-- Check index selectivity (PostgreSQL) SELECT attname, n_distinct, correlation -- 1.0 = perfectly ordered (BRIN-friendly) FROM pg_stats WHERE tablename = 'orders'; -- Find unused indexes SELECT schemaname, tablename, indexname, idx_scan FROM pg_stat_user_indexes WHERE idx_scan = 0 AND indexname NOT LIKE '%pkey'; -- Check index bloat (rebuild if bloat > 30%) SELECT indexname, pg_size_pretty(pg_relation_size(indexrelid)) AS size FROM pg_stat_user_indexes WHERE tablename = 'orders'; -- Rebuild index (PostgreSQL) REINDEX INDEX CONCURRENTLY idx_orders_user_id; -- MySQL: check index usage SELECT * FROM sys.schema_unused_indexes; SELECT * FROM sys.schema_redundant_indexes; -- When NOT to index: -- • Small tables (full scan is faster) -- • Low-cardinality columns (boolean, gender) -- • Columns rarely used in WHERE/JOIN/ORDER BY -- • Write-heavy tables (each index slows INSERT/UPDATE/DELETE)