MySQL EXPLAIN Analyzer
Interactive EXPLAIN Analyzer
Paste your MySQL EXPLAIN output below (tabular or JSON format), click "Analyze" to get a visual breakdown, automatic problem detection, and optimization suggestions.
Analysis Result
id โ Query Identifier
The id column is the sequential number of the SELECT within the query. When a query contains subqueries, UNIONs, or derived tables, multiple rows appear, each corresponding to a SELECT operation. Understanding id numbering is the first step to reading complex EXPLAIN output.
Numbering Rules
- Same id: tables are JOINed in the same SELECT. MySQL processes same-id rows top-to-bottom โ the upper table is the driving table, the lower is the driven table.
- Different id: higher id = higher priority = executed first. A subquery has a larger id than its outer query.
- id is NULL: this row represents the temporary table created for a UNION RESULT.
Examples
-- Simple JOIN: both rows have id = 1
EXPLAIN SELECT * FROM orders o JOIN customers c ON o.customer_id = c.id;
-- id=1, table=o (driving table)
-- id=1, table=c (driven table)
-- Subquery: id=2 executes first
EXPLAIN SELECT * FROM orders WHERE customer_id IN (SELECT id FROM customers WHERE status='active');
-- id=1, table=orders
-- id=2, table=customers
-- UNION: UNION RESULT has id=NULL
EXPLAIN SELECT id FROM t1 UNION SELECT id FROM t2;
-- id=1, table=t1
-- id=2, table=t2
-- id=NULL, table=<union1,2>
Note: The optimizer may rewrite subqueries as JOINs, merging originally distinct ids. Compare
EXPLAINwith the rewritten SQL inSHOW WARNINGSto confirm.
select_type โ SELECT Type
select_type identifies the role of the current row's SELECT within the overall query. It is the key to understanding query structure โ nesting, dependencies, and materialization.
| select_type | Meaning | When It Appears |
|---|---|---|
SIMPLE |
Simple query without subqueries or UNION | SELECT * FROM t |
PRIMARY |
Outermost SELECT | Outer query when subqueries/UNION present |
SUBQUERY |
Non-correlated subquery in SELECT list or WHERE | SELECT (SELECT MAX(id) FROM t2) FROM t1 |
DEPENDENT SUBQUERY |
Correlated subquery โ re-executed for every outer row | WHERE EXISTS (SELECT 1 FROM t2 WHERE t2.fk = t1.id) |
DERIVED |
Subquery in FROM clause (derived table), materialized into temp table | SELECT * FROM (SELECT ... ) AS dt |
MATERIALIZED |
Subquery materialized into a temp table (MySQL 5.6+, optimizes IN subquery) | WHERE id IN (SELECT fk FROM t2) |
UNION |
Second or later SELECT in a UNION | SELECT ... UNION SELECT ... |
UNION RESULT |
Temp table for UNION dedup result | UNION (not UNION ALL) |
DEPENDENT UNION |
UNION member dependent on outer query | UNION with outer table reference |
UNCACHEABLE SUBQUERY |
Subquery whose result cannot be cached and must be re-evaluated | Uses user variables, non-deterministic functions, etc. |
UNCACHEABLE UNION |
UNION member that is uncacheable | Same as above, within UNION |
Performance Warnings
The most dangerous is DEPENDENT SUBQUERY. It means the subquery executes once per outer row. With 1 million outer rows, the subquery runs 1 million times. Common fix: rewrite as JOIN, or let the optimizer convert EXISTS into a semi-join (automatic in MySQL 5.6+).
-- Dangerous: DEPENDENT SUBQUERY
SELECT * FROM orders
WHERE customer_id IN (
SELECT id FROM customers WHERE region = orders.region -- references outer orders.region
);
-- Fix: rewrite as JOIN
SELECT DISTINCT o.* FROM orders o
JOIN customers c ON o.customer_id = c.id AND c.region = o.region;
table โ Table Name
Identifies the table accessed by the current row. Besides actual table names, it may show these special values:
| Value | Meaning |
|---|---|
<derivedN> | Derived table from subquery with id=N |
<subqueryN> | Materialized subquery with id=N (MySQL 5.6+ optimization) |
<unionM,N,...> | UNION result temp table combining ids M, N, ... |
NULL | Optimizer resolved the query at optimization time without table access |
partitions โ Partition Pruning
If the table is partitioned, this column shows which partitions the query actually accesses. NULL means the table is not partitioned. Partition pruning is the core benefit of partitioning โ the query scans only necessary partitions, not all of them.
-- Orders table partitioned by year
CREATE TABLE orders (
id BIGINT, order_date DATE, ...
) PARTITION BY RANGE (YEAR(order_date)) (
PARTITION p2023 VALUES LESS THAN (2024),
PARTITION p2024 VALUES LESS THAN (2025),
PARTITION p2025 VALUES LESS THAN (2026),
PARTITION pmax VALUES LESS THAN MAXVALUE
);
EXPLAIN SELECT * FROM orders WHERE order_date BETWEEN '2024-01-01' AND '2024-12-31';
-- partitions: p2024 (only one partition scanned)
EXPLAIN SELECT * FROM orders WHERE amount > 100;
-- partitions: p2023,p2024,p2025,pmax (all partitions โ WHERE not on partition key)
Diagnostic tip: if you see all partitions listed, pruning is not working. Check whether the WHERE clause hits the partition key and whether the partition function expression is compatible with the condition.
type โ Access Type (THE Most Important Column)
The type column describes how MySQL finds rows in the table. It directly determines query performance. Listed from best to worst (higher is better):
system
The table has at most one row (system table). A special case of const, rarely seen in production queries.
const
At most one matching row found via PRIMARY KEY or UNIQUE index equality lookup. MySQL treats the result as a constant during optimization.
EXPLAIN SELECT * FROM users WHERE id = 42;
-- type: const, key: PRIMARY
EXPLAIN SELECT * FROM users WHERE email = '[email protected]';
-- type: const (requires UNIQUE index on email)
eq_ref
For each row from the driving table, exactly one row is read from this table using a PRIMARY KEY or UNIQUE NOT NULL index. This is the best JOIN access type.
EXPLAIN SELECT * FROM orders o JOIN customers c ON o.customer_id = c.id;
-- orders: type=ALL (driving table full scan)
-- customers: type=eq_ref, key=PRIMARY (perfect: one row per lookup)
ref
Non-unique index lookup or leftmost prefix of a unique index. May return multiple rows. Common for ordinary index equality conditions and non-unique JOINs.
EXPLAIN SELECT * FROM orders WHERE customer_id = 100;
-- type: ref, key: idx_customer_id (customer_id has a regular index)
EXPLAIN SELECT * FROM orders WHERE status = 'pending';
-- type: ref, key: idx_status
fulltext
Uses a FULLTEXT index. Appears only with MATCH ... AGAINST queries.
EXPLAIN SELECT * FROM articles WHERE MATCH(title, body) AGAINST ('mysql performance');
-- type: fulltext, key: ft_idx
ref_or_null
Like ref, but also searches for rows with NULL values. Common with WHERE col = val OR col IS NULL.
EXPLAIN SELECT * FROM t WHERE key_col = 10 OR key_col IS NULL;
-- type: ref_or_null
index_merge
The optimizer uses multiple indexes and merges results (intersection or union). The Extra column shows the merge strategy: Using intersect, Using union, Using sort_union.
EXPLAIN SELECT * FROM t WHERE col_a = 1 OR col_b = 2;
-- type: index_merge, key: idx_a,idx_b, Extra: Using union(idx_a,idx_b); Using where
Optimization tip: while index_merge is better than a full scan, a composite index is usually faster. If you see this type frequently, consider a composite index covering multiple conditions.
unique_subquery
Uses PRIMARY KEY or UNIQUE index in an IN subquery โ the subquery version of eq_ref. Usually replaced by semi-join optimization in MySQL 5.6+.
index_subquery
Like unique_subquery, but uses a non-unique index.
range
Index range scan. Appears with =, <>, >, >=, <, <=, IS NULL, <=>, BETWEEN, LIKE (prefix), IN(). Scans one or more index ranges, not the entire index.
EXPLAIN SELECT * FROM orders WHERE created_at BETWEEN '2024-01-01' AND '2024-03-31';
-- type: range, key: idx_created_at
EXPLAIN SELECT * FROM users WHERE age IN (25, 30, 35);
-- type: range, key: idx_age
Performance: range performance depends on selectivity. Scanning 1% of the index is fast; scanning 80% may be slower than a full index scan.
index
Full index scan. Like ALL, but scans the index tree instead of data rows. Two cases:
1) The index covers all needed columns (Extra shows Using index) โ reads only index pages, no table access;
2) Reads all rows in index order (avoids sorting).
-- Covering index: reads only the index
EXPLAIN SELECT customer_id FROM orders;
-- type: index, key: idx_customer_id, Extra: Using index
-- Full index scan: reads all rows in PK order
EXPLAIN SELECT * FROM orders ORDER BY id;
-- type: index, key: PRIMARY
Warning: type=index is still slow on large tables. While better than ALL (indexes are smaller than data), scanning the entire index of a 10M-row table still requires significant I/O.
ALL
Full table scan. Reads every row in the table. Unless the table is small (a few hundred rows) or you genuinely need most rows (analytics), type=ALL is almost always a sign that optimization is needed.
EXPLAIN SELECT * FROM orders WHERE YEAR(created_at) = 2024;
-- type: ALL (function on column prevents index use)
-- Fix: rewrite condition as range
EXPLAIN SELECT * FROM orders WHERE created_at >= '2024-01-01' AND created_at < '2025-01-01';
-- type: range (index usable)
type=NULL (Best Extreme Case)
MySQL resolved the query at optimization time without even accessing the table or index. Examples: empty table, MIN()/MAX() on indexed column, WHERE 1=0.
EXPLAIN SELECT MIN(id) FROM users;
-- type: NULL, Extra: Select tables optimized away
possible_keys โ Candidate Indexes
possible_keys lists all indexes the optimizer considers potentially usable for this query. It checks which index definitions match the columns and conditions in the query.
- NULL: no candidate indexes. Usually means the columns in WHERE are not indexed, or functions/expressions prevent index usage.
- Multiple indexes: optimizer listed several candidates; the actual choice (or lack thereof) is in the
keycolumn.
Note:
possible_keysis not NULL butkeyIS NULL is a common problem signal โ the optimizer decided a full scan is cheaper than any candidate index. Reasons: (1) small table; (2) low index selectivity; (3) too many columns require table lookups causing excessive random I/O.
key โ Actually Used Index
The key column shows the index the optimizer actually chose. This is one of the most direct "answers" in EXPLAIN: was your index actually used?
- NULL: no index used (full table scan).
- PRIMARY: primary key index used.
- Index name: a specific secondary index was used.
Sometimes the optimizer's choice is suboptimal. You can use FORCE INDEX(idx_name) or USE INDEX(idx_name) hints. But first confirm:
-- Force a specific index
EXPLAIN SELECT * FROM orders FORCE INDEX(idx_created_at)
WHERE created_at > '2024-01-01' AND status = 'pending';
-- Ignore a specific index
EXPLAIN SELECT * FROM orders IGNORE INDEX(idx_status)
WHERE status = 'pending';
key_len โ Index Bytes Used
key_len shows the number of bytes MySQL actually uses from the index. For composite indexes, key_len tells you how many columns of the index are being used. This is key to diagnosing composite index utilization.
Calculation Rules
| Data Type | Bytes | Notes |
|---|---|---|
TINYINT | 1 | |
SMALLINT | 2 | |
MEDIUMINT | 3 | |
INT | 4 | |
BIGINT | 8 | |
FLOAT | 4 | |
DOUBLE | 8 | |
DATE | 3 | |
DATETIME | 5 | MySQL 5.6.4+ |
TIMESTAMP | 4 | |
CHAR(N) | N × charset_bytes | utf8mb4: N × 4 |
VARCHAR(N) | N × charset_bytes + 2 | +2 for length prefix |
| If column is NULLable: +1 byte (NULL flag) | ||
Practical Decoding
-- Composite index: INDEX idx_abc (a INT NOT NULL, b VARCHAR(50) NOT NULL, c DATE NOT NULL)
-- Full usage: key_len = 4 + (50*4+2) + 3 = 209
-- Only a: key_len = 4
-- a + b: key_len = 4 + 202 = 206
-- a + b + c: key_len = 209
EXPLAIN SELECT * FROM t WHERE a = 1 AND b = 'hello';
-- key_len: 206 โ used columns a + b
Leftmost prefix rule: composite index
(a, b, c)can only be used when the query includesa.WHERE b=1cannot use this index.WHERE a=1 AND c=1uses only columna(key_lenis only 4).
ref โ Values Compared to the Index
ref shows which columns or constants are compared to the index indicated by the key column. It helps you understand the details of join conditions.
| Value | Meaning |
|---|---|
const | Compared against a constant (e.g., WHERE col = 42) |
db.table.col | Compared against a column from another table (JOIN) |
func | Comparison value is the result of a function |
NULL | No index reference used (common when type is range or ALL) |
-- ref = const
EXPLAIN SELECT * FROM users WHERE email = '[email protected]';
-- ref: const
-- ref = db.table.column
EXPLAIN SELECT * FROM orders o JOIN users u ON o.user_id = u.id;
-- orders row: ref: NULL (driving table)
-- users row: ref: mydb.o.user_id
rows โ Estimated Rows to Examine
rows is the optimizer's estimate of how many rows must be examined to find the target rows, based on index statistics. This is an estimate and can be significantly off โ especially when statistics are stale.
- For JOINs, the total cost is roughly the product of all row estimates. If three tables estimate 100, 200, 50 rows, total rows examined is approximately 100 × 200 × 50 = 1,000,000.
- When statistics are inaccurate, run
ANALYZE TABLE tablenameto refresh them.
-- Symptom of stale statistics:
EXPLAIN SELECT * FROM orders WHERE status = 'shipped';
-- rows: 500000 (estimated)
-- actually only 128 rows โ run ANALYZE TABLE orders then re-EXPLAIN
filtered โ Row Filter Percentage
filtered indicates the estimated percentage of rows that remain after the table's WHERE condition is applied. Effective rows = rows × filtered / 100.
filtered = 100.00: all examined rows satisfy the condition (no extra filtering), or MySQL lacks info to estimate.filtered = 10.00: only 10% of rows satisfy the condition. Ifrows=10000, about 1000 rows pass to the next step.
A low filtered value (e.g., 1%) with a high rows value is an optimization opportunity: many rows are read then discarded. Consider adding an index to reduce rows examined.
Extra โ Additional Information (30+ Values Explained)
The Extra column contains additional information about how MySQL executes the query. It is the richest source of performance diagnostics. Grouped by performance impact below.
Positive Signals (Green)
Using index
All columns needed by the query can be retrieved from the index alone, without reading the data row. This is a "Covering Index" โ one of the best performance scenarios.
-- Covering index example: INDEX idx_cover (status, created_at)
EXPLAIN SELECT status, created_at FROM orders WHERE status = 'pending';
-- Extra: Using index (no table lookup needed)
Using index condition
Index Condition Pushdown (ICP), introduced in MySQL 5.6. Conditions that would have been filtered at the server layer after table lookup are now pushed down to the storage engine and filtered within the index, reducing table lookups.
-- INDEX idx_name_age (name, age)
EXPLAIN SELECT * FROM users WHERE name LIKE 'John%' AND age > 30;
-- Extra: Using index condition
-- ICP filters age > 30 at index level, avoiding table lookup for non-matching rows
Select tables optimized away
The query was resolved during optimization, no table access needed at execution time. Common for MIN()/MAX() on columns with B-tree indexes.
Using index for group-by
MySQL uses the index to process GROUP BY without creating a temp table or sorting. Occurs when GROUP BY columns are an index prefix.
Using index for skip scan
Skip Scan optimization introduced in MySQL 8.0.13. When composite index (a, b) prefix column a is not in WHERE but b is, the optimizer can skip the prefix and scan by b.
Neutral / Informational
Using where
The server layer applies additional filtering to rows returned by the storage engine. Almost every query with a WHERE clause shows this. Not a problem by itself, but combined with type=ALL and high rows, it means many rows are read then discarded.
Using MRR
Multi-Range Read optimization. MySQL converts random disk access to sequential by sorting row IDs before table lookup, reducing I/O.
Using join buffer (Block Nested Loop / hash join)
The driven table has no usable index. MySQL uses a Join Buffer to reduce scans of the driven table. MySQL 8.0.18+ introduced Hash Join on top of this.
Impossible WHERE
The WHERE condition is logically impossible (e.g., WHERE 1 = 0). MySQL returns empty result immediately.
Impossible WHERE noticed after reading const tables
After reading const tables, MySQL determined the WHERE is impossible.
Impossible HAVING
The HAVING condition is logically impossible.
No matching min/max row
No rows satisfy the condition for MIN()/MAX().
Using union / Using intersect / Using sort_union
Merge strategies for index_merge. union for OR, intersect for AND, sort_union for OR conditions requiring sorting.
Distinct
MySQL stops searching after finding the first matching row.
Start temporary / End temporary
Markers for the DuplicateWeedout semi-join strategy.
FirstMatch(tbl_name)
Semi-join FirstMatch strategy โ returns on first match.
LooseScan(m..n)
Semi-join LooseScan strategy.
Rematerialize
Re-materialization of a Lateral Derived Table in MySQL 8.0+.
Warning Signals (Yellow)
Using filesort
MySQL needs an extra sorting pass to satisfy ORDER BY. Despite the misleading name, this may happen in memory. But it means the index cannot satisfy the sort order, requiring extra CPU and memory. Large datasets may spill to disk temp files.
-- Triggers filesort: INDEX idx_status (status)
EXPLAIN SELECT * FROM orders WHERE status='pending' ORDER BY created_at;
-- Extra: Using where; Using filesort
-- Eliminates filesort: INDEX idx_status_created (status, created_at)
EXPLAIN SELECT * FROM orders WHERE status='pending' ORDER BY created_at;
-- Extra: Using where (no filesort)
Using temporary
MySQL needs to create a temporary table. Usually for GROUP BY, DISTINCT, UNION, subqueries. Small temp tables stay in memory; large ones spill to disk.
-- Triggers temp table:
EXPLAIN SELECT DISTINCT status FROM orders ORDER BY created_at;
-- Extra: Using temporary; Using filesort
-- Fix: make ORDER BY and DISTINCT/GROUP BY use the same index
Using join buffer (Block Nested Loop)
The driven table lacks an index (or the optimizer chose not to use one). MySQL uses Block Nested Loop. This is usually a performance signal โ add an index for the JOIN condition.
Danger Signals (Red)
Using temporary; Using filesort
Both appearing together is the most serious performance warning. A temp table is created AND sorted. Can cause severe performance issues on large datasets.
Full scan on NULL key
Subquery optimization fallback โ when the optimizer cannot use an index, it falls back to a full scan on the subquery.
Range checked for each record
MySQL cannot determine which index to use and re-evaluates for each row. Performance is usually poor.
Backward index scan
Reverse index scan reported in MySQL 8.0+. Appears with ORDER BY ... DESC when the index is ASC. MySQL 8.0 supports descending indexes (CREATE INDEX ... (col DESC)) to optimize this.
Zero limit
Query has LIMIT 0; MySQL returns empty result immediately.
No matching rows after partition pruning
No matching partitions after pruning.
Using where with pushed condition
NDB Cluster specific โ condition pushed to data nodes.
Plan isn't ready yet
Appears in MySQL 8.0+ EXPLAIN FOR CONNECTION โ the target connection's plan is not ready yet.
EXPLAIN ANALYZE (MySQL 8.0.18+)
EXPLAIN ANALYZE is the most powerful execution plan tool, introduced in MySQL 8.0.18. Unlike regular EXPLAIN, it actually executes the query and records real execution statistics: actual rows, actual time, loop count.
EXPLAIN ANALYZE SELECT o.*, c.name
FROM orders o JOIN customers c ON o.customer_id = c.id
WHERE o.status = 'pending' AND o.total > 100
ORDER BY o.created_at DESC LIMIT 10;
/*
-> Limit: 10 row(s) (cost=2.54 rows=10) (actual time=0.892..0.943 rows=10 loops=1)
-> Nested loop inner join (cost=2.54 rows=10) (actual time=0.889..0.938 rows=10 loops=1)
-> Sort: o.created_at DESC, limit input to 10 row(s) per chunk
-> Filter: ((o.status = 'pending') and (o.total > 100)) (cost=1.25 rows=10) (actual time=0.325..0.678 rows=42 loops=1)
-> Table scan on o (cost=1.25 rows=1000) (actual time=0.312..0.590 rows=1000 loops=1)
-> Single-row index lookup on c using PRIMARY (id=o.customer_id) (cost=0.13 rows=1) (actual time=0.024..0.024 rows=1 loops=10)
*/
How to Read the Output
| Field | Meaning |
|---|---|
cost=X.XX | Optimizer's estimated cost (lower is better) |
rows=N (in parens) | Estimated row count |
actual time=A..B | A = time to first row (ms), B = time to all rows (ms) |
rows=N (actual) | Actual rows processed |
loops=N | Times this operation executed (in Nested Loop, inner table loops = outer table rows) |
Estimated vs Actual Discrepancy Analysis
The greatest value of EXPLAIN ANALYZE is revealing discrepancies between estimates and actuals:
- Estimated rows << Actual rows: statistics are stale. Run
ANALYZE TABLE. The optimizer may have chosen a wrong join order or index. - Estimated rows >> Actual rows: optimizer overestimated cost, possibly abandoning an index. Try
FORCE INDEXto see if it's faster. - One layer has disproportionately high actual time: that is the bottleneck. Focus optimization there.
- High loops value: the operation is repeated many times. If loops=50000 and actual time=0.01ms each, total is still 500ms.
Warning:
EXPLAIN ANALYZEactually executes the query. For write operations (INSERT/UPDATE/DELETE), it will really modify data. Use within a transaction and ROLLBACK, or use only on SELECTs.
EXPLAIN FORMAT=JSON
JSON format provides much richer information than tabular format, including cost details, used columns, attached conditions, and more. Especially useful for diagnosing complex queries.
EXPLAIN FORMAT=JSON SELECT * FROM orders WHERE status = 'pending'\G
Key JSON Fields
| Field | Meaning |
|---|---|
query_cost | Total estimated cost of the entire query. Useful for comparing different query formulations. |
cost_info.read_cost | I/O cost of reading data |
cost_info.eval_cost | CPU cost of processing rows |
cost_info.prefix_cost | Cumulative cost from query start to current table |
used_columns | List of columns actually used โ helps design covering indexes |
attached_condition | Full expression of the filtering condition attached to this table |
possible_keys | Candidate indexes (same as tabular) |
key | Chosen index |
rows_examined_per_scan | Rows examined per scan |
rows_produced_per_join | Rows produced per join step |
filtered | Filter percentage (with decimals) |
using_index | Boolean: whether a covering index is used |
Example JSON Output
{
"query_block": {
"select_id": 1,
"cost_info": {
"query_cost": "12.75"
},
"table": {
"table_name": "orders",
"access_type": "ref",
"possible_keys": ["idx_status", "idx_status_created"],
"key": "idx_status_created",
"used_key_parts": ["status"],
"key_length": "42",
"ref": ["const"],
"rows_examined_per_scan": 50,
"rows_produced_per_join": 50,
"filtered": "100.00",
"cost_info": {
"read_cost": "7.75",
"eval_cost": "5.00",
"prefix_cost": "12.75",
"data_read_per_join": "25K"
},
"used_columns": ["id", "customer_id", "status", "total", "created_at"],
"attached_condition": "(`mydb`.`orders`.`status` = 'pending')"
}
}
}
Use case: used_columns lets you precisely design a covering index โ include all columns from used_columns in your index to achieve a covering index scan.
EXPLAIN FORMAT=TREE (MySQL 8.0.16+)
TREE format shows the iterator-based execution plan. Unlike traditional row-level EXPLAIN, it shows how data flows between operations, closer to how the database actually executes.
EXPLAIN FORMAT=TREE SELECT o.*, c.name
FROM orders o JOIN customers c ON o.customer_id = c.id
WHERE o.status = 'pending' ORDER BY o.created_at LIMIT 10\G
/*
-> Limit: 10 row(s) (cost=45.30 rows=10)
-> Nested loop inner join (cost=45.30 rows=50)
-> Sort: o.created_at, limit input to 10 row(s) per chunk (cost=27.80 rows=50)
-> Index lookup on o using idx_status (status='pending') (cost=27.80 rows=50)
-> Single-row index lookup on c using PRIMARY (id=o.customer_id) (cost=0.25 rows=1)
*/
Iterator Types
| Iterator | Meaning |
|---|---|
Table scan | Full table scan |
Index scan | Full index scan |
Index lookup | Index lookup (ref/eq_ref/const etc.) |
Index range scan | Index range scan |
Single-row index lookup | Single-row index lookup (const/eq_ref) |
Nested loop inner join | Nested loop inner join |
Hash join | Hash join (8.0.18+) |
Sort | Sort operation (corresponds to Using filesort) |
Filter | Filter condition (corresponds to Using where) |
Aggregate | Aggregation operation |
Materialize | Materialize subquery/CTE |
Limit | Limit output rows |
Stream results | Stream results to client |
Read TREE format from innermost (deepest indent) to outermost. The innermost level is the data source (table scan or index lookup); data flows upward through filters, sorts, joins, and limits before returning to the client.
key_len Calculator
Enter your index column data types and attributes to calculate the expected key_len. Compare with the actual key_len in EXPLAIN output to determine how many columns of a composite index are being used.
Common key_len Quick Reference
| key_len | Common Source |
|---|---|
| 4 | INT NOT NULL |
| 5 | INT NULL (4 + 1 NULL flag) |
| 8 | BIGINT NOT NULL |
| 9 | BIGINT NULL |
| 3 | DATE NOT NULL |
| 5 | DATETIME NOT NULL (MySQL 5.6.4+) |
| 4 | TIMESTAMP NOT NULL |
| 42 | VARCHAR(10) NOT NULL (utf8mb4: 10×4+2) |
| 43 | VARCHAR(10) NULL (utf8mb4: 10×4+2+1) |
| 82 | VARCHAR(20) NOT NULL (utf8mb4: 20×4+2) |
| 202 | VARCHAR(50) NOT NULL (utf8mb4: 50×4+2) |
| 767 | InnoDB single column index prefix max (MySQL 5.6 innodb_large_prefix=OFF) |
| 3072 | InnoDB single column index prefix max (MySQL 5.7+/8.0) |
Real-World EXPLAIN Analysis Examples
Example 1: Full Table Scan โ Add Index
SELECT * FROM orders WHERE customer_email = '[email protected]';
-- EXPLAIN:
+----+--------+--------+------+------+------+------+--------+-------+-------------+
| id | s_type | table | type | p_keys| key | k_len| ref | rows | Extra |
+----+--------+--------+------+------+------+------+--------+-------+-------------+
| 1 | SIMPLE | orders | ALL | NULL | NULL | NULL | NULL | 850000| Using where |
+----+--------+--------+------+------+------+------+--------+-------+-------------+
-- Problem: scanning 850K rows for one email, 3.2 seconds
ALTER TABLE orders ADD INDEX idx_customer_email (customer_email);
-- EXPLAIN after fix:
+----+--------+--------+------+--------------------+--------------------+------+-------+------+-------+
| id | s_type | table | type | possible_keys | key | k_len| ref | rows | Extra |
+----+--------+--------+------+--------------------+--------------------+------+-------+------+-------+
| 1 | SIMPLE | orders | ref | idx_customer_email | idx_customer_email | 1022 | const | 3 | |
+----+--------+--------+------+--------------------+--------------------+------+-------+------+-------+
-- After: 3 rows examined, 0.001 seconds
Example 2: JOIN with type=ALL โ Optimize Join
SELECT o.id, p.name, oi.quantity
FROM orders o
JOIN order_items oi ON o.id = oi.order_id
JOIN products p ON oi.product_id = p.id
WHERE o.created_at > '2024-01-01';
-- EXPLAIN:
-- id=1, table=o, type=ALL, rows=850000 (no index on created_at)
-- id=1, table=oi, type=ALL, rows=3200000 (no index on order_id)
-- id=1, table=p, type=eq_ref, key=PRIMARY, rows=1
-- Total rows: 850000 x 3200000 x 1 = 2.7 trillion!
ALTER TABLE orders ADD INDEX idx_created_at (created_at);
ALTER TABLE order_items ADD INDEX idx_order_id (order_id);
-- EXPLAIN after fix:
-- id=1, table=o, type=range, key=idx_created_at, rows=42000
-- id=1, table=oi, type=ref, key=idx_order_id, rows=4
-- id=1, table=p, type=eq_ref, key=PRIMARY, rows=1
-- Total: 42000 x 4 x 1 = 168,000
Example 3: Correlated Subquery โ Rewrite as JOIN
SELECT c.name, (
SELECT COUNT(*) FROM orders WHERE customer_id = c.id
) AS order_count
FROM customers c WHERE c.status = 'active';
-- EXPLAIN:
-- id=1, select_type=PRIMARY, table=c, type=ref, rows=5000
-- id=2, select_type=DEPENDENT SUBQUERY, table=orders, type=ALL, rows=850000
-- Subquery runs 5000 times, each scanning 850K rows!
-- Option A: add index on orders, then LEFT JOIN + GROUP BY
ALTER TABLE orders ADD INDEX idx_customer_id (customer_id);
SELECT c.name, COUNT(o.id) AS order_count
FROM customers c
LEFT JOIN orders o ON o.customer_id = c.id
WHERE c.status = 'active'
GROUP BY c.id, c.name;
-- EXPLAIN:
-- id=1, table=c, type=ref, key=idx_status, rows=5000
-- id=1, table=o, type=ref, key=idx_customer_id, rows=4
-- Total: 5000 x 4 = 20,000
Example 4: filesort โ Add Sort-Covering Index
SELECT id, status, created_at, total
FROM orders WHERE status = 'pending'
ORDER BY created_at DESC LIMIT 20;
-- EXPLAIN:
-- type=ref, key=idx_status, rows=42000, Extra: Using filesort
-- Problem: index finds 42K rows, then sorts ALL of them, then takes 20
-- Create composite index covering both WHERE and ORDER BY
ALTER TABLE orders ADD INDEX idx_status_created (status, created_at);
-- EXPLAIN:
-- type=ref, key=idx_status_created, rows=20, Extra: Backward index scan
-- MySQL scans index backwards (DESC), takes first 20 rows, no filesort needed
-- Further optimization: create descending index
ALTER TABLE orders ADD INDEX idx_status_created_desc (status, created_at DESC);
Example 5: Derived Table Materialization โ CTE or Lateral Join
SELECT c.name, recent.last_order_date, recent.total
FROM customers c
JOIN (
SELECT customer_id, MAX(created_at) AS last_order_date, SUM(total) AS total
FROM orders GROUP BY customer_id
) AS recent ON c.id = recent.customer_id
WHERE c.status = 'vip';
-- EXPLAIN:
-- id=1, table=c, type=ref, rows=100
-- id=1, table=<derived2>, type=ref, rows=10
-- id=2, select_type=DERIVED, table=orders, type=ALL, rows=850000
-- Problem: derived table materializes aggregation of all 850K rows
SELECT c.name, recent.last_order_date, recent.total
FROM customers c,
LATERAL (
SELECT MAX(created_at) AS last_order_date, SUM(total) AS total
FROM orders WHERE customer_id = c.id
) AS recent
WHERE c.status = 'vip';
-- Or use CTE + window function (8.0+):
WITH ranked AS (
SELECT customer_id, created_at, total,
ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY created_at DESC) AS rn
FROM orders
)
SELECT c.name, r.created_at AS last_order_date, r.total
FROM customers c JOIN ranked r ON c.id = r.customer_id AND r.rn = 1
WHERE c.status = 'vip';
Common Anti-Patterns Detected by EXPLAIN
Below are common performance anti-patterns detectable through EXPLAIN output. Each includes detection criteria and fix suggestions.
Anti-Pattern 1: Full Table Scan on Large Table
| Attribute | Value |
|---|---|
| Detection | type=ALL AND rows > 1000 |
| Impact | O(N) complexity, slower as table grows. Can take tens of seconds on million-row tables. |
| Fix | Create indexes on columns in WHERE, JOIN, ORDER BY. Check for functions on columns preventing index use. Check for implicit type conversions. |
Anti-Pattern 2: Temporary Table + Filesort
| Attribute | Value |
|---|---|
| Detection | Extra contains both Using temporary AND Using filesort |
| Impact | Materializes intermediate results and sorts them. Large data may spill to disk, causing severe slowdown. |
| Fix | Ensure GROUP BY / ORDER BY columns are in the same index. If possible, share a composite index prefix across WHERE + GROUP BY + ORDER BY. |
Anti-Pattern 3: Rows Estimate >> Actual Result Set
| Attribute | Value |
|---|---|
| Detection | rows very high but filtered very low (e.g., rows=100000, filtered=0.10) |
| Impact | Many rows read then immediately discarded โ wasted I/O and CPU. |
| Fix | Add more precise indexes covering filter columns in WHERE. Consider composite indexes. |
Anti-Pattern 4: possible_keys Exists but key is NULL
| Attribute | Value |
|---|---|
| Detection | possible_keys != NULL AND key = NULL |
| Impact | Index available but optimizer chose full scan instead. |
| Fix | 1) Run ANALYZE TABLE to refresh statistics; 2) Check index selectivity (SELECT COUNT(DISTINCT col)/COUNT(*)); 3) Try FORCE INDEX; 4) Check if SELECT * makes table lookups too expensive โ select only needed columns. |
Anti-Pattern 5: DEPENDENT SUBQUERY Per-Row Execution
| Attribute | Value |
|---|---|
| Detection | select_type = DEPENDENT SUBQUERY |
| Impact | Subquery executes once per outer row. 100K outer rows × 500K subquery scan = 50 billion row checks. |
| Fix | Rewrite as JOIN or EXISTS (let optimizer auto-convert to semi-join). For scalar subqueries (in SELECT list), rewrite as LEFT JOIN + aggregation. |
Anti-Pattern 6: Functions on Indexed Columns
| Attribute | Value |
|---|---|
| Detection | Indexed column in WHERE, but type=ALL and key=NULL |
| Impact | Index completely disabled, full table scan. |
| Fix | Avoid WHERE YEAR(date_col) = 2024; rewrite as WHERE date_col >= '2024-01-01' AND date_col < '2025-01-01'. Or use functional indexes in MySQL 8.0+: ALTER TABLE t ADD INDEX ((YEAR(date_col))). |
Anti-Pattern 7: Implicit Type Conversion
| Attribute | Value |
|---|---|
| Detection | Similar to anti-pattern 6 โ index exists but not used |
| Typical | Column is VARCHAR but WHERE uses a number: WHERE phone = 13800138000. MySQL CASTs every row, index disabled. |
| Fix | Ensure type match: WHERE phone = '13800138000'. |
Anti-Pattern 8: JOIN Multiplier Explosion
| Attribute | Value |
|---|---|
| Detection | Product of rows across multiple JOINed tables is very large |
| Impact | Three-table JOIN with 10000 rows each = 10^12 combinations (without indexes) |
| Fix | Add indexes on the driven side of each JOIN condition. Ensure the driving table has the smallest result set. Use STRAIGHT_JOIN to control order if needed. |
Optimizer Trace โ Inside the Optimizer's Decision Process
When EXPLAIN tells you "what the optimizer chose" but not "why it chose it," optimizer_trace is the tool you need. It records the optimizer's complete decision chain from receiving the SQL to generating the final execution plan.
-- Enable optimizer_trace
SET optimizer_trace = 'enabled=on';
SET optimizer_trace_max_mem_size = 1048576; -- 1MB
-- Execute the query you want to analyze
SELECT * FROM orders WHERE status = 'pending' ORDER BY created_at LIMIT 10;
-- View the complete optimizer decision process
SELECT * FROM information_schema.OPTIMIZER_TRACE\G
-- Disable trace (avoid overhead)
SET optimizer_trace = 'enabled=off';
Three Phases of Trace Output
1. join_preparation
SQL parsing and preprocessing. Here you see: whether subqueries are converted to semi-joins, whether views/derived tables are merged or materialized, equality propagation, and other optimization transformations.
2. join_optimization
This is the core phase. Contains:
- condition_processing: condition simplification (constant folding, removing redundant conditions).
- table_dependencies: dependency relationships between tables.
- ref_optimizer_key_uses: candidate ref access methods.
- rows_estimation: row count estimation and index selection per table. Here you see why an index was chosen or abandoned โ the complete cost comparison numbers.
- considered_execution_plans: all considered JOIN orders and their costs.
- attaching_conditions_to_tables: which conditions are assigned to which tables.
3. join_execution
Execution phase information (usually minimal).
Practical: Using Trace to Diagnose Index Selection Issues
-- Query: optimizer has two candidate indexes but chose the "wrong" one
-- Look in the rows_estimation section of the trace:
"rows_estimation": [
{
"table": "`orders`",
"range_analysis": {
"table_scan": {
"rows": 850000,
"cost": 87412.1 -- full scan cost
},
"potential_range_indexes": [
{
"index": "idx_status",
"usable": true,
"key_parts": ["status"]
},
{
"index": "idx_created_at",
"usable": true,
"key_parts": ["created_at"]
}
],
"analyzing_range_alternatives": {
"range_scan_alternatives": [
{
"index": "idx_status",
"ranges": ["pending <= status <= pending"],
"rows": 425000, -- 50% of rows
"cost": 148750, -- more expensive than full scan!
"chosen": false,
"cause": "cost"
},
{
"index": "idx_created_at",
"ranges": ["2024-01-01 <= created_at"],
"rows": 42000,
"cost": 50400,
"chosen": true
}
]
}
}
}
]
The trace clearly shows: idx_status matches 50% of rows (too low selectivity), cost 148,750 is even higher than full scan's 87,412. Meanwhile idx_created_at matches only 42,000 rows at cost 50,400, so it was chosen.
EXPLAIN FOR CONNECTION โ Diagnose Running Queries
When you see a long-running query in SHOW PROCESSLIST, use EXPLAIN FOR CONNECTION to view its execution plan without copying the SQL and re-executing it.
-- Step 1: find the problem query's connection ID
SELECT id, time, state, LEFT(info, 100) AS query
FROM information_schema.PROCESSLIST
WHERE command = 'Query' AND time > 5
ORDER BY time DESC;
-- Step 2: get the execution plan
EXPLAIN FOR CONNECTION 12345;
-- JSON format for more detail
EXPLAIN FORMAT=JSON FOR CONNECTION 12345;
Use Cases
- Production diagnosis: diagnose slow queries in production without copying SQL or needing permissions on the target table.
- Prepared statements: parameters are already bound, so EXPLAIN FOR CONNECTION shows the actual execution plan with bound values.
- Long transaction analysis: combine with
performance_schema.events_statements_currentfor fuller context.
Limitations: (1) requires
PROCESSprivilege; (2) if the target query is in "Sending data" state, the plan may be incomplete; (3) very short queries may finish before you run EXPLAIN.
Histogram Statistics and EXPLAIN Accuracy (MySQL 8.0+)
MySQL 8.0 introduced Column Histograms, significantly improving the accuracy of the filtered column estimate, especially for columns with skewed data distribution.
Why Histograms Are Needed
Traditional index statistics only record column cardinality (number of distinct values). This works for uniformly distributed data, but real data is often heavily skewed. For example, order statuses on an e-commerce platform:
-- Data distribution:
-- completed: 85% (850,000 rows)
-- shipped: 10% (100,000)
-- pending: 4% (40,000)
-- cancelled: 1% (10,000)
-- Without histogram, optimizer assumes uniform distribution:
-- Rows per status = 1,000,000 / 4 = 250,000
-- This causes severe overestimation for WHERE status = 'cancelled'
-- Estimated 250K, actual 10K
Creating and Using Histograms
-- Create histogram
ANALYZE TABLE orders UPDATE HISTOGRAM ON status WITH 100 BUCKETS;
ANALYZE TABLE orders UPDATE HISTOGRAM ON total WITH 256 BUCKETS;
-- View histogram info
SELECT SCHEMA_NAME, TABLE_NAME, COLUMN_NAME, HISTOGRAM->>'$.histogram-type' AS type,
JSON_LENGTH(HISTOGRAM->>'$.buckets') AS buckets
FROM information_schema.COLUMN_STATISTICS;
-- Drop histogram
ANALYZE TABLE orders DROP HISTOGRAM ON status;
Note: Histograms are independent of indexes. Even without an index on the column, histograms help the optimizer estimate filtered more accurately. Two histogram types:
- Singleton: used when distinct values <= buckets. Each bucket stores an exact value and its frequency.
- Equi-height: used when distinct values > buckets. Each bucket stores a value range and cumulative frequency.
EXPLAIN Optimization Decision Flowchart
When facing EXPLAIN output, follow this flowchart to quickly locate issues:
Step 1: Check the type Column
type = ALLandrows > 1000? โ Optimize immediately. Add index or rewrite query.type = indexwithoutUsing index? โ Not a covering index, may need optimization.typeisref/eq_ref/const/range? โ Usually acceptable, check other columns.
Step 2: Check key and possible_keys
possible_keys = NULL? โ No usable index, need to create one.possible_keyshas values butkey = NULL? โ Index exists but not used, check selectivity and statistics.- Wrong index used? โ Test with
FORCE INDEX, checkoptimizer_tracefor reason.
Step 3: Check rows and filtered
rowsvery high? โ Index selectivity may be insufficient, consider a more precise composite index.filteredvery low (< 20%)? โ Many rows read then discarded, consider adding filter conditions to the index.- Product of
rowsacross JOINed tables is large? โ Check indexes on JOIN conditions, consider adjusting JOIN order.
Step 4: Check Extra
Using filesort? โ Create composite index covering WHERE + ORDER BY.Using temporary? โ Ensure GROUP BY/DISTINCT columns are indexed, or align with ORDER BY columns.Using index? โ Covering index, good performance.Using index condition? โ ICP is working, generally a good signal.
Step 5: Check select_type
DEPENDENT SUBQUERY? โ Highest priority โ rewrite as JOIN.DERIVEDwith high rows? โ Consider CTE or Lateral Join (MySQL 8.0+).
Step 6: Validate with EXPLAIN ANALYZE
- Compare estimated vs actual rows โ large gaps indicate stale statistics.
- Find the node with highest actual time โ that is the bottleneck.
- Watch the loops value โ high loops × time per loop = total actual time.
Using Performance Schema with EXPLAIN
EXPLAIN tells you the plan; Performance Schema tells you the actual runtime behavior. Combining both is the best practice for diagnosing complex performance issues.
Finding the Slowest Query Patterns
-- Top 10 query patterns by total execution time
SELECT DIGEST_TEXT,
COUNT_STAR AS exec_count,
ROUND(SUM_TIMER_WAIT/1e12, 2) AS total_sec,
ROUND(AVG_TIMER_WAIT/1e12, 4) AS avg_sec,
SUM_ROWS_EXAMINED,
SUM_ROWS_SENT,
ROUND(SUM_ROWS_EXAMINED/NULLIF(SUM_ROWS_SENT,0), 1) AS examine_to_send_ratio
FROM performance_schema.events_statements_summary_by_digest
WHERE SCHEMA_NAME = 'your_db'
ORDER BY SUM_TIMER_WAIT DESC
LIMIT 10;
Key Metrics Explained
| Metric | Meaning | EXPLAIN Correlation |
|---|---|---|
SUM_ROWS_EXAMINED / SUM_ROWS_SENT |
Ratio of rows examined to rows returned | Ratio >> 1 means many rows read then discarded. Corresponds to high rows with low filtered in EXPLAIN. |
SUM_SORT_ROWS |
Number of sorted rows | High value corresponds to Using filesort in EXPLAIN. |
SUM_CREATED_TMP_TABLES |
In-memory temp tables created | Corresponds to Using temporary. |
SUM_CREATED_TMP_DISK_TABLES |
On-disk temp tables created | Much worse than in-memory. Indicates tmp_table_size is too small or data volume is too large. |
SUM_NO_INDEX_USED |
Number of queries not using an index | Corresponds to type=ALL in EXPLAIN. |
Workflow Example
-- 1. Find query with highest examine/send ratio from Performance Schema
SELECT DIGEST_TEXT, SUM_ROWS_EXAMINED, SUM_ROWS_SENT,
ROUND(SUM_ROWS_EXAMINED/NULLIF(SUM_ROWS_SENT,0)) AS ratio
FROM performance_schema.events_statements_summary_by_digest
WHERE SCHEMA_NAME = 'mydb' AND SUM_ROWS_SENT > 0
ORDER BY ratio DESC LIMIT 5;
-- 2. After finding the problem query, analyze with EXPLAIN
EXPLAIN FORMAT=JSON SELECT ...;
-- 3. After optimization, check Performance Schema again to confirm improvement
-- (Note: TRUNCATE the stats table first to start fresh)
TRUNCATE performance_schema.events_statements_summary_by_digest;
Index Design Mistakes Revealed by EXPLAIN
Mistake 1: Wrong Column Order in Composite Index
-- Index: INDEX (created_at, status)
-- Query:
EXPLAIN SELECT * FROM orders WHERE status = 'pending' AND created_at > '2024-01-01';
-- EXPLAIN shows key_len covers only created_at (columns after range condition cannot be used for exact match)
-- Correct order: INDEX (status, created_at):
-- Equality columns first, range columns last
Mistake 2: Redundant Indexes
-- These indexes are redundant:
INDEX idx_a (a) -- covered by idx_ab
INDEX idx_ab (a, b)
INDEX idx_abc (a, b, c) -- idx_ab is also covered by idx_abc
-- Check redundant indexes:
SELECT * FROM sys.schema_redundant_indexes WHERE table_schema = 'mydb';
Mistake 3: Over-Indexing
Every index adds cost to write operations (INSERT/UPDATE/DELETE must maintain all indexes). Use EXPLAIN to check which indexes never appear in the key column โ they may be useless.
-- Find indexes never used (MySQL 5.6+):
SELECT object_schema, object_name, index_name
FROM performance_schema.table_io_waits_summary_by_index_usage
WHERE index_name IS NOT NULL
AND count_star = 0
AND object_schema = 'mydb'
ORDER BY object_name;
Mistake 4: Ignoring Charset Impact on key_len
The same VARCHAR(255) has key_len = 257 (255+2) in latin1 but key_len = 1022 (255×4+2) in utf8mb4. This directly impacts:
- Entries per index page (larger key_len = fewer entries = taller B-tree).
- Maximum columns and total length of composite indexes (InnoDB max 3072 bytes).
- Memory usage per row during in-memory sorts.
EXPLAIN Quick Reference Cheat Sheet
type Column Quick Reference
| type | One-Line Description | Target |
|---|---|---|
system/const | PK/UK exact single-row lookup | Best |
eq_ref | PK/UK one-to-one JOIN lookup | Best for JOIN |
ref | Non-unique index equality | Good |
range | Index range scan | Acceptable |
index | Full index scan (OK if covering) | Evaluate |
ALL | Full table scan โ must optimize on large tables | Must optimize |
Extra Column Quick Reference
| Extra | Signal | Action |
|---|---|---|
Using index | Good | Covering index, no action needed |
Using index condition | Good | ICP working |
Using where | Neutral | Evaluate with type and rows |
Using filesort | Warning | Consider composite index for sort |
Using temporary | Warning | Optimize GROUP BY / DISTINCT |
Using join buffer | Attention | Add index on JOIN condition |
| temporary + filesort | Danger | Highest priority fix |
EXPLAIN Commands Quick Reference
-- Basic usage
EXPLAIN SELECT ...;
-- JSON format (more detail)
EXPLAIN FORMAT=JSON SELECT ...;
-- TREE format (iterator view, 8.0.16+)
EXPLAIN FORMAT=TREE SELECT ...;
-- Actually execute and collect stats (8.0.18+)
EXPLAIN ANALYZE SELECT ...;
-- Analyze running query
EXPLAIN FOR CONNECTION <connection_id>;
-- View optimizer-rewritten SQL
EXPLAIN SELECT ...; SHOW WARNINGS;
-- Refresh table statistics
ANALYZE TABLE table_name;
-- Create histogram (8.0+)
ANALYZE TABLE t UPDATE HISTOGRAM ON col WITH 100 BUCKETS;
FAQ
Does EXPLAIN actually execute the query?
Regular EXPLAIN and EXPLAIN FORMAT=JSON/TREE do not execute the query โ they only have the optimizer generate the plan. The sole exception is EXPLAIN ANALYZE (MySQL 8.0.18+), which actually executes the query to collect real runtime statistics. For write operations (INSERT/UPDATE/DELETE), EXPLAIN ANALYZE will truly modify data, so use it within a transaction.
Note: even with regular EXPLAIN, queries with derived tables (DERIVED) may require materialization to generate the plan, consuming some resources.
Why are EXPLAIN's row estimates inaccurate?
rows is based on index statistics (cardinality), which are approximate values from random sampling. Common reasons for inaccuracy:
- Statistics stale after heavy DML โ run
ANALYZE TABLEto refresh. - InnoDB samples limited pages (default
innodb_stats_persistent_sample_pages=20) โ increase sample pages. - Highly skewed data distribution (e.g., 99% of status is 'active') โ histograms (MySQL 8.0
ANALYZE TABLE t UPDATE HISTOGRAM ON col) improve estimates. - Multi-column conditions assume column independence โ actual data may be highly correlated.
What's the difference between type=index and type=ALL? Which is faster?
Both are full scans, but scan different things:
type=ALL: full table scan โ reads all rows from the clustered index (data pages). Each page contains complete row data.type=index: full index scan โ reads all entries from a secondary index. Index entries are much smaller than data rows.
With a covering index (Extra: Using index), type=index is significantly faster since it only reads smaller index pages. But without a covering index (no Using index), type=index may actually be slower โ table lookups cause heavy random I/O.
How can I analyze a slow query without executing it?
Use regular EXPLAIN (without ANALYZE). It generates the plan without executing:
EXPLAIN FORMAT=JSON SELECT ... your slow query ...;
If you need detailed timing but worry the query is too slow:
- Use
EXPLAIN ANALYZEin a test environment. - Add
LIMITthen useEXPLAIN ANALYZE. - Use
optimizer_tracefor full optimizer decision process:SET optimizer_trace='enabled=on'; SELECT ...; SELECT * FROM information_schema.optimizer_trace;
EXPLAIN shows index is used but query is still slow โ why?
Index usage does not guarantee good performance. Common reasons:
- High lookup cost:
type=rangescans 100K index entries, each requiring a table lookup โ 100K random I/Os. Fix: use covering index. - Low selectivity: index returns too many rows (e.g., gender column with only M/F). Check the
rowscolumn. - Sort bottleneck: index used for WHERE but ORDER BY needs filesort. Check Extra column.
- Lock waits: query itself is fast but waiting for locks. Check
SHOW ENGINE INNODB STATUS. - Network transfer: query returns massive data (
SELECT *on millions of rows). Reduce columns and rows returned.
Can query_cost from EXPLAIN FORMAT=JSON be compared across different queries?
To some extent, but with caveats:
query_costis a dimensionless number from the optimizer's internal cost model โ not milliseconds or I/O counts.- Different formulations of the same query can be directly compared by
query_costโ lower cost is usually faster. - Comparing across different queries has limited meaning โ a query returning 1 row may have higher cost than one returning 1M rows but execute faster.
- The cost model uses parameters from
mysql.server_costandmysql.engine_costtables (memory vs disk access costs), tunable but rarely needed.
What are the differences in EXPLAIN between MySQL 5.7 and 8.0?
MySQL 8.0 adds significant EXPLAIN enhancements:
- FORMAT=TREE (8.0.16+): new iterator-based tree execution plan format.
- EXPLAIN ANALYZE (8.0.18+): actual execution with runtime statistics.
- Hash Join (8.0.18+): new JOIN algorithm, shown as
hash join(TREE) orUsing join buffer (hash join)(Extra). - EXPLAIN FOR CONNECTION (available since 5.7, improved in 8.0): view plan of running query.
- Skip Scan (8.0.13+):
Using index for skip scanin Extra. - Descending indexes (8.0.0+):
INDEX (col DESC), reducingBackward index scan. - Histograms (8.0.0+): improve
filteredestimate accuracy.
How to view the execution plan of a query running in another connection?
Use EXPLAIN FOR CONNECTION (MySQL 5.7+):
-- 1. Find the target connection ID
SHOW PROCESSLIST;
-- or
SELECT * FROM performance_schema.processlist WHERE INFO IS NOT NULL;
-- 2. View its execution plan
EXPLAIN FOR CONNECTION 42;
EXPLAIN FORMAT=JSON FOR CONNECTION 42;
Note: if the query finishes too quickly, EXPLAIN FOR CONNECTION may return an error. This feature is best for analyzing long-running queries.
EXPLAIN results differ when run at different times โ is this normal?
Completely normal. EXPLAIN results are influenced by:
- Index statistics: InnoDB statistics are approximate from sampling. After background resampling (or running
ANALYZE TABLE),rowsandfilteredestimates may change. - Data volume changes: as data is inserted/updated/deleted, table size changes may cause the optimizer to choose different paths.
- Buffer Pool state: while EXPLAIN itself is unaffected by cache (it does not execute), EXPLAIN ANALYZE's actual time varies based on whether data is in memory.
- Optimizer hints and variables: session-level
optimizer_switchsettings affect the plan. - MySQL version upgrades: different versions have different optimizer capabilities; plans may change after upgrading.
If a production execution plan suddenly degrades, first run ANALYZE TABLE to refresh statistics, then re-EXPLAIN.