Chapter 10

EXPLAIN Deep Dive

MySQL EXPLAIN Analyzer

Level: Intermediate-Advanced
MySQL 5.7 / 8.0 / 8.4
Interactive Tool + Deep Reference

Interactive EXPLAIN Analyzer

Paste your MySQL EXPLAIN output below (tabular or JSON format), click "Analyze" to get a visual breakdown, automatic problem detection, and optimization suggestions.

Paste EXPLAIN Output

Analyze EXPLAIN
Load Sample
Clear

Analysis Result

id — Query Identifier

The id column is the sequential number of the SELECT within the query. When a query contains subqueries, UNIONs, or derived tables, multiple rows appear, each corresponding to a SELECT operation. Understanding id numbering is the first step to reading complex EXPLAIN output.

Numbering Rules

Same id: tables are JOINed in the same SELECT. MySQL processes same-id rows top-to-bottom — the upper table is the driving table, the lower is the driven table.
Different id: higher id = higher priority = executed first. A subquery has a larger id than its outer query.
id is NULL: this row represents the temporary table created for a UNION RESULT.

Examples

-- Simple JOIN: both rows have id = 1
EXPLAIN SELECT * FROM orders o JOIN customers c ON o.customer_id = c.id;
-- id=1, table=o  (driving table)
-- id=1, table=c  (driven table)

-- Subquery: id=2 executes first
EXPLAIN SELECT * FROM orders WHERE customer_id IN (SELECT id FROM customers WHERE status='active');
-- id=1, table=orders
-- id=2, table=customers

-- UNION: UNION RESULT has id=NULL
EXPLAIN SELECT id FROM t1 UNION SELECT id FROM t2;
-- id=1, table=t1
-- id=2, table=t2
-- id=NULL, table=

Note: The optimizer may rewrite subqueries as JOINs, merging originally distinct ids. Compare EXPLAIN with the rewritten SQL in SHOW WARNINGS to confirm.

select_type — SELECT Type

select_type identifies the role of the current row's SELECT within the overall query. It is the key to understanding query structure — nesting, dependencies, and materialization.

select_type	Meaning	When It Appears
`SIMPLE`	Simple query without subqueries or UNION	`SELECT * FROM t`
`PRIMARY`	Outermost SELECT	Outer query when subqueries/UNION present
`SUBQUERY`	Non-correlated subquery in SELECT list or WHERE	`SELECT (SELECT MAX(id) FROM t2) FROM t1`
`DEPENDENT SUBQUERY`	Correlated subquery — re-executed for every outer row	`WHERE EXISTS (SELECT 1 FROM t2 WHERE t2.fk = t1.id)`
`DERIVED`	Subquery in FROM clause (derived table), materialized into temp table	`SELECT * FROM (SELECT ... ) AS dt`
`MATERIALIZED`	Subquery materialized into a temp table (MySQL 5.6+, optimizes IN subquery)	`WHERE id IN (SELECT fk FROM t2)`
`UNION`	Second or later SELECT in a UNION	`SELECT ... UNION SELECT ...`
`UNION RESULT`	Temp table for UNION dedup result	UNION (not UNION ALL)
`DEPENDENT UNION`	UNION member dependent on outer query	UNION with outer table reference
`UNCACHEABLE SUBQUERY`	Subquery whose result cannot be cached and must be re-evaluated	Uses user variables, non-deterministic functions, etc.
`UNCACHEABLE UNION`	UNION member that is uncacheable	Same as above, within UNION

Performance Warnings

The most dangerous is DEPENDENT SUBQUERY. It means the subquery executes once per outer row. With 1 million outer rows, the subquery runs 1 million times. Common fix: rewrite as JOIN, or let the optimizer convert EXISTS into a semi-join (automatic in MySQL 5.6+).

-- Dangerous: DEPENDENT SUBQUERY
SELECT * FROM orders
WHERE customer_id IN (
  SELECT id FROM customers WHERE region = orders.region  -- references outer orders.region
);

-- Fix: rewrite as JOIN
SELECT DISTINCT o.* FROM orders o
JOIN customers c ON o.customer_id = c.id AND c.region = o.region;

table — Table Name

Identifies the table accessed by the current row. Besides actual table names, it may show these special values:

Value	Meaning
``	Derived table from subquery with id=N
``	Materialized subquery with id=N (MySQL 5.6+ optimization)
``	UNION result temp table combining ids M, N, ...
`NULL`	Optimizer resolved the query at optimization time without table access

partitions — Partition Pruning

If the table is partitioned, this column shows which partitions the query actually accesses. NULL means the table is not partitioned. Partition pruning is the core benefit of partitioning — the query scans only necessary partitions, not all of them.

-- Orders table partitioned by year
CREATE TABLE orders (
  id BIGINT, order_date DATE, ...
) PARTITION BY RANGE (YEAR(order_date)) (
  PARTITION p2023 VALUES LESS THAN (2024),
  PARTITION p2024 VALUES LESS THAN (2025),
  PARTITION p2025 VALUES LESS THAN (2026),
  PARTITION pmax  VALUES LESS THAN MAXVALUE
);

EXPLAIN SELECT * FROM orders WHERE order_date BETWEEN '2024-01-01' AND '2024-12-31';
-- partitions: p2024   (only one partition scanned)

EXPLAIN SELECT * FROM orders WHERE amount > 100;
-- partitions: p2023,p2024,p2025,pmax   (all partitions — WHERE not on partition key)

Diagnostic tip: if you see all partitions listed, pruning is not working. Check whether the WHERE clause hits the partition key and whether the partition function expression is compatible with the condition.

type — Access Type (THE Most Important Column)

The type column describes how MySQL finds rows in the table. It directly determines query performance. Listed from best to worst (higher is better):

system

Best

const

O(1)

eq_ref

O(1)/row

ref

Good

fulltext

Good

ref_or_null

index_merge

unique_subquery

Moderate

index_subquery

Moderate

range

Moderate

index

Slow

ALL

Worst

system

The table has at most one row (system table). A special case of const, rarely seen in production queries.

const

At most one matching row found via PRIMARY KEY or UNIQUE index equality lookup. MySQL treats the result as a constant during optimization.

EXPLAIN SELECT * FROM users WHERE id = 42;
-- type: const, key: PRIMARY

EXPLAIN SELECT * FROM users WHERE email = '[email protected]';
-- type: const (requires UNIQUE index on email)

eq_ref

For each row from the driving table, exactly one row is read from this table using a PRIMARY KEY or UNIQUE NOT NULL index. This is the best JOIN access type.

EXPLAIN SELECT * FROM orders o JOIN customers c ON o.customer_id = c.id;
-- orders:   type=ALL  (driving table full scan)
-- customers: type=eq_ref, key=PRIMARY  (perfect: one row per lookup)

ref

Non-unique index lookup or leftmost prefix of a unique index. May return multiple rows. Common for ordinary index equality conditions and non-unique JOINs.

EXPLAIN SELECT * FROM orders WHERE customer_id = 100;
-- type: ref, key: idx_customer_id  (customer_id has a regular index)

EXPLAIN SELECT * FROM orders WHERE status = 'pending';
-- type: ref, key: idx_status

fulltext

Uses a FULLTEXT index. Appears only with MATCH ... AGAINST queries.

EXPLAIN SELECT * FROM articles WHERE MATCH(title, body) AGAINST ('mysql performance');
-- type: fulltext, key: ft_idx

ref_or_null

Like ref, but also searches for rows with NULL values. Common with WHERE col = val OR col IS NULL.

EXPLAIN SELECT * FROM t WHERE key_col = 10 OR key_col IS NULL;
-- type: ref_or_null

index_merge

The optimizer uses multiple indexes and merges results (intersection or union). The Extra column shows the merge strategy: Using intersect, Using union, Using sort_union.

EXPLAIN SELECT * FROM t WHERE col_a = 1 OR col_b = 2;
-- type: index_merge, key: idx_a,idx_b, Extra: Using union(idx_a,idx_b); Using where

Optimization tip: while index_merge is better than a full scan, a composite index is usually faster. If you see this type frequently, consider a composite index covering multiple conditions.

unique_subquery

Uses PRIMARY KEY or UNIQUE index in an IN subquery — the subquery version of eq_ref. Usually replaced by semi-join optimization in MySQL 5.6+.

index_subquery

Like unique_subquery, but uses a non-unique index.

range

Index range scan. Appears with =, <>, >, >=, <, <=, IS NULL, <=>, BETWEEN, LIKE (prefix), IN(). Scans one or more index ranges, not the entire index.

EXPLAIN SELECT * FROM orders WHERE created_at BETWEEN '2024-01-01' AND '2024-03-31';
-- type: range, key: idx_created_at

EXPLAIN SELECT * FROM users WHERE age IN (25, 30, 35);
-- type: range, key: idx_age

Performance: range performance depends on selectivity. Scanning 1% of the index is fast; scanning 80% may be slower than a full index scan.

index

Full index scan. Like ALL, but scans the index tree instead of data rows. Two cases: 1) The index covers all needed columns (Extra shows Using index) — reads only index pages, no table access; 2) Reads all rows in index order (avoids sorting).

-- Covering index: reads only the index
EXPLAIN SELECT customer_id FROM orders;
-- type: index, key: idx_customer_id, Extra: Using index

-- Full index scan: reads all rows in PK order
EXPLAIN SELECT * FROM orders ORDER BY id;
-- type: index, key: PRIMARY

Warning: type=index is still slow on large tables. While better than ALL (indexes are smaller than data), scanning the entire index of a 10M-row table still requires significant I/O.

ALL

Full table scan. Reads every row in the table. Unless the table is small (a few hundred rows) or you genuinely need most rows (analytics), type=ALL is almost always a sign that optimization is needed.

EXPLAIN SELECT * FROM orders WHERE YEAR(created_at) = 2024;
-- type: ALL  (function on column prevents index use)

-- Fix: rewrite condition as range
EXPLAIN SELECT * FROM orders WHERE created_at >= '2024-01-01' AND created_at < '2025-01-01';
-- type: range  (index usable)

type=NULL (Best Extreme Case)

MySQL resolved the query at optimization time without even accessing the table or index. Examples: empty table, MIN()/MAX() on indexed column, WHERE 1=0.

EXPLAIN SELECT MIN(id) FROM users;
-- type: NULL, Extra: Select tables optimized away

possible_keys — Candidate Indexes

possible_keys lists all indexes the optimizer considers potentially usable for this query. It checks which index definitions match the columns and conditions in the query.

NULL: no candidate indexes. Usually means the columns in WHERE are not indexed, or functions/expressions prevent index usage.
Multiple indexes: optimizer listed several candidates; the actual choice (or lack thereof) is in the key column.

Note: possible_keys is not NULL but key IS NULL is a common problem signal — the optimizer decided a full scan is cheaper than any candidate index. Reasons: (1) small table; (2) low index selectivity; (3) too many columns require table lookups causing excessive random I/O.

key — Actually Used Index

The key column shows the index the optimizer actually chose. This is one of the most direct "answers" in EXPLAIN: was your index actually used?

NULL: no index used (full table scan).
PRIMARY: primary key index used.
Index name: a specific secondary index was used.

Sometimes the optimizer's choice is suboptimal. You can use FORCE INDEX(idx_name) or USE INDEX(idx_name) hints. But first confirm:

-- Force a specific index
EXPLAIN SELECT * FROM orders FORCE INDEX(idx_created_at)
WHERE created_at > '2024-01-01' AND status = 'pending';

-- Ignore a specific index
EXPLAIN SELECT * FROM orders IGNORE INDEX(idx_status)
WHERE status = 'pending';

key_len — Index Bytes Used

key_len shows the number of bytes MySQL actually uses from the index. For composite indexes, key_len tells you how many columns of the index are being used. This is key to diagnosing composite index utilization.

Calculation Rules

Data Type	Bytes	Notes
`TINYINT`	1
`SMALLINT`	2
`MEDIUMINT`	3
`INT`	4
`BIGINT`	8
`FLOAT`	4
`DOUBLE`	8
`DATE`	3
`DATETIME`	5	MySQL 5.6.4+
`TIMESTAMP`	4
`CHAR(N)`	N × charset_bytes	utf8mb4: N × 4
`VARCHAR(N)`	N × charset_bytes + 2	+2 for length prefix
If column is NULLable: +1 byte (NULL flag)

Practical Decoding

-- Composite index: INDEX idx_abc (a INT NOT NULL, b VARCHAR(50) NOT NULL, c DATE NOT NULL)
-- Full usage: key_len = 4 + (50*4+2) + 3 = 209
-- Only a:         key_len = 4
-- a + b:         key_len = 4 + 202 = 206
-- a + b + c:     key_len = 209

EXPLAIN SELECT * FROM t WHERE a = 1 AND b = 'hello';
-- key_len: 206  →  used columns a + b

Leftmost prefix rule: composite index (a, b, c) can only be used when the query includes a. WHERE b=1 cannot use this index. WHERE a=1 AND c=1 uses only column a (key_len is only 4).

ref — Values Compared to the Index

ref shows which columns or constants are compared to the index indicated by the key column. It helps you understand the details of join conditions.

Value	Meaning
`const`	Compared against a constant (e.g., `WHERE col = 42`)
`db.table.col`	Compared against a column from another table (JOIN)
`func`	Comparison value is the result of a function
`NULL`	No index reference used (common when type is range or ALL)

-- ref = const
EXPLAIN SELECT * FROM users WHERE email = '[email protected]';
-- ref: const

-- ref = db.table.column
EXPLAIN SELECT * FROM orders o JOIN users u ON o.user_id = u.id;
-- orders row: ref: NULL (driving table)
-- users row:  ref: mydb.o.user_id

rows — Estimated Rows to Examine

rows is the optimizer's estimate of how many rows must be examined to find the target rows, based on index statistics. This is an estimate and can be significantly off — especially when statistics are stale.

For JOINs, the total cost is roughly the product of all row estimates. If three tables estimate 100, 200, 50 rows, total rows examined is approximately 100 × 200 × 50 = 1,000,000.
When statistics are inaccurate, run ANALYZE TABLE tablename to refresh them.

-- Symptom of stale statistics:
EXPLAIN SELECT * FROM orders WHERE status = 'shipped';
-- rows: 500000  (estimated)
-- actually only 128 rows — run ANALYZE TABLE orders then re-EXPLAIN

filtered — Row Filter Percentage

filtered indicates the estimated percentage of rows that remain after the table's WHERE condition is applied. Effective rows = rows × filtered / 100.

filtered = 100.00: all examined rows satisfy the condition (no extra filtering), or MySQL lacks info to estimate.
filtered = 10.00: only 10% of rows satisfy the condition. If rows=10000, about 1000 rows pass to the next step.

A low filtered value (e.g., 1%) with a high rows value is an optimization opportunity: many rows are read then discarded. Consider adding an index to reduce rows examined.

Extra — Additional Information (30+ Values Explained)

The Extra column contains additional information about how MySQL executes the query. It is the richest source of performance diagnostics. Grouped by performance impact below.

Positive Signals (Green)

Using index

All columns needed by the query can be retrieved from the index alone, without reading the data row. This is a "Covering Index" — one of the best performance scenarios.

-- Covering index example: INDEX idx_cover (status, created_at)
EXPLAIN SELECT status, created_at FROM orders WHERE status = 'pending';
-- Extra: Using index  (no table lookup needed)

Using index condition

Index Condition Pushdown (ICP), introduced in MySQL 5.6. Conditions that would have been filtered at the server layer after table lookup are now pushed down to the storage engine and filtered within the index, reducing table lookups.

-- INDEX idx_name_age (name, age)
EXPLAIN SELECT * FROM users WHERE name LIKE 'John%' AND age > 30;
-- Extra: Using index condition
-- ICP filters age > 30 at index level, avoiding table lookup for non-matching rows

Select tables optimized away

The query was resolved during optimization, no table access needed at execution time. Common for MIN()/MAX() on columns with B-tree indexes.

Using index for group-by

MySQL uses the index to process GROUP BY without creating a temp table or sorting. Occurs when GROUP BY columns are an index prefix.

Using index for skip scan

Skip Scan optimization introduced in MySQL 8.0.13. When composite index (a, b) prefix column a is not in WHERE but b is, the optimizer can skip the prefix and scan by b.

Neutral / Informational

Using where

The server layer applies additional filtering to rows returned by the storage engine. Almost every query with a WHERE clause shows this. Not a problem by itself, but combined with type=ALL and high rows, it means many rows are read then discarded.

Using MRR

Multi-Range Read optimization. MySQL converts random disk access to sequential by sorting row IDs before table lookup, reducing I/O.

Using join buffer (Block Nested Loop / hash join)

The driven table has no usable index. MySQL uses a Join Buffer to reduce scans of the driven table. MySQL 8.0.18+ introduced Hash Join on top of this.

Impossible WHERE

The WHERE condition is logically impossible (e.g., WHERE 1 = 0). MySQL returns empty result immediately.

Impossible WHERE noticed after reading const tables

After reading const tables, MySQL determined the WHERE is impossible.

Impossible HAVING

The HAVING condition is logically impossible.

No matching min/max row

No rows satisfy the condition for MIN()/MAX().

Using union / Using intersect / Using sort_union

Merge strategies for index_merge. union for OR, intersect for AND, sort_union for OR conditions requiring sorting.

Distinct

MySQL stops searching after finding the first matching row.

Start temporary / End temporary

Markers for the DuplicateWeedout semi-join strategy.

FirstMatch(tbl_name)

Semi-join FirstMatch strategy — returns on first match.

LooseScan(m..n)

Semi-join LooseScan strategy.

Rematerialize

Re-materialization of a Lateral Derived Table in MySQL 8.0+.

Warning Signals (Yellow)

Using filesort

MySQL needs an extra sorting pass to satisfy ORDER BY. Despite the misleading name, this may happen in memory. But it means the index cannot satisfy the sort order, requiring extra CPU and memory. Large datasets may spill to disk temp files.

-- Triggers filesort: INDEX idx_status (status)
EXPLAIN SELECT * FROM orders WHERE status='pending' ORDER BY created_at;
-- Extra: Using where; Using filesort

-- Eliminates filesort: INDEX idx_status_created (status, created_at)
EXPLAIN SELECT * FROM orders WHERE status='pending' ORDER BY created_at;
-- Extra: Using where  (no filesort)

Using temporary

MySQL needs to create a temporary table. Usually for GROUP BY, DISTINCT, UNION, subqueries. Small temp tables stay in memory; large ones spill to disk.

-- Triggers temp table:
EXPLAIN SELECT DISTINCT status FROM orders ORDER BY created_at;
-- Extra: Using temporary; Using filesort

-- Fix: make ORDER BY and DISTINCT/GROUP BY use the same index

Using join buffer (Block Nested Loop)

The driven table lacks an index (or the optimizer chose not to use one). MySQL uses Block Nested Loop. This is usually a performance signal — add an index for the JOIN condition.

Danger Signals (Red)

Using temporary; Using filesort

Both appearing together is the most serious performance warning. A temp table is created AND sorted. Can cause severe performance issues on large datasets.

Full scan on NULL key

Subquery optimization fallback — when the optimizer cannot use an index, it falls back to a full scan on the subquery.

Range checked for each record

MySQL cannot determine which index to use and re-evaluates for each row. Performance is usually poor.

Backward index scan

Reverse index scan reported in MySQL 8.0+. Appears with ORDER BY ... DESC when the index is ASC. MySQL 8.0 supports descending indexes (CREATE INDEX ... (col DESC)) to optimize this.

Zero limit

Query has LIMIT 0; MySQL returns empty result immediately.

No matching rows after partition pruning

No matching partitions after pruning.

Using where with pushed condition

NDB Cluster specific — condition pushed to data nodes.

Plan isn't ready yet

Appears in MySQL 8.0+ EXPLAIN FOR CONNECTION — the target connection's plan is not ready yet.

EXPLAIN ANALYZE (MySQL 8.0.18+)

EXPLAIN ANALYZE is the most powerful execution plan tool, introduced in MySQL 8.0.18. Unlike regular EXPLAIN, it actually executes the query and records real execution statistics: actual rows, actual time, loop count.

EXPLAIN ANALYZE SELECT o.*, c.name
FROM orders o JOIN customers c ON o.customer_id = c.id
WHERE o.status = 'pending' AND o.total > 100
ORDER BY o.created_at DESC LIMIT 10;

/*
-> Limit: 10 row(s)  (cost=2.54 rows=10) (actual time=0.892..0.943 rows=10 loops=1)
    -> Nested loop inner join  (cost=2.54 rows=10) (actual time=0.889..0.938 rows=10 loops=1)
        -> Sort: o.created_at DESC, limit input to 10 row(s) per chunk
            -> Filter: ((o.status = 'pending') and (o.total > 100))  (cost=1.25 rows=10) (actual time=0.325..0.678 rows=42 loops=1)
                -> Table scan on o  (cost=1.25 rows=1000) (actual time=0.312..0.590 rows=1000 loops=1)
        -> Single-row index lookup on c using PRIMARY (id=o.customer_id)  (cost=0.13 rows=1) (actual time=0.024..0.024 rows=1 loops=10)
*/

How to Read the Output

Field	Meaning
`cost=X.XX`	Optimizer's estimated cost (lower is better)
`rows=N` (in parens)	Estimated row count
`actual time=A..B`	A = time to first row (ms), B = time to all rows (ms)
`rows=N` (actual)	Actual rows processed
`loops=N`	Times this operation executed (in Nested Loop, inner table loops = outer table rows)

Estimated vs Actual Discrepancy Analysis

The greatest value of EXPLAIN ANALYZE is revealing discrepancies between estimates and actuals:

Estimated rows << Actual rows: statistics are stale. Run ANALYZE TABLE. The optimizer may have chosen a wrong join order or index.
Estimated rows >> Actual rows: optimizer overestimated cost, possibly abandoning an index. Try FORCE INDEX to see if it's faster.
One layer has disproportionately high actual time: that is the bottleneck. Focus optimization there.
High loops value: the operation is repeated many times. If loops=50000 and actual time=0.01ms each, total is still 500ms.

Warning: EXPLAIN ANALYZE actually executes the query. For write operations (INSERT/UPDATE/DELETE), it will really modify data. Use within a transaction and ROLLBACK, or use only on SELECTs.

EXPLAIN FORMAT=JSON

JSON format provides much richer information than tabular format, including cost details, used columns, attached conditions, and more. Especially useful for diagnosing complex queries.

EXPLAIN FORMAT=JSON SELECT * FROM orders WHERE status = 'pending'\G

Key JSON Fields

Field	Meaning
`query_cost`	Total estimated cost of the entire query. Useful for comparing different query formulations.
`cost_info.read_cost`	I/O cost of reading data
`cost_info.eval_cost`	CPU cost of processing rows
`cost_info.prefix_cost`	Cumulative cost from query start to current table
`used_columns`	List of columns actually used — helps design covering indexes
`attached_condition`	Full expression of the filtering condition attached to this table
`possible_keys`	Candidate indexes (same as tabular)
`key`	Chosen index
`rows_examined_per_scan`	Rows examined per scan
`rows_produced_per_join`	Rows produced per join step
`filtered`	Filter percentage (with decimals)
`using_index`	Boolean: whether a covering index is used

Example JSON Output

{
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "12.75"
    },
    "table": {
      "table_name": "orders",
      "access_type": "ref",
      "possible_keys": ["idx_status", "idx_status_created"],
      "key": "idx_status_created",
      "used_key_parts": ["status"],
      "key_length": "42",
      "ref": ["const"],
      "rows_examined_per_scan": 50,
      "rows_produced_per_join": 50,
      "filtered": "100.00",
      "cost_info": {
        "read_cost": "7.75",
        "eval_cost": "5.00",
        "prefix_cost": "12.75",
        "data_read_per_join": "25K"
      },
      "used_columns": ["id", "customer_id", "status", "total", "created_at"],
      "attached_condition": "(`mydb`.`orders`.`status` = 'pending')"
    }
  }
}

Use case: used_columns lets you precisely design a covering index — include all columns from used_columns in your index to achieve a covering index scan.

EXPLAIN FORMAT=TREE (MySQL 8.0.16+)

TREE format shows the iterator-based execution plan. Unlike traditional row-level EXPLAIN, it shows how data flows between operations, closer to how the database actually executes.

EXPLAIN FORMAT=TREE SELECT o.*, c.name
FROM orders o JOIN customers c ON o.customer_id = c.id
WHERE o.status = 'pending' ORDER BY o.created_at LIMIT 10\G

/*
-> Limit: 10 row(s)  (cost=45.30 rows=10)
    -> Nested loop inner join  (cost=45.30 rows=50)
        -> Sort: o.created_at, limit input to 10 row(s) per chunk  (cost=27.80 rows=50)
            -> Index lookup on o using idx_status (status='pending')  (cost=27.80 rows=50)
        -> Single-row index lookup on c using PRIMARY (id=o.customer_id)  (cost=0.25 rows=1)
*/

Iterator Types

Iterator	Meaning
`Table scan`	Full table scan
`Index scan`	Full index scan
`Index lookup`	Index lookup (ref/eq_ref/const etc.)
`Index range scan`	Index range scan
`Single-row index lookup`	Single-row index lookup (const/eq_ref)
`Nested loop inner join`	Nested loop inner join
`Hash join`	Hash join (8.0.18+)
`Sort`	Sort operation (corresponds to Using filesort)
`Filter`	Filter condition (corresponds to Using where)
`Aggregate`	Aggregation operation
`Materialize`	Materialize subquery/CTE
`Limit`	Limit output rows
`Stream results`	Stream results to client

Read TREE format from innermost (deepest indent) to outermost. The innermost level is the data source (table scan or index lookup); data flows upward through filters, sorts, joins, and limits before returning to the client.

key_len Calculator

Enter your index column data types and attributes to calculate the expected key_len. Compare with the actual key_len in EXPLAIN output to determine how many columns of a composite index are being used.

key_len Calculator

  Col 1

    --
    TINYINT
    SMALLINT
    MEDIUMINT
    INT
    BIGINT
    FLOAT
    DOUBLE
    DATE
    DATETIME
    TIMESTAMP
    CHAR(N)
    VARCHAR(N)



    utf8mb4
    utf8mb3
    latin1
    binary

   NULL






+ Add Column
- Remove Column




key_len = **4** bytes

Common key_len Quick Reference

key_len	Common Source
4	`INT NOT NULL`
5	`INT NULL` (4 + 1 NULL flag)
8	`BIGINT NOT NULL`
9	`BIGINT NULL`
3	`DATE NOT NULL`
5	`DATETIME NOT NULL` (MySQL 5.6.4+)
4	`TIMESTAMP NOT NULL`
42	`VARCHAR(10) NOT NULL` (utf8mb4: 10×4+2)
43	`VARCHAR(10) NULL` (utf8mb4: 10×4+2+1)
82	`VARCHAR(20) NOT NULL` (utf8mb4: 20×4+2)
202	`VARCHAR(50) NOT NULL` (utf8mb4: 50×4+2)
767	InnoDB single column index prefix max (MySQL 5.6 innodb_large_prefix=OFF)
3072	InnoDB single column index prefix max (MySQL 5.7+/8.0)

Real-World EXPLAIN Analysis Examples

Example 1: Full Table Scan → Add Index

Problem Query
type=ALL

SELECT * FROM orders WHERE customer_email = '[email protected]';

-- EXPLAIN:
+----+--------+--------+------+------+------+------+--------+-------+-------------+
| id | s_type | table  | type | p_keys| key | k_len| ref   | rows  | Extra       |
+----+--------+--------+------+------+------+------+--------+-------+-------------+
|  1 | SIMPLE | orders | ALL  | NULL | NULL | NULL | NULL   | 850000| Using where |
+----+--------+--------+------+------+------+------+--------+-------+-------------+
-- Problem: scanning 850K rows for one email, 3.2 seconds

↓

Solution
type=ref

ALTER TABLE orders ADD INDEX idx_customer_email (customer_email);

-- EXPLAIN after fix:
+----+--------+--------+------+--------------------+--------------------+------+-------+------+-------+
| id | s_type | table  | type | possible_keys      | key                | k_len| ref   | rows | Extra |
+----+--------+--------+------+--------------------+--------------------+------+-------+------+-------+
|  1 | SIMPLE | orders | ref  | idx_customer_email | idx_customer_email | 1022 | const |    3 |       |
+----+--------+--------+------+--------------------+--------------------+------+-------+------+-------+
-- After: 3 rows examined, 0.001 seconds

Example 2: JOIN with type=ALL → Optimize Join

Problem Query
type=ALL on both tables

SELECT o.id, p.name, oi.quantity
FROM orders o
JOIN order_items oi ON o.id = oi.order_id
JOIN products p ON oi.product_id = p.id
WHERE o.created_at > '2024-01-01';

-- EXPLAIN:
-- id=1, table=o,  type=ALL, rows=850000  (no index on created_at)
-- id=1, table=oi, type=ALL, rows=3200000 (no index on order_id)
-- id=1, table=p,  type=eq_ref, key=PRIMARY, rows=1
-- Total rows: 850000 x 3200000 x 1 = 2.7 trillion!

↓

Solution
range + ref + eq_ref

ALTER TABLE orders ADD INDEX idx_created_at (created_at);
ALTER TABLE order_items ADD INDEX idx_order_id (order_id);

-- EXPLAIN after fix:
-- id=1, table=o,  type=range, key=idx_created_at, rows=42000
-- id=1, table=oi, type=ref, key=idx_order_id, rows=4
-- id=1, table=p,  type=eq_ref, key=PRIMARY, rows=1
-- Total: 42000 x 4 x 1 = 168,000

Example 3: Correlated Subquery → Rewrite as JOIN

Problem Query
DEPENDENT SUBQUERY

SELECT c.name, (
  SELECT COUNT(*) FROM orders WHERE customer_id = c.id
) AS order_count
FROM customers c WHERE c.status = 'active';

-- EXPLAIN:
-- id=1, select_type=PRIMARY,            table=c,      type=ref, rows=5000
-- id=2, select_type=DEPENDENT SUBQUERY, table=orders, type=ALL, rows=850000
-- Subquery runs 5000 times, each scanning 850K rows!

↓

Solution
SIMPLE JOIN + GROUP BY

-- Option A: add index on orders, then LEFT JOIN + GROUP BY
ALTER TABLE orders ADD INDEX idx_customer_id (customer_id);

SELECT c.name, COUNT(o.id) AS order_count
FROM customers c
LEFT JOIN orders o ON o.customer_id = c.id
WHERE c.status = 'active'
GROUP BY c.id, c.name;

-- EXPLAIN:
-- id=1, table=c, type=ref, key=idx_status, rows=5000
-- id=1, table=o, type=ref, key=idx_customer_id, rows=4
-- Total: 5000 x 4 = 20,000

Example 4: filesort → Add Sort-Covering Index

Problem Query
Using filesort

SELECT id, status, created_at, total
FROM orders WHERE status = 'pending'
ORDER BY created_at DESC LIMIT 20;

-- EXPLAIN:
-- type=ref, key=idx_status, rows=42000, Extra: Using filesort
-- Problem: index finds 42K rows, then sorts ALL of them, then takes 20

↓

Solution
filesort eliminated

-- Create composite index covering both WHERE and ORDER BY
ALTER TABLE orders ADD INDEX idx_status_created (status, created_at);

-- EXPLAIN:
-- type=ref, key=idx_status_created, rows=20, Extra: Backward index scan
-- MySQL scans index backwards (DESC), takes first 20 rows, no filesort needed
-- Further optimization: create descending index
ALTER TABLE orders ADD INDEX idx_status_created_desc (status, created_at DESC);

Example 5: Derived Table Materialization → CTE or Lateral Join

Problem Query
DERIVED + Using temporary

SELECT c.name, recent.last_order_date, recent.total
FROM customers c
JOIN (
  SELECT customer_id, MAX(created_at) AS last_order_date, SUM(total) AS total
  FROM orders GROUP BY customer_id
) AS recent ON c.id = recent.customer_id
WHERE c.status = 'vip';

-- EXPLAIN:
-- id=1, table=c,            type=ref,  rows=100
-- id=1, table=,   type=ref,  rows=10
-- id=2, select_type=DERIVED, table=orders, type=ALL, rows=850000
-- Problem: derived table materializes aggregation of all 850K rows

↓

Solution (MySQL 8.0.14+ Lateral Join)
Lateral Derived

SELECT c.name, recent.last_order_date, recent.total
FROM customers c,
LATERAL (
  SELECT MAX(created_at) AS last_order_date, SUM(total) AS total
  FROM orders WHERE customer_id = c.id
) AS recent
WHERE c.status = 'vip';

-- Or use CTE + window function (8.0+):
WITH ranked AS (
  SELECT customer_id, created_at, total,
         ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY created_at DESC) AS rn
  FROM orders
)
SELECT c.name, r.created_at AS last_order_date, r.total
FROM customers c JOIN ranked r ON c.id = r.customer_id AND r.rn = 1
WHERE c.status = 'vip';

Common Anti-Patterns Detected by EXPLAIN

Below are common performance anti-patterns detectable through EXPLAIN output. Each includes detection criteria and fix suggestions.

Anti-Pattern 1: Full Table Scan on Large Table

Attribute	Value
Detection	`type=ALL` AND `rows > 1000`
Impact	O(N) complexity, slower as table grows. Can take tens of seconds on million-row tables.
Fix	Create indexes on columns in WHERE, JOIN, ORDER BY. Check for functions on columns preventing index use. Check for implicit type conversions.

Anti-Pattern 2: Temporary Table + Filesort

Attribute	Value
Detection	`Extra` contains both `Using temporary` AND `Using filesort`
Impact	Materializes intermediate results and sorts them. Large data may spill to disk, causing severe slowdown.
Fix	Ensure GROUP BY / ORDER BY columns are in the same index. If possible, share a composite index prefix across WHERE + GROUP BY + ORDER BY.

Anti-Pattern 3: Rows Estimate >> Actual Result Set

Attribute	Value
Detection	`rows` very high but `filtered` very low (e.g., rows=100000, filtered=0.10)
Impact	Many rows read then immediately discarded — wasted I/O and CPU.
Fix	Add more precise indexes covering filter columns in WHERE. Consider composite indexes.

Anti-Pattern 4: possible_keys Exists but key is NULL

Attribute	Value
Detection	`possible_keys != NULL` AND `key = NULL`
Impact	Index available but optimizer chose full scan instead.
Fix	1) Run `ANALYZE TABLE` to refresh statistics; 2) Check index selectivity (`SELECT COUNT(DISTINCT col)/COUNT()`); 3) Try `FORCE INDEX`; 4) Check if `SELECT ` makes table lookups too expensive — select only needed columns.

Anti-Pattern 5: DEPENDENT SUBQUERY Per-Row Execution

Attribute	Value
Detection	`select_type = DEPENDENT SUBQUERY`
Impact	Subquery executes once per outer row. 100K outer rows × 500K subquery scan = 50 billion row checks.
Fix	Rewrite as JOIN or EXISTS (let optimizer auto-convert to semi-join). For scalar subqueries (in SELECT list), rewrite as LEFT JOIN + aggregation.

Anti-Pattern 6: Functions on Indexed Columns

Attribute	Value
Detection	Indexed column in WHERE, but `type=ALL` and `key=NULL`
Impact	Index completely disabled, full table scan.
Fix	Avoid `WHERE YEAR(date_col) = 2024`; rewrite as `WHERE date_col >= '2024-01-01' AND date_col < '2025-01-01'`. Or use functional indexes in MySQL 8.0+: `ALTER TABLE t ADD INDEX ((YEAR(date_col)))`.

Anti-Pattern 7: Implicit Type Conversion

Attribute	Value
Detection	Similar to anti-pattern 6 — index exists but not used
Typical	Column is VARCHAR but WHERE uses a number: `WHERE phone = 13800138000`. MySQL CASTs every row, index disabled.
Fix	Ensure type match: `WHERE phone = '13800138000'`.

Anti-Pattern 8: JOIN Multiplier Explosion

Attribute	Value
Detection	Product of `rows` across multiple JOINed tables is very large
Impact	Three-table JOIN with 10000 rows each = 10^12 combinations (without indexes)
Fix	Add indexes on the driven side of each JOIN condition. Ensure the driving table has the smallest result set. Use `STRAIGHT_JOIN` to control order if needed.

Optimizer Trace — Inside the Optimizer's Decision Process

When EXPLAIN tells you "what the optimizer chose" but not "why it chose it," optimizer_trace is the tool you need. It records the optimizer's complete decision chain from receiving the SQL to generating the final execution plan.

-- Enable optimizer_trace
SET optimizer_trace = 'enabled=on';
SET optimizer_trace_max_mem_size = 1048576;  -- 1MB

-- Execute the query you want to analyze
SELECT * FROM orders WHERE status = 'pending' ORDER BY created_at LIMIT 10;

-- View the complete optimizer decision process
SELECT * FROM information_schema.OPTIMIZER_TRACE\G

-- Disable trace (avoid overhead)
SET optimizer_trace = 'enabled=off';

Three Phases of Trace Output

1. join_preparation

SQL parsing and preprocessing. Here you see: whether subqueries are converted to semi-joins, whether views/derived tables are merged or materialized, equality propagation, and other optimization transformations.

2. join_optimization

This is the core phase. Contains:

condition_processing: condition simplification (constant folding, removing redundant conditions).
table_dependencies: dependency relationships between tables.
ref_optimizer_key_uses: candidate ref access methods.
rows_estimation: row count estimation and index selection per table. Here you see why an index was chosen or abandoned — the complete cost comparison numbers.
considered_execution_plans: all considered JOIN orders and their costs.
attaching_conditions_to_tables: which conditions are assigned to which tables.

3. join_execution

Execution phase information (usually minimal).

Practical: Using Trace to Diagnose Index Selection Issues

-- Query: optimizer has two candidate indexes but chose the "wrong" one
-- Look in the rows_estimation section of the trace:

"rows_estimation": [
  {
    "table": "`orders`",
    "range_analysis": {
      "table_scan": {
        "rows": 850000,
        "cost": 87412.1          -- full scan cost
      },
      "potential_range_indexes": [
        {
          "index": "idx_status",
          "usable": true,
          "key_parts": ["status"]
        },
        {
          "index": "idx_created_at",
          "usable": true,
          "key_parts": ["created_at"]
        }
      ],
      "analyzing_range_alternatives": {
        "range_scan_alternatives": [
          {
            "index": "idx_status",
            "ranges": ["pending <= status <= pending"],
            "rows": 425000,       -- 50% of rows
            "cost": 148750,       -- more expensive than full scan!
            "chosen": false,
            "cause": "cost"
          },
          {
            "index": "idx_created_at",
            "ranges": ["2024-01-01 <= created_at"],
            "rows": 42000,
            "cost": 50400,
            "chosen": true
          }
        ]
      }
    }
  }
]

The trace clearly shows: idx_status matches 50% of rows (too low selectivity), cost 148,750 is even higher than full scan's 87,412. Meanwhile idx_created_at matches only 42,000 rows at cost 50,400, so it was chosen.

EXPLAIN FOR CONNECTION — Diagnose Running Queries

When you see a long-running query in SHOW PROCESSLIST, use EXPLAIN FOR CONNECTION to view its execution plan without copying the SQL and re-executing it.

-- Step 1: find the problem query's connection ID
SELECT id, time, state, LEFT(info, 100) AS query
FROM information_schema.PROCESSLIST
WHERE command = 'Query' AND time > 5
ORDER BY time DESC;

-- Step 2: get the execution plan
EXPLAIN FOR CONNECTION 12345;

-- JSON format for more detail
EXPLAIN FORMAT=JSON FOR CONNECTION 12345;

Use Cases

Production diagnosis: diagnose slow queries in production without copying SQL or needing permissions on the target table.
Prepared statements: parameters are already bound, so EXPLAIN FOR CONNECTION shows the actual execution plan with bound values.
Long transaction analysis: combine with performance_schema.events_statements_current for fuller context.

Limitations: (1) requires PROCESS privilege; (2) if the target query is in "Sending data" state, the plan may be incomplete; (3) very short queries may finish before you run EXPLAIN.

Histogram Statistics and EXPLAIN Accuracy (MySQL 8.0+)

MySQL 8.0 introduced Column Histograms, significantly improving the accuracy of the filtered column estimate, especially for columns with skewed data distribution.

Why Histograms Are Needed

Traditional index statistics only record column cardinality (number of distinct values). This works for uniformly distributed data, but real data is often heavily skewed. For example, order statuses on an e-commerce platform:

-- Data distribution:
-- completed: 85% (850,000 rows)
-- shipped:   10% (100,000)
-- pending:    4% (40,000)
-- cancelled:  1% (10,000)

-- Without histogram, optimizer assumes uniform distribution:
-- Rows per status = 1,000,000 / 4 = 250,000

-- This causes severe overestimation for WHERE status = 'cancelled'
-- Estimated 250K, actual 10K

Creating and Using Histograms

-- Create histogram
ANALYZE TABLE orders UPDATE HISTOGRAM ON status WITH 100 BUCKETS;
ANALYZE TABLE orders UPDATE HISTOGRAM ON total WITH 256 BUCKETS;

-- View histogram info
SELECT SCHEMA_NAME, TABLE_NAME, COLUMN_NAME, HISTOGRAM->>'$.histogram-type' AS type,
       JSON_LENGTH(HISTOGRAM->>'$.buckets') AS buckets
FROM information_schema.COLUMN_STATISTICS;

-- Drop histogram
ANALYZE TABLE orders DROP HISTOGRAM ON status;

Note: Histograms are independent of indexes. Even without an index on the column, histograms help the optimizer estimate filtered more accurately. Two histogram types:

Singleton: used when distinct values <= buckets. Each bucket stores an exact value and its frequency.
Equi-height: used when distinct values > buckets. Each bucket stores a value range and cumulative frequency.

EXPLAIN Optimization Decision Flowchart

When facing EXPLAIN output, follow this flowchart to quickly locate issues:

Step 1: Check the type Column

type = ALL and rows > 1000? → Optimize immediately. Add index or rewrite query.
type = index without Using index? → Not a covering index, may need optimization.
type is ref/eq_ref/const/range? → Usually acceptable, check other columns.

Step 2: Check key and possible_keys

possible_keys = NULL? → No usable index, need to create one.
possible_keys has values but key = NULL? → Index exists but not used, check selectivity and statistics.
Wrong index used? → Test with FORCE INDEX, check optimizer_trace for reason.

Step 3: Check rows and filtered

rows very high? → Index selectivity may be insufficient, consider a more precise composite index.
filtered very low (< 20%)? → Many rows read then discarded, consider adding filter conditions to the index.
Product of rows across JOINed tables is large? → Check indexes on JOIN conditions, consider adjusting JOIN order.

Step 4: Check Extra

Using filesort? → Create composite index covering WHERE + ORDER BY.
Using temporary? → Ensure GROUP BY/DISTINCT columns are indexed, or align with ORDER BY columns.
Using index? → Covering index, good performance.
Using index condition? → ICP is working, generally a good signal.

Step 5: Check select_type

DEPENDENT SUBQUERY? → Highest priority — rewrite as JOIN.
DERIVED with high rows? → Consider CTE or Lateral Join (MySQL 8.0+).

Step 6: Validate with EXPLAIN ANALYZE

Compare estimated vs actual rows — large gaps indicate stale statistics.
Find the node with highest actual time — that is the bottleneck.
Watch the loops value — high loops × time per loop = total actual time.

Using Performance Schema with EXPLAIN

EXPLAIN tells you the plan; Performance Schema tells you the actual runtime behavior. Combining both is the best practice for diagnosing complex performance issues.

Finding the Slowest Query Patterns

-- Top 10 query patterns by total execution time
SELECT DIGEST_TEXT,
       COUNT_STAR AS exec_count,
       ROUND(SUM_TIMER_WAIT/1e12, 2) AS total_sec,
       ROUND(AVG_TIMER_WAIT/1e12, 4) AS avg_sec,
       SUM_ROWS_EXAMINED,
       SUM_ROWS_SENT,
       ROUND(SUM_ROWS_EXAMINED/NULLIF(SUM_ROWS_SENT,0), 1) AS examine_to_send_ratio
FROM performance_schema.events_statements_summary_by_digest
WHERE SCHEMA_NAME = 'your_db'
ORDER BY SUM_TIMER_WAIT DESC
LIMIT 10;

Key Metrics Explained

Metric	Meaning	EXPLAIN Correlation
`SUM_ROWS_EXAMINED / SUM_ROWS_SENT`	Ratio of rows examined to rows returned	Ratio >> 1 means many rows read then discarded. Corresponds to high `rows` with low `filtered` in EXPLAIN.
`SUM_SORT_ROWS`	Number of sorted rows	High value corresponds to `Using filesort` in EXPLAIN.
`SUM_CREATED_TMP_TABLES`	In-memory temp tables created	Corresponds to `Using temporary`.
`SUM_CREATED_TMP_DISK_TABLES`	On-disk temp tables created	Much worse than in-memory. Indicates `tmp_table_size` is too small or data volume is too large.
`SUM_NO_INDEX_USED`	Number of queries not using an index	Corresponds to `type=ALL` in EXPLAIN.

Workflow Example

-- 1. Find query with highest examine/send ratio from Performance Schema
SELECT DIGEST_TEXT, SUM_ROWS_EXAMINED, SUM_ROWS_SENT,
       ROUND(SUM_ROWS_EXAMINED/NULLIF(SUM_ROWS_SENT,0)) AS ratio
FROM performance_schema.events_statements_summary_by_digest
WHERE SCHEMA_NAME = 'mydb' AND SUM_ROWS_SENT > 0
ORDER BY ratio DESC LIMIT 5;

-- 2. After finding the problem query, analyze with EXPLAIN
EXPLAIN FORMAT=JSON SELECT ...;

-- 3. After optimization, check Performance Schema again to confirm improvement
-- (Note: TRUNCATE the stats table first to start fresh)
TRUNCATE performance_schema.events_statements_summary_by_digest;

Index Design Mistakes Revealed by EXPLAIN

Mistake 1: Wrong Column Order in Composite Index

-- Index: INDEX (created_at, status)
-- Query:
EXPLAIN SELECT * FROM orders WHERE status = 'pending' AND created_at > '2024-01-01';

-- EXPLAIN shows key_len covers only created_at (columns after range condition cannot be used for exact match)
-- Correct order: INDEX (status, created_at):
-- Equality columns first, range columns last

Mistake 2: Redundant Indexes

-- These indexes are redundant:
INDEX idx_a (a)                  -- covered by idx_ab
INDEX idx_ab (a, b)
INDEX idx_abc (a, b, c)          -- idx_ab is also covered by idx_abc

-- Check redundant indexes:
SELECT * FROM sys.schema_redundant_indexes WHERE table_schema = 'mydb';

Mistake 3: Over-Indexing

Every index adds cost to write operations (INSERT/UPDATE/DELETE must maintain all indexes). Use EXPLAIN to check which indexes never appear in the key column — they may be useless.

-- Find indexes never used (MySQL 5.6+):
SELECT object_schema, object_name, index_name
FROM performance_schema.table_io_waits_summary_by_index_usage
WHERE index_name IS NOT NULL
  AND count_star = 0
  AND object_schema = 'mydb'
ORDER BY object_name;

Mistake 4: Ignoring Charset Impact on key_len

The same VARCHAR(255) has key_len = 257 (255+2) in latin1 but key_len = 1022 (255×4+2) in utf8mb4. This directly impacts:

Entries per index page (larger key_len = fewer entries = taller B-tree).
Maximum columns and total length of composite indexes (InnoDB max 3072 bytes).
Memory usage per row during in-memory sorts.

EXPLAIN Quick Reference Cheat Sheet

type Column Quick Reference

type	One-Line Description	Target
`system/const`	PK/UK exact single-row lookup	Best
`eq_ref`	PK/UK one-to-one JOIN lookup	Best for JOIN
`ref`	Non-unique index equality	Good
`range`	Index range scan	Acceptable
`index`	Full index scan (OK if covering)	Evaluate
`ALL`	Full table scan — must optimize on large tables	Must optimize

Extra Column Quick Reference

Extra	Signal	Action
`Using index`	Good	Covering index, no action needed
`Using index condition`	Good	ICP working
`Using where`	Neutral	Evaluate with type and rows
`Using filesort`	Warning	Consider composite index for sort
`Using temporary`	Warning	Optimize GROUP BY / DISTINCT
`Using join buffer`	Attention	Add index on JOIN condition
temporary + filesort	Danger	Highest priority fix

EXPLAIN Commands Quick Reference

-- Basic usage
EXPLAIN SELECT ...;

-- JSON format (more detail)
EXPLAIN FORMAT=JSON SELECT ...;

-- TREE format (iterator view, 8.0.16+)
EXPLAIN FORMAT=TREE SELECT ...;

-- Actually execute and collect stats (8.0.18+)
EXPLAIN ANALYZE SELECT ...;

-- Analyze running query
EXPLAIN FOR CONNECTION ;

-- View optimizer-rewritten SQL
EXPLAIN SELECT ...; SHOW WARNINGS;

-- Refresh table statistics
ANALYZE TABLE table_name;

-- Create histogram (8.0+)
ANALYZE TABLE t UPDATE HISTOGRAM ON col WITH 100 BUCKETS;

FAQ

Does EXPLAIN actually execute the query?

Regular EXPLAIN and EXPLAIN FORMAT=JSON/TREE do not execute the query — they only have the optimizer generate the plan. The sole exception is EXPLAIN ANALYZE (MySQL 8.0.18+), which actually executes the query to collect real runtime statistics. For write operations (INSERT/UPDATE/DELETE), EXPLAIN ANALYZE will truly modify data, so use it within a transaction. Note: even with regular EXPLAIN, queries with derived tables (DERIVED) may require materialization to generate the plan, consuming some resources.

Why are EXPLAIN's row estimates inaccurate?

rows is based on index statistics (cardinality), which are approximate values from random sampling. Common reasons for inaccuracy: - Statistics stale after heavy DML — run ANALYZE TABLE to refresh. - InnoDB samples limited pages (default innodb_stats_persistent_sample_pages=20) — increase sample pages. - Highly skewed data distribution (e.g., 99% of status is 'active') — histograms (MySQL 8.0 ANALYZE TABLE t UPDATE HISTOGRAM ON col) improve estimates. - Multi-column conditions assume column independence — actual data may be highly correlated.

What's the difference between type=index and type=ALL? Which is faster?

Both are full scans, but scan different things: - type=ALL: full table scan — reads all rows from the clustered index (data pages). Each page contains complete row data. - type=index: full index scan — reads all entries from a secondary index. Index entries are much smaller than data rows. With a covering index (Extra: Using index), type=index is significantly faster since it only reads smaller index pages. But without a covering index (no Using index), type=index may actually be slower — table lookups cause heavy random I/O.

How can I analyze a slow query without executing it?

Use regular EXPLAIN (without ANALYZE). It generates the plan without executing: EXPLAIN FORMAT=JSON SELECT ... your slow query ...; If you need detailed timing but worry the query is too slow: - Use EXPLAIN ANALYZE in a test environment. - Add LIMIT then use EXPLAIN ANALYZE. - Use optimizer_trace for full optimizer decision process: SET optimizer_trace='enabled=on'; SELECT ...; SELECT * FROM information_schema.optimizer_trace;

EXPLAIN shows index is used but query is still slow — why?

Index usage does not guarantee good performance. Common reasons: - High lookup cost: type=range scans 100K index entries, each requiring a table lookup — 100K random I/Os. Fix: use covering index. - Low selectivity: index returns too many rows (e.g., gender column with only M/F). Check the rows column. - Sort bottleneck: index used for WHERE but ORDER BY needs filesort. Check Extra column. - Lock waits: query itself is fast but waiting for locks. Check SHOW ENGINE INNODB STATUS. - Network transfer: query returns massive data (SELECT * on millions of rows). Reduce columns and rows returned.

Can query_cost from EXPLAIN FORMAT=JSON be compared across different queries?

To some extent, but with caveats: - query_cost is a dimensionless number from the optimizer's internal cost model — not milliseconds or I/O counts. - Different formulations of the same query can be directly compared by query_cost — lower cost is usually faster. - Comparing across different queries has limited meaning — a query returning 1 row may have higher cost than one returning 1M rows but execute faster. - The cost model uses parameters from mysql.server_cost and mysql.engine_cost tables (memory vs disk access costs), tunable but rarely needed.

What are the differences in EXPLAIN between MySQL 5.7 and 8.0?

MySQL 8.0 adds significant EXPLAIN enhancements: - FORMAT=TREE (8.0.16+): new iterator-based tree execution plan format. - EXPLAIN ANALYZE (8.0.18+): actual execution with runtime statistics. - Hash Join (8.0.18+): new JOIN algorithm, shown as hash join (TREE) or Using join buffer (hash join) (Extra). - EXPLAIN FOR CONNECTION (available since 5.7, improved in 8.0): view plan of running query. - Skip Scan (8.0.13+): Using index for skip scan in Extra. - Descending indexes (8.0.0+): INDEX (col DESC), reducing Backward index scan. - Histograms (8.0.0+): improve filtered estimate accuracy.

How to view the execution plan of a query running in another connection?

Use EXPLAIN FOR CONNECTION (MySQL 5.7+): -- 1. Find the target connection ID SHOW PROCESSLIST; -- or SELECT * FROM performance_schema.processlist WHERE INFO IS NOT NULL; -- 2. View its execution plan EXPLAIN FOR CONNECTION 42; EXPLAIN FORMAT=JSON FOR CONNECTION 42; Note: if the query finishes too quickly, EXPLAIN FOR CONNECTION may return an error. This feature is best for analyzing long-running queries.

EXPLAIN results differ when run at different times — is this normal?

Completely normal. EXPLAIN results are influenced by: - Index statistics: InnoDB statistics are approximate from sampling. After background resampling (or running ANALYZE TABLE), rows and filtered estimates may change. - Data volume changes: as data is inserted/updated/deleted, table size changes may cause the optimizer to choose different paths. - Buffer Pool state: while EXPLAIN itself is unaffected by cache (it does not execute), EXPLAIN ANALYZE's actual time varies based on whether data is in memory. - Optimizer hints and variables: session-level optimizer_switch settings affect the plan. - MySQL version upgrades: different versions have different optimizer capabilities; plans may change after upgrading. If a production execution plan suddenly degrades, first run ANALYZE TABLE to refresh statistics, then re-EXPLAIN.

Rate this chapter

4.5 / 5 (41 ratings)