Top Interview Questions
MySQL Interview Questions Bank
This chapter covers 100+ high-frequency MySQL interview questions organized by topic. Each question includes a detailed answer and interviewer perspective notes. Difficulty: Basic Mid Hard.
Study tip: Try to answer each question yourself before reading the solution. During interviews, don't just recite conclusions — explain the "why" and edge cases.
I. Indexing (25 Questions)
BasicQ1. What is the underlying data structure of MySQL indexes? Why B+Tree over B-Tree or hash?
Answer: InnoDB uses B+Tree. Key differences:
- vs Hash: Hash gives O(1) equality lookup but cannot support range queries, sorting, or prefix matching. MySQL Memory engine uses hash; InnoDB does not.
- vs B-Tree: B-Tree nodes store data too, meaning fewer keys per node, taller tree, more I/O. B+Tree internal nodes only store keys — with a 16KB page, ~1000+ keys fit, so 3-4 levels cover billions of rows.
- Leaf linked list: B+Tree leaves are doubly linked, so range scans just find the start and scan forward — extremely efficient.
Interviewer expects: InnoDB page = 16KB, tree height = log_m(N), why disk I/O determines index structure choice for databases.
BasicQ2. What is the difference between clustered and secondary indexes? What is a "table lookup" (回表)?
Clustered Index: InnoDB stores row data together with the primary key in one B+Tree. Leaf nodes contain the full row. Each table has exactly one clustered index.
Secondary Index: Leaf nodes store the primary key value, not the row data.
Table lookup (回表): After finding the PK via a secondary index, fetching the full row from the clustered index. If EXPLAIN shows Using index (covering index), no lookup needed.
-- Requires table lookup: age is secondary index but we need name
SELECT name FROM users WHERE age = 25;
-- Covering index, no lookup needed: INDEX(age, name)
SELECT name FROM users WHERE age = 25;
MidQ3. What is the leftmost prefix rule for composite indexes?
A composite index INDEX(a, b, c) sorts rows first by a, then b, then c. Queries must match from the left:
| Query Condition | Index Used? |
|---|---|
| WHERE a=1 | ✅ Uses a |
| WHERE a=1 AND b=2 | ✅ Uses a,b |
| WHERE a=1 AND b=2 AND c=3 | ✅ Uses a,b,c |
| WHERE a=1 AND c=3 | ⚠️ Uses a only (b gap breaks c) |
| WHERE b=2 | ❌ Missing a prefix |
| WHERE a=1 AND b>2 AND c=3 | ⚠️ Uses a,b only (range stops at b) |
| WHERE a=1 ORDER BY b | ✅ Uses a,b and avoids filesort |
MidQ4. List 10 common index failure scenarios
- Function on index column:
WHERE YEAR(create_time) = 2024→ rewrite as range - Implicit type conversion: VARCHAR column compared with integer literal
- Leading wildcard LIKE:
WHERE name LIKE '%Zhang'(trailing wildcard'Zhang%'can use index) - OR with non-indexed column:
WHERE a=1 OR b=2when b has no index - NOT IN / NOT EXISTS: optimizer often chooses full scan
- Range breaks composite index:
(a,b,c)—WHERE a=1 AND b>2 AND c=3can't use c - Charset/collation mismatch: JOIN columns with different charsets trigger conversion
- IS NULL / IS NOT NULL: may not use index for low-selectivity columns
- Optimizer abandons index: estimated rows >30% of table, full scan seems cheaper
- Arithmetic on indexed column:
WHERE id + 1 = 100vsWHERE id = 99
II. Transactions & Locks (25 Questions)
BasicQ5. What are the 4 transaction isolation levels? Which is MySQL's default?
| Level | Dirty Read | Non-Repeatable Read | Phantom Read |
|---|---|---|---|
| READ UNCOMMITTED | ✅ yes | ✅ yes | ✅ yes |
| READ COMMITTED | ❌ | ✅ yes | ✅ yes |
| REPEATABLE READ (default) | ❌ | ❌ | ⚠️ mostly prevented |
| SERIALIZABLE | ❌ | ❌ | ❌ |
MySQL defaults to REPEATABLE READ. InnoDB uses Next-Key Lock to prevent most phantom reads in RR, but mixing snapshot reads with current reads (SELECT FOR UPDATE) can still produce phantom-read-like behavior.
MidQ6. What is MVCC? How does InnoDB implement it?
MVCC (Multi-Version Concurrency Control): maintains multiple row versions so reads see a consistent snapshot without blocking writes.
Implementation:
- Hidden columns: Each row has
DB_TRX_ID(last modifying transaction ID) andDB_ROLL_PTR(pointer to Undo Log chain) - Read View: Created at snapshot read time; records currently active transaction IDs. Used to determine which row version is visible.
- Undo Log: Chain of old row versions, traversed when an older version is needed
RC creates a new Read View on each SELECT. RR creates one Read View at transaction start and reuses it.
III. Query Optimization (20 Questions)
MidQ7. Why is deep pagination (LIMIT 1000000, 10) slow? How to fix it?
LIMIT 1000000, 10 must scan and discard 1M rows even with an index (1M index lookups → 1M table lookups).
Fix 1 — Deferred join:
SELECT t.* FROM t
INNER JOIN (
SELECT id FROM t ORDER BY create_time LIMIT 1000000, 10
) sub ON t.id = sub.id;
Fix 2 — Cursor pagination (recommended):
SELECT * FROM t
WHERE (create_time, id) > ('2024-01-15 10:30:00', 99980)
ORDER BY create_time, id LIMIT 10;
Fix 2 is O(log n) but requires frontend to pass cursor, no random page jumping.
HardQ8. How to prevent overselling inventory under high concurrency?
Simplest — WHERE guard:
UPDATE inventory SET stock = stock - 1
WHERE item_id = 100 AND stock > 0;
-- InnoDB row lock serializes updates; stock > 0 prevents negatives
Optimistic lock:
UPDATE inventory
SET stock = stock - 1, version = version + 1
WHERE item_id = 100 AND version = :old_version AND stock > 0;
-- Retry if affected rows = 0
Redis pre-decrement (recommended for massive scale): Atomic Redis DECR → success triggers async message queue → consumer writes to MySQL.