Chapter 15

Locking & Deadlock Diagnosis

Chapter 3: MySQL Locks & Deadlocks Complete Guide

  Level: Advanced
  Reading time ~75 min
  Based on MySQL 8.0 / 8.4

Locks are the cornerstone of concurrency control and a frequent source of performance issues and outages. This chapter starts with a visual map of all InnoDB lock types, then dissects Record Lock, Gap Lock, Next-Key Lock, and Insert Intention Lock internals with a complete compatibility matrix. Through 15+ SQL locking scenario analyses, we show the precise locking behavior of different statements under each isolation level. We deep-dive into the deadlock detection algorithm (wait-for graph), walk through reading SHOW ENGINE INNODB STATUS, and provide an interactive deadlock simulator for reproducing common deadlocks in your browser. Finally we cover MDL metadata locks, hot row optimization patterns, and real-world production incident case studies.

1. InnoDB Lock Types Visual Map

1.1 Lock Type Hierarchy

InnoDB's locking system can be understood along two dimensions: granularity and mode. The ASCII diagram below shows all lock types and their hierarchy:


┌─────────────────────────────────────────────────────────────────────────────┐
│                         InnoDB Lock Taxonomy                               │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  ┌─── By Granularity ──────────────────────────────────────────────────┐   │
│  │                                                                      │   │
│  │   Table-Level                      Row-Level                        │   │
│  │   ┌──────────────────┐             ┌──────────────────────────┐     │   │
│  │   │ Table Lock (LOCK  │             │ Record Lock              │     │   │
│  │   │  TABLES ... )     │             │  - locks single index rec │     │   │
│  │   ├──────────────────┤             ├──────────────────────────┤     │   │
│  │   │ Intention Lock    │             │ Gap Lock                 │     │   │
│  │   │  - IS (Intent     │             │  - locks gap before rec  │     │   │
│  │   │    Shared)        │             ├──────────────────────────┤     │   │
│  │   │  - IX (Intent     │             │ Next-Key Lock            │     │   │
│  │   │    Exclusive)     │             │  = Record + Gap          │     │   │
│  │   ├──────────────────┤             │  (default in RR)         │     │   │
│  │   │ Auto-Inc Lock     │             ├──────────────────────────┤     │   │
│  │   │  (AUTO_INCREMENT) │             │ Insert Intention Lock    │     │   │
│  │   └──────────────────┘             │  - special gap lock for  │     │   │
│  │                                     │    concurrent INSERTs    │     │   │
│  │                                     └──────────────────────────┘     │   │
│  └──────────────────────────────────────────────────────────────────────┘   │
│                                                                             │
│  ┌─── By Mode ─────────────────────────────────────────────────────────┐   │
│  │                                                                      │   │
│  │   Shared (S)                        Exclusive (X)                    │   │
│  │   ┌──────────────────┐             ┌──────────────────────────┐     │   │
│  │   │ SELECT ... FOR    │             │ SELECT ... FOR UPDATE     │     │   │
│  │   │  SHARE / LOCK IN  │             │ UPDATE ...                │     │   │
│  │   │  SHARE MODE       │             │ DELETE ...                │     │   │
│  │   │ Multiple holders  │             │ Single holder only        │     │   │
│  │   │  allowed          │             │                           │     │   │
│  │   └──────────────────┘             └──────────────────────────┘     │   │
│  └──────────────────────────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────────────────────┘

Source ref: storage/innobase/include/lock0types.h, storage/innobase/lock/lock0lock.cc

1.2 Table Locks vs Row Locks vs Intention Locks

Table locks have the coarsest granularity and lock the entire table. Row locks have the finest granularity and lock specific records in an index. Intention locks are lightweight table-level locks that coordinate between table locks and row locks.

Lock Type	Granularity	How Acquired	Purpose
`LOCK TABLES ... READ`	Table-level S	Explicit LOCK TABLES	Block writes, allow reads
`LOCK TABLES ... WRITE`	Table-level X	Explicit LOCK TABLES	Block all reads and writes
IS (Intention Shared)	Table-level	Auto-acquired before row S-lock	Declare intent to S-lock some rows
IX (Intention Exclusive)	Table-level	Auto-acquired before row X-lock	Declare intent to X-lock some rows
AUTO-INC	Table-level	INSERT	Guarantee unique monotonic AUTO_INCREMENT

1.3 S/X Lock Compatibility Matrix (Table-level)

Intention locks never conflict with each other (they are merely "declarations of intent"), but they do conflict with table-level S/X locks:

Held ↓ \ Requested →	X	IX	S	IS
X	N	N	N	N
IX	N	Y	N	Y
S	N	N	Y	Y
IS	N	Y	Y	Y

Key Insight: IX is compatible with IX -- that is why multiple transactions can simultaneously hold exclusive row locks on different rows of the same table. Without intention locks, MySQL would need to scan every row for row locks when executing LOCK TABLES ... WRITE; intention locks make this check O(1).

1.4 AUTO-INC Lock and innodb_autoinc_lock_mode

The AUTO-INC lock is a special table-level lock that ensures unique monotonic AUTO_INCREMENT values. innodb_autoinc_lock_mode controls its behavior:

Mode	Value	Behavior	Use Case
Traditional	0	Table-level AUTO-INC lock held until statement end	SBR replication requiring contiguous IDs
Consecutive	1 (pre-8.0 default)	Lightweight mutex for simple INSERT; AUTO-INC lock for bulk INSERT	Balance SBR safety and INSERT concurrency
Interleaved	2 (8.0+ default)	Lightweight mutex for all INSERTs, IDs may have gaps	RBR replication, maximum concurrency

Production Tip: MySQL 8.0 changed the default from 1 to 2 alongside changing the default binlog format to ROW. If you still use STATEMENT-based replication, you must set innodb_autoinc_lock_mode to 0 or 1, otherwise replica AUTO_INCREMENT values may diverge from the source.

2. Row Lock Deep Dive

InnoDB row locks are placed on index records, not on data rows themselves. If a table has no indexes, InnoDB uses the implicit clustered index (ROW_ID), which effectively results in table-wide locking. Understanding this is crucial for analyzing locking behavior.

2.1 Record Lock

A Record Lock locks a single index record. When you use an equality condition on a unique index that matches exactly one record, InnoDB places only a Record Lock without a Gap Lock.

-- Setup
CREATE TABLE t (
  id INT PRIMARY KEY,
  name VARCHAR(50),
  age INT,
  INDEX idx_age (age)
) ENGINE=InnoDB;

INSERT INTO t VALUES (1,'Alice',25),(5,'Bob',30),(10,'Carol',35),(15,'Dave',40);

-- Transaction A: exact match on PK → Record Lock only
BEGIN;
SELECT * FROM t WHERE id = 5 FOR UPDATE;
-- Locks: X,REC_NOT_GAP on index `PRIMARY`, record (id=5)

-- Transaction B: can still insert id=3 or id=7 (no gap lock)
BEGIN;
INSERT INTO t VALUES (3,'Eve',28);  -- ✅ succeeds
INSERT INTO t VALUES (7,'Frank',32); -- ✅ succeeds

-- Index structure diagram: Record Lock on id=5

   Primary Index B+ Tree Leaf Page
  ┌────────┬────────┬────────┬────────┬──────────┐
  │  id=1  │  id=5  │ id=10  │ id=15  │ supremum │
  │        │  🔒 X  │        │        │          │
  │        │REC_NOT │        │        │          │
  │        │  _GAP  │        │        │          │
  └────────┴────────┴────────┴────────┴──────────┘

The exact lock information can be observed via performance_schema.data_locks:

SELECT ENGINE_TRANSACTION_ID, INDEX_NAME, LOCK_TYPE, LOCK_MODE, LOCK_DATA
FROM performance_schema.data_locks;

+------------------------+------------+-----------+-----------+-----------+
| ENGINE_TRANSACTION_ID  | INDEX_NAME | LOCK_TYPE | LOCK_MODE | LOCK_DATA |
+------------------------+------------+-----------+-----------+-----------+
| 281479386710048        | NULL       | TABLE     | IX        | NULL      |
| 281479386710048        | PRIMARY    | RECORD    | X,REC_NOT | 5         |
|                        |            |           | _GAP      |           |
+------------------------+------------+-----------+-----------+-----------+

Source ref: storage/innobase/lock/lock0lock.cc — lock_rec_lock()

2.2 Gap Lock

A Gap Lock locks the gap between index records, preventing other transactions from inserting new records into the gap. Gap Locks are only used under the REPEATABLE READ (RR) isolation level and not under READ COMMITTED (RC).

Core Property: Gap Locks do NOT conflict with each other! Two transactions can simultaneously hold Gap Locks on the same gap (regardless of S or X mode). The sole purpose of a Gap Lock is to block INSERTs. This is one reason why Gap Locks are a common source of deadlocks.

-- Transaction A: query on non-existent key → Gap Lock
BEGIN;
SELECT * FROM t WHERE id = 7 FOR UPDATE;
-- id=7 doesn't exist. InnoDB locks the gap (5, 10)
-- Locks: X,GAP on index `PRIMARY`, record (id=10)  — means gap before 10

-- Transaction B: trying to insert into the locked gap
BEGIN;
INSERT INTO t VALUES (6,'Eve',28);  -- ❌ BLOCKED (waiting for gap lock)
INSERT INTO t VALUES (8,'Frank',32); -- ❌ BLOCKED
INSERT INTO t VALUES (9,'Grace',29); -- ❌ BLOCKED
INSERT INTO t VALUES (3,'Hank',27);  -- ✅ succeeds (outside gap)
INSERT INTO t VALUES (11,'Ivy',31);  -- ✅ succeeds (outside gap)

-- Gap Lock diagram

   Primary Index B+ Tree Leaf Page
  ┌────────┬───────────────────┬────────┬────────┬──────────┐
  │  id=1  │    GAP (5,10)     │ id=10  │ id=15  │ supremum │
  │        │   🔒 X,GAP        │        │        │          │
  │        │   blocks INSERT   │        │        │          │
  │        │   of id=6,7,8,9   │        │        │          │
  └────────┴───────────────────┴────────┴────────┴──────────┘
             ↑                  ↑
         gap starts         gap lock is attached
         after id=5         to record id=10

-- performance_schema.data_locks output:
+------------------------+------------+-----------+-----------+-----------+
| ENGINE_TRANSACTION_ID  | INDEX_NAME | LOCK_TYPE | LOCK_MODE | LOCK_DATA |
+------------------------+------------+-----------+-----------+-----------+
| 281479386710049        | NULL       | TABLE     | IX        | NULL      |
| 281479386710049        | PRIMARY    | RECORD    | X,GAP     | 10        |
+------------------------+------------+-----------+-----------+-----------+

Source ref: storage/innobase/lock/lock0lock.cc — lock_rec_lock(), LOCK_GAP flag

2.3 Next-Key Lock

Next-Key Lock = Record Lock + Gap Lock, locking both a record and the gap before it. Next-Key Lock is InnoDB's default row lock type under the RR isolation level, designed to prevent phantom reads.

-- Transaction A: range query on secondary index → Next-Key Locks
BEGIN;
SELECT * FROM t WHERE age >= 30 AND age < 40 FOR UPDATE;
-- InnoDB scans idx_age and locks:
-- 1. Next-Key Lock on idx_age record (age=30, id=5):  locks (25,30]
-- 2. Next-Key Lock on idx_age record (age=35, id=10): locks (30,35]
-- 3. Gap Lock on idx_age record (age=40, id=15):      locks (35,40)  — gap only
-- 4. Record Lock (X,REC_NOT_GAP) on PRIMARY id=5 and id=10

-- Net effect: age range [30, 40) is fully locked, no phantom possible

-- Next-Key Lock diagram on secondary index idx_age

  idx_age B+ Tree Leaf Page
  ┌──────────────┬──────────────┬──────────────┬──────────────┬──────────┐
  │ age=25,id=1  │ age=30,id=5  │ age=35,id=10 │ age=40,id=15 │ supremum │
  │              │  🔒 X,NK     │  🔒 X,NK     │  🔒 X,GAP    │          │
  │              │  (25,30]     │  (30,35]     │  (35,40)     │          │
  └──────────────┴──────────────┴──────────────┴──────────────┴──────────┘

  PRIMARY B+ Tree (record locks for matched rows)
  ┌────────┬────────┬────────┬────────┬──────────┐
  │  id=1  │  id=5  │ id=10  │ id=15  │ supremum │
  │        │ 🔒 X   │ 🔒 X   │        │          │
  │        │REC_NOT │REC_NOT │        │          │
  │        │  _GAP  │  _GAP  │        │          │
  └────────┴────────┴────────┴────────┴──────────┘

Source ref: storage/innobase/lock/lock0lock.cc — LOCK_ORDINARY (next-key lock is the default, no special flag)

2.4 Insert Intention Lock

An Insert Intention Lock is a special type of Gap Lock acquired during INSERT operations. It signals that a transaction intends to insert a record at a specific position within a gap. Multiple transactions inserting at different positions within the same gap will not block each other via their Insert Intention Locks, enabling concurrent inserts.

-- Existing records: id = 1, 5, 10, 15
-- Gap between id=5 and id=10: (5, 10)

-- Transaction A:
BEGIN;
INSERT INTO t VALUES (6,'Eve',28);
-- Acquires: Insert Intention Lock on gap (5,10), then Record Lock on id=6

-- Transaction B (concurrent):
BEGIN;
INSERT INTO t VALUES (8,'Frank',32);
-- Acquires: Insert Intention Lock on gap (5,10), then Record Lock on id=8
-- ✅ Both succeed concurrently — Insert Intention Locks don't conflict!

However: If another transaction already holds a Gap Lock (non-Insert Intention type) on that gap, the Insert Intention Lock will be blocked. This is the root cause of Gap Lock + INSERT deadlocks.

-- Deadlock scenario: Gap Lock vs Insert Intention Lock

-- TX A:
BEGIN;
SELECT * FROM t WHERE id = 7 FOR UPDATE;   -- Gap Lock on (5,10)

-- TX B:
BEGIN;
SELECT * FROM t WHERE id = 8 FOR UPDATE;   -- Gap Lock on (5,10) — compatible!

-- TX A:
INSERT INTO t VALUES (7,'Eve',28);
-- Needs Insert Intention Lock on (5,10), blocked by TX B's Gap Lock → WAIT

-- TX B:
INSERT INTO t VALUES (8,'Frank',32);
-- Needs Insert Intention Lock on (5,10), blocked by TX A's Gap Lock → DEADLOCK!

Source ref: storage/innobase/lock/lock0lock.cc — LOCK_INSERT_INTENTION flag

2.5 Row Lock Types Summary

Lock Type	LOCK_MODE Display	Scope	Primary Purpose
Record Lock	`X,REC_NOT_GAP` / `S,REC_NOT_GAP`	Single index record	Exact match on unique index
Gap Lock	`X,GAP` / `S,GAP`	Gap between two records	Prevent phantom reads (block INSERT)
Next-Key Lock	`X` / `S`	Record + preceding gap (gap, record]	Default row lock type under RR
Insert Intention Lock	`X,INSERT_INTENTION`	A point within a gap	Allow concurrent INSERT into same gap

3. Row Lock Compatibility Matrix

Row lock compatibility depends not only on the mode (S/X) but also on the lock type (Record/Gap/Next-Key/Insert Intention). This is the most confusing part of InnoDB's locking system:

3.1 Full Compatibility Matrix

Held ↓ \ Requested →	Record S	Record X	Gap S	Gap X	Next-Key S	Next-Key X	Insert Intention
Record S	Y	N	Y	Y	Y	N	Y
Record X	N	N	Y	Y	N	N	Y
Gap S	Y	Y	Y	Y	Y	Y	N
Gap X	Y	Y	Y	Y	Y	Y	N
Next-Key S	Y	N	Y	Y	Y	N	N
Next-Key X	N	N	Y	Y	N	N	N
Insert Intention	Y	Y	Y	Y	Y	Y	Y

Core Rules: - Gap Locks are always compatible with each other (Gap S/X vs Gap S/X are all Y) - Gap Locks (and the gap portion of Next-Key Locks) block Insert Intention Locks - Insert Intention Locks do not block anything (including other Insert Intention Locks) - Record Lock S/X conflict rules follow standard read-write lock semantics

3.2 Lock Compatibility Source Code Logic

InnoDB determines lock conflicts in the lock_rec_has_to_wait() function. The simplified logic is:

// Simplified from storage/innobase/lock/lock0lock.cc
static bool lock_rec_has_to_wait(
    const lock_t *lock1,  // requesting lock
    const lock_t *lock2,  // existing lock
    ulint heap_no)
{
    // Rule 1: Same transaction never waits for itself
    if (lock1->trx == lock2->trx) return false;

    // Rule 2: If modes are compatible (S+S), no wait
    if (lock_mode_compatible(lock1->mode, lock2->mode)) return false;

    // Rule 3: Gap lock never blocks anything except Insert Intention
    if (lock_rec_get_gap(lock2)) {
        // Existing lock is a gap-only lock
        if (!lock_rec_get_insert_intention(lock1)) return false;
        // Insert Intention vs Gap → must wait
    }

    // Rule 4: Requesting a gap lock? Never wait
    if (lock_rec_get_gap(lock1)) return false;

    // Rule 5: Insert Intention never has to wait for Insert Intention
    if (lock_rec_get_insert_intention(lock1)
        && lock_rec_get_insert_intention(lock2)) return false;

    // Rule 6: Record part of requesting lock vs record part of existing lock
    if (lock_rec_get_rec_not_gap(lock1)) {
        // Only the record component matters
        if (lock_rec_get_gap(lock2)) return false;
    }

    return true;  // must wait
}

Source ref: storage/innobase/lock/lock0lock.cc — lock_rec_has_to_wait()

4. Locking Scenarios (15+ Examples)

All examples below are based on this table (REPEATABLE READ isolation level, unless stated otherwise):

CREATE TABLE orders (
  id       INT PRIMARY KEY,
  user_id  INT,
  amount   DECIMAL(10,2),
  status   VARCHAR(20),
  INDEX idx_user (user_id),
  INDEX idx_status (status)
) ENGINE=InnoDB;

INSERT INTO orders VALUES
  (1, 100, 50.00, 'paid'),
  (5, 100, 80.00, 'paid'),
  (10, 200, 120.00, 'pending'),
  (15, 200, 200.00, 'paid'),
  (20, 300, 90.00,  'shipped'),
  (25, 300, 150.00, 'paid');

Scenario 1: SELECT ... FOR UPDATE — PK Equality

SELECT * FROM orders WHERE id = 10 FOR UPDATE;

-- Index used: PRIMARY
-- Locks acquired:
--   TABLE: IX
--   PRIMARY id=10: X,REC_NOT_GAP  (record lock only, no gap)
-- Why no gap lock? PK is unique, exact match → record lock suffices

Scenario 2: SELECT ... FOR UPDATE — PK Not Found

SELECT * FROM orders WHERE id = 12 FOR UPDATE;

-- Index used: PRIMARY
-- Locks acquired:
--   TABLE: IX
--   PRIMARY id=15: X,GAP  (gap lock on (10, 15))
-- Why? id=12 doesn't exist, InnoDB locks the gap to prevent phantom

Scenario 3: SELECT ... FOR UPDATE — PK Range

SELECT * FROM orders WHERE id BETWEEN 5 AND 15 FOR UPDATE;

-- Index used: PRIMARY (range scan)
-- Locks acquired:
--   TABLE: IX
--   PRIMARY id=5:  X (next-key lock, locks (1,5])
--   PRIMARY id=10: X (next-key lock, locks (5,10])
--   PRIMARY id=15: X (next-key lock, locks (10,15])
--   PRIMARY id=20: X,GAP (gap lock, locks (15,20) — scan stop point)
-- Total locked range: (1, 20) — wider than you might expect!

SELECT * FROM orders WHERE id = 10 FOR SHARE;
-- MySQL 8.0+ syntax; equivalent to: LOCK IN SHARE MODE

-- Locks acquired:
--   TABLE: IS
--   PRIMARY id=10: S,REC_NOT_GAP  (shared record lock)
-- Another tx can also FOR SHARE on id=10 → both succeed
-- But FOR UPDATE on id=10 → blocked until S lock released

Scenario 5: SELECT ... FOR UPDATE — Non-unique Index Equality

SELECT * FROM orders WHERE user_id = 200 FOR UPDATE;

-- Index used: idx_user
-- Locks acquired:
--   TABLE: IX
--   idx_user (user_id=200, id=10): X (next-key, locks gap before + record)
--   idx_user (user_id=200, id=15): X (next-key)
--   idx_user (user_id=300, id=20): X,GAP (gap lock — scan stop)
--   PRIMARY id=10: X,REC_NOT_GAP
--   PRIMARY id=15: X,REC_NOT_GAP
-- Why next-key on non-unique? To prevent phantom: INSERT user_id=200

Scenario 6: UPDATE — PK Equality

UPDATE orders SET amount = 100.00 WHERE id = 10;

-- Index used: PRIMARY
-- Locks acquired:
--   TABLE: IX
--   PRIMARY id=10: X,REC_NOT_GAP
-- Same as FOR UPDATE with PK equality — record lock only

Scenario 7: UPDATE — Non-unique Index Range

UPDATE orders SET amount = amount * 1.1 WHERE user_id >= 200;

-- Index used: idx_user (range scan from user_id=200 to supremum)
-- Locks acquired:
--   TABLE: IX
--   idx_user (user_id=200, id=10): X (next-key)
--   idx_user (user_id=200, id=15): X (next-key)
--   idx_user (user_id=300, id=20): X (next-key)
--   idx_user (user_id=300, id=25): X (next-key)
--   idx_user supremum:              X (next-key — locks to +infinity)
--   PRIMARY id=10,15,20,25: X,REC_NOT_GAP (each)
-- WARNING: supremum lock blocks ALL inserts with user_id ≥ 200

Scenario 8: UPDATE — Modifying Indexed Column

UPDATE orders SET user_id = 400 WHERE id = 10;

-- Locks acquired:
--   TABLE: IX
--   PRIMARY id=10: X,REC_NOT_GAP
-- Note: When updating a secondary index column, InnoDB also needs to
-- delete the old secondary index entry and insert a new one.
-- This involves additional implicit locking on idx_user.

Scenario 9: DELETE — PK Equality

DELETE FROM orders WHERE id = 10;

-- Locks acquired:
--   TABLE: IX
--   PRIMARY id=10: X,REC_NOT_GAP
-- Same as UPDATE on PK equality
-- The record is delete-marked, purge thread removes it later

Scenario 10: DELETE — Non-unique Index

DELETE FROM orders WHERE user_id = 200;

-- Index used: idx_user
-- Locks (same pattern as SELECT ... FOR UPDATE on non-unique index):
--   idx_user (user_id=200, id=10): X (next-key)
--   idx_user (user_id=200, id=15): X (next-key)
--   idx_user (user_id=300, id=20): X,GAP
--   PRIMARY id=10: X,REC_NOT_GAP
--   PRIMARY id=15: X,REC_NOT_GAP

Scenario 11: DELETE — No Index (Full Table Scan)

DELETE FROM orders WHERE amount > 100.00;
-- No index on amount → full table scan on PRIMARY

-- Locks acquired (TERRIFYING):
--   TABLE: IX
--   PRIMARY: next-key lock on EVERY record + supremum
--   Effectively locks the entire table!
--
-- Why? Without an index, InnoDB must scan all rows and lock each one.
-- It cannot release locks on non-matching rows mid-transaction.

Critical Warning: UPDATE/DELETE without a suitable index will lock the entire table! This is the most common cause of "table-wide lock" incidents in production. Always ensure WHERE conditions hit an index.

Scenario 12: INSERT — Basic Locking

INSERT INTO orders VALUES (12, 200, 75.00, 'pending');

-- Locks acquired (briefly):
--   TABLE: IX
--   Insert Intention Lock on gap (10, 15) of PRIMARY
--   PRIMARY id=12: X,REC_NOT_GAP (held until commit)
--
-- The insert intention lock is released as soon as the record lock
-- is placed on the newly inserted row.

Scenario 13: INSERT — Unique Key Conflict

-- TX A:
BEGIN;
INSERT INTO orders VALUES (10, 500, 99.00, 'new');
-- ERROR 1062: Duplicate entry '10' for key 'PRIMARY'
-- BUT: TX A now holds S,REC_NOT_GAP on PRIMARY id=10 (shared lock!)
-- Why S lock on failure? To prevent the conflicting row from being
-- deleted before the failing transaction decides what to do.

Scenario 14: INSERT ... ON DUPLICATE KEY UPDATE

INSERT INTO orders VALUES (10, 200, 130.00, 'paid')
  ON DUPLICATE KEY UPDATE amount = VALUES(amount);

-- Locks acquired:
--   TABLE: IX
--   PRIMARY id=10: X,REC_NOT_GAP  (exclusive — because it's an UPDATE)
--
-- Difference from plain INSERT:
--   Plain INSERT failure → S lock (shared)
--   ON DUPLICATE KEY UPDATE → X lock (exclusive)
-- This distinction matters for deadlock analysis!

Scenario 15: REPLACE INTO

REPLACE INTO orders VALUES (10, 200, 130.00, 'paid');

-- If id=10 exists:
--   Equivalent to DELETE + INSERT
--   PRIMARY id=10: X (next-key lock — includes gap!)
--   The next-key lock (vs record lock) is because REPLACE may need
--   to prevent phantom on secondary unique indexes.
-- If id=10 doesn't exist:
--   Same as plain INSERT (insert intention + record lock)

Scenario 16: SELECT ... FOR UPDATE with LIMIT

SELECT * FROM orders WHERE user_id = 100 FOR UPDATE LIMIT 1;

-- Index used: idx_user
-- Locks acquired:
--   idx_user (user_id=100, id=1): X (next-key)
--   PRIMARY id=1: X,REC_NOT_GAP
--
-- Stops after first match! Fewer locks than without LIMIT.
-- LIMIT can significantly reduce lock contention.

Scenario 17: Lock Accumulation Across Statements

BEGIN;
SELECT * FROM orders WHERE id = 5 FOR UPDATE;     -- Lock 1: X on PK=5
SELECT * FROM orders WHERE id = 15 FOR UPDATE;    -- Lock 2: X on PK=15
UPDATE orders SET amount = 0 WHERE user_id = 300;  -- Lock 3: X on user_id=300 range

-- ALL locks held simultaneously until COMMIT/ROLLBACK!
-- InnoDB does not release locks within a transaction (2PL: two-phase locking)
-- This is the fundamental reason why long transactions cause lock contention.

Scenario 18: Locking in Subqueries

-- Subquery with FOR UPDATE — be careful!
SELECT * FROM orders
  WHERE user_id IN (
    SELECT user_id FROM vip_users WHERE level = 'gold'
  )
  FOR UPDATE;

-- The FOR UPDATE applies to the outer query (orders), not the subquery
-- vip_users: only snapshot read (no locks)
-- orders: X locks on all matching rows
--
-- WARNING: If MySQL transforms this into a JOIN internally,
-- locking behavior may differ from what you expect.
-- Check EXPLAIN to verify the actual execution plan.

Scenario 19: Full Table COUNT Locking

-- Plain COUNT — no locks (snapshot read)
SELECT COUNT(*) FROM orders;
-- Uses MVCC snapshot, no row locks at all

-- COUNT with FOR UPDATE — locks every row!
SELECT COUNT(*) FROM orders FOR UPDATE;
-- Scans entire PRIMARY index
-- Next-Key Lock on every record + supremum
-- Effectively table-wide lock. Almost never what you want!

Scenarios: RC vs RR Comparison

REPEATABLE READ (RR)

Uses Next-Key Lock (Record + Gap)
Range queries lock gaps
Prevents phantom reads
Larger lock scope, lower concurrency

READ COMMITTED (RC)

Uses Record Lock only (no Gap Lock)
Does not lock gaps
Allows phantom reads
Smallest lock scope, highest concurrency
Releases locks on non-matching rows immediately (Semi-consistent read)

5. MVCC and Locking Interaction

5.1 Consistent Read (Snapshot) vs Locking Read

InnoDB supports two read modes with entirely different locking implications:

Read Mode	SQL	Locking	Data Source
Consistent Read (Snapshot)	`SELECT ...` (plain SELECT)	No locks at all	ReadView snapshot via Undo Log
Locking Read (Current Read)	`SELECT ... FOR UPDATE/SHARE`	Row locks	Latest committed version (not snapshot)
Implicit Locking Read	`UPDATE / DELETE / INSERT`	Row locks	Latest committed version

Key Distinction: Under RR, a plain SELECT sees a snapshot from transaction start, while SELECT ... FOR UPDATE sees the latest data. These two can return different results within the same transaction! This is not a bug but by MVCC design.

-- Demonstrating snapshot read vs locking read difference

-- TX A:                                   TX B:
BEGIN;                                     BEGIN;
SELECT * FROM orders WHERE id=10;
-- sees: amount=120.00 (snapshot)
                                            UPDATE orders SET amount=999
                                              WHERE id=10;
                                            COMMIT;

SELECT * FROM orders WHERE id=10;
-- still sees: amount=120.00 (same snapshot)

SELECT * FROM orders WHERE id=10 FOR UPDATE;
-- sees: amount=999.00 (current read, latest committed)

5.2 Why RR Uses Gap Locks But RC Does Not

RR must guarantee repeatable reads and prevent phantoms, so it needs Gap Locks to prevent other transactions from inserting new rows into scanned ranges. RC only cares about reading committed data, gets a fresh snapshot for each SELECT, naturally allows phantoms, and therefore does not need Gap Locks.

-- Phantom row problem (without gap locks, i.e. under RC)

-- TX A:                                   TX B:
BEGIN;
SELECT * FROM orders
  WHERE user_id = 200 FOR UPDATE;
-- Returns: id=10, id=15
                                            BEGIN;
                                            INSERT INTO orders VALUES
                                              (12, 200, 55.00, 'new');
                                            COMMIT;

SELECT * FROM orders
  WHERE user_id = 200 FOR UPDATE;
-- Under RC: Returns id=10, id=12, id=15 — PHANTOM ROW (id=12)!
-- Under RR: TX B's INSERT would be BLOCKED by gap lock → no phantom

5.3 Semi-Consistent Read (RC Optimization)

Under RC isolation, UPDATE performs a "semi-consistent read": if a scanned row is locked by another transaction and doesn't match the WHERE condition, InnoDB reads the latest committed version of that row to evaluate the condition. If it still doesn't match, the lock on that row is immediately released. This significantly reduces lock waits under RC.

-- Semi-consistent read example (RC isolation level)
-- Table has: id=1 (status='paid'), id=5 (status='paid'), id=10 (status='pending')

-- TX A:
BEGIN;
SELECT * FROM orders WHERE id = 5 FOR UPDATE;  -- X lock on id=5

-- TX B (RC):
BEGIN;
UPDATE orders SET amount = 0 WHERE status = 'pending';
-- Full scan: encounters id=5 locked by TX A
-- Semi-consistent: reads latest committed version of id=5
-- status='paid' ≠ 'pending' → skips, does NOT wait for TX A's lock
-- Continues to id=10 (status='pending') → locks and updates

-- Without semi-consistent read, TX B would block on id=5 waiting for TX A

6. Deadlock Deep Dive

6.1 What Causes Deadlocks

A deadlock occurs when two or more transactions mutually wait for locks held by each other, forming a cycle where no transaction can proceed. InnoDB automatically detects this and rolls back the transaction with the least cost to break the cycle.

-- Classic AB-BA deadlock

  TX A                              TX B
  ────────────────────              ────────────────────
  BEGIN;                             BEGIN;
  UPDATE orders SET amount=0         UPDATE orders SET amount=0
    WHERE id = 5;                      WHERE id = 10;
  -- holds X lock on PK=5            -- holds X lock on PK=10

  UPDATE orders SET amount=0
    WHERE id = 10;
  -- WAIT: needs X lock on PK=10
  -- (held by TX B)
                                     UPDATE orders SET amount=0
                                       WHERE id = 5;
                                     -- WAIT: needs X lock on PK=5
                                     -- (held by TX A)

  ┌────────┐         ┌────────┐
  │  TX A   │ waits → │  TX B   │
  │         │ ← waits │         │
  └────────┘         └────────┘
       DEADLOCK! Circular wait detected.
       InnoDB rolls back TX B (fewer undo log records)
       TX A proceeds, TX B gets:
       ERROR 1213 (40001): Deadlock found when trying to get lock

6.2 Deadlock Detection Algorithm (Wait-for Graph)

InnoDB uses a Wait-for Graph algorithm to detect deadlocks. Whenever a transaction needs to wait for a lock, InnoDB adds an edge to the wait-for graph and checks for cycles. If a cycle exists, a deadlock has occurred.

-- Wait-for Graph visualization
--
-- Each node = a transaction
-- Each edge = "waits for lock held by"
--
-- No deadlock (linear chain):      Deadlock (cycle):
--
--   TX A → TX B → TX C              TX A → TX B
--   (A waits B waits C)              ↑         ↓
--   No cycle → just waiting          TX D ← TX C
--                                    Cycle detected → deadlock!
--
-- Algorithm: depth-first search (DFS) on the wait-for graph
-- Complexity: O(V + E) where V = transactions, E = wait edges
--
-- Victim selection: InnoDB picks the transaction with the fewest
-- undo log records (least work to roll back) as the victim.

Related configuration parameters:

Parameter	Default	Description
`innodb_deadlock_detect`	ON	Whether to enable proactive deadlock detection. When OFF, relies on timeout to resolve deadlocks.
`innodb_lock_wait_timeout`	50 (seconds)	Maximum time to wait for a row lock; timeout produces `ERROR 1205`.

High Concurrency: Under high concurrency (hundreds of active transactions), deadlock detection itself can become a bottleneck because each wait triggers a DFS traversal. MySQL 8.0.18+ improved the algorithm by delegating detection to a background thread. For extreme write-hot scenarios, consider disabling innodb_deadlock_detect and setting innodb_lock_wait_timeout to a small value (e.g., 5 seconds).

6.3 Reading SHOW ENGINE INNODB STATUS Deadlock Section

SHOW ENGINE INNODB STATUS\G

------------------------
LATEST DETECTED DEADLOCK
------------------------
2024-01-15 14:23:07 140234567890432
*** (1) TRANSACTION:                                    ← TX A info
TRANSACTION 421937285, ACTIVE 3 sec starting index read
mysql tables in use 1, locked 1
LOCK WAIT 3 lock struct(s), heap size 1128, 2 row lock(s)
MySQL thread id 12, OS thread handle 140234567890432, query id 5678
UPDATE orders SET amount=0 WHERE id = 10               ← TX A's blocked query

*** (1) HOLDS THE LOCK(S):                              ← What TX A already holds
RECORD LOCKS space id 58 page no 4 n bits 72
  index PRIMARY of table `test`.`orders` trx id 421937285 lock_mode X locks rec but not gap
Record lock, heap no 3 PHYSICAL RECORD: n_fields 6; ...
 0: len 4; hex 80000005; asc     ;;                    ← id=5 (X record lock)

*** (1) WAITING FOR THIS LOCK TO BE GRANTED:            ← What TX A is waiting for
RECORD LOCKS space id 58 page no 4 n bits 72
  index PRIMARY of table `test`.`orders` trx id 421937285 lock_mode X locks rec but not gap waiting
Record lock, heap no 4 PHYSICAL RECORD: n_fields 6; ...
 0: len 4; hex 8000000a; asc     ;;                    ← id=10 (waiting for X)

*** (2) TRANSACTION:                                    ← TX B info
TRANSACTION 421937286, ACTIVE 2 sec starting index read
mysql tables in use 1, locked 1
LOCK WAIT 3 lock struct(s), heap size 1128, 2 row lock(s)
MySQL thread id 15, OS thread handle 140234567890433, query id 5679
UPDATE orders SET amount=0 WHERE id = 5                ← TX B's blocked query

*** (2) HOLDS THE LOCK(S):                              ← What TX B already holds
RECORD LOCKS space id 58 page no 4 n bits 72
  index PRIMARY of table `test`.`orders` trx id 421937286 lock_mode X locks rec but not gap
Record lock, heap no 4 PHYSICAL RECORD: n_fields 6; ...
 0: len 4; hex 8000000a; asc     ;;                    ← id=10 (X record lock)

*** (2) WAITING FOR THIS LOCK TO BE GRANTED:            ← What TX B is waiting for
RECORD LOCKS space id 58 page no 4 n bits 72
  index PRIMARY of table `test`.`orders` trx id 421937286 lock_mode X locks rec but not gap waiting
Record lock, heap no 3 PHYSICAL RECORD: n_fields 6; ...
 0: len 4; hex 80000005; asc     ;;                    ← id=5 (waiting for X)

*** WE ROLL BACK TRANSACTION (2)                        ← TX B is the victim

Key interpretation points:

hex 80000005 → 0x80000005 - 0x80000000 = 5, i.e., id=5. InnoDB adds 0x80000000 offset when storing signed integers.
lock_mode X locks rec but not gap = Record Lock (X, REC_NOT_GAP)
lock_mode X (without "but not gap") = Next-Key Lock
lock_mode X locks gap before rec = Gap Lock
lock_mode X insert intention = Insert Intention Lock

6.4 Common Deadlock Patterns

Pattern A: AB-BA Ordering

Two transactions access the same set of resources in opposite order. This is the most classic deadlock pattern; the solution is to ensure all transactions access resources in the same order.

-- TX A: UPDATE WHERE id=5, then UPDATE WHERE id=10
-- TX B: UPDATE WHERE id=10, then UPDATE WHERE id=5
-- Fix: both should access id=5 first, then id=10 (ascending order)

Pattern B: Gap Lock + INSERT

This is the most insidious deadlock pattern. Two transactions both acquire a Gap Lock on the same gap (Gap Locks are compatible), then both try to INSERT into that gap, each blocked by the other's Gap Lock.

-- Records: id = 1, 5, 10, 15

-- TX A:
BEGIN;
SELECT * FROM t WHERE id = 7 FOR UPDATE;  -- Gap Lock on (5,10)

-- TX B:
BEGIN;
SELECT * FROM t WHERE id = 8 FOR UPDATE;  -- Gap Lock on (5,10) — COMPATIBLE!

-- TX A:
INSERT INTO t VALUES (7, 'x', 0);        -- Insert Intention on (5,10) → BLOCKED by TX B's Gap

-- TX B:
INSERT INTO t VALUES (8, 'y', 0);        -- Insert Intention on (5,10) → BLOCKED by TX A's Gap
-- 💀 DEADLOCK

Pattern C: Secondary Index + Clustered Index

One transaction locks via a secondary index (secondary first, then clustered), while another locks directly via the clustered index. Different locking orders lead to deadlock.

-- TX A: uses secondary index idx_user
BEGIN;
UPDATE orders SET amount = 0 WHERE user_id = 200;
-- Lock order: idx_user(200,id=10) → PRIMARY(id=10) → idx_user(200,id=15) → PRIMARY(id=15)

-- TX B: uses primary index directly
BEGIN;
UPDATE orders SET user_id = 999 WHERE id = 15;
-- Lock order: PRIMARY(id=15) → idx_user(200,id=15) (old entry) → idx_user(999,id=15) (new)

-- If TX A has locked idx_user(200,id=10)+PRIMARY(10) and is waiting for PRIMARY(15),
-- while TX B has locked PRIMARY(15) and is waiting for idx_user(200,id=15):
-- 💀 DEADLOCK

Pattern D: Batch INSERT Deadlock

Multiple transactions simultaneously batch-insert records with a unique constraint; the S locks from unique key conflicts cross-wait with X locks from subsequent INSERTs.

-- Table with unique index on (email)
-- TX A: INSERT (email='[email protected]') → holds X on '[email protected]'
-- TX B: INSERT (email='[email protected]') → duplicate → S lock wait on '[email protected]'
-- TX C: INSERT (email='[email protected]') → duplicate → S lock wait on '[email protected]'
-- TX A: ROLLBACK
-- TX B and TX C both get S lock on '[email protected]', both try to INSERT →
-- Both need X lock, both blocked by the other's S lock → DEADLOCK

Source ref: storage/innobase/lock/lock0lock.cc — DeadlockChecker::check_and_resolve()

7. Deadlock Simulator (Interactive)

Choose a preset deadlock scenario, step through two transactions' SQL statements, and observe the locks acquired at each step and the eventual deadlock.

    Scenario:

      AB-BA Ordering Deadlock
      Gap Lock + INSERT Deadlock
      Unique Key Conflict Deadlock
      Secondary + Primary Index Deadlock

Transaction A

        Locks held:

Transaction B

        Locks held:












    Next Step
    Reset
    Click "Next Step" to begin





    -- Execution log will appear here

8. Deadlock Prevention Strategies

8.1 Consistent Lock Ordering

Ensuring all transactions access resources in the same order (e.g., ascending by primary key) is the most effective way to prevent AB-BA deadlocks.

-- BAD: inconsistent order
-- TX A: lock id=5, then id=10
-- TX B: lock id=10, then id=5

-- GOOD: always ascending order
-- TX A: lock id=5, then id=10
-- TX B: lock id=5, then id=10  (same order)

-- Application code pattern:
def transfer(from_id, to_id, amount):
    ids = sorted([from_id, to_id])  # always lock in ascending order
    with transaction():
        lock_account(ids[0])
        lock_account(ids[1])
        do_transfer(from_id, to_id, amount)

8.2 Keep Transactions Short

The longer a transaction holds locks, the higher the deadlock probability. Principles:

Move time-consuming operations (external API calls, file I/O) outside the transaction
Do read-only queries (snapshot reads, no locks) first, write operations last
Never wait for user input inside a transaction
Consider splitting large transactions into smaller ones

-- BAD: long transaction with external call
BEGIN;
SELECT * FROM orders WHERE id = 10 FOR UPDATE;
-- ... call payment gateway (3 seconds) ...
UPDATE orders SET status = 'paid' WHERE id = 10;
COMMIT;

-- GOOD: minimize lock hold time
-- 1. Read data (no lock)
SELECT * FROM orders WHERE id = 10;
-- 2. Call payment gateway (no transaction)
-- ... call payment gateway (3 seconds) ...
-- 3. Quick write (minimal lock time)
BEGIN;
UPDATE orders SET status = 'paid' WHERE id = 10 AND status = 'pending';
COMMIT;

8.3 Use RC Instead of RR When Possible

READ COMMITTED does not use Gap Locks, dramatically reducing deadlock probability. If your workload does not need repeatable reads or phantom prevention (most OLTP scenarios), switching to RC is the simplest lock reduction strategy.

-- Global setting
SET GLOBAL transaction_isolation = 'READ-COMMITTED';

-- Or per-session
SET SESSION transaction_isolation = 'READ-COMMITTED';

-- Or in my.cnf
[mysqld]
transaction_isolation = READ-COMMITTED

8.4 Retry Logic with Exponential Backoff

Deadlocks are normal; the application layer must handle ERROR 1213 and retry. Use exponential backoff with random jitter:

import time, random

def execute_with_retry(func, max_retries=3):
    for attempt in range(max_retries):
        try:
            return func()
        except MySQLDeadlockError:      # ERROR 1213
            if attempt == max_retries - 1:
                raise
            wait = (2 ** attempt) * 0.1  # 0.1s, 0.2s, 0.4s
            jitter = random.uniform(0, wait * 0.5)
            time.sleep(wait + jitter)
            log.warning(f"Deadlock retry {attempt+1}/{max_retries}")

// Go version
func executeWithRetry(fn func() error, maxRetries int) error {
    for i := 0; i < maxRetries; i++ {
        err := fn()
        if err == nil {
            return nil
        }
        if !isDeadlockError(err) {
            return err
        }
        if i == maxRetries-1 {
            return fmt.Errorf("deadlock after %d retries: %w", maxRetries, err)
        }
        backoff := time.Duration(1< **Production Monitoring Tips:** 1. Enable `innodb_print_all_deadlocks` and parse deadlock logs with log collection systems (ELK/Loki) 2. Monitor `Innodb_row_lock_waits` and `Innodb_row_lock_time_avg` status variables 3. Set alerts when `Innodb_row_lock_time_avg > 1000ms` or `Innodb_deadlocks` increment is abnormal 4. Periodically check `information_schema.INNODB_TRX` for transactions with `trx_started` older than 60 seconds





### 9.6 Lock Wait Monitoring Script


Here is a practical lock wait monitoring query that can serve as a periodic monitoring script:

-- Comprehensive lock wait report SELECT r.trx_id AS waiting_trx_id, r.trx_mysql_thread_id AS waiting_thread, r.trx_query AS waiting_query, TIMESTAMPDIFF(SECOND, r.trx_wait_started, NOW()) AS wait_seconds, b.trx_id AS blocking_trx_id, b.trx_mysql_thread_id AS blocking_thread, b.trx_query AS blocking_query, TIMESTAMPDIFF(SECOND, b.trx_started, NOW()) AS blocking_age_seconds, b.trx_rows_locked AS blocking_rows_locked, b.trx_rows_modified AS blocking_rows_modified, CONCAT(dl.OBJECT_SCHEMA, '.', dl.OBJECT_NAME) AS locked_table, dl.INDEX_NAME AS locked_index, dl.LOCK_MODE AS waiting_lock_mode, dl.LOCK_DATA AS lock_data FROM information_schema.INNODB_TRX r JOIN performance_schema.data_lock_waits w ON r.trx_id = w.REQUESTING_ENGINE_TRANSACTION_ID JOIN information_schema.INNODB_TRX b ON b.trx_id = w.BLOCKING_ENGINE_TRANSACTION_ID JOIN performance_schema.data_locks dl ON dl.ENGINE_LOCK_ID = w.REQUESTING_ENGINE_LOCK_ID WHERE r.trx_state = 'LOCK WAIT' ORDER BY wait_seconds DESC;




### 9.7 Key Status Variables

-- Lock-related status variables SHOW GLOBAL STATUS LIKE 'Innodb_row_lock%';

-- Deadlock count SHOW GLOBAL STATUS LIKE 'Innodb_deadlocks'; -- Monitor the rate of change, not absolute value

-- Long-running transactions (potential lock holders) SELECT trx_id, trx_state, trx_started, TIMESTAMPDIFF(SECOND, trx_started, NOW()) AS age_sec, trx_rows_locked, trx_rows_modified, trx_query, trx_mysql_thread_id FROM information_schema.INNODB_TRX WHERE TIMESTAMPDIFF(SECOND, trx_started, NOW()) > 30 ORDER BY trx_started;




### 9.8 Grafana Dashboard Recommendations


Recommended lock-related panels for Grafana:


| Panel Name | Metric Source | Alert Threshold |
| --- | --- | --- |
| Row Lock Wait Rate | `rate(Innodb_row_lock_waits)` | > 50/s |
| Avg Lock Wait Time | `Innodb_row_lock_time_avg` | > 500ms |
| Deadlock Rate | `rate(Innodb_deadlocks)` | > 1/min |
| Current Lock Waits | `Innodb_row_lock_current_waits` | > 10 |
| Long Transactions | `COUNT(INNODB_TRX WHERE age > 60s)` | > 0 |
| History List Length | `Innodb_history_list_length` | > 10000 |








## 10. Metadata Locks (MDL)



### 10.1 What is MDL


Metadata Lock (MDL) was introduced in MySQL 5.5.3 to protect table structures from being modified while queries are in progress. MySQL automatically acquires MDL whenever you execute any SQL statement:


- SELECT / DML → MDL Shared Read / Shared Write
- DDL (ALTER TABLE, DROP TABLE, etc.) → MDL Exclusive



### 10.2 MDL Blocking ALTER TABLE — Typical Scenario

-- Timeline showing how MDL can cascade into a total block

Time Session 1 Session 2 Session 3 ───── ────────────────── ────────────────── ────────────────── T1 BEGIN; SELECT * FROM t; -- holds MDL_SHARED_READ

T2 ALTER TABLE t ADD COLUMN c INT; -- needs MDL_EXCLUSIVE -- BLOCKED by Session 1's MDL_SHARED_READ

T3 SELECT * FROM t; -- needs MDL_SHARED_READ -- BLOCKED! Why? -- Session 2's pending -- MDL_EXCLUSIVE is ahead -- in the wait queue

Result: Session 1's long transaction causes Session 2 (DDL) to wait, which in turn blocks ALL subsequent queries (Session 3, 4, 5, ...)! This can take down an entire service.




> **Common Production Incident:** One uncommitted transaction + one ALTER TABLE = all queries blocked. This is one of the most common causes of MySQL production outages. Always check for long-running transactions before executing DDL: `SELECT * FROM information_schema.INNODB_TRX WHERE TIME_TO_SEC(TIMEDIFF(NOW(), trx_started)) > 10;`



### 10.3 Monitoring MDL

-- Enable MDL instrumentation (MySQL 8.0+) UPDATE performance_schema.setup_instruments SET ENABLED = 'YES', TIMED = 'YES' WHERE NAME = 'wait/lock/metadata/sql/mdl';

-- View current MDL locks SELECT OBJECT_TYPE, OBJECT_SCHEMA, OBJECT_NAME, LOCK_TYPE, LOCK_DURATION, LOCK_STATUS, OWNER_THREAD_ID, OWNER_EVENT_ID FROM performance_schema.metadata_locks WHERE OBJECT_TYPE = 'TABLE';

-- Find the blocking thread SELECT ml.OBJECT_SCHEMA, ml.OBJECT_NAME, ml.LOCK_TYPE, ml.LOCK_STATUS, t.PROCESSLIST_ID, t.PROCESSLIST_INFO AS query FROM performance_schema.metadata_locks ml JOIN performance_schema.threads t ON ml.OWNER_THREAD_ID = t.THREAD_ID WHERE ml.OBJECT_SCHEMA = 'your_db' AND ml.OBJECT_NAME = 'your_table';




### 10.4 Online DDL and MDL


InnoDB Online DDL reduces DDL's impact on DML, but still requires a brief MDL Exclusive lock at the start and end:

-- Online DDL lifecycle:

-- Phase 1: Preparation (brief MDL Exclusive) -- - Acquire MDL_EXCLUSIVE -- - Determine DDL algorithm (INPLACE, INSTANT, COPY) -- - Downgrade to MDL_SHARED_UPGRADABLE -- - Very short, usually milliseconds

-- Phase 2: Execution (MDL_SHARED_UPGRADABLE — allows DML) -- - Rebuild table / add index in background -- - DML continues, changes logged to row log -- - Can be minutes/hours for large tables

-- Phase 3: Commit (brief MDL Exclusive again) -- - Upgrade to MDL_EXCLUSIVE -- - Apply row log (accumulated DML changes) -- - Swap old/new table -- - Release MDL_EXCLUSIVE -- - This is where the blocking happens if there are long transactions!

-- Recommended: set a lock wait timeout for DDL SET lock_wait_timeout = 5; -- give up after 5 seconds ALTER TABLE orders ADD COLUMN note TEXT, ALGORITHM=INPLACE, LOCK=NONE; -- If MDL Exclusive can't be acquired in 5s, the ALTER fails gracefully -- instead of blocking all traffic




| DDL Operation | ALGORITHM | Rebuild Table | Concurrent DML |
| --- | --- | --- | --- |
| ADD COLUMN (at end) | INSTANT (8.0.12+) | No | Yes |
| ADD INDEX | INPLACE | No | Yes |
| DROP COLUMN | INPLACE | Yes | Yes |
| CHANGE COLUMN TYPE | COPY | Yes | No |
| ADD FOREIGN KEY | INPLACE | No | Yes |










## 11. Hot Row / Lock Contention Solutions



When multiple transactions frequently update the same row (inventory deduction, counter increment, account balance changes), row lock contention becomes a severe performance bottleneck, dropping TPS from tens of thousands to hundreds or even tens. Here are 5 production-proven solutions.



### Solution 1: Sharded Counters


Split one hot row into N sub-rows; update a random sub-row, read by SUMming all sub-rows.

-- Original: single hot row CREATE TABLE product_stock ( product_id INT PRIMARY KEY, stock INT ); -- All threads fight for the same row → bottleneck

-- Sharded: 16 sub-rows per product CREATE TABLE product_stock_shard ( product_id INT, shard_id TINYINT, -- 0..15 stock INT, PRIMARY KEY (product_id, shard_id) );

-- Initialize: distribute stock across shards -- product_id=100, total stock=1000, 16 shards → 62 or 63 each

-- Deduct stock: randomly pick a shard UPDATE product_stock_shard SET stock = stock - 1 WHERE product_id = 100 AND shard_id = FLOOR(RAND() * 16) AND stock > 0;

-- Read total stock: SELECT SUM(stock) FROM product_stock_shard WHERE product_id = 100;

-- Throughput improvement: ~16x (linear with shard count) -- Tradeoff: reads become slightly more expensive (SUM over N shards)




### Solution 2: Queue + Batch Merge


Write change requests to a queue table (high throughput, no contention), background worker periodically merges to the main table. See Section 8.6.



### Solution 3: Optimistic Locking (Version Column)


Instead of explicit `FOR UPDATE` locking, use a version column for CAS (Compare-And-Swap) semantics:

CREATE TABLE product ( id INT PRIMARY KEY, stock INT, version INT DEFAULT 0 );

-- Step 1: Read (no lock, snapshot read) SELECT id, stock, version FROM product WHERE id = 100; -- Returns: stock=50, version=7

-- Step 2: Update with version check (CAS) UPDATE product SET stock = stock - 1, version = version + 1 WHERE id = 100 AND version = 7;

-- If affected_rows = 1 → success -- If affected_rows = 0 → another transaction already modified it → retry

-- Advantage: no lock wait, losers retry immediately -- Disadvantage: high retry rate under extreme contention (bad for 100+ concurrent)




### Solution 4: SELECT ... SKIP LOCKED (MySQL 8.0+)


Skip already-locked rows; ideal for job queues and task assignment:

-- Job queue table CREATE TABLE jobs ( id INT PRIMARY KEY, status ENUM('pending','running','done'), payload JSON );

-- Worker picks next available job, skipping locked ones BEGIN; SELECT * FROM jobs WHERE status = 'pending' ORDER BY id LIMIT 1 FOR UPDATE SKIP LOCKED; -- If another worker already locked the first pending job, skip to the next one -- No blocking, no contention!

UPDATE jobs SET status = 'running' WHERE id = ?; COMMIT;




### Solution 5: NOWAIT (MySQL 8.0+)


If the lock cannot be acquired immediately, error out instead of waiting. Ideal for "fail fast" scenarios:

BEGIN; SELECT * FROM orders WHERE id = 10 FOR UPDATE NOWAIT; -- If id=10 is already locked: -- ERROR 3572 (HY000): Statement aborted because lock(s) could not be acquired -- immediately and NOWAIT is set. -- Application catches error and handles gracefully (show "item busy" to user)




### Solution Comparison


| Solution | Throughput Gain | Complexity | Best For |
| --- | --- | --- | --- |
| Sharded Counters | ~Nx (N=shard count) | Medium | Counters, inventory |
| Queue + Merge | Very high | High | Additive ops tolerating brief delay |
| Optimistic Lock | High under low contention | Low | Updates with low contention |
| SKIP LOCKED | High (no waits) | Low | Job queues, task assignment |
| NOWAIT | High (fast fail) | Low | Scenarios needing fast feedback |





### 11.6 Strategy Decision Tree

Hot Row Handling Decision Tree:

Need real-time accurate balance/stock? │ ├── Yes │ │ │ ├── Concurrency < 100 TPS? │ │ ├── Yes → Optimistic Lock (version) │ │ └── No → Sharded Counters │ │ │ └── Can queue processing? │ ├── Yes → Queue + Batch Merge │ └── No → Sharded Counters + NOWAIT fast-fail │ └── No (tolerate brief delay) │ ├── Is it a task assignment scenario? │ ├── Yes → SKIP LOCKED │ └── No → Queue + Batch Merge │ └── Consider Redis cache + MySQL async writeback




### 11.7 Sharded Counter Implementation Details & Caveats


Sharded counters seem simple but have several easily overlooked issues in production:

-- Problem 1: shard exhaustion -- If one shard reaches 0 while others still have stock, -- the random selection may fail repeatedly.

-- Fix: retry with different shard, or use a smarter selection: UPDATE product_stock_shard SET stock = stock - 1 WHERE product_id = 100 AND stock > 0 ORDER BY RAND() LIMIT 1;

-- Problem 2: rebalancing -- When restocking, distribute evenly across shards: UPDATE product_stock_shard SET stock = stock + FLOOR(1000 / 16) WHERE product_id = 100; -- Remainder goes to shard 0: UPDATE product_stock_shard SET stock = stock + (1000 MOD 16) WHERE product_id = 100 AND shard_id = 0;

-- Problem 3: negative stock race -- The AND stock > 0 check + UPDATE is atomic within a single row -- but total stock across all shards might briefly show wrong totals -- during concurrent operations. This is acceptable for most use cases.




### 11.8 Redis + MySQL Hybrid Approach


For extreme concurrency scenarios (10K+ TPS), cache the counter/inventory in Redis with async writeback to MySQL:

-- Architecture:

-- Client → Redis DECR → check >= 0? -- ├── Yes → Queue message → Worker → MySQL UPDATE -- └── No → Redis INCR (restore) → return "out of stock"

-- Pros: Redis handles 100K+ TPS, MySQL only sees merged writes -- Cons: Redis crash loses in-flight data (mitigate with AOF + queue) -- Temporary inconsistency between Redis and MySQL

-- Redis commands (pseudocode): -- DECR stock:product:100 -- if result < 0: -- INCR stock:product:100 # restore -- return ERROR_OUT_OF_STOCK -- else: -- publish deduction event to message queue -- return SUCCESS









## 12. Real-World Cases



### Case 1: E-commerce Inventory Deduction Deadlock



#### Problem Description


During a flash sale, massive concurrent requests deduct inventory. The application first queries stock (FOR UPDATE), checks sufficiency, then UPDATEs. Frequent deadlocks under high concurrency.

-- Problem code (simplified) BEGIN; -- Step 1: lock the SKU row SELECT stock FROM sku WHERE sku_id = 1001 FOR UPDATE; -- Step 2: check stock -- if stock < quantity: rollback -- Step 3: deduct UPDATE sku SET stock = stock - 1 WHERE sku_id = 1001; -- Step 4: create order (on different table) INSERT INTO orders (sku_id, user_id, qty) VALUES (1001, ?, 1); COMMIT;

-- Deadlock scenario: -- An order might contain multiple SKUs. If TX A locks sku_id=1001 then 1002, -- while TX B locks sku_id=1002 then 1001 → classic AB-BA deadlock




#### Solution

-- Fix 1: Sort SKU IDs before locking (consistent ordering) sku_ids = sorted(order_items.keys()) # [1001, 1002] for sku_id in sku_ids: SELECT stock FROM sku WHERE sku_id = ? FOR UPDATE

-- Fix 2: Single UPDATE with stock check (atomic, no FOR UPDATE needed) UPDATE sku SET stock = stock - 1 WHERE sku_id = 1001 AND stock >= 1; -- Check affected_rows: 1=success, 0=insufficient stock -- No separate SELECT FOR UPDATE → simpler, less lock contention

-- Fix 3: For extreme concurrency, use sharded counters (see Section 11)




### Case 2: Account Transfer (Double-entry Bookkeeping) Deadlock



#### Problem Description


User A transfers to User B, while simultaneously User B transfers to User A. The two transactions lock the two accounts in opposite order.

-- Transfer: A → B (TX 1) BEGIN; UPDATE accounts SET balance = balance - 100 WHERE user_id = 'A'; -- lock A UPDATE accounts SET balance = balance + 100 WHERE user_id = 'B'; -- lock B → wait COMMIT;

-- Transfer: B → A (TX 2) BEGIN; UPDATE accounts SET balance = balance - 50 WHERE user_id = 'B'; -- lock B UPDATE accounts SET balance = balance + 50 WHERE user_id = 'A'; -- lock A → DEADLOCK COMMIT;




#### Solution

-- Fix: Always lock accounts in ascending user_id order def transfer(from_user, to_user, amount): # Ensure consistent lock ordering first, second = sorted([from_user, to_user])

BEGIN;
SELECT * FROM accounts WHERE user_id = first  FOR UPDATE;
SELECT * FROM accounts WHERE user_id = second FOR UPDATE;

UPDATE accounts SET balance = balance - amount WHERE user_id = from_user;
UPDATE accounts SET balance = balance + amount WHERE user_id = to_user;
COMMIT;

-- Now both TX 1 and TX 2 will lock 'A' first, then 'B' (if A < B) -- No deadlock possible




### Case 3: Batch Import Gap Lock Storm



#### Problem Description


Batch importing user data into a table with a unique index (email). Multiple threads concurrently execute `INSERT ... ON DUPLICATE KEY UPDATE`, causing frequent deadlocks.

-- Multiple threads doing: INSERT INTO users (email, name) VALUES ('[email protected]', 'Alice') ON DUPLICATE KEY UPDATE name = VALUES(name);

-- Problem: INSERT on unique index acquires next-key lock, not just record lock -- When the key doesn't exist yet, InnoDB acquires gap locks on idx_email -- Multiple threads with gap locks on adjacent gaps → INSERT intention deadlock




#### Solution

-- Fix 1: Switch to READ COMMITTED (no gap locks) SET SESSION transaction_isolation = 'READ-COMMITTED';

-- Fix 2: Batch INSERTs in a single statement (fewer transactions) INSERT INTO users (email, name) VALUES ('[email protected]', 'Alice'), ('[email protected]', 'Bob'), ... ('[email protected]', 'Zach') ON DUPLICATE KEY UPDATE name = VALUES(name);

-- Fix 3: Sort import data by the unique key to minimize gap lock range -- Process emails alphabetically: a@..., b@..., c@... → sequential gaps

-- Fix 4: Use LOAD DATA INFILE for very large imports (single transaction)











## 13. FAQ





**Q1: Are InnoDB row locks placed on rows or indexes?**

On index records. Every InnoDB row lock (Record Lock, Gap Lock, Next-Key Lock) is attached to a leaf node in a B+Tree index. If the table has no defined indexes, InnoDB creates an implicit clustered index (based on a 6-byte ROW_ID), and row locks are placed on this implicit index. This is why `UPDATE/DELETE` on an unindexed table locks every row -- it must scan every record in the implicit index.



**Q2: Why don't Gap Locks conflict with each other?**

Because the sole purpose of a Gap Lock is to prevent INSERTs into the gap, not to protect "data" in the gap (there is no data in a gap by definition). Two transactions simultaneously declaring "no inserts allowed in this gap" creates no contradiction. This is a special check in InnoDB's `lock_rec_has_to_wait()`: if the requesting lock is not Insert Intention type, it never waits for a Gap Lock. However, this design is also the root cause of Gap Lock + INSERT deadlocks -- two transactions easily acquire Gap Locks, then both cannot INSERT due to the other's Gap Lock.



**Q3: Can deadlocks still occur under RC isolation level?**

Yes, but with dramatically reduced probability. RC has no Gap Locks, eliminating all Gap Lock related deadlock patterns (Patterns B and D). However, AB-BA ordering deadlocks (Pattern A) and secondary+primary index deadlocks (Pattern C) can still occur because they involve Record Locks, which are isolation-level independent. In practice, switching from RR to RC typically reduces deadlocks by 80%+.



**Q4: SHOW ENGINE INNODB STATUS only keeps the last deadlock — how to see history?**

Set `innodb_print_all_deadlocks = ON` to write every deadlock to the MySQL error log. Combined with ELK, Loki, or other log collection systems, you can parse and aggregate deadlock patterns, identifying high-frequency deadlock tables and indexes. Additionally, MySQL 8.0's `performance_schema.events_errors_summary_by_account_by_error` can count ERROR 1213 occurrences.



**Q5: When should innodb_deadlock_detect be turned OFF?**

Almost never. Only consider it in very specific scenarios: (1) Extreme concurrent writes to the same row (e.g., 1000+ TPS updating one row), where deadlock detection CPU overhead is the bottleneck. (2) Actual deadlock possibility has been eliminated via sharded counters/queues. (3) `innodb_lock_wait_timeout` is set to a small value (e.g., 3-5 seconds) as a fallback. Disabling deadlock detection means deadlocks require a timeout to be discovered, worsening user experience.



**Q6: What are best practices for FOR UPDATE vs FOR SHARE?**

Guidelines: (1) If you read data and intend to modify it, use `FOR UPDATE`. (2) If you only need to ensure data isn't modified during your transaction, use `FOR SHARE`. (3) Avoid locking reads on queries that don't need locks -- most read-only queries should use plain `SELECT` (snapshot read). (4) `FOR SHARE` can easily cause deadlocks: two transactions FOR SHARE the same row (compatible), then both try to UPDATE it (need X lock) → deadlock. If you read-then-write, use `FOR UPDATE` directly.



**Q7: What is the relationship between LOCK TABLES and InnoDB row locks?**

`LOCK TABLES` is a table-level lock at the MySQL Server layer, independent from InnoDB's row locks at the engine layer. Using `LOCK TABLES` with InnoDB is not recommended because it disables InnoDB's row-level lock optimizations. The correct approach is to use `SELECT ... FOR UPDATE` and transactions for concurrency control. The only scenario requiring `LOCK TABLES` is interacting with MyISAM tables (which don't support row locks or transactions).



**Q8: How to determine if my application is affected by lock contention?**

Monitor these metrics: (1) `SHOW GLOBAL STATUS LIKE 'Innodb_row_lock%'`: `Innodb_row_lock_waits` (cumulative wait count) and `Innodb_row_lock_time_avg` (average wait time in ms). (2) Row count in `performance_schema.data_lock_waits` (current waits). (3) Transactions in `information_schema.INNODB_TRX` with `trx_state='LOCK WAIT'`. (4) Queries with high `Lock_time` in the slow query log. If `Innodb_row_lock_time_avg > 500ms` or `Innodb_row_lock_waits` increases faster than 100/s, you need to investigate.



**Q9: What exact range does a Next-Key Lock cover?**

A Next-Key Lock covers a left-open, right-closed interval `(prev_record, current_record]`. For example, with index records 5, 10, 15, a Next-Key Lock on 10 covers `(5, 10]`, blocking INSERT of id=6,7,8,9 into the gap and locking record 10 itself. Note that InnoDB has optimization rules that automatically degenerate Next-Key Locks: (1) Unique index equality hit → degenerates to Record Lock; (2) Non-unique index equality scan reaches first non-matching record → degenerates to Gap Lock on that record.



**Q10: How to find "ghost transactions" holding locks without releasing?**

Common causes: application code that BEGINs without COMMIT/ROLLBACK (connection leak), or waits for external resources inside a transaction. Investigation: 1. `SELECT trx_id, trx_started, trx_state, trx_query, trx_mysql_thread_id FROM information_schema.INNODB_TRX ORDER BY trx_started;` — find transactions with old `trx_started` and `trx_query=NULL` 2. `SELECT * FROM performance_schema.threads WHERE THREAD_ID = (...);` — trace thread origin 3. If confirmed as leaked connection, `KILL ` to terminate 4. Root fix: ensure the application uses a connection pool with idle timeout and leak detection










    [← Prev: InnoDB Internals](/books/high-performance-mysql/innodb-internals)
    [Back to Contents](/books/high-performance-mysql)
    [Next: Index Optimization →](/books/high-performance-mysql/index-guide)

Rate this chapter

4.8 / 5 (21 ratings)