Chapter 29

SharedArrayBuffer and Atomics: The Memory Model for Multi-threaded JS

Web Workers gave JavaScript true multi-threading. SharedArrayBuffer lets multiple threads share a single block of memory โ€” but shared memory brings CPU-level concurrency problems: out-of-order execution, cache incoherence, and data races. Atomics is the low-level tool for solving them.

๐Ÿ”น Level 1 ยท What You Need to Know

Web Worker: Threads Without Shared Memory

// Main thread
const worker = new Worker('worker.js');
worker.postMessage({ data: [1, 2, 3, 4, 5] });  // deep copy via structured clone
worker.onmessage = e => console.log('result:', e.data);

postMessage uses Structured Clone by default: data is deep-copied, and both the main thread and the Worker hold independent copies. For large data (video frames, scientific arrays), copying on every message incurs significant overhead.

SharedArrayBuffer: True Shared Memory

// Main thread
const sab = new SharedArrayBuffer(4 * Int32Array.BYTES_PER_ELEMENT);  // 16 bytes
const view = new Int32Array(sab);
view[0] = 42;

const worker = new Worker('worker.js');
worker.postMessage(sab);  // sends a reference, not a copy!

// worker.js
self.onmessage = e => {
  const view = new Int32Array(e.data);  // same memory as main thread
  console.log(view[0]);  // 42 (same memory block)
  view[0] = 100;         // main thread sees this change too
};
Feature postMessage (plain value) postMessage (SharedArrayBuffer)
Transfer method Structured clone (deep copy) Reference (shared)
Memory usage Two copies One copy
Communication cost Proportional to data size O(1) (reference only)
Data consistency Independent Shared โ€” requires synchronization
Security requirements None Requires COOP + COEP headers

Why Atomics Are Necessary

Shared memory cannot be read and written directly โ€” even the simplest i++ is unsafe in multi-threaded code:

i++ broken down into 3 assembly steps:
  1. LOAD  r0, [addr]    // read i from memory into register
  2. ADD   r0, r0, 1     // increment register
  3. STORE [addr], r0    // write result back to memory

Two threads executing i++ simultaneously (initial i=0):
  Thread A: LOAD r0=0         Thread B: LOAD r1=0
  Thread A: ADD  r0=1         Thread B: ADD  r1=1
  Thread A: STORE i=1         Thread B: STORE i=1
  
  Result: i=1 (should be 2! One update is lost)

Atomics ensures these 3 steps execute as an indivisible atomic operation.

Prerequisites: COOP + COEP Headers

SharedArrayBuffer was temporarily disabled in 2018 (Spectre attack). It returned in 2020 with these requirements:

# Server response headers (both required)
Cross-Origin-Opener-Policy: same-origin
Cross-Origin-Embedder-Policy: require-corp

When set, window.crossOriginIsolated is true and SharedArrayBuffer is available.


๐Ÿ”ธ Level 2 ยท How It Actually Works

Why Shared Memory Is Unsafe: CPU-Level Problems

Modern CPUs perform many optimizations that are invisible in single-threaded code but cause problems with shared memory in multi-threaded scenarios:

Problem 1: Out-of-Order Execution

// Your code:
store(flag, 1);    // step 1: set flag
store(data, 42);   // step 2: write data

// CPU actual execution order (possible):
store(data, 42);   // step 2 runs first (pipeline optimization)
store(flag, 1);    // step 1 runs second

// Another thread seeing flag=1 might read data before it's written!

Problem 2: CPU Cache Incoherence

CPU 0 (Thread A)              CPU 1 (Thread B)
  L1 Cache: data=0              L1 Cache: data=0 (stale)
  Write data=42 โ†’ L1          Read data (from L1 cache: still 0!)
  Waiting for cache coherency
  (MESI protocol takes time)

Different CPU cores have their own caches. A write may remain in L1/L2 before propagating; other cores read stale values.

Problem 3: Compiler Reordering

Compilers (including JIT) can freely reorder instructions without changing single-threaded semantics:

let x = 0, y = 0;
function threadA() { x = 1; r1 = y; }
function threadB() { y = 1; r2 = x; }

// After compiler reordering (legal โ€” single-threaded semantics unchanged):
function threadA() { r1 = y; x = 1; }  // reordered!
function threadB() { r2 = x; y = 1; }  // reordered!

// Result: r1=0, r2=0 can both happen (seemingly impossible, but real in multi-threaded code)

Atomics Operations Semantics

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Atomics Operations (on Int32Array or BigInt64Array)        โ”‚
โ”‚                                                             โ”‚
โ”‚  Read/Write:                                                โ”‚
โ”‚  Atomics.load(ta, index)           โ€” atomic read            โ”‚
โ”‚  Atomics.store(ta, index, value)   โ€” atomic write           โ”‚
โ”‚                                                             โ”‚
โ”‚  Arithmetic (returns the old value):                        โ”‚
โ”‚  Atomics.add(ta, index, value)     โ€” atomic add             โ”‚
โ”‚  Atomics.sub(ta, index, value)     โ€” atomic subtract        โ”‚
โ”‚  Atomics.and(ta, index, value)     โ€” atomic bitwise AND     โ”‚
โ”‚  Atomics.or(ta, index, value)      โ€” atomic bitwise OR      โ”‚
โ”‚  Atomics.xor(ta, index, value)     โ€” atomic bitwise XOR     โ”‚
โ”‚                                                             โ”‚
โ”‚  CAS Operations:                                            โ”‚
โ”‚  Atomics.compareExchange(ta, index, expected, replacement)  โ”‚
โ”‚    โ€” if current value === expected, replace with replacement โ”‚
โ”‚      returns old value regardless                           โ”‚
โ”‚  Atomics.exchange(ta, index, value)                         โ”‚
โ”‚    โ€” unconditional replace, returns old value               โ”‚
โ”‚                                                             โ”‚
โ”‚  Wait/Notify (Int32Array and BigInt64Array only):           โ”‚
โ”‚  Atomics.wait(ta, index, value, timeout?)                   โ”‚
โ”‚    โ€” block if ta[index] === value (Worker only)             โ”‚
โ”‚  Atomics.notify(ta, index, count?)                          โ”‚
โ”‚    โ€” wake up waiting Workers (can be called from main thread)โ”‚
โ”‚  Atomics.waitAsync(ta, index, value, timeout?)              โ”‚
โ”‚    โ€” non-blocking wait, returns Promise (ES2024)            โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Implementing a Mutex with SharedArrayBuffer + Atomics

class Mutex {
  constructor(sharedBuffer, byteOffset = 0) {
    this._sa = new Int32Array(sharedBuffer, byteOffset, 1);
    // _sa[0]: 0 = unlocked, 1 = locked
  }

  lock() {
    // Spin-lock: keep attempting CAS until we change 0 โ†’ 1
    while (true) {
      // compareExchange(array, index, expectedValue, replacementValue)
      // If _sa[0] === 0 (unlocked): set to 1 (locked), return 0 (success)
      // If _sa[0] !== 0 (locked): no change, return current value (failure)
      const wasUnlocked = Atomics.compareExchange(this._sa, 0, 0, 1);
      if (wasUnlocked === 0) {
        return;  // lock acquired
      }
      // Failed: wait for the lock to be released
      Atomics.wait(this._sa, 0, 1);  // wait until _sa[0] !== 1
    }
  }

  unlock() {
    Atomics.store(this._sa, 0, 0);   // release lock (write 0)
    Atomics.notify(this._sa, 0, 1);  // wake one waiting Worker
  }
}

// Usage in a Worker:
mutex.lock();
try {
  sharedData[0] += 1;  // critical section: only one Worker at a time
} finally {
  mutex.unlock();
}

happens-before and the Memory Model

JavaScript's memory model (ECMAScript spec Chapter 29) defines "when is a write visible to a read":

happens-before Rules (simplified):

1. Program order: within a single Agent, statements have happens-before
   write(x, 1)  happens-before  read(x)  (same thread)

2. Synchronizes-with:
   Atomics.store(ta, i, v)  synchronizes-with  the Atomics.load(ta, i) that reads v
   Atomics.notify()         synchronizes-with  the corresponding Atomics.wait() that returns

3. Transitivity:
   A happens-before B, B happens-before C  โ†’  A happens-before C

Practical meaning: If thread A writes via Atomics.store and thread B reads that value via Atomics.load, then all of A's writes before the store are visible to all of B's code after the load. This is a memory barrier.

Producer-Consumer Pattern

// Shared memory layout:
// [0]: status (0=empty, 1=writing, 2=ready)
// [1]: data value

// Producer (Worker A):
function produce(sab, value) {
  const control = new Int32Array(sab, 0, 1);
  const data = new Int32Array(sab, 4, 1);

  while (Atomics.load(control, 0) !== 0) {
    Atomics.wait(control, 0, 1);
    Atomics.wait(control, 0, 2);
  }

  Atomics.store(control, 0, 1);  // mark as writing
  Atomics.store(data, 0, value);
  Atomics.store(control, 0, 2);  // mark as ready
  Atomics.notify(control, 0);    // wake consumer
}

// Consumer (Worker B):
function consume(sab) {
  const control = new Int32Array(sab, 0, 1);
  const data = new Int32Array(sab, 4, 1);

  Atomics.wait(control, 0, 0);  // wait until not empty
  while (Atomics.load(control, 0) !== 2) {
    Atomics.wait(control, 0, 1);  // wait until not writing
  }

  const value = Atomics.load(data, 0);
  Atomics.store(control, 0, 0);  // reset to empty
  Atomics.notify(control, 0);    // wake producer
  return value;
}

๐Ÿ”บ Level 3 ยท How the Spec Defines It

ECMAScript Spec: Agent and AgentCluster

Spec ยง9.7 defines the Agent concept:

An agent comprises a set of ECMAScript execution contexts, an execution context stack, a running execution context, an Agent Record, and an executing thread.

Every JavaScript thread (main thread, each Web Worker) is an independent Agent.

An agent cluster is a maximal set of agents that can communicate by operating on shared memory.

The main thread and all Workers sharing memory via SharedArrayBuffer belong to the same AgentCluster. Agents within the same cluster can communicate via SharedArrayBuffer; different clusters are completely isolated.

The [[CanBlock]] property:

Agent Type [[CanBlock]] Can call Atomics.wait
Main thread (browser) false No (would block UI)
Dedicated Worker true Yes
Shared Worker false No
Service Worker false No

Spec ยง29.4: Atomics Object

The spec defines Atomics.compareExchange semantics (ยง29.4.4):

Atomics.compareExchange ( typedArray, index, expectedValue, replacementValue )

  1. Let buffer be ? ValidateIntegerTypedArray(typedArray).
  2. Let i be ? ValidateAtomicAccess(typedArray, index).
  3. Let expected be ? ToInteger(expectedValue).
  4. Let replacement be ? ToInteger(replacementValue).
  5. Let elementSize be the Element Size value for typedArray.[[TypedArrayName]].
  6. Let offset be typedArray.[[ByteOffset]].
  7. Let byteIndex be (i ร— elementSize) + offset.
  8. Let v be ? GetValueFromBuffer(buffer, byteIndex, typedArray.[[TypedArrayName]], true, SeqCst).
  9. If v = expected, then a. Perform ? SetValueInBuffer(buffer, byteIndex, typedArray.[[TypedArrayName]], replacement, true, SeqCst).
  10. Return v.

The SeqCst (Sequentially Consistent) parameter specifies the memory order. All Atomics operations use Sequential Consistency โ€” the strongest memory ordering guarantee.

Spec ยง29.8: The Memory Model

The ECMAScript memory model (introduced in ES2017, based on a subset of the C++11 memory model) formalizes shared memory behavior.

Candidate Execution is the core formalization concept. A candidate execution is a tuple of event sets and relations:

CandidateExecution = {
  EventSet: E,               // all memory events (reads/writes)
  AgentOrder: ao,            // program order within each Agent
  ReadsBytesFrom: rbf,       // which write each read byte comes from
  ReadsFrom: rf,             // higher-level read-from relation
  HostSynchronizesWith: hsw, // host-defined synchronization
  SynchronizesWith: sw,      // synchronization from Atomics ops
  HappensBefore: hb          // transitive closure of sw and ao
}

A valid execution must satisfy all memory model constraints, including:

Spec definition of data race:

There is a data race in a candidate execution E if there exist two events (E1, E2) in EventSet such that:

  • E1 and E2 are in different agents
  • E1 and E2 access overlapping byte ranges
  • At least one of E1, E2 is a non-atomic access
  • It is not the case that E1 happens-before E2 or E2 happens-before E1

๐Ÿ’Ž Level 4 ยท Edge Cases and Traps

Trap 1: The Spectre Attack Background (Why SharedArrayBuffer Was Disabled in 2018)

In January 2018, the Spectre CPU vulnerability was disclosed. Attackers use speculative execution and side-channel timing attacks to read memory that should be isolated. SharedArrayBuffer provided a high-precision shared timer (via Atomics.wait + a counting Worker) that made Spectre attacks practical:

// Conceptual Spectre timing attack (no longer works in modern browsers):
const sab = new SharedArrayBuffer(8);
const timer = new BigInt64Array(sab);

// Worker continuously increments counter:
Atomics.add(timer, 0, 1n);  // loop in Worker

// Main thread uses it as a microsecond-precision clock:
const start = Atomics.load(timer, 0);
// ... perform some operation ...
const elapsed = Atomics.load(timer, 0) - start;
// elapsed precision reaches microseconds โ€” sufficient for side-channel attacks

Solution: COOP and COEP isolate the page in a separate browser process, blocking cross-process memory access and neutralizing Spectre.

Trap 2: Atomics.wait Is Unavailable on the Main Thread

const sab = new SharedArrayBuffer(4);
const sa = new Int32Array(sab);

// Wrong: calling Atomics.wait on the main thread throws TypeError
Atomics.wait(sa, 0, 0);
// TypeError: Cannot perform Atomics.wait on the main thread

// Correct option 1: use wait inside a Worker
// worker.js:
Atomics.wait(sa, 0, 0);  // legal โ€” Worker has [[CanBlock]] = true

// Correct option 2: use waitAsync on the main thread (ES2024)
const { async: isAsync, value } = Atomics.waitAsync(sa, 0, 0);
if (isAsync) {
  const result = await value;  // 'ok' | 'not-equal' | 'timed-out'
}

Trap 3: Data Race vs. Race Condition โ€” They Are Different

// Data Race: undefined behavior โ€” the spec guarantees nothing
const sab = new SharedArrayBuffer(4);
const sa = new Int32Array(sab);

// Worker 1:
sa[0] = 1;  // non-atomic write

// Worker 2:
const v = sa[0];  // non-atomic read, conflicts with Worker 1's write

// This IS a data race. The result is completely undefined:
// - v might be 0 (old value)
// - v might be 1 (new value)
// - v might be anything (torn write โ€” partial bytes from each)
// - The program might exhibit completely unpredictable behavior

// โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

// Race Condition: a logic bug โ€” behavior is deterministic but wrong
const count = new Int32Array(sab);

// Worker 1:
Atomics.add(count, 0, 1);   // atomic โ€” no data race
const v = Atomics.load(count, 0);
// But if Worker 2 also ran add concurrently, v's value depends on scheduling

// Race condition: result depends on execution order, but behavior is defined
// Needs a lock or other synchronization to be correct

Trap 4: WebAssembly Threads and SharedArrayBuffer

// Load a threading-capable WASM module
const memory = new WebAssembly.Memory({
  initial: 10,    // 10 pages = 640KB
  maximum: 100,
  shared: true    // use SharedArrayBuffer as backing store
});

const { instance } = await WebAssembly.instantiateStreaming(
  fetch('threaded.wasm'),
  { env: { memory } }
);

// WASM memory's backing store is SharedArrayBuffer
console.log(memory.buffer instanceof SharedArrayBuffer);  // true

// WASM threads directly access the same memory block
// pthreads in WASM are emulated via Worker + SharedArrayBuffer

Key points:

  1. WASM threads require the module compiled with threads enabled (-pthread flag in Emscripten)
  2. WASM memory.atomic.wait maps directly to Atomics.wait
  3. All WASM synchronization primitives (mutex, semaphore) ultimately use Atomics.wait/notify

Trap 5: ArrayBuffer Transfer vs. SharedArrayBuffer

// ArrayBuffer supports transfer (zero-copy ownership transfer)
const buffer = new ArrayBuffer(1024);
const view = new Int32Array(buffer);
view[0] = 42;

worker.postMessage(buffer, [buffer]);  // transfer ownership
console.log(buffer.byteLength);  // 0 (now detached)

// โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

// SharedArrayBuffer does NOT support transfer
// (it's shared by definition โ€” there is no "owner" concept)
const sab = new SharedArrayBuffer(1024);
worker.postMessage(sab);  // shares directly, no transfer needed
// sab remains valid in the main thread (byteLength still 1024)

Performance best practices:

Scenario Recommended Approach
One-time large data transfer ArrayBuffer transfer (zero-copy, but ownership moves)
Continuously shared data SharedArrayBuffer + Atomics (permanent sharing, needs sync)
Small frequent messages postMessage (simple, accepts copying overhead)

Summary

  1. SharedArrayBuffer lets multiple Workers share one memory block, but requires COOP + COEP headers as a Spectre mitigation.
  2. Unsynchronized shared memory access causes data races โ€” undefined behavior at the spec level, not merely a logic error.
  3. Atomics.wait is only available in Agents with [[CanBlock]] = true (Dedicated Workers); the main thread must use Atomics.waitAsync (ES2024).
  4. All Atomics operations use Sequential Consistency memory ordering โ€” the strongest multi-threading visibility guarantee.
  5. A data race (undefined behavior) differs from a race condition (logic bug): the former produces completely unpredictable results; the latter produces deterministic but incorrect results.
Rate this chapter
4.9  / 5  (3 ratings)

๐Ÿ’ฌ Comments