sync Package: Mutex, WaitGroup, Once, Pool
sync Package: Mutex, WaitGroup, Once, Pool
Go's concurrency model centers on CSP (Communicating Sequential Processes), with channels as the preferred synchronization method. But in practice, not every concurrency problem is best solved with channels. When multiple goroutines need to access shared data structures, direct locking is often simpler and more efficient than passing ownership through channels. The standard library's sync package provides a carefully designed set of low-level synchronization primitives—the foundational tools for building high-performance concurrent programs.
The design philosophy of sync is "less is more"—it provides only the most essential primitives, each with a clear use case. As Russ Cox discussed in Go 2017: "The sync package is for those scenarios where channels can't solve the problem or would be too awkward."
Level 1: What You Need to Know
Mutex: Mutual Exclusion Lock
sync.Mutex is the most basic synchronization primitive—it ensures only one goroutine can access a critical section at a time.
type SafeCounter struct {
mu sync.Mutex
count int
}
func (c *SafeCounter) Increment() {
c.mu.Lock()
c.count++
c.mu.Unlock()
}
func (c *SafeCounter) Get() int {
c.mu.Lock()
defer c.mu.Unlock()
return c.count
}
Key rules:
Lock()acquires the lock; blocks if already heldUnlock()releases the lock; panics if not held- Locks aren't bound to goroutines—goroutine A can lock, goroutine B can unlock (but this is bad practice)
- Always use
deferfor unlock, unless you have a clear performance reason not to
func (c *SafeCounter) IncrementBad() {
c.mu.Lock()
// If this panics, lock is never released—deadlock!
riskyOperation()
c.mu.Unlock()
}
func (c *SafeCounter) IncrementGood() {
c.mu.Lock()
defer c.mu.Unlock() // Released even on panic
riskyOperation()
}
Complete Example: Thread-Safe Map
The standard library map is not concurrency-safe. In Go 1.6+, concurrent read-write on a map causes a direct panic (not undefined behavior from data race, but explicit detection followed by fatal).
type SafeMap[K comparable, V any] struct {
mu sync.Mutex
m map[K]V
}
func NewSafeMap[K comparable, V any]() *SafeMap[K, V] {
return &SafeMap[K, V]{m: make(map[K]V)}
}
func (sm *SafeMap[K, V]) Get(key K) (V, bool) {
sm.mu.Lock()
defer sm.mu.Unlock()
v, ok := sm.m[key]
return v, ok
}
func (sm *SafeMap[K, V]) Set(key K, value V) {
sm.mu.Lock()
defer sm.mu.Unlock()
sm.m[key] = value
}
func (sm *SafeMap[K, V]) Delete(key K) {
sm.mu.Lock()
defer sm.mu.Unlock()
delete(sm.m, key)
}
RWMutex: Read-Write Lock
If reads vastly outnumber writes, sync.Mutex makes all reads block each other—wasteful. sync.RWMutex allows multiple concurrent readers, only requiring mutual exclusion for writes.
type Config struct {
mu sync.RWMutex
data map[string]string
}
func (c *Config) Get(key string) string {
c.mu.RLock() // Read lock: multiple reads can proceed concurrently
defer c.mu.RUnlock()
return c.data[key]
}
func (c *Config) Set(key, value string) {
c.mu.Lock() // Write lock: exclusive access
defer c.mu.Unlock()
c.data[key] = value
}
Read-write lock semantics:
RLock(): Acquire read lock. Blocks if write lock is held; otherwise succeeds (can coexist with other read locks)RUnlock(): Release read lockLock(): Acquire write lock. Blocks if any lock (read or write) is heldUnlock(): Release write lock
When to use RWMutex?
Rule of thumb: RWMutex only makes sense when read-to-write ratio exceeds 10:1. When the ratio is near 1:1, RWMutex's overhead (maintaining reader counter internally) actually makes it slower than plain Mutex.
Read:Write Ratio Recommendation
1:1 sync.Mutex
5:1 sync.Mutex (borderline, benchmark needed)
10:1+ sync.RWMutex
100:1+ Consider sync.Map or atomic
WaitGroup: Waiting for a Group of Goroutines
sync.WaitGroup waits for a group of goroutines to complete. It's the core tool for the "fork-join" concurrency model.
func fetchAll(urls []string) []string {
var wg sync.WaitGroup
results := make([]string, len(urls))
for i, url := range urls {
wg.Add(1) // Call Add BEFORE launching goroutine
go func(idx int, u string) {
defer wg.Done() // Call Done when goroutine completes
resp, err := http.Get(u)
if err != nil {
results[idx] = "error"
return
}
defer resp.Body.Close()
body, _ := io.ReadAll(resp.Body)
results[idx] = string(body)
}(i, url)
}
wg.Wait() // Blocks until all goroutines call Done
return results
}
Key rules:
Add(n)must be called before thegostatement (otherwise Wait might return before Add)Done()is equivalent toAdd(-1)- Counter going negative causes panic
- WaitGroup can be reused (after counter returns to 0, you can Add again)
Common mistake: Calling Add inside the goroutine
// Wrong! May cause Wait to return early
for _, url := range urls {
go func(u string) {
wg.Add(1) // Too late! main goroutine may already be at Wait()
defer wg.Done()
fetch(u)
}(url)
}
wg.Wait()
Correct approach:
for _, url := range urls {
wg.Add(1) // Add before launching goroutine
go func(u string) {
defer wg.Done()
fetch(u)
}(url)
}
wg.Wait()
sync.Once: Guarantee Single Execution
sync.Once ensures a function executes exactly once regardless of how many goroutines call it. The most common use is singleton initialization.
var (
instance *Database
once sync.Once
)
func GetDB() *Database {
once.Do(func() {
// This function runs only once, even if 1000 goroutines call GetDB simultaneously
instance = &Database{
conn: connectToDB(),
}
})
return instance
}
sync.Once guarantees:
- Function executes only once (even with concurrent calls)
- All callers wait until the first execution completes before returning
- After first execution completes, all subsequent Do calls return immediately (near-zero overhead)
Note: If the function passed to Once.Do panics, Once still considers it "done." Subsequent calls won't re-execute:
var once sync.Once
once.Do(func() {
panic("oops") // Panicked
})
once.Do(func() {
fmt.Println("this will never print") // Won't execute
})
Starting from Go 1.21, new helpers sync.OnceFunc, sync.OnceValue, and sync.OnceValues provide a more convenient API:
// Go 1.21+
getDB := sync.OnceValue(func() *Database {
return &Database{conn: connectToDB()}
})
db := getDB() // First call initializes, subsequent calls return cached value
Practical Example: Concurrency-Safe Cache
Combining Mutex, WaitGroup, and Once to build a production-grade cache:
type Cache struct {
mu sync.RWMutex
items map[string]*cacheItem
}
type cacheItem struct {
value interface{}
expiry time.Time
}
func NewCache() *Cache {
c := &Cache{items: make(map[string]*cacheItem)}
// Launch background goroutine to clean expired items
go c.janitor()
return c
}
func (c *Cache) Get(key string) (interface{}, bool) {
c.mu.RLock()
defer c.mu.RUnlock()
item, exists := c.items[key]
if !exists {
return nil, false
}
if time.Now().After(item.expiry) {
return nil, false // Expired, treat as non-existent
}
return item.value, true
}
func (c *Cache) Set(key string, value interface{}, ttl time.Duration) {
c.mu.Lock()
defer c.mu.Unlock()
c.items[key] = &cacheItem{
value: value,
expiry: time.Now().Add(ttl),
}
}
func (c *Cache) janitor() {
ticker := time.NewTicker(1 * time.Minute)
defer ticker.Stop()
for range ticker.C {
c.mu.Lock()
for key, item := range c.items {
if time.Now().After(item.expiry) {
delete(c.items, key)
}
}
c.mu.Unlock()
}
}
Level 2: How It Works Under the Hood
sync.Pool: Object Reuse
sync.Pool is a temporary object pool that caches allocated objects for reuse, reducing memory allocation and GC pressure.
var bufPool = sync.Pool{
New: func() interface{} {
return new(bytes.Buffer)
},
}
func processRequest(data []byte) string {
buf := bufPool.Get().(*bytes.Buffer) // Get from pool
buf.Reset() // Reset state
defer bufPool.Put(buf) // Return when done
buf.Write(data)
buf.WriteString(" processed")
return buf.String()
}
sync.Pool characteristics:
- Get(): Retrieves an object from the pool. If empty, calls New to create one
- Put(): Returns an object to the pool
- Cleared on GC: Every GC cycle, all objects in the Pool may be cleared (no survival guarantee)
- No size limit: Pool grows as needed, GC reclaims
Critical constraint: Objects in Pool can be reclaimed at any time. Don't store persistent data in Pool, and don't rely on Pool size.
Real-world usage in standard library—fmt package:
// fmt/print.go (simplified)
var ppFree = sync.Pool{
New: func() interface{} { return new(pp) },
}
func Fprintf(w io.Writer, format string, a ...interface{}) (n int, err error) {
p := ppFree.Get().(*pp)
p.doPrintf(format, a)
n, err = w.Write(p.buf)
p.free() // Internally calls ppFree.Put(p)
return
}
fmt.Printf needs a pp struct for formatting on every call. If every call uses new, high-frequency calls generate massive garbage. With Pool reuse, GC pressure drops dramatically.
Performance comparison:
// Benchmark: without Pool
func BenchmarkNoPool(b *testing.B) {
for i := 0; i < b.N; i++ {
buf := new(bytes.Buffer)
buf.WriteString("hello")
_ = buf.String()
}
}
// Benchmark: with Pool
func BenchmarkWithPool(b *testing.B) {
pool := sync.Pool{New: func() interface{} { return new(bytes.Buffer) }}
for i := 0; i < b.N; i++ {
buf := pool.Get().(*bytes.Buffer)
buf.Reset()
buf.WriteString("hello")
_ = buf.String()
pool.Put(buf)
}
}
// Typical results (Go 1.21, Apple M1):
// BenchmarkNoPool-8 30000000 42 ns/op 64 B/op 1 allocs/op
// BenchmarkWithPool-8 50000000 28 ns/op 0 B/op 0 allocs/op
Pool reduces operation time by ~33%, and more importantly achieves 0 allocs/op—no burden on GC.
sync.Map: Concurrency-Safe Map
sync.Map, introduced in Go 1.9, is a concurrency-safe map optimized for specific scenarios.
var cache sync.Map
// Store
cache.Store("key1", "value1")
// Load
if val, ok := cache.Load("key1"); ok {
fmt.Println(val.(string))
}
// LoadOrStore (atomic operation)
actual, loaded := cache.LoadOrStore("key2", "value2")
// loaded = false: didn't exist before, stored "value2"
// loaded = true: already existed, returned old value
// Delete
cache.Delete("key1")
// Range
cache.Range(func(key, value interface{}) bool {
fmt.Println(key, value)
return true // Return false to stop iteration
})
sync.Map's two ideal scenarios (from official docs):
- Entries written once but read many times (e.g., a growing cache)
- Multiple goroutines read and write disjoint key sets
In other scenarios, sync.Mutex + regular map is usually faster.
Why? sync.Map internally uses two maps—a read-only read map and a lock-requiring dirty map. Read operations first check the read map (lock-free, atomic operations); only on miss does it lock and access dirty. If the key set is stable (rarely adding new keys), most operations take the lock-free path.
// sync.Map internals (simplified)
type Map struct {
mu Mutex
read atomic.Pointer[readOnly] // Lock-free reads
dirty map[interface{}]*entry // Requires lock
misses int
}
type readOnly struct {
m map[interface{}]*entry
amended bool // Whether dirty has keys not in read
}
Performance comparison: choosing by scenario
Scenario sync.Map Mutex+Map
Read-heavy (99:1) 2-5x faster slower
Balanced read/write (50:50) 1-3x slower faster
Key set constantly growing slower faster
Fixed keys, goroutines operate on diff keys 3-10x faster slower
Mutex vs Channel: When to Use Which
Use Mutex when:
- Protecting shared data structures (maps, slices, struct fields)
- Simple counters, flags
- Short critical sections (a few lines)
- Performance-sensitive hot paths
Use Channel when:
- Transferring data ownership between goroutines
- Coordinating execution order of multiple goroutines
- Implementing timeout, cancellation
- Complex patterns like fan-out/fan-in
Rob Pike's advice: "If you're protecting a data structure, use a mutex. If you're coordinating workflow, use a channel."
// Mutex: protecting shared state
type Counter struct {
mu sync.Mutex
n int
}
// Channel: coordinating workflow
func pipeline(input <-chan int) <-chan int {
output := make(chan int)
go func() {
defer close(output)
for v := range input {
output <- transform(v)
}
}()
return output
}
WaitGroup Internal Implementation
sync.WaitGroup's core is a 64-bit atomic counter and a semaphore:
// Simplified WaitGroup structure
type WaitGroup struct {
// High 32 bits: counter
// Low 32 bits: waiter count
state atomic.Uint64
sema uint32 // Semaphore for blocking/waking
}
Add(n): Atomically adds n to counterDone(): Atomically decrements counter by 1; if counter reaches zero, wakes all waitersWait(): If counter > 0, increments waiter count, then callsruntime_Semacquireto block
Counter and waiter count are packed into a single 64-bit integer so Add can atomically check both whether counter reached zero and whether there are waiters—more efficient than using two separate variables.
Cond: Condition Variable
sync.Cond is a relatively uncommon but highly valuable primitive for specific scenarios. It allows goroutines to wait until a condition becomes true.
type BoundedQueue struct {
mu sync.Mutex
notEmpty *sync.Cond
notFull *sync.Cond
buf []int
capacity int
}
func NewBoundedQueue(cap int) *BoundedQueue {
q := &BoundedQueue{
buf: make([]int, 0, cap),
capacity: cap,
}
q.notEmpty = sync.NewCond(&q.mu)
q.notFull = sync.NewCond(&q.mu)
return q
}
func (q *BoundedQueue) Put(val int) {
q.mu.Lock()
defer q.mu.Unlock()
for len(q.buf) == q.capacity {
q.notFull.Wait() // Releases lock and waits; reacquires lock when woken
}
q.buf = append(q.buf, val)
q.notEmpty.Signal() // Notify one waiting consumer
}
func (q *BoundedQueue) Get() int {
q.mu.Lock()
defer q.mu.Unlock()
for len(q.buf) == 0 {
q.notEmpty.Wait()
}
val := q.buf[0]
q.buf = q.buf[1:]
q.notFull.Signal()
return val
}
Cond.Wait()'s three-step operation (executed atomically):
- Release the associated lock
- Suspend current goroutine
- Reacquire lock when woken
Why use a for loop instead of if? Because after Wait returns, the condition may no longer be true (other goroutines may have acted first). This is called "spurious wakeup"—while Go's implementation doesn't produce true spurious wakeups, the spec recommends always checking conditions in a loop.
Signal() wakes one waiter; Broadcast() wakes all waiters.
Atomic Operations: sync/atomic
For simple counters and flags, atomic operations are lighter than Mutex:
import "sync/atomic"
var counter int64
func increment() {
atomic.AddInt64(&counter, 1)
}
func get() int64 {
return atomic.LoadInt64(&counter)
}
Go 1.19 introduced typed atomic variables for safer, more ergonomic usage:
var counter atomic.Int64
func increment() {
counter.Add(1)
}
func get() int64 {
return counter.Load()
}
Atomic operations:
| Operation | Function | Go 1.19+ Type Method |
|---|---|---|
| Load | LoadInt64(&x) |
x.Load() |
| Store | StoreInt64(&x, v) |
x.Store(v) |
| Add | AddInt64(&x, n) |
x.Add(n) |
| CAS | CompareAndSwapInt64(&x, old, new) |
x.CompareAndSwap(old, new) |
| Swap | SwapInt64(&x, new) |
x.Swap(new) |
When atomic vs Mutex?
- Single variable, simple operation → atomic
- Multiple variables need updating together → Mutex
- Complex logic (if-then-update) → Mutex
Level 3: What the Specification Says
Mutex Implementation: From Spinning to Semaphore
Go's Mutex implementation has evolved through multiple iterations. The current implementation (Go 1.9+) combines spinning and semaphores, introducing starvation mode.
// sync/mutex.go (simplified)
type Mutex struct {
state int32 // Lock state (multiple flag bits)
sema uint32 // Semaphore
}
const (
mutexLocked = 1 << iota // 1: lock is held
mutexWoken // 2: a goroutine has been woken
mutexStarving // 4: starvation mode
mutexWaiterShift = iota // 3: bit offset for waiter count
)
Complete Lock() flow:
-
Fast path: CAS attempts to set state from 0 to
mutexLocked. If successful, returns immediately—this is the uncontended path, requiring only one atomic operation. -
Slow path: If fast path fails (lock already held), enters
lockSlow():- Spinning phase: If lock is held and in normal mode, goroutine spins. Spin conditions:
- Running on multicore machine
- Current GOMAXPROCS > 1
- At least one other P (processor) is running
- Spin count < 4
- Semaphore phase: After spin limit exceeded, goroutine calls
runtime_SemacquireMutexto sleep
- Spinning phase: If lock is held and in normal mode, goroutine spins. Spin conditions:
-
After waking: Goroutine woken by semaphore must compete with newly arriving goroutines for the lock.
Why spin first, then semaphore? Spinning avoids thread switch overhead (~1-2 microseconds). For short critical sections (tens of nanoseconds), spinning until lock release is much faster than sleep-wake. But infinite spinning wastes CPU, so after 4 iterations it switches to semaphore.
Starvation Mode (Go 1.9+)
Before Go 1.9, Mutex had a serious problem: newly arriving goroutines could acquire the lock more easily than already-waiting ones (because new goroutines are already running on CPU and can immediately spin-compete). This caused waiting goroutines to potentially "starve"—unbounded wait time.
Go 1.9 introduced starvation mode to solve this:
Normal mode:
- Waiters queue in FIFO order
- Woken waiters compete with newly arriving goroutines
- New arrivals have advantage (already running on CPU)
Starvation mode:
- Trigger condition: a waiter has waited over 1ms
- Behavior: lock is handed directly to head of wait queue; new arrivals don't compete
- Exit condition: goroutine that acquired the lock is the last in queue, or wait time < 1ms
Timeline (starvation problem in normal mode):
G1 holds lock
G2 waiting... (100us)
G3 arrives -> spins -> acquires lock (G2 keeps waiting)
G4 arrives -> spins -> acquires lock (G2 keeps waiting)
...
G2 may wait indefinitely
Timeline (starvation mode):
G1 holds lock
G2 waiting... (>1ms) -> triggers starvation mode
G1 unlocks -> directly handed to G2 (G3, G4 must queue)
Dmitry Vyukov proposed this improvement in Go issue #13086, with commit message: "sync: make Mutex more fair." Benchmarks showed starvation mode reduced worst-case latency from hundreds of milliseconds to ~1ms, though average throughput slightly decreased (fewer spinning opportunities).
sync.Pool and GC Interaction
sync.Pool's lifecycle is tightly coupled with GC. Its internal implementation uses per-P (processor) local pools to reduce lock contention:
// sync/pool.go (simplified)
type Pool struct {
noCopy noCopy
local unsafe.Pointer // [P]poolLocal array
localSize uintptr
victim unsafe.Pointer // local from previous GC cycle
victimSize uintptr
New func() interface{}
}
type poolLocal struct {
poolLocalInternal
pad [128 - unsafe.Sizeof(poolLocalInternal{})%128]byte // Prevent false sharing
}
type poolLocalInternal struct {
private interface{} // Only current P can access (lock-free)
shared poolChain // Other Ps can steal from (lock-free)
}
Get() flow:
- Pin current goroutine to P (
pin()) - Check current P's
privatefield—lock-free - If private is nil, pop from head of current P's
sharedlist - If shared is also empty, steal from tail of other Ps' shared lists (work-stealing)
- If all empty, check
victimpool (leftover from previous GC) - If everything empty, call
New()
GC cleanup:
- Every GC cycle:
victim = local; local = nil - This means objects survive at most two GC cycles: first cycle moves from local to victim, second cycle clears victim
- This "double buffering" strategy avoids the performance cliff of clearing all objects immediately after GC
Why Pool isn't suitable for connection pools:
// Wrong usage: database connection pool
var connPool = sync.Pool{
New: func() interface{} {
conn, _ := sql.Open("mysql", dsn)
return conn
},
}
// Problem: connections cleared after GC, next Get requires reconnection (expensive)
// Correct: use sql.DB's built-in pool, or implement channel-based pool yourself
sync.Pool is for "cheap to create but high frequency" temporary objects (like bytes.Buffer), not "expensive to create but low frequency" long-lived resources (like DB connections).
sync.Once Implementation
Once's implementation appears simple but has subtle performance considerations:
// sync/once.go
type Once struct {
done atomic.Uint32
m Mutex
}
func (o *Once) Do(f func()) {
// Fast path: already done, return immediately
if o.done.Load() == 1 {
return
}
// Slow path: first call (or first is still executing)
o.doSlow(f)
}
func (o *Once) doSlow(f func()) {
o.m.Lock()
defer o.m.Unlock()
if o.done.Load() == 0 {
defer o.done.Store(1)
f()
}
}
Why not just use CAS?
// Wrong implementation (conceptual)
func (o *Once) Do(f func()) {
if atomic.CompareAndSwapUint32(&o.done, 0, 1) {
f()
}
}
Problem: If goroutine A wins the CAS and starts executing f(), goroutine B sees done=1 and returns immediately—but f() might not have finished yet! B could use an uninitialized object.
The correct implementation uses Mutex to ensure: all concurrent callers wait until f() completes before returning. This is stronger than "execute only once"—it guarantees "complete execution before anyone else proceeds."
Memory Model Guarantees for sync
The Go Memory Model's happens-before guarantees for sync package:
- Mutex: The nth
Unlock()happens-before the (n+1)thLock()returns - RWMutex: For any
RLock()call, there exists some n such that the nthUnlock()happens-before thatRLock()returns, and the correspondingRUnlock()happens-before the (n+1)thLock()returns - Once: Completion of f in
once.Do(f)happens-before anyonce.Doreturns - WaitGroup:
wg.Done()happens-before the correspondingwg.Wait()returns - atomic: Go 1.19+ clarified that atomic operations establish happens-before relationships (previously undefined)
These guarantees mean:
var data string
var mu sync.Mutex
// Goroutine A
mu.Lock()
data = "hello"
mu.Unlock()
// Goroutine B (acquires lock after A unlocks)
mu.Lock()
fmt.Println(data) // Guaranteed to see "hello"
mu.Unlock()
Without Mutex, even if A runs first, B isn't guaranteed to see A's write (due to CPU caches and compiler optimizations).
RWMutex Implementation Details
RWMutex uses a counter to track reader count, plus a Mutex to protect write operations:
// sync/rwmutex.go (simplified)
type RWMutex struct {
w Mutex // Write lock mutex
writerSem uint32 // Writer semaphore
readerSem uint32 // Reader semaphore
readerCount atomic.Int32 // Reader count (may be negative)
readerWait atomic.Int32 // Readers waiting to finish
}
const rwmutexMaxReaders = 1 << 30
RLock() implementation:
func (rw *RWMutex) RLock() {
if rw.readerCount.Add(1) < 0 {
// Writer waiting or holding write lock, block
runtime_SemacquireRWMutexR(&rw.readerSem, false, 0)
}
}
Lock() (write lock) implementation:
func (rw *RWMutex) Lock() {
rw.w.Lock() // Exclude other writers
// Notify readers a writer has arrived: subtract rwmutexMaxReaders from readerCount
r := rw.readerCount.Add(-rwmutexMaxReaders) + rwmutexMaxReaders
// If there are active readers, wait for them to finish
if r != 0 && rw.readerWait.Add(r) != 0 {
runtime_SemacquireRWMutex(&rw.writerSem, false, 0)
}
}
The clever trick: readerCount going negative signals "a writer is waiting." New RLock() calls seeing a negative value know they must wait.
Level 4: Edge Cases and Pitfalls
Pitfall 1: Lock Copying (Mutex/WaitGroup Must Not Be Copied)
One of the most common mistakes for Go beginners:
type Service struct {
mu sync.Mutex
// ... fields
}
// Wrong! Value passing copies the Mutex
func process(s Service) {
s.mu.Lock()
// ... operating on copy's lock, original object unprotected
s.mu.Unlock()
}
// Correct: pass pointer
func process(s *Service) {
s.mu.Lock()
defer s.mu.Unlock()
// ...
}
Why can't locks be copied? Mutex's internal state (whether held, wait queue) is specific to that instance. Copying a held lock results in two goroutines thinking they hold "the same lock"—but they're actually two different locks.
go vet detection: Go's built-in go vet tool detects lock copying:
$ go vet ./...
# example.com/myapp
./main.go:15:17: process passes lock by value: example.com/myapp.Service contains sync.Mutex
All sync types must not be copied: Mutex, RWMutex, WaitGroup, Once, Cond, Pool, Map.
The mechanism enforcing this constraint is the noCopy struct (an empty struct implementing sync.Locker interface); go vet checks whether structs containing noCopy fields are being copied.
Pitfall 2: Deadlock Patterns
Pattern 1: Self-locking (recursive lock death)
Go's Mutex is not reentrant—the same goroutine locking the same Mutex twice deadlocks:
func (s *Service) A() {
s.mu.Lock()
defer s.mu.Unlock()
s.B() // Deadlock! B also needs the lock
}
func (s *Service) B() {
s.mu.Lock() // Blocks forever—lock already held by current goroutine
defer s.mu.Unlock()
// ...
}
Why doesn't Go have reentrant locks? Russ Cox explained clearly in Go issue #14939: "Recursive mutexes do not protect invariants. Mutual exclusion locks protect invariants. If the lock protects some invariant, then no reentrant call is safe to make while the invariant may be broken."
Meaning: if A modifies shared data halfway then calls B, B re-acquiring the lock would see inconsistent intermediate state. Reentrant locks mask this problem rather than solving it.
Fix approaches:
// Approach 1: Split into internal unlocked version
func (s *Service) A() {
s.mu.Lock()
defer s.mu.Unlock()
s.bLocked() // Internal version without locking
}
func (s *Service) B() {
s.mu.Lock()
defer s.mu.Unlock()
s.bLocked()
}
func (s *Service) bLocked() {
// Assumes caller holds lock
// ...
}
Pattern 2: AB-BA deadlock
// Goroutine 1 Goroutine 2
mu1.Lock() mu2.Lock()
mu2.Lock() // waits G2 mu1.Lock() // waits G1
// Deadlock!
Fix: Always lock in consistent order
// Convention: always lock mu1 before mu2
func transferLocked(mu1, mu2 *sync.Mutex) {
// Sort by address to ensure global consistency
if uintptr(unsafe.Pointer(mu1)) > uintptr(unsafe.Pointer(mu2)) {
mu1, mu2 = mu2, mu1
}
mu1.Lock()
mu2.Lock()
// ...
mu2.Unlock()
mu1.Unlock()
}
Pattern 3: Forgetting to unlock in a goroutine
func bad(mu *sync.Mutex) {
mu.Lock()
if someCondition {
return // Forgot Unlock!
}
mu.Unlock()
}
Solution: Always use defer
Pitfall 3: RWMutex Writer Starvation
When read operations are very frequent, writers may never acquire the lock:
// Scenario: 100 goroutines continuously RLocking
// 1 goroutine tries to Lock
// If readers never pause, writer can never find a moment when all readers released
Go's RWMutex has protection for this: When a writer arrives (calls Lock), new readers are blocked (because readerCount becomes negative). Existing readers can continue to completion, but no new readers join. This ensures the writer eventually acquires the lock.
However, if existing readers do long operations in their critical sections (e.g., network requests), the writer still waits a long time.
Best practices:
- Keep read critical sections short
- If read operations involve IO, copy needed data inside the lock, then unlock before doing IO
// Wrong: network request inside read lock
func (c *Cache) GetAndFetch(key string) (string, error) {
c.mu.RLock()
defer c.mu.RUnlock()
if val, ok := c.data[key]; ok {
return val, nil
}
// Network request inside read lock—blocks writer for long time
return http.Get("http://example.com/" + key)
}
// Correct: minimize critical section
func (c *Cache) GetAndFetch(key string) (string, error) {
c.mu.RLock()
val, ok := c.data[key]
c.mu.RUnlock() // Release immediately
if ok {
return val, nil
}
// Network request outside lock
return http.Get("http://example.com/" + key)
}
Pitfall 4: sync.Pool Usage Mistakes
Mistake 1: Forgetting to Reset
var bufPool = sync.Pool{
New: func() interface{} { return new(bytes.Buffer) },
}
func process(data string) string {
buf := bufPool.Get().(*bytes.Buffer)
// Forgot buf.Reset()!
// buf may still contain data from previous use
buf.WriteString(data)
result := buf.String() // May include leftover data from previous use
bufPool.Put(buf)
return result
}
Mistake 2: Using after Put
func process() {
buf := bufPool.Get().(*bytes.Buffer)
buf.Reset()
buf.WriteString("hello")
bufPool.Put(buf)
// Wrong! buf is back in pool, may be acquired and modified by another goroutine
fmt.Println(buf.String()) // Data race!
}
Mistake 3: Storing large objects with pointers in Pool
// Large slice referenced by Pool, GC can't reclaim underlying array
var bigBufPool = sync.Pool{
New: func() interface{} {
buf := make([]byte, 0, 1<<20) // 1MB
return &buf
},
}
// Better approach: limit size of objects returned to pool
func putBuf(buf *[]byte) {
if cap(*buf) > 1<<20 {
return // Too large, let GC reclaim
}
*buf = (*buf)[:0]
bigBufPool.Put(buf)
}
Pitfall 5: WaitGroup Reuse Race Condition
var wg sync.WaitGroup
// First round
wg.Add(2)
go func() { defer wg.Done(); work1() }()
go func() { defer wg.Done(); work2() }()
wg.Wait()
// Second round—note: must ensure first round is fully complete before starting
// If after wg.Wait() returns but before wg.Add(2),
// a slow goroutine's Done() hasn't finished (race condition),
// panic "sync: negative WaitGroup counter"
wg.Add(2) // Safe: Wait returning means all Done calls completed
In practice, Go's WaitGroup implementation guarantees that when Wait() returns, all Done() calls have completed, so the above code is safe. But if you have other goroutines that might call Done() between Wait() returning and Add() (a design bug), problems arise.
Pitfall 6: sync.Map Type Safety
sync.Map uses interface{} for key and value types, losing compile-time type checking:
var m sync.Map
m.Store("count", 42)
m.Store("count", "not a number") // Type mismatch only discovered at runtime
val, _ := m.Load("count")
n := val.(int) // If stored value is string, panics here
Go 1.18+ solution—wrap with generics:
type TypedMap[K comparable, V any] struct {
m sync.Map
}
func (tm *TypedMap[K, V]) Store(key K, value V) {
tm.m.Store(key, value)
}
func (tm *TypedMap[K, V]) Load(key K) (V, bool) {
val, ok := tm.m.Load(key)
if !ok {
var zero V
return zero, false
}
return val.(V), true
}
Real-World Case: Deadlock Bug in Docker
Docker had a famous deadlock bug (docker/docker#22507): the container's Mutex and network's Mutex formed an AB-BA deadlock. Simplified:
// container.go
func (c *Container) Stop() {
c.mu.Lock() // Lock A
defer c.mu.Unlock()
c.network.Disconnect(c) // Internally needs Lock B
}
// network.go
func (n *Network) Disconnect(c *Container) {
n.mu.Lock() // Lock B
defer n.mu.Unlock()
c.UpdateState() // Needs Lock A -> deadlock!
}
Fix: Reduce lock scope to avoid calling functions that may acquire another lock while holding one lock.
Interview Questions
-
Is sync.Mutex reentrant? Why not?
- No. Reentrant locks don't protect invariants—functions called while holding the lock may see intermediate state
-
When are objects in sync.Pool reclaimed?
- May be cleared every GC cycle. Specifically: double buffering: local -> victim -> cleared
-
When is sync.Map faster than Mutex+map?
- Read-heavy/write-light, or multiple goroutines operating on disjoint key sets
-
How to detect lock copying?
go vettool automatically detects structs containingsync.Mutexetc. being passed by value
-
What problem does Go 1.9's Mutex starvation mode solve?
- Prevents waiters from being indefinitely preempted by new arrivals. Waiting over 1ms triggers starvation mode, lock handed over directly
-
Difference between sync.Once and atomic.CompareAndSwap?
- Once guarantees function execution completes before other callers return; CAS only guarantees one executor, doesn't wait for completion
Summary
The sync package is the "low-level but efficient" part of Go's concurrency toolbox. Each primitive has a clear use case:
| Primitive | Core Use | Caveats |
|---|---|---|
| Mutex | Protecting shared data | Not reentrant, not copyable |
| RWMutex | Read-heavy shared data | Keep read critical sections short |
| WaitGroup | Waiting for goroutine group to finish | Add before go |
| Once | Single initialization | Panic counts as "done" |
| Pool | Reducing frequent small allocations | Not for connection pools |
| Map | Specific-pattern concurrent map | Use for read-heavy workloads |
| Cond | Waiting for condition | Check in for loop |
| atomic | Single-variable atomic ops | Can't protect multiple variables |
Selection criteria:
- Need to transfer ownership → channel
- Need to protect shared state → mutex
- Need to reduce GC → Pool
- Need to wait for completion → WaitGroup
- Need to do something once → Once