Chapter 33

WebSocket Real-Time Communication

Chapter 33: WebSocket Real-Time Communication

In 2005, Jesse James Garrett published the article that named the technology powering Google Maps and Gmail: Ajax (Asynchronous JavaScript and XML). Ajax let browsers send HTTP requests and update portions of a page without a full reload. It was revolutionary — but it had one fundamental constraint: communication was always client-initiated. The server could never push data on its own.

To receive real-time updates, developers invented workarounds: short polling (send an HTTP request every few seconds), long polling (keep the request open until the server has data, then immediately issue the next one), and HTTP streaming (keep the response open and let the server write continuously). Each approach wasted resources, was complex to implement correctly, or had significant compatibility problems.

In 2011 the WebSocket protocol (RFC 6455) became a standard, cleanly solving this problem. WebSocket is the first genuinely bidirectional, full-duplex, persistent browser communication protocol.

This chapter dissects every technical detail of WebSocket and then builds a real-time multi-room chat server in Go, with full connection management, broadcasting, graceful shutdown, and heartbeats.


Level 1 · What You Need to Know

WebSocket vs. SSE vs. Long Polling: How to Choose

Real-time web applications have three primary options:

Long Polling

The client issues an HTTP request; the server holds it open until new data is available, then returns the response. The client immediately fires another request.

Client ──GET /events──→ Server
                         waiting...
Client ←──200 OK data── Server
Client ──GET /events──→ Server  ← fires immediately

Suitable for: maximum compatibility (IE8), very low message frequency (a few per minute). Drawbacks: every message incurs a full HTTP round-trip; latency is higher than WebSocket; server-side connection bookkeeping is complex.

SSE (Server-Sent Events)

The server streams events to the client over a regular persistent HTTP connection with Content-Type: text/event-stream.

Client ──GET /stream──→ Server
Client ←──event: msg── Server  ← push continuously, connection stays open
Client ←──event: msg── Server

Suitable for: one-way server-to-client pushes such as news feeds, stock tickers, and log tails. Advantages: built on plain HTTP, automatic reconnection in browsers, CDN and proxy compatible. Disadvantages: unidirectional — the client cannot send data back through the same connection; subject to HTTP/1.1 connection limits (not an issue under HTTP/2).

WebSocket

A full-duplex bidirectional protocol that starts life as an HTTP connection and then upgrades. After the handshake, either side may send data frames at any time, unconstrained by the request-response model.

Client ──HTTP Upgrade──→ Server
Client ←──101 Switching── Server  ← protocol switched
Client ←──────────────── Server  ← either direction, any time
Client ──────────────────→ Server

Suitable for: chat, online games, collaborative editors, real-time dashboards, financial trading terminals. Advantages: true bidirectional real-time communication, lowest latency, a single connection carries traffic both ways. Disadvantages: stateful connections complicate horizontal scaling (see Level 4); some corporate firewalls and proxies block WebSocket.

Decision rule: if you only need server-push, SSE is simpler; if you need bidirectional communication or the lowest possible latency, choose WebSocket; if you must support ancient browsers or hostile network environments, implement long polling as a fallback.

The HTTP Upgrade Handshake

A WebSocket connection begins as an HTTP/1.1 request with special Upgrade headers:

Client request:

GET /chat HTTP/1.1
Host: example.com
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
Sec-WebSocket-Version: 13

Server response:

HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=

Sec-WebSocket-Key is a random 16-byte value the client generates and Base64-encodes. The server concatenates it with the fixed magic string 258EAFA5-E914-47DA-95CA-C5AB0DC85B11 (defined in RFC 6455), computes a SHA-1 hash, and Base64-encodes the result to produce Sec-WebSocket-Accept. This mechanism is not a security feature; it is a proof that the server genuinely understands the WebSocket protocol, preventing HTTP caches from returning a cached 101 response.

After the handshake, the underlying TCP connection remains open but HTTP is no longer used — communication proceeds in the WebSocket frame format.


Level 2 · How It Works Under the Hood

gorilla/websocket Internals

github.com/gorilla/websocket is the most widely used WebSocket library in the Go ecosystem. It encapsulates the framing protocol defined in RFC 6455 and provides a clean read/write API.

WebSocket frame layout:

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-------+-+-------------+-------------------------------+
|F|R|R|R| opcode|M| Payload len |    Extended payload length    |
|I|S|S|S|  (4)  |A|     (7)    |             (16/64)           |
|N|V|V|V|       |S|             |   (if payload len==126/127)   |
| |1|2|3|       |K|             |                               |
+-+-+-+-+-------+-+-------------+ - - - - - - - - - - - - - - -+
|   Extended payload length continued, if payload len == 127   |
+ - - - - - - - - - - - - - - -+-------------------------------+
|                               |Masking-key, if MASK set to 1 |
+-------------------------------+-------------------------------+
| Masking-key (continued)       |          Payload Data         |
+-------------------------------- - - - - - - - - - - - - - - -+

Key fields:

gorilla/websocket's Conn type hides all of this, but imposes one critical constraint: reads and writes are not concurrent-safe. A single goroutine must own each direction.

Connection Lifecycle Management

A WebSocket connection lifecycle:

Established (Upgrade) → Normal operation → Close handshake → TCP teardown

Normal close: either party sends a close frame (opcode=8) with an optional status code and reason string. The other side replies with a close frame, and the TCP connection is torn down. This is a graceful close.

Abnormal close: network interruption, process crash, and similar events terminate the TCP connection without a close handshake. This is why heartbeats are essential.

With gorilla/websocket:

// Set a read deadline: if no data (including Pong) arrives within this window,
// treat the connection as dead.
conn.SetReadDeadline(time.Now().Add(60 * time.Second))

// On each Pong, extend the read deadline (reset the heartbeat timer).
conn.SetPongHandler(func(string) error {
    conn.SetReadDeadline(time.Now().Add(60 * time.Second))
    return nil
})

Concurrent Reads and Writes: One Goroutine Per Direction

The gorilla/websocket documentation is explicit: Connections support one concurrent reader and one concurrent writer. This means:

The reason: conn.Write() writes directly into the TCP send buffer. If two goroutines write concurrently, their frames will be interleaved, and the receiver will see corrupted data.

The correct concurrency pattern:

// Goroutine 1: dedicated reader
go func() {
    for {
        msgType, data, err := conn.ReadMessage()
        // handle...
    }
}()

// Goroutine 2: dedicated writer — receives messages through a channel
go func() {
    for msg := range sendChan {
        conn.WriteMessage(msg.Type, msg.Data)
    }
}()

// All other code sends to the channel, never calls WriteMessage directly
sendChan <- Message{Type: websocket.TextMessage, Data: []byte("hello")}

Ping/Pong Heartbeat Protocol

WebSocket RFC defines Ping (opcode=9) and Pong (opcode=10) control frames for keepalive purposes. Browsers automatically reply to server Pings with a Pong.

The two purposes of heartbeats:

  1. Detect half-open connections: The TCP connection appears alive but the peer has vanished (network partition, process crash). Without heartbeats the server never discovers the connection is dead.
  2. Keep NAT/firewall mappings alive: Many NAT devices remove idle mappings after a timeout. Periodic Pings keep the path active.

Correct heartbeat timing constants:

const (
    writeWait  = 10 * time.Second           // write operation deadline
    pongWait   = 60 * time.Second           // max wait for a Pong
    pingPeriod = (pongWait * 9) / 10        // send Ping before pongWait expires
)

The Connection Registry Pattern (Hub)

The standard pattern for managing large numbers of WebSocket connections is a central registry — the Hub — where all state mutations happen through channels:

connect    → register channel   → Hub goroutine → adds to connections map
disconnect → unregister channel → Hub goroutine → removes + cleans up
broadcast  → broadcast channel  → Hub goroutine → iterates connections, sends

The Hub goroutine is the sole modifier of the connections map, so no mutex is needed. This is one of Go's core concurrency insights: do not communicate by sharing memory; share memory by communicating.


Level 3 · Code in Practice

Building a Real-Time Chat Server

Below is a complete multi-room chat server with connection management, broadcasting, graceful disconnect handling, and heartbeats.

Message types (message.go):

package main

import (
    "encoding/json"
    "time"
)

type MessageType string

const (
    MsgTypeChat   MessageType = "chat"
    MsgTypeSystem MessageType = "system"
    MsgTypeError  MessageType = "error"
)

type IncomingMessage struct {
    Type    MessageType `json:"type"`
    Room    string      `json:"room"`
    Content string      `json:"content"`
}

type OutgoingMessage struct {
    Type      MessageType `json:"type"`
    Room      string      `json:"room"`
    Sender    string      `json:"sender"`
    Content   string      `json:"content"`
    Timestamp time.Time   `json:"timestamp"`
}

func (m OutgoingMessage) Encode() ([]byte, error) {
    return json.Marshal(m)
}

Client connection (client.go):

package main

import (
    "encoding/json"
    "log"
    "time"

    "github.com/gorilla/websocket"
)

const (
    writeWait      = 10 * time.Second
    pongWait       = 60 * time.Second
    pingPeriod     = (pongWait * 9) / 10
    maxMessageSize = 4096
)

type Client struct {
    id   string
    room string
    hub  *Hub
    conn *websocket.Conn
    send chan []byte // buffered outbound message queue
}

// readPump runs in its own goroutine and is the sole reader on this connection.
func (c *Client) readPump() {
    defer func() {
        c.hub.unregister <- c
        c.conn.Close()
    }()

    c.conn.SetReadLimit(maxMessageSize)
    c.conn.SetReadDeadline(time.Now().Add(pongWait))
    c.conn.SetPongHandler(func(string) error {
        c.conn.SetReadDeadline(time.Now().Add(pongWait))
        return nil
    })

    for {
        _, rawMsg, err := c.conn.ReadMessage()
        if err != nil {
            if websocket.IsUnexpectedCloseError(err,
                websocket.CloseGoingAway,
                websocket.CloseAbnormalClosure,
            ) {
                log.Printf("client %s unexpected close: %v", c.id, err)
            }
            break
        }

        var incoming IncomingMessage
        if err := json.Unmarshal(rawMsg, &incoming); err != nil {
            log.Printf("client %s invalid JSON: %v", c.id, err)
            continue
        }

        outgoing := OutgoingMessage{
            Type:      MsgTypeChat,
            Room:      c.room,
            Sender:    c.id,
            Content:   incoming.Content,
            Timestamp: time.Now(),
        }
        if encoded, err := outgoing.Encode(); err == nil {
            c.hub.broadcast <- BroadcastRequest{Room: c.room, Payload: encoded}
        }
    }
}

// writePump runs in its own goroutine and is the sole writer on this connection.
// It also owns the Ping ticker.
func (c *Client) writePump() {
    ticker := time.NewTicker(pingPeriod)
    defer func() {
        ticker.Stop()
        c.conn.Close()
    }()

    for {
        select {
        case message, ok := <-c.send:
            c.conn.SetWriteDeadline(time.Now().Add(writeWait))
            if !ok {
                // Hub closed the channel: send a WebSocket close frame.
                c.conn.WriteMessage(websocket.CloseMessage, []byte{})
                return
            }

            w, err := c.conn.NextWriter(websocket.TextMessage)
            if err != nil {
                return
            }
            w.Write(message)

            // Coalesce queued messages into one write to reduce syscall overhead.
            n := len(c.send)
            for i := 0; i < n; i++ {
                w.Write([]byte{'\n'})
                w.Write(<-c.send)
            }
            if err := w.Close(); err != nil {
                return
            }

        case <-ticker.C:
            c.conn.SetWriteDeadline(time.Now().Add(writeWait))
            if err := c.conn.WriteMessage(websocket.PingMessage, nil); err != nil {
                return
            }
        }
    }
}

Hub: connection registry and message routing (hub.go):

package main

import (
    "log"
    "sync"
    "time"
)

type BroadcastRequest struct {
    Room    string
    Payload []byte
}

type HubStats struct {
    TotalConnections  int64
    ActiveConnections int64
    MessagesSent      int64
}

// Hub manages all active client connections.
// All modifications to the rooms map happen inside Run(), which is a single
// goroutine — no mutex needed on the map itself.
type Hub struct {
    rooms      map[string]map[*Client]bool // room → set of clients
    register   chan *Client
    unregister chan *Client
    broadcast  chan BroadcastRequest

    mu    sync.RWMutex
    stats HubStats
}

func NewHub() *Hub {
    return &Hub{
        rooms:      make(map[string]map[*Client]bool),
        register:   make(chan *Client, 256),
        unregister: make(chan *Client, 256),
        broadcast:  make(chan BroadcastRequest, 1024),
    }
}

func (h *Hub) Run() {
    for {
        select {
        case c := <-h.register:
            h.addClient(c)
        case c := <-h.unregister:
            h.removeClient(c)
        case req := <-h.broadcast:
            h.send(req)
        }
    }
}

func (h *Hub) addClient(c *Client) {
    if h.rooms[c.room] == nil {
        h.rooms[c.room] = make(map[*Client]bool)
    }
    h.rooms[c.room][c] = true

    h.mu.Lock()
    h.stats.TotalConnections++
    h.stats.ActiveConnections++
    h.mu.Unlock()

    log.Printf("client %s joined room %s (%d in room)", c.id, c.room, len(h.rooms[c.room]))

    joinMsg := OutgoingMessage{
        Type:      MsgTypeSystem,
        Room:      c.room,
        Sender:    "system",
        Content:   c.id + " joined",
        Timestamp: time.Now(),
    }
    if encoded, err := joinMsg.Encode(); err == nil {
        h.send(BroadcastRequest{Room: c.room, Payload: encoded})
    }
}

func (h *Hub) removeClient(c *Client) {
    if clients, ok := h.rooms[c.room]; ok {
        if _, exists := clients[c]; exists {
            delete(clients, c)
            close(c.send) // closing the channel triggers writePump to exit
            if len(clients) == 0 {
                delete(h.rooms, c.room)
            }
        }
    }
    h.mu.Lock()
    h.stats.ActiveConnections--
    h.mu.Unlock()
    log.Printf("client %s left room %s", c.id, c.room)
}

func (h *Hub) send(req BroadcastRequest) {
    clients, ok := h.rooms[req.Room]
    if !ok {
        return
    }
    for c := range clients {
        select {
        case c.send <- req.Payload:
            h.mu.Lock()
            h.stats.MessagesSent++
            h.mu.Unlock()
        default:
            // The client's send buffer is full — it is too slow to keep up.
            // Drop it to protect the server.
            log.Printf("client %s buffer full, disconnecting", c.id)
            delete(clients, c)
            close(c.send)
        }
    }
}

func (h *Hub) Stats() HubStats {
    h.mu.RLock()
    defer h.mu.RUnlock()
    return h.stats
}

HTTP handler and main (main.go):

package main

import (
    "encoding/json"
    "fmt"
    "log"
    "net/http"
    "strings"

    "github.com/google/uuid"
    "github.com/gorilla/websocket"
)

var upgrader = websocket.Upgrader{
    ReadBufferSize:  1024,
    WriteBufferSize: 1024,
    CheckOrigin: func(r *http.Request) bool {
        origin := r.Header.Get("Origin")
        return strings.HasPrefix(origin, "https://example.com") ||
            origin == "http://localhost:3000"
    },
}

var hub = NewHub()

func wsHandler(w http.ResponseWriter, r *http.Request) {
    room := strings.TrimPrefix(r.URL.Path, "/ws/")
    if room == "" {
        room = "general"
    }

    conn, err := upgrader.Upgrade(w, r, nil)
    if err != nil {
        log.Printf("upgrade: %v", err)
        return
    }

    client := &Client{
        id:   uuid.New().String()[:8],
        room: room,
        hub:  hub,
        conn: conn,
        send: make(chan []byte, 256),
    }
    hub.register <- client

    go client.writePump()
    go client.readPump()
}

func statsHandler(w http.ResponseWriter, r *http.Request) {
    w.Header().Set("Content-Type", "application/json")
    json.NewEncoder(w).Encode(hub.Stats())
}

func main() {
    go hub.Run()

    http.HandleFunc("/ws/", wsHandler)
    http.HandleFunc("/stats", statsHandler)
    http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
        fmt.Fprintln(w, "Chat server. Connect to /ws/{room}")
    })

    log.Println("Listening on :8080")
    log.Fatal(http.ListenAndServe(":8080", nil))
}

Client-side reconnection with exponential backoff (JavaScript):

class ReconnectingWebSocket {
    constructor(url) {
        this.url = url;
        this.delay = 1000;     // initial reconnect delay
        this.maxDelay = 30000; // cap at 30 s
        this.connect();
    }

    connect() {
        this.ws = new WebSocket(this.url);

        this.ws.onopen = () => {
            console.log('connected');
            this.delay = 1000; // reset backoff on successful connect
        };

        this.ws.onmessage = (e) => this.onmessage(JSON.parse(e.data));

        this.ws.onclose = (e) => {
            if (e.wasClean) return;
            console.log(`reconnecting in ${this.delay}ms`);
            setTimeout(() => {
                this.delay = Math.min(this.delay * 2, this.maxDelay);
                this.connect();
            }, this.delay);
        };
    }

    send(data) {
        if (this.ws.readyState === WebSocket.OPEN) {
            this.ws.send(JSON.stringify(data));
        }
    }

    onmessage(msg) {} // override in subclass
}

Level 4 · Advanced Topics and Edge Cases

Scaling WebSocket Across Multiple Servers: Redis Pub/Sub

A single-server Hub does not scale horizontally. Client A connects to server 1; client B connects to server 2; server 1's Hub is unaware of server 2's connections, so a broadcast reaches only half the audience.

The solution: use Redis Pub/Sub as a message bus. Every server instance subscribes to a shared channel and publishes outbound messages to it:

import (
    "context"
    "encoding/json"
    "github.com/redis/go-redis/v9"
)

type DistributedHub struct {
    *Hub
    rdb     *redis.Client
    channel string
}

func NewDistributedHub(redisAddr string) *DistributedHub {
    rdb := redis.NewClient(&redis.Options{Addr: redisAddr})
    h := &DistributedHub{
        Hub:     NewHub(),
        rdb:     rdb,
        channel: "chat:broadcast",
    }
    go h.subscribe()
    return h
}

// subscribe listens to the Redis channel and forwards messages to the local Hub.
func (h *DistributedHub) subscribe() {
    ctx := context.Background()
    sub := h.rdb.Subscribe(ctx, h.channel)
    for msg := range sub.Channel() {
        var req BroadcastRequest
        if err := json.Unmarshal([]byte(msg.Payload), &req); err != nil {
            continue
        }
        h.Hub.broadcast <- req
    }
}

// Publish sends a broadcast request to every server instance via Redis.
func (h *DistributedHub) Publish(req BroadcastRequest) {
    data, _ := json.Marshal(req)
    h.rdb.Publish(context.Background(), h.channel, data)
}

Each server's local Hub handles only the clients directly connected to it. When a message must reach a room whose members are spread across servers, publish it to Redis; all subscribers pick it up and deliver it locally.

If you also need room membership counts visible cluster-wide, store member sets in Redis using SADD/SREM/SCARD. Retrieve the count with SCARD roomName.

WebSocket Compression: permessage-deflate

For text messages such as JSON payloads, compression can substantially reduce bandwidth. RFC 7692 defines the permessage-deflate extension; gorilla/websocket supports it with a single flag:

var upgrader = websocket.Upgrader{
    EnableCompression: true,
}

Client and server negotiate compression capability during the handshake. If both agree, subsequent message frames are deflate-compressed.

Compression trades CPU for bandwidth. Benchmark before enabling in production:

Per-Connection Rate Limiting

Prevent malicious clients from flooding the server with messages using Go's golang.org/x/time/rate token bucket:

import "golang.org/x/time/rate"

type Client struct {
    // ...
    limiter *rate.Limiter
}

// In NewClient: allow 10 messages per second, burst of 20.
client := &Client{
    limiter: rate.NewLimiter(10, 20),
    // ...
}

// In readPump, before processing each message:
if !c.limiter.Allow() {
    log.Printf("client %s rate limited", c.id)
    // Option A: silently drop the message
    continue
    // Option B: disconnect the client (stricter policy)
    // break
}

For more sophisticated policies — for example, rate limiting by room rather than by connection, or applying different limits to authenticated vs. anonymous users — maintain a shared sync.Map of limiters keyed by user ID rather than connection.

Load Testing with k6

import ws from 'k6/ws';
import { check } from 'k6';

export let options = {
    vus: 1000,       // 1000 concurrent virtual users
    duration: '60s',
};

export default function() {
    const url = 'ws://localhost:8080/ws/loadtest';
    const res = ws.connect(url, {}, function(socket) {
        socket.on('open', () => {
            socket.setInterval(() => {
                socket.send(JSON.stringify({
                    type: 'chat',
                    room: 'loadtest',
                    content: 'hello from k6',
                }));
            }, 1000);
        });

        socket.on('message', (data) => {
            const msg = JSON.parse(data);
            check(msg, { 'has type field': (m) => m.type !== undefined });
        });

        socket.setTimeout(() => socket.close(), 30000);
    });

    check(res, { 'status 101': (r) => r && r.status === 101 });
}

Pay attention to these metrics in the k6 output:

Goroutine Leak Detection

Every WebSocket connection in our server spawns exactly 2 goroutines (readPump + writePump). If connections are not properly cleaned up, goroutines accumulate silently until the process runs out of memory. Detect leaks in tests with go.uber.org/goleak:

import (
    "testing"
    "time"
    "go.uber.org/goleak"
    "github.com/gorilla/websocket"
)

func TestNoLeak(t *testing.T) {
    defer goleak.VerifyNone(t)

    server := httptest.NewServer(http.HandlerFunc(wsHandler))
    defer server.Close()

    url := "ws" + strings.TrimPrefix(server.URL, "http") + "/ws/test"
    conn, _, err := websocket.DefaultDialer.Dial(url, nil)
    if err != nil {
        t.Fatal(err)
    }

    // Normal close
    conn.WriteMessage(
        websocket.CloseMessage,
        websocket.FormatCloseMessage(websocket.CloseNormalClosure, ""),
    )
    conn.Close()
    time.Sleep(100 * time.Millisecond)
    // goleak.VerifyNone fires here and fails if any goroutines remain
}

The exit chain that prevents leaks:

  1. readPump encounters an error or close frame → executes defer hub.unregister <- c → sends client to Hub
  2. Hub's removeClient closes c.send (the write channel)
  3. writePump's select receives from a closed channel with ok=false → sends WebSocket close frame → returns → TCP connection closes

Every link in this chain must be present. A missing defer or an unclosed channel anywhere breaks the chain and leaks a goroutine permanently.

WebSocket is a powerful tool for real-time applications, but its statefulness and the connection management complexity it introduces are genuine engineering challenges. Master the design patterns covered in this chapter — the Hub registry, one goroutine per direction, Ping/Pong heartbeats, distributed Redis routing, and goroutine leak prevention — and you have a complete toolkit for building high-concurrency real-time services in Go.

Rate this chapter
4.7  / 5  (3 ratings)

💬 Comments