WebSocket Real-Time Communication
Chapter 33: WebSocket Real-Time Communication
In 2005, Jesse James Garrett published the article that named the technology powering Google Maps and Gmail: Ajax (Asynchronous JavaScript and XML). Ajax let browsers send HTTP requests and update portions of a page without a full reload. It was revolutionary — but it had one fundamental constraint: communication was always client-initiated. The server could never push data on its own.
To receive real-time updates, developers invented workarounds: short polling (send an HTTP request every few seconds), long polling (keep the request open until the server has data, then immediately issue the next one), and HTTP streaming (keep the response open and let the server write continuously). Each approach wasted resources, was complex to implement correctly, or had significant compatibility problems.
In 2011 the WebSocket protocol (RFC 6455) became a standard, cleanly solving this problem. WebSocket is the first genuinely bidirectional, full-duplex, persistent browser communication protocol.
This chapter dissects every technical detail of WebSocket and then builds a real-time multi-room chat server in Go, with full connection management, broadcasting, graceful shutdown, and heartbeats.
Level 1 · What You Need to Know
WebSocket vs. SSE vs. Long Polling: How to Choose
Real-time web applications have three primary options:
Long Polling
The client issues an HTTP request; the server holds it open until new data is available, then returns the response. The client immediately fires another request.
Client ──GET /events──→ Server
waiting...
Client ←──200 OK data── Server
Client ──GET /events──→ Server ← fires immediately
Suitable for: maximum compatibility (IE8), very low message frequency (a few per minute). Drawbacks: every message incurs a full HTTP round-trip; latency is higher than WebSocket; server-side connection bookkeeping is complex.
SSE (Server-Sent Events)
The server streams events to the client over a regular persistent HTTP connection with Content-Type: text/event-stream.
Client ──GET /stream──→ Server
Client ←──event: msg── Server ← push continuously, connection stays open
Client ←──event: msg── Server
Suitable for: one-way server-to-client pushes such as news feeds, stock tickers, and log tails. Advantages: built on plain HTTP, automatic reconnection in browsers, CDN and proxy compatible. Disadvantages: unidirectional — the client cannot send data back through the same connection; subject to HTTP/1.1 connection limits (not an issue under HTTP/2).
WebSocket
A full-duplex bidirectional protocol that starts life as an HTTP connection and then upgrades. After the handshake, either side may send data frames at any time, unconstrained by the request-response model.
Client ──HTTP Upgrade──→ Server
Client ←──101 Switching── Server ← protocol switched
Client ←──────────────── Server ← either direction, any time
Client ──────────────────→ Server
Suitable for: chat, online games, collaborative editors, real-time dashboards, financial trading terminals. Advantages: true bidirectional real-time communication, lowest latency, a single connection carries traffic both ways. Disadvantages: stateful connections complicate horizontal scaling (see Level 4); some corporate firewalls and proxies block WebSocket.
Decision rule: if you only need server-push, SSE is simpler; if you need bidirectional communication or the lowest possible latency, choose WebSocket; if you must support ancient browsers or hostile network environments, implement long polling as a fallback.
The HTTP Upgrade Handshake
A WebSocket connection begins as an HTTP/1.1 request with special Upgrade headers:
Client request:
GET /chat HTTP/1.1
Host: example.com
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
Sec-WebSocket-Version: 13
Server response:
HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=
Sec-WebSocket-Key is a random 16-byte value the client generates and Base64-encodes. The server concatenates it with the fixed magic string 258EAFA5-E914-47DA-95CA-C5AB0DC85B11 (defined in RFC 6455), computes a SHA-1 hash, and Base64-encodes the result to produce Sec-WebSocket-Accept. This mechanism is not a security feature; it is a proof that the server genuinely understands the WebSocket protocol, preventing HTTP caches from returning a cached 101 response.
After the handshake, the underlying TCP connection remains open but HTTP is no longer used — communication proceeds in the WebSocket frame format.
Level 2 · How It Works Under the Hood
gorilla/websocket Internals
github.com/gorilla/websocket is the most widely used WebSocket library in the Go ecosystem. It encapsulates the framing protocol defined in RFC 6455 and provides a clean read/write API.
WebSocket frame layout:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-------+-+-------------+-------------------------------+
|F|R|R|R| opcode|M| Payload len | Extended payload length |
|I|S|S|S| (4) |A| (7) | (16/64) |
|N|V|V|V| |S| | (if payload len==126/127) |
| |1|2|3| |K| | |
+-+-+-+-+-------+-+-------------+ - - - - - - - - - - - - - - -+
| Extended payload length continued, if payload len == 127 |
+ - - - - - - - - - - - - - - -+-------------------------------+
| |Masking-key, if MASK set to 1 |
+-------------------------------+-------------------------------+
| Masking-key (continued) | Payload Data |
+-------------------------------- - - - - - - - - - - - - - - -+
Key fields:
- FIN: 1 means this is the last fragment of the message
- opcode: 1=text, 2=binary, 8=close, 9=Ping, 10=Pong
- MASK: frames from client to server must be XOR-masked (prevents cache poisoning via intermediaries); server-to-client frames are not masked
- Masking-key: 4 random bytes used for the XOR operation
gorilla/websocket's Conn type hides all of this, but imposes one critical constraint: reads and writes are not concurrent-safe. A single goroutine must own each direction.
Connection Lifecycle Management
A WebSocket connection lifecycle:
Established (Upgrade) → Normal operation → Close handshake → TCP teardown
Normal close: either party sends a close frame (opcode=8) with an optional status code and reason string. The other side replies with a close frame, and the TCP connection is torn down. This is a graceful close.
Abnormal close: network interruption, process crash, and similar events terminate the TCP connection without a close handshake. This is why heartbeats are essential.
With gorilla/websocket:
// Set a read deadline: if no data (including Pong) arrives within this window,
// treat the connection as dead.
conn.SetReadDeadline(time.Now().Add(60 * time.Second))
// On each Pong, extend the read deadline (reset the heartbeat timer).
conn.SetPongHandler(func(string) error {
conn.SetReadDeadline(time.Now().Add(60 * time.Second))
return nil
})
Concurrent Reads and Writes: One Goroutine Per Direction
The gorilla/websocket documentation is explicit: Connections support one concurrent reader and one concurrent writer. This means:
- Only one goroutine may be reading at a time
- Only one goroutine may be writing at a time
- Concurrent read + write is fine (one goroutine each)
The reason: conn.Write() writes directly into the TCP send buffer. If two goroutines write concurrently, their frames will be interleaved, and the receiver will see corrupted data.
The correct concurrency pattern:
// Goroutine 1: dedicated reader
go func() {
for {
msgType, data, err := conn.ReadMessage()
// handle...
}
}()
// Goroutine 2: dedicated writer — receives messages through a channel
go func() {
for msg := range sendChan {
conn.WriteMessage(msg.Type, msg.Data)
}
}()
// All other code sends to the channel, never calls WriteMessage directly
sendChan <- Message{Type: websocket.TextMessage, Data: []byte("hello")}
Ping/Pong Heartbeat Protocol
WebSocket RFC defines Ping (opcode=9) and Pong (opcode=10) control frames for keepalive purposes. Browsers automatically reply to server Pings with a Pong.
The two purposes of heartbeats:
- Detect half-open connections: The TCP connection appears alive but the peer has vanished (network partition, process crash). Without heartbeats the server never discovers the connection is dead.
- Keep NAT/firewall mappings alive: Many NAT devices remove idle mappings after a timeout. Periodic Pings keep the path active.
Correct heartbeat timing constants:
const (
writeWait = 10 * time.Second // write operation deadline
pongWait = 60 * time.Second // max wait for a Pong
pingPeriod = (pongWait * 9) / 10 // send Ping before pongWait expires
)
The Connection Registry Pattern (Hub)
The standard pattern for managing large numbers of WebSocket connections is a central registry — the Hub — where all state mutations happen through channels:
connect → register channel → Hub goroutine → adds to connections map
disconnect → unregister channel → Hub goroutine → removes + cleans up
broadcast → broadcast channel → Hub goroutine → iterates connections, sends
The Hub goroutine is the sole modifier of the connections map, so no mutex is needed. This is one of Go's core concurrency insights: do not communicate by sharing memory; share memory by communicating.
Level 3 · Code in Practice
Building a Real-Time Chat Server
Below is a complete multi-room chat server with connection management, broadcasting, graceful disconnect handling, and heartbeats.
Message types (message.go):
package main
import (
"encoding/json"
"time"
)
type MessageType string
const (
MsgTypeChat MessageType = "chat"
MsgTypeSystem MessageType = "system"
MsgTypeError MessageType = "error"
)
type IncomingMessage struct {
Type MessageType `json:"type"`
Room string `json:"room"`
Content string `json:"content"`
}
type OutgoingMessage struct {
Type MessageType `json:"type"`
Room string `json:"room"`
Sender string `json:"sender"`
Content string `json:"content"`
Timestamp time.Time `json:"timestamp"`
}
func (m OutgoingMessage) Encode() ([]byte, error) {
return json.Marshal(m)
}
Client connection (client.go):
package main
import (
"encoding/json"
"log"
"time"
"github.com/gorilla/websocket"
)
const (
writeWait = 10 * time.Second
pongWait = 60 * time.Second
pingPeriod = (pongWait * 9) / 10
maxMessageSize = 4096
)
type Client struct {
id string
room string
hub *Hub
conn *websocket.Conn
send chan []byte // buffered outbound message queue
}
// readPump runs in its own goroutine and is the sole reader on this connection.
func (c *Client) readPump() {
defer func() {
c.hub.unregister <- c
c.conn.Close()
}()
c.conn.SetReadLimit(maxMessageSize)
c.conn.SetReadDeadline(time.Now().Add(pongWait))
c.conn.SetPongHandler(func(string) error {
c.conn.SetReadDeadline(time.Now().Add(pongWait))
return nil
})
for {
_, rawMsg, err := c.conn.ReadMessage()
if err != nil {
if websocket.IsUnexpectedCloseError(err,
websocket.CloseGoingAway,
websocket.CloseAbnormalClosure,
) {
log.Printf("client %s unexpected close: %v", c.id, err)
}
break
}
var incoming IncomingMessage
if err := json.Unmarshal(rawMsg, &incoming); err != nil {
log.Printf("client %s invalid JSON: %v", c.id, err)
continue
}
outgoing := OutgoingMessage{
Type: MsgTypeChat,
Room: c.room,
Sender: c.id,
Content: incoming.Content,
Timestamp: time.Now(),
}
if encoded, err := outgoing.Encode(); err == nil {
c.hub.broadcast <- BroadcastRequest{Room: c.room, Payload: encoded}
}
}
}
// writePump runs in its own goroutine and is the sole writer on this connection.
// It also owns the Ping ticker.
func (c *Client) writePump() {
ticker := time.NewTicker(pingPeriod)
defer func() {
ticker.Stop()
c.conn.Close()
}()
for {
select {
case message, ok := <-c.send:
c.conn.SetWriteDeadline(time.Now().Add(writeWait))
if !ok {
// Hub closed the channel: send a WebSocket close frame.
c.conn.WriteMessage(websocket.CloseMessage, []byte{})
return
}
w, err := c.conn.NextWriter(websocket.TextMessage)
if err != nil {
return
}
w.Write(message)
// Coalesce queued messages into one write to reduce syscall overhead.
n := len(c.send)
for i := 0; i < n; i++ {
w.Write([]byte{'\n'})
w.Write(<-c.send)
}
if err := w.Close(); err != nil {
return
}
case <-ticker.C:
c.conn.SetWriteDeadline(time.Now().Add(writeWait))
if err := c.conn.WriteMessage(websocket.PingMessage, nil); err != nil {
return
}
}
}
}
Hub: connection registry and message routing (hub.go):
package main
import (
"log"
"sync"
"time"
)
type BroadcastRequest struct {
Room string
Payload []byte
}
type HubStats struct {
TotalConnections int64
ActiveConnections int64
MessagesSent int64
}
// Hub manages all active client connections.
// All modifications to the rooms map happen inside Run(), which is a single
// goroutine — no mutex needed on the map itself.
type Hub struct {
rooms map[string]map[*Client]bool // room → set of clients
register chan *Client
unregister chan *Client
broadcast chan BroadcastRequest
mu sync.RWMutex
stats HubStats
}
func NewHub() *Hub {
return &Hub{
rooms: make(map[string]map[*Client]bool),
register: make(chan *Client, 256),
unregister: make(chan *Client, 256),
broadcast: make(chan BroadcastRequest, 1024),
}
}
func (h *Hub) Run() {
for {
select {
case c := <-h.register:
h.addClient(c)
case c := <-h.unregister:
h.removeClient(c)
case req := <-h.broadcast:
h.send(req)
}
}
}
func (h *Hub) addClient(c *Client) {
if h.rooms[c.room] == nil {
h.rooms[c.room] = make(map[*Client]bool)
}
h.rooms[c.room][c] = true
h.mu.Lock()
h.stats.TotalConnections++
h.stats.ActiveConnections++
h.mu.Unlock()
log.Printf("client %s joined room %s (%d in room)", c.id, c.room, len(h.rooms[c.room]))
joinMsg := OutgoingMessage{
Type: MsgTypeSystem,
Room: c.room,
Sender: "system",
Content: c.id + " joined",
Timestamp: time.Now(),
}
if encoded, err := joinMsg.Encode(); err == nil {
h.send(BroadcastRequest{Room: c.room, Payload: encoded})
}
}
func (h *Hub) removeClient(c *Client) {
if clients, ok := h.rooms[c.room]; ok {
if _, exists := clients[c]; exists {
delete(clients, c)
close(c.send) // closing the channel triggers writePump to exit
if len(clients) == 0 {
delete(h.rooms, c.room)
}
}
}
h.mu.Lock()
h.stats.ActiveConnections--
h.mu.Unlock()
log.Printf("client %s left room %s", c.id, c.room)
}
func (h *Hub) send(req BroadcastRequest) {
clients, ok := h.rooms[req.Room]
if !ok {
return
}
for c := range clients {
select {
case c.send <- req.Payload:
h.mu.Lock()
h.stats.MessagesSent++
h.mu.Unlock()
default:
// The client's send buffer is full — it is too slow to keep up.
// Drop it to protect the server.
log.Printf("client %s buffer full, disconnecting", c.id)
delete(clients, c)
close(c.send)
}
}
}
func (h *Hub) Stats() HubStats {
h.mu.RLock()
defer h.mu.RUnlock()
return h.stats
}
HTTP handler and main (main.go):
package main
import (
"encoding/json"
"fmt"
"log"
"net/http"
"strings"
"github.com/google/uuid"
"github.com/gorilla/websocket"
)
var upgrader = websocket.Upgrader{
ReadBufferSize: 1024,
WriteBufferSize: 1024,
CheckOrigin: func(r *http.Request) bool {
origin := r.Header.Get("Origin")
return strings.HasPrefix(origin, "https://example.com") ||
origin == "http://localhost:3000"
},
}
var hub = NewHub()
func wsHandler(w http.ResponseWriter, r *http.Request) {
room := strings.TrimPrefix(r.URL.Path, "/ws/")
if room == "" {
room = "general"
}
conn, err := upgrader.Upgrade(w, r, nil)
if err != nil {
log.Printf("upgrade: %v", err)
return
}
client := &Client{
id: uuid.New().String()[:8],
room: room,
hub: hub,
conn: conn,
send: make(chan []byte, 256),
}
hub.register <- client
go client.writePump()
go client.readPump()
}
func statsHandler(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(hub.Stats())
}
func main() {
go hub.Run()
http.HandleFunc("/ws/", wsHandler)
http.HandleFunc("/stats", statsHandler)
http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
fmt.Fprintln(w, "Chat server. Connect to /ws/{room}")
})
log.Println("Listening on :8080")
log.Fatal(http.ListenAndServe(":8080", nil))
}
Client-side reconnection with exponential backoff (JavaScript):
class ReconnectingWebSocket {
constructor(url) {
this.url = url;
this.delay = 1000; // initial reconnect delay
this.maxDelay = 30000; // cap at 30 s
this.connect();
}
connect() {
this.ws = new WebSocket(this.url);
this.ws.onopen = () => {
console.log('connected');
this.delay = 1000; // reset backoff on successful connect
};
this.ws.onmessage = (e) => this.onmessage(JSON.parse(e.data));
this.ws.onclose = (e) => {
if (e.wasClean) return;
console.log(`reconnecting in ${this.delay}ms`);
setTimeout(() => {
this.delay = Math.min(this.delay * 2, this.maxDelay);
this.connect();
}, this.delay);
};
}
send(data) {
if (this.ws.readyState === WebSocket.OPEN) {
this.ws.send(JSON.stringify(data));
}
}
onmessage(msg) {} // override in subclass
}
Level 4 · Advanced Topics and Edge Cases
Scaling WebSocket Across Multiple Servers: Redis Pub/Sub
A single-server Hub does not scale horizontally. Client A connects to server 1; client B connects to server 2; server 1's Hub is unaware of server 2's connections, so a broadcast reaches only half the audience.
The solution: use Redis Pub/Sub as a message bus. Every server instance subscribes to a shared channel and publishes outbound messages to it:
import (
"context"
"encoding/json"
"github.com/redis/go-redis/v9"
)
type DistributedHub struct {
*Hub
rdb *redis.Client
channel string
}
func NewDistributedHub(redisAddr string) *DistributedHub {
rdb := redis.NewClient(&redis.Options{Addr: redisAddr})
h := &DistributedHub{
Hub: NewHub(),
rdb: rdb,
channel: "chat:broadcast",
}
go h.subscribe()
return h
}
// subscribe listens to the Redis channel and forwards messages to the local Hub.
func (h *DistributedHub) subscribe() {
ctx := context.Background()
sub := h.rdb.Subscribe(ctx, h.channel)
for msg := range sub.Channel() {
var req BroadcastRequest
if err := json.Unmarshal([]byte(msg.Payload), &req); err != nil {
continue
}
h.Hub.broadcast <- req
}
}
// Publish sends a broadcast request to every server instance via Redis.
func (h *DistributedHub) Publish(req BroadcastRequest) {
data, _ := json.Marshal(req)
h.rdb.Publish(context.Background(), h.channel, data)
}
Each server's local Hub handles only the clients directly connected to it. When a message must reach a room whose members are spread across servers, publish it to Redis; all subscribers pick it up and deliver it locally.
If you also need room membership counts visible cluster-wide, store member sets in Redis using SADD/SREM/SCARD. Retrieve the count with SCARD roomName.
WebSocket Compression: permessage-deflate
For text messages such as JSON payloads, compression can substantially reduce bandwidth. RFC 7692 defines the permessage-deflate extension; gorilla/websocket supports it with a single flag:
var upgrader = websocket.Upgrader{
EnableCompression: true,
}
Client and server negotiate compression capability during the handshake. If both agree, subsequent message frames are deflate-compressed.
Compression trades CPU for bandwidth. Benchmark before enabling in production:
- Messages smaller than ~100 bytes may grow after compression due to the deflate header overhead.
- General guideline: enable compression when the average message size exceeds 512 bytes.
- Under heavy concurrency, compression CPU cost can become the throughput bottleneck.
Per-Connection Rate Limiting
Prevent malicious clients from flooding the server with messages using Go's golang.org/x/time/rate token bucket:
import "golang.org/x/time/rate"
type Client struct {
// ...
limiter *rate.Limiter
}
// In NewClient: allow 10 messages per second, burst of 20.
client := &Client{
limiter: rate.NewLimiter(10, 20),
// ...
}
// In readPump, before processing each message:
if !c.limiter.Allow() {
log.Printf("client %s rate limited", c.id)
// Option A: silently drop the message
continue
// Option B: disconnect the client (stricter policy)
// break
}
For more sophisticated policies — for example, rate limiting by room rather than by connection, or applying different limits to authenticated vs. anonymous users — maintain a shared sync.Map of limiters keyed by user ID rather than connection.
Load Testing with k6
import ws from 'k6/ws';
import { check } from 'k6';
export let options = {
vus: 1000, // 1000 concurrent virtual users
duration: '60s',
};
export default function() {
const url = 'ws://localhost:8080/ws/loadtest';
const res = ws.connect(url, {}, function(socket) {
socket.on('open', () => {
socket.setInterval(() => {
socket.send(JSON.stringify({
type: 'chat',
room: 'loadtest',
content: 'hello from k6',
}));
}, 1000);
});
socket.on('message', (data) => {
const msg = JSON.parse(data);
check(msg, { 'has type field': (m) => m.type !== undefined });
});
socket.setTimeout(() => socket.close(), 30000);
});
check(res, { 'status 101': (r) => r && r.status === 101 });
}
Pay attention to these metrics in the k6 output:
- ws_sessions: total WebSocket sessions opened
- ws_msgs_received / ws_msgs_sent: message throughput
- ws_connecting: time to complete the Upgrade handshake
Goroutine Leak Detection
Every WebSocket connection in our server spawns exactly 2 goroutines (readPump + writePump). If connections are not properly cleaned up, goroutines accumulate silently until the process runs out of memory. Detect leaks in tests with go.uber.org/goleak:
import (
"testing"
"time"
"go.uber.org/goleak"
"github.com/gorilla/websocket"
)
func TestNoLeak(t *testing.T) {
defer goleak.VerifyNone(t)
server := httptest.NewServer(http.HandlerFunc(wsHandler))
defer server.Close()
url := "ws" + strings.TrimPrefix(server.URL, "http") + "/ws/test"
conn, _, err := websocket.DefaultDialer.Dial(url, nil)
if err != nil {
t.Fatal(err)
}
// Normal close
conn.WriteMessage(
websocket.CloseMessage,
websocket.FormatCloseMessage(websocket.CloseNormalClosure, ""),
)
conn.Close()
time.Sleep(100 * time.Millisecond)
// goleak.VerifyNone fires here and fails if any goroutines remain
}
The exit chain that prevents leaks:
readPumpencounters an error or close frame → executesdefer hub.unregister <- c→ sends client to Hub- Hub's
removeClientclosesc.send(the write channel) writePump'sselectreceives from a closed channel withok=false→ sends WebSocket close frame → returns → TCP connection closes
Every link in this chain must be present. A missing defer or an unclosed channel anywhere breaks the chain and leaks a goroutine permanently.
WebSocket is a powerful tool for real-time applications, but its statefulness and the connection management complexity it introduces are genuine engineering challenges. Master the design patterns covered in this chapter — the Hub registry, one goroutine per direction, Ping/Pong heartbeats, distributed Redis routing, and goroutine leak prevention — and you have a complete toolkit for building high-concurrency real-time services in Go.