Chapter 42

Build an RPC Framework

In 1984, Birrell and Nelson published "Implementing Remote Procedure Calls." They proposed a deceptively simple idea: calling a function on a remote machine should be as simple as calling a local function. That idea gave birth to RPC (Remote Procedure Call), and to 40 years of core infrastructure for distributed systems.

Today, almost all internal service communication in large systems goes through RPC. Google's gRPC, Facebook's Thrift, Alibaba's Dubbo — they are all solving the same fundamental problem: how to wrap network calls so transparently they feel like function calls.

In this chapter we build a complete RPC framework in Go from scratch, then progressively add production-grade features: interceptors, load balancing, health checking. This journey reveals the reasoning behind every core design decision in frameworks like gRPC.

Level 1 · What RPC Really Is

Hiding Network Calls

Without an RPC framework, calling a service on another machine looks like this:

// The pain of manual network communication
conn, _ := net.Dial("tcp", "10.0.0.1:8080")
request := serialize(AddRequest{A: 1, B: 2})
conn.Write(request)
buf := make([]byte, 1024)
conn.Read(buf)
response := deserialize(buf)
fmt.Println(response.Result)

Every service call requires manual handling: establishing a connection, serializing arguments, sending data, receiving the response, deserializing the result, handling errors. Repetitive, tedious, and error-prone.

An RPC framework hides all of this, making the call look like this:

// With an RPC framework
client := NewMathClient("10.0.0.1:8080")
result, err := client.Add(ctx, &AddRequest{A: 1, B: 2})
fmt.Println(result.Result)

This involves several core problems:

Serialization: How do arguments become bytes? How do bytes become arguments?
Transport: How are bytes sent over the network? What protocol?
Service discovery: How does the client know where the server is?
Error handling: How do network errors, timeouts, and server exceptions propagate back?

REST vs RPC vs GraphQL: The Essential Tradeoffs

These three API styles represent three different design philosophies, each with its own appropriate use cases:

REST (Representational State Transfer): resource-centric, using HTTP verbs (GET/POST/PUT/DELETE) to express operations and URL paths to express resource hierarchies. Strengths: broad generality (any HTTP client can use it), cacheability, human-readable. Weaknesses: complex operations ("transfer funds," "approve request") are hard to express in a resource model; protocol overhead is high (HTTP/1.1 headers repeat every request); versioning is complex.

RPC: action-centric, directly exposing service methods. Strengths: type safety (parameter types defined in schema), high performance (custom serialization, HTTP/2 multiplexing), ideal for internal service-to-service communication. Weaknesses: requires specific clients (ordinary browsers can't use it directly), cross-language support requires code generation, interface changes require version management.

GraphQL: data graph-centric, clients precisely declare which fields they need. Strengths: solves REST's over-fetching and under-fetching problems, especially suitable for frontend-driven scenarios. Weaknesses: server implementation is complex (N+1 query problem), effective caching is difficult, unsuitable for service-to-service communication.

Rule of thumb:

External APIs (for browsers, third parties): REST or GraphQL
Internal service-to-service communication (microservices): RPC (gRPC)
Frontend-heavy products needing flexible data queries: GraphQL

Protobuf vs JSON: Deep Serialization Tradeoffs

The choice of serialization format has a massive impact on performance:

JSON: human-readable, broad cross-language support, easy to debug. But the text format means: the number 1234567890 requires 10 bytes, while binary format needs only 4 bytes (int32); field names are transmitted every time ("user_id": — those 9 bytes repeat in every record); parsing requires string comparison, 5-10x slower than binary parsing.

Protobuf: Google's open-source binary serialization format. Uses field numbers instead of field names (user_id is field 1 in the schema, taking only 1-2 bytes when serialized); uses variable-length integer encoding (Varint) to compress numeric values; can add or remove fields while remaining forward and backward compatible. In typical scenarios, Protobuf is 3-10x smaller than JSON and parses 5-20x faster.

The cost: binary is not human-readable; you need the .proto schema file to parse data; debugging requires specialized tools.

Level 2 · gRPC Architecture Principles

HTTP/2 as the Transport Layer

gRPC's choice of HTTP/2 as its transport layer was deliberate:

Multiplexing: In HTTP/1.1, a single connection can process only one request at a time (serial). In HTTP/2, one connection can process multiple requests simultaneously (streams), completely eliminating the Head-of-Line Blocking problem of HTTP/1.1. gRPC concurrent calls multiplex over the same TCP connection, saving connection setup overhead.

Header compression: HTTP/1.1 sends complete headers with every request. gRPC's metadata (method name, auth token, etc.) is HTTP/2 headers, compressed with HPACK — repeated headers consume almost no bandwidth.

Streaming transport: HTTP/2 natively supports bidirectional streams. gRPC's Server Streaming, Client Streaming, and Bidirectional Streaming all build on this.

Frames: HTTP/2 cuts data into frames. Each frame has a Stream ID; the receiver can reassemble frames from different streams. This is what enables a single connection to truly process multiple requests concurrently.

Interceptors: The Middleware Layer for RPCs

gRPC interceptors are one of the most important extension mechanisms in any RPC framework. They allow adding to all RPC calls — without modifying business code — logging, metrics collection, authentication, rate limiting, and distributed tracing.

Interceptors form a call chain (similar to HTTP middleware):

Request → Interceptor A → Interceptor B → Business Handler → Interceptor B → Interceptor A → Response

gRPC provides Server-side and Client-side interceptors, each in two variants: Unary and Stream.

Service Discovery and Load Balancing

gRPC natively supports service discovery through the Resolver interface, connecting to etcd, Consul, Kubernetes, and other registries:

Client → Resolver (resolve service name → address list) → Balancer (select one address) → Establish connection

Load balancing strategies:

Round Robin: rotate through servers, simplest, suitable for stateless services
Least Connection: select the instance with the fewest current connections, suitable when request processing times vary greatly
Weighted Round Robin: rotate by weight, suitable for heterogeneous machines
Consistent Hashing: suitable for session affinity scenarios

Level 3 · Building the RPC Framework

Overall Architecture

Our RPC framework has these layers:

Client Side:                     Server Side:

+---------------------+         +---------------------+
|  Generated Stub     |         |   Service Registry  |
|  (type-safe API)    |         |   (reflect-based)   |
+---------------------+         +---------------------+
         |                               |
+---------------------+         +---------------------+
|  Client Interceptor |         |  Server Interceptor |
|  Chain              |         |  Chain              |
+---------------------+         +---------------------+
         |                               |
+---------------------+         +---------------------+
|  Codec (gob/json/   |         |  Codec (decode req, |
|  protobuf)          |         |  encode resp)       |
+---------------------+         +---------------------+
         |                               |
+---------------------+         +---------------------+
|  TCP Transport      |<------->|  TCP Transport      |
|  + Connection Pool  |         |  (goroutine/conn)   |
+---------------------+         +---------------------+

Step 1: Request/Response Protocol and Codec

package codec

import (
    "bufio"
    "encoding/gob"
    "encoding/json"
    "fmt"
    "io"
)

// Header holds metadata for each RPC call
type Header struct {
    ServiceMethod string // "ServiceName.MethodName"
    Seq           uint64 // request sequence number (for matching async responses)
    Error         string // server-side error (body is ignored when non-empty)
}

// Codec encodes and decodes Header+Body
type Codec interface {
    io.Closer
    ReadHeader(*Header) error
    ReadBody(interface{}) error
    Write(*Header, interface{}) error
}

// GobCodec uses Go's standard gob encoding
type GobCodec struct {
    conn io.ReadWriteCloser
    buf  *bufio.Writer
    dec  *gob.Decoder
    enc  *gob.Encoder
}

func NewGobCodec(conn io.ReadWriteCloser) Codec {
    buf := bufio.NewWriter(conn)
    return &GobCodec{
        conn: conn,
        buf:  buf,
        dec:  gob.NewDecoder(conn),
        enc:  gob.NewEncoder(buf),
    }
}

func (c *GobCodec) ReadHeader(h *Header) error {
    return c.dec.Decode(h)
}

func (c *GobCodec) ReadBody(body interface{}) error {
    return c.dec.Decode(body)
}

func (c *GobCodec) Write(h *Header, body interface{}) (err error) {
    defer func() {
        _ = c.buf.Flush()
        if err != nil {
            _ = c.Close()
        }
    }()
    if err = c.enc.Encode(h); err != nil {
        return fmt.Errorf("encode header: %w", err)
    }
    if err = c.enc.Encode(body); err != nil {
        return fmt.Errorf("encode body: %w", err)
    }
    return nil
}

func (c *GobCodec) Close() error { return c.conn.Close() }

// JSONCodec uses JSON encoding (easier to debug)
type JSONCodec struct {
    conn io.ReadWriteCloser
    dec  *json.Decoder
    enc  *json.Encoder
}

func NewJSONCodec(conn io.ReadWriteCloser) Codec {
    return &JSONCodec{
        conn: conn,
        dec:  json.NewDecoder(conn),
        enc:  json.NewEncoder(conn),
    }
}

func (c *JSONCodec) ReadHeader(h *Header) error    { return c.dec.Decode(h) }
func (c *JSONCodec) ReadBody(body interface{}) error { return c.dec.Decode(body) }
func (c *JSONCodec) Write(h *Header, body interface{}) error {
    if err := c.enc.Encode(h); err != nil {
        return err
    }
    return c.enc.Encode(body)
}
func (c *JSONCodec) Close() error { return c.conn.Close() }

type CodecType string

const (
    GobType  CodecType = "application/gob"
    JSONType CodecType = "application/json"
)

type NewCodecFunc func(io.ReadWriteCloser) Codec

var NewCodecFuncMap = map[CodecType]NewCodecFunc{
    GobType:  NewGobCodec,
    JSONType: NewJSONCodec,
}

Step 2: Service Registry via Reflection

package server

import (
    "fmt"
    "go/token"
    "log"
    "reflect"
    "strings"
    "sync"
)

// methodType describes a method available for remote invocation
type methodType struct {
    method    reflect.Method
    ArgType   reflect.Type
    ReplyType reflect.Type
    numCalls  uint64
}

func (m *methodType) newArgv() reflect.Value {
    if m.ArgType.Kind() == reflect.Ptr {
        return reflect.New(m.ArgType.Elem())
    }
    return reflect.New(m.ArgType).Elem()
}

func (m *methodType) newReplyv() reflect.Value {
    replyv := reflect.New(m.ReplyType.Elem())
    switch m.ReplyType.Elem().Kind() {
    case reflect.Map:
        replyv.Elem().Set(reflect.MakeMap(m.ReplyType.Elem()))
    case reflect.Slice:
        replyv.Elem().Set(reflect.MakeSlice(m.ReplyType.Elem(), 0, 0))
    }
    return replyv
}

// service represents a registered service
type service struct {
    name    string
    typ     reflect.Type
    rcvr    reflect.Value
    methods map[string]*methodType
}

func newService(rcvr interface{}) *service {
    s := &service{}
    s.rcvr = reflect.ValueOf(rcvr)
    s.typ = reflect.TypeOf(rcvr)
    s.name = reflect.Indirect(s.rcvr).Type().Name()
    if !token.IsExported(s.name) {
        log.Fatalf("rpc server: %s is not a valid service name", s.name)
    }
    s.registerMethods()
    return s
}

func (s *service) registerMethods() {
    s.methods = make(map[string]*methodType)
    for i := 0; i < s.typ.NumMethod(); i++ {
        method := s.typ.Method(i)
        mType := method.Type
        // Valid RPC method signature: func (t *T) Method(args ArgType, reply *ReplyType) error
        if mType.NumIn() != 3 || mType.NumOut() != 1 {
            continue
        }
        if mType.Out(0) != reflect.TypeOf((*error)(nil)).Elem() {
            continue
        }
        argType, replyType := mType.In(1), mType.In(2)
        if !isExportedOrBuiltinType(argType) || !isExportedOrBuiltinType(replyType) {
            continue
        }
        s.methods[method.Name] = &methodType{
            method:    method,
            ArgType:   argType,
            ReplyType: replyType,
        }
        log.Printf("rpc server: register %s.%s\n", s.name, method.Name)
    }
}

func (s *service) call(m *methodType, argv, replyv reflect.Value) error {
    f := m.method.Func
    returnValues := f.Call([]reflect.Value{s.rcvr, argv, replyv})
    if errInter := returnValues[0].Interface(); errInter != nil {
        return errInter.(error)
    }
    return nil
}

func isExportedOrBuiltinType(t reflect.Type) bool {
    return token.IsExported(t.Name()) || t.PkgPath() == ""
}

// Server is the RPC server
type Server struct {
    serviceMap sync.Map
}

func NewServer() *Server { return &Server{} }

func (s *Server) Register(rcvr interface{}) error {
    svc := newService(rcvr)
    if _, dup := s.serviceMap.LoadOrStore(svc.name, svc); dup {
        return fmt.Errorf("rpc: service already defined: %s", svc.name)
    }
    return nil
}

func (s *Server) findService(serviceMethod string) (svc *service, mtype *methodType, err error) {
    dot := strings.LastIndex(serviceMethod, ".")
    if dot < 0 {
        err = fmt.Errorf("rpc server: malformed service method: %s", serviceMethod)
        return
    }
    serviceName, methodName := serviceMethod[:dot], serviceMethod[dot+1:]
    svci, ok := s.serviceMap.Load(serviceName)
    if !ok {
        err = fmt.Errorf("rpc server: can't find service %s", serviceName)
        return
    }
    svc = svci.(*service)
    mtype, ok = svc.methods[methodName]
    if !ok {
        err = fmt.Errorf("rpc server: can't find method %s", methodName)
    }
    return
}

Step 3: Server Connection Handling

package server

import (
    "encoding/json"
    "fmt"
    "io"
    "log"
    "net"
    "reflect"
    "strings"
    "sync"
    "time"

    "github.com/yourname/minirpc/codec"
)

// Option is the negotiation payload (client sends, server reads)
type Option struct {
    MagicNumber   int
    CodecType     codec.CodecType
    ConnTimeout   time.Duration
    HandleTimeout time.Duration
}

const MagicNumber = 0x3bef5c

var DefaultOption = &Option{
    MagicNumber: MagicNumber,
    CodecType:   codec.GobType,
    ConnTimeout: time.Second * 10,
}

func (s *Server) ServeConn(conn io.ReadWriteCloser) {
    defer conn.Close()
    var opt Option
    if err := json.NewDecoder(conn).Decode(&opt); err != nil {
        log.Printf("rpc server: decode option error: %v", err)
        return
    }
    if opt.MagicNumber != MagicNumber {
        log.Printf("rpc server: invalid magic number %x", opt.MagicNumber)
        return
    }
    newCodec := codec.NewCodecFuncMap[opt.CodecType]
    if newCodec == nil {
        log.Printf("rpc server: invalid codec type %s", opt.CodecType)
        return
    }
    s.serveCodec(newCodec(conn), &opt)
}

type request struct {
    h      *codec.Header
    argv   reflect.Value
    replyv reflect.Value
    mtype  *methodType
    svc    *service
}

func (s *Server) serveCodec(cc codec.Codec, opt *Option) {
    sending := new(sync.Mutex)
    wg := new(sync.WaitGroup)

    for {
        req, err := s.readRequest(cc)
        if err != nil {
            if req == nil {
                break
            }
            req.h.Error = err.Error()
            s.sendResponse(cc, req.h, struct{}{}, sending)
            continue
        }
        wg.Add(1)
        go s.handleRequest(cc, req, sending, wg, opt.HandleTimeout)
    }
    wg.Wait()
    _ = cc.Close()
}

func (s *Server) readRequest(cc codec.Codec) (*request, error) {
    var h codec.Header
    if err := cc.ReadHeader(&h); err != nil {
        if err != io.EOF && !strings.HasSuffix(err.Error(), "EOF") {
            log.Printf("rpc server: read header error: %v", err)
        }
        return nil, err
    }
    req := &request{h: &h}
    var err error
    req.svc, req.mtype, err = s.findService(h.ServiceMethod)
    if err != nil {
        _ = cc.ReadBody(nil)
        return req, err
    }
    req.argv = req.mtype.newArgv()
    req.replyv = req.mtype.newReplyv()

    argvi := req.argv.Interface()
    if req.argv.Type().Kind() != reflect.Ptr {
        argvi = req.argv.Addr().Interface()
    }
    if err = cc.ReadBody(argvi); err != nil {
        log.Printf("rpc server: read body err: %v", err)
        return req, err
    }
    return req, nil
}

func (s *Server) handleRequest(cc codec.Codec, req *request, sending *sync.Mutex, wg *sync.WaitGroup, timeout time.Duration) {
    defer wg.Done()
    called := make(chan struct{})
    sent := make(chan struct{})

    go func() {
        err := req.svc.call(req.mtype, req.argv, req.replyv)
        called <- struct{}{}
        if err != nil {
            req.h.Error = err.Error()
            s.sendResponse(cc, req.h, struct{}{}, sending)
        } else {
            s.sendResponse(cc, req.h, req.replyv.Interface(), sending)
        }
        sent <- struct{}{}
    }()

    if timeout == 0 {
        <-called
        <-sent
        return
    }
    select {
    case <-time.After(timeout):
        req.h.Error = fmt.Sprintf("rpc server: handle timeout within %s", timeout)
        s.sendResponse(cc, req.h, struct{}{}, sending)
    case <-called:
        <-sent
    }
}

func (s *Server) sendResponse(cc codec.Codec, h *codec.Header, body interface{}, sending *sync.Mutex) {
    sending.Lock()
    defer sending.Unlock()
    if err := cc.Write(h, body); err != nil {
        log.Printf("rpc server: write response error: %v", err)
    }
}

func (s *Server) Accept(lis net.Listener) {
    for {
        conn, err := lis.Accept()
        if err != nil {
            log.Printf("rpc server: accept error: %v", err)
            return
        }
        go s.ServeConn(conn)
    }
}

var DefaultServer = NewServer()

func Register(rcvr interface{}) error { return DefaultServer.Register(rcvr) }
func Accept(lis net.Listener)         { DefaultServer.Accept(lis) }

Step 4: Client with Async Support

package client

import (
    "context"
    "encoding/json"
    "fmt"
    "io"
    "log"
    "net"
    "sync"
    "time"

    "github.com/yourname/minirpc/codec"
    "github.com/yourname/minirpc/server"
)

// Call represents an active or completed RPC
type Call struct {
    Seq           uint64
    ServiceMethod string
    Args          interface{}
    Reply         interface{}
    Error         error
    Done          chan *Call
}

func (c *Call) done() {
    c.Done <- c
}

// Client is the RPC client
type Client struct {
    cc       codec.Codec
    opt      *server.Option
    sending  sync.Mutex
    header   codec.Header
    mu       sync.Mutex
    seq      uint64
    pending  map[uint64]*Call
    closing  bool
    shutdown bool
}

func NewClient(conn net.Conn, opt *server.Option) (*Client, error) {
    newCodecFunc := codec.NewCodecFuncMap[opt.CodecType]
    if newCodecFunc == nil {
        return nil, fmt.Errorf("invalid codec type %s", opt.CodecType)
    }
    if err := json.NewEncoder(conn).Encode(opt); err != nil {
        return nil, fmt.Errorf("rpc client: encode option: %w", err)
    }
    c := &Client{
        seq:     1,
        cc:      newCodecFunc(conn),
        opt:     opt,
        pending: make(map[uint64]*Call),
    }
    go c.receive()
    return c, nil
}

func Dial(network, address string, opts ...*server.Option) (*Client, error) {
    opt := server.DefaultOption
    if len(opts) > 0 && opts[0] != nil {
        opt = opts[0]
    }
    conn, err := net.DialTimeout(network, address, opt.ConnTimeout)
    if err != nil {
        return nil, err
    }
    c, err := NewClient(conn, opt)
    if err != nil {
        conn.Close()
    }
    return c, err
}

func (c *Client) Close() error {
    c.mu.Lock()
    defer c.mu.Unlock()
    if c.closing {
        return fmt.Errorf("connection already closed")
    }
    c.closing = true
    return c.cc.Close()
}

func (c *Client) IsAvailable() bool {
    c.mu.Lock()
    defer c.mu.Unlock()
    return !c.shutdown && !c.closing
}

func (c *Client) registerCall(call *Call) (uint64, error) {
    c.mu.Lock()
    defer c.mu.Unlock()
    if c.closing || c.shutdown {
        return 0, fmt.Errorf("rpc client: shutting down")
    }
    call.Seq = c.seq
    c.pending[call.Seq] = call
    c.seq++
    return call.Seq, nil
}

func (c *Client) removeCall(seq uint64) *Call {
    c.mu.Lock()
    defer c.mu.Unlock()
    call := c.pending[seq]
    delete(c.pending, seq)
    return call
}

func (c *Client) terminateCalls(err error) {
    c.sending.Lock()
    defer c.sending.Unlock()
    c.mu.Lock()
    defer c.mu.Unlock()
    c.shutdown = true
    for _, call := range c.pending {
        call.Error = err
        call.done()
    }
}

func (c *Client) receive() {
    var err error
    for err == nil {
        var h codec.Header
        if err = c.cc.ReadHeader(&h); err != nil {
            break
        }
        call := c.removeCall(h.Seq)
        switch {
        case call == nil:
            err = c.cc.ReadBody(nil)
        case h.Error != "":
            call.Error = fmt.Errorf(h.Error)
            _ = c.cc.ReadBody(nil)
            call.done()
        default:
            err = c.cc.ReadBody(call.Reply)
            if err != nil {
                call.Error = fmt.Errorf("reading body: %w", err)
            }
            call.done()
        }
    }
    c.terminateCalls(err)
}

func (c *Client) send(call *Call) {
    c.sending.Lock()
    defer c.sending.Unlock()

    seq, err := c.registerCall(call)
    if err != nil {
        call.Error = err
        call.done()
        return
    }

    c.header.ServiceMethod = call.ServiceMethod
    c.header.Seq = seq
    c.header.Error = ""

    if err := c.cc.Write(&c.header, call.Args); err != nil {
        if call := c.removeCall(seq); call != nil {
            call.Error = err
            call.done()
        }
    }
}

// Go performs an asynchronous call, returns immediately
func (c *Client) Go(serviceMethod string, args, reply interface{}, done chan *Call) *Call {
    if done == nil {
        done = make(chan *Call, 1)
    }
    call := &Call{
        ServiceMethod: serviceMethod,
        Args:          args,
        Reply:         reply,
        Done:          done,
    }
    c.send(call)
    return call
}

// Call performs a synchronous call, blocking until complete or timeout
func (c *Client) Call(ctx context.Context, serviceMethod string, args, reply interface{}) error {
    call := c.Go(serviceMethod, args, reply, make(chan *Call, 1))
    select {
    case <-ctx.Done():
        c.removeCall(call.Seq)
        return fmt.Errorf("rpc client: call failed: %s", ctx.Err())
    case call := <-call.Done:
        return call.Error
    }
}

Step 5: Interceptors and Load Balancer

package middleware

import (
    "context"
    "fmt"
    "log"
    "time"
)

type UnaryInterceptor func(ctx context.Context, req interface{}, info *UnaryServerInfo, handler UnaryHandler) (interface{}, error)

type UnaryServerInfo struct {
    Server     interface{}
    FullMethod string
}

type UnaryHandler func(ctx context.Context, req interface{}) (interface{}, error)

// ChainUnaryInterceptors links multiple interceptors into one
func ChainUnaryInterceptors(interceptors ...UnaryInterceptor) UnaryInterceptor {
    return func(ctx context.Context, req interface{}, info *UnaryServerInfo, handler UnaryHandler) (interface{}, error) {
        chained := handler
        for i := len(interceptors) - 1; i >= 0; i-- {
            interceptor := interceptors[i]
            next := chained
            chained = func(ctx context.Context, req interface{}) (interface{}, error) {
                return interceptor(ctx, req, info, next)
            }
        }
        return chained(ctx, req)
    }
}

// LoggingInterceptor logs each RPC call with timing
func LoggingInterceptor(ctx context.Context, req interface{}, info *UnaryServerInfo, handler UnaryHandler) (interface{}, error) {
    start := time.Now()
    log.Printf("RPC call: %s req=%v", info.FullMethod, req)
    resp, err := handler(ctx, req)
    elapsed := time.Since(start)
    if err != nil {
        log.Printf("RPC error: %s err=%v elapsed=%v", info.FullMethod, err, elapsed)
    } else {
        log.Printf("RPC ok: %s elapsed=%v", info.FullMethod, elapsed)
    }
    return resp, err
}

// RecoveryInterceptor recovers from panics in handler
func RecoveryInterceptor(ctx context.Context, req interface{}, info *UnaryServerInfo, handler UnaryHandler) (resp interface{}, err error) {
    defer func() {
        if r := recover(); r != nil {
            log.Printf("RPC panic: %s panic=%v", info.FullMethod, r)
            err = fmt.Errorf("internal server error")
        }
    }()
    return handler(ctx, req)
}

// ===================== Load Balancer =====================

package balancer

import (
    "fmt"
    "math/rand"
    "sync"
    "sync/atomic"
)

type SelectMode int

const (
    RandomSelect     SelectMode = iota
    RoundRobinSelect
)

type Discovery interface {
    Refresh() error
    Update(servers []string) error
    Get(mode SelectMode) (string, error)
    GetAll() ([]string, error)
}

// MultiServerDiscovery is a manually maintained server list (no registry dependency)
type MultiServerDiscovery struct {
    r       *rand.Rand
    mu      sync.RWMutex
    servers []string
    index   uint64
}

func NewMultiServerDiscovery(servers []string) *MultiServerDiscovery {
    return &MultiServerDiscovery{
        r:       rand.New(rand.NewSource(rand.Int63())),
        servers: servers,
    }
}

func (d *MultiServerDiscovery) Refresh() error { return nil }

func (d *MultiServerDiscovery) Update(servers []string) error {
    d.mu.Lock()
    defer d.mu.Unlock()
    d.servers = servers
    return nil
}

func (d *MultiServerDiscovery) Get(mode SelectMode) (string, error) {
    d.mu.RLock()
    defer d.mu.RUnlock()
    n := len(d.servers)
    if n == 0 {
        return "", fmt.Errorf("rpc discovery: no available servers")
    }
    switch mode {
    case RandomSelect:
        return d.servers[d.r.Intn(n)], nil
    case RoundRobinSelect:
        idx := atomic.AddUint64(&d.index, 1) - 1
        return d.servers[idx%uint64(n)], nil
    default:
        return "", fmt.Errorf("unsupported select mode")
    }
}

func (d *MultiServerDiscovery) GetAll() ([]string, error) {
    d.mu.RLock()
    defer d.mu.RUnlock()
    return d.servers, nil
}

Full Usage Example

package main

import (
    "context"
    "fmt"
    "log"
    "net"
    "sync"
    "time"

    "github.com/yourname/minirpc/client"
    "github.com/yourname/minirpc/server"
)

type MathService struct{}
type Args struct{ A, B int }
type Reply struct{ C int }

func (m *MathService) Add(args Args, reply *Reply) error {
    reply.C = args.A + args.B
    return nil
}

func (m *MathService) Multiply(args Args, reply *Reply) error {
    reply.C = args.A * args.B
    return nil
}

func startServer(wg *sync.WaitGroup) {
    if err := server.Register(&MathService{}); err != nil {
        log.Fatal("register:", err)
    }
    lis, err := net.Listen("tcp", ":9999")
    if err != nil {
        log.Fatal("listen:", err)
    }
    log.Println("server on :9999")
    wg.Done()
    server.Accept(lis)
}

func main() {
    var wg sync.WaitGroup
    wg.Add(1)
    go startServer(&wg)
    wg.Wait()
    time.Sleep(time.Second)

    c, err := client.Dial("tcp", ":9999")
    if err != nil {
        log.Fatal("dial:", err)
    }
    defer c.Close()

    ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
    defer cancel()

    var reply Reply
    if err := c.Call(ctx, "MathService.Add", Args{1, 2}, &reply); err != nil {
        log.Fatal(err)
    }
    fmt.Printf("1 + 2 = %d\n", reply.C)

    if err := c.Call(ctx, "MathService.Multiply", Args{6, 7}, &reply); err != nil {
        log.Fatal(err)
    }
    fmt.Printf("6 * 7 = %d\n", reply.C)
}

Level 4 · Advanced Topics and Edge Cases

Health Check Protocol

Production RPC frameworks must be able to detect whether backend instances are healthy. gRPC defines a standard health checking protocol. Services implement the grpc.health.v1.Health service:

// Periodic liveness probe
func healthCheck(addr string, interval time.Duration, unhealthy chan<- string) {
    for {
        conn, err := net.DialTimeout("tcp", addr, 2*time.Second)
        if err != nil {
            unhealthy <- addr
        } else {
            conn.Close()
        }
        time.Sleep(interval)
    }
}

gRPC-Gateway: REST Compatibility

Internal services use gRPC, but external APIs need REST. grpc-gateway solves this: it reads HTTP annotations in Protobuf files and auto-generates an HTTP proxy layer that converts REST requests into gRPC calls:

service MathService {
  rpc Add(AddRequest) returns (AddResponse) {
    option (google.api.http) = {
      post: "/v1/math/add"
      body: "*"
    };
  }
}

With this annotation, the gateway accepts POST /v1/math/add with a JSON body and forwards it as a gRPC call to the backend. Zero business code changes required.

Timeout Propagation

In a microservice call chain, timeouts must propagate downstream. If the chain is A → B → C and A sets a 1-second timeout, B must know how much time remains when calling C — otherwise C may be executing long after A has already timed out.

gRPC propagates deadlines through context.Context:

// A initiates the call with a deadline
ctx, cancel := context.WithTimeout(context.Background(), time.Second)
defer cancel()
// context.Deadline() is serialized into the request metadata by gRPC
resp, err := clientB.Call(ctx, ...)

// When B receives the request, the context already carries the deadline
// B passes the same ctx to C
respC, err := clientC.Call(ctx, ...) // ctx.Deadline() is still A's original deadline

This means if A's 1 second is almost up, B's call to C will automatically get a very short timeout, preventing cascading latency buildup.

gRPC vs Thrift vs Dubbo

gRPC: from Google, HTTP/2 + Protobuf, broadest cross-language support (nearly every language has official support), strong streaming RPC support, deep CNCF ecosystem integration (Kubernetes, Istio, and others support it natively).

Thrift: from Facebook, custom binary protocol, older than gRPC, similar performance, but smaller ecosystem, limited streaming support. Used internally at Facebook; HBase's client protocol also uses Thrift.

Dubbo: from Alibaba, focused on the Java ecosystem, more complete service governance features (rate limiting, circuit breaking, gray deployment built in), suitable for Java microservice architectures.

Rough performance comparison (QPS, highly workload-dependent):

Framework	Serialization	Transport	Relative Performance
gRPC	Protobuf	HTTP/2	baseline
Thrift Binary	Thrift	TCP	~110%
JSON over HTTP/1.1	JSON	HTTP/1.1	~30-40%
gob over TCP	gob	TCP	~90%

Protobuf Schema Evolution Rules

These rules govern backward-compatible schema changes — the key to safe rolling upgrades:

Can add new fields: old code ignores unknown field numbers
Cannot change a field's number: field numbers are the wire identity
Cannot reuse a deleted field's number: mark it reserved instead of removing
Changing field types requires caution: int32 → int64 is wire-compatible; int32 → string is not

message User {
  uint64 id = 1;
  string name = 2;
  reserved 3;          // field 3 was deleted, number is reserved
  reserved "email";    // field name "email" is reserved
  int32 age = 4;       // newly added field
}

These rules guarantee that rolling upgrades — where old and new versions of a service run simultaneously — don't cause communication failures due to schema mismatches. This is one of the features that makes Protobuf-based RPC genuinely viable at the scale of large organizations, not just a performance optimization but a correctness guarantee.

Understanding RPC frameworks at this depth — wire protocols, reflection-based dispatch, interceptor chains, timeout propagation — makes you a far more effective user of gRPC in production. You stop treating it as a black box and start understanding exactly which lever to pull when something goes wrong.

Rate this chapter

4.7 / 5 (3 ratings)

Build an RPC Framework

Build an RPC Framework

Level 1 · What RPC Really Is

Hiding Network Calls

REST vs RPC vs GraphQL: The Essential Tradeoffs

Protobuf vs JSON: Deep Serialization Tradeoffs

Level 2 · gRPC Architecture Principles

HTTP/2 as the Transport Layer

Interceptors: The Middleware Layer for RPCs

Service Discovery and Load Balancing

Level 3 · Building the RPC Framework

Overall Architecture

Step 1: Request/Response Protocol and Codec

Step 2: Service Registry via Reflection

Step 3: Server Connection Handling

Step 4: Client with Async Support

Step 5: Interceptors and Load Balancer

Full Usage Example

Level 4 · Advanced Topics and Edge Cases

Health Check Protocol

gRPC-Gateway: REST Compatibility

Timeout Propagation

gRPC vs Thrift vs Dubbo

Protobuf Schema Evolution Rules

💬 Comments