Build an RPC Framework
Build an RPC Framework
In 1984, Birrell and Nelson published "Implementing Remote Procedure Calls." They proposed a deceptively simple idea: calling a function on a remote machine should be as simple as calling a local function. That idea gave birth to RPC (Remote Procedure Call), and to 40 years of core infrastructure for distributed systems.
Today, almost all internal service communication in large systems goes through RPC. Google's gRPC, Facebook's Thrift, Alibaba's Dubbo — they are all solving the same fundamental problem: how to wrap network calls so transparently they feel like function calls.
In this chapter we build a complete RPC framework in Go from scratch, then progressively add production-grade features: interceptors, load balancing, health checking. This journey reveals the reasoning behind every core design decision in frameworks like gRPC.
Level 1 · What RPC Really Is
Hiding Network Calls
Without an RPC framework, calling a service on another machine looks like this:
// The pain of manual network communication
conn, _ := net.Dial("tcp", "10.0.0.1:8080")
request := serialize(AddRequest{A: 1, B: 2})
conn.Write(request)
buf := make([]byte, 1024)
conn.Read(buf)
response := deserialize(buf)
fmt.Println(response.Result)
Every service call requires manual handling: establishing a connection, serializing arguments, sending data, receiving the response, deserializing the result, handling errors. Repetitive, tedious, and error-prone.
An RPC framework hides all of this, making the call look like this:
// With an RPC framework
client := NewMathClient("10.0.0.1:8080")
result, err := client.Add(ctx, &AddRequest{A: 1, B: 2})
fmt.Println(result.Result)
This involves several core problems:
- Serialization: How do arguments become bytes? How do bytes become arguments?
- Transport: How are bytes sent over the network? What protocol?
- Service discovery: How does the client know where the server is?
- Error handling: How do network errors, timeouts, and server exceptions propagate back?
REST vs RPC vs GraphQL: The Essential Tradeoffs
These three API styles represent three different design philosophies, each with its own appropriate use cases:
REST (Representational State Transfer): resource-centric, using HTTP verbs (GET/POST/PUT/DELETE) to express operations and URL paths to express resource hierarchies. Strengths: broad generality (any HTTP client can use it), cacheability, human-readable. Weaknesses: complex operations ("transfer funds," "approve request") are hard to express in a resource model; protocol overhead is high (HTTP/1.1 headers repeat every request); versioning is complex.
RPC: action-centric, directly exposing service methods. Strengths: type safety (parameter types defined in schema), high performance (custom serialization, HTTP/2 multiplexing), ideal for internal service-to-service communication. Weaknesses: requires specific clients (ordinary browsers can't use it directly), cross-language support requires code generation, interface changes require version management.
GraphQL: data graph-centric, clients precisely declare which fields they need. Strengths: solves REST's over-fetching and under-fetching problems, especially suitable for frontend-driven scenarios. Weaknesses: server implementation is complex (N+1 query problem), effective caching is difficult, unsuitable for service-to-service communication.
Rule of thumb:
- External APIs (for browsers, third parties): REST or GraphQL
- Internal service-to-service communication (microservices): RPC (gRPC)
- Frontend-heavy products needing flexible data queries: GraphQL
Protobuf vs JSON: Deep Serialization Tradeoffs
The choice of serialization format has a massive impact on performance:
JSON: human-readable, broad cross-language support, easy to debug. But the text format means: the number 1234567890 requires 10 bytes, while binary format needs only 4 bytes (int32); field names are transmitted every time ("user_id": — those 9 bytes repeat in every record); parsing requires string comparison, 5-10x slower than binary parsing.
Protobuf: Google's open-source binary serialization format. Uses field numbers instead of field names (user_id is field 1 in the schema, taking only 1-2 bytes when serialized); uses variable-length integer encoding (Varint) to compress numeric values; can add or remove fields while remaining forward and backward compatible. In typical scenarios, Protobuf is 3-10x smaller than JSON and parses 5-20x faster.
The cost: binary is not human-readable; you need the .proto schema file to parse data; debugging requires specialized tools.
Level 2 · gRPC Architecture Principles
HTTP/2 as the Transport Layer
gRPC's choice of HTTP/2 as its transport layer was deliberate:
Multiplexing: In HTTP/1.1, a single connection can process only one request at a time (serial). In HTTP/2, one connection can process multiple requests simultaneously (streams), completely eliminating the Head-of-Line Blocking problem of HTTP/1.1. gRPC concurrent calls multiplex over the same TCP connection, saving connection setup overhead.
Header compression: HTTP/1.1 sends complete headers with every request. gRPC's metadata (method name, auth token, etc.) is HTTP/2 headers, compressed with HPACK — repeated headers consume almost no bandwidth.
Streaming transport: HTTP/2 natively supports bidirectional streams. gRPC's Server Streaming, Client Streaming, and Bidirectional Streaming all build on this.
Frames: HTTP/2 cuts data into frames. Each frame has a Stream ID; the receiver can reassemble frames from different streams. This is what enables a single connection to truly process multiple requests concurrently.
Interceptors: The Middleware Layer for RPCs
gRPC interceptors are one of the most important extension mechanisms in any RPC framework. They allow adding to all RPC calls — without modifying business code — logging, metrics collection, authentication, rate limiting, and distributed tracing.
Interceptors form a call chain (similar to HTTP middleware):
Request → Interceptor A → Interceptor B → Business Handler → Interceptor B → Interceptor A → Response
gRPC provides Server-side and Client-side interceptors, each in two variants: Unary and Stream.
Service Discovery and Load Balancing
gRPC natively supports service discovery through the Resolver interface, connecting to etcd, Consul, Kubernetes, and other registries:
Client → Resolver (resolve service name → address list) → Balancer (select one address) → Establish connection
Load balancing strategies:
- Round Robin: rotate through servers, simplest, suitable for stateless services
- Least Connection: select the instance with the fewest current connections, suitable when request processing times vary greatly
- Weighted Round Robin: rotate by weight, suitable for heterogeneous machines
- Consistent Hashing: suitable for session affinity scenarios
Level 3 · Building the RPC Framework
Overall Architecture
Our RPC framework has these layers:
Client Side: Server Side:
+---------------------+ +---------------------+
| Generated Stub | | Service Registry |
| (type-safe API) | | (reflect-based) |
+---------------------+ +---------------------+
| |
+---------------------+ +---------------------+
| Client Interceptor | | Server Interceptor |
| Chain | | Chain |
+---------------------+ +---------------------+
| |
+---------------------+ +---------------------+
| Codec (gob/json/ | | Codec (decode req, |
| protobuf) | | encode resp) |
+---------------------+ +---------------------+
| |
+---------------------+ +---------------------+
| TCP Transport |<------->| TCP Transport |
| + Connection Pool | | (goroutine/conn) |
+---------------------+ +---------------------+
Step 1: Request/Response Protocol and Codec
package codec
import (
"bufio"
"encoding/gob"
"encoding/json"
"fmt"
"io"
)
// Header holds metadata for each RPC call
type Header struct {
ServiceMethod string // "ServiceName.MethodName"
Seq uint64 // request sequence number (for matching async responses)
Error string // server-side error (body is ignored when non-empty)
}
// Codec encodes and decodes Header+Body
type Codec interface {
io.Closer
ReadHeader(*Header) error
ReadBody(interface{}) error
Write(*Header, interface{}) error
}
// GobCodec uses Go's standard gob encoding
type GobCodec struct {
conn io.ReadWriteCloser
buf *bufio.Writer
dec *gob.Decoder
enc *gob.Encoder
}
func NewGobCodec(conn io.ReadWriteCloser) Codec {
buf := bufio.NewWriter(conn)
return &GobCodec{
conn: conn,
buf: buf,
dec: gob.NewDecoder(conn),
enc: gob.NewEncoder(buf),
}
}
func (c *GobCodec) ReadHeader(h *Header) error {
return c.dec.Decode(h)
}
func (c *GobCodec) ReadBody(body interface{}) error {
return c.dec.Decode(body)
}
func (c *GobCodec) Write(h *Header, body interface{}) (err error) {
defer func() {
_ = c.buf.Flush()
if err != nil {
_ = c.Close()
}
}()
if err = c.enc.Encode(h); err != nil {
return fmt.Errorf("encode header: %w", err)
}
if err = c.enc.Encode(body); err != nil {
return fmt.Errorf("encode body: %w", err)
}
return nil
}
func (c *GobCodec) Close() error { return c.conn.Close() }
// JSONCodec uses JSON encoding (easier to debug)
type JSONCodec struct {
conn io.ReadWriteCloser
dec *json.Decoder
enc *json.Encoder
}
func NewJSONCodec(conn io.ReadWriteCloser) Codec {
return &JSONCodec{
conn: conn,
dec: json.NewDecoder(conn),
enc: json.NewEncoder(conn),
}
}
func (c *JSONCodec) ReadHeader(h *Header) error { return c.dec.Decode(h) }
func (c *JSONCodec) ReadBody(body interface{}) error { return c.dec.Decode(body) }
func (c *JSONCodec) Write(h *Header, body interface{}) error {
if err := c.enc.Encode(h); err != nil {
return err
}
return c.enc.Encode(body)
}
func (c *JSONCodec) Close() error { return c.conn.Close() }
type CodecType string
const (
GobType CodecType = "application/gob"
JSONType CodecType = "application/json"
)
type NewCodecFunc func(io.ReadWriteCloser) Codec
var NewCodecFuncMap = map[CodecType]NewCodecFunc{
GobType: NewGobCodec,
JSONType: NewJSONCodec,
}
Step 2: Service Registry via Reflection
package server
import (
"fmt"
"go/token"
"log"
"reflect"
"strings"
"sync"
)
// methodType describes a method available for remote invocation
type methodType struct {
method reflect.Method
ArgType reflect.Type
ReplyType reflect.Type
numCalls uint64
}
func (m *methodType) newArgv() reflect.Value {
if m.ArgType.Kind() == reflect.Ptr {
return reflect.New(m.ArgType.Elem())
}
return reflect.New(m.ArgType).Elem()
}
func (m *methodType) newReplyv() reflect.Value {
replyv := reflect.New(m.ReplyType.Elem())
switch m.ReplyType.Elem().Kind() {
case reflect.Map:
replyv.Elem().Set(reflect.MakeMap(m.ReplyType.Elem()))
case reflect.Slice:
replyv.Elem().Set(reflect.MakeSlice(m.ReplyType.Elem(), 0, 0))
}
return replyv
}
// service represents a registered service
type service struct {
name string
typ reflect.Type
rcvr reflect.Value
methods map[string]*methodType
}
func newService(rcvr interface{}) *service {
s := &service{}
s.rcvr = reflect.ValueOf(rcvr)
s.typ = reflect.TypeOf(rcvr)
s.name = reflect.Indirect(s.rcvr).Type().Name()
if !token.IsExported(s.name) {
log.Fatalf("rpc server: %s is not a valid service name", s.name)
}
s.registerMethods()
return s
}
func (s *service) registerMethods() {
s.methods = make(map[string]*methodType)
for i := 0; i < s.typ.NumMethod(); i++ {
method := s.typ.Method(i)
mType := method.Type
// Valid RPC method signature: func (t *T) Method(args ArgType, reply *ReplyType) error
if mType.NumIn() != 3 || mType.NumOut() != 1 {
continue
}
if mType.Out(0) != reflect.TypeOf((*error)(nil)).Elem() {
continue
}
argType, replyType := mType.In(1), mType.In(2)
if !isExportedOrBuiltinType(argType) || !isExportedOrBuiltinType(replyType) {
continue
}
s.methods[method.Name] = &methodType{
method: method,
ArgType: argType,
ReplyType: replyType,
}
log.Printf("rpc server: register %s.%s\n", s.name, method.Name)
}
}
func (s *service) call(m *methodType, argv, replyv reflect.Value) error {
f := m.method.Func
returnValues := f.Call([]reflect.Value{s.rcvr, argv, replyv})
if errInter := returnValues[0].Interface(); errInter != nil {
return errInter.(error)
}
return nil
}
func isExportedOrBuiltinType(t reflect.Type) bool {
return token.IsExported(t.Name()) || t.PkgPath() == ""
}
// Server is the RPC server
type Server struct {
serviceMap sync.Map
}
func NewServer() *Server { return &Server{} }
func (s *Server) Register(rcvr interface{}) error {
svc := newService(rcvr)
if _, dup := s.serviceMap.LoadOrStore(svc.name, svc); dup {
return fmt.Errorf("rpc: service already defined: %s", svc.name)
}
return nil
}
func (s *Server) findService(serviceMethod string) (svc *service, mtype *methodType, err error) {
dot := strings.LastIndex(serviceMethod, ".")
if dot < 0 {
err = fmt.Errorf("rpc server: malformed service method: %s", serviceMethod)
return
}
serviceName, methodName := serviceMethod[:dot], serviceMethod[dot+1:]
svci, ok := s.serviceMap.Load(serviceName)
if !ok {
err = fmt.Errorf("rpc server: can't find service %s", serviceName)
return
}
svc = svci.(*service)
mtype, ok = svc.methods[methodName]
if !ok {
err = fmt.Errorf("rpc server: can't find method %s", methodName)
}
return
}
Step 3: Server Connection Handling
package server
import (
"encoding/json"
"fmt"
"io"
"log"
"net"
"reflect"
"strings"
"sync"
"time"
"github.com/yourname/minirpc/codec"
)
// Option is the negotiation payload (client sends, server reads)
type Option struct {
MagicNumber int
CodecType codec.CodecType
ConnTimeout time.Duration
HandleTimeout time.Duration
}
const MagicNumber = 0x3bef5c
var DefaultOption = &Option{
MagicNumber: MagicNumber,
CodecType: codec.GobType,
ConnTimeout: time.Second * 10,
}
func (s *Server) ServeConn(conn io.ReadWriteCloser) {
defer conn.Close()
var opt Option
if err := json.NewDecoder(conn).Decode(&opt); err != nil {
log.Printf("rpc server: decode option error: %v", err)
return
}
if opt.MagicNumber != MagicNumber {
log.Printf("rpc server: invalid magic number %x", opt.MagicNumber)
return
}
newCodec := codec.NewCodecFuncMap[opt.CodecType]
if newCodec == nil {
log.Printf("rpc server: invalid codec type %s", opt.CodecType)
return
}
s.serveCodec(newCodec(conn), &opt)
}
type request struct {
h *codec.Header
argv reflect.Value
replyv reflect.Value
mtype *methodType
svc *service
}
func (s *Server) serveCodec(cc codec.Codec, opt *Option) {
sending := new(sync.Mutex)
wg := new(sync.WaitGroup)
for {
req, err := s.readRequest(cc)
if err != nil {
if req == nil {
break
}
req.h.Error = err.Error()
s.sendResponse(cc, req.h, struct{}{}, sending)
continue
}
wg.Add(1)
go s.handleRequest(cc, req, sending, wg, opt.HandleTimeout)
}
wg.Wait()
_ = cc.Close()
}
func (s *Server) readRequest(cc codec.Codec) (*request, error) {
var h codec.Header
if err := cc.ReadHeader(&h); err != nil {
if err != io.EOF && !strings.HasSuffix(err.Error(), "EOF") {
log.Printf("rpc server: read header error: %v", err)
}
return nil, err
}
req := &request{h: &h}
var err error
req.svc, req.mtype, err = s.findService(h.ServiceMethod)
if err != nil {
_ = cc.ReadBody(nil)
return req, err
}
req.argv = req.mtype.newArgv()
req.replyv = req.mtype.newReplyv()
argvi := req.argv.Interface()
if req.argv.Type().Kind() != reflect.Ptr {
argvi = req.argv.Addr().Interface()
}
if err = cc.ReadBody(argvi); err != nil {
log.Printf("rpc server: read body err: %v", err)
return req, err
}
return req, nil
}
func (s *Server) handleRequest(cc codec.Codec, req *request, sending *sync.Mutex, wg *sync.WaitGroup, timeout time.Duration) {
defer wg.Done()
called := make(chan struct{})
sent := make(chan struct{})
go func() {
err := req.svc.call(req.mtype, req.argv, req.replyv)
called <- struct{}{}
if err != nil {
req.h.Error = err.Error()
s.sendResponse(cc, req.h, struct{}{}, sending)
} else {
s.sendResponse(cc, req.h, req.replyv.Interface(), sending)
}
sent <- struct{}{}
}()
if timeout == 0 {
<-called
<-sent
return
}
select {
case <-time.After(timeout):
req.h.Error = fmt.Sprintf("rpc server: handle timeout within %s", timeout)
s.sendResponse(cc, req.h, struct{}{}, sending)
case <-called:
<-sent
}
}
func (s *Server) sendResponse(cc codec.Codec, h *codec.Header, body interface{}, sending *sync.Mutex) {
sending.Lock()
defer sending.Unlock()
if err := cc.Write(h, body); err != nil {
log.Printf("rpc server: write response error: %v", err)
}
}
func (s *Server) Accept(lis net.Listener) {
for {
conn, err := lis.Accept()
if err != nil {
log.Printf("rpc server: accept error: %v", err)
return
}
go s.ServeConn(conn)
}
}
var DefaultServer = NewServer()
func Register(rcvr interface{}) error { return DefaultServer.Register(rcvr) }
func Accept(lis net.Listener) { DefaultServer.Accept(lis) }
Step 4: Client with Async Support
package client
import (
"context"
"encoding/json"
"fmt"
"io"
"log"
"net"
"sync"
"time"
"github.com/yourname/minirpc/codec"
"github.com/yourname/minirpc/server"
)
// Call represents an active or completed RPC
type Call struct {
Seq uint64
ServiceMethod string
Args interface{}
Reply interface{}
Error error
Done chan *Call
}
func (c *Call) done() {
c.Done <- c
}
// Client is the RPC client
type Client struct {
cc codec.Codec
opt *server.Option
sending sync.Mutex
header codec.Header
mu sync.Mutex
seq uint64
pending map[uint64]*Call
closing bool
shutdown bool
}
func NewClient(conn net.Conn, opt *server.Option) (*Client, error) {
newCodecFunc := codec.NewCodecFuncMap[opt.CodecType]
if newCodecFunc == nil {
return nil, fmt.Errorf("invalid codec type %s", opt.CodecType)
}
if err := json.NewEncoder(conn).Encode(opt); err != nil {
return nil, fmt.Errorf("rpc client: encode option: %w", err)
}
c := &Client{
seq: 1,
cc: newCodecFunc(conn),
opt: opt,
pending: make(map[uint64]*Call),
}
go c.receive()
return c, nil
}
func Dial(network, address string, opts ...*server.Option) (*Client, error) {
opt := server.DefaultOption
if len(opts) > 0 && opts[0] != nil {
opt = opts[0]
}
conn, err := net.DialTimeout(network, address, opt.ConnTimeout)
if err != nil {
return nil, err
}
c, err := NewClient(conn, opt)
if err != nil {
conn.Close()
}
return c, err
}
func (c *Client) Close() error {
c.mu.Lock()
defer c.mu.Unlock()
if c.closing {
return fmt.Errorf("connection already closed")
}
c.closing = true
return c.cc.Close()
}
func (c *Client) IsAvailable() bool {
c.mu.Lock()
defer c.mu.Unlock()
return !c.shutdown && !c.closing
}
func (c *Client) registerCall(call *Call) (uint64, error) {
c.mu.Lock()
defer c.mu.Unlock()
if c.closing || c.shutdown {
return 0, fmt.Errorf("rpc client: shutting down")
}
call.Seq = c.seq
c.pending[call.Seq] = call
c.seq++
return call.Seq, nil
}
func (c *Client) removeCall(seq uint64) *Call {
c.mu.Lock()
defer c.mu.Unlock()
call := c.pending[seq]
delete(c.pending, seq)
return call
}
func (c *Client) terminateCalls(err error) {
c.sending.Lock()
defer c.sending.Unlock()
c.mu.Lock()
defer c.mu.Unlock()
c.shutdown = true
for _, call := range c.pending {
call.Error = err
call.done()
}
}
func (c *Client) receive() {
var err error
for err == nil {
var h codec.Header
if err = c.cc.ReadHeader(&h); err != nil {
break
}
call := c.removeCall(h.Seq)
switch {
case call == nil:
err = c.cc.ReadBody(nil)
case h.Error != "":
call.Error = fmt.Errorf(h.Error)
_ = c.cc.ReadBody(nil)
call.done()
default:
err = c.cc.ReadBody(call.Reply)
if err != nil {
call.Error = fmt.Errorf("reading body: %w", err)
}
call.done()
}
}
c.terminateCalls(err)
}
func (c *Client) send(call *Call) {
c.sending.Lock()
defer c.sending.Unlock()
seq, err := c.registerCall(call)
if err != nil {
call.Error = err
call.done()
return
}
c.header.ServiceMethod = call.ServiceMethod
c.header.Seq = seq
c.header.Error = ""
if err := c.cc.Write(&c.header, call.Args); err != nil {
if call := c.removeCall(seq); call != nil {
call.Error = err
call.done()
}
}
}
// Go performs an asynchronous call, returns immediately
func (c *Client) Go(serviceMethod string, args, reply interface{}, done chan *Call) *Call {
if done == nil {
done = make(chan *Call, 1)
}
call := &Call{
ServiceMethod: serviceMethod,
Args: args,
Reply: reply,
Done: done,
}
c.send(call)
return call
}
// Call performs a synchronous call, blocking until complete or timeout
func (c *Client) Call(ctx context.Context, serviceMethod string, args, reply interface{}) error {
call := c.Go(serviceMethod, args, reply, make(chan *Call, 1))
select {
case <-ctx.Done():
c.removeCall(call.Seq)
return fmt.Errorf("rpc client: call failed: %s", ctx.Err())
case call := <-call.Done:
return call.Error
}
}
Step 5: Interceptors and Load Balancer
package middleware
import (
"context"
"fmt"
"log"
"time"
)
type UnaryInterceptor func(ctx context.Context, req interface{}, info *UnaryServerInfo, handler UnaryHandler) (interface{}, error)
type UnaryServerInfo struct {
Server interface{}
FullMethod string
}
type UnaryHandler func(ctx context.Context, req interface{}) (interface{}, error)
// ChainUnaryInterceptors links multiple interceptors into one
func ChainUnaryInterceptors(interceptors ...UnaryInterceptor) UnaryInterceptor {
return func(ctx context.Context, req interface{}, info *UnaryServerInfo, handler UnaryHandler) (interface{}, error) {
chained := handler
for i := len(interceptors) - 1; i >= 0; i-- {
interceptor := interceptors[i]
next := chained
chained = func(ctx context.Context, req interface{}) (interface{}, error) {
return interceptor(ctx, req, info, next)
}
}
return chained(ctx, req)
}
}
// LoggingInterceptor logs each RPC call with timing
func LoggingInterceptor(ctx context.Context, req interface{}, info *UnaryServerInfo, handler UnaryHandler) (interface{}, error) {
start := time.Now()
log.Printf("RPC call: %s req=%v", info.FullMethod, req)
resp, err := handler(ctx, req)
elapsed := time.Since(start)
if err != nil {
log.Printf("RPC error: %s err=%v elapsed=%v", info.FullMethod, err, elapsed)
} else {
log.Printf("RPC ok: %s elapsed=%v", info.FullMethod, elapsed)
}
return resp, err
}
// RecoveryInterceptor recovers from panics in handler
func RecoveryInterceptor(ctx context.Context, req interface{}, info *UnaryServerInfo, handler UnaryHandler) (resp interface{}, err error) {
defer func() {
if r := recover(); r != nil {
log.Printf("RPC panic: %s panic=%v", info.FullMethod, r)
err = fmt.Errorf("internal server error")
}
}()
return handler(ctx, req)
}
// ===================== Load Balancer =====================
package balancer
import (
"fmt"
"math/rand"
"sync"
"sync/atomic"
)
type SelectMode int
const (
RandomSelect SelectMode = iota
RoundRobinSelect
)
type Discovery interface {
Refresh() error
Update(servers []string) error
Get(mode SelectMode) (string, error)
GetAll() ([]string, error)
}
// MultiServerDiscovery is a manually maintained server list (no registry dependency)
type MultiServerDiscovery struct {
r *rand.Rand
mu sync.RWMutex
servers []string
index uint64
}
func NewMultiServerDiscovery(servers []string) *MultiServerDiscovery {
return &MultiServerDiscovery{
r: rand.New(rand.NewSource(rand.Int63())),
servers: servers,
}
}
func (d *MultiServerDiscovery) Refresh() error { return nil }
func (d *MultiServerDiscovery) Update(servers []string) error {
d.mu.Lock()
defer d.mu.Unlock()
d.servers = servers
return nil
}
func (d *MultiServerDiscovery) Get(mode SelectMode) (string, error) {
d.mu.RLock()
defer d.mu.RUnlock()
n := len(d.servers)
if n == 0 {
return "", fmt.Errorf("rpc discovery: no available servers")
}
switch mode {
case RandomSelect:
return d.servers[d.r.Intn(n)], nil
case RoundRobinSelect:
idx := atomic.AddUint64(&d.index, 1) - 1
return d.servers[idx%uint64(n)], nil
default:
return "", fmt.Errorf("unsupported select mode")
}
}
func (d *MultiServerDiscovery) GetAll() ([]string, error) {
d.mu.RLock()
defer d.mu.RUnlock()
return d.servers, nil
}
Full Usage Example
package main
import (
"context"
"fmt"
"log"
"net"
"sync"
"time"
"github.com/yourname/minirpc/client"
"github.com/yourname/minirpc/server"
)
type MathService struct{}
type Args struct{ A, B int }
type Reply struct{ C int }
func (m *MathService) Add(args Args, reply *Reply) error {
reply.C = args.A + args.B
return nil
}
func (m *MathService) Multiply(args Args, reply *Reply) error {
reply.C = args.A * args.B
return nil
}
func startServer(wg *sync.WaitGroup) {
if err := server.Register(&MathService{}); err != nil {
log.Fatal("register:", err)
}
lis, err := net.Listen("tcp", ":9999")
if err != nil {
log.Fatal("listen:", err)
}
log.Println("server on :9999")
wg.Done()
server.Accept(lis)
}
func main() {
var wg sync.WaitGroup
wg.Add(1)
go startServer(&wg)
wg.Wait()
time.Sleep(time.Second)
c, err := client.Dial("tcp", ":9999")
if err != nil {
log.Fatal("dial:", err)
}
defer c.Close()
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()
var reply Reply
if err := c.Call(ctx, "MathService.Add", Args{1, 2}, &reply); err != nil {
log.Fatal(err)
}
fmt.Printf("1 + 2 = %d\n", reply.C)
if err := c.Call(ctx, "MathService.Multiply", Args{6, 7}, &reply); err != nil {
log.Fatal(err)
}
fmt.Printf("6 * 7 = %d\n", reply.C)
}
Level 4 · Advanced Topics and Edge Cases
Health Check Protocol
Production RPC frameworks must be able to detect whether backend instances are healthy. gRPC defines a standard health checking protocol. Services implement the grpc.health.v1.Health service:
// Periodic liveness probe
func healthCheck(addr string, interval time.Duration, unhealthy chan<- string) {
for {
conn, err := net.DialTimeout("tcp", addr, 2*time.Second)
if err != nil {
unhealthy <- addr
} else {
conn.Close()
}
time.Sleep(interval)
}
}
gRPC-Gateway: REST Compatibility
Internal services use gRPC, but external APIs need REST. grpc-gateway solves this: it reads HTTP annotations in Protobuf files and auto-generates an HTTP proxy layer that converts REST requests into gRPC calls:
service MathService {
rpc Add(AddRequest) returns (AddResponse) {
option (google.api.http) = {
post: "/v1/math/add"
body: "*"
};
}
}
With this annotation, the gateway accepts POST /v1/math/add with a JSON body and forwards it as a gRPC call to the backend. Zero business code changes required.
Timeout Propagation
In a microservice call chain, timeouts must propagate downstream. If the chain is A → B → C and A sets a 1-second timeout, B must know how much time remains when calling C — otherwise C may be executing long after A has already timed out.
gRPC propagates deadlines through context.Context:
// A initiates the call with a deadline
ctx, cancel := context.WithTimeout(context.Background(), time.Second)
defer cancel()
// context.Deadline() is serialized into the request metadata by gRPC
resp, err := clientB.Call(ctx, ...)
// When B receives the request, the context already carries the deadline
// B passes the same ctx to C
respC, err := clientC.Call(ctx, ...) // ctx.Deadline() is still A's original deadline
This means if A's 1 second is almost up, B's call to C will automatically get a very short timeout, preventing cascading latency buildup.
gRPC vs Thrift vs Dubbo
gRPC: from Google, HTTP/2 + Protobuf, broadest cross-language support (nearly every language has official support), strong streaming RPC support, deep CNCF ecosystem integration (Kubernetes, Istio, and others support it natively).
Thrift: from Facebook, custom binary protocol, older than gRPC, similar performance, but smaller ecosystem, limited streaming support. Used internally at Facebook; HBase's client protocol also uses Thrift.
Dubbo: from Alibaba, focused on the Java ecosystem, more complete service governance features (rate limiting, circuit breaking, gray deployment built in), suitable for Java microservice architectures.
Rough performance comparison (QPS, highly workload-dependent):
| Framework | Serialization | Transport | Relative Performance |
|---|---|---|---|
| gRPC | Protobuf | HTTP/2 | baseline |
| Thrift Binary | Thrift | TCP | ~110% |
| JSON over HTTP/1.1 | JSON | HTTP/1.1 | ~30-40% |
| gob over TCP | gob | TCP | ~90% |
Protobuf Schema Evolution Rules
These rules govern backward-compatible schema changes — the key to safe rolling upgrades:
- Can add new fields: old code ignores unknown field numbers
- Cannot change a field's number: field numbers are the wire identity
- Cannot reuse a deleted field's number: mark it
reservedinstead of removing - Changing field types requires caution: int32 → int64 is wire-compatible; int32 → string is not
message User {
uint64 id = 1;
string name = 2;
reserved 3; // field 3 was deleted, number is reserved
reserved "email"; // field name "email" is reserved
int32 age = 4; // newly added field
}
These rules guarantee that rolling upgrades — where old and new versions of a service run simultaneously — don't cause communication failures due to schema mismatches. This is one of the features that makes Protobuf-based RPC genuinely viable at the scale of large organizations, not just a performance optimization but a correctness guarantee.
Understanding RPC frameworks at this depth — wire protocols, reflection-based dispatch, interceptor chains, timeout propagation — makes you a far more effective user of gRPC in production. You stop treating it as a black box and start understanding exactly which lever to pull when something goes wrong.