Networking & Communication

RPC & gRPC

18 min Lesson 6 of 10

RPC & gRPC

When two services running on different machines need to communicate, they can exchange raw HTTP JSON, or they can use a higher-level abstraction called Remote Procedure Call (RPC). RPC makes calling a function on a remote server feel as natural as calling a local function in your code. The most important modern implementation of RPC is gRPC, an open-source framework built by Google that powers enormous internal systems — including Google Search, YouTube, and the Google Cloud API surface — as well as countless external microservice architectures.

Core idea: REST asks you to think in terms of resources and HTTP verbs. RPC asks you to think in terms of actions — you define a service interface as a set of callable methods, and the framework handles serialization, transport, and deserialization for you.

From Local Calls to Remote Calls

In a monolith, one module calls another directly in memory. The call is near-instant, strongly typed by the language, and the compiler catches mistakes. When you split that monolith into microservices, the call crosses a network boundary. Suddenly you face: latency (milliseconds instead of nanoseconds), partial failure (the callee may be down), and the need for a serialization format both sides understand.

The earliest RPC systems in the 1980s (Sun RPC, CORBA, DCOM) tried to hide all of this behind transparent stubs — they generated code that made remote calls look local. The problem: they over-hid the network. Developers wrote code as if the network did not exist, then were shocked by timeouts and retries. Modern RPC frameworks are more honest: they still generate stub code, but they surface network concerns explicitly through error codes, deadlines, and streaming APIs.

What gRPC Is

gRPC (Google Remote Procedure Call) is built on three foundations:

  • Protocol Buffers (protobuf) — a binary serialization format and Interface Definition Language (IDL). You describe your data structures and service methods in a .proto file; the protoc compiler generates client and server code in your language of choice (Go, Java, Python, C++, Node.js, Dart, and more).
  • HTTP/2 — the transport layer. HTTP/2 supports multiplexed streams (many requests over one TCP connection), header compression, and bidirectional streaming. gRPC exploits all of these.
  • Generated stubs — the compiler produces a client stub and a server skeleton. The client calls a method on the stub object; the stub serializes the arguments, sends them over HTTP/2, and deserializes the response. The server implements the skeleton interface.

Protocol Buffers: The Serialization Engine

Protocol Buffers are the key to gRPC's performance advantage over REST+JSON. A protobuf message is encoded as a compact binary stream using field numbers rather than field names. Consider a simple user lookup:

// user.proto syntax = "proto3"; service UserService { rpc GetUser (GetUserRequest) returns (UserResponse); rpc ListUsers (ListUsersRequest) returns (stream UserResponse); } message GetUserRequest { int64 user_id = 1; } message UserResponse { int64 id = 1; string username = 2; string email = 3; int32 age = 4; }

The same data that might cost 120 bytes as JSON ({"id":42,"username":"alice","email":"alice@example.com","age":30}) costs roughly 40 bytes as protobuf — a 3× reduction. At millions of calls per second, this matters for both latency and bandwidth cost. Protobuf also enforces a schema: both producer and consumer must agree on field types, which eliminates a whole class of runtime errors common in JSON APIs.

The Four Communication Modes

gRPC supports four interaction patterns, all defined in the same .proto service definition:

  • Unary RPC — one request, one response. Equivalent to a classic HTTP request/response. Most common. (rpc GetUser(...) returns (...))
  • Server streaming — one request, a stream of responses. Useful for large result sets or live feeds. (rpc ListUsers(...) returns (stream ...))
  • Client streaming — a stream of requests, one response. Useful for bulk uploads or aggregation. (rpc UploadChunks(stream ...) returns (...))
  • Bidirectional streaming — both sides stream concurrently. Useful for real-time collaboration, chat, or sensor telemetry. (rpc Chat(stream ...) returns (stream ...))
Four gRPC Communication Modes 1. Unary Client Server req resp 1 req → 1 resp e.g. GetUser 2. Server Stream Client Server 1 req → many resp e.g. ListUsers 3. Client Stream Client Server many req → 1 resp e.g. UploadChunks 4. Bidirectional Client Server concurrent streams e.g. Chat, telemetry Protocol Buffers (binary) over HTTP/2 multiplexed streams · header compression · schema-enforced · strongly typed Blue arrows = client sends · Green arrows = server sends
The four gRPC communication modes — all running over Protocol Buffers serialization on HTTP/2.

gRPC vs REST: A Real-World Comparison

The comparison is not "which is better" but "which fits the context". Here is how they differ on the dimensions that matter most in system design:

  • Performance: gRPC's binary serialization and HTTP/2 multiplexing give it a consistent 5–10× throughput advantage over REST+JSON at equivalent hardware. Google's internal benchmarks show sub-millisecond p99 latency for small payloads (under 1 KB). REST JSON is slower to parse, especially in languages without native binary handling.
  • Schema and contracts: gRPC's .proto files are the contract — both client and server code is generated from the same source. REST APIs rely on OpenAPI/Swagger for documentation, but that does not prevent a mismatch between the spec and the actual implementation. With protobuf, a type mismatch is a compile error, not a runtime surprise at 2 AM.
  • Browser support: REST works natively in every browser. gRPC does not — HTTP/2 trailers (used for gRPC status codes) are not accessible via the browser Fetch API. You need gRPC-Web (a translation proxy) or gRPC-Gateway (a generated REST transcoding layer) to expose gRPC services to browsers.
  • Human readability: A REST JSON call is trivially readable with curl. A gRPC call is a binary stream — you need grpcurl or a dedicated tool. This slows down debugging and third-party integration for public APIs.
  • Ecosystem and tooling: REST has decades of tooling — API gateways, load balancers, caches, and security products all speak HTTP/1.1+JSON natively. gRPC is increasingly well supported but still requires more deliberate infrastructure choices.
Design rule of thumb: Use gRPC for internal service-to-service communication where you control both ends, performance matters, and schema evolution is important. Use REST (or GraphQL) for external-facing APIs consumed by browsers, mobile apps, or third-party developers where accessibility and human readability win.

How gRPC Works: The Request Lifecycle

Walking through a single unary call helps solidify the mechanics:

  1. The client code calls a method on the generated stub: stub.GetUser(request).
  2. The stub serializes the GetUserRequest protobuf message to a binary byte array.
  3. The gRPC runtime opens (or reuses) an HTTP/2 stream to the server on port 443 (or 50051 for internal services). Each RPC call gets its own stream ID, so multiple in-flight calls share one TCP connection without head-of-line blocking.
  4. The server receives the bytes, the generated skeleton deserializes them into a GetUserRequest object, and calls your handler.
  5. Your handler builds a UserResponse, the skeleton serializes it, and sends it back on the same HTTP/2 stream.
  6. The stream is closed with an HTTP/2 trailer carrying the gRPC status code (OK, NOT_FOUND, UNAVAILABLE, etc.).
gRPC Unary Call Lifecycle over HTTP/2 Client App gRPC Client Stub HTTP/2 Transport Server Handler stub.GetUser(req) serialize + open stream binary payload (HTTP/2 frame) deserialize + handle binary response deserialize response UserResponse object + status OK
The lifecycle of a single gRPC unary call — from application code through stub, HTTP/2 transport, to server handler and back.

Schema Evolution: Adding Fields Safely

Real systems evolve. Protobuf handles schema changes gracefully thanks to its field-number system. The rules are:

  • Adding a new field is always safe — old clients ignore unknown field numbers; new servers send the new field and old clients silently skip it.
  • Removing a field requires reserving its field number to prevent reuse: reserved 3;. If you reuse an old number for a different type, old and new code will misinterpret each other's bytes.
  • Renaming a field is safe — protobuf ignores names on the wire; only field numbers matter.
  • Changing a field type is dangerous and usually breaks compatibility.
Schema versioning pitfall: Never delete a field number and reuse it for a different field type. Even after all services are updated, old clients that were cached or running in long-lived jobs will misread the bytes. Reserve the number forever.

Real-World Usage at Scale

Google uses a single monorepo .proto repository that defines all internal service contracts. When a team changes a .proto file, the CI system automatically regenerates and recompiles all dependent services, catching breakage at build time rather than in production. Netflix, Uber, Lyft, Square, and Dropbox all use gRPC heavily for internal service meshes. Kubernetes' control plane uses gRPC. Envoy Proxy speaks gRPC natively and is the dominant sidecar in service meshes like Istio and Linkerd.

At Google scale (~10 billion RPC calls per second internally), even a 10% reduction in serialization overhead translates directly to thousands of servers removed from the fleet — the reason Google invested heavily in protobuf and HTTP/2.

When gRPC shines: polyglot microservices (one .proto generates stubs in every language your org uses), high-throughput internal APIs (order processing, recommendation engines, ad serving), and bidirectional streaming use cases (telemetry pipelines, collaborative editing, live game state sync).