RPC & gRPC
RPC & gRPC
When two services running on different machines need to communicate, they can exchange raw HTTP JSON, or they can use a higher-level abstraction called Remote Procedure Call (RPC). RPC makes calling a function on a remote server feel as natural as calling a local function in your code. The most important modern implementation of RPC is gRPC, an open-source framework built by Google that powers enormous internal systems — including Google Search, YouTube, and the Google Cloud API surface — as well as countless external microservice architectures.
From Local Calls to Remote Calls
In a monolith, one module calls another directly in memory. The call is near-instant, strongly typed by the language, and the compiler catches mistakes. When you split that monolith into microservices, the call crosses a network boundary. Suddenly you face: latency (milliseconds instead of nanoseconds), partial failure (the callee may be down), and the need for a serialization format both sides understand.
The earliest RPC systems in the 1980s (Sun RPC, CORBA, DCOM) tried to hide all of this behind transparent stubs — they generated code that made remote calls look local. The problem: they over-hid the network. Developers wrote code as if the network did not exist, then were shocked by timeouts and retries. Modern RPC frameworks are more honest: they still generate stub code, but they surface network concerns explicitly through error codes, deadlines, and streaming APIs.
What gRPC Is
gRPC (Google Remote Procedure Call) is built on three foundations:
- Protocol Buffers (protobuf) — a binary serialization format and Interface Definition Language (IDL). You describe your data structures and service methods in a
.protofile; theprotoccompiler generates client and server code in your language of choice (Go, Java, Python, C++, Node.js, Dart, and more). - HTTP/2 — the transport layer. HTTP/2 supports multiplexed streams (many requests over one TCP connection), header compression, and bidirectional streaming. gRPC exploits all of these.
- Generated stubs — the compiler produces a client stub and a server skeleton. The client calls a method on the stub object; the stub serializes the arguments, sends them over HTTP/2, and deserializes the response. The server implements the skeleton interface.
Protocol Buffers: The Serialization Engine
Protocol Buffers are the key to gRPC's performance advantage over REST+JSON. A protobuf message is encoded as a compact binary stream using field numbers rather than field names. Consider a simple user lookup:
The same data that might cost 120 bytes as JSON ({"id":42,"username":"alice","email":"alice@example.com","age":30}) costs roughly 40 bytes as protobuf — a 3× reduction. At millions of calls per second, this matters for both latency and bandwidth cost. Protobuf also enforces a schema: both producer and consumer must agree on field types, which eliminates a whole class of runtime errors common in JSON APIs.
The Four Communication Modes
gRPC supports four interaction patterns, all defined in the same .proto service definition:
- Unary RPC — one request, one response. Equivalent to a classic HTTP request/response. Most common. (
rpc GetUser(...) returns (...)) - Server streaming — one request, a stream of responses. Useful for large result sets or live feeds. (
rpc ListUsers(...) returns (stream ...)) - Client streaming — a stream of requests, one response. Useful for bulk uploads or aggregation. (
rpc UploadChunks(stream ...) returns (...)) - Bidirectional streaming — both sides stream concurrently. Useful for real-time collaboration, chat, or sensor telemetry. (
rpc Chat(stream ...) returns (stream ...))
gRPC vs REST: A Real-World Comparison
The comparison is not "which is better" but "which fits the context". Here is how they differ on the dimensions that matter most in system design:
- Performance: gRPC's binary serialization and HTTP/2 multiplexing give it a consistent 5–10× throughput advantage over REST+JSON at equivalent hardware. Google's internal benchmarks show sub-millisecond p99 latency for small payloads (under 1 KB). REST JSON is slower to parse, especially in languages without native binary handling.
- Schema and contracts: gRPC's
.protofiles are the contract — both client and server code is generated from the same source. REST APIs rely on OpenAPI/Swagger for documentation, but that does not prevent a mismatch between the spec and the actual implementation. With protobuf, a type mismatch is a compile error, not a runtime surprise at 2 AM. - Browser support: REST works natively in every browser. gRPC does not — HTTP/2 trailers (used for gRPC status codes) are not accessible via the browser Fetch API. You need gRPC-Web (a translation proxy) or gRPC-Gateway (a generated REST transcoding layer) to expose gRPC services to browsers.
- Human readability: A REST JSON call is trivially readable with
curl. A gRPC call is a binary stream — you needgrpcurlor a dedicated tool. This slows down debugging and third-party integration for public APIs. - Ecosystem and tooling: REST has decades of tooling — API gateways, load balancers, caches, and security products all speak HTTP/1.1+JSON natively. gRPC is increasingly well supported but still requires more deliberate infrastructure choices.
How gRPC Works: The Request Lifecycle
Walking through a single unary call helps solidify the mechanics:
- The client code calls a method on the generated stub:
stub.GetUser(request). - The stub serializes the
GetUserRequestprotobuf message to a binary byte array. - The gRPC runtime opens (or reuses) an HTTP/2 stream to the server on port 443 (or 50051 for internal services). Each RPC call gets its own stream ID, so multiple in-flight calls share one TCP connection without head-of-line blocking.
- The server receives the bytes, the generated skeleton deserializes them into a
GetUserRequestobject, and calls your handler. - Your handler builds a
UserResponse, the skeleton serializes it, and sends it back on the same HTTP/2 stream. - The stream is closed with an HTTP/2 trailer carrying the gRPC status code (
OK,NOT_FOUND,UNAVAILABLE, etc.).
Schema Evolution: Adding Fields Safely
Real systems evolve. Protobuf handles schema changes gracefully thanks to its field-number system. The rules are:
- Adding a new field is always safe — old clients ignore unknown field numbers; new servers send the new field and old clients silently skip it.
- Removing a field requires reserving its field number to prevent reuse:
reserved 3;. If you reuse an old number for a different type, old and new code will misinterpret each other's bytes. - Renaming a field is safe — protobuf ignores names on the wire; only field numbers matter.
- Changing a field type is dangerous and usually breaks compatibility.
Real-World Usage at Scale
Google uses a single monorepo .proto repository that defines all internal service contracts. When a team changes a .proto file, the CI system automatically regenerates and recompiles all dependent services, catching breakage at build time rather than in production. Netflix, Uber, Lyft, Square, and Dropbox all use gRPC heavily for internal service meshes. Kubernetes' control plane uses gRPC. Envoy Proxy speaks gRPC natively and is the dominant sidecar in service meshes like Istio and Linkerd.
At Google scale (~10 billion RPC calls per second internally), even a 10% reduction in serialization overhead translates directly to thousands of servers removed from the fleet — the reason Google invested heavily in protobuf and HTTP/2.
.proto generates stubs in every language your org uses), high-throughput internal APIs (order processing, recommendation engines, ad serving), and bidirectional streaming use cases (telemetry pipelines, collaborative editing, live game state sync).