gRPC Introduction: Protobuf, Streaming, and Why

Intermediate 10 min read

What you'll learn

✓Write a .proto file and generate code
✓Implement a unary and a streaming RPC
✓Use deadlines, errors, and metadata correctly
✓Compare gRPC to REST honestly
✓Decide where gRPC belongs in your stack

Prerequisites

•Comfort with one of Python or Node.js
•Read [What is REST](/blog/what-is-rest)
•Optional: [What is Node.js](/blog/what-is-nodejs)

gRPC is HTTP/2 plus protobuf plus codegen. You define services in a schema, run a generator, and call methods on a typed stub. The protocol disappears; what you write feels like local function calls with deadlines.

The schema

Protobuf is the contract. A .proto file describes services and messages.

syntax = "proto3";

package shop.v1;

service Catalog {
  rpc GetProduct(GetProductRequest) returns (Product);
  rpc ListProducts(ListProductsRequest) returns (stream Product);
  rpc Bulk(stream Product) returns (BulkResult);
  rpc Chat(stream Message) returns (stream Message);
}

message GetProductRequest { string id = 1; }
message Product { string id = 1; string name = 2; int32 price = 3; }
message ListProductsRequest { string category = 1; }
message BulkResult { int32 written = 1; }
message Message { string text = 1; }

Field numbers are part of the wire format. Never change a number after release. Add fields, never reuse tags.

The four RPC styles

Unary: one request, one response. The familiar RPC shape.
Server streaming: one request, many responses. Good for feeds.
Client streaming: many requests, one response. Good for uploads.
Bidirectional streaming: independent streams in both directions. Good for chat or sync.

All four travel over HTTP/2 multiplexed streams, so latency stays low even when many flow at once.

Generating code

For Python:

pip install grpcio grpcio-tools
python -m grpc_tools.protoc -I. --python_out=. --grpc_python_out=. shop.proto

For Node:

npm i @grpc/grpc-js @grpc/proto-loader

Node can load protos at runtime, which is convenient for small services. Static codegen is better for large schemas and editor tooling.

A Python server

import grpc
from concurrent import futures
import shop_pb2, shop_pb2_grpc

PRODUCTS = {"p1": shop_pb2.Product(id="p1", name="Mug", price=900)}

class Catalog(shop_pb2_grpc.CatalogServicer):
    def GetProduct(self, request, context):
        p = PRODUCTS.get(request.id)
        if not p:
            context.abort(grpc.StatusCode.NOT_FOUND, "no such product")
        return p

    def ListProducts(self, request, context):
        for p in PRODUCTS.values():
            yield p

server = grpc.server(futures.ThreadPoolExecutor(max_workers=10))
shop_pb2_grpc.add_CatalogServicer_to_server(Catalog(), server)
server.add_insecure_port("[::]:50051")
server.start()
server.wait_for_termination()

context.abort is how you return errors. Use the canonical status codes; clients across languages share them.

A Python client

import grpc, shop_pb2, shop_pb2_grpc

with grpc.insecure_channel("localhost:50051") as ch:
    stub = shop_pb2_grpc.CatalogStub(ch)

    p = stub.GetProduct(shop_pb2.GetProductRequest(id="p1"), timeout=2.0)
    print(p.name)

    for prod in stub.ListProducts(shop_pb2.ListProductsRequest()):
        print(prod.id, prod.name)

Note timeout=2.0. That is a deadline propagated through the call. Inside the server, you can check context.time_remaining() and short-circuit slow work.

Deadlines, not timeouts

A deadline is an absolute time the response must arrive by. It propagates across hops, which means a 200 ms budget at the edge stays 200 ms across three services. Always set a deadline on every call; the default is forever.

Metadata

Metadata is HTTP/2 headers in disguise. Use it for auth tokens, request ids, and trace context.

stub.GetProduct(req, metadata=[("authorization", f"Bearer {token}")])

On the server, read it from context.invocation_metadata(). Do not stuff business data into metadata; put it in the proto.

Errors that mean something

gRPC has 16 status codes. A few map cleanly to common failures:

INVALID_ARGUMENT: client sent garbage.
NOT_FOUND: the entity does not exist.
ALREADY_EXISTS: idempotency violations.
PERMISSION_DENIED: authn ok, authz failed.
RESOURCE_EXHAUSTED: rate limited.
UNAVAILABLE: retry-safe transient failure.

Use them. Generic INTERNAL everywhere defeats the typing.

gRPC vs REST

REST wins for browsers, public APIs, and human-readable debugging. gRPC wins for internal service-to-service traffic where you want typed contracts, streaming, and lower CPU per call. Compare to the basics in What is REST.

Where they overlap, gRPC-Web bridges to browsers with limitations. For mobile, gRPC is excellent and ships with both Kotlin and Swift codegen.

Versioning the contract

Put your protos in a versioned package, never mutate field numbers, deprecate before removing. Treat the .proto file as the API; clients and servers can deploy independently because the schema mediates.

Observability

gRPC integrates with OpenTelemetry. Trace context flows through metadata. Status codes turn into structured error rates on dashboards. Streaming RPCs need cardinality care; measure messages per stream, not just RPC counts.

When gRPC is the wrong tool

If your callers are a marketing site, a JS SDK on npm, or a partner integrating once a year, REST or GraphQL is friendlier. gRPC pays off when both sides ship from the same monorepo or share an internal SDK.

Wrap up

Write a proto, generate stubs, set deadlines on every call, use the status codes, and pick the right streaming style for the workload. The result is service-to-service traffic that feels like calling a library, with the performance to match.