Protobuf Schema Evolution Without Breaking Clients
Protobuf schema evolution has clear rules: keep field numbers stable, reserve removals, add not mutate. The safe-vs-breaking change cheat sheet and why.
Part of Polyglot Microservices: Choosing the Right Language
Protobuf schema evolution has a small set of rules, and following them is the difference between shipping a new field and triggering a multi-service outage. Keep field numbers stable forever, reserve anything you remove, and add rather than mutate. Do that, and old and new clients keep talking through every deploy.
Protobuf is built for backward and forward compatibility, but only if you respect how it encodes data. The wire format identifies fields by number, not name, so the field number is the real contract. Break that and a “harmless” schema change silently corrupts data on clients you forgot existed.
Why Protobuf schema evolution matters
In a polyglot system, one .proto file generates clients in Go, Java, Python, TypeScript, and more. A single schema change ripples to every one of them, and they do not all deploy at the same instant.
That means you always run old and new code against the same schema during a rollout. If a change is not backward and forward compatible, the window between the first and last deploy is an outage waiting to happen. This post is part of the Language choices in polyglot microservices series, and it builds on the cross-language failures in Why Language Boundaries Break Polyglot Microservices.
The field number is the contract
Protobuf serializes each field as a tag (the field number plus wire type) followed by the value. Names are a source-code convenience that never travels on the wire in binary encoding.
This single fact explains every rule that follows. The decoder on the other side matches incoming bytes to fields by number. If your number means one thing to the sender and another to the receiver, you do not get an error. You get the wrong value, parsed confidently, with no exception to alert anyone.
That silent-corruption property is why schema discipline matters more than it seems. A type error would at least crash loudly. A field-number collision just quietly hands service B the wrong data.
What changes break Protobuf backward compatibility?
The breaking changes are reusing or changing a field number, changing a field’s type, and removing a field without reserving its number. Each one makes new code misread bytes that old code wrote, or vice versa, with no error raised. Treat all three as forbidden in a live system.
Here is the cheat sheet I keep next to any .proto review.
| Change | Safe? | Why |
|---|---|---|
| Add a field with a new number | ✅ Safe | Old clients ignore it; new clients default it |
Remove a field, then reserve its number | ✅ Safe | No one can recycle the number later |
| Rename a field (same number) | ✅ Binary-safe | Binary uses numbers; JSON/text mapping does change |
| Change a field’s type | ❌ Breaking | Decoders misread the bytes |
| Reuse an old field number for a new field | ❌ Breaking | New code misreads old serialized data |
| Remove a field without reserving | ⚠️ Risky | A future edit can recycle the number |
| Change field number of an existing field | ❌ Breaking | It is a different field on the wire |
Move a field in/out of a oneof | ❌ Breaking | Changes wire semantics |
Can I safely add fields to a Protobuf message?
Yes. Adding a field with a brand-new, never-used field number is the safest change you can make. Old clients silently ignore fields they don’t recognize, and new clients see the default value for fields that old servers don’t send. This is the additive path every safe evolution uses.
The discipline is to make every change additive. Need to “change” a field? Add a new one with a new number, migrate readers to it, and retire the old one later by reserving it. You never edit an existing field in place; you deprecate and add alongside.
message User {
string id = 1;
string email = 2;
// Deprecated: use full_name (5). Kept for old clients.
string name = 3 [deprecated = true];
reserved 4; // a field we removed; never reuse 4
reserved "legacy_flag"; // and its old name
string full_name = 5; // the additive replacement
}
Why should I reserve removed field numbers in Protobuf?
Because reserved stops a future engineer from recycling the number or name for an unrelated field. If number 4 once held a timestamp and someone later reuses 4 for a string, new code reading old data interprets timestamp bytes as a string. Reserving turns that silent corruption into a compile-time error.
Reserve both the number and the name. Reserving the number protects the binary wire format; reserving the name protects JSON and text encodings and prevents accidental source-level reuse. It costs one line and removes an entire class of 2 a.m. incidents.
How do I handle unknown enum values across languages?
Reserve the zero value as an explicit UNKNOWN and handle that case on every client. When a newer server sends an enum value an older client was never compiled with, each language surfaces the “unknown” differently, so an explicit unknown branch is what stops a silent misroute.
enum OrderState {
ORDER_STATE_UNKNOWN = 0; // zero value: always a safe default
ORDER_STATE_PENDING = 1;
ORDER_STATE_SHIPPED = 2;
ORDER_STATE_DELIVERED = 3;
}
Proto3 also collapses the distinction between an absent field, a zero field, and an explicitly-set-to-zero field unless you opt into field presence. When that distinction matters to your logic, use explicit presence (optional in proto3) so a missing value and a real zero are not confused across runtimes.
Should you version your Protobuf API?
Prefer in-place additive evolution over hard version bumps. Because Protobuf changes can almost always be made backward compatible, you rarely need a v2 package; you add fields and deprecate old ones within the same message. Reserve a new versioned package for a genuine, incompatible redesign you cannot reach additively.
The reason matters. A v2 package means generating, deploying, and maintaining two full sets of clients and servers, plus a migration window where both run. That is a large, recurring cost. In-place evolution avoids it entirely as long as you obey the field-number rules: old clients keep working against the new schema, and you migrate readers to new fields at your own pace.
When you genuinely do need a breaking redesign, a new package version (mypackage.v2) run alongside v1 is the clean path: stand up the new contract, migrate consumers one at a time, and retire v1 once traffic drains. The key is that this is the exception you reach for deliberately, not the default cadence for every change.
A schema-change review checklist
Run this before merging any .proto change.
- No existing field number is changed, reused, or retyped.
- Every removed field has a
reservednumber and name. - New fields use new numbers and have sensible defaults.
- Every enum has an explicit zero
UNKNOWN, handled on all clients. - Field presence is explicit wherever absent-versus-zero matters.
- A round-trip test encodes with the old schema and decodes with the new one, and vice versa.
That last point is the one teams skip. A compatibility test that serializes with version N and deserializes with version N+1 (and the reverse) catches the breakage your code review missed.
What I’d do differently
The lesson I learned the slow way is that schema review cannot be vibes. “It’s just a small change” is exactly how field numbers get reused. The fix is to make the rules mechanical: a linter or buf check in CI that rejects field-number changes and unreserved removals, so the discipline does not depend on a tired reviewer at the end of a sprint.
If you treat the .proto as a versioned, owned interface with automated compatibility checks, schema evolution becomes boring, which is exactly what you want from the contract that ties your whole system together. For how those contracts then behave at runtime across languages, see gRPC Across Languages: Production Lessons.
Sources
- Protocol Buffers, Updating a message type: protobuf.dev/programming-guides/proto3/#updating
- Protocol Buffers, Field presence: protobuf.dev/programming-guides/field_presence
- Buf, Breaking change detection: buf.build/docs/breaking/overview
Frequently asked questions
What changes break Protobuf backward compatibility?
Reusing or changing a field number, changing a field's type, and renaming fields in a way that affects JSON or text encoding. Removing a field without reserving its number is also dangerous, because a future engineer can recycle the number and corrupt old data.
Can I safely add fields to a Protobuf message?
Yes. Adding a new field with a new, never-used field number is the safest change in Protobuf. Old clients ignore fields they don't know, and new clients see the default value for fields old servers don't send.
Why should I reserve removed field numbers in Protobuf?
Because reserving the number and name prevents a future engineer from recycling them for a different field. Reusing an old number makes new code misread old serialized data, a silent and hard-to-trace data-corruption bug.
How do I handle unknown enum values across languages?
Reserve the zero value as an explicit UNKNOWN case and handle it on every client. When a newer server sends an enum value an older client has never seen, languages differ in how they surface it, so an explicit unknown branch prevents silent misroutes.