Protobuf Testing: Validating Schemas & Serialization
Protocol Buffers (Protobuf) is not just a serialization format — it's a schema system. When you change a .proto file, you're changing a contract that potentially affects every service that uses it. Testing Protobuf schemas means validating not just that serialization works, but that your changes don't break consumers across language boundaries and service versions.
What Needs Testing in Protobuf
Before jumping to code, clarify what can go wrong:
- Serialization correctness — does the message serialize and deserialize without data loss?
- Backward compatibility — can the new schema read messages written by the old schema?
- Forward compatibility — can the old schema read messages written by the new schema?
- Validation — are invalid messages rejected before reaching business logic?
- Cross-language compatibility — does a Go-serialized message deserialize correctly in Python?
- Default values — are unset fields handled consistently?
Testing Serialization Roundtrips
The most basic test: serialize a message, deserialize it, assert equality.
Python
import unittest
from google.protobuf import json_format
import user_pb2
class TestUserSerialization(unittest.TestCase):
def test_serialize_deserialize_preserves_all_fields(self):
original = user_pb2.User(
user_id="user-123",
name="Alice",
email="alice@example.com",
age=30,
tags=["admin", "beta-tester"],
metadata={"plan": "pro", "region": "us-east-1"},
)
# Serialize to binary
serialized = original.SerializeToString()
# Deserialize
restored = user_pb2.User()
restored.ParseFromString(serialized)
self.assertEqual(restored.user_id, "user-123")
self.assertEqual(restored.name, "Alice")
self.assertEqual(restored.email, "alice@example.com")
self.assertEqual(restored.age, 30)
self.assertEqual(list(restored.tags), ["admin", "beta-tester"])
self.assertEqual(dict(restored.metadata), {"plan": "pro", "region": "us-east-1"})
def test_json_serialization_roundtrip(self):
original = user_pb2.User(user_id="user-456", name="Bob")
# To JSON
json_str = json_format.MessageToJson(original)
# From JSON
restored = json_format.Parse(json_str, user_pb2.User())
self.assertEqual(restored.user_id, "user-456")
self.assertEqual(restored.name, "Bob")
def test_empty_string_vs_unset_field(self):
"""Verify proto3 default value behavior."""
msg = user_pb2.User()
# In proto3, string fields default to empty string
self.assertEqual(msg.name, "")
self.assertFalse(msg.HasField("name")) # Only works for optional fieldsGo
package proto_test
import (
"testing"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
"google.golang.org/protobuf/proto"
pb "github.com/your-org/user/pb"
)
func TestUserSerialization(t *testing.T) {
original := &pb.User{
UserId: "user-123",
Name: "Alice",
Email: "alice@example.com",
Age: 30,
Tags: []string{"admin", "beta-tester"},
}
// Serialize
data, err := proto.Marshal(original)
require.NoError(t, err)
assert.NotEmpty(t, data)
// Deserialize
restored := &pb.User{}
err = proto.Unmarshal(data, restored)
require.NoError(t, err)
// Compare using proto.Equal (not reflect.DeepEqual)
assert.True(t, proto.Equal(original, restored))
}
func TestProtoEqual_HandlesDefaultValues(t *testing.T) {
// Two messages with different construction but same logical value
msg1 := &pb.User{Name: "Alice", Age: 0} // Age explicitly 0
msg2 := &pb.User{Name: "Alice"} // Age unset (defaults to 0)
// proto.Equal handles this correctly
assert.True(t, proto.Equal(msg1, msg2))
// reflect.DeepEqual would also be equal here, but proto.Equal is safer
// for complex messages with nested types and maps
}Java
import org.junit.jupiter.api.Test;
import static org.junit.jupiter.api.Assertions.*;
import com.google.protobuf.InvalidProtocolBufferException;
import com.example.UserProto.User;
class UserSerializationTest {
@Test
void serializeDeserialize_preservesAllFields() throws InvalidProtocolBufferException {
User original = User.newBuilder()
.setUserId("user-123")
.setName("Alice")
.setEmail("alice@example.com")
.setAge(30)
.addTags("admin")
.addTags("beta-tester")
.build();
byte[] serialized = original.toByteArray();
User restored = User.parseFrom(serialized);
assertEquals(original, restored);
assertEquals("user-123", restored.getUserId());
assertEquals(2, restored.getTagsCount());
}
@Test
void jsonSerialization_roundtrip() throws Exception {
User original = User.newBuilder()
.setUserId("user-456")
.setName("Bob")
.build();
String json = JsonFormat.printer().print(original);
User.Builder builder = User.newBuilder();
JsonFormat.parser().merge(json, builder);
User restored = builder.build();
assertEquals(original, restored);
}
}Testing Backward Compatibility
Proto3 backward compatibility rules: adding new fields is safe; removing or changing field numbers is not.
def test_new_schema_reads_old_message(self):
"""
Simulate: producer uses old schema (without 'phone' field),
consumer uses new schema (with 'phone' field added).
The consumer should read the message successfully, with phone defaulting.
"""
# Old message — serialized without phone field
old_message = user_v1_pb2.User(
user_id="user-123",
name="Alice",
)
serialized = old_message.SerializeToString()
# New schema — includes optional phone field
new_message = user_v2_pb2.User()
new_message.ParseFromString(serialized)
# Fields present in old schema are preserved
self.assertEqual(new_message.user_id, "user-123")
self.assertEqual(new_message.name, "Alice")
# New field defaults to empty string (proto3 default)
self.assertEqual(new_message.phone, "")
def test_old_schema_reads_new_message(self):
"""Forward compatibility: old reader ignores unknown fields."""
new_message = user_v2_pb2.User(
user_id="user-456",
name="Bob",
phone="+1-555-0100", # field unknown to old schema
)
serialized = new_message.SerializeToString()
old_message = user_v1_pb2.User()
old_message.ParseFromString(serialized)
# Known fields preserved
self.assertEqual(old_message.user_id, "user-456")
self.assertEqual(old_message.name, "Bob")
# 'phone' is silently ignored by old schema — expected behaviorTesting with buf (Recommended)
buf is the standard tool for Protobuf linting and breaking change detection. Add it to CI to catch compatibility issues before they reach production.
Install:
brew install bufbuild/buf/buf # macOS
<span class="hljs-comment"># or download from https://github.com/bufbuild/buf/releasesbuf.yaml:
version: v1
breaking:
use:
- FILE
lint:
use:
- DEFAULTCheck for breaking changes against main branch:
# Check current changes against the last committed state
buf breaking --against <span class="hljs-string">'.git#branch=main'
<span class="hljs-comment"># Check against a remote schema registry
buf breaking --against <span class="hljs-string">'buf.build/your-org/your-schemas'Add to GitHub Actions:
- name: Protobuf lint
uses: bufbuild/buf-lint-action@v1
- name: Check breaking changes
uses: bufbuild/buf-breaking-action@v1
with:
against: 'https://github.com/${GITHUB_REPOSITORY}.git#branch=main'Validating Proto Messages
Proto3 doesn't enforce validation rules by default. Use protoc-gen-validate (PGV) or protovalidate for field-level validation:
With protovalidate:
import "buf/validate/validate.proto";
message CreateUserRequest {
string name = 1 [(buf.validate.field).string = {
min_len: 1,
max_len: 100,
}];
string email = 2 [(buf.validate.field).string.email = true];
int32 age = 3 [(buf.validate.field).int32 = {
gte: 0,
lte: 150,
}];
}Test the validation:
from buf.validate import validate_pb2
from buf.validate.python import validate
def test_valid_user_request_passes_validation(self):
request = user_pb2.CreateUserRequest(
name="Alice",
email="alice@example.com",
age=30,
)
violations = validate.validate(request)
self.assertEqual(len(violations), 0)
def test_invalid_email_fails_validation(self):
request = user_pb2.CreateUserRequest(
name="Alice",
email="not-an-email",
age=30,
)
violations = validate.validate(request)
self.assertGreater(len(violations), 0)
field_names = [v.field_path for v in violations]
self.assertIn("email", field_names)
def test_empty_name_fails_validation(self):
request = user_pb2.CreateUserRequest(
name="",
email="alice@example.com",
)
violations = validate.validate(request)
self.assertTrue(any(v.field_path == "name" for v in violations))Cross-Language Compatibility Testing
If your services span multiple languages, test that a message serialized in one language is readable in another:
# test_cross_language.py
# Strategy: serialize in Python, write bytes to file, read in Go test
import subprocess
import tempfile
import os
def test_python_serialized_message_readable_by_go(self):
user = user_pb2.User(
user_id="cross-lang-test",
name="CrossLang User",
email="cross@example.com",
)
with tempfile.NamedTemporaryFile(suffix='.bin', delete=False) as f:
f.write(user.SerializeToString())
temp_path = f.name
try:
# Run Go test that reads this file
result = subprocess.run(
['go', 'test', '-run', 'TestReadPythonSerializedMessage',
f'-test-file={temp_path}', './proto/...'],
capture_output=True, text=True,
cwd='/path/to/go-service'
)
self.assertEqual(result.returncode, 0, result.stdout + result.stderr)
finally:
os.unlink(temp_path)// Go test that reads Python-serialized binary
func TestReadPythonSerializedMessage(t *testing.T) {
filePath := flag.String("test-file", "", "path to protobuf binary")
flag.Parse()
if *filePath == "" {
t.Skip("No test file provided")
}
data, err := os.ReadFile(*filePath)
require.NoError(t, err)
user := &pb.User{}
err = proto.Unmarshal(data, user)
require.NoError(t, err)
assert.Equal(t, "cross-lang-test", user.UserId)
assert.Equal(t, "CrossLang User", user.Name)
}Testing Oneof Fields
def test_payment_method_oneof(self):
# Only one payment method should be set
order = order_pb2.Order(
order_id="order-1",
credit_card=order_pb2.CreditCard(
number="4242424242424242",
expiry="12/28",
)
)
# Check which oneof is set
self.assertEqual(order.WhichOneof('payment_method'), 'credit_card')
self.assertEqual(order.credit_card.number, "4242424242424242")
# Setting another oneof field clears the first
order.bank_transfer.CopyFrom(order_pb2.BankTransfer(account="DE89..."))
self.assertEqual(order.WhichOneof('payment_method'), 'bank_transfer')
self.assertEqual(order.credit_card.number, "") # clearedRegression Testing for Schema Changes
Create a golden file test to catch unexpected schema changes:
import hashlib
import json
def test_user_proto_descriptor_unchanged(self):
"""Detect unintended proto schema changes."""
descriptor = user_pb2.User.DESCRIPTOR
schema_info = {
'full_name': descriptor.full_name,
'fields': [
{
'name': f.name,
'number': f.number,
'type': f.type,
'label': f.label,
}
for f in descriptor.fields
]
}
schema_json = json.dumps(schema_info, sort_keys=True)
schema_hash = hashlib.sha256(schema_json.encode()).hexdigest()
# Update this hash intentionally when schema changes are approved
EXPECTED_HASH = "abc123..."
self.assertEqual(
schema_hash, EXPECTED_HASH,
"User proto schema changed unexpectedly. If intentional, "
"update EXPECTED_HASH and verify backward compatibility."
)Monitoring Protobuf Services
gRPC services built on Protobuf schemas often expose metrics and health endpoints. HelpMeTest monitors these with 5-minute intervals:
curl -fsSL https://helpmetest.com/install | bash
helpmetest health <span class="hljs-string">"user-grpc-service" <span class="hljs-string">"5m"Schema changes that break clients show up as service errors — monitoring catches these before users escalate.
Summary
| Test Type | What It Catches | Tool |
|---|---|---|
| Serialization roundtrip | Data loss in encode/decode | unittest, testify |
| Backward compatibility | New schema breaking old readers | Python/Go roundtrip tests |
| Breaking change detection | Removed fields, changed types | buf CLI in CI |
| Field validation | Invalid data bypassing service | protovalidate |
| Cross-language compat | Binary format consistency | Shared binary files |
| Golden file tests | Unintended schema drift | Hash comparison |
Protobuf schemas are contracts. Test them like contracts — with explicit compatibility assertions, not just "does it serialize."