Understanding Protocol Buffers with Practical Examples
Protocol Buffers, often called Protobufs, are Google’s language-neutral, platform-neutral, extensible mechanism for serializing structured…
Protocol Buffers, often called Protobufs, are Google’s language-neutral, platform-neutral, extensible mechanism for serializing structured data. Protocol Buffers are used for data storage and communication over the network, which are more efficient and faster than traditional XML and JSON.
Unlike XML and JSON, which are human-readable, Protocol Buffers use binary format, which is more compact and faster for a computer to read and write. Additionally, Protobufs offer an interface definition language (IDL) that allows you to define the data structure to be serialized.
Protocol Buffer Basics
Protobufs are defined in .proto files, which specify the messages to be exchanged. Each message is a small logical record of information containing a series of name-value pairs. Here’s a simple example of how a Protobuf message could be defined:
message Employee {
required string name = 1;
required int32 id = 2;
optional string email = 3;
}
In this example, the Employee message contains three fields: a string field named ‘name’, an integer field named ‘id’, and a string field named ‘email’. Each field has a unique number assigned, which is used to identify your field in the message binary format.
Fields can be assigned one of the three “labels”: ‘required’, ‘optional’, or ‘repeated’. A ‘required’ field must be set in all messages of this type, ‘optional’ means that the field may or may not be set, and ‘repeated’ indicates that the field can be repeated any number of times (including zero).
Compiling Protocol Buffers
The .proto files can be compiled into various languages using the protoc, Protocol Buffer compiler. For instance, to generate Python code, you would use the following command:
protoc --python_out=. employee.proto
This would output a Python file named employee_pb2.py
, which contains generated code for creating, manipulating, and serializing your defined message.
Using Protocol Buffers in Your Code
Here’s how you could use the generated Python code to create an Employee message, set its fields, and then serialize the message to a string:
import employee_pb2
# Create an Employee message
employee = employee_pb2.Employee()
# Set the fields
employee.name = "Alice"
employee.id = 1234
employee.email = "alice@example.com"
# Serialize the message to a binary string
serialized_employee = employee.SerializeToString()
You can also parse serialized data back into a Protobuf message like this:
import employee_pb2
# Create a new Employee message
employee = employee_pb2.Employee()
# Parse the serialized data
employee.ParseFromString(serialized_employee)
# Now you can access the fields again
print(employee.name) # Outputs: "Alice"
Benefits of Protocol Buffers
- Efficiency: Protobufs are more compact and faster for a computer to parse than XML or JSON.
- Backward-Forward Compatibility: You can add new fields to your message formats without breaking old programs.
- Language Interoperability: Protobufs can be used in almost any language, including Python, Java, C++, Go, Ruby, and others.
Advanced Usage of Protocol Buffers
While our earlier example of Protocol Buffers demonstrated the core functionality, the power of Protobufs is in their extended features, including nested types, default values, and enumerations.
Nested Types
You can define Protobuf messages within other messages — effectively creating nested types. Here’s an example:
message Department {
message Employee {
required string name = 1;
required int32 id = 2;
optional string email = 3;
}
required string name = 1;
repeated Employee employees = 2;
}
In this example, we have a Department
message that has an Employee
message nested within it. We can then have a repeated
field of these Employee
messages, allowing a single Department
to have multiple Employee
objects.
Default Values
You can set default values for fields in Protobuf messages. The default will be used if no value is specified when creating the message.
message Employee {
required string name = 1;
optional int32 age = 2 [default = 30];
}
In this example, if an Employee
the message is created, and no value is set for the age
field, it will default to 30
.
Enumerations
Protocol Buffers also support enumerated types — enums — which create a type with a predefined list of values. Here’s how you could define an enum in a Protobuf message:
message Employee {
enum JobTitle {
SOFTWARE_ENGINEER = 0;
PRODUCT_MANAGER = 1;
DATA_SCIENTIST = 2;
}
required string name = 1;
required JobTitle title = 2;
}
In this example, JobTitle
it is an enumerated type that has three possible values. The title
field in the Employee
message must be one of these values.
Services in Protocol Buffers
In addition to messages, Protocol Buffers also allow you to define services, which are a collection of RPC (remote procedure call) endpoints that provide specific methods that can be implemented on a server and called on a client. Here’s an example of a service definition:
service EmployeeService {
rpc GetEmployee (EmployeeRequest) returns (EmployeeResponse) {}
}
message EmployeeRequest {
required int32 id = 1;
}
message EmployeeResponse {
required Employee employee = 1;
}
In this example, EmployeeService
provides a single RPC method GetEmployee
, which takes an EmployeeRequest
message and returns an EmployeeResponse
message.
Conclusion
With features such as nested types, default values, enumerations, and services, Protocol Buffers offer rich constructs that make your data storage and inter-service communication more efficient and maintainable. While there can be a learning curve when initially working with Protocol Buffers, the performance, backward compatibility, and language interoperability benefits make them a solid choice for a wide range of applications.
Stay tuned, and happy coding!
Visit my Blog for more articles, news, and software engineering stuff!
Follow me on Medium, LinkedIn, and Twitter.
Check out my most recent book — Application Security: A Quick Reference to the Building Blocks of Secure Software.
All the best,
Luis Soares
CTO | Head of Engineering | Blockchain Engineer | Solidity | Rust | Smart Contracts | Web3 | Cyber Security
#go #golang #grpc #protobuf #protocol #buffer #microservices #api #go #programming #language #softwaredevelopment #coding #software #safety #development #building #architecture #data #google #network #communication