Understanding Protocol Buffers with Practical Examples

Protocol Buffers, often called Protobufs, are Google’s language-neutral, platform-neutral, extensible mechanism for serializing structured…

Luis Soares

06 Jun 2023 — 4 min read

Protocol Buffers, often called Protobufs, are Google’s language-neutral, platform-neutral, extensible mechanism for serializing structured data. Protocol Buffers are used for data storage and communication over the network, which are more efficient and faster than traditional XML and JSON.

Unlike XML and JSON, which are human-readable, Protocol Buffers use binary format, which is more compact and faster for a computer to read and write. Additionally, Protobufs offer an interface definition language (IDL) that allows you to define the data structure to be serialized.

Protocol Buffer Basics

Protobufs are defined in .proto files, which specify the messages to be exchanged. Each message is a small logical record of information containing a series of name-value pairs. Here’s a simple example of how a Protobuf message could be defined:

message Employee { 
  required string name = 1; 
  required int32 id = 2; 
  optional string email = 3; 
}

In this example, the Employee message contains three fields: a string field named ‘name’, an integer field named ‘id’, and a string field named ‘email’. Each field has a unique number assigned, which is used to identify your field in the message binary format.

Fields can be assigned one of the three “labels”: ‘required’, ‘optional’, or ‘repeated’. A ‘required’ field must be set in all messages of this type, ‘optional’ means that the field may or may not be set, and ‘repeated’ indicates that the field can be repeated any number of times (including zero).

Compiling Protocol Buffers

The .proto files can be compiled into various languages using the protoc, Protocol Buffer compiler. For instance, to generate Python code, you would use the following command:

protoc --python_out=. employee.proto

This would output a Python file named employee_pb2.py, which contains generated code for creating, manipulating, and serializing your defined message.

Using Protocol Buffers in Your Code

Here’s how you could use the generated Python code to create an Employee message, set its fields, and then serialize the message to a string:

import employee_pb2 
 
# Create an Employee message 
employee = employee_pb2.Employee() 
 
# Set the fields 
employee.name = "Alice" 
employee.id = 1234 
employee.email = "alice@example.com" 
 
# Serialize the message to a binary string 
serialized_employee = employee.SerializeToString()

You can also parse serialized data back into a Protobuf message like this:

import employee_pb2 
 
# Create a new Employee message 
employee = employee_pb2.Employee() 
 
# Parse the serialized data 
employee.ParseFromString(serialized_employee) 
 
# Now you can access the fields again 
print(employee.name)  # Outputs: "Alice"

Benefits of Protocol Buffers

Efficiency: Protobufs are more compact and faster for a computer to parse than XML or JSON.
Backward-Forward Compatibility: You can add new fields to your message formats without breaking old programs.
Language Interoperability: Protobufs can be used in almost any language, including Python, Java, C++, Go, Ruby, and others.

Advanced Usage of Protocol Buffers

While our earlier example of Protocol Buffers demonstrated the core functionality, the power of Protobufs is in their extended features, including nested types, default values, and enumerations.

Nested Types

You can define Protobuf messages within other messages — effectively creating nested types. Here’s an example:

message Department { 
  message Employee { 
    required string name = 1; 
    required int32 id = 2; 
    optional string email = 3; 
  } 
  required string name = 1; 
  repeated Employee employees = 2; 
}

In this example, we have a Department message that has an Employee message nested within it. We can then have a repeated field of these Employee messages, allowing a single Department to have multiple Employee objects.

Default Values

You can set default values for fields in Protobuf messages. The default will be used if no value is specified when creating the message.

message Employee { 
  required string name = 1; 
  optional int32 age = 2 [default = 30]; 
}

In this example, if an Employee the message is created, and no value is set for the age field, it will default to 30.

Enumerations

Protocol Buffers also support enumerated types — enums — which create a type with a predefined list of values. Here’s how you could define an enum in a Protobuf message:

message Employee { 
  enum JobTitle { 
    SOFTWARE_ENGINEER = 0; 
    PRODUCT_MANAGER = 1; 
    DATA_SCIENTIST = 2; 
  } 
  required string name = 1; 
  required JobTitle title = 2; 
}

In this example, JobTitle it is an enumerated type that has three possible values. The title field in the Employee message must be one of these values.

Services in Protocol Buffers

In addition to messages, Protocol Buffers also allow you to define services, which are a collection of RPC (remote procedure call) endpoints that provide specific methods that can be implemented on a server and called on a client. Here’s an example of a service definition:

service EmployeeService { 
  rpc GetEmployee (EmployeeRequest) returns (EmployeeResponse) {} 
} 
 
message EmployeeRequest { 
  required int32 id = 1; 
} 
 
message EmployeeResponse { 
  required Employee employee = 1; 
}

In this example, EmployeeService provides a single RPC method GetEmployee, which takes an EmployeeRequest message and returns an EmployeeResponse message.

Conclusion

With features such as nested types, default values, enumerations, and services, Protocol Buffers offer rich constructs that make your data storage and inter-service communication more efficient and maintainable. While there can be a learning curve when initially working with Protocol Buffers, the performance, backward compatibility, and language interoperability benefits make them a solid choice for a wide range of applications.

Stay tuned, and happy coding!

Visit my Blog for more articles, news, and software engineering stuff!

Follow me on Medium, LinkedIn, and Twitter.

Check out my most recent book — Application Security: A Quick Reference to the Building Blocks of Secure Software.

All the best,

Luis Soares

#go #golang #grpc #protobuf #protocol #buffer #microservices #api #go #programming #language #softwaredevelopment #coding #software #safety #development #building #architecture #data #google #network #communication