BT

Google Releases New Version Of Protocol Buffers

| by Dilip Krishnan Follow 0 Followers on May 14, 2009. Estimated reading time: 2 minutes |

Google released a new version of protocol buffersa language-neutral, platform-neutral, extensible way of serializing structured data for use in communications protocols, data storage, and more. The changes in this release are outlined in the change notes.

Protocol buffers are a flexible, efficient, automated mechanism for serializing structured data – think XML, but smaller, faster, and simpler. You define how you want your data to be structured once, then you can use special generated source code to easily write and read your structured data to and from a variety of data streams and using a variety of languages. You can even update your data structure without breaking deployed programs that are compiled against the "old" format.

From to the documents released; The commonly available techniques for serializing objects for across processes/machine boundaries are

  • Native serialization, where objects are serialized using the native implementation of the language being used for e.g. Java, C++
  • Serializing using custom serialization format
  • Serialize the data to XML.

Each of these approaches have their own set of problems associated with it, for e.g. Native serialization means the platforms on the ends of the serialization pipe must be the same in order to be able to materialize serialized objects, XML is known to be verbose and an inefficient serialization format and custom serialization formats lead to increased cost of developing one-off parsers.

The goal of Protocol buffers are the flexible, efficient, automated solution to solve exactly this problem. With protocol buffers, you write a .proto description of the data structure you wish to store. From that, the protocol buffer compiler creates a class that implements automatic encoding and parsing of the protocol buffer data with an efficient binary format. The generated class provides getters and setters for the fields that make up a protocol buffer and takes care of the details of reading and writing the protocol buffer as a unit. Importantly, the protocol buffer format supports the idea of extending the format over time in such a way that the code can still read data encoded with the old format.

Protocol buffers supports the following primitive datatypes that can be represented in "object" graphs

  • Base 128 Varint representations - int32, int64, uint32, uint64, sint32, sint64, bool, enum (Varints are a method of serializing integers using one or more bytes. Smaller numbers take a smaller number of bytes.)
  • Fixed size 64 bit representations - fixed64, sfixed64, double
  • Fixed size representations - string, bytes, embedded messages, packed repeated fields
  • Fixed size 32 bit representations - fixed32, sfixed32, float

 

A unit of serialization is a message which could contain fields composed of the primitive datatypes or embedded messages. Protocol buffers supports optional, required and repeated fields. An example of an address book message definition using protocol buffers would look like this

package tutorial;

message Person {
  required string name = 1;
  required int32 id = 2;
  optional string email = 3;

  enum PhoneType {
    MOBILE = 0;
    HOME = 1;
    WORK = 2;
  }

  message PhoneNumber {
    required string number = 1;
    optional PhoneType type = 2 [default = HOME];
  }

  repeated PhoneNumber phone = 4;
}

message AddressBook {
  repeated Person person = 1;
}

The features of the message definition language are described in the language guide. When compiled using a protocol buffer compiler, the encoders and parsers that are generated use a proprietary efficient serialization format. The current release includes compilers and APIs for C++, Java, and Python. However there are community projects to add new language implementations to Protocol Buffers, including Perl, C#, and Ruby.

 

Rate this Article

Adoption Stage
Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread
Community comments

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Discuss

Login to InfoQ to interact with what matters most to you.


Recover your password...

Follow

Follow your favorite topics and editors

Quick overview of most important highlights in the industry and on the site.

Like

More signal, less noise

Build your own feed by choosing topics you want to read about and editors you want to hear from.

Notifications

Stay up-to-date

Set up your notifications and don't miss out on content that matters to you

BT