Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage News BERT as Dynamic Alternative to Protocol Buffers/Thrift

BERT as Dynamic Alternative to Protocol Buffers/Thrift

Leia em Português

This item in japanese


Despite the prominence of XML for encoding data, there are many situations where its inefficiency is a problem, both the inefficiency of encoding/decoding and the space used. Examples of popular binary serialzation formats are the widely used ASN.1, Google's ProtocolBuffers or Facebook Thrift.

A new format now powers GitHub's backend: BERT, created by Tom Preston-Werner builds upon Erlangs External Term Format (ETF) which is used to encodes Erlang terms for communication across nodes.

BERT extends ETF with complex data types such as dictionaries, time and regular expressions.

BERT is different from ASN.1 or Protocol Buffers in not requiring a schema or IDL specification for formats. Tom Preston-Werner explains that this makes BERT a kind of binary version of the idea behind JSON:

I like JSON. I love the concept of extracting a subset of a language and using that to facilitate interprocess communication. This got me thinking about the work I’d done with Erlectricity. About two years ago I wrote a C extension for Erlectricity to speed up the deserialization of Erlang’s external term format.
[..] What if I extracted the generic parts of Erlang’s external term format and made that into a standard for interprocess communication? What if Erlang had the equivalent of JavaScript’s JSON? And what if an RPC protocol could be built on top of that format? What would those things look like and how simple could they be made?

BERT-RPC allows to remotely call code hosted on an BERT-RPC server using BERTs to encode the arguments and return values for the calls. Tom ementions a few of BERT-RPC's features:

- Synchronous and Asynchronous calls[..]
- Streaming (to and from)
- Caching directives

Ruby code can be made available using a BERT-RPC server such Ernie.

A specification for BERT and BERT-RPC exists. Alternatives to the Ruby and Erlang implementations are available in other languages such as BERT for Javascript, Python, and others.

Do you prefer a schema-less approach like BERT or one of the IDL based options such as ASN.1 or ProtocolBuffers.

Rate this Article


Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Community comments

  • Avro

    by Fox Touche,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    It seems like Avro, at least on the surface, had/has similar goals (schema-less, compact binary, multi-language, etc):

    Also, this might be of interest (a benchmark of many various serialization options):

  • Re: Avro

    by Rhys Parsons,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Just looked at Avro out of interest. It does have schemas, although the schemas don't have to be transmitted when using it for RPC if both ends already have the schemas.

  • Re: Avro

    by Fox Touche,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Yes, I guess what I really meant was that the serialized data is self-describing and that a code generation phase is only optional (for optimization purposes). So in this sense, data can be serialized "dynamically", because the schemas themselves can be defined dynamically. At least from what I understand.

    Also, Protobuf messages can be dynamic (I believe there is a DynamicMessage or some equivalent and there are also "reflection" capabilities). But it is kind of painful to use (very verbose) and I think much of the speed is lost.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p