BT

BERT as Dynamic Alternative to Protocol Buffers/Thrift

by Werner Schuster on Nov 02, 2009 |

Despite the prominence of XML for encoding data, there are many situations where its inefficiency is a problem, both the inefficiency of encoding/decoding and the space used. Examples of popular binary serialzation formats are the widely used ASN.1, Google's ProtocolBuffers or Facebook Thrift.

A new format now powers GitHub's backend: BERT, created by Tom Preston-Werner builds upon Erlangs External Term Format (ETF) which is used to encodes Erlang terms for communication across nodes.

BERT extends ETF with complex data types such as dictionaries, time and regular expressions.

BERT is different from ASN.1 or Protocol Buffers in not requiring a schema or IDL specification for formats. Tom Preston-Werner explains that this makes BERT a kind of binary version of the idea behind JSON:

I like JSON. I love the concept of extracting a subset of a language and using that to facilitate interprocess communication. This got me thinking about the work I’d done with Erlectricity. About two years ago I wrote a C extension for Erlectricity to speed up the deserialization of Erlang’s external term format.
[..] What if I extracted the generic parts of Erlang’s external term format and made that into a standard for interprocess communication? What if Erlang had the equivalent of JavaScript’s JSON? And what if an RPC protocol could be built on top of that format? What would those things look like and how simple could they be made?

BERT-RPC allows to remotely call code hosted on an BERT-RPC server using BERTs to encode the arguments and return values for the calls. Tom ementions a few of BERT-RPC's features:

- Synchronous and Asynchronous calls[..]
- Streaming (to and from)
- Caching directives

Ruby code can be made available using a BERT-RPC server such Ernie.

A specification for BERT and BERT-RPC exists. Alternatives to the Ruby and Erlang implementations are available in other languages such as BERT for Javascript, Python, and others.

Do you prefer a schema-less approach like BERT or one of the IDL based options such as ASN.1 or ProtocolBuffers.

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Avro by Fox Touche

It seems like Avro, at least on the surface, had/has similar goals (schema-less, compact binary, multi-language, etc):

hadoop.apache.org/avro/

Also, this might be of interest (a benchmark of many various serialization options):

code.google.com/p/thrift-protobuf-compare/wiki/...

Re: Avro by Rhys Parsons

Just looked at Avro out of interest. It does have schemas, although the schemas don't have to be transmitted when using it for RPC if both ends already have the schemas.

Re: Avro by Fox Touche

Yes, I guess what I really meant was that the serialized data is self-describing and that a code generation phase is only optional (for optimization purposes). So in this sense, data can be serialized "dynamically", because the schemas themselves can be defined dynamically. At least from what I understand.

Also, Protobuf messages can be dynamic (I believe there is a DynamicMessage or some equivalent and there are also "reflection" capabilities). But it is kind of painful to use (very verbose) and I think much of the speed is lost.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

3 Discuss

Educational Content

General Feedback
Bugs
Advertising
Editorial
InfoQ.com and all content copyright © 2006-2014 C4Media Inc. InfoQ.com hosted at Contegix, the best ISP we've ever worked with.
Privacy policy
BT