InfoQ

InfoQ

News

My Bookmarks

Login or Register to enable bookmarks for unlimited time.

The content has been bookmarked!

There was an error bookmarking this content! Please retry.

BERT as Dynamic Alternative to Protocol Buffers/Thrift

Posted by Werner Schuster on Nov 02, 2009

Sections
Architecture & Design,
Development,
Enterprise Architecture
Topics
Dynamic Languages ,
Performance & Scalability ,
Architecture ,
Ruby ,
SOA
Tags
github ,
Erlang

Despite the prominence of XML for encoding data, there are many situations where its inefficiency is a problem, both the inefficiency of encoding/decoding and the space used. Examples of popular binary serialzation formats are the widely used ASN.1, Google's ProtocolBuffers or Facebook Thrift.

A new format now powers GitHub's backend: BERT, created by Tom Preston-Werner builds upon Erlangs External Term Format (ETF) which is used to encodes Erlang terms for communication across nodes.

BERT extends ETF with complex data types such as dictionaries, time and regular expressions.

BERT is different from ASN.1 or Protocol Buffers in not requiring a schema or IDL specification for formats. Tom Preston-Werner explains that this makes BERT a kind of binary version of the idea behind JSON:

I like JSON. I love the concept of extracting a subset of a language and using that to facilitate interprocess communication. This got me thinking about the work I’d done with Erlectricity. About two years ago I wrote a C extension for Erlectricity to speed up the deserialization of Erlang’s external term format.
[..] What if I extracted the generic parts of Erlang’s external term format and made that into a standard for interprocess communication? What if Erlang had the equivalent of JavaScript’s JSON? And what if an RPC protocol could be built on top of that format? What would those things look like and how simple could they be made?

BERT-RPC allows to remotely call code hosted on an BERT-RPC server using BERTs to encode the arguments and return values for the calls. Tom ementions a few of BERT-RPC's features:

- Synchronous and Asynchronous calls[..]
- Streaming (to and from)
- Caching directives

Ruby code can be made available using a BERT-RPC server such Ernie.

A specification for BERT and BERT-RPC exists. Alternatives to the Ruby and Erlang implementations are available in other languages such as BERT for Javascript, Python, and others.

Do you prefer a schema-less approach like BERT or one of the IDL based options such as ASN.1 or ProtocolBuffers.

  • This article is part of a featured topic series on SOA
Avro by Fox Touche Posted
Re: Avro by Rhys Parsons Posted
Re: Avro by Fox Touche Posted
  1. Back to top

    Avro

    by Fox Touche

    It seems like Avro, at least on the surface, had/has similar goals (schema-less, compact binary, multi-language, etc):

    hadoop.apache.org/avro/

    Also, this might be of interest (a benchmark of many various serialization options):

    code.google.com/p/thrift-protobuf-compare/wiki/...

  2. Back to top

    Re: Avro

    by Rhys Parsons

    Just looked at Avro out of interest. It does have schemas, although the schemas don't have to be transmitted when using it for RPC if both ends already have the schemas.

  3. Back to top

    Re: Avro

    by Fox Touche

    Yes, I guess what I really meant was that the serialized data is self-describing and that a code generation phase is only optional (for optimization purposes). So in this sense, data can be serialized "dynamically", because the schemas themselves can be defined dynamically. At least from what I understand.

    Also, Protobuf messages can be dynamic (I believe there is a DynamicMessage or some equivalent and there are also "reflection" capabilities). But it is kind of painful to use (very verbose) and I think much of the speed is lost.

Educational Content

Jesper Boeg on Priming Kanban

In this interview, Jesper Boeg, author of the new InfoQ book – Priming Kanban, discusses the keys to using Kanban effectively, and how to get started if you are currently using other approaches.

New-age Transactional Systems - Not Your Grandpa's OLTP

John Hugg discusses high volume transaction processing applications with high and low frequency profiles, and how VoltDB can be used for that purpose.

Cool Code

Kevlin Henney examines code samples to see what can be learned from them starting from the premise that one won’t write great code unless he knows how to read it.

Collaboration: At the Extremities of Extreme

Jason Ayers share the observations he made watching a team of developers collaborating in real time on the same code base, pushing XP, pair programming and continuous integration to their extremes.

Yesod Web Framework

Michael Snoyman presents Yesod, a web framework written in Haskell and containing a web server, templating, ORM, libraries (templating, gravatar, etc.).

Transactions without Transactions

Richard Kreuter and Kyle Banker on how to avoid classical RDBMS transactional systems by using compensation mechanisms, transactional messaging or transactional procedures.

Attila Szegedi on JVM and GC Performance Tuning at Twitter

Attila Szegedi talks about performance tuning Java and Scala programs at Twitter: how to approach GC problems, the importance of asynchronous I/O, when to use MySQL/Cassandra/Redis, and much more.

10 tips on how to prevent business value risk

One category of risk that project teams need to ensure they address is business value failure – delivering a product that fails to provide value for the business investor.