BERT as Dynamic Alternative to Protocol Buffers/Thrift
Despite the prominence of XML for encoding data, there are many situations where its inefficiency is a problem, both the inefficiency of encoding/decoding and the space used. Examples of popular binary serialzation formats are the widely used ASN.1, Google's ProtocolBuffers or Facebook Thrift.
A new format now powers GitHub's backend: BERT, created by Tom Preston-Werner builds upon Erlangs External Term Format (ETF) which is used to encodes Erlang terms for communication across nodes.
BERT extends ETF with complex data types such as dictionaries, time and regular expressions.
BERT is different from ASN.1 or Protocol Buffers in not requiring a schema or IDL specification for formats. Tom Preston-Werner explains that this makes BERT a kind of binary version of the idea behind JSON:
I like JSON. I love the concept of extracting a subset of a language and using that to facilitate interprocess communication. This got me thinking about the work I’d done with Erlectricity. About two years ago I wrote a C extension for Erlectricity to speed up the deserialization of Erlang’s external term format.
BERT-RPC allows to remotely call code hosted on an BERT-RPC server using BERTs to encode the arguments and return values for the calls. Tom ementions a few of BERT-RPC's features:
- Synchronous and Asynchronous calls[..]
- Streaming (to and from)
- Caching directives
Ruby code can be made available using a BERT-RPC server such Ernie.
Do you prefer a schema-less approach like BERT or one of the IDL based options such as ASN.1 or ProtocolBuffers.
Also, this might be of interest (a benchmark of many various serialization options):
Also, Protobuf messages can be dynamic (I believe there is a DynamicMessage or some equivalent and there are also "reflection" capabilities). But it is kind of painful to use (very verbose) and I think much of the speed is lost.