Cassandra Gets Atomic Batches, Virtual Nodes, CQL Improvements
Following are some of the new features -
- Atomic batches for better transactional integrity, even if the coordinator fails mid-batch.
- Virtual Nodes allows better control over clustering. You can also upgrade an existing node to vnodes
- CQL3, with several improvements such as a new native binary protocol, support for collection types, a system keyspace.
- Faster serialization through use of binary format instead of JSON
- Request Tracing
- Several performance improvements
- Faster startup times
- Ability to choose a policy to determine what happens to a node on disk failure
- Concurrent Schema changes by multiple users, including creating and dropping tables
Collection types are especially useful to simplify models naturally, since CQL doesn't support JOINs. Atomic batches can be used to avoid having to program for retries and idempotent writes. However, this comes with approximately 30% performance hit and can be turned off if performance is more important.
Cassandra 1.2 is designed to handle several terabytes of data per node, compared to upto 500 GB disk-space limit recommended in 1.0.
Cassandra is an open source, Columnar, distributed, NoSQL database written in Java. It was originally developed by Facebook to power their inbox search, and it became an Apache project in 2009. You can read more about Cassandra on InfoQ or refer to the updated official documentation.
Delivering Performance Under Schedule and Resource Pressure: Lessons Learned at Google and Microsoft
Ivan Filho Mar 06, 2014