DataStax announced last month the release of Astra, a cloud-native Database-as-a-Service (DBaaS) built on Apache Cassandra database.
Astra, built on technologies like Kubernetes, Prometheus, and Envoy, provides cloud native support for Google Kubernetes Engine (GKE) and Amazon Elastic Kubernetes Service (EKS) management planes.
Astra provides the following components to run a cloud-native service with Apache Cassandra:
- Gateway: The traffic coming to Astra clusters is manged using the gateway and Envoy proxies are used to route the requests. This reduces the number of open ports exposed and enables Astra to be elastic and secure. The gateway is also used to expose REST and GraphQL APIs for interacting with data assets.
- Operations: Astra uses two components for automating the cluster operations. These two components are a Kubernetes operator called Cass Operator and a Kubernetes sidecar called Management API for Apache Cassandra (MAAC).
- Deployment: Astra achieves continuous deployment using the Kubernetes operator. It uses cass-config-builder to drive configuration and NoSQLBench for continuous testing of the cloud environment.
-
Metrics Collector: Astra uses a Metrics Collector for Apache Cassandra (MCAC) to provide real-time system health metrics that include information regarding latency and throughput to the database.
Select components of Astra will be open sourced in the future under CNCF foundation and few other components will be open sourced under Apache. Astra also offers a free tier for learning purposes.
InfoQ spoke with Matt Kennedy, senior director of cloud solutions at DataStax, and Chris Bradford, product manager at DataStax, about the Astra product features and how it helps developer community.
InfoQ: What tools does DataStax Astra platform include?
Matt Kennedy and Chris Bradford: The most obvious tools you’ll see immediately when you start to use Astra are the built-in CQLsh console and the Developer Studio. Those are two different flavors of the same concept, an interactive CQL executor. Developer Studio takes a notebook approach similar to Jupyter, and the CQLsh console is the same shell you’d get with a download of Cassandra, but conveniently run from your browser window.
InfoQ: Can you discuss how the new DataStax Kubernetes Operator for Apache Cassandra works?
Kennedy and Bradford: The DataStax Kubernetes Operator for Apache Cassandra, cass-operator, plays a key role in reducing the operational tasks surrounding the creation and management of Cassandra clusters. Traditionally, to succeed with Cassandra, users must understand the entire software stack from the operating system, through dependencies, to calling the right binary and following strict run books for operations.
With cass-operator, users focus on the logical data plane topology instead. They define a datacenter with failure domains and a size. From there, the operator takes care of creating the underlying Kubernetes (k8s) resources like Pods, Services, and StatefulSets. Once these components are in place, the operator communicates with each Pod via the Management API to retrieve vital information surrounding the pod's health and ability to process queries. From there, it can make intelligent decisions surrounding the overarching operations in the cluster. This may include remediation of worker failures or deploying configuration changes in a rolling, non-disruptive, fashion.
InfoQ: What role does Datastax Control Plane play in Kubernetes Platform? How does this compare with traditional Cassandra installations?
Kennedy and Bradford: Cassandra has traditionally filled this role of a highly performant, scalable, database outside of Kubernetes. Now we're bringing this data layer into the k8s world and have made it trivial to run at the same time. With previous Cassandra installations, there are entire teams dedicated to managing clusters. Whether it's through monitoring and the development of automation or coordinating data movement across infrastructures, there are numerous players involved to keep everything running smoothly. The DataStax operator strives to free up those human resources, allowing them to focus on the higher-level tasks involved instead of the tedium of making sure all the configuration files are the same on every node.
There are still all the advantages of traditional Cassandra deployments -- linear scalability, fault tolerance, and wicked performance. Now there is less work involved in managing the stack. For instance, when a node goes down, there's no paging or alerts being fired to signal that human operators need to be involved. Instead, the operator attempts to restart the process and if it's unsuccessful or there are underlying issues, then the node is moved to another k8s worker. All without needing to disturb the human operations team.
InfoQ: How can the developers use GuardRails to ensure the safety of the code they are creating?
Kennedy and Bradford: Guardrails help developers get feedback more immediately when they violate best practices. Rather than waiting until a behavior manifests itself in a runtime problem, Guardrails will manifest as errors during development and testing. One really practical way that all Cassandra developers can benefit from them now, even before we've had a chance to contribute them to open-source, is to use Astra as a development platform. Then, you'll know that the code you developed against Astra will avoid anti-patterns even if you eventually run that code on a different variant of Cassandra.
For more information on Astra product, checkout the documentation and Getting Started web page.
Datastax also recently announced an AIOps product called Vector that proactively monitors the health of Apache Cassandra clusters. Vector continually assesses the behavior of a Cassandra cluster to provide developers and operators with automated diagnostics and advice, helping them be consistently successful with Cassandra and DataStax Enterprise (DSE) clusters.