Following up on the release of a Kubernetes (K8s) operator for Cassandra last spring, DataStax is unveiling a full-blown open source distribution of Cassandra built for Kubernetes. It expands on the original operator along with preconfigured tooling and guides for deploying and configuring Cassandra on Kubernetes clusters, while leveraging several open source projects for observability. It’s the latest move on the part of DataStax to jumpstart cloud-native development in the Apache open source project.
Announced at the virtual edition of KubeCon this week, of K8ssandra (pronounced “Kay-sandra) starts with the original DataStax Cass-operator. It also includes some tooling with naming reflecting the fact that at least some members of the project have senses of humor. They include Cassandra Reaper, which is not the personification of death, but rather, a very un-grim garbage collector or defragger for disk, cleaning up committed or overridden write-ahead logs. Cassandra Reaper was developed by Cassandra consultancy the Last Pickle before DataStax acquired the company last March. And then there is Cassandra Medusa, which is not a Greek mythological monster, but a modest utility for backing up and restoring data.
Additionally, K8ssandra builds on the open source Prometheus for metrics collection and Grafana for visualization, with both of them preconfigured for collecting specific metrics and providing some jumpstart dashboards. Rounding out the distro are Helm charts for guiding database administrators and Site Reliability Engineers in setting up and operating Cassandra clusters within a Kubernetes environment.
Much of the content for the distro, from the pre-configuring of Prometheus and Grafana, and the best practices outlined in the Helm charts, have come straight out of DataStax’s experience running Astra, its managed DataStax Enterprise cloud service. For instance, Prometheus was preconfigured to pick up specific key metrics from Cassandra, and Grafana was packaged with several prebuilt dashboards. Customers, of course, can extend and customize from these jumpstarts, or develop their own integrations.
More to the point, K8ssandra is part of DataStax’s strategy to make the runtime align with the open source community. While DataStax no longer controls the Apache Cassandra project, it has within the past couple years redoubled efforts to get back aligned with them. The K8s operator that was the linchpin of the distro was introduced to the community as an open source project on GitHub last spring. As we noted a few months back, while job one for the community right now is getting Cassandra 4.0 out the door (it is still in beta), the group is starting to consider how and whether to put cloud-native into the project mainstream.
So, the DataStax operator is not currently part of the core Apache project; it was introduced to the community, some of whose members had previously developed their own operators. It’s notable that in the press release announcing K8ssandra, Orange, which developed one of the operators, endorsed DataStax’s move. “I’m happy to see K8ssandra will expand what we are doing as a community to make Cassandra the standard for data on Kubernetes,” said Franck Dehay, Software Engineer at Orange.