Deploy a Multi-Node Apache Cassandra Cluster on Kubernetes with the K8ssandra Operator

Apache Cassandra is a massively scalable, distributed NoSQL database built to manage huge data sets spread across many standard servers. Because of its distributed design, it removes single points of failure and supports easy horizontal growth. Cassandra performs especially well for write-intensive use cases and delivers strong write and read throughput, which makes it a solid choice for data-heavy applications. It also offers tunable consistency, so you can align consistency behavior with your application’s requirements.

K8ssandra is an open-source project that makes deploying and operating Apache Cassandra on Kubernetes much easier. It ships with the K8ssandra Operator, which automates key operations like provisioning clusters, scaling, handling backups, and running repairs.

This guide walks through deploying a multi-node Apache Cassandra cluster on a Kubernetes Engine using the K8ssandra Operator.

Prerequisites

Before starting, make sure you:

  • Have access to a Kubernetes Engine with a minimum of 4 nodes and at least 4 GB RAM per node.
  • Have an Ubuntu Server available as a non-root user with sudo permissions to act as the control pane.
  • Have Kubectl installed on your workstation.
  • Have Helm Package Manager installed on your workstation.
  • Have your cluster Kube config file and have set up kubectl on your workstation to connect to the cluster.

Install Cassandra CLI (cqlsh)

The Cassandra CLI called cqlsh is a Python-based command-line tool that lets you communicate with Cassandra databases. Use the steps below to install it.

Update the server package index

Install Python and Pip

$ sudo apt install -y python3 python3-pip python3.12-venv

Check the installed Python version

Your output should resemble the following:

Check the installed Pip version

Your output should resemble the following:

pip 24.2 from /usr/lib/python3/dist-packages/pip (python 3.12)

Create a Python virtual environment

$ python3 -m venv cassandra

Activate the Python virtual environment

$ source cassandra/bin/activate

Install the latest cqlsh CLI

Install the newest version of the cqlsh command-line interface:

Verify the installed cqlsh version

Your output should resemble the following:

Install Cert-Manager

Cert-Manager is a Kubernetes operator responsible for issuing and managing TLS/SSL certificates inside a cluster from trusted authorities like Let’s Encrypt. K8ssandra relies on cert-manager to automate certificate handling for Cassandra clusters. This also covers generating the Java keystores and truststores that Cassandra needs from the provided certificates. Follow the instructions below to install the cert-manager components required by the K8ssandra Operator.

Add the Cert-Manager Helm repository

Using Helm, add the Cert-Manager repository to your local chart sources.

$ helm repo add jetstack https://charts.jetstack.io

Update the Helm chart index

Install Cert-Manager in your cluster

Install Cert-Manager into your Kubernetes Engine cluster.

$ helm install cert-manager jetstack/cert-manager \
  --namespace cert-manager \
  --create-namespace \
  --set crds.enabled=true

Confirm Cert-Manager resources are running

After a successful install, confirm that all Cert-Manager resources are present in the cluster.

$ kubectl get all -n cert-manager

Your output should look similar to the example below:

NAME                                           READY   STATUS    RESTARTS   AGE
pod/cert-manager-cainjector-686546c9f7-m9gp7   1/1     Running   0          43s
pod/cert-manager-d6746cf45-sjjs6               1/1     Running   0          43s
...
NAME                              TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)            AGE
service/cert-manager              ClusterIP   10.110.17.176           9402/TCP           44s
...
NAME                                      READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/cert-manager              1/1     1            1           43s
...
NAME                                                 DESIRED   CURRENT   READY   AGE
replicaset.apps/cert-manager-cainjector-686546c9f7   1         1         1       43s
...

Install the K8ssandra Operator

To operate Apache Cassandra clusters on Kubernetes, install the K8ssandra Operator with Helm. Follow these steps to deploy the operator.

Add the K8ssandra Helm repository

Add the K8ssandra operator repository to your Helm sources.

$ helm repo add k8ssandra https://helm.k8ssandra.io/stable

Install the K8ssandra operator

Deploy the K8ssandra operator into your cluster.

$ helm install k8ssandra-operator k8ssandra/k8ssandra-operator \
  --namespace k8ssandra-operator \
  --create-namespace

Verify the operator deployment

After a few minutes, review the deployment to confirm the operator is available.

$ kubectl -n k8ssandra-operator get deployment

Your output should resemble the following:

NAME                               READY   UP-TO-DATE   AVAILABLE   AGE
k8ssandra-operator                 1/1     1            1           20s
k8ssandra-operator-cass-operator   1/1     1            1           20s

Check that operator pods are running

Ensure the K8ssandra operator pods are ready and in a running state.

$ kubectl get pods -n k8ssandra-operator

Your output should resemble the following:

NAME                                                READY   STATUS    RESTARTS   AGE
k8ssandra-operator-65b9c7c9c-km28b                  1/1     Running   0          46s
k8ssandra-operator-cass-operator-54845bc4f6-hsqds   1/1     Running   0

 

Set Up a Multi-Node Apache Cassandra Cluster on Kubernetes Engine

Deploy a highly available Cassandra cluster on Kubernetes Engine with the K8ssandra Operator. Use the steps below to create and bring up an Apache Cassandra cluster.

Check StorageClasses in Your Cluster

First, list the StorageClasses that are available in your cluster.

 

 

$ kubectl get storageclass

Your output should look similar to the example below:

NAME                             PROVISIONER           RECLAIMPOLICY   VOLUMEBINDINGMODE   ALLOWVOLUMEEXPANSION   AGE
centron-block-storage (default)  block.csi.centron.de  Delete          Immediate           true                   6m24s
centron-block-storage-hdd        block.csi.centron.de  Delete          Immediate           true                   6m25s
centron-block-storage-hdd-retain block.csi.centron.de  Retain          Immediate           true                   6m25s
...

Create the Cassandra Cluster Manifest

Using a text editor such as nano, create a new manifest file named cluster.yaml.

Add the following content to the file. Replace centron-block-storage with the StorageClass that exists in your cluster (as shown in the previous step).

apiVersion: k8ssandra.io/v1alpha1
kind: K8ssandraCluster
metadata:
  name: demo
spec:
  cassandra:
    serverVersion: "4.0.1"
    datacenters:
      - metadata:
          name: dc1
        size: 3
        storageConfig:
          cassandraDataVolumeClaimSpec:
            storageClassName: centron-block-storage
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 10Gi
        config:
          jvmOptions:
            heapSize: 512M
        stargate:
          size: 1
          heapSize: 256M

Save and close the file.

This manifest specifies the Cassandra cluster with these settings:

  • Cassandra version: 4.0.1
  • Three worker nodes in the cluster.
  • The centron-block-storage StorageClass with a 10 GB persistent volume for each node.
  • Cassandra JVM heap size per node set to 512 MB.
  • Stargate JVM heap size set to 256 MB.

Note
If you are running this on centron’s Kubernetes Engine, ensure the centron CSI driver is configured with the required credentials before you deploy the cluster. This allows dynamic provisioning of block storage volumes. Check the centron CSI documentation for complete setup details.

Apply the Cluster Deployment

Apply the manifest in your cluster.

$ kubectl apply -n k8ssandra-operator -f cluster.yaml

Allow at least 15 minutes, then watch the cluster pods.

$ kubectl get pods -n k8ssandra-operator --watch

Confirm that all pods are ready and running, similar to the output below:

NAME                                                    READY   STATUS    RESTARTS   AGE
demo-dc1-default-stargate-deployment-64747477d7-hfck9   1/1     Running   0          78s
demo-dc1-default-sts-0                                  2/2     Running   0          6m5s
demo-dc1-default-sts-1                                  2/2     Running   0          6m5s
demo-dc1-default-sts-2                                  2/2     Running   0          6m5s
k8ssandra-operator-65b9c7c9c-km28b                      1/1     Running   0          17m
k8ssandra-operator-cass-operator-54845bc4f6-hsqds       1/1     Running   0          17m

Once all Cassandra database pods are healthy, Stargate pods start to initialize. Stargate acts as a data gateway that provides REST, GraphQL, and Document APIs in front of Cassandra. The Stargate pod name should follow a pattern like: demo-dc1-default-stargate-deployment-xxxxxxxxx-xxxxx.

Verify the Linked Block Storage PVCs for Cassandra Cluster Persistence

The K8ssandra Operator deploys Cassandra pods as StatefulSets to keep stable network identities and persistent storage. Each StatefulSet corresponds to a Cassandra node and uses a Persistent Volume Claim (PVC) to store data reliably. This section shows how to confirm that the PVCs are created through the Block Storage service in your Kubernetes Engine cluster.

Check StatefulSet Readiness

Verify that the StatefulSets are present and ready.

$ kubectl get statefulset -n k8ssandra-operator

Your output should look similar to the example below:

NAME                   READY   AGE
demo-dc1-default-sts   3/3     7m14s

This output confirms that all three Cassandra nodes are up and the StatefulSet is running.

Verify the StorageClass

Check the StorageClass that backs your cluster volumes.

$ kubectl get sc centron-block-storage

Your output should look similar to the example below:

NAME                             PROVISIONER           RECLAIMPOLICY   VOLUMEBINDINGMODE   ALLOWVOLUMEEXPANSION   AGE
centron-block-storage (default)  block.csi.centron.de  Delete          Immediate           true                   24m

List PVCs Across Namespaces

List all PVCs in all namespaces to ensure they are bound correctly.

$ kubectl get pvc --all-namespaces

Your output should look similar to the example below:

NAMESPACE            NAME                                 STATUS   VOLUME                 CAPACITY   ACCESS MODES   STORAGECLASS           VOLUMEATTRIBUTESCLASS   AGE
k8ssandra-operator   server-data-demo-dc1-default-sts-0   Bound    pvc-a62852bae9d24dad   10Gi       RWO            centron-block-storage                   13m
k8ssandra-operator   server-data-demo-dc1-default-sts-1   Bound    pvc-cfb279b19d0c4a55   10Gi       RWO            centron-block-storage                   13m
k8ssandra-operator   server-data-demo-dc1-default-sts-2   Bound    pvc-b09e184f4d7741f6   10Gi       RWO            centron-block-storage                   13m

Create a Kubernetes Service To Access the Cassandra Cluster

To expose the Cassandra cluster externally and allow connections through the native Cassandra protocol (CQL) on port 9042, you need a Kubernetes Service of type LoadBalancer. This assigns a public IP through load balancer integration.

Create the Service Manifest

Create a new service resource file named service.yaml.

Add the following content to the file.

apiVersion: v1
kind: Service
metadata:
  name: cassandra
  labels:
    app: cassandra
spec:
  type: LoadBalancer
  externalTrafficPolicy: Local
  ports:
    - port: 9042
      targetPort: 9042
  selector:
    app.kubernetes.io/name: cassandra

Save and close the file.

This configuration defines a LoadBalancer service that exposes the Cassandra cluster on port 9042.

Apply the Service

Apply the service in the k8ssandra-operator namespace.

$ kubectl apply -n k8ssandra-operator -f service.yaml

Wait about 5 minutes for the LoadBalancer to be created, then check the Cassandra service.

$ kubectl get svc/cassandra -n k8ssandra-operator

Look at the External-IP field and note the IP address for accessing the cluster.

NAME        TYPE           CLUSTER-IP      EXTERNAL-IP   PORT(S)          AGE
cassandra   LoadBalancer   10.103.92.169   192.0.2.1     9042:32444/TCP   2m12s

Note
If EXTERNAL-IP shows <pending>, wait a little longer and run the command again. After an IP is assigned, use it to connect to Cassandra with a CQL client like cqlsh.

Test the Apache Cassandra Cluster

cqlsh is a command-line tool used to connect to Cassandra. The steps below show how to run CQL (Cassandra Query Language) commands and perform actions such as creating, updating, and querying data.

Export the LoadBalancer IP

Store your Cassandra service LoadBalancer IP in the CASS_IP variable.

$ CASS_IP=$(kubectl get svc cassandra -n k8ssandra-operator -o jsonpath="{.status.loadBalancer.ingress[*].ip}")

Print the assigned Cassandra IP.

Export the Cluster Username

Export the cluster access username into the CASS_USERNAME variable.

$ CASS_USERNAME=$(kubectl get secret demo-superuser -n k8ssandra-operator -o=jsonpath='{.data.username}' | base64 --decode)

Print the Cassandra username.

Export the Cluster Password

Export the cluster password into the CASS_PASSWORD variable.

$ CASS_PASSWORD=$(kubectl get secret demo-superuser -n k8ssandra-operator -o=jsonpath='{.data.password}' | base64 --decode)

Print the Cassandra password.

Log In with cqlsh

Use cqlsh to connect to the Cassandra cluster with your variable values.

$ cqlsh -u $CASS_USERNAME -p $CASS_PASSWORD $CASS_IP 9042

Create a Keyspace

Create a new keyspace named demo.

demo-superuser@cqlsh> CREATE KEYSPACE demo WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 3};

Create a Table

Create a table called users inside the demo keyspace.

demo-superuser@cqlsh> CREATE TABLE demo.users (id text primary key, name text, country text);

Insert Records

Insert sample records into the users table.

demo-superuser@cqlsh> BEGIN BATCH
                         INSERT INTO demo.users (id, name, country) VALUES ('42', 'John Doe', 'UK');
                         INSERT INTO demo.users (id, name, country) VALUES ('43', 'Joe Smith', 'US');
                      APPLY BATCH;

Query the Data

Query the table to display the stored values.

demo-superuser@cqlsh> SELECT * FROM demo.users;

Your output should be similar to the following:

 id | country | name
----+---------+-----------
 43 |      US | Joe Smith
 42 |      UK |  John Doe

(2 rows)

Conclusion

You have set up an Apache Cassandra cluster on a Kubernetes Engine environment using the open-source K8ssandra Operator. You enabled persistent storage with Block Storage and connected to the Cassandra cluster using the cqlsh CLI. For deeper configuration options and advanced use cases, consult the official Cassandra documentation.

Source: vultr.com

Create a Free Account

Register now and get access to our Cloud Services.

Posts you might be interested in: