Deploy a Multi-Node Apache Cassandra Cluster on Kubernetes with the K8ssandra Operator
Apache Cassandra is a massively scalable, distributed NoSQL database built to manage huge data sets spread across many standard servers. Because of its distributed design, it removes single points of failure and supports easy horizontal growth. Cassandra performs especially well for write-intensive use cases and delivers strong write and read throughput, which makes it a solid choice for data-heavy applications. It also offers tunable consistency, so you can align consistency behavior with your application’s requirements.
K8ssandra is an open-source project that makes deploying and operating Apache Cassandra on Kubernetes much easier. It ships with the K8ssandra Operator, which automates key operations like provisioning clusters, scaling, handling backups, and running repairs.
This guide walks through deploying a multi-node Apache Cassandra cluster on a Kubernetes Engine using the K8ssandra Operator.
Prerequisites
Before starting, make sure you:
- Have access to a Kubernetes Engine with a minimum of 4 nodes and at least 4 GB RAM per node.
- Have an Ubuntu Server available as a non-root user with sudo permissions to act as the control pane.
- Have Kubectl installed on your workstation.
- Have Helm Package Manager installed on your workstation.
- Have your cluster Kube config file and have set up kubectl on your workstation to connect to the cluster.
Install Cassandra CLI (cqlsh)
The Cassandra CLI called cqlsh is a Python-based command-line tool that lets you communicate with Cassandra databases. Use the steps below to install it.
Update the server package index
$ sudo apt update
Install Python and Pip
$ sudo apt install -y python3 python3-pip python3.12-venv
Check the installed Python version
$ python3 --version
Your output should resemble the following:
Python 3.12.7
Check the installed Pip version
$ pip --version
Your output should resemble the following:
pip 24.2 from /usr/lib/python3/dist-packages/pip (python 3.12)
Create a Python virtual environment
$ python3 -m venv cassandra
Activate the Python virtual environment
$ source cassandra/bin/activate
Install the latest cqlsh CLI
Install the newest version of the cqlsh command-line interface:
$ pip install -U cqlsh
Verify the installed cqlsh version
$ cqlsh --version
Your output should resemble the following:
cqlsh 6.2.0
Install Cert-Manager
Cert-Manager is a Kubernetes operator responsible for issuing and managing TLS/SSL certificates inside a cluster from trusted authorities like Let’s Encrypt. K8ssandra relies on cert-manager to automate certificate handling for Cassandra clusters. This also covers generating the Java keystores and truststores that Cassandra needs from the provided certificates. Follow the instructions below to install the cert-manager components required by the K8ssandra Operator.
Add the Cert-Manager Helm repository
Using Helm, add the Cert-Manager repository to your local chart sources.
$ helm repo add jetstack https://charts.jetstack.io
Update the Helm chart index
$ helm repo update
Install Cert-Manager in your cluster
Install Cert-Manager into your Kubernetes Engine cluster.
$ helm install cert-manager jetstack/cert-manager \
--namespace cert-manager \
--create-namespace \
--set crds.enabled=true
Confirm Cert-Manager resources are running
After a successful install, confirm that all Cert-Manager resources are present in the cluster.
$ kubectl get all -n cert-manager
Your output should look similar to the example below:
NAME READY STATUS RESTARTS AGE
pod/cert-manager-cainjector-686546c9f7-m9gp7 1/1 Running 0 43s
pod/cert-manager-d6746cf45-sjjs6 1/1 Running 0 43s
...
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/cert-manager ClusterIP 10.110.17.176 9402/TCP 44s
...
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/cert-manager 1/1 1 1 43s
...
NAME DESIRED CURRENT READY AGE
replicaset.apps/cert-manager-cainjector-686546c9f7 1 1 1 43s
...
Install the K8ssandra Operator
To operate Apache Cassandra clusters on Kubernetes, install the K8ssandra Operator with Helm. Follow these steps to deploy the operator.
Add the K8ssandra Helm repository
Add the K8ssandra operator repository to your Helm sources.
$ helm repo add k8ssandra https://helm.k8ssandra.io/stable
Install the K8ssandra operator
Deploy the K8ssandra operator into your cluster.
$ helm install k8ssandra-operator k8ssandra/k8ssandra-operator \
--namespace k8ssandra-operator \
--create-namespace
Verify the operator deployment
After a few minutes, review the deployment to confirm the operator is available.
$ kubectl -n k8ssandra-operator get deployment
Your output should resemble the following:
NAME READY UP-TO-DATE AVAILABLE AGE
k8ssandra-operator 1/1 1 1 20s
k8ssandra-operator-cass-operator 1/1 1 1 20s
Check that operator pods are running
Ensure the K8ssandra operator pods are ready and in a running state.
$ kubectl get pods -n k8ssandra-operator
Your output should resemble the following:
NAME READY STATUS RESTARTS AGE
k8ssandra-operator-65b9c7c9c-km28b 1/1 Running 0 46s
k8ssandra-operator-cass-operator-54845bc4f6-hsqds 1/1 Running 0
Set Up a Multi-Node Apache Cassandra Cluster on Kubernetes Engine
Deploy a highly available Cassandra cluster on Kubernetes Engine with the K8ssandra Operator. Use the steps below to create and bring up an Apache Cassandra cluster.
Check StorageClasses in Your Cluster
First, list the StorageClasses that are available in your cluster.
$ kubectl get storageclass
Your output should look similar to the example below:
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
centron-block-storage (default) block.csi.centron.de Delete Immediate true 6m24s
centron-block-storage-hdd block.csi.centron.de Delete Immediate true 6m25s
centron-block-storage-hdd-retain block.csi.centron.de Retain Immediate true 6m25s
...
Create the Cassandra Cluster Manifest
Using a text editor such as nano, create a new manifest file named cluster.yaml.
$ nano cluster.yaml
Add the following content to the file. Replace centron-block-storage with the StorageClass that exists in your cluster (as shown in the previous step).
apiVersion: k8ssandra.io/v1alpha1
kind: K8ssandraCluster
metadata:
name: demo
spec:
cassandra:
serverVersion: "4.0.1"
datacenters:
- metadata:
name: dc1
size: 3
storageConfig:
cassandraDataVolumeClaimSpec:
storageClassName: centron-block-storage
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
config:
jvmOptions:
heapSize: 512M
stargate:
size: 1
heapSize: 256M
Save and close the file.
This manifest specifies the Cassandra cluster with these settings:
- Cassandra version: 4.0.1
- Three worker nodes in the cluster.
- The centron-block-storage StorageClass with a 10 GB persistent volume for each node.
- Cassandra JVM heap size per node set to 512 MB.
- Stargate JVM heap size set to 256 MB.
Note
If you are running this on centron’s Kubernetes Engine, ensure the centron CSI driver is configured with the required credentials before you deploy the cluster. This allows dynamic provisioning of block storage volumes. Check the centron CSI documentation for complete setup details.
Apply the Cluster Deployment
Apply the manifest in your cluster.
$ kubectl apply -n k8ssandra-operator -f cluster.yaml
Allow at least 15 minutes, then watch the cluster pods.
$ kubectl get pods -n k8ssandra-operator --watch
Confirm that all pods are ready and running, similar to the output below:
NAME READY STATUS RESTARTS AGE
demo-dc1-default-stargate-deployment-64747477d7-hfck9 1/1 Running 0 78s
demo-dc1-default-sts-0 2/2 Running 0 6m5s
demo-dc1-default-sts-1 2/2 Running 0 6m5s
demo-dc1-default-sts-2 2/2 Running 0 6m5s
k8ssandra-operator-65b9c7c9c-km28b 1/1 Running 0 17m
k8ssandra-operator-cass-operator-54845bc4f6-hsqds 1/1 Running 0 17m
Once all Cassandra database pods are healthy, Stargate pods start to initialize. Stargate acts as a data gateway that provides REST, GraphQL, and Document APIs in front of Cassandra. The Stargate pod name should follow a pattern like: demo-dc1-default-stargate-deployment-xxxxxxxxx-xxxxx.
Verify the Linked Block Storage PVCs for Cassandra Cluster Persistence
The K8ssandra Operator deploys Cassandra pods as StatefulSets to keep stable network identities and persistent storage. Each StatefulSet corresponds to a Cassandra node and uses a Persistent Volume Claim (PVC) to store data reliably. This section shows how to confirm that the PVCs are created through the Block Storage service in your Kubernetes Engine cluster.
Check StatefulSet Readiness
Verify that the StatefulSets are present and ready.
$ kubectl get statefulset -n k8ssandra-operator
Your output should look similar to the example below:
NAME READY AGE
demo-dc1-default-sts 3/3 7m14s
This output confirms that all three Cassandra nodes are up and the StatefulSet is running.
Verify the StorageClass
Check the StorageClass that backs your cluster volumes.
$ kubectl get sc centron-block-storage
Your output should look similar to the example below:
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
centron-block-storage (default) block.csi.centron.de Delete Immediate true 24m
List PVCs Across Namespaces
List all PVCs in all namespaces to ensure they are bound correctly.
$ kubectl get pvc --all-namespaces
Your output should look similar to the example below:
NAMESPACE NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE
k8ssandra-operator server-data-demo-dc1-default-sts-0 Bound pvc-a62852bae9d24dad 10Gi RWO centron-block-storage 13m
k8ssandra-operator server-data-demo-dc1-default-sts-1 Bound pvc-cfb279b19d0c4a55 10Gi RWO centron-block-storage 13m
k8ssandra-operator server-data-demo-dc1-default-sts-2 Bound pvc-b09e184f4d7741f6 10Gi RWO centron-block-storage 13m
Create a Kubernetes Service To Access the Cassandra Cluster
To expose the Cassandra cluster externally and allow connections through the native Cassandra protocol (CQL) on port 9042, you need a Kubernetes Service of type LoadBalancer. This assigns a public IP through load balancer integration.
Create the Service Manifest
Create a new service resource file named service.yaml.
$ nano service.yaml
Add the following content to the file.
apiVersion: v1
kind: Service
metadata:
name: cassandra
labels:
app: cassandra
spec:
type: LoadBalancer
externalTrafficPolicy: Local
ports:
- port: 9042
targetPort: 9042
selector:
app.kubernetes.io/name: cassandra
Save and close the file.
This configuration defines a LoadBalancer service that exposes the Cassandra cluster on port 9042.
Apply the Service
Apply the service in the k8ssandra-operator namespace.
$ kubectl apply -n k8ssandra-operator -f service.yaml
Wait about 5 minutes for the LoadBalancer to be created, then check the Cassandra service.
$ kubectl get svc/cassandra -n k8ssandra-operator
Look at the External-IP field and note the IP address for accessing the cluster.
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
cassandra LoadBalancer 10.103.92.169 192.0.2.1 9042:32444/TCP 2m12s
Note
If EXTERNAL-IP shows <pending>, wait a little longer and run the command again. After an IP is assigned, use it to connect to Cassandra with a CQL client like cqlsh.
Test the Apache Cassandra Cluster
cqlsh is a command-line tool used to connect to Cassandra. The steps below show how to run CQL (Cassandra Query Language) commands and perform actions such as creating, updating, and querying data.
Export the LoadBalancer IP
Store your Cassandra service LoadBalancer IP in the CASS_IP variable.
$ CASS_IP=$(kubectl get svc cassandra -n k8ssandra-operator -o jsonpath="{.status.loadBalancer.ingress[*].ip}")
Print the assigned Cassandra IP.
$ echo $CASS_IP
Export the Cluster Username
Export the cluster access username into the CASS_USERNAME variable.
$ CASS_USERNAME=$(kubectl get secret demo-superuser -n k8ssandra-operator -o=jsonpath='{.data.username}' | base64 --decode)
Print the Cassandra username.
$ echo $CASS_USERNAME
Export the Cluster Password
Export the cluster password into the CASS_PASSWORD variable.
$ CASS_PASSWORD=$(kubectl get secret demo-superuser -n k8ssandra-operator -o=jsonpath='{.data.password}' | base64 --decode)
Print the Cassandra password.
$ echo $CASS_PASSWORD
Log In with cqlsh
Use cqlsh to connect to the Cassandra cluster with your variable values.
$ cqlsh -u $CASS_USERNAME -p $CASS_PASSWORD $CASS_IP 9042
Create a Keyspace
Create a new keyspace named demo.
demo-superuser@cqlsh> CREATE KEYSPACE demo WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 3};
Create a Table
Create a table called users inside the demo keyspace.
demo-superuser@cqlsh> CREATE TABLE demo.users (id text primary key, name text, country text);
Insert Records
Insert sample records into the users table.
demo-superuser@cqlsh> BEGIN BATCH
INSERT INTO demo.users (id, name, country) VALUES ('42', 'John Doe', 'UK');
INSERT INTO demo.users (id, name, country) VALUES ('43', 'Joe Smith', 'US');
APPLY BATCH;
Query the Data
Query the table to display the stored values.
demo-superuser@cqlsh> SELECT * FROM demo.users;
Your output should be similar to the following:
id | country | name
----+---------+-----------
43 | US | Joe Smith
42 | UK | John Doe
(2 rows)
Conclusion
You have set up an Apache Cassandra cluster on a Kubernetes Engine environment using the open-source K8ssandra Operator. You enabled persistent storage with Block Storage and connected to the Cassandra cluster using the cqlsh CLI. For deeper configuration options and advanced use cases, consult the official Cassandra documentation.


