Persistent Storage with OpenEBS on Kubernetes
Today we will explore persistent storage for Cassandra on Kubernetes with OpenEBS.
If you want to follow along, you can sign up with them to get your account created:
What will be doing
We will be deploying a k3s distribution of kubernetes on Civo, deploy a cassandra cluster, write some data to the cluster and then test the persistence by deleting all the cassandra pods
About OpenEBS
Taken from their Github Page, OpenEBS describes them as the leading open-source container attached storage, built using cloud native architecture, simplifies running stateful applications on kubernetes.
Deploy Kubernetes
Create a new civo k3s cluster with 3 nodes:
$ civo kubernetes create demo-cluster --size=g2.small --nodes=3 --wait
Append the kubernetes config to your kubeconfig file:
$ civo kubernetes config demo-cluster --save
Merged config into ~/.kube/config
Switch the context to the new cluster:
$ kubectx demo-cluster
Switched to context "demo-cluster".
Install OpenEBS
Deploy OpenEBS to your Kubernetes cluster by applying this, which can be found from their github page:
$ kubectl apply -f https://openebs.github.io/charts/openebs-operator.yaml
Give it some time for all the pods to check in and then have a look if all the pods under the openebs
namespace is ready:
$ kubectl get pods -n openebs
NAME READY STATUS RESTARTS AGE
openebs-provisioner-c68bfd6d4-5n7kk 1/1 Running 0 5m52s
openebs-ndm-df25v 1/1 Running 0 5m51s
openebs-snapshot-operator-7ffd685677-h8jpf 2/2 Running 0 5m52s
openebs-admission-server-889d78f96-t64m6 1/1 Running 0 5m50s
openebs-ndm-44k6n 1/1 Running 0 5m51s
openebs-ndm-hpmg8 1/1 Running 0 5m51s
openebs-localpv-provisioner-67bddc8568-shkqr 1/1 Running 0 5m49s
openebs-ndm-operator-5db67cd5bb-mfg5m 1/1 Running 0 5m50s
maya-apiserver-7f664b95bb-l87bm 1/1 Running 0 5m53s
We can see a list of storage classes that comes with OpenEBS:
$ kubectl get sc
NAME PROVISIONER AGE
local-path (default) rancher.io/local-path 28m
openebs-jiva-default openebs.io/provisioner-iscsi 5m43s
openebs-snapshot-promoter volumesnapshot.external-storage.k8s.io/snapshot-promoter 5m41s
openebs-hostpath openebs.io/local 5m41s
openebs-device openebs.io/local 5m41s
Deploy Cassandra
Deploy a Cassandra Service to your Kubernetes Cluster:
$ kubectl apply -f https://raw.githubusercontent.com/ruanbekker/blog-assets/master/civo.com-openebs-kubernetes/manifests/cassandra/cassandra-service.yaml
The Cassandra Stateful Set has 3 replicas which is defined in the manifest as replicas: 3
. I am also using the Jiva storage engine defined as volume.beta.kubernetes.io/storage-class: openebs-jiva-default
which is best suited for running replicated block storage on nodes that make use of ephemeral storage on the Kubernetes worker nodes.
Let's deploy Cassandra to your Kubernetes Cluster:
$ kubectl apply -f https://raw.githubusercontent.com/ruanbekker/blog-assets/master/civo.com-openebs-kubernetes/manifests/cassandra/cassandra-statefulset.yaml
Give it some time for all the pods to check in, then have a look at the pods:
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
pvc-b29a4e4b-20d7-4fd4-83d9-a155ab5c1a93-rep-ffb5b9f9-7nwfl 1/1 Running 0 5m39s
pvc-b29a4e4b-20d7-4fd4-83d9-a155ab5c1a93-rep-ffb5b9f9-wr4hl 1/1 Running 0 5m39s
pvc-b29a4e4b-20d7-4fd4-83d9-a155ab5c1a93-ctrl-6f95fd555-ldd86 2/2 Running 0 5m53s
pvc-b29a4e4b-20d7-4fd4-83d9-a155ab5c1a93-rep-ffb5b9f9-2v75w 1/1 Running 1 5m39s
cassandra-0 1/1 Running 0 5m53s
pvc-2fe5a138-6c1d-4bcd-a0e4-e653793abea2-ctrl-bc4769cb4-c8stc 2/2 Running 0 3m40s
pvc-2fe5a138-6c1d-4bcd-a0e4-e653793abea2-rep-7565997776-w2gqs 1/1 Running 0 3m37s
pvc-2fe5a138-6c1d-4bcd-a0e4-e653793abea2-rep-7565997776-jk6c2 1/1 Running 0 3m37s
pvc-2fe5a138-6c1d-4bcd-a0e4-e653793abea2-rep-7565997776-qq9bb 1/1 Running 1 3m37s
cassandra-1 1/1 Running 0 3m40s
pvc-a1ff3db9-d2fd-43a8-9095-c3529fe2e456-ctrl-55887dc66f-zjbx8 2/2 Running 0 117s
pvc-a1ff3db9-d2fd-43a8-9095-c3529fe2e456-rep-7fbc8785b8-cqvdb 1/1 Running 0 114s
pvc-a1ff3db9-d2fd-43a8-9095-c3529fe2e456-rep-7fbc8785b8-hclgs 1/1 Running 0 114s
pvc-a1ff3db9-d2fd-43a8-9095-c3529fe2e456-rep-7fbc8785b8-85nqt 1/1 Running 1 114s
cassandra-2 0/1 Running 0 118s
When we look at our persistent volumes:
$ kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pvc-b29a4e4b-20d7-4fd4-83d9-a155ab5c1a93 5G RWO Delete Bound default/cassandra-data-cassandra-0 openebs-jiva-default 12m
pvc-2fe5a138-6c1d-4bcd-a0e4-e653793abea2 5G RWO Delete Bound default/cassandra-data-cassandra-1 openebs-jiva-default 10m
pvc-a1ff3db9-d2fd-43a8-9095-c3529fe2e456 5G RWO Delete Bound default/cassandra-data-cassandra-2 openebs-jiva-default 8m49s
And our persistent volume claims:
$ kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
cassandra-data-cassandra-0 Bound pvc-b29a4e4b-20d7-4fd4-83d9-a155ab5c1a93 5G RWO openebs-jiva-default 13m
cassandra-data-cassandra-1 Bound pvc-2fe5a138-6c1d-4bcd-a0e4-e653793abea2 5G RWO openebs-jiva-default 10m
cassandra-data-cassandra-2 Bound pvc-a1ff3db9-d2fd-43a8-9095-c3529fe2e456 5G RWO openebs-jiva-default 9m11s
Interact with Cassandra
View the cluster status:
$ kubectl exec cassandra-0 -- nodetool status
Datacenter: dc1-civo-k3s
========================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 192.168.1.9 83.16 KiB 32 63.4% f3debec3-77c7-438e-bf86-38630fa01c14 rack1-civo-k3s
UN 192.168.0.10 65.6 KiB 32 66.1% 3f78ec71-7485-477a-a095-44aa8fade0bd rack1-civo-k3s
UN 192.168.2.12 84.73 KiB 32 70.5% 0160da34-5975-43d9-ab89-4c92cfe4ceeb rack1-civo-k3s
Now let's connect to our cassandra cluster:
$ kubectl exec -it cassandra-0 -- cqlsh cassandra 9042 --cqlversion="3.4.2"
Connected to civo-k3s at cassandra:9042.
[cqlsh 5.0.1 | Cassandra 3.9 | CQL spec 3.4.2 | Native protocol v4]
Use HELP for help.
cqlsh>
Then create a keyspace, table and write some dummy data to our cluster:
cqlsh> create keyspace test with replication = {'class': 'SimpleStrategy', 'replication_factor': 2 };
cqlsh> use test;
cqlsh:test> create table people (id uuid, name varchar, age int, PRIMARY KEY ((id), name));
cqlsh:test> insert into people (id, name, age) values(uuid(), 'ruan', 28);
cqlsh:test> insert into people (id, name, age) values(uuid(), 'samantha', 25);
cqlsh:test> insert into people (id, name, age) values(uuid(), 'stefan', 29);
cqlsh:test> insert into people (id, name, age) values(uuid(), 'james', 25);
cqlsh:test> insert into people (id, name, age) values(uuid(), 'michelle', 30);
cqlsh:test> insert into people (id, name, age) values(uuid(), 'tim', 32);
Now let's read the data:
cqlsh:test> select * from people;
id | name | age
--------------------------------------+----------+-----
e3a2261d-77d3-4ed6-9daf-51e65ccf618f | tim | 32
594b24ee-e5bc-44cf-87cf-afed48a74df9 | samantha | 25
ca65c207-8dd0-4dc0-99e3-1a0301128921 | michelle | 30
1eeb2b77-57d0-4e10-b5d7-6dc58c43006e | james | 25
3cc12f26-395d-41d1-a954-f80f2c2ff88d | ruan | 28
755d52f2-5a27-4431-8591-7d34bacf7bee | stefan | 29
(6 rows)
Test Data Persistence
Now, it's time to delete our pods and see if our data is persisted. First we will look at our pods to determine how long they are running and on which node they are running:
$ kubectl get pods --selector app=cassandra -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
cassandra-0 1/1 Running 0 35m 192.168.0.10 kube-master-75a2 <none> <none>
cassandra-1 1/1 Running 0 33m 192.168.1.9 kube-node-5d0a <none> <none>
cassandra-2 1/1 Running 0 30m 192.168.2.12 kube-node-aca8 <none> <none>
Let's delete all our cassandra pods:
$ kubectl delete pod/cassandra-{0..2}
pod "cassandra-0" deleted
pod "cassandra-1" deleted
pod "cassandra-2" deleted
As we can see the first pod is busy starting:
$ kubectl get pods --selector app=cassandra -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
cassandra-0 0/1 Running 0 37s 192.168.2.13 kube-node-aca8 <none> <none>
Give it some time to allow all three pods to check in:
$ kubectl get pods --selector app=cassandra -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
cassandra-0 1/1 Running 0 4m36s 192.168.2.13 kube-node-aca8 <none> <none>
cassandra-1 1/1 Running 0 3m24s 192.168.0.17 kube-master-75a2 <none> <none>
cassandra-2 1/1 Running 0 2m3s 192.168.1.12 kube-node-5d0a <none> <none>
Right, now that all pods has checked in, let's connect to our cluster and see if the data is still there:
$ kubectl exec -it cassandra-0 -- cqlsh cassandra 9042 --cqlversion="3.4.2"
Connected to civo-k3s at cassandra:9042.
[cqlsh 5.0.1 | Cassandra 3.9 | CQL spec 3.4.2 | Native protocol v4]
Use HELP for help.
cqlsh> use test;
cqlsh:test> select * from people;
id | name | age
--------------------------------------+----------+-----
e3a2261d-77d3-4ed6-9daf-51e65ccf618f | tim | 32
594b24ee-e5bc-44cf-87cf-afed48a74df9 | samantha | 25
ca65c207-8dd0-4dc0-99e3-1a0301128921 | michelle | 30
1eeb2b77-57d0-4e10-b5d7-6dc58c43006e | james | 25
3cc12f26-395d-41d1-a954-f80f2c2ff88d | ruan | 28
755d52f2-5a27-4431-8591-7d34bacf7bee | stefan | 29
(6 rows)
cqlsh:test> exit;
And as you can see the data is being persisted.
Thank You
I am super impressed with OpenEBS, have a look at their docs and also have a look at their examples on their github page.
Let me know what you think. If you liked my content, feel free to visit me at ruan.dev or follow me on twitter at @ruanbekker