While there are plenty of examples how to write stateless applications on Kubernetes, there are relative few simple samples explaining how to write stateful applications. This article describes how to write a simple database system with Quarkus.
The complete code of this article can be found in the ibm/operator-sample-go repo.
My previous article How to build your own Database on Kubernetes explains the concepts how stateful workloads can be run on Kubernetes. Before reading on, make sure you understand StatefulSets. To recap, here are the main components.
Let’s look at the StatefulSet definition first:
apiVersion: apps/v1 kind: StatefulSet metadata: name: database-cluster namespace: database labels: app: database-cluster spec: serviceName: database-service replicas: 3 selector: matchLabels: app: database-cluster template: metadata: labels: app: database-cluster spec: securityContext: fsGroup: 2000 terminationGracePeriodSeconds: 10 containers: - name: database-container image: nheidloff/database-service:v1.0.22 imagePullPolicy: IfNotPresent ports: - containerPort: 8089 name: api volumeMounts: - name: data-volume mountPath: /data env: - name: DATA_DIRECTORY value: /data/ - name: POD_NAME valueFrom: fieldRef: apiVersion: v1 fieldPath: metadata.name - name: NAMESPACE valueFrom: fieldRef: apiVersion: v1 fieldPath: metadata.namespace volumeClaimTemplates: - metadata: name: data-volume spec: accessModes: [ "ReadWriteOnce" ] storageClassName: ibmc-vpc-block-5iops-tier resources: requests: storage: 1Mi
Notes about the stateful set:
- There are three replicas: One lead and two followers.
- A storage class is used to provision volumes automatically.
- Each pod/container has its own volume.
- The volume is mounted into the container.
- To allow containers to read metadata like their pod names, environment variables are used.
- The security context is set to “fsGroup: 2000” which allows file access from the Quarkus image.
To access the pods, a service is defined. For example the leader can be invoked via “http://database-cluster-0.database-service.database:8089/persons”.
apiVersion: v1 kind: Service metadata: labels: app: database-service name: database-service namespace: database spec: clusterIP: None ports: - port: 8089 selector: app: database-cluster
The database service uses a single JSON file for storage. For the leader the file is created when the leader is initialized. Followers synchronize the data from the leader when they are initialized.
public static Response synchronizeDataFromLeader(LeaderUtils leaderUtils, PersonResource personResource) { System.out.println("LeaderUtils.synchronizeDataFromLeader()"); String leaderAddress = "http://database-cluster-0.database-service.database:8089/persons"; int httpStatus = 200; if (leaderUtils.isLeader() == true) { httpStatus = 501; // Not Implemented } else { Set<Person> persons = null; try { // Note: This follower should update from the previous follower (or leader) // For simplification purposes updates are only read from the leader URL apiUrl = new URL(leaderAddress); System.out.println("Leader found. URL: " + leaderAddress); RemoteDatabaseService customRestClient = RestClientBuilder.newBuilder().baseUrl(apiUrl). register(ExceptionMapper.class).build(RemoteDatabaseService.class); persons = customRestClient.getAll(); } catch (Exception e) { System.out.println("/persons could not be invoked"); httpStatus = 503; // Service Unavailable } if (persons != null) { try { personResource.updateAllPersons(persons); } catch (RuntimeException e) { System.out.println("Data could not be written"); httpStatus = 503; // Service Unavailable } } } return Response.status(httpStatus).build(); }
Write operations are only allowed on the leader. When they are executed on the leader, the followers need to be notified to update their state (see code).
public static void notifyFollowers() { KubernetesClient client = new DefaultKubernetesClient(); String serviceName = "database-service"; String namespace = System.getenv("NAMESPACE"); PodList podList = client.pods().inNamespace(namespace).list(); podList.getItems().forEach(pod -> { if (pod.getMetadata().getName().endsWith("-0") == false) { String followerAddress = pod.getMetadata().getName() + "." + serviceName + "." + namespace + ":8089"; System.out.println("Follower found: " + pod.getMetadata().getName() + " - " + followerAddress); try { URL apiUrl = new URL("http://" + followerAddress + "/api/onleaderupdated"); RemoteDatabaseService customRestClient = RestClientBuilder.newBuilder(). register(ExceptionMapper.class).baseUrl(apiUrl).build(RemoteDatabaseService.class); customRestClient.onLeaderUpdated(); } catch (Exception e) { System.out.println("/onleaderupdated could not be invoked"); } } }); }
The next question is how the leader is determined. In this sample a simple mechanism is used which is to check whether the container’s pod name ends with “-0”.
public void electLeader() { String podName = System.getenv("POD_NAME"); if ((podName != null) && (podName.endsWith("-0"))) { setLeader(true); } }
The state of all pods is stored on the volumes too (podstate.json) so that the new pods can continue with the state previous pod instances left off.
To simulate a real database system, the database application has SQL-like APIs to execute statements and queries.
To learn more, check out the complete source code.