While there are plenty of examples how to write stateless applications on Kubernetes, there are relative few simple samples explaining how to write stateful applications. This article describes how to write a simple database system with Quarkus.
The complete code of this article can be found in the ibm/operator-sample-go repo.
My previous article How to build your own Database on Kubernetes explains the concepts how stateful workloads can be run on Kubernetes. Before reading on, make sure you understand StatefulSets. To recap, here are the main components.
Let’s look at the StatefulSet definition first:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: database-cluster
namespace: database
labels:
app: database-cluster
spec:
serviceName: database-service
replicas: 3
selector:
matchLabels:
app: database-cluster
template:
metadata:
labels:
app: database-cluster
spec:
securityContext:
fsGroup: 2000
terminationGracePeriodSeconds: 10
containers:
- name: database-container
image: nheidloff/database-service:v1.0.22
imagePullPolicy: IfNotPresent
ports:
- containerPort: 8089
name: api
volumeMounts:
- name: data-volume
mountPath: /data
env:
- name: DATA_DIRECTORY
value: /data/
- name: POD_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
- name: NAMESPACE
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
volumeClaimTemplates:
- metadata:
name: data-volume
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: ibmc-vpc-block-5iops-tier
resources:
requests:
storage: 1Mi
Notes about the stateful set:
- There are three replicas: One lead and two followers.
- A storage class is used to provision volumes automatically.
- Each pod/container has its own volume.
- The volume is mounted into the container.
- To allow containers to read metadata like their pod names, environment variables are used.
- The security context is set to “fsGroup: 2000” which allows file access from the Quarkus image.
To access the pods, a service is defined. For example the leader can be invoked via “http://database-cluster-0.database-service.database:8089/persons”.
1
2
3
4
5
6
7
8
9
10
11
12
13
apiVersion: v1
kind: Service
metadata:
labels:
app: database-service
name: database-service
namespace: database
spec:
clusterIP: None
ports:
- port: 8089
selector:
app: database-cluster
The database service uses a single JSON file for storage. For the leader the file is created when the leader is initialized. Followers synchronize the data from the leader when they are initialized.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
public static Response synchronizeDataFromLeader(LeaderUtils leaderUtils, PersonResource personResource) {
System.out.println("LeaderUtils.synchronizeDataFromLeader()");
String leaderAddress = "http://database-cluster-0.database-service.database:8089/persons";
int httpStatus = 200;
if (leaderUtils.isLeader() == true) {
httpStatus = 501; // Not Implemented
} else {
Set<Person> persons = null;
try {
// Note: This follower should update from the previous follower (or leader)
// For simplification purposes updates are only read from the leader
URL apiUrl = new URL(leaderAddress);
System.out.println("Leader found. URL: " + leaderAddress);
RemoteDatabaseService customRestClient = RestClientBuilder.newBuilder().baseUrl(apiUrl).
register(ExceptionMapper.class).build(RemoteDatabaseService.class);
persons = customRestClient.getAll();
} catch (Exception e) {
System.out.println("/persons could not be invoked");
httpStatus = 503; // Service Unavailable
}
if (persons != null) {
try {
personResource.updateAllPersons(persons);
} catch (RuntimeException e) {
System.out.println("Data could not be written");
httpStatus = 503; // Service Unavailable
}
}
}
return Response.status(httpStatus).build();
}
Write operations are only allowed on the leader. When they are executed on the leader, the followers need to be notified to update their state (see code).
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
public static void notifyFollowers() {
KubernetesClient client = new DefaultKubernetesClient();
String serviceName = "database-service";
String namespace = System.getenv("NAMESPACE");
PodList podList = client.pods().inNamespace(namespace).list();
podList.getItems().forEach(pod -> {
if (pod.getMetadata().getName().endsWith("-0") == false) {
String followerAddress = pod.getMetadata().getName() + "." + serviceName + "." + namespace + ":8089";
System.out.println("Follower found: " + pod.getMetadata().getName() + " - " + followerAddress);
try {
URL apiUrl = new URL("http://" + followerAddress + "/api/onleaderupdated");
RemoteDatabaseService customRestClient = RestClientBuilder.newBuilder().
register(ExceptionMapper.class).baseUrl(apiUrl).build(RemoteDatabaseService.class);
customRestClient.onLeaderUpdated();
} catch (Exception e) {
System.out.println("/onleaderupdated could not be invoked");
}
}
});
}
The next question is how the leader is determined. In this sample a simple mechanism is used which is to check whether the container’s pod name ends with “-0”.
1
2
3
4
5
6
public void electLeader() {
String podName = System.getenv("POD_NAME");
if ((podName != null) && (podName.endsWith("-0"))) {
setLeader(true);
}
}
The state of all pods is stored on the volumes too (podstate.json) so that the new pods can continue with the state previous pod instances left off.
To simulate a real database system, the database application has SQL-like APIs to execute statements and queries.
To learn more, check out the complete source code.