Using Kubernetes Persistent Volumes

In the previous tutorial we looked at running a Rails application utilizing a Kubernetes Deployment, Service, and ConfigMap. This week, we’ll take a look at running a stateful database in Kubernetes.

Goals:

Create a service for our Postgres database.
Connect our mealplan service to our database.

Stateful vs Stateless

Our Rails application is an example of a “stateless” application because there is no state maintained between requests. It might seem like there is, but all of the actual state is managed in the database. It’s very easy to tear down and spin up stateless applications because you just need to make sure that you’re not terminating a request while it’s being processed. Stateful applications aren’t quite so easy to work with in this way because the data could/would be lost.

For us to run our Postgres database in Kubernetes we’re going to need to use the following Kubernetes object types:

Service
Deployment
ConfigMap
PersistentVolumeClaim

We’ve seen Service, Deployment, and ConfigMap, but PersistentVolumeClaim is new. A PersistentVolumeClaim object indicates that we need access to some storage that isn’t going to go away. It’s not quite like a Docker volume because we will need to specify more information like how much storage and how many containers can access it.

Creating a Database Deployment

For this tutorial, we’re going to group all of the definitions necessary to run our database in Kubernetes into the same YAML file. Before we get too far let’s create the Service and Deployment that we need without any persistence (running it this way would cause data loss when the container stopped):

deployments/postgres.yml

apiVersion: v1
kind: Service
metadata:
  name: postgres
spec:
  ports:
    - port:5432
  selector:
    app: postgres
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: postgres
spec:
  template:
    metadata:
      labels:
        app: postgres
    spec:
      containers:
        - image: "postgres:9.6.2"
          name: postgres
          env:
            - name: POSTGRES_USER
              valueFrom:
                configMapKeyRef:
                  name: mealplan-config
                  key: postgres_user
            - name: POSTGRES_PASSWORD
              valueFrom:
                configMapKeyRef:
                  name: mealplan-config
                  key: postgres_password
          ports:
            - containerPort: 5432
              name: postgres

This isn’t super far off of how we defined our mealplan Deployment or Service. We’re reusing the configMapKeyRef lines from earlier since we want to use the same user and password. The biggest differences are that we’re defining both objects in the same file and we’re separating them with three dashes ---.

If we run this YAML file using the kubectl create -f command we should see the following:

$ kubectl create -f deployments/postgres.yml
service "postgres" created
deployment "postgres" created

Adding Data Persistence

Now that we have our Service and Deployment created successfully we’re going to set up our PersistentVolumeClaim so that we can persist the data. The postgres image that we’re using sets up the username/password combination when the container is first created with no database, because of this we’re going to remove our deployment and service for the time being:

$ kubectl delete -f deployments/postgres.yml
service "postgres" deleted
deployment "postgres" deleted

Now we will define the PersistentVolumeClaim and also mount the volume into our deployment’s containers. The Service portion of the file will be unchanged so we’ll leave that out.

# Servie declaration left out for brevity
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: postgres
spec:
  template:
    metadata:
      labels:
        app: postgres
    spec:
      containers:
        - image: "postgres:9.6.2"
          name: postgres
          env:
            - name: POSTGRES_USER
              valueFrom:
                configMapKeyRef:
                  name: mealplan-config
                  key: postgres_user
            - name: POSTGRES_PASSWORD
              valueFrom:
                configMapKeyRef:
                  name: mealplan-config
                  key: postgres_password
          ports:
            - containerPort: 5432
              name: postgres
          volumeMounts:
            - name: postgres-storage
              mountPath: /var/lib/postgresql/db-data
      volumes:
        - name: postgres-storage
          persistentVolumeClaim:
            claimName: postgres-pv-claim
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: postgres-pv-claim
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi

We needed to add a volumeMounts section to our container in the Deployment and then also tell the deployment what volumes it has access to. Lastly, we created a PersistenVolumeClaim. The claim itself exists to say that whatever is using this claim is allowed to access the PersistentVolume behind it. Let’s create the resources and do some poking around.

$ kubectl create -f deployments/postgres.yml
service "postgres" created
deployment "postgres" created
persistentvolumeclaim "postgres-pv-claim" created

First taking a look at our PersistentVolumeClaim using the describe command:

$ kubectl describe pvc postgres-pv-claim
Name:           postgres-pv-claim
Namespace:      default
StorageClass:   standard
Status:         Bound
Volume:         pvc-9af882c8-1348-11e7-9408-080027b11ce5
Labels:         <none>
Capacity:       5Gi
Access Modes:   RWO
No events.

We can see from the describe output that a PersistentVolume was created for our claim. Getting the description of that volume should give us even more insight how this is set up:

$ kubectl describe pv pvc-9af882c8-1348-11e7-9408-080027b11ce5
Name:           pvc-9af882c8-1348-11e7-9408-080027b11ce5
Labels:         <none>
StorageClass:   standard
Status:         Bound
Claim:          default/postgres-pv-claim
Reclaim Policy: Delete
Access Modes:   RWO
Capacity:       5Gi
Message:
Source:
    Type:       HostPath (bare host directory volume)
    Path:       /tmp/hostpath-provisioner/pvc-9af882c8-1348-11e7-9408-080027b11ce5
No events.

From the looks of it, the volume was created as a directory that we have read/write access to. That works perfectly fine for our needs.

Connecting Rails to our Database

Now that we have a volume and Postgres running we need to reconfigure our Rails application to connect to it. To complete our set up we need to do a few more things:

Change our ConfigMap’s POSTGRES_HOST value so Rails knows what to talk to.
Deploy the ConfigMap changes.
Set up our new database (we’re not going to migrate any of the data).

Updating the mealplan-config ConfigMap

Thankfully, Kubernetes makes it easy for us to edit the data in a config map using the kubectl edit command. Let’s change the POSTGRES_HOST value now:

$ kubectl edit configmap mealplan-config

This will open up your $EDITOR (in my case that’s Vim) and you should edit the POSTGRES_HOST from being an IP address to having the value of postgres. It should look like this when we’re done:

# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
#
apiVersion: v1
data:
  postgres_host: postres
  postgres_password: dF1nu8xT6jBz01iXAfYDCmGdQO1IOc4EOgqVB703
  postgres_user: meal_planner
  rails_env: production
  rails_log_to_stdout: "true"
  secret_key_base: 7a475ef05d7f1100ae91c5e7ad6ab4706ce5d303e6bbb8da153d2accb7cb53fa5faeff3161b29232b3c08d6417bd05686094d04e22950a4767bc9236991570ad
kind: ConfigMap
metadata:
  creationTimestamp: 2017-03-21T22:46:45Z
  name: mealplan-config
  namespace: default
  resourceVersion: "82058"
  selfLink: /api/v1/namespaces/default/configmaps/mealplan-config
  uid: 427c493c-0e88-11e7-99a4-080027b11ce5

When you save in your editor it will update the ConfigMap object that Kubernetes maintains.

Deploying ConfigMap Changes

Unfortunately, it was very easy to update the ConfigMap, but slightly less easy or understandable to get those changes reflected in our deployment. Similar to the way that environment variable changes are deployed to Docker containers using docker-compose, we need to stop our container and deploy a new one for the value to be updated. Since this is done on a deployment level we will be using the kubectl scale command to stop our container(s) and spin some new ones back up:

$ kubectl scale deployment/mealplan --replicas=0; kubectl scale deployment/mealplan --replicas=1
deployment "mealplan" scaled
deployment "mealplan" scaled

Now our new container will be connected to the Kubernetes maintained postgres service.

Setting up the Database in Kubernetes

Now that our rails application is connected to the postgres service we need to do the work to set up the database initially. Since this is something that we’ll only be doing once we will run this manually by using the kubectl exec command. First, we need to find the container that we’ll use to exec:

$ kubectl get pods
NAME                        READY     STATUS    RESTARTS   AGE
mealplan-4029866644-mhhtf   1/1       Running   0          5m
postgres-4220950182-qphj2   1/1       Running   0          15m

The container name that I’ll be using will be mealplan-4029866644-mhhtf but yours will be different. Now let’s use the kubectl exec command to make sure we’re talking to the proper host and get up our database:

$ kubectl exec mealplan-4029866644-mhhtf -it -- env | grep POSTGRES_HOST
POSTGRES_HOST=postgres
$ kubectl exec mealplan-4029866644-mhhtf -it -- rake db:setup

Tearing Down the Old Postgres Container

Let’s make sure that we don’t leave the old Postgres container running outside of Kubernetes. Set the minikube host as your DOCKER_HOST if it’s not already set:

$ eval $(minikube docker-env)

Now we can stop & remove the database container that we were using:

$ docker stop mealplan_prod_db_1
mealplan_prod_db_1
$ docker rm mealplan_prod_db_1
mealplan_prod_db_1

Recap

In this tutorial, you went through and learned how to set up a Kubernetes PersistentVolume and run a stateful application within Kubernetes. You also learned how to edit a ConfigMap and deploy the changes to an existing deployment. We had some unfortunate downtime that we had to deal with, but in later tutorials, we will address how to overcome that hurdle.