Deploy Minio with persistent storage

25 June 2020 d-nix Comments 4 comments

About this lab

In this lab exercise I will deploy Minio on a Kubernetes cluster using persistent storage provision the the Pure Service Orchestrator.

vTeam Specialization Program

Recently I was nominated to join the Pure Storage vTeam Specialization Program for New Stack. The idea behind the program is to provide a community within Pure for Puritans to learn and develop skills and grow into a subject matter expert.

The program consists of training and lab exercises that are focussed on developing experience in the New Stack space (Kubernetes, Ansible, Open Stack and more). And since I think there are more people out there how want to learn more about New Stack, I though I’d blog my progress of the lab exercises.

Lab instructions

The purpose of this lab is to deploy persistent volumes claims on the Kubernetes cluster deployed earlier (See these blogs: part 1, part 2, part 3) by using PSO (see this blog). To simulate an actual use case, well be deploying the Minio application.

Name:	Deploy a simple application with PSO provided PVCs
Description:	Create a Pure based volume
Objective:	Deploy Minio application
Task #1:	Create a YAML to create a PV on a FlashBlade
Task #2:	Create a YAML to create a PV on a FlashArray
Task #3:	Create a YAML for a Minio deployment using the FlashArray PVC already created
Task #4:	Create an SVC YAML and access the Minio UI
Task #5:	Upload data to Minio
Task #6:	Failover Minio application to new node in k8s cluster
Success Criteria:	All PSO k8s resources are correctly running

Lab goals and tasks

This lab will be done in a single blog, since it doesn’t require too many steps. As said, I’ll be using the Kubernetes cluster and PSO installation that I’ve setup in the previous lab. My Kubernetes cluster is running on VMware and will use iSCSI and NFS to connect to FlashArray and FlashBlade, however for provisioning storage from Kubernetes using PSO it actually doesn’t matter, the steps are always the same.

Deploy persistent storage

How storage is provisioned in Kubernetes

Before we can deploy our Minio application, we need to deploy the required persistent storage volumes. Creating a persistent volume from Kubernetes using PSO is really simple. We need to define a Persistent Volume Claim, which will instruct PSO to provision a volume and connect it to our Kubernetes cluster. Since Kubernetes is declarative the provisioning steps are as follows:

We apply our Persistent Volume Claim (PVC) to the cluster, specifying that we want to use PSO for provisioning. We specify that by using a StorageClass which points to PSO. During the installation the storage classes pure-block and pure-file are created automatically.
Once we’ve applied our PVC, PSO will create a volume on the array, it will create a host definition on the array and mount the volume onto a Kubernetes worker node. Then it will create a Persistent Volume object in Kubernetes that points to the actual storage volume but also links to the PVC.

A container (Pod) that wants to use the volume will reference the PVC, the PVC will point to the PV and the PV will point to the actual volume mounted on the Kubernetes node.

Deploy persistent block storage

The YAML code below is what is used to create a file (NFS) volume, save this in a file called pure-file.yaml:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: pure-file
spec:
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 10Gi
  storageClassName: pure-file

I you know Kubernetes most of this is pretty basic. Lines 1 through 5 basically. In the first two lines we specify the API version as v1, we are calling the API endpoint PersistentVolumeClaim. Then in lines 3 and 4 we give our object a name pure-file.

In the spec part we give the parameters that we want to pass to the API. First the access mode, the two mostly used are ReadWriteOnce (RWO) and ReadWriteMany (RWX). The first (RWO) can only be used by a single POD and is generally associated with block storage. The second (RWX) can be shared by multiple POD’s. NFS storage is very suitable to share across multiple hosts, so RWX is very common for file requests.

To apply this to our Kubernetes cluster and have PSO create our first volumes, we run:

kubectl apply -f pure-file.yaml

Deploy persistent file storage

And that is it. So basically we specified a name, a size, the access mode (RWO/RWX) and the storageclass. Everything else is automated. Now on to the block volume:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: pure-block
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
  storageClassName: pure-block

Identical to the file example, only I changed the name, access mode (RWO) and storageclass. Apply this as well using kubectl apply -f <filename> and we also have our block volume.

Checking our persistent storage volumes

To check if both are created succesfully, run the following:

kubectl get pvc

This should output something similar to this:

NAME               STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
pure-block         Bound    pvc-cea80b7d-9753-403b-8828-1821a31e328d   10Gi       RWO            pure-block     38s
pure-file          Bound    pvc-4b380555-b60d-49df-92f2-1acde995f826   10Gi       RWX            pure-file      20m

If the status is shown as bound, this means that the volume was created by PSO and has been mounted to the Kubernetes worker node as a PV and available for use by you pod.

To get a bit more info, we can use:

kubectl describe pvc pure-file

This shows more information about the PersistentVolumeClaim, including the name of the underlying volume (PersistentVolume) and the events that show PSO picking up the request and successfully provisioning the volume.

Name:          pure-file
Namespace:     default
StorageClass:  pure-file
Status:        Bound
Volume:        pvc-4b380555-b60d-49df-92f2-1acde995f826
Labels:        <none>
Annotations:   pv.kubernetes.io/bind-completed: yes
               pv.kubernetes.io/bound-by-controller: yes
               volume.beta.kubernetes.io/storage-provisioner: pure-csi
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      10Gi
Access Modes:  RWX
VolumeMode:    Filesystem
Mounted By:    <none>
Events:
  Type    Reason                 Age                From                                                              Message
  ----    ------                 ----               ----                                                              -------
  Normal  ExternalProvisioning   31m (x2 over 31m)  persistentvolume-controller                                       waiting for a volume to be created, either by external provisioner "pure-csi" or manually created by system administrator
  Normal  Provisioning           31m                pure-csi_pure-provisioner-0_937faed8-104e-4aab-aefb-786f6f66a9d8  External provisioner is provisioning volume for claim "default/pure-file"
  Normal  ProvisioningSucceeded  31m                pure-csi_pure-provisioner-0_937faed8-104e-4aab-aefb-786f6f66a9d8  Successfully provisioned volume pvc-4b380555-b60d-49df-92f2-1acde995f826

Deploy Minio

Ideally when we deploy (third party) software solutions on our Kubernetes environment, we want to use a package manager. Most commonly Helm is used, as described on the Minio site https://docs.min.io/docs/deploy-minio-on-kubernetes. Also you’d generally deploy Minio with persistent storage in a distributed manner, where the different replicas (pods) replicate their data to protect against data loss.

However the lab states to create YAML files, so let’s go ahead and deploy Minio the “hard” way using YAML and as a single instance (standalone) version.

Application YAML

First the deployment, well use the following (save as minio-deployment.yaml):

apiVersion: apps/v1
kind: Deployment
metadata:
  name: minio-deployment
spec:
  selector:
    matchLabels:
      app: minio
  strategy:
    type: Recreate
  template:
    metadata:
      labels:
        app: minio
    spec:
      volumes:
      - name: storage
        persistentVolumeClaim:
          claimName: pure-block
      containers:
      - name: minio
        image: minio/minio:latest
        args:
        - server
        - /storage
        env:
        - name: MINIO_ACCESS_KEY
          value: "minio"
        - name: MINIO_SECRET_KEY
          value: "minio123"
        ports:
        - containerPort: 9000
          hostPort: 9000
        volumeMounts:
        - name: storage # must match the volume name, above
          mountPath: "/storage"

To be honest, I just copied this file from internet (Google is my friend 🙂 al I did was change line 19 to point to our block storage volume pure-block. Also note the access key and secret keys on lines 27 – 30, since we will need these to login to Minio.

Network service YAML

Next we’ll deploy a service to be able to access the Minio deployment, using a load balancer (save as minio-service.yaml).

apiVersion: v1
kind: Service
metadata:
  name: minio-service
spec:
  type: LoadBalancer
  ports:
    - port: 9000
      targetPort: 9000
      protocol: TCP
  selector:
    app: minio

To create the deployment and service, we need to apply the YAML files, just like we did for our PVC’s earlier.

kubectl apply -f minio-deployment.yaml
kubectl apply -f minio-service.yaml

In my environment I installed a MetalLB load balancer (https://metallb.universe.tf/), so that if I now issue kubectl get service

NAME            TYPE           CLUSTER-IP      EXTERNAL-IP     PORT(S)          AGE
minio-service   LoadBalancer   10.234.50.116   10.1.1.11   9000:31852/TCP   2m22s

The output shows an external IP address which I can use to access my Minio instance. Browse to http://10.1.1.11:9000 to access Minio. Login using the access key and secret key mentioned above and upload some files.

Do a failover of Minio

To now do a failover, let’s see on which node Minio is running:

kubectl get pod -o wide

This should output something like:

NAME                                READY   STATUS    RESTARTS   AGE   IP             NODE    NOMINATED NODE   READINESS GATES
minio-deployment-7cc4c464bb-kln89   1/1     Running   0          14m   10.234.90.26   node1   <none>           <none>

The output shows that Minio runs on node1. Now we will kill this pod and have Kubernetes restart it:

kubectl delete pod minio-deployment-7cc4c464bb-kln89
kubectl get pod -o wide

The first line deletes the POD. Kubernetes automatically restart the pod, since our deployment states we want 1 replica. Using the get pod, now shows Minio running on node3:

NAME                                READY   STATUS              RESTARTS   AGE   IP       NODE    NOMINATED NODE   READINESS GATES
minio-deployment-7cc4c464bb-tskzn   0/1     ContainerCreating   0          7s    <none>   node3   <none>           <none>

So the next step is to login to Minio again and sure enough all our files are still there!

Conclusion

So we’ve seen how easy it is to deploy volumes using PSO and the YAML to do it, just takes four variables. Then we used YAML to deploy Minio with persistent storage, by pointing the Minio configuration to those volumes. Then we proved to ourselves that by storing data on a persistent volume (using PSO) even if the POD fails, we still have access to our data.

I hope you enjoyed reading along and possibly trying some these exercises yourself. Hope to see you in the next lab!

4 thoughts on “Deploy Minio with persistent storage”

K Sai Kumar says:
24 September 2020 at 11:30
Hi, This really helps in configuring the minio in kubernetes cluster. Also let me know how to use this minio to backup the local files. How and where do I store them and their life cycle.
Thanks in advance.
Franz says:
6 December 2021 at 12:53
Is there a reason why you deployed pure-file and pure-block separate? I dont see where you use pure-file on the minio deployment. So I am wondering why you did this in the first place.
Does minio only operate on the block storage, can I use the file storage as well? what are the pros and cons?
Thanks and Cheers
1. d-nix says:
  6 December 2021 at 14:17
  You are fully correct Franz, I only used the pure-block volume for the Minio deployment. The reason to create the pure-file was solely to explain the difference between RWO and RWX. I’d generally say you’d use a block storage volume for Minio as it’s designed with that in mind, since each Minio container/pod will get it’s own persistent volume. For example the Minio Operator documentation explains how to use the MinIO DirectCSI to access local block storage devices (SSD’s or HDD’s) for Minio. The pure-file volumes would be used for applications where multiple containers would access the same file share, such as analytics/ML use cases or sometimes NGINX for shared repos.
  Regards, Remko
rony says:
17 June 2022 at 22:10
I have deployed minio. pods and services are running. but login from web, when i put port (which i get from svc) its redirect me another port and i didn’t get web UI.

D-nix.nl

Technology blog