In this third post in this series, I’ll describe how to deploy Elastic Search on Kubernetes using Elastic Cloud. We need this Elastic deployment to be able to ingest the logfiles that are created by the Python application that I’ve created in the first part of this series. As a quick reminder, the purpose of this blog series is to write about the steps I took in my journey to build a teeny tiny data pipeline using Python, Kubernetes and ElasticSearch.

The idea of the data pipeline I’m creating, is to use Python to export data from the FlashArray and FlashBlade to a log file and then ingest this data into ElasticSearch. In my example, I’ll be collecting volume statistics, such as size, used space and the data reduction numbers per volume.

Data pipeline overview

So let’s move on and start deploying our Elastic Search application on Kubernetes!

Elastic Cloud on Kubernetes

There are numerous ways to deploy Elastic search and if you are planning a production deployment, I would encourage you to due you due research on the deployment method that works best for you. The Elastic site provides great resources on architecture, planning, requirements, etc. However for the purpose of this blog, where I just want to get an Elastic Search and Kibana instance up quickly, I’ll be using the Elastic Cloud on Kubernetes deployment.

About Elastic Cloud on Kubernetes

Why did I choose Elastic Cloud on Kubernetes? Well primarily because it’s really simple to deploy. The way it works it that you deploy an Operator on your Kubernetes environment. Once this Operator is up and running, that will take care of deploying, updating and maintaining your Elastic Cloud deployments. So if you want to get up-and-running quickly, this is a great way to get started. And if you want to for example scale the number of nodes after the initial deployment, the Operator will take care of all of the heavy lifting for you.

Deploy Elastic Cloud operator

For our deployments, I’ll be following the steps as outlined here, but for you to follow along if you want, I’ll write down the commands I used. As mentioned the first step is to deploy the Elastic Cloud Operator, for which the Elastic site provides a simple command.

kubectl apply -f https://download.elastic.co/downloads/eck/1.3.0/all-in-one.yaml

This will install the operator in the elastic-system namespace, so we can see the operator using:

kubectl get all -n elastic-system

This should show you a single elastic-operator-0 in a running state. To check the logs of the operator use the following:

kubectl -n elastic-system logs -f statefulset.apps/elastic-operator

Deploy Elastic Search

Next we want to deploy our actual Elastic Search instance and this is now pretty straight forward.

I’ve defined the following YAML file, which will deploy a three node Elastic Search cluster and I’ve specified the volumeClaimTemplates to use the pure-file storage class. This will create a high performance NFS share on the FlashBlade system in my lab.

apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
  name: quickstart
  version: 7.10.0
  - name: default
    count: 3
      node.store.allow_mmap: false
  # http:
  #   service:
  #     spec:
  #       # expose this cluster Service with a LoadBalancer
  #       type: LoadBalancer
    - metadata:
        name: elasticsearch-data
        - ReadWriteOnce
            storage: 100Gi
        storageClassName: pure-file

I’ve saved the yaml above as cluster.yaml and can now deploy that on my Kubernetes cluster.

kubectl apply -f cluster.yaml

This will deploy my 3-node Elastic cluster in the current namespace. Depending on your environment, you might first want to create a new namespace and use the -n to deploy Elastic to that namespace.

To check the deployment use:

kubectl get all,pvc

Which should show you the three pods getting created, initializing and going into a running state. You should also see the three persistent volumes being created and bound.

Finally you’ll notice the three services created for this Elastic Search instance (quickstart-es-default, quickstart-es-http, quickstart-es-transport). All services are created as a ClusterIP, which is fine for me, however if you want to expose Elastic to the outside world, outside of Kubernetes you could for example change the ClusterIP to a load balancer. I’ve added the yaml to do so in the example above, but commented it out.

The final step is to get the password, that was automatically generated for the cluster, use the following to show the password (note this down).

kubectl get secret quickstart-es-elastic-user -o=jsonpath='{.data.elastic}' | base64 --decode; echo

Deploy Kibana

Next we want to deploy Kibana, since it’s just a lot easier to interact with Elastic through the Kibana web interface. We use the default sample provided on the Elastic website, however in this case I did change the service type from ClusterIP to LoadBalancer, so that I can access my Kibana instance from outside of the Kubernetes cluster.

apiVersion: kibana.k8s.elastic.co/v1 
kind: Kibana 
  name: quickstart 
  version: 7.10.0 
  count: 1 
    name: quickstart
        type: LoadBalancer

Save the yaml to kibana.yaml and apply this:

kubectl apply -f kibana.yaml

That will kick-off our Kibana deployment. Check out the pods using:

kubectl get pod --selector='kibana.k8s.elastic.co/name=quickstart'

And get the IP address assigned by the load balancer using:

kubectl get service quickstart-kb-http

You should now be able to navigate to Kibana using the External IP address, pointing your browser to:

https://[IP address]:5601

This should open up the login page for Kibana.

Kibana login dialog
Kibana login dialog

Here you can login with user elastic and the password that you noted down earlier during the Elastic deployment.

Once you’ve logged in, go ahead and explore some of the options of Elastic and Kibana. The possibilities are almost infinite.

Deploy file beat

Finally we want to ship the Kubernetes log files to Elastic Search so that we can start analysing the data in Kibana. For this we will use filebeat , which can also be deployed using Elastic Cloud.

apiVersion: beat.k8s.elastic.co/v1beta1
kind: Beat
  name: quickstart
  type: filebeat
  version: 7.10.0
    name: quickstart
    name: quickstart
    - type: container
      - /var/log/containers/*.log
    - add_cloud_metadata: {}
    - add_host_metadata: {}
        dnsPolicy: ClusterFirstWithHostNet
        hostNetwork: true
          runAsUser: 0
        - name: filebeat
          - name: varlogcontainers
            mountPath: /var/log/containers
          - name: varlogpods
            mountPath: /var/log/pods
          - name: varlibdockercontainers
            mountPath: /var/lib/docker/containers
        - name: varlogcontainers
            path: /var/log/containers
        - name: varlogpods
            path: /var/log/pods
        - name: varlibdockercontainers
            path: /var/lib/docker/containers

This is a quite straight forward deployment as described in the documentation, so not a lot else to mention here. So let’s save this as filebeat.yaml and apply it.

kubectl apply -f filebeat.yaml

This should deploy filebeat and start shipping the log info into Elastic. Use the following command to check the pod status.

kubectl get pod --selector='beat.k8s.elastic.co/name=quickstart'


We’ve made great progress on our data pipeline journey! In the first two blogs we created and containerised our log generating application and we have now deployed a basic Elastic Search environment that is ingesting our Kubernetes logs, so we should be able to bring this all together! However that’s something for the next blog in this series, where we’ll deploy our containerised app on Kubernetes and transform the logs to a usable format for Elastic Search / Kibana. Hope you enjoyed the read, and looking forward to see you in the next one.

Leave a Reply

Your email address will not be published.