Azure Kubernetes Service (AKS): Deploying Elasticsearch, Logstash and Kibana (ELK) and consume messages from Azure Event Hub

Article
09/24/2018

This article is part of the series of blogs on Azure Kubernetes Service (AKS). In this article I am going to share my experience of setting up Elasticsearch, Logstash and Kibana cluster in Azure Kubernetes Service (AKS) and consume messages from Event Hub. After the end of this article, we are going to have fully functional ELK stack with Azure Event Hub integration.

A sample client App (e.g. IOT device) will be publishing messages to Event Hub and these messages will be ingested into Elasticsearch using 'Azure Event Hub' plugin of Logstash. This article needs x-pack features of Elasticsearch thus I will show steps needed to activate trial license.

The second part of this series goes through steps needed to enable Azure AD SAML based single sign on to secure Elasticsearch and Kibana hosted in AKS. The third part of this series goes through steps needed to ingest Azure Redis Cache messages into Elasticsearch using Logstash's Redis plugin.

The dev tools used to develop these components are Visual Studio for Mac/Visual Studio 2017, AKS Dashboard as well as kubectl commands are used to create/manager Kubernetes resources in AKS.

Azure Kubernetes Service (AKS) Cluster

In case you don’t have AKS cluster up and running, please go through this article to Create AKS Cluster. Once AKS cluster is up and running, you can go through the rest of the article. The code snippets in this article are mostly yaml snippets and are included for reference only as formatting may get distorted thus please refer to GitHub repository for formatted resources.

Event Hub

Event hub messages will be ingested into Elasticsearch using Logstash pipeline, thus first resource I am going to create is Event Hub. Please follow steps listed in this article to create an Event Hub. The main pointers are

Event Hub name I have selected for this sample solution is 'logstash'. If you select a different name, update event hub name in the source code.
Keep a note of Event Hub connection string as this needs to be updated in Logstash pipeline and Event Hub messages publisher client.
Source code uses '$Default' Resource_Group. Update this value if you create a Resource_Group

Azure Blob Storage

The next resource which you will need to create is Azure Blob Storage. Please follow steps listed in this article to create a storage account. Once storage account is created, create Blob Service-> Blobs. The main pointers are

Blob container name I have specified in source code is 'logstash'. If you select a different name, update storage container name in the source code.
Keep a note of Storage connection string as this needs to be updated in Logstash pipeline.

Client App to send messages to Event Hub

AzureEventHubSample project is client app to send messages to Event hub. You will need to update connectionString variable with Event Hub connection string and name of the hub. You can download the source code of this publisher client from Github.

Deploy Elasticsearch to Azure Kubernetes Service

Elasticsearch is a near real time search platform. The steps needed to deploy Elasticsearch to AKS cluster are listed below

Create a Persistent Volume

Persistent volume claim is needed to store Elasticsearch data. The yaml snippet to create a 5 GB storage is displayed below. The StatefulSet resource is going to mount files to this storage claim. You can read more about Persistent Volumes. apiVersion: v1kind: PersistentVolumeClaimmetadata: name: sample-elasticsearch-data-claimspec: accessModes: - ReadWriteOnce resources: requests: storage: 5Gi

Create a Kubernetes ConfigMap

ConfigMaps allow you to decouple configuration aspects from image and a few pointers about the yaml snippet displayed below are

elasticsearch.yml and role_mapping.yml files will be mounted from ConfigMap
xpack-security is enabled
Update max_local_storage_nodes based on number of Pods
role_mapping.yml is intentionally left blank as it will be discussed in detail in next blogs of this series

apiVersion: v1kind: ConfigMapmetadata: name: sample-elasticsearch-configmap namespace: defaultdata: elasticsearch.yml: | cluster.name: "sample-elasticsearch-cluster" network.host: 0.0.0.0 discovery.zen.minimum_master_nodes: 1 #Update max_local_storage_nodes value based on number of nodes node.max_local_storage_nodes: 1 xpack.security.enabled: true xpack.monitoring.collection.enabled: true role_mapping.yml: |

You can read more about ConfigMap.

Create a Kubernetes Service

The next step is to create a Kubernetes Service for Elasticsearch. As you can see in yaml snippet below, port 9200/9300 is defined and type is ClusterIP i.e. this service doesn't has external endpoints. Kubernetes will use to selector 'service: sample-elasticsearch' to map to the deployment as you are going to see next. You can read more about Services. apiVersion: v1 kind: Service metadata: name: sample-elasticsearch labels: service: sample-elasticsearchspec: type: ClusterIP selector: service: sample-elasticsearch ports: - name: http port: 9200 targetPort: 9200 protocol: TCP - name: transport port: 9300 targetPort: 9300 protocol: TCP

Create a Kubernetes StatefulSet

Kubernetes StatefulSet is the workload API used to manage stateful applications. You can read more about StatefulSets. The yaml snippet to create a StatefulSet is displayed below and a few pointers are

service: sample-elasticsearch matches to the selector defined in the service.
I have specified replicas: 1 which means that only one instance of Pod will be created by Kubernetes. You can update this value as needed, however you will need to update ConfigMap value .max_local_storage_nodes: 1 too.
The docker image being used to create this resource is image:docker.elastic.co/elasticsearch/elasticsearch:6.4.1.
By default, Elasticsearch runs inside the container as user elasticsearch using uid:gid 1000:1000. If you are bind-mounting a local directory or file, ensure it is readable by this user, while the data and log dirs additionally require write access. This is the reason I have used environment variable - name: "TAKE_FILE_OWNERSHIP" value: “1”. The alternate option is to grant write access by adding an initcontainer step e.g. command: - sh - -c - chown -R 1000:1000 /usr/share/elasticsearch/data. You can read more at Elasticsearch Docker.
mmap count has been increased vm.max_map_count=262144 by adding an initcontainer step. You can read more about Elasticsearch virtual memory.
ConfigMap defined in previous step is used to mount elasticsearch.yml and role_mapping.yml files i.e. configMap: name: sample-elasticsearch-configmap.
Lastly, persistent volume claim created above is used for storage i.e. persistentVolumeClaim: claimName:sample-elasticsearch-data-claim.
Assign CPU resource as needed by updating section resources:limits:. You can read more about Assign CPU Resources to Containers and Pods

apiVersion: apps/v1kind: StatefulSetmetadata: name: sample-elasticsearch labels: service: sample-elasticsearchspec: serviceName: sample-elasticsearch # Number of PODS replicas: 1 selector: matchLabels: service: sample-elasticsearch template: metadata: labels: service: sample-elasticsearch spec: terminationGracePeriodSeconds: 15 initContainers: # https://www.elastic.co/guide/en/elasticsearch/reference/current/vm-max-map-count.html - name: increase-the-vm-max-map-count image: busybox command: - sysctl - -w - vm.max_map_count=262144 securityContext: privileged: true containers: - name: sample-elasticsearch image: docker.elastic.co/elasticsearch/elasticsearch:6.4.1 resources: limits: cpu: 2000m memory: 2Gi requests: cpu: 100m memory: 1Gi ports: - containerPort: 9200 name: http - containerPort: 9300 name: tcp env: - name: cluster.name value: "sample-elasticsearch-cluster" - name: "xpack.security.enabled" value: "true" - name: "TAKE_FILE_OWNERSHIP" value: “1” volumeMounts: - name: sample-elasticsearch-data-claim mountPath: /usr/share/elasticsearch/data - name: sample-elasticsearch-configmap mountPath: /usr/share/elasticsearch/config/elasticsearch.yml subPath: elasticsearch.yml - name: sample-elasticsearch-configmap mountPath: /usr/share/elasticsearch/config/role_mapping.yml subPath: role_mapping.yml volumes: - name: sample-elasticsearch-data-claim persistentVolumeClaim: claimName: sample-elasticsearch-data-claim - name: sample-elasticsearch-configmap configMap: name: sample-elasticsearch-configmap

By default Elasticsearch will be deployed with basic license. After Elasticsearch is deployed, the next step is to activate trail license of Elasticsearch to use x-pack features of Elasticsearch.

Enable Trial Version of Elasticsearch

The steps needed to activate trail license are

Run command kubectl port-forward sample-elasticsearch-0 9200:9200 and now you can access Elasticsearch endpoint at https://localhost:9200. POST https://localhost:9200/_xpack/license/start_trial?acknowledge=true request from any rest client. This is going to activate the trial license.

You can verify if trail version is activated by GET https://localhost:9200/_xpack/license request from any rest client.
Once trial license is activated you can close the terminal as port forwarding isn't needed anymore.

Setup Password for buit-in user accounts of Elasticsearch

x-pack security feature of Elasticsearch is used to secure access thus we now need to setup passwords for built-in user accounts and the steps are

Connect to Elasticsearch POD by running command kubectl exec -ti sample-elasticsearch-0 bash
Run command bin/elasticsearch-setup-passwords interactive to setup built-in user passwords interactively. For this sample I have specified same password i.e. Password1$ for all accounts thus you will need to change password you specified in a few places in source code.

Deploy Kibana to Azure Kubernetes Service

Kibana lets you visualize your Elasticsearch data and navigate the Elastic Stack. The steps needed to deploy Kibana to AKS are listed below

Create a Kubernetes ConfigMap

The yaml snippet to create this resource is displayed below and a few pointers are

kibana.yml file will be mounted from ConfigMap
Kibana points to Elasticsearch based on elasticsearch.url: https://sample-elasticsearch:9200
Updateelasticsearch.password: Password1$ based on your specified password for kibana (built-in) user

apiVersion: v1kind: ConfigMapmetadata: name: sample-kibana-configmap namespace: defaultdata: kibana.yml: | server.name: sample-kibana server.host: "0" elasticsearch.url: https://sample-elasticsearch:9200 xpack.monitoring.ui.container.elasticsearch.enabled: true elasticsearch.username: kibana elasticsearch.password: Password1$

Create a Kubernetes Service

The next step is to create a Kubernetes Service for Kibana. As you can see in yaml snippet below, port 80 is defined and type is LoadBalancer i.e. this service has external endpoints. Kubernetes will use to selector 'component: sample-kibana' to map to the deployment as you are going to see next. The creation of this service is going to take a while and once done you can get the external endpoint of this service either by opening AKS Dashboard or running Kubectl command kubectl describe services sample-kibana.

apiVersion: v1kind: Servicemetadata: name: sample-kibana labels: component: sample-kibanaspec: type: LoadBalancer selector: component: sample-kibana ports: - name: http port: 80 targetPort: http

Create a Kubernetes Deployment

The next step is to create a Kubernetes Deployment for Kibana. The yaml snippet is displayed below and a few pointers are

The docker image being used to create this resource is image: docker.elastic.co/kibana/kibana:6.4.1
You can change the number of pods by updating replicas: 1
Labelcomponent: sample-kibana has to match the selector defined in the service
ConfigMap defined in previous step is used to mount kibana.yml file i.e. configMap: name: sample-kibana-configmap

apiVersion: apps/v1beta1kind: Deploymentmetadata: name: sample-kibana labels: component: sample-kibanaspec: replicas: 1 selector: matchLabels: component: sample-kibana template: metadata: labels: component: sample-kibana spec: containers: - name: sample-kibana image: docker.elastic.co/kibana/kibana:6.4.1 resources: limits: cpu: 1000m requests: cpu: 100m ports: - containerPort: 5601 name: http volumeMounts: - name: sample-kibana-configmap mountPath: /usr/share/kibana/config/kibana.yml subPath: kibana.yml resources: {} terminationMessagePath: "/dev/termination-log" terminationMessagePolicy: File imagePullPolicy: Always securityContext: privileged: false volumes: - name: sample-kibana-configmap configMap: name: sample-kibana-configmap restartPolicy: Always terminationGracePeriodSeconds: 5 dnsPolicy: ClusterFirst securityContext: {} schedulerName: default-scheduler

Open the external endpoint of Kibana service once deployment is completed. Since x-pack security is enabled, Kibana will prompt for credentials.

Deploy Logstash to Azure Kubernetes Service

Logstash is data processing pipeline that ingests data from a multitude of sources simultaneously, transforms it, and then sends it to Elasticsearch. Logstash will use Azure Event Hub plugin to ingest data into Elasticsearch. The steps needed to deploy Logstash to AKS are listed below

Create a Kubernetes ConfigMap

The yaml snippet to create ConfigMap is displayed below and a few pointers are

logstash.yml file will be mounted from ConfigMap
pipelines.yml file will be mounted from ConfigMap. Multiple pipelines can be defined in this file e.g. you can see that AzureEventHubs pipeline is defined.
azureeventhub.cfg file will be mounted from ConfigMap. The Logstash event processing pipeline has three stages: inputs → filters → outputs. This file defines the logstash pipeline for Azure Event Hub.
- Update {AZURE_EVENT_HUB_CONNECTION_STRING} and {STORAGE_ACCOUNT_CONNECTION_STRING} values based on your Event Hub and Storage Account values.
- Update EntityPath in case your event hub is named differently then logstash.
- Update storage_container in case your storage container is named differently than logstash.
- Update consumer_group in case your consumer group is different than $Default. Specifying $Default isn't an ideal option.
- Index name is defined as index => "azureeventhub-%{+YYYY.MM.dd}
logstash.conf file will be mounted from ConfigMap
Elasticsearch URL is defined in xpack.monitoring.elasticsearch.url: https://sample-elasticsearch:9200
Updatexpack.monitoring.elasticsearch.password: Password1$ based on your specified password for logstash_system (built-in) user.
Elasticsearch endpoint is defined in hosts => [ "sample-elasticsearch:9200" ]
Update password => "Password1$" to the specified password of your elastic (built-in) user

apiVersion: v1kind: ConfigMapmetadata: name: sample-logstash-configmap namespace: defaultdata: logstash.yml: | xpack.monitoring.elasticsearch.url: https://sample-elasticsearch:9200 dead_letter_queue.enable: true xpack.monitoring.enabled: true xpack.monitoring.elasticsearch.username: logstash_system xpack.monitoring.elasticsearch.password: Password1$ pipelines.yml: | - pipeline.id: azureeventhubs path.config: "/usr/share/logstash/azureeventhubs.cfg" azureeventhubs.cfg: | input { azure_event_hubs { event_hub_connections => ["{AZURE_EVENT_HUB_CONNECTION_STRING};EntityPath=logstash"] threads => 2 decorate_events => true consumer_group => "$Default" storage_connection => "{STORAGE_ACCOUNT_CONNECTION_STRING}" storage_container => "logstash" } } filter { } output { elasticsearch { hosts => [ "sample-elasticsearch:9200" ] user => "elastic" password => "Password1$" index => "azureeventhub-%{+YYYY.MM.dd}" } } logstash.conf: |

Create a Kubernetes Service

The next step is to create a Kubernetes Service for Logstash. As you can see in yaml snippet below, port 80 is defined and type is ClusterIP i.e. this service has no external endpoints. Kubernetes will use to selector 'component: sample-logstash' to map to the deployment as you are going to see next.

apiVersion: v1kind: Servicemetadata: name: sample-logstash labels: component: sample-logstashspec: type: ClusterIP selector: component: sample-logstash ports: - name: http port: 80 targetPort: http

Create a Kubernetes Deployment

The next step is to create a Kubernetes Deployment for Logstash. The yaml snippet is displayed below and a few pointers are

The docker image being used to create this resource is docker.elastic.co/logstash/logstash:6.4.1
You can change the number of pods by updating replicas: 1
Labelcomponent: sample-logstash has to match the selector defined in the service
ConfigMap defined in previous step is used to mount logstash.yml, logstash.conf, pipelines.yml and azureeventhubs.cfg files i.e. configMap: name: sample-logstash-configmap

apiVersion: apps/v1beta1kind: Deploymentmetadata: name: sample-logstash labels: component: sample-logstashspec: replicas: 1 selector: matchLabels: component: sample-logstash template: metadata: labels: component: sample-logstash spec: containers: - name: sample-logstash image: docker.elastic.co/logstash/logstash:6.4.1 volumeMounts: - name: sample-logstash-configmap mountPath: /usr/share/logstash/config/logstash.yml subPath: logstash.yml - name: sample-logstash-configmap mountPath: /usr/share/logstash/pipeline/logstash.conf subPath: logstash.conf - name: sample-logstash-configmap mountPath: /usr/share/logstash/azureeventhubs.cfg subPath: azureeventhubs.cfg - name: sample-logstash-configmap mountPath: /usr/share/logstash/config/pipelines.yml subPath: pipelines.yml resources: limits: cpu: 1000m requests: cpu: 100m ports: - containerPort: 5601 name: http volumes: - name: sample-logstash-configmap configMap: name: sample-logstash-configmap

As mentioned earlier, Logstash will use azure_event_hubs plugin to ingest data into Elasticsearch. You can get installed plugins list by following steps listed below

Run command kubectl exec -ti {Logstash_Pod_Name} bash to connect to Logstash POD.
Run command bin/logstash-plugin list to see installed plugins

Demo

After all resources are deployed to AKS, run client app to send messages to Event hub. Open Kibana and you will see ELK stack statistics in Monitoring section. The number of messages sent by client App to Event hub will be displayed in Logstash's Events received and events emitted statistics.

The Discover tab of Kibana will display events ingested into Elasticsearch once you create index filters e.g. I have selected index filter as 'azureeventhub*' and you can see that this index naming convention was defined in azureeventshub.cfg pipelineindex => "azureeventhub-%{+YYYY.MM.dd}.

You can download source code for this article from GitHub repository

The second part of this series is Azure Kubernetes Service (AKS): Azure AD SAML based Single Sign on to secure Elasticsearch and Kibana and securing communications in ELK

Azure Kubernetes Service (AKS): Deploying Elasticsearch, Logstash and Kibana (ELK) and consume messages from Azure Event Hub

Azure Kubernetes Service (AKS) Cluster

Event Hub

Azure Blob Storage

Client App to send messages to Event Hub

Deploy Elasticsearch to Azure Kubernetes Service

Create a Persistent Volume

Create a Kubernetes ConfigMap

Create a Kubernetes Service

Create a Kubernetes StatefulSet

Enable Trial Version of Elasticsearch

Setup Password for buit-in user accounts of Elasticsearch

Deploy Kibana to Azure Kubernetes Service

Create a Kubernetes ConfigMap

Create a Kubernetes Service

Create a Kubernetes Deployment

Deploy Logstash to Azure Kubernetes Service

Create a Kubernetes ConfigMap

Create a Kubernetes Service

Create a Kubernetes Deployment

Demo

Additional resources