Solving ELK problems. -Diary.

F. Ana. D
3 min readNov 30, 2020

So a few weeks ago, the server in my office was having a bad availability score, and our team was asked to analyze and find the cause. The server contained a lot of technology, and long story short, I got the task to analyze the ELK service.

I have no knowledge about the ELK stack, but that’s not an excuse! I can just do the research as I always do. So here I am trying to collect my research result about what is ELK, How our service using ELK, and What can be optimized from the existing ELK in our service.

  1. What is ELK stack
    With zero knowledge about ELK stack, all you can do is this … googling. haha. Here, I’ll try to summarize the result of the google result. So, ELK stack is an acronym for three opensource products (Elastic Search, Logstash, Kibana).
    The ELK stack is a compelling solution for taking data from any source, with any format, then to search, analyze, and visualize that data in near real-time.
    # Elasticsearch is similar to a database. There are several terms that can be compared with the terms in the database.
    ELASTICSEARCH → RELATIONAL DATABASE
    ElasticSearch → MySQL
    Index → Database
    Types → Tables
    Properties → Columns
    Documents → Rows

    Besides that, there are several terms worth mentioning here for better understanding later.
    → Shards(Primary). Every index built from one or more shards. Each shard is an instance of a Lucene index, which you can think of as a self-contained search engine that indexes and handles queries for a SUBSET of the data. A shard can handle up to 50GB of data, so if an index has 1TB of data, we will need more shards.
    → Replica Shards. Replica shard is just a copy of primary shard, to prevent data loss when something bad happens. An index can have no replica shard at all.

    #Logstash collect data(from across many systems in many formats), transforms(Derive structure from unstructured data with grok, decipher geo coordinates from IP addresses, anonymize or exclude sensitive fields), and sends your data to your favorite “stash” (Elastic Search, Slack, SysLog, etc).
    # Kibana lets you visualize your Elasticsearch data and navigate the Elastic Stack.
  2. How our service using ELK stack
    Now, I have a bare-minimum knowledge about the ELK stack, I am trying to figure out where is the E, L, K in our server.
    → The ElasticSearch. Based on the documentation, our service using Elastic search as a view database, which is a read-only replica from the real database.
    → The Logstash. Our service using logstash to send daily service-log data to ElasticSearch.
    → The Kibana. Kibana was used to visualize the services status, server resource status, and other metrics based on elasticsearch data.
  3. What can be optimized from the existing ELK service
    → Indexes. Because logstash sends the log stream daily, there is a lot of old logstash index opened- which affects system performance. We can try using Index Lifecycle Management to manage the indexes.
    → Shards. Each logstash index having 5 shards with very little data(around 1–10MB) is an overkill. We can try reconfiguring the logstash and having one shard instead.

That’s enough babble from me for today, Thanks for reading.

--

--

F. Ana. D

Software Engineer — [Note: English is not my first language so please bear with me]