Elk
ELK Stack
Elasticsearch, Logstash, and Kibana for centralized logging, search, and visualization
ELK Stack
The ELK Stack (Elasticsearch, Logstash, Kibana) is the most widely adopted open-source solution for centralized log management, full-text search, and data visualization. Elastic also provides Beats (lightweight data shippers) and the broader Elastic Stack ecosystem.
Overview
| Component | Role | Description |
|---|---|---|
| Elasticsearch | Search & Storage | Distributed search and analytics engine based on Apache Lucene |
| Logstash | Data Processing | Server-side data processing pipeline for ingestion and transformation |
| Kibana | Visualization | Web UI for searching, visualizing, and dashboarding Elasticsearch data |
| Beats | Data Shipping | Lightweight agents for shipping data from edge machines |
Architecture
┌─────────┐ ┌─────────┐ ┌─────────┐
│ App 1 │ │ App 2 │ │ App 3 │
└────┬────┘ └────┬────┘ └────┬────┘
│ │ │
▼ ▼ ▼
┌─────────┐ ┌─────────┐ ┌─────────┐
│Filebeat │ │Filebeat │ │Metricbeat│
└────┬────┘ └────┬────┘ └────┬────┘
│ │ │
└──────────────┼──────────────┘
▼
┌────────────┐
│ Logstash │ (Parse, Transform, Enrich)
└──────┬─────┘
▼
┌──────────────┐
│Elasticsearch │ (Index, Store, Search)
└──────┬───────┘
▼
┌─────────┐
│ Kibana │ (Visualize, Dashboard, Alert)
└─────────┘Quick Start with Docker Compose
# docker-compose.yml
version: '3.8'
services:
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:8.13.0
container_name: elasticsearch
environment:
- discovery.type=single-node
- xpack.security.enabled=false
- "ES_JAVA_OPTS=-Xms1g -Xmx1g"
ports:
- "9200:9200"
volumes:
- es-data:/usr/share/elasticsearch/data
networks:
- elk
logstash:
image: docker.elastic.co/logstash/logstash:8.13.0
container_name: logstash
volumes:
- ./logstash/pipeline:/usr/share/logstash/pipeline
ports:
- "5044:5044"
- "5000:5000/tcp"
- "5000:5000/udp"
environment:
- "LS_JAVA_OPTS=-Xms512m -Xmx512m"
depends_on:
- elasticsearch
networks:
- elk
kibana:
image: docker.elastic.co/kibana/kibana:8.13.0
container_name: kibana
ports:
- "5601:5601"
environment:
- ELASTICSEARCH_HOSTS=http://elasticsearch:9200
depends_on:
- elasticsearch
networks:
- elk
filebeat:
image: docker.elastic.co/beats/filebeat:8.13.0
container_name: filebeat
volumes:
- ./filebeat/filebeat.yml:/usr/share/filebeat/filebeat.yml:ro
- /var/lib/docker/containers:/var/lib/docker/containers:ro
- /var/run/docker.sock:/var/run/docker.sock:ro
depends_on:
- elasticsearch
- logstash
networks:
- elk
volumes:
es-data:
networks:
elk:
driver: bridgeBeats: Data Shippers
| Beat | Purpose | Data Source |
|---|---|---|
| Filebeat | Log files | Application logs, system logs, container logs |
| Metricbeat | System metrics | CPU, memory, disk, network, container stats |
| Packetbeat | Network data | HTTP, DNS, MySQL, Redis protocol analysis |
| Heartbeat | Uptime monitoring | HTTP, TCP, ICMP health checks |
| Auditbeat | Audit data | File integrity, system calls, user activity |
Filebeat Configuration
# filebeat.yml
filebeat.inputs:
- type: log
enabled: true
paths:
- /var/log/app/*.log
fields:
service: my-app
multiline:
pattern: '^\d{4}-\d{2}-\d{2}'
negate: true
match: after
- type: container
paths:
- /var/lib/docker/containers/*/*.log
processors:
- add_docker_metadata: ~
output.logstash:
hosts: ["logstash:5044"]
# Or output directly to Elasticsearch
# output.elasticsearch:
# hosts: ["elasticsearch:9200"]
# index: "filebeat-%{+yyyy.MM.dd}"Deployment Patterns
Small (Development / Small Team)
- Single-node Elasticsearch
- Logstash on the same host
- Filebeat on application servers
Medium (Production)
- 3-node Elasticsearch cluster (1 master, 2 data)
- Dedicated Logstash instances
- Kafka/Redis as buffer between Beats and Logstash
- Kibana behind reverse proxy with authentication
Large (Enterprise)
- Dedicated master, data, ingest, and coordinating nodes
- Hot-warm-cold architecture for data lifecycle
- Cross-cluster replication for disaster recovery
- Kafka as durable message buffer
- Multiple Logstash pipelines
Best Practices
ELK Stack Guidelines
- Sizing: Allocate 50% of available RAM to Elasticsearch JVM heap (max 31GB)
- Sharding: Use 1 primary shard per 20-40GB of data; avoid over-sharding
- Index Lifecycle: Use ILM policies to manage hot/warm/cold/delete phases
- Security: Enable TLS between nodes and authentication for production
- Buffering: Use Kafka or Redis between Beats and Logstash for resilience
- Monitoring: Use Elastic's built-in monitoring or Metricbeat to monitor the stack itself
- Mapping: Define explicit index mappings instead of relying on dynamic mapping
- Retention: Set index lifecycle policies to automatically delete old data
ELK vs Alternatives
| Feature | ELK Stack | Grafana Loki | Datadog | Splunk |
|---|---|---|---|---|
| Cost | Free (self-hosted) | Free (self-hosted) | Per-GB ingested | Per-GB indexed |
| Full-text Search | ★★★★★ | ★★☆☆☆ | ★★★★☆ | ★★★★★ |
| Log Aggregation | ★★★★★ | ★★★★★ | ★★★★★ | ★★★★★ |
| Resource Usage | High | Low | N/A (SaaS) | High |
| Setup Complexity | Medium | Low | Low (SaaS) | Medium |
| Scalability | ★★★★★ | ★★★★☆ | ★★★★★ | ★★★★★ |
| Visualization | ★★★★☆ | ★★★★★ (Grafana) | ★★★★★ | ★★★★☆ |
| APM Integration | ★★★★☆ | ★★★☆☆ | ★★★★★ | ★★★★☆ |