Elasticsearch

System Architecture

advanced
40-55 minutes
elasticsearchsearchanalyticslucenedistributed-systemslogging

Distributed search and analytics engine built on Apache Lucene for real-time search, logging, and data analytics

Elasticsearch

Overview

Elasticsearch is a distributed, RESTful search and analytics engine built on Apache Lucene, designed for horizontal scalability, reliability, and real-time search capabilities. It addresses the critical challenge of providing fast, relevant search and complex analytics across large volumes of structured and unstructured data.

Originally developed by Shay Banon in 2010, Elasticsearch has become the standard for search, logging, and analytics at companies like GitHub, Netflix, Uber, and Stack Overflow. It processes billions of documents and handles petabytes of data with sub-second query response times, designed for high availability, distributed architecture, and operational simplicity.

Key capabilities include:

  • Full-text search: Advanced text analysis with relevance scoring and fuzzy matching
  • Real-time analytics: Complex aggregations and time-series analysis at scale
  • Distributed architecture: Automatic sharding, replication, and cluster management
  • Schema flexibility: Dynamic mapping with support for complex nested data structures
  • Near real-time indexing: Documents available for search within seconds of ingestion

Architecture & Core Components

System Architecture

System Architecture Diagram

Core Components

1. Cluster and Nodes

  • Cluster: Collection of nodes storing data and providing search capabilities
  • Master node: Manages cluster metadata, index creation, and shard allocation
  • Data node: Stores data and executes search and aggregation operations
  • Coordinating node: Routes requests and merges results from data nodes
  • Ingest node: Preprocesses documents before indexing

2. Indices and Shards

  • Index: Logical collection of documents with similar characteristics
  • Shard: Physical subdivision of an index for horizontal scaling
  • Primary shard: Original shard that handles write operations
  • Replica shard: Copy of primary shard for redundancy and read scaling
  • Segment: Immutable Lucene index containing subset of shard data

3. Documents and Mapping

  • Document: JSON object stored in an index with unique ID
  • Mapping: Schema definition specifying field types and analysis settings
  • Field types: Text, keyword, numeric, date, boolean, geo, nested, object
  • Dynamic mapping: Automatic field type detection and mapping creation

4. Search and Query Engine

  • Query DSL: JSON-based query language for complex search operations
  • Analyzers: Text processing pipeline for tokenization and normalization
  • Scoring: Relevance calculation using TF-IDF and BM25 algorithms
  • Aggregations: Real-time analytics and data summarization framework

Data Flow & Indexing Process

System Architecture Diagram

Lucene Integration

  • Inverted index: Core data structure for fast text search
  • Segment merging: Background optimization for search performance
  • Translog: Write-ahead log for durability and crash recovery
  • Memory management: Java heap, off-heap caches, and OS page cache

Configuration & Deployment

Production Cluster Configuration

Node Configuration

# elasticsearch.yml - Master Node
cluster.name: production-cluster
node.name: master-node-1
node.roles: [master]
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch

# Network settings
network.host: 0.0.0.0
http.port: 9200
transport.port: 9300

# Discovery settings
discovery.seed_hosts: ["master-1", "master-2", "master-3"]
cluster.initial_master_nodes: ["master-node-1", "master-node-2", "master-node-3"]

# Memory settings
bootstrap.memory_lock: true
indices.memory.index_buffer_size: 10%
indices.memory.min_index_buffer_size: 48mb

# Security
xpack.security.enabled: true
xpack.security.transport.ssl.enabled: true
xpack.security.http.ssl.enabled: true

Data Node Configuration

# elasticsearch.yml - Data Node
cluster.name: production-cluster
node.name: data-node-1
node.roles: [data, ingest]

# Storage configuration
path.data: ["/data1/elasticsearch", "/data2/elasticsearch"]
cluster.routing.allocation.same_shard.host: false

# Index settings
indices.queries.cache.size: 10%
indices.requests.cache.size: 1%
indices.fielddata.cache.size: 30%

# Performance tuning
thread_pool.write.queue_size: 1000
thread_pool.search.queue_size: 1000

JVM Configuration

# jvm.options
-Xms31g
-Xmx31g
-XX:+UseG1GC
-XX:G1HeapRegionSize=32m
-XX:+UseG1OldGCMixedGCCount=4
-XX:+UseG1MixedGCLiveThresholdPercent=90
-XX:+DisableExplicitGC
-Djava.io.tmpdir=/tmp
-Dlog4j2.disable.jmx=true

Index Templates and Mappings

Production Index Template

{
  "index_patterns": ["logs-*"],
  "template": {
    "settings": {
      "number_of_shards": 3,
      "number_of_replicas": 1,
      "refresh_interval": "10s",
      "index.codec": "best_compression",
      "index.lifecycle.name": "logs-policy"
    },
    "mappings": {
      "properties": {
        "@timestamp": {
          "type": "date",
          "format": "date_optional_time||epoch_millis"
        },
        "level": {
          "type": "keyword"
        },
        "message": {
          "type": "text",
          "analyzer": "standard",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 512
            }
          }
        },
        "service": {
          "type": "keyword"
        },
        "host": {
          "properties": {
            "name": {"type": "keyword"},
            "ip": {"type": "ip"}
          }
        }
      }
    }
  }
}

Custom Analyzers

{
  "settings": {
    "analysis": {
      "analyzer": {
        "custom_search_analyzer": {
          "type": "custom",
          "tokenizer": "standard",
          "filter": [
            "lowercase",
            "custom_stemmer",
            "custom_synonyms"
          ]
        }
      },
      "filter": {
        "custom_stemmer": {
          "type": "stemmer",
          "language": "english"
        },
        "custom_synonyms": {
          "type": "synonym",
          "synonyms": [
            "fast,quick,rapid",
            "big,large,huge"
          ]
        }
      }
    }
  }
}

Docker & Kubernetes Deployment

Docker Compose Setup

# docker-compose.yml
version: '3.8'
services:
  elasticsearch:
    image: docker.elastic.co/elasticsearch/elasticsearch:8.11.0
    environment:
      - cluster.name=docker-cluster
      - node.name=es01
      - discovery.type=single-node
      - "ES_JAVA_OPTS=-Xms2g -Xmx2g"
      - xpack.security.enabled=false
    volumes:
      - esdata:/usr/share/elasticsearch/data
    ports:
      - "9200:9200"
      - "9300:9300"
    ulimits:
      memlock:
        soft: -1
        hard: -1

  kibana:
    image: docker.elastic.co/kibana/kibana:8.11.0
    environment:
      - ELASTICSEARCH_HOSTS=http://elasticsearch:9200
    ports:
      - "5601:5601"
    depends_on:
      - elasticsearch

volumes:
  esdata:

Kubernetes Deployment

apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
  name: production-es
spec:
  version: 8.11.0
  nodeSets:
  - name: master
    count: 3
    config:
      node.roles: ["master"]
      xpack.security.authc:
        anonymous:
          enabled: false
    podTemplate:
      spec:
        containers:
        - name: elasticsearch
          env:
          - name: ES_JAVA_OPTS
            value: "-Xms4g -Xmx4g"
          resources:
            requests:
              memory: 8Gi
              cpu: 2
            limits:
              memory: 8Gi
              cpu: 4
    volumeClaimTemplates:
    - metadata:
        name: elasticsearch-data
      spec:
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 100Gi
        storageClassName: fast-ssd
  
  - name: data
    count: 6
    config:
      node.roles: ["data", "ingest"]
    podTemplate:
      spec:
        containers:
        - name: elasticsearch
          env:
          - name: ES_JAVA_OPTS
            value: "-Xms31g -Xmx31g"
          resources:
            requests:
              memory: 64Gi
              cpu: 4
            limits:
              memory: 64Gi
              cpu: 8
    volumeClaimTemplates:
    - metadata:
        name: elasticsearch-data
      spec:
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 1Ti
        storageClassName: fast-ssd

Security Configuration

Authentication and Authorization

# Security settings
xpack.security.enabled: true
xpack.security.enrollment.enabled: true
xpack.security.authc.realms:
  native:
    native1:
      order: 0
  ldap:
    ldap1:
      order: 1
      url: "ldap://ldap.company.com:389"
      bind_dn: "cn=admin,dc=company,dc=com"

# Role-based access control
xpack.security.authz.run_as_enabled: true

TLS Configuration

# Transport layer security
xpack.security.transport.ssl.enabled: true
xpack.security.transport.ssl.keystore.path: elastic-certificates.p12
xpack.security.transport.ssl.truststore.path: elastic-certificates.p12

# HTTP layer security
xpack.security.http.ssl.enabled: true
xpack.security.http.ssl.keystore.path: elastic-certificates.p12

Performance Characteristics

Search Performance Metrics

  • Query latency: Sub-100ms for simple queries, <1s for complex aggregations
  • Indexing throughput: 10K-100K documents/sec per node depending on size
  • Search throughput: 1K-10K queries/sec per node based on complexity
  • Index size: Typically 1.5-3x raw data size depending on mappings

Latency Characteristics

Query Type              | P50     | P95     | P99     | P99.9
------------------------|---------|---------|---------|--------
Term query              | 5-15ms  | 15-50ms | 50-100ms| 100-300ms
Match query             | 10-30ms | 30-100ms| 100-200ms| 200-500ms
Complex aggregation     | 50-200ms| 200-1s  | 1-5s    | 5-20s
Large result set        | 100-500ms| 500ms-2s| 2-10s   | 10-30s
Cross-cluster search    | 100-300ms| 300ms-1s| 1-3s    | 3-10s

Resource Utilization Patterns

Memory Usage

  • Heap memory: 50% of available RAM, max 31GB for compressed OOPs
  • Page cache: Remaining RAM for Lucene segment caching
  • Field data: In-memory data structures for aggregations and sorting
  • Query cache: Cached query results for frequently accessed data

CPU Patterns

  • Search operations: CPU-intensive for text analysis and scoring
  • Indexing: CPU-intensive for document analysis and Lucene operations
  • Merging: Background CPU usage for segment optimization
  • GC overhead: 5-10% with properly tuned garbage collection

Storage Patterns

  • Index growth: 1.5-3x raw data size depending on analysis and mappings
  • Segment files: Multiple small files per shard requiring fast I/O
  • Translog: Sequential writes for durability
  • Snapshots: Point-in-time backups to object storage

Scalability Patterns

  • Horizontal scaling: Add nodes and increase shard count
  • Index partitioning: Time-based indices for data lifecycle management
  • Cross-cluster replication: Multi-region deployment for global scaling
  • Hot-warm-cold architecture: Tiered storage for cost optimization

Operational Considerations

Failure Modes & Detection

Node Failures

Symptoms:

  • Unassigned shards
  • Increased search latency
  • Indexing failures
  • Cluster state changes

Detection:

# Check cluster health
curl -X GET "localhost:9200/_cluster/health?pretty"

# Monitor node status
curl -X GET "localhost:9200/_cat/nodes?v"

# Check shard allocation
curl -X GET "localhost:9200/_cat/shards?v&h=index,shard,prirep,state,unassigned.reason"

Memory Issues

Symptoms:

  • OutOfMemoryError exceptions
  • Long GC pauses
  • Circuit breaker trips
  • Slow query performance

Detection:

# Monitor JVM memory
curl -X GET "localhost:9200/_nodes/stats/jvm?pretty"

# Check circuit breakers
curl -X GET "localhost:9200/_nodes/stats/breaker?pretty"

# Field data usage
curl -X GET "localhost:9200/_nodes/stats/indices/fielddata?pretty"

Split Brain Scenarios

Symptoms:

  • Multiple master nodes
  • Inconsistent cluster state
  • Data inconsistency
  • Write conflicts

Detection:

# Check master nodes
curl -X GET "localhost:9200/_cat/master?v"

# Monitor cluster state
curl -X GET "localhost:9200/_cluster/state?filter_path=master_node,nodes"

Disaster Recovery

Snapshot and Restore

# Register snapshot repository
PUT _snapshot/my_backup
{
  "type": "s3",
  "settings": {
    "bucket": "elasticsearch-backups",
    "region": "us-east-1",
    "base_path": "production-cluster"
  }
}

# Create snapshot
PUT _snapshot/my_backup/snapshot_1
{
  "indices": "logs-*,metrics-*",
  "ignore_unavailable": true,
  "include_global_state": false,
  "metadata": {
    "taken_by": "backup-service",
    "taken_because": "daily-backup"
  }
}

# Restore from snapshot
POST _snapshot/my_backup/snapshot_1/_restore
{
  "indices": "logs-2023-12-*",
  "ignore_unavailable": true,
  "index_settings": {
    "index.number_of_replicas": 1
  }
}

Cross-Cluster Replication

# Configure remote cluster
PUT _cluster/settings
{
  "persistent": {
    "cluster.remote.backup_cluster.seeds": [
      "backup-es-1:9300",
      "backup-es-2:9300"
    ]
  }
}

# Set up CCR
PUT backup_cluster:logs-replica/_ccr/follow
{
  "remote_cluster": "backup_cluster",
  "leader_index": "logs-primary"
}

Point-in-Time Recovery

# Create point-in-time for search
POST logs-*/_pit?keep_alive=1m

# Search with PIT
POST _search
{
  "pit": {
    "id": "pit_id_here",
    "keep_alive": "1m"
  },
  "query": {
    "range": {
      "@timestamp": {
        "gte": "2023-12-01T00:00:00",
        "lte": "2023-12-01T23:59:59"
      }
    }
  }
}

Maintenance Procedures

Index Lifecycle Management

{
  "policy": {
    "phases": {
      "hot": {
        "actions": {
          "rollover": {
            "max_size": "50gb",
            "max_age": "7d",
            "max_docs": 100000000
          }
        }
      },
      "warm": {
        "min_age": "7d",
        "actions": {
          "allocate": {
            "number_of_replicas": 0
          },
          "forcemerge": {
            "max_num_segments": 1
          }
        }
      },
      "cold": {
        "min_age": "30d",
        "actions": {
          "allocate": {
            "include": {
              "box_type": "cold"
            }
          }
        }
      },
      "delete": {
        "min_age": "90d"
      }
    }
  }
}

Cluster Maintenance

# Rolling restart procedure
# 1. Disable shard allocation
PUT _cluster/settings
{
  "persistent": {
    "cluster.routing.allocation.enable": "primaries"
  }
}

# 2. Stop indexing, restart node

# 3. Re-enable allocation
PUT _cluster/settings
{
  "persistent": {
    "cluster.routing.allocation.enable": "all"
  }
}

# 4. Wait for cluster recovery
GET _cluster/health?wait_for_status=green&timeout=30s

Troubleshooting Guide

Performance Issues

# Identify slow queries
GET _nodes/stats/indices/search?level=shards

# Check index statistics
GET logs-*/_stats

# Monitor thread pools
GET _nodes/stats/thread_pool

# Hot threads analysis
GET _nodes/hot_threads?threads=3&interval=500ms

Memory Problems

# Clear field data cache
POST _cache/clear?fielddata=true

# Clear query cache
POST _cache/clear?query=true

# Force garbage collection
POST _nodes/_reload_secure_settings

Indexing Issues

# Check indexing rate
GET _cat/indices?v&s=docs.count:desc

# Monitor refresh intervals
GET _stats/refresh

# Check merge statistics
GET _cat/segments?v

Production Best Practices

Index Design and Optimization

Mapping Optimization

{
  "mappings": {
    "properties": {
      "timestamp": {
        "type": "date",
        "format": "date_optional_time||epoch_millis"
      },
      "message": {
        "type": "text",
        "analyzer": "english",
        "index_options": "offsets",
        "fields": {
          "raw": {
            "type": "keyword",
            "ignore_above": 1024
          }
        }
      },
      "tags": {
        "type": "keyword",
        "doc_values": false,
        "index": true
      },
      "metadata": {
        "type": "object",
        "enabled": false
      }
    }
  }
}

Performance Tuning

# Bulk indexing optimization
PUT _template/bulk_template
{
  "index_patterns": ["bulk-*"],
  "settings": {
    "refresh_interval": "30s",
    "number_of_replicas": 0,
    "index.translog.durability": "async",
    "index.translog.sync_interval": "30s"
  }
}

# Search optimization
PUT logs-*/_settings
{
  "index.queries.cache.enabled": true,
  "index.requests.cache.enable": true,
  "index.refresh_interval": "10s"
}

Query Optimization

Efficient Query Patterns

{
  "query": {
    "bool": {
      "filter": [
        {
          "term": {
            "status": "error"
          }
        },
        {
          "range": {
            "@timestamp": {
              "gte": "now-1h"
            }
          }
        }
      ],
      "must": [
        {
          "match": {
            "message": "database connection"
          }
        }
      ]
    }
  },
  "aggs": {
    "error_count_by_service": {
      "terms": {
        "field": "service.keyword",
        "size": 10
      }
    }
  }
}

Search Templates

{
  "script": {
    "lang": "mustache",
    "source": {
      "query": {
        "bool": {
          "filter": [
            {
              "range": {
                "@timestamp": {
                  "gte": "{{start_time}}",
                  "lte": "{{end_time}}"
                }
              }
            }
          ],
          "must": [
            {
              "match": {
                "{{search_field}}": "{{search_term}}"
              }
            }
          ]
        }
      }
    }
  }
}

Monitoring Setup

Essential Metrics

# Cluster health metrics
cluster.status
cluster.active_primary_shards
cluster.active_shards
cluster.unassigned_shards

# Node metrics
node.jvm.mem.heap_used_percent
node.process.cpu.percent
node.fs.io_stats.total.read_operations
node.fs.io_stats.total.write_operations

# Index metrics
index.docs.count
index.size_in_bytes
index.refresh.total_time_in_millis
index.search.query_total

Alerting Configuration

# Watcher alert example
PUT _watcher/watch/cluster_health_watch
{
  "trigger": {
    "schedule": {
      "interval": "30s"
    }
  },
  "input": {
    "http": {
      "request": {
        "host": "localhost",
        "port": 9200,
        "path": "/_cluster/health"
      }
    }
  },
  "condition": {
    "compare": {
      "payload.status": {
        "eq": "red"
      }
    }
  },
  "actions": {
    "send_email": {
      "email": {
        "to": ["admin@company.com"],
        "subject": "Elasticsearch Cluster Alert",
        "body": "Cluster status is RED"
      }
    }
  }
}

Capacity Planning

Sizing Guidelines

# Shard sizing calculation
# Optimal shard size: 20-40GB
# Max shards per node: 20 × heap_size_gb
# Example: 31GB heap → max 620 shards per node

# Storage estimation
# Index size = raw_data_size × (1 + replica_count) × mapping_overhead
# Example: 100GB × 2 (1 replica) × 1.3 (30% overhead) = 260GB

# Memory requirements
# Heap: 50% of RAM, max 31GB
# Remaining RAM: OS page cache for Lucene segments
# Example: 64GB RAM → 31GB heap + 33GB page cache

Scaling Triggers

  • Search latency P95 > 1 second
  • Heap utilization > 75%
  • CPU utilization > 80%
  • Disk utilization > 85%
  • Queue rejections > 0

Integration Patterns

Application Integration

# Python Elasticsearch client
from elasticsearch import Elasticsearch
from elasticsearch.helpers import bulk, scan

# Connection with retry and timeout
es = Elasticsearch(
    ['https://es-node1:9200', 'https://es-node2:9200'],
    http_auth=('username', 'password'),
    use_ssl=True,
    verify_certs=True,
    retry_on_timeout=True,
    max_retries=3,
    timeout=30
)

# Bulk indexing
def bulk_index_documents(documents):
    actions = [
        {
            "_index": "logs-2023-12",
            "_source": doc
        }
        for doc in documents
    ]
    
    bulk(es, actions, chunk_size=1000, request_timeout=60)

# Search with aggregations
def search_logs(query, start_time, end_time):
    body = {
        "query": {
            "bool": {
                "must": [
                    {"match": {"message": query}}
                ],
                "filter": [
                    {
                        "range": {
                            "@timestamp": {
                                "gte": start_time,
                                "lte": end_time
                            }
                        }
                    }
                ]
            }
        },
        "aggs": {
            "logs_over_time": {
                "date_histogram": {
                    "field": "@timestamp",
                    "interval": "1h"
                }
            }
        }
    }
    
    return es.search(index="logs-*", body=body)

ELK Stack Integration

# Logstash configuration
input {
  beats {
    port => 5044
  }
}

filter {
  if [fields][service] == "web" {
    grok {
      match => { "message" => "%{COMBINEDAPACHELOG}" }
    }
    date {
      match => [ "timestamp", "dd/MMM/yyyy:HH:mm:ss Z" ]
    }
  }
}

output {
  elasticsearch {
    hosts => ["elasticsearch1:9200", "elasticsearch2:9200"]
    index => "logs-%{+YYYY.MM.dd}"
    template_name => "logs"
    template => "/etc/logstash/templates/logs.json"
  }
}

Security Best Practices

Authentication and Authorization

# Create roles
POST _security/role/logs_reader
{
  "indices": [
    {
      "names": ["logs-*"],
      "privileges": ["read", "view_index_metadata"]
    }
  ]
}

# Create users
POST _security/user/log_analyst
{
  "password": "secure_password",
  "roles": ["logs_reader"],
  "full_name": "Log Analyst",
  "email": "analyst@company.com"
}

Field and Document Level Security

# Field level security
POST _security/role/sensitive_data_access
{
  "indices": [
    {
      "names": ["user-data-*"],
      "privileges": ["read"],
      "field_security": {
        "grant": ["*"],
        "except": ["ssn", "credit_card"]
      }
    }
  ]
}

# Document level security
POST _security/role/regional_access
{
  "indices": [
    {
      "names": ["sales-*"],
      "privileges": ["read"],
      "query": {
        "term": {
          "region": "{{user.metadata.region}}"
        }
      }
    }
  ]
}

Interview-Focused Content

Technology-Specific Questions

Junior Level (2-4 YOE)

Q: What's the difference between a primary shard and a replica shard in Elasticsearch?

A: Primary and replica shards serve different purposes in Elasticsearch:

Primary Shard:

  • Original shard containing the actual data
  • Handles all write operations (index, update, delete)
  • One primary shard per partition of data
  • Cannot be changed after index creation

Replica Shard:

  • Exact copy of a primary shard
  • Provides redundancy for fault tolerance
  • Can handle read operations (search, get)
  • Can be added or removed dynamically
  • Automatically promoted to primary if original primary fails

Example: An index with 3 primary shards and 1 replica will have 6 total shards (3 primary + 3 replica).

Q: How does Elasticsearch achieve near real-time search?

A: Elasticsearch achieves near real-time search through several mechanisms:

  1. In-memory indexing: Documents are first written to memory (index buffer)
  2. Refresh operation: Periodically (default 1s) moves documents from memory to searchable segments
  3. Translog: Write-ahead log ensures durability before refresh
  4. Segment creation: New segments become immediately searchable
  5. Background merging: Optimizes segments for better search performance

This process makes documents searchable within seconds of ingestion while maintaining durability.

Mid-Level (4-8 YOE)

Q: How would you optimize an Elasticsearch cluster experiencing slow search performance?

A: Search performance optimization involves multiple approaches:

1. Index Level Optimization:

  • Mapping optimization: Use appropriate field types, disable unnecessary features
  • Analyzer tuning: Choose efficient analyzers for your use case
  • Shard sizing: Keep shards between 20-40GB for optimal performance

2. Query Optimization:

  • Use filters over queries: Filters are cached and faster
  • Avoid expensive operations: Script queries, wildcard queries on large datasets
  • Optimize aggregations: Use composite aggregations for large cardinality

3. Hardware/Configuration:

  • Memory allocation: 50% heap, rest for page cache
  • SSD storage: Fast storage for segment files
  • CPU optimization: Sufficient cores for search threads

4. Monitoring and Diagnostics:

# Identify slow queries
GET _nodes/hot_threads
GET _cat/thread_pool?v&h=name,active,queue,rejected

# Check index statistics
GET _cat/indices?v&s=search.query_time_in_millis:desc

Q: Explain Elasticsearch's distributed search execution process.

A: Distributed search in Elasticsearch follows a two-phase process:

Query Phase:

  1. Coordination: Coordinating node receives search request
  2. Broadcast: Query sent to all relevant shards (primary or replica)
  3. Local execution: Each shard executes query locally
  4. Scoring: Local relevance scores calculated
  5. Priority queue: Each shard returns top N document IDs and scores

Fetch Phase:

  1. Global scoring: Coordinating node merges and re-ranks results
  2. Document retrieval: Fetch actual documents from relevant shards
  3. Response assembly: Combine documents and return to client

Optimization:

  • Use _source filtering to reduce network overhead
  • Implement proper routing for single-shard queries
  • Use search templates for frequently executed queries

Senior Level (8+ YOE)

Q: Design an Elasticsearch architecture for a multi-tenant SaaS application handling 100TB of log data with strict data isolation requirements.

A: Multi-tenant logging architecture design:

Requirements Analysis:

  • 100TB data volume requires distributed storage
  • Data isolation prevents tenant data leakage
  • Search performance across tenant boundaries
  • Cost optimization through tiered storage
  • Compliance and audit requirements

Architecture Design:

Application Layer → Load Balancer → Coordinating Nodes
                                         ↓
Hot Data Nodes ← ILM → Warm Data Nodes ← ILM → Cold Data Nodes
     ↓                      ↓                      ↓
 Fast SSD Storage      Standard SSD          Object Storage

Implementation Strategy:

  1. Index Strategy:
    • Index per tenant: logs-{tenant-id}-{date}
    • Time-based rollover for lifecycle management
    • Index templates for consistent mapping
  2. Security Implementation:
    • Document Level Security (DLS) based on tenant ID
    • Role-based access control per tenant
    • API key authentication with tenant scoping
  3. Performance Optimization:
    • Dedicated coordinating nodes for search routing
    • Hot-warm-cold architecture for cost optimization
    • Cross-cluster search for historical data
  4. Operational Considerations:
    • Automated backup and restore per tenant
    • Monitoring and alerting per tenant metrics
    • Capacity planning based on tenant growth

Operational Questions

Q: Your Elasticsearch cluster shows "red" status with unassigned shards. Walk through your troubleshooting process.

A: Red cluster troubleshooting methodology:

1. Immediate Assessment:

# Check overall cluster health
GET _cluster/health?level=indices

# Identify unassigned shards
GET _cat/shards?v&h=index,shard,prirep,state,unassigned.reason

# Check node availability
GET _cat/nodes?v

2. Common Root Causes:

Node Failure:

# Check if nodes left the cluster
GET _cluster/state/nodes

# Verify disk space and memory
GET _nodes/stats/fs,jvm

Shard Allocation Issues:

# Check allocation explain
GET _cluster/allocation/explain

# Review allocation settings
GET _cluster/settings?include_defaults=true

3. Resolution Steps:

For Missing Nodes:

  • Restart failed nodes if possible
  • If data is lost, consider allocating replicas as primaries

For Allocation Issues:

# Enable allocation if disabled
PUT _cluster/settings
{
  "transient": {
    "cluster.routing.allocation.enable": "all"
  }
}

# Force allocation if necessary (data loss risk)
POST _cluster/reroute
{
  "commands": [
    {
      "allocate_empty_primary": {
        "index": "my-index",
        "shard": 0,
        "node": "node-1",
        "accept_data_loss": true
      }
    }
  ]
}

Q: How do you handle capacity planning for an Elasticsearch cluster with unpredictable growth patterns?

A: Dynamic capacity planning approach:

1. Monitoring Strategy:

  • Track growth rates: daily, weekly, monthly patterns
  • Monitor resource utilization trends
  • Set up predictive alerting based on growth rate

2. Elastic Scaling Architecture:

# Hot-warm-cold with automatic tier movement
PUT _ilm/policy/dynamic_logs_policy
{
  "policy": {
    "phases": {
      "hot": {
        "actions": {
          "rollover": {
            "max_size": "50gb",
            "max_age": "1d"
          }
        }
      },
      "warm": {
        "min_age": "3d",
        "actions": {
          "allocate": {
            "number_of_replicas": 0
          }
        }
      },
      "cold": {
        "min_age": "30d",
        "actions": {
          "searchable_snapshot": {
            "snapshot_repository": "cold_repository"
          }
        }
      }
    }
  }
}

3. Automated Scaling:

  • Kubernetes HPA for coordinating nodes
  • Auto-scaling data nodes based on storage utilization
  • Scheduled scaling for predictable load patterns

4. Cost Optimization:

  • Use searchable snapshots for long-term retention
  • Implement compression and forcemerge in warm phase
  • Archive to cheaper storage tiers automatically

Design Integration

Q: How would you integrate Elasticsearch into a microservices architecture for centralized logging and monitoring?

A: Microservices logging integration design:

Architecture Overview:

Microservice A → Filebeat → Logstash → Elasticsearch ← Kibana
Microservice B → Fluentd  → Kafka   → Logstash    ← Grafana
Microservice C → Direct API calls    → ES Ingest Nodes

Implementation Strategy:

  1. Log Collection:
    • Sidecar pattern with Filebeat for container logs
    • Structured logging with correlation IDs
    • Service mesh integration for automatic metadata
  2. Processing Pipeline:
# Logstash configuration
filter {
  if [kubernetes][container][name] == "api-service" {
    grok {
      match => { "message" => "%{TIMESTAMP_ISO8601:timestamp} %{LOGLEVEL:level} \[%{DATA:trace_id}\] %{GREEDYDATA:msg}" }
    }
    mutate {
      add_field => { "service_type" => "api" }
    }
  }
}
  1. Index Strategy:
    • Service-specific indices: logs-{service}-{date}
    • Common mapping for correlation across services
    • Retention policies based on service criticality
  2. Observability Integration:
    • Correlation with metrics (Prometheus) and traces (Jaeger)
    • Alerting based on log patterns and anomalies
    • Dashboard creation for service-specific insights

Trade-off Analysis

Q: When would you choose Elasticsearch over other search solutions like Solr or database-based search?

A: Search technology selection criteria:

Choose Elasticsearch when:

  • Real-time analytics: Need complex aggregations and visualizations
  • Scalability: Horizontal scaling requirements
  • Developer experience: REST API and JSON-based queries
  • Ecosystem: Integration with Logstash, Kibana, Beats
  • Cloud-native: Containerized deployments and auto-scaling

Choose Solr when:

  • Advanced search features: Faceting, highlighting, spell checking
  • Document-centric: Traditional search applications
  • SQL support: Need SQL-like query interface
  • Mature deployments: Existing Solr expertise and infrastructure

Choose Database Search when:

  • Simple search: Basic text search within existing data
  • ACID requirements: Strong consistency guarantees
  • Small scale: Limited data volume and query complexity
  • Cost sensitivity: Avoiding additional infrastructure

Specific Scenarios:

  • E-commerce product search: Elasticsearch (real-time updates, faceting)
  • Academic paper search: Solr (complex text analysis, relevance tuning)
  • Internal document search: Database FTS (simple, existing infrastructure)
  • Log analytics: Elasticsearch (time-series data, real-time dashboards)

Troubleshooting Scenarios

Q: Users report that search results are missing recent documents. How do you investigate and resolve this issue?

A: Missing recent documents troubleshooting:

1. Initial Investigation:

# Check refresh interval and last refresh time
GET logs-*/_settings?filter_path=*.settings.index.refresh_interval
GET logs-*/_stats/refresh

# Verify document count trends
GET _cat/indices/logs-*?v&s=creation.date:desc

2. Common Root Causes:

Refresh Configuration:

# Check if refresh is disabled
GET logs-*/_settings | grep refresh_interval

# Manual refresh to test
POST logs-*/_refresh

Indexing Pipeline Issues:

# Check indexing rate
GET _stats/indexing

# Monitor ingest pipelines
GET _ingest/pipeline/_stats

Routing/Shard Issues:

# Verify document routing
GET logs-*/_search
{
  "query": {
    "range": {
      "@timestamp": {
        "gte": "now-5m"
      }
    }
  }
}

3. Resolution Steps:

Immediate Fix:

# Force refresh if needed
POST _refresh

# Check for stuck operations
GET _cat/pending_tasks?v

Long-term Solution:

  • Adjust refresh interval based on requirements
  • Monitor indexing pipeline health
  • Set up alerting for indexing delays
  • Implement proper error handling in applications

4. Prevention:

  • Regular monitoring of indexing metrics
  • Proper refresh interval configuration
  • Health checks for indexing pipeline
  • Documentation of troubleshooting procedures

Further Reading

Official Documentation

Production Guides

Advanced Topics

Related Systems

apache-solr
opensearch
amazon-cloudwatch
splunk

Used By

elasticgithubnetflixuberlinkedinstack-overflowwikipedia