Skip to main content

Elasticsearch

Overview

Causely integrates with Elasticsearch in two ways:

  1. Log Retrieval: Causely can ingest and analyze container logs stored in Elasticsearch. This allows you to correlate error and exception patterns with service-level incidents and root causes.
  2. Cluster Performance Insights: Causely connects directly to Elasticsearch cluster APIs to detect node-level or infrastructure bottlenecks that can cause degraded application performance.

Use this integration if:

Use Case 1: Log Retrieval from Elasticsearch

When your logs are already shipped to Elasticsearch, for example via Fluentd, Logstash, or Beats, Causely can pull those logs directly to enhance root cause analysis.
Instead of collecting logs from Kubernetes directly, Causely queries Elasticsearch indices for logs tied to specific services or containers.

Benefits

  • Automatically surfaces relevant logs in the context of an active root cause or service malfunction.
  • Displays container-level logs under affected services when abnormal behavior occurs, for example error spikes or degraded performance.
  • Shows log lines and exceptions alongside root causes to validate issues and dramatically shorten time to understanding and resolution.

Setup

Create a Kubernetes Secret for your Elasticsearch credentials (if authentication is required):

kubectl create secret generic elasticsearch-logs-credentials \
--namespace causely \
--from-literal=api_key="your-api-key"

Or for basic authentication:

kubectl create secret generic elasticsearch-logs-credentials \
--namespace causely \
--from-literal=username="your-username" \
--from-literal=password="your-password"

How Elasticsearch Works

Schema Flexibility

Elasticsearch is schema-less (mostly): Unlike SQL databases, you don't define a strict schema upfront.

  • Dynamic mapping: When you index a document, Elasticsearch automatically creates field mappings based on what it sees
  • No enforcement: Different documents in the same index can have completely different fields

Different Log Shippers/Collectors

Different log shipping tools structure data differently:

  • Fluentd: Creates fields like data.message, data.priority, adds fluentd: "true"
  • Filebeat: Might use message, log.level, kubernetes.namespace
  • Logstash: Often uses message, @timestamp, custom parsed fields

This is why Causely requires explicit field mappings—to know where to find pod names, messages, severity, and other metadata in your specific Elasticsearch index structure.

Then update your Kubernetes scraper configuration with the Elasticsearch endpoint, indices, and field mappings:

scrapers:
- type: Kubernetes
enabled: true
logs_enabled: true
elasticsearch:
endpoint: "http://elasticsearch.logging.svc.cluster.local:9200"
secret: "elasticsearch-logs-credentials" # Optional: omit if no auth required
indices:
- "logstash-app_log-*" # Example: Fluentd/Logstash format
# - "filebeat-*" # Example: Filebeat format
fields:
timestamp: "@timestamp"
message: "data.message" # Example: nested field (Fluentd)
# message: "message" # Example: flat field (Filebeat)
severity: "data.priority" # Example: explicit severity field (optional)
pod: "host" # Example: Fluentd uses "host" field
# pod: "kubernetes.pod.name" # Example: Filebeat uses standard k8s metadata
namespace: "" # Optional: set if available
container: "" # Optional: set if available

The field mappings allow you to configure how Causely extracts pod names, namespaces, containers, messages, and severity from your Elasticsearch documents. This supports different log shipping formats (Fluentd, Filebeat, Logstash, etc.) with varying field structures.

Authentication Options

  • API Key: Include api_key in the Kubernetes secret
  • Basic Auth: Include username and password in the Kubernetes secret
  • No Auth: Omit the secret field in the configuration if authentication is not required

Supported Log Types

  • Container logs (stdout/stderr)

Use Case 2: Cluster Performance Monitoring

Setup Guide

Step 1: Create an API Key

Create an API Key for your Elasticsearch cluster with the following permissions:

  • Cluster monitoring: Required to access cluster health and node statistics
  • Node stats: Required to monitor individual node performance and resource usage
  • Read access: Required to query cluster metadata and configuration

For Elastic Cloud:

  1. Go to Elastic Cloud Console → Security → API Keys
  2. Create a new API key with appropriate permissions
  3. Copy the API key for use in the next step

For self-hosted Elasticsearch:

  1. Use Kibana Security → Users → Create API Key
  2. Or use the Elasticsearch API: POST /_security/api_key
  3. Ensure the API key has monitoring privileges

Step 2: Create a Kubernetes Secret for the API Key

After creating the API Key create a Kubernetes Secret:

kubectl create secret generic \
--namespace causely elastic-credentials \
--from-literal=api_key="..." \
--from-literal=url='https://....eastus2.azure.elastic-cloud.com'

The url must be the endpoint URL of your Elasticsearch cluster:

  • Elastic Cloud: https://your-cluster-id.region.elastic-cloud.com
  • AWS OpenSearch: https://your-domain.region.es.amazonaws.com
  • Azure Search: https://your-service.search.windows.net
  • Self-hosted: https://your-elasticsearch-host:9200

Step 3: Update Causely Configuration

Once the Secret is created, update the Causely configuration to enable scraping for the new cluster. Below is an example configuration:

scrapers:
elasticsearch:
enabled: true
instances:
- secretName: elastic-credentials
namespace: causely

Alternative: Enable Credentials Autodiscovery

Causely also supports credentials autodiscovery. This feature allows you to add new scraping targets without updating the Causely configuration. Label the Kubernetes Secret to enable autodiscovery for the corresponding scraper.

kubectl --namespace causely label secret elastic-credentials "causely.ai/scraper=ElasticSearch"

What Data is Collected

From Logs

  • Error and warning messages from container stdout/stderr logs linked to affected services or root causes.

From Cluster Performance

  • Cluster entities with names and health status
  • Service-to-cluster mappings (which service provides the Elasticsearch cluster)
  • Connection details including endpoint URL and API authentication
  • Cluster health metrics including shard allocation status
  • Cluster status (green, yellow, red) and health indicators
  • Shard allocation metrics (active, relocating, initializing, unassigned shards)
  • Task queue monitoring (pending tasks, in-flight operations, queue wait times)
  • Active shards percentage for overall cluster health
  • Node information including names, roles, and attributes
  • File descriptor usage (FileDescriptorUsage, FileDescriptorCapacity)
  • Memory utilization (MemoryUsage, MemoryCapacity)
  • CPU performance (CPUUsage, CPUCapacity)
  • Load average metrics (1m, 5m, 15m) for trend analysis
  • Disk space metrics (Usage, Capacity)
  • File system statistics (total, free, available bytes, watermarks)
  • I/O performance metrics (read/write operations, data transfer, I/O time)
  • JVM memory pool statistics (heap used/committed/max, non-heap usage)
  • Garbage collection metrics (collection count, time per collector)
  • Thread statistics (current/peak thread counts, thread pool utilization)
  • Buffer pool performance (buffer count, usage, capacity)
  • Class loading statistics (loaded/unloaded classes)
  • Operating system metrics (memory, CPU, swap usage)
  • Cgroup metrics for containerized deployments
  • Transport layer information (addresses, ports)
  • Network endpoint mapping for service discovery
  • Host and IP address tracking for infrastructure mapping
  • Service-to-node mappings
  • Node-to-VM relationships
  • VM-to-disk relationships
  • Attribute-based labeling for custom categorization

Result

  • Captures and displays logs alongside a root cause to highlight the precise errors or stack traces that occurred around the time of failure.
  • Detects Elasticsearch cluster issues that may propagate to upstream applications.
  • Identifies and explains root causes automatically, rather than requiring manual investigation across dashboards.