Elasticsearch
Overview
Causely integrates with Elasticsearch in two ways:
- Log Retrieval: Causely can ingest and analyze container logs stored in Elasticsearch. This allows you to correlate error and exception patterns with service-level incidents and root causes.
- Cluster Performance Insights: Causely connects directly to Elasticsearch cluster APIs to detect node-level or infrastructure bottlenecks that can cause degraded application performance.
Use this integration if:
-
Your logs are already centralized in Elasticsearch.
-
If you operate Elasticsearch clusters, Causely goes beyond symptom monitoring by analyzing real-time signals to surface the actual causes of Elasticsearch problems, helping you identify reliability issues such as:
Use Case 1: Log Retrieval from Elasticsearch
When your logs are already shipped to Elasticsearch, for example via Fluentd, Logstash, or Beats, Causely can pull those logs directly to enhance root cause analysis.
Instead of collecting logs from Kubernetes directly, Causely queries Elasticsearch indices for logs tied to specific services or containers.
Benefits
- Automatically surfaces relevant logs in the context of an active root cause or service malfunction.
- Displays container-level logs under affected services when abnormal behavior occurs, for example error spikes or degraded performance.
- Shows log lines and exceptions alongside root causes to validate issues and dramatically shorten time to understanding and resolution.
Setup
Create a Kubernetes Secret for your Elasticsearch credentials (if authentication is required):
kubectl create secret generic elasticsearch-logs-credentials \
--namespace causely \
--from-literal=api_key="your-api-key"
Or for basic authentication:
kubectl create secret generic elasticsearch-logs-credentials \
--namespace causely \
--from-literal=username="your-username" \
--from-literal=password="your-password"
How Elasticsearch Works
Schema Flexibility
Elasticsearch is schema-less (mostly): Unlike SQL databases, you don't define a strict schema upfront.
- Dynamic mapping: When you index a document, Elasticsearch automatically creates field mappings based on what it sees
- No enforcement: Different documents in the same index can have completely different fields
Different Log Shippers/Collectors
Different log shipping tools structure data differently:
- Fluentd: Creates fields like
data.message,data.priority, addsfluentd: "true" - Filebeat: Might use
message,log.level,kubernetes.namespace - Logstash: Often uses
message,@timestamp, custom parsed fields
This is why Causely requires explicit field mappings—to know where to find pod names, messages, severity, and other metadata in your specific Elasticsearch index structure.
Then update your Kubernetes scraper configuration with the Elasticsearch endpoint, indices, and field mappings:
scrapers:
- type: Kubernetes
enabled: true
logs_enabled: true
elasticsearch:
endpoint: "http://elasticsearch.logging.svc.cluster.local:9200"
secret: "elasticsearch-logs-credentials" # Optional: omit if no auth required
indices:
- "logstash-app_log-*" # Example: Fluentd/Logstash format
# - "filebeat-*" # Example: Filebeat format
fields:
timestamp: "@timestamp"
message: "data.message" # Example: nested field (Fluentd)
# message: "message" # Example: flat field (Filebeat)
severity: "data.priority" # Example: explicit severity field (optional)
pod: "host" # Example: Fluentd uses "host" field
# pod: "kubernetes.pod.name" # Example: Filebeat uses standard k8s metadata
namespace: "" # Optional: set if available
container: "" # Optional: set if available
The field mappings allow you to configure how Causely extracts pod names, namespaces, containers, messages, and severity from your Elasticsearch documents. This supports different log shipping formats (Fluentd, Filebeat, Logstash, etc.) with varying field structures.
Authentication Options
- API Key: Include
api_keyin the Kubernetes secret - Basic Auth: Include
usernameandpasswordin the Kubernetes secret - No Auth: Omit the
secretfield in the configuration if authentication is not required
Supported Log Types
- Container logs (stdout/stderr)
Use Case 2: Cluster Performance Monitoring
Setup Guide
Step 1: Create an API Key
Create an API Key for your Elasticsearch cluster with the following permissions:
- Cluster monitoring: Required to access cluster health and node statistics
- Node stats: Required to monitor individual node performance and resource usage
- Read access: Required to query cluster metadata and configuration
For Elastic Cloud:
- Go to Elastic Cloud Console → Security → API Keys
- Create a new API key with appropriate permissions
- Copy the API key for use in the next step
For self-hosted Elasticsearch:
- Use Kibana Security → Users → Create API Key
- Or use the Elasticsearch API:
POST /_security/api_key - Ensure the API key has monitoring privileges
Step 2: Create a Kubernetes Secret for the API Key
After creating the API Key create a Kubernetes Secret:
kubectl create secret generic \
--namespace causely elastic-credentials \
--from-literal=api_key="..." \
--from-literal=url='https://....eastus2.azure.elastic-cloud.com'
The url must be the endpoint URL of your Elasticsearch cluster:
- Elastic Cloud:
https://your-cluster-id.region.elastic-cloud.com - AWS OpenSearch:
https://your-domain.region.es.amazonaws.com - Azure Search:
https://your-service.search.windows.net - Self-hosted:
https://your-elasticsearch-host:9200
Step 3: Update Causely Configuration
Once the Secret is created, update the Causely configuration to enable scraping for the new cluster. Below is an example configuration:
scrapers:
elasticsearch:
enabled: true
instances:
- secretName: elastic-credentials
namespace: causely
Alternative: Enable Credentials Autodiscovery
Causely also supports credentials autodiscovery. This feature allows you to add new scraping targets without updating the Causely configuration. Label the Kubernetes Secret to enable autodiscovery for the corresponding scraper.
kubectl --namespace causely label secret elastic-credentials "causely.ai/scraper=ElasticSearch"
What Data is Collected
From Logs
- Error and warning messages from container stdout/stderr logs linked to affected services or root causes.
From Cluster Performance
- Cluster entities with names and health status
- Service-to-cluster mappings (which service provides the Elasticsearch cluster)
- Connection details including endpoint URL and API authentication
- Cluster health metrics including shard allocation status
- Cluster status (green, yellow, red) and health indicators
- Shard allocation metrics (active, relocating, initializing, unassigned shards)
- Task queue monitoring (pending tasks, in-flight operations, queue wait times)
- Active shards percentage for overall cluster health
- Node information including names, roles, and attributes
- File descriptor usage (
FileDescriptorUsage,FileDescriptorCapacity) - Memory utilization (
MemoryUsage,MemoryCapacity) - CPU performance (
CPUUsage,CPUCapacity) - Load average metrics (1m, 5m, 15m) for trend analysis
- Disk space metrics (
Usage,Capacity) - File system statistics (total, free, available bytes, watermarks)
- I/O performance metrics (read/write operations, data transfer, I/O time)
- JVM memory pool statistics (heap used/committed/max, non-heap usage)
- Garbage collection metrics (collection count, time per collector)
- Thread statistics (current/peak thread counts, thread pool utilization)
- Buffer pool performance (buffer count, usage, capacity)
- Class loading statistics (loaded/unloaded classes)
- Operating system metrics (memory, CPU, swap usage)
- Cgroup metrics for containerized deployments
- Transport layer information (addresses, ports)
- Network endpoint mapping for service discovery
- Host and IP address tracking for infrastructure mapping
- Service-to-node mappings
- Node-to-VM relationships
- VM-to-disk relationships
- Attribute-based labeling for custom categorization
Result
- Captures and displays logs alongside a root cause to highlight the precise errors or stack traces that occurred around the time of failure.
- Detects Elasticsearch cluster issues that may propagate to upstream applications.
- Identifies and explains root causes automatically, rather than requiring manual investigation across dashboards.