Elasticsearch
Overview
Causely provides native integration with Elasticsearch to help you identify and resolve cluster performance issues before they impact your users.
Instead of just monitoring symptoms, Causely analyzes real-time signals to surface the actual root causes of Elasticsearch problems.
This integration helps you identify common root causes, among others:
- Memory Pressure and Disk Pressure
- CPU Congested and Memory Congested
- Memory Failure and Frequent Memory Failure
- Disk Total IOPs Congested and Inode Usage Congested
The integration supports both self-hosted Elasticsearch clusters and cloud-managed services including Elastic Cloud, AWS OpenSearch, and Azure Search.
Setup Guide
Step 1: Create an API Key
Create an API Key for your Elasticsearch cluster with the following permissions:
- Cluster monitoring: Required to access cluster health and node statistics
- Node stats: Required to monitor individual node performance and resource usage
- Read access: Required to query cluster metadata and configuration
For Elastic Cloud:
- Go to Elastic Cloud Console → Security → API Keys
- Create a new API key with appropriate permissions
- Copy the API key for use in the next step
For self-hosted Elasticsearch:
- Use Kibana Security → Users → Create API Key
- Or use the Elasticsearch API:
POST /_security/api_key
- Ensure the API key has monitoring privileges
Step 2: Create a Kubernetes Secret for the API Key
After creating the API Key create a Kubernetes Secret:
kubectl create secret generic \
--namespace causely elastic-credentials \
--from-literal=api_key="..." \
--from-literal=url='https://....eastus2.azure.elastic-cloud.com'
The url
must be the endpoint URL of your Elasticsearch cluster:
- Elastic Cloud:
https://your-cluster-id.region.elastic-cloud.com
- AWS OpenSearch:
https://your-domain.region.es.amazonaws.com
- Azure Search:
https://your-service.search.windows.net
- Self-hosted:
https://your-elasticsearch-host:9200
Step 3: Update Causely Configuration
Once the Secret is created, update the Causely configuration to enable scraping for the new cluster. Below is an example configuration:
scrapers:
elasticsearch:
enabled: true
instances:
- secretName: elastic-credentials
namespace: causely
Alternative: Enable Credentials Autodiscovery
Causely also supports credentials autodiscovery. This feature allows you to add new scraping targets without updating the Causely configuration. Label the Kubernetes Secret to enable autodiscovery for the corresponding scraper.
kubectl --namespace causely label secret elastic-credentials "causely.ai/scraper=ElasticSearch"
What Data is Collected
The Elasticsearch scraper collects comprehensive metadata and performance information from your Elasticsearch clusters, including:
- Cluster entities with names and health status
- Service-to-cluster mappings (which service provides the Elasticsearch cluster)
- Connection details including endpoint URL and API authentication
- Cluster health metrics including shard allocation status
- Cluster status (green, yellow, red) and health indicators
- Shard allocation metrics (active, relocating, initializing, unassigned shards)
- Task queue monitoring (pending tasks, in-flight operations, queue wait times)
- Active shards percentage for overall cluster health
- Node information including names, roles, and attributes
- File descriptor usage (
FileDescriptorUsage
,FileDescriptorCapacity
) - Memory utilization (
MemoryUsage
,MemoryCapacity
) - CPU performance (
CPUUsage
,CPUCapacity
) - Load average metrics (1m, 5m, 15m) for trend analysis
- Disk space metrics (
Usage
,Capacity
) - File system statistics (total, free, available bytes, watermarks)
- I/O performance metrics (read/write operations, data transfer, I/O time)
- JVM memory pool statistics (heap used/committed/max, non-heap usage)
- Garbage collection metrics (collection count, time per collector)
- Thread statistics (current/peak thread counts, thread pool utilization)
- Buffer pool performance (buffer count, usage, capacity)
- Class loading statistics (loaded/unloaded classes)
- Operating system metrics (memory, CPU, swap usage)
- Cgroup metrics for containerized deployments
- Transport layer information (addresses, ports)
- Network endpoint mapping for service discovery
- Host and IP address tracking for infrastructure mapping
- Service-to-node mappings
- Node-to-VM relationships
- VM-to-disk relationships
- Attribute-based labeling for custom categorization