Thresholds for Service Symptoms
Causely automatically learns and detects service symptoms based on various metrics. However, you may want to customize these thresholds to better match your specific requirements and SLO definitions. This document explains how to configure custom thresholds for your services.
Overview
Thresholds in Causely are used to:
- Define custom boundaries for service symptoms
- Specify SLO violation criteria
- Override automatically learned thresholds
Supported Thresholds
Currently, you can configure the following thresholds:
- Error Rate Threshold: Defines the maximum acceptable error rate for a service
- Latency Threshold: Defines the maximum acceptable latency for a service (in milliseconds)
Configuration Methods
Using Kubernetes Labels
The recommended way to configure thresholds is using Kubernetes labels. You can apply these labels to your services:
# Configure error rate threshold (for example, 1% error rate)
kubectl label svc -n <namespace> <service-name> "causely.ai/error-rate-threshold=0.01"
# Configure latency threshold (for example, 500ms)
kubectl label svc -n <namespace> <service-name> "causely.ai/latency-threshold=500.0"
Threshold Values
- Error Rate Threshold: Expressed as a decimal (for example, 0.01 for 1%)
- Latency Threshold: Expressed in milliseconds (for example, 500.0 for 500 ms)
Best Practices
- Start with Defaults: Begin with Causely's automatically learned thresholds
- Adjust Based on SLOs: Modify thresholds to match your specific SLO requirements
- Monitor Impact: After changing thresholds, monitor how they affect symptom detection
- Document Changes: Keep track of threshold changes and their rationale
Example Use Cases
- Strict SLO Requirements: Set lower thresholds for critical services
- Service-Specific Requirements: Configure different thresholds for different services
- Temporary Adjustments: Modify thresholds during maintenance windows