Skip to main content

Thresholds for Service Symptoms

Causely automatically learns and detects service symptoms based on various metrics. However, you may want to customize these thresholds to better match your specific requirements and SLO definitions. This document explains how to configure custom thresholds for your services.

Overview

Thresholds in Causely are used to:

  • Define custom boundaries for service symptoms
  • Specify SLO violation criteria
  • Override automatically learned thresholds

Supported Thresholds

Currently, you can configure the following thresholds:

  • Error Rate Threshold: Defines the maximum acceptable error rate for a service
  • Latency Threshold: Defines the maximum acceptable latency for a service (in milliseconds)

Configuration Methods

Using Kubernetes Labels

The recommended way to configure thresholds is using Kubernetes labels. You can apply these labels to your services:

# Configure error rate threshold (for example, 1% error rate)
kubectl label svc -n <namespace> <service-name> "causely.ai/error-rate-threshold=0.01"

# Configure latency threshold (for example, 500ms)
kubectl label svc -n <namespace> <service-name> "causely.ai/latency-threshold=500.0"

Threshold Values

  • Error Rate Threshold: Expressed as a decimal (for example, 0.01 for 1%)
  • Latency Threshold: Expressed in milliseconds (for example, 500.0 for 500 ms)

Best Practices

  1. Start with Defaults: Begin with Causely's automatically learned thresholds
  2. Adjust Based on SLOs: Modify thresholds to match your specific SLO requirements
  3. Monitor Impact: After changing thresholds, monitor how they affect symptom detection
  4. Document Changes: Keep track of threshold changes and their rationale

Example Use Cases

  1. Strict SLO Requirements: Set lower thresholds for critical services
  2. Service-Specific Requirements: Configure different thresholds for different services
  3. Temporary Adjustments: Modify thresholds during maintenance windows