Skip to main content

v1.0.126

Generic Application Error Log Detection

Causely now detects generic application errors in logs and incorporates them into the causal model via a new Application Error Spike symptom and root cause. Previously, log-based root cause detection relied on predefined error patterns; errors that did not match a known pattern were not surfaced. When Causely observes more than 2,000 generic errors within a one-hour window, it activates the application error log symptom, which is then used in continuous root cause analysis to help agents and on-call engineers identify and resolve the underlying issue faster. Note that the threshold for error log symptom activation is configurable, see more detail in threshold configuration.

Application Error Spike root cause

Learn more about log-based root causes

Token Management and Developer Role

Causely now supports rotating and revoking the tokens used with the Causely mediator, and introduces a new Developer role. Developers can add new mediators and manage tokens for those mediators; Administrator roles retain the ability to manage tokens across all mediators in a tenant. Read Only users cannot add mediators or manage tokens.

Causely mediator token management

Minor Improvements

  • AWS RDS database cluster support: Causely now identifies both the overall RDS cluster and its individual databases, improving visibility into database-layer root causes.
  • Symptom activation timing: Improved the symptom activation delay logic to better distinguish bursty anomalies from sustained ones. Learn more
  • Generic webhook notifications: Improved support for sending Causely notifications to generic webhooks. Learn more
  • Agent-not-deployed false positives: Addressed incorrect "Causely agent is down" symptom activation for nodes where an agent was intentionally not deployed.
  • Elasticsearch integration: Addressed an issue affecting the Elasticsearch integration.
  • Service logs tab: Clarified that errors and warnings shown in the Service logs tab reflect the last 5 minutes of activity.
  • Node malfunction detection: Improved detection of the node malfunction root cause to correctly handle scenarios where nodes are automatically removed by autoscaling mechanisms.