Skip to main content

Release Notes

Each release of Causely includes new features, bug fixes, and performance improvements. This page provides highlights of the most recent releases.

We'd Love Your Feedback!

Have ideas, questions, or feedback? Please reach out to us at community@causely.ai.

August 11, 2025

Version 1.0.88

Root Cause View—Sort Historical Root Causes by Symptom Count and Duration

Understanding what happened in past incidents is key to preventing them in the future and conducting more effective postmortem analyses. We've made this easier with new sorting options in the Historical Root Causes view:

  • Symptom Count: Focus on root causes that triggered a significant number of symptoms. These often represent issues with a large blast radius that, if they recur, can have substantial impact on the environment.
  • Duration: Pinpoint issues that have been degrading performance for a significant time and need to be addressed to restore stability.

For clarity, the Symptoms, Services Degraded, Duration, Start, and End columns now reflect the occurrence that matches your selected sort order. For example, if you sort by Symptoms, you'll see the occurrence with the highest symptom count; if you sort by Duration, you'll see the occurrence with the longest duration, and so on.

Ask Causely—Smarter Natural Language Investigations

Ask Causely, our natural language interface for understanding what’s going on in your environment, just got more powerful.

You can now:

  • Access more metrics across more entities, including disks.
  • Quickly get summarized views of key metrics over time for faster analysis.
  • Investigate faster by simply asking questions, even if you’re not deeply familiar with the environment.

For example, you can now ask:

“Show CPU and memory utilization for the virtual machines with the highest CPU over the last few hours.”
…and instantly see the charts you need to take action.

ask causely metrics interface

Did You Know?

Greater root cause clarity with database instrumentation

Without adding any additional instrumentation, and by leveraging eBPF, Causely discovers databases in your environment, including Postgres, MySQL, MongoDB, CockroachDB, and more. Causely represents these databases as services and will correctly infer them as the source of root causes impacting other services (for example, service congestion causing increased latency or error rates).

You can gain access to more specific database-related RCs by configuring Prometheus as a data source. We support exporters for Postgres, MySQL, MongoDB, CockroachDB, which will enable Causely to infer RCs such as:

  • Slow Database Query: Database queries are taking longer than expected, slowing dependent services and risking resource starvation.
  • Database Connection Pool Saturated: All available connections in the client-side database pool are in use, blocking new queries and potentially causing requests to hang or fail.

You can take this one step further by adding instrumentation for Postgres or MySQL. By adding specific instrumentation for Postgres or MySQL in Causely, you unlock:

  • 3 additional root causes on the database table entity:
    • Malfunction: The database table is degraded or erroring, causing slow queries, potential errors, and reduced performance for dependent applications.
    • Excessive Lock: An abnormally high rate of exclusive locks is blocking concurrent access to the table, creating a performance bottleneck for client applications.
    • DDL Excessive Lock: Frequent DDL locks are blocking both reads and writes during schema changes, disrupting and slowing all dependent services.
  • The ability to view slow queries directly from a database table in the UI.

Bug Fixes and Minor Improvements

  • Added support for multi-tenant Mimir deployments, you can learn more here
  • Improved formatting and clarity in the weekly email digest to make it easier to review notable root causes.
  • Enhanced performance of Ask Causely queries when returning large datasets.

July 29, 2025

Version 1.0.85

Faster Recovery with Resource Contention Remediation

You can now remediate resource contention issues directly from the Causely UI. This helps you resolve incidents faster and restore service performance without breaking your flow.

Supported Root Causes:

The remediation interface provides step-by-step guidance and Kubernetes configuration examples, making it easy to implement fixes with confidence.

remediate now interface

Smarter Urgency Detection for Root Causes

We've improved how Causely flags urgent issues. Root causes that lead to SLO violations, or put your SLOs at risk, are now automatically marked as Urgent and sent to your alerting channels (for example, Slack) by default. Stay focused on what matters most.

This enhancement ensures that critical issues requiring immediate attention are automatically prioritized and routed to the right teams, reducing response times and improving incident management workflows.

Customizable SLO Settings

You can now fine-tune SLO targets and burn rate thresholds within Causely to align with your team's reliability goals. This helps improve the precision of urgency detection and alerting.

Key Features:

  • Configure custom SLO targets for different services
  • Set burn rate thresholds to control alert sensitivity
  • Align reliability goals with business requirements
  • Improve alert precision and reduce false positives

This feature is currently in preview. Check out our SLO Configuration documentation for detailed setup instructions.

Did you know?

Causely supports multiple sources for tracing out of the box. While our agents come with eBPF-based automatic instrumentation out of the box, you can also use Odigos, groundcover, Grafana Beyla, or existing OpenTelemetry data.

Additional service dependencies can be identified from Datadog or Dynatrace data, giving you flexibility to work with your existing observability stack.

Bug Fixes and Minor Improvements

  • Load Balancer Support: Added support for Google Cloud internal load balancers, enabling better visibility into private service endpoints.
  • Controller Discovery: Improved discovery of custom controllers such as GitLab Runner to enhance coverage of user-defined Kubernetes workloads.

July 21, 2025

Version 1.0.84

Root Cause Impact—Now in Architectural Context

Quickly see which services are impacted by a root cause. Causely now provides a direct view of the services being affected, along with a link to the live service-to-service topology graph.

This helps teams align on the scope of the issue, prioritize response, and collaborate on remediation—all from a shared system view.

Easier Alerting Integration Setup and Testing

Streamline your incident response workflows. We now support in-product testing for alerting integrations (like Slack, PagerDuty, and more), so you can confirm everything works before going live.

Causely supports multiple workflow destinations—ensuring your on-call engineers get timely, actionable context in the tools they already use.

alerting integration testing

Fine-Tuned Symptom Activation—Because Every Second Counts

Control how fast (or slow) Causely infers a root cause. You can now customize the activation and deactivation delays for symptoms on a per-service basis using Kubernetes labels, Nomad service tags, or Consul service metadata.

By default, a condition must persist for 10 minutes before a symptom becomes active—but now you can tighten that window for mission-critical services or extend it for lower-priority ones. See our symptom delay configuration guide for detailed setup instructions and best practices.

This helps you tune responsiveness without sacrificing signal quality.

Understand Past Incidents Faster

Explore historical root causes with more precision. You can now sort by services degraded count, start time, or end time to quickly find the root causes that matter most from previous days or weeks.

Whether you're doing a postmortem or scanning for recurring issues, it's now much easier to answer: "What was going on at that time?"

historical root cause filtering

Did you know?

Causely uses eBPF technology to automatically instrument your applications with zero code changes and minimal overhead. Powered by Grafana Beyla, our eBPF-based instrumentation extracts rich telemetry data from services written in Go, Java, .NET, NodeJS, Python, Ruby, Rust, and more—without requiring language-specific agents or application modifications.

This zero-effort integration provides actionable insights into service interactions, latencies, and system performance, and it's enabled by default in all Causely deployments.

Learn more about how Causely leverages eBPF for automatic instrumentation.

July 1, 2025

Version 1.0.83

Improved Root Cause Views

We've redesigned our Root Cause views to help you quickly identify and address the most urgent service-impacting issues. The new interface prioritizes critical root causes based on their impact scope and severity, making incident triage more efficient than ever.

API Documentation Now Available

Our comprehensive GraphQL API documentation is now available! Programmatically access Causely's root cause analysis engine, query defects, and integrate with your CI/CD pipelines.

Smarter Post-Deployment Analysis

Causely now provides enhanced visibility into code change-related root causes with improved clarity around version change events:

  • Immediate Detection: Catch resource usage changes right after a deployment to identify regressions faster
  • Precise Correlation: Version timestamps are now directly correlated with resource metrics like CPU, memory, and latency
  • Before vs. After Insights: Get proactive visibility into what changed pre and post-deployment
  • Automatic Code Change Attribution: Causely automatically infers if a root cause stems from a code change, helping teams quickly connect symptoms to recent deployments

This feature is particularly valuable for understanding the real performance impact of new releases on your services.

post-deployment analysis

Better Visibility Into Asynchronous Data Flows

Understanding complex message flows across your distributed system just got easier with our improved data flow graphs:

  • End-to-End Tracing: Follow messages from publish to RPC method or HTTP path, even across multiple service hops
  • Topic Filtering: Isolate behavior for specific customers or queues by filtering data flows by topic
  • Causal Integration: These improvements are fully integrated into our causal engine, enhancing root cause accuracy for asynchronous systems
data flow

Did you know?

Scopes in Causely allow you to define and manage custom subsets of your environment's topology. As Causely automatically discovers the full topology of your environment, it can present a rich but potentially overwhelming set of entities—services, infrastructure components, and identified problems. Scopes help you focus on the specific subset of data that matters most to your role, responsibilities, or current investigative tasks.

Learn more about scopes and how to use them in our documentation.

Bug Fixes and Minor Improvements

  • Database Performance: Added indices to active actions and increased max DB pool connections for better performance
  • Slack Integration: Enhanced support for users in multiple Slack teams
  • Kubernetes Improvements: Added option to disable entity log collection from Kubernetes and improved pod metadata for network endpoints
  • Alert Management: Added alerts as context in symptoms and implemented continuous sender for alert manager notifications
  • Log Management: Optimized log storage by limiting to 1000 log lines per evidence
  • UI Improvements: Fixed time duration format display and headline database scanning
  • Root Cause Quality: Filtered out root causes with only low probability symptoms and fixed flapping issues due to equal probabilities
  • Monitoring Enhancements: Added support for symptom monitoring from Prometheus and improved topology scraper metrics for SLOs
  • SQL Parsing: Enhanced SQL query parsing for better database performance analysis
  • Email Notifications: Improved weekly email summary delivery system

June 2025

Version 1.0.81

Ask Causely: SQL Query Analysis & Service Graphs

Ask Causely has leveled up. This release also comes with additions to answering questions about incidents and metrics, it now helps you to analyze and troubleshoot your slowest SQL queries and provides automatic visualization for query topology and service graphs.

Want to enable Ask Causely? Reach out to your Causely team to activate early access.

Ask Causely

Grafana Plugin is Now Publicly Available

We're excited to officially launch the Causely plugin for Grafana! Now anyone can bring Causely's root cause analysis directly into their existing dashboards with no extra context-switching required!

Latest Highlights:

  • Root Cause Panel: See urgent issues and causal summaries in place.
  • Root Cause Headlines: Clear, contextual insights without leaving Grafana.
  • Unhealthy Services Panel: Quickly identify degraded services and their downstream impact at a glance.

This marks a major step in bringing Causely's causal intelligence to where your teams already work. With the new plugin, Grafana becomes not just a place for metrics, but a place for action. We'll continue to expand the plugin with even more capabilities in upcoming releases.

grafana plugin

Enriching Root Cause Analysis with Trusted Signals from Prometheus, Alertmanager, and Checkly

Causely now ingests alerts from Prometheus, Alertmanager, and Checkly—bringing the signals you already trust into our causal engine. These alerts are automatically mapped to known symptoms in your environment, giving incidents immediate structure, historical continuity, and causal depth.

  • Prometheus + Alertmanager: Pull alerts in real-time and map them to symptoms in your knowledge graph—enhancing situational awareness and accelerating investigations.
  • Checkly: API check failures are now linked to the services they impact and surfaced as active symptoms, giving synthetic monitoring real operational context.
  • Symptom Activation/Deactivation: Alerts can now directly toggle symptom states, powering more dynamic, accurate, and automated RCA workflows.

With these integrations, Causely doesn't just observe alerts—it interprets them in context. You're extending your causal graph with trusted telemetry, making every incident easier to understand, triage, and resolve.

Did you know?

Causely automatically discovers and visualizes your service dependencies—no manual config required. We analyze runtime communication patterns (HTTP, gRPC, SQL, Kafka, etc.) to give you a live, layered map of your architecture—from services to infra to messaging layers—enabling more accurate root cause analysis and impact prediction.

Bug Fixes and Minor Improvements

  • Weekly Email Summaries: Automatically receive a weekly summary of incidents and RCA results to stay aligned.

May 2025

Version 1.0.79

New Landing Page: See What's Most Interesting in Your Environment

The new Causely landing page gives you a high-signal, low-noise view of your Root Cause Headlines in your environment over the last 24 hours.

Whether it's a sudden spike in latency, a critical root cause affecting your services, or key SLO risks, we now highlight it the moment you log in.

This helps teams prioritize actions based on impact.

new landing page

Ask Causely: Your Incident Copilot in Causely and Slack (Early Access)

Introducing Ask Causely, your LLM-powered assistant built for real-time operational insight. Whether you're in Slack or in the Causely UI, Ask Causely helps you resolve incidents faster and improve service health.

What it can do:

  • Respond to users' natural language questions, like "Which services are currently impacted by active root causes?"
  • Get context-aware answers: **Root cause + symptoms + suggested next steps **
  • Integrated into both Slack and Causely Web UI

Want to enable Ask Causely? Reach out to your Causely team to activate early access.

ask causely

Simplified Navigation: Focused on What Matters Most

We've rethought the structure of Causely's interface to spotlight our core value: real-time, automated analysis of what's causing service latency and errors.

What's improved:

  • Streamlined layout with **fewer distractions **
  • Root causes now front and center
  • A new getting-started checklist to help you activate value faster

This refined navigation ensures that your attention goes straight to high-impact issues.

Bug Fixes and Minor Improvements

  • Smarter Root Cause Alerts : We now notify you only for root causes that impact multiple services beyond just SLO violations—reducing noise and helping you prioritize real incidents.
  • Refined Symptom Deactivation Logic: Error symptoms are now tied to real request activity, preventing premature deactivation or activation in idle services.
  • Per-Service Thresholds: Teams can now configure latency and error thresholds for individual services, replacing the default ML-based learned thresholds. This allows for more fine-grained alert tuning and better alignment with service-specific expectations.
  • Splunk OnCall Integration: Causely now supports Splunk OnCall notifications, expanding your ops toolchain with automated incident routing.
  • AWS Discovery & Metrics Improvements: We've added pagination for AWS ALB discovery, improved tag handling, and now set ALB latency directly from observed values for faster, more accurate symptom detection.

Older Releases

To get details on older releases, please reach out to us at community@causely.ai.