Skip to main content

v1.0.101

November 12, 2025

Version v1.0.101

MCP Server: Causely's Causal Reasoning Engine, Now in Your IDE

The new MCP Server gives you direct access to Causely's Causal Reasoning Engine (CRE) from any MCP-compatible IDE.

You can now pull the full causal context of an inferred cause to automatically generate a code fix and a pull request, without leaving your workflow. This capability is in Early Access; let us know if you'd like to try it in your environment.

Read the blog post announcing MCP Server for IDEs: Introducing Causely’s MCP Server.

Telemetry Integrations in the UI

You can now add and manage telemetry data sources directly in the Causely UI. The Integrations experience shows:

  • The health of all configured sources
  • The last discovery time and data retrieved
  • Recommendations for which data sources to prioritize next

By expanding your telemetry coverage, Causely builds a more complete causal model of your system, improving precision in root cause inference and proactive detection.

tip

Adding additional instrumentation enhances Causely's ability to model service-to-service communication and automatically detect reliability risks.

Integrations page showing configured data sources
🔎 Integrations page showing configured data sources
Integration details view with entity discovery and health
🔎 Integration details view with entity discovery and health
Suggested integrations showing recommended data sources
🔎 Suggested integrations showing recommended data sources

Background Operation Root Causes

Causely's causal model now includes background operations such as asynchronous message consumers (Kafka, RabbitMQ, and more).
This enables automatic inference of slow consumer root causes for message-driven workloads.

Learn more in the docs.

Expanded Elasticsearch Support

Causely now supports additional Elasticsearch indices, ensuring that the most relevant logs are linked to each inferred root cause — giving you the “why” behind degradations with more context and precision.

Did you know?

Not only can Causely leverage Grafana Alloy, Loki and Beyla as telemetry sources, but you can also use the Grafana Plugin to bring Causely's causal insights directly into your Grafana dashboards. Learn more in the docs.

Bug Fixes and Minor Improvements

  • Automatically discover Docker services exposed on host IPs for more complete topology mapping
  • Make OpenTelemetry sample rate configurable to maintain mediation health
  • Show active scope and filter details in the Topology view
  • Update service graph to display intermediate services along route destinations

v1.0.99

October 30, 2025

Version v1.0.99

Ask Causely: Logs and Events Access

When you ask about a service's health, Ask Causely now summarizes recent errors and warnings from container logs and relevant events. This gives you faster context on what changed and where to look next.

Uptime SLO Added

Beyond response error rate and latency SLOs, uptime SLOs are now automatically configured for every discovered service. Causely continuously learns baseline thresholds for request durations. You can customize SLOs and thresholds per service or deprioritize services you don't want to operate under an SLO.

SLO view

HTTP Path and RPC Method Mapping

You now have clear visibility into which services each HTTP path or RPC method belongs to. When latency or error rates spike on a specific endpoint, immediately identify the underlying service for faster remediation.

HTTP path to service association

Historical Root Causes Filter

Filter historical root causes by precise time windows to narrow your analysis and track behavior over specific periods.

Did you know?

We've partnered with Google Gemini to enhance Causely's AI SRE experience. Causely’s causal reasoning engine delivers deterministic diagnoses, while Gemini adds grounded summarization, entity recognition, and code generation to power Ask Causely and causal explanations/remediations. Read the blog: How Causely and Google Gemini Are Powering Autonomous Reliability.

Bug Fixes and Minor Improvements

  • Updated Beyla agent to address incorrect service graph rendering in the Causely UI.
  • Loki logs display fix: timestamps now correctly show the original date and time of each message.

v1.0.96

October 13, 2025

Version v1.0.96

Service Prioritization

Not all services are created equal. You can now set priorities for your services so notifications reflect what matters most. Assign strict SLOs to critical services and reduce noise from lower-priority ones, helping teams focus where impact is highest.

service prioritization

incident.io Integration

Causely integrates with incident.io to automate incident response. Causely analyzes alerts in real time, maps them onto live dependency graphs, and pinpoints the single root cause behind cascading failures; incident.io then routes the incident to the right team with full causal context, reducing triage time and accelerating recovery.

Elastic Log Ingestion

For customers sending logs directly to Elasticsearch, Causely can now ingest and analyze those logs. See log-based errors and contributing signals alongside metrics and traces in your causal graph.

Instana Integration

Causely now supports Instana for trace, metric, and data-flow ingestion. This enables teams using Instana to visualize dependencies, identify performance bottlenecks, and detect root causes directly within Causely.

Early Access: Enhanced Root Cause Descriptions and Remediations

We've introduced an LLM-powered preview that enhances root cause descriptions and recommended remediations. Causely synthesizes the most relevant context—symptoms, metrics, and logs—to produce clear, actionable explanations. We'd love your feedback as we refine this capability.

Did you know?

Causely has a feature-rich API, that allows you to programmatically access and integrate with Causely's causal engine. Learn more about the Causely API.

Bug Fixes and Minor Improvements

  • Improved auto-mapping of Prometheus alerts defined for RabbitMQ
  • Improved resource metric visualization for Nomad environments
  • Improved Datadog integration to leverage OTel traces
  • Improved handling of context in initial prompts and answer parsing for Ask Causely

v1.0.92

September 23, 2025

v1.0.92

Enhanced Ask Causely with Integration and Documentation Support

Ask Causely now provides intelligent assistance across two critical areas that significantly expand its utility:

Integration Status Intelligence:

  • Real-Time Integration Health: Ask about the current state of your telemetry integrations and data sources
  • Configuration Validation: Get insights into integration setup and troubleshooting guidance
  • Coverage Analysis: Understand which parts of your infrastructure are being monitored and identify gaps
  • Performance Metrics: Query integration performance, data ingestion rates, and connection health

Documentation-Aware Assistance:

  • Context-Aware Documentation: Ask Causely can now answer questions directly from Causely's extensive documentation
  • Feature Guidance: Get help understanding how to use specific Causely features and capabilities
  • Best Practice Recommendations: Receive expert advice on configuration, setup, and optimization
  • Troubleshooting Support: Access step-by-step guidance for common issues and advanced scenarios

This enhancement makes Ask Causely your comprehensive assistant for both operational troubleshooting and platform guidance, reducing context switching and accelerating problem resolution.

Enhanced Root Cause Management

Managing and analyzing root causes becomes more powerful with new filtering and historical analysis capabilities:

Priority Filtering and Tagging:

  • Priority-Based Filtering: Filter root causes by priority levels to focus on the most critical issues
  • Tag-Based Organization: Use custom tags for root causes for better categorization and filtering
  • Enhanced Search: Quickly find specific root causes using tag and priority filters

Historical Analysis:

  • Custom Date Filters: Analyze root causes across any custom date range for historical insights
  • Trend Analysis: Identify patterns and recurring issues over time
  • Historical Context: Better understand how root causes have evolved and been resolved

These capabilities help teams prioritize incident response and gain insights from historical patterns.

filter historical root causes

Mediation Insights: Understanding Your Causely Setup

Causely now provides comprehensive visibility into its own operations, giving you clear insights into how your Causely deployment is working and what it's discovering in your environment:

See What Causely Knows About Your Environment:

  • Discovered Entities Overview: Get a clear picture of all the services, databases, load balancers, and infrastructure components that Causely has discovered
  • Discovery Progress: Track which parts of your infrastructure Causely is actively monitoring and identify any gaps in coverage
  • Setup Validation: Understand whether your Causely installation is working as expected and discovering the entities you expect it to find
  • Environment Coverage: See the full scope of what Causely is monitoring across your entire system

Monitor Causely's Health and Performance:

  • Integration Status: Check the health of all your configured data sources and integrations
  • Connection Monitoring: See which external systems (monitoring tools, databases, cloud providers) Causely is successfully connecting to
  • Processing Performance: Monitor how efficiently Causely is analyzing your telemetry data and generating insights
  • System Resource Usage: Track Causely's own resource consumption and performance metrics

These self-insights gives you confidence that Causely is working correctly, helps you optimize your setup, and ensures you're getting comprehensive coverage of your infrastructure.

discovered entities

Enhanced eBPF Instrumentation with Beyla 2.6

Causely now leverages Grafana Beyla 2.6, bringing significant improvements to automatic instrumentation capabilities:

New Beyla 2.6 Features:

  • Improved MongoDB Instrumentation: Enhanced support for MongoDB monitoring and trace collection
  • Advanced Service Discovery: Better automatic discovery of services and applications
  • Enhanced OpenTelemetry Integration: Improved compatibility with OpenTelemetry ecosystem components
  • Stability Improvements: Various fixes and enhancements for more reliable instrumentation

Enhanced Integration Capabilities:

  • Configurable OpenTelemetry SDK: More flexible configuration options for telemetry collection
  • Improved Metrics Collection: Enhanced metrics gathering through OpenTelemetry protocols
  • Advanced Discovery Configuration: Fine-tuned discovery settings for different deployment scenarios

This upgrade ensures that Causely's automatic eBPF-based instrumentation remains at the forefront of eBPF-based observability technology.

Did you know?

Causely works with virtually any programming language because it's built on OpenTelemetry, the industry-standard observability framework. OpenTelemetry provides native instrumentation libraries for all major languages including:

  • Java
  • .NET
  • Go
  • Python
  • JavaScript
  • C++
  • Rust
  • PHP
  • Ruby
  • Swift
  • Erlang/Elixir
  • ... and many more.

Whether your applications are already instrumented with OpenTelemetry or you're just getting started, Causely can help:

This flexibility means Causely can provide root cause analysis regardless of your technology stack or observability maturity.

Bug Fixes and Minor Improvements

This release includes numerous enhancements and fixes across the platform:

  • Incident.io Integration: Added support for Incident.io auto-mapping and integration for enhanced workflow management
  • Improved Notification System: Fixed notification ordering and enhanced support for multiple owner notifications
  • Enhanced Performance: Resolved race conditions, improved queue processing, and better latency threshold calculations
  • Better Visualization: Improved topology graphs with bidirectional edges and enhanced root cause naming for AWS ALB and GCP load balancers

v1.0.90

August 29, 2025

v1.0.90

Enhanced Impact Understanding with Blast Radius Analysis

Understanding the scope and potential impact of a root cause is critical for effective incident response. Causely now provides comprehensive Blast Radius Analysis that gives you a clearer understanding of both the current and potential impact of every root cause.

Key capabilities:

  • Current Impact Assessment: See exactly which services are currently affected by a root cause and how the issue is propagating through your system
  • Potential Impact Prediction: Understand which additional services could be impacted if the root cause persists or worsens
  • Risk Assessment: Evaluate the broader implications of each root cause to prioritize response efforts effectively
  • Visual Impact Mapping: Clear visualization of impact scope helps teams align on response priorities and resource allocation

This enhanced impact analysis ensures teams can make informed decisions about incident response priorities and resource allocation, leading to more effective incident management.

blast radius

Datadog Alert Auto-Mapping for Trusted Signal Integration

Causely now automatically maps Datadog monitors to corresponding service symptoms, enabling you to leverage your existing curated signals for enhanced root cause analysis.

For organizations that have invested in creating reliable Datadog alerts and monitors, this integration provides a powerful way to incorporate those trusted signals into Causely's causal reasoning engine.

Key benefits:

  • Leverage Existing Investments: Utilize your carefully curated Datadog monitors and alerts within Causely's causal analysis
  • Enhanced Signal Quality: Combine Causely's automatic discovery with your domain expertise encoded in Datadog alerts
  • Faster Root Cause Identification: Trusted signals accelerate the path from symptom detection to root cause identification
  • Seamless Integration: Automatic mapping requires no additional configuration—Causely intelligently correlates Datadog alerts with service symptoms

This capability is particularly valuable for teams who have developed sophisticated monitoring strategies in Datadog and want to enhance their root cause analysis capabilities.

Improved Dataflow Visualization and Topic Metrics

We've enhanced Causely's ability to understand and visualize complex asynchronous data flows across your distributed systems:

Enhanced Topic Metrics:

  • Comprehensive Topic Monitoring: Better visibility into message queue performance and throughput
  • Producer-Consumer Relationships: Clear mapping of data flow relationships between services through messaging systems
  • Queue Depth Analysis: Monitor queue depths and identify potential bottlenecks in asynchronous processing

Improved Flow Visualization:

  • Directional Flow Mapping: Enhanced visualization of data flow direction between topics, tables, and services
  • End-to-End Tracing: Follow data flows from producers through topics to consumers with improved clarity
  • Performance Correlation: Better correlation between data flow metrics and service performance impacts

These improvements provide deeper insights into how data moves through your systems, making it easier to identify bottlenecks and performance issues in event-driven architectures.

Aggregated Pod-Level Metrics

Causely now provides enhanced pod-level observability with aggregated metrics that give you better insights into container performance:

Enhanced Pod Visibility:

  • Aggregated Resource Metrics: Combined view of resource utilization across related pods
  • Node-Container Relationships: Improved tracking of how containers relate to their host nodes
  • Performance Correlation: Better correlation between individual pod performance and overall service health
  • Resource Optimization Insights: Clearer understanding of resource usage patterns to inform optimization decisions

This enhancement provides more granular visibility into your containerized workloads while maintaining the broader service-level context that's crucial for effective root cause analysis.

Location-Based Notification Routing

Organizations operating across multiple regions or locations can now route root cause notifications to different destinations based on where entities are located:

Key Features:

  • Geographic Routing: Automatically route notifications based on entity location or region
  • Multi-Region Operations: Support for organizations managing infrastructure across different states, countries, or data centers
  • Customizable Routing Rules: Configure notification destinations based on entity labels, namespaces, or other location indicators
  • Operational Efficiency: Ensure the right teams receive notifications for incidents in their operational domain

This capability is particularly valuable for organizations with distributed operations teams who need location-specific incident routing to ensure fast response times and appropriate escalation paths.

Did you know?

Even without an active root cause, you can drill down into a service and get a list of potential root causes that could be impacting it, and a list of symptoms across entities that could be indicative of those root causes. This enables you to assess risk and prioritize response efforts effectively.

potential root causes

Bug Fixes and Minor Improvements

This release includes numerous stability improvements and enhancements across the platform:

  • Database Performance: Improved database headline scanning for better performance and reliability
  • Entity Management: Enhanced handling of entity data request pathways with status tracking
  • Cloud SQL Stability: Improvements to cloud-sql proxy for more stable database connections
  • Memory Management: Enhanced memory noisy neighbor detection with additional condition checks
  • Notification System: Improved notification routing per mediator for better distribution
  • Network Metrics: Enhanced collection of basic network metrics for improved observability
  • Entity Relationships: Better management of node-container relationships in the mediator
  • eBPF Instrumentation: Upgraded to Beyla 2.5 for improved automatic instrumentation capabilities
  • Entity Configuration: New database table and APIs for enhanced entity configuration management
  • Manifestation Handling: Improved handling of manifestations for deleted entities
  • Redis Performance: Enhanced Redis queue performance for better system responsiveness

v1.0.89

August 25, 2025

v1.0.89

New Features

Docker Host Installation Support

Causely now offers a new streamlined installation option for Docker hosts. This installation method enables telemetry collection and root-cause analysis for containerized services on any Docker-enabled host, complementing our existing deployment options.

Key capabilities:

  • eBPF-based telemetry collection for Docker containers
  • Root cause analysis for services running on Docker hosts
  • Easy setup with automated installation scripts
  • Support for privileged container deployment with host PID access

Learn more about setting up Causely on Docker hosts in our Docker Host Installation guide.

Multi-Database Support for MySQL and PostgreSQL

Causely now supports multiple MySQL and PostgreSQL databases using the same secret configuration. This allows you to monitor multiple databases with the same credentials, reducing the need for separate secret management.

Learn more about setting up multiple MySQL and PostgreSQL databases in our MySQL Configuration guide and PostgreSQL Configuration guide.

Did You Know?

Streamline incident response with CauselyBot webhook integration

Causely can automatically forward root cause notifications to your preferred collaboration tools through CauselyBot, our open source webhook service. CauselyBot receives authenticated payloads from Causely and intelligently routes them to platforms like Slack, Microsoft Teams, and OpsGenie.

Key capabilities include:

  • Smart Filtering: Configure custom filters based on severity, entity type, SLO impact, or root cause name to ensure teams only receive relevant notifications
  • Multiple Destinations: Route different types of incidents to appropriate teams—send critical database issues to the DBA team while routing general service degradations to the on-call engineers
  • Secure Authentication: Built-in bearer token validation ensures only authorized notifications reach your systems
  • Easy Deployment: Available as Docker containers or Helm charts for seamless integration into your existing infrastructure

This enables faster incident response by delivering actionable context directly to the tools your teams already use, reducing mean time to resolution and improving collaboration during critical incidents.

Bug Fixes and Minor Improvements

This release focuses on stability improvements and bug fixes across the platform:

  • Impact Graph Resilience: Improved impact graph calculation to handle missing services gracefully, resulting in faster loading of the impact graph UI.
  • Database Health Checks: Added comprehensive health checks for cloud-sql-proxy and ping-check database connections for improved monitoring.
  • Redis Span Enhancement: Improved Redis span collection for better distributed tracing coverage.
  • Service Impact Calculation: Improved accuracy of service impact calculations to address scenarios where impact was too broad.
  • Entity Deletion Handling: Improved logic for handling deleted entities in the system. For example clusters or namespaces that have been removed, will no longer be shown in the topology.
  • Copilot Improvements: Various enhancements to the Causely Copilot functionality.

v1.0.88

August 11, 2025

v1.0.88

Root Cause View—Sort Historical Root Causes by Symptom Count and Duration

Understanding what happened in past incidents is key to preventing them in the future and conducting more effective postmortem analyses. We've made this easier with new sorting options in the Historical Root Causes view:

  • Symptom Count: Focus on root causes that triggered a significant number of symptoms. These often represent issues with a large blast radius that, if they recur, can have substantial impact on the environment.
  • Duration: Pinpoint issues that have been degrading performance for a significant time and need to be addressed to restore stability.

For clarity, the Symptoms, Services Degraded, Duration, Start, and End columns now reflect the occurrence that matches your selected sort order. For example, if you sort by Symptoms, you'll see the occurrence with the highest symptom count; if you sort by Duration, you'll see the occurrence with the longest duration, and so on.

v1.0.85

July 29, 2025

v1.0.85

Faster Recovery with Resource Contention Remediation

You can now remediate resource contention issues directly from the Causely UI. This helps you resolve incidents faster and restore service performance without breaking your flow.

Supported Root Causes:

The remediation interface provides step-by-step guidance and Kubernetes configuration examples, making it easy to implement fixes with confidence.

remediate now interface

Smarter Urgency Detection for Root Causes

We've improved how Causely flags urgent issues. Root causes that lead to SLO violations, or put your SLOs at risk, are now automatically marked as Urgent and sent to your alerting channels (for example, Slack) by default. Stay focused on what matters most.

This enhancement ensures that critical issues requiring immediate attention are automatically prioritized and routed to the right teams, reducing response times and improving incident management workflows.

Customizable SLO Settings

You can now fine-tune SLO targets and burn rate thresholds within Causely to align with your team's reliability goals. This helps improve the precision of urgency detection and alerting.

Key Features:

  • Configure custom SLO targets for different services
  • Set burn rate thresholds to control alert sensitivity
  • Align reliability goals with business requirements
  • Improve alert precision and reduce false positives

This feature is currently in preview. Check out our SLO Configuration documentation for detailed setup instructions.

Did you know?

Causely supports multiple sources for tracing out of the box. While our agents come with eBPF-based automatic instrumentation out of the box, you can also use Odigos, groundcover, Grafana Beyla, or existing OpenTelemetry data.

Additional service dependencies can be identified from Datadog or Dynatrace data, giving you flexibility to work with your existing observability stack.

Bug Fixes and Minor Improvements

  • Load Balancer Support: Added support for Google Cloud internal load balancers, enabling better visibility into private service endpoints.
  • Controller Discovery: Improved discovery of custom controllers such as GitLab Runner to enhance coverage of user-defined Kubernetes workloads.

v1.0.84

July 21, 2025

v1.0.84

Root Cause Impact—Now in Architectural Context

Quickly see which services are impacted by a root cause. Causely now provides a direct view of the services being affected, along with a link to the live service-to-service topology graph.

This helps teams align on the scope of the issue, prioritize response, and collaborate on remediation—all from a shared system view.

Easier Alerting Integration Setup and Testing

Streamline your incident response workflows. We now support in-product testing for alerting integrations (like Slack, PagerDuty, and more), so you can confirm everything works before going live.

Causely supports multiple workflow destinations—ensuring your on-call engineers get timely, actionable context in the tools they already use.

alerting integration testing

Fine-Tuned Symptom Activation—Because Every Second Counts

Control how fast (or slow) Causely infers a root cause. You can now customize the activation and deactivation delays for symptoms on a per-service basis using Kubernetes labels, Nomad service tags, or Consul service metadata.

By default, a condition must persist for 10 minutes before a symptom becomes active—but now you can tighten that window for mission-critical services or extend it for lower-priority ones. See our symptom delay configuration guide for detailed setup instructions and best practices.

This helps you tune responsiveness without sacrificing signal quality.

Understand Past Incidents Faster

Explore historical root causes with more precision. You can now sort by services degraded count, start time, or end time to quickly find the root causes that matter most from previous days or weeks.

Whether you're doing a postmortem or scanning for recurring issues, it's now much easier to answer: "What was going on at that time?"

historical root cause filtering

Did you know?

Causely uses eBPF technology to automatically instrument your applications with zero code changes and minimal overhead. Powered by Grafana Beyla, our eBPF-based instrumentation extracts rich telemetry data from services written in Go, Java, .NET, NodeJS, Python, Ruby, Rust, and more—without requiring language-specific agents or application modifications.

This zero-effort integration provides actionable insights into service interactions, latencies, and system performance, and it's enabled by default in all Causely deployments.

Learn more about how Causely leverages eBPF for automatic instrumentation.

v1.0.83

July 1, 2025

v1.0.83

Improved Root Cause Views

We've redesigned our Root Cause views to help you quickly identify and address the most urgent service-impacting issues. The new interface prioritizes critical root causes based on their impact scope and severity, making incident triage more efficient than ever.

API Documentation Now Available

Our comprehensive GraphQL API documentation is now available! Programmatically access Causely's root cause analysis engine, query defects, and integrate with your CI/CD pipelines.

Smarter Post-Deployment Analysis

Causely now provides enhanced visibility into code change-related root causes with improved clarity around version change events:

  • Immediate Detection: Catch resource usage changes right after a deployment to identify regressions faster
  • Precise Correlation: Version timestamps are now directly correlated with resource metrics like CPU, memory, and latency
  • Before vs. After Insights: Get proactive visibility into what changed pre and post-deployment
  • Automatic Code Change Attribution: Causely automatically infers if a root cause stems from a code change, helping teams quickly connect symptoms to recent deployments

This feature is particularly valuable for understanding the real performance impact of new releases on your services.

post-deployment analysis

Better Visibility Into Asynchronous Data Flows

Understanding complex message flows across your distributed system just got easier with our improved data flow graphs:

  • End-to-End Tracing: Follow messages from publish to RPC method or HTTP path, even across multiple service hops
  • Topic Filtering: Isolate behavior for specific customers or queues by filtering data flows by topic
  • Causal Integration: These improvements are fully integrated into our causal engine, enhancing root cause accuracy for asynchronous systems
data flow

Did you know?

Scopes in Causely allow you to define and manage custom subsets of your environment's topology. As Causely automatically discovers the full topology of your environment, it can present a rich but potentially overwhelming set of entities—services, infrastructure components, and identified problems. Scopes help you focus on the specific subset of data that matters most to your role, responsibilities, or current investigative tasks.

Learn more about scopes and how to use them in our documentation.

Bug Fixes and Minor Improvements

  • Database Performance: Added indices to active actions and increased max DB pool connections for better performance
  • Slack Integration: Enhanced support for users in multiple Slack teams
  • Kubernetes Improvements: Added option to disable entity log collection from Kubernetes and improved pod metadata for network endpoints
  • Alert Management: Added alerts as context in symptoms and implemented continuous sender for alert manager notifications
  • Log Management: Optimized log storage by limiting to 1000 log lines per evidence
  • UI Improvements: Fixed time duration format display and headline database scanning
  • Root Cause Quality: Filtered out root causes with only low probability symptoms and fixed flapping issues due to equal probabilities
  • Monitoring Enhancements: Added support for symptom monitoring from Prometheus and improved topology scraper metrics for SLOs
  • SQL Parsing: Enhanced SQL query parsing for better database performance analysis
  • Email Notifications: Improved weekly email summary delivery system

v1.0.81

June 2025

v1.0.81

Ask Causely: SQL Query Analysis & Service Graphs

Ask Causely has leveled up. This release also comes with additions to answering questions about incidents and metrics, it now helps you to analyze and troubleshoot your slowest SQL queries and provides automatic visualization for query topology and service graphs.

Want to enable Ask Causely? Reach out to your Causely team to activate early access.

Ask Causely

Grafana Plugin is Now Publicly Available

We're excited to officially launch the Causely plugin for Grafana! Now anyone can bring Causely's root cause analysis directly into their existing dashboards with no extra context-switching required!

Latest Highlights:

  • Root Cause Panel: See urgent issues and causal summaries in place.
  • Root Cause Headlines: Clear, contextual insights without leaving Grafana.
  • Unhealthy Services Panel: Quickly identify degraded services and their downstream impact at a glance.

This marks a major step in bringing Causely's causal intelligence to where your teams already work. With the new plugin, Grafana becomes not just a place for metrics, but a place for action. We'll continue to expand the plugin with even more capabilities in upcoming releases.

grafana plugin

Enriching Root Cause Analysis with Trusted Signals from Prometheus, Alertmanager, and Checkly

Causely now ingests alerts from Prometheus, Alertmanager, and Checkly—bringing the signals you already trust into our causal engine. These alerts are automatically mapped to known symptoms in your environment, giving incidents immediate structure, historical continuity, and causal depth.

  • Prometheus + Alertmanager: Pull alerts in real-time and map them to symptoms in your knowledge graph—enhancing situational awareness and accelerating investigations.
  • Checkly: API check failures are now linked to the services they impact and surfaced as active symptoms, giving synthetic monitoring real operational context.
  • Symptom Activation/Deactivation: Alerts can now directly toggle symptom states, powering more dynamic, accurate, and automated RCA workflows.

With these integrations, Causely doesn't just observe alerts—it interprets them in context. You're extending your causal graph with trusted telemetry, making every incident easier to understand, triage, and resolve.

Did you know?

Causely automatically discovers and visualizes your service dependencies—no manual config required. We analyze runtime communication patterns (HTTP, gRPC, SQL, Kafka, etc.) to give you a live, layered map of your architecture—from services to infra to messaging layers—enabling more accurate root cause analysis and impact prediction.

Bug Fixes and Minor Improvements

  • Weekly Email Summaries: Automatically receive a weekly summary of incidents and RCA results to stay aligned.

v1.0.79

May 2025

v1.0.79

New Landing Page: See What's Most Interesting in Your Environment

The new Causely landing page gives you a high-signal, low-noise view of your Root Cause Headlines in your environment over the last 24 hours.

Whether it's a sudden spike in latency, a critical root cause affecting your services, or key SLO risks, we now highlight it the moment you log in.

This helps teams prioritize actions based on impact.

new landing page

Ask Causely: Your Incident Copilot in Causely and Slack (Early Access)

Introducing Ask Causely, your LLM-powered assistant built for real-time operational insight. Whether you're in Slack or in the Causely UI, Ask Causely helps you resolve incidents faster and improve service health.

What it can do:

  • Respond to users' natural language questions, like "Which services are currently impacted by active root causes?"
  • Get context-aware answers: **Root cause + symptoms + suggested next steps **
  • Integrated into both Slack and Causely Web UI

Want to enable Ask Causely? Reach out to your Causely team to activate early access.

ask causely

Simplified Navigation: Focused on What Matters Most

We've rethought the structure of Causely's interface to spotlight our core value: real-time, automated analysis of what's causing service latency and errors.

What's improved:

  • Streamlined layout with **fewer distractions **
  • Root causes now front and center
  • A new getting-started checklist to help you activate value faster

This refined navigation ensures that your attention goes straight to high-impact issues.

Bug Fixes and Minor Improvements

  • Smarter Root Cause Alerts : We now notify you only for root causes that impact multiple services beyond just SLO violations—reducing noise and helping you prioritize real incidents.
  • Refined Symptom Deactivation Logic: Error symptoms are now tied to real request activity, preventing premature deactivation or activation in idle services.
  • Per-Service Thresholds: Teams can now configure latency and error thresholds for individual services, replacing the default ML-based learned thresholds. This allows for more fine-grained alert tuning and better alignment with service-specific expectations.
  • Splunk OnCall Integration: Causely now supports Splunk OnCall notifications, expanding your ops toolchain with automated incident routing.
  • AWS Discovery & Metrics Improvements: We've added pagination for AWS ALB discovery, improved tag handling, and now set ALB latency directly from observed values for faster, more accurate symptom detection.