Agent Integration

Building Reliable Agents with Causely

Agents fail not because they lack data. They fail because data alone does not explain causality.

An agent with access to metrics, logs, and traces still cannot reliably determine what caused an issue, how far it has spread, or what action is safe to take. That requires a causal model: a structured understanding of how services, dependencies, and failure patterns relate.

Causely provides that model. Agents query Causely through the MCP server and receive structured, deterministic answers, including root causes, blast radius, dependency maps, and remediation guidance, instead of raw signals to interpret.

The Gap in Today's Agent Architectures

Most agent-driven systems run into three core limitations:

Information gap
Agents can retrieve telemetry, but cannot consistently determine what is happening or what matters.

System gap
There is no shared understanding of how services, infrastructure, and dependencies relate to each other.

Execution gap
Agents lack a reliable way to determine which actions are safe and how to coordinate them.

As a result, agents require human interpretation, and automation breaks down at scale.

Benchmark: agents with and without Causely

Agents using Causely cut token consumption 48%, ran 63% faster, and hit 100% diagnosis accuracy across 72 benchmark experiments. See the benchmark →

Where Causely Fits

Causely provides a system intelligence layer that continuously models how your system behaves: its services, dependencies, and failure propagation.

Instead of reasoning over raw telemetry, agents interact with structured, deterministic system knowledge. Decisions are based on how the system actually behaves, not on correlation or heuristics.

Architecture Overview

[Agent (for example Holmes or custom agent)]
                ↓
[Causely (causal model + reasoning engine)]
                ↓
[Observability + Infrastructure (metrics, traces, logs, alerts)]

Agent: orchestrates workflows, queries systems, and takes action
Causely: builds and maintains a causal model and provides deterministic reasoning
Observability + Infrastructure: provides raw signals and telemetry

What Your Agent Can Do

The Causely MCP server exposes 24 tools across 5 categories. Here is what each category enables:

Entity Resolution: Resolve service and database names to IDs, enumerate namespaces and clusters, check current health status. Most workflows start here.
Data Retrieval: Retrieve time-series metrics, live logs, alert history, deployment events, configuration files, and slow query analysis for any entity.
Health & Diagnosis: Get active symptoms environment-wide, identify root causes with impacted services and remediation guidance, check SLOs, map service topology, and get structured health summaries for services, teams, or individual entities.
Reporting & Postmortems: Generate deterministic postmortem drafts and structured engineering tickets from resolved incident data.
Reliability & Deployment: Compare resource consumption before and after deployments for a single service or an entire fleet.

Integration Paths

Choose based on how much you want to build.

Option	Best for	What you get
MCP Server	Any MCP-compatible agent or assistant	Standardized interface to all 24 Causely tools; works with Cursor, Claude Code, VS Code, and others
HolmesGPT	Teams already using Holmes	Pre-built agent with Causely MCP configured; no custom integration required
Custom Agents	Teams building internal tooling or automation pipelines	Full control over logic, policies, and execution; MCP or direct API

If you are starting fresh, use the MCP Server. It works with any agent that supports the Model Context Protocol and requires no custom code.

Example Workflow

Scenario: High error rate alert

The agent receives an alert
The agent calls get_entities() to resolve the alerted service name to an entity ID
The agent calls get_root_causes() to identify the source
Causely returns:
- Root cause service
- Affected dependencies
- Explanation of why this is the cause
- Remediation guidance
The agent:
- Notifies the correct team
- Suggests or executes remediation

When This Approach Is Most Valuable

This architecture is most effective when:

You operate distributed systems with many interdependent services
You already have observability in place
You are building or evaluating automated incident workflows

Summary

Causely does not replace your agents or your observability stack.

It provides the system intelligence layer required for agents to interpret telemetry consistently, identify true root causes, and take safe, coordinated action.

Building Reliable Agents with Causely​

The Gap in Today's Agent Architectures​

Where Causely Fits​

Architecture Overview​

What Your Agent Can Do​

Integration Paths​

Example Workflow​

When This Approach Is Most Valuable​

Summary​