Skip to main content

MCP Server Integration

The Causely MCP server gives agents and AI assistants direct access to Causely's causal reasoning engine. 24 tools across 5 categories let your agent move from raw alerts to structured root cause analysis, dependency maps, and reliability reports: without writing custom integrations.

Key Workflows

These are the four workflows agents use most often. Each maps to a specific sequence of MCP tool calls.

Incident Triage

Identify what's broken and how far it has spread.

  1. get_symptoms(): see all active symptoms across the entire environment (no filters needed)
  2. get_root_causes(): identify all active root causes and impacted services
  3. get_root_causes(symptom_ids=[...]): drill into a specific symptom's causes
  4. get_topology(entity_id=..., mode="dependents"): map which upstream services are affected

Quick Service Health

Get a complete health picture for a specific service in two calls.

  1. get_entities(query="service-name", entity_types=["Service"]): resolve the service name to its entity ID
  2. get_service_summary(service="service-name"): full snapshot: status, active symptoms, root causes, SLOs, metrics, recent events, error logs

Post-Deploy Validation

Check whether a deployment introduced regressions.

  1. reliability_delta(service="service-name"): compare CPU, memory, latency, and error rate before vs after the most recent deployment
  2. fleet_reliability_delta(team="team-name"): batch check across all services for a team, namespace, or explicit list

Post-Incident Reporting

Generate postmortem documentation and action items from a resolved incident.

  1. get_root_causes(root_cause_id=...): retrieve full root cause details, timeline, and blast radius
  2. postmortem(root_cause_id=...): generate a structured postmortem draft
  3. generate_ticket(task="..."): create a follow-up engineering ticket for Jira, GitHub Issues, or Linear

Using the Tool Reference

tip

Most agents, Claude, Cursor, Codex, will intuitively pick the right tools based on your prompt. You don't need to specify tool names or sequences. Just describe what you need and the agent will handle the rest.

The tool reference below is for teams building custom agents or automations who want precise control over which tools are called and when. One thing worth knowing: most structured tools require an entity ID, so get_entities() is usually the right first call when working programmatically.

Tool Selection: Ask Causely vs Structured Tools

The MCP server exposes two interaction styles. Choose based on what your agent needs to do with the result.

Use caseRecommended tool
Narrative health summary ("Is checkout healthy?")get_service_summary
Historical questions ("What happened last night?")ask_causely
Incident standup summary ("What happened to checkout yesterday?")ask_causely
System-wide SLO overview ("Are any current SLOs at risk?")get_slo
Programmatic root cause output ("What is the root cause of latency on payments?")get_root_causes
Time-series metric data ("What is the p95 latency for the last hour on payments?")get_metrics
Entity ID resolution ("Resolve the entity ID for the payments service")get_entities
SLO burn rate and violation status ("Is the payments SLO burning?")get_slo
Dependency graph ("What services depend on payments?")get_topology
Post-deploy regression check ("Did the latest payments deploy introduce a regression?")reliability_delta

Ask Causely natural language in. Best for open-ended exploration and synthesis.

Structured tools explicit named inputs. Best when your agent needs to act on the result, apply logic, or chain calls.

What Agents Get vs Raw Telemetry

Raw telemetryCausely MCP
Root cause identificationCorrelation-based, requires analysisDeterministic causal analysis
Dependency awarenessManual mapping requiredLive topology from observed traffic
Blast radiusEstimatedComputed from causal graph
Structured outputCustom parsing requiredTyped tool responses
Time to insightMinutes of analysisSingle tool call

Authentication

The MCP server validates Frontegg-issued access tokens (JWT) in Authorization: Bearer, the same identity layer as the rest of the Causely API. How that token reaches your tool depends on the flow you use.

Browser-based OAuth (default)

Tools such as mcp-remote follow the MCP authorization model: they discover Causely’s OAuth protected-resource metadata, perform dynamic client registration (POST to the registration endpoint advertised in that metadata—for Causely SaaS this is under https://api.causely.app/mcp/oauth/register), open a browser login (Frontegg), then attach the resulting Bearer token to MCP requests. No manual client secret is required for this path.

Client ID and client secret (machine access)

For non-interactive MCP calls (automation, CI, or proxies that do not complete browser OAuth), supply credentials the MCP server can exchange for a Frontegg access token. The HTTP Basic username and password are always client_id and client_secret in that order.

Tenant API tokens (typical for your Causely tenant)
Generate OAuth client credentials for your tenant in the Frontegg account portal:

https://auth.causely.app/oauth/portal/api-tokens

Use the issued client ID as the Basic “username” and the client secret as the Basic “password” when building the payload below. Treat these secrets like any other API key (store them in a secret manager, rotate when needed).

MCP dynamic client registration (alternative)
MCP clients that complete dynamic client registration receive their own client_id and client_secret. Those values use the same encoding and headers as tenant API tokens. Registered MCP client secrets are time-limited; when they expire, register again or switch to tenant API tokens from the portal.

Encoding: concatenate client_id, a single colon (:), and client_secret, then Base64-encode that string (standard HTTP Basic user-info, same as Authorization: Basic elsewhere). Do not add a newline before encoding—the decoded value must be exactly client_id:secret.

Generating the Base64 string
In a shell, set CLIENT_ID and CLIENT_SECRET to your values (or substitute quoted literals), then run:

printf '%s:%s' "$CLIENT_ID" "$CLIENT_SECRET" | base64

Use the single line of output as the Base64 payload (no line breaks). For X-Causely-Client-Basic, you may send either that raw string or prefix it with Basic when building the header.

How to send it:

MethodWhen to use
Authorization: Basic <Base64(client_id:secret)>Supported: the server reads the Authorization header when it uses the Basic scheme (RFC 7617; scheme is matched case-insensitively, with optional whitespace after Basic). Used when the request has no Authorization: Bearer token—typical for curl or custom HTTP clients.
X-Causely-Client-BasicSame payload: either raw Base64 or Basic <Base64(client_id:secret)>. Use when you must not put those bytes on Authorization (for example an MCP proxy reserves or rewrites Authorization). If both this header and Authorization: Basic are set, the server prefers X-Causely-Client-Basic when resolving credentials for exchange.

If the request already includes a non-empty Authorization: Bearer token, the server validates that JWT and does not run the Basic exchange; values on X-Causely-Client-Basic are ignored in that case.

Examples

Direct HTTP (no Bearer on the request):

Authorization: Basic <Base64(client_id:secret)>

mcp-remote accepts repeated --header "Name: value" arguments and expands ${ENV_VAR} inside header values. Use that when your MCP stack should attach static credentials without hard‑coding secrets in config:

{
"mcpServers": {
"causely": {
"command": "npx",
"args": [
"mcp-remote",
"https://api.causely.app/mcp/",
"--header",
"X-Causely-Client-Basic: Basic ${CAUSELY_MCP_CLIENT_BASIC}"
]
}
}
}

Set CAUSELY_MCP_CLIENT_BASIC to the Base64 string (not including the Basic prefix in the env value—the example adds the prefix in the header). This path only affects authentication when outbound MCP requests are sent without a Bearer token from the interactive OAuth flow (for example automation-oriented setups); the default browser OAuth configuration does not need it.

Setup

Step 1: Configure Your Development Tool

For most users, the hosted MCP URL plus mcp-remote is enough: OAuth and dynamic registration run automatically. Add this configuration to your MCP-compatible tool:

{
"mcpServers": {
"causely": {
"command": "npx",
"args": ["mcp-remote", "https://api.causely.app/mcp/"]
}
}
}

Prerequisites:

  • Active Causely account
  • An MCP-compatible tool (see example Supported Tools below)
  • Node.js (typically pre-installed with most development environments)

Step 2: Verify Connection

Test with: "Ask Causely: What defects are currently active?"

If you don't receive a response:

  • For browser OAuth: you're logged into Causely in the browser when the tool completes login (the access token comes from that OAuth flow, not from the portal session alone)
  • For client ID / secret: the Base64 value encodes client_id:secret from https://auth.causely.app/oauth/portal/api-tokens; header name and Basic prefix match the table in Authentication
  • Restart your development tool after configuration
  • Confirm Node.js is available in your PATH

Supported Tools

IDEs and Editors: Cursor, Codex, Visual Studio Code (with MCP extension), JetBrains IDEs (IntelliJ IDEA, PyCharm, WebStorm, GoLand, and others), Windsurf, Zed

Desktop Applications: Claude Desktop

CLIs: Claude Code, Kiro CLI, Amp, Atlassian Rovo DEV CLI, and other MCP-compatible CLI tools

Agent Frameworks: HolmesGPT

Full Tool Reference

24 tools across 5 categories. All tools are available to any MCP-compatible agent or assistant.

Entity Resolution

ToolWhen to use
get_entitiesStart here. Resolve a service or database name to its ID; list all entities in a namespace; check current health status
get_label_valuesEnumerate valid label values (team, product, cluster, namespace) before fanning out queries across environments
list_namespacesDiscover Kubernetes namespace names before resolving entities or scanning a namespace
list_clustersDiscover cluster names before scoping multi-cluster queries

Data Retrieval

ToolWhen to use
get_metricsRetrieve numeric metric data (p95 latency, error rate, CPU, memory, throughput): the only tool that returns time-series
get_logsInspect live service logs, or retrieve evidence logs captured at root cause detection time
get_alertsStart triage from an alert name (PagerDuty, Slack, Datadog); distinguish alerts mapped to causal analysis from noise
get_eventsCorrelate symptom onset with deployments, restarts, scaling events, or config changes
get_configInvestigate configuration drift; verify deployment manifest matches expectations
get_slow_queriesIdentify database queries consuming the most execution time; follow up on database root causes

Health & Diagnosis

ToolWhen to use
get_symptomsCall with no filters to see all active symptoms across the entire environment or filter for specific entity, namespace or cluster
get_root_causesIdentify all active root causes; filter by impacted service, symptom, or root cause ID
get_entity_healthStructured health summary for non-Service entities (databases, pods, queues, topics, tables)
get_sloCheck SLO state, error budget remaining, and burn rate
get_topologyFind upstream blast radius (dependents), downstream dependencies, or full data-flow graph
get_integration_statusVerify monitoring coverage; check scraper health by cluster
triageFocused health summary by entity name or root cause ID: no entity ID pre-resolution needed
team_healthHealth summary for all services owned by a team; degraded and critical services listed first
get_service_summaryComprehensive health snapshot for a single service: status, symptoms, root causes, SLOs, metrics, events, logs

Reporting & Postmortems

ToolWhen to use
postmortemGenerate a deterministic postmortem draft for a resolved incident from Causely data
generate_ticketCreate a structured engineering ticket suitable for Jira, GitHub Issues, or Linear

Reliability & Deployment

ToolWhen to use
reliability_deltaPost-deploy regression check for a single service: compare resource consumption before/after most recent deployment
fleet_reliability_deltaBatch regression check across a team, namespace, or explicit service list (up to 20 services per call)

Feature Demos

Solving Slow Database Queries

Helm Chart Example