MCP Server Integration
The Causely MCP server gives agents and AI assistants direct access to Causely's causal reasoning engine. 24 tools across 5 categories let your agent move from raw alerts to structured root cause analysis, dependency maps, and reliability reports: without writing custom integrations.
Key Workflows
These are the four workflows agents use most often. Each maps to a specific sequence of MCP tool calls.
Incident Triage
Identify what's broken and how far it has spread.
get_symptoms(): see all active symptoms across the entire environment (no filters needed)get_root_causes(): identify all active root causes and impacted servicesget_root_causes(symptom_ids=[...]): drill into a specific symptom's causesget_topology(entity_id=..., mode="dependents"): map which upstream services are affected
Quick Service Health
Get a complete health picture for a specific service in two calls.
get_entities(query="service-name", entity_types=["Service"]): resolve the service name to its entity IDget_service_summary(service="service-name"): full snapshot: status, active symptoms, root causes, SLOs, metrics, recent events, error logs
Post-Deploy Validation
Check whether a deployment introduced regressions.
reliability_delta(service="service-name"): compare CPU, memory, latency, and error rate before vs after the most recent deploymentfleet_reliability_delta(team="team-name"): batch check across all services for a team, namespace, or explicit list
Post-Incident Reporting
Generate postmortem documentation and action items from a resolved incident.
get_root_causes(root_cause_id=...): retrieve full root cause details, timeline, and blast radiuspostmortem(root_cause_id=...): generate a structured postmortem draftgenerate_ticket(task="..."): create a follow-up engineering ticket for Jira, GitHub Issues, or Linear
Using the Tool Reference
Most agents, Claude, Cursor, Codex, will intuitively pick the right tools based on your prompt. You don't need to specify tool names or sequences. Just describe what you need and the agent will handle the rest.
The tool reference below is for teams building custom agents or automations who want precise control over which tools are called and when. One thing worth knowing: most structured tools require an entity ID, so get_entities() is usually the right first call when working programmatically.
Tool Selection: Ask Causely vs Structured Tools
The MCP server exposes two interaction styles. Choose based on what your agent needs to do with the result.
| Use case | Recommended tool |
|---|---|
| Narrative health summary ("Is checkout healthy?") | get_service_summary |
| Historical questions ("What happened last night?") | ask_causely |
| Incident standup summary ("What happened to checkout yesterday?") | ask_causely |
| System-wide SLO overview ("Are any current SLOs at risk?") | get_slo |
| Programmatic root cause output ("What is the root cause of latency on payments?") | get_root_causes |
| Time-series metric data ("What is the p95 latency for the last hour on payments?") | get_metrics |
| Entity ID resolution ("Resolve the entity ID for the payments service") | get_entities |
| SLO burn rate and violation status ("Is the payments SLO burning?") | get_slo |
| Dependency graph ("What services depend on payments?") | get_topology |
| Post-deploy regression check ("Did the latest payments deploy introduce a regression?") | reliability_delta |
Ask Causely natural language in. Best for open-ended exploration and synthesis.
Structured tools explicit named inputs. Best when your agent needs to act on the result, apply logic, or chain calls.
What Agents Get vs Raw Telemetry
| Raw telemetry | Causely MCP | |
|---|---|---|
| Root cause identification | Correlation-based, requires analysis | Deterministic causal analysis |
| Dependency awareness | Manual mapping required | Live topology from observed traffic |
| Blast radius | Estimated | Computed from causal graph |
| Structured output | Custom parsing required | Typed tool responses |
| Time to insight | Minutes of analysis | Single tool call |
Authentication
The MCP server validates Frontegg-issued access tokens (JWT) in Authorization: Bearer, the same identity layer as the rest of the Causely API. How that token reaches your tool depends on the flow you use.
Browser-based OAuth (default)
Tools such as mcp-remote follow the MCP authorization model: they discover Causely’s OAuth protected-resource metadata, perform dynamic client registration (POST to the registration endpoint advertised in that metadata—for Causely SaaS this is under https://api.causely.app/mcp/oauth/register), open a browser login (Frontegg), then attach the resulting Bearer token to MCP requests. No manual client secret is required for this path.
Client ID and client secret (machine access)
For non-interactive MCP calls (automation, CI, or proxies that do not complete browser OAuth), supply credentials the MCP server can exchange for a Frontegg access token. The HTTP Basic username and password are always client_id and client_secret in that order.
Tenant API tokens (typical for your Causely tenant)
Generate OAuth client credentials for your tenant in the Frontegg account portal:
https://auth.causely.app/oauth/portal/api-tokens
Use the issued client ID as the Basic “username” and the client secret as the Basic “password” when building the payload below. Treat these secrets like any other API key (store them in a secret manager, rotate when needed).
MCP dynamic client registration (alternative)
MCP clients that complete dynamic client registration receive their own client_id and client_secret. Those values use the same encoding and headers as tenant API tokens. Registered MCP client secrets are time-limited; when they expire, register again or switch to tenant API tokens from the portal.
Encoding: concatenate client_id, a single colon (:), and client_secret, then Base64-encode that string (standard HTTP Basic user-info, same as Authorization: Basic elsewhere). Do not add a newline before encoding—the decoded value must be exactly client_id:secret.
Generating the Base64 string
In a shell, set CLIENT_ID and CLIENT_SECRET to your values (or substitute quoted literals), then run:
printf '%s:%s' "$CLIENT_ID" "$CLIENT_SECRET" | base64
Use the single line of output as the Base64 payload (no line breaks). For X-Causely-Client-Basic, you may send either that raw string or prefix it with Basic when building the header.
How to send it:
| Method | When to use |
|---|---|
Authorization: Basic <Base64(client_id:secret)> | Supported: the server reads the Authorization header when it uses the Basic scheme (RFC 7617; scheme is matched case-insensitively, with optional whitespace after Basic). Used when the request has no Authorization: Bearer token—typical for curl or custom HTTP clients. |
X-Causely-Client-Basic | Same payload: either raw Base64 or Basic <Base64(client_id:secret)>. Use when you must not put those bytes on Authorization (for example an MCP proxy reserves or rewrites Authorization). If both this header and Authorization: Basic are set, the server prefers X-Causely-Client-Basic when resolving credentials for exchange. |
If the request already includes a non-empty Authorization: Bearer token, the server validates that JWT and does not run the Basic exchange; values on X-Causely-Client-Basic are ignored in that case.
Examples
Direct HTTP (no Bearer on the request):
Authorization: Basic <Base64(client_id:secret)>
mcp-remote accepts repeated --header "Name: value" arguments and expands ${ENV_VAR} inside header values. Use that when your MCP stack should attach static credentials without hard‑coding secrets in config:
{
"mcpServers": {
"causely": {
"command": "npx",
"args": [
"mcp-remote",
"https://api.causely.app/mcp/",
"--header",
"X-Causely-Client-Basic: Basic ${CAUSELY_MCP_CLIENT_BASIC}"
]
}
}
}
Set CAUSELY_MCP_CLIENT_BASIC to the Base64 string (not including the Basic prefix in the env value—the example adds the prefix in the header). This path only affects authentication when outbound MCP requests are sent without a Bearer token from the interactive OAuth flow (for example automation-oriented setups); the default browser OAuth configuration does not need it.
Setup
Step 1: Configure Your Development Tool
For most users, the hosted MCP URL plus mcp-remote is enough: OAuth and dynamic registration run automatically. Add this configuration to your MCP-compatible tool:
{
"mcpServers": {
"causely": {
"command": "npx",
"args": ["mcp-remote", "https://api.causely.app/mcp/"]
}
}
}
Prerequisites:
- Active Causely account
- An MCP-compatible tool (see example Supported Tools below)
- Node.js (typically pre-installed with most development environments)
Step 2: Verify Connection
Test with: "Ask Causely: What defects are currently active?"
If you don't receive a response:
- For browser OAuth: you're logged into Causely in the browser when the tool completes login (the access token comes from that OAuth flow, not from the portal session alone)
- For client ID / secret: the Base64 value encodes
client_id:secretfrom https://auth.causely.app/oauth/portal/api-tokens; header name andBasicprefix match the table in Authentication - Restart your development tool after configuration
- Confirm Node.js is available in your PATH
Supported Tools
IDEs and Editors: Cursor, Codex, Visual Studio Code (with MCP extension), JetBrains IDEs (IntelliJ IDEA, PyCharm, WebStorm, GoLand, and others), Windsurf, Zed
Desktop Applications: Claude Desktop
CLIs: Claude Code, Kiro CLI, Amp, Atlassian Rovo DEV CLI, and other MCP-compatible CLI tools
Agent Frameworks: HolmesGPT
Full Tool Reference
24 tools across 5 categories. All tools are available to any MCP-compatible agent or assistant.
Entity Resolution
| Tool | When to use |
|---|---|
get_entities | Start here. Resolve a service or database name to its ID; list all entities in a namespace; check current health status |
get_label_values | Enumerate valid label values (team, product, cluster, namespace) before fanning out queries across environments |
list_namespaces | Discover Kubernetes namespace names before resolving entities or scanning a namespace |
list_clusters | Discover cluster names before scoping multi-cluster queries |
Data Retrieval
| Tool | When to use |
|---|---|
get_metrics | Retrieve numeric metric data (p95 latency, error rate, CPU, memory, throughput): the only tool that returns time-series |
get_logs | Inspect live service logs, or retrieve evidence logs captured at root cause detection time |
get_alerts | Start triage from an alert name (PagerDuty, Slack, Datadog); distinguish alerts mapped to causal analysis from noise |
get_events | Correlate symptom onset with deployments, restarts, scaling events, or config changes |
get_config | Investigate configuration drift; verify deployment manifest matches expectations |
get_slow_queries | Identify database queries consuming the most execution time; follow up on database root causes |
Health & Diagnosis
| Tool | When to use |
|---|---|
get_symptoms | Call with no filters to see all active symptoms across the entire environment or filter for specific entity, namespace or cluster |
get_root_causes | Identify all active root causes; filter by impacted service, symptom, or root cause ID |
get_entity_health | Structured health summary for non-Service entities (databases, pods, queues, topics, tables) |
get_slo | Check SLO state, error budget remaining, and burn rate |
get_topology | Find upstream blast radius (dependents), downstream dependencies, or full data-flow graph |
get_integration_status | Verify monitoring coverage; check scraper health by cluster |
triage | Focused health summary by entity name or root cause ID: no entity ID pre-resolution needed |
team_health | Health summary for all services owned by a team; degraded and critical services listed first |
get_service_summary | Comprehensive health snapshot for a single service: status, symptoms, root causes, SLOs, metrics, events, logs |
Reporting & Postmortems
| Tool | When to use |
|---|---|
postmortem | Generate a deterministic postmortem draft for a resolved incident from Causely data |
generate_ticket | Create a structured engineering ticket suitable for Jira, GitHub Issues, or Linear |
Reliability & Deployment
| Tool | When to use |
|---|---|
reliability_delta | Post-deploy regression check for a single service: compare resource consumption before/after most recent deployment |
fleet_reliability_delta | Batch regression check across a team, namespace, or explicit service list (up to 20 services per call) |