Datadog
- Metrics - Performance metrics from applications and infrastructure
- Symptoms - Automatic symptom detection from metrics, traces, and external monitoring systems
- Traces - Distributed traces for service dependency discovery and communication monitoring
Enable Dual Shipping for Datadog APM​
Causely can leverage Datadog APM instrumentation with dual shipping to discover and monitor service dependencies.
Option 1: Enable via Helm Deployment​
To enable dual shipping when deploying the Datadog Agent using Helm:
- Add the following configuration to your
values.yaml:
agents:
useConfigMap: true
customAgentConfig:
apm_config:
additional_endpoints:
'http://mediator.causely:8126':
- 'datadog-receiver'
- Upgrade or install the Datadog Helm chart:
helm upgrade --install datadog datadog/datadog -f ./values.yaml
Option 2: Enable via Datadog Operator​
If you're managing the Datadog Agent using the Datadog Operator, modify the DatadogAgent custom resource as follows:
apiVersion: datadoghq.com/v2alpha1
kind: DatadogAgent
metadata:
name: datadog
spec:
override:
nodeAgent:
env:
- name: DD_APM_ADDITIONAL_ENDPOINTS
value: '{"http://mediator.causely:8126": ["datadog-receiver"]}'
Adding Host Identity Tags for EC2-Based Datadog APM Traces​
When applications run outside Kubernetes (such as on EC2 instances), Datadog APM traces may not include hostname or other identifying metadata by default. Without this information, Causely cannot associate incoming traces with the EC2 instances discovered from AWS.
To ensure Causely can correctly match traces to EC2 hosts, add a unique identifying tag to the application's Datadog configuration:
DD_TAGS="pm-name:<unique-host-identifier>"
Set this environment variable in the same location where Datadog APM configuration is applied, such as:
- ECS task definitions
- Docker environment variables
- systemd unit files
- CI/CD deployment configurations
Example (Docker):
docker run \
-e DD_AGENT_HOST="<agent-host>" \
-e DD_TAGS="pm-name:my-ec2-hostname" \
my-app:latest
Causely uses this tag to stitch Datadog APM traces to the corresponding EC2 instance in your environment model.
For more information on Datadog tagging, refer to Datadog's documentation.
Enabling Datadog Watchdog Monitors as a Data Source​
Causely can also leverage Datadog monitors for Postgres, Redis, and other integrations to be used as input to Casual Reasoning.
To enable Datadog as a data source in Causely, add the following configuration to your values.yaml:
scrapers:
datadog:
enabled: true
instances:
- secretName: datadog-credentials
# namespace: your-namespace # optional; defaults to the pod namespace
# event_tag_filters: [] # optional
# monitor_mapping: [] # optional
Alternative: Enable Credentials Autodiscovery​
Causely supports credentials autodiscovery so you can add new accounts without editing config. Label the Kubernetes secret to enable autodiscovery for the Datadog scraper:
kubectl --namespace causely label secret datadog-credentials "causely.ai/scraper=Datadog"
For the secret, you can use the following example:
apiVersion: v1
kind: Secret
metadata:
name: datadog-credentials
type: Opaque
stringData:
org: 'YourOrg'
apiKey: '<YOUR_DATADOG_API_KEY>'
appKey: '<YOUR_DATADOG_APP_KEY>'
The optional event_tag_filters is used to filter triggered Datadog events, for example:
event_tag_filters:
- 'pagerduty'
- 'integration:postgres'
- 'integration:redisdb'
The optional monitor_mapping is used to map Datadog monitors to Causely entities, for example:
monitor_mapping:
- entityType: ApplicationLoadBalancer
createIfMissing: true
attributes:
service_name:
label: ['hostname']
isIdentifier: true
resources:
- monitorId: 123267804
attribute: RequestsTotal
Without the monitor_mapping, Causely will try to automatically map the latency and error rate monitors to the Causely Service corresponding symptoms.
For example, the following event will be mapped automatically to the high error rate symptom of the causely-analysis service:
{
"id": "AwAAAZiZU7KAMHtQKwAAABhBWmlaVlJXNUFBRFhWU3hKanRwb1ZyYl8AAAAkMTE5ODk5ZGMtMjgyYS00YWJmLWFjYzUtNmQ4NjkyZjBlM2FlAAEjVg",
"type": "event",
"attributes": {
"attributes": {
"aggregation_key": "2b57d1af567993ace06eebaf9dc0e669",
"evt": {
"uid": "AZiZVRiqAABt7uudQBprAQAA",
"name": "Errors are high",
"id": "8231846403567092370",
"source_id": 36,
"type": "log_alert"
},
"monitor_id": 150830126,
"monitor_notifications": [
"causely-alerts"
],
"monitor": {
"group_status": 5,
"alert_cycle_key_txt": "8231846399607848082",
"query": "logs(\"service:(causely-analysis*production) status:error -service:* -service:*compliancev2* -service:*ocr* -\\\"Request failed with status code 404\\\" -\\\"#8b6d6338\\\"\").index(\"*\").rollup(\"count\").by(\"service\").last(\"10m\") > 2000",
"groups": [
"service:causely-analysis"
],
"created_at": 1723077612000,
"priority": 1,
"type": "log alert",
"transition": {
"destination_state": "Warn",
"transition_type": "warn",
"source_state": "OK"
},
"tags": [
"team:analysis-team"
],
"result": {
"result_id_txt": "8231846397529909389",
"result_id": 8231846397529909389,
"result_ts": 1754919056,
"group_key": "service"
},
"name": "Errors are high",
"options": {
"on_missing_data": "default",
"thresholds": {
"critical": 2000,
"warning": 1000
},
"new_group_delay": 0,
"enable_logs_sample": true,
"include_tags": false,
"groupby_simple_monitor": false,
"notify_audit": false
},
"modified": 1754582393000,
"id": 150830516,
"templated_name": "Errors are high in production",
},
"priority": "normal",
"title": "[P1] [Warn] Errors are high",
"service": "causely-analysis",
"sourcecategory": "monitor_alert",
"event_object": "047235cd648d8d01abf7bc03c7c11bc8",
"_dd": {
"has_notification": false,
"internal": "1",
"version": "1"
},
"timestamp": 1754919056000,
"status": "warning"
},
"timestamp": "2025-08-11T13:30:56Z",
"tags": [
"monitor",
"priority:p1",
"service:causely-analysis",
"source:alert",
"team:analysis-team"
]
}
},