ClickHouse
- Infrastructure Entities - Complete infrastructure topology including compute, storage, and networking resources
- Metrics - Performance metrics from applications and infrastructure
- Symptoms - Automatic symptom detection from metrics, traces, and external monitoring systems
Overview
Causely provides native integration with ClickHouse to help you identify and resolve database issues before they impact your users.
Instead of just monitoring symptoms, Causely analyzes real-time signals to surface the underlying causal factors driving database issues.
By setting up the ClickHouse integration, you will be able to do the following:
-
Identify causes for reliability issues originating from your ClickHouse database, including:
-
Observe the database as an entity in the Topology Graph, including its relationships to other entities on the service map, infrastructure stack, and dataflow map.
-
Get insights into the slowest queries over a rolling 12-hour window, and troubleshoot them with Ask Causely directly from the UI.
The integration supports both self-hosted ClickHouse instances and cloud-managed deployments.
Setup Guide
Step 1: Create a user
Create a dedicated user in your ClickHouse instance and grant it access to the system tables that Causely requires:
CREATE USER causely_user IDENTIFIED BY 'your-password';
GRANT SELECT ON system.tables TO causely_user;
GRANT SELECT ON system.columns TO causely_user;
GRANT SELECT ON system.databases TO causely_user;
GRANT SELECT ON system.query_log TO causely_user;
GRANT SELECT ON system.mutations TO causely_user;
The integration reads from the following system tables:
| Table | Purpose |
|---|---|
system.tables | Table names, row counts, and sizes |
system.columns | Column definitions and schema information |
system.databases | Database discovery |
system.query_log | Slow query analysis (top 10 by total execution time, rolling 12-hour window) |
system.mutations | Active mutation detection for lock monitoring |
Step 2: Create a Kubernetes secret for the user
Create a Kubernetes secret with the ClickHouse connection details. The secret supports two protocols:
- Native (default): binary protocol on port
9000 - HTTP: HTTP/HTTPS protocol on port
8123/8443
Option 1: Single Database Configuration
kubectl create secret generic \
--namespace causely clickhouse-credentials \
--from-literal=username="causely_user" \
--from-literal=password='...' \
--from-literal=host="..." \
--from-literal=port="9000" \
--from-literal=database="..." \
--from-literal=protocol="native" \
--from-literal=secure="false"
To connect over HTTP instead of the native protocol:
kubectl create secret generic \
--namespace causely clickhouse-credentials \
--from-literal=username="causely_user" \
--from-literal=password='...' \
--from-literal=host="..." \
--from-literal=port="8123" \
--from-literal=database="..." \
--from-literal=protocol="http" \
--from-literal=secure="false"
Option 2: Multiple Databases Configuration
To monitor multiple databases within the same ClickHouse instance, specify them as a comma-separated list using the databases field:
kubectl create secret generic \
--namespace causely clickhouse-credentials-multidb \
--from-literal=username="causely_user" \
--from-literal=password='...' \
--from-literal=host="..." \
--from-literal=port="9000" \
--from-literal=databases="database1,database2,database3" \
--from-literal=protocol="native" \
--from-literal=secure="false"
Alternatively, use a YAML manifest:
apiVersion: v1
kind: Secret
metadata:
name: clickhouse-credentials-multidb
namespace: causely
type: Opaque
stringData:
username: 'causely_user'
password: '...'
host: '...'
port: '9000'
databases: 'database1,database2,database3'
protocol: 'native'
secure: 'false'
Note: Use either the database field for a single database or the databases field for multiple databases. Do not use both in the same secret.
Option 3: Database Auto-Discovery
Causely can automatically discover all databases on a ClickHouse server. Add auto_discovery: "true" to the secret:
kubectl create secret generic \
--namespace causely clickhouse-credentials \
--from-literal=username="causely_user" \
--from-literal=password='...' \
--from-literal=host="..." \
--from-literal=port="9000" \
--from-literal=protocol="native" \
--from-literal=secure="false" \
--from-literal=auto_discovery="true"
When auto-discovery is enabled, Causely queries system.databases (excluding system, INFORMATION_SCHEMA, and information_schema) and starts a scraper for each discovered database. Discovery runs periodically to pick up newly created databases.
The host must be the FQDN of your ClickHouse instance, or an IP address if no DNS entry is set up. It must match the FQDN/IP Causely would discover either from the Kubernetes API or your cloud provider's API.
If you are connecting through a proxy, set host to the proxy address and host_overwrite to the actual ClickHouse instance address:
--from-literal=host="my-proxy.example.com"
--from-literal=host_overwrite="my-clickhouse.example.com"
Secret field reference
| Field | Required | Default | Description |
|---|---|---|---|
host | Yes | Hostname or IP of the ClickHouse server | |
username | Yes | ClickHouse user name | |
password | Yes | ClickHouse user password | |
database | Yes* | Single database to monitor | |
databases | Yes* | Comma-separated list of databases to monitor | |
port | No | 9000 (native) / 8123 (HTTP) | ClickHouse port |
protocol | No | native | Connection protocol: native or http |
secure | No | false | Enable TLS (true or false) |
host_overwrite | No | Override the host used for entity resolution | |
port_overwrite | No | Override the port used for entity resolution | |
auto_discovery | No | false | Automatically discover all databases |
*Either database or databases must be set, unless auto_discovery is enabled.
Step 3: Update Causely Configuration
Once the secret is created, update the Causely configuration to enable scraping for the new instance:
scrapers:
clickhouse:
enabled: true
instances:
- secretName: clickhouse-credentials
namespace: causely
Alternative: Enable Credentials Autodiscovery
Causely also supports credentials autodiscovery, which lets you add new scraping targets without modifying the Causely configuration. Label the Kubernetes secret to enable autodiscovery:
kubectl --namespace causely label secret clickhouse-credentials "causely.ai/scraper=ClickHouse"
Verify Your Configuration
After completing the setup, run these queries against your ClickHouse instance to verify that the Causely user has the required access.
Quick Access Check
SELECT
(SELECT count() FROM system.tables LIMIT 1) > 0 AS tables_ok,
(SELECT count() FROM system.columns LIMIT 1) > 0 AS columns_ok,
(SELECT count() FROM system.databases LIMIT 1) > 0 AS databases_ok,
(SELECT count() FROM system.query_log LIMIT 1) >= 0 AS query_log_ok,
(SELECT count() FROM system.mutations LIMIT 1) >= 0 AS mutations_ok;
All columns should return 1 (true).
Detailed Checks
1. System tables access
-- Each of these should return a result without error
SELECT 1 FROM system.tables LIMIT 1;
SELECT 1 FROM system.columns LIMIT 1;
SELECT 1 FROM system.databases LIMIT 1;
SELECT 1 FROM system.query_log LIMIT 1;
SELECT 1 FROM system.mutations LIMIT 1;
If any query fails with an access denied error, grant the missing privilege to your Causely user:
-- Run as admin
GRANT SELECT ON system.<table_name> TO causely_user;
2. Test slow query collection
Run this query to confirm Causely can collect slow query data:
SELECT
normalized_query_hash,
count() AS calls,
sum(query_duration_ms) AS total_exec_time_ms
FROM system.query_log
WHERE type = 'QueryFinish'
AND event_time >= now() - toIntervalHour(12)
GROUP BY normalized_query_hash
ORDER BY total_exec_time_ms DESC
LIMIT 5;
This should return results without error. An empty result set is normal on a freshly configured instance—entries will appear as queries run.
If all checks pass, your ClickHouse instance is correctly configured for Causely monitoring.
Setup Checklist
-
causely_usercreated in ClickHouse -
SELECTgranted onsystem.tables,system.columns,system.databases,system.query_log,system.mutations - Kubernetes secret created with correct
host,username,password,database/databases,protocol, andport - Causely configuration updated (or secret labeled for autodiscovery)
- Verification queries succeed without access errors
What Data is Collected
The ClickHouse scraper collects comprehensive metadata and performance information from your ClickHouse databases, including:
- Database entities with names and relationships to hosting services
- Service-to-database mappings (which service provides which database)
- Connection details including host, port, and protocol configuration
- Table information including names, row counts, and sizes (from
system.tables) - Complete table schemas with column definitions, data types, default expressions, and comments (from
system.columns) - Slow query analysis using
system.query_log: top 10 queries by total execution time over a rolling 12-hour window, including call counts, total and average execution time, and rows read - Mutation lock monitoring using
system.mutations: active mutations are tracked as exclusive locks to detect contention on tables