ClickHouse

Signals Provided

Infrastructure Entities - Complete infrastructure topology including compute, storage, and networking resources
Metrics - Performance metrics from applications and infrastructure
Symptoms - Automatic symptom detection from metrics, traces, and external monitoring systems

Overview

Causely provides native integration with ClickHouse to help you identify and resolve database issues before they impact your users.

Instead of just monitoring symptoms, Causely analyzes real-time signals to surface the underlying causal factors driving database issues.

By setting up the ClickHouse integration, you will be able to do the following:

Identify causes for reliability issues originating from your ClickHouse database, including:
Observe the database as an entity in the Topology Graph, including its relationships to other entities on the service map, infrastructure stack, and dataflow map.
Get insights into the slowest queries over a rolling 12-hour window, and troubleshoot them with Ask Causely directly from the UI.

The integration supports both self-hosted ClickHouse instances and cloud-managed deployments.

Setup Guide

Step 1: Create a user

Create a dedicated user in your ClickHouse instance and grant it access to the system tables that Causely requires:

CREATE USER causely_user IDENTIFIED BY 'your-password';

GRANT SELECT ON system.tables TO causely_user;
GRANT SELECT ON system.columns TO causely_user;
GRANT SELECT ON system.databases TO causely_user;
GRANT SELECT ON system.query_log TO causely_user;
GRANT SELECT ON system.mutations TO causely_user;

The integration reads from the following system tables:

Table	Purpose
`system.tables`	Table names, row counts, and sizes
`system.columns`	Column definitions and schema information
`system.databases`	Database discovery
`system.query_log`	Slow query analysis (top 10 by total execution time, rolling 12-hour window)
`system.mutations`	Active mutation detection for lock monitoring

Step 2: Create a Kubernetes secret for the user

Create a Kubernetes secret with the ClickHouse connection details. The secret supports two protocols:

Native (default): binary protocol on port 9000
HTTP: HTTP/HTTPS protocol on port 8123 / 8443

Option 1: Single Database Configuration

kubectl create secret generic \
  --namespace causely clickhouse-credentials \
  --from-literal=username="causely_user" \
  --from-literal=password='...' \
  --from-literal=host="..." \
  --from-literal=port="9000" \
  --from-literal=database="..." \
  --from-literal=protocol="native" \
  --from-literal=secure="false"

To connect over HTTP instead of the native protocol:

kubectl create secret generic \
  --namespace causely clickhouse-credentials \
  --from-literal=username="causely_user" \
  --from-literal=password='...' \
  --from-literal=host="..." \
  --from-literal=port="8123" \
  --from-literal=database="..." \
  --from-literal=protocol="http" \
  --from-literal=secure="false"

Option 2: Multiple Databases Configuration

To monitor multiple databases within the same ClickHouse instance, specify them as a comma-separated list using the databases field:

kubectl create secret generic \
  --namespace causely clickhouse-credentials-multidb \
  --from-literal=username="causely_user" \
  --from-literal=password='...' \
  --from-literal=host="..." \
  --from-literal=port="9000" \
  --from-literal=databases="database1,database2,database3" \
  --from-literal=protocol="native" \
  --from-literal=secure="false"

Alternatively, use a YAML manifest:

apiVersion: v1
kind: Secret
metadata:
  name: clickhouse-credentials-multidb
  namespace: causely
type: Opaque
stringData:
  username: 'causely_user'
  password: '...'
  host: '...'
  port: '9000'
  databases: 'database1,database2,database3'
  protocol: 'native'
  secure: 'false'

Note: Use either the database field for a single database or the databases field for multiple databases. Do not use both in the same secret.

Option 3: Database Auto-Discovery

Causely can automatically discover all databases on a ClickHouse server. Add auto_discovery: "true" to the secret:

kubectl create secret generic \
  --namespace causely clickhouse-credentials \
  --from-literal=username="causely_user" \
  --from-literal=password='...' \
  --from-literal=host="..." \
  --from-literal=port="9000" \
  --from-literal=protocol="native" \
  --from-literal=secure="false" \
  --from-literal=auto_discovery="true"

When auto-discovery is enabled, Causely queries system.databases (excluding system, INFORMATION_SCHEMA, and information_schema) and starts a scraper for each discovered database. Discovery runs periodically to pick up newly created databases.

The host must be the FQDN of your ClickHouse instance, or an IP address if no DNS entry is set up. It must match the FQDN/IP Causely would discover either from the Kubernetes API or your cloud provider's API.

If you are connecting through a proxy, set host to the proxy address and host_overwrite to the actual ClickHouse instance address:

--from-literal=host="my-proxy.example.com"
--from-literal=host_overwrite="my-clickhouse.example.com"

Secret field reference

Field	Required	Default	Description
`host`	Yes		Hostname or IP of the ClickHouse server
`username`	Yes		ClickHouse user name
`password`	Yes		ClickHouse user password
`database`	Yes*		Single database to monitor
`databases`	Yes*		Comma-separated list of databases to monitor
`port`	No	`9000` (native) / `8123` (HTTP)	ClickHouse port
`protocol`	No	`native`	Connection protocol: `native` or `http`
`secure`	No	`false`	Enable TLS (`true` or `false`)
`host_overwrite`	No		Override the host used for entity resolution
`port_overwrite`	No		Override the port used for entity resolution
`auto_discovery`	No	`false`	Automatically discover all databases

*Either database or databases must be set, unless auto_discovery is enabled.

Step 3: Update Causely Configuration

Once the secret is created, update the Causely configuration to enable scraping for the new instance:

scrapers:
  clickhouse:
    enabled: true
    instances:
      - secretName: clickhouse-credentials
        namespace: causely

Alternative: Enable Credentials Autodiscovery

Causely also supports credentials autodiscovery, which lets you add new scraping targets without modifying the Causely configuration. Label the Kubernetes secret to enable autodiscovery:

kubectl --namespace causely label secret clickhouse-credentials "causely.ai/scraper=ClickHouse"

Verify Your Configuration

After completing the setup, run these queries against your ClickHouse instance to verify that the Causely user has the required access.

Quick Access Check

SELECT
    (SELECT count() FROM system.tables LIMIT 1) > 0     AS tables_ok,
    (SELECT count() FROM system.columns LIMIT 1) > 0    AS columns_ok,
    (SELECT count() FROM system.databases LIMIT 1) > 0  AS databases_ok,
    (SELECT count() FROM system.query_log LIMIT 1) >= 0 AS query_log_ok,
    (SELECT count() FROM system.mutations LIMIT 1) >= 0 AS mutations_ok;

All columns should return 1 (true).

Detailed Checks

1. System tables access

-- Each of these should return a result without error
SELECT 1 FROM system.tables LIMIT 1;
SELECT 1 FROM system.columns LIMIT 1;
SELECT 1 FROM system.databases LIMIT 1;
SELECT 1 FROM system.query_log LIMIT 1;
SELECT 1 FROM system.mutations LIMIT 1;

If any query fails with an access denied error, grant the missing privilege to your Causely user:

-- Run as admin
GRANT SELECT ON system.<table_name> TO causely_user;

2. Test slow query collection

Run this query to confirm Causely can collect slow query data:

SELECT
    normalized_query_hash,
    count() AS calls,
    sum(query_duration_ms) AS total_exec_time_ms
FROM system.query_log
WHERE type = 'QueryFinish'
  AND event_time >= now() - toIntervalHour(12)
GROUP BY normalized_query_hash
ORDER BY total_exec_time_ms DESC
LIMIT 5;

This should return results without error. An empty result set is normal on a freshly configured instance—entries will appear as queries run.

Success

If all checks pass, your ClickHouse instance is correctly configured for Causely monitoring.

Setup Checklist

causely_user created in ClickHouse
SELECT granted on system.tables, system.columns, system.databases, system.query_log, system.mutations
Kubernetes secret created with correct host, username, password, database/databases, protocol, and port
Causely configuration updated (or secret labeled for autodiscovery)
Verification queries succeed without access errors

What Data is Collected

The ClickHouse scraper collects comprehensive metadata and performance information from your ClickHouse databases, including:

Database entities with names and relationships to hosting services
Service-to-database mappings (which service provides which database)
Connection details including host, port, and protocol configuration
Table information including names, row counts, and sizes (from system.tables)
Complete table schemas with column definitions, data types, default expressions, and comments (from system.columns)
Slow query analysis using system.query_log: top 10 queries by total execution time over a rolling 12-hour window, including call counts, total and average execution time, and rows read
Mutation lock monitoring using system.mutations: active mutations are tracked as exclusive locks to detect contention on tables

Overview​

Setup Guide​

Step 1: Create a user​

Step 2: Create a Kubernetes secret for the user​

Option 1: Single Database Configuration​

Option 2: Multiple Databases Configuration​

Option 3: Database Auto-Discovery​

Secret field reference​

Step 3: Update Causely Configuration​

Alternative: Enable Credentials Autodiscovery​

Verify Your Configuration​

Quick Access Check​

Detailed Checks​

1. System tables access​

2. Test slow query collection​

Setup Checklist​

What Data is Collected​