DataLakeCatalog
The DataLakeCatalog database engine enables you to connect ClickHouse to external
data catalogs and query open table format data without the need for data duplication.
This transforms ClickHouse into a powerful query engine that works seamlessly with
your existing data lake infrastructure.
Supported catalogs
The DataLakeCatalog engine supports the following data catalogs:
- AWS Glue Catalog - For Iceberg tables in AWS environments
- Databricks Unity Catalog - For Delta Lake and Iceberg tables
- Hive Metastore - Traditional Hadoop ecosystem catalog
- REST Catalogs - Any catalog supporting the Iceberg REST specification
Creating a database
You will need to enable the relevant settings below to use the DataLakeCatalog engine:
Databases with the DataLakeCatalog engine can be created using the following syntax:
The following settings are supported:
| Setting | Description |
|---|---|
catalog_type | Type of catalog: glue, unity (Delta), rest (Iceberg), hive |
warehouse | The warehouse/database name to use in the catalog. |
catalog_credential | Authentication credential for the catalog (e.g., API key or token) |
auth_header | Custom HTTP header for authentication with the catalog service |
auth_scope | OAuth2 scope for authentication (if using OAuth) |
storage_endpoint | Endpoint URL for the underlying storage |
oauth_server_uri | URI of the OAuth2 authorization server for authentication |
vended_credentials | Boolean indicating whether to use vended credentials (AWS-specific) |
aws_access_key_id | AWS access key ID for S3/Glue access (if not using vended credentials) |
aws_secret_access_key | AWS secret access key for S3/Glue access (if not using vended credentials) |
region | AWS region for the service (e.g., us-east-1) |
Examples
See below pages for examples of using the DataLakeCatalog engine: