Unity catalog with Delta Lake#

The Delta Lake connector supports the INSERT, UPDATE, MERGE, and DELETE write operations for managed and external tables, and supports reading from both managed and external tables when using the Databricks Unity Catalog as a metastore on AWS, Azure, or GCP.

Configuration#

To use Unity Catalog metastore, add the following configuration properties to your catalog configuration file:

delta.security=unity
hive.metastore.unity.host=host
hive.metastore.unity.token=token
hive.metastore.unity.catalog-name=main

When using Unity Catalog with managed tables, add hive.metastore.unity.catalog-owned-table-enabled=true to your catalog configuration file. Then add the following table properties on the Delta Lake table in Databricks to ensure compatibility with SEP:

CREATE TABLE catalog_name.schema_name.table_name (id int
) USING delta
TBLPROPERTIES (
  'delta.enableRowTracking' = 'false',
  'delta.checkpointPolicy' = 'classic'
)

The following table shows the configuration properties used to connect SEP to Unity Catalog as a metastore.

Unity configuration properties#

Property name

Description

hive.metastore.unity.host

Name of the host without http(s) prefix. For example: dbc-a1b2345c-d6e7.cloud.databricks.com

hive.metastore.unity.token

The personal access token used to authenticate a connection to the Unity Catalog metastore. For more information about generating access tokens, see the Databricks documentation.

hive.metastore.unity.catalog-name

Name of the catalog in Databricks.

hive.metastore.unity.catalog-owned-table-enabled

Enables support for Databricks Unity Catalog-owned tables.

Enable OAuth 2.0 token passthrough#

The Unity Catalog supports OAuth 2.0 token pass-through.

To enable OAuth 2.0 token passthrough:

  1. Add the following configuration properties to the config.properties file on the coordinator:

    http-server.authentication.type=DELEGATED-OAUTH2
    web-ui.authentication.type=DELEGATED-OAUTH2
    http-server.authentication.oauth2.scopes=<AzureDatabricks-ApplicationID>/.default,openid
    http-server.authentication.oauth2.additional-audiences=<AzureDatabricks-ApplicationID>
    

Replace <AzureDatabricks-ApplicationID> with the Application ID for your Azure Databricks Microsoft Application which can be found in your Azure Portal under Enterprise applications.

  1. Add only the following configuration properties to the delta.properties catalog configuration file:

    delta.metastore.unity.authentication-type=OAUTH2_PASSTHROUGH
    delta.security=unity
    hive.metastore-cache-ttl=0s
    

Limitations:

  • Credential passthrough is only supported with Azure Databricks and when Microsoft Entra is the IdP.

  • When enabling credential passthrough you cannot use Hive Passthrough.

Location alias mapping#

If you are using Unity catalog as a metastore when accessing external tables, the Starburst Delta Lake connector supports using a bucket-style alias for your Amazon S3 bucket access point.

To enable location alias mapping:

  1. Create a bucket alias mapping file in JSON format:

{
  "bucket_name_1": "bucket_alias_1",
  "bucket_name_2": "bucket_alias_2"
}
  1. Add the following properties to your catalog configuration:

location-alias.provider-type=file
location-alias.mapping.file.path=/path_to_bucket_alias_mapping_file
  1. Optionally, use location-alias.mapping.file.expiration-time to specify the interval at which SEP rereads the bucket alias mapping file. The default is 1m.

SEP uses the new external location path specified in the bucket alias mapping file to access the data. Only the bucket name is replaced. The URI is otherwise unchanged.