Unity catalog with Delta Lake#
The Delta Lake connector supports the INSERT
, UPDATE
, MERGE
, and DELETE
write operations for
managed and
external
tables, and
supports reading from both managed and external tables when using the Databricks
Unity Catalog as a metastore on AWS, Azure, or GCP.
Configuration#
To use Unity Catalog metastore, add the following configuration properties to your catalog configuration file:
delta.security=unity
hive.metastore.unity.host=host
hive.metastore.unity.token=token
hive.metastore.unity.catalog-name=main
When using Unity Catalog with managed
tables, add
hive.metastore.unity.catalog-owned-table-enabled=true
to your catalog
configuration file. Then add the following table properties on the
Delta Lake table in Databricks to ensure compatibility with SEP:
CREATE TABLE catalog_name.schema_name.table_name (id int
) USING delta
TBLPROPERTIES (
'delta.enableRowTracking' = 'false',
'delta.checkpointPolicy' = 'classic'
)
The following table shows the configuration properties used to connect SEP to Unity Catalog as a metastore.
Property name |
Description |
---|---|
|
Name of the host without http(s) prefix. For example:
|
|
The personal access token used to authenticate a connection to the Unity Catalog metastore. For more information about generating access tokens, see the Databricks documentation. |
|
Name of the catalog in Databricks. |
|
Enables support for Databricks Unity Catalog-owned tables. |
Enable OAuth 2.0 token passthrough#
The Unity Catalog supports OAuth 2.0 token pass-through.
To enable OAuth 2.0 token passthrough:
Add the following configuration properties to the
config.properties
file on the coordinator:http-server.authentication.type=DELEGATED-OAUTH2 web-ui.authentication.type=DELEGATED-OAUTH2 http-server.authentication.oauth2.scopes=<AzureDatabricks-ApplicationID>/.default,openid http-server.authentication.oauth2.additional-audiences=<AzureDatabricks-ApplicationID>
Replace <AzureDatabricks-ApplicationID>
with the Application ID for your Azure
Databricks Microsoft Application which can be found in your Azure Portal under
Enterprise applications.
Add only the following configuration properties to the
delta.properties
catalog configuration file:delta.metastore.unity.authentication-type=OAUTH2_PASSTHROUGH delta.security=unity hive.metastore-cache-ttl=0s
Limitations:
Credential passthrough is only supported with Azure Databricks and when Microsoft Entra is the IdP.
When enabling credential passthrough you cannot use Hive Passthrough.
Location alias mapping#
If you are using Unity catalog as a metastore when accessing external tables, the Starburst Delta Lake connector supports using a bucket-style alias for your Amazon S3 bucket access point.
To enable location alias mapping:
Create a bucket alias mapping file in JSON format:
{
"bucket_name_1": "bucket_alias_1",
"bucket_name_2": "bucket_alias_2"
}
Add the following properties to your catalog configuration:
location-alias.provider-type=file
location-alias.mapping.file.path=/path_to_bucket_alias_mapping_file
Optionally, use
location-alias.mapping.file.expiration-time
to specify the interval at which SEP rereads the bucket alias mapping file. The default is1m
.
SEP uses the new external location path specified in the bucket alias mapping file to access the data. Only the bucket name is replaced. The URI is otherwise unchanged.