Hive and Delta Lake access control with Apache Ranger#
You can use the Apache Ranger integration with SEP to control access to Hive, Delta Lake, and Iceberg data sources configured in any catalog using the SEP Hive, Delta Lake, or Iceberg connectors.
This is specifically useful in the following scenarios:
You already use Apache Ranger to control access for these data sources
You use SQL standard-based authorization
If you want to control access for other catalogs and connectors with Ranger, you need to adopt global access control.
Note
Hive and Delta Lake access control with Apache Ranger requires a valid Starburst license.
Installation#
Verify you fulfill the Apache Ranger requirements.
If you do not yet have Ranger running and configured, you can take advantage of the supported installation methods for global access control with Apache Ranger.
Configuration#
To use Ranger with Hive, Delta Lake, and Iceberg catalogs, you must set the appropriate property for your setup in the catalog configuration file:
For Hive catalogs:
hive.security=system
For Delta Lake catalogs:
delta.security=system
For Iceberg catalogs:
iceberg.security=system
Additionally, you must have the access-control.properties
file as well as the
hive.security
file for Hive catalogs. The access-control.properties
and
hive.security
files are configured to use Apache Ranger for authorization and
Kerberos for authentication.
An example access-control.properties
file:
access-control.name=hive-based-ranger
ranger.hive-based.catalogs=hive
ranger.hive-based.fallback-access-control=ALLOW_ALL
ranger.policy-rest-url=https://ranger-host:6182
ranger.service-name=hive
ranger.authentication-type=KERBEROS
ranger.kerberos-principal=sep-server/sep-server-node@EXAMPLE.COM
ranger.kerberos-keytab=/etc/sep/conf/sep-server.keytab
ranger.plugin-policy-ssl-config-file=/etc/hive/conf/ranger-policymgr-ssl.xml
An example hive.properties
file:
connector.name=hive
hive.metastore.uri=thrift://hive-metastore-node:9083
hive.metastore.authentication.type=KERBEROS
hive.metastore.service.principal=hive/hive-metastore-node@EXAMPLE.COM
hive.metastore.client.principal=hive/sep-server-node@EXAMPLE.COM
hive.metastore.client.keytab=/etc/hive/conf/hive.keytab
hive.hdfs.authentication.type=KERBEROS
hive.hdfs.impersonation.enabled=false
hive.hdfs.trino.principal=hdfs/sep-server-node@EXAMPLE.COM
hive.hdfs.trino.keytab=/etc/hadoop/conf/hdfs.keytab
hive.security=system
Property name |
Description |
---|---|
|
Set to |
|
Specifies the comma-separated names of catalogs to be secured by Hive Ranger policies. |
|
Fallback access control to control resources that are out of scope of Hive Ranger policies. There are three options:
|
|
Determines if access to query execution and killing should be controlled. Default is |
|
Determines if access to user defined functions should be controlled. Default is |
|
Specifies default value for the run-as-invoker property. |
See additional configuration properties available in Ranger.
Resource and privilege relationship limitations#
The integration of Ranger with the Hive, Delta Lake, and Iceberg connectors does not support any kind of relation between a resource, such as a table, and the granted access. This is particularly evident when newly created resources are treated with the access rights of the existing policies.
For example, as a user without full administrative access (ALL PRIVILEGES
) you
can end up in a situation where you create a new table, but can not grant access
to that table to other users. As a workaround you can create global policies
that use matching patterns and define the desired default access.
If you want all tables ending with -reporting
to be accessible by everyone,
you can create a policy that uses the value *-reporting
for the table. If
you use *
for database and *
for column, you can grant a specific privilege
such as reporting a table and all of its columns in any database.
Alternatively, you can grant all privileges to anyone creating new objects and require privileges to grant access to other users. This policy might be too open for your use case, depending on your security policies, the data, the number of users, and other aspects.
Role management limitations#
SQL supports the SET ROLE [role]
statement to enable or disable a role. This
feature is not supported in Ranger. All user roles are always enabled. Querying
information_schema.applicable_roles
and information_schema.enabled_roles
yields the same results.