Starburst Galaxy

  •  Get started

  •  Working with data

  •  Data engineering

  •  Developer tools

  •  Cluster administration

  •  Security and compliance

  •  Troubleshooting

  • Galaxy status

  •  Reference

  • Access control policy types #

    A policy supplies privileges to an entity such as a catalog, schema, table, view, or column, based on the entity’s assigned attributes, such as tags.

    A policy has the following characteristics:

    Attribute Description
    Name and description The name and description given to the policy upon creation.
    Matching expression A boolean expression that can test whether tags are associated with an entity. Tags can be associated with one or more catalogs, schemas, tables, views, or columns.
    A role that the policy is assigned to The policy is only active if the policy role is contained in the user's current active role set.
    Grants of privileges on entities A set of privileges that can either be granted or denied. You can identify all tables in a catalog by selecting All schemas and All tables in the Scope section of the Policies tab. The entities on which policies can grant privileges are catalogs, schemas, tables, views, and columns.
    Combined privilege checks with role-based privileges When checking privileges, role-based privileges and attribute-based privileges from policies are combined using the same rules as are used for multiple role-based privileges:

    • ALLOW privileges are additive, so the allowed privileges on the entity are the union of role-based and attribute-based ALLOW privileges from either source.
    • However, a DENY privilege from either role-based or attribute-based privileges overrides an ALLOW privilege from either source.

    Role-based access control #

    A role has a name and an optional description. A role can be granted privileges on entities such as clusters, catalogs, and tables. This provides fine-grained control that protects your data, and allows you to define just the right mix of allowed actions and access for each function in your organization.

    You can manage users, roles, and privileges in the Access - Users section and the Access - Roles and privileges section of Starburst Galaxy.

    All actions are controlled by privileges or ownership.

    Attribute-based access control #

    The attribute-based access control system of Starburst Galaxy allows the combination of policies and attributes, such as tags, to further manage role access to entities such as catalogs, schemas, tables, views, and columns.

    A policy can grant privileges that can apply to different entities, such as a schema and a table. A policy is only active if the policy role is contained in the user’s current active role set.

    Starburst Galaxy combines attribute-based privilege grants and role-based privilege grants to determine role access to entities.

    For more information, see Access control basics.

    Matching expressions #

    The matching expression is a boolean expression that uses parentheses and the AND, OR and NOT operators. HAS_TAG(example_tag) is a special function operator that is applied using the name of a tag as a parameter. If the tag is associated with the set of tags on an entity, it returns true. As with SQL expressions, matching expression operators are case-insensitive, but by convention are usually written as uppercase.

    Matching expressions do not refer to entities directly. They only refer to the tags associated with an entity. This means that the same policy can apply to a catalog, schema, table, view, or column.

    The following example shows a matching expression that tests if the set of tags on an entity includes the tag sales_department and does not include the tag pii:

    HAS_TAG(sales_department) AND NOT HAS_TAG(pii)
    

    The following example shows a matching expression that returns true if either the set of tags on an entity includes the tag sales_department or includes tag marketing_department as well as tag sales_liaison:

    HAS_TAG(sales_department) OR (HAS_TAG(marketing_department) AND HAS_TAG(sales_liaison))
    

    Policy grants #

    A policy can define grants on multiple entities, and a single policy can grant privileges on different entity kinds. For example, a policy named sales_admin might allow the CREATE_TABLE privilege on all schemas in a catalog named sales_data and allow privileges SELECT, UPDATE and INSERT on all tables in catalog sales_data.

    Defining policies #

    The Policies tab lists all policies. The tab is only visible if a role in the user’s active role set has the account-level privilege Manage security.

    If a role in the user’s active role set has the Manage security privilege, the user can create a new policy by clicking Add policy.

    When creating a policy, you can enter the matching expression as free text in the matching expression input field. For each character typed, the expression parser is run, and the input field turns pink if there is a syntax error in the matching expression, or if it references a tag that does not exist.

    You can change any attribute of a policy listed in the policies tab by clicking .

    Policy grants and privilege checking #

    The Starburst Galaxy access control system performs the following operations to check whether the user has access to a catalog, schema, table, view, or column entity:

    • The tags associated with the catalog, schema, table, view, or column are fetched. Because of tag inheritance, a tag applied to a catalog is returned in the set of tags for any schema, table, view, or column in the catalog. Similarly, a tag applied to a schema is returned in the set of tags for any table, view, or column in the schema. Finally, a tag applied to a table or view is returned in the set of tags for any column in the table or view.
    • The system evaluates the matching expression for all policies based on the tags associated with the entity. The system collects the privileges granted by the policies on the entity for which the matching expressions are true.
    • Role-based grants for the entity are retrieved and combined with the policy grants. The usual rules for ALLOW and DENY privilege grants are applied, so a DENY privilege always overrides an ALLOW privilege whether the privileges come from role-based grants or from attribute-based grants.

    Policy grants and catalog, schema, table, and view visibility #

    A key operation performed by the access control system determines visibility of catalogs, schemas, tables, and views:

    • A catalog is visible to a user if a role in the user’s active role set owns the catalog, or has an ALLOW privilege on the catalog or any schema, table, view, or column in the catalog, and that the ALLOW privilege is not overridden by a DENY privilege. The privileges can come from role-based grants, or from policies whose matching expression is true for the tags associated with a tagged entity in the catalog.
    • Similarly, a schema is visible to a user if a role in the user’s active role set owns the schema, or has an ALLOW privilege on the schema or any table, view, or column in the schema, and that the ALLOW privilege is not overridden by a DENY privilege. The privileges can come from role-based grants, or from policies whose matching expression is true for the tags associated with a tagged entity in the schema.
    • Finally, a table or view is visible to a user if a role in the user’s active role set owns the table or view, or has an ALLOW privilege on the table or view, or any column in the table or view, and that the ALLOW privilege is not overridden by a DENY privilege. The privileges can come from role-based grants, or from policies whose matching expression is true for the tags associated with a tagged entity in the table or view.

    Row-level filters #

    Row-level filters are an important component of data governance. They ensure that the rows returned by queries are rows the user is entitled to see. Unlike RBAC grants, which apply to entire tables or columns, row filters determine access on a row-by-row basis.

    A row-level filter is a named SQL expression. One or more row filters can be added to a policy whose target is a table or view. When Starburst Galaxy fetches privileges for a table or view, if the table or view matches the policy target, and the policy matching expression evaluates to true for the table’s tags, the policy’s row filters are included in the table privileges. Multiple different policies can match the table, so the row filters returned can come from multiple different policies.

    The row filter SQL expressions from all row filters from all matching policies are compared with the OR operator and added to the query’s WHERE clause. The rows in the query for which any of the row filter expressions returns true are included in the query result, and the rows for which all row filters returned false are excluded.

    Row filters are evaluated using the privileges of the role that owns the row filter, which may not be the same as the current role of the user running a query to which the row filter is applied. This means that the row filter in general has privileges to reference tables the user running a query doesn’t have the privileges to access.

    Creating row-level filters #

    To create row filters, a user’s active role set must have the Manage security privilege. In addition, the Row filters side pane link is only visible to users whose active role set has the Manage security privilege.

    See Row filters for more details.

    Column masks #

    Column masks are another access control tool that can be used to protect sensitive data. They ensure that column data is only visible to those who have permission to see the data by masking the values returned by the query.

    One or more column masks can be added to a policy whose target is a table or view. Column masks function similar to row filters in that when Starburst Galaxy fetches privileges for a table or view, and that table or view matches the policy target and the policy’s matching expression is true for the table or view’s tags, then the policy’s column masks are included in the table or view privileges.

    Column masks are evaluated using the privileges of the role that owns the column mask, which may not be the same as the current role of the user running a query to which the column mask is applied. This means that the column mask in general has privileges to reference tables the user running a query doesn’t have the privileges to access.

    When you execute a query, Galaxy automatically rewrites your query and applies a column mask expression to the specified column. The column rewrite applies the mask expression everywhere the column appears in the query. Users see masked data based on the conditions you define.

    When to use masking vs hashing #

    Masking and hashing are different obfuscation techniques that have different use cases:

    • Masking: Use masking to protect privacy or to obscure sensitive data to meet compliance requirements. Masking lets you hide specific portions of a value, such as displaying only the last four digits of a social security number. Do not perform joins on masked columns, as this produces an unexpected output.

    • Hashing: Use hashing when you need to obfuscate data irreversibly. Hashing lets you anonymize data while ensuring that the same input always produces the same hashed output.

    Creating column masks #

    To create column masks, a user’s active role set must have the Manage security privilege. In addition, the Column mask side pane link is only visible to users whose active role set has the Manage security privilege.

    See Column masks for more details.

    User attributes #

    User attributes can be imported into Galaxy from your identity provider, such as Okta. A total of 1KB in attribute statements is supported. Statements exceeding 1KB may result in attributes being dropped.

    For more information on importing attribute statements from Okta, see Okta SAML setup.

    User attributes in policy expressions #

    User attribute statements as sent from Okta or any other SSO provider can be evaluated as part of a policy expression.

    These expression are supported as part of the policy language:

    • user_attribute_exists('attributeName') evaluates to true when attributeName is sent from your IdP and at least one value corresponding to attributeName is not NULL.
    • user_has_attribute('attributeName', 'attributeValue') evaluates to true when attributeName is sent from your IdP and at least one value corresponding to attributeName equals attributeValue.

    Note that attributeName and attributeValue are placeholders and can be replaced with any values you specify in your IdP. Additionally, strings can be escaped using a backslash \. For example, an expression such as user_attribute_exists('it\'s an example') is valid and matches with the SSO attribute name it's an example.

    User attributes in row filter expressions #

    User attribute statements can also be substituted in row filter expressions automatically on a per user basis. There are two special expressions which allow for this substitution:

    • $USER_ATTRIBUTE('attributeName') replaces the first matching attribute for the name attributeName as sent from your SSO provider. If no values for attributeName are found, NULL is returned.
    • $USER_ATTRIBUTE_LIST('attributeName') replaces the matching attributes for attributeName with a list in the form of (val1, val2, ...)

    The standard convention is to use $USER_ATTRIBUTE('attributeName') when checking for strict equality in the filter expression, and $USER_ATTRIBUTE_LIST('attributeName') when you want to match any value in the list. Note that there is no validation performed on the row filter expression.