Routing rules#

Starburst Gateway includes a routing rules engine that lets you customize how the Gateway routes requests to backend clusters.

Default routing behavior#

By default, Starburst Gateway reads the X-Trino-Routing-Group request header to route requests. If you do not specify this header, the Gateway sends requests to the default routing group called adhoc.

Custom routing#

The routing rules engine enables you to create custom routing logic based on request information such as request headers. You can either:

  • Define routing logic in your configuration file

  • Use a custom, external service

Enable routing rules#

To configure the routing rules engine in your config.yaml file, use the following configuration properties:

  • rulesEngineEnabled: Set to true to enable the routing rules engine.

  • rulesType: Set to FILE for file-based rules or EXTERNAL for external service routing (default: FILE)

  • rulesConfigPath: The path to your routing rules configuration file (required when rulesType is FILE)

  • rulesRefreshPeriod: Specifies how often the rules file reloads (default: 1m.)

  • rulesExternalConfiguration The URL for an external routing service (required for when rulesType is EXTERNAL)

See the following example configuration:

routingRules:
    rulesEngineEnabled: true
    rulesType: FILE  # or EXTERNAL
    rulesConfigPath: "app/config/routing_rules.yml"  # for FILE type
    rulesExternalConfiguration:
        urlPath: https://router.example.com/gateway-rules  # for EXTERNAL type
        excludeHeaders:
            - 'Authorization'
            - 'Accept-Encoding'

Note

If there are errors parsing the routing rules configuration file, Starburst Gateway logs the error and uses the X-Trino-Routing-Group header for routing decisions.

File-based routing rules#

Define routing rules in a YAML file using conditions and actions. Separate each rule with ---.

Rules consist of:

  • name: A descriptive identifier for the rule

  • description: Human-readable explanation of what the rule does

  • condition: An expression that determines when the rule applies

  • actions: What to do when the condition is met

  • priority (optional): Execution order (lower numbers run first)

See the following example rules:

---
name: "airflow"
description: "if query from airflow, route to etl group"
condition: 'request.getHeader("X-Trino-Source") == "airflow"'
actions:
  - 'result.put("routingGroup", "etl")'
---
name: "airflow special"
description: "if query from airflow with special label, route to etl-special group"
condition: 'request.getHeader("X-Trino-Source") == "airflow" && request.getHeader("X-Trino-Client-Tags") contains "label=special"'
actions:
  - 'result.put

Available objects#

By default, rules can access three objects:

  • request: The incoming HTTP request as an HttpServletRequest

  • state: A HashMap for passing data between rules

  • result: A HashMap for returning routing decisions

Optional objects#

Enable additional objects by setting the requestAnalyzerConfig.analyzeRequest property to true:

  • trinoRequestUser: Information about the authenticated user

  • trinoQueryProperties: Details about the SQL query being executed

Expression language#

Rules use MVEL, an expression language with Java-like syntax. Available classes include:

  • java.util.*

  • java.lang.* (Integer, String, etc.)

  • java.lang.Math and java.lang.StrictMath

Note

java.lang.System and other system access classes are not available for security reasons.

MVEL operators#

MVEL provides convenient operators for common operations.

Instead of:

condition: 'request.getHeader("X-Trino-Client-Tags") != null && request.getHeader("X-Trino-Client-Tags").contains("label=foo")'

Use the contains operator:

condition: 'request.getHeader("X-Trino-Client-Tags") contains "label=foo"'

MVEL does not support generic type parameters. Use new HashSet() instead of new HashSet<String>().

Flow control#

Use MVEL flow control statements for complex logic. For example:

---
name: "airflow routing"
condition: 'request.getHeader("X-Trino-Source") == "airflow"'
actions:
  - 'if (request.getHeader("X-Trino-Client-Tags") contains "label=foo") {
       result.put("routingGroup", "etl-foo")
     }
     else if (request.getHeader("X-Trino-Client-Tags") contains "label=bar") {
       result.put("routingGroup", "etl-bar")
     }
     else {
       result.put("routingGroup", "etl")
     }'

Execution order#

By default, rules execute in the order they appear in the file.

Assign integer priorities to control execution order. Lower numbers execute first:

---
name: "general airflow"
description: "Route Airflow queries to ETL group"
priority: 0
condition: 'request.getHeader("X-Trino-Source") == "airflow"'
actions:
  - 'result.put("routingGroup", "etl")'
---
name: "special airflow"
description: "Override for special Airflow queries"
priority: 1
condition: 'request.getHeader("X-Trino-Source") == "airflow" && request.getHeader("X-Trino-Client-Tags") contains "label=special"'
actions:
  - 'result.put("routingGroup", "etl-special")'

Rules without specified priority default to INT_MAX and execute last.

When multiple rules match a request, all matching rules execute. A query cam satisfy multiple conditions and have its routing group changed multiple times. The final routing group is determined by the last rule that executes.

Sharing state between rules#

Use the state object to pass information between rules:

---
name: "initialize state"
priority: 0
condition: "true"
actions:
  - 'state.put("triggeredRules", new HashSet())'
---
name: "airflow detection"
priority: 1
condition: 'request.getHeader("X-Trino-Source") == "airflow"'
actions:
  - 'result.put("routingGroup", "etl")'
  - 'state.get("triggeredRules").add("airflow")'
---
name: "special airflow routing"
priority: 2
condition: 'state.get("triggeredRules").contains("airflow") && request.getHeader("X-Trino-Client-Tags") contains "label=special"'
actions:
  - 'result.put("routingGroup", "etl-special")'

External service routing rules#

Configure Starburst Gateway to use an external service for routing decisions.

Set the rulesType property to EXTERNAL and configure the rulesExternalConfiguration property:

routingRules:
    rulesEngineEnabled: true
    rulesType: EXTERNAL
    rulesExternalConfiguration:
        urlPath: https://router.example.com/gateway-rules
        excludeHeaders:
            - 'Authorization'
            - 'Accept-Encoding'

Note

Starburst Gateway does not support redirect URLs.

Optionally, add headers to the excludeHeaders list to prevent sending specific header values in the POST request.

Request format#

Starburst Gateway sends a POST request to the external service containing:

  • All HTTP headers from the original request(excluding those in excludeHeaders)

  • The following HTTP information:

    • remoteUser

    • method

    • requestURI

    • queryString

    • session

    • remoteAddr

    • remoteHost

    • parameterMap

  • Optional information (if requestAnalyzerConfig.analyzeRequest = true)

    • TrinoRequestUser

    • TrinoQueryProperties

Response format#

The external service must return a response in JSON format. It must also return a status code 200.

See the following JSON object containing a routing decision:

{
    "routingGroup": "etl-cluster",
    "errors": [
        "Optional error message 1",
        "Optional error message 2"
    ]
}

If errors is not null or empty, the request routes to the default adhoc group.

HTTP client configuration#

Configure the HTTP client for external service requests in the serverConfig section:

serverConfig:
    router.http-client.request-timeout: 1s
    router.http-client.connect-timeout: 500ms

For all available options, see the HTTP client properties documentation.

Health status#

Starburst Gateway tracks the health status of the configured clusters. The Gateway updates cluster states with every health check.

There are four possible states:

  • PENDING: The cluster is starting up (it is treated as unhealthy for routing)

  • HEALTHY: The cluster passed health checks and is ready for requests

  • UNHEALTHY: The cluster failed health checks (no requests routed)

  • UNSUPPORTED: A Non-SEP cluster is configured

Request analysis configuration#

Configure request analysis in the requestAnalyzerConfig section of your YAML file.

requestAnalyzerConfig:
    analyzeRequest: true
    maxBodySize: 1000000
    isClientsUseV2Format: false
    tokenUserField: "email"
    oauthTokenInfoUrl: "https://auth.example.com/userinfo"

Configuration options#

To configure request analysis, use the following configuration properties:

  • analyzeRequest: Set to true to enable trinoQueryProperties and trinoRequestUser objects (default: false)

  • maxBodySize: The maximum request body size to analyze in characters (default: 1,000,000)

  • isClientsUseV2Format: Set to true to support for V2-style request structure (default: false)

  • tokenUserField: The JWT claim to use as username (default: "email")

  • oauthTokenInfoUrl (optional): The URL for OAuth token exchange

User information extraction#

The trinoRequestUser object attempts to extract user information from requests in the following order:

  1. X-Trino-User header

  2. Authorization: Basic header

  3. Authorization: Bearer header (requires OAuth2 configuration)

  4. Trino-UI-Token or __Secure-Trino-ID-Token cookie

To enable user information extraction, set the requestAnalyzerConfig.analyzeRequest property to true.

Available methods#

  • trinoRequestUser.getUser(): rRturns an Optional<String> containing the extracted username, or an empty Optional if user information is not extracted.

  • trinoRequestUser.getUserInfo(): Returns an Optional<UserInfo>, with an OpenID Connect UserInfo if a token is successfully exchanged with the oauthTokenInfoUrl, and an empty Optional otherwise.

  • trinoRequestUser.userExistsAndEquals("username"): Checks a username matches the extracted user.

Query analysis#

The trinoQueryProperties object provides information about SQL queries.

Available methods#

  • errorMessage(): Returns error message if analysis failed

  • isNewQuerySubmission(): Returns true if request is a new query submission

  • getQueryType(): Returns statement class name (e.g., “ShowCreate”)

  • getResourceGroupQueryType(): Returns resource group query type (e.g., “SELECT”, “DATA_DEFINITION”)

  • getDefaultCatalog(): Returns default catalog if set

  • getDefaultSchema(): Returns default schema if set

  • getCatalogs(): Returns set of catalogs used in the query

  • getSchemas(): Returns set of schemas used in the query

  • getCatalogSchemas(): Returns set of qualified schemas (catalog.schema format)

  • tablesContains(String tableName): Returns true if query references the specified table

  • getTables(): Returns set of fully qualified table names

Limitations#

  • Only performs syntactic analysis

  • Does not expand views

  • Does not analyze view dependencies

  • Treats views and materialized views as tables