Kubernetes configuration examples#

You can find the default configuration and numerous practical examples for use cases around Configuring Starburst Enterprise in Kubernetes in the following sections:

Adding the license file#

Starburst provides customers a license file to unlock additional features of SEP. The license file needs to be provided to SEP in the cluster:

Rename the file you received to starburstdata.license.
Create a k8s secret that contains the license file with a name of your choice in the cluster.
kubectl create secret generic mylicense --from-file=starburstdata.license
Configure the secret name as the Starburst platform license.
starburstPlatformLicense: mylicense

Images and repository registry credentials examples#

Defaults#

<< Return to section in k8s configuration documentation.

The following are the image- and registry-related defaults. Do not place unchanged values in customization files. Instead, follow best practices for creating YAML files for customizing SEP:

image:
  repository: "harbor.starburstdata.net/starburstdata/starburst-enterprise"
  tag: "475-e"
  pullPolicy: "IfNotPresent"

initImage:
  repository: "harbor.starburstdata.net/starburstdata/starburst-enterprise-init"
  tag: "475.0.0"
  pullPolicy: "IfNotPresent"

registryCredentials:
  enabled: false
  # Replace this with Docker Registry that you use
  registry:
  username:
  password:

imagePullSecrets:

Controlling SEP releases#

<< Return to section in k8s configuration documentation.

With normal usage you do not need to configure the image, since the chart version updates automatically include the version update of the Docker images. In rare cases, it can be useful to update the Docker image without changing the overall chart version used. For example, you can choose to upgrade from 3xx-e.1 to a newer patch version 3xx-e.3 of SEP, which allows you to keep the rest of the chart configuration unchanged:

image:
  repository: "harbor.starburstdata.net/starburstdata"
  tag: "3xx-e.3"
  pullPolicy: "IfNotPresent"

Using private registries#

<< Return to Docker images or Docker registry access section in k8s configuration documentation.

In some organizations you need to use private registries and repositories instead of Starburst Harbor. They are often hosted on a private Harbor instance or in a repository manager. You can publish the Helm charts and Docker containers to your private setup, or use a proxying setup. Steps to set this up vary widely based on your tools, and require both Docker and Helm expertise:

Pull the Docker image from the Starburst Harbor registry with your credentials
Tag the image as desired for your internal registry
Push the image to your registry
Download the Helm charts
Publish the Helm charts to your Helm repository

You can use your private setup with the following steps:

Add your private Helm chart repository using these same steps as for adding the Starburst repository, but replacing the values with those of your private repository.
Update your registry credentials with details for your private Docker registry.

The following example overrides Docker registry to use your private registry:

image:
  repository: "docker.example.com/thirdparty"
  tag: "475-e"
  pullPolicy: "IfNotPresent"

You also need to update your registry access configuration:

registryCredentials:
  enabled: true
  registry: docker.example.com
  username: myusername
  password: mypassword

You can also use imagePullSecrets: with private registries instead of registryCredentials.

If you changed the Docker image organization, name, or version tags, you also need to override these details in your YAML configuration files. For example, update image and initImage for SEP. Similar steps are necessary if you are using the HMS or Ranger charts.

Using `imagePullSecrets`#

<< Return to Docker images or Docker registry access section in k8s configuration documentation.

You can use imagePullSecrets: to authenticate with a Docker registry as an alternative to using registryCredentials:. You can pass an array list of Kubernetes secret names of type kubernetes.io/dockerconfigjson with the following format:

imagePullSecrets:
 - name: secret1
 - name: secret2

Detailed instructions for using private registries with pull secrets can be found in the Kubernetes documentation.

Internal communications configuration#

Defaults#

The following are the internal communications-related defaults in the values.yaml file. Do not place unchanged values in customization files. Instead, follow best practices for creating YAML files for customizing SEP:

sharedSecret:

environment:

internalTls: false

internal:
  ports:
    http:
      port: 8080
    https:
      port: 8443

Using TLS for internal communication#

<< Return to section in k8s configuration documentation.

You can optionally enable TLS for internal communication, if the cluster is deemed insecure, or TLS is otherwise required and the performance overhead is acceptable.

Configuring automatic internal TLS#

Set internalTls to true to configure SEP to enable TLS for internal communication. Certificates are automatically generated and used. You must configure the environment: and sharedSecret: top level nodes, as well as the internal.ports.https.port: to enable this feature.

environment: production
sharedSecret: AN0Qhhw9PsZmEgEXAMPLEkIj3AJZ5/Mnyy5iRANDOMceM+SSV+APSTiSTRING
internalTls: true
internal:
  ports:
    http:
      port: 8080
    https:
      port: 8443

Manual TLS configuration#

Warning

We very strongly suggest that you use the automatic internal TLS as described in the preceding section. Manual TLS configuration is deprecated functionality. Using and configuring TLS for internal communication is very complex, requiring you to implement a certificate manager and managing certificates within the cluster.

All cluster nodes must have a fully qualified domain name (FQDN) that matches the Kubernetes naming scheme. When node.internal-address-source is set to FQDN (in both the the coordinator.additionalProperties: and worker.additionalProperties: nodes), the chart manages the node.internal-address property automatically, and the SAN field in TLS certs must match.

The TLS certificates used must have starburst, coordinator.<namespace>.svc and *.worker.<namespace>.svc in the Subject Alternative Name (SAN) field. Replace <namespace> with a real value.

coordinator:
  additionalProperties: |
    node.internal-address-source=FQDN

worker:
  additionalProperties: |
    node.internal-address-source=FQDN

Non-standard HTTPS port numbers#

<< Return to section in k8s configuration documentation.

If a non-standard port (other than 8443) is used for HTTPS, the same value must be set for both internal.ports.https.port in the YAML file and the http-server.https.port property in config.properties. This is best achieved by adding the setting to the additionalProperties for coordinator and workers as detailed in Using TLS for internal communication.

internal:
  ports:
    https:
      port: 8440
coordinator:
  additionalProperties: |
    http-server.https.port=8440
worker:
  additionalProperties: |
    http-server.https.port=8440

External communications configuration#

Defaults#

The following are the external communications-related defaults in the values.yaml file. Do not place unchanged values in customization files. Instead, follow best practices for creating YAML files for customizing SEP:

expose:
  type: "clusterIp"
  clusterIp:
    name: "starburst"
    ports:
      http:
        port: 8080
  nodePort:
    name: "starburst"
    ports:
      http:
        port: 8080
        nodePort: 30080
    extraLabels: {}
  loadBalancer:
    name: "starburst"
    IP: ""
    ports:
      http:
        port: 8080
    annotations: {}
    sourceRanges: []
  ingress:
    ingressName: "coordinator-ingress"
    serviceName: "starburst"
    servicePort: 8080
    ingressClassName:
    tls:
      enabled: true
      secretName:
    host:
    path: "/"
    annotations: {}

`clusterIp` type#

<< Return to section in k8s configuration documentation.

expose:
  type: "clusterIp"
  clusterIp:
    name: "starburst"
    ports:
      http:
        port: 8080

`nodePort` type#

<< Return to section in k8s configuration documentation.

expose:
  type: "nodePort"
  nodePort:
    name: "starburst"
    ports:
      http:
        port: 8080
        nodePort: 30080

`loadBalancer` type#

<< Return to section in k8s configuration documentation.

expose:
  type: "loadBalancer"
  loadBalancer:
    name: "starburst"
    IP: ""
    ports:
      http:
        port: 8080
    annotations: {}
    sourceRanges: []

Basic `ingress` type#

<< Return to section in k8s configuration documentation.

expose:
  type: "ingress"
  ingress:
    serviceName: "starburst"
    servicePort: 8080
    tls:
      enabled: true
      secretName:
    host:
    path: "/"
    annotations: {}

`ingress` with nginx and cert-manager#

<< Return to section in k8s configuration documentation.

nginx is a powerful HTTP and proxy server, commonly used as load balancer. You can combine using it with cert-manager backed by Let’s Encrypt.

As a first step you need to deploy an HTTPS ingress controller for your cluster. You can follow a tutorial from the cert-manager documentation.

With the setup done, and an A record in your DNS zone ready, you can expose the Starburst Enterprise web UI:

expose:
  type: "ingress"
  ingress:
    serviceName: "starburst"
    servicePort: 8080
    tls:
      enabled: true
      secretName: "tls-secret-starburst"
    host: ""
    path: "/(.*)"
    annotations:
      kubernetes.io/ingress.class: "nginx"
      cert-manager.io/issuer: "letsencrypt-staging"

The secretName is used by the cert-manager to store the generated certificate, and can be any value.

The annotations section uses the nginx default value for the single ingress controller installation. It assumes certificate issuer with the name letsencrypt-staging is used, and needs to exist.

The Ranger user interface can be exposed in exactly the same way:

expose:
  type: "ingress"
  ingress:
    tls:
      enabled: true
      secretName: "tls-secret-ranger"
    host: ""
    path: "/(.*)"
    annotations:
      kubernetes.io/ingress.class: "nginx"
      cert-manager.io/issuer: "letsencrypt-staging"

Coordinator configuration#

Defaults#

<< Return to section in k8s configuration documentation.

The following are the coordinator-related defaults in the values.yaml file. Do not place unchanged values in customization files. Instead, follow best practices for creating YAML files for customizing SEP:

coordinator:
  etcFiles:
    jvm.config: |
      -server
      -Xmx16G
      -XX:InitialRAMPercentage=80
      -XX:MaxRAMPercentage=80
      -XX:G1HeapRegionSize=32M
      -XX:+ExplicitGCInvokesConcurrent
      -XX:+ExitOnOutOfMemoryError
      -XX:+HeapDumpOnOutOfMemoryError
      -XX:-OmitStackTraceInFastThrow
      -XX:ReservedCodeCacheSize=512M
      -XX:PerMethodRecompilationCutoff=10000
      -XX:PerBytecodeRecompilationCutoff=10000
      -Djdk.attach.allowAttachSelf=true
      -Djdk.nio.maxCachedBufferSize=2000000
      -Dfile.encoding=UTF-8
      # Allow loading dynamic agent used by JOL
      -XX:+EnableDynamicAgentLoading
    properties:
      config.properties: |
        coordinator=true
        node-scheduler.include-coordinator=false
        http-server.http.port=8080
        discovery-server.enabled=true
        discovery.uri=http://localhost:8080
      node.properties: |
        node.environment={{include "starburst.environment" .}}
        node.data-dir=/data/starburst
        plugin.dir=/usr/lib/starburst/plugin
        node.server-log-file=/var/log/starburst/server.log
        node.launcher-log-file=/var/log/starburst/launcher.log
      log.properties: |
        # Enable verbose logging from Starburst Enterprise
        #io.trino=DEBUG
        #com.starburstdata.presto=DEBUG
      password-authenticator.properties: |
        password-authenticator.name=file
        file.password-file=/usr/lib/starburst/etc/auth/password.db
      access-control.properties:
    other: {}
  resources:
    memory: "60Gi"
    cpu: 16
  nodeMemoryHeadroom: "2Gi"
  heapSizePercentage: 90
  heapHeadroomPercentage: 30
  additionalProperties: ""

  envFrom: []
  nodeSelector: {}
  affinity: {}
  tolerations: []
  deploymentAnnotations: {}
  podAnnotations: {}
  priorityClassName:

JVM configuration#

<< Return to section in k8s configuration documentation.

The JVM configuration is automatically included and includes the appropriate memory settings based on the configured resources. In rare cases you might need to add or modify some parameters. In this case you need to include the full default JVM configuration and the modified values. The following example only modified G1HeapRegionSize and ReservedCodeCacheSize, but all values are required to be included.

coordinator:
  etcFiles:
    jvm.config: |
      -server
      -Xmx16G
      -XX:InitialRAMPercentage=80
      -XX:MaxRAMPercentage=80
      -XX:G1HeapRegionSize=32M
      -XX:+ExplicitGCInvokesConcurrent
      -XX:+ExitOnOutOfMemoryError
      -XX:+HeapDumpOnOutOfMemoryError
      -XX:-OmitStackTraceInFastThrow
      -XX:ReservedCodeCacheSize=512M
      -XX:PerMethodRecompilationCutoff=10000
      -XX:PerBytecodeRecompilationCutoff=10000
      -Djdk.attach.allowAttachSelf=true
      -Djdk.nio.maxCachedBufferSize=2000000
      -Dfile.encoding=UTF-8
      # Allow loading dynamic agent used by JOL
      -XX:+EnableDynamicAgentLoading

The same settings need to be applied to the worker configuration.

Adding other files in `etc`#

<< Return to section in k8s configuration documentation.

You can add any random other required configuration file in the etc folder, by adding a node with the desired filename in the other section, and using YAML multi-line sections to define the content. The following example adds the two files etc/resource-groups.json and etc/kafka/tpch.customer.json:

coordinator:
  etcFiles:
    other:
      resource-groups.json: |
          {
          <<json_here>
          }
      kafka/tpch.customer.json: |
          {
          <<json_here>
          }

Logging#

Writing log output to files is not enabled in k8s by default. The logs are written to stdout and stderr streams. Kubernetes-native logging tools should be used for log analysis and troubleshooting.

In rare cases you can temporarily enable logging by setting the log.path property in additionalProperties, and configure Log levels in log.properties to your desired settings. For more information, see Logging properties.

The package structure of the source code determines the package to use for logging specific plugins or functionality. The following example shows how to debug the plugin codebase of the Iceberg connector.

coordinator:
  additionalProperties: |
    log.path=/var/log/starburst/server.log
  etcFiles:
    properties:
      log.properties: |
        io.trino.plugin.iceberg=DEBUG

CPU and memory allocation#

<< Return to section in k8s configuration documentation.

You can configure the desired CPU and memory allocation to use for coordinator and worker pods. The parameters affect the requested pod size via k8s resource management settings. The settings are also used to determine the memory settings for the JVM running SEP.

coordinator:
  resources:
    memory: "256Gi"
    cpu: 32
  nodeMemoryHeadroom: "4Gi"

By default, SEP use the same values for resource requests and limits. You can specify different settings for requests and limits using the same syntax that k8s uses for pods.

For example, you can make sure CPU limits are not specified, in order to bypass a known issue with k8s throttling:

coordinator:
  resources:
    requests:
      memory: "256Gi"
      cpu: 32
    limits:
      memory: "256Gi"

The query.max-memory property is set to 1PB. This setting overrides the low default value.

Adding properties to `config.properties`#

<< Return to section in k8s configuration documentation.

You can set additional properties to change to the default configuration file and override the default values. An example usage is to set client and query time-out values:

additionalProperties: |
  query.client.timeout=5m
  query.min-expire-age=30m

Using environment variables for secrets#

<< Return to section in k8s configuration documentation.

envFrom:
  - secretRef:
    name: <<secret_name>>

To use secrets for sensitive credential information to use in a catalog properties file:

1. Create a secret holding variables. You can statically create secrets using base64 encoded values of your configuration. Make sure your secret key, which is used to define the environment variable name, follows this regex pattern [a-zA-Z][a-zA-Z0-9_]* - only alphanumerics and underscore allowed. Convention is to use all caps and underscores such as PSQL_USERNAME.

$ echo -n user | base64
$ echo -n pass | base64

apiVersion: v1
kind: Secret
metadata:
  name: variables-secret
type: Opaque
data:
  PSQL_USERNAME: <base64_encoded_user>
  PSQL_PASSWORD: <base64_encoded_pass>

Add the secret reference in envFrom for both coordinator and worker to make it accessible on all nodes:

envFrom:
  - secretRef:
      name: variables-secret

3. Reference variables in properties files using built-in placeholder pattern as enabled by the secrets support.

catalogs:
  postgresql: |
    connector.name=postgresql
    connection-url=jdbc:postgresql://postgresql:5432/postgres
    connection-password=${ENV:PSQL_PASSWORD}
    connection-user=${ENV:PSQL_USERNAME}

Enabling the SEP backend service and Insights#

<< Return to section in k8s configuration documentation.

Note

The insights.jdbc.url, insights.jdbc.user and insights.jdbc.password configuration properties are part of the backend service and are required. Together, they enable a number of features besides Insights. Read more about it in our backend service topic.

Starburst Insights provides a visual overview of important metrics about your cluster as well as a query editor feature to write and run SQL queries. We strongly suggest reading about the options and capabilities of this tool.

Enable and configure Insights in the additionalProperties section of the coordinator configuration:

coordinator:
  additionalProperties: |
    insights.persistence-enabled=true
    insights.metrics-persistence-enabled=true
    insights.jdbc.url=jdbc:postgresql://<database hostname>:5432/starburstenterprise
    insights.jdbc.user=<user>
    insights.jdbc.password=<password>
    insights.authorized-users=<superuser>

This example activates persistence for the query history feature.

Our Insights documentation contains a complete description of all Insights configuration properties.

Enabling access control#

<< Return to section in k8s configuration documentation.

SEP has several options for implementing access control. You can add any desired content of the properties file inside the YAML file. For example, you can use the read-only System access control:

coordinator:
  etcFiles:
    properties:
      access-control.properties: |
        access-control.name=read-only

To implement Ranger, use the Apache Ranger Helm chart.

Other access control choices, such as file-based user access control require configurations in one or more nodes in the SEP Helm chart.

SEP also supports Privacera. Please refer to your Privacera user documentation to learn how to connect that service to your k8s cluster.

Also see: file-based user authentication example.

Worker configuration#

Defaults#

<< Return to section in k8s configuration documentation.

The following are the worker-related defaults in the values.yaml file. Do not place unchanged values in customization files. Instead, follow best practices for creating YAML files for customizing SEP:

worker:
  etcFiles:
    jvm.config: |
      -server
      -Xmx16G
      -XX:InitialRAMPercentage=80
      -XX:MaxRAMPercentage=80
      -XX:G1HeapRegionSize=32M
      -XX:+ExplicitGCInvokesConcurrent
      -XX:+ExitOnOutOfMemoryError
      -XX:+HeapDumpOnOutOfMemoryError
      -XX:-OmitStackTraceInFastThrow
      -XX:ReservedCodeCacheSize=512M
      -XX:PerMethodRecompilationCutoff=10000
      -XX:PerBytecodeRecompilationCutoff=10000
      -Djdk.attach.allowAttachSelf=true
      -Djdk.nio.maxCachedBufferSize=2000000
      -Dfile.encoding=UTF-8
      # Allow loading dynamic agent used by JOL
      -XX:+EnableDynamicAgentLoading
    properties:
      config.properties: |
        coordinator=false
        http-server.http.port=8080
        discovery.uri=http://{{include "starburst.service.name" .}}:8080
      node.properties: |
        node.environment={{include "starburst.environment" .}}
        node.data-dir=/data/starburst
        plugin.dir=/usr/lib/starburst/plugin
        node.server-log-file=/var/log/starburst/server.log
        node.launcher-log-file=/var/log/starburst/launcher.log
      log.properties: |
        # Enable verbose logging from Starburst Enterprise
        #io.trino=DEBUG
        #com.starburstdata.presto=DEBUG
    other: {}
  replicas: 2
  autoscaling:
    enabled: false
    minReplicas: 1
    maxReplicas: 100
    targetCPUUtilizationPercentage: 80
  deploymentTerminationGracePeriodSeconds: 300 # 5 minutes
  starburstWorkerShutdownGracePeriodSeconds: 120 # 2 minutes
  resources:
    memory: "100Gi"
    cpu: 16
  nodeMemoryHeadroom: "2Gi"
  heapSizePercentage: 90
  heapHeadroomPercentage: 30
  additionalProperties: ""
  envFrom: []
  nodeSelector: {}
  affinity: {}
  tolerations: []
  deploymentAnnotations: {}
  podAnnotations: {}
  priorityClassName:

Startup script#

Defaults#

The following are the startup script-related defaults in the values.yaml file. Do not place unchanged values in customization files. Instead, follow best practices for creating YAML files for customizing SEP:

initFile: ""
extraArguments: []

Using initFile#

<< Return to section in k8s configuration documentation.

initFile:
extraArguments:
extraSecret:
  name:
  file:

The following example shows how you can use initFile to run a custom init script on the coordinator and workers:

initFile: |
  #!/bin/bash
  echo "Custom init for $1 $2"
  exec /usr/lib/starburst/bin/run-starburst
extraArguments:
  - TEST_ARG

Output on the coordinator:

Custom init for coordinator TEST_ARG
<<starburst_logs>>

Output on a worker:

Custom init for worker TEST_ARG
<<starburst_logs>>

To retrieve and load drivers and other large binaries at startup, Starburst recommends using an init container.

Security considerations#

Defaults#

The following are the security-related defaults in the values.yaml file. Do not place unchanged values in customization files. Instead, follow best practices for creating YAML files for customizing SEP:

externalSecrets:
  enabled: false
  type: eso
  secretPrefix: external/
  eso:
    refreshInterval: 1m
    secretStoreRef:

userDatabase:
  enabled: false
  users:
    - username: admin
      password: thepassword

securityContext: {}

extraSecret:
  name:
  file:

External secret reference#

<< Return to section in k8s configuration documentation.

To configure SEP to work with LDAP as an external secret reference, first create a k8s secret holding the file:

$ kubectl create secret generic ldap-ca --from-file=ca.crt

When the file is created, you can configure the secret reference usage for the above configuration as:

coordinator:
  etcFiles:
    properties:
      password-authenticator.properties: |
        ldap.url=ldaps://ldap-server:636
        ldap.user-bind-pattern=uid=${USER},OU=America,DC=corp,DC=example,DC=com
        ldap.ssl-trust-certificate=secretRef:ldap-ca:ca.crt

This mounts the secret named ldap-ca in the path /mnt/secretsRef/ldap-ca and replaces secretRef:ldap-ca occurrences into the absolute path, resulting in the following configuration property setting:

ldap.ssl-trust-certificate=/mnt/secretRef/ldap-ca/ca.crt

Note

Specific secret values, such as passwords, can be passed into properties files using the envFrom parameters available for coordinator and worker.

Defining external secrets#

<< Return to section in k8s configuration documentation.

You can automatically mount external secrets, for example from the AWS Secrets Manager, using the secretRef or secretEnv notation.

externalSecrets:
  enabled: true # disabled by default
  type: eso
  secretPrefix: <<secret_name_prefix>>
  eso:
    refreshInterval: 1m
    secretStoreRef:

An example of this configuration:

Create AWS Secrets Manager secret:

$ aws secretsmanager create-secret --name external.starburst-http-server-port --secret-string 8888

Reference it from your configuration section in config.properties:

coordinator:
  etcFiles:
      config.properties: |
      http-server.http.port=secretEnv:external/starburst-http-server-port

Configure the external secrets:

externalSecrets:
  enabled: true
  type: eso
  secretPrefix: external/
  eso:
    refreshInterval: 1m
    secretStoreRef:

This creates the following external secret manifest:

apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: external.starburst-http-server-port
spec:
  refreshInterval: 1m
  secretStoreRef: # values passed in externalSecrets.eso.secretStoreRef object
    name: my-store
    kind: ClusterSecretStore
  target:
     name: external.starburst-http-server-port
  data:
    - secretKey: EXTERNAL_STARBURST_HTTP_SERVER_PORT
      remoteRef:
        key: external/starburst-http-server-port

Additionally, the external secrets provider fetches secrets from AWS and creates a k8s secret:

apiVersion: v1
kind: Secret
metadata:
  name: external.starburst-http-server-port
type: Opaque
data:
  EXTERNAL_STARBURST_HTTP_SERVER_PORT: 8888

The k8s secret is now bound to the container as the EXTERNAL_STARBURST_HTTP_SERVER_PORT environment variable. SEP config.properties is resolved to:

http-server.http.port=${ENV:EXTERNAL_STARBURST_HTTP_SERVER_PORT}

If you have a secret with multiple values, such as a JSON-formatted secret, you can reference the secret values independently.

For example, you may have a secret named external-starburst-creds-mysql that is structured like this in the AWS Secrets Manager:

{
  "MYSQL_USER": "user",
  "MYSQL_PASSWORD": "password"
}

The MYSQL_USER and MYSQL_PASSWORD keys can be referenced in the values.yaml file:

externalSecrets:
  enabled: true
  type: eso
  secretPrefix: external/
  eso:
    refreshInterval: 1m
    secretStoreRef:

catalogs:
   mysqldb: |-
    connector.name=mysql
    connection-url=jdbc:mysql://<<dns>>:3306
    connection-user=secretEnv:external-starburst-creds-mysql:MYSQL_USER
    connection-password=secretEnv:external-starburst-creds-mysql:MYSQL_PASSWORD

File-based authentication#

<< Return to section in k8s configuration documentation.

Using the htpasswd-generated file:

userDatabase:
  enabled: true
  name: password.db
  users:
    - username: admin
      password: thepassword

Using an externally-created user database:

userDatabase:
  enabled: false

RBAC-enabled clusters#

<< Return to section in k8s requirements documentation.

In the following example, a user steve is configured to work with SEP in a namespace called dev-sandbox:

Create RoleBinding steve-edit to bind the edit ClusterRole to the user in the specific namespace:

kubectl create rolebinding steve-edit \
  --clusterrole edit \
  --user steve \
  --namespace dev-sandbox

If externalSecrets: are in use, then a role with the following permissions must be additionally bound to the user:

kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: external-secrets-edit
rules:
- apiGroups:
  - kubernetes-client.io
  resources:
  - externalsecrets
  - externalsecrets/status
  verbs:
    - create
    - get
    - list
    - watch
    - update
    - patch
    - delete

kubectl create rolebinding steve-external-secrets-edit \
  --clusterrole external-secrets-edit \
  --user steve \
  --namespace dev-sandbox

The Starburst Enterprise Helm chart does not provide functionality to use a service account for deployments. None of the containers deployed to the cluster needs access to the Kubernetes API.

Performance considerations#

Concurrent query defaults#

The following is the query: default in the values.yaml file. Do not place unchanged values in customization files. Instead, follow best practices for creating YAML files for customizing SEP:

query:
  maxConcurrentQueries: 3

Concurrent query increase example#

<< Return to section in k8s configuration documentation.

query:
  maxConcurrentQueries: 5

Spilling defaults#

The following are the spilling-related defaults in the values.yaml file. Do not place unchanged values in customization files. Instead, follow best practices for creating YAML files for customizing SEP:

spilling:
  enabled: false
  volume:
    emptyDir: {}

Spilling example#

<< Return to section in k8s configuration documentation.

spilling:
  enabled: true:
  volume:
    emptyDir: {}

Hive connector storage caching defaults#

The following are the storage caching-related defaults in the values.yaml file. Do not place unchanged values in customization files. Instead, follow best practices for creating YAML files for customizing SEP:

cache:
  enabled: false
  diskUsagePercentage: 80
  ttl: "7d"
  volume:
    emptyDir: {}

Hive connector storage caching example#

<< Return to section in k8s configuration documentation.

cache:
  enabled: true
  diskUsagePercentage: 75
  ttl: "5d"

Catalogs#

Default#

The following is the default catalog entry in the values.yaml file. Do not place unchanged values in customization files. Instead, follow best practices for creating YAML files for customizing SEP:

catalogs:
  tpch: |-
    connector.name=tpch
    tpch.splits-per-node=4

Catalog examples#

<< Return to section in k8s configuration documentation.

The following snippet adds the tpcds-testdata catalog. It uses the TPC-DS connector and only specifies the connector name.

catalogs:
  tpcds-testdata: |
  connector.name=tpcds

Multiple catalogs are configured one after the other:

catalogs:
  tpch-testdata: |
    connector.name=tpch
  tpcds-testdata: |
    connector.name=tpcds
  tmpmemory: |
    connector.name=memory
  metrics: |
    connector.name=jmx
  devnull: |
    connector.name=blackhole
  datalake: |
    connector.name=hive
    hive.metastore.uri=thrift://hive:9083
  s3: |
    connector.name=hive
    hive.metastore=glue

The name of each catalog is defined by the chosen name for the node within catalogs. The above examples results in catalog names such as tpch-testdata, tpcds-testdata, tmpmemory, metrics and others. These names are visible for CLI and other tool users with SHOW CATALOGS and potentially in the user interface.

Each catalog properties file can use the configuration options supported by the connector designated by the configured connector.name.

Teradata Direct connector#

<< Return to section in k8s configuration documentation.

The Starburst Teradata Direct connector is supported for Kubernetes deployments in AWS EKS and in Azure AKS. Follow the detailed instructions to configure the necessary networking and components.

Warning

The configuration to use the Starburst Teradata Direct connector on Kubernetes is complex. You need significant Kubernetes and networking knowledge. Contact our Starburst Support team for assistance.

Additional volumes#

Default#

The following is the additionalVolumes: default in the values.yaml file. Do not place unchanged values in customization files:

additionalVolumes: []

Adding volumes examples#

<< Return to section in k8s configuration documentation.

additionalVolumes:
  - path: /mnt/InContainer
    volume:
      emptyDir: {}
  - path: /var/lib/starburst/cache1
    volume:
      hostPath:
        path: /media/nv1/starburst-cache
  - path: /var/lib/starburst/cache2
    volume:
      hostPath:
        path: /media/nv2/starburst-cache

Adding files examples#

<< Return to section in k8s configuration documentation.

As an example, if you want to copy a file to an already existing location like /usr/lib/starburst/plugin you can mount the file to a Kubernetes volume like a ConfigMap, and add the file as a subPath to the path:

additionalVolumes:
- path: /usr/lib/starburst/plugin/x.jar
  subPath: x.jar
  volume:
    configMap:
      name: "configmap-in-volume"

In this case, the key named x.jar from the ConfigMap is mounted as that file in the location provided in path.

Large binaries, such as drivers, are added at cluster start time in the initFile: top level node.

Prometheus#

Default#

<< Return to section in k8s configuration documentation.

The following are the prometheus: defaults in the values.yaml file. Do not place unchanged values in customization files. Instead, follow best practices for creating YAML files for customizing SEP:

prometheus:
  enabled: true
  agent:
    version: "0.20.0"
    port: 8081
    config: "/etc/starburst/telemetry/prometheus.yaml"
  rules:
    - pattern: trino.execution<name=QueryManager><>(running_queries|queued_queries)
      name: $1
      attrNameSnakeCase: true
      type: GAUGE
    - pattern: 'trino.execution<name=QueryManager><>FailedQueries\.TotalCount'
      name: 'failed_queries'
      type: COUNTER
  serviceMonitor:
    enabled: true
    apiVersion: monitoring.coreos.com/v1
    labels:
      prometheus: kube-prometheus
    interval: "30s"
    coordinator: {}
    worker: {}

The following table explains the relevant YAML sections:

Prometheus YAML configuration sections#
YAML section	Purpose
`enabled`	Enables or disables Prometheus metrics integration.
`agent`	Used to specify the version of the Prometheus agent used for metrics collection, as well as the port on which the Prometheus agent exposes metrics and the path to the Prometheus agent configuration file.
`rules`	A list of custom metric rules for Prometheus. Each rule can define a pattern, name, and type.
`serviceMonitor`	Defines configuration for creating a serviceMonitor resource to automatically discover and scrape metrics, including scrape interval and overriding or extending the default service monitor configuration for the coordinator and workers. You must install the Prometheus Operator in the cluster to use service monitors.

Kubernetes management and monitoring#

Default#

The following are the Kubernetes-related defaults in the values.yaml file. Do not place unchanged values in customization files. Instead, follow best practices for creating YAML files for customizing SEP:

readinessProbe:
  httpGet:
    scheme: HTTP
    path: /v1/readiness
    port: 8085
  timeoutSeconds: 10
  periodSeconds: 15
  failureThreshold: 3

livenessProbe:
  httpGet:
    scheme: HTTP
    path: /v1/info
    port: 8085
  timeoutSeconds: 10
  periodSeconds: 60
  failureThreshold: 3

commonLabels: {}

Note

Both HTTP and HTTPS are supported.

Kubernetes configuration examples#

Adding the license file#

Images and repository registry credentials examples#

Defaults#

Controlling SEP releases#

Using private registries#

Using imagePullSecrets#

Internal communications configuration#

Defaults#

Using TLS for internal communication#

Configuring automatic internal TLS#

Manual TLS configuration#

Non-standard HTTPS port numbers#

External communications configuration#

Defaults#

clusterIp type#

nodePort type#

loadBalancer type#

Basic ingress type#

ingress with nginx and cert-manager#

Coordinator configuration#

Defaults#

JVM configuration#

Adding other files in etc#

Logging#

CPU and memory allocation#

Adding properties to config.properties#

Using environment variables for secrets#

Enabling the SEP backend service and Insights#

Enabling access control#

Worker configuration#

Defaults#

Startup script#

Defaults#

Using initFile#

Security considerations#

Defaults#

External secret reference#

Defining external secrets#

File-based authentication#

RBAC-enabled clusters#

Performance considerations#

Concurrent query defaults#

Concurrent query increase example#

Spilling defaults#

Spilling example#

Hive connector storage caching defaults#

Hive connector storage caching example#

Catalogs#

Default#

Catalog examples#

Teradata Direct connector#

Additional volumes#

Default#

Adding volumes examples#

Adding files examples#

Prometheus#

Default#

Kubernetes management and monitoring#

Default#

Using `imagePullSecrets`#

`clusterIp` type#

`nodePort` type#

`loadBalancer` type#

Basic `ingress` type#

`ingress` with nginx and cert-manager#

Adding other files in `etc`#

Adding properties to `config.properties`#