Starburst smart indexing and caching#

Starburst smart indexing and caching enables higher performance for catalogs using the Hive, Iceberg, or Delta Lake connectors by transparently adding an indexing and caching layer.

Note

Learn more about high level architecture and characteristics of smart indexing and caching before using it.

Note

Smart indexing and caching is a public preview feature. Contact Starburst support with questions or feedback.

Requirements#

To use smart indexing and caching, you need:

Supported object storage#

Smart indexing and caching only supports the following object storage:

No other other object storage systems, including on-premises storage, Google Cloud Storage, S3-compatible storage such as MinIO, and any others are supported.

Cluster configuration#

Smart indexing and caching requires your cluster to operate on the Kubernetes-based platform with the recommended node sizes. These sizes provide the necessary resources in terms of CPU, memory, and storage. Specifically, the included SSD storage is heavily used. You need to ensure that these devices are not used for any other purpose.

EKS node sizes:

  • r5d.4xlarge or larger

  • r5dn.4xlarge or larger

  • r6gd.4xlarge or larger

  • i3.4xlarge or larger

Create an EKS Managed Node Group with a specific size, and use it for all nodes in the cluster.

AKS node size:

  • Standard_L16s_v2 or larger with SSDs attached

Warning

Smart indexing and caching is not supported on a cluster running in Fault-tolerant execution mode.

Deployment and management is performed with the SEP Helm charts detailed in Deploying with Kubernetes and Configuring Starburst Enterprise in Kubernetes.

Add the following section in your values file for the specific cluster to enable smart indexing and caching:

siac:
  enabled: true

By default, it is disabled. The recommended memory allocation is automatically configured, but can be optionally overridden.

siac:
  enabled: true
  additionalHeapSizePercentage: 15

Smart indexing and caching requires privileged access to the attached storage on the nodes. Configure this for the coordinator and all workers.

Coordinator configuration for privileged access:

coordinator:
  securityContext:
    privileged: true

Worker configuration for privileged access:

worker:
  securityContext:
    privileged: true

As a last step, you need to ensure that you use a dedicated coordinator that is not scheduled for query processing, and adjust the query processing configuration to allow for more splits:

coordinator:
  additionalProperties: |
    ...
    node-scheduler.include-coordinator=false
    node-scheduler.max-splits-per-node=4096
    node-scheduler.max-unacknowledged-splits-per-task=1024
    node-scheduler.max-adjusted-pending-splits-per-task=1024

Use Helm to update the values and restart the cluster nodes. Confirm the cluster is operating correctly with the new configuration, but without any adjusted catalogs, and then proceed to configure catalogs.

Related to the catalog usage, the cluster needs to allow internal communication between all workers, as well as with the coordinator on all the HTTP ports configured by the different values for http-rest-port in all catalogs.

When starting the cluster, smart indexing and caching parses all configuration parameters and can send invalid warnings such as Configuration property 'cache-service.password' was not used. You can safely ignore these warnings.

Catalog configuration#

After a successful Cluster configuration, you can configure the desired catalog to use smart indexing and caching.

Only catalogs using the Hive, Iceberg, or Delta Lake connectors can be accelerated:

  • connector.name=hive

  • connector.name=iceberg

  • connector.name=delta-lake

For more details, see Delta Lake considerations, Iceberg considerations, and Hive considerations.

Only catalogs backed by S3 and ADLS object storage are supported. For more details, see S3 considerations and ADLS considerations.

For example, update the example catalog that uses the Hive connector and AWS Glue as metastore in the values file.

catalogs:
  example: |
    connector.name=hive
    hive.metastore=glue
    ...

Enable smart indexing and caching on the catalog by updating the connector name to siac and adding the required configuration properties:

catalogs:
  example: |
    connector.name=siac
    siac.proxied-connector=HIVE
    siac.store.path=s3://example/path/
    siac.cluster-uuid=example-cluster-567891234567
    siac.workerdb.hibernate.connection.url=jdbc:hsqldb:file:/usr/lib/starburst/[CATALOG_NAME]/workerDB/db
    http-rest-port=8088
    hive.metastore=glue
    ...

For testing purposes, or alternatively for permanent usage of a new catalog name (for example, faster), in parallel to the existing catalog, you can copy the configuration of a catalog and update it:

catalogs:
  example: |
    connector.name=hive
    hive.metastore=glue
    ...
  faster: |
    connector.name=siac
    siac.proxied-connector=HIVE
    siac.store.path=s3://example/path/
    siac.cluster-uuid=example-cluster-567891234567
    siac.workerdb.hibernate.connection.url=jdbc:hsqldb:file:/usr/lib/starburst/[CATALOG_NAME]/workerDB/db
    http-rest-port=8088
    hive.metastore=glue
    ...

This allows you to query the same data with or without smart indexing and caching using different catalog names. However, existing scripts and statements that include the old catalog name example are not accelerated.

Catalog configuration properties#

The following table provides more information about the available catalog configuration properties:

Catalog configuration properties#

Property name

Description

connector.name

Required. Must be set to siac.

siac.proxied-connector

Required. The type of embedded connector that is used for accessing cold data through the smart indexing and caching connector. Defaults to HIVE. Valid values are HIVE, ICEBERG, or DELTA_LAKE. All properties supported by these connectors can be used to configure the catalog.

siac.store.path

Required. The path of the object store where call-home data, metadata, and import-export data is managed. Required to use s3://, abfs:// or abfss:// as protocol. Write access privileges are necessary.

siac.cluster-uuid

Required. Unique identifier of the cluster. Used as the folder name in the store path. When creating a new cluster and the same siac.store.path and siac.cluster-uuid is used, then the cluster warmup rules for index and cache creation are imported into the newly created cluster.

siac.workerdb.hibernate.connection.url

Required. JDBC connection string for the internal database. Must be set to jdbc:hsqldb:file:/usr/lib/starburst/[CATALOG_NAME]/workerDB/db. Include [CATALOG_NAME] verbatim in the string. Do not replace the CATALOG_NAME text because the catalog directory name is generated for you. The brackets are required. Write access privileges are necessary.

http-rest-port

Required if using a unique port for every catalog. The port to use to run the REST server used for smart indexing and caching configuration. The port needs to be open across the cluster. You can have each catalog use a different port value, or use a single port for all catalogs. For details, see Indexing and caching management. Defaults to 8088.

siac.config.http-rest-server-enabled

Enables the REST server for smart indexing and caching. Defaults to true, which uses a unique port for each catalog. To use a single port for all requests, remove the http-rest-port from your configuration and set the value of siac.config.http-rest-server-enabled to false. For details, see Indexing and caching management.

siac.call-home.enable

Enable pushing logs and metadata to the object store path configured at siac.store.path. Defaults to true.

siac.config.azure.connection-string

Required when using ADLS for storage. Authorizes access to data in an Azure Storage account. For more information, see configure a connection string for an Azure storage account and copy the connection string.

Hive considerations#

All configurations of the Hive connector are supported. Additionally, the following considerations apply when using the Hive connector as the proxied connector for smart indexing and caching:

For optimal performance, add the following properties to your catalog configuration:

catalogs:
  example: |
    ...
    hive.max-outstanding-splits-size=512MB
    hive.max-initial-splits=0
    hive.max-outstanding-splits=2000
    hive.max-split-size=64MB
    ...

Iceberg considerations#

All configurations of the Iceberg connector are supported. Additionally, the following considerations apply when using the Iceberg connector as the proxied connector for smart indexing and caching:

  • Materialized views are not supported.

  • An associated split is served from object storage and no acceleration occurs when:

Delta Lake considerations#

All configurations of the Delta Lake connector are supported. Additionally, the following considerations apply when using the Delta Lake connector as the proxied connector for smart indexing and caching:

  • Materialized views are not supported.

  • An associated split is served from object storage and no acceleration occurs when:

S3 considerations#

Smart indexing and caching supports Amazon S3 with catalogs using the Hive, Iceberg and Delta Lake connectors.

Using the s3:// protocol is required.

ADLS considerations#

Smart indexing and caching supports Microsoft ADLS Gen 2. ADLS Gen1 is not supported.

Using the abfs:// or abfss:// protocol is required.

ADLS can be used with catalogs using the Hive and Delta Lake connectors with the following configuration properties to connect to Azure storage:

catalogs:
  faster: |
    ...
    siac.store.path=abfs://<container_name@account_name>.dfs.core.windows.net/folder
    siac.config.azure.connection-string=<connection_string>
    hive.azure.abfs-storage-account=<storage_account_name>
    hive.azure.abfs-access-key=xxx
    ...

In addition, siac.call-home.enable must be disabled. Then it is possible to secure the connection with TLS and use the abfss protocol with the URI syntax.

Indexing and caching management#

Smart caching and indexing automatically creates and manages its data based on processed queries, also called the default acceleration.

Catalog-specific REST API access#

You can configure smart indexing and caching with the REST API exposed on the coordinator at the port configured with http-rest-port for each catalog. This port on the coordinator node needs to be exposed to the machine from where you call the REST API. Typically this is outside the cluster. Every catalog requires a separate port and therefore REST API server. No authentication or authorization system is supported for API access.

The following example call accesses the row-group/row-group-count endpoint for the catalog configured for port 8088 on the cluster at sep.example.com via HTTP with a GET command.

curl -X GET "http://sep.example.com:8088/row-group/row-group-count" -H "accept: application/json"

Alternatively you can use the curl installed on the coordinator, and execute commands towards the row-group/row-group-count endpoint of the API directly on the coordinator container with localhost and the HTTP port for the specific catalog:

kubectl exec <coordinator-pod> -c coordinator --  \
curl -X GET "http://localhost:8088/row-group/row-group-count" -H "accept: application/json"

The JSON-formatted response is returned on your machine, where you run kubectl. You can pipe results into a locally installed jq command to be able to read the result.

kubectl exec <coordinator-pod> -c coordinator --  \
curl -X GET "http://localhost:8088/row-group/row-group-count" -H "accept: application/json" \
| jq

Alternatively you can pipe the output into a file and then view it with an editor or other tool that improves the JSON file formatting.

kubectl exec <coordinator-pod> -c coordinator -- \
curl -X GET "http://localhost:8088/row-group/row-group-count" -H "accept: application/json" \
> result.json

Centralized REST API access#

You can configure smart indexing and caching with its REST API exposed on the coordinator in a separate context for each catalog on the same port and domain as the SEP web UI and the Trino REST API. Access to the REST API is controlled by the authentication-types system configured for SEP.

To use centralized REST API access, remove http-rest-port from your catalog configuration, and set siac.config.http-rest-server-enabled to false in all of your accelerated catalog configurations.

The endpoints path structure changes to accommodate the name of the catalog: /ext/{catalogName}/{paths: .+}. The following example shows the /ext path to the warmup/warmup-rule-set endpoint for the catalog named faster on a cluster without authentication exposed via HTTP:

curl -X GET "http://sep.example.com:8080/ext/faster/warmup/warmup-rule-set" -H "accept: application/json"

An authenticated server needs to be accessed via HTTPS and include authetication information:

curl -X GET "https://sep.example.com/ext/faster/warmup/warmup-rule-set" -H "accept: application/json"

REST API overview#

The following sections detail the REST API and available endpoints. The example calls use plain curl calls to the endpoints at via HTTP on localhost. Adapt them to your usage by adding the kubectl invocation or update the coordinator URL.

Warming status#

You can determine the status of the warmup for smart indexing and caching with a GET operation of the /warming/status endpoint. It measures the warmup progress for splits across workers and if warming is currently taking place.

curl -X GET "http://localhost:8088/warming/status" -H "accept: application/json"

Example response:

{"nodesStatus":
  {"172.31.16.98": {"started":22136,"finished":22136},
   "172.31.25.207":{"started":20702,"finished":20702},
   "172.31.19.167":{"started":21116,"finished":21116},
   "172.31.22.28":{"started":20678,"finished":20678}},
   "warming":false}

The response shows that warmup started and finished on four workers, and is currently not in progress.

Debug tools#

The debug-tools endpoint requires an HTTP POST to specify the detailed command with a JSON payload to retrieve the desired data. You can use it to return the storage utilization:

curl -X POST "http://localhost:8088/debug-tools" -H "Content-Type: application/json" \
-d '{"commandName" : "all","@class" : "io.trino.plugin.varada.execution.debugtools.DebugToolData"}'

Example response:

{"coordinator-container":
  {"result":
    {"Storage_capacity":15000000,
     "Allocated 8k pages":1000000,
     "Num used stripes":0
    }
  }
}

Calculate the storage utilization percentage with (Allocated 8k pages / Storage_capacity) * 100.

Row group count#

A row group in smart indexing and caching is a collection of index and cache elements that are used to accelerate processing of Trino splits from the SSD storage.

A row group in smart indexing and caching is not equivalent to a Parquet row group or an ORC stripe, but a higher level artifact specific to smart indexing and caching. It can be related to a specific Parquet row group or ORC stripe but can also represent data from a whole file or more.

The row-group/row-group-count endpoint exposes all currently warmed up columns via an HTTP GET:

curl -X GET "http://localhost:8088/row-group/row-group-count" -H "accept: application/json"

The result is a list of columns specified by schema.table.column.warumuptype as the key. The value represents the corresponding count of accelerated row groups. Warmup types:

In the following example, 20 row groups of the tripid column of the trips_data table in the trips schema are accelerated with a data cache and an index.

{
  trips.trips_data.tripid.COL_WARM_UP_TYPE_DATA": 20,
  trips.trips_data.tripdid.COL_WARM_UP_TYPE_BASIC": 20
}

Create a warmup rule#

Use the warmup/warmup-rule-set endpoint with an HTTP POST and a JSON payload to create a new rule for a column of a table in a schema. This forces the creation of index and caching data for the column, independent of processed queries and the default acceleration.

curl -X POST -H "Content-Type: application/json" http://localhost:8088/warmup/warmup-rule-set \
-d '[ { "colNameId": "int_1", "schema": "tmp", "table": "aaa", "colWarmUpType": "COL_WARM_UP_TYPE_BASIC", "priority": 8, "ttl": "PT720H", "predicates": [ ] } ]'
Warmup rule properties#

Property name

Description

colNameid

Name of the column to which a warmup rule is attached.

schema

Name of the schema that contains the specified table.

table

Name of the table that contains the specified column.

predicates

Defaults to all partitions. Use the JSON array syntax ["example1", "example2"] to limit to specific partitions.

colWarmUpType

The materialization type performed on the specified column in the specified table. Valid values are COL_WARM_UP_TYPE_DATA for data cache acceleration, COL_WARM_UP_TYPE_BASIC for index acceleration, and COL_WARM_UP_TYPE_LUCENE for text search acceleration.

priority

Priority for the warmup rule. To ensure a column is accelerated even if storage capacity is exceeded, set the priority as high as 10. To ensure a column is never accelerated and prevent data cache or index creation, set to -10. Valid range of values: -10 to 10.

ttl

Duration for which the warmup rule remains active. Use PT0M to prevent expiration of the rule. Use duration specified in ISO-8601 duration format (PnDTnHnMn).

Get all warmup rules#

The warmup/warmup-rule-get endpoint exposes all defined warmup rules via an HTTP GET:

curl -X GET "http://localhost:8088/warmup/warmup-rule-get" -H "accept: application/json"

Response:

{
  "id":186229827,
  "schema":"ride_sharing_dataset",
  "table":"trips_data_big",
  "colNameId":"d_date",
  "column":
    {
      "classType":"RegularColumn",
      "key":"d_date"
     },
  "colWarmUpType":"COL_WARM_UP_TYPE_BASIC",
  "priority":8.0,
  "ttl":2592000.000000000,
  "predicates":[]
}

Delete a warmup rule#

The warmup/warmup-rule-delete endpoint allows you to delete a warmup rule via an HTTP DELETE. The identifier for the rule is a required parameter and can be seen from the result of warmup/warmup-rule-get in the id value.

curl -X DELETE "http://localhost:8088/warmup/warmup-rule-delete" -H "accept: application/json" -H "Content-Type: application/json" -d "[186229827]""

When you delete a warmup rule, the column index and cache data is de-prioritized to data from a default acceleration, and therefore is subject to earlier deletion.

SQL support#

All SQL statements and functions supported by the connector used in the accelerated catalog are supported:

When functions are applied on columns for the filtering operation, there is no acceleration for the filtering operation for some functions. For example, the filtering operation in the following example is not accelerated:

SELECT count(*)
FROM catalog.schema.table
WHERE lower(company) = 'starburst';

In addition, the following functions are accelerated by indexing when used on the left right or the right side of the predicate:

  • ceil(x) with real or double data type

  • in_nan(x) with real or double data type

  • cast(x as type) with double cast to real, or any type cast to varchar

  • day(d) and day_of_month(d) with date and timestamp data types

  • day_of_year(d) and doy(y) with date and timestamp data types

  • day_of_week(d) and dow(d) with date and timestamp data types

  • year(d) with date and timestamp data types

  • year_of_week(d) and yow(d) with date and timestamp data types

  • week(d) and week_of_year(d) with date and timestamp data types

  • LIKE, NOT LIKE with varchar data type

  • contains(arr_varchar, value) with array of varchar data type

The maximum supported string length for any data type used for caching and indexing is 50.000 characters.