Release 429-e LTS (29 Nov 2023)#

Starburst Enterprise platform (SEP) 429-e LTS is the follow up release to the 429-e STS release and the 423-e LTS release.

This release is a promotion of the original 429-e STS release in November 2023 into a long term support (LTS) release.

The 429-e release includes all improvements from the following Trino releases:

It contains all improvements from the Starburst Enterprise releases since 423-e LTS:

Highlights since 423-e#

Breaking changes#

  • The SEP backend service has been updated to require PostgreSQL 12.0+ when using PostgreSQL as the underlying RDBMS.

  • TIMESTAMP type mapping between MySQL and Trino is no longer TIMESTAMP to TIMESTAMP. The new conversion is MySQL TIMESTAMP to Trino TIMESTAMP WITH TIMEZONE. Depending on the query, mapping from MySQL TIMESTAMP to Trino TIMESTAMP may result in an error message.

  • Privileged access to the attached storage on nodes is no longer required for Starburst Warp Speed cluster configuration in Helm deployments. Existing cluster configurations must be updated, with EKS deployments requiring the addition of a boostrap script. Review and follow the considerations for your platform in the Starburst Warp Speed documentation.

  • The deprecated.hive.metastore.glue-read-properties-based-column-statistics Hive Metastore configuration property and underlying functionality has been removed. You must remove this configuration property or the cluster fails to start.

  • The updated base Docker image for SEP no longer includes curl, vi, nano, sed, awk, grep, and other popular command line tools. Starburst recommends using an init container with a base image that includes your needed command line tools. Guidance on using init containers and selecting suitable base images can be found in our init container documentation.

  • A new autoConfigure property was added to the Starburst Warp Speed Helm chart which defaults to false. Starburst Warp Speed deployments on AKS and GKE upgrading from 426-e that have already been reconfigured to use the filesystem instead of privileged mode must set this property to true or the filesystem is not created. Review and follow the migration guide for detailed instructions for your cloud platform.

  • SEP 427-e uses a base system image that does not contain a system-wide trust store. Trusted, self-signed certificates must now be added to the Java distribution CA certificates located under $JAVA_HOME/lib/security/cacerts.

  • The legacy parse-decimal-literals-as-double configuration property has been removed. Clusters that use this property must have it removed from configuration or the cluster does not start.

  • The following deprecated task writer configuration properties have been removed:

    • task.writer-count, replaced by prop-task-min-writer-count.

    • task.partitioned-writer-count, replaced by prop-task-max-writer-count.

    • task.scale-writers.max-writer-count, replaced by prop-task-max-writer-count.

    • writer-min-size, replaced by writer-scaling-min-data-processed.

    You must remove these properties from the cluster configuration and replace them with these replacement properties, or the cluster does not start.

  • The Snowflake distributed connector is now deprecated and is planned to be removed in a future SEP release, in favor of the improved Snowflake parallel connector. Existing catalogs that use the Snowflake distributed connector must be migrated to the Snowflake parallel connector.

  • The RPM package service daemon script is now deprecated and is planned to be removed in a future SEP release. Configurations that rely on this script must be updated to use the systemctl daemon script instead.

  • As of the 429-e release, privileges allowing execution of table functions such as query must be qualified with a schema. Privileges for table function execution that are still not qualified with a schema result in an Access Denied error.

  • The legacy Parquet reader has been removed from the Hive, Hudi, Delta Lake, and Iceberg connectors. The parquet.optimized-reader.enabled and parquet.optimized-nested-reader.enabled catalog configuration properties must be removed from your catalog configurations or the cluster does not start.

  • The legacy Hive readers and writers are removed in Trino, as well as other deprecated features. The following catalog configuration properties and their respective session properties have been removed:

    • *.native-reader.enabled and *_native_reader_enabled

    • *.native-writer.enabled and *_native_writer_enabled

    • hive.s3select* and s3_select_pushdown_enabled

    • hive.optimize-symlink-listing and optimize_symlink_listing

    You must remove these properties from your catalog configurations or the cluster does not start.

  • Trino 429 removed differntiation between function types in File-based access control rules. Any rules that have been reliant on function type must be updated accordingly.

429-e initial changes#

General#

  • Added support for publishing data products that contain decimal literals.

  • Updated usage metrics to upload data collected between previous upload and the coordinator shutdown or restart.

  • Fixed issue that prevented the Run and troubleshoot option in the query editor from working when built-in access control is enabled.

Security#

  • Added session logout to OAuth 2.0 providers when logging out from the SEP web UI.

  • Changed built-in functions to be qualified under the system.builtin schema. No access control privileges are necessary to grant access to these basic, non user-defined functions.

  • Fixed issue that prevented tables and columns inside information_schema from being displayed when built-in access control is used.

  • Fixed JavaScript policy evaluation in Ranger and Privacera.

Hive connector#

Delta Lake connector#

MongoDB connector#

Snowflake connector#

  • Added support for RENAME SCHEMA and RENAME TABLE when the snowflake.database-prefix-for-schema.enabled configuration property is set to true.

  • Updated connectors to use fully parallel mode by default for more query shapes.

SQL Server connector#

  • Added the sqlserver.database-prefix-for-schema.enabled catalog configuration property that allows SQL Server catalogs to access multiple databases.

429-e.0 changes (29 Nov 2023)#

  • Improved support for concurrent updates of table statistics in Glue.

  • Added masking for additional sensitive values in log files.

  • Added casting of char fields, if necessary, to varchar type in Hive view translations.

  • Added support for RENAME SCHEMA and RENAME TABLE when the snowflake.database-prefix-for-schema.enabled property is set to true.

  • Remediated CVE-2023-41900

  • Fixed incorrect results for queries involving an aggregation in a correlated subquery.

  • Fixed incorrect results for queries involving ORDER BY and window functions with ordered frames.

  • Fixed launcher start command not working with default directories.

  • Fixed possible JVM crash when reading short decimal columns in parquet files created by Impala. Applies to the Hive, Hudi, Delta, and Iceberg connectors.

  • Fixed incorrect results when a query contains several != or NOT IN predicates in MongoDB catalogs.

429-e.1 changes (21 Dec 2023)#

Warning

This release upgrades Ranger to 2.4, and to avoid a breaking change, adhere to the following steps. Before upgrading to this version of SEP, record your current version of Ranger (for k8s deployments, found in your starburst-ranger values.yaml file as admin.image.tag and usersync.image.tag). While upgrading SEP, before deploying the update to your cluster nodes, you must revert the Ranger 2.4 version tag back to the previous version.

  • Improved query planning time on Hive tables without statistics generated.

  • Fixed long query planning times for queries with many local exchanges.

  • Fixed query failure when reading parquet column index for timestamped columns in Hive, Delta, Iceberg, and Hudi tables.

  • Fixed incorrect results for LIKE with some strings containing repeated substrings.

  • Fixed coordinator memory leak.

429-e.2 changes (18 Jan 2024)#

  • Fixed a potential issue with SEP inadvertently changing users’ passwords in Ranger when used with Ranger Admin 2.4.0.

  • Fixed incorrect results on parquet files containing page indexes when the query has filters on multiple columns in Hive, Delta, and Hudi tables.

  • Fixed an issue with the Run and troubleshoot Run button option writing to empty directories without the option being selected.

429-e.3 changes (14 Feb 2024)#

  • Fixed Teradata custom dates format.

  • Fixed query failure when reading array columns.

  • Fixed a bug where an entire directory is skipped from schema discovery if at least one file matched the excludePatterns option.

  • Fixed out-of-bound (OOB) telemetry null pointer exception in parallel Snowflake connector.

  • Fixed complex expression pushdown in the Redshift connector.

  • Fixed a bug where query history displayed queries of another user.

429-e.4 changes (11 Mar 2024)#

  • Updated Kubernetes external secret operator.

  • Fixed UI authentication for large authentication tokens.

  • Fixed incorrect results for DATETIMEOFFSET values before the year 1400.

  • Fixed query failure when using char types with the reverse() function.

  • Fixed potential incorrect results when using the ST_Centroid() and ST_Buffer() functions for tiny geometries.

  • Fixed schema, table, and function visibility in BIAC filtering.

  • Fixed a bug where column statistics created in |sep| would not be visible in Hive when using CDP 7.

429-e.5 changes (28 Mar 2024)#

  • Added support for setting endpoint and region in STS clients in Lake Formation.

  • Added AWS endpoint configuration for Lake Formation client.

  • Fixed an issue which caused the sync_partition_metadata operation to fail when partition paths had case changes.

  • Restored support for SymlinkTextInputFormat for text formats.

  • Fixed reading Delta Lake files with encoded characters on Azure.

  • Fixed failure when reading certain Avro data with UNION data types.

429-e.6 changes (17 Apr 2024)#

  • Enabled PyStarburst dataframe API by default.

  • Fixed possible worker crashes when running aggregation queries due to out-of-memory error.

  • Fixed incorrect results when querying a table being modified concurrently.

  • Fixed handling of union options in Hive and Avro to allow coercion to a single type.

  • Fixed a bug that caused the creation of materialized views to fail when using MySQL as the cache service backend database if materialized_view_definitions is longer than 64K characters.

429-e.7 changes (20 May 2024)#

  • Fixed potential query failure due to worker nodes running out of memory in concurrent scenarios.

  • Fixed incorrect result with deletion vector on Delta partitioned table.

  • Fixed correctness bug in constant literal distinct aggregation.

  • Fixed Prometheus whiteListObjectNames being overwritten when KEDA is enabled.

429-e.8 changes (14 Jun 2024)#

  • Fixed potential failure when reading ORC files larger than 2GB.

  • Fixed startup failure when fault-tolerant execution is enabled with Google Cloud Storage exchange.

  • Fixed potential loss of a query completion event when multiple queries fail at the same time.

  • Backported IMDSv2 service metadata access.

429-e.9 changes (28 Jun 2024)#

  • Fixed incorrect results when specifying a value for the cassandra.partition-size-for-batch-select configuration property.

  • Fixed failure when writing to tables with Iceberg VARBINARY values.

  • Fixed correctness issue on receivers refresh that could cause query hanging.

429-e.10 changes (11 Jul 2024)#

  • Added encoding to error code in OAuth2 callback handler.

  • Fixed reading empty files from S3 and GCS.

  • Fixed issue syncing partition metadata which could cause data deletion.

429-e.11 changes (29 Jul 2024)#

  • Fixed bug preventing use of Starburst security in Delta Lake connector.

429-e.12 changes (14 Aug 2024)#

  • Fixed optimizer timeout for certain queries involving aggregations and CASE expressions.

  • Fixed failure when adding new columns with a decimal type.

  • Fixed failure to read Hive tables migrated to Iceberg with Apache Spark.

  • Fixed issue that caused the error ‘Multiple masks on a single column are not supported’ to occur unintentionally.

429-e.13 changes (30 Aug 2024)#

  • Fixed query failure when file-based network topology is configured with the node-scheduler.network-topology.file configuration property.

429-e.14 changes (13 Sep 2024)#

  • Fixed a bug that caused cluster metrics to be created with incorrect intervals and subsequently led to loss of cluster metrics data.

  • Fixed Run and troubleshoot feature when insights.authorized-groups configuration property contains authorized groups.

  • Fixed numeric overflow during managed statistics computation for large tables in Teradata mode session.

429-e.15 was skipped.

429-e.16 changes (18 Oct 2024)#

  • Fixed OpenX JSON decoding a JSON array line that resulted in data being written to the wrong output column.

  • Fixed reading large Prometheus responses.

  • Fixed failures for count(*) queries with predicates containing non-ASCII strings. Applies to the Elasticsearch connector.

429-e.17 was skipped.

429-e.18 changes (4 Nov 2024)#

  • Use hive.metastore.partition-batch-size.max config property value in sync_partition_metadata procedure. The default batch size is changed to 100 from 1000.

429-e.19 changes (14 Nov 2024)#

  • Fixed memory leak in InMemoryEventClient within cache service.