Telemetry#
Starburst Enterprise platform (SEP) collects the following data about product performance and usage. This information is sent to Starburst, and informs our development efforts. Starburst can share the data with you on a per-request basis:
Optional anonymized query logs
Security#
The collected data is sent to the following endpoints in a Protobuf compressed, binary format:
https://telemetry.eng.starburstdata.net/v1/metrics
https://telemetry.eng.starburstdata.net/v1/logs
All connections use TLS, and are therefore secured and data is end-to-end encrypted.
telemetry.eng.starburstdata.net
uses the following static IP addresses:
99.83.235.49
75.2.114.217
Configuration#
Telemetry is enabled by default, with the exclusion of query logging. You must opt-in to query logging by configuring it. All configuration properties listed here are set in the Config properties file.
Property name |
Description |
---|---|
|
Set to |
|
Set to |
|
Frequency at which metrics are sent to Starburst.
Defaults to |
|
Set to |
|
Set to |
|
Set to |
|
A comma-separated list of query types to be logged.
Possible query types are |
|
Set to |
|
Set to |
|
Set to |
|
Set to |
|
Set to |
|
Set to |
|
Set to |
|
Duration to wait and batch more logs, before sending them to Starburst.
Defaults to |
|
Maximum number of logs entries collected, before sending them to Starburst.
Defaults to |
|
Maximum size of the logs batch, before sending it to Starburst.
Defaults to |
All the data collected is annotated with the environmental data described in the next section.
Enable query logging#
SEP can be configured to collect the following information for every executed query:
Query statistics and input-output metadata do not contain any sensitive information and only expose performance-related metrics. This data helps Starburst improve the user experience as well as query performance. Therefore, we generally suggest to enable this feature in all your deployments.
To enable these logs use the following configurations:
telemetry.log-completed=true
telemetry.log-query-statistics=ANONYMIZED
telemetry.log-query-io-metadata=ANONYMIZED
Any query text and query plan may contain sensitive information, such as in
WHERE
and other predicate clauses. Anonymizing the data is an opt-in feature
that must be specifically configured.
To configure SEP to collect the query statistics, input-output metadata, query text, query plan, and failure info of all completed SQL queries, use the following configuration:
telemetry.log-completed=true
telemetry.log-query-types=select
telemetry.log-query-statistics=ORIGINAL
telemetry.log-query-io-metadata=ORIGINAL
telemetry.log-query-plan=ORIGINAL
telemetry.log-query-text=ORIGINAL
telemetry.log-query-failure-info=ORIGINAL
Collected data#
Metrics, with the exception of environmental and configuration log data, are based on completed queries, whether successful or not. Examples of this data are provided in the following sections.
Environmental data#
The environmental data describes the ownership, licensing and service information of every SEP cluster.
Key |
Value |
---|---|
|
Environment name defined in SEP. |
|
The hash of the license file, if present. |
|
Defined by |
|
|
|
A random UUID generated each time SEP starts. |
|
Always set to |
|
The SEP version number of the cluster. |
|
Always set to |
|
Always set to opentelemetry. |
|
The version of the OpenTelemetry library used. |
|
The ISO8601 date and time when the coordinator last started. |
The following is an example of the data collected that describes the SEP environment:
"resource":{
"attributes":[
{
"key":"deployment.environment",
"value":{
"string_value":"prod"
}
},
{
"key":"license.hash",
"value":{
"string_value":"5000eRAND0M967d0004a4eLICENSEa97b00006023dedeSTRING82460c8500055"
}
},
{
"key":"license.owner",
"value":{
"string_value":"Example Company"
}
},
{
"key":"license.type",
"value":{
"string_value":"JSON"
}
},
{
"key":"service.instance.id",
"value":{
"string_value":"6d35zzzz-2000-4628-zzzz-120000zzzzed"
}
},
{
"key":"service.name",
"value":{
"string_value":"starburst-enterprise"
}
},
{
"key":"service.version",
"value":{
"string_value":"prod"
}
},
{
"key":"telemetry.sdk.language",
"value":{
"string_value":"java"
}
},
{
"key":"telemetry.sdk.name",
"value":{
"string_value":"opentelemetry"
}
},
{
"key":"telemetry.sdk.version",
"value":{
"string_value":"1.6.0"
}
}
]
}
Configuration log data#
SEP collects configuration property names, and a representation of the value.
Boolean values are recorded as-is. Binary values are rounded to the nearest base
two magnitude. For example, 72 GB is recorded as 64 GB. Other numeric values,
such as INTEGER
and DOUBLE
are rounded down to the nearest order of
magnitude. For example, 54,321 is rounded to 100,000. For most configuration
properties, SEP does not record text values, only that they are set.
Text values are recorded for the following configuration properties:
access-control.name
connector.name
delta.security
hive.security
http-server.authentication.oauth2.issuer
http-server.authentication.type
iceberg.security
password-authenticator.name
retry-policy
warp-speed.proxied-connector
web-ui.authentication.type
The following JSON snippet is an example of the data collected that describes SEP configuration properties:
"logs": [
{
"time_unix_nano": "1637193575209705000",
"severity_number": "SEVERITY_NUMBER_INFO",
"name": "bootstrap",
"body": {
"string_value": ""
},
"attributes": [
{
"key": "propertyName",
"value": {
"string_value": "cache-service.cache-ttl"
}
},
{
"key": "propertyValue",
"value": {
"string_value": "0.00ns"
}
}
]
},
{
"time_unix_nano": "1637193575223536000",
"severity_number": "SEVERITY_NUMBER_INFO",
"name": "bootstrap",
"body": {
"string_value": ""
},
"attributes": [
{
"key": "propertyName",
"value": {
"string_value": "cache-service.uri"
}
}
]
},
{
"time_unix_nano": "1637193575224097000",
"severity_number": "SEVERITY_NUMBER_INFO",
"name": "bootstrap",
"body": {
"string_value": ""
},
"attributes": [
{
"key": "propertyName",
"value": {
"string_value": "materialized-views.namespace"
}
}
]
},
]
Metrics#
All metrics collected by SEP are aggregated for the time period starting at
start_time_unix_nano
and ending at time_unix_nano
. These timestamps are
repeated with the same value with most metrics.
Metrics are based on completed queries, whether successful or not. Examples of this data are provided below.
queries_executed
#
SEP collects aggregated counts of specific query dimensions as described in the following table.
Dimension |
Description |
---|---|
|
Total queries per column type, across all sources. |
|
Total queries per connector. |
|
Total queries by connector and query type. Possible query types are
|
|
|
|
Total queries using named session property or catalog session property, and a representation of the value. Boolean values are recorded as-is. Binary values are rounded to the nearest base 2 magnitude. For example, 72 GB is recorded as 64 GB. Other numeric values are rounded down to the nearest order of magnitude. For example, 54,321 is rounded to 100,000. Text values are not recorded, only the fact that they were set. |
|
Total queries per named client, as supplied by client, such as “trino-cli”. |
The following is an example of the collected dimensional query execution data:
"name":"queries_executed",
"unit":"1",
"sum":{
"data_points":[
{
"start_time_unix_nano":"1635164762424772000",
"time_unix_nano":"1635172027851773000",
"as_int":"3",
"attributes":[
{
"key":"source",
"value":{
"string_value":"trino-cli"
}
}
]
},
{
"start_time_unix_nano":"1635164762424772000",
"time_unix_nano":"1635172027851773000",
"as_int":"1",
"attributes":[
{
"key":"function",
"value":{
"string_value":"max"
}
}
]
},
{
"start_time_unix_nano":"1635164762424772000",
"time_unix_nano":"1635172027851773000",
"as_int":"2",
"attributes":[
{
"key":"connector",
"value":{
"string_value":"postgresql"
}
},
{
"key":"queryType",
"value":{
"string_value":"SELECT"
}
}
]
},
queries_failed
#
SEP collects aggregated counts of query failures.
Dimension |
Description |
---|---|
|
Total failed queries by failure type. Error codes are numeric code values.
FailureTypes are exception class names such as
|
The following is an example of the collected data:
"name":"queries_failed",
"unit":"1",
"sum":{
"data_points":[
{
"start_time_unix_nano":"1635164762424772000",
"time_unix_nano":"1635172027851773000",
"as_int":"3",
"attributes":[
{
"key":"error_code",
"value":{
"int_value":"400"
}
},
{
"key":"failure_type",
"value":{
"string_value":"Can't create database 'foo'; database exists"
}
}
]
},
}
physical_input_bytes
#
SEP collects the aggregated byte count of data in all processed queries.
Dimension |
Description |
---|---|
|
Total input bytes by connector. |
The following is an example of the collected data:
"name":"physical_input_bytes",
"unit":"byte",
"sum":{
"data_points":[
{
"start_time_unix_nano":"1635164762424772000",
"time_unix_nano":"1635172027851773000",
"as_int":"300",
"attributes":[
{
"key":"connector",
"value":{
"string_value":"postgresql"
}
}
]
},
{
"start_time_unix_nano":"1635164762424772000",
"time_unix_nano":"1635172027851773000",
"as_int":"300"
]
}
]
}
physical_input_rows
#
SEP collects the aggregated count of input rows of data in all processed queries.
Dimension |
Description |
---|---|
|
Total input rows by connector |
The following is an example of the collected data:
"name":"physical_input_rows",
"unit":"1",
"sum":{
"data_points":[
{
"start_time_unix_nano":"1635164762424772000",
"time_unix_nano":"1635172027851773000",
"as_int":"300",
"attributes":[
{
"key":"connector",
"value":{
"string_value":"postgresql"
}
}
]
},
{
"start_time_unix_nano":"1635164762424772000",
"time_unix_nano":"1635172027851773000",
"as_int":"300"
]
}
]
}
Query performance and complexity metrics#
SEP collects aggregations of key performance and complexity measures of the queries it processes.
Metric |
Data type |
Description |
---|---|---|
|
Histogram |
Binned query analysis times for all queries in the collection time period. |
|
Histogram |
Binned number of distinct catalogs used in a query for all queries in the collection time period. |
|
Histogram |
Binned number of distinct connectors used in a query for all queries in the collection time period. |
|
Histogram |
Binned total CPU time spent processing a query, for all queries in the collection time period. |
|
Single value |
Binned cumulative memory for a single query throughout its processing, for all queries in the collection time period. This is different from peak memory; not all of the cumulative memory may have been in use at the same time. |
|
Single value |
Cumulative memory used by queries in the collection period. |
|
Histogram |
Binned query execution times for all queries in the collection time period. |
|
Histogram |
Binned number of input columns used in a query for all queries in the collection time period. |
|
Histogram |
Binned number of output columns resulting from a query for all queries in the collection time period. |
|
Single value |
Highest measured memory used by a task in the collection period. |
|
Single value |
Highest measured user memory used by a task in the collection period. |
|
Histogram |
Binned resource waiting times for all queries in the collection time period. |
|
Histogram |
Binned query queued times for all queries in the collection time period. |
|
Histogram |
Binned resource waiting times for all queries in the collection time period. |
|
Histogram |
Binned scheduled times for all queries in the collection time period. |
|
Histogram |
Binned number of distinct schemas used in a query for all queries in the collection time period. |
|
Single value |
Total number of splits across all queries in the collection time period. |
|
Single value |
Binned number of stages for a single query, for all queries in the collection time period. |
|
Histogram |
Binned number of tasks in any given stage for a single query, for all queries in the collection time period. |
|
Histogram |
Binned number of distinct tables used in a query for all queries in the collection time period. |
|
Histogram |
Binned number of columns in a single table for all tables used in a query for all queries in the collection time period. |
|
Histogram |
Binned query wall times for all queries in the collection time period. Wall time does not include queued time. |
The following is an example of a single-value metric:
{
"name":"peak_task_total_memory",
"unit":"byte",
"sum":{
"data_points":[
{
"start_time_unix_nano":"1635164762424772000",
"time_unix_nano":"1635172027851773000",
"as_int":"66609"
}
],
"aggregation_temporality":"AGGREGATION_TEMPORALITY_CUMULATIVE",
"is_monotonic":true
}
}
Performance data that is presented in a histogram also includes count
and
sum
values, where the count
is equal to the number of instances
represented in the histogram, and the sum
is the metric aggregated across
all instances, such as shown in the following example, where there were three
queries with an aggregated analysis time of 1396.0 ms:
{
"name":"analysis_time",
"unit":"millisecond",
"histogram":{
"data_points":[
{
"start_time_unix_nano":"1635164762424772000",
"time_unix_nano":"1635172027851773000",
"count":"3",
"sum":1396.0,
"bucket_counts":[
"0",
"0",
"2",
"1",
"0",
"0",
"0",
"0",
"0",
"0",
"0"
],
"explicit_bounds":[
10.0,
100.0,
500.0,
1000.0,
2000.0,
10000.0,
60000.0,
300000.0,
3600000.0,
86400000.0
]
}
],
"aggregation_temporality":"AGGREGATION_TEMPORALITY_CUMULATIVE"
}
}
Optional query log data#
If query log collection is enabled, each query processed results in one or more associated log entries. The following is an example of a query log entry:
"logs": [
{
"time_unix_nano": "1635515535751000000",
"severity_number": "SEVERITY_NUMBER_INFO",
"name": "queryCompletedEvent",
"body": {
"string_value": ""
},
"attributes": [
{
"key": "createTime",
"value": {
"string_value": "2021-10-29T13:52:13.288Z"
}
},
{
"key": "endTime",
"value": {
"string_value": "2021-10-29T13:52:15.654Z"
}
},
{
"key": "executionStartTime",
"value": {
"string_value": "2021-10-29T13:52:13.501Z"
}
},
{
"key": "failureInfo",
"value": {
"string_value": "null"
}
},
{
"key": "metadata.plan",
"value": {
"string_value": "Fragment 0 [SINGLE]\n CPU: 18.33ms, Scheduled: 24.11ms, Input: 598 rows (65.56kB); per task: avg.: 598.00 std.dev.: 0.00, Output: 598 rows (57.21kB)\n Output layout: [field, field_0, field_1, field_2, field_3, field_4]\n ..."
}
},
{
"key": "metadata.query",
"value": {
"string_value": "SHOW FUNCTIONS"
}
},
{
"key": "statistics",
"value": {
"string_value": "{\"cpuTime\":0.097000000,...}"
}
},
{
"key": "inputIOMetadata",
"value": {
"string_value": "[{"connectorMetrics":{"Physical input read time":...}}]"
}
},
{
"key": "warnings",
"value": {
"string_value": "[]"
}
}
]
}
]
Anonymized query statistics#
Anonymized query statistics collects metrics related to query execution without exposing any sensitive information.
Note
We generally suggest to enable this feature in all your deployments, to assist Starburst in improving query performance by using this data.
To enable it use the following properties:
telemetry.log-completed=true telemetry.log-query-statistics=ANONYMIZED
{
"cpuTime": 0.005,
"failedCpuTime": 0,
"wallTime": 0.113,
"queuedTime": 0,
"scheduledTime": 0.019,
"failedScheduledTime": 0,
"analysisTime": 0.01,
"planningTime": 0.026,
"executionTime": 0.103,
"inputBlockedTime": 0,
"failedInputBlockedTime": 0,
"outputBlockedTime": 0,
"failedOutputBlockedTime": 0,
"peakUserMemoryBytes": 117,
"peakTaskUserMemory": 117,
"peakTaskTotalMemory": 117,
"physicalInputBytes": 1512,
"physicalInputRows": 4,
"processedInputBytes": 31,
"processedInputRows": 3,
"internalNetworkBytes": 0,
"internalNetworkRows": 0,
"totalBytes": 1512,
"totalRows": 3,
"outputBytes": 31,
"outputRows": 3,
"writtenBytes": 0,
"writtenRows": 0,
"cumulativeMemory": 0,
"failedCumulativeMemory": 0,
"stageGcStatistics": [
{
"stageId": 0,
"tasks": 1,
"fullGcTasks": 0,
"minFullGcSec": 0,
"maxFullGcSec": 0,
"totalFullGcSec": 0,
"averageFullGcSec": 0
}
],
"completedSplits": 2,
"complete": true,
"cpuTimeDistribution": [
{
"stageId": 0,
"tasks": 1,
"p25": 5,
"p50": 5,
"p75": 5,
"p90": 5,
"p95": 5,
"p99": 5,
"min": 5,
"max": 5,
"total": 5,
"average": 5
}
],
"operatorSummaries": [
"{\n \"stageId\" : 0,\n \"pipelineId\" : 0,\n \"operatorId\" : 0,\n \"planNodeId\" : \"0\",\n \"operatorType\" : \"TableScanOperator\",...}",
"{\n \"stageId\" : 0,\n \"pipelineId\" : 0,\n \"operatorId\" : 1,\n \"planNodeId\" : \"6\",\n \"operatorType\" : \"TaskOutputOperator\",...}"
],
"planNodeStatsAndCosts": "{\n \"stats\" : { },\n \"costs\" : { }\n}",
"resourceWaitingTime": 0.01
}
Anonymized query input-output metadata#
Anonymized query input-output metadata consists of connector metrics along with
anonymized table-level metadata. Hence, it doesn’t expose any sensitive
information. For instance, the catalog name tpch
is anonymized to
catalog_1
.
Note
We generally suggest to enable this feature in all your deployments, to assist Starburst in improving query performance by using this data.
To enable it use the following properties:
telemetry.log-completed=true telemetry.log-query-io-metadata=ANONYMIZED
[
{
"catalogName": "catalog_dd42e04c",
"schema": "schema_8fcc1516",
"table": "table_dd3bdbd2",
"columns": [
"column_144ef030",
"column_6c2bbea9"
],
"connectorMetrics": {
"Physical input read time": {
"@class": "io.trino.plugin.base.metrics.DurationTiming",
"duration": "1266330.00ns"
},
"OrcReaderCompressionFormat_ZLIB": {
"@class": "io.trino.plugin.base.metrics.LongCount",
"total": 112
}
},
"physicalInputBytes": 1512,
"physicalInputRows": 4
}
]