Starburst Galaxy

  •  Get started

  •  Working with data

  •  AI workflows

  •  Data engineering

  •  Developer tools

  •  Cluster administration

  •  Security and compliance

  •  Troubleshooting

  • Galaxy status

  •  Reference

  • Galaxy Telemetry #

    The Galaxy Telemetry catalog is a built-in catalog that lets you access Starburst Galaxy managed datasets such as query insights.

    The catalog is available by default. However, you must add it to an AWS cluster, in the us-east-1 region, in order to access it. The catalog is read-only and unmodifiable.

    Query history #

    Query history data is located in the query_history table within the public schema. New history data is accessible within two hours of query completion.

    Users can query up to the last 30 days of their query history by selecting from galaxy_telemetry.public.query_history.

    The query_history table contains the following columns:

    Column Type Description
    account_name varchar Domain name of the account.
    client_info varchar Information about the client submitting the query.
    cluster_name varchar Name of the cluster running the query.
    create_time timestamp Timestamp when the query was received.
    date varchar Date (yyyy-MM-dd) of end_time.
    email varchar Email address of the user running the query.
    end_time timestamp Timestamp when results were finished being consumed by the client.
    error_code_category varchar Query failure error category.
    error_code_name varchar Query failure error code name.
    error_exception_message varchar Query failure exception message.
    execution_start_time timestamp Timestamp when query execution started.
    hour varchar Hour (00-23) of end_time.
    index_and_cache_usage_filtering double Indicates index usage for filtering operations.
    index_and_cache_usage_overall double Combined score for performance gains from filtering and projection.
    index_and_cache_usage_projection double Indicates data cache usage for projection operations.
    internal_network_bytes bigint Number of bytes of data exchanged between nodes during query execution.
    internal_network_rows bigint Number of rows of data exchanged between nodes during query execution.
    original_query_id varchar Unique identifier of the original query, in case the results were served from cache.
    output_bytes bigint Number of filtered bytes on query output.
    output_rows bigint Number of filtered rows on query output.
    peak_task_total_memory_bytes bigint Maximum amount of user and system memory reserved at one time by one task.
    peak_task_user_memory_bytes bigint Maximum amount of user memory reserved at one time by one task.
    peak_user_memory_bytes bigint Maximum amount of memory directly tied to query resources used by a query at one time.
    physical_input_bytes bigint Number of uncompressed bytes read from the source.
    physical_input_rows bigint Number of rows read from the source.
    plannning_time_secs double The time spent actively planning the execution of the query.
    query varchar Full SQL statement of the submitted query.
    query_id varchar Unique identifier of the query.
    query_plan varchar Full explain plan of the submitted query, including costs for each stage. Identical to the output from the EXPLAIN ANALYZE SQL statement.
    query_state varchar The state of the query, such as FINISHED, FAILED, or RUNNING.
    query_type varchar Query type, such as SELECT, UPDATE, or INSERT.
    read_bytes bigint Sum of all bytes used in the query input.
    read_rows bigint Sum of all rows used in the query input.
    remote_client_address varchar Address of the client that submitted the query.
    role_name varchar Role of the user running the query.
    session_catalog varchar The session catalog.
    session_properties map(varchar, varchar) Temporary properties for the query session.
    session_schema varchar The session schema.
    source varchar Client tool or driver used to execute query.
    tables array(ROW(catalog, varchar, schema varchar, table varchar)) Catalog, schema, and table names accessed by the query.
    update_type varchar The type of update the query performed, such as CREATE TABLE. This column is null for SELECT queries.
    user_agent varchar User agent that submitted the query.
    written_bytes bigint Number of bytes of inserted rows.
    written_rows bigint Number of inserted rows.

    Billable usage #

    Billable usage information is located in the billable_usage table within the public schema.

    New usage details are fetched every hour and may take anywhere from 24-48 hours to become available. Usage details are retained for 12 months.

    The billable_usage table contains the following information:

    Column Type description
    account_name varchar Domain name of the account.
    cloud varchar Cloud provider.
    cluster_name varchar Name of the cluster.
    cluster_owner_role varchar Role of the cluster owner.
    cost double Cost of the usage.
    credits_used double Credits used.
    date varchar Date of end_time in UTC (yyyy-MM-dd).
    end_time timestamp(3) End time of the usage.
    livetable_name varchar Name of the live table.
    privatelink_connection_id varchar PrivateLink connection ID.
    quantity double Quantity of usage.
    region varchar Cloud region where the service is used.
    service_type timestamp(3) Type of service used.
    start_time varchar Start time of the usage.
    usage_units varchar Units of usage.
    • The Index status column shows check Successful or errorFailed indexing.
    • All columns are nullable, and have a NULL default value.
    • Any tags assigned are listed in the Tags column. Click add to manage tags.

    Click edit at the end of each row to edit descriptions.

    Set permissions #

    The galaxy_telemetry catalog is owned by the accountadmin role. Read-only access can be granted to other roles.

    SQL support #

    The catalog provides read access to data and metadata in the datasets. It supports the following features: