ThoughtSpot #

Requirements #

ThoughtSpot provides a built-in Starburst data connector. The requirements for using this connector in ThoughtSpot Cloud are:

  • Users of ThoughtSpot Cloud must connect to a cluster running Starburst Galaxy or SEP version 354-e or newer.

  • Other ThoughtSpot Cloud requirements are described in the ThoughtSpot documentation.

Add a cluster connection #

Use the following steps to connect to ThoughtSpot and add your cluster as a data source:

  1. Determine the connection information for your cluster, including its network name, port, and your login credentials.

  2. Using a supported web browser, log into ThoughtSpot using the URL and login credentials provided for your site by your network administrators. If you are evaluating ThoughtSpot and have trial login credentials, use try.thoughtspot.cloud.

  3. To add a connection to your cluster, select Data in the navigation bar.

  4. Select the Connections tab, then Add a connection.

  5. In the Choose your data warehouse page, in the first field, Name your connection, such as production cluster. Optionally enter a connection description.

  6. Select the Starburst tile and click Continue.

      ThoughtSpot choose connection page

  7. Enter the connection details for your cluster. Leave the Database field empty.

      ThoughtSpot connection details page

  8. If your cluster uses secure TLS/HTTPS connections, open the Advanced Config drop-down. In the first row of Key and Value fields, enter SSL and true.

  9. Click Continue.

  10. In the Select tables page, ThoughtSpot connects to your cluster and returns a list of configured catalogs. Open a catalog entry to see the schemas and tables it contains.

      ThoughtSpot select tables page

  11. Use the checkboxes to select one or more tables on the left and columns on the right. Choose a set of tables and columns from which you can make meaningful queries by dragging column names into ThoughtSpot’s Search field.

    You can return to this page under a different connection name to select another set of tables and columns to support a different set of queries. Therefore, do not select in this current named connection every table you might be interested in querying.

      ThoughtSpot select table columns page

  12. Click Confirm. This adds a named connection that includes only the tables and columns you selected.

    You can now perform a live query on this connection, following the instructions in ThoughtSpot’s documentation.

    To query a different set of tables and columns, create a different data connection to the same cluster.

Joining tables #

Use the ThoughtSpot Select tables UI to perform join operations, including joining tables across different data sources. ThoughtSpot generates a SQL query that specifies the joins and passes that to the SEP query engine, which is then responsible for analysis and execution of the query.

You can also create a view directly in SEP from a SELECT query that includes the joins of interest. This allows you to select that view by name in ThoughtSpot’s Select tables page, and issue queries on that view. This is appropriate for a static set of joined tables that forms a platform for many successive queries.

In both cases, ThoughtSpot generates the SQL for your analysis query and forwards that to SEP for execution.

To create a view of joined tables, use a SQL editor client such as the Trino CLI or the query editor in the Starburst Enterprise web UI.

Your cluster must have a Hive catalog that implements a Hive metastore service such as the AWS Glue Data Catalog, which serves as a location to store your created view.

The following example saves a view of a SELECT statement that accesses three catalogs: postgresql, hive, and sqlserver.

CREATE VIEW hive.savedviews.income_by_customer_segment_vw AS
SELECT c.custkey,
       c.estimated_income,
       c.fico,
       o.risk_appetite,
       l.cc_type
FROM glue.burst_bank.customer c
   INNER JOIN postgresql.burst_bank.customer_profile o
     ON c.custkey = o.custkey
   INNER JOIN sqlserver.burst_bank.product_profile l
     ON o.custkey = l.custkey;

Resources #