Starburst AI agent-based data product enrichment#

Starburst supports data product enrichment using Starburst AI Agent. This feature automatically generates descriptions for data products and datasets using metadata and relationships between data elements.

Requirements#

To use the data products enrichment feature, you need:

  • A valid AI_WORKFLOWS license.

  • Access to at least one configured language AI model.

  • Write permissions for data products.

Configuration#

To configure the enrichment feature, add the following property to your coordinator configuration file:

starburst.agent.enabled=true

Enrich a data product#

After confirming access and configuration, follow these steps to generate metadata for a data product:

  1. In the Data products tab of the Starburst Enterprise web UI, select an existing data product.

  2. Click the AI Agent Icon
SparkleEnrich with AI button to open the Datasets section of the Enrich data product dialog.

  3. From the drop-down menu, choose an AI model.

  4. Click AI Agent Icon
SparkleAI generate all datasets to enrich each dataset within the data product:

    • Column descriptions are generated based on their names, types, any existing descriptions, and the surrounding context such as dataset and data product names.

    • Dataset descriptions are generated based on column descriptions, existing dataset descriptions (if available), and context.

    To generate metadata for a specific dataset, select it, then choose AI Agent Icon SparkleAI generate this dataset from the AI Agent Icon SparkleAI generate all drop-down menu and click AI Agent Icon Sparkle AI generate this dataset.

  5. Click Next.

  6. In the Data product details section, click AI Agent Icon
SparkleGenerate values for all fields.

    This generates a description for the data product, based on its name, current description (if any), and its datasets and their descriptions. The Summary and Tags are based on the generated description.

  7. Click Save.

Enrich a specific field#

To enrich a specific dataset field using AI agent:

  1. Navigate to the Data products tab and select a data product.

  2. Click AI Agent Icon
SparkleEnrich with AI.

  3. Click the AI generate button for the field you want to generate metadata for. You can generate metadata for an individual column, all columns, or the dataset description.

Resetting dataset fields#

You can reset metadata fields during the initial AI generation, either for all datasets or for specific ones.

To reset a single field within a dataset, navigate to the field and click the corresponding autorenew reset button.

To reset all fields in a specific dataset:

  1. Navigate to the Data products tab and select a data product.

  2. Select the dataset you want to reset.

  3. Click AI Agent Icon
SparkleEnrich with AI.

  4. From the AI Agent Icon
Sparkle AI generate all datasets drop-down menu, select Reset this dataset.

  5. Click autorenew Reset this dataset.

  6. Click Next.

  7. In the Data product details section, select Reset values for all fields from the AI
Agent Icon SparkleAI generate all datasets drop-down menu.

  8. Click autorenew Reset values for all fields.

  9. Click Save.

To reset all fields across all datasets:

  1. Navigate to the Data products tab and select a data product.

  2. Click AI Agent Icon
SparkleEnrich with AI.

  3. From the AI Agent Icon
SparkleAI generate all datasets drop-down menu, select Reset all.

  4. Click autorenew Reset all to resets all fields across all datasets in the selected data product.

  5. Click Next.

  6. In the Data product details section, select Reset values for all fields from the AI
Agent Icon SparkleAI generate all datasets drop-down menu.

  7. Click autorenew Reset values for all fields.

  8. Click Save.