Welcome to the Learn SQL basics tutorial, where you explore useful ways to craft
your SQL statements using the Sample dataset. The
SQL statements start with SELECT
expressions that return every row, and
increase in complexity with the addition of clauses and operations like WHERE
,
GROUP BY
, and ORDER BY
.
To complete the tutorial using Starburst Galaxy, you must create a free trial or activate your user account.
Once you login to Starburst Galaxy, a cluster named sample
or free-cluster
,
containing a catalog named sample
, is usually pre-configured and ready
for querying.
sample
catalog automatically available, create a
Sample dataset catalog with the name sample
.free-cluster
or sample
cluster automatically
available, create a cluster named sample
and
add the sample
catalog.In the query editor, navigate to the Cluster explorer.
free-cluster
or sample
cluster to view
its catalogs.Select the sample
catalog.
In the location drop-down
menus,
select the cluster and the catalog sample
in order to run the queries without
having to specify the full table path location in each query.
The namespace for a table is typically specified as
catalog_name.schema_name.table_name
. For the predefined sample dataset,
this configuration is as follows:
sample.demo.<table_name>
If you choose to name your catalog differently, either adjust the queries in the tutorials accordingly or select the appropriate catalog in the location drop-down menus.
If you choose to name your cluster differently, select the appropriate cluster in the location drop-down menus.
The SELECT
clause specifies the output of the query. There are multiple
Select expressions that can be
utilized depending on the query requirements.
Use the SELECT *
expression and query the astronauts
table to view the
data available for analysis.
SELECT
*
FROM
sample.demo.astronauts;
The astronauts table displays all the astronauts that are associated with a mission, which creates duplicate entries of the same astronaut because each row represents not only the astronaut but each mission the astronaut has completed.
Use the SELECT *
expression and query the missions
table to view the data
available for analysis.
SELECT
*
FROM
sample.demo.missions;
Instead of using the SELECT *
expression to return all the columns in the
table, only return columns that are beneficial for your analysis of the
astronauts
table.
SELECT
name,
nationality,
mission_title,
mission_number,
hours_mission
FROM
sample.demo.astronauts;
Instead of using the SELECT *
expression to return all the columns in the
table, only return columns that are beneficial for your analysis of the
missions
table.
SELECT
company_name,
status_rocket,
cost,
status_mission
FROM
sample.demo.missions;
The WHERE
clause is used to specify a condition and only returns records that
satisfy the conditional criteria.
There are many astronauts from the U.S. and U.S.S.R, so use the WHERE
clause
to alter the query and only view the astronauts of other countries.
SELECT
name,
nationality,
mission_title,
mission_number,
hours_mission
FROM
sample.demo.astronauts
WHERE
nationality NOT LIKE 'U.S.%';
Use the WHERE
clause to query the table and only return missions that are
classified as a success.
SELECT
company_name,
status_rocket,
cost,
status_mission
FROM
sample.demo.missions
WHERE
status_mission = 'Success';
The ORDER BY
clause is used to sort a result set by one or more output
expressions.
Add a sort to the astronauts query to view the results in order of nationality and name.
SELECT
name,
nationality,
mission_title,
mission_number,
hours_mission
FROM
sample.demo.astronauts
WHERE
nationality NOT LIKE 'U.S.%'
ORDER BY
nationality,
name;
A gentle reminder that the astronauts table includes duplicate entries for selected astronauts; however, these astronauts have a unique identifier for each completed mission.
Aggregate functions operate on a set of values to compute a single result.
Use one query to find three different aggregate values. Discover the amount of trips, the longest mission in hours, and the smallest mission in hours from select countries.
SELECT
count() as trips,
max(hours_mission) as longest_mission,
min(hours_mission) as shortest_mission
FROM
sample.demo.astronauts
WHERE
nationality NOT LIKE 'U.S.%';
Instead of evaluating all the data together, increase the granularity level and evaluate the aggregates specifically for each country.
SELECT
nationality,
count() AS number_trips,
max(hours_mission) AS longest_time,
min(hours_mission) AS shortest_time
FROM
sample.demo.astronauts
WHERE
nationality NOT LIKE 'U.S.%'
GROUP BY
nationality;
Add an ORDER BY
clause and sort the results by most trips per country. In the
case of a tie, sort additionally by longest mission.
SELECT
nationality,
count() AS number_trips,
max(hours_mission) AS longest_time,
min(hours_mission) AS shortest_time
FROM
sample.demo.astronauts
WHERE
nationality NOT LIKE 'U.S.%'
GROUP BY
nationality
ORDER BY
number_trips DESC,
longest_time DESC;
Revisit the previous missions
query and add an Aggregate
function to determine which company
spent the most money on successful missions. Order this query by the total
highest cost.
SELECT
company_name,
sum(cost),
status_rocket
FROM
sample.demo.missions
WHERE
status_mission = 'Success'
GROUP BY
company_name,
status_rocket
ORDER BY
sum(cost) DESC;
Check out our other tutorials, or dive right into the SQL documentation and experiment with your own data.
Is the information on this page helpful?
Yes
No