Python clients #
Starburst Galaxy, Starburst Enterprise platform (SEP), and Trino fully support client access from Python code and Python-based client apps.
Python libraries and clients #
The following data query libraries and clients take advantage of the Trino Python client package.
PyStarburst is Starburst Galaxy’s library that supports the Python DataFrame API.
Ibis is a portable Python DataFrame library, with support for Trino connections.
dbt is a data transformation workflow development framework that lets teams quickly and collaboratively deploy analytics code. Starburst provides a supported adapter. The dbt client page describes the steps to use the adapter and dbt with Trino, Starburst Enterprise, or Starburst Galaxy.
Apache Superset is a data exploration and visualization platform. Connections to clusters use the SQLAlchemy-Trino package in conjunction with the Trino Python client package. The Superset client page describes the steps to use Superset with Trino, Starburst Enterprise, or Starburst Galaxy.
Querybook is a browser-based data analysis tool that turns SQL queries into natural language reports and graphs called DataDocs. The Querybook client page describes the steps to use Querybook with Trino, Starburst Enterprise, or Starburst Galaxy.
Trino Python client #
The client supports running queries within transactions, as described in the GitHub project’s README.
The Python client package requires Python 3.6 or later, or PyPy 3.
To use the package directly in your Python code, install it locally with
install trino (or use
pip3 if your system is so configured). Thereafter,
import trino into your code.
To use one of the Python-based clients, follow
the setup instructions for that client, which incorporates the
Authentication methods #
The Python client package supports the following Trino authentication methods:
- No authentication
- Basic authentication using passwords, which includes:
Package comparison #
The Python Database API Specification (DBAPI) defines a standard way for Python clients to access databases. The Trino Python client is a direct implementation of the DBAPI specification.
SQLAlchemy is a toolkit whose core
component provides a SQL abstraction layer over many DBAPI implementations.
Several Python clients use SQLAlchemy along with the
package to provide SQL access to Trino clusters.
Python clients that use the Trino DBAPI implementation directly, or that use SQLAlchemy along with the Trino DBAPI package, are the most direct path to querying Trino, Starburst Enterprise, and Starburst Galaxy clusters.
Several alternative Python access methods are not as direct, and are not recommended:
PySpark requires Spark JARs as well as a JDBC driver. This leaves your SQL query two layers removed from a direct DBAPI implementation.
PyJDBC does implement DBAPI, but also inserts the requirement of a JDBC driver in the path of your query.
PyHive implements DBAPI, can support use with SQLAlchemy, and has support for the Trino client package. However, it is designed to use the Hive query language, and not SQL. While both languages are similar, they are not identical and using the PyHive library can therefore result in unexpected query results or failures.
The following example shows how to use the Python API to connect to a local cluster running without security to submit a single query and return the results.
import trino conn = trino.dbapi.connect( host='localhost', port=8080, user='sep-user', catalog='system', schema='runtime', ) cur = conn.cursor() cur.execute('SELECT * FROM nodes') rows = cur.fetchall() for row in rows: print(row)
The next example runs the same query on a remote cluster secured with LDAP
user parameter is not needed for LDAP because you
specify the username in the
auth parameter. The
parameters are not required for this query format, which specifies the entire
import trino conn = trino.dbapi.connect( host='cluster.example.com', port=8443, http_scheme='https', auth=trino.auth.BasicAuthentication("ldap-username", "ldap-password"), ) cur = conn.cursor() cur.execute('SELECT * FROM system.runtime.nodes') rows = cur.fetchall() for row in rows: print(row)
The next example runs a query on a Starburst Galaxy cluster secured using HTTPS and
port 433. This example uses username and password
credentials for authentication and is appropriate for establishing a connection
to any cluster that relies on basic authentication.
import trino conn = trino.dbapi.connect( host='cluster.trino.galaxy.starburst.io', port=443, http_scheme='https', auth=trino.auth.BasicAuthentication("username", "password"), ) cur = conn.cursor() cur.execute('SELECT nationkey, name FROM tpch.sf1.nation') rows = cur.fetchall() for row in rows: print(row)
Is the information on this page helpful?