Starburst Generic JDBC connector#
The Generic JDBC connector allows querying and creating tables in an external database that SEP does not have a dedicated connector for. This can be used for proof-of-concept deployments of federation use-cases, enabling joining multiple data sources. For production deployments a dedicated supported connector should be used.
Due to its generic nature, this connector may or may not work with a particular database or deployment scenario. In particular, you should expect the connector to handle at most only a subset of data types available in the external database.
To connect to a data source with the Generic JDBC connector, you need:
A JDBC driver for the database or system you want to connect.
Network access from the coordinator and workers to the data source. The used port varies for different databases and configurations.
A valid Starburst Enterprise license.
The Generic JDBC connector requires a JDBC 4.2 driver dedicated for the external database. No JDBC drivers are shipped with the connector. The installation steps assume you have the external database already running and that you have the appropriate JDBC driver jar.
generic-jdbcdirectory under SEP’s plugin directory. When SEP is installed from RPM, this is
Add the appropriate JDBC driver jar file to the
etc/catalogdirectory containing configuration for SEP catalogs. When SEP is installed from RPM, this is
some_catalog.propertiesfile in the catalogs configuration directory (replace
some_catalogwith your database name or some other descriptive name of the catalog) with the following contents:
connector.name=generic-jdbc # the name of the Driver class. E.g. for Postgres this would be org.postgresql.Driver generic-jdbc.driver-class=FULLY_QUALIFIED_NAME_OF_THE_DRIVER_CLASS # JDBC connection URL. E.g. for Postgres this could be jdbc:postgresql://example.net:5432/database?defaultRowFetchSize=10000 connection-url=JDBC_CONNECTION_URL_AS_APPROPRIATE_FOR_THE_EXTERNAL_DATABASE_DRIVER # connection credentials connection-user=USERNAME connection-password=PASSWORD # additional configuration case-insensitive-name-matching=true
Perform the above steps on every SEP cluster node.
Restart SEP on every node.
Multiple external databases#
If you want to connect to multiple external databases using the Generic JDBC connector, follow these steps:
Add appropriate JDBC driver jars to the
generic-jdbcdirectory under SEP’s plugin directory for all database servers you are going to connect to. If two external databases use the same driver jar, you do not need to add the driver jar twice.
Define additional catalogs in SEP catalogs configuration directory, by creating
some_additional_catalogwith some descriptive name of the catalog).
Generic JDBC connector provides best-effort compatibility with older JDBC drivers, including drivers implementing version 3 of the JDBC specification (JDBC 3).
The best-effort compatibility can be turned on with:
The connector includes a number of performance improvements, detailed in the following sections.
Dynamic filtering is enabled by default. It causes the connector to wait for dynamic filtering to complete before starting a JDBC query.
You can disable dynamic filtering by setting the property
dynamic-filtering.enabled in your catalog properties file to
Table scan redirections#
The connector supports table scan redirection to improve performance and reduce load on the data source.
The connector provides read and write access to data and metadata in a database using a JDBC connection. In addition to the globally available and read operation statements, the connector supports the following features:
Some SQL statements may not work depending on the database and JDBC driver.
General configuration properties#
The following properties can be used to configure how data types from the connected data source are mapped to Trino data types and how the metadata is cached in Trino.
Configure how unsupported column data types are handled:
The respective catalog session property is
Allow forced mapping of comma separated lists of data types to convert to
Support case insensitive database and collection names
Duration for which metadata, including table and column statistics, is cached
0 (disabled caching)
Cache the fact that metadata, including table and column statistics, is not available
The connector includes a number of security-related features, detailed in the following sections.
Generic JDBC connector supports user impersonation.
User impersonation can be enabled in the catalog file:
User impersonation in Generic JDBC connector is based on
Support for data types depends on the data types in the external database and how the data types are reported by the JDBC driver of the external database. Actual support may vary from database to database. The Generic JDBC connector will hide columns which data type it considered unsupported. In the extreme case, this may mean that no columns are visible, if all data types being used are unsupported.
If you want to query data with unsupported types, you can define a view in your
external database, converting unsupported data types to supported ones
varchar is usually a good choice). Then you should query this
view from SEP, instead of the underlying table.