Apache Gravitino Trino connector - Hive catalog
The Hive catalog allows Trino querying data stored in an Apache Hive data warehouse.
Requirements
The Hive connector requires a Hive metastore service (HMS), or a compatible implementation of the Hive metastore, such as AWS Glue.
Apache Hadoop HDFS 2.x supported.
Many distributed storage systems including HDFS, Amazon S3 or S3-compatible systems, Google Cloud Storage, Azure Storage, and IBM Cloud Object Storage can be queried with the Hive connector.
The coordinator and all workers must have network access to the Hive metastore and the storage system.
Hive metastore access with the Thrift protocol defaults to using port 9083.
Data files must be in a supported file format. Some file formats can be configured using file format configuration properties per catalog:
- ORC
- PARQUET
- AVRO
- RCFILE
- SEQUENCEFILE
- JSON
- CSV
- TEXTFILE
Schema operations
Create a schema
Users can create a schema with properties through Apache Gravitino Trino connector as follows:
CREATE SCHEMA catalog.schema_name
Table operations
Create table
The Gravitino Trino connector currently supports basic Hive table creation statements, such as defining fields,
allowing null values, and adding comments. The Gravitino Trino connector does not support CREATE TABLE AS SELECT
.
The following example shows how to create a table in the Hive catalog:
CREATE TABLE catalog.schema_name.table_name
(
name varchar,
salary int
)