Apache Gravitino Trino connector - Iceberg catalog
Apache Iceberg is an open table format for huge analytic datasets. The Iceberg catalog allows Trino querying data stored in files written in Iceberg format, as defined in the Iceberg Table Spec. The catalog supports Apache Iceberg table spec versions 1 and 2.
Requirements
To use Iceberg, you need:
- Network access from the Trino coordinator and workers to the distributed object storage.
- Access to a Hive metastore service (HMS), an AWS Glue catalog, a JDBC catalog, a REST catalog, or a Nessie server.
- Data files stored in a supported file format. These can be configured using file format configuration properties per catalog:
- ORC
- Parquet (default)
Schema operations
Create a schema
Users can create a schema through Apache Gravitino Trino connector as follows:
CREATE SCHEMA catalog.schema_name
Table operations
Create table
The Apache Gravitino Trino connector currently supports basic Iceberg table creation statements, such as defining fields,
allowing null values, and adding comments. The Apache Gravitino Trino connector does not support CREATE TABLE AS SELECT
.
The following example shows how to create a table in the Iceberg catalog:
CREATE TABLE catalog.schema_name.table_name
(
name varchar,
salary int
)
Alter table
Support for the following alter table operations:
- Rename table
- Add a column
- Drop a column
- Rename a column
- Change a column type
- Set a table property
Select
The Apache Gravitino Trino connector supports most SELECT statements, allowing the execution of queries successfully. Currently, it doesn't support certain query optimizations, such as pushdown and pruning functionalities.
Table and Schema properties
Create a schema with properties
Iceberg schema does not support properties.
Create a table with properties
Users can use the following example to create a table with properties:
CREATE TABLE catalog.dbname.tablename
(
name varchar,
salary int
) WITH (
KEY = 'VALUE',
...
);
The following tables are the properties supported by the Iceberg table:
Property | Description | Default Value | Required | Reserved | Since Version |
---|---|---|---|---|---|
partitioning | Partition columns for the table | (none) | No | No | 0.4.0 |
sorted_by | Sorted columns for the table | (none) | No | No | 0.4.0 |
Reserved properties: A reserved property is one can't be set by users but can be read by users.