Hudi catalog
Introduction
Apache Gravitino provides the ability to manage Apache Hudi metadata.
Requirements and limitations
info
Tested and verified with Apache Hudi 0.15.0.
Catalog
Catalog capabilities
- Works as a catalog proxy, supporting 
HMSas catalog backend. - Only support read operations (list and load) for Hudi schemas and tables.
 - Doesn't support timeline management operations now.
 
Catalog properties
| Property name | Description | Default value | Required | Since Version | 
|---|---|---|---|---|
catalog-backend | Catalog backend of Gravitino Hudi catalog. Only supports hms now. | (none) | Yes | 0.7.0-incubating | 
uri | The URI associated with the backend. Such as thrift://127.0.0.1:9083 for HMS backend. | (none) | Yes | 0.7.0-incubating | 
client.pool-size | For HMS backend. The maximum number of Hive metastore clients in the pool for Gravitino. | 1 | No | 0.7.0-incubating | 
client.pool-cache.eviction-interval-ms | For HMS backend. The cache pool eviction interval. | 300000 | No | 0.7.0-incubating | 
gravitino.bypass. | Property name with this prefix passed down to the underlying backend client for use. Such as gravitino.bypass.hive.metastore.failure.retries = 3 indicate 3 times of retries upon failure of Thrift metastore calls for HMS backend. | (none) | No | 0.7.0-incubating | 
Catalog operations
Please refer to Manage Relational Metadata Using Gravitino for more details.
Schema
Schema capabilities
- Only support read operations: listSchema, loadSchema, and schemaExists.
 
Schema properties
- The 
Locationis an optional property that shows the storage path to the Hudi database 
Schema operations
Only support read operations: listSchema, loadSchema, and schemaExists. Please refer to Manage Relational Metadata Using Gravitino for more details.
Table
Table capabilities
- Only support read operations: listTable, loadTable, and tableExists.
 
Table partitions
- Support loading Hudi partitioned tables (Hudi only supports identity partitioning).
 
Table sort orders
- Doesn't support table sort orders.
 
Table distributions
- Doesn't support table distributions.
 
Table indexes
- Doesn't support table indexes.
 
Table properties
- For HMS backend, it will bring out all the table parameters from the HMS.
 
Table column types
The following table shows the mapping between Gravitino and Apache Hudi column types:
| Gravitino Type | Apache Hudi Type | 
|---|---|
boolean | boolean | 
integer | int | 
long | long | 
date | date | 
timestamp | timestamp | 
float | float | 
double | double | 
string | string | 
decimal | decimal | 
binary | bytes | 
array | array | 
map | map | 
struct | struct | 
Table operations
Only support read operations: listTable, loadTable, and tableExists. Please refer to Manage Relational Metadata Using Gravitino for more details.