Skip to main content
Gravitino is a high-performance, geo-distributed, and federated metadata lake. It manages the metadata directly in different sources, types, and regions. It also provides users with unified metadata access for data and AI assets.

Unified metadata management

Gravitino abstracts the unified metadata models and APIs for different kinds of metadata sources. For example, relational metadata models for tabular data, like Hive, MySQL, PostgreSQL, etc. File metadata model for all the unstructured data, like HDFS, S3, and other formats.

Unified metadata governance

Gravitino also provides a unified metadata governance layer (WIP) to manage the metadata in a unified way, including access control, auditing, discovery and other features.

Direct metadata management

Unlike traditional metadata management systems, which need to collect the metadata actively or passively from underlying systems, Gravitino manages these systems directly. It provides a set of connectors to connect to different metadata sources. The changes in Gravitino directly reflect in the underlying systems, and vice versa.

Geo-distribution support

Gravitino supports geo-distribution deployment, which means different instances of Gravitino can deploy in different regions or clouds, and they can connect to get the metadata from each other. With this, users can get a global view of metadata across the regions or clouds.

Multi-engine support

Gravitino supports different query engines to access the metadata. Currently, it supports Trino, users can use Trino to query the metadata and data without needing to change the existing SQL dialects. Other query engine support is on the roadmap, including Apache Spark, Apache Flink and others.

AI asset management (WIP)

The goal of Gravitino is to unify the data management in both data and AI assets. The support of AI assets like models, features, and others are under development.