10 posts tagged with "metadata"

Apache Gravitino 1.1.0 - An AI-native metadata management platform

December 16, 2025 · 6 min read

PMC Member

We are glad to announce the release of Apache Gravitino 1.1.0! This release builds upon the solid foundation laid by Apache Gravitino 1.0.0, introducing a range of new features, improvements, and bug fixes that enhance the platform's capabilities, performance, and security.

Highlights

Broader catalog support (initial Lance REST service, a reusable lakehouse-generic catalog, and Hive3) to simplify integration with diverse lakehouse deployments.
Stronger metadata-level authorization and security hardening for the Iceberg REST surface.
Multi-cluster fileset support and Python client improvements for real-world multi-region and migration workflows.
Stability, performance and observability work across the entity-store, caches, scan planning, connectors and CI — reducing operational friction and test flakiness.

New Features

Built for the Future of AI Data: Lance REST service. #8889

As AI and ML workflows become central to data platforms, efficient access to vector data is crucial. The new Lance REST service exposes Lance datasets through a managed HTTP interface. This allows remote clients—such as inference services or notebooks—to access vector data with the high performance of the Lance format, all while adhering to Apache Gravitino's centralized security and governance policies.

Generic lakehouse catalog. #8828

The lakehouse ecosystem is diverse and rapidly evolving, with new table formats and engines emerging frequently. To keep pace, we introduced a generic lakehouse catalog framework. This abstraction reduces the boilerplate code required to integrate new engines, standardizing how capabilities are negotiated and how namespaces are handled. This means faster support for new formats and a more consistent experience for developers and users alike.

Access control for Iceberg REST service. #4290

The Iceberg REST catalog is becoming the standard for open table access, but production use demands robust security. We have hardened the Iceberg REST service with comprehensive authentication and authorization checks. This ensures that data accessed via standard Iceberg clients is fully protected, making Apache Gravitino a secure choice for multi-tenant and public-facing data lake deployments.

Hive 3 catalog support. #5912

Many enterprises still rely on Hive 3 for their core data infrastructure, making migration a risky and complex endeavor. This feature allows users to register existing Hive 3 metastores directly as Apache Gravitino catalogs. By doing so, organizations can instantly bring their legacy data under Apache Gravitino's unified governance and management umbrella without moving data or disrupting existing workloads, paving the way for a smoother transition to modern lakehouse architectures.

Multiple HDFS clusters support. #9117, #9288

In large-scale production environments, data is often distributed across multiple HDFS clusters to ensure isolation and disaster recovery. Previously, Apache Gravitino was limited in how it handled these complex topologies. With this release, users can manage filesets across multiple HDFS clusters within a single Apache Gravitino instance. This capability simplifies cross-cluster data management, improves resource isolation, and provides greater flexibility for multi-tenant architectures.

Metadata authorization for IRC, statistics, tags, jobs, and policies. #4361, #8752, #8944, #8943

True governance requires securing every aspect of the metadata platform. We have expanded fine-grained authorization to cover auxiliary resources like tags, statistics, and background jobs. This enhancement closes previous security gaps, ensuring that all user interactions with the system—whether viewing statistics or managing tags—are strictly governed by least-privilege policies.

New Iceberg REST endpoints. #6336

To support the full range of capabilities expected by modern analytics tools, we have implemented additional endpoints from the Iceberg REST specification. This improves compatibility with the latest query engines and clients, ensuring that users can leverage advanced planning and catalog operations without running into compatibility issues.

Improvements

Core & Server

Entity store and Cache: Fixed several performance and logic issues to improve stability and speed. #8697, #8743, #8815, #8817, #8710, #9148, #7916, #8546
Metrics: Expose more metrics for server and catalogs to enhance observability. #8594
Authorization: Refined permission checks. #7942.
Resource management: Improved resource release and closure mechanisms to prevent leaks. #8981, #9002, #8999
JDBC metric store: Support storing Iceberg metrics in JDBC. #8899
Job system enhancement: Support job alteration. #8638, #8814

Catalogs & Connectors

Iceberg catalog: Support metadata cache. #8314
Upgrade Iceberg to 1.10.0 to support scan planning. #9046
Improve dynamic config provider for better usability. #8970
Fileset catalog: Prevented filesystem instances from hanging for a long time. #9280
Trino connector: Support SQL UPDATE/DELETE/MERGE. #8241
Fix getTableStatistics in GravitinoMetadata. #9100

Clients

GVFS client: Improved stability and error handling. #8752, #8882, #8948, #8953.
Fileset bundle JARs: Refactored for a more detailed delivery strategy. #9106
Python client: Added support for relational catalog. #5198

Developer Experience & Operations

Helm chart: Enhanced configuration options and stability. #8747, #8174
GitHub templates: Added templates to support AI coding. #9227.
Tests: Refactoring and enhancement of test suites. #9223, #9107
Docker: Changed Apache Gravitino Docker base image. #8817
Code Style: Upgrade Google Java Format to support JDK 17. #8792.

Frontend Updates

Added pagination for files list. #8987
Displayed the index type in UI. #6997
Upgraded dependabot affected versions. #9357
Fixed routing issue where path '/' may not route to 'metalakes'. #9354

Bug Fixes

Create topic encounters NoSuchTopicException when Kafka is deployed with 3 brokers on EKS. #4168
Apache Gravitino IRC server returns java.lang.NoSuchMethodError: void org.apache.hadoop.security.HadoopKerberosName.setRuleMechanism. #8754
Several bugs in SQL provider. #8659, #9166
Unknown error when using fsspec through JNI. #8858

Still, there are many bug fixes that have not been listed due to limited space. Please refer to the full list of issues and pull requests merged since the 1.0.0 release for more details.

Acknowledgements

Thanks to everyone who contributed to the 1.1.0 work — code, reviews, tests, issue triage, design, and feedback. Below is a consolidated list of contributor GitHub IDs extracted from issue and PR activity.

_{Apache, Apache Flink, Apache Hive, Apache Hudi, Apache Iceberg, Apache Ranger, Apache Spark, Apache Paimon and Apache Gravitino are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries.}

Apache Gravitino 1.0.1 - Release Notes

November 14, 2025 · 3 min read

Minghuang Li

committer

We are pleased to announce the release of Gravitino 1.0.1. This version introduces comprehensive support for job template alterations, along with significant improvements and bug fixes across the core engine, various catalogs, and clients.

Major Features & Improvements

Job and Job Template

Supports altering job templates. #8638, #8639, #8781, #8783, #8640, #8641, #8642
Supports placeholders for all job template fields. #8865
Supports running Spark jobs in the local environment. #7962

Gravitino Core

Refactored tag operations by leveraging the entity store's relation operations. #7916
Made several optimizations to the Caffeine cache, including adjusting weight values, resolving a performance issue with reverseIndex, and prioritizing the eviction of tags and policies when the cache is full, and so on. #8697, #8743, #8815, #8871, #8937

Catalogs

Kafka: Fixed an issue where topic creation was asynchronous, ensuring the operation is now synchronous. #4168
Iceberg: Fixed a failure in starting the Iceberg REST server within a Docker environment. #8733
Doris, StarRocks, PostgreSQL: Fixed incorrect parsing of column default values and types for these data sources. #8277

Python Client

Added metadata objects to the Python client. #8627
Fixed an incorrect credential URL and a fileset test issue on GCS. #8935, #8969

Authorization

Authorization is supported for the testCatalogConnection operation. #7893

Web UI

Fixed an issue with reconfiguring submission parameters when creating a catalog. #8694
Added pagination support for the fileset file list. #8987

Bug Fixes

Fixed a Null Pointer Exception (NPE) in TableFormat.java when a user has no roles. #8202
Corrected exception handling in the setPolicy operation. #8661
Fixed missing policy operations in the OpenAPI entry point. #8706
Fixed a build failure in the gvfs-fuse module. #8830
Fixed an issue where the hard deletion of statistics would fail. #9038
Corrected index names for statistics and job names in the database upgrade script. #8979
Fixed deletePolicyAndVersionMetasByLegacyTimeline error. #9031
Fixed role didn't update when the table is deleted. #8824

Credits

We would like to thank the following contributors for their valuable contributions to this release:

@dyrnq @yuqi1129 @LauraXia123 @jerryshao @danhuawang @playasim @keepConcentration @KayMas2808 @jerqi @mchades @HugoSalaDev @FANNG1 @diqiu50 @hdygxsj @tsungchih

Apache Gravitino 1.0.0 - From Metadata Management to Contextual Engineering

September 24, 2025 · 8 min read

Jerry Shao

PMC Member

Apache Gravitino was designed from day one to provide a unified framework for metadata management across heterogeneous sources, regions, and clouds—what we define as the metadata lake (or metalake). Throughout its evolution, Gravitino has extended support to multiple data modalities, including tabular metadata from Apache Hive, Apache Iceberg, MySQL, and PostgreSQL; unstructured assets from HDFS and S3; streaming and messaging metadata from Apache Kafka; and metadata for machine learning models. To further strengthen governance in Gravitino, we have also integrated advanced capabilities, including tagging, audit logging, and end-to-end lineage capture.

After all enterprise metadata has been centralized through Gravitino, it forms a data brain: a structured, queryable, and semantically enriched representation of data assets. This enables not only consistent metadata access but also knowledge grounding, contextual reasoning, tool using and others. As we approach the 1.0 milestone, our focus shifts from pure metadata storage to metadata-driven contextual engineering—a foundation we call the Metadata-driven Action System, to provide the building blocks for the contextual engineering.

The release of Apache Gravitino 1.0.0 marks a significant engineering step forward, with robust APIs, extensible connectors, enhanced governance primitives, improved scalability and reliability in distributed environments. In the following sections, I will dive into the new features and architectural improvements introduced in Gravitino 1.0.0.

Metadata-driven action system

In version 1.0.0, we introduced three new components that enable us to build jobs to accomplish metadata-driven actions, such as table compaction, TTL data management, and PII identification. These three new components are: the statistics system, the policy system, and the job system.

Taking table compaction as an example:

Firstly, users can define the table compaction policy in Gravitino and associate this policy with the tables that need to be compacted.
Then, users can save the statistics of the table to Gravitino.
Also, users can define a job template for the compaction.
Lastly, users can use the statistics with the defined policy to generate the compaction parameters and use these parameters to trigger a compaction job based on the defined job templates.

Statistics system

The statistics system is a new component for the statistics store and retrieval. You can define and store the table/partition level statistics in Gravitino, and also fetch them through Gravitino for different purposes.

For the details of how we design this component, please see #7268. For instructions on using the statistics system, refer to the documentation here.

Policy system

The policy system enables you to define action rules in Gravitino, like compaction rules or TTL rules. The defined policy can be associated with the metadata, which means these rules will be enforced on the dedicated metadata. Users can leverage these enforced polices to decide how to trigger an action on the dedicated metadata.

Please refer to the policy system documentation to know how to use it. For more information on the policy system's implementation details, please refer to #7139.

Job system

The job system is another feature that allows you to submit and run jobs through Gravitino. Users can register a job template, then trigger a job based on the specific job template. Gravitino will help submit the job to the dedicated job executor, such as Apache Airflow. Gravitino can manage the job lifecycle and save the job status in it. With the job system, users can run a self-defined job to accomplish a metadata-driven action system.

In version 1.0.0, we have an initial version to support running the jobs as a local process. If you want to know more about the design details, you can follow issue #7154. Also, a user-facing documentation can be found here.

The whole metadata-driven action system is still in an alpha phase for version 1.0.0. The community will continue to evolve the code and take the Iceberg table maintenance as a reference implementation in the next version. Please stay tuned.

Agent-ready through the MCP server

MCP is a powerful protocol to bridge the gap between human languages and machine interfaces. With MCP, users can communicate with the LLM using natural language, and the LLM can understand the context and invoke the appropriate tools.

In version 1.0.0, the community officially delivered the MCP server for Gravitino. Users can launch it as a remote or local MCP server and connect to various MCP applications, such as Cursor and Claude Desktop. Additionally, we exposed all metadata-related interfaces as tools that MCP clients can call.

With the Gravitino MCP server, users can manage and govern metadata, as well as perform metadata-driven actions using natural language. Please follow issue #7483 for more details. Additionally, you can refer to the documentation for instructions on how to start the MCP server locally or in Docker.

Unified access control framework

Gravitino introduced the RBAC system in the previous version, but it only offers users the ability to grant privileges to roles and users, without enforcing access control when manipulating the secure objects. In 1.0.0, we complete this missing piece in Gravitino.

Currently, users can set access control policies through our RBAC system and enforce these controls when accessing secure objects. For details, you can refer to the umbrella issue #6762.

Add support for multiple locations model management

The model management is introduced in Gravitino 0.9.0. Users have since requested support for multiple storage locations within a single model version, allowing them to select a model version with a preferred location.

In 1.0.0, the community added multiple locations for model management. This feature is similar to the fileset’s support for multiple locations. Users can check the document here for more information. For more information on implementation details, please refer to this issue #7363.

Support the latest Apache Iceberg and Paimon versions

In Gravitino 1.0.0, we have upgraded the supported Iceberg version to 1.9.0. With the new version, we will add more feature support in the next release. Additionally, we have upgraded the supported Paimon version to 1.2.0, introducing new features for Paimon support.

You can see the issue #6719 for Iceberg upgrading and issue #8163 for Paimon upgrading.

Various core features

Core:

Add the cache system in the Gravitino entity store #7175.
Add Marquez integration as a lineage sink in Gravitino #7396.

Server:

Add Azure AD login support for OAuth authentication #7538.

Catalogs:

Support StarRocks catalog management in Gravitino #3302.

Clients:

Adds the custom configurations for clients #7816, #7817, #7670, #7456.

Spark connector:

Upgrade the supported Kyubbi version #7480.

UI:

Add web UI for listing files / directories under a fileset #7477.

Deployment:

Add hem char deployment for Iceberg REST catalog #7159.

Behavior changes

Compatible changes:

Rename the Hadoop catalog to fileset catalog #7184.
Allowing event listener changes Iceberg create table request #6486.
Support returning aliases when listing model version #7307.

Breaking changes:

Change the supported Java version to JDK 17 for the Gravitino server.
Remove the Python 3.8 support for the Gravitino Python client #7491.
Fix the unnecessary double encoding and decoding issue for fileset get location and list files interfaces #8335. This change is incompatible with the old version of Java and Python clients. Using old version clients with a new version server will meet a decoding issue in some unexpected scenarios.

Overall

There are still lots of features, improvements, and bug fixes that are not mentioned here. We thank the community for their continued support and valuable contributions.

Apache Gravitino 1.0.0 opens a new chapter from the data catalog to the smart catalog. We will continue to innovate and build, to add more Data and AI features. Please stay tuned!

Credits

This release acknowledges the hard work and dedication of all contributors who have helped make this release possible.

1161623489@qq.com, Aamir, Aaryan Kumar Sinha, Ajax, Akshat Tiwari, Akshat kumar gupta, Aman Chandra Kumar, AndreVale69, Ashwil-Colaco, BIN, Ben Coke, Bharath Krishna, Brijesh Thummar, Bryan Maloyer, Cyber Star, Danhua Wang, Daniel, Daniele Carpentiero, Dentalkart399, Drinkaiii, Edie, Eric Chang, FANNG, Gagan B Mishra, George T. C. Lai, Guilherme Santos, Hatim Kagalwala, Jackeyzhe, Jarvis, JeonDaehong, Jerry Shao, Jimmy Lee, Joonha, Joonseo Lee, Joseph C., Justin Mclean, KWON TAE HEON, Kang, KeeProMise, Khawaja Abdullah Ansar, Kwon Taeheon, Kyle Lin, KyleLin0927, Lord of Abyss, MaAng, Mathieu Baurin, Maxspace1024, Mikshakecere, Mini Yu, Minji Kim, Minji Ryu, Nithish Kumar S, Pacman, Peidian li, Praveen, Qian Xia, Qiang-Liu, Qiming Teng, Raj Gupta, Ratnesh Rastogi, Raveendra Pujari, Reuben George, RickyMa, Rory, Sambhavi Pandey, Sébastien Brochet, Shaofeng Shi, Spiritedswordsman, Sua Bae, Surya B, Tarun, Tian Lu, Tianhang, Timur, Viral Kachhadiya, Will Guo, XiaoZ, Xiaojian Sun, Xun, Yftach Zur, Yuhui, Yujiang Zhong, Yunchi Pang, Zhengke Zhou, _.mung, ankamde, arjun, danielyyang, dependabot[bot], fad, fanng, gavin.wang, guow34, jackeyzhe, kaghatim, keepConcentration, kerenpas, kitoha, lipeidian, liuxian, liuxian131, lsyulong, mchades, mingdaoy, predator4ann, qbhan, raveendra11, roryqi, senlizishi, slimtom95, taylor.fan, taylor12805, teo, tian bao, vishnu, yangyang zhong, youngseojeon, yuhui, yunchi, yuqi, zacsun, zhanghan, zhanghan18, 梁自强, 박용현, 배수아, 신동재, 이승주, 이준하

Apache Gravitino 0.9.1

July 21, 2025 · 2 min read

Rory Qi

committer

Model Management

Support updating aliases for model versions #6814,#7158

Add file viewer support for Filesets #6860

Implement ListFilesEvent in FilesetEventDispatcher #7314

Support setOwner/getOwner event operations #7646

Trino Connector

Auto-load multiple metalakes in Trino connector #7288

JDBC Validation

Validate JDBC URLs during store initialization #7547

Bug Fixes

Core & Catalogs

Fix H2 backend file lock issues during deletion #7406

Prevent SQL session commit errors #7403

Correct OAuth token refresh in web UI #7426

Validate namespace string conversions #7516

Improve server force-kill shutdown logic #7513

Fix bypass key handling in Hive catalog #7416

Filter empty Hadoop storage locations #7190

Fix model catalog error messages #7346

Connectors

Spark Connector

Remove conflicting slf4j dependency #7287

Fix S3 credential test errors #7432

Trino Connector

Handle unsupported catalog providers #7322

Python Client

Fix storage handler mappings for S3/OSS/ABS #7225

Improve Java client error messages #7344

Filesets

Fix multi-location file paths #7371

Improvements

Core & Catalogs

Optimize column deletion logic (#7415)(https://github.com/apache/gravitino/issues/7415)

Auto-register mappers via SPI #7529

Validate JDBC entity store URLs #7614

Fix catalog index existence checks #7660

CLI & Clients

Remove duplicate owner field in CLI #7639

URL-encode paths in Java client #7686

Testing

Refactor Hadoop catalog test stubbing #7280

Fix precondition message mismatches #7521

Documentation

Add Trino REST catalog example #7121

Iceberg IRC guides for StarRocks/Doris #7368

OpenAPI specs for Fileset/File #6860

Fix access control docs #7195

Update model privilege docs #7555

Typo fixes #7448, #7647

Remove incubating status markers #7492

Add 0.9.1 release notes #7485

Build & Infra

Fix Helm chart versioning #7129, #7134

Upgrade Kyuubi dependency #7480

Credits

FANNG1 Abyss-lord jerqi jerryshao slimtom95 flaming-archer yunchipang KyleLin0927 xiaozcy diqiu50 yuqi1129 ziqiangliang carl239 LauraXia123 guov100 senlizishi fivedragon5 justinmclean Jackeyzhe Spiritedswordsman su8y

Apache Gravitino 0.9.0 - Focus on AI, data governance, and security with multi-dimensional feature upgrade

May 7, 2025 · 4 min read

Rory Qi

committer

Gravitino 0.9.0 focuses on advancements in AI, data governance, and security. Many of its new features are already being used in production environments. The release has attracted strong interest from users from well-known companies, with AI and security capabilities drawing attention.

In this version, the community optimized the user experience for fileset catalogs and model catalogs, making it easier for users to manage their unstructured AI data and model data.

The community added a new data lineage interface. Users can now implement a custom data lineage plugin to adapt to their own system.

For security, the community has corrected some privilege semantics and fixed authorization plugin corner cases to make the entire system more robust.

Model Catalog

Before 0.9.0, the model catalog was immutable, which was not flexible. In the new version, users can alter models and model versions and add tags #6626 #6222.

Fileset Catalog

Gravitino now supports multiple named storage locations within a single fileset and placeholder-based path generation.

With multiple location support, users can reference data across different file systems (HDFS, S3, GCS, local, etc.) through a unified fileset interface, each with a unique location name.

The placeholder feature allows dynamic storage path generation using the {{placeholder}} syntax, automatically replacing placeholders with corresponding fileset properties.

These enhancements significantly improves the flexibility for multi-cloud environments and complex data organization patterns while maintaining a clean abstraction layer for data assets management #6681.

GVFS (Gravitino Virtual File System)

GVFS has been enhanced to support accessing multiple locations within filesets. Users can now select which location to use through configuration properties, environment variables, or fileset default settings.

GVFS has also been refactored with a pluggable architecture allowing custom operations and hooks. This enables users to extend functionality through operations_class and hook_class configuration options for more flexible integration with their specific infrastructure #6938.

Security

The new version has added privileges for the data model and corrected some privilege semantics. It has also fixed some bugs with the Ranger path-based plugin #6620 #6575 #6821 #6864. All of the user-related, group-related, and role-related events are now supported for the event system #2969.

Data Lineage

The community added a data lineage interface that follows the OpenLineage API specification. Users can implement their custom data lineage plugin to adapt to their system #6617.

Core

The community cared about performance. Performance was improved by reducing the scope of the lock and batch reading data from storage #6744 #6560 #2870.

CLI

Additionally, there is one more change worth mentioning. Users no longer need to rely on the alias command to use the CLI. Instead, the community provided a convenient script located at ./bin/gcli.sh so that a user can directly use the CLI client #5383.

Connector

Both the Flink connector and the Spark connector added JDBC support #6233 #6164.

Chart

Deploying Gravitino on Kubernetes with a fully customizable configuration #6594.

Overall

Gravitino 0.9.0 focuses on advancements in AI, data governance, and security. We thank the Gravitino community for their continued support and valuable contributions. We can continue to innovate and build thanks to all our users' feedback. Thank you for taking the time to read this! To dive deeper into the Gravitino 0.9.0 release, explore the full documentation. Your feedback is greatly valued and helps shape the future of the Gravitino project and community.

Credits

JavedAbdullah AndreVale69 Brijeshthummar02 cool9850311 liuchunhao danhuawang unknowntpo FANNG1 tsungchih jerryshao justinmclean zhoukangcn Abyss-lord amazingLyche yuqi1129 Pranaykarvi puchengy LauraXia123 tengqm rud9192 antony0016 frankvicky TEOTEO520 TungYuChiang sunxiaojian xunliu LuciferYang diqiu50 zhengkezhou1 caican00 granewang yunchipang jerqi mchades rickyma Xander-run flaming-archer waukin lsyulong luoshipeng FourFriends this-user vitamin43 hdygxsj liangyouze

_{Apache, Apache Fink, Apache Hive, Apache Hudi, Apache Iceberg, Apache Ranger, Apache Spark, Apache Paimon and Apache Gravitino are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries.}

Apache Gravitino 0.8.0 - strengthen the AI support for Apache Gravitino™ (incubating)

January 24, 2025 · 6 min read

Xiaojing Fang

committer

Apache Gravitino 0.8.0 is the third major release after entering the ASF. In this release, the community provides several exciting features like model catalog, Fuse for Fileset, credential vending for Fileset, Flink Iceberg and Paimon connector, Spark Paimon connector, and security enforcement.

This release blog will briefly introduce the new significant features and improvements. Please keep reading to learn more about what the community has worked on.

Model Catalog

Besides table and messaging metadata, Gravitino supports model metadata management in version 0.8. Gravitino allows a model to have multiple versions, and users can choose the best version. 0.8 provides basic functionality, and more features will be provided in the future, such as tagging models and better integration with machine learning workflows, to help users better manage models and extract more value from data and models.

Support model versioning metadata #4783.

Credential vending

Credential vending is a fundamental function in the cloud. In version 0.7, credential vending was supported for the Iceberg REST server. In version 0.8, we offer support for the Gravitino server and integrate it with Fileset. Based on Credential vending, Fileset can be used more securely and conveniently. The Gravitino server will centrally manage the security key and issue a temporary token, which is only valid for the Fileset that needs to be accessed by the request, making it more secure and eliminating the need for the user side to provide information such as AKSK.

In addition to the support for GCS and S3, version 0.8 also has built-in support for OSS and ADLS credential vending, and can support other storage in a pluggable manner.

Support credential vending for fileset client #5677.
Support credential vending for Gravitino #4398.
Support Aliyun OSS credential provider #5625.
Support ADLS credential provider #5624.

Fuse for Fileset

With the widespread use of Fileset in AI scenarios, how to improve usability and reduce user usage costs has become a major issue. In AI scenarios, users tend to access remote data in the way of local disks. Fuse for fileset is designed based on this, enabling users to access data managed by Fileset as if they were using local disks. Currently, basic alpha functionality is provided, which allows access to S3 data managed by Fileset. In subsequent versions, metadata caching functionality and support for more storage will be provided. Fuse for fileset is developed in Rust for performance considerations, and everyone is welcome to join the development.

Implement GVFS fuse to access Gravitino fileset in the POSIX Protocol #5504.

Lakehouse Federation

Gravitino provides a variety of catalogs, such as Apache Hive, Apache Iceberg, Apache Hudi, and Apache Paimon, etc. How can it be better connected to the surrounding ecosystem to facilitate user use? This iteration provides Flink Paimon connector, Flink Iceberg connector, and Spark Paimon connector to access data from Paimon and Iceberg. More connectors will be supported in the future. Let's look forward to it.

Support Iceberg catalog in Flink connector #3515.
Support Paimon catalog in Flink connector #5194 #5193 #5192.
Support Paimon catalog in Spark connector #5722.

Security enforcement

As a metadata management system, security is of the utmost importance. In this iteration, we managing security policies in chain authorization, and support the push-down of SQL security policies and path-based security policies. Additionally, the privilege policies of Iceberg and Paimon tables can be pushed down to Ranger. Based on Gravitino's security policies, a solid foundation is provided for your business development.

Chain authorization multiple underlying data source #5774.
SQL based authorization plugin #5530.
Add path-based authorization securable object and user-group mapping interface #5966.
Use chain authorization to support Hive and HDFS authorization #5956.
Ranger Authorization HDFS Plugin #5731.

Other notable enhancements

Iceberg REST server

Generate credential according to the data path and metadata path #5648.
Integrate audit log framework for Iceberg REST server #5556.
Add schema and view event for Iceberg REST server #5438 #5437. Add HTTP header to Iceberg event #5518.

Core

Optimization tree lock when drop and load Table/Schema #6044.
Support ADLS storage for Gravitino Iceberg catalog and Spark connector #5954.
Support pre-event for Gravitino server #5317.

Gravitino Client

Add CLI interface for Gravitino #4943.
Support Python client for table operations #5198.

BUG FIX

Version 0.8 has fixed a large number of bugs, especially in terms of security and fileset usage. Some are listed below.

Can't load filesystem 'gs' when use spark to access Gravitino GCS bundles #5609.
Invalid token issue happened in GVFS when Spark job long running #5596.
Trino, hive catalog: COMMNET COLUMN with ' ' or NULL has ArrayIndexOutOfBoundsException error, #5533.
Correct the behaviors when creating Iceberg table with none distribution #6196
Updable to create fileset with minio #6156.
Grant privileges to a role, duplicated privilege name with different condition shouldn’t be allowed to grant #6116.
The owner of the catalog is incorrect when using Basic Auth and Password is empty #5968.
Grant a metalake level privilege won't take effect #5892.

Overall

Apache Gravitino 0.8.0 is the third ASF release. This version adds a bunch of new features. We thank the Gravitino community for their continued support and valuable contributions. Thanks to our users' feedback, we can continue to innovate and build, so thanks to all those reading this!

To further explore the Gravitino 0.8.0 release, please check the documentation. Your feedback is invaluable to the community and the project.

Credits

This release acknowledges the hard work and dedication of all contributors who have helped make this release possible.

@Abyss-lord @Aireed @FANNG1 @LauraXia123 @LindaSummer @LiuQhahah @SophieTech88 @TungYuChiang @caican00 @chenyuan99 @cool9850311 @danhuawang @deeshantk @diqiu50 @featherchen @frankvicky @fsalhi2 @hdygxsj @hienduyph @jerqi @jerryshao @justinmclean @liangyouze @liuchunhao @luoshipeng @mchades @orenccl @pithecuse527 @rud9192 @sunxiaojian @theoryxu @waukin @xloya @xunliu @yuqi1129

Apache Gravitino 0.7.0 - strengthen the cloud support for Apache Gravitino™ (incubating)

November 14, 2024 · 6 min read

Jerry Shao

PMC Member

Apache Gravitino 0.7.0 is the second major release after entering the ASF. In this release, the community mainly focused on strengthening cloud support to make Gravitino work better in cloud environments.

This release blog will briefly introduce the new features related to cloud support and other significant features and improvements. Please keep reading to learn more about what the community has worked on.

Cloud storage support for Gravitino

As more and more users run their data stacks in the cloud and use cloud object storage, cloud storage support has become an imperative requirement. In this release, the community has mainly focused on adding cloud storage support for Gravitino to make sure Gravitino itself and its connectors/sources can work smoothly with cloud storage.

In this release:

Gravitino Iceberg REST catalog server now supports different cloud storage services, including AWS S3, Google GCS, Aliyun OSS. Users can simply configure it to make it work.
Gravitino Fileset catalog now supports managing files (objects) stored in S3, GCS, and OSS. Gravitino provides both server-side pluggable framework and client-side Java / Python GVFS (Gravitino Virtual File System) SDK. Users can easily use their existing tools with the Gravitino provided bundled packages to access the data in the cloud. Also, Gravitino provides a pluggable framework for users to implement their own storage support.
Gravitino’s Hive, Paimon, and Iceberg catalogs also adds and verifies the support with different cloud storage.
Gravitino’s Spark and Trino connectors also verifies cloud storage support.

Overall, with the 0.7.0 release, Gravitino generally supports working with different cloud storage services. To know more, you can check our issue #4396. Also, we’re continuing to add more cloud storage support in future releases. Please stay tuned.

Credential vending support in Gravitino

Besides cloud storage support, credential vending support is also vital for Gravitino, especially to work with cloud storage. The traditional way of using AKSK is not convenient or safe. With credential vending technology, the Gravitino server will help users get temporary tokens for authentication, significantly simplifying the client-side configurations and centralizing authentications.

Gravitino 0.7.0 introduces a framework to support Credential vending and also adds S3 and GCS token support. This framework is integrated into Gravitino's Iceberg REST catalog service. So users can smoothly access Iceberg tables on S3 and GCS with authentication.

This is just the first step of credential vending, and we will add more integrations with Gravitino, like fileset support, connector support, etc, in the next release.

For the details of credential vending, please check the issue #4398 and the design document.

Unified access control improvements

Gravitino 0.6.0 introduced the alpha version of unified access control with Apache Ranger support (here), but this feature still needs improvement. In version 0.7.0, we added many improvements and bug fixes to make this end-to-end access control workable. Now, with the release of 0.7.0, the Gravitino unified access control works well with Spark and Ranger to secure end-to-end table access. To see what we have fixed, please check out our issue #4615. You can also try our playground to try out the unified access control feature.

Centralized audit log support

Thanks to the community, Gravitino now supports centralized audit logs. With this feature enabled, users can get the audit information in a centralized place, whether they’re accessing tables or filesets from various sources.

Gravitino’s audit log framework also supports different plugin formatters and writers, and users can implement their own log format and output destinations.

Please see the issues #4887 and #4021 to know more about Gravitino’s centralized audit log.

New data sources support

As a unified data catalog, the community always aims to add more data sources. In this version, Gravitino adds two new data sources. One is Apache Hudi, the other is OceanBase. You can now use Gravitino to manipulate Hudi and Oceanbase metadata in a unified manner.

Various core features

Apart from the features listed above, this version also improves a lot at its core, here are several important features:

Add PostgreSQL support for storage backend #4101. Gravitino already supports using MySQL, H2 as its backend metadata storage. In 0.7.0, the community adds the PostgreSQL support to enlarge its adoption.
Unify the catalog and metalake drop behavior #5031. In the previous version, we didn’t enforce the behavior of catalog and metadata drop operations. In this version, we redefine its behavior and make it much safer.
Manage columns in Gravitino #4493. In 0.7.0, we introduce the column entity in Gravitino, which Gravitino can manage. With this feature introduced, Gravitino can now support tagging on columns, and in future, it will support column-level operations.
Add an event listener for Iceberg REST catalog server #5204 and support pre-event for event listener #5112.

Other notable enhancements

Gravitino core

Supporting storing column metadata in Gravitino #4493.
Support pre-event for Gravitino #5049.
Unify drop metalake and catalog behavior #5031.
Add credential vending support in Gravitino #4398.
Support audit log in Gravitino #4887.
Shrink the package size of Gravitino #4513.

Iceberg REST catalog server

Add credential vending for Iceberg REST server. #4993.
Add an event listener for Iceberg REST server #5204.
Support pre-event for event listener #5112.

Add OSS support for fileset catalog #5173.
Add GCS support for fileset catalog #5074.
Add S3 support for fileset catalog #3379.
Add pluggable storage support for fileset catalog #5019.
Add S3 support for Paimon catalog #4938.
Add catalog support for Hudi #4306.
Add catalog support for OceanBase#4848.

API and client

Add S3 fileset support for Python GVFS client #5188.
Add GCS fileset support for Python GVFS client #5139.
Add OSS fileset support for Python GVFS client #5221.
Supports unified auditing of Fileset metadata and data operations #4021.
Support OAuth2 in Python GVFS #3758.

UI

Add UI support for operating fileset #5167.
Add UI support for operating schema #5140.

All the resolved issues targeting the 0.7.0 release can be seen at https://github.com/apache/gravitino/issues?q=is%3Aissue+is%3Aclosed+label%3A0.7.0+.

Overall

Apache Gravitino 0.7.0 is the second ASF release. This version adds a bunch of new features. We thank the Gravitino community for their continued support and valuable contributions. Thanks to our users' feedback, we can continue to innovate and build, so thanks to all those reading this!

To further explore the Gravitino 0.7.0 release, please check the documentation. Your feedback is invaluable to the community and the project.

Credits

This release acknowledges the hard work and dedication of all contributors who have helped make this release possible.

@FANNG1 @LauraXia123 @LindaSummer @LiuQhahah @Naresh-kumar-Thodupunoori @SeanAverS @caican00 @coolderli @diqiu50 @featherchen @hanwxx @jerqi @jerryshao @jingjia88 @justinmclean @koonchen @lsyulong @lw-yang @mchades @noidname01 @puchengy @shaofengshi @theoryxu @xiaozcy @xloya @xunliu @yangyuxia @yaoderek @yuanoOo @yuqi1129

_{Apache, Apache Fink, Apache Hive, Apache Hudi, Apache Iceberg, Apache Ranger, Apache Spark, Apache Paimon and Apache Gravitino are either
registered trademarks or trademarks of the Apache Software Foundation in the United States
and/or other countries.}

Apache Gravitino 0.6.1 release for Apache Gravitino™ (incubating)

October 21, 2024 · 3 min read

Minghuang Li

committer

We are pleased to announce the stable release of Gravitino 0.6.1-incubating, based on branch-0.6. This release brings a suite of new features and enhancements, particularly focusing on the unified access control system. Additionally, it includes various bug fixes and optimizations across other components.

Security

Supports list users #3348
Supports list roles #3346
Supports list roles by object #4886
Supports list group #4873
Supports grant or revoke privileges for a role #4903
Improved security with additional checks for privilege APIs #5054 #5070
Fix Hive metastore authentication failed when creating a role #4960
Remove role local cache #4246
Addressed a response error in Ranger when calling the Ranger CREATE_GROUP API #4975

Gravitino Core

Fixed an issue with updating comments in metalake or catalog operations #4845
Introduced a basic framework to support multiple JDBC backends #4832 #4868
Fixed a cleanup bug occurring after failed catalog creation attempts #5082

Catalogs

Iceberg

Use unified logic to transform catalog backend name to handle the renaming of catalog #4718

Doris

Fix the missing distribution information when loading Doris tables #4988

Trino Connector

Corrected the default precision settings for Time and Timestamp column types in the Iceberg catalog #4743

UI

Supports creating Paimon catalog #4742
Improved user experience by showing an expand arrow when reloading tree nodes #5042

Build and Others

Fix the env of openAPI lint plugin #4876
Addressed an Out Of Memory (OOM) issue during Trino connector tests #4871
Resolved a test failure in testCheckLinkDocs for the web module #4914
Increase the Python timeout minutes to 45 minutes #5038
Ensured that TestHiveTableOperations can be run independently #4851
Added LICENSE and NOTICE files for the Iceberg REST binary to comply with licensing requirements #5010

Limitations and Known Issues

Please be aware that the Ranger authorization plugin within the unified access control system may exhibit some limitations and known issues. For detailed information, refer to issue #5115.

Credits

We would like to thank the following contributors for their valuable contributions to this release:

@diqiu50 @FANNG1 @jerqi @jerryshao @justinmclean @LauraXia123 @LindaSummer @LiuQhahah @lsyulong @lw-yang @mchades @tyoushinya @yangyuxia @yuqi1129

_{Apache, Apache Iceberg, Apache Hive, Apache Fink, Apache Paimon and Apache Gravitino are either
registered trademarks or trademarks of the Apache Software Foundation in the United States
and/or other countries.}

Apache Gravitino 0.6.0 - First ASF release for Apache Gravitino™ (incubating)

September 9, 2024 · 7 min read

Jerry Shao

PMC Member

This blog post will briefly introduce the new features and significant improvements. Keep reading to learn what the community has worked on and understand Gravitino’s use cases.

Introducing the unified RBAC model for Gravitino

Access control is a crucial feature for the enterprise use of a data catalog, providing users with unified and centralized authorization and authentication capabilities. This release introduces a role-based access control (RBAC) model in Gravitino to authorize different securable objects in a unified manner.

We use Privilege, SecurableObject, Role, User, and Group to define the permissions.

RBAC model

Privilege

Privilege defines the types of operations on different metadata objects, and is used to allow or deny a specific type of operation on a metadata object.

SecurableObject

SecurableObject binds multiple operation-specific types of privileges to a single metadata object.

Role

A Role is a collection of SecurableObjects, and a role represents multiple operation type permissions on multiple metadata objects.

User Users are granted one or multiple roles, and users have different operating privileges depending on their roles.

Group

To make it easier to grant a single permission to multiple users, we can add users to a group, and then grant one or more roles to that user group. This process allows all users belonging to that user group to have the permissions in those roles.

More importantly, the privileges authorized by the user in Gravitino will be pushed down to the underlying permission system. Currently, we support push permissions to Apache Ranger, others like IAM are under development.

Authorization flow

For more information about how our RBAC works, please check out our design document. To enable and use access control in Gravitino, please refer to the user document.

Our implementation of unified access control capability is still in the alpha stage, and we’re striving to add more features and make it stable as soon as possible, so please stay tuned.

Separation of the Iceberg REST catalog service

Apache Iceberg is a first-class citizen, and Gravitino has provided an embedded Iceberg REST catalog service since version 0.3. We have seen the increased demands and adoption of Iceberg REST catalog service as a standalone server. So, in version 0.6.0, we refactored the whole architecture and modularized the Iceberg REST catalog service as a standalone service, allowing it to be deployed with or without the Gravitino server. Besides the refactoring, we also bumped the supported version to Iceberg 1.5.2, added support for S3 cloud storage, and now support the registerTable interface.

Iceberg REST catalog support is crucial to Gravitino, and modularization is just the first step. In future releases, we will add more features like cloud storage support and integrating Gravitino’s RBAC model, credential vending, etc.

To use the Gravitino Iceberg REST catalog service, please check our user document. The umbrella issue is #4058.

Tagging support

Tagging on metadata objects is useful for data discovery, classification, and data governance. It can also be leveraged by query engines to provide tag-based access control. In Gravitino 0.6.0, we introduce tag support users can add tags on metadata objects like CATALOG, SCHEMA, TABLE, FILESET, and TOPIC. To know how our tag system is designed, please check out the design document and issue #3344. To use tags in both REST API and Java SDK, please see how to manage tags.

Apache Flink Gravitino connector

As an open data catalog, we want to be able to support all query engines. Therefore, alongside Trino and Apache Spark, we have added Apache Flink as our newest supported query engine.

In 0.6.0, we added a new Flink Gravitino connector #1354 and supported querying Hive tables using Flink with Gravitino. Hive support is just our first step, we will continue to add more table support.

To know how to use the Flink Gravitino connector, please refer to our documentation.

Apache Paimon table management in Gravitino

Apache Paimon has become quite popular this year, and many companies use Paimon to build their streaming warehouse or lakehouse. To manage all the lakehouse tables in a unified manner, Gravitino has added Paimon table management in 0.6.0 #1129. Users can use our unified API to manage Paimon tables as well as other tables. To know more about how to manage Paimon tables, please refer to Lakehouse Paimon Catalog document.

Add Python GVFS support for fileset

In Gravitino 0.5, we added a Java Hadoop Compatible Filesystem (HCFS) support (GVFS) for fileset read/write in Gravitino. The provided Java GVFS can be used by query engines like Apache Spark to read/write data from files or folders. Although this works well in big data, AI development is largely dominated by Python, which can create an obstacle and hinder users from using Fileset with AI frameworks.

In 0.6.0, we followed the Python fsspec to provide a Python GVFS package that can be used by popular Python frameworks like Apache Arrow, Pandas, Ray, LlamaIndex, and more. You can check out Python GVFS document for more information.

Notable enhancements

Gravitino core

Support catalog reload after a property is altered #2267.
Deprecate KV store and add H2 support as embedded storage backend #3968.

Catalog relate

Add API test catalog connection #4107.
Improve the type system to support unknown types #3427.
Add Kerberos support for fileset Hadoop catalog #3462.
Add S3 support for Iceberg #4264.
Support cloud and region property when creating catalog #3966.
Support multiple Kerberos authentication for Hive catalog #3906.
Unify the behavior of purge for all the catalogs #3685.

API and client

Refactor Java and Python API for better user experience #3626.
Add missing error handlers in Python client #4225.

All the resolved issues targeting the 0.6.0 release can be seen at https://github.com/apache/gravitino/issues?page=12&q=is%3Aissue+is%3Aclosed+label%3A0.6.0.

Overall

Apache Gravitino 0.6.0 is the first ASF release, we would like to show appreciation to the Gravitino community for their continued support and valuable contributions. Thanks to the feedback of our users, we are able to continue to innovate and build, so thanks to all those reading this!

To explore Gravitino 0.6.0 release, please check the documentation. Your feedback is invaluable to the community and the project.

Credits

This release acknowledges the hard work and dedication of all contributors who have helped make this release possible.

@1996fanrui @BSSsunny @FANNG1 @IamSaker @JinsYin @JosefinaOller @LanceHsun @LauraXia123 @Leonidas963 @LindaSummer @MukarramHaq @Naresh-kumar-Thodupunoori @Nishtha-Jain-1119 @SteNicholas @TEOTEO520 @Vishesh-Paliwal @ashwin1596 @bknbkn @caican00 @ch3yne @charliecheng630 @coolderli @danhuawang @diqiu50 @featherchen @hanwxx @ian910297 @jenish-thapa @jerqi @jerryshao @jingjia88 @jtao1 @justinmclean @kalencaya @khmgobe @kiratkumar47 @kohantikanath @kristopherkane @lsyulong @lw-yang @mchades @mygrsun @noidname01 @pan3793 @pravo23 @qqqttt123 @rich7420 @rohit-satya @shaofengshi @theoryxu @totalo @unknowntpo @xiaozcy @xloya @xunliu @yijhenlin @yuqi1129 @zhoukangcn @zivali

Apache Gravitino is an effort undergoing incubation at The Apache Software Foundation (ASF), sponsored by ASF Incubator. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF.

Gravitino is an Open Source Data and AI Multi-Cloud Solution

July 5, 2024 · 4 min read

Justin Mclean

PMC Member

In the ever-evolving landscape of data and artificial intelligence, innovation is the key driver of progress. Gravitino is an open source, next-generation data and AI platform. Gravitino aims to unify all aspects of your data, analytics, and AI in one seamless accessible platform.

The power of open source

Open source embodies collaboration, transparency, and community-driven development. Making Gravitino open source and as an incubating project of the Apache Software Foundation extends an invitation to developers worldwide to participate in shaping the future of multi-cloud data management and analytics.

Unified data, analytics, and AI fabric

Gravitino isn't just a tool; it's a fabric that weaves together all your data, analytics, and AI into a single, unified platform. Regardless of where your data resides, be it in various public or private cloud environments, different vendors or different regions, Gravitino provides a solution and delivers optimal performance and cost efficiency.

Operational Simplicity

Gravitino offers a unified perspective of all your data and AI models, ensuring seamless access to all your data. Gravitino empowers users with operational simplicity, allowing them to focus on deriving insights rather than managing complex data infrastructure.

Developer experience

For developers, Gravitino enables a unified ANSI standard-compatible SQL interface, making data handling ETL-free and codeless. Its REST interface, coupled with a built-in SQL optimizer and intelligent query execution, ensures an efficient developer experience. Gravitino empowers developers to focus on innovation rather than grappling with the intricacies of data handling.

Performance and cost efficiency

Gravitino aims to take data management to the next level by eliminating unnecessary data transmission, providing the best performance for data queries on multi-cloud environments. With global data acceleration, Gravitino enables faster and more cost-effective data analysis. This performance boost ensures that organizations can derive insights quicker and more efficiently.

Data source connection, data virtualization, federated computing

Gravitino comes equipped with enterprise-ready connectors for seamless access to cloud data lakes with a focus on high performance. It offers a unified experience for data in remote regions through data virtualization, progress on intelligent acceleration, and allows effortless data analysis and training across different data sources, breaking down traditional silos.

Why Gravitino?

Breaking down data silos

Gravitino tackles the age-old challenge of data silos by providing a unified metadata management and federated analytics engine. This allows for direct data analysis from various cloud and SaaS services without the need for time-consuming ETL processes.

Query federation and in-situ analysis

Gravitino is creating a world where users can access data from diverse systems within a single query, eliminating the need for complex data replication and transformation processes.

Open source commitment

Gravitino's journey isn't just about software; it's about community-driven development. Actively engaged in open source development under the Apache License, a business-friendly permissive license, join the developer community to be part of this exciting journey.

The future of multi-cloud data management

In the era of data-driven decision-making, Gravitino emerges as a beacon of innovation and collaboration. Embracing open source, the belief in the power of community-driven development to shape the future of data and AI is evident. Gravitino isn't just a platform; it represents a movement toward a more connected, efficient, and accessible data landscape. Join the journey to redefine the possibilities of data management and analytics with Gravitino, the next-generation data and AI fabric.

Discover the power of Gravitino, an open source platform reshaping multi-cloud data and AI. Join the community and redefine the possibilities of data management. Get started on GitHub!, on GitHub you also find documentation and a Docker playground to help get you started, you can also join the community slack channel to discuss ideas and seek help.

Highlights​

New Features​

Improvements​

Core & Server​

Catalogs & Connectors​

Clients​

Developer Experience & Operations​

Frontend Updates​

Bug Fixes​

Acknowledgements​

Major Features & Improvements​

Job and Job Template​

Gravitino Core​

Catalogs​

Python Client​

Authorization​

Web UI​

Bug Fixes​

Credits​

Metadata-driven action system​

Statistics system​

Policy system​

Job system​

Agent-ready through the MCP server​

Unified access control framework​

Add support for multiple locations model management​

Support the latest Apache Iceberg and Paimon versions​

Various core features​

Behavior changes​

Compatible changes:​

Breaking changes:​

Overall​

Credits​

Model Management​

Trino Connector​

JDBC Validation​

Bug Fixes

Core & Catalogs​

Connectors​

Spark Connector​

Trino Connector​

Python Client​

Filesets​

Improvements

Core & Catalogs​

CLI & Clients​

Testing​

Documentation​

Build & Infra​

Credits

Model Catalog​

Fileset Catalog​

GVFS (Gravitino Virtual File System)​

Security​

Data Lineage​

Core​

CLI​

Connector​

Chart​

Overall​

Credits​

Model Catalog​

Credential vending​

Fuse for Fileset​

Lakehouse Federation​

Security enforcement​

Other notable enhancements​

Iceberg REST server​

Core​

Gravitino Client​

BUG FIX​

Overall​

Credits​

Cloud storage support for Gravitino​

Credential vending support in Gravitino​

Unified access control improvements​

Centralized audit log support​

New data sources support​

Various core features​

Other notable enhancements​

Highlights

New Features

Improvements

Core & Server

Catalogs & Connectors

Clients

Developer Experience & Operations

Frontend Updates

Bug Fixes

Acknowledgements

Major Features & Improvements

Job and Job Template

Gravitino Core

Catalogs

Python Client

Authorization

Web UI

Bug Fixes

Credits

Metadata-driven action system

Statistics system

Policy system

Job system

Agent-ready through the MCP server

Unified access control framework

Add support for multiple locations model management

Support the latest Apache Iceberg and Paimon versions

Various core features

Behavior changes

Compatible changes:

Breaking changes:

Overall

Credits

Model Management

Trino Connector

JDBC Validation

Core & Catalogs

Connectors

Spark Connector

Trino Connector

Python Client

Filesets

Core & Catalogs

CLI & Clients

Testing

Documentation

Build & Infra

Model Catalog

Fileset Catalog

GVFS (Gravitino Virtual File System)

Security

Data Lineage

Core

CLI

Connector

Chart

Overall

Credits

Model Catalog

Credential vending

Fuse for Fileset

Lakehouse Federation

Security enforcement

Other notable enhancements

Iceberg REST server

Core

Gravitino Client

BUG FIX

Overall

Credits

Cloud storage support for Gravitino

Credential vending support in Gravitino

Unified access control improvements

Centralized audit log support

New data sources support

Various core features

Other notable enhancements

Gravitino core

Iceberg REST catalog server

Catalog related