Skip to main content
Version: 1.3.0

MCP Server

Introduction

Gravitino MCP server provides the ability to manage Gravitino metadata for LLM.

Requirements

  1. Python 3.10+
  2. uv is installed. Install uv by following the official guide.

Usage

  1. Clone the code from GitHub, and change to mcp-server directory
  2. Create virtual environment, uv venv
  3. Install the required Python packages. uv pip install -e .
  4. Add Gravitino MCP server to corresponding LLM tools. Take cursor for example, edit ~/.cursor/mcp.json, use following configuration for local Gravitino MCP server:
{
"mcpServers": {
"gravitino": {
"command": "uv",
"args": [
"--directory",
"$path/mcp-server",
"run",
"mcp_server",
"--metalake",
"test",
"--gravitino-uri",
"http://127.0.0.1:8090"
]
}
}
}

Or start an HTTP MCP server by uv run mcp_server --metalake test --gravitino-uri http://127.0.0.1:8090 --transport http --mcp-url http://localhost:8000/mcp, and use the configuration:

{
"mcpServers": {
"gravitino": {
"url": "http://localhost:8000/mcp"
}
}
}

Docker Instructions

You could start Gravitino MCP server by Docker image, docker run -p 8000:8000 --network=host apache/gravitino-mcp-server:latest --metalake test --transport http --mcp-url http://0.0.0.0:8000/mcp --gravitino-uri http://127.0.0.1:8090. Please note that the MCP server in Docker container doesn't support stdio transport mode.

Supported Tools

Gravitino MCP server supports the following tools, and you could export tool by tag.

Tool nameDescriptionTagSince version
get_list_of_catalogsRetrieve a list of all catalogs in the system.catalog1.0.0
get_list_of_schemasRetrieve a list of schemas belonging to a specific catalog.schema1.0.0
get_list_of_tablesRetrieve a list of tables within a specific catalog and schema.table1.0.0
get_table_metadata_detailsRetrieve comprehensive metadata details for a specific table.table1.0.0
list_of_modelsRetrieve a list of models within a specific catalog and schema.model1.0.0
load_modelRetrieve comprehensive metadata details for a specific model.model1.0.0
list_model_versionsRetrieve a list of versions for a specific model.model1.0.0
load_model_versionRetrieve comprehensive metadata details for a specific model version.model1.0.0
load_model_version_by_aliasRetrieve comprehensive metadata details for a specific model version by alias.model1.0.0
metadata_type_to_fullname_formatsRetrieve the metadata type to fullname formats mapping.metadata1.0.0
list_of_topicsRetrieve a list of topics within a specific catalog and schema.topic1.0.0
load_topicRetrieve comprehensive metadata details for a specific topic.topic1.0.0
list_of_filesetsRetrieve a list of filesets within a specific catalog and schema.fileset1.0.0
load_filesetRetrieve comprehensive metadata details for a specific fileset.fileset1.0.0
list_files_in_filesetRetrieve a list of files within a specific fileset.fileset1.0.0
list_of_jobsRetrieve a list of jobsjob1.0.0
get_job_by_idRetrieve a job by its ID.job1.0.0
list_of_job_templatesRetrieve a list of job templates.job1.0.0
get_job_template_by_nameRetrieve a job template by its name.job1.0.0
run_jobRun a job with the specified parameters.job1.0.0
cancel_jobCancel a running job by its ID.job1.0.0
get_tag_by_nameRetrieve a tag by its name.tag1.0.0
list_of_tagsRetrieve a list of tags.tag1.0.0
list_tags_for_metadataRetrieve a list of tags associated with a specific metadata item.tag1.0.0
list_metadata_by_tagRetrieve a list of metadata items associated with a specific tag.tag1.0.0
associate_tag_with_metadataAssociate tags with a specific metadata item.tag1.0.0
disassociate_tag_from_metadataDisassociate tags from a specific metadata item.tag1.0.0
list_statistics_for_metadataRetrieve a list of statistics associated with a specific metadata item.statistics1.0.0
list_statistics_for_partitionRetrieve a list of statistics associated with a specific partition.statistics1.0.0
get_list_of_policiesRetrieve a list of policies in the system.policy1.0.0
get_policy_detail_informationRetrieve detailed information for a specific policy by policy name.policy1.0.0
list_policies_for_metadataList all policies associated with a specific metadata item.policy1.0.0
list_metadata_by_policyList all metadata items associated with a specific policy.policy1.0.0
get_policy_for_metadataGet a policy associated with a specific metadata item.policy1.0.0

Configuration

You could config Gravitino MCP server by arguments, uv run mcp_server -h shows the detailed information.

ArgumentDescriptionDefault valueRequiredSince version
--metalakeThe Gravitino metalake name.noneYes1.0.0
--gravitino-uriThe URI of Gravitino server.http://127.0.0.1:8090No1.0.0
--transportTransport protocol: stdio (local), http / streamable-http (Streamable HTTP).stdioNo1.0.0
--mcp-urlThe URL of MCP server if using HTTP transport.http://127.0.0.1:8000/mcpNo1.0.0
--tokenOAuth2 Bearer token for Gravitino; or set GRAVITINO_TOKEN. See Authentication.none (anonymous)No1.3.0
--tls-certPEM certificate to serve the endpoint over HTTPS. Requires --tls-key.noneNo1.3.0
--tls-keyPEM private key to serve the endpoint over HTTPS. Requires --tls-cert.noneNo1.3.0

Authentication

By default the MCP server talks to Gravitino anonymously. There are two ways to attach an identity, depending on the transport.

Static startup token (stdio and HTTP)

Pass --token (or set the GRAVITINO_TOKEN environment variable) to authenticate the server with a static OAuth2 Bearer token. The value is treated as a Bearer token and sent as Authorization: Bearer <token>. The token is masked in the server's log output.

uv run mcp_server --metalake test --gravitino-uri http://127.0.0.1:8090 --token <your-token>
# or
export GRAVITINO_TOKEN=<your-token>
uv run mcp_server --metalake test --gravitino-uri http://127.0.0.1:8090

In stdio mode this token is used for every request. In HTTP mode it is only the fallback, used when an incoming request does not carry its own Authorization header.

Per-request identity (HTTP)

When the server runs with HTTP transport, the Authorization header of each incoming MCP request is forwarded verbatim to Gravitino. The scheme is preserved, so OAuth2 (Bearer), Gravitino simple authentication (Basic <base64(user:dummy)>) and others all work. This keeps concurrent sessions from different principals isolated — one principal's identity never leaks into another's calls — and lets Gravitino enforce authorization per caller. The per-request header takes priority over the static --token.

Authorization itself is always enforced by Gravitino: the MCP server forwards the identity but does not make access-control decisions of its own.

Serving over HTTPS (TLS)

To serve the MCP HTTP endpoint (the --mcp-url, not the --gravitino-uri) over TLS, provide both --tls-cert and --tls-key and use an https:// --mcp-url. The certificate and key must be provided together, and the URL scheme must match the TLS setting (an https:// URL without a cert/key, or a cert/key behind an http:// URL, is rejected at startup).

uv run mcp_server --metalake test --gravitino-uri http://127.0.0.1:8090 \
--transport streamable-http --mcp-url https://localhost:8000/mcp \
--tls-cert /path/to/cert.pem --tls-key /path/to/key.pem

Audit Logging

Every tool invocation is recorded as one structured JSON line in gravitino-mcp-audit.log (written to the server's working directory). Each record is attributed to the principal derived from the request's Authorization header.

FieldDescription
timestampUTC ISO-8601 time of the call.
principalCaller identity: username for Basic simple auth, bearer:<first-8-chars> for a Bearer token, or anonymous when no identity is present.
toolName of the invoked MCP tool.
outcomeallow for successful calls, deny for failed ones. deny is emitted for any tool-call exception (authorization denial being the common case); inspect error_type to disambiguate.
error_typeException class name, present only when outcome is deny.

Example record:

{"timestamp": "2026-06-16T03:21:09.123456+00:00", "principal": "alice", "tool": "get_list_of_catalogs", "outcome": "allow"}