Skip to content

Conversation

@geruh
Copy link
Contributor

@geruh geruh commented Dec 20, 2025

closes to #2847

Rationale for this change

This PR adds the server endpoint capabilities support, aligning with the Java implementation. While working on the REST scanning support, we need to know if a server supports specific capabilities before making any calls. So this PR also adds some extra support for the current implementation of PI iceberg REST catalog.

The REST catalog will now parse the endpoints field from the config call to determine server capabilities. When a server doesn't respond, we have fallback logic that matches the behavior of Java's rest catalog. The View endpoints are conditionally added to the default with the config property as well.

Are these changes tested?

Added unit tests and tested with the iceberg rest fixture.

Are there any user-facing changes?

Yes added config and alignment with java impl.

cc: @kevinjqliu @Fokko

@kevinjqliu
Copy link
Contributor

wdyt about adding integration tests against the iceberg-rest-fixture?

Running the integration test infra gives me this response on http://localhost:8181/v1/config

{
  "defaults": {},
  "overrides": { "namespace-separator": "%2E" },
  "endpoints":
    [
      "POST v1/oauth/tokens",
      "POST https://auth-server.com/token",
      "GET v1/config",
      "GET /v1/{prefix}/namespaces",
      "POST /v1/{prefix}/namespaces",
      "HEAD /v1/{prefix}/namespaces/{namespace}",
      "GET /v1/{prefix}/namespaces/{namespace}",
      "DELETE /v1/{prefix}/namespaces/{namespace}",
      "POST /v1/{prefix}/namespaces/{namespace}/properties",
      "GET /v1/{prefix}/namespaces/{namespace}/tables",
      "POST /v1/{prefix}/namespaces/{namespace}/tables",
      "HEAD /v1/{prefix}/namespaces/{namespace}/tables/{table}",
      "GET /v1/{prefix}/namespaces/{namespace}/tables/{table}",
      "POST /v1/{prefix}/namespaces/{namespace}/register",
      "POST /v1/{prefix}/namespaces/{namespace}/tables/{table}",
      "DELETE /v1/{prefix}/namespaces/{namespace}/tables/{table}",
      "POST /v1/{prefix}/tables/rename",
      "POST /v1/{prefix}/namespaces/{namespace}/tables/{table}/metrics",
      "POST /v1/{prefix}/transactions/commit",
      "GET /v1/{prefix}/namespaces/{namespace}/views",
      "HEAD /v1/{prefix}/namespaces/{namespace}/views/{view}",
      "GET /v1/{prefix}/namespaces/{namespace}/views/{view}",
      "POST /v1/{prefix}/namespaces/{namespace}/views",
      "POST /v1/{prefix}/namespaces/{namespace}/views/{view}",
      "POST /v1/{prefix}/views/rename",
      "DELETE /v1/{prefix}/namespaces/{namespace}/views/{view}",
      "POST /v1/{prefix}/namespaces/{namespace}/tables/{table}/plan",
      "GET /v1/{prefix}/namespaces/{namespace}/tables/{table}/plan/{plan-id}",
      "POST /v1/{prefix}/namespaces/{namespace}/tables/{table}/tasks",
      "DELETE /v1/{prefix}/namespaces/{namespace}/tables/{table}/plan/{plan-id}",
    ],
}

Copy link
Contributor

@kevinjqliu kevinjqliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for adding this feature.
The PR looks good and is throughly tested. I just have a few nit comments.
Feel free to address here or as a follow up PR

Comment on lines +95 to +99
if not raw_path:
raise ValueError("Invalid path: empty")
raw_path = raw_path.strip()
if not raw_path:
raise ValueError("Invalid path: empty")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if not raw_path:
raise ValueError("Invalid path: empty")
raw_path = raw_path.strip()
if not raw_path:
raise ValueError("Invalid path: empty")
raw_path = raw_path.strip()
if not raw_path:
raise ValueError("Invalid path: empty")

i think we can just check once here

NotImplementedError: If the endpoint is not supported.
"""
if endpoint not in self._supported_endpoints:
raise NotImplementedError(f"Server does not support endpoint: {endpoint}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: java throws UnsupportedOperationException here

return f"{self.http_method.value} {self.path}"

@classmethod
def from_string(cls, endpoint: str | None) -> "Endpoint":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def from_string(cls, endpoint: str | None) -> "Endpoint":
def from_string(cls, endpoint: str) -> "Endpoint":

can we enforce that endpoint must be str?

def from_string(cls, endpoint: str | None) -> "Endpoint":
if endpoint is None:
raise ValueError("Invalid endpoint (must consist of 'METHOD /path'): None")
elements = endpoint.split(None, 1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
elements = endpoint.split(None, 1)
elements = endpoint.strip().split(None, 1)

strip leading/trailing whitespace before split, just in case

fetch_scan_tasks: str = "namespaces/{namespace}/tables/{table}/tasks"


class Capability:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe we can refactor the Endpoints class and consolidate this class

Comment on lines +969 to +976

if Capability.V1_NAMESPACE_EXISTS not in self._supported_endpoints:
try:
self.load_namespace_properties(namespace_tuple)
return True
except NoSuchNamespaceError:
return False

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Returns:
bool: True if the table exists, False otherwise.
"""
if Capability.V1_TABLE_EXISTS not in self._supported_endpoints:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: wdyt of adding the endpoints in the rest_mock here
that way we dont need to add it to each test.

We can modify it when testing specific cases, such as when an older server does not return the view endpoints, or when testing the endpoint response directly

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants