Skip to content

[core] Introduce Catalog#loadTableMetadata#7851

Open
tsreaper wants to merge 1 commit into
apache:masterfrom
tsreaper:feature/catalog-load-schema
Open

[core] Introduce Catalog#loadTableMetadata#7851
tsreaper wants to merge 1 commit into
apache:masterfrom
tsreaper:feature/catalog-load-schema

Conversation

@tsreaper
Copy link
Copy Markdown
Contributor

Purpose

All catalogs have implemented the loadTableMetadata method. In this PR we move this method into the interface, so we can use it if we only want to get some metadata like schemas from tables and don't want the table object itself.

Tests

This is a refactor. Existing tests should cover this change.

Copy link
Copy Markdown
Contributor

@JingsongLi JingsongLi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: [core] Introduce Catalog#loadTableMetadata

The interface promotion (moving loadTableMetadata to public on Catalog) is a sensible refactor. DelegateCatalog delegation and Javadoc are properly done.

Correctness Concerns

1. Error-handling semantics change (potential bug)

Callers rely on createOtherBranchTable returning null when a branch does not exist:

FileStoreTable otherTable = createOtherBranchTable(...);
if (otherTable != null) { ... } else { LOG.error(...); }
  • Before: SchemaManager.latest() returns Optional.empty() for a non-existent branch, method returns null gracefully.
  • After: catalog.loadTableMetadata(branchIdentifier) throws TableNotExistException. The broad catch (Exception e) { throw new RuntimeException(e); } escalates -- turning graceful degradation into hard failure.

Suggested fix: catch TableNotExistException specifically and return Optional.empty().

2. JdbcCatalog branch compatibility

JdbcCatalog.loadTableSchema calls assertMainBranch(identifier) which throws UnsupportedOperationException for non-main branches. After this PR, any fallback branch on JdbcCatalog will fail with wrapped UnsupportedOperationException.

Options:

  • Fall back to filesystem-based schema loading when the catalog throws UnsupportedOperationException
  • Guard this path to only activate for catalog types that support branch resolution

3. Minor: catalog lifecycle cost

Opening a full catalog via catalogLoader.load() just to load a schema could be more expensive than a direct filesystem read (especially for REST or Hive catalogs). The try-with-resources ensures proper cleanup, but worth considering.

Summary

The interface refactoring part is good. The FileStoreTableFactory change needs the error-handling fix (point 1) and JdbcCatalog compatibility (point 2) addressed to avoid regressions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants