Skip to content

fix: validate filesystem catalog identifiers#334

Merged
JingsongLi merged 6 commits into
apache:mainfrom
fallintoplace:fix/filesystem-catalog-identifier-validation
May 21, 2026
Merged

fix: validate filesystem catalog identifiers#334
JingsongLi merged 6 commits into
apache:mainfrom
fallintoplace:fix/filesystem-catalog-identifier-validation

Conversation

@fallintoplace
Copy link
Copy Markdown
Contributor

@fallintoplace fallintoplace commented May 19, 2026

What changed

This adds path-safe identifier validation helpers and enforces them in FileSystemCatalog before database or table paths are constructed.

The existing Identifier::new API remains infallible and compatible. FileSystemCatalog now rejects database and object names that are empty or whitespace-only, ., .., contain / or \, or contain control characters before using them in database_path or table_path.

C/Go binding identifier constructors remain compatible and do not reject path-control names at identifier construction time. Unsafe names are rejected when they are used with FileSystemCatalog operations.

Why

Filesystem catalog paths are derived from database and table names. Without validation, path-control input can escape the intended warehouse namespace on local filesystem storage.

Fixes #333.

Validation

  • cargo fmt --all -- --check
  • cargo check -p paimon --all-targets
  • cargo test -p paimon catalog::
  • cargo clippy --all-targets --workspace --features fulltext,vortex -- -D warnings
  • git diff --check

@fallintoplace fallintoplace marked this pull request as ready for review May 19, 2026 18:34
Comment thread crates/paimon/src/catalog/mod.rs Outdated
}

/// Validate this identifier's database and object names.
pub fn validate(&self) -> Result<()> {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just in Identifier.new?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I will implement this.

@JingsongLi
Copy link
Copy Markdown
Contributor

After the signature modification, this is indeed a major change. Perhaps we should still keep Identifier:: new() invincible (without changing the signature) and only perform validation in the entry method of the FileSystemCatalog. (RESTCatalog will have validation in server side.)

Copy link
Copy Markdown
Contributor

@JingsongLi JingsongLi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@JingsongLi JingsongLi merged commit ab9fb84 into apache:main May 21, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Filesystem catalog should reject path-control characters in identifiers

2 participants