Skip to content

[python] Add create_tag_from_timestamp to FileStoreTable#7946

Open
JunRuiLee wants to merge 2 commits into
apache:masterfrom
JunRuiLee:pypaimon-create-tag-from-timestamp
Open

[python] Add create_tag_from_timestamp to FileStoreTable#7946
JunRuiLee wants to merge 2 commits into
apache:masterfrom
JunRuiLee:pypaimon-create-tag-from-timestamp

Conversation

@JunRuiLee
Copy link
Copy Markdown
Contributor

Summary

  • Add create_tag_from_timestamp(tag_name, timestamp_millis, ignore_if_exists) method to FileStoreTable
  • Uses the existing SnapshotManager.earlier_or_equal_time_mills() binary search to find the latest snapshot at or before the given timestamp
  • Aligns with Java's CreateTagFromTimestampProcedure behavior

Motivation

PyPaimon currently supports creating tags from a specific snapshot ID or the latest snapshot, but lacks the ability to create a tag from a timestamp — a feature available in the Java API via CreateTagFromTimestampProcedure. This is useful for batch workflows that need to pin a snapshot based on wall-clock time.

Tests

  • No new tests added in this PR (existing earlier_or_equal_time_mills is already covered by SnapshotManager tests)

Copy link
Copy Markdown
Contributor

@JingsongLi JingsongLi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean and concise implementation. A couple of minor points:

  1. Consistency with Java API: In the Java Paimon TagManager, create_tag_from_timestamp typically uses SnapshotManager.earlierOrEqualTimeMills() which returns a Snapshot or null. Your implementation follows the same pattern. LGTM.

  2. Error message: The ValueError message says "earlier than or equal to" but the method name says earlier_or_equal_time_mills. Consider aligning: either say "at or before" or "earlier than or equal to" — just pick one style.

  3. Suggest adding a test: Even though this is a thin wrapper, a unit test that creates a couple of snapshots with known timestamps and then calls create_tag_from_timestamp would protect against regressions (e.g., if earlier_or_equal_time_mills semantics change).

Otherwise LGTM — this is a useful addition for time-based tag creation from the Python API.

JunRuiLee added 2 commits May 24, 2026 21:08
Allow creating a tag from the latest snapshot at or before a given
timestamp (in milliseconds), aligning with Java's
CreateTagFromTimestampProcedure behavior.
- Align error message wording to "at or before"
- Add unit tests for create_tag_from_timestamp
@JunRuiLee JunRuiLee force-pushed the pypaimon-create-tag-from-timestamp branch from b0ec6c2 to 8f72e99 Compare May 24, 2026 13:09
Copy link
Copy Markdown
Contributor Author

@JunRuiLee JunRuiLee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JingsongLi Thanks for the review!

  1. Consistency with Java API: Thanks, glad the pattern aligns.

  2. Error message: Updated to use "at or before" consistently (matching the docstring).

  3. Test: Added test_create_tag_from_timestamp (creates 2 snapshots, tags from a future timestamp, asserts correct snapshot) and test_create_tag_from_timestamp_no_snapshot_raises (timestamp 0, asserts ValueError).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants