Skip to content

Conversation

@geruh
Copy link
Contributor

@geruh geruh commented Jan 3, 2026

Rationale for this change

This PR adds the ability to rollback a table to a ancestoral snapshot given a timestamp. Some of this work was also done in #758, and is a progress pr to be merged after #2871 & #2878. This is standalone from the other changes but it makes use of the helpers in the other prs.

Additionally, adding some more tests.

Are these changes tested?

Yes

Are there any user-facing changes?

New API for meta

@geruh
Copy link
Contributor Author

geruh commented Jan 4, 2026

Looks like Test Infra failed need to retrigger tests.

Co-authored-by: Chinmay Bhat <12948588+chinmay-bhat@users.noreply.github.com>
Copy link
Contributor

@jayceslesar jayceslesar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a couple of suggestions and the tests also look good!

Comment on lines +477 to +478
def latest_ancestor_before_timestamp(table_metadata: TableMetadata, timestamp_ms: int) -> Snapshot | None:
"""Find the latest ancestor snapshot whose timestamp is before the provided timestamp.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be nice from a users perspective to allow this to be a datetime as well? Does differ from the java impl though...

Comment on lines +490 to +493
for ancestor in ancestors_of(table_metadata.current_snapshot(), table_metadata):
if timestamp_ms > ancestor.timestamp_ms > result_timestamp:
result = ancestor
result_timestamp = ancestor.timestamp_ms
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about using a max(filter(...)) statement here? I think a little easier to follow than the double greater than expression?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe not, this does match the java


return self.set_current_snapshot(snapshot_id=snapshot_id)

def rollback_to_timestamp(self, timestamp_ms: int) -> ManageSnapshots:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could add support for datetime too.

yield from ancestors_of(to_snapshot, table_metadata)


def latest_ancestor_before_timestamp(table_metadata: TableMetadata, timestamp_ms: int) -> Snapshot | None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if there is an edge case if the timestamp exactly matches a snapshot's timestamp_ms.
Update: timestamp_ms: lookup snapshots before this timestamp
to
timestamp_ms: lookup snapshots strictly before this timestamp

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe, but this logic has been in iceberg for at least 9 years though so would be hard to argue against or change

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants