Skip to content

Conversation

@zhjwpku
Copy link
Collaborator

@zhjwpku zhjwpku commented Jan 17, 2026

No description provided.

/// When committing, these changes will be applied to the latest table snapshot. Commit
/// conflicts will be resolved by applying the changes to the new latest snapshot and
/// reattempting the commit.
class ICEBERG_EXPORT AppendFiles {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO, we don't need this class. The Java impl's AppendFiles extends SnapshotUpdate so that one has more methods to call. We can just return FastAppend from NewFastAppend() and then return AppendFiles|MergeAppend from NewAppend().

/// \return Reference to this for method chaining
FastAppend& Set(const std::string& property, const std::string& value);

Kind kind() const override { return Kind::kUpdateSnapshot; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be moved to the base SnapshotUpdate class?

///
/// \param branch The branch name
/// \return Reference to this for method chaining
FastAppend& ToBranch(const std::string& branch);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we simplify reuse SetTargetBranch or rename it to ToBranch if this is a better name?

/// \param property The property name
/// \param value The property value
/// \return Reference to this for method chaining
FastAppend& Set(const std::string& property, const std::string& value);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we directly define this as a virtual function in the SnapshotUpdate?

Result<std::vector<ManifestFile>> WriteNewManifests();

private:
struct DataFilePtrHash {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we move DataFilePtrHash and DataFilePtrEqual to manifest_entry.h (where DataFile is defined) or content_file_util.h (if DataFileSet is needed as below comment)?

// Cleanup after committing is disabled for FastAppend unless there are
// rewritten_append_manifests because:
// 1.) Appended manifests are never rewritten
// 2.) Manifests which are written out as part of appendFile are already cleaned
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// 2.) Manifests which are written out as part of appendFile are already cleaned
// 2.) Manifests which are written out as part of AppendFile are already cleaned

new_manifests_.clear();
}

// Clean up only rewritten_append_manifests as they are always owned by the table
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Clean up only rewritten_append_manifests as they are always owned by the table
// Clean up only rewritten append manifests as they are always owned by the table

}

// Clean up only rewritten_append_manifests as they are always owned by the table
// Don't clean up append_manifests as they are added to the manifest list and are
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Don't clean up append_manifests as they are added to the manifest list and are
// Don't clean up append manifests as they are added to the manifest list and are

std::vector<std::shared_ptr<DataFile>> files;
files.reserve(data_files.size());
std::ranges::copy(data_files, std::back_inserter(files));
ICEBERG_ASSIGN_OR_RAISE(auto written_manifests, WriteDataManifests(files, spec));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to revisit the signature of WriteDataManifests to use iterator begin and end of DataFile. We can add a TODO comment to WriteDataManifests for now.

// If there are new files and manifests were already written, clean them up
if (has_new_files_ && !new_manifests_.empty()) {
for (const auto& manifest : new_manifests_) {
ICEBERG_RETURN_UNEXPECTED(DeleteFile(manifest.manifest_path));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ignore the error?

for (const auto& manifest : append_manifests_) {
ManifestFile updated = manifest;
updated.added_snapshot_id = snapshot_id;
manifests.push_back(updated);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and std::move

for (const auto& manifest : rewritten_append_manifests_) {
ManifestFile updated = manifest;
updated.added_snapshot_id = snapshot_id;
manifests.push_back(updated);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

// Clean up new manifests that were written but not committed
if (!new_manifests_.empty()) {
for (const auto& manifest : new_manifests_) {
if (committed.find(manifest.manifest_path) == committed.end()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (committed.find(manifest.manifest_path) == committed.end()) {
if (!committed.contains(manifest.manifest_path))) {

// not compacted
if (!rewritten_append_manifests_.empty()) {
for (const auto& manifest : rewritten_append_manifests_) {
if (committed.find(manifest.manifest_path) == committed.end()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants