-
Notifications
You must be signed in to change notification settings - Fork 14
cfsctl: Fix garbage collection #200
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
I think a lot of this will change once #185 lands. We'd probably want to wait until then, I think |
|
I briefly skimmed the PR, I think as long as the split stream changes exposes the same That being said, it's totally fair to delay this PR ahead of a potentially dependent change as large as that. I could work on a rebase once that is landed. |
cgwalters
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're not adding any tests here, but we definitely want that. Basic ones would be ensuring adding content and removing all the streams also GC's all objects, and two streams with shared objects but removing just one keeps the right objects etc.
|
Also sorry to be clear the above review was using the newly landed https://github.com/bootc-dev/agent-skills/blob/main/perform-forge-review/SKILL.md flow...and it somehow seems to have lost my header comment? I'll look at that. |
|
I rebased the branch onto main after the splitstream changes. It seems that in the new format manifests referring to other streams in a separate table |
87a7a1a to
5d61fcf
Compare
|
Done, I have included some improvements to the GC code as well. Now GC process can take caller-specified GC roots, and remove broken links in the repo. PTAL @cgwalters |
|
Hold on, I found another bug with actual bootc repository testing. |
|
Should be good now, I also included a regression test for the bug mentioned above |
cgwalters
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this!! Sorry about the delay. I only have a relatively superficial review so far
crates/cfsctl/src/main.rs
Outdated
| GC { | ||
| // digest of root images for gc operations | ||
| #[clap(long, short = 'i')] | ||
| root_images: Vec<String>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait why would these be provided externally? Are they additional to the images present?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, these are added to the list of GC roots that would be used for GC analysis. The existing internal references (in streams/refs/ and in images/refs) will always be considered and added to the GC root list.
The main reasoning here to allow additional external roots is mainly because we don't have a concrete standard to partition */refs directory into different namespaces for different users of composefs. Even bootc does not call composefs in a way that would create these references. So it is better to just assume the */refs references are not enough and it is up to the user to supplement all sensible roots. This concrete standard/scheme is particularly more relevant in pursuing the unified containers storage approach and we probably need to come to it later.
Besides it's really handy to be able to specify the additional roots when doing tests, management and diagnosis.
crates/composefs/src/repository.rs
Outdated
| } | ||
|
|
||
| fn walk_symlinkdir(fd: OwnedFd, objects: &mut HashSet<ObjectID>) -> Result<()> { | ||
| fn walk_symlinkdir(fd: OwnedFd, entry_digests: &mut HashSet<CString>) -> Result<()> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should prefer Rust-native representations in memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now uses OsString
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if we want String here or OsString. Since the hash set stores digest strings, they should always be UTF8 and not really giving any UTF8 parsing errors. But the UTF8 conversion feels redundant anyways. Let me know your preferences.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On Unix platforms there's not a "conversion" to UTF-8, only a validation. (Yes, Windows is messier w/UTF-16 but we are not targeting that)
Now when we start to get into representation precision: any reason not to require a digest type here?
|
Addressed most of the comments, PTAL. |
|
OK I was looking at this more, and the thing that strongly relates to this (that to be clear is a pre-existing problem) is how we've ended up having this project trying to be too OCI independent in my opinion. This relates to the topic of garbage collection in that it's really important to have a strong data model - what references what and how. So on this topic I put up #216 as draft which has a high level goal of just starting by ensuring we store the manifest too (and have a schema for tags that name OCI images specifically including transports). But the big next step we could take is that when this project is compiled with oci support (and why wouldn't one do that?) the core GC model just directly parses manifest JSON and not splitstream - we would in fact cease to store splitstream for manifests and configs, only for tar layers. Anyways...just something I'm thinking about while looking at this, they are still distinct threads. |
(Commit message from Colin Walters, Assisted-by: Opus 4.5) The previous GC implementation was completely broken - it never added any object IDs to the live set and would mark every object for deletion. The code assumed an old composefs structure where non-first-level entries could directly link to object stores. This commit provides a working GC implementation that: - Properly walks named references in `*/refs/` directories to find GC roots - Recursively traverses splitstream named references (stream_refs) to find all transitively reachable objects, handling the two-layer OCI structure where config splitstreams reference layer splitstreams - Supports caller-specified additional GC roots via `--root-images` and `--root-streams` flags, useful for external integrations like bootc - Actually performs deletions (previously was dry-run only) - Cleans up broken symlinks in images/ and streams/ after GC - Uses `--dry-run` flag (conventional) instead of `--force` (inverted) Includes comprehensive test coverage for: - Stream and image GC with various root configurations - Shared objects between multiple streams/images - Named reference traversal with different table vs repo names Signed-off-by: Chaser Huang <huangkangjing@gmail.com> Signed-off-by: Colin Walters <walters@verbum.org>
Refactor the GC API: - gc(additional_roots) performs GC, returns GcResult with statistics - gc_dry_run(additional_roots) previews without deleting Key difference: gc_dry_run() only acquires a shared lock since it doesn't modify anything, allowing concurrent reads during preview. Additional roots are looked up in both images and streams by name, useful for external integrations (like bootc) that track roots outside the repository's refs. GcResult includes: - objects_removed: count of unreferenced objects deleted - objects_bytes: total bytes freed - images_pruned/streams_pruned: broken symlinks cleaned up Tests verify GcResult values match expected counts. Assisted-by: Claude (Opus) Signed-off-by: Colin Walters <walters@verbum.org>
|
Hi @chaserhkj I took the liberty of making the "force => dry-run" change per above, and also:
Then I added a commit on top which further tweaks the API. WDYT? |
Johan-Liebert1
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested some basic cases locally. LGTM
|
Sorry was working on another project in the last few days. Thanks for the changes and merging! |
This PR fixes the broken garbage collection implementation provided by
cfsctl. To my knowledge this GC implementation is not used anywhere else, but this fixed implementation could be helpful for downstream project like bootc as well, see bootc-dev/bootc#1808Previously, the GC implementation was completely broken as it didn't add any object IDs to the live set and would thus always mark every object for deletion. The code in
gc_categoryseems to be assuming an old composefs structure where non-first-level entries instreams/andimages/directory (e.g.streams/refs/some_distro/some_version) could directly link to object stores. But currently these links always link to first-level entries first and first-level entries would then link to objects in store.This PR fixes this by doing a proper walk to add all objects referred from named references to the live set, such that all unlinked objects would be marked for deletion.
Furthermore, the old implementation was doing a naive shallow walk for the streams, this is problematic for pulled OCI images in composefs repo, since they have two layers of links, a config split stream linking all layers, and each layers linking to their layer contents. This PR adds a full walk algorithm to walk down and prune the entire stream tree to mark unlinked objects for deletion.
Currently this is still dry-run only, but I have changed the output format to add a "#" before all non-delete lines and the output could now just be piped to a shell to perform the deletion.
Note that bootc has its own GC implementations here. But bootc uses bootloader entries as part of the GC root, which I believe should be considered out of composefs scope. Bootc GC also does not prune streams. I think ideally we should have a complete GC implementation in composefs and bootc should just forward the call.