Skip to content

Conversation

@alexlarsson
Copy link
Contributor

Based on ideas from #141

This is an initial version of ostree support. This allows pulling
from local and remote ostree repos, which will create a set of
regular file content objects, as well as a blob containing all the
remaining ostree objects. From the blob we can create an image.

When pulling a commit, a base blob (i.e. "the previous version" can be
specified. Any objects in that base blob will not be downloaded. If a
name is given for the pulled commit, then pre-existing blobs with the
same name will automatically be used as a base blob.

This is an initial version and there are several things missing:

  • Pull operations are completely serial
  • There is no support for ostree summary files
  • There is no support for ostree delta files
  • There is no caching of local file availability (other than base blob)
  • Local ostree repos only support archive mode

@alexlarsson alexlarsson force-pushed the ostree-support branch 2 times, most recently from e0e827f to 9c5b086 Compare June 17, 2025 06:54
Copy link
Collaborator

@allisonkarlitskaya allisonkarlitskaya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love this! Thanks for working on it!

I made some comments on the first round of commits. Feel free to adjust those and PR them separately: we can merge those now without further discussion.

The blobs thing is going to need a call.

I didn't review the crate addition in any detail at all. That's probably also going to need a call :)

@alexlarsson
Copy link
Contributor Author

Hmmm, thinking more about this. We probably want a "content type" magic thing in the splitstream header as well, so we can error out if the wrapped thing is of the wrong type.

@alexlarsson alexlarsson force-pushed the ostree-support branch 2 times, most recently from 2ed83a2 to c041afe Compare June 19, 2025 09:11
@alexlarsson
Copy link
Contributor Author

Ok. Reworked this to use splitstreams for object maps and commits. And, by using an object mapping to find the object map we make the content of the splitstream for the commit be just the commit data, and thus the sha256 of that splitstream matches the ostree commit id.

@alexlarsson
Copy link
Contributor Author

@allisonkarlitskaya There is still lots to do here. But have a look at this approach and see what you think.

@alexlarsson
Copy link
Contributor Author

Added some further changes. We now validate all objects when pulling and all non-file objects when creating images. Its hard to efficiently validate file objects during create-image though, we would like to avoid re-reading the external object files to compute the sha256.

Remaining things to do:

  • Stream larger objects into repo
  • Support summaries and summary branches for remote repos
  • Support deltas when remote pulling
  • Parallelize downloads of objects
  • Report pull progress in some sane way
  • Use some kind of local cache for available objects other than just those from "previous version"
  • Handle GPG validation of commit objects

@alexlarsson alexlarsson force-pushed the ostree-support branch 4 times, most recently from 481e604 to e88573d Compare June 30, 2025 14:26
@alexlarsson
Copy link
Contributor Author

I started working on the delta support, but it failed because of an issue in gvariant-rs.

Copy link
Collaborator

@allisonkarlitskaya allisonkarlitskaya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It occurs to me that it might be interesting not to sort the table of fs-verity references, and it might also be interesting to permit duplicate items.

On the topic of deferring writing of objects to a background thread, this would allow us to write "external object #123" based on a sequential index to the splitstream without actually knowing the hash value yet, and then fill in the actual values in the header at the end when we're writing: it helps there that the fs-verity references aren't compressed and therefore not part of the stream...

@cgwalters
Copy link
Collaborator

It seems like we should get in the splitstream changes in 0f6d69e at least sooner rather than later? Can you file a separate PR?

alexlarsson added a commit to alexlarsson/composefs-rs that referenced this pull request Sep 29, 2025
This changes the splitstream format a bit, with the goal of allowing
splitstreams to support ostree files as well (see containers#144)

The primary differences are:

 * The header is not compressed
 * All referenced fs-verity objects are stored in the header, including
   external chunks, mapped splitstreams and (a new feature) references
   that are not used in chunks.
 * The mapping table is separate from the reference table (and generally
   smaller), and indexes into it.
 * There is a magic value to detect the file format.
 * There is a magic content type to detect the type wrapped in the stream.
 * We store a tag for what ObjectID format is used
 * The total size of the stream is stored in the header.

The ability to reference file objects in the repo even if they are not
part of the splitstream "content" will be useful for the ostree
support to reference file content objects.

This change also allows more efficient GC enumeration, because we
don't have to parse the entire splitstream to find the referenced
objects.

Signed-off-by: Alexander Larsson <alexl@redhat.com>
alexlarsson added a commit to alexlarsson/composefs-rs that referenced this pull request Sep 29, 2025
This changes the splitstream format a bit, with the goal of allowing
splitstreams to support ostree files as well (see containers#144)

The primary differences are:

 * The header is not compressed
 * All referenced fs-verity objects are stored in the header, including
   external chunks, mapped splitstreams and (a new feature) references
   that are not used in chunks.
 * The mapping table is separate from the reference table (and generally
   smaller), and indexes into it.
 * There is a magic value to detect the file format.
 * There is a magic content type to detect the type wrapped in the stream.
 * We store a tag for what ObjectID format is used
 * The total size of the stream is stored in the header.

The ability to reference file objects in the repo even if they are not
part of the splitstream "content" will be useful for the ostree
support to reference file content objects.

This change also allows more efficient GC enumeration, because we
don't have to parse the entire splitstream to find the referenced
objects.

Signed-off-by: Alexander Larsson <alexl@redhat.com>
alexlarsson added a commit to alexlarsson/composefs-rs that referenced this pull request Oct 6, 2025
This changes the splitstream format a bit, with the goal of allowing
splitstreams to support ostree files as well (see containers#144)

The primary differences are:

 * The header is not compressed
 * All referenced fs-verity objects are stored in the header, including
   external chunks, mapped splitstreams and (a new feature) references
   that are not used in chunks.
 * The mapping table is separate from the reference table (and generally
   smaller), and indexes into it.
 * There is a magic value to detect the file format.
 * There is a magic content type to detect the type wrapped in the stream.
 * We store a tag for what ObjectID format is used
 * The total size of the stream is stored in the header.

The ability to reference file objects in the repo even if they are not
part of the splitstream "content" will be useful for the ostree
support to reference file content objects.

This change also allows more efficient GC enumeration, because we
don't have to parse the entire splitstream to find the referenced
objects.

Signed-off-by: Alexander Larsson <alexl@redhat.com>
@alexlarsson alexlarsson force-pushed the ostree-support branch 2 times, most recently from c788da2 to 2ee193a Compare October 6, 2025 14:58
alexlarsson added a commit to alexlarsson/composefs-rs that referenced this pull request Oct 6, 2025
This changes the splitstream format a bit, with the goal of allowing
splitstreams to support ostree files as well (see containers#144)

The primary differences are:

 * The header is not compressed
 * All referenced fs-verity objects are stored in the header, including
   external chunks, mapped splitstreams and (a new feature) references
   that are not used in chunks.
 * The mapping table is separate from the reference table (and generally
   smaller), and indexes into it.
 * There is a magic value to detect the file format.
 * There is a magic content type to detect the type wrapped in the stream.
 * We store a tag for what ObjectID format is used
 * The total size of the stream is stored in the header.

The ability to reference file objects in the repo even if they are not
part of the splitstream "content" will be useful for the ostree
support to reference file content objects.

This change also allows more efficient GC enumeration, because we
don't have to parse the entire splitstream to find the referenced
objects.

Signed-off-by: Alexander Larsson <alexl@redhat.com>
allisonkarlitskaya pushed a commit to allisonkarlitskaya/composefs-rs that referenced this pull request Nov 12, 2025
This changes the splitstream format a bit, with the goal of allowing
splitstreams to support ostree files as well (see containers#144)

The primary differences are:

 * The header is not compressed
 * All referenced fs-verity objects are stored in the header, including
   external chunks, mapped splitstreams and (a new feature) references
   that are not used in chunks.
 * The mapping table is separate from the reference table (and generally
   smaller), and indexes into it.
 * There is a magic value to detect the file format.
 * There is a magic content type to detect the type wrapped in the stream.
 * We store a tag for what ObjectID format is used
 * The total size of the stream is stored in the header.

The ability to reference file objects in the repo even if they are not
part of the splitstream "content" will be useful for the ostree
support to reference file content objects.

This change also allows more efficient GC enumeration, because we
don't have to parse the entire splitstream to find the referenced
objects.

Signed-off-by: Alexander Larsson <alexl@redhat.com>
@alexlarsson
Copy link
Contributor Author

I rebased the code I had here on top of the splitstream changes that was merged in main. I removed all the code to give ostree commits some kind of named ref for now, as we need to figure out exactly how that will look. But, it has the basic minimal support to pull from a (remote or local) ostree repo into a ostree-commit-* splitstream object, and you can specify another such splitstream as a base name, and it will only download objects not in that version. (Which eventually we want to automatically pass what was the previous commit of the named ostree ref).

It also supports converting ostree commits to images, which can then be mounted.

There are tons of things that could be added on top of this:

  • Named refs support
  • Support summaries for remote repos
  • Use of static deltas when downloading from http
  • Some kind of object file map caching between commits other than the specified base name
  • Parallel pulling with progress
  • gpg validation
  • Handle more local repo modes (currently only archive and bare-user-only

To test this you can run for example:

$ cfsctl --repo repo ostree pull-local /var/lib/flatpak/repo remotes/flathub/app/org.gnome.gedit/x86_64/stable
$ cfsctl --repo repo pull --base-name ostree-commit-306eed5546b7af406377e4947a118bd942729d9b380099fb191728654e2c925b https://dl.flathub.org/repo/ app/org.gnome.gedit/aarch64/stable
$ cfsctl --repo repo ostree create-image --image-name gedit ostree-commit-306eed5546b7af406377e4947a118bd942729d9b380099fb191728654e2c925b
$ sudo cfsctl --repo repo mount refs/gedit mnt

@alexlarsson
Copy link
Contributor Author

The CI failures seem unrelated to this change. Weird.

@cgwalters
Copy link
Collaborator

Can you explain a bit what the medium-term is for this? Is it flatpak using this? Would we try to create a shim C API? Or is it for something like bootc (or rpm-ostree?) fetching from ostree repos? Something else?


Something else I'm thinking about here is the intersection with #216 - this is a half-baked thought but...maybe we could always map ostree to OCI (instead of mapping ostree to splitstream), though depending on how we do that there could be nontrivial metadata overhead.

@alexlarsson
Copy link
Contributor Author

@cgwalters The origins of this is the work that @allisonkarlitskaya was doing on a quickly moving flatpak-next experimental codebase (in rust). The idea for a flatpak-next would be to have a native composefs local on-disk format, but still allow installing from ostree based remotes.

I would not say there is any real, concrete usecase for this atm, but it seems important to me to at least have the new codebase and on-disk format flexible enough to be able to handle ostree if we need it for backwards compatibility reasons.

@alexlarsson
Copy link
Contributor Author

Something else I'm thinking about here is the intersection with #216 - this is a half-baked thought but...maybe we could always map ostree to OCI (instead of mapping ostree to splitstream), though depending on how we do that there could be nontrivial metadata overhead.

I'm not sure about this. For sure we can convert to anything, but if we ever want to have efficient "delta ostree pull" based on a previous commit we can't lose fidelity of the ostree objects.

@allisonkarlitskaya
Copy link
Collaborator

I would actually like to see this merged. It's good code and will almost definitely be useful to somebody at some point. It also doesn't really hurt anyone since it's in a separate crate.

@allisonkarlitskaya
Copy link
Collaborator

I also think the fact that it has nothing to do with OCI is great. I'm not as much on the OCI train and one of the main things you hear from composefs detractors is "but OCI sucks!". Showing that composefs != OCI is good IMHO.

@cgwalters
Copy link
Collaborator

@cgwalters The origins of this is the work that @allisonkarlitskaya was doing on a quickly moving flatpak-next experimental codebase (in rust).

OK right! That is super exciting to me and a lot of potential there. If we're semi-serious about pushing it forward then it is indeed a strong argument against #213 as written (tangent but an important one).

For sure we can convert to anything, but if we ever want to have efficient "delta ostree pull" based on a previous commit we can't lose fidelity of the ostree objects.

Yes, of course - my proposal wouldn't lose that, but it's not the most important thread right now.

@cgwalters
Copy link
Collaborator

I also think the fact that it has nothing to do with OCI is great.

I think unless we prove out that composefs can be a very good way to store OCI, then it is not worth investing in. Thankfully that's not the case - I think it is (and I believe you do too!).

So it's not that it has "nothing to do with OCI" (right?) - how about "has the capability to easily/natively store any type of content that one would want to represent as read-only immutable versioned filesystem trees".

For example, today Android as far as I know uses fsverity on single zip files, and they've made it work quite well, but it's harder to get deduplication across apps that way, and maybe someday they go to a composefs-like model.

Signed-off-by: Alexander Larsson <alexl@redhat.com>
@alexlarsson
Copy link
Contributor Author

I rebased this, lets see if CI passes now.

This lets you look up a ref digest from the splitstream by index
and is needed by the ostree code.

Signed-off-by: Alexander Larsson <alexl@redhat.com>
@allisonkarlitskaya
Copy link
Collaborator

I also think the fact that it has nothing to do with OCI is great.

I think unless we prove out that composefs can be a very good way to store OCI, then it is not worth investing in. Thankfully that's not the case - I think it is (and I believe you do too!).

So it's not that it has "nothing to do with OCI" (right?) - how about "has the capability to easily/natively store any type of content that one would want to represent as read-only immutable versioned filesystem trees".

Just to be clear, when I said "it has nothing to do with OCI" I specifically meant composefs-ostree, not composefs-rs generally (which very clearly was designed with OCI in mind).

Very obviously the main target of composefs-rs right now is bootc (OCI), probably followed by container storage (obviously also OCI). flatpak is probably a distant third at the moment, and indeed, even that has something to do with OCI (the current flatpak demo only works with OCI, in fact)...

For example, today Android as far as I know uses fsverity on single zip files, and they've made it work quite well, but it's harder to get deduplication across apps that way, and maybe someday they go to a composefs-like model.

Ya, that's sort of what I meant... it would be cool to show that you can really do a lot of different things with this stuff...

Based on ideas from containers#141

This is an initial version of ostree support. This allows pulling from
local and remote ostree repos, which will create a set of regular file
content objects, as well as a commit splitstream containing all the
remaining ostree objects and file data. From the splitstream we can
create an image.

When pulling a commit, a base commit (i.e. "the previous version" can be
specified. Any objects in that base commit will not be downloaded. If a
name is given for the pulled commit, then pre-existing blobs with the
same name will automatically be used as a base commit.

This is an initial version and there are several things missing:
 * Pull operations are completely serial
 * There is no support for ostree summary files
 * There is no support for ostree delta files
 * There is no caching of local file availability (other than base commit)
 * Local ostree repos only support archive mode
 * There is no GPG validation on ostree pull

Signed-off-by: Alexander Larsson <alexl@redhat.com>
Copy link
Collaborator

@cgwalters cgwalters left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just an initial pass

for d in dirs_data.iter() {
let (_name, tree_checksum, meta_checksum) = d.to_tuple();

self.maybe_fetch_dirmeta(meta_checksum.try_into().unwrap());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should be ? since this is reachable over the network.

Hmm we probably want a clippy lint that uwraps in non-test code require Justification.


self.enqueue_fetch(commit_id, ObjectType::Commit);

// TODO: Support deltas
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going to be fun...

// TODO: At least for http we should make parallel fetches
while !self.outstanding.is_empty() {
let fetch = self.outstanding.pop_front().unwrap();
println!(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't println! in a library. Let's use indicatif at least...

}

impl RepoMode {
pub fn parse(s: &str) -> Result<RepoMode> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this impl FromStr, there's also crates for that

Comment on lines +108 to +113
let variant_header_size = size_of::<SizedVariantHeader>();
if data.len() < variant_header_size {
bail!("Sized variant too small");
}

let aligned: AlignedBuf = data[0..variant_header_size].to_vec().into();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my experience almost all uses of array slicing can be replaced with a helper function that can help avoid implicit panics, and this is I think one of them.

let data = data.get(..variant_header_size).ok_or_else("Sized variant too small")?

Ditto re various other cases here.

s.push((key, value))
}

gv!("(uuuusa(ayay))").serialize_to_vec(&(*uid, *gid, *mode, *zero, symlink_target.to_str(), &s))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Can we host some of these to nice documented const or is that not possible with the macro?)

// Decompress rest
let mut uncompressed = DeflateDecoder::new(compressed_data);

// TODO: Stream files into repo instead of reading it all
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We just landed a lot of code for this for the OCI importer should be quite doable.

}

impl<ObjectID: FsVerityHashValue> LocalRepo<ObjectID> {
pub fn open_path(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a blocker but I would like to be sure we always have #[deny(missing_docs)] hmm I bet it's that we're not inheriting the workspace lints.

if filetype.is_symlink() {
Ok((zlib_header, Box::new(empty())))
} else {
let fd_path = format!("/proc/self/fd/{}", path_fd.as_fd().as_raw_fd());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tangential to this but I'd like to use https://docs.rs/crate/rustix-linux-procfs/latest I think

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants