fix: preserve cache ID when metadata is content-equivalent #5374

baszalmstra · 2026-01-20T14:50:54Z

Description

When the build backend metadata cache is outdated (e.g., due to timestampchanges), pixi would regenerate metadata by calling the build backend again. Previously, this always generated a new random ID, which invalidated downstream caches (like source metadata) even when the actual metadata content was unchanged.

This change:

Compares new metadata with the previously cached metadata
Reuses the existing ID if the content is equivalent (excluding id, timestamp, and cache_version fields)
Switches from random u64 IDs to nanoid strings for better uniqueness

This optimization prevents unnecessary cache invalidation when the build backend returns the same metadata, improving build performance.

How Has This Been Tested?

Manually tested

AI Disclosure

This PR contains AI-generated content.
- I have tested any AI-generated content in my PR.
- I take responsibility for any AI-generated content in my PR.

Tools: Claude Code Opus 4.5

Checklist:

I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
I have added sufficient tests to cover my changes.
I have verified that changes that would impact the JSON schema have been made in schema/model.py.

When the build backend metadata cache is outdated (e.g., due to timestamp changes), pixi would regenerate metadata by calling the build backend again. Previously, this always generated a new random ID, which invalidated downstream caches (like source metadata) even when the actual metadata content was unchanged. This change: - Compares new metadata with the previously cached metadata - Reuses the existing ID if the content is equivalent (excluding id, timestamp, and cache_version fields) - Switches from random u64 IDs to nanoid strings for better uniqueness - Bumps cache version to v1 due to the format change This optimization prevents unnecessary cache invalidation when the build backend returns the same metadata, improving build performance.

This refactoring improves maintainability of the metadata comparison logic: - Created `CachedCondaMetadataContent` struct containing all fields that represent the actual build backend response (project_model_hash, build_source, build_variants, build_variant_files, input_globs, input_files, outputs) - `CachedCondaMetadata` now wraps this content struct with cache metadata (id, cache_version, timestamp) - Changed `BinaryHeap` to `BTreeSet` for build_variant_files and input_globs to enable `PartialEq`/`Eq` derives - Simplified `is_content_equivalent()` to just compare `self.content == other.content` - Refactored `verify_cache_freshness` to take a reference and return bool, avoiding unnecessary clones when cache is stale This makes the equality comparison less error-prone: when fields are added or removed from `CachedCondaMetadataContent`, the derived `PartialEq` automatically includes them, preventing bugs from manual field listing.

baszalmstra changed the title ~~Claude/fix metadata cache invalidation 5p pf1~~ fix: preserve cache ID when metadata is content-equivalent Jan 20, 2026

baszalmstra force-pushed the claude/fix-metadata-cache-invalidation-5pPf1 branch 2 times, most recently from c68717b to b480472 Compare January 20, 2026 15:21

baszalmstra force-pushed the claude/fix-metadata-cache-invalidation-5pPf1 branch from b480472 to 74ba0c7 Compare January 20, 2026 15:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: preserve cache ID when metadata is content-equivalent #5374

fix: preserve cache ID when metadata is content-equivalent #5374

baszalmstra commented Jan 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix: preserve cache ID when metadata is content-equivalent #5374

Are you sure you want to change the base?

fix: preserve cache ID when metadata is content-equivalent #5374

Conversation

baszalmstra commented Jan 20, 2026

Description

How Has This Been Tested?

AI Disclosure

Checklist:

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants