Skip to content

Conversation

@baszalmstra
Copy link
Contributor

Description

When the build backend metadata cache is outdated (e.g., due to timestampchanges), pixi would regenerate metadata by calling the build backend again. Previously, this always generated a new random ID, which invalidated downstream caches (like source metadata) even when the actual metadata content was unchanged.

This change:

  • Compares new metadata with the previously cached metadata
  • Reuses the existing ID if the content is equivalent (excluding id, timestamp, and cache_version fields)
  • Switches from random u64 IDs to nanoid strings for better uniqueness

This optimization prevents unnecessary cache invalidation when the build backend returns the same metadata, improving build performance.

How Has This Been Tested?

  • Manually tested

AI Disclosure

  • This PR contains AI-generated content.
    • I have tested any AI-generated content in my PR.
    • I take responsibility for any AI-generated content in my PR.

Tools: Claude Code Opus 4.5

Checklist:

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have added sufficient tests to cover my changes.
  • I have verified that changes that would impact the JSON schema have been made in schema/model.py.

When the build backend metadata cache is outdated (e.g., due to timestamp
changes), pixi would regenerate metadata by calling the build backend again.
Previously, this always generated a new random ID, which invalidated downstream
caches (like source metadata) even when the actual metadata content was unchanged.

This change:
- Compares new metadata with the previously cached metadata
- Reuses the existing ID if the content is equivalent (excluding id, timestamp,
  and cache_version fields)
- Switches from random u64 IDs to nanoid strings for better uniqueness
- Bumps cache version to v1 due to the format change

This optimization prevents unnecessary cache invalidation when the build backend
returns the same metadata, improving build performance.
@baszalmstra baszalmstra changed the title Claude/fix metadata cache invalidation 5p pf1 fix: preserve cache ID when metadata is content-equivalent Jan 20, 2026
@baszalmstra baszalmstra force-pushed the claude/fix-metadata-cache-invalidation-5pPf1 branch 2 times, most recently from c68717b to b480472 Compare January 20, 2026 15:21
This refactoring improves maintainability of the metadata comparison logic:

- Created `CachedCondaMetadataContent` struct containing all fields that
  represent the actual build backend response (project_model_hash, build_source,
  build_variants, build_variant_files, input_globs, input_files, outputs)
- `CachedCondaMetadata` now wraps this content struct with cache metadata
  (id, cache_version, timestamp)
- Changed `BinaryHeap` to `BTreeSet` for build_variant_files and input_globs
  to enable `PartialEq`/`Eq` derives
- Simplified `is_content_equivalent()` to just compare `self.content == other.content`
- Refactored `verify_cache_freshness` to take a reference and return bool,
  avoiding unnecessary clones when cache is stale

This makes the equality comparison less error-prone: when fields are added or
removed from `CachedCondaMetadataContent`, the derived `PartialEq` automatically
includes them, preventing bugs from manual field listing.
@baszalmstra baszalmstra force-pushed the claude/fix-metadata-cache-invalidation-5pPf1 branch from b480472 to 74ba0c7 Compare January 20, 2026 15:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants