Skip to content

Conversation

@JohnSwan1503
Copy link
Contributor

Introduces a new garbage collection pattern where file_metadata entries are eagerly deleted during compaction while preserving footer_cache entries until the Collector runs. This allows immediate query visibility changes while maintaining footer data for in-flight reads.

Migration changes:

  • Drop footer column from file_metadata (now only in footer_cache)
  • Remove FK constraint from footer_cache to file_metadata
  • Remove FK constraint from gc_manifest to file_metadata
  • Remove CHECK constraint on gc_manifest.expiration

Compaction flow:

  • upsert_gc_manifest now deletes from file_metadata via CTE and inserts into gc_manifest
  • File metadata disappears from queries immediately after compaction
  • Footer cache entries preserved for reads during grace period

Collector changes:

  • No longer deletes from file_metadata (already done during compaction)
  • Now deletes from footer_cache after gc_manifest entries expire
  • Now deletes from gc_manifest after physical files are removed

Consistency check:

  • Remove orphan file deletion logic (now handled by garbage collector)
  • Orphaned files may be pending GC via gc_manifest

Test utilities:

  • Add test-utils feature flag to metadata-db
  • Add count_footer_cache_by_location method for test assertions
  • Update sql_dataset_input_batch_size test to verify new behavior

…ile_metadata

  Migrations:
  - Add footer_cache table to store footer bytes separately from file_metadata
  - Add url column to file_metadata, denormalized from physical_tables

  metadata-db:
  - Update insert to use CTE that writes to both file_metadata and footer_cache
  - Add url parameter to insert/register_file functions
  - Update get_footer_bytes_by_id to read from footer_cache instead of file_metadata
  - Update select queries to use denormalized fm.url instead of joining physical_tables

  dump:
  - Add url field to ParquetFileWriterOutput struct
  - Add url parameter to commit_metadata function
  - Update all call sites in raw_dataset_writer, derived_dataset, and compactor

  common:
  - Update register_file call in physical.rs to pass url parameter

Signed-off-by: John Swanson <jswanson@edgeandnode.com>
  Introduces a new garbage collection pattern where file_metadata entries are
  eagerly deleted during compaction while preserving footer_cache entries until
  the Collector runs. This allows immediate query visibility changes while
  maintaining footer data for in-flight reads.

  **Migration changes:**
  - Drop `footer` column from `file_metadata` (now only in `footer_cache`)
  - Remove FK constraint from `footer_cache` to `file_metadata`
  - Remove FK constraint from `gc_manifest` to `file_metadata`
  - Remove CHECK constraint on `gc_manifest.expiration`

  **Compaction flow:**
  - `upsert_gc_manifest` now deletes from `file_metadata` via CTE and inserts into `gc_manifest`
  - File metadata disappears from queries immediately after compaction
  - Footer cache entries preserved for reads during grace period

  **Collector changes:**
  - No longer deletes from `file_metadata` (already done during compaction)
  - Now deletes from `footer_cache` after gc_manifest entries expire
  - Now deletes from `gc_manifest` after physical files are removed

  **Consistency check:**
  - Remove orphan file deletion logic (now handled by garbage collector)
  - Orphaned files may be pending GC via `gc_manifest`

  **Test utilities:**
  - Add `test-utils` feature flag to `metadata-db`
  - Add `count_footer_cache_by_location` method for test assertions
  - Update `sql_dataset_input_batch_size` test to verify new behavior

Signed-off-by: John Swanson <jswanson@edgeandnode.com>
…oter-cache-table

Incorporates:
  - bring back EmptyExec check (#1404)
  - feat(metadata-db): add footer_cache table and denormalize url into file_metadata (#1403)
  - feat(controller): add retry for failed jobs (#1398)
Signed-off-by: John Swanson <jswanson@edgeandnode.com>
@JohnSwan1503 JohnSwan1503 requested a review from leoyvens December 4, 2025 21:53
  Remove duplicate doc comment and `url` parameter in `files::insert`
  that were accidentally introduced during merge from main.

Signed-off-by: John Swanson <jswanson@edgeandnode.com>
@JohnSwan1503
Copy link
Contributor Author

This is a follow up of #1403

Copy link

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Summary

This PR implements a significant architectural change to the garbage collection pattern, introducing eager deletion of file_metadata entries during compaction while preserving footer_cache entries for in-flight reads. The implementation is well-structured and the test coverage is comprehensive.

✅ Strengths

  1. Clear separation of concerns: The new pattern cleanly separates metadata visibility (immediate deletion from file_metadata) from data availability (footer_cache preservation)
  2. Atomic operations: Good use of CTEs to ensure atomic transitions between tables
  3. Comprehensive test updates: The test clearly validates the new multi-phase lifecycle
  4. Removal of side effects: The consistency check function no longer performs deletions, improving code clarity

📝 Suggestions for Improvement

  1. Migration safety: The schema migration drops critical constraints - ensure deployment is carefully coordinated
  2. Return values: Consider having deletion methods return affected row counts for better observability
  3. Performance optimization: Use UNION ALL instead of UNION in the test utility query
  4. Error handling: Consider adding validation for partial deletion scenarios

⚠️ Points to Consider

  • The removal of FK constraints means the application must now maintain referential integrity - ensure all code paths handle this correctly
  • Silent ignoring of non-existent file IDs in upsert_gc_manifest - verify this is intended behavior
  • The collector now has more deletion responsibilities - ensure proper error handling and retry logic

Overall, this is a well-thought-out change that improves the garbage collection architecture. The code is clean and the test coverage demonstrates the new behavior effectively.

.delete_file_ids(found_file_ids_to_paths.keys())
// Delete from footer_cache (file_metadata was already deleted during compaction)
metadata_db
.delete_footer_cache(found_file_ids_to_paths.keys())
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding error handling or logging if the number of deleted entries doesn't match expectations. For example, if delete_footer_cache or delete_gc_manifest operations partially succeed, it might be worth logging the discrepancy.


-- Remove the CHECK constraint on gc_manifest.expiration
-- This allows inserting entries with any expiration time
ALTER TABLE gc_manifest DROP CONSTRAINT gc_manifest_expiration_check;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important: This migration makes breaking schema changes by dropping constraints and columns. Ensure you have:

  1. A rollback plan if needed
  2. Verified that no running processes depend on these constraints
  3. Considered the impact on any concurrent operations during deployment

The removal of FK constraints enables the new eager deletion pattern, but make sure to coordinate the deployment carefully.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#1403 Prepares the metadata db for the schema changes made here in this PR's migrations.

Comment on lines +377 to +378
let sql = indoc::indoc! {r#"
WITH deleted_files AS (
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CTE pattern here is elegant - atomically deleting from file_metadata and inserting into gc_manifest in a single query. This ensures consistency without needing an explicit transaction.

However, consider what happens if some file IDs don't exist in file_metadata. The deleted_files CTE will only return rows that were actually deleted, so non-existent IDs will be silently ignored. Is this the intended behavior?

Comment on lines +423 to +430
/// Deletes footer cache entries for the given file IDs.
///
/// Called by the Collector when gc_manifest entries have expired
/// and physical files are ready to be deleted.
pub async fn delete_footer_cache(
&self,
file_ids: impl Iterator<Item = &FileId>,
) -> Result<(), Error> {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These new deletion methods (delete_footer_cache and delete_gc_manifest) don't return information about how many rows were actually deleted. Consider returning the count of affected rows, which could be useful for validation and debugging:

Suggested change
/// Deletes footer cache entries for the given file IDs.
///
/// Called by the Collector when gc_manifest entries have expired
/// and physical files are ready to be deleted.
pub async fn delete_footer_cache(
&self,
file_ids: impl Iterator<Item = &FileId>,
) -> Result<(), Error> {
pub async fn delete_footer_cache(
&self,
file_ids: impl Iterator<Item = &FileId>,
) -> Result<u64, Error> {
let sql = "DELETE FROM footer_cache WHERE file_id = ANY($1)";
let result = sqlx::query(sql)
.bind(file_ids.map(|id| **id).collect::<Vec<_>>())
.execute(&*self.pool)
.await
.map_err(Error::Database)?;
Ok(result.rows_affected())
}

Comment on lines +58 to +71
// 6. After compaction:
// - file_metadata: only 1 entry (the compacted file) since old files are eagerly deleted
// - physical files: 5 (4 original + 1 compacted, waiting for garbage collection)
// - footer_cache: 5 (preserved for all files until garbage collection)
let physical_file_count_after = test.file_count("sql_stream_ds", "even_blocks").await;
let file_metadata_count_after = test
.file_metadata_count("sql_stream_ds", "even_blocks")
.await;
let footer_cache_count_after = test
.footer_cache_count("sql_stream_ds", "even_blocks")
.await;

assert_eq!(file_count_after, 5);
assert_eq!(
file_metadata_count_after, 1,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent test updates! The test now properly validates the new behavior where:

  1. file_metadata entries are eagerly deleted during compaction
  2. footer_cache entries are preserved until garbage collection
  3. Physical files remain until the collector runs

The assertions clearly document the expected state at each phase of the lifecycle.

JohnSwan1503 and others added 2 commits December 5, 2025 08:15
Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>
Signed-off-by: JohnSwan1503 <82048481+JohnSwan1503@users.noreply.github.com>
This reverts commit aa913c4. Bad bot!

Signed-off-by: John Swanson <jswanson@edgeandnode.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants