-
Notifications
You must be signed in to change notification settings - Fork 492
Persist: Add new API for run based spine replacement #33044
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds a new API for run-based spine replacement in the persist client, extending the existing merge resolution functionality to support fine-grained replacement of specific runs within hollow batches rather than just entire batches.
- Adds
RunLocationstruct to identify specific runs within hollow batches usingSpineIdandRunIdpairs - Introduces a new
apply_merge_res_checkedmethod that can replace individual runs within batches - Extends
FueledMergeReswithinputsandnew_active_compactionfields to support the new replacement logic
Reviewed Changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| src/persist-client/src/internal/trace.rs | Core implementation of run-based spine replacement with new data structures and methods |
| src/persist-client/src/internal/state_serde.json | Test data updates reflecting the new run-based structure |
| src/persist-client/src/internal/state_diff.rs | Updates to populate new fields in FueledMergeRes |
| src/persist-client/src/internal/state.rs | Test utilities and method renaming for compatibility |
| src/persist-client/src/internal/machine.rs | Updates to construct FueledMergeRes with new fields |
| src/persist-client/src/internal/compact.rs | Updates to construct FueledMergeRes with new fields |
| src/persist-client/src/cli/admin.rs | Updates to construct FueledMergeRes with new fields |
f035a90 to
23e21c8
Compare
23e21c8 to
f497c4a
Compare
f497c4a to
ae7312a
Compare
ae7312a to
2bb9775
Compare
2bb9775 to
a15c651
Compare
a15c651 to
85910d2
Compare
85910d2 to
0274b1f
Compare
0274b1f to
ee85994
Compare
ee85994 to
3c38325
Compare
a97ff26 to
a946d3c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good changes! If nightlies are happy then so am I.
This PR could use a description also.
| #[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord)] | ||
| #[derive(Debug, Clone, PartialEq, Eq, PartialOrd, Ord, Hash)] | ||
| pub enum CompactionInput { | ||
| /// We don't know what our inputs where, this should only be used for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| /// We don't know what our inputs where, this should only be used for | |
| /// We don't know what our inputs were; this should only be used for |
| Legacy, | ||
| /// This compaction output is a total replacement for all batches in this id range. | ||
| #[allow(dead_code)] | ||
| IdRange(BTreeSet<SpineId>), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could just be a SpineId instead of a set, since a single id is already a range. (Optional, but would save you some validation below.)
| let old_batch_diff_sum = Self::diffs_sum::<D>(batch.parts.iter(), metrics); | ||
| let old_diffs_sum = Self::diffs_sum_for_runs::<D>(batch, &run_ids, metrics); | ||
|
|
||
| if let (Some(old_diffs_sum), Some(new_diffs_sum)) = (old_diffs_sum, new_diffs_sum) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should also assert that the old_diffs_sum is None when the new_diffs_some is None... for new diffs None should always mean 0, and it would be bad if we let that replace data with nonzero sum!
| let Some((i, batch)) = part else { | ||
| return ApplyMergeResult::NotAppliedNoMatch; | ||
| }; | ||
| let replacement_range = i..i + 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| let replacement_range = i..i + 1; | |
| let replacement_range = i..(i + 1); |
| } | ||
|
|
||
| let parts = &self.parts[replacement_range.clone()]; | ||
| let id = SpineId(parts.first().unwrap().id.0, parts.last().unwrap().id.1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is oddly baroque... we're only targeting a single batch. Why a single-element range that we then unwrap?
| .prop_filter("non-empty batch", |(batch, _)| batch.run_meta.len() >= 1) | ||
| .prop_flat_map(|(batch, to_replace)| { | ||
| let batch_len = batch.run_meta.len(); | ||
| (0..batch_len).prop_flat_map(move |start| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I'd extract this gen out to a let above so you get proper formatting.
| (1..=max_len).prop_map(move |len| { | ||
| let range = start..(start + len); | ||
| (batch_clone.clone(), to_replace_clone.clone(), range) | ||
| }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test still bakes in the "must replace a contiguous range of runs from the original batch" idea that the function under test no longer requires. Might as well choose a random subset instead and get better coverage?
|
Looks like you need to opt out the new test from I find the scalability benchmark result sort of implausible, given the limited stuff you've changed... maybe spurious? |
a946d3c to
aee2ad8
Compare
Depends on #33044 <!-- Describe the contents of the PR briefly but completely. If you write detailed commit messages, it is acceptable to copy/paste them here, or write "see commit messages for details." If there is only one commit in the PR, GitHub will have already added its commit message above. --> ### Motivation <!-- Which of the following best describes the motivation behind this PR? * This PR fixes a recognized bug. [Ensure issue is linked somewhere.] * This PR adds a known-desirable feature. [Ensure issue is linked somewhere.] * This PR fixes a previously unreported bug. [Describe the bug in detail, as if you were filing a bug report.] * This PR adds a feature that has not yet been specified. [Write a brief specification for the feature, including justification for its inclusion in Materialize, as if you were writing the original feature specification.] * This PR refactors existing code. [Describe what was wrong with the existing code, if it is not obvious.] --> ### Tips for reviewer <!-- Leave some tips for your reviewer, like: * The diff is much smaller if viewed with whitespace hidden. * [Some function/module/file] deserves extra attention. * [Some function/module/file] is pure code movement and only needs a skim. Delete this section if no tips. --> ### Checklist - [ ] This PR has adequate test coverage / QA involvement has been duly considered. ([trigger-ci for additional test/nightly runs](https://trigger-ci.dev.materialize.com/)) - [ ] This PR has an associated up-to-date [design doc](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/README.md), is a design doc ([template](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/00000000_template.md)), or is sufficiently small to not require a design. <!-- Reference the design in the description. --> - [ ] If this PR evolves [an existing `$T ⇔ Proto$T` mapping](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/command-and-response-binary-encoding.md) (possibly in a backwards-incompatible way), then it is tagged with a `T-proto` label. - [ ] If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label ([example](MaterializeInc/cloud#5021)). <!-- Ask in #team-cloud on Slack if you need help preparing the cloud PR. --> - [ ] If this PR includes major [user-facing behavior changes](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/guide-changes.md#what-changes-require-a-release-note), I have pinged the relevant PM to schedule a changelog post.
Motivation
Tips for reviewer
Checklist
$T ⇔ Proto$Tmapping (possibly in a backwards-incompatible way), then it is tagged with aT-protolabel.