-
Notifications
You must be signed in to change notification settings - Fork 25
Balanced index splits (byte-based multi-split), forward leaf link #1140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| right-children (subvec (vec child-nodes) median-i) | ||
| left-branch (reconstruct-branch branch t left-children) | ||
| right-branch (reconstruct-branch branch t right-children)] | ||
| (update-sibling-leftmost [left-branch right-branch]))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change would only cut a branch in half. A reindexing that incorporates a large amount of novelty could theoretically cause a branch to have more than double the target count of children, which would mean that the new branches resulting from this process would still be too big.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was intentional here, as the scenario that would apply is getting close to 100MB of novelty, likely beyond a reasonable amount. In the scenario this did happen, the second split would just happen at next index and a larger branch is pretty benign.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the advantage of only splitting by two instead of the way it was before?
| right-children (subvec (vec child-nodes) median-i) | ||
| left-branch (reconstruct-branch branch t left-children) | ||
| right-branch (reconstruct-branch branch t right-children)] | ||
| (update-sibling-leftmost [left-branch right-branch]))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the advantage of only splitting by two instead of the way it was before?
| (let [size (flake/size-flake f)] | ||
| [(conj tuples [f size]) | ||
| (+ sum size)])) | ||
| [[] 0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tend to prefer reduce over loop myself, but I think loop will be faster in this particular case because of all the extra vector allocations for the intermediate steps the reduce version has to do.
| - :target-leaves - number of leaves to create | ||
| - :bytes-per-leaf - target bytes per leaf" | ||
| [flakes] | ||
| (let [flake-vec (vec flakes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this necessary? It looks like flake-vec is only used in the reduce call below. I think flakes is a clojure.data.avl sorted set, which also implements IReduce.
Resolved conflict in novelty.cljc by keeping the balanced splitting algorithm which already handles the edge case fixed in main (ensuring each leaf has at least one flake via seq check before splitting).
What part of the diff from this pull request changes the previous behavior related to these bullet points? |
The way we indexed, the persistent index tree could become unbalanced (varying heights). In larger ledgers with specific insert patterns that can affect performance adversely.
This fix keeps the index tree at constant height. I also added a :next-id to the leaf which will allow range scans in the future (once support is implemented) to skip directly to the next node which can also improve performance but TBD on how much.
Summary
:next-idadded (optional) and resolved at write-time in a single pass.Motivation
Heavy inserts localized to one key range could repeatedly split the same path, increasing depth on the “hot” side. Prior leaf splitting didn’t always produce balanced results when a single leaf was significantly over target. We also lacked a forward link to optimize sequential scans across leaves. This change ensures:
:next-id.Key Changes
:rhson each left sibling to the next sibling’s first flake; final sibling inherits original:rhs.:next-idto its right sibling’s final id without extra writes.:first,:rhs), and ensure only the first sibling is:leftmost?.:next-idadded to leaf node schema (backward compatible).:next-tempidto the right sibling’s temp id. Before persistence,:next-tempidis resolved to the right sibling’s final id using the existing updated-ids mapping. No second write required.Behavior and Invariants
left.rhs = right.first.:leftmost?; others are false.:next-idis forward-only and optional; readers fall back to traversal if absent.Backward Compatibility
:next-idon leaves. Old nodes remain valid; readers ignore missing:next-id.