Skip to content

Conversation

@faithokamoto
Copy link
Contributor

Since vgteam/vg#4745 is going to lead to old distance indexes being incompatible with new vg giraffe, this PR introduces a way to version check. The version number field is stored in the root record as the first item. To avoid possible confusion with the previous first item (the connected component count), the number is stored as (2^26 minus version). If anyone has a graph with that many connected components, something very weird has occurred. The distance index will error on load if you try to read something with the wrong version number.

@adamnovak please review.

Copy link
Member

@adamnovak adamnovak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is good, but I could be happier if the justification for the cool subtraction trick was in here, because I can't tell why you're doing that and not bit masks.

Comment on lines 823 to 827
// While the version number is 3, we will store (max - version)
// to avoid getting confused with old indexes without version numbers
// that start with component count
const static size_t CURRENT_VERSION_NUMBER = 3;
const static size_t VERSION_NUMBER_SENTINEL = (1 << 26) - CURRENT_VERSION_NUMBER;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting some high bits will make for a big page in the paged vector, but that shouldn't be a big problem since it's just the first page.

Why 26? And why subtract from it instead of setting a high bit and masking down to the bits where the version actually is? I think this way we can keep our values at the high end in 1 fewer bits than with the masking approach, which then means 1 fewer bits in each entry in the page, which is nice. But if that's the reasoning it should probably be explained.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was 26 because when I tried to set it to the maximum size_t that was too big; it complained that it couldn't store it in 26 bits.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... but the tests just complained again, so I'm gonna change it to 10 bits I guess. I could do bit masking; didn't think of that.

@faithokamoto faithokamoto merged commit 1ae5dc8 into master Nov 14, 2025
1 check passed
@faithokamoto faithokamoto deleted the version_num branch November 14, 2025 03:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants