MB-69881: Re-architect vector search #2270

CascadingRadium · 2025-12-28T05:18:07Z

Use a bitset to track eligible documents instead of a slice of N uint64s, reducing memory usage from 8N bytes to N/8 bytes per segment (up to 64× reduction) and improving cache locality.
Pass an iterator over eligible documents that iterates the bitset directly, allowing direct translation into a bitset of eligible vector IDs in the storage layer and eliminating the need for a separate slice intermediary.
Fix garbage creation in the UnadornedPostingsIterator, which previously allocated a temporary struct per Next() call to wrap a doc number and satisfy the Postings interface; the iterator now returns a single reusable struct (one-time allocation) consistent with the working of the PostingsIterator in the storage-layer.
Avoid unnecessary BytesRead statistics computation when executing searches in no-scoring mode, removing redundant work as a micro-optimization.

Copilot

Pull request overview

This PR re-architects vector search to improve memory efficiency and reduce garbage collection pressure. The changes replace slice-based eligible document tracking with bitsets, achieving up to 64× memory reduction per segment, and optimize the iterator pattern to eliminate per-call allocations in the unadorned postings iterator.

Key changes:

Replaced slice-based eligible document tracking ([]uint64) with bitsets, reducing memory from 8N bytes to N/8 bytes per segment
Introduced iterator-based API for eligible documents that directly translates to bitset iteration at the storage layer
Fixed garbage creation in UnadornedPostingsIterator by reusing a single struct instance instead of allocating per Next() call
Optimized bytes read tracking to skip computation in no-scoring mode

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
index/scorch/snapshot_vector_index.go	Introduces bitset-based eligible document storage and iterator API, replacing the previous slice-based approach
index/scorch/unadorned.go	Changes `UnadornedPosting` from `uint64` to struct with pointer receivers and adds reusable struct fields to iterators to eliminate per-call allocations
index/scorch/snapshot_index_tfr.go	Adds conditional bytes read tracking via `updateBytesRead` flag to skip computation in no-scoring mode
index/scorch/snapshot_index.go	Initializes `updateBytesRead` flag based on scoring requirements
index/scorch/optimize_knn.go	Removes `requiresFiltering` flag and updates to use new `SegmentEligibleDocuments` API
index/scorch/optimize.go	Sets `updateBytesRead` to false for unadorned term field readers
index/scorch/snapshot_index_vr.go	Updates `InterpretVectorIndex` call to remove filtering parameter
index_test.go	Updates expected bytes read values to reflect the optimization that skips unnecessary computation

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

index/scorch/snapshot_vector_index.go

index/scorch/snapshot_index_tfr.go

index/scorch/snapshot_vector_index.go

CascadingRadium added 6 commits December 25, 2025 01:26

minor opt

65b931b

remove redundant variable

f6bf3af

overhaul the eligible iterator for performance

ec24d99

fix bytes stat

27f2d9f

fix unadorned posting garbage

da11922

micro optimization

a76bdab

CascadingRadium added this to Vector Search v2 Dec 28, 2025

github-project-automation bot moved this to Todo in Vector Search v2 Dec 28, 2025

CascadingRadium moved this from Todo to Done in Vector Search v2 Dec 28, 2025

CascadingRadium requested review from Likith101, Thejas-bhat, abhinavdangeti, capemox, Copilot and maneuvertomars December 28, 2025 06:35

Copilot started reviewing on behalf of CascadingRadium December 28, 2025 06:35 View session

Copilot AI reviewed Dec 28, 2025

View reviewed changes

index/scorch/snapshot_vector_index.go Outdated Show resolved Hide resolved

index/scorch/snapshot_index_tfr.go Outdated Show resolved Hide resolved

index/scorch/snapshot_vector_index.go Show resolved Hide resolved

index/scorch/snapshot_vector_index.go Outdated Show resolved Hide resolved

CascadingRadium added 2 commits January 7, 2026 15:19

code review

4ffeb2a

code review

39d9ec2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

MB-69881: Re-architect vector search #2270

MB-69881: Re-architect vector search #2270

CascadingRadium commented Dec 28, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

MB-69881: Re-architect vector search #2270

Are you sure you want to change the base?

MB-69881: Re-architect vector search #2270

Conversation

CascadingRadium commented Dec 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

CascadingRadium commented Dec 28, 2025 •

edited

Loading