-
Notifications
You must be signed in to change notification settings - Fork 1
Add OpenMP parallelization to IVFFlatIndex #5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
5000user5000
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modified Files
-
src/IVFFlatIndex.cpp
- Added
#include <omp.h>header - Parallelized centroid distance calculation with
#pragma omp parallel for schedule(static) - Parallelized list probing using thread-local heaps with
schedule(dynamic)and#pragma omp criticalfor merging - Parallelized batch search with
schedule(dynamic)across multiple queries - Added clear comments explaining parallelization strategy
- Added
-
Makefile
- Added
-fopenmpflag toCXXFLAGSfor compilation - Added
-fopenmptoLDFLAGSfor linking
- Added
Parallelization Strategy
-
Centroid Distance Calculation
- Uses
schedule(static)for balanced workload distribution - Parallel computation of L2 distance to all centroids
- Uses
-
List Probing
- Uses
schedule(dynamic)to handle variable cluster sizes - Each thread maintains a local heap to avoid contention
- Results merged into global heap via
#pragma omp critical
- Uses
-
Batch Search
- Uses
schedule(dynamic)for load balancing across queries - Each query processed independently in parallel
- Uses
| @@ -1,5 +1,5 @@ | |||
| CXX := g++ | |||
| CXXFLAGS := -std=c++17 -O3 -fPIC | |||
| CXXFLAGS := -std=c++17 -O3 -fPIC -fopenmp | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added -fopenmp for compilation
| # Add OpenMP linking | ||
| LDFLAGS += -fopenmp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added -fopenmp for linking
| #pragma omp parallel for schedule(static) | ||
| for (size_t c = 0; c < nlist_; ++c) { | ||
| float d = l2_naive(query.data(), centroids_[c].data(), dimension_); | ||
| cdist[c] = {d, c}; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
質心 query 分給多個 threads
| if (heap.size() < k) { | ||
| heap.emplace_back(dist, id); | ||
| if (heap.size() == k) | ||
| std::make_heap(heap.begin(), heap.end()); | ||
| } else if (dist < heap.front().first) { | ||
| std::pop_heap(heap.begin(), heap.end()); | ||
| heap.back() = {dist, id}; | ||
| std::push_heap(heap.begin(), heap.end()); | ||
| if (local.size() < k) { | ||
| local.emplace_back(dist, id); | ||
| if (local.size() == k) { | ||
| std::make_heap(local.begin(), local.end()); | ||
| } | ||
| } else if (dist < local.front().first) { | ||
| std::pop_heap(local.begin(), local.end()); | ||
| local.back() = {dist, id}; | ||
| std::push_heap(local.begin(), local.end()); | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Search within this cluster's inverted list , 將原本 heap, 改成讓 openMP 每個 thread 各自有自己的 heap (local)
| // Merge local results into global heap (thread-safe) | ||
| #pragma omp critical | ||
| { | ||
| for (auto& p : local) { | ||
| if (heap.size() < k) { | ||
| heap.push_back(p); | ||
| if (heap.size() == k) { | ||
| std::make_heap(heap.begin(), heap.end()); | ||
| } | ||
| } else if (p.first < heap.front().first) { | ||
| std::pop_heap(heap.begin(), heap.end()); | ||
| heap.back() = p; | ||
| std::push_heap(heap.begin(), heap.end()); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
將 local 資料合併
5000user5000
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Summary
This PR adds OpenMP-based multi-threading parallelization to
IVFFlatIndex, significantly improving search performance for approximate nearest neighbor queries.Modified Files
src/IVFFlatIndex.cpp
Added
#include <omp.h>Applied OpenMP parallelization:
schedule(static)schedule(dynamic)with thread-local heaps and#pragma omp criticalfor mergingschedule(dynamic)for parallel query processingAdded comments explaining parallelization
Makefile
-fopenmpto bothCXXFLAGSandLDFLAGSParallelization Overview
Related Issue