feat: incremental matcher rework #44

saghen · 2025-08-07T19:34:01Z

The Matcher API should be the main entrypoint for consumers. We want to do as much work up front as possible

Copy strings to predefined 16 byte aligned widths ahead of time
- Loop unrolling + faster SIMD loading
Perform interleave every iteration
- Allows us to change the number of items we operate on but requires converting score matrices to an array, TBD how slow this will be
- If we didn't do this, how would we handle filtering? Perhaps we could do scatter/gather to combine SIMDs to only the unfiltered ones. We would then get the same performance as the one_shot matcher, or better because of the string copy

Currently the copying, prefiltering, smith waterman, and storing scores have been implemented. We don't yet calculate typos, support incremental matching or push matches to the output. Preliminary benchmarks show 67ms vs 39ms on the chromium bench. By the end of this PR, I'm hoping to be ~20% away from the one shot bench.

Shout out ii14 for all the ideas :)

feat: just some ideas for incremental matcher

407037f

saghen marked this pull request as draft August 7, 2025 19:35

feat: WIP ideas

ec71f1c

saghen mentioned this pull request Oct 30, 2025

Custom scoring support #46

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: incremental matcher rework #44

feat: incremental matcher rework #44

Uh oh!

saghen commented Aug 7, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat: incremental matcher rework #44

Are you sure you want to change the base?

feat: incremental matcher rework #44

Uh oh!

Conversation

saghen commented Aug 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

saghen commented Aug 7, 2025 •

edited

Loading