Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Oct 1, 2025

📄 6% (0.06x) speedup for merge_multi_strings in doctr/models/recognition/utils.py

⏱️ Runtime : 7.28 milliseconds 6.88 milliseconds (best of 241 runs)

📝 Explanation and details

The optimized code achieves a 5% speedup by eliminating redundant computations and reducing memory allocations in the merge_strings function:

Key Optimizations:

  1. Single-pass score computation: Instead of creating the scores list in one pass and then creating zero_matches in a separate enumeration, the optimized version combines both operations in a single loop. This eliminates the need to iterate through scores twice.

  2. Precomputed expected_overlap: Moved the calculation of expected_overlap outside of conditional branches to avoid redundant computation on every path.

  3. In-place minimum finding: Replaced combined_scores.index(min(combined_scores)) with a manual loop that finds the minimum without creating an intermediate list. This eliminates the memory allocation for combined_scores and avoids a second pass through the data.

  4. Reduced list comprehension overhead: The combined loop approach avoids the overhead of multiple list comprehensions and their associated memory allocations.

Performance Benefits:

  • Best for medium-scale merging: The optimizations show consistent 7-15% improvements across test cases with multiple strings and moderate overlaps
  • Memory efficient: Eliminates temporary list allocations, reducing garbage collection pressure
  • Maintains correctness: All optimization paths preserve the original logic while reducing computational overhead

The optimizations are particularly effective for the common use cases shown in tests, where strings have partial overlaps and the function needs to compute Hamming distances across multiple potential alignment positions.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 6 Passed
🌀 Generated Regression Tests 75 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
⚙️ Existing Unit Tests and Runtime
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
common/test_models_recognition_utils.py::test_merge_multi_strings 43.4μs 41.1μs 5.52%✅
🌀 Generated Regression Tests and Runtime
import pytest  # used for our unit tests
from doctr.models.recognition.utils import merge_multi_strings
from rapidfuzz.distance import Hamming

# unit tests

# ----------------------------
# Basic Test Cases
# ----------------------------

def test_single_string():
    # Single string should return itself
    codeflash_output = merge_multi_strings(["hello"], 0.5, 0.5) # 830ns -> 868ns (4.38% slower)

def test_two_strings_perfect_overlap():
    # Overlap "abc" and "bcd" with overlap_ratio=0.5
    codeflash_output = merge_multi_strings(["abc", "bcd"], 0.5, 0.5) # 6.73μs -> 5.95μs (13.2% faster)

def test_three_strings_simple_overlap():
    # Overlap ['abc', 'cde', 'efg'] with overlap_ratio=0.5
    codeflash_output = merge_multi_strings(['abc', 'cde', 'efg'], 0.5, 0.5) # 8.58μs -> 7.91μs (8.45% faster)

def test_no_overlap():
    # No overlap between strings
    codeflash_output = merge_multi_strings(['abc', 'def', 'ghi'], 0.0, 0.0) # 8.13μs -> 7.41μs (9.74% faster)

def test_partial_overlap():
    # Partial overlap, not perfect
    codeflash_output = merge_multi_strings(['abc', 'bcdef', 'defgh'], 0.4, 0.4) # 9.07μs -> 8.42μs (7.78% faster)

def test_last_overlap_ratio():
    # Last overlap ratio is different
    codeflash_output = merge_multi_strings(['abc', 'bcdef', 'defgh'], 0.5, 0.1) # 8.76μs -> 8.18μs (7.04% faster)

def test_overlap_with_repeated_chars():
    # Overlap with repeated characters
    codeflash_output = merge_multi_strings(['aaa', 'aaa', 'aaa'], 0.8, 0.8) # 10.2μs -> 9.81μs (4.10% faster)

def test_overlap_with_numbers():
    # Overlap with digits
    codeflash_output = merge_multi_strings(['123', '234', '345'], 0.5, 0.5) # 7.50μs -> 7.06μs (6.28% faster)

def test_overlap_with_special_characters():
    # Overlap with special characters
    codeflash_output = merge_multi_strings(['!@#', '@#$%', '$%^'], 0.5, 0.5) # 7.66μs -> 7.02μs (9.13% faster)

# ----------------------------
# Edge Test Cases
# ----------------------------

def test_empty_list():
    # Empty list should return empty string
    codeflash_output = merge_multi_strings([], 0.5, 0.5) # 330ns -> 344ns (4.07% slower)

def test_list_with_empty_strings():
    # List with empty strings
    codeflash_output = merge_multi_strings(['', '', ''], 0.5, 0.5) # 2.60μs -> 2.52μs (3.54% faster)

def test_mixed_empty_and_nonempty_strings():
    # List with empty and non-empty strings
    codeflash_output = merge_multi_strings(['', 'abc', '', 'def'], 0.5, 0.5) # 7.09μs -> 6.43μs (10.3% faster)

def test_all_single_char_strings():
    # List of single character strings
    codeflash_output = merge_multi_strings(['a', 'b', 'c', 'd'], 0.5, 0.5) # 3.17μs -> 3.15μs (0.476% faster)

def test_overlap_ratio_zero():
    # Overlap ratio is zero, so no merging
    codeflash_output = merge_multi_strings(['abc', 'cde', 'efg'], 0.0, 0.0) # 8.61μs -> 7.57μs (13.8% faster)

def test_overlap_ratio_one():
    # Overlap ratio is one, so maximal overlap
    codeflash_output = merge_multi_strings(['abc', 'bc', 'c'], 1.0, 1.0) # 5.99μs -> 5.34μs (12.1% faster)

def test_overlap_ratio_greater_than_one():
    # Overlap ratio > 1 should behave like maximal overlap
    codeflash_output = merge_multi_strings(['abc', 'bc', 'c'], 2.0, 2.0) # 7.08μs -> 6.39μs (10.8% faster)

def test_overlap_ratio_negative():
    # Negative overlap ratio should behave like no overlap
    codeflash_output = merge_multi_strings(['abc', 'cde', 'efg'], -1.0, -1.0) # 8.06μs -> 7.67μs (5.01% faster)

def test_strings_with_unicode():
    # Unicode characters
    codeflash_output = merge_multi_strings(['héllo', 'élloworld', 'world!'], 0.5, 0.5) # 13.0μs -> 12.5μs (3.86% faster)

def test_strings_with_spaces():
    # Strings with spaces
    codeflash_output = merge_multi_strings(['hello ', ' world', ' world!'], 0.5, 0.5) # 12.4μs -> 11.5μs (7.89% faster)

def test_strings_with_different_lengths():
    # Strings of varying lengths
    codeflash_output = merge_multi_strings(['a', 'ab', 'abc', 'abcd'], 0.5, 0.5) # 9.07μs -> 8.41μs (7.78% faster)

def test_strings_with_non_ascii():
    # Non-ASCII characters
    codeflash_output = merge_multi_strings(['你好', '好世界', '世界!'], 0.5, 0.5) # 8.26μs -> 7.67μs (7.69% faster)

def test_overlap_at_start_and_end():
    # Overlap only at start and end
    codeflash_output = merge_multi_strings(['abc', 'cde', 'efg', 'g'], 0.5, 0.5) # 8.29μs -> 7.73μs (7.12% faster)

def test_overlap_with_empty_last_string():
    # Last string is empty
    codeflash_output = merge_multi_strings(['abc', 'bcdef', ''], 0.5, 0.1) # 6.10μs -> 5.52μs (10.5% faster)

def test_overlap_with_empty_first_string():
    # First string is empty
    codeflash_output = merge_multi_strings(['', 'abc', 'bcdef'], 0.5, 0.5) # 5.98μs -> 5.52μs (8.45% faster)

def test_overlap_with_identical_strings():
    # All strings are identical
    codeflash_output = merge_multi_strings(['abc', 'abc', 'abc'], 0.5, 0.5) # 7.43μs -> 7.15μs (3.87% faster)

def test_overlap_with_substrings():
    # Each string is a substring of the next
    codeflash_output = merge_multi_strings(['a', 'ab', 'abc', 'abcd', 'abcde'], 0.5, 0.5) # 11.1μs -> 10.7μs (4.44% faster)

def test_overlap_with_long_repeated_characters():
    # Long repeated characters
    codeflash_output = merge_multi_strings(['aaaaa', 'aaaaa', 'aaaaa'], 0.9, 0.9) # 12.0μs -> 11.3μs (5.56% faster)

# ----------------------------
# Large Scale Test Cases
# ----------------------------

def test_large_number_of_strings():
    # Merge 500 strings each with overlap
    seqs = [f"abc{i}" for i in range(500)]
    # Each string overlaps with the previous by "abc"
    codeflash_output = merge_multi_strings(seqs, 0.75, 0.75); merged = codeflash_output # 1.22ms -> 1.05ms (15.9% faster)
    # Should contain all numbers from 0 to 499, prefixed by 'abc'
    for i in range(500):
        pass

def test_large_strings_with_overlap():
    # Merge two large strings with significant overlap
    a = "a" * 500 + "b" * 500
    b = "b" * 500 + "c" * 500
    codeflash_output = merge_multi_strings([a, b], 0.5, 0.5); result = codeflash_output # 370μs -> 372μs (0.624% slower)

def test_large_strings_no_overlap():
    # Merge two large strings with no overlap
    a = "x" * 1000
    b = "y" * 1000
    codeflash_output = merge_multi_strings([a, b], 0.0, 0.0); result = codeflash_output # 335μs -> 330μs (1.79% faster)

def test_large_strings_full_overlap():
    # Merge two large identical strings with maximal overlap
    a = "z" * 1000
    b = "z" * 1000
    codeflash_output = merge_multi_strings([a, b], 1.0, 1.0); result = codeflash_output # 396μs -> 411μs (3.59% slower)

def test_large_list_single_char_strings():
    # Merge 1000 single character strings
    seqs = [chr(65 + (i % 26)) for i in range(1000)]  # A-Z repeated
    codeflash_output = merge_multi_strings(seqs, 0.5, 0.5); result = codeflash_output # 260μs -> 261μs (0.192% slower)
    for i in range(1000):
        pass

def test_large_list_with_varied_overlap():
    # Merge 1000 strings, each overlaps with the next by 1 char
    seqs = [f"{chr(65 + (i % 26))}test{i}" for i in range(1000)]
    codeflash_output = merge_multi_strings(seqs, 0.2, 0.2); result = codeflash_output # 2.26ms -> 2.12ms (6.62% faster)
    # Should contain all substrings
    for i in range(1000):
        pass

# ----------------------------
# Mutation-sensitive test cases
# ----------------------------

def test_mutation_sensitive_overlap():
    # If the function fails to merge at the correct overlap, this will fail
    seqs = ["abcde", "cdefg", "efghi"]
    codeflash_output = merge_multi_strings(seqs, 0.5, 0.5); result = codeflash_output # 9.23μs -> 8.47μs (8.97% faster)

def test_mutation_sensitive_no_overlap():
    # If the function merges when it shouldn't, this will fail
    seqs = ["abc", "xyz", "123"]
    codeflash_output = merge_multi_strings(seqs, 0.0, 0.0); result = codeflash_output # 7.88μs -> 7.07μs (11.5% faster)

def test_mutation_sensitive_last_overlap():
    # If the function uses wrong overlap ratio for last merge, this will fail
    seqs = ["abc", "bcdef", "defgh"]
    codeflash_output = merge_multi_strings(seqs, 0.5, 0.1); result = codeflash_output # 8.44μs -> 7.81μs (8.01% faster)

def test_mutation_sensitive_multiple_perfect_matches():
    # If the function chooses wrong overlap for repeated chars, this will fail
    seqs = ["aaaa", "aaaa", "aaaa"]
    codeflash_output = merge_multi_strings(seqs, 0.8, 0.8); result = codeflash_output # 10.9μs -> 10.5μs (3.93% faster)

def test_mutation_sensitive_unicode():
    # If the function fails on unicode, this will fail
    seqs = ["héllo", "élloworld", "world!"]
    codeflash_output = merge_multi_strings(seqs, 0.5, 0.5); result = codeflash_output # 11.9μs -> 11.3μs (5.32% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
import pytest  # used for our unit tests
from doctr.models.recognition.utils import merge_multi_strings
from rapidfuzz.distance import Hamming

# unit tests

# ----------------------
# Basic Test Cases
# ----------------------

def test_basic_no_overlap():
    # No overlap between strings, should concatenate
    codeflash_output = merge_multi_strings(['abc', 'def', 'ghi'], 0.0, 0.0) # 7.51μs -> 7.02μs (6.96% faster)

def test_basic_partial_overlap():
    # Overlap between suffix and prefix
    codeflash_output = merge_multi_strings(['abc', 'bcd', 'cde'], 0.5, 0.5) # 7.55μs -> 6.84μs (10.4% faster)

def test_basic_full_overlap():
    # Full overlap, strings are identical
    codeflash_output = merge_multi_strings(['abc', 'abc', 'abc'], 1.0, 1.0) # 7.48μs -> 6.75μs (10.8% faster)

def test_basic_realistic():
    # Example from docstring
    codeflash_output = merge_multi_strings(['abc', 'bcdef', 'difghi', 'aijkl'], 0.5, 0.1) # 12.9μs -> 12.0μs (7.74% faster)

def test_basic_two_strings_overlap():
    # Two strings, overlap in the middle
    codeflash_output = merge_multi_strings(['hello', 'llo world'], 0.6, 0.6) # 5.91μs -> 5.30μs (11.3% faster)

def test_basic_two_strings_no_overlap():
    # Two strings, no overlap
    codeflash_output = merge_multi_strings(['foo', 'bar'], 0.0, 0.0) # 5.09μs -> 4.63μs (9.96% faster)

def test_basic_single_string():
    # Only one string, should return itself
    codeflash_output = merge_multi_strings(['single'], 0.5, 0.5) # 805ns -> 782ns (2.94% faster)

def test_basic_empty_list():
    # Empty input list
    codeflash_output = merge_multi_strings([], 0.5, 0.5) # 327ns -> 343ns (4.66% slower)

def test_basic_overlap_with_repeated_chars():
    # Repeated characters in overlap
    codeflash_output = merge_multi_strings(['abbb', 'bbbcc'], 0.6, 0.6) # 8.19μs -> 7.71μs (6.22% faster)

def test_basic_overlap_with_numbers():
    # Overlap with digits
    codeflash_output = merge_multi_strings(['12345', '34567', '56789'], 0.4, 0.4) # 9.62μs -> 8.96μs (7.43% faster)

# ----------------------
# Edge Test Cases
# ----------------------

def test_edge_empty_strings():
    # List contains empty strings
    codeflash_output = merge_multi_strings(['', '', ''], 0.5, 0.5) # 2.49μs -> 2.43μs (2.55% faster)
    codeflash_output = merge_multi_strings(['abc', '', 'def'], 0.5, 0.5) # 5.15μs -> 4.67μs (10.1% faster)
    codeflash_output = merge_multi_strings(['', 'abc'], 0.5, 0.5) # 768ns -> 715ns (7.41% faster)
    codeflash_output = merge_multi_strings(['abc', ''], 0.5, 0.5) # 480ns -> 520ns (7.69% slower)

def test_edge_one_char_strings():
    # Strings with only one character
    codeflash_output = merge_multi_strings(['a', 'b', 'c'], 0.5, 0.5) # 2.64μs -> 2.57μs (2.45% faster)
    codeflash_output = merge_multi_strings(['a', 'a', 'a'], 1.0, 1.0) # 1.12μs -> 1.10μs (0.995% faster)

def test_edge_overlap_ratio_zero():
    # Overlap ratio is zero, should concatenate
    codeflash_output = merge_multi_strings(['abc', 'bcd', 'cde'], 0.0, 0.0) # 8.02μs -> 7.32μs (9.62% faster)

def test_edge_overlap_ratio_one():
    # Overlap ratio is one, should maximize overlap
    codeflash_output = merge_multi_strings(['abc', 'bcd', 'cde'], 1.0, 1.0) # 9.26μs -> 8.52μs (8.73% faster)

def test_edge_last_overlap_ratio_differs():
    # Last overlap ratio is different
    codeflash_output = merge_multi_strings(['abc', 'bcd', 'cde'], 0.5, 0.0) # 7.67μs -> 7.12μs (7.80% faster)
    codeflash_output = merge_multi_strings(['abc', 'bcd', 'cde'], 0.0, 1.0) # 5.07μs -> 4.84μs (4.86% faster)

def test_edge_overlap_ratio_negative():
    # Negative overlap ratio, should behave as no overlap
    codeflash_output = merge_multi_strings(['abc', 'bcd', 'cde'], -1.0, -1.0) # 7.41μs -> 6.76μs (9.55% faster)

def test_edge_non_ascii_characters():
    # Unicode characters
    codeflash_output = merge_multi_strings(['héllo', 'llo wørld', 'ørld!'], 0.6, 0.6) # 10.2μs -> 9.41μs (8.46% faster)

def test_edge_special_characters():
    # Special characters
    codeflash_output = merge_multi_strings(['foo, '$bar', 'bar#'], 0.5, 0.5) # 8.14μs -> 7.43μs (9.56% faster)

def test_edge_overlap_at_start_and_end():
    # Overlap at start or end
    codeflash_output = merge_multi_strings(['start', 'artistic', 'icend'], 0.5, 0.5) # 9.16μs -> 8.45μs (8.45% faster)

def test_edge_overlap_with_spaces():
    # Overlap with spaces
    codeflash_output = merge_multi_strings(['hello ', 'lo world'], 0.5, 0.5) # 6.11μs -> 5.62μs (8.73% faster)

def test_edge_overlap_with_empty_and_nonempty():
    # Mix of empty and non-empty strings
    codeflash_output = merge_multi_strings(['', 'abc', ''], 0.5, 0.5) # 2.58μs -> 2.53μs (1.86% faster)
    codeflash_output = merge_multi_strings(['', '', 'abc'], 0.5, 0.5) # 1.12μs -> 1.15μs (2.09% slower)

def test_edge_overlap_with_long_single_string():
    # Single long string in list
    s = 'a' * 500
    codeflash_output = merge_multi_strings([s], 0.5, 0.5) # 793ns -> 775ns (2.32% faster)

def test_edge_overlap_ratio_greater_than_one():
    # Overlap ratio > 1, should not crash
    codeflash_output = merge_multi_strings(['abc', 'bcd'], 2.0, 2.0) # 7.37μs -> 6.69μs (10.3% faster)

def test_edge_overlap_ratio_float_precision():
    # Test with float overlap ratio
    codeflash_output = merge_multi_strings(['abc', 'bcd', 'cde'], 0.333, 0.666) # 7.94μs -> 7.53μs (5.43% faster)

# ----------------------
# Large Scale Test Cases
# ----------------------

def test_large_scale_many_short_strings():
    # Merge 1000 single-character strings
    seqs = [chr(65 + (i % 26)) for i in range(1000)]  # A-Z repeated
    codeflash_output = merge_multi_strings(seqs, 0.0, 0.0); result = codeflash_output # 267μs -> 266μs (0.461% faster)

def test_large_scale_long_strings_with_overlap():
    # Merge 10 long strings with 50-char overlaps
    base = 'A' * 50
    seqs = []
    for i in range(10):
        seqs.append(base + str(i) + base)
    # Each string overlaps with the next by 50 chars
    codeflash_output = merge_multi_strings(seqs, 0.5, 0.5); merged = codeflash_output # 236μs -> 239μs (1.54% slower)
    # Should contain all unique suffixes
    for i in range(10):
        pass

def test_large_scale_large_overlap_ratio():
    # Merge 100 strings, each with large overlaps
    seqs = []
    for i in range(100):
        seqs.append('X' * 20 + str(i) + 'Y' * 20)
    codeflash_output = merge_multi_strings(seqs, 0.9, 0.9); merged = codeflash_output # 1.16ms -> 1.08ms (7.31% faster)
    # All numbers should be present
    for i in range(100):
        pass

def test_large_scale_varying_overlap_ratios():
    # Merge strings with varying overlap ratios
    seqs = ['abc', 'bcd', 'cde', 'def', 'efg']
    # Use different ratios for last
    codeflash_output = merge_multi_strings(seqs, 0.5, 0.1); merged = codeflash_output # 11.4μs -> 10.2μs (12.3% faster)

def test_large_scale_overlap_with_repeated_patterns():
    # Overlap with repeated patterns
    seqs = ['ababab', 'baba', 'abab', 'bababa']
    codeflash_output = merge_multi_strings(seqs, 0.7, 0.7); merged = codeflash_output # 14.1μs -> 13.4μs (5.02% faster)

def test_large_scale_unicode():
    # Large scale with unicode
    seqs = ['汉字' * 10, '字汉' * 10, '汉字' * 10]
    codeflash_output = merge_multi_strings(seqs, 0.5, 0.5); merged = codeflash_output # 21.2μs -> 21.2μs (0.019% faster)

def test_large_scale_efficiency():
    # Test that merging 1000 short strings is fast and correct
    seqs = ['x'] * 1000
    codeflash_output = merge_multi_strings(seqs, 0.5, 0.5); merged = codeflash_output # 264μs -> 263μs (0.234% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-merge_multi_strings-mg7io5aq and push.

Codeflash

The optimized code achieves a **5% speedup** by eliminating redundant computations and reducing memory allocations in the `merge_strings` function:

**Key Optimizations:**

1. **Single-pass score computation**: Instead of creating the `scores` list in one pass and then creating `zero_matches` in a separate enumeration, the optimized version combines both operations in a single loop. This eliminates the need to iterate through `scores` twice.

2. **Precomputed expected_overlap**: Moved the calculation of `expected_overlap` outside of conditional branches to avoid redundant computation on every path.

3. **In-place minimum finding**: Replaced `combined_scores.index(min(combined_scores))` with a manual loop that finds the minimum without creating an intermediate list. This eliminates the memory allocation for `combined_scores` and avoids a second pass through the data.

4. **Reduced list comprehension overhead**: The combined loop approach avoids the overhead of multiple list comprehensions and their associated memory allocations.

**Performance Benefits:**
- **Best for medium-scale merging**: The optimizations show consistent 7-15% improvements across test cases with multiple strings and moderate overlaps
- **Memory efficient**: Eliminates temporary list allocations, reducing garbage collection pressure
- **Maintains correctness**: All optimization paths preserve the original logic while reducing computational overhead

The optimizations are particularly effective for the common use cases shown in tests, where strings have partial overlaps and the function needs to compute Hamming distances across multiple potential alignment positions.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 October 1, 2025 04:59
@codeflash-ai codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Oct 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant