Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Oct 31, 2025

📄 12% (0.12x) speedup for DriverManager.find_driver_class in nvflare/fuel/f3/drivers/driver_manager.py

⏱️ Runtime : 1.29 milliseconds 1.16 milliseconds (best of 204 runs)

📝 Explanation and details

The optimized code achieves a 12% speedup through two key optimizations:

1. Replace .lower() with .casefold() for case-insensitive matching:

  • str.casefold() is generally faster than str.lower() for case normalization, especially with non-ASCII characters
  • The profiler shows this change reduces the most expensive line (dictionary lookup) from 35% to 34.3% of execution time
  • Test results confirm consistent 10-20% improvements across most cases, with particularly strong gains for URL parsing (21-26% faster)

2. Use slice notation [:index] instead of [0:index]:

  • Python's slice notation [:index] is slightly more optimized than the explicit [0:index] form
  • This micro-optimization contributes to the overall performance gain

Performance characteristics:

  • Best improvements occur with URL inputs containing colons (21-32% faster), as these benefit from both the slicing and casefold optimizations
  • Consistent gains across case-insensitive lookups (8-22% faster) due to casefold's efficiency
  • Large-scale scenarios maintain the improvement (12-27% faster) showing the optimization scales well
  • Minimal impact on simple scheme-only lookups but still positive (1-15% faster)

The optimizations are particularly effective for URL parsing scenarios and case-insensitive scheme matching, which appear to be common use cases based on the test coverage.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 4131 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
from typing import Optional, Type

# imports
import pytest  # used for our unit tests
from nvflare.fuel.f3.drivers.driver_manager import DriverManager


# Minimal stub for the Driver base class
class Driver:
    pass
from nvflare.fuel.f3.drivers.driver_manager import DriverManager


# Helper classes for testing
class HttpDriver(Driver):
    pass

class FtpDriver(Driver):
    pass

class CustomDriver(Driver):
    pass

class XDriver(Driver):
    pass

# -------------------- Unit Tests --------------------

# Basic Test Cases
def test_find_driver_class_with_scheme_only():
    # Test with scheme only (basic case)
    mgr = DriverManager()
    mgr.drivers = {
        "http": HttpDriver,
        "ftp": FtpDriver,
    }
    # Should find the correct class for 'http'
    codeflash_output = mgr.find_driver_class("http") # 924ns -> 808ns (14.4% faster)
    # Should find the correct class for 'ftp'
    codeflash_output = mgr.find_driver_class("ftp") # 421ns -> 414ns (1.69% faster)

def test_find_driver_class_with_url():
    # Test with URL (scheme + colon)
    mgr = DriverManager()
    mgr.drivers = {
        "http": HttpDriver,
        "ftp": FtpDriver,
    }
    # Should extract scheme and find the correct class
    codeflash_output = mgr.find_driver_class("http://example.com") # 1.24μs -> 1.02μs (21.4% faster)
    codeflash_output = mgr.find_driver_class("ftp://host/file.txt") # 515ns -> 467ns (10.3% faster)

def test_find_driver_class_case_insensitivity():
    # Should be case-insensitive for scheme
    mgr = DriverManager()
    mgr.drivers = {
        "http": HttpDriver,
        "ftp": FtpDriver,
    }
    codeflash_output = mgr.find_driver_class("HTTP") # 859ns -> 787ns (9.15% faster)
    codeflash_output = mgr.find_driver_class("Ftp://host/file.txt") # 807ns -> 672ns (20.1% faster)

def test_find_driver_class_returns_none_for_unknown_scheme():
    # Should return None if scheme not found
    mgr = DriverManager()
    mgr.drivers = {
        "http": HttpDriver,
    }
    codeflash_output = mgr.find_driver_class("ftp") # 845ns -> 760ns (11.2% faster)
    codeflash_output = mgr.find_driver_class("ftp://host/file.txt") # 732ns -> 615ns (19.0% faster)
    codeflash_output = mgr.find_driver_class("unknown") # 456ns -> 405ns (12.6% faster)

# Edge Test Cases
def test_find_driver_class_with_empty_string():
    # Should return None for empty string
    mgr = DriverManager()
    mgr.drivers = {
        "http": HttpDriver,
    }
    codeflash_output = mgr.find_driver_class("") # 858ns -> 787ns (9.02% faster)

def test_find_driver_class_with_colon_at_start():
    # Should treat string starting with ':' as scheme
    mgr = DriverManager()
    mgr.drivers = {
        "": CustomDriver,
    }
    # The scheme is '', so it should match the empty string key
    codeflash_output = mgr.find_driver_class(":something") # 919ns -> 820ns (12.1% faster)

def test_find_driver_class_with_multiple_colons():
    # Should only consider the first colon
    mgr = DriverManager()
    mgr.drivers = {
        "x": XDriver,
    }
    # The scheme is 'x' before the first colon
    codeflash_output = mgr.find_driver_class("x:abc:def") # 1.29μs -> 1.04μs (23.3% faster)

def test_find_driver_class_with_no_colon_and_no_matching_driver():
    # Should return None if no colon and no matching driver
    mgr = DriverManager()
    mgr.drivers = {
        "http": HttpDriver,
    }
    codeflash_output = mgr.find_driver_class("ftp") # 877ns -> 813ns (7.87% faster)

def test_find_driver_class_with_uppercase_and_mixed_case_scheme():
    # Should handle mixed/uppercase scheme correctly
    mgr = DriverManager()
    mgr.drivers = {
        "custom": CustomDriver,
    }
    codeflash_output = mgr.find_driver_class("Custom:foo") # 1.30μs -> 1.08μs (20.5% faster)
    codeflash_output = mgr.find_driver_class("CUSTOM") # 535ns -> 478ns (11.9% faster)

def test_find_driver_class_with_scheme_containing_spaces():
    # Should handle schemes with spaces (though unusual)
    mgr = DriverManager()
    mgr.drivers = {
        "my scheme": CustomDriver,
    }
    codeflash_output = mgr.find_driver_class("my scheme:foo") # 1.21μs -> 1.06μs (13.4% faster)
    codeflash_output = mgr.find_driver_class("MY SCHEME") # 499ns -> 471ns (5.94% faster)

def test_find_driver_class_with_non_ascii_scheme():
    # Should handle non-ASCII schemes
    mgr = DriverManager()
    mgr.drivers = {
        "schème": CustomDriver,
    }
    codeflash_output = mgr.find_driver_class("schème:foo") # 1.69μs -> 1.83μs (7.55% slower)
    codeflash_output = mgr.find_driver_class("SCHÈME") # 573ns -> 529ns (8.32% faster)

def test_find_driver_class_with_numeric_scheme():
    # Should handle numeric schemes
    mgr = DriverManager()
    mgr.drivers = {
        "123": CustomDriver,
    }
    codeflash_output = mgr.find_driver_class("123:abc") # 1.26μs -> 1.06μs (18.9% faster)
    codeflash_output = mgr.find_driver_class("123") # 485ns -> 457ns (6.13% faster)

def test_find_driver_class_with_scheme_containing_special_characters():
    # Should handle schemes with special characters
    mgr = DriverManager()
    mgr.drivers = {
        "foo-bar": CustomDriver,
        "foo_bar": XDriver,
    }
    codeflash_output = mgr.find_driver_class("foo-bar:abc") # 1.25μs -> 1.04μs (19.5% faster)
    codeflash_output = mgr.find_driver_class("foo_bar") # 497ns -> 432ns (15.0% faster)

# Large Scale Test Cases
def test_find_driver_class_with_many_drivers():
    # Test with a large number of drivers (up to 1000)
    mgr = DriverManager()
    # Create 1000 dummy driver classes with unique scheme names
    for i in range(1000):
        # Dynamically create a driver class
        driver_cls = type(f"Driver{i}", (Driver,), {})
        mgr.drivers[f"scheme{i}"] = driver_cls

    # Test that each scheme resolves to the correct class
    for i in range(1000):
        scheme = f"scheme{i}"
        url = f"{scheme}:somevalue"
        expected_cls = mgr.drivers[scheme]
        # Check with scheme only
        codeflash_output = mgr.find_driver_class(scheme) # 291μs -> 259μs (12.3% faster)
        # Check with scheme in URL
        codeflash_output = mgr.find_driver_class(url)
        # Check with uppercase scheme
        codeflash_output = mgr.find_driver_class(scheme.upper()) # 343μs -> 309μs (11.2% faster)

def test_find_driver_class_performance_with_large_input():
    # Test performance with large input strings
    mgr = DriverManager()
    mgr.drivers = {
        "http": HttpDriver,
    }
    # Create a very long URL (999 'a's after the scheme)
    long_url = "http:" + "a" * 999
    codeflash_output = mgr.find_driver_class(long_url) # 1.72μs -> 1.54μs (11.1% faster)

def test_find_driver_class_with_large_scheme_name():
    # Test with a very long scheme name (999 chars)
    mgr = DriverManager()
    long_scheme = "a" * 999
    driver_cls = type("LongSchemeDriver", (Driver,), {})
    mgr.drivers[long_scheme] = driver_cls
    codeflash_output = mgr.find_driver_class(long_scheme) # 2.33μs -> 2.11μs (10.0% faster)
    # Also test with long scheme in URL
    codeflash_output = mgr.find_driver_class(long_scheme + ":foo") # 1.88μs -> 1.70μs (10.7% faster)

def test_find_driver_class_with_large_number_of_non_matching_schemes():
    # Test with many drivers, none matching the input
    mgr = DriverManager()
    for i in range(1000):
        driver_cls = type(f"Driver{i}", (Driver,), {})
        mgr.drivers[f"scheme{i}"] = driver_cls
    # Use a scheme not present
    codeflash_output = mgr.find_driver_class("notfound") # 1.83μs -> 1.77μs (3.74% faster)
    codeflash_output = mgr.find_driver_class("notfound:abc") # 1.05μs -> 848ns (23.3% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
from typing import Optional, Type

# imports
import pytest  # used for our unit tests
from nvflare.fuel.f3.drivers.driver_manager import DriverManager


# --- Minimal stub for Driver base class ---
class Driver:
    """Base class for all drivers."""
    pass
from nvflare.fuel.f3.drivers.driver_manager import DriverManager


# --- Example concrete driver classes for testing ---
class HttpDriver(Driver):
    pass

class FtpDriver(Driver):
    pass

class SshDriver(Driver):
    pass

class CustomDriver(Driver):
    pass

# --- Pytest fixtures for DriverManager setup ---
@pytest.fixture
def driver_manager_basic():
    """Fixture with basic drivers registered."""
    dm = DriverManager()
    dm.drivers = {
        "http": HttpDriver,
        "ftp": FtpDriver,
        "ssh": SshDriver,
    }
    return dm

@pytest.fixture
def driver_manager_edge():
    """Fixture with edge-case drivers registered."""
    dm = DriverManager()
    dm.drivers = {
        "": CustomDriver,  # empty string as scheme
        "special": CustomDriver,
        "http": HttpDriver,
        "ftp": FtpDriver,
        "ssh": SshDriver,
    }
    return dm

@pytest.fixture
def driver_manager_large():
    """Fixture with a large number of drivers registered."""
    dm = DriverManager()
    # Generate 1000 fake schemes and assign CustomDriver to each
    for i in range(1000):
        dm.drivers[f"scheme{i}"] = CustomDriver
    # Add a few known drivers for control
    dm.drivers["http"] = HttpDriver
    dm.drivers["ftp"] = FtpDriver
    dm.drivers["ssh"] = SshDriver
    return dm

# --- Basic Test Cases ---

def test_find_driver_class_basic_scheme(driver_manager_basic):
    # Test with exact scheme
    codeflash_output = driver_manager_basic.find_driver_class("http") # 1.25μs -> 1.11μs (12.3% faster)
    codeflash_output = driver_manager_basic.find_driver_class("ftp") # 496ns -> 438ns (13.2% faster)
    codeflash_output = driver_manager_basic.find_driver_class("ssh") # 291ns -> 262ns (11.1% faster)

def test_find_driver_class_basic_url(driver_manager_basic):
    # Test with full URL
    codeflash_output = driver_manager_basic.find_driver_class("http://example.com") # 1.45μs -> 1.14μs (26.3% faster)
    codeflash_output = driver_manager_basic.find_driver_class("ftp://myhost") # 606ns -> 510ns (18.8% faster)
    codeflash_output = driver_manager_basic.find_driver_class("ssh://host:22") # 370ns -> 318ns (16.4% faster)

def test_find_driver_class_basic_case_insensitive(driver_manager_basic):
    # Test with mixed-case schemes
    codeflash_output = driver_manager_basic.find_driver_class("HTTP") # 920ns -> 845ns (8.88% faster)
    codeflash_output = driver_manager_basic.find_driver_class("Ftp") # 429ns -> 418ns (2.63% faster)
    codeflash_output = driver_manager_basic.find_driver_class("SSh") # 317ns -> 259ns (22.4% faster)

def test_find_driver_class_basic_not_found(driver_manager_basic):
    # Test with unknown scheme
    codeflash_output = driver_manager_basic.find_driver_class("smtp") # 863ns -> 778ns (10.9% faster)
    codeflash_output = driver_manager_basic.find_driver_class("smtp://mailhost") # 751ns -> 612ns (22.7% faster)

# --- Edge Test Cases ---

def test_find_driver_class_edge_empty_string(driver_manager_edge):
    # Test with empty string as input (should match "" scheme)
    codeflash_output = driver_manager_edge.find_driver_class("") # 883ns -> 798ns (10.7% faster)

def test_find_driver_class_edge_colon_at_start(driver_manager_edge):
    # Test with colon at start (should treat whole string as scheme)
    codeflash_output = driver_manager_edge.find_driver_class(":http") # 969ns -> 867ns (11.8% faster)
    # If "" is registered, ":http" should match "" scheme
    codeflash_output = driver_manager_edge.find_driver_class(":") # 441ns -> 400ns (10.2% faster)

def test_find_driver_class_edge_no_colon(driver_manager_edge):
    # Test with no colon, but scheme exists
    codeflash_output = driver_manager_edge.find_driver_class("special") # 981ns -> 853ns (15.0% faster)

def test_find_driver_class_edge_multiple_colons(driver_manager_basic):
    # Should only consider first colon
    codeflash_output = driver_manager_basic.find_driver_class("http:extra:stuff") # 1.41μs -> 1.17μs (20.4% faster)
    codeflash_output = driver_manager_basic.find_driver_class("ftp:foo:bar:baz") # 580ns -> 525ns (10.5% faster)

def test_find_driver_class_edge_spaces_and_whitespace(driver_manager_basic):
    # Schemes with leading/trailing spaces should not match
    codeflash_output = driver_manager_basic.find_driver_class(" http") # 966ns -> 817ns (18.2% faster)
    codeflash_output = driver_manager_basic.find_driver_class("http ") # 363ns -> 301ns (20.6% faster)
    codeflash_output = driver_manager_basic.find_driver_class(" http://host") # 677ns -> 554ns (22.2% faster)
    # But stripping spaces before passing should work
    codeflash_output = driver_manager_basic.find_driver_class("http".strip()) # 419ns -> 361ns (16.1% faster)

def test_find_driver_class_edge_uppercase_url(driver_manager_basic):
    # URLs with uppercase scheme should match
    codeflash_output = driver_manager_basic.find_driver_class("HTTP://example.com") # 1.24μs -> 1.02μs (21.0% faster)

def test_find_driver_class_edge_non_string_input(driver_manager_basic):
    # Non-string input should raise AttributeError
    with pytest.raises(AttributeError):
        driver_manager_basic.find_driver_class(None) # 1.60μs -> 1.28μs (24.5% faster)
    with pytest.raises(AttributeError):
        driver_manager_basic.find_driver_class(123) # 945ns -> 777ns (21.6% faster)
    with pytest.raises(AttributeError):
        driver_manager_basic.find_driver_class(["http"]) # 639ns -> 647ns (1.24% slower)

def test_find_driver_class_edge_scheme_not_in_drivers(driver_manager_basic):
    # Scheme not registered should return None
    codeflash_output = driver_manager_basic.find_driver_class("unknown") # 1.10μs -> 941ns (17.0% faster)
    codeflash_output = driver_manager_basic.find_driver_class("unknown://host") # 903ns -> 682ns (32.4% faster)

def test_find_driver_class_edge_scheme_with_special_chars(driver_manager_edge):
    # Scheme with special chars not registered should return None
    codeflash_output = driver_manager_edge.find_driver_class("http$") # 945ns -> 832ns (13.6% faster)
    codeflash_output = driver_manager_edge.find_driver_class("ftp!") # 477ns -> 456ns (4.61% faster)

# --- Large Scale Test Cases ---

def test_find_driver_class_large_scale_known_scheme(driver_manager_large):
    # Should still find known drivers among many
    codeflash_output = driver_manager_large.find_driver_class("http") # 1.12μs -> 881ns (27.4% faster)
    codeflash_output = driver_manager_large.find_driver_class("ftp") # 539ns -> 535ns (0.748% faster)
    codeflash_output = driver_manager_large.find_driver_class("ssh") # 349ns -> 295ns (18.3% faster)

def test_find_driver_class_large_scale_fake_scheme(driver_manager_large):
    # Should find CustomDriver for any fake scheme
    for i in range(0, 1000, 100):  # test every 100th scheme
        scheme = f"scheme{i}"
        codeflash_output = driver_manager_large.find_driver_class(scheme) # 3.96μs -> 3.52μs (12.7% faster)
        # test with URL format
        url = f"{scheme}://host"
        codeflash_output = driver_manager_large.find_driver_class(url)

def test_find_driver_class_large_scale_unknown_scheme(driver_manager_large):
    # Should return None for unknown scheme
    codeflash_output = driver_manager_large.find_driver_class("notarealscheme") # 1.02μs -> 957ns (6.17% faster)
    codeflash_output = driver_manager_large.find_driver_class("notarealscheme://host") # 856ns -> 746ns (14.7% faster)

def test_find_driver_class_large_scale_performance(driver_manager_large):
    # Performance: ensure lookup is fast for large number of drivers
    import time
    start = time.time()
    for i in range(1000):
        scheme = f"scheme{i}"
        codeflash_output = driver_manager_large.find_driver_class(scheme) # 300μs -> 266μs (12.5% faster)
    duration = time.time() - start

# --- Additional Edge Case: Unicode and Non-ASCII ---
def test_find_driver_class_edge_unicode_scheme(driver_manager_edge):
    # Unicode schemes not registered should return None
    codeflash_output = driver_manager_edge.find_driver_class("httpé") # 1.74μs -> 1.80μs (3.23% slower)
    # Unicode scheme registered should work
    dm = DriverManager()
    dm.drivers = {"unicøde": CustomDriver}
    codeflash_output = dm.find_driver_class("unicøde") # 662ns -> 601ns (10.1% faster)
    codeflash_output = dm.find_driver_class("unicøde://host") # 1.20μs -> 941ns (27.2% faster)

# --- Additional Edge Case: Scheme with digits ---
def test_find_driver_class_edge_scheme_with_digits(driver_manager_edge):
    dm = DriverManager()
    dm.drivers = {"http2": CustomDriver}
    codeflash_output = dm.find_driver_class("http2") # 888ns -> 829ns (7.12% faster)
    codeflash_output = dm.find_driver_class("http2://host") # 815ns -> 690ns (18.1% faster)

# --- Additional Edge Case: Scheme with hyphens and underscores ---
def test_find_driver_class_edge_scheme_with_hyphen_and_underscore(driver_manager_edge):
    dm = DriverManager()
    dm.drivers = {"my-scheme": CustomDriver, "my_scheme": CustomDriver}
    codeflash_output = dm.find_driver_class("my-scheme") # 887ns -> 837ns (5.97% faster)
    codeflash_output = dm.find_driver_class("my-scheme://host") # 863ns -> 679ns (27.1% faster)
    codeflash_output = dm.find_driver_class("my_scheme") # 349ns -> 319ns (9.40% faster)
    codeflash_output = dm.find_driver_class("my_scheme://host") # 367ns -> 338ns (8.58% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-DriverManager.find_driver_class-mhe72zx9 and push.

Codeflash Static Badge

The optimized code achieves a **12% speedup** through two key optimizations:

**1. Replace `.lower()` with `.casefold()` for case-insensitive matching:**
- `str.casefold()` is generally faster than `str.lower()` for case normalization, especially with non-ASCII characters
- The profiler shows this change reduces the most expensive line (dictionary lookup) from 35% to 34.3% of execution time
- Test results confirm consistent 10-20% improvements across most cases, with particularly strong gains for URL parsing (21-26% faster)

**2. Use slice notation `[:index]` instead of `[0:index]`:**
- Python's slice notation `[:index]` is slightly more optimized than the explicit `[0:index]` form
- This micro-optimization contributes to the overall performance gain

**Performance characteristics:**
- **Best improvements** occur with URL inputs containing colons (21-32% faster), as these benefit from both the slicing and casefold optimizations
- **Consistent gains** across case-insensitive lookups (8-22% faster) due to casefold's efficiency
- **Large-scale scenarios** maintain the improvement (12-27% faster) showing the optimization scales well
- **Minimal impact** on simple scheme-only lookups but still positive (1-15% faster)

The optimizations are particularly effective for URL parsing scenarios and case-insensitive scheme matching, which appear to be common use cases based on the test coverage.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 October 31, 2025 01:49
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Oct 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant