Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Oct 29, 2025

📄 5% (0.05x) speedup for require_arguments in nvflare/fuel/utils/function_utils.py

⏱️ Runtime : 3.38 milliseconds 3.22 milliseconds (best of 231 runs)

📝 Explanation and details

The optimization achieves a 5% speedup by eliminating redundant iterations over function parameters and caching frequently-accessed constants.

Key optimizations:

  1. Single-pass parameter processing: The original code made 2-3 separate passes over parameters.values() - once for the any() call to check for positional arguments, and once for the list comprehension to count defaults. The optimized version processes all parameters in a single loop, reducing iteration overhead.

  2. Constant caching: Frequently accessed constants like inspect.Parameter.POSITIONAL_OR_KEYWORD and inspect.Parameter.empty are stored in local variables (param_pos_key, param_empty). This eliminates repeated attribute lookups during the loop, which is especially beneficial for functions with many parameters.

  3. Early termination optimization: The req flag is set to True immediately when the first positional-or-keyword parameter is found, avoiding unnecessary checks for subsequent parameters.

  4. Eliminated intermediate data structures: The original code created a temporary list args_with_defaults via list comprehension, then called len() on it. The optimized version directly increments a counter, avoiding memory allocation and the overhead of building an intermediate list.

Performance impact by test case type:

  • Large-scale functions (500+ parameters) see the biggest gains (~10% for keyword-only arguments) due to reduced loop overhead
  • Standard functions (1-10 parameters) see modest but consistent 3-8% improvements
  • Built-in functions show smaller gains (~3%) since most time is spent in inspect.signature() itself

The optimization is most effective for functions with many parameters, where the single-pass approach and constant caching compound their benefits.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 46 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
import inspect

# imports
import pytest  # used for our unit tests
from nvflare.fuel.utils.function_utils import require_arguments

# unit tests

# 1. Basic Test Cases

def test_no_arguments():
    # Function with no arguments
    def foo():
        pass
    require_args, size, default_size = require_arguments(foo) # 14.9μs -> 14.5μs (3.39% faster)

def test_one_positional_argument():
    # Function with one required positional argument
    def foo(a):
        pass
    require_args, size, default_size = require_arguments(foo) # 17.2μs -> 16.3μs (5.83% faster)

def test_one_default_argument():
    # Function with one positional argument with default
    def foo(a=1):
        pass
    require_args, size, default_size = require_arguments(foo) # 16.4μs -> 15.4μs (6.64% faster)

def test_multiple_arguments_mixed_defaults():
    # Function with mixed required and default arguments
    def foo(a, b=2, c=3):
        pass
    require_args, size, default_size = require_arguments(foo) # 18.8μs -> 18.1μs (3.55% faster)

def test_keyword_only_arguments():
    # Function with keyword-only arguments
    def foo(*, a, b=2):
        pass
    require_args, size, default_size = require_arguments(foo) # 17.3μs -> 16.3μs (5.72% faster)

def test_varargs_and_kwargs():
    # Function with *args and **kwargs only
    def foo(*args, **kwargs):
        pass
    require_args, size, default_size = require_arguments(foo) # 16.9μs -> 15.9μs (6.20% faster)

def test_mixed_varargs_and_positional():
    # Function with positional and *args/**kwargs
    def foo(a, *args, b=2, **kwargs):
        pass
    require_args, size, default_size = require_arguments(foo) # 19.6μs -> 18.3μs (7.46% faster)

# 2. Edge Test Cases

def test_positional_only_arguments():
    # Function with positional-only arguments (Python 3.8+)
    def foo(a, /, b):
        pass
    require_args, size, default_size = require_arguments(foo) # 16.1μs -> 15.0μs (7.67% faster)

def test_all_default_arguments():
    # Function where all arguments have defaults
    def foo(a=1, b=2, c=3):
        pass
    require_args, size, default_size = require_arguments(foo) # 17.5μs -> 17.0μs (2.80% faster)

def test_only_keyword_only_arguments_with_defaults():
    # Only keyword-only arguments, all with defaults
    def foo(*, a=1, b=2):
        pass
    require_args, size, default_size = require_arguments(foo) # 16.3μs -> 15.5μs (5.46% faster)

def test_only_keyword_only_arguments_no_defaults():
    # Only keyword-only arguments, none with defaults
    def foo(*, a, b):
        pass
    require_args, size, default_size = require_arguments(foo) # 15.7μs -> 15.2μs (3.29% faster)

def test_callable_object():
    # Callable object with __call__ method
    class Foo:
        def __call__(self, a, b=2):
            pass
    f = Foo()
    require_args, size, default_size = require_arguments(f) # 32.9μs -> 31.4μs (4.90% faster)

def test_builtin_function():
    # Built-in function (should not fail)
    require_args, size, default_size = require_arguments(len) # 114μs -> 111μs (2.22% faster)

def test_lambda_function():
    # Lambda function with default argument
    f = lambda a, b=2: a + b
    require_args, size, default_size = require_arguments(f) # 18.6μs -> 17.9μs (3.62% faster)

def test_function_with_annotations():
    # Function with annotated arguments
    def foo(a: int, b: str = "hi"):
        pass
    require_args, size, default_size = require_arguments(foo) # 17.5μs -> 16.2μs (7.78% faster)

def test_function_with_varargs_and_kwargs_and_defaults():
    # Complex signature
    def foo(a, b=2, *args, c, d=4, **kwargs):
        pass
    require_args, size, default_size = require_arguments(foo) # 21.8μs -> 20.7μs (5.17% faster)

# 3. Large Scale Test Cases

def test_large_number_of_positional_arguments():
    # Function with 1000 positional arguments
    code = "def foo(" + ", ".join(f"a{i}" for i in range(1000)) + "): pass"
    ns = {}
    exec(code, ns)
    foo = ns["foo"]
    require_args, size, default_size = require_arguments(foo) # 781μs -> 752μs (3.94% faster)




#------------------------------------------------
import inspect

# imports
import pytest  # used for our unit tests
from nvflare.fuel.utils.function_utils import require_arguments

# unit tests

# -------------- BASIC TEST CASES --------------

def test_no_arguments():
    # Function with no arguments
    def f(): pass
    req, size, default_size = require_arguments(f) # 17.1μs -> 16.2μs (5.69% faster)

def test_one_required_argument():
    # Function with a single required argument
    def f(a): pass
    req, size, default_size = require_arguments(f) # 18.7μs -> 17.3μs (8.62% faster)

def test_one_default_argument():
    # Function with a single argument with default value
    def f(a=42): pass
    req, size, default_size = require_arguments(f) # 16.9μs -> 15.6μs (8.23% faster)

def test_multiple_required_arguments():
    # Function with multiple required arguments
    def f(a, b, c): pass
    req, size, default_size = require_arguments(f) # 18.1μs -> 17.2μs (4.83% faster)

def test_mixed_required_and_default_arguments():
    # Function with required and default arguments
    def f(a, b=1, c=2): pass
    req, size, default_size = require_arguments(f) # 18.9μs -> 17.9μs (5.66% faster)

def test_all_default_arguments():
    # Function with all arguments having defaults
    def f(a=1, b=2, c=3): pass
    req, size, default_size = require_arguments(f) # 17.6μs -> 17.2μs (2.33% faster)

def test_varargs_only():
    # Function with only *args
    def f(*args): pass
    req, size, default_size = require_arguments(f) # 14.8μs -> 14.1μs (5.35% faster)

def test_kwargs_only():
    # Function with only **kwargs
    def f(**kwargs): pass
    req, size, default_size = require_arguments(f) # 14.9μs -> 14.3μs (3.83% faster)

def test_varargs_and_kwargs():
    # Function with both *args and **kwargs
    def f(*args, **kwargs): pass
    req, size, default_size = require_arguments(f) # 16.1μs -> 15.6μs (3.09% faster)

def test_positional_and_varargs_kwargs():
    # Function with required, default, *args, and **kwargs
    def f(a, b=2, *args, **kwargs): pass
    req, size, default_size = require_arguments(f) # 19.8μs -> 18.7μs (6.19% faster)

# -------------- EDGE TEST CASES --------------

def test_keyword_only_arguments():
    # Function with keyword-only arguments
    def f(*, a, b=2): pass
    req, size, default_size = require_arguments(f) # 16.7μs -> 15.6μs (6.80% faster)

def test_positional_only_arguments():
    # Python 3.8+: positional-only arguments with "/"
    def f(a, b, /, c, d=2): pass
    req, size, default_size = require_arguments(f) # 19.3μs -> 18.1μs (6.52% faster)

def test_no_parameters_but_varargs_and_kwargs():
    # Function with only *args and **kwargs, no named parameters
    def f(*args, **kwargs): pass
    req, size, default_size = require_arguments(f) # 16.0μs -> 15.1μs (6.05% faster)

def test_required_and_keyword_only():
    # Function with required positional and required keyword-only
    def f(a, *, b): pass
    req, size, default_size = require_arguments(f) # 16.0μs -> 15.1μs (5.59% faster)

def test_required_and_varargs_and_keyword_only():
    # Function with required positional, *args, and required keyword-only
    def f(a, *args, b): pass
    req, size, default_size = require_arguments(f) # 17.0μs -> 16.1μs (5.94% faster)

def test_all_argument_types():
    # Function with all argument types
    def f(a, b=1, *args, c, d=2, **kwargs): pass
    req, size, default_size = require_arguments(f) # 21.4μs -> 20.6μs (3.70% faster)

def test_method_self_argument():
    # Method with 'self' argument
    class C:
        def method(self, a, b=2): pass
    req, size, default_size = require_arguments(C.method) # 18.3μs -> 17.5μs (4.12% faster)

def test_static_method():
    # Static method
    class C:
        @staticmethod
        def method(a, b=2): pass
    req, size, default_size = require_arguments(C.method) # 16.6μs -> 16.1μs (3.00% faster)

def test_class_method():
    # Class method
    class C:
        @classmethod
        def method(cls, a, b=2): pass
    req, size, default_size = require_arguments(C.method) # 24.7μs -> 23.4μs (5.74% faster)

def test_lambda_function():
    # Lambda function
    f = lambda a, b=2: a + b
    req, size, default_size = require_arguments(f) # 15.9μs -> 15.2μs (4.13% faster)

def test_builtin_function():
    # Built-in function (should not raise)
    req, size, default_size = require_arguments(len) # 115μs -> 112μs (3.06% faster)

def test_partial_function():
    # functools.partial should be handled as a normal function
    import functools
    def f(a, b, c=3): pass
    pf = functools.partial(f, 1)
    req, size, default_size = require_arguments(pf) # 42.8μs -> 41.8μs (2.18% faster)

def test_function_with_annotations():
    # Function with type annotations
    def f(a: int, b: str = "x"): pass
    req, size, default_size = require_arguments(f) # 16.9μs -> 16.1μs (5.22% faster)

# -------------- LARGE SCALE TEST CASES --------------

def test_many_required_arguments():
    # Function with 100 required arguments
    code = "def f(" + ",".join(f"a{i}" for i in range(100)) + "): pass"
    ns = {}
    exec(code, ns)
    f = ns["f"]
    req, size, default_size = require_arguments(f) # 96.2μs -> 92.2μs (4.35% faster)

def test_many_default_arguments():
    # Function with 100 default arguments
    code = "def f(" + ",".join(f"a{i}=0" for i in range(100)) + "): pass"
    ns = {}
    exec(code, ns)
    f = ns["f"]
    req, size, default_size = require_arguments(f) # 107μs -> 103μs (4.48% faster)

def test_many_mixed_arguments():
    # Function with 50 required and 50 default arguments
    code = "def f(" + ",".join(f"a{i}" for i in range(50)) + "," + ",".join(f"b{i}=0" for i in range(50)) + "): pass"
    ns = {}
    exec(code, ns)
    f = ns["f"]
    req, size, default_size = require_arguments(f) # 101μs -> 98.1μs (3.74% faster)

def test_large_varargs_and_kwargs():
    # Function with many arguments, *args, **kwargs
    code = "def f(" + ",".join(f"a{i}=0" for i in range(500)) + ",*args,**kwargs): pass"
    ns = {}
    exec(code, ns)
    f = ns["f"]
    req, size, default_size = require_arguments(f) # 447μs -> 430μs (4.00% faster)

def test_large_keyword_only_arguments():
    # Function with many keyword-only arguments
    code = "def f(*, " + ",".join(f"a{i}=0" for i in range(500)) + "): pass"
    ns = {}
    exec(code, ns)
    f = ns["f"]
    req, size, default_size = require_arguments(f) # 493μs -> 446μs (10.5% faster)

def test_large_positional_and_keyword_only():
    # Function with 250 positional, 250 keyword-only arguments
    code = "def f(" + ",".join(f"a{i}" for i in range(250)) + ", *, " + ",".join(f"b{i}=0" for i in range(250)) + "): pass"
    ns = {}
    exec(code, ns)
    f = ns["f"]
    req, size, default_size = require_arguments(f) # 427μs -> 409μs (4.47% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-require_arguments-mhcf5u09 and push.

Codeflash

The optimization achieves a 5% speedup by eliminating redundant iterations over function parameters and caching frequently-accessed constants.

**Key optimizations:**

1. **Single-pass parameter processing**: The original code made 2-3 separate passes over `parameters.values()` - once for the `any()` call to check for positional arguments, and once for the list comprehension to count defaults. The optimized version processes all parameters in a single loop, reducing iteration overhead.

2. **Constant caching**: Frequently accessed constants like `inspect.Parameter.POSITIONAL_OR_KEYWORD` and `inspect.Parameter.empty` are stored in local variables (`param_pos_key`, `param_empty`). This eliminates repeated attribute lookups during the loop, which is especially beneficial for functions with many parameters.

3. **Early termination optimization**: The `req` flag is set to `True` immediately when the first positional-or-keyword parameter is found, avoiding unnecessary checks for subsequent parameters.

4. **Eliminated intermediate data structures**: The original code created a temporary list `args_with_defaults` via list comprehension, then called `len()` on it. The optimized version directly increments a counter, avoiding memory allocation and the overhead of building an intermediate list.

**Performance impact by test case type:**
- **Large-scale functions** (500+ parameters) see the biggest gains (~10% for keyword-only arguments) due to reduced loop overhead
- **Standard functions** (1-10 parameters) see modest but consistent 3-8% improvements
- **Built-in functions** show smaller gains (~3%) since most time is spent in `inspect.signature()` itself

The optimization is most effective for functions with many parameters, where the single-pass approach and constant caching compound their benefits.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 October 29, 2025 19:59
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Oct 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant