⚡️ Speed up function `find_last_node` by 10,383% #250

codeflash-ai · 2026-01-08T08:10:54Z

📄 10,383% (103.83x) speedup for `find_last_node` in `src/algorithms/graph.py`

⏱️ Runtime : 46.5 milliseconds → 443 microseconds (best of 240 runs)

📝 Explanation and details

The optimized code achieves a ~104x speedup by eliminating a quadratic complexity bottleneck in the original implementation.

Key Performance Problem in Original Code:
The original code uses a nested iteration pattern: for each node, it checks all(e["source"] != n["id"] for e in edges). This means:

For every node examined, all edges are scanned to verify the node isn't a source
Time complexity: O(nodes × edges) - quadratic behavior
With 1000 nodes and 999 edges (linear chain), this results in ~1M comparisons

The Optimization:
The optimized version performs a one-time preprocessing step:

sources = {e["source"] for e in edges}  # Build set of all source IDs
return next((n for n in nodes if n["id"] not in sources), None)  # O(1) lookup per node

This changes the complexity from O(nodes × edges) to O(nodes + edges), where:

Building the sources set: O(edges) - single pass through edges
Finding terminal node: O(nodes) with O(1) set membership checks

Why This Works:

Set membership testing (n["id"] not in sources) is O(1) average case in Python
Dictionary/set lookups use hash tables, providing constant-time access
The preprocessing cost is amortized across all node checks

Performance by Test Case Type:

Small graphs (2-3 nodes): 25-100% faster - overhead of set creation is minimal
Linear chains (1000 nodes): 329x faster - eliminates catastrophic quadratic behavior
Dense graphs (100 nodes, 9900 edges): 86x faster - set lookup vastly superior to repeated edge iteration
Star graphs (1 source, 999 targets): 88% faster - single set entry, fast lookups
Empty/minimal inputs: Slight overhead (5-20% slower) due to set creation cost, but negligible in absolute terms (nanoseconds)

Impact:
This optimization is particularly valuable when the function is called frequently or on larger graphs, as the speedup scales dramatically with input size. The test results show orders-of-magnitude improvements for realistic graph sizes while maintaining identical behavior.

✅ Correctness verification report:

Test	Status
⚙️ Existing Unit Tests	🔘 None Found
🌀 Generated Regression Tests	✅ 51 Passed
⏪ Replay Tests	🔘 None Found
🔎 Concolic Coverage Tests	🔘 None Found
📊 Tests Coverage	100.0%

🌀 Click to see Generated Regression Tests

from __future__ import annotations

# imports
import pytest  # used for our unit tests
from src.algorithms.graph import find_last_node

# unit tests

# -----------------------------
# 1. Basic Test Cases
# -----------------------------


def test_single_node_no_edges():
    # Only one node, no edges. That node is the last node.
    nodes = [{"id": 1}]
    edges = []
    codeflash_output = find_last_node(nodes, edges)  # 1.25μs -> 1.00μs (25.0% faster)


def test_two_nodes_one_edge():
    # Two nodes, one edge from 1 to 2. Node 2 is the last node.
    nodes = [{"id": 1}, {"id": 2}]
    edges = [{"source": 1, "target": 2}]
    codeflash_output = find_last_node(nodes, edges)  # 1.79μs -> 1.17μs (53.6% faster)


def test_three_nodes_chain():
    # Three nodes in a chain: 1->2->3. Node 3 is the last node.
    nodes = [{"id": 1}, {"id": 2}, {"id": 3}]
    edges = [{"source": 1, "target": 2}, {"source": 2, "target": 3}]
    codeflash_output = find_last_node(nodes, edges)  # 2.08μs -> 1.25μs (66.6% faster)


def test_multiple_terminal_nodes():
    # 1->2, 1->3; both 2 and 3 are terminal, function should return first found (2).
    nodes = [{"id": 1}, {"id": 2}, {"id": 3}]
    edges = [{"source": 1, "target": 2}, {"source": 1, "target": 3}]
    codeflash_output = find_last_node(nodes, edges)
    result = codeflash_output  # 1.83μs -> 1.21μs (51.7% faster)


# -----------------------------
# 2. Edge Test Cases
# -----------------------------


def test_empty_nodes_and_edges():
    # No nodes or edges. Should return None.
    nodes = []
    edges = []
    codeflash_output = find_last_node(nodes, edges)  # 750ns -> 917ns (18.2% slower)


def test_no_edges_multiple_nodes():
    # Multiple nodes, no edges. All nodes are terminal, should return first node.
    nodes = [{"id": 10}, {"id": 20}, {"id": 30}]
    edges = []
    codeflash_output = find_last_node(nodes, edges)  # 1.25μs -> 1.00μs (25.0% faster)


def test_all_nodes_have_outgoing_edges():
    # Each node is a source in at least one edge, so no terminal node.
    nodes = [{"id": 1}, {"id": 2}]
    edges = [{"source": 1, "target": 2}, {"source": 2, "target": 1}]
    codeflash_output = find_last_node(nodes, edges)  # 1.83μs -> 1.21μs (51.7% faster)


def test_edges_with_unknown_sources():
    # Edge refers to a source not in nodes. Should not affect result.
    nodes = [{"id": 1}, {"id": 2}]
    edges = [{"source": 3, "target": 1}]
    # Both nodes are not source of any edge, so first node is returned.
    codeflash_output = find_last_node(nodes, edges)  # 1.50μs -> 1.17μs (28.6% faster)


def test_duplicate_nodes():
    # Duplicate nodes with same id. Should return the first one not a source.
    nodes = [{"id": 1}, {"id": 1}, {"id": 2}]
    edges = [{"source": 1, "target": 2}]
    # Node with id 2 is not a source, so should be returned.
    codeflash_output = find_last_node(nodes, edges)  # 2.04μs -> 1.21μs (69.0% faster)


def test_node_with_non_integer_id():
    # IDs are strings.
    nodes = [{"id": "A"}, {"id": "B"}]
    edges = [{"source": "A", "target": "B"}]
    codeflash_output = find_last_node(nodes, edges)  # 1.88μs -> 1.17μs (60.7% faster)


def test_node_with_mixed_type_ids():
    # IDs are mixed types.
    nodes = [{"id": 1}, {"id": "2"}]
    edges = [{"source": 1, "target": "2"}]
    codeflash_output = find_last_node(nodes, edges)  # 1.88μs -> 1.17μs (60.7% faster)


def test_edge_with_extra_keys():
    # Edge dicts have extra keys; should be ignored.
    nodes = [{"id": 1}, {"id": 2}]
    edges = [{"source": 1, "target": 2, "weight": 10}]
    codeflash_output = find_last_node(nodes, edges)  # 1.79μs -> 1.17μs (53.6% faster)


def test_node_dict_with_extra_keys():
    # Node dicts have extra keys; should be ignored.
    nodes = [{"id": 1, "label": "A"}, {"id": 2, "label": "B"}]
    edges = [{"source": 1, "target": 2}]
    codeflash_output = find_last_node(nodes, edges)  # 1.79μs -> 1.17μs (53.6% faster)


def test_cycle_graph():
    # Graph with a cycle: 1->2->3->1. No terminal node.
    nodes = [{"id": 1}, {"id": 2}, {"id": 3}]
    edges = [
        {"source": 1, "target": 2},
        {"source": 2, "target": 3},
        {"source": 3, "target": 1},
    ]
    codeflash_output = find_last_node(nodes, edges)  # 2.21μs -> 1.29μs (71.0% faster)


def test_disconnected_graph():
    # Two disconnected components, one is a chain, one is a single node.
    nodes = [{"id": 1}, {"id": 2}, {"id": 3}, {"id": 4}]
    edges = [{"source": 1, "target": 2}, {"source": 2, "target": 3}]
    # Node 3 and 4 are terminal, function returns first found (3).
    codeflash_output = find_last_node(nodes, edges)  # 2.12μs -> 1.25μs (70.0% faster)


def test_node_id_none():
    # Node with id None.
    nodes = [{"id": None}, {"id": 2}]
    edges = [{"source": 2, "target": None}]
    # Node with id None is not a source, so should be returned.
    codeflash_output = find_last_node(nodes, edges)  # 1.58μs -> 1.12μs (40.8% faster)


def test_edges_with_missing_source_key():
    # Edge missing 'source' key should raise KeyError.
    nodes = [{"id": 1}, {"id": 2}]
    edges = [{"target": 2}]
    with pytest.raises(KeyError):
        find_last_node(nodes, edges)  # 1.88μs -> 875ns (114% faster)


def test_large_linear_chain():
    # 1000 nodes in a chain: 0->1->2->...->999
    N = 1000
    nodes = [{"id": i} for i in range(N)]
    edges = [{"source": i, "target": i + 1} for i in range(N - 1)]
    codeflash_output = find_last_node(nodes, edges)  # 18.4ms -> 55.7μs (32941% faster)


def test_large_star_graph():
    # One central node (0) points to 999 others.
    N = 1000
    nodes = [{"id": i} for i in range(N)]
    edges = [{"source": 0, "target": i} for i in range(1, N)]
    # All nodes except 0 are terminal; function returns first found (id=1).
    codeflash_output = find_last_node(nodes, edges)  # 38.5μs -> 20.5μs (88.2% faster)


def test_large_fully_connected_graph():
    # Every node is a source in at least one edge; no terminal node.
    N = 100
    nodes = [{"id": i} for i in range(N)]
    edges = [{"source": i, "target": j} for i in range(N) for j in range(N) if i != j]
    codeflash_output = find_last_node(nodes, edges)  # 16.9ms -> 193μs (8654% faster)


def test_large_disconnected_nodes():
    # 1000 nodes, no edges. All are terminal; function returns first.
    N = 1000
    nodes = [{"id": i} for i in range(N)]
    edges = []
    codeflash_output = find_last_node(nodes, edges)  # 1.25μs -> 1.08μs (15.4% faster)


def test_large_graph_with_duplicate_ids():
    # 500 nodes with id=1, 500 with id=2, one edge from 1 to 2.
    nodes = [{"id": 1} for _ in range(500)] + [{"id": 2} for _ in range(500)]
    edges = [{"source": 1, "target": 2}]
    # Node with id=2 is terminal, should return first such node.
    codeflash_output = find_last_node(nodes, edges)
    result = codeflash_output  # 99.5μs -> 12.5μs (696% faster)


# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

import pytest  # used for our unit tests
from src.algorithms.graph import find_last_node

# unit tests


class TestFindLastNodeBasic:
    """Basic test cases for normal operation"""

    def test_simple_chain_two_nodes(self):
        """Test a simple chain: A -> B, where B is the terminal"""
        nodes = [{"id": "A"}, {"id": "B"}]
        edges = [{"source": "A", "target": "B"}]
        codeflash_output = find_last_node(nodes, edges)
        result = codeflash_output  # 1.96μs -> 1.21μs (62.2% faster)

    def test_simple_chain_three_nodes(self):
        """Test a chain: A -> B -> C, where C is the terminal"""
        nodes = [{"id": "A"}, {"id": "B"}, {"id": "C"}]
        edges = [{"source": "A", "target": "B"}, {"source": "B", "target": "C"}]
        codeflash_output = find_last_node(nodes, edges)
        result = codeflash_output  # 2.38μs -> 1.21μs (96.4% faster)

    def test_branching_graph_single_terminal(self):
        """Test a branching graph: A -> B, A -> C, B -> D, C -> D"""
        nodes = [{"id": "A"}, {"id": "B"}, {"id": "C"}, {"id": "D"}]
        edges = [
            {"source": "A", "target": "B"},
            {"source": "A", "target": "C"},
            {"source": "B", "target": "D"},
            {"source": "C", "target": "D"},
        ]
        codeflash_output = find_last_node(nodes, edges)
        result = codeflash_output  # 2.96μs -> 1.42μs (109% faster)

    def test_nodes_with_additional_properties(self):
        """Test that nodes with extra properties work correctly"""
        nodes = [
            {"id": "start", "name": "Start Node", "value": 100},
            {"id": "end", "name": "End Node", "value": 200},
        ]
        edges = [{"source": "start", "target": "end", "weight": 5}]
        codeflash_output = find_last_node(nodes, edges)
        result = codeflash_output  # 1.92μs -> 1.17μs (64.2% faster)


class TestFindLastNodeMultipleTerminals:
    """Test cases where multiple nodes could be terminals"""

    def test_two_terminal_nodes_returns_first(self):
        """Test that when multiple terminals exist, the first one in nodes list is returned"""
        nodes = [{"id": "A"}, {"id": "B"}, {"id": "C"}]
        edges = [{"source": "A", "target": "B"}]
        # Both B and C are never sources, should return first one found (B)
        codeflash_output = find_last_node(nodes, edges)
        result = codeflash_output  # 1.88μs -> 1.17μs (60.8% faster)

    def test_multiple_disconnected_terminals(self):
        """Test graph with disconnected components having multiple terminals"""
        nodes = [{"id": "A"}, {"id": "B"}, {"id": "C"}, {"id": "D"}]
        edges = [{"source": "A", "target": "B"}, {"source": "C", "target": "D"}]
        # B and D are both terminals, should return first one (B)
        codeflash_output = find_last_node(nodes, edges)
        result = codeflash_output  # 1.88μs -> 1.21μs (55.2% faster)

    def test_all_nodes_are_terminals_no_edges(self):
        """Test when there are no edges, all nodes are terminals"""
        nodes = [{"id": "A"}, {"id": "B"}, {"id": "C"}]
        edges = []
        codeflash_output = find_last_node(nodes, edges)
        result = codeflash_output  # 1.25μs -> 1.00μs (25.0% faster)


class TestFindLastNodeNoTerminals:
    """Test cases where no terminal node exists"""

    def test_circular_graph_no_terminal(self):
        """Test a circular graph where every node is a source"""
        nodes = [{"id": "A"}, {"id": "B"}, {"id": "C"}]
        edges = [
            {"source": "A", "target": "B"},
            {"source": "B", "target": "C"},
            {"source": "C", "target": "A"},
        ]
        codeflash_output = find_last_node(nodes, edges)
        result = codeflash_output  # 2.38μs -> 1.29μs (83.8% faster)

    def test_self_loop_single_node(self):
        """Test a single node with a self-loop"""
        nodes = [{"id": "A"}]
        edges = [{"source": "A", "target": "A"}]
        codeflash_output = find_last_node(nodes, edges)
        result = codeflash_output  # 1.46μs -> 1.12μs (29.6% faster)

    def test_complete_graph_no_terminal(self):
        """Test a complete graph where every node points to every other node"""
        nodes = [{"id": "A"}, {"id": "B"}, {"id": "C"}]
        edges = [
            {"source": "A", "target": "B"},
            {"source": "A", "target": "C"},
            {"source": "B", "target": "A"},
            {"source": "B", "target": "C"},
            {"source": "C", "target": "A"},
            {"source": "C", "target": "B"},
        ]
        codeflash_output = find_last_node(nodes, edges)
        result = codeflash_output  # 2.58μs -> 1.38μs (87.9% faster)


class TestFindLastNodeEmptyInputs:
    """Test cases with empty or minimal inputs"""

    def test_empty_nodes_empty_edges(self):
        """Test with both empty lists"""
        nodes = []
        edges = []
        codeflash_output = find_last_node(nodes, edges)
        result = codeflash_output  # 791ns -> 834ns (5.16% slower)

    def test_empty_nodes_with_edges(self):
        """Test with empty nodes but edges present (unusual case)"""
        nodes = []
        edges = [{"source": "A", "target": "B"}]
        codeflash_output = find_last_node(nodes, edges)
        result = codeflash_output  # 791ns -> 1.00μs (20.9% slower)

    def test_single_node_no_edges(self):
        """Test with a single node and no edges"""
        nodes = [{"id": "A"}]
        edges = []
        codeflash_output = find_last_node(nodes, edges)
        result = codeflash_output  # 1.29μs -> 1.04μs (24.1% faster)

    def test_nodes_with_empty_edges(self):
        """Test multiple nodes with no edges"""
        nodes = [{"id": "A"}, {"id": "B"}, {"id": "C"}]
        edges = []
        codeflash_output = find_last_node(nodes, edges)
        result = codeflash_output  # 1.29μs -> 958ns (34.8% faster)


class TestFindLastNodeComplexStructures:
    """Test cases with complex or unusual structures"""

    def test_edges_with_nonexistent_sources(self):
        """Test edges that reference nodes not in the nodes list"""
        nodes = [{"id": "A"}, {"id": "B"}]
        edges = [{"source": "X", "target": "A"}, {"source": "Y", "target": "B"}]
        codeflash_output = find_last_node(nodes, edges)
        result = codeflash_output  # 1.67μs -> 1.17μs (42.8% faster)

    def test_duplicate_edges(self):
        """Test with duplicate edges"""
        nodes = [{"id": "A"}, {"id": "B"}]
        edges = [
            {"source": "A", "target": "B"},
            {"source": "A", "target": "B"},
            {"source": "A", "target": "B"},
        ]
        codeflash_output = find_last_node(nodes, edges)
        result = codeflash_output  # 2.12μs -> 1.25μs (70.0% faster)

    def test_multiple_self_loops(self):
        """Test nodes with self-loops and other edges"""
        nodes = [{"id": "A"}, {"id": "B"}, {"id": "C"}]
        edges = [
            {"source": "A", "target": "A"},
            {"source": "A", "target": "B"},
            {"source": "B", "target": "B"},
            {"source": "B", "target": "C"},
        ]
        codeflash_output = find_last_node(nodes, edges)
        result = codeflash_output  # 2.46μs -> 1.29μs (90.2% faster)

    def test_edges_with_extra_properties(self):
        """Test that edges with extra properties are handled correctly"""
        nodes = [{"id": "A"}, {"id": "B"}]
        edges = [{"source": "A", "target": "B", "weight": 10, "color": "red"}]
        codeflash_output = find_last_node(nodes, edges)
        result = codeflash_output  # 1.88μs -> 1.17μs (60.7% faster)

    def test_node_ids_as_numbers(self):
        """Test with numeric node IDs"""
        nodes = [{"id": 1}, {"id": 2}, {"id": 3}]
        edges = [{"source": 1, "target": 2}, {"source": 2, "target": 3}]
        codeflash_output = find_last_node(nodes, edges)
        result = codeflash_output  # 2.33μs -> 1.17μs (100% faster)

    def test_mixed_type_node_ids(self):
        """Test with mixed type node IDs (strings and numbers)"""
        nodes = [{"id": "A"}, {"id": 1}, {"id": "B"}]
        edges = [{"source": "A", "target": 1}, {"source": 1, "target": "B"}]
        codeflash_output = find_last_node(nodes, edges)
        result = codeflash_output  # 2.54μs -> 1.25μs (103% faster)


class TestFindLastNodeLargeScale:
    """Large scale test cases for performance and scalability"""

    def test_long_linear_chain(self):
        """Test a long linear chain of nodes"""
        # Create a chain of 500 nodes: 0 -> 1 -> 2 -> ... -> 499
        num_nodes = 500
        nodes = [{"id": i} for i in range(num_nodes)]
        edges = [{"source": i, "target": i + 1} for i in range(num_nodes - 1)]
        codeflash_output = find_last_node(nodes, edges)
        result = codeflash_output  # 4.64ms -> 28.7μs (16086% faster)

    def test_dense_graph_single_terminal(self):
        """Test a dense graph where many nodes point to a single terminal"""
        # Create 200 nodes, all pointing to node 200 (the terminal)
        num_sources = 200
        nodes = [{"id": i} for i in range(num_sources + 1)]
        edges = [{"source": i, "target": num_sources} for i in range(num_sources)]
        codeflash_output = find_last_node(nodes, edges)
        result = codeflash_output  # 784μs -> 11.0μs (7063% faster)

    def test_wide_graph_many_parallel_paths(self):
        """Test a graph with many parallel paths converging to one terminal"""
        # Create 100 parallel paths: (0,1,2) -> 300, (3,4,5) -> 300, etc.
        num_paths = 100
        nodes_per_path = 3
        terminal_id = num_paths * nodes_per_path

        nodes = [{"id": i} for i in range(terminal_id + 1)]
        edges = []

        for path in range(num_paths):
            base = path * nodes_per_path
            # Create chain within path
            for i in range(nodes_per_path - 1):
                edges.append({"source": base + i, "target": base + i + 1})
            # Connect last node of path to terminal
            edges.append({"source": base + nodes_per_path - 1, "target": terminal_id})

        codeflash_output = find_last_node(nodes, edges)
        result = codeflash_output  # 1.69ms -> 16.2μs (10341% faster)

    def test_many_disconnected_terminals(self):
        """Test a graph with many disconnected components, each with a terminal"""
        # Create 100 disconnected pairs: 0->1, 2->3, 4->5, etc.
        num_pairs = 100
        nodes = [{"id": i} for i in range(num_pairs * 2)]
        edges = [{"source": i * 2, "target": i * 2 + 1} for i in range(num_pairs)]
        codeflash_output = find_last_node(nodes, edges)
        result = codeflash_output  # 6.08μs -> 4.46μs (36.4% faster)

    def test_large_circular_graph(self):
        """Test a large circular graph with no terminal"""
        # Create a circle of 300 nodes
        num_nodes = 300
        nodes = [{"id": i} for i in range(num_nodes)]
        edges = [{"source": i, "target": (i + 1) % num_nodes} for i in range(num_nodes)]
        codeflash_output = find_last_node(nodes, edges)
        result = codeflash_output  # 1.69ms -> 16.2μs (10325% faster)

    def test_binary_tree_structure(self):
        """Test a binary tree structure with multiple leaf nodes"""
        # Create a binary tree with depth 8 (255 nodes)
        # Nodes 0-126 are internal, 127-254 are leaves
        num_nodes = 255
        nodes = [{"id": i} for i in range(num_nodes)]
        edges = []

        # Create binary tree edges (parent i has children 2i+1 and 2i+2)
        for i in range(127):
            left_child = 2 * i + 1
            right_child = 2 * i + 2
            if left_child < num_nodes:
                edges.append({"source": i, "target": left_child})
            if right_child < num_nodes:
                edges.append({"source": i, "target": right_child})

        codeflash_output = find_last_node(nodes, edges)
        result = codeflash_output  # 618μs -> 10.8μs (5656% faster)

    def test_large_graph_with_many_edges_per_node(self):
        """Test a graph where each node has many outgoing edges"""
        # Create 50 nodes, each of first 49 points to all subsequent nodes
        num_nodes = 50
        nodes = [{"id": i} for i in range(num_nodes)]
        edges = []

        for i in range(num_nodes - 1):
            for j in range(i + 1, num_nodes):
                edges.append({"source": i, "target": j})

        codeflash_output = find_last_node(nodes, edges)
        result = codeflash_output  # 1.51ms -> 28.7μs (5176% faster)


class TestFindLastNodeOrderDependence:
    """Test that the function returns the first terminal in iteration order"""

    def test_order_matters_when_multiple_terminals(self):
        """Verify that node order in list determines which terminal is returned"""
        # Test with different orderings
        nodes1 = [{"id": "X"}, {"id": "Y"}, {"id": "Z"}]
        edges1 = [{"source": "X", "target": "Y"}]
        codeflash_output = find_last_node(nodes1, edges1)
        result1 = codeflash_output  # 1.96μs -> 1.17μs (67.9% faster)

        nodes2 = [{"id": "X"}, {"id": "Z"}, {"id": "Y"}]
        edges2 = [{"source": "X", "target": "Y"}]
        codeflash_output = find_last_node(nodes2, edges2)
        result2 = codeflash_output  # 1.04μs -> 541ns (92.4% faster)


# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-find_last_node-mk563xrs and push.

The optimized code achieves a **~104x speedup** by eliminating a quadratic complexity bottleneck in the original implementation. **Key Performance Problem in Original Code:** The original code uses a nested iteration pattern: for each node, it checks `all(e["source"] != n["id"] for e in edges)`. This means: - For every node examined, **all edges are scanned** to verify the node isn't a source - Time complexity: O(nodes × edges) - quadratic behavior - With 1000 nodes and 999 edges (linear chain), this results in ~1M comparisons **The Optimization:** The optimized version performs a one-time preprocessing step: ```python sources = {e["source"] for e in edges} # Build set of all source IDs return next((n for n in nodes if n["id"] not in sources), None) # O(1) lookup per node ``` This changes the complexity from O(nodes × edges) to O(nodes + edges), where: 1. Building the `sources` set: O(edges) - single pass through edges 2. Finding terminal node: O(nodes) with O(1) set membership checks **Why This Works:** - Set membership testing (`n["id"] not in sources`) is O(1) average case in Python - Dictionary/set lookups use hash tables, providing constant-time access - The preprocessing cost is amortized across all node checks **Performance by Test Case Type:** - **Small graphs** (2-3 nodes): 25-100% faster - overhead of set creation is minimal - **Linear chains** (1000 nodes): **329x faster** - eliminates catastrophic quadratic behavior - **Dense graphs** (100 nodes, 9900 edges): **86x faster** - set lookup vastly superior to repeated edge iteration - **Star graphs** (1 source, 999 targets): **88% faster** - single set entry, fast lookups - **Empty/minimal inputs**: Slight overhead (5-20% slower) due to set creation cost, but negligible in absolute terms (nanoseconds) **Impact:** This optimization is particularly valuable when the function is called frequently or on larger graphs, as the speedup scales dramatically with input size. The test results show orders-of-magnitude improvements for realistic graph sizes while maintaining identical behavior.

codeflash-ai bot requested a review from KRRT7 January 8, 2026 08:10

codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Jan 8, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

⚡️ Speed up function `find_last_node` by 10,383% #250

⚡️ Speed up function `find_last_node` by 10,383% #250

codeflash-ai bot commented Jan 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

⚡️ Speed up function find_last_node by 10,383% #250

Are you sure you want to change the base?

⚡️ Speed up function find_last_node by 10,383% #250

Conversation

codeflash-ai bot commented Jan 8, 2026

📄 10,383% (103.83x) speedup for find_last_node in src/algorithms/graph.py

📝 Explanation and details

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

⚡️ Speed up function `find_last_node` by 10,383% #250

⚡️ Speed up function `find_last_node` by 10,383% #250

📄 10,383% (103.83x) speedup for `find_last_node` in `src/algorithms/graph.py`