⚡️ Speed up function find_last_node by 5,297%
#245
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📄 5,297% (52.97x) speedup for
find_last_nodeinsrc/algorithms/graph.py⏱️ Runtime :
5.81 milliseconds→108 microseconds(best of250runs)📝 Explanation and details
The optimization achieves a 53x speedup by eliminating nested iteration through a classic algorithmic improvement: replacing O(n×m) complexity with O(n+m) complexity.
What changed:
The original code checks every edge for every node using nested iteration:
all(e["source"] != n["id"] for e in edges). The optimized version pre-computes a set of all source node IDs once:sources = {e["source"] for e in edges}, then performs fast O(1) membership lookups:n["id"] not in sources.Why this is faster:
in) is O(1) average case using hash tables, while theall()check iterates through all edges for each node, resulting in O(m) per node checkPerformance characteristics by test case:
The line profiler confirms this: the original code spent 100% of time (46.7ms) in the nested iteration, while the optimized version spends only 43.8% (149μs) building the set and 56.2% (191μs) doing lookups - a total of 340μs vs 46.7ms.
Impact considerations:
Without
function_references, we cannot determine if this function is in a hot path, but given it processes graph structures (potentially in flow/workflow systems based on "flow" in the docstring), any system repeatedly querying for terminal nodes would benefit significantly, especially as graph size scales.✅ Correctness verification report:
🌀 Click to see Generated Regression Tests
To edit these changes
git checkout codeflash/optimize-find_last_node-mk3kjwywand push.