⚡️ Speed up function find_last_node by 13,556%
#246
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📄 13,556% (135.56x) speedup for
find_last_nodeinsrc/algorithms/graph.py⏱️ Runtime :
25.4 milliseconds→186 microseconds(best of250runs)📝 Explanation and details
The optimized code achieves a 135x speedup by eliminating redundant work through better algorithmic design.
Key Optimization:
The original code uses a nested iteration pattern that checks every edge for every node, resulting in O(n*m) complexity where n is the number of nodes and m is the number of edges. For each node candidate, it iterates through all edges to verify none have that node as a source.
The optimized version pre-computes a set of all source node IDs once (O(m) operation), then performs constant-time membership checks (O(1) per node) as it iterates through nodes (O(n) total). This reduces the overall complexity to O(n+m).
Why This Matters:
Performance Characteristics:
The speedup is most dramatic on large graphs:
Even tiny test cases show consistent improvements because building the source set is very cheap, while the original's nested iteration is expensive regardless of early termination.
Trade-offs:
The optimization adds minimal memory overhead (one set storing source IDs) but dramatically reduces CPU cycles, making it beneficial across all workload sizes tested.
✅ Correctness verification report:
🌀 Click to see Generated Regression Tests
To edit these changes
git checkout codeflash/optimize-find_last_node-mk3kwjauand push.