Skip to content

Performance Optimization: Reduce Redundant Computations in Polynomial Regression & Eye-Tracking Clustering #34

@Soumyals

Description

@Soumyals

Issues Identified:
⚠️ Repeated instantiation of models (StandardScaler, PolynomialFeatures, LinearRegression) within loops.
⚠️ Suboptimal filtering & grouping of data using .groupby().apply().
⚠️ Multiple redundant DataFrame operations slowing down performance.
⚠️ Lack of efficient lookup for True vs. Predicted values causing slowdowns in dictionary operations.

Optimizations suggested:
✅ ReusedStandardScaler instance instead of reinitializing multiple times.
✅ Create fit_polynomial_regression() function to avoid redundant model instantiations.
✅ Optimize groupby.apply() using NumPy for precision and accuracy calculations.
✅ Used .query() for efficient DataFrame filtering instead of chained conditions.
✅ Optimize dictionary creation with .set_index() for fast lookups.
✅ Set n_init='auto' in KMeans for better stability and efficiency.

Expected Behavior:
The optimized implementation should produce the same output as the original.
Performance should improve significantly by reducing redundant computations.
The clustering and prediction results should remain stable across multiple runs.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions