Skip to content

Conversation

@kavehshahedi
Copy link
Contributor

@kavehshahedi kavehshahedi commented Feb 27, 2025

What it does

The original implementation created and processed sliding windows one by one, resulting in hour-long processing times for million-row datasets. The new implementation maintains identical detection logic but uses a vectorized cumulative sum approach to calculate all window means at once, dramatically reducing computation time.

This PR also aims to fix the excessive calculation time reported in a previous pull request.

How to test

Initialize the AnomalyDetecion module with your custom outputs (e.g., CPU Usage, Memory Usage, etc.). In the case of having a huge dataset (e.g., millions of data points), you can now observe the significant performance improvement when indicating the anomalies.

Follow-ups

N/A

Review checklist

  • As an author, I have thoroughly tested my changes and carefully followed the instructions in this template

Previously, we were creating each sliding window one-by-one which took
forever on large datasets. Now we use a cumulative sum approach that gives
identical results but runs way faster on extremly huge datasets.

Signed-off-by: Kaveh Shahedi <kaveh.shahedi@ericsson.com>
Copy link
Contributor

@bhufmann bhufmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this contribution. It improves performance significantly.

@kavehshahedi kavehshahedi merged commit 67fe0e1 into eclipse-tmll:main Feb 28, 2025
4 checks passed
@kavehshahedi kavehshahedi deleted the window-optimization branch March 6, 2025 17:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants