-
Notifications
You must be signed in to change notification settings - Fork 3
Sed 4415 optimize reporting request and query performance #566
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: 29
Are you sure you want to change the base?
Sed 4415 optimize reporting request and query performance #566
Conversation
This reverts commit eb71b1f.
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
...ntroller/step-controller-server/src/main/java/step/core/controller/StepControllerPlugin.java
Show resolved
Hide resolved
| AsyncTaskManager asyncTaskManager = context.require(AsyncTaskManager.class); | ||
| asyncTaskManager.scheduleAsyncTask((empty) -> { | ||
| logger.info("ReportNode timeSeries ingestion for empty resolutions has started"); | ||
| reportNodeTimeSeries.getTimeSeries().ingestDataForEmptyCollections(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I couldn't find the place where this was called previously. Wasn't this called?
- Is it safe to do it asynchronously? I assume the controller would start and executions could start creating data too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have the same logic implemented for the time-series (response times) but it did not exists for the reportNodeTimeSeries.
Reinsgesting such new resolutions can take quite some time and I would say doing it async is the only option. Since empty resolutions are determined before any new execution can be triggered and a dedicated ingestion pipeline is used it should be safe However the filter currently used to re-ingest data it too permissive and could cause to re-ingest "new" buckets (created while re-ingesting). I will update the filter to make sure begin < controller_start_time (in ingestDataForEmptyCollections)
Filter filter = (Filter)(collection.getTtl() > 0L ? Filters.gte("begin", System.currentTimeMillis() - collection.getTtl()) : Filters.empty()); try (Stream<Bucket> bucketStream = previousCollection.findLazy(filter, searchOrder)) {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the clarification. Agree that it should be done asynchronously. There's however a few things we should think about:
- Ensure the re-ingestion is not interrupted: We could explicitly write in the logs that the controller shouldn't be restarted during the re-ingestion to avoid incomplete re-ingestions.
- Ensure new data are not re-ingested: not sure if the condition begin < controller_start_time would be enough for large resolution. In theory the last bucket could fulfill this condition and be used for new executions while it is being re-ingested. Right?
- In any case, we should benchmark the re-ingestion for large data sets
|
|
||
| @Override | ||
| public void runUpgradeScript() { | ||
| log.info("Renaming time-series 'main' collections to match their resolutions"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was about to write that we should have only one main collection and understood later that we have 2 timeseries. We might explicitly list the 2 collection names
| private void updateSettingKeyIfPresent(String oldKey, String newKey) { | ||
| Optional<Document> setting = settings.find(Filters.equals("key", oldKey), null, null, null, 0).findFirst(); | ||
| setting.ifPresent(s -> { | ||
| s.put("key", newKey); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would add an explicit log entry here to be fully transparent
...ep-plans-core/src/main/java/step/core/artefacts/reports/aggregated/ReportNodeTimeSeries.java
Show resolved
Hide resolved
| public enum Resolution { | ||
| FIVE_SECONDS("5_seconds", Duration.ofSeconds(5), Duration.ofSeconds(1), false), | ||
| FIFTEEN_SECONDS("15_seconds", Duration.ofSeconds(15), Duration.ofSeconds(5), false), | ||
| ONE_MINUTE("minute", Duration.ofMinutes(1), Duration.ofSeconds(10), false), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we rename the main collections, shouldn't we rename to 1_minute to be consistent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to limit the migrations, but you're right
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is too much of effort (including FE), we can also do it for the next major
| return enabledCollections; | ||
| } | ||
|
|
||
| public List<TimeSeriesCollection> getSingleTimeSeriesCollections(String mainCollectionName, TimeSeriesCollectionsSettings collectionsSettings, Duration resolution, Long flushInterval) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't seem to be used
| List<TimeSeriesCollection> enabledCollections = new ArrayList<>(); | ||
| int flushSeriesQueueSize = collectionsSettings.getFlushSeriesQueueSize(); | ||
| int flushAsyncQueueSize = collectionsSettings.getFlushAsyncQueueSize(); | ||
| addIfEnabled(enabledCollections, mainCollectionName, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If for some reason the settings of a time series don't exist, the whole collection is drop in addIfEnabled. Isn't this a bit risky? should we foresee an explicit cleanup flag or something similar to avoid unexpected drops of the main collection?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'll completely remove the drop when the collection is disabled. Admins can always manually drop the collection in DB if required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly. I think it is safer
No description provided.