Skip to content

Conversation

@david-stephan
Copy link
Contributor

No description provided.

@chatgpt-codex-connector
Copy link

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

AsyncTaskManager asyncTaskManager = context.require(AsyncTaskManager.class);
asyncTaskManager.scheduleAsyncTask((empty) -> {
logger.info("ReportNode timeSeries ingestion for empty resolutions has started");
reportNodeTimeSeries.getTimeSeries().ingestDataForEmptyCollections();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • I couldn't find the place where this was called previously. Wasn't this called?
  • Is it safe to do it asynchronously? I assume the controller would start and executions could start creating data too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have the same logic implemented for the time-series (response times) but it did not exists for the reportNodeTimeSeries.
Reinsgesting such new resolutions can take quite some time and I would say doing it async is the only option. Since empty resolutions are determined before any new execution can be triggered and a dedicated ingestion pipeline is used it should be safe However the filter currently used to re-ingest data it too permissive and could cause to re-ingest "new" buckets (created while re-ingesting). I will update the filter to make sure begin < controller_start_time (in ingestDataForEmptyCollections)

Filter filter = (Filter)(collection.getTtl() > 0L ? Filters.gte("begin", System.currentTimeMillis() - collection.getTtl()) : Filters.empty()); try (Stream<Bucket> bucketStream = previousCollection.findLazy(filter, searchOrder)) {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarification. Agree that it should be done asynchronously. There's however a few things we should think about:

  • Ensure the re-ingestion is not interrupted: We could explicitly write in the logs that the controller shouldn't be restarted during the re-ingestion to avoid incomplete re-ingestions.
  • Ensure new data are not re-ingested: not sure if the condition begin < controller_start_time would be enough for large resolution. In theory the last bucket could fulfill this condition and be used for new executions while it is being re-ingested. Right?
  • In any case, we should benchmark the re-ingestion for large data sets


@Override
public void runUpgradeScript() {
log.info("Renaming time-series 'main' collections to match their resolutions");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was about to write that we should have only one main collection and understood later that we have 2 timeseries. We might explicitly list the 2 collection names

private void updateSettingKeyIfPresent(String oldKey, String newKey) {
Optional<Document> setting = settings.find(Filters.equals("key", oldKey), null, null, null, 0).findFirst();
setting.ifPresent(s -> {
s.put("key", newKey);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add an explicit log entry here to be fully transparent

public enum Resolution {
FIVE_SECONDS("5_seconds", Duration.ofSeconds(5), Duration.ofSeconds(1), false),
FIFTEEN_SECONDS("15_seconds", Duration.ofSeconds(15), Duration.ofSeconds(5), false),
ONE_MINUTE("minute", Duration.ofMinutes(1), Duration.ofSeconds(10), false),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we rename the main collections, shouldn't we rename to 1_minute to be consistent?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to limit the migrations, but you're right

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is too much of effort (including FE), we can also do it for the next major

return enabledCollections;
}

public List<TimeSeriesCollection> getSingleTimeSeriesCollections(String mainCollectionName, TimeSeriesCollectionsSettings collectionsSettings, Duration resolution, Long flushInterval) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't seem to be used

List<TimeSeriesCollection> enabledCollections = new ArrayList<>();
int flushSeriesQueueSize = collectionsSettings.getFlushSeriesQueueSize();
int flushAsyncQueueSize = collectionsSettings.getFlushAsyncQueueSize();
addIfEnabled(enabledCollections, mainCollectionName,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If for some reason the settings of a time series don't exist, the whole collection is drop in addIfEnabled. Isn't this a bit risky? should we foresee an explicit cleanup flag or something similar to avoid unexpected drops of the main collection?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'll completely remove the drop when the collection is disabled. Admins can always manually drop the collection in DB if required.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly. I think it is safer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants