Skip to content

Conversation

@Pajaraja
Copy link
Collaborator

Summary:
Currently we have a bug with quarterly data based on end-of-quarter dates due to months not having the same number of days (September 30th minus 6 months is March 30th, which is less than March 31st breaking custom cutoff generation logic). To combat this, we adopt a addition instead of a subtraction based approach, as dates past the last date of a month will be rounded down to the last date of that month.

Test plan:
New tests in diagnostics_test and utils_test.

@Pajaraja Pajaraja requested a review from Copilot January 23, 2026 16:34
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a bug in quarterly data handling where end-of-quarter dates caused incorrect custom cutoff generation. The issue occurred because subtracting months doesn't account for varying month lengths (e.g., September 30th minus 6 months equals March 30th, not March 31st). The fix switches from a subtraction-based to an addition-based approach for calculating cutoffs.

Changes:

  • Modified generate_custom_cutoffs to use addition instead of subtraction when calculating maximum cutoff
  • Updated cross_validation validation logic to check if max cutoff plus horizon exceeds end date
  • Added test cases for quarterly end-of-month scenarios

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
runtime/databricks/automl_runtime/forecast/utils.py Changed max_cutoff calculation from subtraction to addition-based approach
runtime/databricks/automl_runtime/forecast/prophet/diagnostics.py Updated validation logic to use addition instead of subtraction for consistency
runtime/tests/automl_runtime/forecast/utils_test.py Added test case for quarterly data with end-of-quarter dates
runtime/tests/automl_runtime/forecast/prophet/diagnostics_test.py Added test case for cross-validation with month-end cutoffs

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

cutoffs = generate_custom_cutoffs(df, horizon=7, frequency_unit="QS", split_cutoff=pd.Timestamp('2020-07-12 00:00:00'))
self.assertEqual([pd.Timestamp('2020-07-12 00:00:00'), pd.Timestamp('2020-10-12 00:00:00')], cutoffs)

def test_generate_custom_cutoffs_success_quaterly_end(self):
Copy link

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected spelling of 'quaterly' to 'quarterly'.

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo is in other places too. Will fix in followup PR.

period_dateoffset = pd.DateOffset(**DATE_OFFSET_KEYWORD_MAP[frequency_unit])*period*frequency_quantity
horizon_dateoffset = pd.DateOffset(**DATE_OFFSET_KEYWORD_MAP[frequency_unit])*horizon*frequency_quantity

# First cutoff is the cutoff bewteen splits
Copy link

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected spelling of 'bewteen' to 'between'.

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will fix in followup PR

@Pajaraja Pajaraja merged commit 2a2a7d8 into databricks:branch-0.2.20.13 Jan 23, 2026
7 of 8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant