docs: RFC - Multi-Quantile Target Support for ForecastInputDataset #770
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
RFC-0001: Multi-Quantile Target Support for ForecastInputDataset
Summary
Extend
ForecastInputDatasetto support multiple target series as quantiles (e.g., P10, P50, P90). This enables training models directly on probabilistic targets from upstream forecasters.Motivation
We're building a metaforecasting module with:
Both need multi-quantile targets as "ground truth" for training.
Why not train N separate models?
Problems:
XGBoost supports multi-target training natively (
multi_strategy="one_output_per_tree"). Multi-quantile targets enable efficient joint training in a single pass.Design
API Changes
Column Naming
Pattern:
{target_column}_{quantile.format()}loadload_quantile_P10load_quantile_P50Detection
target_quantilesparam provided → use those{target_column}_quantile_P*Validation
target_seriescompatibility)Design Decisions
D1: Sample weights use primary target
Sample weights based on
primary_target_series. Custom forecasters can override viahas_quantile_targets.D2: Forecasters must validate support
Forecasters raise
InputValidationErrorif they receive multi-quantile targets but don't support them. Fail-fast prevents silent bugs.D3: Evaluation uses primary target only
Always use
primary_target_series(P50 for multi-quantile) as ground truth. Quantile-to-quantile metrics deferred to future work.D4: "Primary" not "median" terminology
Single-target datasets aren't necessarily medians (could be mean).
primary_target_seriesavoids implying statistical meaning.D5: Target quantiles must match forecaster quantiles
If forecaster supports multi-quantile targets,
data.target_quantilesmust exactly matchforecaster.config.quantiles. Prevents undefined training behavior.Drawbacks
Alternatives Considered
MultiQuantileTargetDatasetclassquantile_P*naming (like ForecastDataset)Future Possibilities