Binary classification probabilities and regression forecasts are blended using
the conditional mean of a zero-truncated distribution. Given a classifier
probability p and an unconditional mean forecast \mu_u, the zero
probability is P0 = (\kappa / (\kappa + \mu_u))^\kappa. The conditional
mean \mu_c is then \mu_u / (1 - P0) and the final prediction becomes::
y_hat = ((1 - \epsilon_{leaky}) * p + \epsilon_{leaky}) * \mu_c
During training the model jointly learns the mean \mu and shape \kappa
of a truncated negative binomial distribution, while the binary head employs
focal loss. The leakage term epsilon_leaky, configurable via PATCH_PARAMS,
stabilizes the blend by keeping a small positive probability even when the
classifier predicts zero.
SampleWindowizer.build_lgbm_train now vectorizes target generation and
unpivoting, producing the same dataset as the previous row-wise
implementation but more efficiently.
Run baseline models with the provided configuration:
python LGHackerton/train_baseline.py --config configs/baseline.yaml --model naiveRun a grid search over PatchTST settings:
python LGHackerton/tune.py --task patchtst_grid --config configs/patchtst.yamltrain.py determines PatchTST settings using the following order:
- If
artifacts/patchtst_search.csvexists (created by the grid-search task), the combination with the lowestval_wsmapeis chosen. The associatedinput_lenis applied to window generation and to the model. - Otherwise, Optuna results from
artifacts/optuna/patchtst_best.jsonare used when available. - If neither artifact is present, default parameters from
PATCH_PARAMSare used.
This avoids confusion when both grid-search and Optuna artifacts may exist.
Training scripts resolve trainer classes and Optuna-based tuners through
lightweight registries. Providing an unknown name raises a ValueError listing
available options. See docs/registry_tuner.md for full
usage and error-handling rules.
After generating predictions from different models, use the postprocessing utilities to ensemble them and create a submission file:
python LGHackerton/predict.py --model patchtst --out patch.csv
python LGHackerton/predict.py --model lgbm --out lgbm.csv
python - <<'PY'
import pandas as pd
from LGHackerton.postprocess import aggregate_predictions, convert_to_submission
preds = [pd.read_csv('patch.csv'), pd.read_csv('lgbm.csv')]
ens = aggregate_predictions(preds, weights=[0.7, 0.3])
convert_to_submission(ens).to_csv('submission.csv', index=False, encoding='utf-8-sig')
PYThe helper functions aggregate_predictions and convert_to_submission ensure
consistent formatting and handle any missing or duplicate entries.