This repository contains the code and reproducible submission pipeline for the Data Mining Spring 2026 final project.
Best verified public leaderboard score:
Public MAE: 0.8151
Final submission file:
submissions/final_submission.csv
The final submission is copied from:
submissions/c28_after8162_C28_HYBRID_G275_R120_CAP840.csv
The final system uses deterministic ensemble calibration and post-processing. The main stages are:
- Strict 91-day based learned-rank signal generation.
- Distribution-preserving calibration of prediction scores.
- Public-anchor guided continuation using verified submissions.
- Hybrid calibration combining validated continuation and learned rank agreement.
No test labels, private leaderboard labels, or external datasets are used.
python -m venv .venv ..venv\Scripts\Activate.ps1 pip install -r requirements.txt
Place the competition files in:
data/train.csv data/test.csv data/sample_submission.csv
The raw data files are not included in this repository.
The public leaderboard progress is recorded in:
outputs/public_score_log.csv
All final scripts are deterministic. The post-processing scripts preserve the official sample submission order and required columns. Predictions are clipped to the valid score range.