Fix WeatherDataset boundary checks for analysis indexing and forecast forcing horizon (Fixes #311, Closes #319) by kshirajahere · Pull Request #312 · mllam/neural-lam

kshirajahere · 2026-03-02T13:13:22Z

Describe your changes

Fix analysis-mode sample counting in WeatherDataset.__len__ (off-by-one correction).
Add explicit index bounds handling in WeatherDataset.__getitem__:
- supports Python-style negative indexing,
- raises IndexError for out-of-range indices.
Add forecast-mode horizon validation in WeatherDataset.__len__ so forcing windows cannot overrun available forecast steps.
Add focused regression tests for:
- analysis-mode out-of-bounds indexing,
- forecast-mode forcing horizon boundary checks.
Add CHANGELOG entry under ## [unreleased] -> ### Fixed.

Issue Link

Fixes #311
Closes #319

Type of change

🐛 Bug fix (non-breaking change that fixes an issue)
✨ New feature (non-breaking change that adds functionality)
💥 Breaking change (fix or feature that would cause existing functionality to not work as expected)
📖 Documentation (Addition or improvements to documentation)

Checklist before requesting a review

My branch is up-to-date with the target branch - if not update your fork with the changes from the target branch (use pull with --rebase option if possible).
I have performed a self-review of my code
For any new/modified functions/classes I have added docstrings that clearly describe its purpose, expected inputs and returned values
I have placed in-line comments to clarify the intent of any hard-to-understand passages of my code
I have updated the README to cover introduced code changes
I have added tests that prove my fix is effective or that my feature works
I have given the PR a name that clearly describes the change, written in imperative form (context).
I have requested a reviewer and an assignee (assignee is responsible for merging). This applies only if you have write access to the repo, otherwise feel free to tag a maintainer to add a reviewer and assignee.

Checklist for reviewers

Each PR comes with its own improvements and flaws. The reviewer should check the following:

the code is readable
the code is well tested
the code is documented (including return types and parameters)
the code is easy to maintain

Author checklist after completed review

I have added a line to the CHANGELOG describing this change, in a section
reflecting type of change (add section where missing):
- added: when you have added new functionality
- changed: when default behaviour of the code has been changed
- fixes: when your contribution fixes a bug
- maintenance: when your contribution is relates to repo maintenance, e.g. CI/CD or documentation

Checklist for assignee

PR is up to date with the base branch
the tests pass
(if the PR is not just maintenance/bugfix) the PR is assigned to the next milestone. If it is not, propose it for a future milestone.
author has added an entry to the changelog (and designated the change as added, changed, fixed or maintenance)
Once the PR is ready to be merged, squash commits and merge the PR.

kshirajahere · 2026-03-02T13:37:05Z

Ready for review. Happy to make any requested adjustments.

@Jayant-kernel

Co-developed the forcing-bounds equation using @Jayant-kernel's detailed breakdown in issue 319!

@Jayant-kernel

Co-developed the forcing-bounds equation using @Jayant-kernel's detailed breakdown in issue 319!

Updated the changelog to clarify fixes in WeatherDataset and other issues.

kshirajahere · 2026-03-03T14:08:42Z

Update: I’ve integrated the related forecast-boundary validation from #319 into this PR, so WeatherDataset boundary checks are handled in one place.

This PR now includes:

analysis-mode indexing bounds (IndexError behavior),
forecast-mode forcing-horizon validation in __len__,
regression tests for both paths,
CHANGELOG update.

@sadamov @Jayant-kernel please let me know if you’d prefer this split, otherwise this PR should close both #311 and #319 on merge.

kshirajahere · 2026-03-04T16:09:03Z

@sadamov @leifdenby can someone pls review the PR :) thanks

kshirajahere · 2026-03-06T17:43:48Z

@sadamov Hey sir :), was just pinging to request u to review when you have time. Have regression tests done. If you need me to perform any cleanups or any another changes let me know :) thanks

kshirajahere · 2026-03-08T05:57:42Z

@sadamov Please if you get time, do review 😸

sadamov

@kshirajahere Thanks alot, this has indeed been a bug. And the fact that datasets were silently truncated is certainly not great. I agree with your suggested fixes and only request a reduction on lines of code, because we know that forcing and state share temporal dimensions.

Because this fix has some importance and you introduced 3 new tests I am asking for a second review as well. I prefer a third pair of eyes before merging a bugfix of this size.

sadamov · 2026-03-08T20:18:52Z

+            # If forcing data is present, also validate that the complete
+            # forcing window can be constructed for each autoregressive target
+            # step without truncation.
+            if self.da_forcing is not None:
+                required_forcing_steps = (
+                    max(2, self.num_past_forcing_steps)
+                    + self.ar_steps
+                    + self.num_future_forcing_steps
                )
+                n_forcing_forecast_steps = (
+                    self.da_forcing.elapsed_forecast_duration.size
+                )
+                if n_forcing_forecast_steps < required_forcing_steps:
+                    raise ValueError(
+                        "The number of forcing forecast steps available "
+                        f"({n_forcing_forecast_steps}) is less than the "
+                        f"required {required_forcing_steps} "
+                        f"(max(2, num_past_forcing_steps={self.num_past_forcing_steps})"
+                        f" + ar_steps={self.ar_steps} + "
+                        f"num_future_forcing_steps={self.num_future_forcing_steps}) "
+                        "for constructing forcing windows."
+                    )


Both da_state and da_forcing are always built from the same datastore time coordinates, so their sizes are guaranteed to be equal and a separate shape check is not needed. The single forecast-mode check is still necessary, but only because the required minimum size is larger when forcing is present (+ num_future_forcing_steps), not because the arrays could ever differ in size.

The no-forcing path can remain unchanged in behaviour; the with-forcing path should skip the redundant state check and goes straight to the stricter (and sufficient) forcing constraint.

I removed the redundant separate forcing-size check and now use the shared forecast horizon once:

no-forcing path keeps the original 2 + ar_steps behavior

with-forcing path applies the stricter minimum needed for the full forcing window

Re-ran:

pytest -q tests/test_datasets.py -k "dataset_length or out_of_bounds or forecast_len"

ruff check neural_lam/weather_dataset.py tests/test_datasets.py

sadamov · 2026-03-08T20:21:27Z

+        # Match Python sequence semantics for negative indexing.
+        if idx < 0:
+            idx += dataset_len
+        if idx < 0 or idx >= dataset_len:
+            raise IndexError(
+                f"Index {idx} is out of bounds for dataset of size "
+                f"{dataset_len}"
+            )
+


I was at first conflicted about this change, because it only affects the analysis type data, and only if accessed programmatically e.g. from a test. And I thought that it might rather be separate issue.

But, after consideration, I think this fits well within this PR. And knowing that there is more flexibility and robustness needed when we introduce #138 boundary datastores, this is good. Nothing to change here, just some context.

kshirajahere · 2026-03-08T21:37:12Z

Hey @sadamov Can u rereview. I have reduced the forecast-mode validation.

sadamov

Looks good now! Just waiting for @joeloskarsson review and then we can merge

kshirajahere · 2026-03-17T19:00:31Z

@joeloskarsson pinging for review :)

kshirajahere · 2026-03-21T20:53:32Z

Was going thru the code after pinging @joeloskarsson for the review on slack 😅 and found some gaps (fixed them)
@sadamov small heads-up: I pushed a follow-up fix after rechecking the branch, covering two forecast-mode validation gaps plus regression tests. Since this is post-approval in the same area, I wanted to flag it here.

kshirajahere · 2026-03-30T13:44:10Z

@joeloskarsson i request you to please review my PR :)

sadamov

your latest commits did not break anything since my last review. thanks for notifying me

kshirajahere · 2026-04-03T17:51:42Z

Thanks @sadamov, just waiting for @joeloskarsson review so i can make those changes and we can merge this 🚀

Route pull_request GPU matrix jobs to ubuntu-latest so they no longer wait 24 hours for an unavailable Cirun runner. Keep Cirun on push and workflow_dispatch, and only use the NVMe-specific pip GPU setup on the self-hosted path.

This reverts commit e554aa8.

kshirajahere · 2026-04-09T07:23:59Z

Heads-up: the two GPU checks are stuck in queue again (Cirun runner availability). Everything else is green. Could a maintainer re-run checks once the Cirun pool is free? If you prefer, I can re-run them too once the runner frees up.

kshirajahere added 3 commits March 2, 2026 18:20

Fix mllam#311

b3c1012

Fix mllam#311

c2d443c

Update CHANGELOG for Fix mllam#311 and Pull mllam#312

a08eff0

kshirajahere mentioned this pull request Mar 2, 2026

[BUG] WeatherDataset undercounts __len__ in analysis mode and silently yields NaN tensors out-of-bounds #311

Open

sadamov mentioned this pull request Mar 3, 2026

[Bug] Forecast-mode WeatherDataset length check ignores forcing window, causing truncated/NaN forcing tensors #319

Open

kshirajahere added 3 commits March 3, 2026 19:25

fix: account for forecast mode forcing horizons (Closes mllam#319)

f3aaeca

Co-developed the forcing-bounds equation using @Jayant-kernel's detailed breakdown in issue 319!

fix: account for forecast mode forcing horizons (Closes mllam#319)

3b1e014

Co-developed the forcing-bounds equation using @Jayant-kernel's detailed breakdown in issue 319!

Clarify fixes in CHANGELOG.md

7987bf4

Updated the changelog to clarify fixes in WeatherDataset and other issues.

kshirajahere changed the title ~~Fix WeatherDataset analysis-mode indexing bounds and add regression test (Fixes #311)~~ Fix WeatherDataset boundary checks for analysis indexing and forecast forcing horizon (Fixes #311, Closes #319) Mar 3, 2026

Merge branch 'main' into fix/weatherdataset-index-bounds-311

1c588da

sadamov self-requested a review March 4, 2026 17:22

kshirajahere mentioned this pull request Mar 5, 2026

[RFC/Design] Standardize probabilistic vs deterministic return contract to unblock evaluation integrations #335

Open

sadamov requested changes Mar 8, 2026

View reviewed changes

sadamov requested a review from joeloskarsson March 8, 2026 20:24

sadamov mentioned this pull request Mar 8, 2026

Fix silent truncation crash for short forecast data #254

Open

21 tasks

Reduced the forecast-mode validation (mllam#312)

ce4286e

kshirajahere requested a review from sadamov March 8, 2026 21:38

sadamov added 2 commits March 9, 2026 13:26

Merge remote-tracking branch 'mllam' into pr/kshirajahere/312

10ead93

linting

e02b422

sadamov approved these changes Mar 9, 2026

View reviewed changes

sadamov self-assigned this Mar 12, 2026

sadamov added the bug Something isn't working label Mar 12, 2026

Fix forecast forcing horizon validation

ed0f1a8

kshirajahere added 3 commits March 20, 2026 02:47

Resolve mllam#312 test conflict and harden forcing handling

075d92c

Merge origin/main into fix/weatherdataset-index-bounds-311

a1db949

Fix forecast-mode WeatherDataset length validation

5618bd9

kshirajahere requested a review from sadamov March 21, 2026 21:13

kshirajahere added 3 commits March 22, 2026 04:07

Validate forecast coordinate consistency in WeatherDataset

4335729

Merge origin/main into fix/weatherdataset-index-bounds-311

9329f54

style: format test_datasets with black

764fd7f

Merge branch 'main' into fix/weatherdataset-index-bounds-311

af95a8d

sadamov approved these changes Apr 1, 2026

View reviewed changes

Merge branch 'main' into fix/weatherdataset-index-bounds-311

9d1bf17

kshirajahere added 2 commits April 8, 2026 23:26

ci: unblock PR GPU checks

e554aa8

Route pull_request GPU matrix jobs to ubuntu-latest so they no longer wait 24 hours for an unavailable Cirun runner. Keep Cirun on push and workflow_dispatch, and only use the NVMe-specific pip GPU setup on the self-hosted path.

Revert "ci: unblock PR GPU checks"

29220cf

This reverts commit e554aa8.

kshirajahere added 7 commits April 10, 2026 22:41

Merge branch 'main' into fix/weatherdataset-index-bounds-311

c9a0a72

Move dataset validation out of __len__

040ba8f

Format dataset validation for pre-commit

bf161bd

Merge branch 'main' into fix/weatherdataset-index-bounds-311

4bacdc9

fix: allow longer forecast forcing horizons

b15ba68

Merge branch 'main' into fix/weatherdataset-index-bounds-311

99fa750

Merge branch 'main' into fix/weatherdataset-index-bounds-311

37947f3

Conversation

kshirajahere commented Mar 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Describe your changes

Issue Link

Type of change

Checklist before requesting a review

Checklist for reviewers

Author checklist after completed review

Checklist for assignee

Uh oh!

kshirajahere commented Mar 2, 2026

Uh oh!

kshirajahere commented Mar 3, 2026

Uh oh!

kshirajahere commented Mar 4, 2026

Uh oh!

kshirajahere commented Mar 6, 2026

Uh oh!

kshirajahere commented Mar 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sadamov left a comment

Choose a reason for hiding this comment

Uh oh!

sadamov Mar 8, 2026

Choose a reason for hiding this comment

Uh oh!

kshirajahere Mar 8, 2026

Choose a reason for hiding this comment

Uh oh!

sadamov Mar 8, 2026

Choose a reason for hiding this comment

Uh oh!

kshirajahere commented Mar 8, 2026

Uh oh!

sadamov left a comment

Choose a reason for hiding this comment

Uh oh!

kshirajahere commented Mar 17, 2026

Uh oh!

kshirajahere commented Mar 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kshirajahere commented Mar 30, 2026

Uh oh!

sadamov left a comment

Choose a reason for hiding this comment

Uh oh!

kshirajahere commented Apr 3, 2026

Uh oh!

kshirajahere commented Apr 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

kshirajahere commented Mar 2, 2026 •

edited

Loading

kshirajahere commented Mar 8, 2026 •

edited

Loading

kshirajahere commented Mar 21, 2026 •

edited

Loading