-
Notifications
You must be signed in to change notification settings - Fork 262
Fix WeatherDataset boundary checks for analysis indexing and forecast forcing horizon (Fixes #311, Closes #319) #312
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 7 commits
b3c1012
c2d443c
a08eff0
f3aaeca
3b1e014
7987bf4
1c588da
ce4286e
10ead93
e02b422
ed0f1a8
075d92c
a1db949
5618bd9
4335729
9329f54
764fd7f
af95a8d
9d1bf17
e554aa8
29220cf
c9a0a72
040ba8f
bf161bd
4bacdc9
b15ba68
99fa750
37947f3
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -131,16 +131,47 @@ def __len__(self): | |
| UserWarning, | ||
| ) | ||
|
|
||
| # check that there are enough forecast steps available to create | ||
| # samples given the number of autoregressive steps requested | ||
| n_forecast_steps = self.da_state.elapsed_forecast_duration.size | ||
| if n_forecast_steps < 2 + self.ar_steps: | ||
| # Check that there are enough forecast steps available for state | ||
| # slicing. This includes two initial states and `ar_steps` targets, | ||
| # potentially offset by past forcing. | ||
| required_state_steps = ( | ||
| max(2, self.num_past_forcing_steps) + self.ar_steps | ||
| ) | ||
| n_state_forecast_steps = ( | ||
| self.da_state.elapsed_forecast_duration.size | ||
| ) | ||
| if n_state_forecast_steps < required_state_steps: | ||
| raise ValueError( | ||
| "The number of forecast steps available " | ||
| f"({n_forecast_steps}) is less than the required " | ||
| f"2+ar_steps (2+{self.ar_steps}={2 + self.ar_steps}) for " | ||
| "creating a sample with initial and target states." | ||
| "The number of state forecast steps available " | ||
| f"({n_state_forecast_steps}) is less than the required " | ||
| f"{required_state_steps} " | ||
| f"(max(2, num_past_forcing_steps={self.num_past_forcing_steps})" | ||
| f" + ar_steps={self.ar_steps}) for creating a sample with " | ||
| "initial and target states." | ||
| ) | ||
|
|
||
| # If forcing data is present, also validate that the complete | ||
| # forcing window can be constructed for each autoregressive target | ||
| # step without truncation. | ||
| if self.da_forcing is not None: | ||
| required_forcing_steps = ( | ||
| max(2, self.num_past_forcing_steps) | ||
| + self.ar_steps | ||
| + self.num_future_forcing_steps | ||
| ) | ||
| n_forcing_forecast_steps = ( | ||
| self.da_forcing.elapsed_forecast_duration.size | ||
| ) | ||
| if n_forcing_forecast_steps < required_forcing_steps: | ||
| raise ValueError( | ||
| "The number of forcing forecast steps available " | ||
| f"({n_forcing_forecast_steps}) is less than the " | ||
| f"required {required_forcing_steps} " | ||
| f"(max(2, num_past_forcing_steps={self.num_past_forcing_steps})" | ||
| f" + ar_steps={self.ar_steps} + " | ||
| f"num_future_forcing_steps={self.num_future_forcing_steps}) " | ||
| "for constructing forcing windows." | ||
| ) | ||
|
|
||
| return self.da_state.analysis_time.size | ||
| else: | ||
|
|
@@ -159,6 +190,7 @@ def __len__(self): | |
| - self.ar_steps | ||
| - max(2, self.num_past_forcing_steps) | ||
| - self.num_future_forcing_steps | ||
| + 1 | ||
| ) | ||
|
|
||
| def _slice_state_time(self, da_state, idx, n_steps: int): | ||
|
|
@@ -468,6 +500,17 @@ def __getitem__(self, idx): | |
| the target steps. | ||
|
|
||
| """ | ||
| dataset_len = len(self) | ||
|
|
||
| # Match Python sequence semantics for negative indexing. | ||
| if idx < 0: | ||
| idx += dataset_len | ||
| if idx < 0 or idx >= dataset_len: | ||
| raise IndexError( | ||
| f"Index {idx} is out of bounds for dataset of size " | ||
| f"{dataset_len}" | ||
| ) | ||
|
|
||
|
Comment on lines
+574
to
+582
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I was at first conflicted about this change, because it only affects the analysis type data, and only if accessed programmatically e.g. from a test. And I thought that it might rather be separate issue. But, after consideration, I think this fits well within this PR. And knowing that there is more flexibility and robustness needed when we introduce #138 boundary datastores, this is good. Nothing to change here, just some context. |
||
| ( | ||
| da_init_states, | ||
| da_target_states, | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both da_state and da_forcing are always built from the same datastore time coordinates, so their sizes are guaranteed to be equal and a separate shape check is not needed. The single forecast-mode check is still necessary, but only because the required minimum size is larger when forcing is present (+ num_future_forcing_steps), not because the arrays could ever differ in size.
The no-forcing path can remain unchanged in behaviour; the with-forcing path should skip the redundant state check and goes straight to the stricter (and sufficient) forcing constraint.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the redundant separate forcing-size check and now use the shared forecast horizon once:
2 + ar_stepsbehaviorRe-ran: