Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Copy Imputer #72

Open
wants to merge 4 commits into
base: develop
Choose a base branch
from

Conversation

icedoom888
Copy link
Contributor

@icedoom888 icedoom888 commented Jan 9, 2025

Add an imputer that copies missing information from another field.

When a value is missing at a certain Pressure Level it would be useful to copy the value of the Pressure Level above.
This technique ensures constant imputing in areas with missing Pressure Levels and will not cause a big delta in values with respect to vertical changes.

Solves #71

@icedoom888 icedoom888 self-assigned this Jan 9, 2025
@FussyDuck
Copy link

FussyDuck commented Jan 9, 2025

CLA assistant check
All committers have signed the CLA.

@icedoom888 icedoom888 changed the title Feature/copy imputer [Feature] Copy Imputer Jan 9, 2025
@HCookie
Copy link
Member

HCookie commented Jan 14, 2025

Can you please provide some more details on the description about where this imputer may be useful and some example code/config of it in action?

@icedoom888
Copy link
Contributor Author

Can you please provide some more details on the description about where this imputer may be useful and some example code/config of it in action?

@HCookie Done ✔️ Is it clearer now?

@icedoom888 icedoom888 requested a review from HCookie January 15, 2025 14:14
Comment on lines +237 to +241
default: "none"
x:
- y
- q
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments on what these characters represent will help here

@@ -303,3 +408,50 @@ def __init__(
"You are using a dynamic Imputer: NaN values will not be present in the model predictions. \
The model will be trained to predict imputed values. This might deteriorate performances."
)


class DynamicCopyImputer(CopyImputer):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A little more detail on the exact differences, and uses would be helpful to other users.

Comment on lines +428 to +453
self.loss_mask_training = torch.ones(
(x.shape[-2], len(self.data_indices.model.output.name_to_index)), device=x.device
)

# Choose correct index based on number of variables
if x.shape[-1] == self.num_training_input_vars:
index = self.index_training_input
elif x.shape[-1] == self.num_inference_input_vars:
index = self.index_inference_input
else:
raise ValueError(
f"Input tensor ({x.shape[-1]}) does not match the training "
f"({self.num_training_input_vars}) or inference shape ({self.num_inference_input_vars})",
)

# Replace values
for idx_src, (idx_dst, value) in zip(self.index_training_input, zip(index, self.replacement)):
if idx_dst is not None:
assert not torch.isnan(
x[..., self.data_indices.data.input.name_to_index[value]][nan_locations[..., idx_src]]
).any(), f"NaNs found in {value}."
x[..., idx_dst][nan_locations[..., idx_src]] = x[
..., self.data_indices.data.input.name_to_index[value]
][nan_locations[..., idx_src]]

return x
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the similarities between this and the parent class implementation, would it make sense to factor it into another function. Or a generic parent class with two children?

@HCookie HCookie removed the request for review from JesperDramsch January 16, 2025 16:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Under Review
Development

Successfully merging this pull request may close these issues.

3 participants