Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Copy Imputer #71

Open
icedoom888 opened this issue Jan 9, 2025 · 3 comments
Open

Copy Imputer #71

icedoom888 opened this issue Jan 9, 2025 · 3 comments
Assignees
Labels
enhancement New feature or request models

Comments

@icedoom888
Copy link
Contributor

icedoom888 commented Jan 9, 2025

Is your feature request related to a problem? Please describe.

Within the Alpine domain, converting from Model Levels to Pressure Levels causes the raise of many NaN values on the lower Pressure Levels. This is due to steep orography.
The missing values change dynamically in time.

Describe the solution you'd like

Add an imputer that copies missing information from another field.
When a value is missing at a certain Pressure Level it would be useful to copy the value of the Pressure Level above.
This technique ensures constant imputing in areas with missing Pressure Levels and will not cause a big delta in values with respect to vertical changes.

Describe alternatives you've considered

No response

Additional context

Here attached is an exemple config for such imputer:


mean_imputer:
  default: "none"

  "mean":
   - 10u
   - 10v
   - 2d
   - 2t
   - msl
   - sp

   - q_100
   - q_150
   - q_200
   - q_250
   - q_300
   - q_400
   - q_50
   - q_500

   - t_100
   - t_150
   - t_200
   - t_250
   - t_300
   - t_400
   - t_50
   - t_500
  
   - u_100
   - u_150
   - u_200
   - u_250
   - u_300
   - u_400
   - u_50
   - u_500
  
   - v_100
   - v_150
   - v_200
   - v_250
   - v_300
   - v_400
   - v_50
   - v_500
  
   - w_100
   - w_150
   - w_200
   - w_250
   - w_300
   - w_400
   - w_50
   - w_500
  
   - z_100
   - z_150
   - z_200
   - z_250
   - z_300
   - z_400
   - z_50
   - z_500
  
copy_imputer:
  default: "none"
  
  "q_500":
    - q_600
  "q_600":
    - q_700
  "q_700":
    - q_850
  "q_850":
    - q_925

  "t_500":
    - t_600
  "t_600":
    - t_700
  "t_700":
    - t_850
  "t_850":
    - t_925


  "u_500":
    - u_600
  "u_600":
    - u_700
  "u_700":
    - u_850
  "u_850":
    - u_925

  "v_500":
    - v_600
  "v_600":
    - v_700
  "v_700":
    - v_850
  "v_850":
    - v_925
  

  "w_500":
    - w_600
  "w_600":
    - w_700
  "w_700":
    - w_850
  "w_850":
    - w_925


  "z_500":
    - z_600
  "z_600":
    - z_700
  "z_700":
    - z_850
  "z_850":
    - z_925

processors:
  mean_imputer:
     _target_: anemoi.models.preprocessing.imputer.DynamicInputImputer
     _convert_: all
     config: ${data.mean_imputer}

  copy_imputer:
     _target_: anemoi.models.preprocessing.imputer.DynamicCopyImputer
     _convert_: all
     config: ${data.copy_imputer}
  
  normalizer:
    _target_: anemoi.models.preprocessing.normalizer.InputNormalizer
    _convert_: all
    config: ${data.normalizer}

Organisation

MeteoSwiss

@icedoom888 icedoom888 added enhancement New feature or request models labels Jan 9, 2025
@icedoom888 icedoom888 self-assigned this Jan 9, 2025
@HCookie
Copy link
Member

HCookie commented Jan 15, 2025

Could these issues with nan values within the orography be resolved by ignoring them in the calculation of the loss with the loss_weights_mask? Or are there other issues with the nan's?

@icedoom888
Copy link
Contributor Author

@HCookie the dynamic nature of these NaNs makes ignoring them in the loss impossible. If Nans are ignored, at inference time you cannot mask them out using a constant NaN map (as it was done before). Therefore you need the model to learn how to predict the imputed values. Making it necessary to have meaningful values and not statistic based values or constant values.

@HCookie
Copy link
Member

HCookie commented Jan 16, 2025

@icedoom888 Thanks for the clarification, makes sense.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request models
Projects
None yet
Development

No branches or pull requests

2 participants