Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix!: Rework Loss Scalings to provide better modularity #52

Open
wants to merge 58 commits into
base: main
Choose a base branch
from

Conversation

sahahner
Copy link
Member

@sahahner sahahner commented Dec 27, 2024

Solve the problem explained in issue #7 by refactoring the variable scalings into a general variable scaling and a pressure level scaling.
@mc4117 , @pinnstorm and me came up with a new structure. This PR implements this.

Changes by this PR

  • collect all available loss scalings in anemoi/training/losses/scaling
    • the new modular structure makes it easier to implement new scalings
  • move configuration of all available scalings into config/training/scalers
    • we provide a list of possible scalers
    • list the scalings in a dictionary you want to apply as scalers in the training_loss configuration
      -> in comparison to before, no changes to the code are necessary to apply additional scalings, but can simply be added to the training-config files
    • the new default config files do not change the default scalers to be applied
  • new features of this PR include
    • define new scalers that are applied to a group of model/pressure level variables
    • the grouping of variables into surface and pressure level variables is no longer defined by the parsing of strings but is defined in config/training/scalers
    • introduce a util function to retrieve this grouping at other places in the code
    • tendency scaler:
      • introduce the ability to scale the losses by the statistical tendencies in the dataset. At infinite precision, this is equivalent to training towards a tendency loss
  • rename loss scalars to scalers, as this is the correct naming for the feature

While this PR does introduce a breaking change to the training-config files, it comes with neat new features, that should be of use to many of us.
We did our best to document the changes to the training-config files as best as possible.

Tasklist

  • allow several variable level scaling (i.e. pressure level and model level)
  • implement/update tests
  • decide: do we want to allow scaling by variable_ref and variable_name, i.e. scale q_50 by q and q_50?
  • get variable level and name from dataset metadata if available
  • Change of name: loss scalars to scalers.
  • move node weights into new scaling submodule

📚 Documentation preview 📚: https://anemoi-training--52.org.readthedocs.build/en/52/


📚 Documentation preview 📚: https://anemoi-graphs--52.org.readthedocs.build/en/52/


📚 Documentation preview 📚: https://anemoi-models--52.org.readthedocs.build/en/52/

@sahahner sahahner linked an issue Dec 27, 2024 that may be closed by this pull request
2 tasks
@b8raoult
Copy link
Collaborator

Please consider using the knowledge about variables that come from the dataset metadata. See https://github.com/ecmwf/anemoi-transform/blob/7cbf5f3d4baa37453022a5a97e17cc71a5b8ceeb/src/anemoi/transform/variables/__init__.py#L47

@sahahner sahahner linked an issue Dec 30, 2024 that may be closed by this pull request
@sahahner
Copy link
Member Author

Please consider using the knowledge about variables that come from the dataset metadata. See https://github.com/ecmwf/anemoi-transform/blob/7cbf5f3d4baa37453022a5a97e17cc71a5b8ceeb/src/anemoi/transform/variables/__init__.py#L47

We have given this some thought, and after wanting to use the information from the dataset in the beginning, I have opted for allowing the definition of our own groups here to use different scaling for self-defined groups.
Also, I was also told that it is possible to build datasets without information about the variable types and therefore not to rely on that metadata.
If you have strong opinions on this I am happy to discuss it again.

@sahahner sahahner changed the title pressure level scalings only applied in specific circumstances refactor variable scaling, pressure level scalings only applied in specific circumstances Jan 2, 2025
@FussyDuck
Copy link

FussyDuck commented Jan 2, 2025

CLA assistant check
All committers have signed the CLA.

@HCookie HCookie self-requested a review January 6, 2025 14:36
@JPXKQX
Copy link
Member

JPXKQX commented Jan 8, 2025

Hi, I would like to know what you think about making all scalers explicit in the config file. Something similar to the additional_scalers: field, but including not only the scalers per variable, but also the node_loss_weight,... The positive aspect I see is that there would be more homogeneity in the scalers defined in the metrics/loss fields.

@mc4117
Copy link
Member

mc4117 commented Jan 9, 2025

Hi, I would like to know what you think about making all scalers explicit in the config file. Something similar to the additional_scalers: field, but including not only the scalers per variable, but also the node_loss_weight,... The positive aspect I see is that there would be more homogeneity in the scalers defined in the metrics/loss fields.

Seems like a good idea! Would you like to add this in this PR?

@JPXKQX
Copy link
Member

JPXKQX commented Jan 9, 2025

Hi, I would like to know what you think about making all scalers explicit in the config file. Something similar to the additional_scalers: field, but including not only the scalers per variable, but also the node_loss_weight,... The positive aspect I see is that there would be more homogeneity in the scalers defined in the metrics/loss fields.

Seems like a good idea! Would you like to add this in this PR?

I’m not sure what the best approach is. On the one hand, adding more work to this PR would increase its complexity, which might make it more logical to address this refactor in a future PR. On the other hand, this PR already introduces some changes to the configs, and the future PRs would also involve changes to the configs. From this, it might be better to have 1 PR and communicate all the changes to users at once. What do you think?

@pinnstorm
Copy link
Member

Hi, I would like to know what you think about making all scalers explicit in the config file. Something similar to the additional_scalers: field, but including not only the scalers per variable, but also the node_loss_weight,... The positive aspect I see is that there would be more homogeneity in the scalers defined in the metrics/loss fields.

Seems like a good idea! Would you like to add this in this PR?

I’m not sure what the best approach is. On the one hand, adding more work to this PR would increase its complexity, which might make it more logical to address this refactor in a future PR. On the other hand, this PR already introduces some changes to the configs, and the future PRs would also involve changes to the configs. From this, it might be better to have 1 PR and communicate all the changes to users at once. What do you think?

I'm happy for it to be included in this PR! Not sure if @sahahner or @mc4117 have other views?

sahahner and others added 4 commits January 22, 2025 15:56
…umstances' of github.com:ecmwf/anemoi-core into 7-pressure-level-scalings-only-applied-in-specific-circumstances
…umstances' of https://github.com/ecmwf/anemoi-core into 7-pressure-level-scalings-only-applied-in-specific-circumstances
@jakob-schloer
Copy link
Collaborator

jakob-schloer commented Jan 24, 2025

Hi, I would like to know what you think about making all scalers explicit in the config file. Something similar to the additional_scalers: field, but including not only the scalers per variable, but also the node_loss_weight,... The positive aspect I see is that there would be more homogeneity in the scalers defined in the metrics/loss fields.

Seems like a good idea! Would you like to add this in this PR?

I’m not sure what the best approach is. On the one hand, adding more work to this PR would increase its complexity, which might make it more logical to address this refactor in a future PR. On the other hand, this PR already introduces some changes to the configs, and the future PRs would also involve changes to the configs. From this, it might be better to have 1 PR and communicate all the changes to users at once. What do you think?

I fully agree with @JPXKQX. I personally think that all config keywords related to the loss and its scaling are a bit scattered in the training.
I would suggest bringing the restructuring into this PR. If this PR goes in, it will break old configs, and the restructuring will again break old configs.

Ideally, I think the config could look something like that:

loss_scaling:
	default: 1
	groups:
		default: sfc
		pl: []
	scalers:
        - _target_: anemoi.training.losses.scaling.ConstVariableScaler
              default: 1
	      variables:
	            q: 0.6 #1
		     t: 6   #1
		     u: 0.8 #0.5
                    ...
	- _target_: anemoi.training.losses.scaling.ReluVariableLevelScaler
              group: pl
              y_intercept: 0.2
              ...
	- target_: anemoi.training.losses.nodeweights.GraphNodeAttribute
	      target_nodes: ${graph.data}
             node_attribute: area_weight
	



 

Copy link
Collaborator

@jakob-schloer jakob-schloer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great effort! The flexible scaling of the loss is really nice.

However, in its current version all config keywords related to the loss and its scaling as well as the code is a bit scattered. Why is variable scaling under training while nodeweights is under losses? In my opinion everything should be under losses. See my comment above.

training/src/anemoi/training/config/training/default.yaml Outdated Show resolved Hide resolved
@HCookie HCookie mentioned this pull request Jan 27, 2025
2 tasks
@HCookie HCookie changed the title fix!: variable scaling, pressure level scalings only applied in specific circumstances fix!: Rework Loss Scalings to provide better modularity Jan 27, 2025
@@ -111,25 +134,24 @@ def __init__(

# Kwargs to pass to the loss function
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be added to the new structure.

Comment on lines +52 to +56
node_weights:
_target_: anemoi.training.losses.nodeweights.GraphNodeAttribute
target_nodes: ${graph.data}
node_attribute: area_weight
scale_dim: 2 # dimension on which scaling applied
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class doesn't have a scale_dim attribute.
It may also be useful to add a general scale by node attribute scaler.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not yet. Refactor is still ongoing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Under Review
Development

Successfully merging this pull request may close these issues.

Pressure Level Scalings only applied in specific circumstances Loss scalings
9 participants