Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix/combined loss #70

Draft
wants to merge 5 commits into
base: main
Choose a base branch
from
Draft

Fix/combined loss #70

wants to merge 5 commits into from

Conversation

OpheliaMiralles
Copy link
Contributor

Try to fix #68
Add tests

@FussyDuck
Copy link

FussyDuck commented Jan 9, 2025

CLA assistant check
All committers have signed the CLA.

@HCookie HCookie self-requested a review January 14, 2025 15:08
Comment on lines 124 to 141
if config.training.training_loss._target_ == 'anemoi.training.losses.combined.CombinedLoss':
assert "loss_weights" in config.training.training_loss, "Loss weights must be provided for combined loss"
losses = []
ignore_nans = config.training.training_loss.get("ignore_nans", False) # no point in doing this for each loss, nan+nan is nan
for loss in config.training.training_loss.losses:
node_weighting = instantiate(loss.node_weights)
loss_node_weights = node_weighting.weights(graph_data)
loss_node_weights = self.output_mask.apply(loss_node_weights, dim=0, fill_value=0.0)
loss_instantiated = self.get_loss_function(loss, scalars=self.scalars, **{"node_weights": loss_node_weights, "ignore_nans": ignore_nans})
losses.append(loss_instantiated)
assert isinstance(loss_instantiated, BaseWeightedLoss)
self.loss = instantiate({"_target_": config.training.training_loss._target_}, losses=losses, loss_weights = config.training.training_loss.loss_weights, **loss_kwargs)
else:
self.loss = self.get_loss_function(config.training.training_loss, scalars=self.scalars, **loss_kwargs)
assert isinstance(self.loss, BaseWeightedLoss) and not isinstance(
self.loss,
torch.nn.ModuleList,
), f"Loss function must be a `BaseWeightedLoss`, not a {type(self.loss).__name__!r}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this is over specific for this use case, and instantiate's objects unneccessarily

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instantiating node_weights was necessary to call the combined loss but if you find a way around it, please let me know... I have another version where all of this is implemented in the get_loss_function from the forecaster. It is cleaner so I'll try to commit it soon.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, yeah, as I wrote the loss functions code originally, I was able to find a way around, and only update the CombinedLoss class.

Copy link
Member

@HCookie HCookie Jan 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you'd like, we can work together on https://github.com/ecmwf/anemoi-core/tree/fix/combined_loss_hcookie to make sure your use case is addressed.

Copy link
Contributor Author

@OpheliaMiralles OpheliaMiralles Jan 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know, for me the CombinedLoss is not a BaseWeightedLoss, so I don't really see the point in trying to make it fit this base class. The weights don't mean the same thing here, and the individual losses should probably all have separate node weights. I'll update this PR today. Let me know what you think, but really we should not be afraid of separating use cases when they don't match, don't you think?

Copy link
Member

@HCookie HCookie Jan 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While the CombinedLoss may not be a clear use case of BaseWeightedLoss, it is still an anemoi loss function, and so inheritance based structures make sense. I am very wary of any solution that requires hard coding of any sort. Anemoi is designed to be a generic framework so following proper OOP principles is a must, otherwise any of these main classes end up with massive branching behaviours which is both hard to read and hard to use. (This is already the case in the GraphForecaster)

the individual losses should probably all have separate node weights

Excluding your use case of different losses for different params, in what case will this be true? Having a weighting between losses and then different relative weightings within the losses will massively increase complexity, and in my opinion be very hard to interpret.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There have been some changes implemented in #52 that I think may be interesting for your use cases? Shall we move this discussion to slack and organise a call?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this usecase, the node weights are defined as several masks defining grids for different data sources (a radar mask, a station pointwise mask and a satellite mask). I believe it might be a common usecase in the scope of data assimilation, but of course it is part of a broader discussion. OK to move the discussion to slack. We can have a call next week, let's schedule on slack too.

Comment on lines 90 to 91
elif hasattr(loss, "__class__"):
self.losses.append(loss)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we checking for __class__? If checking for an object why not isinstance(loss, object)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it could originally only take a class (of type "type", not instantiated) as losses arguments. Indeed, loss(**kwargs) called later in the function expects init arguments from the individual loss object and not forward arguments. As I said, I'll try to commit recent changes later.

@OpheliaMiralles OpheliaMiralles marked this pull request as draft January 16, 2025 19:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Now In Progress
Development

Successfully merging this pull request may close these issues.

CombinedLoss not working/not tested
4 participants