-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Specifying normalization layers. #31
Comments
Cathal already did experiments with RMSNorm (but from TransformerEngine, I think). It might have been hard coded but good to coordinate. CC: @cathalobrien |
Hey, yeah i have this PR ecmwf/anemoi-models#35 . I put it on ice a while back bc I thought it would cause problems in inference if we have arbitrary functions in the checkpoint file. but now that the checkpoints are weights only, it should be fine. I can refresh it next week |
I see, this is related but I was thinking of something more general. I would like to be able to write custom normalization layers, e.g.
Do you think this could be combined with your PR @cathalobrien? |
Ah I see, yeah I think this should work. I already have this implemented LayerNorm:
#_target_: "torch.nn.LayerNorm" #the default PyTorch implementation
_target_: "liger_kernel.transformers.rms_norm.LigerRMSNorm" # my desired layernorm
_partial_: True I havent tried with a handwritten layernorm, but i assume as long as the import in I like your idea of passing |
On a second thought, I believe it should be only **kwargs. In the future someone wants to do something else in the forward function. |
Yes, e.g. cross attention or some fancy bias terms for the attention could also be passed. |
I close this, since PR ecmwf/anemoi-models#35 has this already. |
Is your feature request related to a problem? Please describe.
Currently, the processor is implemented with LayerNormalization. I would like to use other normalization layers (https://pytorch.org/docs/stable/nn.html#normalization-layers) including custom normalization layers.
Describe the solution you'd like
I would like to specify the normalization layer of the processor in the config, e.g. transformer.yaml:
Describe alternatives you've considered
No response
Additional context
No response
Organisation
No response
The text was updated successfully, but these errors were encountered: