deprecate LayerNormFp32 #850

EIFY · 2024-04-01T00:01:04Z

Modern pytorch (1.10+) always performs LN in fp32:

For example, LayerNorm has to be done in fp32 and recent pytorch (1.10+) has been fixed to do that regardless of the input types, but earlier pytorch versions accumulate in the input type which can be an issue.

So it's no longer necessary to use LayerNormFp32 to explicitly cast to fp32. However, the built-in torch.nn.LayerNorm always returns in fp32 when run under the autocast() context, so we still need the LayerNorm subclass to cast back. See also pytorch/pytorch#66707 (comment).

Modern pytorch always performs LN in fp32.

rwightman · 2024-05-09T15:25:02Z

@EIFY I don't think this is quite the case, in an autocast context it returns float32 because it's upcast to float32 when AMP . But we aren't using this when AMP is enabled, it's used when pure float16/bfloat16 is enabled. Then it does make a difference. Even if the reduction is being done internally in float32, the affine ops will be done in low precision where as in LayerNormFp32 everything will be done in float32 regardless of the dtype.

deprecate LayerNormFp32

6849a61

Modern pytorch always performs LN in fp32.

rwightman closed this May 31, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

deprecate LayerNormFp32 #850

deprecate LayerNormFp32 #850

EIFY commented Apr 1, 2024

rwightman commented May 9, 2024

deprecate LayerNormFp32 #850

deprecate LayerNormFp32 #850

Conversation

EIFY commented Apr 1, 2024

rwightman commented May 9, 2024