You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I use aft_full model,6 layers.
and I use it in init with this code:
self.encoder_transformer = nn.ModuleList()
for _ in range(6):
self.encoder_transformer.append(AFTFull(max_seqlen=500, dim=512,hidden_dim=256))
and in forward function, I use this code:
for _, layer in enumerate(self.encoder_transformer):`
x = layer(x) + x
Originally I used the traditional transformer, now I replaced it with this, the training loss appeared Nan,Is something wrong? and how U use the model for many layers,please help me, Thank U.
The text was updated successfully, but these errors were encountered:
I use aft_full model,6 layers.
and I use it in init with this code:
and in forward function, I use this code:
Originally I used the traditional transformer, now I replaced it with this, the training loss appeared Nan,Is something wrong? and how U use the model for many layers,please help me, Thank U.
The text was updated successfully, but these errors were encountered: