I test the model in an NLP task. #5

zshy1205 · 2021-06-22T02:35:48Z

I use aft_full model，6 layers.
and I use it in init with this code:

self.encoder_transformer = nn.ModuleList()
for _ in range(6):
    self.encoder_transformer.append(AFTFull(max_seqlen=500, dim=512,hidden_dim=256))

and in forward function, I use this code:

for _, layer in enumerate(self.encoder_transformer):`
    x = layer(x) + x

Originally I used the traditional transformer, now I replaced it with this, the training loss appeared Nan，Is something wrong? and how U use the model for many layers，please help me, Thank U.

The text was updated successfully, but these errors were encountered:

rish-16 · 2021-06-22T14:51:35Z

Hey, thanks. I'll get into it asap. Give me a while!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

I test the model in an NLP task. #5

I test the model in an NLP task. #5

zshy1205 commented Jun 22, 2021 •

edited

Loading

rish-16 commented Jun 22, 2021

I test the model in an NLP task. #5

I test the model in an NLP task. #5

Comments

zshy1205 commented Jun 22, 2021 • edited Loading

rish-16 commented Jun 22, 2021

zshy1205 commented Jun 22, 2021 •

edited

Loading