Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I test the model in an NLP task. #5

Open
zshy1205 opened this issue Jun 22, 2021 · 1 comment
Open

I test the model in an NLP task. #5

zshy1205 opened this issue Jun 22, 2021 · 1 comment

Comments

@zshy1205
Copy link

zshy1205 commented Jun 22, 2021

I use aft_full model,6 layers.
and I use it in init with this code:

self.encoder_transformer = nn.ModuleList()
for _ in range(6):
    self.encoder_transformer.append(AFTFull(max_seqlen=500, dim=512,hidden_dim=256))

and in forward function, I use this code:

for _, layer in enumerate(self.encoder_transformer):`
    x = layer(x) + x

Originally I used the traditional transformer, now I replaced it with this, the training loss appeared Nan,Is something wrong? and how U use the model for many layers,please help me, Thank U.

@rish-16
Copy link
Owner

rish-16 commented Jun 22, 2021

Hey, thanks. I'll get into it asap. Give me a while!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants