Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update vit.py to include LayerNorm in the MLP head which is missing #317

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

vivekh2000
Copy link

I have included the LayerNorm layer in the classification head to identify it with the original implementation. It will be easy to compare now.
Also, an isolated LayerNorm was included, which I have removed and inserted in the MLP head, again to make it more intuitive. For reference, you have included LayerNorm in the self.ditill_mlp attribute of the class DistillWrapper in distill.py. I hope these changes are acceptable.

Thank you for the excellent code base.

I have included the LayerNorm layer in the classification head, which needed to be added to the implementation. Also, an unnecessary layer norm was included, which I have removed.
For reference, you have included LayerNorm in the ditill_mlp attribute of the class DistillWrapper in distill.py. Thank you.
@lucidrains
Copy link
Owner

@vivekh2000 hey Vivek, thanks for this PR

so i think the layernorm should be kept at its current position, as most transformers these days end with a final norm

will definitely remove the layernorm from the distill wrapper though (or make it an option, in the case that some of the variants do not have the final norm)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants