Transformer-From-Scratch

A comprehensive implementation of the Transformer architecture from the ground up, inspired by the seminal "Attention is All You Need" paper by Vaswani et al. This project meticulously constructs each component of the Transformer model—ranging from the multi-head self-attention mechanism to the positional encodings—providing a clear, step-by-step exploration of how these elements come together to form one of the most powerful models in natural language processing.

In addition to the model architecture, this repository also includes fully implemented training and validation loops, allowing you to train the Transformer model on real-world datasets. As a demonstration of its capabilities, the model is applied to the OPUS Books dataset for language translation, showcasing the potential of Transformers in machine translation tasks.

This project is an ideal resource for anyone looking to gain a deeper understanding of Transformers by building and experimenting with the model from scratch.

References: Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I. (2017). Attention is All You Need. Advances in Neural Information Processing Systems, 30.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
.gitignore		.gitignore
README.md		README.md
attention_visual.ipynb		attention_visual.ipynb
config.py		config.py
dataset.py		dataset.py
model.py		model.py
tokenizer_en.json		tokenizer_en.json
tokenizer_it.json		tokenizer_it.json
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Transformer-From-Scratch

About

Releases

Packages

Languages

niyantha23/Transformer-From-Scratch

Folders and files

Latest commit

History

Repository files navigation

Transformer-From-Scratch

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages