This repository contains an implementation of the Transformer model from scratch, built using TensorFlow. The dimension of this model as same as used in the paper Attention is all you need by ashish et al. This model is for language translation. The encode takes the input sentence and converts it to a matrix, then that matrix is passed to decoder, decoder also gets the output while training. It consist of six files all files are connected to each other. If you want to understand Transformers more in depth you should debug the code in book.
- Self-attention mechanism
- Multi-head attention
- Positional encoding
- Encoder-decoder architecture
git clone https://github.com/Rohit2sali/TransformerFromScratch.git cd TransformerFromScratch