MT Exercise 4: Layer Normalization for Transformer Models

Changes made to the code

Uncommented the code marked in the exercise sheet that caused issues when training with CPU.
Changed train.sh to use a different model name and config file. Also adjusted CPU cores to 16.

Added files

In the configs folder, I added new config files for pre- and post-normalization. The config files are also set to use GPU for training.
Added scripts/parse_format_ppl.py to get the perplexity from extracted values of the training logs. The script then formats the perplexity values into a tsv file and creates a line plot of the perplexity values.
Added scripts/extraxt_ppl.sh to iterate over the training logs and extract loss, perplexity, and accuracy values. This script then calls parse_format_ppl.py to further process the extracted values as mentioned above.

How to use

After running the setup explained in the exercise sheet and activating the virtual environment, you can run the training script. Make sure you choose the correct config file in the train.sh script. Before extracting the perplexity values, make sure you have matplotlib installed. You can install it by running pip install matplotlib. After training your model variations, you can run extract_ppl.sh to extract the perplexity values from the training logs and create a table and a line plot of the perplexity values.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
configs		configs
data		data
extracted_perplexities		extracted_perplexities
logs		logs
scripts		scripts
shared_models		shared_models
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MT Exercise 4: Layer Normalization for Transformer Models

Changes made to the code

Added files

How to use

About

Releases

Packages

Languages

deSchwed/mt-exercise-4

Folders and files

Latest commit

History

Repository files navigation

MT Exercise 4: Layer Normalization for Transformer Models

Changes made to the code

Added files

How to use

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages