tldr An educational implementation of GPT2 (and its variants) that is refactored from #references. I like the code style here and you may too! test reference minGPT nanoGPT