Skip to content

ChristopherMichael-Stokes/llama2-evaluation

Repository files navigation

LLaMa2 evaluation repo

This repo serves as an experimentation playground for usage and performance evaluation on the LLaMa2 model family.

Currently covers:

  • Loading model in various precisions including quantized
  • Running standard inference + properly formatted prompt based inference of the chat models
  • Evaluation of NF4 vs int8 performance on a few benchmarks

Future TODOs:

  • QLoRA training
  • Evaluation of 16-bit precisions + the larger LLaMa models
  • Implement batch inference for each of the evaluations
  • Evaluation against other non-llama models + upstream derivitives from the base models

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published