Skip to content

Goal: to create the best possible Amharic language models that can run offline on a budget smartphone

License

Notifications You must be signed in to change notification settings

simonbutt/amharic_llama

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Amharic Llama

Goal: To create the best possible Amharic language models that can run offline on a budget smartphone.

All resources created from this repository will be stored in amharic_llama huggingface collection.

Unless otherwise stated, notebooks through Google Colab are the primary supported way of running everything here.

Vision

  • Create a robust evaluation framework to understand current Amharic LLM capabilities.
  • Develop the SoA Amharic language models for <3B and <10B parameters.
  • Develop an offline AI app to enable broad distribution for Ethiopians running on low RAM hardware.
  • Progress to Oromo, Tigrinya, Afar and Somali languages.

Process

Phase 1 - Evaluate

Objective: Understand the current SOA in Amharic

  • Fine tune various SOA english LLMs on Amharic translated datasets to provide a set of control models.
  • Develop an open Amharic LLM Leaderboard for model evaluation

Phase 2 - Train

Objective: Use EEVE vocabulary expansion to train the SOA Amharic language model.

About

Goal: to create the best possible Amharic language models that can run offline on a budget smartphone

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published