- The intent of this document is to outline how to run llama.cpp on your local machine and could include details on AWS / EC2 Build recommendations for compute-providers
- The end state should be presentation of a proxy-router accessible private endpoint that the proxy-router can talk to to serve its models