-
-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Custom GGML outside LlamaCpp scope #38
Comments
Outsourced curated list of supported models; later adding to README.md |
Maye create setup.py that fetches directly from HF Edit: this does counteract the air-gapped idea from huggingface_hub import hf_hub_download
#Download the model
hf_hub_download(repo_id="LLukas22/gpt4all-lora-quantized-ggjt", filename="ggjt-model.bin", local_dir=".") Edit: implemented with #61 |
Only mpt-7b-q4_0.bin from https://huggingface.co/LLukas22/mpt-7b-ggml |
I feel this mpt-7B is faster than the existing one here. |
You got it running? We should add benchmark runs so everyone can plot and share results. |
For the MosaiML: haven't tried yet, feel free to create another issue so that we don't forget after closing this one
Update: mpt-7b-q4_0.bin doesn't work "out of the box", it yields what(): unexpectedly reached end of file and a runtime error.
Originally posted by @hippalectryon-0 in #33 (comment)
The text was updated successfully, but these errors were encountered: