Due to projects like Explore the LLMs specializing in model indexing, the custom list has been removed.
- Cerebras GPT-13b (release notes)
- LAION OpenFlamingo | Multi Modal Model and training architecture
- GeoV/GeoV-9b - 9B parameter, in-progress training to 300B tokens (33:1)
- RWKV: Parallelizable RNN with Transformer-level LLM Performance
- CodeGeeX 13B | Multi Language Code Generation Model
- BigCode | Open Scientific collaboration to train a coding LLM
- MOSS by Fudan University a 16b Chinese/English custom foundational model with additional models fine tuned on sft and plugin usage
- mPLUG-Owl Multimodal finetuned model for visual/language tasks
- Multimodal-GPT multi-modal visual/language chatbot, using llama with custom LoRA weights and openflamingo-9B.
- Visual-med-alpaca fine-tuning llama-7b on self instruct for the biomedical domain. Models locked behind a request form.
- replit-code focused on Code Completion. The model has been trained on a subset of the Stack Dedup v1.2 dataset.
- VPGTrans Transfer Visual Prompt Generator across LLMs and the VL-Vicuna model is a novel VL-LLM. Paper, code
- salesforce/CodeT5 code assistant, has released their codet5+ 16b and other model sizes
- baichuan-7b Baichuan Intelligent Technology developed baichuan-7B, an open-source language model with 7 billion parameters trained on 1.2 trillion tokens. Supporting Chinese and English, it achieves top performance on authoritative benchmarks (C-EVAL, MMLU)
- ChatGLM2-6B v2 of the GLM 6B open bilingual EN/CN model
- sqlcoder 15B parameter model that outperforms gpt-3.5-turbo for natural language to SQL generation tasks
- CodeShell code LLM with 7b parameters trained on 500b tokens, context length of 8k outperforming CodeLlama and Starcoder on humaneval, weights
- SauerkrautLM-13B-v1 fine tuned llama-2 13b on a mix of German data augmentation and translations, SauerkrautLM-7b-v1-mistral German SauerkrautLM-7b fine-tuned using QLoRA on 1 A100 80GB with Axolotl
- em_german_leo_mistral LeoLM Mistral fine tune of LeoLM with german instructions
- leo-hessianai-13b-chat-bilingual based on llama-2 13b is a fine tune of the base leo-hessianai-13b for chat
- WizardMath-70B-V1.0 SOTA Mathematical Reasoning
- Mistral-7B-german-assistant-v3 finetuned version for german instructions and conversations in style of Alpaca. "### Assistant:" "### User:", trained with a context length of 8k tokens. The dataset used is deduplicated and cleaned, with no codes inside. The focus is on instruction following and conversational tasks
- HelixNet Mixture of Experts with 3 Mistral-7B, LoRA, HelixNet-LMoE optimized version
- llmware RAG models small LLMs and sentence transformer embedding models specifically fine-tuned for RAG workflows
- openchat Advancing Open-source Language Models with Mixed-Quality Data
- deepseek-coder code language models, trained on 2T tokens, 87% code 13% English / Chinese, up to 33B with 16K context size achieving SOTA performance on coding benchmarks
- Poro SiloGen model checkpoints of a family of multilingual open source LLMs covering all official European languages and code, news
- Mixtral of experts A high quality Sparse Mixture-of-Experts.