English | 简体中文
This is the first LLM for HEP, an offitial implemention of Xiwu(溪悟): A Basis Flexible and Learnable LLM for High Energy Physics. This model is designed to possess exceptional capabilities in common sense answering, BOSS code generation, and physical logical reasoning.
Xi(溪): stremlet → drops of water, Wu(悟): understand and gaining insight
- Xiwu, the first LLM specilized for high energy physics outperforms the foundation model in accuracy for domain-specific knowledge question answering, and exceeds GPT-4 in BOSS (BESIII Offline Software System) code
- Xiwu is a Level 2 model that can smoothly switch between foundation models such as LLaMA, Vicuna, ChatGLM and Grok-1.
- Xiwu equipped with two learning systems: The Just-In-Time Learning system based on RAG is capable of acquiring new knowledge instantly, and the On-The-Fly Traning system based on secondary pre-training and fine-tuning can be used to enhance the model's performance in specific tasks.
pip install -r requirements.txt
You can see the basic configurations in the configs.py and constant.py files.
By default, the model weights will be stored in the /data/<USERNAME>/weights
directory, you set the PRETRAINED_WEIGHTS_DIR
cont in the constant.py file or the PRETRAINED_WEIGHTS_DIR
environment variable to change the default directory.
You can run ./prepare_weights.sh --list_all
to see all available weights, and run the following command to download the trained weights:
./prepare_weights.sh --model lmsys/vicuna-7b-v1.5
python run_cli.py \
--model_path xiwu/xiwu-13b-16k-20240417 \
--load_8bit False
You and switch to any supported model. For more available arguments, you can run python run_cli.py -h
.
The assembler will automatically search the model in the PRETRAINED_WEIGHTS_DIR
directory.
python run_worker.py \
--model_path xiwu/xiwu-13b-16k-20240417 \
For more available arguments, you can run python run_worker.py -h
.
After the worker is started, you can open a new terminal and access the model via by following script:
python request_api.py
Note that you should specify the base_url
in the script to the address of the worker.
Streaming API is also supported in this script.
bash scripts/train_xiwu.sh
If you are interested in contributing to Xiwu, please refer to the Contributing Guidelines.
Currently, Xiwu
is authored by Zhengde Zhang, Yiyu Zhang, Haodong Yao, Jianwen Luo, Rui Zhao, Bo Huang, Jiameng Zhao, Yipu Liao, Ke Li, Lina Zhao, Fazhi Qi and Changzheng Yuan.
it is maintained by Zhengde Zhang ([email protected]).
This work is Supported by the Informatization Plan of Chinese Academy of Science, Grant No. CAS-WX2022SF-0104 and "From 0 to 1" Original Innovation Project of IHEP, Grant No. E3545PU2. We would like to express our gratitude to Beijiang Liu, Yaquan Fang, Gang Li, Wuming Luo, Ye Yuan, Shengsen Sun, Yi Jiao and others who are not listed here for engaging in beneficial discussions or providing computing resources.
We are very grateful to the LLaMA, FastChat projects for the foundation models.
@misc{zhang2024xiwu,
title={Xiwu: A Basis Flexible and Learnable LLM for High Energy Physics},
author={Zhengde Zhang and Yiyu Zhang and Haodong Yao and Jianwen Luo and Rui Zhao and Bo Huang and Jiameng Zhao and Yipu Liao and Ke Li and Lina Zhao and Fazhi Qi and Changzheng Yuan},
year={2024},
eprint={2404.08001},
archivePrefix={arXiv},
primaryClass={hep-ph}
}
This project is licensed under the terms of the CC BY-NC-SA 4.0 license.