New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

refact quantization, support torchao quant and vllm w8a8 #596

Open

shihaobai wants to merge 58 commits into main from quantization

Commits on Oct 25, 2024

refactor ppl quant

baishihao committed Oct 25, 2024
Configuration menu
View commit details

Copy full SHA for 08531cd

Browse repository at this point
Copy the full SHA

08531cd View commit details

Browse the repository at this point in the history
ppl w4a16 for llama-cls models

baishihao committed Oct 25, 2024
Configuration menu
View commit details

Copy full SHA for e6022da

Browse repository at this point
Copy the full SHA

e6022da View commit details

Browse the repository at this point in the history

Commits on Oct 28, 2024

add torchao

baishihao committed Oct 28, 2024
Configuration menu
View commit details

Copy full SHA for b4cd881

Browse repository at this point
Copy the full SHA

b4cd881 View commit details

Browse the repository at this point in the history

Commits on Oct 29, 2024

add vllm w8a8

baishihao committed Oct 29, 2024
Configuration menu
View commit details

Copy full SHA for 2b28cd1

Browse repository at this point
Copy the full SHA

2b28cd1 View commit details

Browse the repository at this point in the history
merge main

baishihao committed Oct 29, 2024
Configuration menu
View commit details

Copy full SHA for 8cb6526

Browse repository at this point
Copy the full SHA

8cb6526 View commit details

Browse the repository at this point in the history
vllm alloc

baishihao committed Oct 29, 2024
Configuration menu
View commit details

Copy full SHA for 416e940

Browse repository at this point
Copy the full SHA

416e940 View commit details

Browse the repository at this point in the history

Commits on Nov 1, 2024

modify router model backend for quant

baishihao committed Nov 1, 2024
Configuration menu
View commit details

Copy full SHA for 33ff5e9

Browse repository at this point
Copy the full SHA

33ff5e9 View commit details

Browse the repository at this point in the history
remove unused code

baishihao committed Nov 1, 2024
Configuration menu
View commit details

Copy full SHA for 15ab7db

Browse repository at this point
Copy the full SHA

15ab7db View commit details

Browse the repository at this point in the history
remove unused code

baishihao committed Nov 1, 2024
Configuration menu
View commit details

Copy full SHA for ed40f35

Browse repository at this point
Copy the full SHA

ed40f35 View commit details

Browse the repository at this point in the history

Commits on Nov 4, 2024

modify all models

baishihao committed Nov 4, 2024
Configuration menu
View commit details

Copy full SHA for cd92bf6

Browse repository at this point
Copy the full SHA

cd92bf6 View commit details

Browse the repository at this point in the history

Commits on Nov 5, 2024

add vllm fp8 w8a8 (per-channel/per-token)

baishihao committed Nov 5, 2024
Configuration menu
View commit details

Copy full SHA for 253725c

Browse repository at this point
Copy the full SHA

253725c View commit details

Browse the repository at this point in the history
remove wa & aw template

baishihao committed Nov 5, 2024
Configuration menu
View commit details

Copy full SHA for 309fefe

Browse repository at this point
Copy the full SHA

309fefe View commit details

Browse the repository at this point in the history

Commits on Nov 7, 2024

refact quantization to support mix quantization

baishihao committed Nov 7, 2024
Configuration menu
View commit details

Copy full SHA for f57b996

Browse repository at this point
Copy the full SHA

f57b996 View commit details

Browse the repository at this point in the history

Commits on Nov 8, 2024

fix mix quantization

baishihao committed Nov 8, 2024
Configuration menu
View commit details

Copy full SHA for b038857

Browse repository at this point
Copy the full SHA

b038857 View commit details

Browse the repository at this point in the history
update the baseweight cls

baishihao committed Nov 8, 2024
Configuration menu
View commit details

Copy full SHA for 24add93

Browse repository at this point
Copy the full SHA

24add93 View commit details

Browse the repository at this point in the history

Commits on Nov 12, 2024

support bf16 int8kv

baishihao committed Nov 12, 2024
Configuration menu
View commit details

Copy full SHA for dfd16be

Browse repository at this point
Copy the full SHA

dfd16be View commit details

Browse the repository at this point in the history

Commits on Nov 13, 2024

fix load weight with multi-threads

baishihao committed Nov 13, 2024
Configuration menu
View commit details

Copy full SHA for 116e85a

Browse repository at this point
Copy the full SHA

116e85a View commit details

Browse the repository at this point in the history
refactor internlm2

baishihao committed Nov 13, 2024
Configuration menu
View commit details

Copy full SHA for f213eab

Browse repository at this point
Copy the full SHA

f213eab View commit details

Browse the repository at this point in the history
add name_mapping

baishihao committed Nov 13, 2024
Configuration menu
View commit details

Copy full SHA for cba8e07

Browse repository at this point
Copy the full SHA

cba8e07 View commit details

Browse the repository at this point in the history
refactor gemma

baishihao committed Nov 13, 2024
Configuration menu
View commit details

Copy full SHA for 7ede4a7

Browse repository at this point
Copy the full SHA

7ede4a7 View commit details

Browse the repository at this point in the history
fix gemma2 tp2 for multi-query

baishihao committed Nov 13, 2024
Configuration menu
View commit details

Copy full SHA for 99a7f76

Browse repository at this point
Copy the full SHA

99a7f76 View commit details

Browse the repository at this point in the history
remove offset of mm weight

baishihao committed Nov 13, 2024
Configuration menu
View commit details

Copy full SHA for 5d0eae6

Browse repository at this point
Copy the full SHA

5d0eae6 View commit details

Browse the repository at this point in the history
remove name_mapping

baishihao committed Nov 13, 2024
Configuration menu
View commit details

Copy full SHA for 17b9839

Browse repository at this point
Copy the full SHA

17b9839 View commit details

Browse the repository at this point in the history
refact deepseek-v2 transformer_layer_weight

WANDY666 committed Nov 13, 2024
Configuration menu
View commit details

Copy full SHA for b51b573

Browse repository at this point
Copy the full SHA

b51b573 View commit details

Browse the repository at this point in the history
Merge branch 'quantization' of github.com:ModelTC/lightllm into quant…
```
…ization

Conflicts:
	lightllm/common/basemodel/layer_weights/meta_weights/__init__.py
```
WANDY666 committed Nov 13, 2024
Configuration menu
View commit details

Copy full SHA for 6d6f83c

Browse repository at this point
Copy the full SHA

6d6f83c View commit details

Browse the repository at this point in the history
refactor internlm & fix ppl

baishihao committed Nov 13, 2024
Configuration menu
View commit details

Copy full SHA for 2060038

Browse repository at this point
Copy the full SHA

2060038 View commit details

Browse the repository at this point in the history
Merge branch 'quantization' of https://github.com/ModelTC/lightllm in…
```
…to quantization
```
baishihao committed Nov 13, 2024
Configuration menu
View commit details

Copy full SHA for 5e132aa

Browse repository at this point
Copy the full SHA

5e132aa View commit details

Browse the repository at this point in the history
remove print

baishihao committed Nov 13, 2024
Configuration menu
View commit details

Copy full SHA for 65de425

Browse repository at this point
Copy the full SHA

65de425 View commit details

Browse the repository at this point in the history
fix ppl

baishihao committed Nov 13, 2024
Configuration menu
View commit details

Copy full SHA for ed84426

Browse repository at this point
Copy the full SHA

ed84426 View commit details

Browse the repository at this point in the history

Commits on Nov 14, 2024

update ppl dequant

baishihao committed Nov 14, 2024
Configuration menu
View commit details

Copy full SHA for ef06da6

Browse repository at this point
Copy the full SHA

ef06da6 View commit details

Browse the repository at this point in the history
fix ppl

baishihao committed Nov 14, 2024
Configuration menu
View commit details

Copy full SHA for b55ea10

Browse repository at this point
Copy the full SHA

b55ea10 View commit details

Browse the repository at this point in the history
refactor minicpm

baishihao committed Nov 14, 2024
Configuration menu
View commit details

Copy full SHA for 49dfb51

Browse repository at this point
Copy the full SHA

49dfb51 View commit details

Browse the repository at this point in the history
deepseek-v2-lite can run

WANDY666 committed Nov 14, 2024
Configuration menu
View commit details

Copy full SHA for 28ada22

Browse repository at this point
Copy the full SHA

28ada22 View commit details

Browse the repository at this point in the history
Merge branch 'quantization' of github.com:ModelTC/lightllm into quant…
```
…ization
```
WANDY666 committed Nov 14, 2024
Configuration menu
View commit details

Copy full SHA for 20b4909

Browse repository at this point
Copy the full SHA

20b4909 View commit details

Browse the repository at this point in the history
refactor qwen

baishihao committed Nov 14, 2024
Configuration menu
View commit details

Copy full SHA for 60e16db

Browse repository at this point
Copy the full SHA

60e16db View commit details

Browse the repository at this point in the history
Merge branch 'quantization' of https://github.com/ModelTC/lightllm in…
```
…to quantization
```
baishihao committed Nov 14, 2024
Configuration menu
View commit details

Copy full SHA for 0a21710

Browse repository at this point
Copy the full SHA

0a21710 View commit details

Browse the repository at this point in the history
refactor phi-3

baishihao committed Nov 14, 2024
Configuration menu
View commit details

Copy full SHA for 61c4ceb

Browse repository at this point
Copy the full SHA

61c4ceb View commit details

Browse the repository at this point in the history
gemma for int8kv

baishihao committed Nov 14, 2024
Configuration menu
View commit details

Copy full SHA for 3428699

Browse repository at this point
Copy the full SHA

3428699 View commit details

Browse the repository at this point in the history
refactor qwen2

baishihao committed Nov 14, 2024
Configuration menu
View commit details

Copy full SHA for 1fcd74e

Browse repository at this point
Copy the full SHA

1fcd74e View commit details

Browse the repository at this point in the history
remove import for qwen2-vl

baishihao committed Nov 14, 2024
Configuration menu
View commit details

Copy full SHA for 0171a12

Browse repository at this point
Copy the full SHA

0171a12 View commit details

Browse the repository at this point in the history
fix some bugs, can run deepseek-v2

WANDY666 committed Nov 14, 2024
Configuration menu
View commit details

Copy full SHA for 3fcfa82

Browse repository at this point
Copy the full SHA

3fcfa82 View commit details

Browse the repository at this point in the history
Merge branch 'quantization' of github.com:ModelTC/lightllm into quant…
```
…ization
```
WANDY666 committed Nov 14, 2024
Configuration menu
View commit details

Copy full SHA for 7a40bed

Browse repository at this point
Copy the full SHA

7a40bed View commit details

Browse the repository at this point in the history
mistral for int8kv

baishihao committed Nov 14, 2024
Configuration menu
View commit details

Copy full SHA for a82148d

Browse repository at this point
Copy the full SHA

a82148d View commit details

Browse the repository at this point in the history
Merge branch 'quantization' of https://github.com/ModelTC/lightllm in…
```
…to quantization
```
baishihao committed Nov 14, 2024
Configuration menu
View commit details

Copy full SHA for f1ee848

Browse repository at this point
Copy the full SHA

f1ee848 View commit details

Browse the repository at this point in the history
refactor yi

baishihao committed Nov 14, 2024
Configuration menu
View commit details

Copy full SHA for c5f8288

Browse repository at this point
Copy the full SHA

c5f8288 View commit details

Browse the repository at this point in the history
update yi

baishihao committed Nov 14, 2024
Configuration menu
View commit details

Copy full SHA for 211ffc9

Browse repository at this point
Copy the full SHA

211ffc9 View commit details

Browse the repository at this point in the history
refactor stablm

baishihao committed Nov 14, 2024
Configuration menu
View commit details

Copy full SHA for 91132c4

Browse repository at this point
Copy the full SHA

91132c4 View commit details

Browse the repository at this point in the history

Commits on Nov 15, 2024

refactor starcoder

baishihao committed Nov 15, 2024
Configuration menu
View commit details

Copy full SHA for 50135e5

Browse repository at this point
Copy the full SHA

50135e5 View commit details

Browse the repository at this point in the history
add disable_qk_absorb and disable_vo_absorb

WANDY666 committed Nov 15, 2024
Configuration menu
View commit details

Copy full SHA for e58ac95

Browse repository at this point
Copy the full SHA

e58ac95 View commit details

Browse the repository at this point in the history
Merge branch 'quantization' of github.com:ModelTC/lightllm into quant…
```
…ization
```
WANDY666 committed Nov 15, 2024
Configuration menu
View commit details

Copy full SHA for 3bde26a

Browse repository at this point
Copy the full SHA

3bde26a View commit details

Browse the repository at this point in the history
fix

baishihao committed Nov 15, 2024
Configuration menu
View commit details

Copy full SHA for 84dfadb

Browse repository at this point
Copy the full SHA

84dfadb View commit details

Browse the repository at this point in the history
Merge branch 'quantization' of https://github.com/ModelTC/lightllm in…
```
…to quantization
```
baishihao committed Nov 15, 2024
Configuration menu
View commit details

Copy full SHA for 1099279

Browse repository at this point
Copy the full SHA

1099279 View commit details

Browse the repository at this point in the history
fix absorb

WANDY666 committed Nov 15, 2024
Configuration menu
View commit details

Copy full SHA for 76cda5a

Browse repository at this point
Copy the full SHA

76cda5a View commit details

Browse the repository at this point in the history
refactor models (#606 )

sufubao authored Nov 15, 2024
Configuration menu
View commit details

Copy full SHA for b8467de

Browse repository at this point
Copy the full SHA

b8467de View commit details

Browse the repository at this point in the history
import vllm moe

WANDY666 committed Nov 15, 2024
Configuration menu
View commit details

Copy full SHA for 32b542d

Browse repository at this point
Copy the full SHA

32b542d View commit details

Browse the repository at this point in the history
Merge branch 'quantization' of github.com:ModelTC/lightllm into quant…
```
…ization
```
WANDY666 committed Nov 15, 2024
Configuration menu
View commit details

Copy full SHA for 5d88f4a

Browse repository at this point
Copy the full SHA

5d88f4a View commit details

Browse the repository at this point in the history
deepseek fp8 moe

baishihao committed Nov 15, 2024
Configuration menu
View commit details

Copy full SHA for 8b1b2a1

Browse repository at this point
Copy the full SHA

8b1b2a1 View commit details

Browse the repository at this point in the history
fix

baishihao committed Nov 15, 2024
Configuration menu
View commit details

Copy full SHA for b1d4f52

Browse repository at this point
Copy the full SHA

b1d4f52 View commit details

Browse the repository at this point in the history

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refact quantization, support torchao quant and vllm w8a8 #596

refact quantization, support torchao quant and vllm w8a8 #596

Commits on Oct 25, 2024

Commits on Oct 28, 2024

Commits on Oct 29, 2024

Commits on Nov 1, 2024

Commits on Nov 4, 2024

Commits on Nov 5, 2024

Commits on Nov 7, 2024

Commits on Nov 8, 2024

Commits on Nov 12, 2024

Commits on Nov 13, 2024

Commits on Nov 14, 2024

Commits on Nov 15, 2024

refact quantization, support torchao quant and vllm w8a8 #596

Are you sure you want to change the base?

refact quantization, support torchao quant and vllm w8a8 #596

Commits on Oct 25, 2024

Commits on Oct 28, 2024

Commits on Oct 29, 2024

Commits on Nov 1, 2024

Commits on Nov 4, 2024

Commits on Nov 5, 2024

Commits on Nov 7, 2024

Commits on Nov 8, 2024

Commits on Nov 12, 2024

Commits on Nov 13, 2024

Commits on Nov 14, 2024

Commits on Nov 15, 2024