Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refact quantization, support torchao quant and vllm w8a8 #596

Open
wants to merge 58 commits into
base: main
Choose a base branch
from

Commits on Oct 25, 2024

  1. refactor ppl quant

    baishihao committed Oct 25, 2024
    Configuration menu
    Copy the full SHA
    08531cd View commit details
    Browse the repository at this point in the history
  2. ppl w4a16 for llama-cls models

    baishihao committed Oct 25, 2024
    Configuration menu
    Copy the full SHA
    e6022da View commit details
    Browse the repository at this point in the history

Commits on Oct 28, 2024

  1. add torchao

    baishihao committed Oct 28, 2024
    Configuration menu
    Copy the full SHA
    b4cd881 View commit details
    Browse the repository at this point in the history

Commits on Oct 29, 2024

  1. add vllm w8a8

    baishihao committed Oct 29, 2024
    Configuration menu
    Copy the full SHA
    2b28cd1 View commit details
    Browse the repository at this point in the history
  2. merge main

    baishihao committed Oct 29, 2024
    Configuration menu
    Copy the full SHA
    8cb6526 View commit details
    Browse the repository at this point in the history
  3. vllm alloc

    baishihao committed Oct 29, 2024
    Configuration menu
    Copy the full SHA
    416e940 View commit details
    Browse the repository at this point in the history

Commits on Nov 1, 2024

  1. modify router model backend for quant

    baishihao committed Nov 1, 2024
    Configuration menu
    Copy the full SHA
    33ff5e9 View commit details
    Browse the repository at this point in the history
  2. remove unused code

    baishihao committed Nov 1, 2024
    Configuration menu
    Copy the full SHA
    15ab7db View commit details
    Browse the repository at this point in the history
  3. remove unused code

    baishihao committed Nov 1, 2024
    Configuration menu
    Copy the full SHA
    ed40f35 View commit details
    Browse the repository at this point in the history

Commits on Nov 4, 2024

  1. modify all models

    baishihao committed Nov 4, 2024
    Configuration menu
    Copy the full SHA
    cd92bf6 View commit details
    Browse the repository at this point in the history

Commits on Nov 5, 2024

  1. add vllm fp8 w8a8 (per-channel/per-token)

    baishihao committed Nov 5, 2024
    Configuration menu
    Copy the full SHA
    253725c View commit details
    Browse the repository at this point in the history
  2. remove wa & aw template

    baishihao committed Nov 5, 2024
    Configuration menu
    Copy the full SHA
    309fefe View commit details
    Browse the repository at this point in the history

Commits on Nov 7, 2024

  1. refact quantization to support mix quantization

    baishihao committed Nov 7, 2024
    Configuration menu
    Copy the full SHA
    f57b996 View commit details
    Browse the repository at this point in the history

Commits on Nov 8, 2024

  1. fix mix quantization

    baishihao committed Nov 8, 2024
    Configuration menu
    Copy the full SHA
    b038857 View commit details
    Browse the repository at this point in the history
  2. update the baseweight cls

    baishihao committed Nov 8, 2024
    Configuration menu
    Copy the full SHA
    24add93 View commit details
    Browse the repository at this point in the history

Commits on Nov 12, 2024

  1. support bf16 int8kv

    baishihao committed Nov 12, 2024
    Configuration menu
    Copy the full SHA
    dfd16be View commit details
    Browse the repository at this point in the history

Commits on Nov 13, 2024

  1. fix load weight with multi-threads

    baishihao committed Nov 13, 2024
    Configuration menu
    Copy the full SHA
    116e85a View commit details
    Browse the repository at this point in the history
  2. refactor internlm2

    baishihao committed Nov 13, 2024
    Configuration menu
    Copy the full SHA
    f213eab View commit details
    Browse the repository at this point in the history
  3. add name_mapping

    baishihao committed Nov 13, 2024
    Configuration menu
    Copy the full SHA
    cba8e07 View commit details
    Browse the repository at this point in the history
  4. refactor gemma

    baishihao committed Nov 13, 2024
    Configuration menu
    Copy the full SHA
    7ede4a7 View commit details
    Browse the repository at this point in the history
  5. fix gemma2 tp2 for multi-query

    baishihao committed Nov 13, 2024
    Configuration menu
    Copy the full SHA
    99a7f76 View commit details
    Browse the repository at this point in the history
  6. remove offset of mm weight

    baishihao committed Nov 13, 2024
    Configuration menu
    Copy the full SHA
    5d0eae6 View commit details
    Browse the repository at this point in the history
  7. remove name_mapping

    baishihao committed Nov 13, 2024
    Configuration menu
    Copy the full SHA
    17b9839 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    b51b573 View commit details
    Browse the repository at this point in the history
  9. Merge branch 'quantization' of github.com:ModelTC/lightllm into quant…

    …ization
    
    Conflicts:
    	lightllm/common/basemodel/layer_weights/meta_weights/__init__.py
    WANDY666 committed Nov 13, 2024
    Configuration menu
    Copy the full SHA
    6d6f83c View commit details
    Browse the repository at this point in the history
  10. refactor internlm & fix ppl

    baishihao committed Nov 13, 2024
    Configuration menu
    Copy the full SHA
    2060038 View commit details
    Browse the repository at this point in the history
  11. Merge branch 'quantization' of https://github.com/ModelTC/lightllm in…

    …to quantization
    baishihao committed Nov 13, 2024
    Configuration menu
    Copy the full SHA
    5e132aa View commit details
    Browse the repository at this point in the history
  12. remove print

    baishihao committed Nov 13, 2024
    Configuration menu
    Copy the full SHA
    65de425 View commit details
    Browse the repository at this point in the history
  13. fix ppl

    baishihao committed Nov 13, 2024
    Configuration menu
    Copy the full SHA
    ed84426 View commit details
    Browse the repository at this point in the history

Commits on Nov 14, 2024

  1. update ppl dequant

    baishihao committed Nov 14, 2024
    Configuration menu
    Copy the full SHA
    ef06da6 View commit details
    Browse the repository at this point in the history
  2. fix ppl

    baishihao committed Nov 14, 2024
    Configuration menu
    Copy the full SHA
    b55ea10 View commit details
    Browse the repository at this point in the history
  3. refactor minicpm

    baishihao committed Nov 14, 2024
    Configuration menu
    Copy the full SHA
    49dfb51 View commit details
    Browse the repository at this point in the history
  4. deepseek-v2-lite can run

    WANDY666 committed Nov 14, 2024
    Configuration menu
    Copy the full SHA
    28ada22 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    20b4909 View commit details
    Browse the repository at this point in the history
  6. refactor qwen

    baishihao committed Nov 14, 2024
    Configuration menu
    Copy the full SHA
    60e16db View commit details
    Browse the repository at this point in the history
  7. Merge branch 'quantization' of https://github.com/ModelTC/lightllm in…

    …to quantization
    baishihao committed Nov 14, 2024
    Configuration menu
    Copy the full SHA
    0a21710 View commit details
    Browse the repository at this point in the history
  8. refactor phi-3

    baishihao committed Nov 14, 2024
    Configuration menu
    Copy the full SHA
    61c4ceb View commit details
    Browse the repository at this point in the history
  9. gemma for int8kv

    baishihao committed Nov 14, 2024
    Configuration menu
    Copy the full SHA
    3428699 View commit details
    Browse the repository at this point in the history
  10. refactor qwen2

    baishihao committed Nov 14, 2024
    Configuration menu
    Copy the full SHA
    1fcd74e View commit details
    Browse the repository at this point in the history
  11. remove import for qwen2-vl

    baishihao committed Nov 14, 2024
    Configuration menu
    Copy the full SHA
    0171a12 View commit details
    Browse the repository at this point in the history
  12. Configuration menu
    Copy the full SHA
    3fcfa82 View commit details
    Browse the repository at this point in the history
  13. Configuration menu
    Copy the full SHA
    7a40bed View commit details
    Browse the repository at this point in the history
  14. mistral for int8kv

    baishihao committed Nov 14, 2024
    Configuration menu
    Copy the full SHA
    a82148d View commit details
    Browse the repository at this point in the history
  15. Merge branch 'quantization' of https://github.com/ModelTC/lightllm in…

    …to quantization
    baishihao committed Nov 14, 2024
    Configuration menu
    Copy the full SHA
    f1ee848 View commit details
    Browse the repository at this point in the history
  16. refactor yi

    baishihao committed Nov 14, 2024
    Configuration menu
    Copy the full SHA
    c5f8288 View commit details
    Browse the repository at this point in the history
  17. update yi

    baishihao committed Nov 14, 2024
    Configuration menu
    Copy the full SHA
    211ffc9 View commit details
    Browse the repository at this point in the history
  18. refactor stablm

    baishihao committed Nov 14, 2024
    Configuration menu
    Copy the full SHA
    91132c4 View commit details
    Browse the repository at this point in the history

Commits on Nov 15, 2024

  1. refactor starcoder

    baishihao committed Nov 15, 2024
    Configuration menu
    Copy the full SHA
    50135e5 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    e58ac95 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    3bde26a View commit details
    Browse the repository at this point in the history
  4. fix

    baishihao committed Nov 15, 2024
    Configuration menu
    Copy the full SHA
    84dfadb View commit details
    Browse the repository at this point in the history
  5. Merge branch 'quantization' of https://github.com/ModelTC/lightllm in…

    …to quantization
    baishihao committed Nov 15, 2024
    Configuration menu
    Copy the full SHA
    1099279 View commit details
    Browse the repository at this point in the history
  6. fix absorb

    WANDY666 committed Nov 15, 2024
    Configuration menu
    Copy the full SHA
    76cda5a View commit details
    Browse the repository at this point in the history
  7. refactor models (#606)

    sufubao authored Nov 15, 2024
    Configuration menu
    Copy the full SHA
    b8467de View commit details
    Browse the repository at this point in the history
  8. import vllm moe

    WANDY666 committed Nov 15, 2024
    Configuration menu
    Copy the full SHA
    32b542d View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    5d88f4a View commit details
    Browse the repository at this point in the history
  10. deepseek fp8 moe

    baishihao committed Nov 15, 2024
    Configuration menu
    Copy the full SHA
    8b1b2a1 View commit details
    Browse the repository at this point in the history
  11. fix

    baishihao committed Nov 15, 2024
    Configuration menu
    Copy the full SHA
    b1d4f52 View commit details
    Browse the repository at this point in the history