From 42af2df8626aac4bb928b4f3293ffd95694b59f7 Mon Sep 17 00:00:00 2001 From: Seungduk Kim Date: Wed, 13 Nov 2024 07:29:27 +0000 Subject: [PATCH] Update README.md --- README.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 3aa59bc..92b67d8 100644 --- a/README.md +++ b/README.md @@ -132,8 +132,9 @@ fmo quantize \ ``` The command line arguments means : - **`model-name-or-path`**: Hugging Face pretrained model name or directory path of the saved model checkpoint. +- **`local-dataset-type`**: Type of the local dataset file. Defaults to `inferred`. You can choose from `inferred`, `json`, `csv`, `parquet`, and `arrow`. - **`output-dir`**: Directory path to save the quantized checkpoint and related configurations. -- **`mode`**: Quantization techniques to apply. You can use `fp8`, `int8`. +- **`mode`**: Quantization techniques to apply. You can use `fp8`, `int8`, and `awq`. - **`pedantic-level`**: Represent to accuracy-latency trade-off. Higher pedantic level ensure a more accurate representaition of the model, but increase the quantization processing time. Defaults to 1. - **`device`**: Device to run the quantization process. Defaults to "cuda:0". - **`offload`**: When enabled, this option significantly reduces GPU memory usage by offloading model layers onto CPU RAM. Defaults to False.