Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
seungduk-yanolja committed Nov 13, 2024
1 parent 4082f57 commit 42af2df
Showing 1 changed file with 2 additions and 1 deletion.
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -132,8 +132,9 @@ fmo quantize \
```
The command line arguments means :
- **`model-name-or-path`**: Hugging Face pretrained model name or directory path of the saved model checkpoint.
- **`local-dataset-type`**: Type of the local dataset file. Defaults to `inferred`. You can choose from `inferred`, `json`, `csv`, `parquet`, and `arrow`.
- **`output-dir`**: Directory path to save the quantized checkpoint and related configurations.
- **`mode`**: Quantization techniques to apply. You can use `fp8`, `int8`.
- **`mode`**: Quantization techniques to apply. You can use `fp8`, `int8`, and `awq`.
- **`pedantic-level`**: Represent to accuracy-latency trade-off. Higher pedantic level ensure a more accurate representaition of the model, but increase the quantization processing time. Defaults to 1.
- **`device`**: Device to run the quantization process. Defaults to "cuda:0".
- **`offload`**: When enabled, this option significantly reduces GPU memory usage by offloading model layers onto CPU RAM. Defaults to False.
Expand Down

0 comments on commit 42af2df

Please sign in to comment.