Skip to content

Commit

Permalink
Add explanation of the related arguments as well
Browse files Browse the repository at this point in the history
  • Loading branch information
seungduk-yanolja committed Nov 13, 2024
1 parent 42af2df commit c3892c6
Showing 1 changed file with 11 additions and 1 deletion.
12 changes: 11 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -132,12 +132,22 @@ fmo quantize \
```
The command line arguments means :
- **`model-name-or-path`**: Hugging Face pretrained model name or directory path of the saved model checkpoint.
- **`local-dataset-type`**: Type of the local dataset file. Defaults to `inferred`. You can choose from `inferred`, `json`, `csv`, `parquet`, and `arrow`.
- **`output-dir`**: Directory path to save the quantized checkpoint and related configurations.
- **`mode`**: Quantization techniques to apply. You can use `fp8`, `int8`, and `awq`.
- **`pedantic-level`**: Represent to accuracy-latency trade-off. Higher pedantic level ensure a more accurate representaition of the model, but increase the quantization processing time. Defaults to 1.
- **`device`**: Device to run the quantization process. Defaults to "cuda:0".
- **`offload`**: When enabled, this option significantly reduces GPU memory usage by offloading model layers onto CPU RAM. Defaults to False.
- **`dataset-name-or-path`**: Hugging Face dataset name or directory path of the local dataset file. If you use a single file, you can set this option to the path of the file and set `local-dataset-type` to the appropriate type.
- **`local-dataset-type`**: Type of the local dataset file. Set this only when you use a local dataset file. Defaults to `inferred`. You can choose from `inferred`, `json`, `csv`, `parquet`, and `arrow`.
- **`dataset-split-name`**: The split of the dataset to use (e.g., "train", "test", "validation"). Defaults to "test".
- **`dataset-target-column-name`**: The name of the column in your dataset containing the text to be processed. For example:
- "article" for CNN/DailyMail dataset
- "text" for many standard datasets
- "content" for custom datasets
Defaults to "article". Note that if you want to apply a chat template, you should preprocess your dataset to have a single field for the formatted text.
- **`dataset-num-samples`**: Number of samples to use from the dataset for calibration. More samples may improve quantization quality but increase processing time. Defaults to 512.
- **`dataset-max-length`**: Maximum length (in tokens) for each sample from the dataset. Longer sequences will be truncated. Defaults to 1024.
- **`dataset-batch-size`**: Number of samples to process simultaneously during calibration. If not specified (None), FMO will automatically determine an optimal batch size based on available GPU memory.

## Example: Run FP8 quantization with Meta-Llama-3-8B-Instruct
```bash
Expand Down

0 comments on commit c3892c6

Please sign in to comment.