Added support for running official HF baseline FSDP-QLoRA benchmark #16

achew010 · 2024-05-21T03:40:34Z

This PR addresses issue #10 by adding support for a FSDP-compatible HF QLoRA baseline to the our benchmarks.

Feature

This will allow users to specify a no_peft_model field in the plugin config bnb.yaml. Specifying this field will bypass the plugin.augmentation function and allow SFTTrainer to manage the PEFT preparation of the model instead.

NOTE:

While the open-source approach to FSDP-compatible QLoRA removes the extraneous dtype casting in prepare_model_for_kbit_training, it only does so when the model is sharded. When on single device, it continues to use prepare_model_for_kbit_training and users will continue to experience a slowdown due to the extraneous casting.

…device

fabianlim

LGTM save for that linting issue. but im ok to merge this first and then handle #7 later.

fabianlim · 2024-05-21T04:57:38Z

plugins/accelerated-peft/tests/test_peft_plugins.py

+            require_packages_check=False,
+        ):
+            # check flags and callbacks
+            assert (not correct_value)==framework.requires_agumentation


i can see some liniting issues, but we can take care of it in #9

…or GPTQ-LoRA (#20) * Add GitHub Workflow for Linting , Formatting and Test. Activate Workflow for Framework (#7) * add lint workflow Signed-off-by: Yu Chin Fabian Lim <[email protected]> * add pylintrc, update .tox fix files Signed-off-by: Yu Chin Fabian Lim <[email protected]> * activate test and minor fix Signed-off-by: Yu Chin Fabian Lim <[email protected]> * lint benchmarks.py and add workflow to dev Signed-off-by: Yu Chin Fabian Lim <[email protected]> --------- Signed-off-by: Yu Chin Fabian Lim <[email protected]> * Improvements to Benchmark Scripts and Config Generation Workflow (#13) * fix benches and add verify configs Signed-off-by: Yu Chin Fabian Lim <[email protected]> * update readme and add workflow Signed-off-by: Yu Chin Fabian Lim <[email protected]> * add packaging dep Signed-off-by: Yu Chin Fabian Lim <[email protected]> * update torch dep in framework and run-benches Signed-off-by: Yu Chin Fabian Lim <[email protected]> * take host env in run-benches * add display bench results script * rename summary.csv to raw_summary.csv and update run_benchmarks.sh * export environment variables in shell command * dump out pip requirements for repro, and add default FHT_branch --------- Signed-off-by: Yu Chin Fabian Lim <[email protected]> * Added support for running official HF baseline FSDP-QLoRA benchmark (#16) * new baseline scenario * rename variables * added warning when plugin allows SFTTrainer to handle PEFT on single device * Fix FSDP when performing GPTQ-LoRA with Triton V2 (#15) * wrap in parameters and torch view to correct dtype Signed-off-by: Yu Chin Fabian Lim <[email protected]> * refactor to apply patch only on FSDP and simplify Signed-off-by: Yu Chin Fabian Lim <[email protected]> --------- Signed-off-by: Yu Chin Fabian Lim <[email protected]> * Provide Memory Benchmarking Feature to Benchmarking Code (#14) * add gpu memory logging support * made improvements to GPU reference and result collation * Renamed memory logging argument to reflect its readings as reserved me mory using nvidia-smi and changed aggregation function in result collation * variable renames * manual linting * added memory logging functionality via HFTrainer * added support to benchmark memory using HFTrainer and updated READMEwith explanation of the 2 memory benchmarking options * addressed changes requested in PR #14 * fix bug and smplify gpu logs aggregation logic * fixes to calculation of HFTrainer Mem Logging values * fix calculations * more fixes * fix to ignore including stage inside max calculation of alloc memory * more comments and README updates * added fix to keyerror due to empty output dict from OOM * manual linting * added benchmark results to refs * remove unnecessary columns in results gathering * made changes to results gathering --------- Signed-off-by: Yu Chin Fabian Lim <[email protected]> Co-authored-by: achew010 <[email protected]>

achew010 added 3 commits May 20, 2024 08:23

new baseline scenario

5e2ca61

rename variables

75dbcb9

added warning when plugin allows SFTTrainer to handle PEFT on single …

1adc9c3

…device

achew010 requested a review from fabianlim as a code owner May 21, 2024 03:40

fabianlim approved these changes May 21, 2024

View reviewed changes

fabianlim merged commit d510ceb into foundation-model-stack:dev May 21, 2024
2 checks passed

achew010 deleted the fsdp-qlora-baseline branch May 26, 2024 16:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added support for running official HF baseline FSDP-QLoRA benchmark #16

Added support for running official HF baseline FSDP-QLoRA benchmark #16

achew010 commented May 21, 2024

fabianlim left a comment

fabianlim May 21, 2024

Added support for running official HF baseline FSDP-QLoRA benchmark #16

Added support for running official HF baseline FSDP-QLoRA benchmark #16

Conversation

achew010 commented May 21, 2024

Feature

NOTE:

fabianlim left a comment

Choose a reason for hiding this comment

fabianlim May 21, 2024

Choose a reason for hiding this comment