Upstream Main: Linting, Benchmarking, HF QLoRA baseline, FSDP fixes for GPTQ-LoRA #20

fabianlim · 2024-05-27T07:41:48Z

Upstreaming to main, see commit notes.

…low for Framework (#7) * add lint workflow Signed-off-by: Yu Chin Fabian Lim <[email protected]> * add pylintrc, update .tox fix files Signed-off-by: Yu Chin Fabian Lim <[email protected]> * activate test and minor fix Signed-off-by: Yu Chin Fabian Lim <[email protected]> * lint benchmarks.py and add workflow to dev Signed-off-by: Yu Chin Fabian Lim <[email protected]> --------- Signed-off-by: Yu Chin Fabian Lim <[email protected]>

* fix benches and add verify configs Signed-off-by: Yu Chin Fabian Lim <[email protected]> * update readme and add workflow Signed-off-by: Yu Chin Fabian Lim <[email protected]> * add packaging dep Signed-off-by: Yu Chin Fabian Lim <[email protected]> * update torch dep in framework and run-benches Signed-off-by: Yu Chin Fabian Lim <[email protected]> * take host env in run-benches * add display bench results script * rename summary.csv to raw_summary.csv and update run_benchmarks.sh * export environment variables in shell command * dump out pip requirements for repro, and add default FHT_branch --------- Signed-off-by: Yu Chin Fabian Lim <[email protected]>

) * new baseline scenario * rename variables * added warning when plugin allows SFTTrainer to handle PEFT on single device

* wrap in parameters and torch view to correct dtype Signed-off-by: Yu Chin Fabian Lim <[email protected]> * refactor to apply patch only on FSDP and simplify Signed-off-by: Yu Chin Fabian Lim <[email protected]> --------- Signed-off-by: Yu Chin Fabian Lim <[email protected]>

* add gpu memory logging support * made improvements to GPU reference and result collation * Renamed memory logging argument to reflect its readings as reserved me mory using nvidia-smi and changed aggregation function in result collation * variable renames * manual linting * added memory logging functionality via HFTrainer * added support to benchmark memory using HFTrainer and updated READMEwith explanation of the 2 memory benchmarking options * addressed changes requested in PR #14 * fix bug and smplify gpu logs aggregation logic * fixes to calculation of HFTrainer Mem Logging values * fix calculations * more fixes * fix to ignore including stage inside max calculation of alloc memory * more comments and README updates * added fix to keyerror due to empty output dict from OOM * manual linting * added benchmark results to refs * remove unnecessary columns in results gathering * made changes to results gathering

fabianlim and others added 5 commits May 17, 2024 10:33

Added support for running official HF baseline FSDP-QLoRA benchmark (#16

d510ceb

) * new baseline scenario * rename variables * added warning when plugin allows SFTTrainer to handle PEFT on single device

fabianlim self-assigned this May 27, 2024

fabianlim merged commit e8e06c9 into main May 27, 2024
4 checks passed

This was referenced May 27, 2024

Revert "Upstream Main: Linting, Benchmarking, HF QLoRA baseline, FSDP fixes for GPTQ-LoRA" #21

Closed

Upstream Main: Linting, Benchmarking, HF QLoRA baseline, FSDP fixes for GPTQ-LoRA #22

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Upstream Main: Linting, Benchmarking, HF QLoRA baseline, FSDP fixes for GPTQ-LoRA #20

Upstream Main: Linting, Benchmarking, HF QLoRA baseline, FSDP fixes for GPTQ-LoRA #20

fabianlim commented May 27, 2024 •

edited

Loading

Upstream Main: Linting, Benchmarking, HF QLoRA baseline, FSDP fixes for GPTQ-LoRA #20

Upstream Main: Linting, Benchmarking, HF QLoRA baseline, FSDP fixes for GPTQ-LoRA #20

Conversation

fabianlim commented May 27, 2024 • edited Loading

fabianlim commented May 27, 2024 •

edited

Loading