Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ReIntroduce Package for FMS Accel #223

Merged
merged 3 commits into from
Jul 9, 2024

Conversation

fabianlim
Copy link
Collaborator

@fabianlim fabianlim commented Jun 30, 2024

Description of the change

@tedhtchang @anhuong
This reintroduces the fms-acceleration package, which will soon be available once this PR is merged foundation-model-stack/fms-acceleration#45

Related issue number

#219

How to verify the PR

  1. Install and check its properly installed
pip install "fms-hf-tuning[fms-accel] @ git+https://github.com/fabianlim/fms-hf-tuning.git@fix/accel-ref"

Then verify

Successfully installed fms-acceleration-0.1.0 fms-hf-tuning-0.1.dev208+g3157e6f simpleeval-0.9.13 tokenizers-0.15.2 transformers-4.39.3

Was the PR tested

  • I have added >=1 unit test(s) for every new method I have added.
  • I have ensured all unit tests pass

@fabianlim fabianlim requested review from tedhtchang and anhuong and removed request for alex-jw-brooks, Ssukriti and anhuong June 30, 2024 15:58
@fabianlim fabianlim self-assigned this Jun 30, 2024
@fabianlim fabianlim marked this pull request as draft June 30, 2024 15:58
@fabianlim fabianlim marked this pull request as ready for review July 1, 2024 07:18
tedhtchang
tedhtchang previously approved these changes Jul 5, 2024
Copy link
Collaborator

@tedhtchang tedhtchang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/LGTM

Signed-off-by: Yu Chin Fabian Lim <[email protected]>
@fabianlim
Copy link
Collaborator Author

@tedhtchang is there anybody else that should review this or we can merge?

@anhuong
Copy link
Collaborator

anhuong commented Jul 8, 2024

@fabianlim I am reviewing and testing, will get review in today

anhuong
anhuong previously approved these changes Jul 8, 2024
Copy link
Collaborator

@anhuong anhuong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, I had a question on the usage of fms-acceleration library. It is assumed that if one installs fms-acceleration, that one must configure an acceleration framework correct? Because if I install and try to run sft_trainer normally, it fails with error ValueError: No plugins could be configured. Please check the acceleration framework configuration file.

@anhuong
Copy link
Collaborator

anhuong commented Jul 8, 2024

Also when I try installing the plugin, I get an error that torch isn't installed when it is.

$ pip install fms-acceleration
...
Successfully installed fms-acceleration-0.1.1 tokenizers-0.15.2 transformers-4.39.3

$ python -m fms_acceleration.cli install fms_acceleration_peft
/usr/local/lib/python3.11/site-packages/_distutils_hack/__init__.py:33: UserWarning: Setuptools is replacing distutils.
  warnings.warn("Setuptools is replacing distutils.")
Collecting git+https://github.com/foundation-model-stack/fms-acceleration.git#subdirectory=plugins/accelerated-peft
  Cloning https://github.com/foundation-model-stack/fms-acceleration.git to /tmp/pip-req-build-c7fg3o71
  Running command git clone --filter=blob:none --quiet https://github.com/foundation-model-stack/fms-acceleration.git /tmp/pip-req-build-c7fg3o71
  Resolved https://github.com/foundation-model-stack/fms-acceleration.git to commit 06dcc4d967cb13842660eed0ebf403a191748ce8
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... done
Collecting auto-gptq@ git+https://github.com/AutoGPTQ/AutoGPTQ.git@ea829c7bbe83561c2b1de26795b6592992373ef7 (from fms-acceleration-peft==0.0.1)
  Cloning https://github.com/AutoGPTQ/AutoGPTQ.git (to revision ea829c7bbe83561c2b1de26795b6592992373ef7) to /tmp/pip-install-91u6z775/auto-gptq_65c3bf33f0094c9295adc1efa87da6b0
  Running command git clone --filter=blob:none --quiet https://github.com/AutoGPTQ/AutoGPTQ.git /tmp/pip-install-91u6z775/auto-gptq_65c3bf33f0094c9295adc1efa87da6b0
  Running command git rev-parse -q --verify 'sha^ea829c7bbe83561c2b1de26795b6592992373ef7'
  Running command git fetch -q https://github.com/AutoGPTQ/AutoGPTQ.git ea829c7bbe83561c2b1de26795b6592992373ef7
  Running command git checkout -q ea829c7bbe83561c2b1de26795b6592992373ef7
  Resolved https://github.com/AutoGPTQ/AutoGPTQ.git to commit ea829c7bbe83561c2b1de26795b6592992373ef7
  Installing build dependencies ... done
  Getting requirements to build wheel ... error
  error: subprocess-exited-with-error
  
  × Getting requirements to build wheel did not run successfully.
  │ exit code: 1
  ╰─> [2 lines of output]
      Building PyTorch CUDA extension requires PyTorch being installed, please install PyTorch first: No module named 'torch'.
       NOTE: This issue may be raised due to pip build isolation system (ignoring local packages). Please use `--no-build-isolation` when installing with pip, and refer to https://github.com/AutoGPTQ/AutoGPTQ/pull/620 for more details.
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.

# but i see torch does exist
$ pip show torch
Name: torch
Version: 2.3.1
Summary: Tensors and Dynamic neural networks in Python with strong GPU acceleration
Home-page: https://pytorch.org/
Author: PyTorch Team
Author-email: [email protected]
License: BSD-3
Location: /home/tuning/.local/lib/python3.11/site-packages
Requires: filelock, fsspec, jinja2, networkx, nvidia-cublas-cu12, nvidia-cuda-cupti-cu12, nvidia-cuda-nvrtc-cu12, nvidia-cuda-runtime-cu12, nvidia-cudnn-cu12, nvidia-cufft-cu12, nvidia-curand-cu12, nvidia-cusolver-cu12, nvidia-cusparse-cu12, nvidia-nccl-cu12, nvidia-nvtx-cu12, sympy, triton, typing-extensions
Required-by: accelerate, flash-attn, fms-acceleration, fms-hf-tuning, peft, trl

# as well as below works
$ python 
Python 3.11.7 (main, May 16 2024, 00:00:00) [GCC 11.4.1 20231218 (Red Hat 11.4.1-3)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> import fms_acceleration

also verified that flash-attn and packaging are installed

@fabianlim fabianlim dismissed stale reviews from anhuong and tedhtchang via 1d2b016 July 9, 2024 02:58
Signed-off-by: Yu Chin Fabian Lim <[email protected]>
@fabianlim
Copy link
Collaborator Author

fabianlim commented Jul 9, 2024

@anhuong thanks for reviewing. We have pushed another commit 3ef14a2 to clean up one the previous commits.

Looks good, I had a question on the usage of fms-acceleration library. It is assumed that if one installs fms-acceleration, that one must configure an acceleration framework correct? Because if I install and try to run sft_trainer normally, it fails with error ValueError: No plugins could be configured. Please check the acceleration framework configuration file.

This is a little strange, the code should already be handling this case, as you can see below, I have fms-accel installed.

fms-acceleration==0.1.1
fms-hf-tuning @ file:///data/repos/fms-hf-tuning

And if i run with the following, without any framework config arguments, it will run as per normal. Maybe you can test on the latest commit.

TRANSFORMERS_VERBOSITY=info \
	python \
	tuning/sft_trainer.py \
	--training_data_path $DATA_PATH \
	--output_dir ./results \
	--num_train_epochs 1 \
	--torch_dtype float16 \
    --model_name TinyLlama/TinyLlama-1.1B-Chat-v1.0 \
	--per_device_train_batch_size 4 \
	--per_device_eval_batch_size 1 \
	--gradient_accumulation_steps 1 \
	--gradient_checkpointing True \
	--evaluation_strategy "no" \
	--save_strategy "no" \
	--learning_rate 2e-4 \
	--weight_decay 0.01 \
	--warmup_steps 10 \
	--adam_epsilon 1e-4 \
	--lr_scheduler_type "linear" \
	--logging_strategy steps \
	--logging_steps 10 \
	--include_tokens_per_second \
	--packing True \
	--use_flash_attn True \
	--response_template "\n### Response:" \
	--dataset_text_field "output" \
	--max_steps 200 \
	--peft_method lora \
	--r 16 --lora_alpha 16 --lora_dropout 0.1 \
	--target_modules q_proj k_proj v_proj o_proj

as you can see no fms acceleration framework plugins will be activated

***** Running training *****
  Num examples = 3,251
  Num Epochs = 1
  Instantaneous batch size per device = 4
  Total train batch size (w. parallel, distributed & accumulation) = 4
  Gradient Accumulation steps = 1
  Total optimization steps = 200
  Number of trainable parameters = 4,505,600
  0%|                                                                                                                                                                                                    | 0/200 [00:00<?, ?it/s]
`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.
/home/flim/.local/lib/python3.10/site-packages/torch/utils/checkpoint.py:460: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
  warnings.warn(
{'loss': 1.2318, 'grad_norm': 0.342529296875, 'learning_rate': 0.0002, 'epoch': 0.01}

Also when I try installing the plugin, I get an error that torch isn't installed when it is.

The error is not a torch installation error, but rather that AutoGPTQ requires cuda toolkit to be installed.

  • in our README, we provide sample nvidia images that have cuda toolkit already installed.
  • This is a common thing with libraries that have kernels. FOr example, flash attention used to have this (before they published their prebuilt binaries)
  • And we have an issue to remove cuda toolkit dependencies, by extracting out only triton portions of AutoGPTQ. This will be merged and published soon, so that it will make the installation process more seemless for future users.

If all looks good to you afgain, do you mind giving me another approve and I can merge

Copy link
Collaborator

@anhuong anhuong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the details Fabian, running on this PR, I do not get the initial error where fms-acceleration is installed and I have to use it to run sft_trainer.py.

@anhuong anhuong merged commit bf22a2f into foundation-model-stack:main Jul 9, 2024
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants