-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
build: use poetry for reproducible virtual environments #209
base: main
Are you sure you want to change the base?
build: use poetry for reproducible virtual environments #209
Conversation
pyproject.toml
Outdated
optional=true | ||
|
||
[tool.poetry.group.fms-accel.dependencies] | ||
fms_acceleration = {git = "https://github.com/foundation-model-stack/fms-acceleration.git", subdirectory="plugins/framework", rev = "40aad4683e01be3faebbcaae2670524c431cbf3e"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, this doesn't make sense, so I'll revert this change. Poetry already takes care of "pinning" this dependency for the reproducible virtual environments it creates (i.e. the poetry.lock
file).
We just have to run poetry lock
(or manually update python dependencies) whenever fms_acceleration releases a new version and we'd like to update the dependencies of fms-hf-tuning to match those of fms_acceleration.
For example, at the moment fms_acceleration has a constraint transformers<4.40.0
which propagates to fms-hf-tuning and sets transformers to 4.39.3.
c0ebb5d
to
336a684
Compare
@@ -515,7 +515,7 @@ def main(**kwargs): # pylint: disable=unused-argument | |||
exp_metadata, | |||
) | |||
except Exception as e: # pylint: disable=broad-except | |||
logging.error(traceback.format_exc()) | |||
logging.get_logger("error").error(traceback.format_exc()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious what's this change for ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logging.error
isn't part of transformers.utils.logging
.
For example, when you run this:
python -c 'from transformers.utils import logging; logging.error("foo")'
python raises the following exception:
Traceback (most recent call last):
File "<string>", line 1, in <module>
AttributeError: module 'transformers.utils.logging' has no attribute 'error'
so I just created a logger object called "error" and had that print the traceback. I assume that this line of code was originally using the logging
module for which logging.error("foo")
works
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, I don't remember how I made sft_trainer.py execute that code.
85da5d9
to
e56ee75
Compare
I updated the description of the issue with a couple more changes that I had to do for the tests to pass. |
For reviewers For package developers, these are the most used poetry commands. So you may follow those examples to verify the PR:
|
@VassilisVassiliadis some files have been updated could you rebase. thanks |
Signed-off-by: Vassilis Vassiliadis <[email protected]>
Signed-off-by: Vassilis Vassiliadis <[email protected]>
Signed-off-by: Vassilis Vassiliadis <[email protected]>
Signed-off-by: Vassilis Vassiliadis <[email protected]>
Signed-off-by: Vassilis Vassiliadis <[email protected]>
Signed-off-by: Vassilis Vassiliadis <[email protected]>
Signed-off-by: Vassilis Vassiliadis <[email protected]>
Signed-off-by: Vassilis Vassiliadis <[email protected]>
Signed-off-by: Vassilis Vassiliadis <[email protected]>
Signed-off-by: Vassilis Vassiliadis <[email protected]>
Signed-off-by: Vassilis Vassiliadis <[email protected]>
Signed-off-by: Vassilis Vassiliadis <[email protected]>
Signed-off-by: Vassilis Vassiliadis <[email protected]>
Signed-off-by: Vassilis Vassiliadis <[email protected]>
pip install `fms-hf-tuning[flash-attn]` is equivalent to `poetry install --extras flash-attn` not `poetry install --with flash-attn`. Signed-off-by: Vassilis Vassiliadis <[email protected]>
Signed-off-by: Vassilis Vassiliadis <[email protected]>
Co-authored-by: ted chang <[email protected]> Signed-off-by: Vassilis Vassiliadis <[email protected]>
…onment Signed-off-by: Vassilis Vassiliadis <[email protected]>
Signed-off-by: Vassilis Vassiliadis <[email protected]>
Signed-off-by: Vassilis Vassiliadis <[email protected]>
d939b0c
to
ca9ab4e
Compare
I rebased the PR and removed the instructions to install fms-accell using poetry. I replaced that text with a link to the relevant section in the README.md file. |
Signed-off-by: Vassilis Vassiliadis <[email protected]>
I also tested building the dockerfile for the last release of fms-hf-tuning on pyli i.e.
In a future PR we could have the images built for a release actually install the dependencies from the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@VassilisVassiliadis Thanks this looks good. And I agree. We need a mechanism to build/rebuild any release image with consistent dependencies for a given commit/tag/sha.
Hi @tedhtchang, I tried it and it looked good...
and see that it had an update:
I was then able to install aim with the original orjson and then update it:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, tested on my Red Hat Server
``` | ||
|
||
If you wish to use [fms-acceleration](https://github.com/foundation-model-stack/fms-acceleration) follow the instructions in [this section of README.md](README.md#fms-acceleration). | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, you could continue `pip install -e . `, if you do not wish to leverage the lock file and have environmental contrainsts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed some changes, please review.
Additionally in release process, @tedhtchang please add a post step to run poetry update
and commit the new lock file. This has to be done post release by maintainers to keep it updated
|
||
# Install from the wheel | ||
# Install using poetry if PyPi wheel_version is empty else download the wheel from PyPi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# Install using poetry if PyPi wheel_version is empty else download the wheel from PyPi | |
# Install using poetry if PyPi wheel_version is empty else download the wheel from PyPi | |
# If creating your own dockerfile we suggest to use poetry export for a reproducible environment |
Signed-off-by: Sukriti-Sharma4 <[email protected]>
Signed-off-by: Sukriti-Sharma4 <[email protected]>
@@ -43,6 +43,8 @@ jobs: | |||
run: | | |||
python -m pip install --upgrade pip | |||
python -m pip install tox | |||
python -m pip install poetry --user |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is there a need to install this to --user
? If you can install peotry the same way as tox, that this obliviates the need for the PATH
setting below.
If the aim is for isolation, then I dont think installing in the user space may achieve it. Since any further pip install commands
will still find the peoetry
in user
and try to update it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There quite a few instances of these in other worflows as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tox always creates it's own isolated venv.
Agree it would probably work without --user
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, tox
and poetry
should go in the same virtual-environment and that environment should not be the one that we install fms-hf-tuning
in. Because tox
creates a new virtual environment for the environments
it processes (e.g. py
, `fmt, etc) poetry and tox will be outside the tested virtual-environment.
We don't need to run tox "inside" one of the virtual environments that tox creates but we do need to run poetry
. Here's how I chose to do that.
First, I installed poetry under the user pip install directory (which by default for linux is ~/.local/
. Then I updated the file that $GITHUB_ENV points to so that in subsequent steps the $PATH
environment variable contains the path ~/.local/bin
.
This ensures that the poetry
commands running inside tox
can use the poetry
executable. Alternatively, we can create a new virtual environment under a directory of our choosing e.g. /tmp/isolated
in there, we install just poetry and then update $GITHUB_ENV
so that $PATH
includes /tmp/isolated/bin
.
@@ -42,6 +42,8 @@ If additional new Python module dependencies are required, think about where to | |||
- If they're optional dependencies for additional functionality, then put them in the pyproject.toml file like were done for [flash-attn](https://github.com/foundation-model-stack/fms-hf-tuning/blob/main/pyproject.toml#L44) or [aim](https://github.com/foundation-model-stack/fms-hf-tuning/blob/main/pyproject.toml#L45). | |||
- If it's an additional dependency for development, then add it to the [dev](https://github.com/foundation-model-stack/fms-hf-tuning/blob/main/pyproject.toml#L43) dependencies. | |||
|
|||
if using `poetry` to install dependencies, you can optionally leverage [poetry add](https://python-poetry.org/docs/cli/#add) to add new dependencies to pyproject. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel this is abit confusing. If we have a peoetry flow, and there is a new dep, shouldnt we have some guidance on when it has to be added to the lockfile?
Also some guidance to say that for optional deps it should not be added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
true, maybe we can continue adding to pyproject.yml , like we do now and then we have to run poetry update
?
I can change it to that instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There're 2 ways to add a new dependency. One is to use poetry
i.e. poetry add
the other is to manually edit the pyproject.toml
file and then run poetry update
which will then ensure poetry.lock
is consistent with pyproject.toml (that's basically what poetry add
does on your behalf).
If you opt for the 2nd route but forget to run poetry update
then the next time you run poetry install
(or the CI/CD runs it) you'll get an exception that the poetry.lock
and pyproject.toml
files are inconsistent and that you should run the poetry update
command.
@VassilisVassiliadis there is a PR here to add back the
|
|
||
We have committed a `poetry.lock` file to allow reproducible enviornments. If building from source, you can clone the repository and use poetry to install as mentioned in [development docs](/CONTRIBUTING.md#development) | ||
|
||
If building in a dockerfile use `poetry export --format requirements.txt` can install same dependencies from the lock file. Maintainers regularly update the lock file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this line is not very clear.. this will do not any installation correct? Its just an export of the locked deps to a requirements file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ya its just to export the requirements and same lock file and then we have to install. hmm should be just link our dockerfile as example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@VassilisVassiliadis there is a PR here to add back the [.fms-accel] dep. It was removed before with the intention of being added back #223
@fabianlim I'll rebase the PR once #223 makes its way into the main
branch.
why will poetry constraint packages under optional dependencies? that makes no sense
I'm actually not sure. I assume that they do this so that poetry install
and poetry install -E fms-accel
both end up installing the same non-optional dependencies (in our case transformers).
Description of the change
accelerate.commands.launch.launch_command(args)
raises when the accelerate optionnum_processes
is greater than 1test_launch_script.py
to try and use a free port. This is useful if you have a default accelerate_config.yml file (e.g. under~/.cache/transformers
) with a num_processes > 1 field.--eval_strategy epoch
to fallback to--evaluation_strategy
for backwards compatibility with older version of transformers (e.g.4.39.3
and4.40.2
).due to fms_acceleration having a constraint to transformers<4.40.0 transformers gets set to 4.39.3 when you use poetry to install fms-hf-tuning (even if you do not add the cli argfms_acceleration is no longer an optional package of fms-hf-tuning, there's an explicit constraint for transformers which restricts the package to <=4.40.2--with fms-accel
).get_train_args()
intest_sft_trainer.py
for building theconfigs.TrainingArguments
objectRelated issue number
Contributes to #60
How to verify the PR
Use
poetry install
to installfms-hf-tuning
from source. You will end up installing the python dependencies listed inpoetry.lock
Was the PR tested