Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Config manager supports command line overrides #69

Closed
wants to merge 2 commits into from

Conversation

gnadathur
Copy link
Contributor

@gnadathur gnadathur commented Feb 23, 2024

Stack from ghstack (oldest at bottom):

Summary:
PR implements following enhancements to config manager.

  1. Command line args and toml file args are now unified.
  2. Defaults can be loaded from either.
  3. Defaults can be overridden through command line. Overrides will
    be applied irrespective of where the defaults were loaded from.

Test Plan:
============================= test session starts ==============================
platform linux -- Python 3.10.13, pytest-8.0.1, pluggy-1.4.0 -- /home/gnadathur/local/a/pytorch-env/bin/python
cachedir: .pytest_cache
rootdir: /data/users/gnadathur/a/torchtrain
configfile: pyproject.toml
plugins: cov-4.1.0
collecting ... collected 5 items

test/test_job_config.py::TestJobConfig::test_command_line_args PASSED [ 20%]
test/test_job_config.py::TestJobConfig::test_command_line_args_with_override PASSED [ 40%]
test/test_job_config.py::TestJobConfig::test_job_config_file PASSED [ 60%]
test/test_job_config.py::TestJobConfig::test_job_config_file_with_override PASSED [ 80%]
test/test_job_config.py::TestJobConfig::test_job_file_does_not_exist PASSED [100%]

---------- coverage: platform linux, python 3.10.13-final-0 ----------
Coverage XML written to file coverage.xml

============================= slowest 20 durations =============================
0.01s call test/test_job_config.py::TestJobConfig::test_job_config_file_with_override
0.00s call test/test_job_config.py::TestJobConfig::test_job_config_file
0.00s call test/test_job_config.py::TestJobConfig::test_command_line_args
0.00s call test/test_job_config.py::TestJobConfig::test_command_line_args_with_override
0.00s call test/test_job_config.py::TestJobConfig::test_job_file_does_not_exist
0.00s setup test/test_job_config.py::TestJobConfig::test_command_line_args
0.00s teardown test/test_job_config.py::TestJobConfig::test_command_line_args
0.00s setup test/test_job_config.py::TestJobConfig::test_job_file_does_not_exist
0.00s setup test/test_job_config.py::TestJobConfig::test_command_line_args_with_override
0.00s teardown test/test_job_config.py::TestJobConfig::test_command_line_args_with_override
0.00s setup test/test_job_config.py::TestJobConfig::test_job_config_file_with_override
0.00s setup test/test_job_config.py::TestJobConfig::test_job_config_file
0.00s teardown test/test_job_config.py::TestJobConfig::test_job_file_does_not_exist
0.00s teardown test/test_job_config.py::TestJobConfig::test_job_config_file
0.00s teardown test/test_job_config.py::TestJobConfig::test_job_config_file_with_override
============================== 5 passed in 0.10s ===============================

Successful 10 iterations llama.sh run

Reviewers:

Subscribers:

Tasks:

Tags:

Summary:
PR implements following enhancements to config manager.
1. Command line args and toml file args are now unified.
2. Defaults can be loaded from either.
3. Defaults can be overridden through command line. Overrides will
be applied irrespective of where the defaults were loaded from.

Test Plan:
============================= test session starts ==============================
platform linux -- Python 3.10.13, pytest-8.0.1, pluggy-1.4.0 -- /home/gnadathur/local/a/pytorch-env/bin/python
cachedir: .pytest_cache
rootdir: /data/users/gnadathur/a/torchtrain
configfile: pyproject.toml
plugins: cov-4.1.0
collecting ... collected 5 items

test/test_job_config.py::TestJobConfig::test_command_line_args PASSED    [ 20%]
test/test_job_config.py::TestJobConfig::test_command_line_args_with_override PASSED [ 40%]
test/test_job_config.py::TestJobConfig::test_job_config_file PASSED      [ 60%]
test/test_job_config.py::TestJobConfig::test_job_config_file_with_override PASSED [ 80%]
test/test_job_config.py::TestJobConfig::test_job_file_does_not_exist PASSED [100%]

---------- coverage: platform linux, python 3.10.13-final-0 ----------
Coverage XML written to file coverage.xml

============================= slowest 20 durations =============================
0.01s call     test/test_job_config.py::TestJobConfig::test_job_config_file_with_override
0.00s call     test/test_job_config.py::TestJobConfig::test_job_config_file
0.00s call     test/test_job_config.py::TestJobConfig::test_command_line_args
0.00s call     test/test_job_config.py::TestJobConfig::test_command_line_args_with_override
0.00s call     test/test_job_config.py::TestJobConfig::test_job_file_does_not_exist
0.00s setup    test/test_job_config.py::TestJobConfig::test_command_line_args
0.00s teardown test/test_job_config.py::TestJobConfig::test_command_line_args
0.00s setup    test/test_job_config.py::TestJobConfig::test_job_file_does_not_exist
0.00s setup    test/test_job_config.py::TestJobConfig::test_command_line_args_with_override
0.00s teardown test/test_job_config.py::TestJobConfig::test_command_line_args_with_override
0.00s setup    test/test_job_config.py::TestJobConfig::test_job_config_file_with_override
0.00s setup    test/test_job_config.py::TestJobConfig::test_job_config_file
0.00s teardown test/test_job_config.py::TestJobConfig::test_job_file_does_not_exist
0.00s teardown test/test_job_config.py::TestJobConfig::test_job_config_file
0.00s teardown test/test_job_config.py::TestJobConfig::test_job_config_file_with_override
============================== 5 passed in 0.10s ===============================

Successful 10 iterations llama.sh run

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
gnadathur pushed a commit that referenced this pull request Feb 23, 2024
Summary:
PR implements following enhancements to config manager.
1. Command line args and toml file args are now unified.
2. Defaults can be loaded from either.
3. Defaults can be overridden through command line. Overrides will
be applied irrespective of where the defaults were loaded from.

Test Plan:
============================= test session starts ==============================
platform linux -- Python 3.10.13, pytest-8.0.1, pluggy-1.4.0 -- /home/gnadathur/local/a/pytorch-env/bin/python
cachedir: .pytest_cache
rootdir: /data/users/gnadathur/a/torchtrain
configfile: pyproject.toml
plugins: cov-4.1.0
collecting ... collected 5 items

test/test_job_config.py::TestJobConfig::test_command_line_args PASSED    [ 20%]
test/test_job_config.py::TestJobConfig::test_command_line_args_with_override PASSED [ 40%]
test/test_job_config.py::TestJobConfig::test_job_config_file PASSED      [ 60%]
test/test_job_config.py::TestJobConfig::test_job_config_file_with_override PASSED [ 80%]
test/test_job_config.py::TestJobConfig::test_job_file_does_not_exist PASSED [100%]

---------- coverage: platform linux, python 3.10.13-final-0 ----------
Coverage XML written to file coverage.xml

============================= slowest 20 durations =============================
0.01s call     test/test_job_config.py::TestJobConfig::test_job_config_file_with_override
0.00s call     test/test_job_config.py::TestJobConfig::test_job_config_file
0.00s call     test/test_job_config.py::TestJobConfig::test_command_line_args
0.00s call     test/test_job_config.py::TestJobConfig::test_command_line_args_with_override
0.00s call     test/test_job_config.py::TestJobConfig::test_job_file_does_not_exist
0.00s setup    test/test_job_config.py::TestJobConfig::test_command_line_args
0.00s teardown test/test_job_config.py::TestJobConfig::test_command_line_args
0.00s setup    test/test_job_config.py::TestJobConfig::test_job_file_does_not_exist
0.00s setup    test/test_job_config.py::TestJobConfig::test_command_line_args_with_override
0.00s teardown test/test_job_config.py::TestJobConfig::test_command_line_args_with_override
0.00s setup    test/test_job_config.py::TestJobConfig::test_job_config_file_with_override
0.00s setup    test/test_job_config.py::TestJobConfig::test_job_config_file
0.00s teardown test/test_job_config.py::TestJobConfig::test_job_file_does_not_exist
0.00s teardown test/test_job_config.py::TestJobConfig::test_job_config_file
0.00s teardown test/test_job_config.py::TestJobConfig::test_job_config_file_with_override
============================== 5 passed in 0.10s ===============================

Successful 10 iterations llama.sh run

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 9a0f588849f520652a431df438318be3ab0f9578
Pull Request resolved: #69
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Feb 23, 2024
Summary:
PR implements following enhancements to config manager.
1. Command line args and toml file args are now unified.
2. Defaults can be loaded from either.
3. Defaults can be overridden through command line. Overrides will
be applied irrespective of where the defaults were loaded from.

Test Plan:
============================= test session starts ==============================
platform linux -- Python 3.10.13, pytest-8.0.1, pluggy-1.4.0 -- /home/gnadathur/local/a/pytorch-env/bin/python
cachedir: .pytest_cache
rootdir: /data/users/gnadathur/a/torchtrain
configfile: pyproject.toml
plugins: cov-4.1.0
collecting ... collected 5 items

test/test_job_config.py::TestJobConfig::test_command_line_args PASSED    [ 20%]
test/test_job_config.py::TestJobConfig::test_command_line_args_with_override PASSED [ 40%]
test/test_job_config.py::TestJobConfig::test_job_config_file PASSED      [ 60%]
test/test_job_config.py::TestJobConfig::test_job_config_file_with_override PASSED [ 80%]
test/test_job_config.py::TestJobConfig::test_job_file_does_not_exist PASSED [100%]

---------- coverage: platform linux, python 3.10.13-final-0 ----------
Coverage XML written to file coverage.xml

============================= slowest 20 durations =============================
0.01s call     test/test_job_config.py::TestJobConfig::test_job_config_file_with_override
0.00s call     test/test_job_config.py::TestJobConfig::test_job_config_file
0.00s call     test/test_job_config.py::TestJobConfig::test_command_line_args
0.00s call     test/test_job_config.py::TestJobConfig::test_command_line_args_with_override
0.00s call     test/test_job_config.py::TestJobConfig::test_job_file_does_not_exist
0.00s setup    test/test_job_config.py::TestJobConfig::test_command_line_args
0.00s teardown test/test_job_config.py::TestJobConfig::test_command_line_args
0.00s setup    test/test_job_config.py::TestJobConfig::test_job_file_does_not_exist
0.00s setup    test/test_job_config.py::TestJobConfig::test_command_line_args_with_override
0.00s teardown test/test_job_config.py::TestJobConfig::test_command_line_args_with_override
0.00s setup    test/test_job_config.py::TestJobConfig::test_job_config_file_with_override
0.00s setup    test/test_job_config.py::TestJobConfig::test_job_config_file
0.00s teardown test/test_job_config.py::TestJobConfig::test_job_file_does_not_exist
0.00s teardown test/test_job_config.py::TestJobConfig::test_job_config_file
0.00s teardown test/test_job_config.py::TestJobConfig::test_job_config_file_with_override
============================== 5 passed in 0.10s ===============================

Successful 10 iterations llama.sh run

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
gnadathur pushed a commit that referenced this pull request Feb 23, 2024
Summary:
PR implements following enhancements to config manager.
1. Command line args and toml file args are now unified.
2. Defaults can be loaded from either.
3. Defaults can be overridden through command line. Overrides will
be applied irrespective of where the defaults were loaded from.

Test Plan:
============================= test session starts ==============================
platform linux -- Python 3.10.13, pytest-8.0.1, pluggy-1.4.0 -- /home/gnadathur/local/a/pytorch-env/bin/python
cachedir: .pytest_cache
rootdir: /data/users/gnadathur/a/torchtrain
configfile: pyproject.toml
plugins: cov-4.1.0
collecting ... collected 5 items

test/test_job_config.py::TestJobConfig::test_command_line_args PASSED    [ 20%]
test/test_job_config.py::TestJobConfig::test_command_line_args_with_override PASSED [ 40%]
test/test_job_config.py::TestJobConfig::test_job_config_file PASSED      [ 60%]
test/test_job_config.py::TestJobConfig::test_job_config_file_with_override PASSED [ 80%]
test/test_job_config.py::TestJobConfig::test_job_file_does_not_exist PASSED [100%]

---------- coverage: platform linux, python 3.10.13-final-0 ----------
Coverage XML written to file coverage.xml

============================= slowest 20 durations =============================
0.01s call     test/test_job_config.py::TestJobConfig::test_job_config_file_with_override
0.00s call     test/test_job_config.py::TestJobConfig::test_job_config_file
0.00s call     test/test_job_config.py::TestJobConfig::test_command_line_args
0.00s call     test/test_job_config.py::TestJobConfig::test_command_line_args_with_override
0.00s call     test/test_job_config.py::TestJobConfig::test_job_file_does_not_exist
0.00s setup    test/test_job_config.py::TestJobConfig::test_command_line_args
0.00s teardown test/test_job_config.py::TestJobConfig::test_command_line_args
0.00s setup    test/test_job_config.py::TestJobConfig::test_job_file_does_not_exist
0.00s setup    test/test_job_config.py::TestJobConfig::test_command_line_args_with_override
0.00s teardown test/test_job_config.py::TestJobConfig::test_command_line_args_with_override
0.00s setup    test/test_job_config.py::TestJobConfig::test_job_config_file_with_override
0.00s setup    test/test_job_config.py::TestJobConfig::test_job_config_file
0.00s teardown test/test_job_config.py::TestJobConfig::test_job_file_does_not_exist
0.00s teardown test/test_job_config.py::TestJobConfig::test_job_config_file
0.00s teardown test/test_job_config.py::TestJobConfig::test_job_config_file_with_override
============================== 5 passed in 0.10s ===============================

Successful 10 iterations llama.sh run

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 9a0f588849f520652a431df438318be3ab0f9578
Pull Request resolved: #69
Copy link
Contributor

@wanchaol wanchaol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks great to me! have a few suggestions inlined


args = parser.parse_args()
main(args)
config = JobConfig()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO we should define a config_from_cli method in the config_manager module, and do things like config = config_from_cli(JobConfig())

@@ -15,7 +15,7 @@ log_freq = 10

[model]
name = "llama"
model_conf = "debugmodel"
config = "debugmodel"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe call it type to avoid confusion with JobConfig?

Semantics:
- Default config is loaded from a toml file. If no toml file is provided,
then the default config is loaded from argparse defaults.
- Then, Override config is loaded from command line arguments.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should either accept a config file, or a full set of arguments, wondering do you think override config would be something helpful here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we assume exclusive either or, that simplifies implementation. There are a couple of use cases to consider:

  1. Do we want the ability to use a base config file for a model say LLaMa-13b and submit a bunch of jobs with say different batch size overridden from command line ?

  2. For using it with internal environments, would we be able to pass the full set of arguments through command line or do we want to leverage command line defaults and pass overrides only ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So regarding the cases:

  1. I think often the time if I want to submit a job with a slight different config, I just modify the config file locally, and the torchx launcher is able to pick up the changes, we can also create separate configs for such a case
  2. for internal using, torchx launcher would also do the same thing, load the config options from a config file, and launch with those config options, and for the case where the config is not specified in the config file, the command line defaults is valuable there.

return True

def _init_args(self) -> Tuple:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can put both: arg parser initialization, and arg parsing into this function, call it sth like parse_args? what I think would be nice in train.py:

config = JobConfig()
config.parse_args(args[1:])
main(config)

class_type = type(k.title(), (), v)
setattr(self, k, class_type())
self._validate_config()
default_args, override_args = self._init_args()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this function can happen inside parse_args?

@gnadathur gnadathur closed this Feb 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants