Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for 4 merging modes of DictConfig #917

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

luisherrmann
Copy link

@luisherrmann luisherrmann commented May 9, 2022

See the description of omegaconf/omegaconf.py. Essentially, the idea is to introduce 4 different modes of merging DictConfigs on their keys, where the keys are to be understood as nested keys. E.g. the dictionary

  • a1: 1
  • a2:
    • b1: 2
    • b2:
      • c1: 3
      • c2: 4

or {a1: 1, a2: {b1: 2, b2: {c1: 3, c2: 4}}} can be understood as a dictionary mapping nested keys to values as

  • a1: 1
  • a2.b1: 2
  • a2.b2.c1: 3
  • a2.b2.c2: 4

The way of merging can be provided through the how keyword in OmegaConf.merge(), for example as OmegaConf.merge(config1, config2, how='left').

The four different modes are as follows:

1. left:

Only merge on keys from the left DictConfig. E.g.
{a1: 1, a2: {b1: 2, b2: 3}}, {a2: {b2: 11, b2: 12}, a3: 13} -> {a1: 1, a2: {b1: 2, b2: 11}}

2. inner:

Only merge on keys from both DictConfigs. E.g.
{a1: 1, a2: {b1: 2, b2: 3}}, {a2: {b2: 11, b2: 12}, a3: 13} -> {a2: {b2: 11}}

3. outer-left:

Merge on keys from both DictConfigs. E.g.
{a1: 1, a2: {b1: 2, b2: 3}}, {a2: {b2: 11, b2: 12}, a3: 13} -> {a1: 1, a2: {b1: 2, b2: 11, b3: }}
If nesting levels conflict, use keys on the left. E.g.
{a1: {b1: 9}}, {a1: {b1: {c1: 21}}, a2: 22} -> {a1: {b1: 9}, a2: 22}

4. outer-right:

Merge on keys from both DictConfigs (see outer-left).
If nesting levels conflict, use keys on the right. E.g.
{a1: {b1: 9}}, {a1: {b1: {c1: 21}}, a2: 22} -> {a1: {b1: {c1: 21}}, a2: 22}

The current default is "outer-right", which corresponds to the current Omegaconf merging behavior.

@Jasha10
Copy link
Collaborator

Jasha10 commented May 11, 2022

Hi @luisherrmann,
Thank you for contributing this interesting idea.

To make sure the tests are correct, I would want to add tests for interaction of the merge modes with each of the following:

  • None values
  • OmegaConf interpolations (both interpolations to scalars and to other dicts/lists)
  • OmegaConf missing values (???)
  • merging deeply nested containers

To be honest, I'm not sure if having many merge modes is in-scope for this project.
There are questions such as "should the nested DictConfig object have the same merge mode as the top-level DictConfig object?" What if I want to do an "inner" merge at the top level and an "outer" merge at a nested level? Since there are so many possible variations, I feel that it may be better for the core omegaconf package to support just one simple merge strategy and have 3rd-party libraries implement more advanced functionality. What do you think?

@luisherrmann
Copy link
Author

luisherrmann commented May 17, 2022

Hello Jasha,

thanks for considering the idea! Your suggestion for extended tests sounds very reasonable, I will add the tests you suggested as soon as I have a free moment. 🙂

Regarding your concerns about the scope of the project, I think the different merge-modes can be quite useful for cases where you may have to resolve merge conflicts between OmegaConfs, i.e. in cases where you have OmegaConfs which conform to a certain nested structure, but need to merge those with configs that could conform to a different nested structure.

Personally, I have encountered this issue on a project where we needed to merge several configs, and some of those could be legacy versions with a different nested structure. Since we needed a way to deal with the possible legacy versions, I thought it would be a good idea to allow for possible conflicts to be addressed in different ways, by specifying how exactly to merge the OmegaConfs. These 4 different modes should cover all basic scenarios. In projects where the code base evolves very quickly (e.g. Data Science or ML projects), I expect it to be a common problem that the config structure needs to change very often in order to accommodate to new requirements from experimental results. However, supporting old configs may still be necessary, either because it takes time to rewrite the configs to match the new requirements, or because old configs still need to be supported to allow for backwards-compatibility, reproducibility of old experiments, etc.

You are of course right that the possibilities of merging different OmegaConfs could, in principle, become arbitrarily complex as the config size increases, but I think the four basic scenarios left, inner, outer-left and outer-right should cover the most common use cases. My reasoning here is that if left, inner, right and outer cover most use cases of merging database tables on keys, then these four cases should also cover most use cases of dict merging on (nested) keys. However, I don't have a mode right here because in the context of dict merging, it would make no sense to merge a src dict into a dest dict and only use keys of the src dict - this would simply be equivalent to using the src dict - and for the outer merge, it makes sense to deal with conflicting nesting levels in two possible ways, thus leading to outer-left and outer-right. The latter is how OmegaConf merging works right now, the former is how I needed OmegaConf merging to work in the project I mentioned earlier. 😉 But I can definitely see left or inner being useful as well.

What are your thoughts? Can you see this being a useful contribution to the OmegaConf project after reading more about my motivation?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants