Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize map unions to avoid building long lists #14215

Merged
merged 6 commits into from
Jan 24, 2025

Conversation

sabiwara
Copy link
Contributor

@sabiwara sabiwara commented Jan 23, 2025

Fixes #14203
Replaces #14205

Only implemented maps for now, once approved I'm happy to copy it for tuples.

We do only one pass at keys, and are able to switch the strategy in the middle (including just bailing if a key is missing or an expectation is not met).

It also fixes the bamboo example.

Perf-wise it seems to not spend a lot of time at all in the new code:

Profiling details

:tprof.profile(Mix, :install, [[:bamboo], [force: true]], %{type: :call_time, pattern: [{Module.Types.Descr,:_, :_}], report: {:total, {:measurement, :descending}}})
Generated bamboo app
FUNCTION                                                                        CALLS  TIME (us)  PER CALL  [    %]
'Elixir.Module.Types.Descr':'empty?'/1                                         133937      10362      0.08  [10.89]
'Elixir.Module.Types.Descr':iterator_intersection/4                             50587       7093      0.14  [ 7.45]
'Elixir.Module.Types.Descr':'non_empty_intersection!'/2                         23751       4280      0.18  [ 4.50]
'Elixir.Module.Types.Descr':intersection/2                                      45968       4043      0.09  [ 4.25]
'Elixir.Module.Types.Descr':'gradual?'/1                                        54133       3706      0.07  [ 3.89]
'Elixir.Module.Types.Descr':term/0                                              54112       3028      0.06  [ 3.18]
'Elixir.Module.Types.Descr':'map_empty?'/3                                       4815       2609      0.54  [ 2.74]
'Elixir.Module.Types.Descr':iterator_merge/3                                    23944       2247      0.09  [ 2.36]
'Elixir.Module.Types.Descr':'disjoint?'/2                                       19417       2235      0.12  [ 2.35]
'Elixir.Module.Types.Descr':union/2                                             19824       2216      0.11  [ 2.33]
'Elixir.Module.Types.Descr':atom/1                                              17552       2116      0.12  [ 2.22]
'Elixir.Module.Types.Descr':atom_new/1                                          17552       1935      0.11  [ 2.03]
'Elixir.Module.Types.Descr':symmetrical_intersection/3                          13292       1909      0.14  [ 2.01]
'Elixir.Module.Types.Descr':'compatible?'/2                                      9400       1877      0.20  [ 1.97]
'Elixir.Module.Types.Descr':'-map_empty?/3-fun-0-'/1                            33105       1866      0.06  [ 1.96]
'Elixir.Module.Types.Descr':intersection/3                                      22489       1800      0.08  [ 1.89]
'Elixir.Module.Types.Descr':tuple_fetch/2                                       10462       1691      0.16  [ 1.78]
'Elixir.Module.Types.Descr':compatible_intersection/2                            3983       1643      0.41  [ 1.73]
'Elixir.Module.Types.Descr':atom_fetch/1                                         5452       1434      0.26  [ 1.51]
'Elixir.Module.Types.Descr':map_descr_pairs/3                                   19865       1357      0.07  [ 1.43]
'Elixir.Module.Types.Descr':map_descr/2                                          2568       1284      0.50  [ 1.35]
'Elixir.Module.Types.Descr':'-map_literal_intersection/4-fun-1-'/3              18962       1255      0.07  [ 1.32]
'Elixir.Module.Types.Descr':'zip_non_empty_intersection!'/3                      6882       1233      0.18  [ 1.30]
'Elixir.Module.Types.Descr':tuple_descr/3                                       17812       1227      0.07  [ 1.29]
'Elixir.Module.Types.Descr':dynamic/0                                           18284       1150      0.06  [ 1.21]
'Elixir.Module.Types.Descr':map_fetch/2                                          3249       1150      0.35  [ 1.21]
'Elixir.Module.Types.Descr':symmetrical_merge/3                                 12873        996      0.08  [ 1.05]
'Elixir.Module.Types.Descr':tuple_descr/2                                        5218        942      0.18  [ 0.99]
'Elixir.Module.Types.Descr':none/0                                              12028        854      0.07  [ 0.90]
'Elixir.Module.Types.Descr':union/3                                             10002        816      0.08  [ 0.86]
'Elixir.Module.Types.Descr':iterator_difference_static/2                         5576        783      0.14  [ 0.82]
'Elixir.Module.Types.Descr':'iterator_non_disjoint_intersection?'/2              4830        782      0.16  [ 0.82]
'Elixir.Module.Types.Descr':list_descr/3                                         3239        751      0.23  [ 0.79]
'Elixir.Module.Types.Descr':atom_intersection/2                                  4536        732      0.16  [ 0.77]
'Elixir.Module.Types.Descr':'map_empty?'/1                                       4814        707      0.15  [ 0.74]
'Elixir.Module.Types.Descr':'-map_intersection/2-fun-0-'/5                       4233        694      0.16  [ 0.73]
'Elixir.Module.Types.Descr':map_fetch_static/2                                   4136        675      0.16  [ 0.71]
'Elixir.Module.Types.Descr':'tuple_empty?'/3                                     3337        659      0.20  [ 0.69]
'Elixir.Module.Types.Descr':'-list_intersection/2-fun-0-'/5                      2006        622      0.31  [ 0.65]
'Elixir.Module.Types.Descr':list_pop_dynamic/1                                   6478        616      0.10  [ 0.65]
'Elixir.Module.Types.Descr':'-list_empty?/1-fun-1-'/1                            1252        600      0.48  [ 0.63]
'Elixir.Module.Types.Descr':tuple_fetch_static/2                                 5244        596      0.11  [ 0.63]
'Elixir.Module.Types.Descr':'-map_intersection/2-fun-1-'/3                       4206        591      0.14  [ 0.62]
'Elixir.Module.Types.Descr':dynamic/1                                            8683        587      0.07  [ 0.62]
'Elixir.Module.Types.Descr':map_intersection/2                                   4168        587      0.14  [ 0.62]
'Elixir.Module.Types.Descr':tuple_literal_intersection/4                         3831        558      0.15  [ 0.59]
'Elixir.Module.Types.Descr':'-tuple_intersection/2-fun-0-'/5                     3559        540      0.15  [ 0.57]
'Elixir.Module.Types.Descr':map_literal_intersection/4                           4233        498      0.12  [ 0.52]
'Elixir.Module.Types.Descr':list_hd/1                                            3063        490      0.16  [ 0.51]
'Elixir.Module.Types.Descr':'tuple_empty?'/1                                     3317        490      0.15  [ 0.51]
'Elixir.Module.Types.Descr':'atom_only?'/1                                       5452        478      0.09  [ 0.50]
'Elixir.Module.Types.Descr':'subtype_static?'/2                                  3397        462      0.14  [ 0.49]
'Elixir.Module.Types.Descr':'non_disjoint_intersection?'/2                       4343        427      0.10  [ 0.45]
'Elixir.Module.Types.Descr':'-tuple_intersection/2-fun-1-'/3                     2861        425      0.15  [ 0.45]
'Elixir.Module.Types.Descr':'descr_key?'/2                                       5015        417      0.08  [ 0.44]
'Elixir.Module.Types.Descr':tuple_get/2                                          1707        413      0.24  [ 0.43]
'Elixir.Module.Types.Descr':tuple_intersection/2                                 2472        373      0.15  [ 0.39]
'Elixir.Module.Types.Descr':empty_list/0                                         5154        364      0.07  [ 0.38]
'Elixir.Module.Types.Descr':tuple_new/2                                          5218        339      0.06  [ 0.36]
'Elixir.Module.Types.Descr':atom/0                                               5172        323      0.06  [ 0.34]
'Elixir.Module.Types.Descr':list_intersection/2                                  1992        316      0.16  [ 0.33]
'Elixir.Module.Types.Descr':dynamic_union/2                                      4780        306      0.06  [ 0.32]
'Elixir.Module.Types.Descr':unfolded_term/0                                      4656        305      0.07  [ 0.32]
'Elixir.Module.Types.Descr':'-list_intersection/2-fun-1-'/3                      1996        296      0.15  [ 0.31]
'Elixir.Module.Types.Descr':'-map_empty?/1-fun-0-'/1                             4814        295      0.06  [ 0.31]
'Elixir.Module.Types.Descr':tuple/1                                              5194        294      0.06  [ 0.31]
'Elixir.Module.Types.Descr':'-tuple_get/2-fun-0-'/3                              1808        294      0.16  [ 0.31]
'Elixir.Module.Types.Descr':map_new/2                                            2725        293      0.11  [ 0.31]
'Elixir.Module.Types.Descr':tuple_union/2                                        1148        270      0.24  [ 0.28]
'Elixir.Module.Types.Descr':difference_static/2                                  2868        249      0.09  [ 0.26]
'Elixir.Module.Types.Descr':integer/0                                            3840        240      0.06  [ 0.25]
'Elixir.Module.Types.Descr':'empty_key?'/2                                       3829        233      0.06  [ 0.24]
'Elixir.Module.Types.Descr':dynamic_intersection/2                               3398        229      0.07  [ 0.24]
'Elixir.Module.Types.Descr':binary/0                                             3874        221      0.06  [ 0.23]
'Elixir.Module.Types.Descr':do_map_union_optimization_strategy/3                 1866        210      0.11  [ 0.22]
'Elixir.Module.Types.Descr':pop_optional_static/1                                3167        202      0.06  [ 0.21]
'Elixir.Module.Types.Descr':'-tuple_empty?/1-fun-0-'/1                           3337        201      0.06  [ 0.21]
'Elixir.Module.Types.Descr':'tuple_only?'/1                                      2622        191      0.07  [ 0.20]
'Elixir.Module.Types.Descr':'map_only?'/1                                        2280        184      0.08  [ 0.19]
'Elixir.Module.Types.Descr':'list_empty?'/1                                      1252        181      0.14  [ 0.19]
'Elixir.Module.Types.Descr':list_new/2                                           3239        171      0.05  [ 0.18]
'Elixir.Module.Types.Descr':non_empty_list/2                                     3179        156      0.05  [ 0.16]
'Elixir.Module.Types.Descr':atom_difference/2                                     769        147      0.19  [ 0.15]
'Elixir.Module.Types.Descr':map_fetch_and_put_shared/3                            212        140      0.66  [ 0.15]
'Elixir.Module.Types.Descr':atom_union/2                                          807        132      0.16  [ 0.14]
'Elixir.Module.Types.Descr':difference/3                                         2046        131      0.06  [ 0.14]
'Elixir.Module.Types.Descr':map_literal_intersection_loop/2                      1284        129      0.10  [ 0.14]
'Elixir.Module.Types.Descr':list_tail_unfold/1                                   1668        126      0.08  [ 0.13]
'Elixir.Module.Types.Descr':'tuple_elements_empty?'/5                             561        114      0.20  [ 0.12]
'Elixir.Module.Types.Descr':closed_map/1                                         1460        113      0.08  [ 0.12]
'Elixir.Module.Types.Descr':list_hd_static/1                                     1645        108      0.07  [ 0.11]
'Elixir.Module.Types.Descr':map_union_next_strategy/4                            1762        107      0.06  [ 0.11]
'Elixir.Module.Types.Descr':tag_to_type/1                                        1819        104      0.06  [ 0.11]
'Elixir.Module.Types.Descr':map_take_static/3                                     424        103      0.24  [ 0.11]
'Elixir.Module.Types.Descr':'-list_difference/2-fun-1-'/4                         208         96      0.46  [ 0.10]
'Elixir.Module.Types.Descr':'-map_fetch_static/2-fun-0-'/3                       1303         92      0.07  [ 0.10]
'Elixir.Module.Types.Descr':not_set/0                                            1720         73      0.04  [ 0.08]
'Elixir.Module.Types.Descr':boolean/0                                            1314         70      0.05  [ 0.07]
'Elixir.Module.Types.Descr':difference/2                                          452         61      0.13  [ 0.06]
'Elixir.Module.Types.Descr':'non_empty_list_only?'/1                             1009         61      0.06  [ 0.06]
'Elixir.Module.Types.Descr':'subtype?'/2                                          440         60      0.14  [ 0.06]
'Elixir.Module.Types.Descr':open_map/1                                           1108         59      0.05  [ 0.06]
'Elixir.Module.Types.Descr':'-tuple_difference/2-fun-1-'/5                        272         53      0.19  [ 0.06]
'Elixir.Module.Types.Descr':map_put_static/3                                      424         51      0.12  [ 0.05]
'Elixir.Module.Types.Descr':map_union_optimization_strategy/4                     183         48      0.26  [ 0.05]
'Elixir.Module.Types.Descr':'-map_take_static/3-fun-3-'/3                         157         47      0.30  [ 0.05]
'Elixir.Module.Types.Descr':'-map_difference/2-fun-3-'/2                          257         47      0.18  [ 0.05]
'Elixir.Module.Types.Descr':'-list_difference/2-fun-2-'/2                         208         47      0.23  [ 0.05]
'Elixir.Module.Types.Descr':map_pop_key/3                                         157         41      0.26  [ 0.04]
'Elixir.Module.Types.Descr':map_union/2                                           247         41      0.17  [ 0.04]
'Elixir.Module.Types.Descr':'-map_put_static/3-fun-1-'/3                          212         39      0.18  [ 0.04]
'Elixir.Module.Types.Descr':list_tl/1                                             184         35      0.19  [ 0.04]
'Elixir.Module.Types.Descr':map_fetch_and_put/3                                   336         34      0.10  [ 0.04]
'Elixir.Module.Types.Descr':tuple_compatibility/6                                 183         34      0.19  [ 0.04]
'Elixir.Module.Types.Descr':list_union/2                                          457         32      0.07  [ 0.03]
'Elixir.Module.Types.Descr':'-tuple_difference/2-fun-2-'/2                        208         28      0.13  [ 0.03]
'Elixir.Module.Types.Descr':tuple_difference/2                                    180         27      0.15  [ 0.03]
'Elixir.Module.Types.Descr':'-map_literal_intersection/4-fun-0-'/3                523         26      0.05  [ 0.03]
'Elixir.Module.Types.Descr':'fun'/0                                               464         25      0.05  [ 0.03]
'Elixir.Module.Types.Descr':map_difference/2                                      189         24      0.13  [ 0.03]
'Elixir.Module.Types.Descr':list_difference/2                                     208         23      0.11  [ 0.02]
'Elixir.Module.Types.Descr':'-map_difference/2-fun-2-'/5                          246         23      0.09  [ 0.02]
'Elixir.Module.Types.Descr':'-map_fetch_and_put_shared/3-fun-0-'/3                424         22      0.05  [ 0.02]
'Elixir.Module.Types.Descr':'only_gradual?'/1                                     381         18      0.05  [ 0.02]
'Elixir.Module.Types.Descr':list_tl_static/1                                      184         16      0.09  [ 0.02]
'Elixir.Module.Types.Descr':'-iterator_non_disjoint_intersection?/2-fun-0-'/3     487         16      0.03  [ 0.02]
'Elixir.Module.Types.Descr':'-map_union/2-fun-0-'/3                                83         15      0.18  [ 0.02]
'Elixir.Module.Types.Descr':'number_type?'/1                                      256         13      0.05  [ 0.01]
'Elixir.Module.Types.Descr':open_tuple/1                                           24         11      0.46  [ 0.01]
'Elixir.Module.Types.Descr':open_map/0                                            194          9      0.05  [ 0.01]
'Elixir.Module.Types.Descr':'-map_take_static/3-fun-1-'/1                         157          8      0.05  [ 0.01]
'Elixir.Module.Types.Descr':term_or_optional/0                                     99          6      0.06  [ 0.01]
'Elixir.Module.Types.Descr':'trivial_subtype?'/2                                   35          6      0.17  [ 0.01]
'Elixir.Module.Types.Descr':fun_fetch/2                                            40          5      0.13  [ 0.01]
'Elixir.Module.Types.Descr':'fun_only?'/1                                          40          5      0.13  [ 0.01]
'Elixir.Module.Types.Descr':truthness/1                                            43          4      0.09  [ 0.00]
'Elixir.Module.Types.Descr':float/0                                                20          3      0.15  [ 0.00]
'Elixir.Module.Types.Descr':list/2                                                 60          3      0.05  [ 0.00]
'Elixir.Module.Types.Descr':open_tuple/2                                           24          3      0.13  [ 0.00]
'Elixir.Module.Types.Descr':'-list_tl_static/1-fun-0-'/2                           78          3      0.04  [ 0.00]
'Elixir.Module.Types.Descr':list/1                                                 60          2      0.03  [ 0.00]
'Elixir.Module.Types.Descr':'-map_take_static/3-fun-0-'/0                          45          1      0.02  [ 0.00]
'Elixir.Module.Types.Descr':'-map_take_static/3-fun-4-'/0                          10          1      0.10  [ 0.00]
'Elixir.Module.Types.Descr':'-map_difference/2-fun-0-'/3                           11          1      0.09  [ 0.00]
'Elixir.Module.Types.Descr':'-do_map_union_optimization_strategy/3-fun-0-'/1        7          1      0.14  [ 0.00]
'Elixir.Module.Types.Descr':negation/1                                              9          0      0.00  [ 0.00]
                                                                                           95193            [100.0]


defp trivial_subtype?(%{} = left, %{} = right)
when map_size(left) == 1 and map_size(right) == 1 do
case {left, right} do
Copy link
Contributor Author

@sabiwara sabiwara Jan 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went for an even simpler version with just shallow comparisons (except for the structural comparison on top)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion, let's go with the regular subtyping? We apply this on very few cases and once we start having structs nested inside structs, this will make a difference?

In other words, let's go with the "slow" but more general and less code version and we optimize it again in the future? Especially to avoid getting the logic wrong here... :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes total sense, especially since
a) we'll bail at the first non-match
b) we avoid building lists and make a bunch of stuff cheaper downstream so the extra precision might pay
c) it's easier to optimize based on actually slow projects, so deferring this one seems like it's a good idea

And perhaps we'll be able to make subtype? cheaper in the future and bail early in some cases or something, in which case the whole typing would benefit.

Comment on lines 1288 to 1305
strategy when strategy != nil <- map_union_optimization_strategy(tag1, pos1, tag2, pos2) do
case strategy do
:all_equal ->
dnf1

:any_map ->
[{:open, %{}, []}]

{:one_key_difference, key, v1, v2} ->
new_pos = Map.put(pos1, key, union(v1, v2))
[{tag1, new_pos, []}]

:left_subtype_of_right ->
dnf2

:right_subtype_of_left ->
dnf1
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we can encapsulate this and use it on map_normalize too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, you mean instead of the map_non_negated_fuse / fusible_maps / map_non_negated_fuse_pair part right?
Seems it's doing very similar stuff, let's see!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but I can also be the one giving that a try. We can merge this once the comments above are addressed, then I can work on the map_normalize and you work on tuples? :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I got it actually! Will push in a sec.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ec96574

I didn't spend time on renaming but I think we can probably use the term fuse / fusion etc more for consistency.
Feel free to rename as you see fit!

@josevalim
Copy link
Member

Beautifully written and well done! 😍

@josevalim josevalim requested a review from gldubc January 23, 2025 09:51
lib/elixir/test/elixir/module/types/descr_test.exs Outdated Show resolved Hide resolved
|> :maps.next()
|> do_map_union_optimization_strategy(pos2, :all_equal)
end

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not handle cases where the open map on the left has some extra fields that are set to if_set(term()), and the right map is closed.

Example:

assert union(
        open_map(a: if_set(term())),
        closed_map([])
      ) == open_map(a: if_set(term()))

Similarly, what if we have a larger (in size) open map as pos1, but which is a supertype of pos2? Then the only tried strategy will be l.1340 which leads to :left_subtype_of_right.

Example:

 assert union(
        open_map(a: if_set(term()), b: number()),
        open_map(b: integer())
      ) == open_map(a: if_set(term()), b: number())

I don't think those are case that necessarily need to be covered, but adding those tests to highlight it would prevent us discovering this again.

Copy link
Contributor Author

@sabiwara sabiwara Jan 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I was totally forgetting about if_set.

I don't think those are case that necessarily need to be covered

Yeah this is an optimization supposed to deal with some "obvious" cases that happen frequently, so it might be OK not to catch all cases (we're not dealing with negs either).

But in this case it might be possible to implement in the current pass with something like:

  • if one key is only on the side of the supertype and its value is if_set, continue inferring this supertype relation
  • if we can switch to the supertype strategy, do it
  • otherwise bail

The map size issue is a real problem though... Perhaps by changing the internal representation to store if_set as part of a different map, we can easily compute the size of the required map, and have a separate pass for optional keys?
This might be overkill for this particular use case, but if this new representation can simplify a bunch of other places such as subtyping etc it might be worth considering.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The map size issue is a real problem though...

I think it is fine because our goal is to traverse the smallest map for performance. The full algorithm does require traversing both sides but the point here is precisely to not implement the full algorithm. :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good!

I was more thinking if we get bottlenecks in the future due to if_set and if there was a way to optimize them in other parts too, but it's best to avoid speculation and to wait if real world slow/pathological cases to show up, and iterate then.

@sabiwara
Copy link
Contributor Author

@gldubc thanks a lot for the review 💜

@josevalim will merge this and open one for tuples this weekend.

@sabiwara sabiwara merged commit 85d2e16 into elixir-lang:main Jan 24, 2025
9 checks passed
@sabiwara sabiwara deleted the opti-union2 branch January 24, 2025 08:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

Compiler hang on bamboo dependency
3 participants