Optimize map unions to avoid building long lists #14215

sabiwara · 2025-01-23T03:04:55Z

Only implemented maps for now, once approved I'm happy to copy it for tuples.

We do only one pass at keys, and are able to switch the strategy in the middle (including just bailing if a key is missing or an expectation is not met).

It also fixes the bamboo example.

Perf-wise it seems to not spend a lot of time at all in the new code:

Profiling details

:tprof.profile(Mix, :install, [[:bamboo], [force: true]], %{type: :call_time, pattern: [{Module.Types.Descr,:_, :_}], report: {:total, {:measurement, :descending}}})

Generated bamboo app
FUNCTION                                                                        CALLS  TIME (us)  PER CALL  [    %]
'Elixir.Module.Types.Descr':'empty?'/1                                         133937      10362      0.08  [10.89]
'Elixir.Module.Types.Descr':iterator_intersection/4                             50587       7093      0.14  [ 7.45]
'Elixir.Module.Types.Descr':'non_empty_intersection!'/2                         23751       4280      0.18  [ 4.50]
'Elixir.Module.Types.Descr':intersection/2                                      45968       4043      0.09  [ 4.25]
'Elixir.Module.Types.Descr':'gradual?'/1                                        54133       3706      0.07  [ 3.89]
'Elixir.Module.Types.Descr':term/0                                              54112       3028      0.06  [ 3.18]
'Elixir.Module.Types.Descr':'map_empty?'/3                                       4815       2609      0.54  [ 2.74]
'Elixir.Module.Types.Descr':iterator_merge/3                                    23944       2247      0.09  [ 2.36]
'Elixir.Module.Types.Descr':'disjoint?'/2                                       19417       2235      0.12  [ 2.35]
'Elixir.Module.Types.Descr':union/2                                             19824       2216      0.11  [ 2.33]
'Elixir.Module.Types.Descr':atom/1                                              17552       2116      0.12  [ 2.22]
'Elixir.Module.Types.Descr':atom_new/1                                          17552       1935      0.11  [ 2.03]
'Elixir.Module.Types.Descr':symmetrical_intersection/3                          13292       1909      0.14  [ 2.01]
'Elixir.Module.Types.Descr':'compatible?'/2                                      9400       1877      0.20  [ 1.97]
'Elixir.Module.Types.Descr':'-map_empty?/3-fun-0-'/1                            33105       1866      0.06  [ 1.96]
'Elixir.Module.Types.Descr':intersection/3                                      22489       1800      0.08  [ 1.89]
'Elixir.Module.Types.Descr':tuple_fetch/2                                       10462       1691      0.16  [ 1.78]
'Elixir.Module.Types.Descr':compatible_intersection/2                            3983       1643      0.41  [ 1.73]
'Elixir.Module.Types.Descr':atom_fetch/1                                         5452       1434      0.26  [ 1.51]
'Elixir.Module.Types.Descr':map_descr_pairs/3                                   19865       1357      0.07  [ 1.43]
'Elixir.Module.Types.Descr':map_descr/2                                          2568       1284      0.50  [ 1.35]
'Elixir.Module.Types.Descr':'-map_literal_intersection/4-fun-1-'/3              18962       1255      0.07  [ 1.32]
'Elixir.Module.Types.Descr':'zip_non_empty_intersection!'/3                      6882       1233      0.18  [ 1.30]
'Elixir.Module.Types.Descr':tuple_descr/3                                       17812       1227      0.07  [ 1.29]
'Elixir.Module.Types.Descr':dynamic/0                                           18284       1150      0.06  [ 1.21]
'Elixir.Module.Types.Descr':map_fetch/2                                          3249       1150      0.35  [ 1.21]
'Elixir.Module.Types.Descr':symmetrical_merge/3                                 12873        996      0.08  [ 1.05]
'Elixir.Module.Types.Descr':tuple_descr/2                                        5218        942      0.18  [ 0.99]
'Elixir.Module.Types.Descr':none/0                                              12028        854      0.07  [ 0.90]
'Elixir.Module.Types.Descr':union/3                                             10002        816      0.08  [ 0.86]
'Elixir.Module.Types.Descr':iterator_difference_static/2                         5576        783      0.14  [ 0.82]
'Elixir.Module.Types.Descr':'iterator_non_disjoint_intersection?'/2              4830        782      0.16  [ 0.82]
'Elixir.Module.Types.Descr':list_descr/3                                         3239        751      0.23  [ 0.79]
'Elixir.Module.Types.Descr':atom_intersection/2                                  4536        732      0.16  [ 0.77]
'Elixir.Module.Types.Descr':'map_empty?'/1                                       4814        707      0.15  [ 0.74]
'Elixir.Module.Types.Descr':'-map_intersection/2-fun-0-'/5                       4233        694      0.16  [ 0.73]
'Elixir.Module.Types.Descr':map_fetch_static/2                                   4136        675      0.16  [ 0.71]
'Elixir.Module.Types.Descr':'tuple_empty?'/3                                     3337        659      0.20  [ 0.69]
'Elixir.Module.Types.Descr':'-list_intersection/2-fun-0-'/5                      2006        622      0.31  [ 0.65]
'Elixir.Module.Types.Descr':list_pop_dynamic/1                                   6478        616      0.10  [ 0.65]
'Elixir.Module.Types.Descr':'-list_empty?/1-fun-1-'/1                            1252        600      0.48  [ 0.63]
'Elixir.Module.Types.Descr':tuple_fetch_static/2                                 5244        596      0.11  [ 0.63]
'Elixir.Module.Types.Descr':'-map_intersection/2-fun-1-'/3                       4206        591      0.14  [ 0.62]
'Elixir.Module.Types.Descr':dynamic/1                                            8683        587      0.07  [ 0.62]
'Elixir.Module.Types.Descr':map_intersection/2                                   4168        587      0.14  [ 0.62]
'Elixir.Module.Types.Descr':tuple_literal_intersection/4                         3831        558      0.15  [ 0.59]
'Elixir.Module.Types.Descr':'-tuple_intersection/2-fun-0-'/5                     3559        540      0.15  [ 0.57]
'Elixir.Module.Types.Descr':map_literal_intersection/4                           4233        498      0.12  [ 0.52]
'Elixir.Module.Types.Descr':list_hd/1                                            3063        490      0.16  [ 0.51]
'Elixir.Module.Types.Descr':'tuple_empty?'/1                                     3317        490      0.15  [ 0.51]
'Elixir.Module.Types.Descr':'atom_only?'/1                                       5452        478      0.09  [ 0.50]
'Elixir.Module.Types.Descr':'subtype_static?'/2                                  3397        462      0.14  [ 0.49]
'Elixir.Module.Types.Descr':'non_disjoint_intersection?'/2                       4343        427      0.10  [ 0.45]
'Elixir.Module.Types.Descr':'-tuple_intersection/2-fun-1-'/3                     2861        425      0.15  [ 0.45]
'Elixir.Module.Types.Descr':'descr_key?'/2                                       5015        417      0.08  [ 0.44]
'Elixir.Module.Types.Descr':tuple_get/2                                          1707        413      0.24  [ 0.43]
'Elixir.Module.Types.Descr':tuple_intersection/2                                 2472        373      0.15  [ 0.39]
'Elixir.Module.Types.Descr':empty_list/0                                         5154        364      0.07  [ 0.38]
'Elixir.Module.Types.Descr':tuple_new/2                                          5218        339      0.06  [ 0.36]
'Elixir.Module.Types.Descr':atom/0                                               5172        323      0.06  [ 0.34]
'Elixir.Module.Types.Descr':list_intersection/2                                  1992        316      0.16  [ 0.33]
'Elixir.Module.Types.Descr':dynamic_union/2                                      4780        306      0.06  [ 0.32]
'Elixir.Module.Types.Descr':unfolded_term/0                                      4656        305      0.07  [ 0.32]
'Elixir.Module.Types.Descr':'-list_intersection/2-fun-1-'/3                      1996        296      0.15  [ 0.31]
'Elixir.Module.Types.Descr':'-map_empty?/1-fun-0-'/1                             4814        295      0.06  [ 0.31]
'Elixir.Module.Types.Descr':tuple/1                                              5194        294      0.06  [ 0.31]
'Elixir.Module.Types.Descr':'-tuple_get/2-fun-0-'/3                              1808        294      0.16  [ 0.31]
'Elixir.Module.Types.Descr':map_new/2                                            2725        293      0.11  [ 0.31]
'Elixir.Module.Types.Descr':tuple_union/2                                        1148        270      0.24  [ 0.28]
'Elixir.Module.Types.Descr':difference_static/2                                  2868        249      0.09  [ 0.26]
'Elixir.Module.Types.Descr':integer/0                                            3840        240      0.06  [ 0.25]
'Elixir.Module.Types.Descr':'empty_key?'/2                                       3829        233      0.06  [ 0.24]
'Elixir.Module.Types.Descr':dynamic_intersection/2                               3398        229      0.07  [ 0.24]
'Elixir.Module.Types.Descr':binary/0                                             3874        221      0.06  [ 0.23]
'Elixir.Module.Types.Descr':do_map_union_optimization_strategy/3                 1866        210      0.11  [ 0.22]
'Elixir.Module.Types.Descr':pop_optional_static/1                                3167        202      0.06  [ 0.21]
'Elixir.Module.Types.Descr':'-tuple_empty?/1-fun-0-'/1                           3337        201      0.06  [ 0.21]
'Elixir.Module.Types.Descr':'tuple_only?'/1                                      2622        191      0.07  [ 0.20]
'Elixir.Module.Types.Descr':'map_only?'/1                                        2280        184      0.08  [ 0.19]
'Elixir.Module.Types.Descr':'list_empty?'/1                                      1252        181      0.14  [ 0.19]
'Elixir.Module.Types.Descr':list_new/2                                           3239        171      0.05  [ 0.18]
'Elixir.Module.Types.Descr':non_empty_list/2                                     3179        156      0.05  [ 0.16]
'Elixir.Module.Types.Descr':atom_difference/2                                     769        147      0.19  [ 0.15]
'Elixir.Module.Types.Descr':map_fetch_and_put_shared/3                            212        140      0.66  [ 0.15]
'Elixir.Module.Types.Descr':atom_union/2                                          807        132      0.16  [ 0.14]
'Elixir.Module.Types.Descr':difference/3                                         2046        131      0.06  [ 0.14]
'Elixir.Module.Types.Descr':map_literal_intersection_loop/2                      1284        129      0.10  [ 0.14]
'Elixir.Module.Types.Descr':list_tail_unfold/1                                   1668        126      0.08  [ 0.13]
'Elixir.Module.Types.Descr':'tuple_elements_empty?'/5                             561        114      0.20  [ 0.12]
'Elixir.Module.Types.Descr':closed_map/1                                         1460        113      0.08  [ 0.12]
'Elixir.Module.Types.Descr':list_hd_static/1                                     1645        108      0.07  [ 0.11]
'Elixir.Module.Types.Descr':map_union_next_strategy/4                            1762        107      0.06  [ 0.11]
'Elixir.Module.Types.Descr':tag_to_type/1                                        1819        104      0.06  [ 0.11]
'Elixir.Module.Types.Descr':map_take_static/3                                     424        103      0.24  [ 0.11]
'Elixir.Module.Types.Descr':'-list_difference/2-fun-1-'/4                         208         96      0.46  [ 0.10]
'Elixir.Module.Types.Descr':'-map_fetch_static/2-fun-0-'/3                       1303         92      0.07  [ 0.10]
'Elixir.Module.Types.Descr':not_set/0                                            1720         73      0.04  [ 0.08]
'Elixir.Module.Types.Descr':boolean/0                                            1314         70      0.05  [ 0.07]
'Elixir.Module.Types.Descr':difference/2                                          452         61      0.13  [ 0.06]
'Elixir.Module.Types.Descr':'non_empty_list_only?'/1                             1009         61      0.06  [ 0.06]
'Elixir.Module.Types.Descr':'subtype?'/2                                          440         60      0.14  [ 0.06]
'Elixir.Module.Types.Descr':open_map/1                                           1108         59      0.05  [ 0.06]
'Elixir.Module.Types.Descr':'-tuple_difference/2-fun-1-'/5                        272         53      0.19  [ 0.06]
'Elixir.Module.Types.Descr':map_put_static/3                                      424         51      0.12  [ 0.05]
'Elixir.Module.Types.Descr':map_union_optimization_strategy/4                     183         48      0.26  [ 0.05]
'Elixir.Module.Types.Descr':'-map_take_static/3-fun-3-'/3                         157         47      0.30  [ 0.05]
'Elixir.Module.Types.Descr':'-map_difference/2-fun-3-'/2                          257         47      0.18  [ 0.05]
'Elixir.Module.Types.Descr':'-list_difference/2-fun-2-'/2                         208         47      0.23  [ 0.05]
'Elixir.Module.Types.Descr':map_pop_key/3                                         157         41      0.26  [ 0.04]
'Elixir.Module.Types.Descr':map_union/2                                           247         41      0.17  [ 0.04]
'Elixir.Module.Types.Descr':'-map_put_static/3-fun-1-'/3                          212         39      0.18  [ 0.04]
'Elixir.Module.Types.Descr':list_tl/1                                             184         35      0.19  [ 0.04]
'Elixir.Module.Types.Descr':map_fetch_and_put/3                                   336         34      0.10  [ 0.04]
'Elixir.Module.Types.Descr':tuple_compatibility/6                                 183         34      0.19  [ 0.04]
'Elixir.Module.Types.Descr':list_union/2                                          457         32      0.07  [ 0.03]
'Elixir.Module.Types.Descr':'-tuple_difference/2-fun-2-'/2                        208         28      0.13  [ 0.03]
'Elixir.Module.Types.Descr':tuple_difference/2                                    180         27      0.15  [ 0.03]
'Elixir.Module.Types.Descr':'-map_literal_intersection/4-fun-0-'/3                523         26      0.05  [ 0.03]
'Elixir.Module.Types.Descr':'fun'/0                                               464         25      0.05  [ 0.03]
'Elixir.Module.Types.Descr':map_difference/2                                      189         24      0.13  [ 0.03]
'Elixir.Module.Types.Descr':list_difference/2                                     208         23      0.11  [ 0.02]
'Elixir.Module.Types.Descr':'-map_difference/2-fun-2-'/5                          246         23      0.09  [ 0.02]
'Elixir.Module.Types.Descr':'-map_fetch_and_put_shared/3-fun-0-'/3                424         22      0.05  [ 0.02]
'Elixir.Module.Types.Descr':'only_gradual?'/1                                     381         18      0.05  [ 0.02]
'Elixir.Module.Types.Descr':list_tl_static/1                                      184         16      0.09  [ 0.02]
'Elixir.Module.Types.Descr':'-iterator_non_disjoint_intersection?/2-fun-0-'/3     487         16      0.03  [ 0.02]
'Elixir.Module.Types.Descr':'-map_union/2-fun-0-'/3                                83         15      0.18  [ 0.02]
'Elixir.Module.Types.Descr':'number_type?'/1                                      256         13      0.05  [ 0.01]
'Elixir.Module.Types.Descr':open_tuple/1                                           24         11      0.46  [ 0.01]
'Elixir.Module.Types.Descr':open_map/0                                            194          9      0.05  [ 0.01]
'Elixir.Module.Types.Descr':'-map_take_static/3-fun-1-'/1                         157          8      0.05  [ 0.01]
'Elixir.Module.Types.Descr':term_or_optional/0                                     99          6      0.06  [ 0.01]
'Elixir.Module.Types.Descr':'trivial_subtype?'/2                                   35          6      0.17  [ 0.01]
'Elixir.Module.Types.Descr':fun_fetch/2                                            40          5      0.13  [ 0.01]
'Elixir.Module.Types.Descr':'fun_only?'/1                                          40          5      0.13  [ 0.01]
'Elixir.Module.Types.Descr':truthness/1                                            43          4      0.09  [ 0.00]
'Elixir.Module.Types.Descr':float/0                                                20          3      0.15  [ 0.00]
'Elixir.Module.Types.Descr':list/2                                                 60          3      0.05  [ 0.00]
'Elixir.Module.Types.Descr':open_tuple/2                                           24          3      0.13  [ 0.00]
'Elixir.Module.Types.Descr':'-list_tl_static/1-fun-0-'/2                           78          3      0.04  [ 0.00]
'Elixir.Module.Types.Descr':list/1                                                 60          2      0.03  [ 0.00]
'Elixir.Module.Types.Descr':'-map_take_static/3-fun-0-'/0                          45          1      0.02  [ 0.00]
'Elixir.Module.Types.Descr':'-map_take_static/3-fun-4-'/0                          10          1      0.10  [ 0.00]
'Elixir.Module.Types.Descr':'-map_difference/2-fun-0-'/3                           11          1      0.09  [ 0.00]
'Elixir.Module.Types.Descr':'-do_map_union_optimization_strategy/3-fun-0-'/1        7          1      0.14  [ 0.00]
'Elixir.Module.Types.Descr':negation/1                                              9          0      0.00  [ 0.00]
                                                                                           95193            [100.0]

sabiwara · 2025-01-23T03:33:21Z

lib/elixir/lib/module/types/descr.ex

+
+  defp trivial_subtype?(%{} = left, %{} = right)
+       when map_size(left) == 1 and map_size(right) == 1 do
+    case {left, right} do


I went for an even simpler version with just shallow comparisons (except for the structural comparison on top)

Suggestion, let's go with the regular subtyping? We apply this on very few cases and once we start having structs nested inside structs, this will make a difference?

In other words, let's go with the "slow" but more general and less code version and we optimize it again in the future? Especially to avoid getting the logic wrong here... :)

This makes total sense, especially since
a) we'll bail at the first non-match
b) we avoid building lists and make a bunch of stuff cheaper downstream so the extra precision might pay
c) it's easier to optimize based on actually slow projects, so deferring this one seems like it's a good idea

And perhaps we'll be able to make subtype? cheaper in the future and bail early in some cases or something, in which case the whole typing would benefit.

josevalim · 2025-01-23T08:28:34Z

lib/elixir/lib/module/types/descr.ex

+         strategy when strategy != nil <- map_union_optimization_strategy(tag1, pos1, tag2, pos2) do
+      case strategy do
+        :all_equal ->
+          dnf1
+
+        :any_map ->
+          [{:open, %{}, []}]
+
+        {:one_key_difference, key, v1, v2} ->
+          new_pos = Map.put(pos1, key, union(v1, v2))
+          [{tag1, new_pos, []}]
+
+        :left_subtype_of_right ->
+          dnf2
+
+        :right_subtype_of_left ->
+          dnf1
+      end


I believe we can encapsulate this and use it on map_normalize too?

Oh, you mean instead of the map_non_negated_fuse / fusible_maps / map_non_negated_fuse_pair part right?
Seems it's doing very similar stuff, let's see!

Yes, but I can also be the one giving that a try. We can merge this once the comments above are addressed, then I can work on the map_normalize and you work on tuples? :)

I think I got it actually! Will push in a sec.

ec96574

I didn't spend time on renaming but I think we can probably use the term fuse / fusion etc more for consistency.
Feel free to rename as you see fit!

josevalim · 2025-01-23T08:28:52Z

Beautifully written and well done! 😍

lib/elixir/test/elixir/module/types/descr_test.exs

gldubc · 2025-01-23T14:37:09Z

lib/elixir/lib/module/types/descr.ex

+    |> :maps.next()
+    |> do_map_union_optimization_strategy(pos2, :all_equal)
+  end
+


This does not handle cases where the open map on the left has some extra fields that are set to if_set(term()), and the right map is closed.

Example:

assert union( open_map(a: if_set(term())), closed_map([]) ) == open_map(a: if_set(term()))

Similarly, what if we have a larger (in size) open map as pos1, but which is a supertype of pos2? Then the only tried strategy will be l.1340 which leads to :left_subtype_of_right.

Example:

assert union( open_map(a: if_set(term()), b: number()), open_map(b: integer()) ) == open_map(a: if_set(term()), b: number())

I don't think those are case that necessarily need to be covered, but adding those tests to highlight it would prevent us discovering this again.

Oh I was totally forgetting about if_set.

I don't think those are case that necessarily need to be covered

Yeah this is an optimization supposed to deal with some "obvious" cases that happen frequently, so it might be OK not to catch all cases (we're not dealing with negs either).

But in this case it might be possible to implement in the current pass with something like:

if one key is only on the side of the supertype and its value is if_set, continue inferring this supertype relation

if we can switch to the supertype strategy, do it

otherwise bail

The map size issue is a real problem though... Perhaps by changing the internal representation to store if_set as part of a different map, we can easily compute the size of the required map, and have a separate pass for optional keys?
This might be overkill for this particular use case, but if this new representation can simplify a bunch of other places such as subtyping etc it might be worth considering.

The map size issue is a real problem though...

I think it is fine because our goal is to traverse the smallest map for performance. The full algorithm does require traversing both sides but the point here is precisely to not implement the full algorithm. :)

Sounds good!

I was more thinking if we get bottlenecks in the future due to if_set and if there was a way to optimize them in other parts too, but it's best to avoid speculation and to wait if real world slow/pathological cases to show up, and iterate then.

Co-authored-by: Guillaume Duboc <[email protected]>

sabiwara · 2025-01-24T08:12:00Z

@gldubc thanks a lot for the review 💜

@josevalim will merge this and open one for tuples this weekend.

sabiwara mentioned this pull request Jan 23, 2025

Optimization attempt for union of maps and tuples #14205

Closed

Optimize map unions to avoid building long lists

e3fb8ed

sabiwara force-pushed the opti-union2 branch from a0d98c7 to e3fb8ed Compare January 23, 2025 03:31

sabiwara commented Jan 23, 2025

View reviewed changes

josevalim reviewed Jan 23, 2025

View reviewed changes

sabiwara added 3 commits January 23, 2025 17:58

Remove trivial subtype optimization for now

5255a75

Extract for reuse

9d74860

Reuse fuse logic

ec96574

josevalim requested a review from gldubc January 23, 2025 09:51

gldubc approved these changes Jan 23, 2025

View reviewed changes

sabiwara and others added 2 commits January 24, 2025 08:33

Update test

55ae816

Co-authored-by: Guillaume Duboc <[email protected]>

Uncomment test now that we remove shallow subtyping

6e489d4

josevalim approved these changes Jan 24, 2025

View reviewed changes

sabiwara merged commit 85d2e16 into elixir-lang:main Jan 24, 2025
9 checks passed

sabiwara deleted the opti-union2 branch January 24, 2025 08:12

sabiwara mentioned this pull request Jan 26, 2025

Implement optimization of tuple unions #14228

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize map unions to avoid building long lists #14215

Optimize map unions to avoid building long lists #14215

sabiwara commented Jan 23, 2025 •

edited

Loading

sabiwara Jan 23, 2025 •

edited

Loading

josevalim Jan 23, 2025

sabiwara Jan 23, 2025

josevalim Jan 23, 2025

sabiwara Jan 23, 2025

josevalim Jan 23, 2025

sabiwara Jan 23, 2025

sabiwara Jan 23, 2025

josevalim commented Jan 23, 2025

gldubc Jan 23, 2025

sabiwara Jan 23, 2025 •

edited

Loading

josevalim Jan 24, 2025

sabiwara Jan 24, 2025

sabiwara commented Jan 24, 2025

Optimize map unions to avoid building long lists #14215

Optimize map unions to avoid building long lists #14215

Conversation

sabiwara commented Jan 23, 2025 • edited Loading

sabiwara Jan 23, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

josevalim commented Jan 23, 2025

Choose a reason for hiding this comment

sabiwara Jan 23, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sabiwara commented Jan 24, 2025

sabiwara commented Jan 23, 2025 •

edited

Loading

sabiwara Jan 23, 2025 •

edited

Loading

sabiwara Jan 23, 2025 •

edited

Loading