Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missed CSE opportunity #3181

Open
bclement-ocp opened this issue Oct 24, 2024 · 1 comment · May be fixed by #3182
Open

Missed CSE opportunity #3181

bclement-ocp opened this issue Oct 24, 2024 · 1 comment · May be fixed by #3182
Labels
flambda2 Prerequisite for, or part of, flambda2

Comments

@bclement-ocp
Copy link
Contributor

bclement-ocp commented Oct 24, 2024

On current main (ab2272b) with the following code:

let always_true x =
  if match x with None -> true | Some _ -> false then
    match x with None -> true | Some _ -> false
  else true

I get the following output in -dflambda:

$ OCAMLLIB=_build/runtime_stdlib_install/lib/ocaml_runtime_stdlib _build/_bootinstall/bin/ocamlopt.opt -dflambda test.ml

After simplify:
((module_symbol Test.camlTest) (return_continuation k3) (exn_continuation k2)
 (toplevel_my_region toplevel_my_region/0N)
 (toplevel_my_ghost_region toplevel_my_ghost_region/1N)
 (used_value_slots { })
 camlTest__always_true_0_0_code = (Deleted)
 camlTest__always_true_0_1_code =
  ((code_metadata
    ((newer_version_of camlTest__always_true_0_0_code) (stub false)
     (inline Default_inline) () (poll_attribute Default) (is_a_functor false)
     (is_opaque false)
     (params_arity 𝕍=Variant((consts ({ 0 })) (non_consts ({(0 [*])}))))
     (param_modes (Heap)) (first_complex_local_param 1)
     (result_arity 𝕍=tagged_ℕ𝕚) (result_types (Unknown)) (result_mode Heap)
     (contains_no_escaping_local_allocs true) (recursive Non_recursive)
     (cost_metrics
      size: 32 removed: {call: 0 alloc: 0 prim: 14 branch: 1 direct: 0
                         poly_cmp: 0 requested: 0})
     (inlining_arguments Oclassic) (dbg test.ml:1,16--133) (is_tupled false)
     (is_my_closure_used false)
     (inlining_decision (Function_body_too_large 10))
     (loopify Never_loopify)))
   (λ〈k13〉《k12》⟅my_region/27N⟆ ⟅my_ghost_region/26N⟆
      (x/29UV ∷ 𝕍=Variant((consts ({ 0 })) (non_consts ({(0 [*])})))) 
      (my_closure/28N ∷ 𝕍) my_depth/25N .
    (prim/32N = ((Is_int x/29UV) test.ml:2,18--22)
     (switch prim/32N test.ml:2,18--22
     | 0 ↦ apply_cont k16 #0
     | 1 ↦ apply_cont k16 #1)
     k16 (naked_immediate/37N ∷ ℕ𝕚) #More_than_one:
     ((switch naked_immediate/37N test.ml:2,18--22
      | 0 ↦ return k13 1
      | 1 ↦ goto k17 test.ml:3,17--21)
      k17 #One:
      prim/41N = ((Is_int x/29UV) test.ml:3,17--21)
      (switch prim/41N test.ml:3,17--21
      | 0 ↦ return k13 0
      | 1 ↦ return k13 1))))))
 Test.camlTest__always_true_1 ↤ (always_true/0 ∷ 𝕍) =
  (set_of_closures Heap
   ({((always_true/0 ∷ 𝕍) camlTest__always_true_0_1_code)}))
 Test.camlTest =
  (Immutable_block (tag 0) (shape Value_only) (Test.camlTest__always_true_1))
 module_init_end k3 Test.camlTest)

Both prim/41N and prim/32N are equal to Is_int x/29UV, and prim/32N dominates prim/41N, so I'd expect CSE to kick in and eliminate prim/41N, but evidently it does not.

@lthls
Copy link
Contributor

lthls commented Oct 24, 2024

I looked into this, and the issue is during the CSE join.
We have two branches with the same CSE equation (Is_int x/29 = prim/32), but in one of them we also know that x/29 = 0. Since we canonicalise our primitives at each use, this turns into a join between Is_int 0 on one side and Is_int x/29 on the other side, the result being then empty.

In this particular example, canonicalising with the typing env at fork instead solves our problem, but we lose cases where each branch has a local alias of a common variable.
We could instead add CSE equations for every primitive that is equivalent to the current one up to aliases, but that will likely blow up quickly.
I think I will propose a middle-ground solution: for each CSE primitive, generate two primitives (one with the arguments canonicalised in the env at fork, one canonicalised in the env at use), and if they differ add both to the set of eligible primitives.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flambda2 Prerequisite for, or part of, flambda2
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants