Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inefficient categorical IR? #52

Open
ahorn opened this issue Jun 5, 2019 · 3 comments
Open

Inefficient categorical IR? #52

ahorn opened this issue Jun 5, 2019 · 3 comments

Comments

@ahorn
Copy link

ahorn commented Jun 5, 2019

This is not so much a bug report but a technical question:

In your papers, an expression such as x + (x * y) would be encoded as add ○ (exl △ mul). However, your compiler extension uses dup instead. This seems to lead to inefficiencies.

I am therefore wondering: what is the reason for the difference between the categorical operators in your papers and the implementation? Is it due to the way GHC works? Or does it solve some other issues? Could the above inefficiencies be avoided by using the more "rudimentary" categorical operators that you use in your papers?

Many thanks in advance for your time!

@conal
Copy link
Collaborator

conal commented Jun 6, 2019

Thanks for the question. After writing the paper, I introduced a class for monoidal categories with products as a superclass of cartesian categories. (My non-cartesian, monoidal, motivating example was a category of isomorphisms---really a category transformer.) Such a monoidal category has (***), while a cartesian category further adds exl, exr, and dup. The old (&&&) method now has a definition: f &&& g = (f *** g) . dup. Without optimization, we can get inefficient translations for a variety of reasons. My intent (partially successful) has been to provide GHC rewrite rules to optimize categorical expressions, and if these optimizations worked better, I think the example you mentioned would improve. For a variety of subtle reasons, however, it's been difficult to get GHC to perform all of these rewrites.

@ahorn
Copy link
Author

ahorn commented Jun 6, 2019

Thank you! It would be interesting to have a list of "requirements" or "overarching issues" that are essential for the approach to be viable when IR optimizations matter. This is related to a recent remark by David Spivak where he mentions that "isomorphisms are free [in mathematics] but expensive [in computer science]". All this seems to put rewrite systems in the spotlight, and it would be interesting to know from a research perspective what the key problems are in practice.

@ahorn
Copy link
Author

ahorn commented Jun 12, 2019

I realize there isn't an easy answer, and I'd like to give an example to illustrate what I mean in case others stumble across our conversation in the future.

Re-consider the function λ(x,y). x + (x * y). It takes a pair of numbers as input. Suppose we set y=3, possibly during "constant propagation", a common compiler optimization.

Setting y=3, we get λx .x + (x ∗ 3). This corresponds to add ○ (id △ [mul ○ (id △ (const 3))]).

The question is whether it is feasible to get such a categorical form (or maybe something simpler) without relying on the λ-calculus. Put differently, can we simplify add ○ (exl △ mul) ○ (id △ (const 3)) directly? Through an easy calculation, we find the answer to be "yes" here:

add ○ (exl △ mul) ○ (id △ (const 3))
     = add ○ ([exl ○ (id △ (const 3)] △ [mul ○ (id △ (const 3))])
     = add ○ (id △ [mul ○ (id △ (const 3))])

The rationale behind such calculation using a "categorical IR" is that there may be theorem(s) that we can exploit to optimize the code. The purpose of this Github issue is to find out whether you think such an approach is viable. I suspect there may be a good reason why something like CAM has never caught on in the larger community around functional programming compilers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants