-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restrictions on disjunctive patterns #44
Comments
The usual argument is that each clause means “this, and not any of the previous”, and it is burdensome to manually make them disjoint. It’s fine to permute them when they are disjoint (but I’m not sure what this would benefit), but I don’t think it’s a good idea to insist that they be. For one thing, the usual catch-all “underscore” case must surely be last, and that is often useful.
Best,
Bob
(c) Robert Harper All Rights Reserved.
… On Sep 7, 2020, at 17:30, John Reppy ***@***.***> wrote:
The Definition of Successor ML specifies that the pattern pat1 | pat2 must be irredundant, which is defined that there must exist a value that is matched by pat2 but not pat1. After discussions with Dave MacQueen, I wonder if the restriction should be that pat1 and pat2 are required to be disjoint; i.e., that there is no value that is matched by both patterns.
This restriction has the property that it makes the | pattern constructor commutative, which I think is intuitive.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub <#44>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AALWY5OOWOHG4CBV35CWRO3SEVGINANCNFSM4Q65Q65A>.
|
@RobertHarper I believe that this issue is about matches like Maybe I'm misinterpreting, but I don't think an expression like Edit: Updated my first case to actually be sensical. |
Ah, I misunderstood. Never mind.
(c) Robert Harper All Rights Reserved.
… On Sep 7, 2020, at 19:33, matt rice ***@***.***> wrote:
@RobertHarper <https://github.com/RobertHarper> I believe that this issue is about matches like (x : bool, y) | (x, y: Int) => x', not affecting the normal match ::= pat => exp ⟨| match⟩`. I also noted a paper referring to some issues with these disjunctive patterns in #42 <#42>.
Maybe I'm misinterpreting, but I don't think an expression like x | _ => () makes sense since x isn't bound in all the patterns.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub <#44 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AALWY5J3FJHHWNCX2GTLX6DSEVUUPANCNFSM4Q65Q65A>.
|
I'm not sure. The advantage of specifying it asymmetrically is that it naturally corresponds to the naive implementation of just expanding all the or-patterns into sequential matches, where one would only have a warning if pat2 was irredundant with pat1. |
A good argument in favor is that the restriction proposed could be relaxed, Given that this would fix that other issue without the addition of a new warning, which it seems only occurs where overlapping disjunctive patterns meet pattern guards, I would be inclined to argue in favor of it. |
I’ve spent a lot of time over the last six months or so rewriting the match compiler for SML/NJ, trying to go back to first principles for
the design. The rewrite has made me look closely at a number of issues, including the semantics of OR patterns.
The old match compiler treated OR patterns naively. It eliminated OR patterns by splitting rules whose pattern contained OR patterns
into multiple rules. This approach caused a rule like p => e, where p contains a sub pattern of the form (p1 | p2) to split into at least two
rules p’ => e and p’’ => e where p’ contains just p1 and p’’ contains just p2 (unless there is some implicit permuting of the rules going
on). So with this approach it turns out that p1 seems to have priority over p2 — ignoring other things that might be going on in the
pattern context, if p1 matches then the p’ => e rule fires and p2 doesn’t get matched, and conversely if p1 fails to match then p2
(as part of p’’ => e) gets a chance.
But this approach of eliminating embedded OR patterns and splitting rules was a kind of quick and dirty solution. A more fundamental
analysis can be made that does not “eliminate” OR patterns but treats them as a “first-class” feature of pattern matching. My new
match compiler does this analysis and as a consequence there is no intrinsic reason to treat (p1 | p2) asymmetrically. (Just as there
is no reason to treat the pattern (p1, p2) asymmetrically.) [In a decision-tree based match compiler, there are heuristics that determine
the sequence in which pattern points are matched, but that is an implementation issue.]
So I wanted to find some “semantic” principles to define what a “proper” or “legal” OR pattern is.
Here are the basic points of my proposal: In an OR pattern (p1 | p2)
(1) p1 and p2 have the same type
(2) p1 and p2 contain the same pattern variables
(3) p1 and p2 are disjoint (meaning Val(p1) \intersect Val(p2) = \emptyset, where Val(p) is the set of values matching p.)
These rules eliminate certain “bugus” OR patterns like, e.g.
x | x
x | true, true | x
x | 1::x, 1::x | x
These all violate the disjointness principle (3).
I had first thought of a set of reduction rules that would “normalize” OR patterns, such as
x | p —> x
p | x —> x
(p1,p2) | (p3, p4) —> ((p1 | p3), (p2 | p4))
but after further reflection, it seems simpler and better to just report the bogus OR patterns as errors.
I am writing up a detailed description of the new match compiler and its “theory”, which I hope to complete in the not
too distant future after I finish debugging the implementation (intended for SML/NJ 110.99).
Dave
On Sep 8, 2020, at 8:50 AM, matt rice ***@***.***> wrote:
A good argument in favor is that the restriction proposed could be relaxed,
However if we take the relaxed behavior as standard we cannot restrict it further.
Given that this would fix that other issue without the addition of a new warning, which it seems only occurs where overlapping disjunctive patterns meet pattern guards, I would be inclined to argue in favor of it.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub <#44 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAGXNPJD5E2N4625UXAKH3DSEZHEJANCNFSM4Q65Q65A>.
David MacQueen
[email protected]
|
That sounds great, send me the paper when it’s done.
(c) Robert Harper. All Rights Reserved.
… On Sep 8, 2020, at 18:41, David MacQueen ***@***.***> wrote:
I’ve spent a lot of time over the last six months or so rewriting the match compiler for SML/NJ, trying to go back to first principles for
the design. The rewrite has made me look closely at a number of issues, including the semantics of OR patterns.
The old match compiler treated OR patterns naively. It eliminated OR patterns by splitting rules whose pattern contained OR patterns
into multiple rules. This approach caused a rule like p => e, where p contains a sub pattern of the form (p1 | p2) to split into at least two
rules p’ => e and p’’ => e where p’ contains just p1 and p’’ contains just p2 (unless there is some implicit permuting of the rules going
on). So with this approach it turns out that p1 seems to have priority over p2 — ignoring other things that might be going on in the
pattern context, if p1 matches then the p’ => e rule fires and p2 doesn’t get matched, and conversely if p1 fails to match then p2
(as part of p’’ => e) gets a chance.
But this approach of eliminating embedded OR patterns and splitting rules was a kind of quick and dirty solution. A more fundamental
analysis can be made that does not “eliminate” OR patterns but treats them as a “first-class” feature of pattern matching. My new
match compiler does this analysis and as a consequence there is no intrinsic reason to treat (p1 | p2) asymmetrically. (Just as there
is no reason to treat the pattern (p1, p2) asymmetrically.) [In a decision-tree based match compiler, there are heuristics that determine
the sequence in which pattern points are matched, but that is an implementation issue.]
So I wanted to find some “semantic” principles to define what a “proper” or “legal” OR pattern is.
Here are the basic points of my proposal: In an OR pattern (p1 | p2)
(1) p1 and p2 have the same type
(2) p1 and p2 contain the same pattern variables
(3) p1 and p2 are disjoint (meaning Val(p1) \intersect Val(p2) = \emptyset, where Val(p) is the set of values matching p.)
These rules eliminate certain “bugus” OR patterns like, e.g.
x | x
x | true, true | x
x | 1::x, 1::x | x
These all violate the disjointness principle (3).
I had first thought of a set of reduction rules that would “normalize” OR patterns, such as
x | p —> x
p | x —> x
(p1,p2) | (p3, p4) —> ((p1 | p3), (p2 | p4))
but after further reflection, it seems simpler and better to just report the bogus OR patterns as errors.
I am writing up a detailed description of the new match compiler and its “theory”, which I hope to complete in the not
too distant future after I finish debugging the implementation (intended for SML/NJ 110.99).
Dave
> On Sep 8, 2020, at 8:50 AM, matt rice ***@***.***> wrote:
>
>
> A good argument in favor is that the restriction proposed could be relaxed,
> However if we take the relaxed behavior as standard we cannot restrict it further.
>
> Given that this would fix that other issue without the addition of a new warning, which it seems only occurs where overlapping disjunctive patterns meet pattern guards, I would be inclined to argue in favor of it.
>
> —
> You are receiving this because you are subscribed to this thread.
> Reply to this email directly, view it on GitHub <#44 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAGXNPJD5E2N4625UXAKH3DSEZHEJANCNFSM4Q65Q65A>.
>
David MacQueen
***@***.***
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
Wouldn't such a restriction harm usability? You want overlapping or patterns for similar reasons that you want overlapping cases. Imagine a large datatype like
and a pattern like the following:
With the proposed restriction, I'd have to write this as
which seems rather silly and tedious, especially for larger N. |
Ah, good point, which is similar to the one I raised in my misdirected response.
(c) Robert Harper All Rights Reserved.
… On Sep 9, 2020, at 08:25, Andreas Rossberg ***@***.***> wrote:
Wouldn't such a restriction harm usability? You want overlapping or patterns for similar reasons that you want overlapping cases. Imagine a large datatype like
datatype t = C1 of int | C2 of t2 | ... | CN of tN
and a pattern like the following:
SOME ((C1 n, _) | (_, n)) => e(n)
With the proposed restriction, I'd have to write this as
SOME ((C1 n, _) | ((C2 _ | C3 _ | ... | CN _), n)) => e(n)
which seems rather silly, especially for larger N.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub <#44 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AALWY5LCBOF5AIR7DDBOP5TSE5X4VANCNFSM4Q65Q65A>.
|
In your example, how the value (C1 2, 3) would be matched would be ambiguous — n might be bound to 2 or to 3.
If you say that this ambiguity should be resolved by choosing the left-most match, then you are imposing an ordering
on the match compiler that limits its ability to optimize the order in which pattern nodes are examined.
So there is a trade-off. My strong preference is not to impose a pre-determined ordering on the way that pattern
nodes are examined during matches.
Another approach might be to do an analysis leading to a disambiguating transformation of the OR pattern. In this
case, your first example might be transformed into your second example automatically.
Dave
On Sep 9, 2020, at 5:25 AM, Andreas Rossberg ***@***.***> wrote:
Wouldn't such a restriction harm usability? You want overlapping or patterns for similar reasons that you want overlapping cases. Imagine a large datatype like
datatype t = C1 of int | C2 of t2 | ... | CN of tN
and a pattern like the following:
SOME ((C1 n, _) | (_, n)) => e(n)
With the proposed restriction, I'd have to write this as
SOME ((C1 n, _) | ((C2 _ | C3 _ | ... | CN _), n)) => e(n)
which seems rather silly, especially for larger N.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub <#44 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAGXNPLMLMVEEYCKKYYANNTSE5X4VANCNFSM4Q65Q65A>.
David MacQueen
[email protected]
|
Well, the dynamic semantics of OR-patterns at https://github.com/SMLFamily/Successor-ML/blob/master/definition/dyncor.tex#L1016 pretty clearly specifies that the matches are tried in left-to-right order. (Of course, a static semantics that rejected non-commutative OR-patterns would make the specifics of the dynamic semantics less relevant.) And, there is clearly an ordering imposed on the match compiler based on the order of the matches. |
Even if the static semantics rejected overlapping or-patterns, their dynamic semantics would still need to be ordered, due to the presence of nested matches and guards, which (unfortunately) can have observable side effects. So they still wouldn't be commutative. |
Another consideration is that such a restriction would complicate the static semantics considerably. So far, the Definition never needs to define coverage for patterns, since it's only needed to implement an informal warning. If we were to elevate that to an error in some cases, we'd have to go through the trouble of formalising coverage. Moreover, we'd run into the known issue with exception constructors, where overlap cannot generally be decided during type checking of a pattern itself:
In other words, such a requirement isn't compositional. |
I would at least like prefer if come up with a better name than disjunctive patterns at least, since disjunction is commutative and this is not, something like ordered intersection patterns (but preferably less terrible 🤷) |
"left-biased OR-patterns"? |
FWIW, SML's regular disjunction operator ( |
The Definition of Successor ML specifies that the pattern
pat1 | pat2
must be irredundant, which is defined that there must exist a value that is matched bypat2
but notpat1
. After discussions with Dave MacQueen, I wonder if the restriction should be thatpat1
andpat2
are required to be disjoint; i.e., that there is no value that is matched by both patterns.This restriction has the property that it makes the
|
pattern constructor commutative, which I think is intuitive.The text was updated successfully, but these errors were encountered: