Inconsistent Range behavior when dealing with reverse ranges #564

isomarcte · 2022-11-15T13:55:49Z

I've noticed some inconsistencies with how several methods on Range and that use Range are defined when dealing with ranges which go different directions.

Range.contains

contains for Range is not symmetric around reverse.

scala> val a = Range(1, 10)
a: cats.collections.Range[Int] = Range(1,10)

scala> val b = a.reverse
b: cats.collections.Range[Int] = Range(10,1)

scala> a.contains(b)
res0: Boolean = true

scala> b.contains(a)
res1: Boolean = false

Perhaps most troubling,

scala> Range(1, 10).contains(Range(15, 9))
res0: Boolean = true

This is made more confusing when looking at how Range is used with Diet.

scala> val a = Range(1, 10)
a: cats.collections.Range[Int] = Range(1,10)

scala> val b = a.reverse
b: cats.collections.Range[Int] = Range(10,1)

scala> Diet.fromRange(b) === Diet.fromRange(a)
res0: Boolean = false

scala> Diet.empty.addRange(b) === Diet.fromRange(a)
res1: Boolean = true

There are a few ways to remedy this that come to mind, though I think the most simple would be to change Range to be defined as always going in one direction. If a Range is desired that is going in the inverse order, then an newtype (perhaps Inverse) could be used.

The text was updated successfully, but these errors were encountered:

satorg · 2022-11-15T17:12:12Z

I would say it is by design: Range defines boundaries but not specify Order upon creation. Then every single method has to provide the Order implicitly for itself. In particular, both the contains methods take the implicit Order parameter. Therefore, if we reverse Range then the Order used for its methods has also be reversed:

scala> val rng1 = Range(1, 10)
rng1: cats.collections.Range[Int] = Range(1,10)

scala> rng1.contains(5)
res0: Boolean = true

scala> val rng2 = rng1.reverse
rng2: cats.collections.Range[Int] = Range(10,1)

scala> rng2.contains(5) // won't work since it uses a regular Order for Int
res1: Boolean = false

scala> rng2.contains(5)(Order.reverse(Order[Int]))
res2: Boolean = true

That said, I personally find such an approach controversial quite a bit.

armanbilge · 2022-11-15T17:14:23Z

Range defines boundaries but not specify Order upon creation. Then every single method has to provide the Order implicitly for itself.

Yes, this is one of the things I think we should consider changing for 1.0. See my comment in #277 (comment).

Indeed, this design was rejected for the new HashMap and HashSet, on grounds of be a "Haskell cargo-cult". So at the very least, this library now has inconsistent designs.

satorg · 2022-11-15T17:23:10Z

this is one of the things I think we should consider changing for 1.0

If we could change that, then perhaps it would be a cool change. Would we consider making Range to be not "all-inclusive" only? I mean, if we could create a Range instance having one or both boundaries that would be exclusive, then it would make this type way more useful.

armanbilge · 2022-11-15T17:24:53Z

See also some discussion in #315 (comment).

isomarcte · 2022-11-15T18:20:14Z

Just to clarify, are we saying that we intentionally want the usage of non-coherent instances to be normative? Like, we demand either Order[Int] or Order.reverse(Order[Int]) on Range creation as a design feature to change the ordering of the Range?

Using HashSet as an example, is this an intended usage?

scala> val sillyButValidHash: Hash[Int] = new Hash[Int] { override def hash(a: Int): Int = 1; override def eqv(x: Int, y: Int): Boolean = Eq[Int].eqv(x, y)}
sillyButValidHash: cats.Hash[Int] = $anon$1@1f21f982

scala> HashSet(1) === HashSet(1)(sillyButValidHash)
res0: Boolean = true

scala> HashSet(1).add(2) === HashSet(1)(sillyButValidHash).add(2)
res1: Boolean = false

scala> HashSet(1).add(2).iterator.toList === HashSet(1)(sillyButValidHash).add(2).iterator.toList
res2: Boolean = true

I personally would never want this to be intended behavior, as if I have more than one instance of HashSet[A] I completely lose the ability to reason about how they relate to each other, unless I've defined them in the same local scope.

I was under the impression that binding the instance at the method or at the class was mostly for convenience (and preventing damage to internal state of an already allocated structure) and not as a mechanism for using alternative instances. I would expect to use newtypes (ideally opaque types) to index alternative implementations.

armanbilge · 2022-11-15T18:31:29Z

The problem is if you take the instance at the call site then you can do this:

HashSet(1).add(2)(sillyButValidHash)

Which is also bad. Oscar just summarized this nicely in #277 (comment).

I still believe that in a language like scala where typeclasses are first class values and we don't have coherence, it doesn't make since to consume required typeclasses at callsites if using the wrong one invalidates the datastructure (which it does for hashmaps and trees with Hash and Order respectively).

armanbilge · 2022-11-15T18:33:20Z

I would expect to use newtypes (ideally opaque types) to index alternative implementations.

👍 yes, this is the only safe way.

isomarcte · 2022-11-15T18:41:14Z

👍 You scared me for a minute.

So coming back to Range. If we define ranges as intervals going in a single direction, then I think we can make the interface consistent and relatively easy to use. This would require the use of a newtype for reversing direction. I'm happy to POC that if there are no objections. It would also make the case for moving Inverse to cats-collections more compelling.

My intended strategy would probably be to introduce a new class Interval and deprecate Range.

Thoughts?

armanbilge · 2022-11-15T19:20:55Z

Sure, maybe Interval is a better name for the concept anyway.

satorg · 2022-11-15T19:22:29Z

@isomarcte I think there are two strategies possible:

As you suggested, to make a type that always uses a single direction and then create a newtype for the inverse directions.
- PROS: (as I can see it) the direction is unified and alway apparent.
- CONS: the range type and its inverse counterpart become two different types and thus cannot be compared to each other directly. I.e. we have to either implement separate methods for correlating those types or make a conversion before each comparison.
Another way could be to keep Range being bi-directional but latching its Order upon a new instance creation. Therefore, when we call .reverse we can also reverse the Order.
- PROS: a single type that is "up" to be compared with a counterpart regardless of each other's direction.
- CONS: the Range becomes more complex and has to maintain more complex structure rather than just boundaries.

Also as I mentioned above, I personally would find useful to have Range that could have non-inclusive boundaries, e.g. [0..10) which is similar to Scala's 0 until 10 .

All above is my personal perception of course.

armanbilge · 2022-11-15T19:31:13Z

Thanks for that distinction, at least for me I think that helped clarify things.

I think (1) is an "interval": it is a set of things, in some natural order. Whereas (2) is a "range" since you are not only specifying the set, but in which order to iterate it. I am not sure if that means we have to reverse the Order instance: the direction is a property of the "range", and not of its element. I think :)

satorg · 2022-11-15T19:52:53Z

I am not sure if that means we have to reverse the Order instance: the direction is a property of the "range", and not of its element.

In theory yes, it should not be the property of Range. But the way it is currently implemented is that Order is used to correlate the Range's boundaries with other items of the same type (including boundaries of another Range). Therefore, currently Order magically becomes a property of the Range itself. And yeah, I agree it should not be so...

isomarcte · 2022-11-15T20:14:26Z

@satorg RE 1.CONS

Assuming the inclusion of Inverse from the aforementioned PR, do you think something along these lines would work?

val a: Interval[Int] = Interval(1, 10)
val b: Interval[Inverse[Int]] = a.reverse

def reverse: Interval[Inverse[A]] = 
  Interval(Inverse(end), Inverse(start))
def contains(value: Interval[A]): Boolean = ???
def containsInverse(value: Interval[Inverse[A]]): Boolean = 
  contains(value.reverse)

// Just a rough sketch. I might be able to do something interesting so that
// (a: Interval[Inverse[A]]).reverse => Interval[A], and not
// Interval[Inverse[Inverse[A]]]
:t reverse 
Interval[A] => Interval[Inverse[A]]
:t contains 
Interval[A] => Interval[A] => Boolean
:t containsInverse
Interval[A] => Interval[Inverse[A]] => Boolean

And I'm completely fine making it exclusive on the upper bound. I think that would even be required, since if we require that start <= end, then the only want to define the empty interval is to have the upper bound be exclusive.

isomarcte · 2022-11-15T20:15:29Z

I can also just make a POC PR as well, if that would be easier for discussion.

satorg · 2022-11-16T02:56:21Z

And I'm completely fine making it exclusive on the upper bound. I think that would even be required, since if we require that start <= end, then the only want to define the empty interval is to have the upper bound be exclusive.

Yes, I think it makes sense as an option... But on the other hand.... Imagine that we need to make Interval for the entire Short type range. In that case Interval(Short.MinVal, Short.MaxVal) wouldn't work since the last value will be always excluded.

Another (but similar) example:

sealed trait MyNum

object MyNum {
  object One extends MyNum
  object Two extends MyNum
  object Three extends MyNum
  object Four extends MyNum
}

Apparently we can define some Order for such an enum as well as Discrete (or perhaps even BoundedEnumerable from cats) for that. But if Interval can only have an exclusive upper bound, then there's no way to define such an Interval that would contain all the members of an enum like one above.

satorg · 2022-11-16T02:59:33Z

As a bottomline: there's a controverity: if Interval only allows both lower and upper bounds to be inclusive, then there's no way to define an empty Interval. Otherwise, if it always defined as [inclusive, exclusive), then we cannot describe a full range of some limited set of values.

This is why I think it should allow both options.

satorg · 2022-11-16T03:04:57Z

On the second thought... Perhaps, the idea is too drastic, but... is there really a necessity for having classes like Range or Interval in principle, taking into account that there are cool typeclasses like UnboundedEnumerable, LowerBoundedEnumerable, BoundedEnumerable, etc already defined in cats?

isomarcte · 2022-11-16T14:23:15Z

@satorg I would say yes. An Interval type, not only providing some utility itself, is fundamental to a class of data structures called Interval Trees, of which we have a variant in cats-collection already, e.g. Diet.

Having a base type with the correct semantics to implement these is, I think, quite useful.

I am of course heavily biased here. I started down this path because a nice Interval type and an Interval Tree which supports key value pairs would be amazing for my work in idna4s. 😉

satorg · 2022-11-16T23:00:35Z

I basically agree. But what I am thinking about (and yeah, perhaps it is too crazy) is to make Diet more generic and don't stick to a particular interval/range type. Therefore if we made implementation of Diet based on typeclasses rather than a particular range implementation, then it should be possible to plug any range implementation into it, including scala.collection.immutable.Range, something like

class Diet[A, R] {}

where A – element type and R – range type

armanbilge added the bug label Nov 15, 2022

isomarcte mentioned this issue Nov 16, 2022

Add Syntax For Enumerable typelevel/cats#4347

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inconsistent Range behavior when dealing with reverse ranges #564

Inconsistent Range behavior when dealing with reverse ranges #564

isomarcte commented Nov 15, 2022

satorg commented Nov 15, 2022 •

edited

Loading

armanbilge commented Nov 15, 2022

satorg commented Nov 15, 2022

armanbilge commented Nov 15, 2022

isomarcte commented Nov 15, 2022 •

edited

Loading

armanbilge commented Nov 15, 2022

armanbilge commented Nov 15, 2022

isomarcte commented Nov 15, 2022

armanbilge commented Nov 15, 2022

satorg commented Nov 15, 2022 •

edited

Loading

armanbilge commented Nov 15, 2022

satorg commented Nov 15, 2022

isomarcte commented Nov 15, 2022

isomarcte commented Nov 15, 2022

satorg commented Nov 16, 2022

satorg commented Nov 16, 2022

satorg commented Nov 16, 2022

isomarcte commented Nov 16, 2022

satorg commented Nov 16, 2022

Inconsistent Range behavior when dealing with reverse ranges #564

Inconsistent Range behavior when dealing with reverse ranges #564

Comments

isomarcte commented Nov 15, 2022

Range.contains

satorg commented Nov 15, 2022 • edited Loading

armanbilge commented Nov 15, 2022

satorg commented Nov 15, 2022

armanbilge commented Nov 15, 2022

isomarcte commented Nov 15, 2022 • edited Loading

armanbilge commented Nov 15, 2022

armanbilge commented Nov 15, 2022

isomarcte commented Nov 15, 2022

armanbilge commented Nov 15, 2022

satorg commented Nov 15, 2022 • edited Loading

armanbilge commented Nov 15, 2022

satorg commented Nov 15, 2022

isomarcte commented Nov 15, 2022

isomarcte commented Nov 15, 2022

satorg commented Nov 16, 2022

satorg commented Nov 16, 2022

satorg commented Nov 16, 2022

isomarcte commented Nov 16, 2022

satorg commented Nov 16, 2022

satorg commented Nov 15, 2022 •

edited

Loading

isomarcte commented Nov 15, 2022 •

edited

Loading

satorg commented Nov 15, 2022 •

edited

Loading