Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

a faster immutable stack/list #634

Open
johnynek opened this issue Sep 12, 2023 · 2 comments
Open

a faster immutable stack/list #634

johnynek opened this issue Sep 12, 2023 · 2 comments

Comments

@johnynek
Copy link
Contributor

The canonical encoding of list in scala 2 would be:

sealed trait List[+A]
object List {
  case object Empty extends List[Nothing]
  case class NonEmpty[A](head: A, tail: List[A]) extends List[A]
}

but this implementation involves a lot of pointer chasing (to iterate N times into the List we need to dereference N times).

We know from work on immutable vectors that having blocks of arrays that we copy on change can give considerable improvements due to cache locality (and fewer derefs).

So one can imagine another list:

sealed trait List[+A] {
  def uncons: Option[(A, List[A])]
  def prepend(a: A): List[A]
}

object List {
  private val BlockSize = 16 // experiment on this
  private case object Empty extends List[Nothing] {
    def uncons = None
    def prepend(a: A) = {
      val ary = new Array[Any](BlockSize)
      val offset = BlockSize - 1
      ary(offset) = a
      Impl(offset, ary, Empty)
    }
  }
  private case class Impl[A](offset: Int, block: Array[Any], tail: List[A]) extends List[A] {
    def uncons = {
      val nextOffset = offset + 1
      val next = if (nextOffset == block.length) tail else Impl(nextOffset, block, tail)
      Some((block(offset).asInstanceOf[A], next)
   }
    def prepend(a: A) = {
      val ary = new Array[Any](BlockSize)
      if (offset > 0) {
        // copy the right side
        System.arraycopy(block, offset, ary, offset, BlockSize - offset) 
        val nextOffset = offset - 1
        ary(nextOffset) = a
        Impl(nextOffset, ary, tail)
      }
      else {
        val offset = BlockSize - 1
        ary(offset) = a
        Impl(offset, ary, this)
      }
    }
}

or something like this...

The idea is to copy the block on write. Of course, apply, map, foldLeft, foreach, iterator, etc... could be significantly optimized since once you have a handle on the block iterating to the next value will be in cache.

With some appropriate benchmarks you could set the size of the block to some optimal value that I bet would be bigger than 1. It could be that this beats naive List in almost all cases, and when the List isn't too large it could compete with Vector (which isn't as fast as List for stack uses).

@armanbilge
Copy link
Member

armanbilge commented Sep 12, 2023

I like this!

A couple other thoughts:

  1. creating a List by concatenating to a mutable builder would also be a lot faster since you can just place elements in the Array. I guess you'd have to allow the last block to be incomplete (needs logic but seems fine ...)

  2. I wonder if it's worth to try to avoid copying the block on the first prepend, optimistically assuming that it will not be prepended more than once. scodec ByteVector has some optimizations like this for appending:

https://github.com/scodec/scodec-bits/blob/8a23a541def13150768c83aa0b077afbda4b3f34/core/shared/src/main/scala/scodec/bits/ByteVector.scala#L2311-L2317

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants