Specialization for Scala 3 #18044
Replies: 2 comments 2 replies
-
It's an interesting scheme, though I think a worked-through example would help clarify some of the details and potential rough edges. In particular, I'm not sure that it interacts well enough with opaque types to be clearly superior to the tedious but somewhat-workable alternative of inlining-based specialization by hand. A great deal of the pointless boilerplate that was previously required by hand-based specialization can be avoided by judicious use of inlining. For instance, if you have sealed abstract class Thing[A, B] {
def foo(a: A, b: B): Thing[A, B]
} and final class ThingIntLong extends Thing[A, B] {
def foo(a: Int, b: Long): ThingIntLong = ...
} you can very often write an inline function final class ThingIntLong() extends Thing[A, B] {
def foo(a: Int, b: Long): ThingIntLong = fooImpl[Int, Long, ThingIntLong](a, b, this, (i: Int, l: Long) => ..., ...)
} This is still a lot of boilerplate. But it's far less boilerplate than manually specializing everything by hand, and it's very flexible, so it raises the bar for what specialization needs to offer. The key is, if you do it this way, if you have Anyway, at least for me, it's very hard to judge whether this type of specialization would be worth it without some examples that let us judge whether the assumption that we won't cross abstraction boundaries too much seems to be true in practice. For instance, one of the problems that image processing libraries tend to run into is that pixel-by-pixel access with the right specialized data type for the pixel, from at least two images at once, is essential for performance. With Scala 3, you can do that with inlining, but it isn't clear to me whether the above specialization scheme would allow it. |
Beta Was this translation helpful? Give feedback.
-
Yes, the way I look at it are two complementary things:
|
Beta Was this translation helpful? Give feedback.
-
Specialization for Scala 3
Specialization is one of the few remaining desirable features from Scala 2 that's are as yet missing in Scala 3. We could try to port the Scala 2 scheme, which would be non-trivial since the implementation is quite complex. But that scheme is problematic enough to suggest that we also look for alternatives. A possible alternative is described in this note. It is meant to complement the proposal on inline traits. That proposal also contains a more detailed critique of Scala 2 specialization.
The main problem of Scala-2 specialization is code bloat. We have to pro-actively generate up to 11 copies of functions and classes when they have a specialized type parameter, and this grows exponentially with the number of such type parameters. Miniboxing tries to reduce the number under the exponent from ~10 to 3 or 4, but it has problems dealing with arrays.
Languages like C++, Rust, Go, D, or Zig avoid the proactive generation of all possible specializations by monomorphizing the whole program. This means we only need to generate a specialized version of a function or class if it is actually used in the program. On the other hand, a global monomorphization can lead itself to code bloat and long compile times. It is also a problematic choice for binary APIs.
This note discusses a different scheme to get specialization for Scala 3, which is somewhat between Scala 2's selective specialization and full monomorphization. As in Scala 2, specialized type parameters are tagged explicitly (but not with a annotation). But as for monomorphization, specializations are only generated if a specialized type is referenced in the program. To make this work efficiently, we need a way to transport information about possible specialization types through generic code (full monomorphization does not need that since it eliminates all generic code).
We do that using a type class
Specialized
that is typically used as a context bound on a type parameter of some class. It indicates that we want to create specialized versions of that class where the type parameter is instantiated to the type argument. The specialized versions offer optimization opportunities compared to the generic class.Example
As an example, consider a
Vec
class for vectors over a numeric type.The idea is that we want to specialize vectors on the type parameter
T
in order to get important efficiency gains, including the following:arr
specialized to the actual element instead of a fully generic array that has to be accessed via reflectionresult
scalarProduct
Numeric
class instance forT
, so that calls tonum
's methods have static targets and can be inlined.Terminology and Restrictions
A class such as
Vec
which has some type parameterX
that comes with a context parameter of typeSpecialized[X]
is called a specialized class, and its type parameterX
is called a specialized type parameter.Restrictions:
legal if the class is translated to an inline trait.
The types which can be substituted for a
Specialized
parameter areAnyRef
or a primitive typeByte
,Short
,Char
,Int
,Long
,Float
,Double
,Boolean
,Unit
. We call these types the specializable argument types.Besides
Specialized
, there is also aSpecializedAt
type class which allows to specify the specializable argument types explicitly. Given a context boundX: SpecializedAt[T_1, ..., T_n]
the specializable argument types forX
areT1, ..., Tn
. As forSpecialized
, these types must be primitive types orAnyRef
.Examples:
[X: SpecializedAt[Long | Double | AnyRef]]
areLong
,Double
, andAnyRef
.[X: SpecializedAt[Int]]
consist of justInt
.(The precise definition of specializable argument types can be changed without changing the essence of the proposal, we keep it here for concreteness).
A Closer Look At Specialized
Specialized
arguments can be inter-converted withClassTag
s. For instanceSpecialized
could be a subclass ofClassTag
and an additional given could produce aSpecialized
value from a classtag.Specialized
differs fromClassTag
mainly in that it has some additional fields that help make specialization efficient. For instance, it could carry a tag field that maps every specializable type to a valid index, and maps all other types to a sentinel number.A concrete implementation of
Specialized
andSpecializedAt
could be as follows:Note that the first
S
parameter ofSpecializedAt
is a phantom type that is not used on the right hand side. In fact the type argument forS
is only used to determine the specialized argument types for a type variable with theSpecializedAt
annotation. When it comes to parameter passing, allSpecializedAt
dictionaries are equivalent toSpecialized
.Expansion of Specialized Classes
A general form of a specialized class modelled after
Vec
is:We will use this class as an example to explain how specialized classes are compiled. In general, we can have
X
,a
,Specialized
context parameters likes
, each over a separate type parameter,b
that take specialized type parameters as arguments.A specialized class expands to several artifacts.
First, there is an inline trait with the parents and body as given in the class.
For instance, here is the inline trait for class
C
above:Second, there are zero or more specialized versions of the class.The name of the specialized version is of the form
C$sp$N
whereC
is the specialized class andN
is a tag of a specialized argument type. We propose for now to use the same naming schemes as for Scala 2 specialization, but this aspect could be modified.The definition of the specialized version depends on whether the specialized argument
T
is final or not. IfT
is final, we determine a canonical implementations of each typeclass argument overT
usingIn this case, we assume that any specialized typeclass arguments like
b
will be the same whereeverC[T]
gets instantiated. We test that this assumption is correct at least at the toplevel by comparing the class of any specialized instantiation argument with the class of the corresponding argument with whichC$sp$T
was first instantiated. Fixing type class arguments in this way creates important potential for optimizations when inlining traitC
atT
.If
T
is not final, we expand instead to:If
T
is not final, we can still specialize the class to an element type upper-bounded byT
. This could be important in some cases. For instance, if classC
is specialized atAnyRef
and there are generic arrays of typeArray[X]
inC
, those arrays would be specialized to anArray[Object]
representation. This is much more efficient than an unconstrainedArray[X]
which needs reflection for element access.On the other hand, we cannot fix the type class argument
b
since different subtypes ofT
might have different instances, even if we assume global coherence.Third, there is a generic implementation of the class which is used for all type arguments that are not specializable. Here is the one for
C
:Creating Specialized Versions
Scala erasure maps the type
C[T]
toC$sp$S
orC$sp$S[T]
ifT
conforms to a specializable argumentS
forC
. This step is important for performance. Without it, any interaction with classC
would go through the inefficient generic API ofC
that relies on boxing.To make this erasure work, the compiler needs to create a specialized version of
C
for every specializable argument typeS
whereC[S]
is referred to in a program. A compiler can scan its classpath to see whether a required specialized version already exists. If none exists, the specialized version of classC
atS
will be created on the fly. This typically involves reading the inline trait forC
from Tasty and expanding it in the freshly created classC$sp$S
.Now consider a construction of a specializable class, say
new C[T](a)(using s, b)
. IfT
is statically known to conform to a specializable typeS
, we can rewrite this tonew C$sp$S(a)
ornew C$sp$[T](a)(using s, b)
depending on whetherS
is final or not. IfT
is a monomorphic type that does not conform to any specializable typeS
, we rewrite tonew C$generic[T](a)(s, b)
instead.That leaves type variables
T
where the actual runtime instance is unknown at compile time. In this case we must still create the appropriate value of classC$sp$T
in the case where at runtime theSpecialized
values
indicatesT
is a specializableargument for
C
. This is necessary, since otherwise it would not be sound to erase all occurrences of typeC[T]
to their specialized versions.We do this by implementing
new
expressions as calls to a special factory method in the companion object of the specializable class. For instance, the new expression above would be rewritten toC.newInstance(s)[T](a, s, b)
, where the companion object ofC
contains a factory methodnewInstance
defined like this:The implementation of
newInstance
depends on whether we assume static linking or not.With static linking, the linker could generate code that maps every
Specialized[T]
entry whereT
is a specializable argument forC
mentioned in the code to the functionSpecialized
entries could contain consecutive tags for specializable types and a separate tag for other types, so this mapping could be a simple table lookup on the tag.If we don't assume static linking, we can get equivalent functionality using reflection. For instance, on the JVM, we can implement
newInstance
by callingClass.forName
with the name of the specialized version corresponding to a type and the class loader oftrait
C
itself. If such a class exists, we can then extract the calling function by getting a method handle on the constructor.If the specialized version does not exist we would instead return the creation method for the generic version of
C
(this could happen if classes are created only from generic code, but never mentioned explicitly at the specialized type in the source). We can use hashes to make sure that any specialized version found matches the version of the inline trait it implements. We would cache these computations in a lambda factory so all calls after the first one are simple lambda invocations.AOT Specialization
Our scheme does specialization on demand. A type
C[T]
gets specialized ifT
is a specializable argument forC
and the type is mentioned in the program (either explicitly or inferred).A typical use case is where a library defines specializable generic classes and then applications using that library would use these classes at specific argument types. Then the classes would be specialized on demand in the binary code of such applications.
We might also want to define some common specializations that are generated with the library and thus can be shared by all applications using them. To do this it is enough to mention the specialized type in the library at any point whatsoever.
As a convenience feature, we can define a method
Then libraries can force specializations by calling
specialized
like this:Discussion
Upsides of this proposal relative to Scala 2 specialization:
Numeric
which could be very expensive if compiled as is.Downsides:
Class.forName
or a linking step.The characteristics of the new scheme mean that it is ideally suited for larger classes such as
The larger the specialized code body is, the more likely it is that we will not cross abstraction boundaries requiring boxing, and the fewer lookups are needed to find the right constructor. On the other hand, for Scala 2 specialization, exponential code blow up makes all these usage scenarios unrealistic. So Scala 2 specialization is best used for small classes and traits such as functions and tuples.
Scala 3 already supports specialized functions in the Scala 2 style. It should not be too hard to support specialized tuples in that style as well. For most other things, the scheme described here looks like a better option.
Beta Was this translation helpful? Give feedback.
All reactions