Java 8 introduced lambdas to the Java language. While the design choices differ in many regards from Scala's functions, the underlying mechanics used to represent Java lambdas is flexible enough to be used as a target for the Scala compiler.
Java does not have canonical heirarchy of generic function types (ala scala.FunctionN
), but instead allows a lambda to be used as a shorthand for an anonymous implementation of an Functional Interface
Here's an example of creating a predicate that closes over one value:
public class Test {
void test() {
String s = "foo";
java.util.function.Predicate<String> pred = (str) -> str.equals(s);
}
}
This is compiled to:
// javac -d . sandbox/Test.java && javap -classpath . -v -private Test | subl
void test();
0: ldc #2 // String foo
2: astore_1
3: aload_1
4: invokedynamic #3, 0 // InvokeDynamic #0:test:(Ljava/lang/String;)Ljava/util/function/Predicate;
9: astore_2
10: return
private static boolean lambda$test$0(java.lang.String, java.lang.String);
0: aload_1
1: aload_0
2: invokevirtual #4 // Method java/lang/String.equals:(Ljava/lang/Object;)Z
5: ireturn
}
BootstrapMethods:
0: #19 invokestatic java/lang/invoke/LambdaMetafactory.metafactory:(Ljava/lang/invoke/MethodHandles$Lookup;Ljava/lang/String;Ljava/lang/invoke/MethodType;Ljava/lang/invoke/MethodType;Ljava/lang/invoke/MethodHandle;Ljava/lang/invoke/MethodType;)Ljava/lang/invoke/CallSite;
Method arguments:
#20 (Ljava/lang/Object;)Z
#21 invokestatic Test.lambda$test$0:(Ljava/lang/String;Ljava/lang/String;)Z
#22 (Ljava/lang/String;)Z
The first time that the method test
is invoked, the Java runtime will call the bootstrap method [LambdaMetaFactory.metafactory
], asking it to create a class that implements the interface Predicate
whose implementation will call the lambda target method lambda$test$0
.
LambdaMetafactory
was designed as a public API for other language implementors to target.
class Test {
def test = {
val s = "foo"
(x: String) => x == s
}
}
javap -classpath . -private -c Test
Compiled from "Test.scala"
public class Test {
public scala.Function1<java.lang.String, java.lang.Object> test();
Code:
0: ldc #12 // String foo
2: astore_1
3: new #14 // class Test$$anonfun$test$1
6: dup
7: aload_0
8: aload_1
9: invokespecial #18 // Method Test$$anonfun$test$1."<init>":(LTest;Ljava/lang/String;)V
12: areturn
public Test();
Code:
0: aload_0
1: invokespecial #25 // Method java/lang/Object."<init>":()V
4: return
}
// javap -classpath . -private -c 'Test$$anonfun$test$1'
Compiled from "Test.scala"
public final class Test$$anonfun$test$1 extends scala.runtime.AbstractFunction1<java.lang.String, java.lang.Object> implements scala.Serializable {
public static final long serialVersionUID;
private final java.lang.String s$1;
public final boolean apply(java.lang.String);
Code:
0: aload_1
1: aload_0
2: getfield #23 // Field s$1:Ljava/lang/String;
5: astore_2
6: dup
7: ifnonnull 18
10: pop
11: aload_2
12: ifnull 25
15: goto 29
18: aload_2
19: invokevirtual #29 // Method java/lang/Object.equals:(Ljava/lang/Object;)Z
22: ifeq 29
25: iconst_1
26: goto 30
29: iconst_0
30: ireturn
public final java.lang.Object apply(java.lang.Object);
Code:
0: aload_0
1: aload_1
2: checkcast #34 // class java/lang/String
5: invokevirtual #37 // Method apply:(Ljava/lang/String;)Z
8: invokestatic #43 // Method scala/runtime/BoxesRunTime.boxToBoolean:(Z)Ljava/lang/Boolean;
11: areturn
public Test$$anonfun$test$1(Test, java.lang.String);
Code:
0: aload_0
1: aload_2
2: putfield #23 // Field s$1:Ljava/lang/String;
5: aload_0
6: invokespecial #50 // Method scala/runtime/AbstractFunction1."<init>":()V
9: return
}
The compiler eagerly creates a subclass of FunctionN
, and lambda capture is simply an instantiation of thie class. The body of the lambda is coped into the apply
method of this class.
// scalac -Ydelambdafy:method sandbox/Test.scala && javap -classpath . -private -c 'Test'
Compiled from "Test.scala"
public class Test {
public scala.Function1<java.lang.String, java.lang.Object> test();
Code:
0: ldc #12 // String foo
2: astore_1
3: new #14 // class Test$lambda$$test$1
6: dup
7: aload_1
8: invokespecial #18 // Method Test$lambda$$test$1."<init>":(Ljava/lang/String;)V
11: checkcast #20 // class scala/Function1
14: areturn
private static final boolean $anonfun$1(java.lang.String, java.lang.String);
Code:
0: aload_0
1: aload_1
2: astore_2
3: dup
4: ifnonnull 15
7: pop
8: aload_2
9: ifnull 22
12: goto 26
15: aload_2
16: invokevirtual #30 // Method java/lang/Object.equals:(Ljava/lang/Object;)Z
19: ifeq 26
22: iconst_1
23: goto 27
26: iconst_0
27: ireturn
public Test();
Code:
0: aload_0
1: invokespecial #37 // Method java/lang/Object."<init>":()V
4: return
public static final boolean accessor$1(java.lang.String, java.lang.String);
Code:
0: aload_0
1: aload_1
2: invokestatic #40 // Method $anonfun$1:(Ljava/lang/String;Ljava/lang/String;)Z
5: ireturn
}
// javap -classpath . -private -c 'Test$lambda$$test$1'
Compiled from "Test.scala"
public final class Test$lambda$$test$1 extends scala.runtime.AbstractFunction1 implements scala.Serializable {
public static final long serialVersionUID;
public java.lang.String s$2;
public Test$lambda$$test$1(java.lang.String);
Code:
0: aload_0
1: invokespecial #18 // Method scala/runtime/AbstractFunction1."<init>":()V
4: aload_0
5: aload_1
6: putfield #20 // Field s$2:Ljava/lang/String;
9: return
public final boolean apply(java.lang.String);
Code:
0: aload_1
1: aload_0
2: getfield #20 // Field s$2:Ljava/lang/String;
5: invokestatic #30 // Method Test.accessor$1:(Ljava/lang/String;Ljava/lang/String;)Z
8: ireturn
public final java.lang.Object apply(java.lang.Object);
Code:
0: aload_0
1: aload_1
2: checkcast #34 // class java/lang/String
5: invokevirtual #36 // Method apply:(Ljava/lang/String;)Z
8: invokestatic #42 // Method scala/runtime/BoxesRunTime.boxToBoolean:(Z)Ljava/lang/Boolean;
11: areturn
}
This is quite similar; lambda capture is still instantiation of an anonymous class. The class name is a little different, and its apply method no longer contains the lambda body, but rather delegates a method in the enclosing class of the lambda, in which the body has been copied.
This is a stepping stone towards...
// topic/indylambda-emit-indy /code/scala2 qscalac -Ydelambdafy:method -Ybackend:GenBCode -target:jvm-1.8 sandbox/Test.scala && javap -classpath . -private -c 'Test'
Compiled from "Test.scala"
public class Test {
public scala.Function1<java.lang.String, java.lang.Object> test();
Code:
0: ldc #12 // String foo
2: astore_1
3: aload_1
4: invokedynamic #32, 0 // InvokeDynamic #0:apply:(Ljava/lang/String;)Lscala/compat/java8/JFunction1;
9: checkcast #34 // class scala/Function1
12: areturn
public static final boolean Test$$$anonfun$1(java.lang.String, java.lang.String);
Code:
0: aload_1
1: aload_0
2: astore_2
3: dup
4: ifnull 10
7: goto 18
10: pop
11: aload_2
12: ifnull 28
15: goto 32
18: aload_2
19: invokevirtual #42 // Method java/lang/Object.equals:(Ljava/lang/Object;)Z
22: ifne 28
25: goto 32
28: iconst_1
29: goto 33
32: iconst_0
33: ireturn
public Test();
Code:
0: aload_0
1: invokespecial #50 // Method java/lang/Object."<init>":()V
4: return
}
BootstrapMethods:
0: #19 invokestatic java/lang/invoke/LambdaMetafactory.metafactory:(Ljava/lang/invoke/MethodHandles$Lookup;Ljava/lang/String;Ljava/lang/invoke/MethodType;Ljava/lang/invoke/MethodType;Ljava/lang/invoke/MethodHandle;Ljava/lang/invoke/MethodType;)Ljava/lang/invoke/CallSite;
Method arguments:
#21 (Ljava/lang/Object;)Ljava/lang/Object;
#26 invokestatic Test.Test$$$anonfun$1:(Ljava/lang/String;Ljava/lang/String;)Z
#28 (Ljava/lang/String;)Z
This looks a lot like the Java 8 encoding. The main point of difference is that use of the functional interface scala/compat/java8/JFunction1;
. scala.FunctionN
are not functional interfaces; they contain abstract alternatives of apply
created by specialization. There are also a few concrete methods in the traits, which are usually filled in by mixin composition in the the Scala compiler.
scala> classOf[Function1[_, _]].getMethods.filter(m => java.lang.reflect.Modifier.isAbstract(m.getModifiers)).map(_.getName)
res3: Array[String] = Array(toString, apply, apply$mcZD$sp, apply$mcDD$sp, apply$mcFD$sp, apply$mcID$sp, apply$mcJD$sp, apply$mcVD$sp, apply$mcZF$sp, apply$mcDF$sp, apply$mcFF$sp, apply$mcIF$sp, apply$mcJF$sp, apply$mcVF$sp, apply$mcZI$sp, apply$mcDI$sp, apply$mcFI$sp, apply$mcII$sp, apply$mcJI$sp, apply$mcVI$sp, apply$mcZJ$sp, apply$mcDJ$sp, apply$mcFJ$sp, apply$mcIJ$sp, apply$mcJJ$sp, apply$mcVJ$sp, compose, andThen)
scala> classOf[Function3[_, _, _, _]].getMethods.filter(m => java.lang.reflect.Modifier.isAbstract(m.getModifiers)).map(_.getName)
res4: Array[String] = Array(toString, apply, curried, tupled)
In Scala 2.12, we intend to use Default Methods in Java interfaces to make these types functional interfaces.
In Scala 2.11 under experimental mode, we require an additional JAR on the classpath from scala-java8-compat.
LambdaMetafactory
allows a little slack between the signature of the lambda target and the abstract method being implemented; Widening, Casting, Boxing, Unboxing operations are generated in the generated code to join the two together.
There are two places where this doesn't work as needed for Scala: null
and void
.
Unit returning functions in Scala currently have a generic bridge method that "boxes" void
as BoxedUnit
.
scala> class V extends scala.runtime.AbstractFunction1[String, Unit] { def apply(s: String): Unit = () }
defined class V
scala> :javap -c V
Compiled from "<console>"
public class V extends scala.runtime.AbstractFunction1<java.lang.String, scala.runtime.BoxedUnit> {
public void apply(java.lang.String);
Code:
0: return
public java.lang.Object apply(java.lang.Object);
Code:
0: aload_0
1: aload_1
2: checkcast #23 // class java/lang/String
5: invokevirtual #25 // Method apply:(Ljava/lang/String;)V
8: getstatic #31 // Field scala/runtime/BoxedUnit.UNIT:Lscala/runtime/BoxedUnit;
11: areturn
public V();
Code:
0: aload_0
1: invokespecial #37 // Method scala/runtime/AbstractFunction1."<init>":()V
4: return
}
However, we are unable to emit:
class C {
def anonfun$1(s: String): Unit = ()
def test = indy(LMF, <anonfun$1>, (LObject;)LObject;)
}
Given that LambdaMetafactory
doesn't know how to perform this boxing.
One solution is to use introduce an intermediate functional interfaces, such as JFunction1
from scala-java8-compat
.
:javap -c scala.compat.java8.JProcedure1
Compiled from "JProcedure1.java"
public interface scala.compat.java8.JProcedure1<T1> extends scala.compat.java8.JFunction1<T1, scala.runtime.BoxedUnit> {
public void $init$();
Code:
0: return
public abstract void applyVoid(T1);
public scala.runtime.BoxedUnit apply(T1);
Code:
0: aload_0
1: aload_1
2: invokeinterface #1, 2 // InterfaceMethod applyVoid:(Ljava/lang/Object;)V
7: getstatic #2 // Field scala/runtime/BoxedUnit.UNIT:Lscala/runtime/BoxedUnit;
10: areturn
public java.lang.Object apply(java.lang.Object);
Code:
0: aload_0
1: aload_1
2: invokeinterface #3, 2 // InterfaceMethod apply:(Ljava/lang/Object;)Lscala/runtime/BoxedUnit;
7: areturn
}
This leaves applyVoid
abstract, and its return type exactly matches the lambda target method, so we don't require LMF
to box.
I had originally thought that this was the only material difference in type adapatation, until...
scala> val box: java.lang.Integer = null
box: Integer = null
scala> Int.unbox(box) // Scala's unboxing
res6: Int = 0
scala> box.intValue // Java's unboxing
java.lang.NullPointerException
... 34 elided
This difference means our semantics would change in:
scala> val i2i = (x: Int) => x + 1
i2i: Int => Int = <function1>
scala> i2i.asInstanceOf[AnyRef => Int]
res4: AnyRef => Int = <function1>
scala> res4(null) // would be an NPE under indylambda if we let LMF unbox
res5: Int = 1
I had identified early on that LMF would not be able to unbox value classes. I had worked around this by excluding functions that operated on Value Classes from the indylamba translation.
Fortunately, there is a solution here: we can always emit an additional method to bridge signature of the lambda body method to the signature of the abstract method, and use that as the lambda target. (Or, just emit the lambda body with such a signature.)
This is a bit of a pain to implement as it means we'll have to get involved in erasure (and we're already touching uncurry, delambdafy and a little bit of specialization!)
We can deal with the differences in type adapation semantics by:
- Being selective about whether or not to use indylambda (currently done for value classes)
- Using intermediates functional interfaces (currently done for
void
methods) - Emitting the box/unboxing code ourselves, either
- in the lambda target itself, or
- in a bridging method.
I believe we should go last option to solve all of the boxing issues. Ideally we should only create this bridging method in cases where it is needed to preserve semantics; unneeded indirection is a pain in stack traces / debuggers.
I was worried that it would be difficult to create the bridging method that accepts a boxed value class as a parameter, rather than its wrapped value. If we created this method in Delambdafy
, we'd type check it with an ErasureTyper
and it would would be stripped back to the wrapped type. I thought we'd need to invent a way to influence erasure of value classes.
However, I think we can sidestep this problem by just using Object
as the formal parameter type and casting. Here's what I have in mind; this should be implementable entirely within Delambdafy
.
class V(val a: Int) extends AnyVal
// typer
class C {
def test = (v: V, i: Int) => ""
}
// delambdafy
class C {
def test = $anonfun$1adapted(v, i).asInstanceOf[Function2[V, Int, String]]
def $anonfun$1(v: V, i: Int) = ""
// we only need to create this if value classes appear anywhere in the signature of `$anonfun$1` or
// if it has any unspecialized primitives parameters.
def $anonfun$1$adapted(v: Object, i: Integer) =
$anonfun$1(v.asInstanceOf[V], i.asInstanceOf[Int])
}
// erasure
class C {
def test = $anonfun$1(v, i).asInstanceOf[Function2[V, Int, String]]
def $anonfun$1(v: Int, i: Int) = ""
def $anonfun$1$adapted(v: Object, i: Integer) =
$anonfun$1(v.asInstanceOf[V].v, i.asInstanceOf[Int])
}
// jvm
class C {
def test = indy[LambdaMetafactory, $anonfun$1adapted], "(LObject;LObject;)LObject;", "scala/Function2"]()
def $anonfun$1(v: Int, i: Int) = ""
def $anonfun$1$adapted(v: Object, i: Integer) =
$anonfun$1(v.asInstanceOf[V].v, i.asInstanceOf[Int])
}
- Compiler support: scala/scala#4501
- Switch to using
altMetafactory
to synthesize serialization support in the generated lambdas - Add a
$deserializeLambda$
method to lambda hosts to deserialize the serialization proxy,SerialiedLambda
.
- Switch to using
- Library support: scala/scala-java8-compat#37
- Forward to a generic deserializer that uses
LambdaMetafactory
in "user space", rather than via aninvokedynamic
call site.
- Forward to a generic deserializer that uses
To support inlining, we'll need:
- Review, Test and Merge new optimizer: closure inlining
- Update this to recognize
invokedynamic <LambdaMetafactory implMethod FunctionN>(captures)
as closure capture, rather thannew anonFun$N(captures)
We shouln't need GC Savvy Closures, as LambdaMetafactory
takes care of this for any lambda that doesn't have captured (e.g would have an empty constructor).
This is a great summary @retronym. Thanks for putting this together.