Skip to content

Instantly share code, notes, and snippets.

@retronym
Last active February 5, 2022 10:47
Show Gist options
  • Save retronym/0178c212e4bacffed568 to your computer and use it in GitHub Desktop.
Save retronym/0178c212e4bacffed568 to your computer and use it in GitHub Desktop.
indylambda: Putting invokedynamic to work for Scala

indylambda: Putting invokedynamic to work for Scala

Java 8 introduced lambdas to the Java language. While the design choices differ in many regards from Scala's functions, the underlying mechanics used to represent Java lambdas is flexible enough to be used as a target for the Scala compiler.

Lambdas in Java

Java does not have canonical heirarchy of generic function types (ala scala.FunctionN), but instead allows a lambda to be used as a shorthand for an anonymous implementation of an Functional Interface

Here's an example of creating a predicate that closes over one value:

public class Test {
	void test() {
		String s = "foo";
		java.util.function.Predicate<String> pred = (str) -> str.equals(s);
	}
}

This is compiled to:

// javac -d . sandbox/Test.java && javap -classpath . -v -private Test | subl
  void test();
         0: ldc           #2                  // String foo
         2: astore_1
         3: aload_1
         4: invokedynamic #3,  0              // InvokeDynamic #0:test:(Ljava/lang/String;)Ljava/util/function/Predicate;
         9: astore_2
        10: return

  private static boolean lambda$test$0(java.lang.String, java.lang.String);
         0: aload_1
         1: aload_0
         2: invokevirtual #4                  // Method java/lang/String.equals:(Ljava/lang/Object;)Z
         5: ireturn
}

BootstrapMethods:
  0: #19 invokestatic java/lang/invoke/LambdaMetafactory.metafactory:(Ljava/lang/invoke/MethodHandles$Lookup;Ljava/lang/String;Ljava/lang/invoke/MethodType;Ljava/lang/invoke/MethodType;Ljava/lang/invoke/MethodHandle;Ljava/lang/invoke/MethodType;)Ljava/lang/invoke/CallSite;
    Method arguments:
      #20 (Ljava/lang/Object;)Z
      #21 invokestatic Test.lambda$test$0:(Ljava/lang/String;Ljava/lang/String;)Z
      #22 (Ljava/lang/String;)Z

The first time that the method test is invoked, the Java runtime will call the bootstrap method [LambdaMetaFactory.metafactory], asking it to create a class that implements the interface Predicate whose implementation will call the lambda target method lambda$test$0.

LambdaMetafactory was designed as a public API for other language implementors to target.

Lambdas in Scala 2.11-

class Test {
  def test = {
    val s = "foo"
    (x: String) => x == s
  }
}
  javap -classpath . -private -c Test
Compiled from "Test.scala"
public class Test {
  public scala.Function1<java.lang.String, java.lang.Object> test();
    Code:
       0: ldc           #12                 // String foo
       2: astore_1
       3: new           #14                 // class Test$$anonfun$test$1
       6: dup
       7: aload_0
       8: aload_1
       9: invokespecial #18                 // Method Test$$anonfun$test$1."<init>":(LTest;Ljava/lang/String;)V
      12: areturn

  public Test();
    Code:
       0: aload_0
       1: invokespecial #25                 // Method java/lang/Object."<init>":()V
       4: return
}

// javap -classpath . -private -c 'Test$$anonfun$test$1'
Compiled from "Test.scala"
public final class Test$$anonfun$test$1 extends scala.runtime.AbstractFunction1<java.lang.String, java.lang.Object> implements scala.Serializable {
  public static final long serialVersionUID;

  private final java.lang.String s$1;

  public final boolean apply(java.lang.String);
    Code:
       0: aload_1
       1: aload_0
       2: getfield      #23                 // Field s$1:Ljava/lang/String;
       5: astore_2
       6: dup
       7: ifnonnull     18
      10: pop
      11: aload_2
      12: ifnull        25
      15: goto          29
      18: aload_2
      19: invokevirtual #29                 // Method java/lang/Object.equals:(Ljava/lang/Object;)Z
      22: ifeq          29
      25: iconst_1
      26: goto          30
      29: iconst_0
      30: ireturn

  public final java.lang.Object apply(java.lang.Object);
    Code:
       0: aload_0
       1: aload_1
       2: checkcast     #34                 // class java/lang/String
       5: invokevirtual #37                 // Method apply:(Ljava/lang/String;)Z
       8: invokestatic  #43                 // Method scala/runtime/BoxesRunTime.boxToBoolean:(Z)Ljava/lang/Boolean;
      11: areturn

  public Test$$anonfun$test$1(Test, java.lang.String);
    Code:
       0: aload_0
       1: aload_2
       2: putfield      #23                 // Field s$1:Ljava/lang/String;
       5: aload_0
       6: invokespecial #50                 // Method scala/runtime/AbstractFunction1."<init>":()V
       9: return
}

The compiler eagerly creates a subclass of FunctionN, and lambda capture is simply an instantiation of thie class. The body of the lambda is coped into the apply method of this class.

Lambdas in Scala 2.11 with -Ydelambdafy:method

// scalac -Ydelambdafy:method sandbox/Test.scala && javap -classpath . -private -c 'Test'
Compiled from "Test.scala"
public class Test {
  public scala.Function1<java.lang.String, java.lang.Object> test();
    Code:
       0: ldc           #12                 // String foo
       2: astore_1
       3: new           #14                 // class Test$lambda$$test$1
       6: dup
       7: aload_1
       8: invokespecial #18                 // Method Test$lambda$$test$1."<init>":(Ljava/lang/String;)V
      11: checkcast     #20                 // class scala/Function1
      14: areturn

  private static final boolean $anonfun$1(java.lang.String, java.lang.String);
    Code:
       0: aload_0
       1: aload_1
       2: astore_2
       3: dup
       4: ifnonnull     15
       7: pop
       8: aload_2
       9: ifnull        22
      12: goto          26
      15: aload_2
      16: invokevirtual #30                 // Method java/lang/Object.equals:(Ljava/lang/Object;)Z
      19: ifeq          26
      22: iconst_1
      23: goto          27
      26: iconst_0
      27: ireturn

  public Test();
    Code:
       0: aload_0
       1: invokespecial #37                 // Method java/lang/Object."<init>":()V
       4: return

  public static final boolean accessor$1(java.lang.String, java.lang.String);
    Code:
       0: aload_0
       1: aload_1
       2: invokestatic  #40                 // Method $anonfun$1:(Ljava/lang/String;Ljava/lang/String;)Z
       5: ireturn
}

// javap -classpath . -private -c 'Test$lambda$$test$1'
Compiled from "Test.scala"
public final class Test$lambda$$test$1 extends scala.runtime.AbstractFunction1 implements scala.Serializable {
  public static final long serialVersionUID;

  public java.lang.String s$2;

  public Test$lambda$$test$1(java.lang.String);
    Code:
       0: aload_0
       1: invokespecial #18                 // Method scala/runtime/AbstractFunction1."<init>":()V
       4: aload_0
       5: aload_1
       6: putfield      #20                 // Field s$2:Ljava/lang/String;
       9: return

  public final boolean apply(java.lang.String);
    Code:
       0: aload_1
       1: aload_0
       2: getfield      #20                 // Field s$2:Ljava/lang/String;
       5: invokestatic  #30                 // Method Test.accessor$1:(Ljava/lang/String;Ljava/lang/String;)Z
       8: ireturn

  public final java.lang.Object apply(java.lang.Object);
    Code:
       0: aload_0
       1: aload_1
       2: checkcast     #34                 // class java/lang/String
       5: invokevirtual #36                 // Method apply:(Ljava/lang/String;)Z
       8: invokestatic  #42                 // Method scala/runtime/BoxesRunTime.boxToBoolean:(Z)Ljava/lang/Boolean;
      11: areturn
}

This is quite similar; lambda capture is still instantiation of an anonymous class. The class name is a little different, and its apply method no longer contains the lambda body, but rather delegates a method in the enclosing class of the lambda, in which the body has been copied.

This is a stepping stone towards...

Lambdas in Scala 2.12 (or 2.11 with experimental options)

// topic/indylambda-emit-indy /code/scala2 qscalac -Ydelambdafy:method -Ybackend:GenBCode -target:jvm-1.8 sandbox/Test.scala && javap -classpath . -private -c 'Test'
Compiled from "Test.scala"
public class Test {
  public scala.Function1<java.lang.String, java.lang.Object> test();
    Code:
       0: ldc           #12                 // String foo
       2: astore_1
       3: aload_1
       4: invokedynamic #32,  0             // InvokeDynamic #0:apply:(Ljava/lang/String;)Lscala/compat/java8/JFunction1;
       9: checkcast     #34                 // class scala/Function1
      12: areturn

  public static final boolean Test$$$anonfun$1(java.lang.String, java.lang.String);
    Code:
       0: aload_1
       1: aload_0
       2: astore_2
       3: dup
       4: ifnull        10
       7: goto          18
      10: pop
      11: aload_2
      12: ifnull        28
      15: goto          32
      18: aload_2
      19: invokevirtual #42                 // Method java/lang/Object.equals:(Ljava/lang/Object;)Z
      22: ifne          28
      25: goto          32
      28: iconst_1
      29: goto          33
      32: iconst_0
      33: ireturn

  public Test();
    Code:
       0: aload_0
       1: invokespecial #50                 // Method java/lang/Object."<init>":()V
       4: return
}

BootstrapMethods:
  0: #19 invokestatic java/lang/invoke/LambdaMetafactory.metafactory:(Ljava/lang/invoke/MethodHandles$Lookup;Ljava/lang/String;Ljava/lang/invoke/MethodType;Ljava/lang/invoke/MethodType;Ljava/lang/invoke/MethodHandle;Ljava/lang/invoke/MethodType;)Ljava/lang/invoke/CallSite;
    Method arguments:
      #21 (Ljava/lang/Object;)Ljava/lang/Object;
      #26 invokestatic Test.Test$$$anonfun$1:(Ljava/lang/String;Ljava/lang/String;)Z
      #28 (Ljava/lang/String;)Z

This looks a lot like the Java 8 encoding. The main point of difference is that use of the functional interface scala/compat/java8/JFunction1;. scala.FunctionN are not functional interfaces; they contain abstract alternatives of apply created by specialization. There are also a few concrete methods in the traits, which are usually filled in by mixin composition in the the Scala compiler.

scala> classOf[Function1[_, _]].getMethods.filter(m => java.lang.reflect.Modifier.isAbstract(m.getModifiers)).map(_.getName)
res3: Array[String] = Array(toString, apply, apply$mcZD$sp, apply$mcDD$sp, apply$mcFD$sp, apply$mcID$sp, apply$mcJD$sp, apply$mcVD$sp, apply$mcZF$sp, apply$mcDF$sp, apply$mcFF$sp, apply$mcIF$sp, apply$mcJF$sp, apply$mcVF$sp, apply$mcZI$sp, apply$mcDI$sp, apply$mcFI$sp, apply$mcII$sp, apply$mcJI$sp, apply$mcVI$sp, apply$mcZJ$sp, apply$mcDJ$sp, apply$mcFJ$sp, apply$mcIJ$sp, apply$mcJJ$sp, apply$mcVJ$sp, compose, andThen)

scala> classOf[Function3[_, _, _, _]].getMethods.filter(m => java.lang.reflect.Modifier.isAbstract(m.getModifiers)).map(_.getName)
res4: Array[String] = Array(toString, apply, curried, tupled)

In Scala 2.12, we intend to use Default Methods in Java interfaces to make these types functional interfaces.

In Scala 2.11 under experimental mode, we require an additional JAR on the classpath from scala-java8-compat.

Bridges and Boxes

LambdaMetafactory allows a little slack between the signature of the lambda target and the abstract method being implemented; Widening, Casting, Boxing, Unboxing operations are generated in the generated code to join the two together.

There are two places where this doesn't work as needed for Scala: null and void.

Out of the void

Unit returning functions in Scala currently have a generic bridge method that "boxes" void as BoxedUnit.

scala> class V extends scala.runtime.AbstractFunction1[String, Unit] { def apply(s: String): Unit = () }
defined class V

scala> :javap -c V
Compiled from "<console>"
public class V extends scala.runtime.AbstractFunction1<java.lang.String, scala.runtime.BoxedUnit> {
  public void apply(java.lang.String);
    Code:
       0: return

  public java.lang.Object apply(java.lang.Object);
    Code:
       0: aload_0
       1: aload_1
       2: checkcast     #23                 // class java/lang/String
       5: invokevirtual #25                 // Method apply:(Ljava/lang/String;)V
       8: getstatic     #31                 // Field scala/runtime/BoxedUnit.UNIT:Lscala/runtime/BoxedUnit;
      11: areturn

  public V();
    Code:
       0: aload_0
       1: invokespecial #37                 // Method scala/runtime/AbstractFunction1."<init>":()V
       4: return
}

However, we are unable to emit:

class C {
   def anonfun$1(s: String): Unit = ()
   def test = indy(LMF, <anonfun$1>, (LObject;)LObject;)
}   

Given that LambdaMetafactory doesn't know how to perform this boxing.

One solution is to use introduce an intermediate functional interfaces, such as JFunction1 from scala-java8-compat.

:javap -c scala.compat.java8.JProcedure1
Compiled from "JProcedure1.java"
public interface scala.compat.java8.JProcedure1<T1> extends scala.compat.java8.JFunction1<T1, scala.runtime.BoxedUnit> {
  public void $init$();
    Code:
       0: return

  public abstract void applyVoid(T1);

  public scala.runtime.BoxedUnit apply(T1);
    Code:
       0: aload_0
       1: aload_1
       2: invokeinterface #1,  2            // InterfaceMethod applyVoid:(Ljava/lang/Object;)V
       7: getstatic     #2                  // Field scala/runtime/BoxedUnit.UNIT:Lscala/runtime/BoxedUnit;
      10: areturn

  public java.lang.Object apply(java.lang.Object);
    Code:
       0: aload_0
       1: aload_1
       2: invokeinterface #3,  2            // InterfaceMethod apply:(Ljava/lang/Object;)Lscala/runtime/BoxedUnit;
       7: areturn
}

This leaves applyVoid abstract, and its return type exactly matches the lambda target method, so we don't require LMF to box.

I had originally thought that this was the only material difference in type adapatation, until...

Null boxes

scala> val box: java.lang.Integer = null
box: Integer = null

scala> Int.unbox(box) // Scala's unboxing
res6: Int = 0

scala> box.intValue  // Java's unboxing
java.lang.NullPointerException
  ... 34 elided

This difference means our semantics would change in:

scala> val i2i = (x: Int) => x + 1
i2i: Int => Int = <function1>

scala> i2i.asInstanceOf[AnyRef => Int]
res4: AnyRef => Int = <function1>

scala> res4(null) // would be an NPE under indylambda if we let LMF unbox
res5: Int = 1

Value Classes

I had identified early on that LMF would not be able to unbox value classes. I had worked around this by excluding functions that operated on Value Classes from the indylamba translation.

DIY boxing

Fortunately, there is a solution here: we can always emit an additional method to bridge signature of the lambda body method to the signature of the abstract method, and use that as the lambda target. (Or, just emit the lambda body with such a signature.)

This is a bit of a pain to implement as it means we'll have to get involved in erasure (and we're already touching uncurry, delambdafy and a little bit of specialization!)

Summary

We can deal with the differences in type adapation semantics by:

  • Being selective about whether or not to use indylambda (currently done for value classes)
  • Using intermediates functional interfaces (currently done for void methods)
  • Emitting the box/unboxing code ourselves, either
    • in the lambda target itself, or
    • in a bridging method.

I believe we should go last option to solve all of the boxing issues. Ideally we should only create this bridging method in cases where it is needed to preserve semantics; unneeded indirection is a pain in stack traces / debuggers.

DIY Boxing implementation

I was worried that it would be difficult to create the bridging method that accepts a boxed value class as a parameter, rather than its wrapped value. If we created this method in Delambdafy, we'd type check it with an ErasureTyper and it would would be stripped back to the wrapped type. I thought we'd need to invent a way to influence erasure of value classes.

However, I think we can sidestep this problem by just using Object as the formal parameter type and casting. Here's what I have in mind; this should be implementable entirely within Delambdafy.

class V(val a: Int) extends AnyVal

// typer
class C {
  def test = (v: V, i: Int) => ""
}

// delambdafy
class C {
  def test = $anonfun$1adapted(v, i).asInstanceOf[Function2[V, Int, String]]

  def $anonfun$1(v: V, i: Int) = ""
  // we only need to create this if value classes appear anywhere in the signature of `$anonfun$1` or
  // if it has any unspecialized primitives parameters.
  def $anonfun$1$adapted(v: Object, i: Integer) =
    $anonfun$1(v.asInstanceOf[V], i.asInstanceOf[Int])
}

// erasure
class C {
  def test = $anonfun$1(v, i).asInstanceOf[Function2[V, Int, String]]

  def $anonfun$1(v: Int, i: Int) = ""
  def $anonfun$1$adapted(v: Object, i: Integer) =
    $anonfun$1(v.asInstanceOf[V].v, i.asInstanceOf[Int])
}

// jvm
class C {
  def test = indy[LambdaMetafactory, $anonfun$1adapted], "(LObject;LObject;)LObject;", "scala/Function2"]()

  def $anonfun$1(v: Int, i: Int) = ""
  def $anonfun$1$adapted(v: Object, i: Integer) =
    $anonfun$1(v.asInstanceOf[V].v, i.asInstanceOf[Int])
}

Serialization

  • Compiler support: scala/scala#4501
    • Switch to using altMetafactory to synthesize serialization support in the generated lambdas
    • Add a $deserializeLambda$ method to lambda hosts to deserialize the serialization proxy, SerialiedLambda.
  • Library support: scala/scala-java8-compat#37
    • Forward to a generic deserializer that uses LambdaMetafactory in "user space", rather than via an invokedynamic call site.

Optimizer

To support inlining, we'll need:

  • Review, Test and Merge new optimizer: closure inlining
  • Update this to recognize invokedynamic <LambdaMetafactory implMethod FunctionN>(captures) as closure capture, rather than new anonFun$N(captures)

We shouln't need GC Savvy Closures, as LambdaMetafactory takes care of this for any lambda that doesn't have captured (e.g would have an empty constructor).

Default Methods for traits

See scala/scala-dev#35

@jvican
Copy link

jvican commented Apr 8, 2017

This is a great summary @retronym. Thanks for putting this together.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment