I attempted to implement a String representation in our Wasm backend using an i16
array with WasmGC. However, we cannot simply switch the String representation from a JS String to an i16
array, as we still rely on JS Strings for JS interoperation. The plan is to allow these two String representations to coexist, using i16
arrays wherever possible while retaining JS Strings where necessary. This way, we should be able to keep the test suites passing, start using i16
arrays, and gradually remove JS interoperation.
To achieve this, we need to convert between JS Strings and i16
array Strings. The main idea is to
- convert an
i16
array String to a JS String when upcasting String into Any and - convert a JS String into an
i16
array when down casting from Any to String - Additionally, we must handle conversions between JS Strings and
i16
arrays when passing Strings between non-JS and JS classes (because we keep using JSStrings inside the JSClasses). Select
from JSClass from Non-JSClass, and vise-versa.
I tried to implement this approach, and most test suites passed, but I couldn't get all of them be green.
Here's what I did and what isn't working:
First of all, we have two different TypeTransformer
s:
JSTypeTransformer
, which uses JSString ((null)? any
) for Strings.WasmTypeTransformer
, which uses(ref (null)? (array (mut i16)))
for Strings.
// TypeTransformer.scala
object JSTypeTransformer extends TypeTransformer {
override val useWasmString: Boolean = false
override val stringType: Types.Type = watpe.RefType.any
override val boxedStringType: Types.Type = watpe.RefType.anyref
}
object WasmTypeTransformer extends TypeTransformer {
override val useWasmString: Boolean = true
override val stringType: Types.Type = watpe.RefType(genTypeID.i16Array)
override val boxedStringType: Types.Type = watpe.RefType.nullable(genTypeID.i16Array)
}
We use JSTypeTransformer
in JSClass
es, and WasmTypeTransformer
for others.
// ClassEmitter.scala
val typeTransformer =
if (clazz.kind.isJSType) TypeTransformer.JSTypeTransformer
else TypeTransformer.WasmTypeTransformer
The transformType
method transforms StringType
and ClassType(BoxedStringType)
to any
and anyref
in WasmTypeTransformer
.
def transformType(tpe: Type)(implicit ctx: WasmContext): watpe.Type = {
tpe match {
case AnyType => watpe.RefType.anyref
case ClassType(className) if className == BoxedStringClass => boxedStringType
case ClassType(className) => transformClassType(className)
case StringType => stringType
case UndefType => watpe.RefType.any
// ...
When passing a String from a non-JS class to a JS class, we need to convert the i16
array String into a JS String and vice versa. We also need to convert the JS String back to an i16
array String when return String back to non-JS class.
I defined the methods genAdaptArgString
and genAdaptResultString
. These methods generate the necessary transformations between JS Strings and i16
arrays. We generate these conversions on the caller side.
private def genAdaptArgString(
paramType: Type,
callerUsesWasmStr: Boolean,
calleeUsesWasmStr: Boolean,
argNullable: Boolean = true // make it false when callee knows it's not nullable
): Unit = {
if (paramType == StringType || paramType == ClassType(BoxedStringClass)) {
val nullable = (paramType == ClassType(BoxedStringClass) && argNullable)
// if caller uses Wasm string (i16 array), and callee doesn't,
// transform i16 array to JS string
if (callerUsesWasmStr && !calleeUsesWasmStr) {
if (nullable) fb += wa.Call(genFunctionID.createJSStringFromArrayNullable)
else fb += wa.Call(genFunctionID.createJSStringFromArray)
// if caller doesn't use Wasm string (i16 array), and callee does,
// transform JS string into i16 array
} else if (!callerUsesWasmStr && calleeUsesWasmStr) {
if (nullable) fb += wa.Call(genFunctionID.createArrayFromJSStringNullable)
else fb += wa.Call(genFunctionID.createArrayFromJSString)
}
}
}
// genAdaptResultString is almost same, it does the same conversion in an opposite way
For example, genArgs
and genReceiverNotNull
private def genArgs(args: List[Tree], methodName: MethodName, receiverClassKind: ClassKind)(
implicit typeTransformer: TypeTransformer): Unit = {
for ((arg, paramTypeRef) <- args.zip(methodName.paramTypeRefs)) {
val paramType = ctx.inferTypeFromTypeRef(paramTypeRef)
genTree(arg, paramType)
genAdaptArgString(
paramType,
// we know caller uses i16Array string by checking `TypeTransformer.useWasmstring`
callerUsesWasmStr = typeTransformer.useWasmString,
// we know callee uses i16Array string by the receiver's class kind is NOT JSType
calleeUsesWasmStr = !receiverClassKind.isJSType
)
}
}
def genReceiverNotNull(): Unit = {
genTreeAuto(receiver)
fb += wa.RefAsNonNull
genAdaptArgString(
receiver.tpe,
typeTransformer.useWasmString,
!receiverClassInfo.kind.isJSType,
argNullable = false
)
}
We execute genAdaptResultString
after Call
(or CallRef
).
We convert the i16
array string to a JavaScript string when upcasting a String to AnyType
, only when the surrounding class is not a JavaScript class. Inside a JavaScript class, the string is already a JavaScript string, so upcasting is a no-op.
I added the conversion when the generatedType
is CharSequence
. In this case, the stack should contain either an i16
array or an instance of CharSequence
. For the former, a conversion is necessary; for the latter, no conversion is needed, as the underlying content
will be handled elsewhere (???)
// genAdapt
case (ClassType(CharSequenceClass), AnyType) if typeTransformer.useWasmString =>
// should be either an instance of `CharSequence` or `i16Array`
// if it's i16Array -> convert to JS string
// if it's an instance of `CharSequence` -> no-op
val receiver = addSyntheticLocal(watpe.RefType.anyref)
fb += wa.LocalSet(receiver)
fb.block(watpe.RefType.anyref) { labelDone =>
fb.block(watpe.RefType.anyref) { labelNotOurObject =>
fb += wa.LocalGet(receiver)
fb += wa.BrOnCastFail(
labelNotOurObject,
watpe.RefType.anyref,
watpe.RefType(genTypeID.ObjectStruct)
)
fb += wa.Br(labelDone)
} // end of labelNotOurObject
// otherwise, it should be i16Array
fb += wa.RefCast(watpe.RefType.nullable(genTypeID.i16Array))
fb += wa.Call(genFunctionID.createJSStringFromArrayNullable)
}
In genAsInstanceOf
val isDownCastAnyToString: Boolean =
(sourceWasmType, targetWasmType) match {
case (watpe.RefType(_, sourceHeapType),
watpe.RefType(_, targetHeapType))
if sourceHeapType == watpe.HeapType.Any &&
targetHeapType == watpe.HeapType(genTypeID.i16Array) =>
true
case _ => false
}
// ...
} else if (isDownCastAnyToString && typeTransformer.useWasmString) {
fb.block(targetWasmType) { foo =>
genTreeAuto(expr)
// In case the receiver value is already an i16Array
fb += wa.BrOnCast(
foo,
watpe.RefType.anyref,
watpe.RefType.nullable(genTypeID.i16Array)
)
fb += wa.Call(genFunctionID.createArrayFromJSStringNullable)
}
fb += wa.RefCast(watpe.RefType.nullable(genTypeID.i16Array))
Also, in genUnbox
if the surrounding type uses i16
array string (but I'm not sure this is right...)
case StringType =>
fb += wa.RefAsNonNull
fb += wa.Call(genFunctionID.jsValueToString) // for `undefined`
if (typeTransformer.useWasmString) {
fb += wa.Call(genFunctionID.createArrayFromJSString)
}
and in ArraySelect
/* If it is a reference array type whose element type does not translate
* to `anyref`, we must cast down the result.
*/
// ...
case refType @ watpe.RefType(nullable, heapType) if
typeTransformer.useWasmString &&
heapType == watpe.HeapType(genTypeID.i16Array) =>
if (nullable) fb += wa.Call(genFunctionID.createArrayFromJSStringNullable)
else fb += wa.Call(genFunctionID.createArrayFromJSString)
case refType: watpe.RefType =>
fb += wa.RefCast(refType)
// ...
Checking if it's an i16
array
Added a block around notOurObject
because if the receiver is not an instance of j.l.Object
, it can be either JS value or i16 array.
if (typeTransformer.useWasmString) {
fb += wa.ArrayLen
} else {
fb += wa.Call(genFunctionID.stringLength)
}
if (typeTransformer.useWasmString) {
fb += wa.ArrayGetU(genTypeID.i16Array)
} else {
fb += wa.Call(genFunctionID.stringCharAt)
}
Just use array.len
and array.get_u
// genEq
if (typeTransformer.useWasmString) {
fb += wa.Call(genFunctionID.equals)
} else {
fb += wa.Call(genFunctionID.is)
}
Defined an equal
function that checks
- if two given
anyref
are bothi16Array
, checks the deep equality of those two arrays - if both two are not
i16array
, callis
- otherwise, returns
false
if (typeTransformer.useWasmString) fb += wa.Call(genFunctionID.wasmStringConcat)
else fb += wa.Call(genFunctionID.stringConcat)
toString
if (typeTransformer.useWasmString && receiverClassName == CharSequenceClass) {
// do nothing
fb += wa.RefCast(watpe.RefType(genTypeID.i16Array)) // ???
} else {
fb += wa.Call(genFunctionID.jsValueToString)
if (typeTransformer.useWasmString) fb += wa.Call(genFunctionID.createArrayFromJSString)
}
If the receiverClass is CharSequence
and the runtime type is not an our Object, it should be an i16Array
, just ref.cast
genLiteral
case StringLiteral(v) =>
fb ++= ctx.stringPool.getConstantStringInstr(v)
if (typeTransformer.useWasmString) fb += wa.Call(genFunctionID.createArrayFromJSString)
(It's verbose that, getConstatntStringInstr
create i16Array
and then convert it to JS string, and createArrayFromJSString
converts back to Array, don't care in this prototype)
class ScalaClassContainerWithObject(xxx: String) {
object InnerJSObject extends js.Object with TestInterface {
val zzz: String = xxx + "zzz"
def foo(a: String): String = xxx + "zzz" + a
}
def makeLocalJSObject(yyy: String): TestInterface = {
object LocalJSObject extends js.Object with TestInterface {
val zzz: String = xxx + yyy
def foo(a: String): String = xxx + yyy + a
}
LocalJSObject
}
}
In def makeLocalJSObject(yyy: String): TestInterface = {
, yyy
will be an i16
array because it's a member of ScalaClassContainerWithObject
, which is a non-JS class.
On the other hand, when referencing yyy
from foo
in LocalJSObject
, it expects yyy
to be a JS string.
We need to find a way to convert between the i16
array and a JS string when accessing the captured values.
Not sure what is the root cause,
val a1 = Array[String]("a", "s", "d", "f")
val a2 = "asdf".split("")
java.util.Arrays.deepEquals(a1, a2) // -> false (should be true)
it seems like the elements of a1
becomes JS strings and elements of a2
will be i16
array. Maybe somehow suppress the conversion to JS string when constructing Array?