Skip to content

Instantly share code, notes, and snippets.

@jhugman
Created July 28, 2021 18:55
Show Gist options
  • Save jhugman/9d7a969ca345420797f5f1819c8a77e1 to your computer and use it in GitHub Desktop.
Save jhugman/9d7a969ca345420797f5f1819c8a77e1 to your computer and use it in GitHub Desktop.

Tool specific Intermediate Representation

The Intermediate Representation (IR) resolves to a tree of Descriptors, e.g.:

  • EnumDescriptor, which has EnumVariantDescriptors which may have FieldDescriptors
  • RecordDescriptor which have FieldDescriptors.
  • ObjectDescriptor which have MethodDescriptors, which may have ArgDescriptors.

These represent the concrete types and syntactic structures within those types.

struct EnumDescriptor {
	name: String,
	variants: Vec<EnumVariantDescriptor>,
}

Some of these descriptors will point to other concrete types. e.g.

struct FieldDescriptor {
	field_name: String,
	type_: TypeIdentifier,
	default: Option<Value>,
}

These descriptors are shared between all backends and serialize to the IR.

Backend specific type wraps type descriptors

For the descriptors that represent types (e.g. Object, Enum etc), there exists a struct that wraps the descriptor, and gives access to the sub-descriptor.

struct KotlinEnum {
	inner: EnumDescriptor
}

impl KotlinEnum {
	fn variants(&self) -> Vec<EnumVariantDescriptor> {
		self.inner.variants()
	}
}

These structs implement the trait CodeType.

CodeType is a trait that emits foreign language code for specific tasks, mostly identifiers, and expressions (e.g. function calls into its own machinery).

Precisely what is needed depends upon what the tool this generator is part of. E.g. uniffi needs lift and lower machinery.

It also knows how to generate all the code for its inner Descriptor with the fn render_declaration(&self).

impl CodeType for KotlinEnum {
	fn name(&self) -> String {
		self.inner.name().to_camel_case()
	}

	fn internals(&self) -> String {
		format!("Uniffi{}Internals", self.name())
	}

	fn literal(&self, v: Value) -> String {
		…
	}
	fn lower_into(&self, value: String, buffer: String) -> String {
		format!("{}.lowerInto({}, {})", self.internals(), value, buffer)
	}
	
	fn render_declaration(&self) -> Result<String> {
		EnumDecl(&self).render()
	}
}

A type_oracle knows how to map TypeIdentifiers to CodeTypes.

i.e. if a render_declaration() or another CodeType has a TypeIdentifier and the type_oracle, it can look up the CodeType and then be able to reference and manipulate it in the foreign language.

Aside: how far can we take these CodeTypes? Can CodeType contain TypeIdentifiers?

Since we can generate a declaration, and ways to call into it, I suspect we can:

  • support compound code types (for Option<T> and Array<T>, Map<String, T>)
  • support the TransformTowers proposal
  • support the external types proposal.
  • code types for primitives (though a rust macro may be needed for this).

Generating the declaration with the main templates

The render_declaration method is almost certainly calling into a template, which is all the code needed to define the type, and what ever internal/private machinery that is required.

The template has access to the type_oracle.

We can stop here, and do all the above with askama. In askama land, we have one template per struct, so in this proposal this would be one template file per struct that implements CodeType.

However, rfk asked me for dreamcode.

I've used a rsx! macro and #[component] syntax which is taken directly from the render crate, which itself implements something like JSX. I added to backticks.

Components (in JSX speak) are templates that take a set of arguments and render a representation of those arguments using strings or other components to render themselves.

#[component]
fn EnumDecl(type_: &KotlinEnum) -> Result<String> {
	let type_name = type_.name();
	rsx! ```
		public sealed class {{ type_name }} {
			<EnumVariants type_={{type_}} />
		}
		
		internal class {{ type_.internals() }} {
			static fun downOne(v: {{ type_name }}): Int = …
			static fun upOne(v: Int): {{ type_name }} = …
			static fun lowerInto(v: {{ type_name }}, buffer: RustBuffer) {
				…
			}
		}
	```
}

#[component]
fn EnumVariants(type_: &KotlinEnum) -> Result<String> {
	let type_name = type_.name();
 	type_.variants().map(|v| {
		if v.fields().len() == 0 {
			rsx! ```
				public object {{ v.name() }} : {{ type_name }}
			```
		} else {
			rsx! ```
				public class {{ v.name() }}(
					<FieldsDecl fields={{ variant.fields() }}
				) : {{ type_name }}
			```
		}
	}).join("\n")
}

#[component]
fn FieldsDecl(fields: &Vec<FieldDescriptor>) -> Result<String> {
	fields.map(|f| {
		let name = f.name();
		let type_ = type_oracle.find(f.type_id())?
		if let Option(default) = f.default_value() {
			rsx! ``` 
				val {{ name }}: {{ type_.name() }} = {{ type_.literal(default) }}
			```
		} else {
			rsx! ```
				val {{ name }}: {{ type_.name }}			
			```
		}

	}).join(",\n")
}

The interesting bits here are:

  • templates are composed of text and other templates.
  • Intra-template logic is Rust, instead of additional templating logic.

We did use macros in askama, but these are somewhat more ergonomic unit of template re-use.

I don't know if render can be persuaded to do this, or if we have to write something ourselves, perhaps based on syn-rsx.

Wishlist aside if we were to build our own rsx:

  • works on Rust Stable
  • Markdown triple backticks FTW
  • Trimming the indent so it matches the indent of the rsx! token, or kotlin's trimIndent
@jhugman
Copy link
Author

jhugman commented Jul 30, 2021

Its not obvious to me what the KotlinEnum wrapper is for. Would it work if we implemented a trait directly on the underlying struct?

I'm not stuck on a KotlinEnum wrapper that implements a CodeType, but I ended up there because of function name collisions from the SwiftCodeType and PythonCodeType traits.

@jhugman
Copy link
Author

jhugman commented Jul 30, 2021

I feel like maybe the point of this proposal is that it doesn't matter too much what a TypeIdentifier actually is, so long as you can map it to a CodeType implementing the necessary rendering functions, but I want to check my understanding.

Yes, I think you're right; it doesn't matter.

My thinking was that the TypeIdentifier was serializable type label, used to label the types of args, properties, return types, etc etc. That points to an enum not unlike Type that can represent compound type labels.

For the purpose of this proposal, uniffi's type_oracle could well be a mega-match expression on Type::. By my count, this would reduce the number of match expressions on Type:: from 6 to 1 per binding.

I think this would be a net-positive. Adding new types to an existing backend becomes almost straight forward.

For the purposes of the External Types and Transform Towers proposal, the type_oracles then becomes quite an interesting place to start.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment