Skip to content

Instantly share code, notes, and snippets.

@nasser
Created June 13, 2019 21:14
Show Gist options
  • Save nasser/f2987557fe8bea0921c37ad36782c1f7 to your computer and use it in GitHub Desktop.
Save nasser/f2987557fe8bea0921c37ad36782c1f7 to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Compiler Hacking in `F#`"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"First, load `Mono.Cecil`. It's the CIL code generation library we're going to be using."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"#r \"Mono.Cecil.dll\""
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"open Mono.Cecil\n",
"open Mono.Cecil.Cil\n",
"open System"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Next, create an `AssemblyDefinition`. This will store all our types and methods."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"let assembly =\n",
" AssemblyDefinition.CreateAssembly(new AssemblyNameDefinition(\"hello\", new Version(0, 0)), \"Hello\", ModuleKind.Dll)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"hello, Version=0.0.0.0, Culture=neutral, PublicKeyToken=null"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"assembly"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Next, lets create a type called `ExampleType` with a static method called `Main` that prints `\"hello world\"`. Cecil is a faithful object-oriented representation of the CLR's standardized assembly format, and you have to populate its data structures somewhat manually."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The basic types are all module-specific. We're going to need `object` and `void` later so we grab them now."
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [],
"source": [
"let objectType = assembly.MainModule.TypeSystem.Object\n",
"let voidType = assembly.MainModule.TypeSystem.Void"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [],
"source": [
"let exampleType = new TypeDefinition(\"Example\", \"ExampleType\", TypeAttributes.Public ||| TypeAttributes.Class, objectType)"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Example.ExampleType"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"exampleType"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This is what I mean by manually populating the data structures:"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [],
"source": [
"assembly.MainModule.Types.Add(exampleType)"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [],
"source": [
"let mainMethod = new MethodDefinition(\"Main\", MethodAttributes.Public ||| MethodAttributes.Static, voidType)\n",
"exampleType.Methods.Add(mainMethod)"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"System.Void Example.ExampleType::Main()"
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"mainMethod"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Methods have *bodies*, which is where their bytecode lives."
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Mono.Cecil.Cil.MethodBody"
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"mainMethod.Body"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"seq []"
]
},
"execution_count": 23,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"mainMethod.Body.Instructions"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To print `\"hello world\"` we first need to load the string \"hello world\" onto the evaluation stack with the `ldstr` opcode."
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [],
"source": [
"mainMethod.Body.Instructions.Add(Instruction.Create(OpCodes.Ldstr, \"Hello, World!\"))"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"seq [IL_0000: ldstr \"Hello, World!\"]"
]
},
"execution_count": 25,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"mainMethod.Body.Instructions"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To print that string we need to call the appropriate overload of `System.Console.WriteLine`. We need to *import* the reference to the method into the module we are compiling."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can find the `WriteLine` method using standard C# reflection."
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [],
"source": [
"let writeLineMethod = typeof<Console>.GetMethod(\"WriteLine\", [|typeof<String>|])"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Void WriteLine(System.String)"
]
},
"execution_count": 27,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"writeLineMethod"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This is a reference to the method in memory, which is not usable by Cecil as is. It needs a reference *to* the method *from* the module we are compiling."
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {},
"outputs": [],
"source": [
"let writeLineMethodReference = assembly.MainModule.ImportReference(writeLineMethod)"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"System.Void System.Console::WriteLine(System.String)"
]
},
"execution_count": 29,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"writeLineMethodReference"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Next we call it using the `call` opcode. The argument that will be passed to it is the string on the evaluation stack."
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {},
"outputs": [],
"source": [
"mainMethod.Body.Instructions.Add(Instruction.Create(OpCodes.Call, writeLineMethodReference))"
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"seq\n",
" [IL_0000: ldstr \"Hello, World!\";\n",
" IL_0000: call System.Void System.Console::WriteLine(System.String)]"
]
},
"execution_count": 31,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"mainMethod.Body.Instructions"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Starting to look like coherent bytecode! We just need to return from the method with the `ret` opcode and we're done."
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {},
"outputs": [],
"source": [
"mainMethod.Body.Instructions.Add(Instruction.Create(OpCodes.Ret))"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"seq\n",
" [IL_0000: ldstr \"Hello, World!\";\n",
" IL_0000: call System.Void System.Console::WriteLine(System.String);\n",
" IL_0000: ret]"
]
},
"execution_count": 33,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"mainMethod.Body.Instructions"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We now have our assemlbly, module, type, and method all ready to go. What's left is to *write it* into memory to make it usable. You can point Cecil at a stream in memory to get the bytes for the generated assembly then load them into the runtime using reflection."
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"hello, Version=0.0.0.0, Culture=neutral, PublicKeyToken=null"
]
},
"execution_count": 35,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"open System.IO\n",
"open System.Reflection\n",
"\n",
"let outStream = new MemoryStream()\n",
"assembly.Write(outStream)\n",
"Assembly.Load(outStream.GetBuffer())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can also write the assembly to disk, which is easier for this notebook to consume."
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {},
"outputs": [],
"source": [
"assembly.Write(\"assembly.dll\")"
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {},
"outputs": [],
"source": [
"#r \"assembly.dll\""
]
},
{
"cell_type": "code",
"execution_count": 38,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Hello, World!\n"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"Example.ExampleType.Main()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "F#",
"language": "fsharp",
"name": "ifsharp"
},
"language": "fsharp",
"language_info": {
"codemirror_mode": "",
"file_extension": ".fs",
"mimetype": "text/x-fsharp",
"name": "fsharp",
"nbconvert_exporter": "",
"pygments_lexer": "",
"version": "4.3.1.0"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment