Last active
July 30, 2020 19:51
-
-
Save dsyme/dc86bf86de81b83b75557d4944db43c2 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Topics | |
1. tokenization | |
2. parsing | |
3. checking and elaboration (i.e. producing TypedTree) | |
- $" " plain | |
- $"..." as FormattableString | |
- $"..." as PrintFormat | |
4. FSharp.Core support (printf.fs) | |
Code examples: | |
printf "abc %d def" 3 | |
$"abc {1+1} def" | |
@$"abc {1+1} def" | |
$@"abc {1+1} def" | |
"""abc {1+1} def""" | |
## Tokenization | |
token, get " -> string (args with NormalString) | |
token, get $" -> string (args with InterpolatedString) | |
token, get {, and ars.Stack~~InterpolatedString -> token (args.PushABrace()) | |
token, get }, and ars.Stack~~InterpolatedString and Braces=1 -> string/vstring/tqstring (args with NormalString) | |
token, get }, and ars.Stack~~InterpolatedString and Braces=N -> token (args.PopABrace()) | |
token, get @" -> verbatimString | |
token, get @$", $@" -> verbatimString | |
token, get """ -> tripleQuoteString | |
token, get $""" -> tripleQuoteString | |
string, get "{", and args.InterpolatedString --> produce INTERPOLATED_STRING_FRAGEMENT, then go to token state + push | |
vstring, get "{", and args.InterpolatedString --> produce INTERPOLATED_STRING_FRAGEMENT, then go to token (args.Push "vstring") | |
New tokens: | |
INTERP_STRING_BEGIN_END --> $"cvkjvrkjhrve" $"""vrkjhrvhrewhervj"" | |
INTERP_STRING_BEGIN_PART --> $"vrwhwver { $"""vrwjrlvjwe { | |
INTERP_STRING_PART --> } vrewhvrehkjervh { | |
INTERP_STRING_END --> } vrwkwjervh" | |
fsc --tokenize test.fs | |
## Parsing | |
``` | |
atomicExprAfterType: // Q: WHY THIS ONE | |
| interpolatedString | |
interpolatedStringFill: | |
| declExpr | |
| declExpr COLON ident %prec interpolation_fill | |
interpolatedStringParts: | |
| INTERP_STRING_END | |
| INTERP_STRING_PART interpolatedStringFill interpolatedStringParts | |
interpolatedString: | |
| INTERP_STRING_BEGIN_PART interpolatedStringFill interpolatedStringParts | |
| INTERP_STRING_BEGIN_END | |
``` | |
Giving these SyntaxTree extensions: | |
```fsharp | |
type SynExpr = | |
... | |
| InterpolatedString of | |
contents: SynInterpolatedStringPart list * | |
range: range | |
type SynInterpolatedStringPart = | |
| String of string * range | |
| FillExpr of SynExpr * Ident option | |
``` | |
## Checking and Elaboration | |
1. $"..." : overallTy --> Check if overallTy unifies with 'string' etc. as per spec | |
2. Then put together the fragments into one format string using `%P()` or `%alignmentP(format)` as holes as per spec | |
3. Do normal format string checking of the overall format string, with %P(..) allowed | |
--> Extract type information about the format string | |
4. In the case where $".." is being used as a string or a PrintfFormat | |
Make a call to PrintfFormat<...>(format) | |
Fill in Captures and CaptureTypes in the PrintfFormat object. | |
If $"..." is being used as a string then call "sprintf" taking the PrintfFormat as argument | |
e.g. | |
$"abc{x,5}" --> Printf.sprintf (new PrintfFormat("abc%5P()", [| x |], null)) | |
$"abc{1+1}def" --> Printf.sprintf (new PrintfFormat("abc%P()def", [| box (1+1) |], null)) | |
$"abc%d{1+1}def" --> Printf.sprintf (new PrintfFormat("abc%d%P()def", [| box (1+1) |], null)) | |
In the case where $"..." is being used as a .NET FormattableString then some different codegen is needed, also | |
more restrictions apply (e.g. no % patterns are allowed), as per spec. Codegen becomes a | |
call to FormattableStringFactory.Create, e.g. | |
($"abc {x} {y:N}" : FormattableString) | |
--> FormattableStringFactory.Create("abc {0} {1:N}", [| box x; box y |]) | |
## printf at runtime | |
- Given format string object containing | |
.FormatString (.Value) --> the string, e.g. "abc%d%P()def" | |
.Captures --> null for a normal old-style printf, non-null for capturing interpolation | |
.CaptureTypes --> null for a normal old-style printf, non-null of there are %A patterns | |
- Aim of `sprintf` is EITHER | |
1. produce a string (if interpolated printf formatting) | |
2. produce a curried function of the right type (if old-style printf formatting) | |
- Two phase approach | |
1. crack the format string into an array of "steps" | |
1b. if producing a curried function, generate the curried function now `(fun arg1 -> (fun arg2 -> .... <phase2>))` | |
2. iterate over the steps writing the output fragments | |
There is a two-level Cache, type-directed table | |
type Cache<'Printer, 'Residue, '...> = | |
static let mutable recent = ... | |
static let mutable dict = ConcurrentDictionary.... | |
The Cache holds the results of phase 1 | |
This is how printf has always worked since 2012 or so. The main addition here is that "phase 2" can fill in the arguments | |
and relevant %A types from Captures/CaptureTypes rather than the arguments of the curried function chain. | |
Basic runtime action of sprintf will | |
1. Look up cache, populate with phase1 results if needed | |
2. run phase 2, return the string. | |
## Tooling | |
1. Extra complication for reporting locations of %d etc. in interpolated strings. | |
2. Extra complication for making sure we can take a correct continuation from tokenization. | |
$""" vwhvwerkhj vwekh wvekjh vwe { <--- take continuation at the end of each line | |
Test cases related to tokenization: | |
``` | |
$""" vwhvwerkhj vwekh wvekjh vwe { | |
#if GOO | |
vwevw | |
} | |
#else | |
fwewe | |
} | |
#endif | |
vwekhwevvew""" | |
``` | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment