4coder_qol/plugins/4coder_minimap.cpp
4coder is best thought of less as a customizable editor but instead as a platform layer you can implement an editor ontop of
This means the implementation of base level things like syntax highlighting and line layout are left up to the user
This puts us in a bit of a bind if we want to implement minimaps which rely on layout for where to place the rectangles, and syntax highlighting for what color the rectangles should be.
First thought might be "I can just call render_buffer with a really small font"
This is far more expensive then we'd like it to be and looks terrible. But, there's a key insight we can bring into focus:
Providing a smaller font is "playing nice" with render_buffer which calls some function to do syntax highlighting
So long as we "play nice" with that function (or class of functions), we're in the clear
The function we're expecting has the form:
paint_my_tokens(App_Ctx *app, Token_Array tokens, Layout_ID layout_id){
Range visible_range = layout_get_visible_range(layout_id);
for (Token *token : token_iterator(tokens, visible_range)){
Color color = default_color_for_token(token->kind);
if (token->kind == TokenBaseKind_Identifier){
Note* note = code_index_note_lookup(token);
if (note != NULL){
color = note_color_for_token(color, note->kind);
}
}
// else if token is a member var
// else if token is a method call
// else if token is a label
// else if token is a multi-line DSL string
// else if I don't like the cut of the token's jib...
paint_text_color(app, layout_id, color);
}
}So if we can provide a Layout_ID with a visible_range computed specifically for our minimap
Then hook into paint_text_color to draw tiny rects instead of full glyphs
This would mean the user only needs to add the following:
Layout_ID minimap_id = MM_begin(...);
paint_my_tokens(app, tokens, minimap_id);
MM_finish(...);For the purpose of getting rough timing estimates,
Change 4coder startup to synchronously load, lex, and parse all files beforehand to remove noise/variation
Running for 300 frames, moving 10% down the buffer (mod buffer size) every 10 frames
The major constraints on the feature's implementation:
- Supports virtual white-space and line-wrapping
- A custom layer plugin
- Trivial to add to existing custom layers
- Can't depend on ABI breaking core changes
- Runs decently in debug mode on my potato laptop
Let's start by simply doing the most straight-forward approach first
We have a core function intended to draw the cursor rectangle at a given character
We can use this to get the token rect, then map the rect coords to be in "minimap-space"
Token *t = token_from_pos(&mm.tokens, range.min);
if (t == 0 || t->kind == TokenBaseKind_Whitespace){ return; }
Rect a = text_layout_character_on_screen(app, mm.layout, range.min);
Rect b = text_layout_character_on_screen(app, mm.layout, range.max-1);
Rect r = rect_union(a, b);
Vec2 p0 = mm.region.p0 + mm.s_p0*r.p0;
Vec2 p1 = mm.region.p0 + mm.s_p0*r.p1;
draw_rectangle(app, rect(p0, p1), color);| Naive | 66.57 ms per call | (19.97s / 23.92s) = 83% |
|---|
Performance-wise, this is entirely unusable, but it does serve as a proof of concept
We'll first note that we were just iterating tokens, then we passed the token's position and size as a range,
then had to call token_from_pos which binary-searches for the token we just saw
If we assume
By simply elliding the call to token_from_pos, we can to bring this back to
void MM_paint_token_color(App_Ctx *app, Layout_ID layout_id, Token *token, Color color){
paint_text_color(app, layout_id, Range(mm.hint = token), color);
}Bool use_hint = (mm.hint && Range(mm.hint) == range);
Token *t = (use_hint ? mm.hint : token_from_pos(&mm.tokens, range.min));
if (t == 0 || t->kind == TokenBaseKind_Whitespace){ return; }
Rect a = text_layout_character_on_screen(app, mm.layout, range.min);
Rect b = text_layout_character_on_screen(app, mm.layout, range.max-1);
Rect r = rect_union(a, b);
Vec2 p0 = mm.region.p0 + mm.s_p0*r.p0;
Vec2 p1 = mm.region.p0 + mm.s_p0*r.p1;
draw_rectangle(app, rect(p0, p1), color);| Naive | 66.57 ms per call | (19.97s / 23.92s) = 83% |
|---|---|---|
| Cached | 65.07 ms per call | (19.52s / 23.56s) = 83% |
Unfortunately, this barely made a dent. So what went wrong?
I mean, clearly academia has lied to us about algorithmic analysis,
and it's up to we enlightened few who are "in the know" to set things straight
Let's run an experiment and only call text_layout_character_on_screen for the starting position
Then just use the token size to compute the ending position1
Rect a = text_layout_character_on_screen(app, mm.layout, t->pos);
Vec2 p0 = mm.region.p0 + mm.s_p0*a.p0;
Vec2 dim = mm.s_str*vec2(f32(t->size), 1.0);
draw_rectangle(app, rect_xy_wh(p0, dim), color);| Cached | 65.07 ms per call | (19.52s / 23.56s) = 83% |
|---|---|---|
| Elim. 1 call | 33.33 ms per call | (10.00s / 11.96s) = 83% |
By halving the number of calls to text_layout_character_on_screen we halve our cost
This confirms our time is clearly being dominated by text_layout_character_on_screen
So what can we do? Maybe, we could try to find a cheaper core function to call?
Rect a = view_relative_box_of_pos(app, mm.view, mm.line, t->pos);
Vec2 p0 = mm.region.p0 + mm.s_p0*a.p0;
Vec2 dim = mm.s_str*vec2(f32(t->size), 1.0);
draw_rectangle(app, rect_xy_wh(p0, dim), color);| Elim. 1 call | 33.33 ms per call | (10.00s / 11.96s) = 83% |
|---|---|---|
| Relative box | 21.60 ms per call | (06.48s / 07.74s) = 84% |
Cheaper, but still not usable. So if it's expensive, let's try to minimize calls to this function
To do so, we'll try to reuse as much information as possible
If the token is on the same line as the previous one, then we can certainly reuse y0, y1
By the same logic of how we compute r0.x1 = r0.x0 + width*t0.size
We can compute r1.x0 = r0.x0 + width*(t1.pos - t0.pos)
Then only when a token starts on a new line do we need to properly call the layout function
// Vec2 MM_get_xy(App_Ctx *app, i64 pos);
if (!range_contains(mm._range, pos)){
mm._rel = view_relative_box_of_pos(app, mm.view, mm.line, pos).p0;
mm._range.min = view_pos_at_relative_xy (app, mm.view, mm.line, vec2(0.0, mm._rel.y));
mm._range.max = view_pos_at_relative_xy (app, mm.view, mm.line, vec2(max_f32, mm._rel.y));
mm._p0 = mm.region.p0 + mm.s_p0*mm._rel;
}
return mm._p0 + mm.s_str * vec2(f32(pos - mm._range.min), 0);
// void MM_paint_text_color(App_Ctx *app, Layout_ID layout_id, Range range, Color color)
Vec2 p0 = MM_get_xy(app, t->pos);
Vec2 dim = mm.s_str*vec2(f32(t->size), 1.0);
draw_rectangle(app, rect_xy_wh(p0, dim), color);| Relative box | 21.60 ms per call | (6.48s / 7.74s) = 84% |
|---|---|---|
| per line | 10.43 ms per call | (3.13s / 4.38s) = 71% |
This is roughly the limit to how far we can go with our eyes closed.
If we look under the lid, file_pos_at_relative_xy gets a layout list starting at base line,
then calls layout_nearest_pos_to_xy which iterates every block in the list until it contains point p
Given a constant base line, the number of layout blocks iterated grows with the distance of p from the base line
Even if we were to take the best case scenario where each line contains only one block
Then to render n lines, the number of blocks iterated is
If we update the base line used to layout positions and track deltas between the base lines,
Then p will always be contained within the first block of the list
This brings the number of blocks down to 1 for every p, totaling
i64 line = get_line_number_from_pos(app, mm.buffer, pos);
mm._range.min = view_pos_at_relative_xy (app, mm.view, line, vec2(0.0, 0.0));
mm._range.max = view_pos_at_relative_xy (app, mm.view, line, vec2(max_f32, 0.0));
mm._rel.y += view_line_y_difference (app, mm.view, line, mm._line);
mm._rel.x = view_relative_box_of_pos(app, mm.view, line, mm._range.min).x0;
mm._line = line;| per line | 10.43 ms per call | (3.13s / 4.38s) = 71% |
|---|---|---|
| track deltas | 4.933 ms per call | (1.48s / 2.79s) = 53% |
While we're at it, instead of calling view_line_y_difference every line
We can just query the line height once at the beginning
mm._rel.y += mm.line_height*(line - mm._line);| track deltas | 4.933 ms per call | (1.48s / 2.79s) = 53% |
|---|---|---|
| Use font-height | 4.533 ms per call | (1.36s / 2.70s) = 50% |
Putting all this together gets us
| Naive | 66.57 ms per call | (19.97s / 23.92s) = 83% |
|---|---|---|
| Cached | 65.07 ms per call | (19.52s / 23.56s) = 83% |
| Elim. 1 call | 33.33 ms per call | (10.00s / 11.96s) = 83% |
| Relative box | 21.60 ms per call | (06.48s / 07.74s) = 84% |
| per line | 10.43 ms per call | (03.13s / 04.38s) = 71% |
| track deltas | 4.933 ms per call | (01.48s / 02.79s) = 53% |
| Use font-height | 4.533 ms per call | (01.36s / 02.70s) = 50% |
The same way that MM_get_xy() tells us when a token starts on a different line,
we can also see when it ends on a different line, signalling it as a multi-line token
We initialize pos from token.pos and start iterating lines
Each line, we compute rect.p0 from pos and rect.p1 from the line.end
Then we compute the next line range and update pos to be line.begin
breaking out of the loop once pos surpasses either token.end or visible_range.end
The finishing touch would be to add nests to the minimap
While tokens are stored in a homogenous array, nests are stored in a heterogenous tree
The natural way to walk the tree would be a pre-order traversal
Bool MM_draw_scopes(App_Ctx *app, Nest_Array *array, Nest *last, Range range, Color color){
for (Nest *nest : nest_array){
Rect r0 = rect(MM_get_xy(app, nest->open.min));
Rect r1 = rect(MM_get_xy(app, nest->close.min));
Rect r = rect_union(r0, r1);
draw_rectangle(app, rect(r.x0, r.y0, r.x0+1, r.y1 + mm.s_str.y), color);
if (nest == last || MM_draw_scopes(app, &nest->nest_array, last, range, color)){ return true; }
}
return false;
}This works fine, but if take into consideration how MM_get_xy optimizes for laying out nearby lines
We see that with a pre-order traversal, the lines we're laying out kinda ping-pongs
Making large multi-line jumps from the top open brace to the bottom close brace as it walks parent to child
By simply moving to an in-order traversal, the number of jumps stays the same, but the distance is minimized
Bool MM_draw_scopes(App_Ctx *app, Nest_Array *array, Nest *last, Range range, Color color){
for (Nest *nest : nest_array){
Rect r0 = rect(MM_get_xy(app, nest->open.min));
Bool did_last = (nest == last || MM_draw_scopes(app, &nest->nest_array, last, range, color));
Rect r1 = rect(MM_get_xy(app, nest->close.min));
Rect r = rect_union(r0, r1);
draw_rectangle(app, rect(r.x0, r.y0, r.x0+1, r.y1 + mm.s_str.y), color);
if (did_last){ return true; }
}
return false;
}Now this isn't the end of the story, as it assumes nests must be rendered in isolation from tokens
We know that nests are opened by a token, and closed by a token, and that we're already laying out every token
This means we can avoid duplicate work by interleaving the nest-stream with the token-stream
The only caveat is our nest-stream is all nests which overlap the visible range, rather than all those strictly contained by it
This requires us to partially consume any nests not entirely contained both before and after painting our tokens
Here we're bump-allocating the nest list.
While we could iterate the nest-list in fixed space only allocating an intermediate stack bounded by the maximum nest depth
There are 2 things to keep in mind.
- Accessing the nest entries requires acquiring a lock on the code index, and the caller likely wants access to it for syntax highlighting types, functions, globals etc.
- The number of scopes visible on the minimap is trivially small
// MM_begin(...)
MM_nest_list(arena, &index->nest_array, mm.range);
while (mm.head && !range_contains(mm.range, mm.head->p0)){ MM_nest_open(app); }
// MM_paint_text_color(...)
while (mm.head && range_contains(mm._range, mm.head->p0)){ MM_nest_open(app); }
while (mm.stack && range_contains(mm._range, mm.stack->p1)){ MM_nest_close(app); }
// MM_end(...)
while (mm.stack){ MM_nest_close(app); }There's an order of magnitude less scopes than there are tokens, so we'll need to adjust the benchmark a bit
Different file with more nests (and more tokens per line)
Bumped up from 300 -> 1200 total frames
| Pre-order | 8.075 ms per call | (9.69s / 15.89s) = 61% |
|---|---|---|
| In-order | 7.792 ms per call | (9.35s / 15.35s) = 61% |
| Interleaved | 7.158 ms per call | (8.59s / 14.71s) = 58% |
| Disabled | 7.075 ms per call | (8.49s / 14.43s) = 59% |
Within spitting distance of our "speed of light"
Footnotes
-
This makes no change for purely ASCII buffers, but can make UTF8 appear slightly wider
It also causes multi-line tokens to appear as a single really long line but those weren't handled correctly before ↩

