My Dear Imgui render loop looks a bit unusual because I want to reduce calls to WebGL as much as possible, especially buffer update calls.
This means:
- only one buffer each for all per-frame vertex- and index-data
- only one update call each per frame for vertex- and index-data (with my own double-buffering, since buffer-orphaning doesn't work on WebGL, and with this I'm also independent from any 'under-the-hood' magic a GL driver might or might not perform)
- buffers and vertex attributes are 'bound' once before the draw loop
- the draw loop only changes the texture (if necessary), sets the scissor rect and performs draw calls
ImGui gives me 'local' vertex- and index-data for each ImDrawList though, so I need to merge the vertex-data first and merge and 'rebase' the index data (so that all indices are relative to the per-frame merged vertex data), this is necessary because WebGL/GLES2 cannot render from a base-vertex-index, and I don't want to redefine the vertex-attributes during the draw loop).
WebGL also doesn't have glMapBuffer...
So updating the vertex- and index-data involves 2 copies:
- copy all 'local' vertex and index data chunks from ImDrawLists into two continuous memory chunks, and while at it, 'rebase' the indices to the global vertex data chunk
- update the vertex-buffer and index-buffers with 2 calls to glBufferSubData()
This also means only 64k vertices can be rendered per frame when using 16-bit indices, but I didn't hit that limit yet.
Here's what the code with sokol-gfx looks like (this sample doesn't have support for custom textures or custom fonts):
- sg_update_buffer() is just glBufferSubData() but does an internal buffer rotation / doube buffering
- sg_apply_draw_state() binds the buffers and calls glVertexAttribPointer (and also updates the other GL render states)
- sg_draw() does the glDrawElements() (...or glDrawArrays, or the instanced variants)
If I'd have a few feature wishes granted, they would be this (all features would have to be enabled during initialization):
- optional feature to have ImGui write all vertex- and index-data into 2 continuous memory chunks instead of having local vertex- and index-data per ImDrawList (an ImDrawList would only have offsets and size of the vertex- and index-range for this draw list, in case this is needed for rendering)
- optionally enable a 'global indices' mode, where indices are based not to the current ImDrawList but to the global vertex data
- don't hardwire the ImDrawIdx type to 16-bit, but allow to configure ImGui during init to write 32-bit indices (not critical yet for me though, since I haven't hit the 16-bit limit)
I don't know if it is trivial to write per-frame vertex- and index-data instead of per-ImDrawList though :)
But those requests are definitely not high-prio, even with all the CPU work happening before the draw loop, ImGui rendering performance is never a problem in asm.js/wasm, no matter what I throw at it.
Cheers! -Floh.
A) and B) would go together. The
Begin()
submission order are obviously decorrelated from the visible z-order, but we could submit vertices in a single buffer, and copy/re-order only the indices at the end of the frame (the indices being one tenth the size of the vertices it'd be a win over your approach / one fifth with 32-bit indices).C) ImDrawIdx type: not sure how to do that at runtime without a performance hit, the easiest approach would probably to implement most of the inner ImDrawList code/loop twice (perhaps with a template, or just hard-coding for both types may be more reasonable). It's not really difficult but overhead/maintenance wise I'm not really eager to do it without a good reason, you could just switch to 32-bit in your case?