Skip to content

Instantly share code, notes, and snippets.

@DGriffin91
Last active November 4, 2024 08:59
Show Gist options
  • Save DGriffin91/0b1b8ab3a19993e6caeb74a16f7c2ba5 to your computer and use it in GitHub Desktop.
Save DGriffin91/0b1b8ab3a19993e6caeb74a16f7c2ba5 to your computer and use it in GitHub Desktop.
bs13 brain dump
#[derive(Debug, Hash, PartialEq, Eq, Clone, SystemSet)]
pub enum BS13RenderSet {
Init,
Pipeline,
/// Just the spot tangents are generated if enabled. The renderer can also use tangents from the GLTF and fallback
/// to generating them in the fragment shader.
GenerateTangents,
/// Setup view target images, etc..
ViewTargetPrepare,
/// Send bindpose buffers (current and last frame) to the GPU
PrepareBindpose,
/// Generally used to init buffers inc sending them to the GPU in some cases
Prepare,
/// Queue draw commands, instance buffer, etc...
Queue,
/// DynMaterial
/// Materials are fully runtime dynamic. The materials data is just a list of u32s with an associated "Archetype"
/// that holds the shader string/path and a layout that maps u32s with names/type info.
/// The layout can be generated from a description in an hlsl file or procedurally (like in the case of blender
/// materials). The Handle<DynMaterialArchetype> is part of the pipeline key. So this can be retried during
/// get_pipeline() -> GraphicPipeline for DynMaterialPipelineKey. The layout is used directly to create named
/// shader defs that contain the material data offset for each item. In the case of blender materials the Archetype
/// shader string holds the shader code derived from the blender material graph. This generated code is inserted
/// into a blender wrapper shader via a shader def. Both blender materials and Bevy's standard materials are
/// translated to DynMaterials. Blender materials are generally rendered deferred, but in cases where there are
/// multiple BSDFs in one material, or alpha blending is used, they are rendered in a forward pass. (This is
/// somewhat similar to EEVEE). Blender materials are inserted into the GLTF via a `gather_material_hook` plugin.
/// There's support for marking nodes to use material properties for inputs instead of constants, this and using
/// indices for textures allows for materials with the same graph layout to be merged into the same pipeline.
/// Maybe there will be a system in the future that automatically reconfigures graphs to minimize pipeline switches.
///
/// Generally, instances (transform, material index, etc..) are retained on the GPU. Change detection on the CPU
/// triggers a compute shader to update the specific instances that have changed. A free list is maintained on the
/// CPU when instances are removed/when the buffer is resized (updated via on_remove component hook). Whenever the
/// instance buffer is created, the size is max(2x the current needed size, 100k). 100k instances is only 24MB so
/// this both doesn't use much memory and keeps from needing to recreate the buffer often. If space runs out the
/// buffer is recreated from scratch from the CPU. This probably causes very occasional stutters. This general
/// update strategy needs to be translated to: materials, meshes, bindposes, lights, etc... but is not yet. Those
/// all generally send the full state every frame. Or whenever something changes (with meshes for example)
///
/// All of the render passes use write_indirect_and_culling.hlsl which preforms occlusion/frustum culling.
/// Bevy's existing frustum culling is currently used since it potentially reduces the work of various things down
/// the line. In the future this will probably be replaced with something faster and/or only happen on the gpu.
/// write_indirect_and_culling.hlsl takes a buffer of indices into the instance buffer and generates
/// DrawIndexedIndirectCommands. It writes these densely and updates the count buffer. On macOS the count buffer
/// this isn't supported so it writes empty DrawIndexedIndirectCommands in the gaps. (macOS then also greatly
/// benefits from cpu culling so the indirect count can be as low as possible. It may be possible at some point to
/// improve things by reading back the counts and using them to inform future frames.)
///
/// Instance data in GPU format and pipeline keys are cached on the mesh entity, updated via change detection.
/// Draws are binned by pipeline key in a vec (draw_cmd_lists). A hashmaps keep the mapping from the pipeline key
/// to vec index. (main_draw_mapping & prepass_draw_mapping) Mesh entites keep a DrawCmdListIndices that indexes
/// directly into the draw_cmd_lists. Draw command lists are just cleared each frame, keeping the allocation.
/// The first prepass just renders deferred things if the screen ratio is being used, and all the forward prepass items.
/// (The screen ratio renders deferred things that take a large portion of the screen in a depth only prepass)
/// This first prepass only currently has frustum culling. "low_z_candidates"
RenderPrepass,
/// Optionally a low z pyramid is rendered from the prepass depth and deferred high screen ratio depth only items.
/// This LowZPyramid is currently only run if deferred was included and using screen ratio pre-prepass items.
LowZPyramid,
/// The second prepass skips any forward prepass items since they were already rendered in the first prepass.
/// So it just renders depth only for deferred assuming the DepthPrepassForDeferred component is on the camera.
/// RenderPrepass2 uses both frustum and occlusion culling (culling things from being drawn in the 2nd prepass)
/// The previous frame's LowZPyramid and previous frame's camera view are used to cull, along with the first
/// LowZPyramid if it was run. This DepthPrepassForDeferred doesn't need to be perfect, the depth from what is drawn
/// in RenderPrepass2 will be used to cull the actual deferred and all passes after this.
RenderPrepass2,
/// LowZPyramid of the depth drawn so far "low_z_full"
LowZPyramid2,
/// Shadows, currently only one cascade is drawn. Uses GPU frustum culling, no occlusion culling.
RenderShadows,
/// Deferred gbuffer pass. Uses occlusion/frustum via write_indirect_and_culling.hlsl. Gbuffer is just U32x4 + depth
/// Everything is crammed into those two attachments. (motion vectors, normals, etc...)
RenderDeferred,
/// In a single pass, makes 5 mip levels of: depth, normal, normal_sm_vs, color (pre-reprojected from last frame),
/// motion (motion vectors only currently supported on deferred, falls back to camera motion from depth otherwise)
FramePyramid,
/// Tiled lighting. 64x32 screen space tiles. Each tile can hold 48 point lights and an additional 48 cheap lights.
/// (Cheap lights are lambert, diffuse only, spot lights that fully fit in U32x4)
/// The tiles are fully preallocated so the start points are already known. Each light range aabb is intersected
/// with each tile. The LowZPyramid is also used to occlude lights. Occlusion works better here than for draws
/// since it is constrained to the tile allowing it to consider a much smaller area, using a more accurate, higher
/// mip.
TiledLighting,
/// Basic visibility bit mask SSAO. Currently no filtering step. Run at full res.
Ssao,
/// Forward opaque pass. Used for blender materials with multiple BSDFs in the same material. (Can optionally be
/// used for everything instead of deferred)
RenderForwardOpaque,
/// Deferred lighting pass.
RenderLightDeferred,
/// SSR uses the BSDF from deferred, currently just one sample, no filtering yet. Uses VNDF. Falls back to
/// environment. If SSR is on, environment specular is skipped. So it can be just summed on top of the frame in the
/// SSR pass. The reprojected previous frame color is used. Could also use the current frame color which would be
/// missing ssr *and* missing env. One idea would actually be to use env initially in the deferred lighting pass,
/// then subtract it and add SSR. Letting the SSR bounce be lit by the environment map. Idk, will probably do
/// something else with actual ray tracing.
Ssr,
/// Decals can be setup in blender these project their arbitrary blender material onto opaque things. The blender
/// material runs like usual but the normals/positions/tangents/etc... of the underlying surface are used instead of
/// its own.
RenderForwardDecal,
/// Alpha blend
RenderForwardBlend,
/// One single pass of transmissive materials reads from the current state of the frame without any mips currently.
RenderForwardTransmission,
/// Just copies the state of the current frame, just does mip 0, leaves to the next frame to reproject and make mips
FramePyramidColorCopy,
/// Fxaa. Sucks like usual.
Fxaa,
/// CMAA2 can be used with TAA for TSCMAA
Cmaa,
/// Fairly basic TAA. Gets object and armature motion vectors. Mostly does stuff from:
/// https://www.elopezr.com/temporal-aa-and-the-quest-for-the-holy-trail/
Taa,
/// Just some white noise with a slight low pass will probably be moved to upscale to skip the full screen triangle
/// pass
FilmGrain,
/// Stuff
Debug,
/// Mostly the same as Bevy. TonyMcMapface & AgX FTW
/// The render can run at a lower or high scale than the swap chain this pass resolves it with catmull rom for a bit
/// of extra sharpness. Also converts from R16G16B16A16_SFLOAT to R8G8B8A8_UNORM.
TonemappingAndUpscale,
/// egui is currently the only thing supported. Thankfully the Bevy egui plugin supports custom render backends.
UI,
PrepareViewTargetForBlit,
Present,
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment