Skip to content

Instantly share code, notes, and snippets.

Last active November 4, 2024 08:59
Show Gist options
  • Save DGriffin91/0b1b8ab3a19993e6caeb74a16f7c2ba5 to your computer and use it in GitHub Desktop.
Save DGriffin91/0b1b8ab3a19993e6caeb74a16f7c2ba5 to your computer and use it in GitHub Desktop.
bs13 brain dump
#[derive(Debug, Hash, PartialEq, Eq, Clone, SystemSet)]
pub enum BS13RenderSet {
/// Just the spot tangents are generated if enabled. The renderer can also use tangents from the GLTF and fallback
/// to generating them in the fragment shader.
/// Setup view target images, etc..
/// Send bindpose buffers (current and last frame) to the GPU
/// Generally used to init buffers inc sending them to the GPU in some cases
/// Queue draw commands, instance buffer, etc...
/// DynMaterial
/// Materials are fully runtime dynamic. The materials data is just a list of u32s with an associated "Archetype"
/// that holds the shader string/path and a layout that maps u32s with names/type info.
/// The layout can be generated from a description in an hlsl file or procedurally (like in the case of blender
/// materials). The Handle<DynMaterialArchetype> is part of the pipeline key. So this can be retried during
/// get_pipeline() -> GraphicPipeline for DynMaterialPipelineKey. The layout is used directly to create named
/// shader defs that contain the material data offset for each item. In the case of blender materials the Archetype
/// shader string holds the shader code derived from the blender material graph. This generated code is inserted
/// into a blender wrapper shader via a shader def. Both blender materials and Bevy's standard materials are
/// translated to DynMaterials. Blender materials are generally rendered deferred, but in cases where there are
/// multiple BSDFs in one material, or alpha blending is used, they are rendered in a forward pass. (This is
/// somewhat similar to EEVEE). Blender materials are inserted into the GLTF via a `gather_material_hook` plugin.
/// There's support for marking nodes to use material properties for inputs instead of constants, this and using
/// indices for textures allows for materials with the same graph layout to be merged into the same pipeline.
/// Maybe there will be a system in the future that automatically reconfigures graphs to minimize pipeline switches.
/// Generally, instances (transform, material index, etc..) are retained on the GPU. Change detection on the CPU
/// triggers a compute shader to update the specific instances that have changed. A free list is maintained on the
/// CPU when instances are removed/when the buffer is resized (updated via on_remove component hook). Whenever the
/// instance buffer is created, the size is max(2x the current needed size, 100k). 100k instances is only 24MB so
/// this both doesn't use much memory and keeps from needing to recreate the buffer often. If space runs out the
/// buffer is recreated from scratch from the CPU. This probably causes very occasional stutters. This general
/// update strategy needs to be translated to: materials, meshes, bindposes, lights, etc... but is not yet. Those
/// all generally send the full state every frame. Or whenever something changes (with meshes for example)
/// All of the render passes use write_indirect_and_culling.hlsl which preforms occlusion/frustum culling.
/// Bevy's existing frustum culling is currently used since it potentially reduces the work of various things down
/// the line. In the future this will probably be replaced with something faster and/or only happen on the gpu.
/// write_indirect_and_culling.hlsl takes a buffer of indices into the instance buffer and generates
/// DrawIndexedIndirectCommands. It writes these densely and updates the count buffer. On macOS the count buffer
/// this isn't supported so it writes empty DrawIndexedIndirectCommands in the gaps. (macOS then also greatly
/// benefits from cpu culling so the indirect count can be as low as possible. It may be possible at some point to
/// improve things by reading back the counts and using them to inform future frames.)
/// Instance data in GPU format and pipeline keys are cached on the mesh entity, updated via change detection.
/// Draws are binned by pipeline key in a vec (draw_cmd_lists). A hashmaps keep the mapping from the pipeline key
/// to vec index. (main_draw_mapping & prepass_draw_mapping) Mesh entites keep a DrawCmdListIndices that indexes
/// directly into the draw_cmd_lists. Draw command lists are just cleared each frame, keeping the allocation.
/// The first prepass just renders deferred things if the screen ratio is being used, and all the forward prepass items.
/// (The screen ratio renders deferred things that take a large portion of the screen in a depth only prepass)
/// This first prepass only currently has frustum culling. "low_z_candidates"
/// Optionally a low z pyramid is rendered from the prepass depth and deferred high screen ratio depth only items.
/// This LowZPyramid is currently only run if deferred was included and using screen ratio pre-prepass items.
/// The second prepass skips any forward prepass items since they were already rendered in the first prepass.
/// So it just renders depth only for deferred assuming the DepthPrepassForDeferred component is on the camera.
/// RenderPrepass2 uses both frustum and occlusion culling (culling things from being drawn in the 2nd prepass)
/// The previous frame's LowZPyramid and previous frame's camera view are used to cull, along with the first
/// LowZPyramid if it was run. This DepthPrepassForDeferred doesn't need to be perfect, the depth from what is drawn
/// in RenderPrepass2 will be used to cull the actual deferred and all passes after this.
/// LowZPyramid of the depth drawn so far "low_z_full"
/// Shadows, currently only one cascade is drawn. Uses GPU frustum culling, no occlusion culling.
/// Deferred gbuffer pass. Uses occlusion/frustum via write_indirect_and_culling.hlsl. Gbuffer is just U32x4 + depth
/// Everything is crammed into those two attachments. (motion vectors, normals, etc...)
/// In a single pass, makes 5 mip levels of: depth, normal, normal_sm_vs, color (pre-reprojected from last frame),
/// motion (motion vectors only currently supported on deferred, falls back to camera motion from depth otherwise)
/// Tiled lighting. 64x32 screen space tiles. Each tile can hold 48 point lights and an additional 48 cheap lights.
/// (Cheap lights are lambert, diffuse only, spot lights that fully fit in U32x4)
/// The tiles are fully preallocated so the start points are already known. Each light range aabb is intersected
/// with each tile. The LowZPyramid is also used to occlude lights. Occlusion works better here than for draws
/// since it is constrained to the tile allowing it to consider a much smaller area, using a more accurate, higher
/// mip.
/// Basic visibility bit mask SSAO. Currently no filtering step. Run at full res.
/// Forward opaque pass. Used for blender materials with multiple BSDFs in the same material. (Can optionally be
/// used for everything instead of deferred)
/// Deferred lighting pass.
/// SSR uses the BSDF from deferred, currently just one sample, no filtering yet. Uses VNDF. Falls back to
/// environment. If SSR is on, environment specular is skipped. So it can be just summed on top of the frame in the
/// SSR pass. The reprojected previous frame color is used. Could also use the current frame color which would be
/// missing ssr *and* missing env. One idea would actually be to use env initially in the deferred lighting pass,
/// then subtract it and add SSR. Letting the SSR bounce be lit by the environment map. Idk, will probably do
/// something else with actual ray tracing.
/// Decals can be setup in blender these project their arbitrary blender material onto opaque things. The blender
/// material runs like usual but the normals/positions/tangents/etc... of the underlying surface are used instead of
/// its own.
/// Alpha blend
/// One single pass of transmissive materials reads from the current state of the frame without any mips currently.
/// Just copies the state of the current frame, just does mip 0, leaves to the next frame to reproject and make mips
/// Fxaa. Sucks like usual.
/// CMAA2 can be used with TAA for TSCMAA
/// Fairly basic TAA. Gets object and armature motion vectors. Mostly does stuff from:
/// Just some white noise with a slight low pass will probably be moved to upscale to skip the full screen triangle
/// pass
/// Stuff
/// Mostly the same as Bevy. TonyMcMapface & AgX FTW
/// The render can run at a lower or high scale than the swap chain this pass resolves it with catmull rom for a bit
/// of extra sharpness. Also converts from R16G16B16A16_SFLOAT to R8G8B8A8_UNORM.
/// egui is currently the only thing supported. Thankfully the Bevy egui plugin supports custom render backends.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment