Skip to content

Instantly share code, notes, and snippets.

View S-V's full-sized avatar

Vyacheslav S. S-V

  • Earth
View GitHub Profile
@paniq
paniq / NOIR.md
Last active April 27, 2025 11:57
NOIR: A Neatly Optimal Intermediate Representation

NOIR: A Neatly Optimal Intermediate Representation

By Leonard Ritter (25.9.2024, last update 6.1.2024)

NOIR is a high-level function-scope intermediate representation for programs, extending SSA Form but omitting basic blocks in favor of a simple acyclic dependency structure that is simpler to evaluate, analyze, canonicalize and optimize, particularly in the context of concurrent and iterative execution, and offers different lowering strategies.

In order to render a function-level IR fully functional, we only need five things: instructions, ordering, branches, merges and loops. In the following sections, we will introduce and describe each element.

As an introductory example and for overview, here is an iterative fibonacci procedure expressed in NOIR pseudocode:

@o11c
o11c / every-vm-tutorial-you-ever-studied-is-wrong.md
Last active June 17, 2025 05:38
Every VM tutorial you ever studied is wrong (and other compiler/interpreter-related knowledge)

Note: this was originally several Reddit posts, chained and linked. But now that Reddit is dying I've finally moved them out. Sorry about the mess.


URL: https://www.reddit.com/r/ProgrammingLanguages/comments/up206c/stack_machines_for_compilers/i8ikupw/ Summary: stack-based vs register-based in general.

There are a wide variety of machines that can be described as "stack-based" or "register-based", but not all of them are practical. And there are a lot of other decisions that affect that practicality (do variables have names or only address/indexes? fixed-width or variable-width instructions? are you interpreting the bytecode (and if so, are you using machine stack frames?) or turning it into machine code? how many registers are there, and how many are special? how do you represent multiple types of variable? how many scopes are there(various kinds of global, local, member, ...)? how much effort/complexity can you afford to put into your machine? etc.)

  • a pure stack VM can only access the top elemen
@vassvik
vassvik / Simulation_Projection.md
Last active May 14, 2025 12:19
Realtime Fluid Simulation: Projection

Realtime Fluid Simulation: Projection

The core of most real-time fluid simulators, like the one in EmberGen, are based on the "Stable Fluids" algorithm by Jos Stam, which to my knowledge was first presented at SIGGRAPH '99. This is a post about one part of this algorithm that's often underestimated: Projection

MG4_F32.mp4

Stable Fluids

The Stable Fluids algorithm solves a subset of the famous "Navier Stokes equations", which describe how fluids interact and move. In particular, it typically solves what's called the "incompressible Euler equations", where viscous forces are often ignored.

@h3r2tic
h3r2tic / restir-meets-surfel-lighting-breakdown.md
Created November 23, 2021 02:15
A quick breakdown of lighting in the `restir-meets-surfel` branch of my renderer

A quick breakdown of lighting in the restir-meets-surfel branch of my renderer, where I revive some olde surfel experiments, and generously sprinkle ReSTIR on top.

General remarks

Please note that this is all based on work-in-progress experimental software, and represents a single snapshot in development history. Things will certainly change 😛

Due to how I'm capturing this, there's frame-to-frame variability, e.g. different rays being shot, TAA shimmering slightly. Some of the images come from a dedicated visualization pass, and are anti-aliased, and some show internal buffers which are not anti-aliased.

Final images

@randompast
randompast / list.md
Last active July 9, 2024 12:16
Paper title to Two Minute Papers video
@bitonic
bitonic / vectorized-atan2f.cpp
Last active June 24, 2025 04:56
Vectorized & branchless atan2f
// Copyright (c) 2021 Francesco Mazzoli <[email protected]>
//
// Permission to use, copy, modify, and distribute this software for any
// purpose with or without fee is hereby granted, provided that the above
// copyright notice and this permission notice appear in all copies.
//
// THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
// WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
// MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
// ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
@dondragmer
dondragmer / PrefixSort.compute
Created January 20, 2021 23:32
An optimized GPU counting sort
#pragma use_dxc //enable SM 6.0 features, in Unity this is only supported on version 2020.2.0a8 or later with D3D12 enabled
#pragma kernel CountTotalsInBlock
#pragma kernel BlockCountPostfixSum
#pragma kernel CalculateOffsetsForEachKey
#pragma kernel FinalSort
uint _FirstBitToSort;
int _NumElements;
int _NumBlocks;
bool _ShouldSortPayload;
@d3x0r
d3x0r / Marching-tets.md
Last active May 30, 2025 19:41
Marching tetrahedrons; Marching Diamond Crystal Lattice; Marching Cubes/Sliced into 5 Tetrahedrons
@buybackoff
buybackoff / Benchmark.md
Last active August 31, 2024 20:44
Avx2/branch-optimized binary search in .NET

Binary search is theoretically optimal, but it's possible to speed it up substantially using AVX2 and branchless code even in .NET Core.

Memory access is the limiting factor for binary search. When we access each element for comparison a cache line is loaded, so we could load a 32-byte vector almost free, check if it contains the target value, and if not - reduce the search space by 32/sizeof(T) elements instead of 1 element. This gives quite good performance improvement (code in BinarySearch1.cs and results in the table 1 below).

However, for larger N the search space reduction is quite minimal and the most gains come from switching to linear search. After an interesting discussion in Twitter (especially with @trav_downs), and trying to widen the pivot area to use 2 AVX2 vectors it became clear that just switching to linear search sooner is more important than using AVX2 vectors as pivots.

The linear search was not using AVX2, and for linear

@nadult
nadult / !vector.h
Last active December 25, 2024 17:56
fwk::Vector: improves compilation speed of whole project by ~20%, decreases binary size by 50% (compared with std::vector)
// Copyright (C) Krzysztof Jakubowski <[email protected]>
// This file is part of libfwk. See license.txt for details.
#pragma once
#include "fwk/base_vector.h"
#include "fwk/span.h"
#include "fwk/sys_base.h"
namespace fwk {