Vyacheslav S. S-V

NOIR: A Neatly Optimal Intermediate Representation

By Leonard Ritter (25.9.2024, last update 6.1.2024)

NOIR is a high-level function-scope intermediate representation for programs, extending SSA Form but omitting basic blocks in favor of a simple acyclic dependency structure that is simpler to evaluate, analyze, canonicalize and optimize, particularly in the context of concurrent and iterative execution, and offers different lowering strategies.

In order to render a function-level IR fully functional, we only need five things: instructions, ordering, branches, merges and loops. In the following sections, we will introduce and describe each element.

As an introductory example and for overview, here is an iterative fibonacci procedure expressed in NOIR pseudocode:

Note: this was originally several Reddit posts, chained and linked. But now that Reddit is dying I've finally moved them out. Sorry about the mess.

URL: https://www.reddit.com/r/ProgrammingLanguages/comments/up206c/stack_machines_for_compilers/i8ikupw/ Summary: stack-based vs register-based in general.

There are a wide variety of machines that can be described as "stack-based" or "register-based", but not all of them are practical. And there are a lot of other decisions that affect that practicality (do variables have names or only address/indexes? fixed-width or variable-width instructions? are you interpreting the bytecode (and if so, are you using machine stack frames?) or turning it into machine code? how many registers are there, and how many are special? how do you represent multiple types of variable? how many scopes are there(various kinds of global, local, member, ...)? how much effort/complexity can you afford to put into your machine? etc.)

a pure stack VM can only access the top elemen

Realtime Fluid Simulation: Projection

The core of most real-time fluid simulators, like the one in EmberGen, are based on the "Stable Fluids" algorithm by Jos Stam, which to my knowledge was first presented at SIGGRAPH '99. This is a post about one part of this algorithm that's often underestimated: Projection

MG4_F32.mp4

Stable Fluids

The Stable Fluids algorithm solves a subset of the famous "Navier Stokes equations", which describe how fluids interact and move. In particular, it typically solves what's called the "incompressible Euler equations", where viscous forces are often ignored.

A quick breakdown of lighting in the restir-meets-surfel branch of my renderer, where I revive some olde surfel experiments, and generously sprinkle ReSTIR on top.

General remarks

Please note that this is all based on work-in-progress experimental software, and represents a single snapshot in development history. Things will certainly change 😛

Due to how I'm capturing this, there's frame-to-frame variability, e.g. different rays being shot, TAA shimmering slightly. Some of the images come from a dedicated visualization pass, and are anti-aliased, and some show internal buffers which are not anti-aliased.

Final images

2022-02-01 Is Simulating Turbulent Smoke Possible?💨

Predicting High-Resolution Turbulence Details in Space and Time

2022-01-31 Next Level Paint Simulations Are Coming! 🎨🖌️

Practical Pigment Mixing for Digital Painting

2022-01-27 Adobe's New Simulation: Bunnies Everywhere! 🐰

FrictionalMonolith: A Monolithic Optimization-based Approach for Granular Flow with Contact-Aware Rigid-Body

2022-01-26 Wow, A Simulation That Matches Reality! 🤯

Marching Diamond Crystal Lattice

Git Repository

https://github.com/d3x0r/MarchingTetrahedra

Marching Diamond Crystal Lattice Detail

Binary search is theoretically optimal, but it's possible to speed it up substantially using AVX2 and branchless code even in .NET Core.

Memory access is the limiting factor for binary search. When we access each element for comparison a cache line is loaded, so we could load a 32-byte vector almost free, check if it contains the target value, and if not - reduce the search space by 32/sizeof(T) elements instead of 1 element. This gives quite good performance improvement (code in BinarySearch1.cs and results in the table 1 below).

However, for larger N the search space reduction is quite minimal and the most gains come from switching to linear search. After an interesting discussion in Twitter (especially with @trav_downs), and trying to widen the pivot area to use 2 AVX2 vectors it became clear that just switching to linear search sooner is more important than using AVX2 vectors as pivots.

The linear search was not using AVX2, and for linear

	// Copyright (c) 2021 Francesco Mazzoli <[email protected]>
	//
	// Permission to use, copy, modify, and distribute this software for any
	// purpose with or without fee is hereby granted, provided that the above
	// copyright notice and this permission notice appear in all copies.
	//
	// THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
	// WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
	// MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
	// ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES

	#pragma use_dxc //enable SM 6.0 features, in Unity this is only supported on version 2020.2.0a8 or later with D3D12 enabled
	#pragma kernel CountTotalsInBlock
	#pragma kernel BlockCountPostfixSum
	#pragma kernel CalculateOffsetsForEachKey
	#pragma kernel FinalSort

	uint _FirstBitToSort;
	int _NumElements;
	int _NumBlocks;
	bool _ShouldSortPayload;

	// Copyright (C) Krzysztof Jakubowski <[email protected]>
	// This file is part of libfwk. See license.txt for details.

	#pragma once

	#include "fwk/base_vector.h"
	#include "fwk/span.h"
	#include "fwk/sys_base.h"

	namespace fwk {