Last active
April 3, 2024 11:20
-
-
Save unitycoder/4d988bb21b3ce820eaa23028ed6d04bd to your computer and use it in GitHub Desktop.
Avoiding Branching / Conditionals in Shaders Fast No If
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
"The common wisdom of "don't use conditionals in shaders" is one of my biggest frustrations with how shaders are taught. | |
step(y, x) _is_ a conditional! It compiles to identical code as: | |
float val = (x >= y ? 1.0 : 0.0) | |
or | |
float val = 0.0; | |
if (x >= y) val = 1.0;" | |
https://twitter.com/bgolus/status/1235254923819802626 | |
// Performing shader divisions without diving *rcp = approximation of 1/x | |
float myDividedVal = myValToDivide * rcp(myDivider); | |
https://twitter.com/andreintg/status/1384136830430310405 | |
pow vs loop | |
https://forum.unity.com/threads/power-or-multiplication-in-a-loop.654358/ | |
// | |
for(int x = 0; x < 8; x++) | |
result *= result ; | |
// | |
pow(result , 512); | |
“is the loop ever faster than the pow() function?” | |
The answer is no, it is never faster for hard coded powers set in the shader. At best they are equivalent | |
Basically above powers of 512, the pow() always wins | |
--------------------------------------------------- | |
// orig | |
fixed mask = 0; | |
if( i.uv.x > 1 && i.uv.x < 2){ | |
mask = 1; | |
} | |
// new | |
fixed mask = 0; | |
mask += 1 * step(i.uv.x, 1) * step(2, i.uv.x); | |
"Except that step() function is an implemented as an "if". Those two examples will either compile to identical shader assembly, or the lerp will be slower! " | |
https://forum.unity.com/threads/color-replacement-without-texture.562819/#post-3737593 | |
--------------------------------------------------- | |
Also this. | |
if ( value > 0.5) { | |
col = valueA; | |
} | |
else | |
{ | |
col = valueB; | |
} | |
//could (and should) be replaced with something like this: | |
col = lerp(valueB, valueA, step(value, 0.5)); | |
//or potentially: | |
col = lerp(valueB, valueA, saturate((value - 0.5) * 1000)); | |
https://forum.unity.com/threads/mask-between-2-values.497799/ | |
http://xissburg.com/eliminating-branches-in-shaders/ | |
http://theorangeduck.com/page/avoiding-shader-conditionals | |
// slow | |
if (x<0.5) | |
{ | |
x=a; | |
}else{ | |
x=b; | |
} | |
// faster | |
x=a*step(x,0.5)+b*step(0.5,x); | |
https://twitter.com/LeLocTai/status/834258304209555456 | |
"You can always perform a signed integer range test with a single comparison. unsigned(x - lo) <= unsigned(hi - lo)" | |
https://twitter.com/ericlengyel/status/867967467770920960 | |
********************************************************** | |
void vert(inout appdata_full v, out Input o) | |
{ | |
UNITY_INITIALIZE_OUTPUT(Input, o); | |
//0 <= dot(AB,AM) <= dot(AB,AB) && | |
//0 <= dot(BC,BM) <= dot(BC,BC) | |
o.worldPos = mul(unity_ObjectToWorld, v.vertex).xyz; | |
half abam = dot(_Point1-_Point2, _Point1-o.worldPos); | |
half abab = dot(_Point1-_Point2,_Point1-_Point2); | |
half bcbm = dot(_Point2-_Point3, _Point2-o.worldPos); | |
half bcbc = dot(_Point2-_Point3,_Point2-_Point3); | |
if(0 <= abam && abam <= abab && 0 <= bcbm && bcbm <= bcbc) | |
//o.delta = clamp(max(0,abam),0,1)*clamp(max(0,bcbm),0,1); | |
v.vertex.y -= v.vertex.y;//*o.delta; | |
} | |
// to | |
half4 temp = saturate(half4(abam, bcbm, abab, bcbc) - half4(0.0, 0.0, abam, bcbm)); | |
temp.xy *= temp.zw; | |
o.delta = temp.x * temp.y; | |
https://forum.unity.com/threads/avoiding-multiple-ifs-in-vert-shader.518551/ | |
https://forum.unity.com/threads/add-one-pixel-border-to-a-rendertexture-result-in-shader.389121/#post-2535441 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Branching on a GPU (is mostly ok, and good even!)
https://medium.com/@jasonbooth_86226/branching-on-a-gpu-18bfc83694f2