Skip to content

Instantly share code, notes, and snippets.

@rygorous
rygorous / gist:1907292
Created February 25, 2012 08:05
Getting rid of the LUTs
If you want to get rid of the LUTs:
lut16
=====
Assume a 4-bit x=abcd (a, b, c, d are bits) "spread" such that:
x_4bits = 0x0a0b0c0d;
(this can be done with 2 "shift-and-select" class operations, for instance).
Then compute:
@rygorous
rygorous / fp16_to_32.asm
Created March 21, 2012 04:37
half->float using SSE2
; input: 4x F16 in XMM0 (low words of each DWord)
; original idea+implementation by Dean Macri
; WARNING: copy & pasted together from other code, this ver is untested!!
; though the original version was definitely correct.
bits 32
section .data
@rygorous
rygorous / gist:2144712
Created March 21, 2012 05:20
half->float variants
// half->float variants.
// by Fabian "ryg" Giesen.
//
// I hereby place this code in the public domain.
//
// half_to_float_fast: table based
// tables could be done in a more compact fashion (in particular, can store tab2 in low word of tab1!)
// but something of a dead end since not very SIMD-friendly. pretty much abandoned at this point.
//
// half_to_float_fast2: use FP adder hardware to deal with denormals.
@rygorous
rygorous / gist:2156668
Last active June 17, 2025 11:16
float->half variants
// float->half variants.
// by Fabian "ryg" Giesen.
//
// I hereby place this code in the public domain, as per the terms of the
// CC0 license:
//
// https://creativecommons.org/publicdomain/zero/1.0/
//
// float_to_half_full: This is basically the ISPC stdlib code, except
// I preserve the sign of NaNs (any good reason not to?)
@rygorous
rygorous / gist:2202577
Created March 26, 2012 02:57
ISPC surprising code-gen edge case
export void good(uniform int output[], uniform const int a[], uniform const int b[], uniform unsigned int count)
{
count &= ~15;
foreach (i = 0 ... count)
{
// This function works as you'd expect
int x = a[i];
int y = b[i];
output[i] = (x > 0) ? x : y;
}
@rygorous
rygorous / gist:2203834
Created March 26, 2012 08:03
float->sRGB8 using SSE2 (and a table)
// float->sRGB8 conversions - two variants.
// by Fabian "ryg" Giesen
//
// I hereby place this code in the public domain.
//
// Both variants come with absolute error bounds and a reversibility and monotonicity
// guarantee (see test driver code below). They should pass D3D10 conformance testing
// (not that you can verify this, but still). They are verified against a clean reference
// implementation provided below, and the test driver checks all floats exhaustively.
//
@rygorous
rygorous / gist:2246678
Created March 30, 2012 05:09
float->sRGB8 for ISPC 1.2.0
// float->sRGB8 conversions - two variants.
// by Fabian "ryg" Giesen
//
// I hereby place this code in the public domain.
//
// Both variants come with absolute error bounds and a reversibility and monotonicity
// guarantee. They should pass D3D10 conformance testing.
//
// This is an ISPC port of https://gist.github.com/2203834 - see there for a test
// driver and code that computes the tables.
@rygorous
rygorous / gist:2390021
Created April 15, 2012 04:28
ye olde texgen code
#include "types.h"
#include "opsys.h"
#include "texture.h"
#include "opdata.h"
#include "rtlib.h"
#include "math3d_2.h"
// for text stuff
#define WIN32_LEAN_AND_MEAN
#include <windows.h>
@rygorous
rygorous / Stats.js
Created April 25, 2012 03:48
Half-space tri rasterizers in JS
// stats.js r9 - http://github.com/mrdoob/stats.js
var Stats=function(){var h,a,r=0,s=0,i=Date.now(),u=i,t=i,l=0,n=1E3,o=0,e,j,f,b=[[16,16,48],[0,255,255]],m=0,p=1E3,q=0,d,k,g,c=[[16,48,16],[0,255,0]];h=document.createElement("div");h.style.cursor="pointer";h.style.width="80px";h.style.opacity="0.9";h.style.zIndex="10001";h.addEventListener("mousedown",function(a){a.preventDefault();r=(r+1)%2;0==r?(e.style.display="block",d.style.display="none"):(e.style.display="none",d.style.display="block")},!1);e=document.createElement("div");e.style.textAlign=
"left";e.style.lineHeight="1.2em";e.style.backgroundColor="rgb("+Math.floor(b[0][0]/2)+","+Math.floor(b[0][1]/2)+","+Math.floor(b[0][2]/2)+")";e.style.padding="0 0 3px 3px";h.appendChild(e);j=document.createElement("div");j.style.fontFamily="Helvetica, Arial, sans-serif";j.style.fontSize="9px";j.style.color="rgb("+b[1][0]+","+b[1][1]+","+b[1][2]+")";j.style.fontWeight="bold";j.innerHTML="FPS";e.appendChild(j);f=document.createElement("div");f.style.position="relati
@rygorous
rygorous / p4v_wishlist.txt
Created June 1, 2012 17:42
P4V wishlist
(All issues here reported for P4V NTX64/2012.1/459107 on Windows 7)
P4V UI/Workflow issues
======================
- Feature request: Auto-"Refresh all" when switching current task to P4V (WM_ACTIVATEAPP).
Reason: I frequently Alt-Tab to P4V to add a newly created file, only to not find it the
first time, then hit F5, then actually get to add it. "Refresh all" seems to be
reasonably fast, so why not automatically do it when the Window has to repaint anyway?
- Feature request: "Edit Current Workspace" dialog (or maybe "Environment Variables"?)