rygorous · February 25, 2012 08:05
diff --git a/gistfile1.txt b/gistfile1.txt
 If you want to get rid of the LUTs:

 lut16
 =====

 Assume a 4-bit x=abcd (a, b, c, d are bits) "spread" such that:
  x_4bits = 0x0a0b0c0d;
 (this can be done with 2 "shift-and-select" class operations, for instance).

 Then compute:
  y = (x_4bits * 0x0103091b) >> 24;

 which returns y=27*a + 9*b + 3*c + d;

 lut256
 ======

 With 64-bit ints, almost the same computation on 8 input bits can compute two 4-bit chunks at once, then you use

  // x_8bits = 0x0a0b0c0d0e0f0g0h
  y = x_8bits * 0x0103091b;
  y = ((y >> 24) & 0xff) + (3**4) * (y >> (24 + 32));

 for the combination step. This is the easiest way I can think of to get a "lut256" equivalent.
	If you want to get rid of the LUTs:

	lut16
	=====

	Assume a 4-bit x=abcd (a, b, c, d are bits) "spread" such that:
	x_4bits = 0x0a0b0c0d;
	(this can be done with 2 "shift-and-select" class operations, for instance).

	Then compute:
	y = (x_4bits * 0x0103091b) >> 24;

	which returns y=27a + 9b + 3*c + d;

	lut256
	======

	With 64-bit ints, almost the same computation on 8 input bits can compute two 4-bit chunks at once, then you use

	// x_8bits = 0x0a0b0c0d0e0f0g0h
	y = x_8bits * 0x0103091b;
	y = ((y >> 24) & 0xff) + (3*4) (y >> (24 + 32));

	for the combination step. This is the easiest way I can think of to get a "lut256" equivalent.