Skip to content

Instantly share code, notes, and snippets.

@rygorous
rygorous / main.cpp
Created December 24, 2024 15:07
Direct UNORM/SNORM conversion test program
#include <stdio.h>
#include <stdint.h>
#include <stdlib.h>
#include <string.h>
static uint32_t float_to_bits(float x)
{
uint32_t u;
memcpy(&u, &x, sizeof(x));
return u;
// ep1 - ep0 >= 0: four-interpolated color mode
// ep1 - ep0 < 0: six-interpolated-color mode
//
// we want to avoid ep0 = ep1 in general (because it gives us no useful
// interpolated values). That means in four-interp mode we want ep1 - ep0 > 0,
// and in six-interp mode we want ep1 - ep0 < 0. If they're identical or
// have the wrong sign, we need to fix it!
int ep_diff = ((ep1_int - ep0_int) ^ target_sign) - target_sign;
if (ep_diff <= 0)
{
template <int t_nbits>
static inline void quant_endpoint_with_pbit(U8 *deq0, U8 *deq1, int val)
{
const int expanded_nbits = t_nbits + 1;
const U32 range = 1u << expanded_nbits;
const U32 recip255 = 0x8081; // enough bits for our value range
const int postscale = (0x10000 >> t_nbits) + (0x10000 >> (t_nbits*2 + 1));
// The reconstruction here adds the pbit as the lowest bit and then reconstructs
// it as a (nbits+1)-bit value to float, i.e. (quant*2 + pbit) / (range - 1).
File Type: DLL
Section contains the following imports:
KERNEL32.dll
180085000 Import Address Table
18009B1B8 Import Name Table
0 time date stamp
0 Index of first forwarder reference
inName : c:\devel\media\verybig\enwik9
inSize64 : 1000000000
building LRM...
done!
ref_sink: 0x9f93e2e0
SimpleProf :seconds calls count : clk/call clk/count
search_one : 1.2668 1 8388480 : 5320774726.0 634.30
search_one_pf : 0.9975 1 8388480 : 4189319442.0 499.41
search_multi2 : 1.1072 1 8388480 : 4650070068.0 554.34
search_multi4 : 1.1205 1 8388480 : 4706202060.0 561.03
00000000000001BC: D2802009 mov x9,#0x100
00000000000001C0: F2C00809 movk x9,#0x40,lsl #0x20
00000000000001C4: F90003E9 str x9,[sp]
00000000000001C8: D2800209 mov x9,#0x10
00000000000001CC: F2C00089 movk x9,#4,lsl #0x20
00000000000001D0: F90007E9 str x9,[sp,#8]
00000000000001D4: 58000869 ldr x9,$LN356
00000000000001D8: F9000BE9 str x9,[sp,#0x10]
00000000000001DC: 58000869 ldr x9,$LN357
00000000000001E0: F9000FE9 str x9,[sp,#0x18]
@rygorous
rygorous / timings.txt
Created February 29, 2024 19:21
Oodle Texture 2.9.12 results on Kodak set, BC7RGB, Ryzen 7950X3D, average time over 10 repeats
---- "Low" effort level (OodleTex_EncodeEffortLevel_Low = 10)
CompressBCN : 5.258 millis
kodim01.bmp BC7-RGB: rmse=2.4325 hash=0x5dcf9106f8f4415d
CompressBCN : 5.040 millis
kodim02.bmp BC7-RGB: rmse=2.1168 hash=0x79f45423cd9d3ec0
CompressBCN : 5.165 millis
kodim03.bmp BC7-RGB: rmse=1.6793 hash=0x806dce71d1ff8293
CompressBCN : 5.092 millis
kodim04.bmp BC7-RGB: rmse=2.1611 hash=0x681ebb3045e254ec
// advance
for (int i = 0; i < num_streams; ++i)
{
std::string desc = formatf("advance %d", i);
bool is_reverse_stream = (i % 3) == 1;
if (EARLY_CLZ != 2)
{
if (EARLY_CLZ == 0)
bb->append(CLZ(bits[i], bits[i]).set_comment(desc)); // figure out how many bits we consumed
if (is_reverse_stream)
@rygorous
rygorous / bc7_single_color.cpp
Created July 7, 2023 03:09
BC7 encoding for single-color blocks
void bc7_encode_single_color_block(U8 * output_bc7, const U8 rgba[4])
{
U64 r = rgba[0];
U64 g = rgba[1];
U64 b = rgba[2];
U64 a = rgba[3];
const U64 bit6_mask = (0x40 << 8) | (0x40 << 22) | (0x40ull << 36);
const U64 lo7_mask = (0x7f << 8) | (0x7f << 22) | (0x7full << 36);
U64 color_bits;
@rygorous
rygorous / gist:efc460d0154347ebe17bde7275070d9b
Created July 7, 2023 03:05
Oodle Texture BC7 mode stats on some test images (some game textures, some other)
reading: c:\devel\media\bc1speedlevel/M_MED_Kurohomura_Backpack_Textures_T_Kurohomura_BP_backpack_D.uasset_build.mip0.png
CompressBC7 : 65.720 millis, 3.51 kc/B, rate= 997.20 kB/s
per-pixel rmse : 0.6425
mode stats (8=invalid 9=solid):
[0] 567 ( 0.87%)
[1] 5465 ( 8.34%)
[2] 126 ( 0.19%)
[3] 15855 ( 24.19%)
[4] 909 ( 1.39%)
[5] 15072 ( 23.00%)