Skip to content

Instantly share code, notes, and snippets.

@rygorous
rygorous / gist:2937595
Created June 15, 2012 17:04
Another selb attempt
# Rough sketch, not actually tested or anything...
# in any case, way too many instrs to be actually useful.
rotqbyi inMaskY, inMask, 4
rotqbyi inMaskZ, inMask, 8
rotqbyi inMaskW, inMask, 12
cwd maskX, 4(inMask) # 0x00010203 in maskX.x if inMask.x set, 0x10111213 otherwise (rest id + 0x10)
cwd maskY, 8(inMaskY) # 0x00010203 in maskY.y if inMask.y set, 0x14151617 otherwise (rest id + 0x10)
cwd maskZ, 12(inMaskZ) # 0x00010203 in maskZ.z if inMask.z set, 0x18191a1b otherwise (rest id + 0x10)
cwd maskW, 0(inMaskW) # 0x00010203 in maskZ.w if inMask.w set, 0x1c1d1e1f otherwise (rest id + 0x10)
@rygorous
rygorous / doubleprec.cpp
Created June 27, 2012 17:11
RAII class for when you actually need double precision. (MSVC on Win32 w/out /arch:SSE2)
// header
// RAII class. For use in 32-bit processes compiled without /arch:SSE2 - for
// when you actually need double precision. (D3D and GL like to set x87
// precision to single, which means the only thing you get out of doubles is
// the higher exponent range)
class DoublePrecisionContext
{
unsigned int oldfp;
@rygorous
rygorous / magic_ring.cpp
Created July 22, 2012 03:55
The magic ring buffer.
#define _CRT_SECURE_NO_DEPRECATE
#include <stdio.h>
#include <string.h>
#include <Windows.h>
// This allocates a "magic ring buffer" that is mapped twice, with the two
// copies being contiguous in (virtual) memory. The advantage of this is
// that this allows any function that expects data to be contiguous in
// memory to read from (or write to) such a buffer. It also means that
@rygorous
rygorous / gist:3694094
Created September 10, 2012 21:40
7ms Present() fact-finding mission
The problem: Every 9-10 frames, Present() takes a huge amount of time.
Original tweet with Parallel NSight screen shot: https://twitter.com/rygorous/status/245270559020171265
Short but representative log of Present() timings: (frame number:time taken)
4721: 7.628466ms
4722: 0.294534ms
4723: 0.340265ms
4724: 0.282702ms
4725: 0.283661ms
@rygorous
rygorous / stb_dxt.h
Created October 3, 2012 17:12
YannC's modified rygdxt code review
// stb_dxt.h - Real-Time DXT1/DXT5 compressor
// Based on original by fabian "ryg" giesen v1.04
// Custom version, modified by Yann Collet
//
/*
BSD 2-Clause License (http://www.opensource.org/licenses/bsd-license.php)
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are
met:
@rygorous
rygorous / gist:3842461
Created October 5, 2012 21:13
16x tri binning setup
// Constants
static const OM_U32 cEdgeFlags[5] =
{
RAST_SCISSORS_FLAG_EDGE0,
RAST_SCISSORS_FLAG_EDGE1,
RAST_SCISSORS_FLAG_EDGE2,
RAST_SCISSORS_FLAG_EDGE3,
RAST_SCISSORS_FLAG_TRIANGLE_EXTENDS_OUTSIDE_SCISSORS
};
static RADINLINE bool is_fence_pending(GDrawFence fence)
{
U32 retired = get_gpu_fence();
// everything between "retired" (exclusive) and "next_fence_counter"
// (inclusive) is pending. everything else is definitely done.
//
// we need to be careful about this test since the "fence" value
// coming in is, for all practical purposes, an arbitrary U32. our
// fence counter might have wrapped around multiple times since we last
// The native handle type holds resource handles and a coarse description.
typedef union {
// handle that is a texture
struct {
GLuint gl;
GLuint gl_renderbuf;
} tex;
// handle that is a vertex buffer
struct {
// This is your vertex buffer, split into components.
// On D3D11 HW, you can probably use structured buffers to de-SoA this, but I haven't checked.
Buffer<float3> buf_pos;
Buffer<float3> buf_norm;
Buffer<float2> buf_uvs;
float4x4 clip_from_model;
uint base_index, index_mask;
@rygorous
rygorous / gist:3982477
Created October 30, 2012 19:36
Resource manager state machine
// +------+ +--------+
// | Live |<------->| Locked |
// +------+ +--------+
// / \ ^
// / \ \
// v v \
// +------+ +------+ +------+ |
// | Dead |--->| Free |<---| User | |
// +------+ +------+ +------+ |
// ^ ^ ^ ^ |