Skip to content

Instantly share code, notes, and snippets.

@vassvik
vassvik / vs_parameters.txt
Last active September 8, 2018 13:12
Default Visual Studio compiler parameters and what they mean
/permissive- standards conformance
/MP build with multiple processors
/GS buffer security check
/Qpar auto-parallelizer
/GL whole program optimization
/W3 warning level
/wd"4305" identifier' : truncation from 'type1' to 'type2'
/Gy enable function-level linking
/Zc:wchar_t wchar_t is native type
/Zi debug information format
@vassvik
vassvik / quaternions.cpp
Last active September 8, 2018 13:09
Testing quaternion rotation equivalencies
#include <stdio.h>
#include <math.h>
struct vec3 {
float x, y, z;
vec3(float x, float y, float z) : x(x), y(y), z(z) {}
vec3(float x) : x(x), y(x), z(x) {}
};
@vassvik
vassvik / main.c
Last active August 16, 2024 18:43
Minimal GLFW example in C
// compile with one of the following, probably.
//
// in linux:
// if glfw is built as a dynamic lib:
// gcc main.c -lglfw
// gcc main.c -lglfw3
//
// or, if glfw is built as a static lib:
// gcc main.c -lglfw -lX11 -lXxf86vm -lpthread -lXrandr -lXinerama -lXcursor -lXi
// gcc main.c -lglfw3 -lX11 -lXxf86vm -lpthread -lXrandr -lXinerama -lXcursor -lXi
@vassvik
vassvik / circular_buffer.go
Last active September 8, 2018 13:07
Simple circular buffer in Odin
Circular_Buffer :: struct(T: type, N: int) {
data: [N]T,
cursor: int,
length: int,
}
push_back :: inline proc(using cb: ^$T/Circular_Buffer, v: T.T) -> bool #no_bounds_check {
data[(cursor + length) %% T.N] = v;
if length < T.N {
@vassvik
vassvik / obj_loader.go
Last active October 17, 2022 03:49
odin obj loader
package main
import "core:os";
import "core:fmt";
import "core:strconv";
// model data stuff
Model_Data :: struct {
vertices: [][3]f32,
indices: []i32,
@vassvik
vassvik / 1
Last active March 5, 2019 09:13
b: 1x1x1 = 1 elements, 1 non-zero, 0 zeros
1
x: 3x3x3 = 27 elements, 6 non-zeros, 21 zeros
0 0 0
0 1 0
0 0 0

Compute memory access pattern throughput benchmarks

We test the runtimes of simple compute shaders reading from one 3D texture using some kind of filter, and writing back to another texture. The local work group size of the compute shader is varied for some arbitrary set of work group sizes, and the effect of different internal texture formats are studied.

All tests are performed using 512x512x512 3D textures. At this size memory throughput and latency will be the primary bottleneck, so any extra calculations should have negligible impact on the timings.

All timings are measured by averaging the frame time across 128 frames, with a 128 frame warmup, with vsync disabled. Using queries might provide more stable numbers.

The work group sizes are:

19.344 0.114 cs_filter3D_27stencil.glsl 512x512x512 R16F [8, 8, 8]
19.045 0.116 cs_filter3D_27stencil.glsl 512x512x512 R16F [32, 32, 1]
18.796 0.202 cs_filter3D_27stencil.glsl 512x512x512 R16F [32, 1, 32]
18.108 0.386 cs_filter3D_27stencil.glsl 512x512x512 R16F [16, 16, 1]
18.860 6.760 cs_filter3D_27stencil.glsl 512x512x512 R16F [16, 1, 16]
19.676 0.094 cs_filter3D_27stencil.glsl 512x512x512 R16F [16, 16, 4]
19.427 0.106 cs_filter3D_27stencil.glsl 512x512x512 R16F [16, 4, 16]
19.628 0.196 cs_filter3D_27stencil.glsl 512x512x512 R16F [4, 16, 16]
18.416 0.249 cs_filter3D_27stencil.glsl 512x512x512 R16F [4, 2, 16]
18.358 0.250 cs_filter3D_27stencil.glsl 512x512x512 R16F [16, 2, 4]
19.432 0.095 386.792 cs_filter3D_27stencil.glsl 512x512x512 R16F [8, 8, 8]
19.150 0.149 392.494 cs_filter3D_27stencil.glsl 512x512x512 R16F [32, 32, 1]
18.925 0.132 397.147 cs_filter3D_27stencil.glsl 512x512x512 R16F [32, 1, 32]
18.203 0.138 412.910 cs_filter3D_27stencil.glsl 512x512x512 R16F [16, 16, 1]
18.483 0.128 406.655 cs_filter3D_27stencil.glsl 512x512x512 R16F [16, 1, 16]
19.548 0.142 384.503 cs_filter3D_27stencil.glsl 512x512x512 R16F [16, 16, 4]
19.298 0.167 389.487 cs_filter3D_27stencil.glsl 512x512x512 R16F [16, 4, 16]
19.458 0.116 386.272 cs_filter3D_27stencil.glsl 512x512x512 R16F [4, 16, 16]
18.272 0.117 411.344 cs_filter3D_27stencil.glsl 512x512x512 R16F [4, 2, 16]
18.279 0.696 411.186 cs_filter3D_27stencil.glsl 512x512x512 R16F [16, 2, 4]
31.932 0.781 235.379 cs_filter3D_27stencil.glsl 512x512x512 R16F [8, 8, 8]
31.973 1.101 235.080 cs_filter3D_27stencil.glsl 512x512x512 R16F [32, 32, 1]
31.285 0.552 240.247 cs_filter3D_27stencil.glsl 512x512x512 R16F [32, 1, 32]
30.455 1.047 246.794 cs_filter3D_27stencil.glsl 512x512x512 R16F [16, 16, 1]
30.281 0.894 248.218 cs_filter3D_27stencil.glsl 512x512x512 R16F [16, 1, 16]
32.020 1.188 234.732 cs_filter3D_27stencil.glsl 512x512x512 R16F [16, 16, 4]
31.585 0.934 237.969 cs_filter3D_27stencil.glsl 512x512x512 R16F [16, 4, 16]
31.712 0.940 237.013 cs_filter3D_27stencil.glsl 512x512x512 R16F [4, 16, 16]
30.113 0.383 249.603 cs_filter3D_27stencil.glsl 512x512x512 R16F [4, 2, 16]
30.041 0.290 250.198 cs_filter3D_27stencil.glsl 512x512x512 R16F [16, 2, 4]