Skip to content

Instantly share code, notes, and snippets.

@raphlinus
Created February 8, 2020 21:11
Show Gist options
  • Save raphlinus/59a7ab6a134828f5afdad8fcdaeb3388 to your computer and use it in GitHub Desktop.
Save raphlinus/59a7ab6a134828f5afdad8fcdaeb3388 to your computer and use it in GitHub Desktop.
32x32 matrix transpose in metal using subgroups
inline uint shuffle_round(uint a, uint m, ushort s) {
uint b = simd_shuffle_xor(a, s);
uint c;
if ((tix & s) == 0) {
c = b << s;
} else {
m = ~m;
c = b >> s;
}
return (a & m) | (c & ~m);
}
// Transpose bitmask. Assumes subgroup size = 32
uint bitmask; // value to transpose
bitmask = shuffle_round(bitmask, 0xffff, 16);
bitmask = shuffle_round(bitmask, 0xff00ff, 8);
bitmask = shuffle_round(bitmask, 0xf0f0f0f, 4);
bitmask = shuffle_round(bitmask, 0x33333333, 2);
bitmask = shuffle_round(bitmask, 0x55555555, 1);
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment