Skip to content

Instantly share code, notes, and snippets.

@evincarofautumn
Created January 28, 2018 04:24
Show Gist options
  • Save evincarofautumn/b84ca15bbefcd4ea3b2f215b5d5bfc4d to your computer and use it in GitHub Desktop.
Save evincarofautumn/b84ca15bbefcd4ea3b2f215b5d5bfc4d to your computer and use it in GitHub Desktop.
Vectorized matrix multiplication in Kitten
define square_matrix_multiply
<N as Size>
(Array<(N * N), Float>, Array<(N * N), Float>
-> Array<(N * N), Float>)
{
-> left, right;
// static::<N> lifts the value of the static constant N to runtime.
if (static::<N> >= 256):
512
else:
256
-> block_width;
if (static::<N> >= 512):
8
elif (static::<N> >= 256):
16
else:
32
-> block_height;
// +Unsafe to allow temporarily uninitialized result.
with (+Unsafe):
// Result array. Uninitialized: all cells will be filled.
undef_array::<(N * N)>
// Could be done much more functionally.
do (0 static::<N> block_height map_from_to_by) -> row_offset:
do (0 static::<N> block_height map_from_to_by) -> column_offset:
do (0 n map_from_to) -> i:
do (column_offset (column_offset + block_width) 8 map_from_to_by) -> j:
if (j < static::<N>):
(i * static::<N> + j) load
do (row_offset row_offset + block_height map_from_to) -> k:
if (k < static::<N>):
left (i * static::<N> + k) set1 swap
right (k * static::<N> + j) load swap
multiply_add
(i * static::<N> + j) store
}
// Based on Matrix Multiplication Revisited <http://richardstartin.uk/mmm-revisited/>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment