Skip to content

Instantly share code, notes, and snippets.

@raphlinus
Last active February 24, 2020 03:55
Show Gist options
  • Save raphlinus/65aede6e6a257aea7d573c7acf8a9542 to your computer and use it in GitHub Desktop.
Save raphlinus/65aede6e6a257aea7d573c7acf8a9542 to your computer and use it in GitHub Desktop.
backend: metal, device: Intel(R) Iris(TM) Plus Graphics 640
metal-threadgroup-Intel(R) Iris(TM) Plus Graphics 640
kernel type: threadgroup
cpu_execs: 2, gpu_execs: 5001
transpose-threadgroup-WGS=(1,32) kernel already compiled...
num bms: 4096, num dispatch groups: 4096
GPU results verified!
task name:metal-threadgroup-WGS=(32, 32)
TG size: 32
timestamp stats (N = 2): 0.00 +/- 0.00 ms
instant stats (N = 2): 263.12 +/- 45.29 ms
transpose-threadgroup-WGS=(2,32) kernel already compiled...
num bms: 4096, num dispatch groups: 2048
GPU results verified!
task name:metal-threadgroup-WGS=(64, 32)
TG size: 64
timestamp stats (N = 2): 0.00 +/- 0.00 ms
instant stats (N = 2): 200.00 +/- 6.21 ms
transpose-threadgroup-WGS=(4,32) kernel already compiled...
num bms: 4096, num dispatch groups: 1024
GPU results verified!
task name:metal-threadgroup-WGS=(128, 32)
TG size: 128
timestamp stats (N = 2): 0.00 +/- 0.00 ms
instant stats (N = 2): 379.75 +/- 10.94 ms
transpose-threadgroup-WGS=(8,32) kernel already compiled...
num bms: 4096, num dispatch groups: 512
GPU results verified!
task name:metal-threadgroup-WGS=(256, 32)
TG size: 256
timestamp stats (N = 2): 0.00 +/- 0.00 ms
instant stats (N = 2): 735.89 +/- 4.76 ms
transpose-threadgroup-WGS=(16,32) kernel already compiled...
num bms: 4096, num dispatch groups: 256
GPU results verified!
task name:metal-threadgroup-WGS=(512, 32)
TG size: 512
timestamp stats (N = 2): 0.00 +/- 0.00 ms
instant stats (N = 2): 1424.03 +/- 0.29 ms
transpose-threadgroup-WGS=(32,32) kernel already compiled...
num bms: 4096, num dispatch groups: 128
GPU results verified!
task name:metal-threadgroup-WGS=(1024, 32)
TG size: 1024
timestamp stats (N = 2): 0.00 +/- 0.00 ms
instant stats (N = 2): 1230.29 +/- 6.08 ms
backend: metal, device: Intel(R) Iris(TM) Plus Graphics 640
metal-hybrid-shuffle-Intel(R) Iris(TM) Plus Graphics 640
kernel type: hybrid-shuffle
cpu_execs: 2, gpu_execs: 5001
compiling kernel transpose-hybrid-shuffle-WGS=(32,1)...
num bms: 4096, num dispatch groups: 4096
GPU results verified!
task name:metal-hybrid-shuffle-WGS=(32,1)
TG size: 32
timestamp stats (N = 2): 0.00 +/- 0.00 ms
instant stats (N = 2): 541.60 +/- 0.73 ms
compiling kernel transpose-hybrid-shuffle-WGS=(64,1)...
num bms: 4096, num dispatch groups: 2048
GPU results verified!
task name:metal-hybrid-shuffle-WGS=(64,1)
TG size: 64
timestamp stats (N = 2): 0.00 +/- 0.00 ms
instant stats (N = 2): 540.33 +/- 0.54 ms
compiling kernel transpose-hybrid-shuffle-WGS=(128,1)...
num bms: 4096, num dispatch groups: 1024
GPU results verified!
task name:metal-hybrid-shuffle-WGS=(128,1)
TG size: 128
timestamp stats (N = 2): 0.00 +/- 0.00 ms
instant stats (N = 2): 544.51 +/- 5.09 ms
compiling kernel transpose-hybrid-shuffle-WGS=(256,1)...
num bms: 4096, num dispatch groups: 512
GPU results verified!
task name:metal-hybrid-shuffle-WGS=(256,1)
TG size: 256
timestamp stats (N = 2): 0.00 +/- 0.00 ms
instant stats (N = 2): 584.60 +/- 17.78 ms
compiling kernel transpose-hybrid-shuffle-WGS=(512,1)...
num bms: 4096, num dispatch groups: 256
GPU results verified!
task name:metal-hybrid-shuffle-WGS=(512,1)
TG size: 512
timestamp stats (N = 2): 0.00 +/- 0.00 ms
instant stats (N = 2): 652.31 +/- 4.39 ms
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment