Skip to content

Instantly share code, notes, and snippets.

@matpalm
Created March 3, 2015 04:08
Show Gist options
  • Save matpalm/fcf734a01964c4e08cd1 to your computer and use it in GitHub Desktop.
Save matpalm/fcf734a01964c4e08cd1 to your computer and use it in GitHub Desktop.
mat@mat-desktop:~/dev/GTX970test$ make run
./test_bandwidth1.out
The bandwidth should stay be about the same each time:
Data size: 0.125000 GB; Bandwidth: 101725.257812 GB/s
Data size: 0.375000 GB; Bandwidth: 930059.500000 GB/s
Data size: 0.625000 GB; Bandwidth: 1514050.500000 GB/s
Data size: 0.875000 GB; Bandwidth: 2119670.750000 GB/s
Data size: 1.125000 GB; Bandwidth: 3662109.250000 GB/s
Data size: 1.375000 GB; Bandwidth: 4475911.500000 GB/s
Data size: 1.625000 GB; Bandwidth: 3998523.500000 GB/s
Data size: 1.875000 GB; Bandwidth: 6040593.000000 GB/s
Data size: 2.125000 GB; Bandwidth: 5147772.000000 GB/s
Data size: 2.375000 GB; Bandwidth: 5798340.000000 GB/s
Data size: 2.625000 GB; Bandwidth: 6459153.000000 GB/s
Data size: 2.875000 GB; Bandwidth: 9358724.000000 GB/s
Data size: 3.125000 GB; Bandwidth: 7750496.000000 GB/s
Data size: 3.375000 GB; Bandwidth: 5521924.000000 GB/s
Data size: 3.625000 GB; Bandwidth: 4968476.000000 GB/s
Data size: 3.875000 GB; Bandwidth: 7712977.500000 GB/s
./test_bandwidth2.out
--------------------------------------------
Allocated GPU memory size in GB: 2.861023
Used GPU memory by other applications in GB: 0.179005
--------------------------------------------
Device : GeForce GTX 970
Matrix size: 16000 16000, Block size: 32 8, Tile size: 32 32
dimGrid: 500 500 1. dimBlock: 32 8 1
Routine Bandwidth (GB/s) ### HANGS at this point for ~5seconds
copy1 0.000000 1.000000
*** FAILED ***
shared memory copy1 0.000000 1.000000
*** FAILED ***
naive transpose1 0.000000 16000.000000
*** FAILED ***
coalesced transpose1 0.000000 16000.000000
*** FAILED ***
conflict-free transpose1 0.000000 16000.000000
*** FAILED ***
--------------------------------------------
Allocated GPU memory size in GB: 2.953308
Used GPU memory by other applications in GB: 0.179005
--------------------------------------------
Device : GeForce GTX 970
Matrix size: 16256 16256, Block size: 32 8, Tile size: 32 32
dimGrid: 508 508 1. dimBlock: 32 8 1
Routine Bandwidth (GB/s)
copy1 0.000000 1.000000
*** FAILED ***
shared memory copy1 0.000000 1.000000
*** FAILED ***
naive transpose1 0.000000 16256.000000
*** FAILED ***
coalesced transpose1 0.000000 16256.000000
*** FAILED ***
conflict-free transpose1 0.000000 16256.000000
*** FAILED ***
--------------------------------------------
Allocated GPU memory size in GB: 3.047058
Used GPU memory by other applications in GB: 0.179005
--------------------------------------------
Device : GeForce GTX 970
Matrix size: 16512 16512, Block size: 32 8, Tile size: 32 32
dimGrid: 516 516 1. dimBlock: 32 8 1
Routine Bandwidth (GB/s)
copy1 0.000000 1.000000
*** FAILED ***
shared memory copy1 0.000000 1.000000
*** FAILED ***
naive transpose1 0.000000 16512.000000
*** FAILED ***
coalesced transpose1 0.000000 16512.000000
*** FAILED ***
conflict-free transpose1 0.000000 16512.000000
*** FAILED ***
--------------------------------------------
Allocated GPU memory size in GB: 3.142273
Used GPU memory by other applications in GB: 0.179005
--------------------------------------------
Device : GeForce GTX 970
Matrix size: 16768 16768, Block size: 32 8, Tile size: 32 32
dimGrid: 524 524 1. dimBlock: 32 8 1
Routine Bandwidth (GB/s)
copy1 0.000000 1.000000
*** FAILED ***
shared memory copy1 0.000000 1.000000
*** FAILED ***
naive transpose1 0.000000 16768.000000
*** FAILED ***
coalesced transpose1 0.000000 16768.000000
*** FAILED ***
conflict-free transpose1 0.000000 16768.000000
*** FAILED ***
--------------------------------------------
Allocated GPU memory size in GB: 3.238953
Used GPU memory by other applications in GB: 0.179005
--------------------------------------------
Device : GeForce GTX 970
Matrix size: 17024 17024, Block size: 32 8, Tile size: 32 32
dimGrid: 532 532 1. dimBlock: 32 8 1
Routine Bandwidth (GB/s)
copy1 0.000000 1.000000
*** FAILED ***
shared memory copy1 0.000000 1.000000
*** FAILED ***
naive transpose1 0.000000 17024.000000
*** FAILED ***
coalesced transpose1 0.000000 17024.000000
*** FAILED ***
conflict-free transpose1 0.000000 17024.000000
*** FAILED ***
--------------------------------------------
Allocated GPU memory size in GB: 3.337097
Used GPU memory by other applications in GB: 0.179005
--------------------------------------------
Device : GeForce GTX 970
Matrix size: 17280 17280, Block size: 32 8, Tile size: 32 32
dimGrid: 540 540 1. dimBlock: 32 8 1
^Cmake: *** [run] Interrupt
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment