Created
March 3, 2015 04:08
-
-
Save matpalm/fcf734a01964c4e08cd1 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
mat@mat-desktop:~/dev/GTX970test$ make run | |
./test_bandwidth1.out | |
The bandwidth should stay be about the same each time: | |
Data size: 0.125000 GB; Bandwidth: 101725.257812 GB/s | |
Data size: 0.375000 GB; Bandwidth: 930059.500000 GB/s | |
Data size: 0.625000 GB; Bandwidth: 1514050.500000 GB/s | |
Data size: 0.875000 GB; Bandwidth: 2119670.750000 GB/s | |
Data size: 1.125000 GB; Bandwidth: 3662109.250000 GB/s | |
Data size: 1.375000 GB; Bandwidth: 4475911.500000 GB/s | |
Data size: 1.625000 GB; Bandwidth: 3998523.500000 GB/s | |
Data size: 1.875000 GB; Bandwidth: 6040593.000000 GB/s | |
Data size: 2.125000 GB; Bandwidth: 5147772.000000 GB/s | |
Data size: 2.375000 GB; Bandwidth: 5798340.000000 GB/s | |
Data size: 2.625000 GB; Bandwidth: 6459153.000000 GB/s | |
Data size: 2.875000 GB; Bandwidth: 9358724.000000 GB/s | |
Data size: 3.125000 GB; Bandwidth: 7750496.000000 GB/s | |
Data size: 3.375000 GB; Bandwidth: 5521924.000000 GB/s | |
Data size: 3.625000 GB; Bandwidth: 4968476.000000 GB/s | |
Data size: 3.875000 GB; Bandwidth: 7712977.500000 GB/s | |
./test_bandwidth2.out | |
-------------------------------------------- | |
Allocated GPU memory size in GB: 2.861023 | |
Used GPU memory by other applications in GB: 0.179005 | |
-------------------------------------------- | |
Device : GeForce GTX 970 | |
Matrix size: 16000 16000, Block size: 32 8, Tile size: 32 32 | |
dimGrid: 500 500 1. dimBlock: 32 8 1 | |
Routine Bandwidth (GB/s) ### HANGS at this point for ~5seconds | |
copy1 0.000000 1.000000 | |
*** FAILED *** | |
shared memory copy1 0.000000 1.000000 | |
*** FAILED *** | |
naive transpose1 0.000000 16000.000000 | |
*** FAILED *** | |
coalesced transpose1 0.000000 16000.000000 | |
*** FAILED *** | |
conflict-free transpose1 0.000000 16000.000000 | |
*** FAILED *** | |
-------------------------------------------- | |
Allocated GPU memory size in GB: 2.953308 | |
Used GPU memory by other applications in GB: 0.179005 | |
-------------------------------------------- | |
Device : GeForce GTX 970 | |
Matrix size: 16256 16256, Block size: 32 8, Tile size: 32 32 | |
dimGrid: 508 508 1. dimBlock: 32 8 1 | |
Routine Bandwidth (GB/s) | |
copy1 0.000000 1.000000 | |
*** FAILED *** | |
shared memory copy1 0.000000 1.000000 | |
*** FAILED *** | |
naive transpose1 0.000000 16256.000000 | |
*** FAILED *** | |
coalesced transpose1 0.000000 16256.000000 | |
*** FAILED *** | |
conflict-free transpose1 0.000000 16256.000000 | |
*** FAILED *** | |
-------------------------------------------- | |
Allocated GPU memory size in GB: 3.047058 | |
Used GPU memory by other applications in GB: 0.179005 | |
-------------------------------------------- | |
Device : GeForce GTX 970 | |
Matrix size: 16512 16512, Block size: 32 8, Tile size: 32 32 | |
dimGrid: 516 516 1. dimBlock: 32 8 1 | |
Routine Bandwidth (GB/s) | |
copy1 0.000000 1.000000 | |
*** FAILED *** | |
shared memory copy1 0.000000 1.000000 | |
*** FAILED *** | |
naive transpose1 0.000000 16512.000000 | |
*** FAILED *** | |
coalesced transpose1 0.000000 16512.000000 | |
*** FAILED *** | |
conflict-free transpose1 0.000000 16512.000000 | |
*** FAILED *** | |
-------------------------------------------- | |
Allocated GPU memory size in GB: 3.142273 | |
Used GPU memory by other applications in GB: 0.179005 | |
-------------------------------------------- | |
Device : GeForce GTX 970 | |
Matrix size: 16768 16768, Block size: 32 8, Tile size: 32 32 | |
dimGrid: 524 524 1. dimBlock: 32 8 1 | |
Routine Bandwidth (GB/s) | |
copy1 0.000000 1.000000 | |
*** FAILED *** | |
shared memory copy1 0.000000 1.000000 | |
*** FAILED *** | |
naive transpose1 0.000000 16768.000000 | |
*** FAILED *** | |
coalesced transpose1 0.000000 16768.000000 | |
*** FAILED *** | |
conflict-free transpose1 0.000000 16768.000000 | |
*** FAILED *** | |
-------------------------------------------- | |
Allocated GPU memory size in GB: 3.238953 | |
Used GPU memory by other applications in GB: 0.179005 | |
-------------------------------------------- | |
Device : GeForce GTX 970 | |
Matrix size: 17024 17024, Block size: 32 8, Tile size: 32 32 | |
dimGrid: 532 532 1. dimBlock: 32 8 1 | |
Routine Bandwidth (GB/s) | |
copy1 0.000000 1.000000 | |
*** FAILED *** | |
shared memory copy1 0.000000 1.000000 | |
*** FAILED *** | |
naive transpose1 0.000000 17024.000000 | |
*** FAILED *** | |
coalesced transpose1 0.000000 17024.000000 | |
*** FAILED *** | |
conflict-free transpose1 0.000000 17024.000000 | |
*** FAILED *** | |
-------------------------------------------- | |
Allocated GPU memory size in GB: 3.337097 | |
Used GPU memory by other applications in GB: 0.179005 | |
-------------------------------------------- | |
Device : GeForce GTX 970 | |
Matrix size: 17280 17280, Block size: 32 8, Tile size: 32 32 | |
dimGrid: 540 540 1. dimBlock: 32 8 1 | |
^Cmake: *** [run] Interrupt |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment