CUDA helper for block and thread indexing
threadsperblock = 32
blockspergrid = (an_array .size + (threadsperblock - 1 )) // threadsperblock
increment_by_one [blockspergrid , threadsperblock ](an_array )
Multi dimensional block layout
1 dimensional block layout
blockDim.x is 4, so the marked tile (██) will be executed when, threadIdx.x == 1 and blockIdx.x == 2
B0
T0
T1
T2
T3
B1
T0
T1
T2
T3
B2
T0
██
T2
T3
B3
T0
T1
T2
T3
2 dimensional block layout
3 x 3 blocks, each with 3 x 3 threads
threadsPerBlock = (3 , 3 )
b_x = math .ceil (an_array .shape [0 ] / threadsPerBlock [0 ])
b_y = math .ceil (an_array .shape [1 ] / threadsPerBlock [1 ])
blocksPerGrid = (b_x , b_y )
increment_a_2D_array [blocksPerGrid , threadsPerBlock ](an_array )
[0, 0]
[0, 1]
[0, 2]
[1, 0]
[1, 1]
[1, 2]
[2, 0]
[2, 1]
[2, 2]
threads indexes for each block
[0, 0]
[0, 1]
[0, 2]
[1, 0]
[1, 1]
[1, 2]
[2, 0]
[2, 1]
[2, 2]
representation of the grid, the blocks in it, and the threads in each block
╔═══════════════╦═══════════════╦═══════════════╗
║ Block [0 , 0 ] ║ Block [0 , 1 ] ║ Block [0 , 2 ] ║
║ ┌───┬───┬───┐ ║ ┌───┬───┬───┐ ║ ┌───┬───┬───┐ ║
║ │ 0 │ 1 │ 2 │ ║ │ 0 │ 1 │ 2 │ ║ │ 0 │ 1 │ 2 │ ║
║ ├───┼───┼───┤ ║ ├───┼───┼───┤ ║ ├───┼───┼───┤ ║
║ │ 3 │ 4 │ 5 │ ║ │ 3 │ 4 │ 5 │ ║ │ 3 │ 4 │ 5 │ ║
║ ├───┼───┼───┤ ║ ├───┼───┼───┤ ║ ├───┼───┼───┤ ║
║ │ 6 │ 7 │ 8 │ ║ │ 6 │ 7 │ 8 │ ║ │ 6 │ 7 │ 8 │ ║
║ └───┴───┴───┘ ║ └───┴───┴───┘ ║ └───┴───┴───┘ ║
╠═══════════════╬═══════════════╬═══════════════╣
║ Block [1 , 0 ] ║ Block [1 , 1 ] ║ Block [1 , 2 ] ║
║ ┌───┬───┬───┐ ║ ┌───┬───┬───┐ ║ ┌───┬───┬───┐ ║
║ │ 0 │ 1 │ 2 │ ║ │ 0 │ 1 │ 2 │ ║ │ 0 │ 1 │ 2 │ ║
║ ├───┼───┼───┤ ║ ├───┼───┼───┤ ║ ├───┼───┼───┤ ║
║ │ 3 │ 4 │ 5 │ ║ │ 3 │ 4 │ 5 │ ║ │ 3 │ 4 │ 5 │ ║
║ ├───┼───┼───┤ ║ ├───┼───┼───┤ ║ ├───┼───┼───┤ ║
║ │ 6 │ 7 │ 8 │ ║ │ 6 │ 7 │ 8 │ ║ │ 6 │ 7 │ 8 │ ║
║ └───┴───┴───┘ ║ └───┴───┴───┘ ║ └───┴───┴───┘ ║
╠═══════════════╬═══════════════╬═══════════════╣
║ Block [2 , 0 ] ║ Block [2 , 1 ] ║ Block [2 , 2 ] ║
║ ┌───┬───┬───┐ ║ ┌───┬───┬───┐ ║ ┌───┬───┬───┐ ║
║ │ 0 │ 1 │ 2 │ ║ │ 0 │ 1 │ 2 │ ║ │ 0 │ 1 │ 2 │ ║
║ ├───┼───┼───┤ ║ ├───┼───┼───┤ ║ ├───┼───┼───┤ ║
║ │ 3 │ 4 │ 5 │ ║ │ 3 │ 4 │ 5 │ ║ │ 3 │ 4 │ 5 │ ║
║ ├───┼───┼───┤ ║ ├───┼───┼───┤ ║ ├───┼───┼───┤ ║
║ │ 6 │ 7 │ 8 │ ║ │ 6 │ 7 │ 8 │ ║ │ 6 │ 7 │ 8 │ ║
║ └───┴───┴───┘ ║ └───┴───┴───┘ ║ └───┴───┴───┘ ║
╚═══════════════╩═══════════════╩═══════════════╝