Skip to content

Instantly share code, notes, and snippets.

@ddjerqq
Last active March 6, 2023 10:46
Show Gist options
  • Save ddjerqq/d330298ce43e9c91934ec198a95e6b1f to your computer and use it in GitHub Desktop.
Save ddjerqq/d330298ce43e9c91934ec198a95e6b1f to your computer and use it in GitHub Desktop.
this is a small helper for indexing blocks and threads with cuda.

CUDA helper for block and thread indexing

general rule of thumb

threadsperblock = 32
blockspergrid = (an_array.size + (threadsperblock - 1)) // threadsperblock
increment_by_one[blockspergrid, threadsperblock](an_array)

Multi dimensional block layout

1 dimensional block layout

blockDim.x is 4, so the marked tile (██) will be executed when, threadIdx.x == 1 and blockIdx.x == 2

B0 T0 T1 T2 T3
B1 T0 T1 T2 T3
B2 T0 ██ T2 T3
B3 T0 T1 T2 T3

Launched with:

kernel[4, 4](an_array)

2 dimensional block layout

3 x 3 blocks, each with 3 x 3 threads

threadsPerBlock = (3, 3)
b_x = math.ceil(an_array.shape[0] / threadsPerBlock[0])
b_y = math.ceil(an_array.shape[1] / threadsPerBlock[1])
blocksPerGrid = (b_x, b_y)

increment_a_2D_array[blocksPerGrid, threadsPerBlock](an_array)

Block indexes - grid:

[0, 0] [0, 1] [0, 2]
[1, 0] [1, 1] [1, 2]
[2, 0] [2, 1] [2, 2]

threads indexes for each block

[0, 0] [0, 1] [0, 2]
[1, 0] [1, 1] [1, 2]
[2, 0] [2, 1] [2, 2]

representation of the grid, the blocks in it, and the threads in each block

╔═══════════════╦═══════════════╦═══════════════╗
║ Block [0, 0]  ║ Block [0, 1]  ║ Block [0, 2]  ║
║ ┌───┬───┬───┐ ║ ┌───┬───┬───┐ ║ ┌───┬───┬───┐ ║
║ │ 012 │ ║ │ 012 │ ║ │ 012 │ ║
║ ├───┼───┼───┤ ║ ├───┼───┼───┤ ║ ├───┼───┼───┤ ║
║ │ 345 │ ║ │ 345 │ ║ │ 345 │ ║
║ ├───┼───┼───┤ ║ ├───┼───┼───┤ ║ ├───┼───┼───┤ ║
║ │ 678 │ ║ │ 678 │ ║ │ 678 │ ║
║ └───┴───┴───┘ ║ └───┴───┴───┘ ║ └───┴───┴───┘ ║
╠═══════════════╬═══════════════╬═══════════════╣
║ Block [1, 0]  ║ Block [1, 1]  ║ Block [1, 2]  ║
║ ┌───┬───┬───┐ ║ ┌───┬───┬───┐ ║ ┌───┬───┬───┐ ║
║ │ 012 │ ║ │ 012 │ ║ │ 012 │ ║
║ ├───┼───┼───┤ ║ ├───┼───┼───┤ ║ ├───┼───┼───┤ ║
║ │ 345 │ ║ │ 345 │ ║ │ 345 │ ║
║ ├───┼───┼───┤ ║ ├───┼───┼───┤ ║ ├───┼───┼───┤ ║
║ │ 678 │ ║ │ 678 │ ║ │ 678 │ ║
║ └───┴───┴───┘ ║ └───┴───┴───┘ ║ └───┴───┴───┘ ║
╠═══════════════╬═══════════════╬═══════════════╣
║ Block [2, 0]  ║ Block [2, 1]  ║ Block [2, 2]  ║
║ ┌───┬───┬───┐ ║ ┌───┬───┬───┐ ║ ┌───┬───┬───┐ ║
║ │ 012 │ ║ │ 012 │ ║ │ 012 │ ║
║ ├───┼───┼───┤ ║ ├───┼───┼───┤ ║ ├───┼───┼───┤ ║
║ │ 345 │ ║ │ 345 │ ║ │ 345 │ ║
║ ├───┼───┼───┤ ║ ├───┼───┼───┤ ║ ├───┼───┼───┤ ║
║ │ 678 │ ║ │ 678 │ ║ │ 678 │ ║
║ └───┴───┴───┘ ║ └───┴───┴───┘ ║ └───┴───┴───┘ ║
╚═══════════════╩═══════════════╩═══════════════╝
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment