Skip to content

Instantly share code, notes, and snippets.

@taeguk
Created December 8, 2016 07:31
Show Gist options
  • Save taeguk/5659dae5aa27b8414b179ce1b89561c1 to your computer and use it in GitHub Desktop.
Save taeguk/5659dae5aa27b8414b179ce1b89561c1 to your computer and use it in GitHub Desktop.
CUDA
CUDA ILP -> by compiler? or by hardware?
CUDA -> in-order? or out-of-order?
Occupancy = # of active warps / # of max resident warps
Occupancy is determined from block size, register usage per a thread, shared memory usage per a thread.
shared memory usage per a thread = shared memory usage in a block / block size.
# of blocks >= # of SMs is good.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment