taeguk · December 8, 2016 07:31
diff --git a/cuda.txt b/cuda.txt
 CUDA ILP -> by compiler? or by hardware?
 CUDA -> in-order? or out-of-order?

 Occupancy = # of active warps / # of max resident warps
 Occupancy is determined from block size, register usage per a thread, shared memory usage per a thread.
 shared memory usage per a thread = shared memory usage in a block / block size.

 # of blocks >= # of SMs is good.
	CUDA ILP -> by compiler? or by hardware?
	CUDA -> in-order? or out-of-order?

	Occupancy = # of active warps / # of max resident warps
	Occupancy is determined from block size, register usage per a thread, shared memory usage per a thread.
	shared memory usage per a thread = shared memory usage in a block / block size.

	# of blocks >= # of SMs is good.