-
-
Save dappelha/57941aed306045376a606f8a6561fb69 to your computer and use it in GitHub Desktop.
OpenACC overlap of GPU work and CPU work
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
! transfrom this: | |
Do step = 1, 100 | |
acc kernel | |
acc update() | |
Call diagnostics_on_cpu() | |
end | |
!Needs to change to: | |
Step=1 | |
acc kernel async(step) | |
acc update() async(step) | |
Do step = 2, n | |
! guard use of CPU updated buffers | |
Acc wait(step-1) ! so you don’t launch this step kernels before previous step has finished moving to CPU. | |
! launch next step GPU work: | |
Acc kernel async(step) | |
Call diagnostics_on_cpu( <on step-1> ) | |
! putting update after the cpu code will make sure you are finished with CPU buffers before new update | |
Acc update() async(step) | |
enddo | |
! clean up last diagnostics: | |
Step=n | |
Acc wait(step) ! so you don’t launch this step kernels before previous step has finished moving to CPU. | |
Call diagnostics_on_cpu( <on step> ) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment