Run the sample with nvprof command. Specify profile file with -o
option. -f
means force and it override an existed file.
$ nvprof -f -o fuse.nvvp python fuse_sample.py
And then run nvvp
command, and open file fuse.nvvp
.
import cupy | |
import cupy.prof | |
def f(x): | |
return x + x + x + x + x | |
# This decorator fuses the kernels into one. | |
@cupy.fuse() | |
def g(x): | |
return x + x + x + x + x | |
x = cupy.arange(40000000) | |
with cupy.prof.time_range('without fuse', color_id=0): | |
f(x) | |
with cupy.prof.time_range('with fuse', color_id=1): | |
g(x) | |
# You can pass numpy arrays transparently to the fused function as well. | |
# In this case, no JIT compilation is applied and it just falls back to plain NumPy API calls. | |
g(numpy.arange(40000000)) |
Run the sample with nvprof command. Specify profile file with -o
option. -f
means force and it override an existed file.
$ nvprof -f -o fuse.nvvp python fuse_sample.py
And then run nvvp
command, and open file fuse.nvvp
.