Run the sample with nvprof command. Specify profile file with -o option. -f means force and it override an existed file.
$ nvprof -f -o fuse.nvvp python fuse_sample.py
And then run nvvp command, and open file fuse.nvvp.
| import cupy | |
| import cupy.prof | |
| def f(x): | |
| return x + x + x + x + x | |
| # This decorator fuses the kernels into one. | |
| @cupy.fuse() | |
| def g(x): | |
| return x + x + x + x + x | |
| x = cupy.arange(40000000) | |
| with cupy.prof.time_range('without fuse', color_id=0): | |
| f(x) | |
| with cupy.prof.time_range('with fuse', color_id=1): | |
| g(x) | |
| # You can pass numpy arrays transparently to the fused function as well. | |
| # In this case, no JIT compilation is applied and it just falls back to plain NumPy API calls. | |
| g(numpy.arange(40000000)) |
Run the sample with nvprof command. Specify profile file with -o option. -f means force and it override an existed file.
$ nvprof -f -o fuse.nvvp python fuse_sample.py
And then run nvvp command, and open file fuse.nvvp.