-
-
Save jpivarski/da343abd8024834ee8c5aaba691aafc7 to your computer and use it in GitHub Desktop.
Thanks for sharing. To avoid memory access errors, a CUDA kernel must still check whether the x- and y-indices are within the array boundaries. For large zoom depths, it is useful to integrate perturbation theory as shown here:
https://rosettacode.org/wiki/Mandelbrot_set#Normal_Map_Effect,_Mercator_Projection_and_Deep_Zoom_Images
Some sample programs that use DPEP and Modular instead of CUDA on non-NVIDIA hardware can be found here:
https://github.com/IntelPython/DPEP/tree/main/demos/mandelbrot
https://github.com/modular/modular/tree/main/examples/custom_ops
https://github.com/modular/modular/tree/main/examples/mojo/python-interop
Just wanted to mention that I did similar research (comparing Numba, Taichi, Warp, and JAX with different number of pixels & 200 loops) at https://github.com/34j/mandelbrot-benchmark. Hope this helps.

I got intrigued and went and found a dual scalar / handwritten portable SIMD implementation of the Mandelbrot algorithm: https://pythonspeed.com/articles/optimizing-with-simd/