8th September 2023. Results from Constantin's development/benchmarks.py
script.
Summary: something about AMG is really killing the performance for MPS backends.
- We'll need to do some line profiling on that part of the code to get more information.
- I am suspicious that the fallback to CPU might somehow involve a lot of transferring large tensors back and forth between the cpu and mps, and perhaps that is a large part of why it is so much slower than the cpu only computation.
Running benchmark_amg ...
[W MPSFallback.mm:11] Warning: The operator 'torchvision::nms' is not currently supported on the MPS backend and will fall back to run on the CPU. This may have performance implications. (function operator())