The following results are obtained when PyOP2
and matfree
codes ran on porter
with 16 cores. And Loopy
was using the Pocl
implementation.
Kernel | Loopy | PyOP2 | MatFree |
---|---|---|---|
Mass | 0.029 | 0.0013 | 0.0021 |
Laplace | 0.020 | 0.0013 | 0.0023 |
Hyperelasticity | 0.041 | 0.0034 | 0.0088 |
Work Group Size | Time | Registers Used |
---|---|---|
32 | 0.0056 | 60 |
64 | 0.0041 | 48 |
128 | 0.0038 | 48 |
256 | 0.0040 | 48 |