kaushikcfd/better_cpu_comparison.md

Last active February 5, 2018 19:53

Star (0) You must be signed in to star a gist
Fork (0) You must be signed in to fork a gist

Learn more about clone URLs
Clone this repository at <script src="https://gist.github.com/kaushikcfd/19ab706b4246b6b938ae67c4e75d6e54.js"></script>
Save kaushikcfd/19ab706b4246b6b938ae67c4e75d6e54 to your computer and use it in GitHub Desktop.

Download ZIP

Raw

better_cpu_comparison.md

Below are the timing values in seconds for the kernel matvecs, invovlving different strategies in which the matvec is performed.

The time is in seconds in each table.

POCL and AMD are the 2 OpenCL implementations on which the timings are done.

Single Core

Kernel	POCL	AMD	MatFree	PyOP2(SpMV)
Mass	0.122	0.136	0.031	0.011
Laplace	0.124	0.125	0.035	0.011
Hyperelasticity	0.268	0.264	0.105	0.027

Cost of atomics on single core CPU

POCL:

Kernel	With	Without
Mass	0.122	0.064
Laplace	0.124	0.041
Hyperelasticity	0.268	0.091

AMD:

Kernel	With	Without
Mass	0.136	0.073
Laplace	0.125	0.057
Hyperelasticity	0.264	0.164

Cost of atmoics on Multicore CPU with Vectorization

Added MatFree just to compare what number are we chasing.

Kernel	With	Without	MatFree
Mass	0.029	0.009	0.002
Laplace	0.020	0.010	0.002
Hyperelasticity	0.041	0.019	0.009

New Lower bounds on bandwidth(using Footprint Measurement)

Mass: 52GBps(was 1125 GBps)
Laplace: 53GBps(was 270 GBps)
Hyperelasticity: 18GBps(was 206 GBps)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment