Skip to content

Instantly share code, notes, and snippets.

@genkuroki
Last active May 29, 2021 12:09
Show Gist options
  • Save genkuroki/6123aef79488bc20b52047656fc6f015 to your computer and use it in GitHub Desktop.
Save genkuroki/6123aef79488bc20b52047656fc6f015 to your computer and use it in GitHub Desktop.
Octavian!
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"metadata": {},
"cell_type": "markdown",
"source": "https://discourse.julialang.org/t/intel-c-c-compiler-performance-versus-julia/61929/18"
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "versioninfo()",
"execution_count": 1,
"outputs": [
{
"output_type": "stream",
"text": "Julia Version 1.7.0-DEV.1129\nCommit 9117b4d6d6 (2021-05-20 16:42 UTC)\nPlatform Info:\n OS: Windows (x86_64-w64-mingw32)\n CPU: Intel(R) Core(TM) i7-10750H CPU @ 2.60GHz\n WORD_SIZE: 64\n LIBM: libopenlibm\n LLVM: libLLVM-11.0.1 (ORCJIT, skylake)\nEnvironment:\n JULIA_NUM_THREADS = 12\n JULIA_PYTHONCALL_EXE = C:\\Users\\genkuroki\\.julia\\conda\\3\\python.exe\n",
"name": "stdout"
}
]
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "using LinearAlgebra\nusing BLASBenchmarksCPU\nusing Octavian\nusing BenchmarkHistograms",
"execution_count": 2,
"outputs": []
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "M = K = N = 128\nA = rand(M, K)\nB = rand(K, N)\nC1 = @time(A * B)\nC0 = similar(C1);",
"execution_count": 3,
"outputs": [
{
"output_type": "stream",
"text": " 0.597493 seconds (2.53 M allocations: 134.252 MiB, 7.83% gc time, 99.48% compilation time)\n",
"name": "stdout"
}
]
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "@benchmark mul!($C0, $A, $B)",
"execution_count": 4,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 4,
"data": {
"text/plain": "samples: 10000; evals/sample: 1; memory estimate: 0 bytes; allocs estimate: 0\nns\n\n (48100.0 - 55200.0 ] \u001b[32m▏\u001b[39m24\n (55200.0 - 62200.0 ] \u001b[32m█\u001b[39m276\n (62200.0 - 69300.0 ] \u001b[32m██████████████████████████████ \u001b[39m8954\n (69300.0 - 76400.0 ] \u001b[32m█▍\u001b[39m404\n (76400.0 - 83500.0 ] \u001b[32m▋\u001b[39m154\n (83500.0 - 90500.0 ] \u001b[32m▎\u001b[39m44\n (90500.0 - 97600.0 ] \u001b[32m▎\u001b[39m49\n (97600.0 - 104700.0] \u001b[32m▏\u001b[39m23\n (104700.0 - 111700.0] \u001b[32m▏\u001b[39m15\n (111700.0 - 118800.0] \u001b[32m▏\u001b[39m18\n (118800.0 - 125900.0] \u001b[32m▏\u001b[39m8\n (125900.0 - 133000.0] \u001b[32m▏\u001b[39m13\n (133000.0 - 140000.0] \u001b[32m▏\u001b[39m4\n (140000.0 - 147100.0] \u001b[32m▏\u001b[39m4\n (147100.0 - 1.2566e6] \u001b[32m▏\u001b[39m10\n\n Counts\n\nmin: 48.100 μs (0.00% GC); mean: 66.320 μs (0.00% GC); median: 64.700 μs (0.00% GC); max: 1.257 ms (0.00% GC)."
},
"metadata": {}
}
]
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "# MKL dgemm\n@benchmark gemmmkl!($C0, $A, $B)",
"execution_count": 5,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 5,
"data": {
"text/plain": "samples: 10000; evals/sample: 1; memory estimate: 0 bytes; allocs estimate: 0\nns\n\n (15200.0 - 18800.0 ] \u001b[32m█▎\u001b[39m155\n (18800.0 - 22500.0 ] \u001b[32m████▍\u001b[39m591\n (22500.0 - 26100.0 ] \u001b[32m██████████████████████▋\u001b[39m3063\n (26100.0 - 29700.0 ] \u001b[32m██████████████████████████████ \u001b[39m4066\n (29700.0 - 33400.0 ] \u001b[32m████████████▌\u001b[39m1692\n (33400.0 - 37000.0 ] \u001b[32m▍\u001b[39m49\n (37000.0 - 40700.0 ] \u001b[32m▍\u001b[39m45\n (40700.0 - 44300.0 ] \u001b[32m▉\u001b[39m118\n (44300.0 - 47900.0 ] \u001b[32m▊\u001b[39m99\n (47900.0 - 51600.0 ] \u001b[32m▋\u001b[39m76\n (51600.0 - 55200.0 ] \u001b[32m▏\u001b[39m13\n (55200.0 - 58800.0 ] \u001b[32m▏\u001b[39m5\n (58800.0 - 62500.0 ] \u001b[32m▏\u001b[39m4\n (62500.0 - 66100.0 ] \u001b[32m▏\u001b[39m14\n (66100.0 - 137500.0] \u001b[32m▏\u001b[39m10\n\n Counts\n\nmin: 15.200 μs (0.00% GC); mean: 27.755 μs (0.00% GC); median: 27.700 μs (0.00% GC); max: 137.500 μs (0.00% GC)."
},
"metadata": {}
}
]
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "# MKL dgemm_direct\n@benchmark gemmmkl_direct!($C0, $A, $B)",
"execution_count": 6,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 6,
"data": {
"text/plain": "samples: 10000; evals/sample: 1; memory estimate: 0 bytes; allocs estimate: 0\nns\n\n (23500.0 - 26500.0 ] \u001b[32m█████████████████████▏\u001b[39m2994\n (26500.0 - 29500.0 ] \u001b[32m██████████████████████████████ \u001b[39m4277\n (29500.0 - 32600.0 ] \u001b[32m██████████████▎\u001b[39m2028\n (32600.0 - 35600.0 ] \u001b[32m███▎\u001b[39m449\n (35600.0 - 38600.0 ] \u001b[32m▌\u001b[39m60\n (38600.0 - 41600.0 ] \u001b[32m▎\u001b[39m20\n (41600.0 - 44700.0 ] \u001b[32m▌\u001b[39m63\n (44700.0 - 47700.0 ] \u001b[32m▍\u001b[39m39\n (47700.0 - 50700.0 ] \u001b[32m▎\u001b[39m31\n (50700.0 - 53700.0 ] \u001b[32m▏\u001b[39m9\n (53700.0 - 56700.0 ] \u001b[32m▏\u001b[39m4\n (56700.0 - 59800.0 ] \u001b[32m▏\u001b[39m8\n (59800.0 - 62800.0 ] \u001b[32m▏\u001b[39m5\n (62800.0 - 65800.0 ] \u001b[32m▏\u001b[39m3\n (65800.0 - 104900.0] \u001b[32m▏\u001b[39m10\n\n Counts\n\nmin: 23.500 μs (0.00% GC); mean: 28.604 μs (0.00% GC); median: 28.200 μs (0.00% GC); max: 104.900 μs (0.00% GC)."
},
"metadata": {}
}
]
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "# Octavian.jl\n@benchmark matmul!($C0, $A, $B)",
"execution_count": 7,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 7,
"data": {
"text/plain": "samples: 10000; evals/sample: 1; memory estimate: 0 bytes; allocs estimate: 0\nns\n\n (16600.0 - 20500.0 ] \u001b[32m▍\u001b[39m88\n (20500.0 - 24400.0 ] \u001b[32m██████▋\u001b[39m1531\n (24400.0 - 28300.0 ] \u001b[32m██████████████████████████████▏\u001b[39m7049\n (28300.0 - 32200.0 ] \u001b[32m███▏\u001b[39m710\n (32200.0 - 36100.0 ] \u001b[32m█▎\u001b[39m265\n (36100.0 - 40000.0 ] \u001b[32m▍\u001b[39m80\n (40000.0 - 43900.0 ] \u001b[32m▍\u001b[39m88\n (43900.0 - 47700.0 ] \u001b[32m▋\u001b[39m118\n (47700.0 - 51600.0 ] \u001b[32m▏\u001b[39m22\n (51600.0 - 55500.0 ] \u001b[32m▏\u001b[39m3\n (55500.0 - 59400.0 ] \u001b[32m▏\u001b[39m7\n (59400.0 - 63300.0 ] \u001b[32m▏\u001b[39m17\n (63300.0 - 67200.0 ] \u001b[32m▏\u001b[39m7\n (67200.0 - 71100.0 ] \u001b[32m▏\u001b[39m5\n (71100.0 - 150600.0] \u001b[32m▏\u001b[39m10\n\n Counts\n\nmin: 16.600 μs (0.00% GC); mean: 26.611 μs (0.00% GC); median: 25.600 μs (0.00% GC); max: 150.600 μs (0.00% GC)."
},
"metadata": {}
}
]
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "",
"execution_count": null,
"outputs": []
}
],
"metadata": {
"@webio": {
"lastKernelId": null,
"lastCommId": null
},
"kernelspec": {
"name": "julia-1.7-depwarn-o3",
"display_name": "Julia 1.7.0-DEV depwarn -O3",
"language": "julia"
},
"language_info": {
"file_extension": ".jl",
"name": "julia",
"mimetype": "application/julia",
"version": "1.7.0"
},
"toc": {
"nav_menu": {},
"number_sections": true,
"sideBar": true,
"skip_h1_title": false,
"base_numbering": 1,
"title_cell": "Table of Contents",
"title_sidebar": "Contents",
"toc_cell": false,
"toc_position": {},
"toc_section_display": true,
"toc_window_display": false
},
"gist": {
"id": "6123aef79488bc20b52047656fc6f015",
"data": {
"description": "Octavian",
"public": true
}
},
"_draft": {
"nbviewer_url": "https://gist.github.com/6123aef79488bc20b52047656fc6f015"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment