Skip to content

Instantly share code, notes, and snippets.

@JiaweiZhuang
Last active February 19, 2018 20:50
Show Gist options
  • Save JiaweiZhuang/c3350f7a89db3d5a98c6a2c0228ceea9 to your computer and use it in GitHub Desktop.
Save JiaweiZhuang/c3350f7a89db3d5a98c6a2c0228ceea9 to your computer and use it in GitHub Desktop.
NN prediction benchmark
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**update**: Install Intel-optimized tensorflow according to https://github.com/tensorflow/tensorflow/issues/17028#issuecomment-366548366. Now TF is as fast as PyTorch.\n",
"\n",
"**NN prediction timing (2-layer MLP)**\n",
"\n",
"|PyTorch|Keras (TF)|TensorFlow|\n",
"|------|------|------|\n",
"|133 ms|379 ms|131 ms|"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Platform info"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- System: AWS Ubuntu 16.04 base AMI (ami-66506c1c)\n",
"- Instance type: c5.large, CPU-only\n",
"\n",
"Software installation:\n",
"- Python: Miniconda Python 3.6 https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh\n",
"- TensorFlow: \n",
"```conda install -c intel tensorflow```\n",
"- Keras: ```pip install keras```\n",
"- PyTorch: ```conda install pytorch```"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Architecture: x86_64\r\n",
"CPU op-mode(s): 32-bit, 64-bit\r\n",
"Byte Order: Little Endian\r\n",
"CPU(s): 2\r\n",
"On-line CPU(s) list: 0,1\r\n",
"Thread(s) per core: 2\r\n",
"Core(s) per socket: 1\r\n",
"Socket(s): 1\r\n",
"NUMA node(s): 1\r\n",
"Vendor ID: GenuineIntel\r\n",
"CPU family: 6\r\n",
"Model: 85\r\n",
"Model name: Intel(R) Xeon(R) Platinum 8124M CPU @ 3.00GHz\r\n",
"Stepping: 3\r\n",
"CPU MHz: 3000.000\r\n",
"BogoMIPS: 6000.00\r\n",
"Hypervisor vendor: KVM\r\n",
"Virtualization type: full\r\n",
"L1d cache: 32K\r\n",
"L1i cache: 32K\r\n",
"L2 cache: 1024K\r\n",
"L3 cache: 25344K\r\n",
"NUMA node0 CPU(s): 0,1\r\n",
"Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single kaiser fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx avx512f rdseed adx smap clflushopt clwb avx512cd xsaveopt xsavec xgetbv1 ida arat\r\n"
]
}
],
"source": [
"!lscpu"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Preparation"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"import numpy as np"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"n_sample = 20000\n",
"n_feature = 500\n",
"X = np.random.rand(n_sample, n_feature)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"# use same number of hidden layers for different frameworks\n",
"H1, H2 = 400, 400 "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Pytorch"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/plain": [
"'0.3.0'"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import torch\n",
"from torch.autograd import Variable\n",
"torch.__version__"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"model = torch.nn.Sequential(\n",
" torch.nn.Linear(n_feature, H1),\n",
" torch.nn.ReLU(),\n",
" torch.nn.Linear(H1, H2),\n",
" torch.nn.ReLU(),\n",
" torch.nn.Linear(H2, 1)\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"X_torch = Variable(torch.from_numpy(X).type(torch.FloatTensor))"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"133 ms ± 1.18 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)\n"
]
}
],
"source": [
"%timeit model(X_torch).data.numpy()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Keras (TF-backend)"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"Using TensorFlow backend.\n"
]
},
{
"data": {
"text/plain": [
"'2.1.4'"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import keras\n",
"from keras import models, layers\n",
"keras.__version__"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [],
"source": [
"network = models.Sequential()\n",
"network.add(layers.Dense(H1, activation='relu', input_shape=(n_feature,)))\n",
"network.add(layers.Dense(H2, activation='relu'))\n",
"network.add(layers.Dense(1, activation=None))"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"379 ms ± 560 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
]
}
],
"source": [
"%timeit network.predict(X)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# TensorFlow"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'1.3.1'"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import tensorflow as tf\n",
"from tensorflow.contrib.layers import fully_connected\n",
"tf.__version__"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [],
"source": [
"tf.reset_default_graph()\n",
"\n",
"X_tf = tf.placeholder(tf.float32, shape=(None, n_feature), name=\"X\")\n",
"\n",
"hidden1 = fully_connected(X_tf, H1, scope=\"hidden1\") \n",
"hidden2 = fully_connected(hidden1, H2, scope=\"hidden2\") \n",
"out = fully_connected(hidden2, 1, scope=\"outputs\", activation_fn=None)\n",
"\n",
"init = tf.global_variables_initializer()"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"131 ms ± 124 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)\n"
]
}
],
"source": [
"with tf.Session() as sess: \n",
" init.run()\n",
" %timeit out.eval(feed_dict={X_tf: X})"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.3"
},
"toc": {
"nav_menu": {},
"number_sections": true,
"sideBar": true,
"skip_h1_title": false,
"toc_cell": false,
"toc_position": {},
"toc_section_display": "block",
"toc_window_display": false
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment