Skip to content

Instantly share code, notes, and snippets.

@sub-mod
Last active April 16, 2019 20:49
Show Gist options
  • Save sub-mod/18c23839ccbac660de08ba5f6033defd to your computer and use it in GitHub Desktop.
Save sub-mod/18c23839ccbac660de08ba5f6033defd to your computer and use it in GitHub Desktop.
GPU testing on Openshift 3.11

https://gist.github.com/sub-mod/638f4201a919993827c29533ed5c85a8

	{
	"apiVersion": "v1",
	"kind": "Pod",
	"metadata": {
		"name": "cuda-vector-add",
		"namespace": "nvidia"
	},
	"spec": {
		"restartPolicy": "OnFailure",
		"containers": [
			{
				"name": "cuda-vector-add",
				"image": "docker.io/mirrorgooglecontainers/cuda-vector-add:v0.1",
				"env": [
					{
						"name": "NVIDIA_VISIBLE_DEVICES",
						"value": "all"
					},
					{
						"name": "NVIDIA_DRIVER_CAPABILITIES",
						"value": "compute,utility"
					},
					{
						"name": "NVIDIA_REQUIRE_CUDA",
						"value": "cuda>=5.0"
					}
				],
				"securityContext": {
					"allowPrivilegeEscalation": false,
					"capabilities": {
						"drop": [
							"ALL"
						]
					},
					"seLinuxOptions": {
						"type": "nvidia_container_t"
					}
				},
				"resources": {
					"limits": {
						"nvidia.com/gpu": 1
					}
				}
			}
		]
	}
}
docker run  --user 1000:1000 --security-opt=no-new-privileges --cap-drop=ALL  --security-opt label=type:nvidia_container_t  nvidia/cuda:9.0-base nvidia-smi


{
	"apiVersion": "v1",
	"kind": "Pod",
	"metadata": {
		"name": "test-gpu-310"
	},
	"spec": {
		"restartPolicy": "OnFailure",
		"containers": [
			{
				"name": "test-gpu-310",
				"image": "submod/test-gpu-310",
				"env": [
					{
						"name": "NVIDIA_VISIBLE_DEVICES",
						"value": "all"
					},
					{
						"name": "NVIDIA_DRIVER_CAPABILITIES",
						"value": "compute,utility"
					},
					{
						"name": "NVIDIA_REQUIRE_CUDA",
						"value": "cuda>=9.0"
					}
				],
				"securityContext": {
					"allowPrivilegeEscalation": false,
					"capabilities": {
						"drop": [
							"ALL"
						]
					},
					"seLinuxOptions": {
						"type": "nvidia_container_t"
					}
				},
				"resources": {
					"limits": {
						"nvidia.com/gpu": 1
					}
				}
			}
		]
	}
}
sh-4.2$ curl "https://bootstrap.pypa.io/get-pip.py" -o "get-pip.py"
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 1659k  100 1659k    0     0   989k      0  0:00:01  0:00:01 --:--:--  989k
sh-4.2$ python get-pip.py --user
DEPRECATION: Python 2.7 will reach the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 won't be maintained after that date. A future version of pip will drop support for Python 2.7.
Collecting pip
  Downloading https://files.pythonhosted.org/packages/d8/f3/413bab4ff08e1fc4828dfc59996d721917df8e8583ea85385d51125dceff/pip-19.0.3-py2.py3-none-any.whl (1.4MB)
    100% |################################| 1.4MB 772kB/s
Collecting wheel
  Downloading https://files.pythonhosted.org/packages/96/ba/a4702cbb6a3a485239fbe9525443446203f00771af9ac000fa3ef2788201/wheel-0.33.1-py2.py3-none-any.whl
Installing collected packages: pip, wheel
  The script wheel is installed in '/opt/app-root/src/.local/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
Successfully installed pip-19.0.3 wheel-0.33.1
sh-4.2$ pip install  https://github.com/AICoE/tensorflow-wheels/releases/download/tf-r1.13-gpu-2019-04-16_003239/tensorflow-1.13.1-cp27-cp27mu-linux_x86_64.whl --user
sh: pip: command not found
sh-4.2$ /opt/app-root/src/.local/bin/pip install  https://github.com/AICoE/tensorflow-wheels/releases/download/tf-r1.13-gpu-2019-04-16_003239/tensorflow-1.13.1-cp27-cp27mu-linux_x86_64.whl --user
DEPRECATION: Python 2.7 will reach the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 won't be maintained after thatdate. A future version of pip will drop support for Python 2.7.
Collecting tensorflow==1.13.1 from https://github.com/AICoE/tensorflow-wheels/releases/download/tf-r1.13-gpu-2019-04-16_003239/tensorflow-1.13.1-cp27-cp27mu-linux_x86_64.whl
  Downloading https://github.com/AICoE/tensorflow-wheels/releases/download/tf-r1.13-gpu-2019-04-16_003239/tensorflow-1.13.1-cp27-cp27mu-linux_x86_64.whl (285.4MB)
    100% |################################| 285.4MB 151kB/s
Collecting keras-preprocessing>=1.0.5 (from tensorflow==1.13.1)
  Downloading https://files.pythonhosted.org/packages/c0/bf/0315ef6a9fd3fc2346e85b0ff1f5f83ca17073f2c31ac719ab2e4da0d4a3/Keras_Preprocessing-1.0.9-py2.py3-none-any.whl (59kB)
    100% |################################| 61kB 263kB/s
Collecting enum34>=1.1.6 (from tensorflow==1.13.1)
  Downloading https://files.pythonhosted.org/packages/c5/db/e56e6b4bbac7c4a06de1c50de6fe1ef3810018ae11732a50f15f62c7d050/enum34-1.1.6-py2-none-any.whl
Collecting astor>=0.6.0 (from tensorflow==1.13.1)
  Downloading https://files.pythonhosted.org/packages/35/6b/11530768cac581a12952a2aad00e1526b89d242d0b9f59534ef6e6a1752f/astor-0.7.1-py2.py3-none-any.whl
Collecting backports.weakref>=1.0rc1 (from tensorflow==1.13.1)
  Downloading https://files.pythonhosted.org/packages/88/ec/f598b633c3d5ffe267aaada57d961c94fdfa183c5c3ebda2b6d151943db6/backports.weakref-1.0.post1-py2.py3-none-any.whl
Requirement already satisfied: wheel in ./.local/lib/python2.7/site-packages (from tensorflow==1.13.1) (0.33.1)
Collecting mock>=2.0.0 (from tensorflow==1.13.1)
  Downloading https://files.pythonhosted.org/packages/e6/35/f187bdf23be87092bd0f1200d43d23076cee4d0dec109f195173fd3ebc79/mock-2.0.0-py2.py3-none-any.whl (56kB)
    100% |################################| 61kB 525kB/s
Collecting tensorflow-estimator<1.14.0rc0,>=1.13.0 (from tensorflow==1.13.1)
  Downloading https://files.pythonhosted.org/packages/bb/48/13f49fc3fa0fdf916aa1419013bb8f2ad09674c275b4046d5ee669a46873/tensorflow_estimator-1.13.0-py2.py3-none-any.whl (367kB)
    100% |################################| 368kB 885kB/s
Collecting gast>=0.2.0 (from tensorflow==1.13.1)
  Downloading https://files.pythonhosted.org/packages/4e/35/11749bf99b2d4e3cceb4d55ca22590b0d7c2c62b9de38ac4a4a7f4687421/gast-0.2.2.tar.gz
Collecting termcolor>=1.1.0 (from tensorflow==1.13.1)
  Downloading https://files.pythonhosted.org/packages/8a/48/a76be51647d0eb9f10e2a4511bf3ffb8cc1e6b14e9e4fab46173aa79f981/termcolor-1.1.0.tar.gz
Collecting protobuf>=3.6.1 (from tensorflow==1.13.1)
  Downloading https://files.pythonhosted.org/packages/ea/72/5eadea03b06ca1320be2433ef2236155da17806b700efc92677ee99ae119/protobuf-3.7.1-cp27-cp27mu-manylinux1_x86_64.whl (1.2MB)
    100% |################################| 1.2MB 19.4MB/s
Collecting absl-py>=0.1.6 (from tensorflow==1.13.1)
  Downloading https://files.pythonhosted.org/packages/da/3f/9b0355080b81b15ba6a9ffcf1f5ea39e307a2778b2f2dc8694724e8abd5b/absl-py-0.7.1.tar.gz (99kB)
    100% |################################| 102kB 42.5MB/s
Collecting tensorboard<1.14.0,>=1.13.0 (from tensorflow==1.13.1)
  Downloading https://files.pythonhosted.org/packages/89/ac/48dd71c2bdc8d31e367f9b72f25ccb3b89bc6b9d664fee21f9a8efa5714d/tensorboard-1.13.1-py2-none-any.whl (3.2MB)
    100% |################################| 3.2MB 9.7MB/s
Collecting six>=1.10.0 (from tensorflow==1.13.1)
  Downloading https://files.pythonhosted.org/packages/73/fb/00a976f728d0d1fecfe898238ce23f502a721c0ac0ecfedb80e0d88c64e9/six-1.12.0-py2.py3-none-any.whl
Collecting keras-applications>=1.0.6 (from tensorflow==1.13.1)
  Downloading https://files.pythonhosted.org/packages/90/85/64c82949765cfb246bbdaf5aca2d55f400f792655927a017710a78445def/Keras_Applications-1.0.7-py2.py3-none-any.whl (51kB)
    100% |################################| 61kB 34.7MB/s
Collecting grpcio>=1.8.6 (from tensorflow==1.13.1)
  Downloading https://files.pythonhosted.org/packages/b8/be/3bb6d8241b5ed1f8437169df53e7dd6ca986174e022585de15087a848c99/grpcio-1.19.0-cp27-cp27mu-manylinux1_x86_64.whl (10.7MB)
    100% |################################| 10.7MB 4.0MB/s
Collecting numpy>=1.13.3 (from tensorflow==1.13.1)
  Downloading https://files.pythonhosted.org/packages/c4/33/8ec8dcdb4ede5d453047bbdbd01916dbaccdb63e98bba60989718f5f0876/numpy-1.16.2-cp27-cp27mu-manylinux1_x86_64.whl (17.0MB)
    100% |################################| 17.0MB 2.7MB/s
Collecting funcsigs>=1; python_version < "3.3" (from mock>=2.0.0->tensorflow==1.13.1)
  Downloading https://files.pythonhosted.org/packages/69/cb/f5be453359271714c01b9bd06126eaf2e368f1fddfff30818754b5ac2328/funcsigs-1.0.2-py2.py3-none-any.whl
Collecting pbr>=0.11 (from mock>=2.0.0->tensorflow==1.13.1)
  Downloading https://files.pythonhosted.org/packages/14/09/12fe9a14237a6b7e0ba3a8d6fcf254bf4b10ec56a0185f73d651145e9222/pbr-5.1.3-py2.py3-none-any.whl (107kB)
    100% |################################| 112kB 42.0MB/s
Requirement already satisfied: setuptools in /usr/lib/python2.7/site-packages (from protobuf>=3.6.1->tensorflow==1.13.1) (0.9.8)
Collecting futures>=3.1.1; python_version < "3" (from tensorboard<1.14.0,>=1.13.0->tensorflow==1.13.1)
  Downloading https://files.pythonhosted.org/packages/2d/99/b2c4e9d5a30f6471e410a146232b4118e697fa3ffc06d6a65efde84debd0/futures-3.2.0-py2-none-any.whl
Collecting werkzeug>=0.11.15 (from tensorboard<1.14.0,>=1.13.0->tensorflow==1.13.1)
  Downloading https://files.pythonhosted.org/packages/18/79/84f02539cc181cdbf5ff5a41b9f52cae870b6f632767e43ba6ac70132e92/Werkzeug-0.15.2-py2.py3-none-any.whl (328kB)
    100% |################################| 337kB 40.6MB/s
Collecting markdown>=2.6.8 (from tensorboard<1.14.0,>=1.13.0->tensorflow==1.13.1)
  Downloading https://files.pythonhosted.org/packages/f5/e4/d8c18f2555add57ff21bf25af36d827145896a07607486cc79a2aea641af/Markdown-3.1-py2.py3-none-any.whl (87kB)
    100% |################################| 92kB 42.9MB/s
Collecting h5py (from keras-applications>=1.0.6->tensorflow==1.13.1)
  Downloading https://files.pythonhosted.org/packages/53/08/27e4e9a369321862ffdce80ff1770553e9daec65d98befb2e14e7478b698/h5py-2.9.0-cp27-cp27mu-manylinux1_x86_64.whl (2.8MB)
    100% |################################| 2.8MB 18.3MB/s
Building wheels for collected packages: gast, termcolor, absl-py
  Building wheel for gast (setup.py) ... done
  Stored in directory: /opt/app-root/src/.cache/pip/wheels/5c/2e/7e/a1d4d4fcebe6c381f378ce7743a3ced3699feb89bcfbdadadd
  Building wheel for termcolor (setup.py) ... done
  Stored in directory: /opt/app-root/src/.cache/pip/wheels/7c/06/54/bc84598ba1daf8f970247f550b175aaaee85f68b4b0c5ab2c6
  Building wheel for absl-py (setup.py) ... done
  Stored in directory: /opt/app-root/src/.cache/pip/wheels/ee/98/38/46cbcc5a93cfea5492d19c38562691ddb23b940176c14f7b48
Successfully built gast termcolor absl-py
markdown 3.1 has requirement setuptools>=36, but you'll have setuptools 0.9.8 which is incompatible.
Installing collected packages: six, numpy, keras-preprocessing, enum34, astor, backports.weakref, funcsigs, pbr, mock, absl-py, tensorflow-estimator, gast, termcolor, protobuf, futures, grpcio, werkzeug, markdown, tensorboard, h5py, keras-applications, tensorflow
  The scripts f2py, f2py2 and f2py2.7 are installed in '/opt/app-root/src/.local/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
  The script pbr is installed in '/opt/app-root/src/.local/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
  The script markdown_py is installed in '/opt/app-root/src/.local/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
  The script tensorboard is installed in '/opt/app-root/src/.local/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
  The scripts freeze_graph, saved_model_cli, tensorboard, tf_upgrade_v2, tflite_convert, toco and toco_from_protos are installed in '/opt/app-root/src/.local/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
Successfully installed absl-py-0.7.1 astor-0.7.1 backports.weakref-1.0.post1 enum34-1.1.6 funcsigs-1.0.2 futures-3.2.0 gast-0.2.2 grpcio-1.19.0 h5py-2.9.0 keras-applications-1.0.7 keras-preprocessing-1.0.9 markdown-3.1 mock-2.0.0 numpy-1.16.2 pbr-5.1.3 protobuf-3.7.1 six-1.12.0 tensorboard-1.13.1tensorflow-1.13.1 tensorflow-estimator-1.13.0 termcolor-1.1.0 werkzeug-0.15.2
sh-4.2$
sh-4.2$
sh-4.2$ TEST_CMD="import tensorflow as tf ; a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a') ; \
> b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b') ; c = tf.matmul(a, b) ; \
> sess = tf.Session(config=tf.ConfigProto(log_device_placement=True)) ;print(sess.run(c))"
sh-4.2$
sh-4.2$ python -c "$TEST_CMD"
2019-04-16 00:47:39.794173: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties:
name: Tesla P100-SXM2-16GB major: 6 minor: 0 memoryClockRate(GHz): 1.4805
pciBusID: 0000:06:00.0
totalMemory: 15.90GiB freeMemory: 15.64GiB
2019-04-16 00:47:39.924800: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 1 with properties:
name: Tesla P100-SXM2-16GB major: 6 minor: 0 memoryClockRate(GHz): 1.4805
pciBusID: 0000:84:00.0
totalMemory: 15.90GiB freeMemory: 15.64GiB
2019-04-16 00:47:39.924905: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0, 1
2019-04-16 00:47:40.964429: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-04-16 00:47:40.964490: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990]      0 1
2019-04-16 00:47:40.964500: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0:   N Y
2019-04-16 00:47:40.964505: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 1:   Y N
2019-04-16 00:47:40.966164: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 15148 MB memory) -> physical GPU (device: 0, name: Tesla P100-SXM2-16GB, pci bus id: 0000:06:00.0, compute capability: 6.0)
2019-04-16 00:47:40.966960: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 15148 MB memory) -> physical GPU (device: 1, name: Tesla P100-SXM2-16GB, pci bus id: 0000:84:00.0, compute capability: 6.0)
Device mapping:
/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: Tesla P100-SXM2-16GB, pci bus id: 0000:06:00.0, compute capability: 6.0
/job:localhost/replica:0/task:0/device:GPU:1 -> device: 1, name: Tesla P100-SXM2-16GB, pci bus id: 0000:84:00.0, compute capability: 6.0
2019-04-16 00:47:40.974724: I tensorflow/core/common_runtime/direct_session.cc:317] Device mapping:
/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: Tesla P100-SXM2-16GB, pci bus id: 0000:06:00.0, compute capability: 6.0
/job:localhost/replica:0/task:0/device:GPU:1 -> device: 1, name: Tesla P100-SXM2-16GB, pci bus id: 0000:84:00.0, compute capability: 6.0

MatMul: (MatMul): /job:localhost/replica:0/task:0/device:GPU:0
2019-04-16 00:47:40.976255: I tensorflow/core/common_runtime/placer.cc:1059] MatMul: (MatMul)/job:localhost/replica:0/task:0/device:GPU:0
a: (Const): /job:localhost/replica:0/task:0/device:GPU:0
2019-04-16 00:47:40.976299: I tensorflow/core/common_runtime/placer.cc:1059] a: (Const)/job:localhost/replica:0/task:0/device:GPU:0
b: (Const): /job:localhost/replica:0/task:0/device:GPU:0
2019-04-16 00:47:40.976326: I tensorflow/core/common_runtime/placer.cc:1059] b: (Const)/job:localhost/replica:0/task:0/device:GPU:0
[[22. 28.]
 [49. 64.]]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment