Helin Wang helinwang

@retep 12/27/18

Links

Main inspiration comes from [here][1].

“”” Here is what a deep learning system stack would look like in nowdays.

Build operator level graph description language: name whatever dl frameworks you care about, and [ONNX][2]
Tensor primitive level graph description languages: [NNVM][3], [HLO/XLA][4], [NGraph][5]. It is close enough to the first one that you can also build graph optimization on first layer and bypass this layer.
DSL for description and codegen: TVM, image processing languages like [halide][6], [darkroom][7].
Hardcoded optimized kernel library: [nnpack][8], [cudnn][9], [libdnn][10]
Device dependent library: [maxas][11](assembler for NVIDIA Maxwell architecture)

	# Note – this is not a bash script (some of the steps require reboot)
	# I named it .sh just so Github does correct syntax highlighting.
	#
	# This is also available as an AMI in us-east-1 (virginia): ami-cf5028a5
	#
	# The CUDA part is mostly based on this excellent blog post:
	# http://tleyden.github.io/blog/2014/10/25/cuda-6-dot-5-on-aws-gpu-instance-running-ubuntu-14-dot-04/

	# Install various packages
	sudo apt-get update