Tech:
- Ubuntu
- Nvidia Cuda
- Python
- Theano
- TensorFlow
- Keras
- Scikit Learn
- VowPal Wabbit
- LDA2Vec
- spaCy
- and more
ubuntu 14.04 ami-7c927e11 from Canonical set up on GPU instance (HVM-SSD)
sudo apt-get update
sudo apt-get install aptitude wget python-numpy python-scipy python-dev python-pip python-nose g++ libatlas-base-dev gfortran libopenblas-dev git build-essential linux-image-extra-virtual libboost-dev libboost libboost-program-options-dev libboost-python-dev libboost-mpi-python-dev libboost-python libhd5-dev libhdf5-7 pkg-config zip g++ zlib1g-dev unzip swig
echo -e "blacklist nouveau\nblacklist lbm-nouveau\noptions nouveau modeset=0\nalias nouveau off\nalias lbm-nouveau off\n" | sudo tee /etc/modprobe.d/blacklist-nouveau.conf
echo options nouveau modeset=0 | sudo tee -a /etc/modprobe.d/nouveau-kms.conf
sudo update-initramfs -u
sudo reboot
sudo apt-get install linux-headers-3.13.0-86-generic
sudo apt-get install -y build-essential python-pip python-dev git python-numpy swig python-dev zip zlib1g-dev
sudo dpkg -i cuda-repo-ubuntu1404_7.5-18_amd64.deb
sudo apt-get update
sudo apt-get install cuda
sudo reboot
cd /usr/local/cuda/samples/1_Utilities/deviceQuery
make
./deviceQuery
cd
sudo dpkg -i libcudnn5_5.0.4-1+cuda7.5_amd64.deb
sudo dpkg -i libcudnn5-dev_5.0.4-1+cuda7.5_amd64.deb
sudo cp /usr/include/cudnn* /usr/local/cuda/include/
echo -e "\nexport PATH=/usr/local/cuda/bin:$PATH\n\nexport LD_LIBRARY_PATH=/usr/local/cuda/lib64" >> .bashrc
Reinstalls scipy since we have Fortran and LibAtlas
# sudo pip install --upgrade pip
sudo pip install --upgrade git+git://github.com/Theano/Theano.git
sudo pip install --upgrade --no-cache git+git://github.com/fchollet/keras.git
Specify to always use GPU
echo -e "\n[global]\nfloatX=float32\ndevice=gpu\n[mode]=FAST_RUN\n\n[nvcc]\nfastmath=True\n\n[cuda]\nroot=/usr/local/cuda" >> ~/.theanorc
Some warnings about Python version.
cd
source .bashrc
git clone git://github.com/fchollet/keras.git
cd keras/examples/
python lstm_text_generation.py
cd
Should see
Using Theano backend.
Using gpu device 0: GRID K520 (CNMeM is disabled, cuDNN 5004)
corpus length: 600901
total chars: 59
nb sequences: 200287
Vectorization...
Build model...
And in top, something like
18765 ubuntu 20 0 63484 53724 3568 R 17.6 0.3 0:00.53 cudafe
19055 ubuntu 20 0 95964 59760 4728 R 9.3 0.4 0:00.28 cc1plus
Via https://www.tensorflow.org/versions/r0.8/get_started/os_setup.html#installation-for-linux
sudo pip install --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.8.0-cp27-none-linux_x86_64.whl
But, compiling from source may be necessary for levaraging GPU.
Requires Java install, you may prefer Java7 for safety
sudo apt-add-repository ppa:webupd8team/java
Note: Personal Package Archives include unsupported packages and are untrusted by the primary Ubuntu branch. At the time of publication, the WebUpd8 Oracle Java PPA is just an installer (meaning it does not include any Oracle Java binaries, but will download and install them). Use this PPA at your own risk.
sudo apt-get update
sudo apt-get install oracle-java8-installer
Material here mostly from the https://gist.github.com/erikbern/78ba519b97b440e10640 and https://github.com/tensorflow/models/tree/master/syntaxnet
sudo apt-get install
git clone https://github.com/bazelbuild/bazel.git
cd bazel
git checkout tags/0.2.2
./compile.sh
sudo cp output/bazel /usr/bin
sudo pip install -U protobuf==3.0.0b2
sudo pip instal asciitree
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/lib64"
export CUDA_HOME=/usr/local/cuda
git clone --recursive https://github.com/tensorflow/models.git
cd models/syntaxnet/tensorflow
./configure
cd ..
bazel test syntaxnet/... util/utf8/...
compiling was not working, but I was using the wrong version of Bezel
git clone --recurse-submodules https://github.com/tensorflow/tensorflow
cd tensorflow
TF_UNOFFICIAL_SETTING=1 ./configure
You will see
Please specify the location of python. [Default is /usr/bin/python]:
Do you wish to build TensorFlow with GPU support? [y/N] y
GPU support will be enabled for TensorFlow
Please specify which gcc nvcc should use as the host compiler. [Default is /usr/bin/gcc]:
Please specify the Cuda SDK version you want to use, e.g. 7.0. [Leave empty to use system default]: 7.5
...
bazel build -c opt --config=cuda //tensorflow/cc:tutorials_example_trainer
bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package
bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
sudo pip install /tmp/tensorflow_pkg/tensorflow-*-none-linux_x86_64.whl
cd tensorflow/models/image/cifar10/
python cifar10_multi_gpu_train.py
sudo aptitude install vowpal-wabbit
# pandas and h5 can take a while
sudo pip install --upgrade --no-cache ipython scikit-learn sklearn-pandas nose nltk h5py
sudo pip install --upgrade --no-cache git+https://github.com/piskvorky/gensim.git
sudo pip install --upgrade --no-cache spacy
sudo pip install --upgrade --no-cache pyvw
sudo pip install --upgrade --no-cache git+https://github.com/cemoody/lda2vec.git#egg=lda2vec
https://github.com/ryankiros/skip-thoughts
git clone https://github.com/ryankiros/skip-thoughts
http://markus.com/install-theano-on-aws/ http://tleyden.github.io/blog/2014/10/25/cuda-6-dot-5-on-aws-gpu-instance-running-ubuntu-14-dot-04/ http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using_cluster_computing.html