As configured in my dotfiles.
start new:
tmux
start new with session name:
As configured in my dotfiles.
start new:
tmux
start new with session name:
# docker build -t ubuntu1604py36 | |
FROM ubuntu:16.04 | |
RUN apt-get update | |
RUN apt-get install -y software-properties-common vim | |
RUN add-apt-repository ppa:jonathonf/python-3.6 | |
RUN apt-get update | |
RUN apt-get install -y build-essential python3.6 python3.6-dev python3-pip python3.6-venv | |
RUN apt-get install -y git |
Cython has two major benefits:
Cython gains most of it's benefit from statically typing arguments. However, statically typing is not required, in fact, regular python code is valid cython (but don't expect much of a speed up). By incrementally adding more type information, the code can speed up by several factors. This gist just provides a very basic usage of cython.
""" | |
Example TensorFlow script for finetuning a VGG model on your own data. | |
Uses tf.contrib.data module which is in release v1.2 | |
Based on PyTorch example from Justin Johnson | |
(https://gist.github.com/jcjohnson/6e41e8512c17eae5da50aebef3378a4c) | |
Required packages: tensorflow (v1.2) | |
Download the weights trained on ImageNet for VGG: | |
``` | |
wget http://download.tensorflow.org/models/vgg_16_2016_08_28.tar.gz |
Deploy key is a SSH key set in your repo to grant client read-only (as well as r/w, if you want) access to your repo.
As the name says, its primary function is to be used in the deploy process, where only read access is needed. Therefore keep the repo safe from the attack, in case the server side is fallen.
""" | |
A bare bones examples of optimizing a black-box function (f) using | |
Natural Evolution Strategies (NES), where the parameter distribution is a | |
gaussian of fixed standard deviation. | |
""" | |
import numpy as np | |
np.random.seed(0) | |
# the function we want to optimize |
# use ImageMagick convert | |
# the order is important. the density argument applies to input.pdf and resize and rotate to output.pdf | |
convert -density 90 input.pdf -rotate 0.5 -attenuate 0.2 +noise Multiplicative -colorspace Gray output.pdf |
#!/bin/bash | |
# This file sets the environment variable CUDA_VISIBLE_DEVICES to the MPI local rank to enable multi-GPU usage of this benchmark. Note that this disables any GPU distribution handling by he batch scheduler. | |
# Background: Most/some batch schedulers set CUDA_VISIBLE_DEVICES to all available GPUs on a node. In that case, the Arbor benchmark would only use the first entry in the list, probably GPU#0. This script changes that. | |
# -Andreas Herten, Nov 2018 | |
_verbose=1 | |
localrank=$CUDA_VISIBLE_DEVICES | |
if [[ -n "$OMPI_COMM_WORLD_NODE_RANK" ]]; then |
#!/usr/bin/env bash | |
set -e | |
cd | |
case "$OSTYPE" in | |
darwin*) DOWNLOAD=https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh ;; | |
linux*) DOWNLOAD=https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh ;; | |
*) echo "unknown: $OSTYPE" ;; | |
esac |
(Internal Tranining Material)
Usually the first step in performance optimization is to do profiling, e.g. to identify performance hotspots of a workload. This gist tells basic knowledge of performance profiling on PyTorch, you will get:
This tutorial takes one of my recent projects - pssp-transformer as an example to guide you through path of PyTorch CPU peformance optimization. Focus will be on Part 1 & Part 2.