- A Statistical View of Deep Learning
- What My Deep Model Doesn't Know...
- Bayesian Methods for Machine Learning
- Deep Learning Summer School 2015
- Bengio's Deep learning course notes
- deeplearning.net tutorial
- UFLDL Tutorial
- Reading lists for new MILA students
- [What's the best way to go about transitioning to a ML career? Is it even realistic for someone with my background?](https://www.reddit.com/r/MachineLearning/comments/3sknex/whats_the_bes
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
object main | |
{ | |
val coins = Array(1, 5, 10, 25) | |
val minCoin = coins.reduceLeft (_ min _) | |
var table: Map[Int, Seq[Int]] = Map() | |
val infinity: Seq[Int] = for (i <- 1 to 100000) yield i | |
def calculateMinCoins(n: Int): Seq[Int] = { | |
if (n == 0) { Seq() } |
- When the branch predictor is wrong and speculatively executes code from a branch that is not taken, that can actually pollute caches, causing much worse performance than just wasted cycles from fetch, decode, ALU.
- Retiring: all instructions retire (commit) in program order and happens at a max rate of 2/cycle.
- i.e. the visible side-effects of an instruction are committed in order, even if executed out of order.
- L1 hit takes 3 cycle, L2 hit takes 25 cycles, i.e. L2 is ~8x slower.
- Main memory ~200 cycles, i.e. around 66x slower than L1
- Retire control unit (RCU) can only store 64 instructions.
- L2 miss + full RCU can be a recipe for disaster:
- L2 miss will not retire for 200+ cycles and frontend is (almost) always fetching 2 instructions / cycle, which means after ~32 instructions the RCU is full and so the entire pipeline must stall. CPU can no longer (out of order) execute instructions that occur after the memory op to hide that memory
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
// spinlock.h | |
#include <thread> | |
class Mutex | |
{ | |
public: | |
Mutex(); | |
Mutex(Mutex const&) = delete; | |
Showcases some interesting and non-obvious optimizations that compilers can make on and around atomics. In particular, I liked this example: the following code
int x = 0;
std::atomic<int> y;
int dso() {
x = 0;
int z = y.load(std::memory_order_seq_cst);
y.store(0, std::memory_order_seq_cst);
x = 1;
return z;
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
// Example program | |
#include <iostream> | |
#include <string> | |
#include <vector> | |
#include <type_traits> | |
//--------------------- | |
// Maybe and MyVector, two totally unrelated classes whose only commanilty is that they are both type constructors of the same arity (e.g. 1) and order (e.g. 1). | |
//--------------------- | |
template< typename T > |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
http://courses.cms.caltech.edu/cs179/ | |
http://www.amd.com/Documents/GCN_Architecture_whitepaper.pdf | |
https://community.arm.com/graphics/b/blog | |
http://cdn.imgtec.com/sdk-documentation/PowerVR+Hardware.Architecture+Overview+for+Developers.pdf | |
http://cdn.imgtec.com/sdk-documentation/PowerVR+Series5.Architecture+Guide+for+Developers.pdf | |
https://www.imgtec.com/blog/a-look-at-the-powervr-graphics-architecture-tile-based-rendering/ | |
https://www.imgtec.com/blog/the-dr-in-tbdr-deferred-rendering-in-rogue/ | |
http://developer.amd.com/tools-and-sdks/opencl-zone/amd-accelerated-parallel-processing-app-sdk/opencl-optimization-guide/#50401334_pgfId-412605 | |
https://fgiesen.wordpress.com/2011/07/09/a-trip-through-the-graphics-pipeline-2011-index/ | |
https://community.arm.com/graphics/b/documents/posts/moving-mobile-graphics#siggraph2015 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import numpy as np | |
from scipy.stats import norm | |
from math import log | |
N = 1000 | |
true_loc = 10.0 | |
true_stddev = 0.1 | |
x_data = true_loc + (np.random.randn(N) * true_stddev) | |
def lognormalpdf(x,loc,scale): |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import numpy as np | |
from skimage import filters | |
def optical_flow_lk(t0, t1, sigma): | |
# setup the local linear systems of equations | |
gradients = np.gradient(t0) | |
dx, dy = gradients[1], gradients[0] | |
dt = t1 - t0 | |
A00 = filters.gaussian(dx * dx, sigma) | |
A11 = filters.gaussian(dy * dy, sigma) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import numpy as np | |
from skimage import filters | |
from scipy.sparse import csc_matrix | |
from scipy.sparse.linalg import spsolve | |
def optical_flow_hs(t0, t1, alpha): | |
h, w = t0.shape[:2] | |
gradients = np.gradient(t0) | |
dx, dy = gradients[1], gradients[0] | |
dt = t1 - t0 |