Skip to content

Instantly share code, notes, and snippets.

.code-snippet {
background-color: #2B394D;
.line-number {
color: #636d83;
font-size: 1.25rem;
}
.line {
color: white;
font-size: 1.25rem;
}
@byelipk
byelipk / time-complexity.md
Created June 5, 2017 14:23
Common time complexity operations

O(1)

This denotes constant time. Running a statement like if (true) {...} is constant time. Another example of constant time is looking up a value in an object, array, or a hash table.

O(log n)

This denotes logrithmic time. Divide and conquer or recursive algorithms have a O(log n) time complexity.

O(n)

@byelipk
byelipk / timer.rb
Created June 3, 2017 13:57
Simple timer app in Ruby
def time_diff(start_time, end_time)
seconds_diff = (start_time - end_time).to_i.abs
hours = seconds_diff / 3600
seconds_diff -= hours * 3600
minutes = seconds_diff / 60
seconds_diff -= minutes * 60
seconds = seconds_diff
@byelipk
byelipk / download_m3u8.rb
Created May 30, 2017 17:18
A utility written in Ruby to download video from M3U8 files. Requires ffmpeg and the Clipboard gem.
# A utility written in Ruby to download video from M3U8 files.
#
# From Wikipedia:
#
# An M3U file is a plain text file that specifies the locations of one or more
# media files. The file is saved with the "m3u" filename extension if the text
# is encoded in the local system's default non-Unicode encoding (e.g., a
# Windows codepage), or with the "m3u8" extension if the text is UTF-8 encoded.
require 'optparse'
@byelipk
byelipk / take_n.py
Created April 1, 2017 16:15
Return index position of first N items that match a test condition.
from collections import Counter
def take_n(data, n, test_condition=lambda x: True):
"""
Return index position of first N items that match a test condition.
Parameters
==========
:data: An enumerable, such as a list.
@byelipk
byelipk / fetch_batch.py
Created March 22, 2017 15:23
Helpful for mini-batch optimizers
import numpy as np
def fetch_batch(X, y, epoch, n_batches, batch_index, batch_size):
"""
A generic function that returns the next batch of data to train on.
Parameters
==========
:X: The training examples

Exercises

Is it okay to initialize all the weights to the same value as long as that value is selected randomly using He initialization?

No. All weights should be initialized to different random values and should not have the same initial value. If weights are symmetrical, meaning they have the same value, it makes it almost impossible for backpropagation to converge to a good solution.

Think of it this way: if all the weights are the same, it's like having just one neuron per layer, but much slower.

The technique we use to break this symmetry is to sample weights randomly.

Exercises

1. Draw an ANN using the original artificial neurons that compute the XOR operation.

TODO: Upload photo of XOR network

2. Why is it generally preferable to use a Logistic Regression classifier rather than a classical Perceptron (ie. a single layer of Linear Threshold Units trained using the Perceptron training algorithm)? How can you tweak a Perceptron to make it equivalent to a Logistic Regression classifier?

A classical perceptron will only converge if the data is linearly seperable. It also cannot compute class probabilities. The logistic regression classifier is able to converge on non-linear data and outputs class probabilities.

@byelipk
byelipk / 9-ml-exercises.md
Last active April 22, 2022 05:18
Machine learning questions and answers

Excercises

1. What are the main benefits of creating a computation graph rather than directly executing the computations? What are the main drawbacks?

Deep Learning frameworks that generate computation graphs, like TensorFlow, have several things going for it.

For starters, computation graphs will compute the gradients automatically. This saves you from having to do lots of tedious calculus by hand.

Another huge plus is that they are optimized to run on your computer's GPU. If this wasn't the case you'd need to learn either CUDA or OPENCL and write lots of C++ by hand. Not an easy thing to do.

@byelipk
byelipk / 2-ml-exercises.md
Last active October 15, 2024 20:25
Machine learning questions and answers

Exercises

1. What Linear Regression training algorithm can you use if you have a training set with millions of features?

You could use batch gradient descent, stochastic gradient descent, or mini-batch gradient descent. SGD and MBGD would work the best because neither of them need to load the entire dataset into memory in order to take 1 step of gradient descent. Batch would be ok with the caveat that you have enough memory to load all the data.

The normal equations method would not be a good choice because it is computationally inefficient. The main cause of the computational complexity comes from inverse operation on an (n x n) matrix.

O n2 . 4 to O n3

2. Suppose the features in your training set have very different scales: what algorithms might suffer from this, and how? What can you do about it?