Skip to content

Instantly share code, notes, and snippets.

View chyld's full-sized avatar
:electron:
import numpy

Chyld Medford chyld

:electron:
import numpy
View GitHub Profile
version: "3.8"
services:
db:
image: mongo
app:
image: python
command: tail -f /dev/null

Whiteboarding + Coding

Linear Algebra

  • You have a 2d vector, w.
  • w has coordinates (3, 3)
  • Generate 100 additional 2d vectors with random components, v_1 to v_100
  • Project each v_i onto w, creating a new vector, x_i
  • Display these new 100 x_i vectors, sort them from smallest magnitude to largest

This is a Live Coding Problem

Problem Statement

  • Sally owns her own trucking company
  • Every time one of her drivers goes on job, they log how many miles they drove and how many gallons of gas were consumed
  • Here is a list of the miles driven per job
10., 20., 30., 40., 50., 60., 70., 80., 90., 100., 110.,
print(tf.__version__)
tf.config.list_physical_devices(
device_type=None
)
https://www.youtube.com/watch?v=2nrgTossoB0
https://open.spotify.com/playlist/61RNVG9yeQpFBRi8OAVC9I
https://whoami.sh/thought/flow-playlist
https://www.youtube.com/playlist?list=PLuqZTQ_5gsppMEJWlJoH327iOkKLtcNHX
https://mynoise.net/
https://open.spotify.com/album/20owuzVYJHoBDBFOVbw3Qj
https://soundcloud.com/deep_electronic
https://open.spotify.com/playlist/5KmEKavq5Ux0IxY2d5VfyI?si=0QLJavB8Rpmzwtms8BwI5g
def anomalyScores(originalDF, reducedDF):
loss = np.sum((np.array(originalDF)-np.array(reducedDF))**2, axis=1)
loss = pd.Series(data=loss,index=originalDF.index)
loss = (loss-np.min(loss))/(np.max(loss)-np.min(loss))
# loss is between 0 and 1 ... 1 being highest reconstruction error
return loss
def plotResults(trueLabels, anomalyScores, returnPreds = False):
preds = pd.concat([trueLabels, anomalyScores], axis=1)
preds.columns = ['trueLabel', 'anomalyScore']
TIME SERIES REVIEW
------------------
- Get Data (value is dependent over time, consistent interval)
- Plot (visualize)
- Break down dataset into components
- Trend (long running average or polynomial 2nd or 3rd)
- Seasonal (regular repeating interval based on normal calendar, christmas, new years, summer)
- Cyclic (based on business cycle, not tied to a season - every 3 weeks)
- Seasonal and Cyclic can get conflated
data = {'animal': ['cat', 'cat', 'snake', 'dog', 'dog', 'cat', 'snake', 'cat', 'dog', 'dog'],
'age': [2.5, 3, 0.5, np.nan, 5, 2, 4.5, np.nan, 7, 3],
'visits': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1],
'priority': ['yes', 'yes', 'no', 'yes', 'no', 'no', 'no', 'yes', 'no', 'no']}
labels = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']
- list all the model we have learned so far in class
- linear regression
- ridge
- lasso
- knn (classifier / regressor)
- logistic regression
- decision tree (classifier / regressor)
- random forest (classifier / regressor)
-------------------------------------------------------------------------------------------------------
- what is bias and variance in your model
- too much bias - add features
- too much variance - remove features, add data, ridge or lasso (regularization)
- ridge L2 - reduce the impact of your features
- lasso L1 - feature selection
- cross validation - use new data for validating model
- kfold, train_test_split