Chyld Medford chyld

Whiteboarding + Coding

You have a 2d vector, w.
w has coordinates (3, 3)
Generate 100 additional 2d vectors with random components, v_1 to v_100
Project each v_i onto w, creating a new vector, x_i
Display these new 100 x_i vectors, sort them from smallest magnitude to largest

Sally owns her own trucking company
Every time one of her drivers goes on job, they log how many miles they drove and how many gallons of gas were consumed
Here is a list of the miles driven per job

10., 20., 30., 40., 50., 60., 70., 80., 90., 100., 110.,

	https://www.youtube.com/watch?v=2nrgTossoB0
	https://open.spotify.com/playlist/61RNVG9yeQpFBRi8OAVC9I
	https://whoami.sh/thought/flow-playlist
	https://www.youtube.com/playlist?list=PLuqZTQ_5gsppMEJWlJoH327iOkKLtcNHX
	https://mynoise.net/
	https://open.spotify.com/album/20owuzVYJHoBDBFOVbw3Qj
	https://soundcloud.com/deep_electronic
	https://open.spotify.com/playlist/5KmEKavq5Ux0IxY2d5VfyI?si=0QLJavB8Rpmzwtms8BwI5g

	def anomalyScores(originalDF, reducedDF):
	loss = np.sum((np.array(originalDF)-np.array(reducedDF))**2, axis=1)
	loss = pd.Series(data=loss,index=originalDF.index)
	loss = (loss-np.min(loss))/(np.max(loss)-np.min(loss))
	# loss is between 0 and 1 ... 1 being highest reconstruction error
	return loss

	def plotResults(trueLabels, anomalyScores, returnPreds = False):
	preds = pd.concat([trueLabels, anomalyScores], axis=1)
	preds.columns = ['trueLabel', 'anomalyScore']

	TIME SERIES REVIEW
	------------------

	- Get Data (value is dependent over time, consistent interval)
	- Plot (visualize)
	- Break down dataset into components
	- Trend (long running average or polynomial 2nd or 3rd)
	- Seasonal (regular repeating interval based on normal calendar, christmas, new years, summer)
	- Cyclic (based on business cycle, not tied to a season - every 3 weeks)
	- Seasonal and Cyclic can get conflated

	data = {'animal': ['cat', 'cat', 'snake', 'dog', 'dog', 'cat', 'snake', 'cat', 'dog', 'dog'],
	'age': [2.5, 3, 0.5, np.nan, 5, 2, 4.5, np.nan, 7, 3],
	'visits': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1],
	'priority': ['yes', 'yes', 'no', 'yes', 'no', 'no', 'no', 'yes', 'no', 'no']}

	labels = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']

	print(tf.__version__)

	tf.config.list_physical_devices(
	device_type=None
	)

	- list all the model we have learned so far in class
	- linear regression
	- ridge
	- lasso
	- knn (classifier / regressor)
	- logistic regression
	- decision tree (classifier / regressor)
	- random forest (classifier / regressor)

	-------------------------------------------------------------------------------------------------------

	- what is bias and variance in your model
	- too much bias - add features
	- too much variance - remove features, add data, ridge or lasso (regularization)
	- ridge L2 - reduce the impact of your features
	- lasso L1 - feature selection
	- cross validation - use new data for validating model
	- kfold, train_test_split