This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Q1 | |
# | |
# Suppose we compute PageRank with a β of 0.7, and we introduce the additional constraint that the sum of the PageRanks of the three pages must be 3, to handle the problem that otherwise any multiple of a solution will also be a solution. Compute the PageRanks a, b, and c of the three pages A, B, and C, respectively. Then, identify from the list below, the true statement. | |
# | |
# Matrix | |
# | |
# A B C | |
# A 0 0 0 | |
# B 0.5 0 0 | |
# C 0.5 1 1 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# | |
# Quiz 2a | |
# | |
# | |
# Q1 | |
# The edit distance is the minimum number of character insertions and character deletions required to turn one string into another. Compute the edit distance between each pair of the strings he, she, his, and hers. Then, identify which of the following is a true statement about the number of pairs at a certain edit distance. | |
# | |
packages <- c('combinat', 'stringdist') |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Q1 | |
C -- D -- E | |
/ | | | \ | |
A | | | B | |
\ | | | / | |
F -- G -- H | |
Write the adjacency matrix A, the degree matrix D, and the Laplacian matrix L. For each, find the sum of all entries and the number of nonzero entries. Then identify the true statement from the list below. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Q1 | |
# Here is a table of 1-5 star ratings for five movies (M, N, P. Q. R) by three raters (A, B, C). | |
# M N P Q R | |
# A 1 2 3 4 5 | |
# B 2 3 2 5 3 | |
# C 5 5 5 3 2 | |
# Normalize the ratings by subtracting the average for each row and then subtracting the average for each column in the resulting table. Then, identify the true statement about the normalized table. | |
# First, setup the data. | |
ratings <- data.frame(M = c(1, 2, 5), N = c(2, 3, 5), P = c(3, 2, 5), Q = c(4, 5, 3), R = c(5, 3, 2)) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# | |
# MMDS Programming Assignment 2: Finding Similar Sentences | |
# This assignment is an optional challenge and it won't count in your final grade. | |
# Your task is to quickly find the number of pairs of sentences that are at the word-level edit distance at most 1. Two sentences S1 and S2 they are at edit distance 1 if S1 can be transformed to S2 by: adding, removing or substituting a single word. | |
# For example, consider the following sentences where each letter represents a word: | |
# S1: A B C D | |
# S2: A B X D | |
# S3: A B C | |
# S4: A B X C | |
# Then pairs the following pairs of sentences are at word edit distance 1 or less: (S1, S2), (S1, S3), (S2, S4), (S3, S4). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# | |
# Mining Massive Datasets Quiz 5B | |
# | |
# Q1 | |
# We wish to cluster the following set of points: into 10 clusters. We initially choose each of the green points (25,125), (44,105), (29,97), (35,63), (55,63), (42,57), (23,40), (64,37), (33,22), and (55,20) as a centroid. Assign each of the gold points to their nearest centroid. (Note: the scales of the horizontal and vertical axes differ, so you really need to apply the formula for distance of points; you can't just "eyeball" it.) Then, recompute the centroids of each of the clusters. Do any of the points then get reassigned to a new cluster on the next round? Identify the true statement in the list below. Each statement refers either to a centroid AFTER recomputation of centroids (precise to one decimal place) or to a point that gets reclassified. | |
# Setup data. | |
centroids <- t(data.frame(c(25,125), c(44,105), c(29,97), c(35,63), c(55,63), c(42,57), c(23,40), c(64,37), c(33,22), c(55,20))) | |
points <- t(data.frame(c(28,145), c(65,140), c(50,130), c(55,118), c(38,115 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
n choose k | |
n! / k!(n - k)! | |
How many pairs in: x,y,a,b,c,d,e,f | |
8C2 = 8! / 2! * (8 - 2)! = 40320 / (2 * 6!) = 40320 / (2 * 720) = 40320 / 1440 = 28 | |
xy,xa,xb,xc,xd,xe,xf | |
ya,yb,yc,yd,ye,yf | |
ab,ac,ad,ae,af | |
bc,bd,be,bf |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# mmds Week6A | |
# Q1 | |
# Using the matrix-vector multiplication described in Section 2.3.1, applied to the matrix and vector: | |
# 1 2 3 4 | |
# 5 6 7 8 | |
# 9 10 11 12 | |
# 13 14 15 16 | |
# | |
# 1 | |
# 2 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# | |
# Q1 | |
# Suppose we have an LSH family h of (d1,d2,.6,.4) hash functions. We can use three functions from h and the AND-construction to form a (d1,d2,w,x) family, and we can use two functions from h and the OR-construction to form a (d1,d2,y,z) family. Calculate w, x, y, and z, and then identify the correct value of one of these in the list below. | |
# | |
val1 <- .6 | |
val2 <- .4 | |
# AND construction | |
w <- val1 ^ 3 | |
x <- val2 ^ 3 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# | |
# Converts text into a friendly url. | |
# Examples: | |
# friendlyUrl('She sells seashells by the seashore.') -> "she-sells-seashells-by-the-seashore" | |
# friendlyUrl('Learn To Program: 1,000 "Languages" in 2015.') -> "learn-to-program-1-000-languages-in-2015" | |
# | |
friendlyUrl <- function(text, sep = '-', max = 80) { | |
# Replace non-alphanumeric characters. | |
url <- gsub('[^A-Za-z0-9]', sep, text) | |