-
-
Save Qambar/6bcc648522efa283db4a to your computer and use it in GitHub Desktop.
Mining Massive Datasets Quiz 1
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Q1 | |
# | |
# Suppose we compute PageRank with a β of 0.7, and we introduce the additional constraint that the sum of the PageRanks of the three pages must be 3, to handle the problem that otherwise any multiple of a solution will also be a solution. Compute the PageRanks a, b, and c of the three pages A, B, and C, respectively. Then, identify from the list below, the true statement. | |
# | |
# Matrix | |
# | |
# A B C | |
# A 0 0 0 | |
# B 0.5 0 0 | |
# C 0.5 1 1 | |
# | |
# Looking at columns. | |
# For node A, the user has 0 probability of moving to A, 0.5 of moving to B, 0.5 of moving to C. | |
# | |
b = 0.7 | |
M = matrix(c(0, 0.5, 0.5, 0, 0, 1, 0, 0, 1), ncol=3) | |
e = matrix(c(1, 1, 1), ncol=1) | |
v1 = matrix(c(1, 1, 1), ncol=1) | |
v1 = v1 / 3 | |
for (i in 1:5) { | |
v1 = ((b * M ) %*% v1 ) + (((1 - b ) * e ) / 3) | |
} | |
v1 = v1 * 3 | |
# Find values for a, b, c to match equations in possible answers. | |
a <- v1[1] | |
b <- v1[2] | |
c <- v1[3] | |
# Q2 | |
# | |
# Suppose we compute PageRank with β=0.85. Write the equations for the PageRanks a, b, and c of the three pages A, B, and C, respectively. Then, identify in the list below, one of the equations. | |
# | |
# Matrix | |
# | |
# A B C | |
# A 0 0 1 | |
# B 0.5 0 0 | |
# C 0.5 1 0 | |
# | |
b = 0.85 | |
M = matrix(c(0, 0.5, 0.5, 0, 0, 1, 1, 0, 0), ncol=3) | |
e = matrix(c(1, 1, 1), ncol=1) | |
v1 = matrix(c(1, 1, 1), ncol=1) | |
v1 = v1 / 3 | |
for (i in 1:4) { | |
v1 = ((b * M ) %*% v1 ) + (((1 - b ) * e ) / 3) | |
} | |
v1 = v1 * 3 | |
# Find values for a, b, c to match equations in possible answers. | |
a <- v1[1] | |
b <- v1[2] | |
c <- v1[3] | |
# Determine which equation is true. | |
(85 * b) == (.575 * a) + (.15 * c) | |
(.95 * a) == (.9 * c) + (.05 * b) | |
b == (.475 * a) + (.05 * c) | |
c == b + (.575 * a) | |
# Q3 | |
# | |
# Assuming no "taxation," compute the PageRanks a, b, and c of the three pages A, B, and C, using iteration, starting with the "0th" iteration where all three pages have rank a = b = c = 1. Compute as far as the 5th iteration, and also determine what the PageRanks are in the limit. Then, identify the true statement from the list below. | |
# | |
# We re-use the same matrix and calculations from Q2. | |
round(b, 3) == 5/8 | |
round(a, 1) == 6/5 | |
round(c, 3) == 11/8 | |
# | |
# Q4 | |
# | |
# Suppose our input data to a map-reduce operation consists of integer values (the keys are not important). The map function takes an integer i and produces the list of pairs (p,i) such that p is a prime divisor of i. For example, map(12) = [(2,12), (3,12)]. | |
# The reduce function is addition. That is, reduce(p, [i1, i2, ...,ik]) is (p,i1+i2+...+ik). | |
# Compute the output, if the input is the set of integers 15, 21, 24, 30, 49. Then, identify, in the list below, one of the pairs in the output. | |
# | |
# See https://gist.github.com/primaryobjects/8755398629a4e2ef74dd | |
# | |
# map(15) = [(3, 15), (5, 15)] | |
# map(21) = [(3, 21), (7, 21)] | |
# map(24) = [(2, 24), (3, 24)] | |
# map(30) = [(2, 30), (3, 30), (5, 30)] | |
# map(49) = [(7, 49)] | |
# | |
# reduce(2, 54) | |
# reduce(3, 90) | |
# reduce(5, 45) | |
# reduce(7, 70) | |
# |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment