Created
June 10, 2009 19:24
-
-
Save williamcotton/127435 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
require "matrix" | |
#First, you construct an adjacency matrix. An adjacency matrix is just a matrix of what is linking to what. | |
#[0, 1, 1, 1, 1, 0, 1] | |
#[1, 0, 0, 0, 0, 0, 0] | |
#[1, 1, 0, 0, 0, 0, 0] | |
#[0, 1, 1, 0, 1, 0, 0] | |
#[1, 0, 1, 1, 0, 1, 0] | |
#[1, 0, 0, 0, 1, 0, 0] | |
#[0, 0, 0, 0, 1, 0, 0] | |
#This example is based on the wikipedia description, http://en.wikipedia.org/wiki/PageRank#Algorithm | |
#So, for example, row 1 is what Page ID=1 is linking to, ie, pages 2, 3, 4, 5, and 7. | |
#Let's call the matrix m1, | |
m1 = Matrix[[ 0.0,1.0,1.0,1.0,1.0,0.0,1.0],[1.0,0.0,0.0,0.0,0.0,0.0,0.0],[1.0,1.0,0.0,0.0,0.0,0.0,0.0],[0.0,1.0,1.0,0.0,1.0,0.0,0.0],[1.0,0.0,1.0,1.0,0.0,1.0,0.0],[1.0,0.0,0.0,0.0,1.0,0.0,0.0],[0.0,0.0,0.0,0.0,1.0,0.0,0.0]] | |
#I've got it in floating point so it'll end up in floating point... | |
#Now, the first thing you need to do is compute the row-stochastic matrix, which is the same thing as getting each row to add up to 1. | |
def row_stochastic(matrix) | |
x = 0 | |
row_total = [] | |
while x < matrix.row_size | |
y = 0 | |
row_total << 0 | |
while y < matrix.row_size | |
row_total[x] += matrix.row(x)[y] | |
y += 1 | |
end | |
x += 1 | |
end | |
a1 = matrix.to_a | |
x = 0 | |
while x < matrix.row_size | |
y = 0 | |
while y < matrix.row_size | |
a1[x][y] = a1[x][y]/row_total[x] | |
y += 1 | |
end | |
x += 1 | |
end | |
Matrix[*a1] | |
end | |
#You'd end up with a matrix like this: | |
puts row_stochastic(m1) | |
#[[0.0, 0.2, 0.2, 0.2, 0.2, 0.0, 0.2] | |
#[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0] | |
#[0.5, 0.5, 0.0, 0.0, 0.0, 0.0, 0.0] | |
#[0.0, 0.33, 0.33, 0.0, 0.33, 0.0, 0.0] | |
#[0.25, 0.0, 0.25, 0.25 , 0.0, 0.25, 0.0] | |
#[0.5, 0.0, 0.0, 0.0, 0.5, 0.0, 0.0] | |
#[0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0]] | |
#Now, you need to transpose it before you continue. | |
row_stochastic(m1).transpose | |
#Now, you've got to compute the principal eigenvector. I won't go in to the details of what that is, but here's the method, and #what it returns. | |
def eigenvector(matrix) | |
i = 0 | |
a = [] | |
while i < matrix.row_size | |
a << [1] | |
i += 1 | |
end | |
eigen_vector = Matrix[*a] | |
i = 0 | |
while i < 100 | |
eigen_vector = matrix*eigen_vector | |
eigen_vector = eigen_vector / eigen_vector.row(0)[0] | |
i += 1 | |
end | |
eigen_vector | |
end | |
puts eigenvector(row_stochastic(m1).transpose) | |
#[[1.0], [0.547368421052632], [0.463157894736842], [0.347368421052632], [0.589473684210526], [0.147368421052632], [0.2 ]] | |
#Now, just normalize it, by having it all add up to 1. | |
def normalize(matrix) | |
i = 0 | |
t = 0 | |
while i < matrix.row_size | |
t += matrix.column(0)[i] | |
i += 1 | |
end | |
matrix = matrix/t | |
end | |
#Giving, in the end, | |
puts normalize(eigenvector(row_stochastic(m1).transpose)) | |
#[[0.303514376996805], [0.166134185303514], [0.140575079872204], [0.105431309904153], [0.178913738019169], [0.0447284345047923 ], #[0.060702875399361]] |
You're totally right! Here's an outline and working Javascript example: http://williamcotton.com/pagerank-explained-with-javascript
Line 64 computes the transposed matrix but discards it (it's not a mutating method). Did you mean to say this?
row_stochastic = row_stochastic(m1).transpose
Your Javascript example (now posted at http://www.cs.duke.edu/csed/principles/pagerank/) is also wrong. Did you try running the row_stochastic method? Because it doesn't produce the result shown. There's also Javascript in that page which prevents you scrolling to the bottom (in Chrome) - I had to save it and fix your JS just to read the page.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I belive this method breaks down if you have disconnected graphs. The google white paper adds a damping factor to solve this problem.