Created
October 27, 2016 07:25
-
-
Save ahwillia/4c10830640d325e0cab978bc18c6263a to your computer and use it in GitHub Desktop.
PCA in TensorFlow
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import numpy as np | |
import tensorflow as tf | |
# N, size of matrix. R, rank of data | |
N = 100 | |
R = 5 | |
# generate data | |
W_true = np.random.randn(N,R) | |
C_true = np.random.randn(R,N) | |
Y_true = np.dot(W_true, C_true) | |
Y_tf = tf.constant(Y_true.astype(np.float32)) | |
W = tf.Variable(np.random.randn(N,R).astype(np.float32)) | |
C = tf.Variable(np.random.randn(R,N).astype(np.float32)) | |
Y_est = tf.matmul(W,C) | |
loss = tf.reduce_sum((Y_tf-Y_est)**2) | |
# regularization | |
alpha = tf.constant(1e-4) | |
regW = alpha*tf.reduce_sum(W**2) | |
regC = alpha*tf.reduce_sum(C**2) | |
# full objective | |
objective = loss + regW + regC | |
# optimization setup | |
train_step = tf.train.AdamOptimizer(0.001).minimize(objective) | |
# fit the model | |
init_op = tf.initialize_all_variables() | |
with tf.Session() as sess: | |
sess.run(init_op) | |
for n in range(10000): | |
sess.run(train_step) | |
if (n+1) % 1000 == 0: | |
print('iter %i, %f' % (n+1, sess.run(objective))) |
Yeah sure, the point of this was just for demonstration. You could extend this for PCA models that can't be solved in closed form (e.g. sparse PCA).
I have knowledge of the linear algebraic implementation for PCA using SVD. But I see that you are trying to minimize a loss function here. Can you explain how this relates to PCA? Or share resources on the above.
What does this op stand for ?
regW = alphatf.reduce_sum(W**2)
regC = alphatf.reduce_sum(C**2)
It seems to make regularization of W or C?
I think It is not need to square the W or T, I guess.
My implementation: https://gist.github.com/N-McA/bbbaed9d1a4b7c316f5d28cef1b96bdd
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Couldn't one just use tf.svd? The SVD is how PCA is performed in the majority of implementations.