Problem: Transforming X into φ(X)
space can be expensive, and it is usually used as an intermediate result inside of a dot product like <φ(x[i]), φ(x[j])>
.
Trick to save computation time: Conditional on having a φ
where we know how to compute <φ(x[i]), φ(x[j])>
through a shortcut, we can use the shortcut instead of explicitly calling φ
and storing the long intermediate result. The savings stem from (a) saving calls to φ
, and (b) making the dot product operate on shorter vectors.
φ(x) = (x[1]**2, sqrt(2)*x[1]* x[2], x[2]**2)
<φ(x),φ(z)> = sum((x[1]**2)(z[1]**2), 2x[1]x[2]z[1]z[2], (x[2]**2)(z[2]**2))
<φ(x),φ(z)> = (x1z1 + x2z2)**2 = <x,z>**2
def shortcut(x, z):
return (x.dot(z)) **2
def old_way(x, y):
return φ(x).dot(φ(z))
In old_way
, each call to φ
costs 4 multiplications, and the dot product of the intermediate results costs 3 multiplications and 2 additions. So the cost is 11 multiplications and 2 additions.
In shortcut
, we need 2 multiplications and 1 addition to compute the dot product. And then another multiplication to square the scalar result, for a total of 3 multiplications and 1 addition.
So, by using shortcut
instead of old_way
we save 8 multiplications and 1 addition per call.