Skip to content

Instantly share code, notes, and snippets.

@dpiponi
Last active June 14, 2017 17:28
Show Gist options
  • Save dpiponi/15d72a9b72d336c1d3c7667625ac7435 to your computer and use it in GitHub Desktop.
Save dpiponi/15d72a9b72d336c1d3c7667625ac7435 to your computer and use it in GitHub Desktop.
Here are some tanh units being self-normalising
# See "Self-Normalizing Neural Networks" https://arxiv.org/abs/1706.02515
# "SNNs cannot be derived with...tanh units..."
# So I'm probably missing the point somewhere...
import math
import numpy
# Magic number
lambda0 = 1.59254
n = 1000
nlayers = 100
# Incoming activiations have mean 0, variance 1
x = numpy.random.normal(0, 1, n)
# Applying 100 fully connected random layers of 1000 units each
for i in xrange(nlayers):
w = numpy.random.normal(0, 1.0/math.sqrt(n), (n, n))
x = lambda0*numpy.tanh(w.dot(x))
# Mean and variance remain around 0, 1
print numpy.mean(x), numpy.var(x)
@dpiponi
Copy link
Author

dpiponi commented Jun 13, 2017

Try things like x = numpy.random.normal(-0.2, 1.2, n)

You still end up with mean, variance around 0, 1

@dpiponi
Copy link
Author

dpiponi commented Jun 13, 2017

Note that you can push the lambda0 into the weights so that everyone who's been using tanh has been using self-normalising units all along. They just weren't initialising the weights appropriately :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment