Skip to content

Instantly share code, notes, and snippets.

@fperez
Last active April 19, 2022 18:52
Show Gist options
  • Save fperez/c7b1cb4810f9d0935e893f34c41f0c62 to your computer and use it in GitHub Desktop.
Save fperez/c7b1cb4810f9d0935e893f34c41f0c62 to your computer and use it in GitHub Desktop.
Notes for "Why does deep and cheap learning work so well?" (ArXiv:1608.08225v1/cond-mat.dis-nn) by Lin and Tegmark.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@fperez
Copy link
Author

fperez commented Sep 12, 2016

Ah @hwlin76, of course! I just didn't pay enough attention to eq. (10) and fell in the same trap you probably did: once I accepted it as correct (yup, Taylor series, fine, move along), then after that in my head I simply identified sigma_2 with sigma'', and the trap was closed.

I'd be happy to look at the numerical error for multivariate polynomials, but in a couple of days, I'm currently traveling.

BTW, if you don't mind, I'd like to use this little exchange to illustrate two things:

  • your very interesting results/paper (which I'm not done digesting)
  • the value of open work with Jupyter, Github, etc, for this kind of exchange, that is even more lightweight and direct than an exchange of ArXiv preprints (and there's code to work reproducibly off actual implementations).

On Tuesday I'm giving the colloquium in the CS department at CU Boluder but I know some of the folks from my former physics dept will be in attendance. Would you be OK if I mention it?

@fperez
Copy link
Author

fperez commented Sep 12, 2016

ps - small typo, in the paragraph right after eq. (11) it says that the approximation is exact as lambda -> \infty. It should be "as lambda -> 0."

@fperez
Copy link
Author

fperez commented Sep 13, 2016

BTW @hwlin76, for the multivariate case, are you thinking of representing it as nested networks, e.g. g(u,v,w) = uvw as g = f(u, f(v,w)) where f(u,v) = uv, or do you have a similar (explicit) construction to the paper for the 3-variable case, with a 3-n-1 network? I imagine this should be possible/easy, but I haven't tried yet figuring out what the nx3 matrix would be, nor what \mu would become. And if that's the case, do you have the inductive result for the p-term multivariate product?

Before I dive into the multivariate analysis, it would be good to know which way you're thinking of it, and if you have these constructive results ready then we could formulate the implementation that way directly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment