Quoc V.Le - Google
parallel neural networks at Google scale
- machine learning requires domain knowledge from human experts
- we want to move beyond hiring domain experts; it would be good to have machines create features rather than human experts
deep learning:
- great performance on many problems
- works well with a lot of data
- requires less domain knowledge
applying non-linearity (like a sigmoid) in successive iterations to build complex neural networks
The network can "learn" a lot of complex functions, independent of domain knowledge.
pixels -> edge detectors -> face detectors
- trains deep learning on many machines (10K or more)
- forward pass to compute the gradient, backward pass to compute the gradient
- model parameters are partitioned
- can use up to 1000 cores.
- "1000 cores is still really small" so they partition the data and apply the functions to separate nodes and then send answers back to a "parameter server"
- the problem with this model: the server needs to wait for all answers to compute. so they relax the constraint and allow for asynch computation
voice search, photo search, and text understanding
Voice search: your speech is sent to a deep neural network that
- extracts a speech frame
- classifies the phonemes
- then puts the phonemes together to recognize your speech
Completely done with parallelized networks.
Text understanding: useful but very difficult
- programatically understanding the meaning of words in context (complete with metaphors and idioms)
- you can map each word to a 100-dimension space.
- translation can be mapped geometrically by matching words that occupy the same XY