ZHAOZHIHAO · July 24, 2019 21:38
diff --git a/Nice sentences b/Nice sentences
 1. Most neural networks are essentially very large correlation engines that will hone in on any statis-
 tical, potentially spurious pattern that allows them to model the observed data more accurately.

 2. Generative adversarial networks (GANs) have been extremely effective in approximating complex distributions
 of high-dimensional, input data samples.

 3. The learning objective is to learn the best parameterization of those embeddings such that the correct answer 
 has higher likelihood among all possible answers.   

 4. In contrast with past approaches in AI, modern deep learning methods (LeCun et al., 2015; Schmidhuber, 2015; Goodfellow et al.,
 2016) often follow an “end-to-end” design philosophy which emphasizes minimal a priori representational and computational 
 assumptions, and seeks to avoid explicit structure and “hand-engineering”.  This emphasis has fit well with—and has perhaps been
 affirmed by—the current abundance of cheap data and cheap computing resources, which make trading off sample efficiency for more 
 flexible learning a rational choice.  The remarkable and rapid advances  across  many  challenging  domains,  from  image  
 classification  (Krizhevsky  et  al.,  2012;Szegedy et al., 2017), to natural language processing (Sutskever et al., 2014; 
 Bahdanau et al., 2015), to game play (Mnih et al., 2015; Silver et al., 2016; Moravˇc ́ık et al., 2017), are a testament to this
 minimalist principle.  A prominent example is from language translation, where sequence-to-sequence approaches (Sutskever et al., 
 2014; Bahdanau et al., 2015) have proven very effective without using explicit parse trees or complex relationships between
 linguistic entities.

 5. Bengio：没法立刻判断，都是几年后才能意识到。通常情况是双方的期望之间存在不匹配，与企业界的合作必须小心这一点，你需要明确告诉对方，学术界的人可以
 为他们做什么而不能做什么。重要的是让他们明白学术不是廉价劳动力，也不会产出产品，而是可以创造一些能改变商业模式的想法。企业需要明白，这只是投资的一部
 分。他们还需要让内部人员将算法和原型转变为产品，否则合作注定要失败。说实话很多人不愿意听到这些，因为这意味着企业要花更多的钱。但是这些话不得不说。
 Bengio：倾听你的直觉。许多人缺乏自信，因此他们错过了机会。作为研究人员，我们的主要工作是提供有意义的想法来推动知识进步。这些想法隐藏在我们大脑某个
 地方，我们需要培养一种能力，让这些想法能够发展成熟并发布出来，因此你需要有足够的时间来思考，而不是一直编程，写作甚至阅读。多考虑一下那些让你烦恼的大
 问题。

 6. https://www.zhihu.com/question/21342077
 Elon Musk: Well, I do think there’s a good framework for thinking. It is physics. You know, the sort of first principles reasoning. 
 Generally I think there are — what I mean by that is, boil things down to their fundamental truths and reason up from there, 
 as opposed to reasoning by analogy. Through most of our life, we get through life by reasoning by analogy, which essentially 
 means copying what other people do with slight variations. And you have to do that. Otherwise, mentally, you wouldn’t be able 
 to get through the day. But when you want to do something new, you have to apply the physics approach. Physics is really 
 figuring out how to discover new things that are counterintuitive, like quantum mechanics. It’s really counterintuitive. So I 
 think that’s an important thing to do, and then also to really pay attention to negative feedback, and solicit it, particularly 
 from friends. This may sound like simple advice, but hardly anyone does that, and it’s incredibly helpful.

 7. Goodfellow's twitter on opinion on deep learning and convex cost constraint, also another guy's rebuttal
 https://twitter.com/goodfellow_ian/status/964168396072828928

 8. Michael I. Jordan: 
 The developments which are now being called “AI” arose mostly in the engineering fields associated with low-level pattern recognition 
 and movement control, and in the field of statistics — the discipline focused on finding patterns in data and on making well-founded 
 predictions, tests of hypotheses and decisions.
 ...
 One could simply agree to refer to all of this as “AI,” and indeed that is what appears to have happened. Such labeling may come as a 
 surprise to optimization or statistics researchers, who wake up to find themselves suddenly referred to as “AI researchers.” 
 ...
 Of course, classical human-imitative AI problems remain of great interest as well. However, the current focus on doing AI research via 
 the gathering of data, the deployment of “deep learning” infrastructure, and the demonstration of systems that mimic certain 
 narrowly-defined human skills — with little in the way of emerging explanatory principles — tends to deflect attention from major open 
 problems in classical AI. 
 ...
 We need to realize that the current public dialog on AI — which focuses on a narrow subset of industry and a narrow subset of 
 academia — risks blinding us to the challenges and opportunities that are presented by the full scope of AI, IA and II.
 ...
 In the current era, we have a real opportunity to conceive of something historically new — a human-centric engineering discipline.

 9. For thirty years, the state-of-the-art in speech recognition used hidden Markov models with Gaussian mixtures as output 
 distributions.  These models were easy to learn on small computers, but they had a representational limitation that was 
 ultimately fatal:  The one-of-n representations they use are exponentially inefficient compared with, say, a recurrent neural 
 network that uses distributed representations. To double the amount of information that an HMM can remember about the string it
 has generated so far, we need to square the number of hidden nodes. For a recurrent net we only need to double the number of 
 hidden neurons.
 Now that convolutional neural networks have become the dominant approach to object recognition, it makes sense to ask whether 
 there are any exponential inefficiencies that may lead to their demise.

 10. The theory of reinforcement learning provides a normative account, deeply rooted in psychological and neuroscientific 
 perspectives on animal behaviour, of how agents may optimize their control of an environment. To use reinforcement learning 
 successfully in situations approaching real-world complexity, however, agents are confronted with a difficult task: they must 
 derive efficient representations of the environment from high-dimensional sensory inputs, and use these to generalize past 
 experience to new situations. Remarkably, humans and other animals seem to solve this problem through a harmonious combination 
 of reinforcement learning and hierarchical sensory processing systems, the former evidenced by a wealth of neural data revealin
 g notable parallels between the phasic signals emitted by dopaminergic neurons and temporal difference reinforcement learning 
 algorithms. 

 11. Deep learning, Yann LeCun, Yoshua Bengio & Geoffrey Hinton
 The key aspect of deep learning is that these layers of features are not designed by human engineers: they are learned from 
 data using a general-purpose learning procedure.
 ...
 For classification tasks, higher layers of representation amplify aspects of the input that are important for discrimination
 and suppress irrelevant variations.
 ...
 The issue of representation lies at the heart of the debate between the logic-inspired and the neural-network-inspired
 paradigms for cognition. In the logic-inspired paradigm, an instance of a symbol is something for which the only property
 is that it is either identical or non-identical to other symbol instances. It has no internal structure that is relevant to 
 its use; and to reason with symbols, they must be bound to the variables in judiciously chosen rules of inference. By contrast,
 neural networks just use big activity vectors, big weight matrices and scalar non-linearities to perform the type of fast 
 'intuitive' inference that underpins effortless commonsense reasoning.
 ...
 Ultimately, major progress in artificial intelligence will come about through systems that combine representation learning
 with complex reasoning. Although deep learning and simple reasoning have been used for speech and handwriting recognition 
 for a long time, new paradigms are needed to replace rule-based manipulation of symbolic expressions by operations on large 
 vectors.

 12. About CapsuleNet
 Equivariance and invariance are also useful properties when aiming to produce data representations that disentangle factors of 
 variation, which is one goal of capsule networks.
 ...
 They aim to hard-wire the ability to disentangle the pose of an object from the evidence of its existence. This is done by
 encoding the output of one layer as a tuple of a pose vector and an activation, leading to a clearer geometric interpretation
 of learned representations. They are inspired by the human vision and detect linear, hierarchical relationships occurring in
 the data.
 ...
 The votes are used to compute a proposal for an output pose by a variant of weighted averaging.  The weights are then 
 iteratively refined using distances between votes and the proposal.  Last, an agreement value is computed as output 
 activation, which encodes how strong the transformed input poses agree on the output pose. The capsule layer outputs a set 
 of tuples (M, a), each containing pose matrix and agreement (as activation) for one output capsule.
 ...
 During inference propagation, the principle of coincidence filtering is employed to activate higher-level capsules and set 
 up part-whole relationships among capsule entities.  Such part-whole hierarchy lays a solid foundation for viewpoint-invariant
 recognition, which can be implemented through dynamic routing or EM routing.
 ...
 The success of capsule networks lies in their ability to preserve more information about the input by replacing max-pooling layers
 with convolutional strides and dynamic routing, allowing for preservation of part-whole relationships in the data. This preservation
 of the input is demonstrated by reconstructing the input from the output capsule vectors. 
 ...
 Viewpoint changes in capsule network are linear effects on the pose matrices of the parts and the whole between different
 capsules layers. 
 ...
 Hinton wants to revamp the approach to AI because there is little evidence inside the natural sciences that "backprop" and 
 large "training data" are used during human learning. Unsupervised and reinforcement approaches share much more in common 
 with what we know about the human brain and its capacity to adapt to its environment. But the current approach to 
 unsupervised learning (for deep learning approaches) is really just converting an unsupervised problem into a supervised 
 one so that we can apply backdrop (e.g. GANs, autoencoders, etc.). So in some sense the progress to move towards human 
 learning is much slower than promoted. The issues become even more challenging when we realize the use of EM in generative
 models usually degrades into an optimization problem that is best solved using backprop anyway. So Hinton's frustration
 with the current approach is likely shared by others. The take home message is that the SOTA is not moving away from large 
 training sets, and is not moving towards how people learn.
 ...
 Blending more traditional ML approaches with deep learning may in fact bring us closer to human intelligence (e.g. using 
 kNNs to implement the availability and anchoring heuristics, or perhaps even analogy). Real neurons are vastly more complex
 than the dumb ANNs we construct with our models, and backprop is much too wasteful of training data to be the way forward.
 Backprop is likely a placeholder until we find ways to realistically move towards unsupervised, reinforcement, and 
 human-heuristic approaches to learning.

 13. Operator notation provides a higher level of mathematical abstraction, allowing the theorems derived below to express
 the relationship between transformations that we are interested in (e.g. image formation) rather than being tied up in the
 underlying functions being acted upon (e.g. light fields and photographs). 

 14. (optical flow) Small motion minimizes the correspondence problem between successive images, but sacrifices depth resolution
 because of the small baseline between consecutive image pairs. 




 Life:
 cross my/your heart and hope to die
 A sexy voice is a sexy voice regardless of dailect
 It's hopeless He lets me run away
 dry humor
 You keep telling yourself what you know, but what do you believe what do you feel
 Deep down, I'm really superficial.
 You want too much
 Mirror mirror on the wall, who's the fairest of them all
 side character
 他死的很糟糕是因为过去活的很糟糕
	1. Most neural networks are essentially very large correlation engines that will hone in on any statis-
	tical, potentially spurious pattern that allows them to model the observed data more accurately.

	2. Generative adversarial networks (GANs) have been extremely effective in approximating complex distributions
	of high-dimensional, input data samples.

	3. The learning objective is to learn the best parameterization of those embeddings such that the correct answer
	has higher likelihood among all possible answers.

	4. In contrast with past approaches in AI, modern deep learning methods (LeCun et al., 2015; Schmidhuber, 2015; Goodfellow et al.,
	2016) often follow an “end-to-end” design philosophy which emphasizes minimal a priori representational and computational
	assumptions, and seeks to avoid explicit structure and “hand-engineering”. This emphasis has fit well with—and has perhaps been
	affirmed by—the current abundance of cheap data and cheap computing resources, which make trading off sample efficiency for more
	flexible learning a rational choice. The remarkable and rapid advances across many challenging domains, from image
	classification (Krizhevsky et al., 2012;Szegedy et al., 2017), to natural language processing (Sutskever et al., 2014;
	Bahdanau et al., 2015), to game play (Mnih et al., 2015; Silver et al., 2016; Moravˇc ́ık et al., 2017), are a testament to this
	minimalist principle. A prominent example is from language translation, where sequence-to-sequence approaches (Sutskever et al.,
	2014; Bahdanau et al., 2015) have proven very effective without using explicit parse trees or complex relationships between
	linguistic entities.

	5. Bengio：没法立刻判断，都是几年后才能意识到。通常情况是双方的期望之间存在不匹配，与企业界的合作必须小心这一点，你需要明确告诉对方，学术界的人可以
	为他们做什么而不能做什么。重要的是让他们明白学术不是廉价劳动力，也不会产出产品，而是可以创造一些能改变商业模式的想法。企业需要明白，这只是投资的一部
	分。他们还需要让内部人员将算法和原型转变为产品，否则合作注定要失败。说实话很多人不愿意听到这些，因为这意味着企业要花更多的钱。但是这些话不得不说。
	Bengio：倾听你的直觉。许多人缺乏自信，因此他们错过了机会。作为研究人员，我们的主要工作是提供有意义的想法来推动知识进步。这些想法隐藏在我们大脑某个
	地方，我们需要培养一种能力，让这些想法能够发展成熟并发布出来，因此你需要有足够的时间来思考，而不是一直编程，写作甚至阅读。多考虑一下那些让你烦恼的大
	问题。

	6. https://www.zhihu.com/question/21342077
	Elon Musk: Well, I do think there’s a good framework for thinking. It is physics. You know, the sort of first principles reasoning.
	Generally I think there are — what I mean by that is, boil things down to their fundamental truths and reason up from there,
	as opposed to reasoning by analogy. Through most of our life, we get through life by reasoning by analogy, which essentially
	means copying what other people do with slight variations. And you have to do that. Otherwise, mentally, you wouldn’t be able
	to get through the day. But when you want to do something new, you have to apply the physics approach. Physics is really
	figuring out how to discover new things that are counterintuitive, like quantum mechanics. It’s really counterintuitive. So I
	think that’s an important thing to do, and then also to really pay attention to negative feedback, and solicit it, particularly
	from friends. This may sound like simple advice, but hardly anyone does that, and it’s incredibly helpful.

	7. Goodfellow's twitter on opinion on deep learning and convex cost constraint, also another guy's rebuttal
	https://twitter.com/goodfellow_ian/status/964168396072828928

	8. Michael I. Jordan:
	The developments which are now being called “AI” arose mostly in the engineering fields associated with low-level pattern recognition
	and movement control, and in the field of statistics — the discipline focused on finding patterns in data and on making well-founded
	predictions, tests of hypotheses and decisions.
	...
	One could simply agree to refer to all of this as “AI,” and indeed that is what appears to have happened. Such labeling may come as a
	surprise to optimization or statistics researchers, who wake up to find themselves suddenly referred to as “AI researchers.”
	...
	Of course, classical human-imitative AI problems remain of great interest as well. However, the current focus on doing AI research via
	the gathering of data, the deployment of “deep learning” infrastructure, and the demonstration of systems that mimic certain
	narrowly-defined human skills — with little in the way of emerging explanatory principles — tends to deflect attention from major open
	problems in classical AI.
	...
	We need to realize that the current public dialog on AI — which focuses on a narrow subset of industry and a narrow subset of
	academia — risks blinding us to the challenges and opportunities that are presented by the full scope of AI, IA and II.
	...
	In the current era, we have a real opportunity to conceive of something historically new — a human-centric engineering discipline.

	9. For thirty years, the state-of-the-art in speech recognition used hidden Markov models with Gaussian mixtures as output
	distributions. These models were easy to learn on small computers, but they had a representational limitation that was
	ultimately fatal: The one-of-n representations they use are exponentially inefficient compared with, say, a recurrent neural
	network that uses distributed representations. To double the amount of information that an HMM can remember about the string it
	has generated so far, we need to square the number of hidden nodes. For a recurrent net we only need to double the number of
	hidden neurons.
	Now that convolutional neural networks have become the dominant approach to object recognition, it makes sense to ask whether
	there are any exponential inefficiencies that may lead to their demise.

	10. The theory of reinforcement learning provides a normative account, deeply rooted in psychological and neuroscientific
	perspectives on animal behaviour, of how agents may optimize their control of an environment. To use reinforcement learning
	successfully in situations approaching real-world complexity, however, agents are confronted with a difficult task: they must
	derive efficient representations of the environment from high-dimensional sensory inputs, and use these to generalize past
	experience to new situations. Remarkably, humans and other animals seem to solve this problem through a harmonious combination
	of reinforcement learning and hierarchical sensory processing systems, the former evidenced by a wealth of neural data revealin
	g notable parallels between the phasic signals emitted by dopaminergic neurons and temporal difference reinforcement learning
	algorithms.

	11. Deep learning, Yann LeCun, Yoshua Bengio & Geoffrey Hinton
	The key aspect of deep learning is that these layers of features are not designed by human engineers: they are learned from
	data using a general-purpose learning procedure.
	...
	For classification tasks, higher layers of representation amplify aspects of the input that are important for discrimination
	and suppress irrelevant variations.
	...
	The issue of representation lies at the heart of the debate between the logic-inspired and the neural-network-inspired
	paradigms for cognition. In the logic-inspired paradigm, an instance of a symbol is something for which the only property
	is that it is either identical or non-identical to other symbol instances. It has no internal structure that is relevant to
	its use; and to reason with symbols, they must be bound to the variables in judiciously chosen rules of inference. By contrast,
	neural networks just use big activity vectors, big weight matrices and scalar non-linearities to perform the type of fast
	'intuitive' inference that underpins effortless commonsense reasoning.
	...
	Ultimately, major progress in artificial intelligence will come about through systems that combine representation learning
	with complex reasoning. Although deep learning and simple reasoning have been used for speech and handwriting recognition
	for a long time, new paradigms are needed to replace rule-based manipulation of symbolic expressions by operations on large
	vectors.

	12. About CapsuleNet
	Equivariance and invariance are also useful properties when aiming to produce data representations that disentangle factors of
	variation, which is one goal of capsule networks.
	...
	They aim to hard-wire the ability to disentangle the pose of an object from the evidence of its existence. This is done by
	encoding the output of one layer as a tuple of a pose vector and an activation, leading to a clearer geometric interpretation
	of learned representations. They are inspired by the human vision and detect linear, hierarchical relationships occurring in
	the data.
	...
	The votes are used to compute a proposal for an output pose by a variant of weighted averaging. The weights are then
	iteratively refined using distances between votes and the proposal. Last, an agreement value is computed as output
	activation, which encodes how strong the transformed input poses agree on the output pose. The capsule layer outputs a set
	of tuples (M, a), each containing pose matrix and agreement (as activation) for one output capsule.
	...
	During inference propagation, the principle of coincidence filtering is employed to activate higher-level capsules and set
	up part-whole relationships among capsule entities. Such part-whole hierarchy lays a solid foundation for viewpoint-invariant
	recognition, which can be implemented through dynamic routing or EM routing.
	...
	The success of capsule networks lies in their ability to preserve more information about the input by replacing max-pooling layers
	with convolutional strides and dynamic routing, allowing for preservation of part-whole relationships in the data. This preservation
	of the input is demonstrated by reconstructing the input from the output capsule vectors.
	...
	Viewpoint changes in capsule network are linear effects on the pose matrices of the parts and the whole between different
	capsules layers.
	...
	Hinton wants to revamp the approach to AI because there is little evidence inside the natural sciences that "backprop" and
	large "training data" are used during human learning. Unsupervised and reinforcement approaches share much more in common
	with what we know about the human brain and its capacity to adapt to its environment. But the current approach to
	unsupervised learning (for deep learning approaches) is really just converting an unsupervised problem into a supervised
	one so that we can apply backdrop (e.g. GANs, autoencoders, etc.). So in some sense the progress to move towards human
	learning is much slower than promoted. The issues become even more challenging when we realize the use of EM in generative
	models usually degrades into an optimization problem that is best solved using backprop anyway. So Hinton's frustration
	with the current approach is likely shared by others. The take home message is that the SOTA is not moving away from large
	training sets, and is not moving towards how people learn.
	...
	Blending more traditional ML approaches with deep learning may in fact bring us closer to human intelligence (e.g. using
	kNNs to implement the availability and anchoring heuristics, or perhaps even analogy). Real neurons are vastly more complex
	than the dumb ANNs we construct with our models, and backprop is much too wasteful of training data to be the way forward.
	Backprop is likely a placeholder until we find ways to realistically move towards unsupervised, reinforcement, and
	human-heuristic approaches to learning.

	13. Operator notation provides a higher level of mathematical abstraction, allowing the theorems derived below to express
	the relationship between transformations that we are interested in (e.g. image formation) rather than being tied up in the
	underlying functions being acted upon (e.g. light fields and photographs).

	14. (optical flow) Small motion minimizes the correspondence problem between successive images, but sacrifices depth resolution
	because of the small baseline between consecutive image pairs.




	Life:
	cross my/your heart and hope to die
	A sexy voice is a sexy voice regardless of dailect
	It's hopeless He lets me run away
	dry humor
	You keep telling yourself what you know, but what do you believe what do you feel
	Deep down, I'm really superficial.
	You want too much
	Mirror mirror on the wall, who's the fairest of them all
	side character
	他死的很糟糕是因为过去活的很糟糕