Last active
September 26, 2020 20:03
-
-
Save Nikolaj-K/5eb9efcad0292712189196b79d2761a1 to your computer and use it in GitHub Desktop.
Review of "At the Interface of Algebra and Statistics"
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Those are the notes that I got through in the video here: | |
https://youtu.be/sIy9pD4sTVA | |
########## Links: | |
https://arxiv.org/abs/2004.05631 | |
https://youtu.be/wiadG3ywJIs | |
https://youtu.be/hZfLEX6d-Dk | |
https://qcpages.qc.cuny.edu/~jterilla/ | |
https://www.youtube.com/channel/UCs4aHmggTfFrpkPcWSaBN9g/videos | |
https://www.math3ma.com/ | |
http://tensornetwork.org/diagrams/ | |
https://github.com/emstoudenmire/parity | |
https://en.wikipedia.org/wiki/Penrose_graphical_notation | |
https://en.wikipedia.org/wiki/Density_matrix | |
https://en.wikipedia.org/wiki/Matrix_product_state | |
https://en.wikipedia.org/wiki/Density_matrix_renormalization_group | |
https://en.wikipedia.org/wiki/Generative_model | |
https://en.wikipedia.org/wiki/Parity_problem | |
https://en.wikipedia.org/wiki/Quantum_tomography | |
https://www.ctan.org/pkg/tufte-latex | |
########## ########## ########## ########## Pitch | |
At the Interface of Algebra and Statistics (PhD thesis) | |
arXiv:2004.05631v1 [quant-ph] 12 Apr 2020 | |
130 pages | |
PhD thesis by Tai-Danae Bradley | |
# About | |
One main aspect is about casting joint probability on finite systems ($S=X\times Y$ with subsystem $Y$) as | |
a pure state (in a relatively large vector space), | |
a notion in the language of quantum theory. | |
This, in particular, allows taking partial traces (relating to SVD and marginal propabilities). | |
The framework is then used for generative modeling (ML) experiments. | |
Technical article (see end of chapter 3): | |
>Tai-Danae Bradley, E. Miles Stoudenmire, and John Terilla. Modeling sequences with quantum states: A look under the hood. Machine Learning: Science and Technology, 2020. https://doi.org/ 10.1088/2632-2153/ab8731. arXiv:1910.07425. | |
>https://iopscience.iop.org/article/10.1088/2632-2153/ab8731/pdf | |
# Wikipedia buzzwords | |
* Compositionality Mag. | |
* Linear algebra (over C) | |
- Tensor network | |
+ e.g. link in Bibliography is given to http://tensornetwork.org/diagrams/ | |
and https://arxiv.org/pdf/1812.04011.pdf | |
+ https://en.wikipedia.org/wiki/Penrose_graphical_notation | |
* https://en.wikipedia.org/wiki/Matrix_product_state | |
+ https://en.wikipedia.org/wiki/Density_matrix | |
+ all of this is e.g. used prominently in quantum-many-body physics (see e.g. Nolting for a physics exposition) | |
- Partial trace | |
- Free C-vector space on X (enabling linear algebra perspective for some f:X^C) | |
* Generative models (Machine learning) | |
- Language analysis | |
+ Parity dataset | |
* Lattice theory | |
* Adjoints | |
########## ########## ########## ########## Meta | |
* Related to "Applied Category theory" scene | |
* Video summary (or rather teaser and "how to read") of thesis topic | |
- https://www.youtube.com/watch?v=wiadG3ywJIs | |
- https://youtu.be/wiadG3ywJIs?t=198 | |
* PBS (or so) youtube (for 2 years) | |
# Advisor | |
John Terilla, CEO of Tunnel Technologies discusses his quantum mechanical approach to NLP | |
https://youtu.be/hZfLEX6d-Dk | |
Related: | |
* https://en.wikipedia.org/wiki/Parity_problem | |
* Quantum tomography | |
See also arxiv | |
* https://arxiv.org/abs/1910.07425 | |
Publications: | |
* https://qcpages.qc.cuny.edu/~jterilla/ | |
See also linked to small github repo | |
* https://github.com/emstoudenmire/parity | |
# Website (blog) | |
* Math blog | |
* Going back to 2015, an article about once a month | |
* Mostly short but, some longer, some personal | |
* Very aestetic presentation, also including drawing / handwriting | |
* Colimits article | |
* latest | |
* Crumbs on writing | |
# Interaction | |
Quickyl wrote her a question -> short but informative answer. | |
########## ########## ########## ########## Text | |
# Style | |
## Layout | |
* very very pretty / aestetic | |
* uses tufte-latex | |
- https://www.ctan.org/pkg/tufte-latex | |
+ Sidenote column (effectively instead of footnotes) | |
## Writing | |
* Thoughout "positive" (I'm more cynical, so it's a bit hard to read) | |
* aimed at "more general audience" (surprising for a PhD thesis), casual, talks to reader | |
- "Takeaways", used as summary (from a style perspective, these are similarly used as theorem or lemma headers) | |
- reads like a Jon Baez blog. Style examples: | |
+ "Let’s begin." | |
+ "Think of this chapter as the prelude to the main piece, which begins in Chapter 2." | |
## Bibliography | |
* I think about 70 items | |
* Many about tensor networks | |
* Some about category theory (introductory theory) | |
* Few about information theory | |
* Some links to MathOverflow questions / nLab / nForum discussions etc. | |
# Conclusion | |
Actually only 5 sentences pitching the work to come. | |
## Misc | |
* Starts with a note-to-reader | |
* Chapter 1 is a coarse intro to some concepts, more explicitly revisited later | |
* Chapter 2-4 is basically focused linear algebra text | |
* Chapter 5 is in very different tone (nLab speak) than the rest. | |
## Table of Contents | |
A Note to the Reader 5 | |
Abstract 7 | |
Acknowledgments 9 | |
1 Introduction 11 | |
Formal Concepts 11 | |
A Linear Algebraic Version 16 | |
Outline of Contents 17 | |
2 Preliminaries 21 | |
A Motivating Example 22 | |
Some Elementary Probability 22 | |
Marginal Probability With Memory 24 | |
Notation 27 | |
Bra-Ket Notation 27 | |
Tensor Network Diagrams 31 | |
Density Operators 37 | |
The Partial Trace 42 | |
Reduced Densities of Pure Quantum States 48 | |
Entanglement 55 | |
3 A Passage from Classical to Quantum Probability | |
Reduced Densities from Classical Probability Distributions | |
The Motivating Example, Revisited 67 | |
Reduced Densities from Empirical Distributions 73 | |
Eigenvectors versus Formal Concepts 79 | |
Modeling Entailment with Densities 82 | |
An Extended Example 84 | |
The General Idea 90 | |
4 An Application and an Experiment 93 | |
Building a Generative Model Initial Inputs 95 | |
A Bird’s-Eye View 97 | |
Intuition: Why Does This Work? The Experiment 105 | |
5 Fixed Points, Categorically | |
Revisiting Eigenvectors 110 | |
Free (Co)Completions 113 | |
Revisiting Formal Concepts 121 | |
Interlude: 2, Categorically 122 | |
A Special Adjunction 127 | |
A Blueprint 127 | |
Conclusion 131 | |
Bibliography 132 | |
########## | |
# A Note to the Reader 5 | |
* Assures nobody will be left behind | |
* => See also "buzzwords" section above | |
# Abstract 7 | |
* => See what follows below | |
# Acknowledgments 9 | |
* Shoutout to advisor | |
# 1 Introduction 11 | |
Starts by talking about adjoints in two pedestrian cases. | |
## Formal Concepts 11 | |
* Adjunctions via Galois connections via Formal Concept. | |
* Definition of what's called a "formal concept" is given: | |
- Colors fruits example | |
- Definition given via the lattice of posets, or as fixed point of the functors (function) involved | |
- Also the same as complete bipartite subgraphs | |
* Discussions on finiteness requirements (usually, sloppily, not mentioned) in Chapter 5. | |
## A Linear Algebraic Version 16 | |
* Linear algebra, C^n, except the ordinal n is replaced by any finite set X, so it looks a bit different. | |
E.g. a matrix M is X x Y->C and thus induces a linear map C^X->C^Y | |
(which means a collection of matrix elements n x m->C induces a linear map C^n->C^m) | |
* Fixed-points <=> Eigenvector correspondence (to be proven) | |
Later (p. 80): The correspondence between formal concepts (set theory view) and | |
eigenvectors (Linear algebra) checks out when the relation R in the former has the property that its bipartite graph is a disjoint union of complete bipartite subgraphs. | |
## Outline of Contents 17 | |
* Chapter two works out, e.g.: | |
M with $\sum_{x,y} |M(x, y)|^2 = 1$ corresponds to unit vectors in $C^X\tensor C^Y$. | |
$M^T\M$ and $M\,M^T$ are projections onto "left- and right aspect" of this vector. | |
* Chapter 3: "[...] we present the main contribution in Chapter 3. It is a procedure for modeling any joint probability distribution by a rank 1 density operator together with a decryption of the information carried in the eigenvalues and eigen- vectors of its reduced density operators." | |
* Chapter 4: Algorithm for handling/generating generative models. | |
* Chapter 5: Abstracting some notions in categorical language. | |
# 2 Preliminaries 21 | |
## A Motivating Example 22 | |
colored food (two dimensional finite space) | |
## Some Elementary Probability 22 | |
## Marginal Probability With Memory 24 | |
capturing marginal and conditional probability with M resp. eigenvectors of functions of M. | |
## Notation 27 | |
## Bra-Ket Notation 27 | |
Mentions S->"C^S" typing issue: Orten C^S denotes a set and sometimes a vector space. | |
Can e.g. be resolved by consistently writing S->UC^S, i.e. to the underlying set of C^S (a vector space) | |
Chapter 5 works like that. | |
## Tensor Network Diagrams 31 | |
E.g. matrix product states are introduced | |
## Density Operators 37 | |
"Non-diagonal density operator" | |
\sqrt{p_i\cdot p_j} | |
## The Partial Trace 42 | |
Pass from vector space $V\otimes W$ to End($V\otimes W$) since there are natural (in the formal sense of the word) | |
maps to its components definable on it - the partial traces. | |
## Reduced Densities of Pure Quantum States 48 | |
Use of partial trace to deifne reduced densities. | |
Lots of linear algebra one needs a bit routine with | |
* Schmidt decomposition | |
* Singular-value decomposition (SVD) | |
- Relation between unitary matrices from SVD and reduced densities | |
+ e.g. ρ_V = M^T M and ρ_W = M M^T | |
## Entanglement 55 | |
* Entropy. | |
* Emphasis/Warnings/Clarifications | |
- All those properties (mixing, entanglement, etc.) are "mediated"/determined by rank. | |
- Entanglement is defined _for density operators on a tensor product_ | |
- Entanglement is entanglement w.r.t. a chosen decomposition $V\otimes W$ of the underlying $H$ | |
- Entanglement is defined only for underlying pure states in H (as opposed to mixed densities). | |
Some more nice diagrams for mental visualization. | |
Note_n: | |
- Mixing is considering a non-trivial statistical ensemble of state. | |
- Entanglement of a pure state a property w.r.t. a H-decomposition. | |
# 3 A Passage from Classical to Quantum Probability | |
## Reduced Densities from Classical Probability Distributions | |
Redo of previous section with language used in QM. | |
Language: Prefix / Suffic subsystem. | |
Forms density operator on finite set $S=X\times X$ | |
- Emphasis on using ${\mathbb C}^X\otimes{\mathbb C}^Y$, such that diagonal entires remain | |
+ Partial trace swallows less information that classical marginalizing | |
## The Motivating Example, Revisited 67 | |
Definitions with realizes with colored food example. | |
## Reduced Densities from Empirical Distributions 73 | |
We deal with a finite state space here, so many entities can be cast/explained in a combinatorical/graph theoretical fashion. | |
## Eigenvectors versus Formal Concepts 79 | |
Correspondence (or non-correspondence) between (1.) some notion mentioned in the set/graph theory heavy parts and (2.) their linear algebra counterparts.. | |
## Modeling Entailment with Densities 82 | |
Example $S=A\times{}B\times{}C\times{}Y$. | |
Decomposition into projection operators, A (not to be confused with the set above). | |
Discussion of relationship to M. | |
Cool graphs. | |
#### An Extended Example 84 | |
Food example with more predicates. | |
#### The General Idea 90 | |
Discussion/Summary. | |
# 4 An Application and an Experiment 93 | |
## Building a Generative Model Initial Inputs 95 | |
Even parity example. | |
More detailed description of the alorithm used/discussed is found in the article posted at the beginning. | |
## A Bird’s-Eye View 97 | |
Description of the algorithm taking the state |psi> from an empirical distribution of a training set (some of the even parity bit-strings, T) to one that is claimed to predicts accurately the partity property of any bit-string. | |
Keyword: Matrix-Product-State. | |
The algorthm is iterative over the subsyste-basis vectors, i.e a stepwise refinements towards the best appriximation. | |
Unlike classical margnialization, at each step the partial trace | |
captures information of all T (even if one isn't at the N'th refinement step yet.) | |
## Intuition: Why Does This Work? | |
This section gives an example with |T|=7 on a N=5 date set, computing $\rho_2$. | |
## The Experiment 105 | |
Effectively a check how many datasets are necessary to obtain good approximations to the |E_n> vector. | |
(This section is a bit short and only sketchy.) | |
# 5 Fixed Points, Categorically | |
"the motif being that fixed points of the composition of a morphism with its “adjoint” are interesting" | |
Linear algebra | "Abstract" | Lattices | |
Vect, Set | (Co)compCAT, CAT | Join/Meet, Set | |
Eigenvectors | Nuclei | Formal concepts | |
Some of the more intricate relevant keywords here: | |
* Isbell completion | |
* Enriched categories | |
## Revisiting Eigenvectors 110 | |
* Some warning on sloppy language and notation w.r.t. sets with extra structure (e.g. vector spaces.) | |
* Adjoints in linear algebra, relation of adjoint matrices. | |
* Eigenvector equation viewed as fixed point equation. | |
- M · M^T f \propto f | |
- M^T · M f \propto f | |
## Free (Co)Completions 113 | |
* Clarifications on relations involving free cocomplete categories and such. | |
* Analogies are drawn with linear algebra. | |
* "Although linear algebra does not appear to be a special case of enriched category theory, it turns out that the formal concept con- struction discussed in Sections 1.1 and 3.3 is. This is explained in the next section." | |
## Revisiting Formal Concepts 121 | |
* Set replaces with the 2-value set | |
* Lattices | |
#### Interlude: 2, Categorically 122 | |
* Lattices in more depth | |
* More discussion of enriched categories | |
## A Special Adjunction 127 | |
#### A Blueprint 127 | |
# Conclusion 131 | |
See above | |
# Bibliography 132 | |
See above |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment