A complete list of books, articles, blog posts, videos and neat pages that support Data Fundamentals (H), organised by Unit.
If the resource is available online (legally) I have included a link to it. Each entry has symbols following it.
- ⨕⨕⨕ indicates difficulty/depth, from ⨕ (easy to pick up intro, no background required) through ⨕⨕⨕⨕⨕ (graduate level textbook, maths heavy, expect equations)
- ⭐ indicates a particularly recommended resource; 🌟 is a very strongly recommended resource and you should look at it.
- SciPy lecture notes introduces all of the scientific Python infrastructure and lots of overlap with Data Fundamentals (H) ⨕⨕⨕
- Learning AI if you suck at math ⨕⨕
- Mathematical Notation: A Guide for Engineers and Scientists by Edward R. Scheinerman covers all of the mathematical notation (and more) that we will use in a very concise form. ⨕ 🌟
- Deep learning notation covers much of the same terminology and symbols. ⨕⨕
- Math As Code: A cheatsheet for Mathematical Notation A really nice explanation of mathematical notation in terms of simple code (in Javascript, but easily applicable) ⨕⨕ 🌟
Excerpt from Math as Code:
The big Greek Σ
(Sigma) is for Summation. In other words: summing up some numbers.
Here, i=1
says to start at 1
and end at the number above the Sigma, 100
. These are the lower and upper bounds, respectively. The i to the right of the "E" tells us what we are summing. In code:
var sum = 0
for (var i = 1; i <= 100; i++) {
sum += i
}
The result of sum
is 5050
.
If you don't know any Python, you will need to learn some.
You will need to know:
- basic syntax: expressions and function calls
- printing
- lists
- dictionaries
- basic iteration (for, while)
- functions, parameters
- (maybe) list comprehensions
You will not need to know:
- classes
- exceptions
- file handling
- or anything more advanced
- Python cheat sheet A quick reference card.
- learnxinyminutes Python 3 A very concise reference
- Python for data science cheat sheetA quick reference card with a data science focus. ⭐
- "Think Python!" by Allen Downey A full textbook on Python. Easy to read.
- Try the online tutorials at LearnPython
We'll be using Jupyter for everything in DF(H). While it's not hard to learn, there are some guides:
Quick references for getting stuck and coding things up. This covers NumPy and Matplotlib, the two key software libraries we use in DF(H).
- NumPy cheatsheet
- NumPy API reference
- NumPy user guide
- Python for Data Science cheatsheet
- Another NumPy Cheatsheet
- Introduction to Matplotlib
- Matplotlib command summary
- From Python to Numpy ⨕⨕⨕⭐
- 100 numpy exercises ⨕⨕⭐
- NumPy tutorial ⨕⭐
- Introduction to NumPy ⨕⨕
- Linear algebra cheat sheet ⨕⨕
- 101 NumPy Exercises for Data Analysis ⨕⨕
- Floating point visually explained ⨕ 🌟
- A series of fascinating articles on the inner workings of floating point by a Google engineer ⨕⨕⨕⭐
- Floating point numbers ⨕
- What Every Computer Scientist Should Know About Floating Point Numbers ⨕⨕⨕
- Demystifying floating point precision ⨕⨕
- Advanced NumPy ⨕⨕⨕⨕
- NumPy tricks and Part I ⨕⨕⨕⨕⨕
-
Ten simple rules for better figures (and the accomapnying video (recommended; read this and watch the video) ⨕ 🌟
- [The Hacker's Guide to uncertainty visualisation] (https://erikbern.com/2018/10/08/the-hackers-guide-to-uncertainty-estimates.html) ⨕ 🌟
- Understanding the Box plot Very thorough discussion of what Box plots are and how they should be used. ⨕⨕
- Simple Matplotlib cheatsheet ⨕
- Introduction to Matplotlib ⨕⨕
- Matplotlib command summary ⨕⨕
- [Extensive matplotlib cheatsheet](http://nbviewer.jupyter.org/urls/gist.githubusercontent.com/Jwink3101/e6b57eba3beca4b05ec146d9e38fc839/raw/f486ca3dcad44c33fc4e7ddedc1f83b82c02b492/Matplotlib_Cheatsheet ⨕ ) ⨕⨕⨕
- Randal Olson's blog has many, many examples of good visualization, mainly using Python for graph preparation. ⨕
- Layered Grammar of Graphics (long, but detailed) ⨕⨕⨕
- The Grammar of Graphics, Leland Wilkinson, Second ed. ⨕⨕⨕⨕
- How to Lie with Statistics Darrel Huff (short, easy to read, worth reading) ⨕⭐
- Information Visualization: Perception for Design Colin Ware: a serious book on advanced visualisations.⨕⨕⨕
- The "Tufte" books
- The Visual Display of Quantitative Information by Edward Tufte⨕⨕⨕
- Visual Explanations: Images and Quantities, Evidence and Narrative by Edward Tufte⨕⨕⨕
- Envisioning Information by Edward Tufte⨕⨕⨕
- WATCH THIS 3blue1brown Linear Algebra series (strongly recommended) ⨕ 🌟
- An introduction to linear algebra Jeremy Kun ⨕⨕⨕⭐
- A primer on inner product spaces Jeremy Kun ⨕⨕⨕
This can be mind-bending. Some further reading and viewing:
- AI experiments: Visualizing high dimensional spaces ⨕⭐
- 3blue1Brown A Trick to Visualizing Higher Dimensions ⨕⨕ 🌟
- High-dimensional spaces chapter ⨕⨕⨕
- Geometry in Very High Dimension ⨕⨕⨕⨕
- On the Surprising Behavior of Distance Metrics in High Dimensional Space ⨕⨕⨕⨕⨕
- Introduction to Applied Linear Algebra freely available. Stephen Boyd and Lieven Vandenberghe⨕⨕⨕⭐
- Coding the Matrix Phillip N. Klein An excellent and thorough introduction to linear algebra through Python programming⨕⨕⨕
- Linear Algebra Done Right, Sheldon Axler a more pure mathematics perspective ⨕⨕⨕
- A tutorial on principal components analysis ⨕⨕⨕⭐
- Eigenvectors and eigenvalues ⨕⨕⭐
- An introduction to principal components and the geometric interpretation of the covariance matrix ⨕⨕⨕
- Matrix decompositions ⨕⨕ 🌟
- A tutorial on the singular value decomposition ⨕⨕⨕⭐
- Toward an exploratory medium for mathematics intuitive geometric explanation of the SVD⨕⨕⨕⭐
- SVD part 1 Jeremy Kun⨕⨕⨕
- SVD part 2 Jeremy Kun⨕⨕⨕
- The Singular Value Decomposition ⨕⨕⨕⨕
- The Matrix Cookbook Kaare Brandt Petersen and Michael Syskind Pedersen. If you need to do a tricky calculation with matrices, this book will probably tell you how to do it.⨕⨕⨕⨕⨕#
- Introduction to Linear Algebra Gilbert Strang The standard textbook on linear algebra⨕⨕⨕⨕
- A First Course in Numerical Methods Uri M. Ascher and Chen Greif⨕⨕⨕⨕
- On the Origin of Circuits covers genetic algorithms ⨕ ⭐
- Khan academy: Multivariable calculus, particularly "Thinking about multivariable functions", "Derivatives of multivariable functions" and "Applications of multivariable derivatives"
- Why have Sex? Information Acquisition and Evolution ⨕⨕⨕⨕
- When least is best: How Mathematicians Discovered Many Clever Ways to Make Things as Small (or as Large) as Possible by Paul J. Nahin An interesting and mathematically thorough description of the history of optimisation from a mathematical standpoint.⨕⨕⨕⨕
- The Blind Watchmaker Richard Dawkins An excellent popular science book on how evolution (genetic algorithms in the wild) can work, including some early computer simulations.
- Gradient based optimization ⨕⭐
- An overview of gradient descent optimization algorithms ⨕⨕
- An introduction to algorithms for continuous optimization by Nicholas Gould⨕⨕⨕⨕⨕
- What is backpropagation if you want more detail on how first-order optimisation is used in deep learning.⨕⨕
- How machines learn 3blue1brown strikes again⨕ 🌟 Recommended: WATCH THIS
- **Introduction to automatic differentiation ⨕⨕⨕⭐
- A visual guide to Bayesian thinking ⨕ 🌟
- Veritasium explains Bayes' Theorem ⨕
- Count Bayesie's guide to Bayesian statistics A collection of very readable articles by Count Bayesie⨕⨕⨕⭐
- [Video by author of Think Bayes](Video by same author https://www.youtube.com/watch?v=TpgiFIGXcT4) ⨕⨕
- Khan Academy materials on probability (more in depth than we cover, but high quality stuff) ⨕⨕⨕
These provide a formal basis for probability theory, if you feel more comfortable having a rigorous mathematical basis. These go way beyond the course.
- A formal introduction to probability for Scientists and Engineers ⨕⨕⨕⨕⨕
- Conditional Probability Theory (For Scientists and Engineers) ⨕⨕⨕⨕⨕
- Probability and statistics cookbook ⨕⨕⨕ Like the Matrix Cookbook, this provides a dense, quick reference to many problems in statistics and probability.
- Think Bayes, Allen B. Downey light, Python focused⨕⨕⭐
- All of Statistics: A Concise Course in Statistical Inference Larry Wasserman *Outstanding; the best of these books, but somewhat maths heavy.*⨕⨕⨕⨕⨕⭐
- Chapters 2 and 3 of Information Theory, Inference, and Learning Algorithms by David Mackay⨕⨕⨕⨕
- **A First Course in Probability ** by Sheldon Ross (standard textbook on probability) ⨕⨕⨕
- Probability theory: the logic of science by E. T. Jaynes an excellent but controversial and very technical book⨕⨕⨕⨕⨕
- Information Theory, Inference and Learning Algorithms, David Mackay Also excellent and covers many interesting relation between probability, information and learning⨕⨕⨕⨕⨕
- Introduction to statistical learning (outstanding introduction to statistical learning, including a book, video and course notes) ⨕⨕⨕⨕
- Why would I ever need Bayesian statistics ⨕⨕
- MCMC for dummies ⨕⨕⨕⭐
- Bayesian Linear Regression This is what the example above is based on.⨕⨕⨕⨕
- The Non-parametric Bootstrap as a Bayesian Model How the bootstrap can be seen as Bayesian model⨕⨕⨕⨕
- Bayesian methods for Hackers a full "book" on Bayesian methods and inference ⨕⨕⨕⨕⭐
- Sampling, Quantization and Encoding (short introduction to sampling and quantization) ⨕⨕⭐
- DSP for the Braindead (not actually for the braindead, in fact much more advanced than we cover here!) ⨕⨕⨕
- The Scientist and Engineer's Guide to Signal Processing http://dspguide.com/ (free, online book) ⨕⨕⨕
- Digital Signal Processing, A Computer Science Perspective, Jonathan (Y) Stein A great introduction for CS students, but fantastically expensive.