Skip to content

Instantly share code, notes, and snippets.

@lyrl
Forked from johnhw/df_resources.md
Created October 24, 2020 17:42
Show Gist options
  • Save lyrl/548b47e77af286b0f08aa45280e55568 to your computer and use it in GitHub Desktop.
Save lyrl/548b47e77af286b0f08aa45280e55568 to your computer and use it in GitHub Desktop.

A complete list of books, articles, blog posts, videos and neat pages that support Data Fundamentals (H), organised by Unit.

Formatting

If the resource is available online (legally) I have included a link to it. Each entry has symbols following it.

  • ⨕⨕⨕ indicates difficulty/depth, from ⨕ (easy to pick up intro, no background required) through ⨕⨕⨕⨕⨕ (graduate level textbook, maths heavy, expect equations)
  • ⭐ indicates a particularly recommended resource; 🌟 is a very strongly recommended resource and you should look at it.

General

Mathematical notation

  • Mathematical Notation: A Guide for Engineers and Scientists by Edward R. Scheinerman covers all of the mathematical notation (and more) that we will use in a very concise form. ⨕ 🌟
  • Deep learning notation covers much of the same terminology and symbols. ⨕⨕
  • Math As Code: A cheatsheet for Mathematical Notation A really nice explanation of mathematical notation in terms of simple code (in Javascript, but easily applicable) ⨕⨕ 🌟

Excerpt from Math as Code:


The big Greek Σ (Sigma) is for Summation. In other words: summing up some numbers.

$$\sum_{i=1}^{100}i$$

Here, i=1 says to start at 1 and end at the number above the Sigma, 100. These are the lower and upper bounds, respectively. The i to the right of the "E" tells us what we are summing. In code:

var sum = 0
for (var i = 1; i <= 100; i++) {
  sum += i
}

The result of sum is 5050.


Python

If you don't know any Python, you will need to learn some.

You will need to know:

  • basic syntax: expressions and function calls
  • printing
  • lists
  • dictionaries
  • basic iteration (for, while)
  • functions, parameters
  • (maybe) list comprehensions

You will not need to know:

  • classes
  • exceptions
  • file handling
  • or anything more advanced

References

Jupyter

We'll be using Jupyter for everything in DF(H). While it's not hard to learn, there are some guides:

Cheat sheets and API references

Quick references for getting stuck and coding things up. This covers NumPy and Matplotlib, the two key software libraries we use in DF(H).

Unit 1: Vectorized computation I

Unit 2: Vectorized computation II

Articles on floating point

Advanced NumPy

Unit 3: Visualisation

Aesthetics

Uncertainty

Matplotlib

Visualisation

Example visualisations

  • Randal Olson's blog has many, many examples of good visualization, mainly using Python for graph preparation. ⨕

Books

  • Layered Grammar of Graphics (long, but detailed) ⨕⨕⨕
  • The Grammar of Graphics, Leland Wilkinson, Second ed. ⨕⨕⨕⨕
  • How to Lie with Statistics Darrel Huff (short, easy to read, worth reading) ⨕⭐
  • Information Visualization: Perception for Design Colin Ware: a serious book on advanced visualisations.⨕⨕⨕
  • The "Tufte" books
    • The Visual Display of Quantitative Information by Edward Tufte⨕⨕⨕
    • Visual Explanations: Images and Quantities, Evidence and Narrative by Edward Tufte⨕⨕⨕
    • Envisioning Information by Edward Tufte⨕⨕⨕

Unit 4: Computational Linear Algebra I

Primers

High-dimensional spaces

This can be mind-bending. Some further reading and viewing:

Videos

Texts

Books

  • Introduction to Applied Linear Algebra freely available. Stephen Boyd and Lieven Vandenberghe⨕⨕⨕⭐
  • Coding the Matrix Phillip N. Klein An excellent and thorough introduction to linear algebra through Python programming⨕⨕⨕
  • Linear Algebra Done Right, Sheldon Axler a more pure mathematics perspective ⨕⨕⨕

Unit 5: Computational Linear Algebra II

Eigenvectors

Beyond the course

The SVD

Books

  • The Matrix Cookbook Kaare Brandt Petersen and Michael Syskind Pedersen. If you need to do a tricky calculation with matrices, this book will probably tell you how to do it.⨕⨕⨕⨕⨕#
  • Introduction to Linear Algebra Gilbert Strang The standard textbook on linear algebra⨕⨕⨕⨕
  • A First Course in Numerical Methods Uri M. Ascher and Chen Greif⨕⨕⨕⨕

Unit 6: Numerical Optimization I

Books

  • When least is best: How Mathematicians Discovered Many Clever Ways to Make Things as Small (or as Large) as Possible by Paul J. Nahin An interesting and mathematically thorough description of the history of optimisation from a mathematical standpoint.⨕⨕⨕⨕
  • The Blind Watchmaker Richard Dawkins An excellent popular science book on how evolution (genetic algorithms in the wild) can work, including some early computer simulations.

Unit 7: Numerical Optimization II

Gradient descent

Automatic differentiation

Pareto optimality

Unit 8: Probability & Stochastics I


Probability

Bayesian thinking and Bayes' rule

Beyond the course

These provide a formal basis for probability theory, if you feel more comfortable having a rigorous mathematical basis. These go way beyond the course.

Books

  • Probability and statistics cookbook ⨕⨕⨕ Like the Matrix Cookbook, this provides a dense, quick reference to many problems in statistics and probability.
  • Think Bayes, Allen B. Downey light, Python focused⨕⨕⭐
  • All of Statistics: A Concise Course in Statistical Inference Larry Wasserman *Outstanding; the best of these books, but somewhat maths heavy.*⨕⨕⨕⨕⨕⭐
  • Chapters 2 and 3 of Information Theory, Inference, and Learning Algorithms by David Mackay⨕⨕⨕⨕
  • **A First Course in Probability ** by Sheldon Ross (standard textbook on probability) ⨕⨕⨕

Beyond the course

Unit 9: Probability & Stochastics II

Books

Unit 10: Digital Signals and Time Series

Books

  • The Scientist and Engineer's Guide to Signal Processing http://dspguide.com/ (free, online book) ⨕⨕⨕
  • Digital Signal Processing, A Computer Science Perspective, Jonathan (Y) Stein A great introduction for CS students, but fantastically expensive.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment