The course will have six weeks of lectures and assignments, followed by a final exam.
- Distributed shortest-path routing -- sending email messages
- DNA sequence alignment, calculate Needleman-Wunsch score [1970]
- The Algorithm Designer’s Mantra -- "Can we do better?"
- Big-O notation, Asymptotic analysis
- Graph search algorithms -- BFS and DFS, Dijkstra’s Algorithm
- Heaps -- supported operations, runtime performance, applications
- No single “silver bullet” for solving problems -- divide & conquer, randomized, greedy, dynamic programming
- Definition -- iteratively make “myopic” decisions, hope everything works out at the end
- Easy running time analysis, hard to establish correctness
- Many greedy algorithms are not correct, e.g.Dijkstra'a algorithm with negative edge lengths
- Proof of correctness -- induction and exchange argument
- The Optimal Caching Algorithm
- Theorem: [Belady 1960s] The “furthest-in-future” algorithm is optimal (i.e., minimizes the number of cache misses)
- One shared resource, many jobs to do -- in what order should we sequence the jobs?
- Each job has a weight w-j and length i-j, Completion Time C-j is sum of all jobs up to and including j
- Goal is to minimize the weighted sum of all completion times, min E(j=1-n) w-j * C-j
- If weights are equal must schedule shorter jobs first, but if lengths are equal must schedule larger jobs first
- Assign scores to jobs that are increasing in weight and decreasing in length
- Order by decreasing ratio w-j / l-j is always correct, runtime is O(n*log(n)) to do the sorting
- Proof by Exchange Argument
- Connect vertices together while minimizing the path cost
- Input is undirected graph G(V,E) (connected) and cost e for each edge, output is minimum cost tree T that spans all vertices
- Applications are clustering and network routing
- Prim's and Kruskal's algorithms, O(m*log(n))
- Proof part 1 -- Prim's algorithm always outputs a spanning tree
- Empty Cut Lemma -- graph is not connected <=> there exists a cut (A,B) with no crossing edges
- Double Crossing Lemma -- if cycle C has an edge that crosses a cut (A,B) => there is another edge that crosses that cut
- Lonely Cut Corollary -- if there is only one edge crossing some cut (A,B) => that edge is not in any cycle
- Proof part 2 -- Prim's algorithm always outputs a MST, minimum-cost spanning tree
- Cut Property -- in a cut (A,B) where there is an edge e with minimum cost, that edge is part of MST
- MST is unique if edge costs are distinct
- Naive implementation has running time O(nm) but with heaps it drops to O(mlog(n))