farkas lemma

First I'll say that whoever is telling you you need to understand farkas lemma to understand loop carried dependencies is completely fucking with you. Even the people that should understand it probably don't (I can think of 5 people in my professional circle that are "polyhedral" people and I guarantee they don't remember/don't care about farkas).

Secondly, farkas clicked for me after thinking about barrier functions and duality. Bear with me because I haven't thought about this in a while:

Ax = b

is a linear system of constraints and you're looking for a solution right? Constrained optimization is "hard" because in principle your search algo should always satisfy the constraints.

What's easier is unconstrained optimization i.e. gradient descent. How can you (sort of) turn Ax = b into an unconstrained optimization problem? Write down a penalty function that "penalizes" for violating each of the constraints and then just use gradient descent.

What does that penalty function look like? Well just do the easy thing and penalize for failing each of the constraints and then add up those penalties. So

p(x) = Σci * (ai * x - bi)

where ai is row i of A and ci is the penalty for failing the ith constraint. If you juggle the parens

p(x) = Σ(ci * ai) * x - ci * bi

where (ci * ai) is now actually a row vector times the columns of A which is the same as A^t * c (do it out on paper if you don't see it).

Okay so now think about when this unconstrained optimization problem has a solution and what does it mean. If for any penalty (any cis) there's an x such that

p(x) = Σ(ci * ai) * x - ci * bi = 0

i.e. you can always reduce the penalty to zero. Stated another way: if there's always an x (for any c) such that

Σ(ci * ai) * x = Σci * bi

then the system is solvable (and x is the solution). But what happens if there's no such x. Then for all x you'll be able to pick ci such that either

Σ(ci * ai) * x < Σci * bi

Σ(ci * ai) * x > Σci * bi

and (I think) you can play a further trick (can't remember) such that (assuming something about x) you can show that A^t * c ≥ 0 and c * b ≤ 0.

If I figure out the rest I'll come back and update. But basically just look up a proof of strong duality and it should look roughly like this.

makslevental/farkas.md