Duality and KKT condition are very important for machine learning, especially in SVM models. I’ll focus more on the high-level idea and the derivation of Lagrange Duality and how to introduce to KKT condition. There are some connceptions should be covered first.
The basic form of optimization problem without restrictions is just like finding the $x \in \Re^d$ makes that
Consider an Optimization Problem with equality restrictions.
Lagrange Multiplier is a method to solve this kind of problem. We can rewrite the objective function as . Then we can prove that the solution of is equal to the solution of previous problem. Here are called the Lagrange Multiplier. The new optimization problem is
And here the new function is called Lagrange function.
If there is any , the minimum will become $-\infty$ due to the unrestricted , so we should add a restriction that which makes the solution finite and converge into .
Let ,then there is a trivial theorem that
Here is the dual problem of .
Assume are continuous functions on , then consider the restricted optimization problem.
We have already known that the primal problem without restrictions can be solved easily by calculating derivatives and testing. So our first step is to translate the primal problem into a problem without restrictions.
We have a enhanced Lagrange Function in form of . Here because the direction of has been restricted.
Define a new function , we can conclude that under all primal constraints.
Obviously, , so is an upper bound of . Then under all constraints, we have , then .
In conclusion, the primal problem has an equivalent form
We have already known the equivalent form of primal problem, but in this form we should still consider the constraints which makes the calculation too complicated. The next step is to find a simpler way of finding the best solution.
Consider the dual problem, a well property should be when is the best solution of primal problem.
Think back the transformation of primal problem, if dual problem is equal to the primal problem on $x = x^*$, the formula should be
Then consider the Lagrange condition of both inner optimizations which are and . This leads to and .
Then consider the parameter . There are two situations about . First is that the minimized point is of and the other is that the minimized point is of .
For the first case, the inequality constraint becomes a equality constraint. That is
For the second case, the inequality constraint disappears, that is
So combine two situations together we have . Then under this constraint, the becomes a regular Lagrange Function, which leads to a Lagrange Multiplier constraint that
So the final constraint becomes
This is the KKT condition.