Sunday, February 25, 2007

Trading Convexity for Scalability


The significant information the author wanna show us is that it might not be forbidden when the objective function is not convex. The paper demonstrates us how to solve a non-convex case when the objective is a diference of convex functions by ConCave Convex Programming(CCCP, C3P). The idea is illustrated in the following picture.

Then SVM with Hinge loss is critisized since the number of SVs will increase linearly when the training set grows larger, at last hindering the training and recognition. By analyzing the objective function and according to the Fermat's lemma, those misclassified samples become SVs because the derivative of Hinge loss there is not zero and according to KKT condition, they are SVs.

So the loss function is replaced with Ramp loss, which is the difference of Hinge loss with another convex function.

And then the objective of SVM can be regarded as a difference of convex fucntions and can be solved with C3P.

This technique also applies to TSVM, because the difficulty of TSVM is the non-convexity of the objective. So the algorithm can be used without a touch. The original penalty for unlabelled samples, namely symmetric Hinge loss, now is substituted for an extended version, difference of Ramp loss.

Here are some related work to be checked:

No comments: