This paper proposed another version of knowledge-based SVM. The former versions can be regarded as adding constraints to the original SVM. To ensure the optimization can still be solved in quadratic programming, the polyhedra-like constraints might introduce non-separability into the separable cases.
The authors of this paper introduce a transformation of the classifier f. The transformation ensures constraints in several cases, e.g. fixing labels for several samples in binary classification, setting an upper and lower bound for regression, setting monotonicity, exclusion of some labels in multi-class classification, setting the even or odd property. It seems a more general method compared to the former version. I don't quite catch the analysis part of this paper. Their result seemingly relies upon the transformation adopted. Sometimes, they are lucky to find a convex optimization problem while they have to use C3P and other analogous techniques for non-convex optimization problems.
Basically, to further understanding this paper, I have to explore the following papers:
- G. Fung, O. Mangasarian and J. Shavlik: Knowledge-based support vector machine classifiers, NIPS 2002.
- O.L. Mangasarian, J.W. Shavlik and E.W. Wild: Knowledge-based kernel approximation, JMLR 2004.
- Rademacher complexity