Thursday, February 22, 2007

Nightmare at Test Time: Robust Learning by Feature Deletion


The setting of this paper originates from a real application, spam filtering. If the classifier is trained as usual, it might put too much weight on a single feature and the spam senders can change their style of writing junk emails, e.g. abandoning the weighted feature and the classifier will suffer at the test time. Hence, an idea from game theory helps in this case. But it is only a trade-off between the precision and robustness, that's allowing a little more errors and doing better if deletion happens.

The method is not complicated, modification of objective function of original SVM. The punishment of misclassification is altered so that the maximal possible influence of feature deletion is taken into consideration. The adjustment is carried out per-sample, which might look strange. However, if it is carried out in a uniform way, the optimization will be more difficult, to everyone's surprise, because then the objective function is non-convex. After this, we can even formulate the dual form of the optimization problem but unfortunately, the form is not as promising as the original SVM's, which can take advantage of kernel methods.

Comparison of this method and feature selection is made. If we apply the same technique to formulate feature extraction, in either per-sample style or uniform way, a non-convex optimization problem is confronted and too tough to be solved with a snap.

The important thing about this paper is that I have to be familiar with derivation of optimization problems and get full knowledge of what kind of problems are tractable and what's their solution. This reminds me of reading the books on optimization. Now I have two of them, and hope I can finish them this semester.

No comments: