In semisupervised learning, there is an important technique called co-learning. The basic idea is to train two independent classifier. Let them mutually interact with training errors in order to agree on unlabelled examples at last. Two phrase: independent classifiers, agree on unlabelled data.
Now let's focus on what the authors show us. As is know, the framework of SVM give a regularization technique for supervised learning already. So in order to incorporate unsupervised data, add a "agreement regularizer".
Then by representer theorem, the optimization result requires O(M3(m+n)3) where M is the number of different views, m is the number of unlabelled data and n the number of labelled data.
However, usually m will be huge due to the accessibility of unlabelled data and we will suffer with a cubic algorithm. An alternative simply uses a subspace of the original RKHS, namely the one expanded with labelled data only. Then the optimization becomes the following equation:
and this can be solved in O(M3n3+M2m).
Another result given in this paper is a distributed algorithm for the optimization problems mentioned above. The algorithm called block coordinate descent is analogous to steepest gradient descent while the former one only scrambles along several coordinates associated with the message sent by another classifier in a different site.
The paper also lists several literatures on co-learning for my future study:
- Combining labled and unlabeled data, by Blum and et al.(the one who proposed co-training)
- Learning classification with unlabeled data, by de Sa, in NIPS(the first who noticed the consensus of multiple hypothesis)
- Unsupervised models for named entity classification, by Collins and et al.(a variant of adaboost)
- PAC generalization bounds for co-training, by Dasgupta and et al., in NIPS
- The value of agreement, a new boosting algorithm, by Leskes, in COLT
- Two view learning: SVM-2K, theory and practice, by Hardoon and et al., in NIPS
No comments:
Post a Comment