Wednesday, September 24, 2008

Spectral Clustering with Inconsistent Advice


This paper is about semi-supervised clustering. The two main algorithms are spectral clustering and 2-correlation clustering problem (2CC). The main technique used here is SDP relaxation, which is very interesting.

The basic idea of SDP relaxation: Usually a combinatoric optimization problem results in a minimization of a quadratic form, where the variables are 0-1. The relaxation allows the variables to take real values. Then the optimization could be turned into a SDP problem by exploiting the trace equation: tr XT S X = tr S G, where G = X XT. With this formulation of the original problem G would substitute X. We could derive new optimization in SDP form and later transform it back to the original problem which can be efficiently solved by eigen decomposition.

It is famous that spectral clustering does the relaxation part and simplifies the NP-hard problem into an eigenvalue problem. Somehow, given consistent advice, we could solve the semi-supervised learning problem in a subspace that accords with the advice. But for inconsistent advice, it would be infeasible to apply the technique.

Several solutions are proposed in this paper. One is simple enough. 2CC could resolve the inconsistancies, which reenables the usage of subspace method with spectral clustering.

The other two ideas exploit SDP relaxation. The 2CC optimization could be relaxed in the same way as spectral clustering, resulting in a SDP problem. Combining the two SDP, optimizing over the clustering and putting 2CC in the constraints by limiting the upper bound, we get the SDP formulation of the problem. The imposed constraint in the spectral form turns out to be the subspace constraint of a block of advice graph. Compared with the original subspace method, it's similar (but it's blockwise and the eigen vectors of another matrix.) However this method doesn't yield a flexible model that could vary the importance of the advice. By multiplying a scalar (>1), and searching in a larger space (spanned by eigen vectors of the smallest eigen value multiplied by the scalar), the relaxation is done.

I think SDP might be really something useful in the spectrum related algorithms. Graph-related algorithms and similar ideas might all benefit from the idea.