Monday, May 11, 2009

Measuring Statistical Dependence with Hilbert-Schmidt Norms


This paper uses the so-called Hilbert-Schmidt norms instead of L^2 norm (corresponding to the COCO independence test) in the previously scanned paper, resulting in the so-called HSIC (Hilbert-Schmidt independence criterion), which is in practice much easier to calculate. Compared with COCO, which using the square root of the largest singular value of \tilde{K}^{(x)} \tilde{K}^{(y)}, HSIC actually uses the whole spectrum, i.e. the trace of \tilde{K}^{(x)} \tilde{K}^{(y)}. Since the Frobenius norm is an upper bound for the L^2 norm, therefore the similar independence equivalence can be obtained.

Please notice the relationship of KCC, COCO and HSIC. For KCC, we use the maximum kernel correlation as the criterion for independence test, but then we have to use the regularized version to avoid the common case when it does not give desired result. In COCO, we constrain both covariances to identity and then the the problem becomes seeking the largest singular value of the product of two centered Gram matrices while KCC solves the generalized eigenvalue problem. Therefore when we deal with multiple r.v.s, we can do it in the same ways as CCA does. So in a way you know why it is called COCO. These three ideas are closely related, just as in the KMI paper, KMI and KGV are closely related.

No comments: