Thursday, March 26, 2009

Dimensionality Reduction for Supervised Learning with Reproducing Kernel Hilbert Spaces


This paper tells us one way of SDR (sufficient dimensionality reduction) using KDR (kernel dimensionality reduction), though it has a somewhat different path to reach the conclusion. I will scan the earlier nonparametric version.

This model proposed by the author is kind of semi-parametric. The sufficiency of a dimensionality reduction algorithm for a regression problem is that y is independent of x conditioned on the projection of x in a subspace (let the projection be Px = u and v is the residual). Therefore either we maximize the mutual information of y and u or we minimize the mutual information of y and v. Consider about the KDR technique: the mutual information can be approximated with Kernel CCA or KGV (kernel generalized variance).

The different derivation starts with two definitions: one for covariance operator and the other conditional covariance operator. In a way, these operators formulates the idea of ``two r.v.s are independent when arbitrary functions of them are independent,'' but in a more abstract and rigorous way. With these concept we know that the conditional covariance operator ΣYY|U is no less than ΣYY|X and the equality happens when Y is independent of V. And ΣYX|U = 0 when Y is independent of V.

When we concretize these operators with matrices, we have the same formulation as in KDR. Therefore the optimization is done in the same way. This procedure is analogous to feature selection and may be applied to variable selection.

1 comment:

Anonymous said...

Amiable dispatch and this post helped me alot in my college assignement. Thank you for your information.