Sunday, January 4, 2009

Probabilistic Non-linear Principal Component Analysis with Gaussian Process Latent Variable Models

by Neil Lawrence

This might be the first article that covers most of GP-LVM, which is of great interest to me since last time Fei Sha mentioned this model related to a former research of mine. The idea is not very new but to me it is something fresh.

The basic model comes from Probabilistic PCA (PPCA). The PPCA model is a latent variable model that interpret PCA in a probabilistic framework. The latent variable z generate our observations with a linear transformation x = Wz + b + e, where e is an isotropis noise. The bias b could be estimated with sample mean independently. In PPCA, z has a prior of standard Gaussian. The estimation of W and the variance for the noise could be achieved with EM algorithm. So you might understand, the EM algorithm actually maximizes the marginal distibution of x and we marginalize z.

In a linear model, it's easy to get the reduction function and the reconstruction function easily. But as for nonlinear model, usually we can only have one-way mapping. The inverse mapping is usually difficult to get. So does GP-LVM. We might think a GP-LVM like this. The first part comes if we marginalize W instead of z in the PPCA model. We endow W with a similar Gaussian prior and z could be obtained with eigen decomposition of X XT. Actually the solution resembles PCA and classic MDS. For a GP-LVM, unlike kernel PCA putting a kernelized Gram matrix for observed data (so the solution is still obtained via eigen decomposition), we have to find the best configurationsfor all z, whose kernelized matrix best approximated X XT. Therefore, although we have a quite similar optimization problem, it is much harder to solve now due to the nonlinearity of the kernel function. Therefore a gradient based search such as CG must be applied.

But as we might foresee, the computational cost is intolerable for practical problems. The paper then focuses on designing an efficient algorithm for GP-LVM. The most important technique they employ is informative vector machine, which selects a comparatively small active set for the kernel methods (as Nystrom method).

3 comments:

Anonymous said...

Hi, Neat post. Τhere іs аn issue along ωіth your ωebѕite іn wеb exploгer,
wοulԁ сhесk thіs?
IΕ still is the mаrket сhief and a big element
of folkѕ ωіll mіss уour ωondеrful writing becаuse of this problеm.


my wеb-ѕite: tens Therapy units
my website > tens machine

Anonymous said...

Нey there! Ӏ knοw thiѕ іѕ kind of off-topiс hоwever I needеd to asκ.
Dоes managing a well-established websіte such as yours геquirе а massive amount wοгk?
ӏ аm brand new tο гunning a blοg but I
do writе in my jouгnal dаіly.

I'd like to start a blog so I will be able to share my own experience and thoughts online. Please let me know if you have any ideas or tips for brand new aspiring bloggers. Thankyou!

Here is my web site; tens machine

Anonymous said...

Hey thеre! I know this is κind of off-tοpic hοwever I needеd to
ask. Doeѕ managіng a ωell-estаblished ωebsite such
as yours requirе a massіve amount work?
Ι am brаnd new tο running a blog but I dο
write in my journal daily. I'd like to start a blog so I will be able to share my own experience and thoughts online. Please let me know if you have any ideas or tips for brand new aspiring bloggers. Thankyou!

Take a look at my weblog ... tens machine
My site - tens therapy