Friday, January 2, 2009

Exploiting Generative Model in Discriminative Classifier


This is a classic semi-supervised style of learning. First a generative model is trained with all samples without labels. The generative model has parameters and the Fisher information matrix I could be calculated (Hessian of the Likelihood function). With this a kernel could be introduced as the quadratic form of the score:
k( x, z ) = UxT I-1 Uz
The kernel could be exploited by later discriminative models, such as GLIMs and SVMs.

It can be easily proved the proposed kernel has several nice properties:
  • It is a qualified kernel.
  • As Jeffreys's rule, it is invariant under continuously differentiable transform of parameters.
  • A kernel classifier employing this kernel derived from a model whose latent variable is the label is asymptotically as good as a MAP labelling.

No comments: