This is a classic semi-supervised style of learning. First a generative model is trained with all samples without labels. The generative model has parameters and the Fisher information matrix I could be calculated (Hessian of the Likelihood function). With this a kernel could be introduced as the quadratic form of the score:
k( x, z ) = UxT I-1 Uz
The kernel could be exploited by later discriminative models, such as GLIMs and SVMs.It can be easily proved the proposed kernel has several nice properties:
- It is a qualified kernel.
- As Jeffreys's rule, it is invariant under continuously differentiable transform of parameters.
- A kernel classifier employing this kernel derived from a model whose latent variable is the label is asymptotically as good as a MAP labelling.
No comments:
Post a Comment