Friday, April 10, 2009

An Introduction to Variational Methods for Graphical Models


This paper describes in details the variational methods for graphical models. There are basically two important applications of variational methods (here it refers to variational bounds): one for Bayesian models where the posterior probability is hard to evaluate and therefore is a replacement for the slow MCMC methods; the other for graphical models where too many dependencies (edges in the graph) give rise to a difficult inference problem. Here the authors describe two styles of variational bounds. No matter which is employed, the ultimate objective is the same: to simplify the graph by eliminating some edges since the bound usually provides an easier way of computation.

The first bound is to exploit the Fenchel-Legendre transform of a convex/concave function. Generally speaking, a convex (concave) function has a natural upper (lower) bound, which is the tangent hyperplane at each point on the graph of the function. The so-called Fenchel-Legendre transform (or variational transform, as referred to in the paper) of a function is
\hat{f}(z) = \min_x z^\top x - f(x)
which is the intercept of the the function f where the tangent is z. Therefore
f(x) \geq \langle z, x \rangle + \hat{f}(z),
which is a lower bound for the function and z is the variational parameter. There are two examples in the paper:
  • The QMR-DT is a disease-symptom bipartite Bayesian belief net. Due to the ``explaining away'' problem (a``v'' shape structure, when the child is not given the parental nodes are independent but when the child is given, they become dependent and the sum over these variables will be difficult since they are coupled), we have to simplify the probability \Pr(f_i=1 \mid d ) = 1 - \exp\left( -\sum_{j \in \mathrm{pa}(i)} \theta_{i,j} d_j - \theta_{i, 0}\right) with an upper bound.
  • The Boltzmann machine can also be solved with this method.They derive a lower bound as for logistic function, which can delink all neighbours of a single r.v. but an upper bound, though easy to compute and decouple the r.v. with its neighbors, connects all its neighbors.
The second bound is to exploit the Jensen's inequality,
\log \Pr(x) = \log \left( \sum_z \Pr(x, z) \right) = \log \left( \sum_z q(z) \cdot \frac{\Pr(x, z)}{q(z)} \right) \geq \sum_z q(z) \log \frac{\Pr(x, z)}{q(z)},
which resembles the standard derivation of EM. Since x is observed, we usually introduce a variational parameter for q(z \mid \lambda) and maximize the lower bound by selecting a proper \lambda. The paper illustrates the idea with several mode examples:
  • Boltzmann machine. Now we may put all the observed r.v.s to x and all latent r.v.s. The proposed q(z) is a decomposed binomial distribution and the parameter, i.e. the moment isthe variational parameter.
  • Neural network, here referring to sigmoid belief network. Just like Boltzmann machine (the difference is one is MRF while the other is a belief netowrk).
  • Factorial hidden Markov model. Since the posteror distribution of the hidden variables couples, the proposed q(z) is a decoupled prior with adjustable parameters (variational parameters).
  • Hidden Markov decision tree.
This idea has relationships with other known models:
  • Helmholtz machine: it uses two set of parameter for the generative model and recognition model instead of computing the posterior probability with Bayes' theorem.
  • Sampling method: they compare Gibbs sampler and variational methods. Gibbs sampler is said to pass messages of samples (conditioned on all other variables) while variational bounds pass statistics themselves (or the parameters). Not sure, I think the global variational method is closer to this point.
  • Bayesian methods: they are actually referring to the global variational method. Not clear why it is not mentioned here. Maybe it not their work -,-b

16 comments:

Anonymous said...

ambien order ambien dosage pregnancy - ambien sleep apnea

Anonymous said...

buy ativan treatment for ativan overdose - can you buy ativan online

Anonymous said...

buy ativan online ativan nursing implications - ativan withdrawal ringing ears

Anonymous said...

where can i buy xanax online xanax klonopin together - purchase xanax online canada

Anonymous said...

buy zolpidem online zolpidem japan - zolpidem generic not working

Anonymous said...

order ativan ativan no prescription needed - ativan drug overdose

Anonymous said...

ativan pills lorazepam 1mg tab ran - ativan help quit smoking

Anonymous said...

diazepam online what drug category is diazepam - diazepam dosage for dogs with seizures

Anonymous said...

diazepam online buy diazepam 10mg - diazepam veterinary

Anonymous said...

discount ativan ativan withdrawal drug - long do ativan withdrawal last

Anonymous said...

buy lorazepam online ativan side effects urination - ativan half life in urine

Anonymous said...

buy ambien online generic ambien cr 12.5 mg - ambien side effects men

Anonymous said...

ambien 5 ambien dosage 5 or 10 mg - ambien klonopin alcohol

Anonymous said...

valium online no prescription xanax vs valium drug - valium show up drug test

Anonymous said...

soma muscle carisoprodol injection - carisoprodol 350 mg info

Anonymous said...

valium pill better anxiety valium klonopin - valium dosage limit