Friday, February 12, 2010

Generalized Nonnegative Matrix Approximation with Bregmen Divergence


This paper introduces a family of objective in the form of Bregman divergence. From this page, you may see how broad the family is. For any given continuously differentiable and strictly convex function f(\cdot), the Bregman divergence is defined as
B_f(p \parallel q) = f(p) - f(q) - \langle \nabla f(q), p - q\rangle
which can be regarded as the difference of function value change and a linear approximation of the change. There are many interesting properties. Please check it out. This paper uses the Bregman divergence to measure the goodness of NMF, which naturally extends the original NMF objectives in the previous paper scanned here. Please note that both B_\phi(x_i, Wv_i) and B_\phi(Wv_i, x_i) can be employed as the objective function, but the former one requires less computational work.

The most surprising fact is that the auxiliary function used to formulate the multiplicative updating rule can also be extended into the more general Bregman divergence case if the auxiliary function satisfies the same constraints
G(x,x) = f(x), \qquad G(x, y) \geq f(x), \forall y.
The authors provides the following f(\cdot),
f(v) = B_\phi(Wv, x)
given the approximation problem
X = (x_1 ,\ldots, x_N) \approx W V = W(v_1, \ldots, v_N).
The function G(\cdot, \cdot) is a little well-designed to simulate the original auxiliary function,
G(v, u) = \sum_{i, j} \lambda_{i, j} f\left( \frac{W_{i, j} v_j}{\lambda_{i, j}} \right) - \left( \sum_i f(x_i) + f'(x_i)(W v_i - x_i)\right)
They also derived another multiplicative rule with KKT condition.

This result is really interesting usage of Bregman distance.

No comments: