Wednesday, July 15, 2009

On Primal and Dual Sparsity of Markov Networks


This paper mainly taks about the relationship of several M3Ns. The primal and dual sparsity are caused by L1 norm penalty and the constraints (according to KKT conditions). Therefore adding L1 norm penalty to M3N will cause both sparsities, which increases generalization capacity and selects important features.

The LapM3N proposed by the authors earlier is a Bayesian version of M3N. The MAP estimator of LapM3N would go as M3N with different penalties, e.g. L2 norm corresponding to a Gaussian prior and L1 norm corresponding to a Laplace prior with parameter going to infinity.

Another relationship of L1 norm is found to sparse Bayesian learning, since adding a prior for each parameter of M3N (think about RVM) would result a sparse solution. The adaptive M3N will yield the same result as the L1-normed M3N.

The authors proposed an EM-like algorithm for training L1-normed M3N. Obviously, it would have connection to variational Bayesian approximation.

Maybe we should write our own structured learning tools.

2 comments:

Anonymous said...

Don't you lose the ability to use RKHS when you switch from the L2-norm to the L1-norm?

Unknown said...

that's true...

but their applications are mainly from NLP, usu. linear kernels are enough

but that's something, L1 norm for RKHS... I haven't think about it... is ther some alternative?