Monday, July 26, 2010

Dirichlet Component Analysis: Feature Extraction for Compositional Data


This paper discusses a problem of extracting features from compositional data (nonnegative features that sums up to 1). The problem is interesting though I am not sure about the major difficulties. To get a proper projection into lower dimensional space, we have to conform to a certain set of constraints (balanced rearrangement). A regularization operator is devised to preserve the Euclidean geometry. The rearrangement will shrink the data while regularization expands them.

The optimization is quite strange (solved via genetic algorithm). The maximization w.r.t. \alpha looks like a MLE but the minimization w.r.t. the rearrangement matrix doesn't make much sense to me.

I am still unsure about the application's intention.

No comments: