Sunday, August 22, 2010

Probabilistic Latent Semantic Visualization: Topic Model for Visualizing Documents


This paper proposed a model based on LDA. But the Dirichlet prior is replaced with the probability generated by the latent coordinates of each documents and the topics. The paper only deals with MAP estimation of the latent coordinates, which can be solved via EM-like algorithm.

The learning is simple for the distribution of words conditioned on topics (analytic solution) while difficult for the latent coordinates due to the optimization (has to be solved via gradient-based numerical solutions).

The idea is interesting though. Instead of seeking a representation of the documents in the topic space learn by LDA-like models, the visualization is directly modeled via a probabilistic graphical model.

No comments: