Thursday, October 7, 2010

PLDA: Parallel Latent Dirichlet Allocation for Large-Scale Applications


This paper discussed two implementations of PLDA, one using MPI and another using Map/Reduce. It seems that the Map/Reduce framework in google doesn't support iteration too. The comparison of LDA implementations via variational Bayesian, EP and Gibbs sampler was mentioned. But somehow quite strange, I would like to make the comparison myself sometime later.


Some models can be naturally extended into its parallel version, such as many monte-carlo methods. In the authors' implementation I see many similarity as another version by my colleagues. Wen-yen now is also one of my colleagues now, small world!

No comments: