Wednesday, June 15, 2011

Regression-based Latent Factor Models


This paper talks about another way of using the rating matrix.

Given features of users and items, there are some correlations between them. The CCA idea is ``unsupervised'' since it works on two kinds of data and finds the maximum correlations without further ``supervised information.'' Then if indeed we have some ``supervised'' information, that is how correlated a user is to some items, the problem becomes a ``supervised learning'' version. Actually, most of the literatures are in this style. Haha, you might have realized my point now: is there any ``semi-supervised'' version? Yeah, I'd like to develop one :-)

OK, let's go back to this paper. So with supervised information, the problem of CCA becomes finding two linear mapping of user's feature and item's feature into a common subspace, so that their inner product approximates the given one. This might be problematic, since obviously the ratings might be non-consistent with their relative face values. But anyway we have this basic idea: it's like what we get in PCA.

The second step is to make some probabilistic model out of it (PCA to PPCA) so that everything gets some interpretation and a disciplined way to train the model and making inferences. The paper shows us an interpretation: there is a common subspace that each user/item is drawn from (zero mean, some variance?), which can be obtain from the original feature space (via a linear transformation). The observed rating can be seen as the inner product mixed with a noise. With this graphical model, we may do the learning via EM. Here the authors chose the MCEM (MC to compute the expectation).

Let's try some of the ideas.

No comments: