Showing posts with label cumulative distribution network. Show all posts
Showing posts with label cumulative distribution network. Show all posts
Friday, January 2, 2009
Structured Ranking Learning using Cumulative Distribution Networks
This NIPS paper further elucidates how to take advantage of CDN's idea to tackle a ranking problem. The author introduce a probability distributon for the preference variable given an ordered relationship. Therefore a natural choice of the ranking scheme is to maximize the likelihood or equivalently to minimize the negative log-likelihood (loss function). Once the ranking function could be parameterized, the optimization problem could be solved. The paper employs a nonparametric model. CDN here plays the role of setting up the CDF for the observed preference variable.
With this formulation, several earlier proposed algorithms could be regarded its special cases, such as RankNet, ListNet and ListMLE. I will take a close look at those ranking algorithms later.
Cumulative Distribution Networks and the Derivative-Sum-Product Algorithm
The paper first proposed novel probabilistic graphical model, i.e. CDN. The model differs with MRF and Bayesian belief network in several aspects. The basic idea of the network resembles the factor graph of MRF. The factor graph of an MRF consists a vertex set of two parts: one of the original vertices of the MRF, the other of each cliques. The representer theorem of MRF actually tells us each clique is a nonnegative potential function (unnormalized PDF, hence the vertex is often referred to by function vertex). In CDN, however, each function vertex corresponds to a CDF-like function (i.e. cumulative function as in the paper). As long as these cumulative functions satisfy several simple properties (which might be easily proved as in the paper), their product is a decent CDF.
To do inference in CDF, the authors proposed an algorithm that resembles sum-product algorithm for factor graphs (i.e. belief propagation for polytrees). We know the derivative of the CDF is PDF, therefore it looks natural their inference is called Derivative-Sum-Product (DSP) algorithm. I will read the details about the algorithm later.
This framework is applied to a ranking problem.
Here we can see, we have three teams, (X1, X2), (X3, X4) and (X5, X6, X7). The function si(xi) is the skill of the player. The performance of each team is the sum of all players, a function of xi. And we get R1, R2 and R3. We do have the rank of these teams, which comes as an ordered graph whose relationships are represented with h. The prediction is carried out with skill functions learned with previous results. In this model every function vertex is a CDF of a Gaussian and DSP could be computed easily. I think the details are worth trying to see.
In the end, several applications are pointed out, webpage ranking, collaborative filtering and etc. They also intended to focus on the learning part of the model in later research and the DSP algorithm, as the belief propagation, is only applicable to trees.
Several related topic:
convolutional factor graph: Convolutional factor graphs as probabilistic models, UAI 2004.
Subscribe to:
Posts (Atom)
