Thursday, December 25, 2008

Learning Globally-Consistent Local Distance Functions for Shape-Based Image Retrieval and Classification


It looks like a similar problem to CBIR (content-based information retrieval). This application might be solved with MIL (multiple-instance learning). Here the basic idea is distance learning. First several patches are extraced from the images (with SIFT or geometric blur). The distance of each pair of images is a weighted sum of ditances of patches. Here the distance of each pair of images is actually not a distance (no symmetry is assured). We only compute the distance of image i and j to the image k and compare the two to decide which is more likely a match. Then we only need to decide the weights.

Here it exploits the maximum margin concept by a similar strategy as in SVM, that is to constrain the training set by an inequality . - .. >= 1 and minimize the norm of weights. The thing about this strategy is whether this is a margin -,-b I am always confused by it. Maybe it is reasonable but not geometrically interpretable?

No comments: