Paper Scanner

Sunday, January 31, 2010

Non-metric Label Propagation

by Yin Zhang and Zhi-hua Zhou

This paper follows the idea of the non-metric similarity matrix analysis. By decomposing the Gram matrix into two separate graphs (one for positive eigenvalues and the other for the negative), they build two separate Markov chains, which compromise a mixture of Markov model for label propagation (just an explicit solution of linear equations). Their paper contains many experiments as usual, which I think might be the deficit of my own research work.

The idea is not that fancy but the application in label propagation might be novel, the research style of Zhou's :-p That requires keen olfaction.

Feature Discovery in Non-metric Pairwise Data

by Julian Laub Klaus-Robert Muller

This is a paper about how to analysis pairwise "distance" or similarity matrices. Since no all similarity matrices can be transformed into a Gram matrix (as we do in MDS), it is interesting to take a deeper insight into the details.

Basically, we may imagine there are two metrics, one for similarity and another for dissimilarity (penalizing the similarity in human perception). By applying a spetral transformation, we may use metric methods if the spectra can be fixed (no negative).

The problem is how we may utilize the negative part of the spectra.

Monday, November 23, 2009

Two-view Feature Generation Model for Semi-supervised Learning

by Rie Kubota Ando and Tong Zhang

We first take a look at their logic: for semi-supervised learning, a generative model is usually preferred since unlabeled data help estimate the margin distribution \Pr(x). In a Bayesian MAP formulation, we are actually

\min_\alpha - \sum_i \log \Pr(y_i \mid \alpha, x_i) - \log \Pr(x_u \mid \alpha)\Pr(\alpha)

which is a little different from a direct generative model. Here the first term is actually a discriminative term and the second term is a penalty from unlabeled part (therefore it is more similar to a ``supervised loss + penalty'' model). This paper does talk about models of the latter, using auxiliary problems.

The two-view model means, analogous to co-training, we have two view of feature vector x, namely z_1(x), z_2(x), which are inpdependent conditioned on the label. The different thing about this model is in order to solve \Pr(y \mid z_1, z_2), we need \Pr(y \mid z_1), \Pr(y \mid z_2). Now we only consider \Pr(y \mid z_1). One possibility is to make a low-rank decomposition of \Pr(z_2 \mid z_1) = \sum_y \Pr(z_2 \mid y) \Pr(y \mid z_1) but the LHS is sometimes impossible to compute. An approximation is to encode z_2 with a set of binary labels t_1^k(z_2). Then \Pr( t_1^k \mid z_1) = \sum_y \Pr(t_1^k \mid y) \Pr(y \mid z_1) can be computed. By increasing the number of related binary labels t_1^k we may have a good estimation of \Pr(y \mid z_1).

They proposed two models (one linear and the other log-linear, which resembles linear regression and logistic regression in a way). The linear version coincides with the SVD-ASO model in their JMLR paper. The log-linear model is solved via EM-like algorithm.

The thing is what kind of binary auxilliary function would be essential to our semi-supervised problems? This might be a key to understanding their JMLR paper for multi-task learning.

A Framework for Learning Predicative Structures from Multiple Tasks and Unlabeled Data

by Rie Kubota Ando and Tong Zhang

This paper addresses a framework for multi-task learning. Their idea is quite simple. There is a common factor \Theta which is shared in different but related problems. Therefore in each problem P_k, our parameters include w_i, which is problem-specific and v_i which is dependent on the common feature controled by \Theta. To solve the model it usually desirable to alternatively optimize over w_i, v_i and \Theta. Usually a regularizer is also included for better generalization capacity.

using this idea, the authors proposed a linear model which is solved by the ASO using SVD in each iteration to find \Theta (SVD-ASO in their term). With this idea, they analyzed the semi-supervised learning with auxiliary functions, which are essentially those multi-tasks.

Their extention for this piece of work is scanned here.

Saturday, November 21, 2009

Discriminative Semi-supervised Feature Selection via Manifold Regularization

by Zenglin Xu, Rong Jin, Michael R. Lyu and Irwin King

This paper talks about feature selection via SVM. The semi-supervised part is enabled by adding a manifold regularizer. The method is to multiply the feature with a diagonal 0-1 matrix (selecting features). With these variables in the optimization as well, we get the optimization for this problem. The key idea to solve this problem is to reformulate it with the dual of SVM but leaving the feature selecting variables alone. Then the optima is the saddle point of the optimization problem. This kind of problem can be found in multiple-kernel learning, which has a standard algorithm (alternating optimization w.r.t. difference variables).

The idea of using SVM for feature selection is not new. The contribution might be the semi-supervised setting. In my own research it seems that we still do not have a clear goal of achieving this with other methods. hmm...

Thursday, November 12, 2009

Dappled Photography: Mask Enhanced Cameras for Heterodyned Light Fields and Coded Aperture Refocusing

by A. Veeraraghavan, R. Raskar, A. Agrawal, A. Mohan and J. Tumblin

A light field conveys both spatial and angular distribution of light incident on the camera sensor. The pioneer work to capture a light field in one photographic exposure is the plenoptic camera, a device that uses a microlens array to rearrange a 4D light field and capture it with a 2D sensor. However, the optics of the microlens array defines a fixed resolution tradeoff between spatial and angular sampling of the light field.
In this paper, the authors propose to modulate the light field by shadowing the incoming light with a mask in the optical path. In the Fourier Light Field Space (FLS), the mask creates a train of identical kernels positioned in a slanted slice, and thereby, via convolution, pulls high angular frequencies to the central angular slice, the only slice the camera measures in the FLS. Assuming that the incident light field is band limited, the captured image is the flattened version of the incident light field in the Fourier domain.
Moreover, the slant of the mask kernel, which decides the spatial-angular resolution tradeoff, is determined by the location of the mask. Consequently, the resolution tradeoff can be adjusted by translating the mask. The minimal and the maximal angular resolution are achieved by placing the mask at the aperture and at the conjugate plane respectively.
However, the mask enhanced camera seems to trade reconstruction quality for flexibility. First, the optimal pattern of the mask varies with its location yet in practice the mask pattern is permanent. Second, the mask blocks about half of the incident light and reduces the signal-to-noise ratio of the sensed image. After all, the paper provides a profound analysis on the principle of mask enhanced cameras, a major category of computational camera, making itself influential in the Computation Photography community.

4D Frequency Analysis of Computational Cameras for Depth of Field Extension

by A. Levin, S. W. Hasinoff, P. Green, F. Durand and W. T. Freeman

Although many types of cameras are invented to extend their depth of field (DoF), none of them optimize the quality of the resulting image or, equivalently, maximize the modulation transfer function (MTF). In this paper, the authors perform a 4D frequency analysis to estimate the maximal frequency spectrums of optical systems.
The key of the analysis lies in the observation of the dimensional gap between the 3D MTF and the 4D ambiguity function that characterizes a camera: the former was a manifold embedded in the latter, called “the focal segments”. To maximize the MTF, therefore, the ambiguity function is desired to uniformly distribute all the energy on these segments. This analysis leads to an upper bound of the MTF.
Unfortunately, most contemporary computational cameras waste energy out of the region. The only exception is the focal sweep camera, but the phase incongruence of its OTF across various focus settings lowers the spectrum magnitude. The authors propose the lattice-focal lens. This lens is composed of a number of sub-squares, each responsible for focusing light rays from a specific depth. This spatial division of aperture also concentrates energy on the focal region, but achieves a much higher spectrum than the focal sweep camera.
The ambiguity function, defined as auto-correlation of the 2D scalar field of an optical system, is a redundant representation. This prohibits the authors from determining the tight upper bound of the frequency spectrum. Still, the proposed analysis sheds much light on this question. Although there is no explicit analysis, it indicates that the key of maximizing MTFs may lie in phase incoherence of the optical system.