There are several points in this paper:
One single kernel might not be desirable, since if we know nothing about the data, the kernel is selected empirically from a set of known kernels.
Given a set of parameterized kernels, how is it possible to select one that is the convex combination of them? The authors mention a finite representation in Learning convex combinations of continuously parameterized basic kernels (in COTL 2005).
A procedure for calculating the kernel selection is illustrated in the following algorithm:
The first step is carried out in convex optimization algorithm and the second is a DC programming problem. The stopping criterion ensures the procedure will increase the object function.
But it is a little strange why Step 2 is a DC programming problem. The authors cite a result from On functions representable as a difference of convex functions by Hartman, which implies:
- Every two continuously differentiable function on Omega is DC;
- Every continuous function on Omega is the limit of a sequence of DC functions that converges uniformly on Omega.
No comments:
Post a Comment