Sunday, November 21, 2010

Deep Transfer vis Second-Order Markov Logic


Transfer learning is a topic I am not familiar with. It is said to cope with training with data that differs from the testing data. There are two extents of transfer learning: so called shallow transfer that deals with different distributions in training and testing (in the same domain) and so called deep transfer that deals with different domains in training and testing.

The deep transfer is possible only because different domains shares the same logic while this is actually I think Markov logic should be able to play an important role. The paper explains why we must use second-order logic (due to finding domain-independent knowledge) and relational (for transfer learning). Their proposed algorithm is DTM (deep transfer via Markov logic).

For the experiments, the authors uses three domains, which seem no-in-the-least related to each other (yeast protein, webkb and social nets data from facebook). I am still not sure whether these experiments really show how their transfers work. Maybe we should return to this paper after a more careful study.

Statistical Predicate Invention


This paper talks about how to find statistical predicate (SPI problems, statistical predicate invention). This, from another talk given by Domingos, is equivalent to find latent variables in traditional probabilistical learning, one of the ten most important problems in the following decades in machine learning. The setting of this paper is second-order Markov logic.

Their proposed approach for this problem is MRC (multiple relational clustering). The multiple relational clustering is interpreted via a simple example in the paper: one's technical skills and hobbies should be mined from different groups (clusters) of people, e.g. coworkers may share similar technical skills while friends share similar hobbies. The relational clustering is to find the latent relationship between people, and therefore ultimately finds who are coworkers, friends (latent predicates or r.v.). The resulting algorithm is not easy to understand. It looks like a clustering algorithm in logic language.

There is another piece of work (infinite relational model) using CRP in relational modelling, which I think should be very interesting. We will try to see the details later.

Saturday, November 13, 2010

The PageRank Citation Ranking: Bringing Order to the Web


This might be the famous paper introducing the PageRank to the ranking research and the famous search engine Google to the Internet. The key idea behind the PageRank that differentiate it from the back link counts is that a back link from an authorized site should be more valuable. Therefore the backlinks must be weighted by its own rank. So the last scanned paper's recursion makes sense.

A more interesting question is how to make a distributed version. I think in a way this is equivalent to solving some linear system but I haven't really try to derive it.

This could be a contributing factor for real ranking algorithm (which uses many other features as well).

Multiple Instance Learning for Computer Aided Diagnosis


This paper talks about MIL poblems, addressing them by a convex hull based algorithm. The key idea comes from a former proposed relaxation (in another paper). In MIL, the basic assumption is that each positive bag contains at least one positive sample. The relaxation of this assumption is that in the convex hull of each positive bag, there exists a point that could be correctly classified.

With this relaxation, the author proposed a general form for MIL, with which many known algorithms can be formulated, e.g. SVM and Fisher discriminant criterion for MIL. On a whole this is an application paper.

Wednesday, November 10, 2010

The Intelligent Surfer: Probabilistic Combination of Link and Content Information in PageRank


I started to read those papers by Domingos to follow their idea on Markov logic. Now I have already read a few pages from their tutorials and wish to know more about their past research background.

This paper talks about ranking, which I am not quite familiar with. This paper introduces two previous methods, i.e. HITS and PageRank, which I'd like to scan later. The HITS model is little bit complicated and can't be served online (they have to compute hubs and authorities at query time) while PageRank can utilize the PageRank at offline stage (i.e. after the crawling, they may compute the PageRank using incoming links and outgoing links, which later serves as a factor contributing to the final ranking function).

The graph PageRank builds consists of links:
  • using the outgoing links and incoming links;
  • if a webpage has no outgoing links, it links to all other pages;
So we may set a random walker on the graph and find the probability it gets to the page using the formula
\displaystyle P(j) = \frac{1 - \beta}{N} + \beta \sum_{i \in B(j)} \frac{P(i)}{|F_i|}
where B(j) contains all pages link to page j and F(i) contains the pages linked from page i.
This paper's main idea is to incorporate the query information into the PageRank, which, you may have already seen from tabove, contains nothing about the query q. Let
P_q(j) = (1 - \beta) P_q'(j) + \beta \sum_{i \in B_j} P_q(i) P_q(i \to j)
, where those query related terms are all derived from query relevance scores, e.g.
P_q'(j) = R_q(i) / \sum_{i \in W}R_q(i), \qquad P_q(i \to j) = R_q(i) / \sum_{j \in F_i} R_q(j)
.
To overcome the computation problem, the author suggests we pre-compute the ranking score for the search queries offline.