Sunday, November 6, 2011

Swing or not to Swing: Learn When (not) to Advertise


This paper talks about how to learn to show the ads or not in some cases. The basic idea is to learn a classification model based on candidate sets's features, like relevance feature, cohesiveness features. This is compared to a thresholding method, which can be trivially obtained via commonly used retrieval system of ads and proved more effective when our goal is not to maintain a high recall rate (therefore the more the better case).

Another idea is to train the model based on click feedbacks instead of editorial judgement. The paper mentioned to use online learning techniques, which is the reason that most online serving models has to take into account of the time-evolving property and allows the model to be trained in an incremental style.

Impedance Coupling in Content-Targeted Advertising


This paper talks about an expansion of keywords in a webpages by using similar webpages. By this expansion, the retrieval of related ads can be improved. It is different from another idea, that is, the expansion set is mined from search engine.

The basic model is Bayesian belief net, where for each document D0, we take its most similar pages Dk and create links from the doc to its related terms. Originally, the doc's related terms are sparse when used in retrieval (due to vocabulary impedance problem). But with similar pages, the candidate sets get enlarged but still we need to select which are good for the current page. Take Pr(Di) as constant, and the conditional probabilities Pr(R | Di) as some function related to the similarity to D0 and Pr(Tj | Di) as some normalized term frequencies. We may evaluate Pr(Tj | R) for each document as a ranking score to choose the corresponding term.

Review Spotlight: A User Interface for Summerizing User-Generated Reviews Using Adjective-Noun Word Pairs


This is an interesting HCI paper on showing interviews in the form of adj.+ n. form. The sentiment analysis shows the positiveness and negativeness of each mined structure, which renders the color of the phrases. The sizes can be determined by the popularity of the phrases. The interface helps the user get better impression on the product of interest. The layout of the phrases are not very important. When the mouse hovers the selected phrases, the corresponding reviews will be listed (maybe with a smart summary?).

In a way, this interface has similarity in word/tag clouds but do has very reasonably advantageous edges. Shall we implement one for some projects?