Sunday, November 6, 2011

Impedance Coupling in Content-Targeted Advertising


This paper talks about an expansion of keywords in a webpages by using similar webpages. By this expansion, the retrieval of related ads can be improved. It is different from another idea, that is, the expansion set is mined from search engine.

The basic model is Bayesian belief net, where for each document D0, we take its most similar pages Dk and create links from the doc to its related terms. Originally, the doc's related terms are sparse when used in retrieval (due to vocabulary impedance problem). But with similar pages, the candidate sets get enlarged but still we need to select which are good for the current page. Take Pr(Di) as constant, and the conditional probabilities Pr(R | Di) as some function related to the similarity to D0 and Pr(Tj | Di) as some normalized term frequencies. We may evaluate Pr(Tj | R) for each document as a ranking score to choose the corresponding term.

No comments: