This paper introduces a variant of CRF, adding a constraint by prior knowledge. Since we know for some words, they are representing a positive/negative sentiment. Therefore the corresponding feature
f_{(\sigma, w)}
, which means f_{(\sigma, w)}(y) = \begin{cases} 1 & y = \sigma, x = w \\ 0 & \text{otherwise} \end{cases},
has a larger/smaller parameter \mu_{(\sigma, w)}
. That is to say, if w
is a special word indicating positive, then \mu_{(\sigma, w)} \geq \mu_{(\sigma', w)}
when \sigma \geq \sigma'
.This style of constraints will lead to a convex optimization problem though. The author prove that given a sequence
x
and the corresponding labelling y
, letting x' = (x_1, \ldots, x_j \cup \{ w\}, \ldots, x_n)
, if \mu_{(t_j, v)} \geq \mu_{(s_j, v)}
, then\frac{\Pr(s \mid x)}{\Pr(s \mid x')} \geq \frac{\Pr(t \mid x)}{\Pr(t \mid x')}.
This gives a new interpretation for the constraints. The model is reparameterized with Mobios inverse theorem (what is that?) and solved.
No comments:
Post a Comment