(no subject)

Pete Whitelock (pete.whitelock@sharp.co.uk)
Fri, 19 Apr 1996 10:45:14 +0100

Can anyone explain to me why in the standard tagger model, the lexical
probability is defined as the probability of the word given the tag rather
than the tag given the word. The latter would seem much more intuitive
(as well as easier to estimate), but is reported to give worse results
(e.g. the discussion in Charniak's book p.50). Is there a good reason for this?

Thanks,

Pete