Re: word frequency lists?

Judith Klavans (klavans@cs.columbia.edu)
Sun, 26 Nov 1995 23:39:04 -0500

My understanding of Ted's comment (and my own opinion) is that
he is not denying the usefulness of ``general'' or ``balanced''
corpora, but is simply pointing out some oddities of the data
ht sent out. Indeed, deviation from the norm is a common way
of determining e.g. topic, domain, etc. But the establishment
of ``the norm'' or the baseline is what he was commenting on.

It's not as easy a task as it seems; one might have a difficult
time judging when one has attained the ``right balance''.
However, the alternative is not to have to collect specific
corpora for each app, but simply be aware of deficiencies and
limitations.

Judith Klavans