Re: Corpora: Question about a Brown Corpus tag

From: Andrew Harley (aharley@cambridge.org)
Date: Thu Aug 17 2000 - 18:20:24 MET DST

  • Next message: Ken Litkowski: "Corpora: [Fwd: [CONTENT:528] Publication Alert: Overestimating interrater reliability]"

    >Some tag
    >definitions in Brown were clearly decided by what TAGGIT found computable;
    >I *guess* linguistic inconsistencies in tagging some words may be down to
    >drawing boundaries on grounds of computational tractability rather than
    >purely linguistic reasons (or, to be more fair, when two or more
    >conflicting linguistic criteria were available (eg form v function),
    >computational tractability was a deciding factor)

    This explains how so many taggers can claim 95% or higher success rates!

    I also know taggers that tagged IN as "preposition or conjunction" on the
    same grounds.

    Different users have different needs. For lexicographers, the difference
    between these forms for WDT and IN are important, as are the less commonly
    distinguished VB-transitive and VB-intransitive tags.

    Andrew Harley
    Systems Development Manager
    English Language Teaching & Dictionaries
    Cambridge University Press
    http://www.cambridge.org/elt

    Direct line: (01223)325880

    Cambridge Dictionaries Online (4 million searches since August 1999):
    http://dictionary.cambridge.org

    Cambridge International Dictionary of English on CD-ROM:
    http://www.cambridge.org/elt/cide



    This archive was generated by hypermail 2b29 : Fri Aug 18 2000 - 14:57:13 MET DST