Re: Corpora: web-search

From: Ken Litkowski (ken@clres.com)
Date: Fri May 05 2000 - 20:45:17 MET DST

  • Next message: Ken Litkowski: "Re: Corpora: web-search"

    I would hope that this tool may be useful to lexicographers as you have
    configured it. Might I suggest, in addition to the suggestions already
    made, that an output option include a format like that used in Senseval,
    since there are many in the computational linguistics community who have
    used that format for word-sense disambiguation studies.

    The format would be a line with an identifier and then up to three
    sentences of the source text, with the last sentence containing the
    bracketed target word. It wouldn't be crucial to be all-inclusive. Use
    a simple sentence-splitter and see if you can generate a set of
    sentences. If not, just discard the particular corpus instance. This
    would provide great training data.

            Ken

    -- 
    Ken Litkowski                     TEL.: 301-482-0237
    CL Research                       EMAIL: ken@clres.com
    9208 Gue Road
    Damascus, MD 20872-1025 USA       Home Page: http://www.clres.com
    



    This archive was generated by hypermail 2b29 : Fri May 05 2000 - 20:53:15 MET DST