Corpora: Seeking dependency-annotated non-English corpora

From: Philip Resnik (resnik@umiacs.umd.edu)
Date: Thu Dec 14 2000 - 02:07:08 MET

  • Next message: Andrew W. Cole: "Corpora: Programmer Job at the Linguistic Data Consortium (LDC)"

    Greetings and happy holidays!

    I'm looking for non-English corpora annotated with (or accompanied by)
    information about syntactic dependencies, also known as grammatical
    relations. For my purposes even collections of only tens or hundreds
    of annotated sentences are potentially helpful, although as always
    more is better. Data with parallel English translations would be
    wonderful but that's probably too much to hope for.

    French, Spanish, Chinese, and Arabic are rather high on my list,
    though information on other languages would be useful. (Yes, I
    already know about PDT for Czech.) Non-English corpora that were
    annotated with automatic parsers, rather than by hand, could even
    still be useful -- for example, I would be interested in parser output
    consistent with the SPARKLE Level-2 annotation scheme, or in the
    output of algorithms that identify particular predicate-argument
    relations like subject and object.

    Please reply privately, and I'll post a summary if there's interest.

    Best,

      Philip

      ----------------------------------------------------------------
      Philip Resnik, Assistant Professor
      Department of Linguistics and Institute for Advanced Computer Studies

      1401 Marie Mount Hall UMIACS phone: (301) 405-6760
      University of Maryland Linguistics phone: (301) 405-8903
      College Park, MD 20742 USA Fax : (301) 405-7104
      http://umiacs.umd.edu/~resnik E-mail: resnik@umiacs.umd.edu



    This archive was generated by hypermail 2b29 : Thu Dec 14 2000 - 02:04:01 MET