Corpora: Children's or Graded Corpora Query Results

From: T Murphy (tmorpheme@hotmail.com)
Date: Thu Jul 06 2000 - 03:40:29 MET DST

  • Next message: Thorsten Brants: "Corpora: LINC-2000 Call for participation"

    I believe it's my obligation to report back to the list on the results of my
    query. I asked about the "existence of children's corpora -- in the sense of
    books for children, whether textbooks, novels for children or young adults,
    vocabulary-graded readers and so on".

    The results weren't fantastic, but it looks as though it may get better over
    the next few years. Copyright problems seem to be a major concern at the
    moment, slowing progress on the corpora that are being created.

    1. Quentin Allan, who is with the Teachers of English Language Education
    Centre (TELEC) in the Department of Curriculum Studies at the University of
    Hong Kong, provided me with what I thought was the most promising lead.

    He is in the process of developing a TeleCorpus, a computer-based collection
    of writing which is relevant to primary and secondary teachers in Hong Kong.

    TeleCorpus is in two parts: the TeleNex Learner Corpus (primary and
    secondary), and the TeleNex Corpus of Modern English. It¡¯s part of the
    TELEC project (Teachers of English Language Education Centre).

    The relevant part may be found here: http://www.telenex.hku.hk

    They're also developing a concordancer for English teachers to use as an
    access tool for the various corpora they've developed:

    The TeleNex Corpus collection consists of:

         a corpus of primary students' writing
         a corpus of secondary students writing (and transcriptions of oral
    presentations from
         Form 7 students)
         a corpus of general English (transcriptions of spoken English from the
    UK, feature
         articles from the SCMP, etc.)
         a corpus of texts which are relevant to the primary level English
    classroom (graded
         readers, fairy tales, etc.)

    2. For classic children's literature, I was reminded by Christopher Tribble
    to check out Project Gutenberg http://www.promo.net/pg/.

    3. Prof. Geoffrey Sampson noted that "some people at Leeds gathered corpora
    of children's own writing at two ages, I think 9 and 12, back in the 1960s
    or 1970s". Prof. Sampson has copies of the published version.

    4. Tony McEnery has constructed some small corpora of the writing of
    children.

    5. David Lee reminded the corpora list that the BNC contains a large number
    of texts by both children and teens.

    6. Finally, Leonel Ruiz of the Center of Applied Linguistics in Santiago de
    Cuba, Cuba wrote me that the Center has done a study on the vocabulary of
    Cuban children and that they have an interesting corpus of children's
    vocabulary in Spanish.

    My thanks to all who responded to the query.

    Dr. Terry Murphy
    Department of English,
    Yonsei University
    Seoul, Korea

    ________________________________________________________________________
    Get Your Private, Free E-mail from MSN Hotmail at http://www.hotmail.com



    This archive was generated by hypermail 2b29 : Thu Jul 06 2000 - 03:39:18 MET DST