Corpora: Czech National Corpus

From: Pavel Kveton (kveton@slivka.ff.cuni.cz)
Date: Thu Dec 21 2000 - 11:49:15 MET

  • Next message: Judy Pearsall: "Corpora: ENGLISH LANGUAGE CORPUS DEVELOPMENT"

    In November 2000, a 100-million representative corpus of written Czech,
    called SYN2000, has been officially released for non-commercial use.

    It is a major part of the Czech National Corpus project which consists of
    other minor corpora, too, and which will be gradually released as well.
    SYN2000 is basically a contemporary modern corpus (where, for example,
    newspaper texts date from 1991-1999), which is planned for a many-sided
    research, dictionary-makers etc. An access to it can be negotiated, against
    signing a written statement, free of charge through the address
    http://ucnk.ff.cuni.cz which serves also as a web address with some
    additional information. Next to this, the same address offers a public access
    to some 20 million of the large corpus in a somewhat limited way, too. An
    accompanying book about the Czech National Corpus, containg a Manual for
    using SYN200, which has just come out, is available from the Institute of the
    Czech National Corpus who is responsible for the corpora developed under the
    project.

    Professor Frantisek Cermak



    This archive was generated by hypermail 2b29 : Fri Dec 22 2000 - 22:18:43 MET