In November 2000, a 100-million representative corpus of written Czech,
called SYN2000, has been officially released for non-commercial use.
It is a major part of the Czech National Corpus project which consists of
other minor corpora, too, and which will be gradually released as well.
SYN2000 is basically a contemporary modern corpus (where, for example,
newspaper texts date from 1991-1999), which is planned for a many-sided
research, dictionary-makers etc. An access to it can be negotiated, against
signing a written statement, free of charge through the address
http://ucnk.ff.cuni.cz which serves also as a web address with some
additional information. Next to this, the same address offers a public access
to some 20 million of the large corpus in a somewhat limited way, too. An
accompanying book about the Czech National Corpus, containg a Manual for
using SYN200, which has just come out, is available from the Institute of the
Czech National Corpus who is responsible for the corpora developed under the
project.
Professor Frantisek Cermak
This archive was generated by hypermail 2b29 : Fri Dec 22 2000 - 22:18:43 MET