Corpora: Parallel corpus

From: Tomaz Erjavec (Tomaz.Erjavec@ijs.si)
Date: Wed Dec 20 2000 - 19:38:51 MET

  • Next message: Dan Tufis: "Re: Corpora: Parallel corpus"

    Yuliya Katsnelson writes:
    > I am looking for a parallel corpus (news, etc.) in English and
    > optimally, Eastern European languages. The second-best scenario would

    If you are happy with Slovene-English download or search at
    http://nl.ijs.si/elan/#corpus

    Tadeus has already mentioned the (double) TELRI CD-ROM, which has on
    one CD the 'Republic' by Plato in over 20 languages (most east
    europe), and on the other the MULTEXT-East corpus (6 ee languages, cca
    300 kW per language, cf http://nl.ijs.si/ME/). I think the CD-ROM is
    currently out of print, but check with http://www.telri.de/cdrom.html
    Also, have a look at the TELRI Tractor archive, http://www.tractor.de/

    Another possibility but a far cry from newspapers are the Linux
    Documentation Project localisation files. They are copyleft, and easy
    to get to. e.g. you can pick up the complete KDE desktop localisation
    from http://i18n.kde.org/translation_archive/

    Good luck,
    Tomaz

    -- 
    Tomaz Erjavec                  | Dept. for Intelligent Systems E-8
    email: tomaz.erjavec@ijs.si    | Jozef Stefan Institute
    www:   http://nl.ijs.si/et/    | Jamova 39
    fax:   (+386 1) 425-1038       | SI-1000 Ljubljana, Slovenia
    



    This archive was generated by hypermail 2b29 : Wed Dec 20 2000 - 19:35:47 MET