Re: Corpora: non-alphabetic language databases

From: Mike Maxwell (mike_maxwell@sil.org)
Date: Thu Nov 30 2000 - 14:45:13 MET

  • Next message: Sampo Nevalainen: "Corpora: TRANSLATION UNIVERSALS - FIRST ANNOUNCEMENT"

    >3.) SIL international are developing a font rendering
    >engine called Graphite which should be able to be
    >embedded in corpus processing systems.

    As an SIL member, but not one directly involved with Graphite (or non-Roman
    font rendering systems in general), I'll just say a couple things:

    1. For info on the Graphite project, see
    http://www.sil.org/computing/graphite/.

    2. The question of writing systems for signed languages came up a couple
    weeks ago. As I understand it, the de facto standard for writing among
    native "speakers" of American Sign Language (as opposed to researchers) is
    "Sign Writing"; see http://www.signwriting.org/. The developers there have
    a (propietary) system for keying in and rendering (displaying) Sign Writing.
    I don't believe that system is "in" Unicode yet, although I could be wrong.

    3. Font rendering problems come up in a variety of "non-Roman" writing
    systems, including alphabetic ones--namely, in any writing system where the
    shape or positioning of glyphs (the displayed form of characters) is context
    sensitive, or the direction of writing is not strictly left-to-right and
    top-down. This includes things like the two forms of lower case sigma in
    Greek (one word-final, the other elsewhere), Arabic-based systems
    (right-to-left), systems in which vowel letters appear above or below the
    consonant letters (Massoretic Hebrew, I think, and many alphabetic systems
    of SE Asia), etc. Even the IPA transcription system has some of
    this--diacritics which otherwise appear below a base character instead
    appear above characters that have a descender (e.g. the voiceless symbol
    when used with the engma). In widely-used writing systems (e.g. Greek and
    Arabic), these problems have been addressed at the operating system level;
    thus, there are Middle Eastern versions of Microsoft Windows. The problem
    is worse with writing systems which are not commercially viable, for which
    there are at present few solutions.

    4. Apple has some solutions for the above issues, and Microsoft is now
    putting more effort into it as well. But at present, I think it's safe to
    say that no one covers all the writing systems.

    5. Treat all the above as non-official, uninformed speculation on my part
    :-). In case of disagreement, there will be a recount.

                                             Mike Maxwell
                                             SIL
                                             Mike_Maxwell@sil.org



    This archive was generated by hypermail 2b29 : Thu Nov 30 2000 - 14:42:24 MET