Dear list members,
last week I asked for information about computational morphology
systems that deal with word-formation. I received a number of helpful
replies - thank you very much.
I would like to thank the following colleagues
Antti Aarppe
Janne Bondi Johannessen
Rodolfo Delmonte
Sergei A. Koval
Kemal Oflazer
Dan Tufis
Alexander S. Yeh
Below I have summarized the responses I got plus the information on
word-formation systems that I already had by language. I have given
the url where available and commented where I could (I haven't yet had
the time to look at all the links and papers that were provided but I
will
certainly try to do so).
--- EnglishALE-RA http://nl.ijs.si/et/Thesis/ALE-RA/
"Alexander S. Yeh" wrote: > UMLS (Unified Medical Language System?) is a U.S. Government program that > provides among other things, a free morphological variation system for > mainly English medical terms.
---- Finnish (& other languages)
Antti Arppe wrote:
> A Finnish language technology company, Lingsoft <www.lingsoft.fi> has > used their morphological models (based on the two-level principle and > model by Koskenniemi) for generating inflected word forms in > inflecting thesauri, i.e. synonym dictionaries that can handle the > inflected forms of the synonyms as well. The languages that were > covered are Finnish, Swedish, Norwegian (bokmål), Danish and German. > > There's a short presentation on part of this in the Proceedings of the > 17th Scandinavian Conference of Linguistics: Arppe, Antti; Voipio, > Mari; Würtz, Malene 2000. Creating Inflecting Electronic Thesauri. In > Lindberg, Carl-Erik & Nordahl Lund, Steffen 2000. 17th Scandinavian > Conference of Linguistics Odense Working Papers in Language and > Communication, No. 19, Vol. I, Institute of Language and > Communication, University of Southern Denmark. > > In the case of these software tools, the generation was geared for the > (limited) synonym content. In principle the same models could be > applied for the language as a whole, but there are a variety of > factors that have to be considered in such a case, e.g. variant > inflected forms and errors in the underlying linguistic model which > become apparent only when generation is applied. > > Though I have been talking here mostly about inflection, specifically > the Finnish model has had a version where both derivations and > inflections can be generated from root words, e.g. > > ympäri+dv-oida+dn-minen+nom+sg > ympäröiminen > around+verbalize+nominalize+nominative+singular > encirclement > > I believe that this could be adapted rather easily to the other > languages as well, since they're all based on the same theoretical > principle, i.e. the TWOL model which allows to be used for both > morphological analysis and generation. Nevertheless, Lingsoft has not > been otherwise very active regarding these tools, as far as I know.
Comment: I am familiar with GerTWOL, the German version of TWOL. A link is given below.
---- German
DeKo (for Derivation und Komposition, IMS, University of Stuttgart; this is the project I worked in :-) http://www.ims.uni-stuttgart.de/projekte/DeKo
Projekt Deutscher Wortschatz (University of Leipzig): http://wortschatz.uni-leipzig.de
Deutsche Malaga Morphologie (university of Erlangen): http://www.linguistik.uni-erlangen.de/~orlorenz/DMM/DMM.html
CISLEX (University of Munich): http://www.cis.uni-,uenchen.de/projects/CISLEX:html
GerTWOL (Lingsoft Inc.): http://www.lingsoft.fi/cgi-bin/gertwol
and there is a German version of WordManager (University of Basel & Canoo) http://www.wordmanager.com
--- Italian
Rodolfo Delmonte wrote: > > As to the morphology word formation system, of course we have our > system for Italian IMMORTALE) that generates/analyses derivations > besides inflections. But no compound word, at least not yet. Even > though we could regard cliticized verbs as a special type of compound > word, > - lasciamoglielo / (let's) leave it to him > it requires clitic stripping and then inflection stripping, perhaps > with derivation stripping too, in case the verb is not included in > the dictionary list. > There's a number of published papers on it, they are listed in my website. > website: http://project.cgm.unive.it
--- Norwegian
Janne Bondi Johannessen wrote:
> For Norwegian, we have a compound analyser that also analyses > productive derivation as part of our morphological tagger. It can be > tested at : http://decentius.hit.uib.no:8005/cl/cgp/test.html
--- Romanian
Dan Tufis wrote: > > For Romanian I can give you at least three examples: > 1) Dan Cristea's morphological analyser/generator in the early 1980's > 2) my PARADIGM morphology learning system > (described in the EACL89 proceedings: "Tufis D. "It Would Be Much Easier If > WENT Were GOED", > in Harry Somers, Mary McGee Wood (eds.), Proceedings of the 4th EACL, > Manchester, 1989, pp.145-152 > and in EACL91: Tufis D., Popescu O., "A Unified Management and Processing of > Word-Forms, Idioms and Analytical Compounds", in Jurgen Kunze and Dorothy > Reinman (eds.), Proceedings of the 5th EACL, Berlin, 1991, pp.95-100) > 2) Dan Cristea's MICH classification-based system > (described in Dan Cristea (1994): The Classification Language MICH, Research > Report, LIMSI-CNRS, Universite Paris-Sud, Orsay. > Dan Cristea (1993): The generation of Romanian Morphology. Research Report. > University of Edinburgh). > > There is a new C-based PC-implementation of the LISP system 1) due to Stefan > Andrei of University A.I. Cuza in Iasi > (described in Andrei, St.: A Morphological Analyser for Romanian Language. > The First EUROLAN Summer School > in Natural Language Processing , Iasi - Romania, July 19-29, 1993)
--- Russian
"Sergei A. Koval" wrote: > > As for Russian, there is a system called RUSLO (abbreviated from the Russian > "RUSskoye SLOvoobrazovaniye" = "Russian Derivation") developed by > N.N.Pertsova, A.V.Cheremkhin, A.V.Rafaeva. > Some details are available at > http://194.226.57.46/uvk1838/Sciper/volume1/pertsova.htm
--- Turkish
Kemal Oflazer wrote: > > You may want to take a look at the morphological analyzer for Turkish > reachable from http://www.sabanciuniv.edu/fens/people/oflazer/
I have tried this one out - it seems to do quite a lot, it is especially interesting since it treats both word formation and inflection.
---
Multilingual
Word-Manager (German, English, Italian, ...)
---
More general information about morphology systems (dealing mostly with inflection) can be found
http://www.sil.org/computing/comp-morph-phon.html http://www.xrce.xerox.com/competencies/content-analysis/fsnlp/morph.en.html
-- Dr. Anke Lüdeling Institut für Kognitionswissenschaft, Universität Osnabrück Katharinenstr. 24, 49069 Osnabrück, Germany phone: +49-541-9694073 fax: +49-541-9696210 homepage: http://www.cogsci.uni-osnabrueck.de/~aluedeli
This archive was generated by hypermail 2b29 : Wed May 08 2002 - 14:59:45 MET DST