Corpora: RE: Swedish taggers & parsers

From: Atro Voutilainen (atro.voutilainen@conexor.fi)
Date: Fri Oct 06 2000 - 12:59:12 MET DST

  • Next message: Mcenery, Tony: "Corpora: CFP - Corpus Linguistics 2001"

    A few days ago I posted a query about references to tagging & parsing of Swedish.
    Attached are the results. I wish to thank the following people for their help:

    Daniel Ridings
    Dimitrios Kokkinakis
    Jakub Zavrel
    Joakim Nivre
    Jussi Karlgren
    Nikolaj Lindberg

    I would like to ask a second question: are there any available
    collections of Swedish sentences that exemplify different grammatical
    phenomena in Swedish? In return for pointers, I'll post a summary.

    Thanks,
    Atro Voutilainen

    -- 
    Atro Voutilainen                              mobile: +358 50 5437452
    Conexor oy                                       fax: +358 9 37468502
    Helsinki Science Park                     atro.voutilainen@conexor.fi
    Koetilantie 3, 00710 Helsinki, Finland          http://www.conexor.fi
    
    

    parsing

    Björn Gambäck. 1997. Processing Swedish Sentences: A Unification-Based Grammar and some Applications. Doctor of Engineering Thesis, The Royal Institute of Technology and Stockholm University, Dept. of Computer and Systems Sciences, Stockholm, Sweden, June. Also available as SICS Dissertation Series 21, Swedish Institute of Computer Science, Kista, Sweden.

    Kokkinakis D. and Johansson Kokkinakis S. (1999), A Cascaded Finite-State Parser for Syntactic Analysis of Swedish, In Proceedings of the 9th EACL (European Chapter of The Association of Computational Linguistics), Bergen, Norway

    morphology, tagging

    Most papers and pointers are related to tagging (morphology, POS, morphosyntactic functions).

    Karlgren & Cutting paper on implementing a HMM tagger of Swedish, Proc. Nodalida '93, Stockholm.

    Martin Eineborg & Nikolaj Lindberg, 1999. Improving Part of Speech Disambiguation Rules by Adding Linguistic Knowledge. In Proceedings of the Ninth International Workshop on Inductive Logic Programming ( ILP'99 ), Bled, Slovenia.

    Brants & Samuelsson 1995. Tagging the Teleman Corpus. Procs. Nodalida'95. Helsinki.

    "SWETWOL: A Comprehensive Morphological Analyzer for Swedish". Nordic Journal of Linguistics 15, 1992, 1-45.

    Juhani Birn, Lingsoft, Inc., 1998. Swedish Constraint Grammar: A Short Presentation. http://www.lingsoft.fi/doc/swecg/intro/.

    AUTHOR = "Eineborg, Martin and Lindberg, Nikolaj", TITLE = "Induction of {C}onstraint {G}rammar-Rules Using {P}rogol", BOOKTITLE = "Proceedings of The Eighth International Conference on Inductive Logic Programming ({ILP}'98)", YEAR = "1998", ADDRESS = "Madison, Wisconsin", PAGES = "116--124",

    AUTHOR = "Lindberg, Nikolaj and Eineborg, Martin", TITLE = "Learning {C}onstraint {G}rammar-style disambiguation rules using {I}nductive {L}ogic {P}rogramming", BOOKTITLE = "Proceedings of COLING/ACL'98", YEAR = "1998", VOLUME = "II", PAGES = "775--779", ADDRESS = "Montreal, Canada",

    AUTHOR = "Lindberg, Nikolaj and Eineborg, Martin", TITLE = "Improving Part of Speech Disambiguation Rules by Adding Linguistic Knowledge", BOOKTITLE = "Proceedings of the Ninth International Workshop on Inductive Logic Programming ({ILP}'99)", PAGES = "186--197", YEAR = 1999, EDITOR = "D\v{z}eroski, Sa\v{s}o and Flach, Peter", ADDRESS = "Bled, Slovenia"

    Eineborg, M. and Lindberg, N. (2000). ILP in Part-of-Speech Tagging - An Overview. In James Cussens and Saso Dzeroski, editors, Learning Language in Logic, volume 1925 of LNAI. Springer, 2000.

    AUTHOR = "Lager, Torbj{\"o}rn", TITLE = "The $\mu$-{TBL} System: Logic Programming Tools for Transformation-Based Learning", BOOKTITLE = "Proceedings of CoNLL'99", YEAR = "1999", ADDRESS = "Bergen, Norway"

    AUTHOR = "Carlberger, Johan and Kann, Viggo", TITLE = "Implementing an Efficient Part-of-Speech Tagger", YEAR = "1999", NOTE = "To appear. Available at {\tt http://www.nada.kth.se/theory/projects/granska/}"

    AUTHOR="{R}idings, Daniel", TITLE="{SUC} and the {B}rill tagger", YEAR="1998", HOWPUBLISHED="{GU-ISS-98-1} (Research Reports from the Department of Swedish, G{\"o}teborg University)"

    torbjörn lager has done some work on learning CG-rules using an error-driven transformation based learning approach. see http://www.ling.gu.se/~lager/.

    Nivre, J., Grönqvist, L., Gustafsson, M., Lager, T. & Sofkova, S. (1996) Tagging Spoken Language Using Written Language Statistics. In Proceedings of the 16th International Conference of Computational Linguistics (COLING-96). Copenhagen: Center for Language Technology.

    Nivre, J. (2000) Sparse Data and Smoothing in Statistical Part-of-Speech Tagging. Journal of Quantitative Lingustics, 7(1), 1-17.

    Nivre, J. & Grönqvist, L. (in press) Tagging a Corpus of Spoken Swedish. To appear in International Journal of Corpus Linguistics.

    On the ILK webpage in Tilburg, we have a demo on-line of a Memory-based Swedish tagger. It's been trained on the SUC corpus... The URL is: http://ilk.kub.nl/~zavrel/tagtest.html

    www.lexilogik.com: Under "Demonstrations" we have a demo of our tagger.

    Kokkinakis D. and Johansson Kokkinakis D. (1997), A Robust and Modularized Lemmatizer/Tagger for Swedish Based on Large Lexical Resources, Research Reports from the Department of Swedish, GU-ISS-97-1, Språkdata.

    www.conexor.fi: Swedish tagger and light parser



    This archive was generated by hypermail 2b29 : Fri Oct 06 2000 - 09:52:50 MET DST