Corpora: Last Call: COLING-2000 Workshop on Toolsets and Architectures

From: Remi Zajac (rzajac@crl.nmsu.edu)
Date: Sun May 14 2000 - 19:42:58 MET DST

  • Next message: Gabriel Pereira Lopes: "Re: Corpora: ISO Multilingual parallel corpora"

    Call for Papers for the

    COLING-2000 Workshop on Using Toolsets and Architectures To Build NLP Systems

    Centre Universitaire, Luxembourg, 5 August 2000

    (see also this call at http://crl.nmsu.edu/Events/COLING00)

    Background

    The purpose of the workshop is to present the state-of-the-art on NLP
    toolsets and workbenches that can be used to develop multilingual
    and/or multi-applications NLP components and systems. Although
    technical presentations of particular toolsets are of interest, we
    would like to emphasize methodologies and practical experiences in
    building components or full applications using an NLP
    toolset. Combined demonstrations and paper presentations are strongly
    encouraged.

    Many toolsets have been developed to support the implementation of
    single NLP components (taggers, parsers, generators, dictionaries) or
    complete Natural Language Processing applications (Information
    Extraction systems, Machine Translation systems). These tools aim at
    facilitating and lowering the cost of building NLP systems. Since the
    tools themselves are often complex pieces of software, they require a
    significant amount of effort to be developed and maintained in the
    first place. Is this effort worth the trouble? It is to be noted that
    NLP toolsets have often been originally developed for implementing a
    single component or application. In this case, why not build the NLP
    system using a general programming language such as Lisp or Prolog?
    There can be at least two answers. First, for pure efficiency issues
    (speed and space), it is often preferable to build a parameterized
    algorithm operating on a uniform data structure (e.g., a
    phrase-structure parser). Second, it is harder, and often impossible,
    to develop, debug and maintain a large NLP system directly written in
    a general programming language.

    It has been the experience of many users that a given toolset is quite
    often unusable outside its environment: the toolset can be too
    restricted in its purpose (e.g. an MT toolset that cannot be used for
    building a grammar checker), too complex to use, or even too difficult
    to install. There have been, in particular in the US under the Tipster
    program, efforts to promote instead common architectures for a given
    set of applications (primarily IR and IE in Tipster; see also the
    Galaxy architecture of the DARPA Communicator project). Several
    software environments have been built around this flexible concept,
    which is closer to current trends in main stream software engineering.

    The workshop aims at providing a picture of the current problems faced
    by developers and users of toolsets, and future directions for the
    development and use of NLP toolsets. We encourage reports of actual
    experiences in the use of toolsets (complexity, training, learning
    curve, cost, benefits, user profiles) as well as presentation of
    toolsets concentrating on user issues (GUIs, methodologies, on-line
    help, etc.) and application development. Demonstrations are also
    welcome.

    Audience

    Researchers and practitioners in Language Engineering, users and
    developers of tools and toolsets.

    Issues

    Although individual tools (such as a POS taggers) have their use, they
    typically need to be integrated in a complete application (e.g. an IR
    system). Language Engineering issues in toolset and architectures
    include (in no particular order):

      Practical experience in the use of a toolset;
      Methodological issues associated to the use of a toolset;
      Benefits and deficiencies of toolsets;
      User (linguist/programmer) training and support;
      Adaptation of a tool (or toolset) to a new kind of application;
      Adaptation of a tool to a new language;
      Integration of a tool in an application;
      Architectures and support software;
      Reuse of data resources vs. processing components;
      NLP algorithmic libraries.

    Format of the Workshop

    The one-day workshop will include twelve presentation periods which
    will be divided into 20 minutes presentations followed by 10 minutes
    reserved for exchanges. We encourage the authors to focus on the
    salient points of their presentation and identify possible
    controversial positions. There will be ample time set aside for
    informal and panel discussions and audience participation. Please note
    that workshop participants are required to register at
    http://www.coling.org/reg.html.

    Deadlines

       21 May 2000: Submission deadline.
       11 June 2000: Notification to authors.
       24 June 2000: Final camera-ready copy.
       5 August 2000: COLING-2000 Workshop.

    Submission Format

    Send submissions of no more than 6 pages conforming to the COLING
    format to zajac@crl.nmsu.edu. We prefer electronic submissions using
    either PDF or Postscript. Final submissions can extend to 10 pages.

    Organizing Committee

      Rémi Zajac (Chair), CRL, New-Mexico State University, USA:
           zajac@crl.nmsu.edu.
      Jan Amtrup, CRL, New-Mexico State University, USA:
          jamtrup@crl.nmsu.edu.
      Stephan Busemann, DFKI, Saarbrucken:
           busemann@dfki.de.
      Hamish Cunningham, University of Sheffield:
          hamish@dcs.shef.ac.uk.
      Guenther Goerz, IMMD VIII, University of Erlangen:
          goerz@immd8.informatik.uni-erlangen.de.
      Gertjan van Noord, University of Groningen:
          vannoord@let.rug.nl.
      Fabio Pianesi, IRST, Trento:
          pianesi@irst.itc.it.

    Of Related Interest

      The Natural Language Software Registry at
          http://www.dfki.de/lt/registry/sections.html
      The Coling-200 Web Site at http://www.coling.org/

    ---
    



    This archive was generated by hypermail 2b29 : Mon May 15 2000 - 09:01:01 MET DST