> Well if you can't get the prophet to the mountain, why not just
> move the mountain to the prophet and reformat the corpora into a nice
> format like sgml using the BNC dtd. In this way we could use them
> with SARA. Reformatting corpora is what we have to do to use many
The point is that Sara is far from being an ideal corpus query system! But you
are certainly right in saying that corpus data should be distributed in a
standardized format.
> other corpus access programs, so why not for SARA.
Because there are things that might have done better. And although I do not want
to end up in a debate about operating systems - there are other platforms in
widespread use and if I am not mistaken the client software is available for
Windows only (I'd be happy to hear that there is a version that will compile
under HP/UX - and I am not talking about the server).
> benefit sgml/xml-formatted corpora might inspire programmers to write
> "more flexible, more general" software for corpus analysis.
Meta languages are ideal for interchange purposes but I doubt that ANY software
will handle SGML data describing 100 million annotated word forms efficiently.
But that's another story.
Regards
Thomas
--- Thomas Kuenneth M.A. Universitaet Erlangen-Nuernberg Institut fuer Germanistik Abteilung Computerlinguistik Bismarckstr. 6 * D-91054 Erlangen * Tel.: +49 9131 8529250 http://www.linguistik.uni-erlangen.de/~tommi
This archive was generated by hypermail 2b29 : Thu Jun 21 2001 - 11:10:09 MET DST