Hello,
Does anyone out there have some experience working with
corpora that have linguistic annotation (e.g. for part of
speech, syntax, multiword expressions, or word senses) kept in
files separate from the text itself?
This is the system recommended by the CES and XCES corpus
encoding standards, and the TEI guidelines also provide a
mechanism for putting tags in one document that indicate links
to another document.
I'm trying to get my mind around how best to enable software
to access corpus annotation in such a format. Ideally such
access could be provided using standard XML formats and tools,
like XPath and XSLT.
Any suggestions on how best to do this, pointers to software
or APIs that work with modular annotation, etc. would be
invaluable.
Thanks for your help.
Scott
-- Scott Cederberg ResearcherInfomap Project Computational Semantics Lab Center for the Study of Language and Information (CSLI) Stanford University
This archive was generated by hypermail 2b29 : Fri Apr 18 2003 - 05:21:24 MET DST