I'm looking for:
FREE corpora of text that is formatted one sentence per line or with clear
end of sentence markers. Could be just about anything, literature, news,
etc. Preferrably not poetry or speech with lots of pauses and
hesitations. (We already have access to the free part of the Penn LDC
database)
AND/OR
FREE programs that pull individual sentences out of bodies of text with an
algorithm that's not simply searching for punctuation (and checking that
they're not abbreviations), etc.
Nina B. Silverberg Phone: (215) 707-3090
Center for Cognitive Neuroscience Fax: (215) 707-7843
Department of Neurology
Temple University School of Medicine
3401 N. Broad St.
Philadelphia, PA 19140
This archive was generated by hypermail 2b29 : Wed Jun 07 2000 - 01:51:04 MET DST