Sylvana,
A problem with "retrieving new words in a corpus" is: "new" with respect
to what? You can easily find all words in a corpus with only one (or
two..) occurrences, which makes them "rare"; but "new" implies
your corpus builds on a larger monitor corpus tracking the language over
time. As I understand it, AVIATOR/APRIL is not just software for a
static corpus but infrastructure for processing a (large) monitor corpus.
Is this what you have?
Eric Atwell
On Thu, 12 Jun 2003, krausse wrote:
> Dear colleagues,
>
> In Lynne Bowker's and Jennifer Pearson's book "Working with Specialized
> Corpora" neologism finder tools like the ones used in the AVIATOR/APRIL
> project are mentioned.
>
> I wonder whether there are any free or commercial programs available or
> how other people go about retrieving new words in a corpus.
>
> Many thanks in advance,
>
> Sylvana Krausse
>
-- Eric Atwell, CVL: Computer Vision and Language research group Distributed Multimedia Systems MSc Tutor & SOCRATES/JYA Tutor School of Computing, University of Leeds, LEEDS LS2 9JT TEL: 0113-3435761 MOBILE: 0775-1039104 FAX: 0113-3435468 WWW: http://www.comp.leeds.ac.uk/eric EMAIL: eric@comp.leeds.ac.uk Visit http://www.computingLEEDS.ac.uk - our newsletter for industry
This archive was generated by hypermail 2b29 : Thu Jun 12 2003 - 16:11:08 MET DST