[Corpora-List] ngram statistics package version 0.53

From: ted pedersen (tpederse@d.umn.edu)
Date: Wed Jan 15 2003 - 05:49:55 MET

  • Next message: Ute Römer: "Re: [Corpora-List] are there corpora of fast speech?"

    We are happy to announce an updated version of the Ngram Statistics
    Package (NSP). The most current version is now v-0.53.
    NSP is a suite of Perl programs that allow users to identify interesting
    Ngrams in text using a variety of measures of association. It's free
    software, just go to : http://www.d.umn.edu/~tpederse/nsp.html

    There are several new features included, among them:

    1) improved stoplist handling that allows stop listed words to be defined
    via regular expressions,

    2) new utility tools for finding kth order co-occurrences in corpora,

    3) several new 2-dimensional (bigram) tests,

    4) ... and at long last, a 3-dimensional (trigram) test (the log
    likelihood ratio).

    We've also revamped and expanded the documentation. A more detailed
    changelog is available at


    and the new README is at:


    We'd be pleased if you'd check it out and let us know what you think.


    This archive was generated by hypermail 2b29 : Wed Jan 15 2003 - 05:54:56 MET