Re: Corpora: rewrite rules for speech

From: William M. Fisher (william.fisher@nist.gov)
Date: Mon Oct 23 2000 - 22:33:00 MET DST

  • Next message: James L. Fidelholtz: "Re: Corpora: rewrite rules for speech"

    Jim Magnuson wrote:

    > Hi. I am trying to compute estimates of, e.g., diphone transitional
    > probabilities in conversational speech. So far I have worked with the
    > CallHome database from the LDC. What I'm working with are orthographic
    > transcripts of telephone conversations. I've replaced all of the
    > orthographic forms with phonemic citation forms. This gives me very
    > different estimates of diphone probabilities than, e.g., written corpora
    > or frequency-weighted dictionaries.
    >
    > However, citation forms are obviously not ideal. For my purposes, it is
    > not worth investing in retranscribing the corpus phonetically. But I would
    > like to improve my estimates by applying phonological rules to my corpus
    > of phonemic citation forms. Could anyone point me towards a source of such
    > rules for American English? I've started working on my own, but would
    > rather not reinvent anything.
    >

      A couple of years ago Steve Greenberg and colleagues at ICSI did
    phonetic transcriptions of a part of the Switchboard corpus, and Joe
    Picone at ISIP has made them available for downloading from:

         http://www.isip.msstate.edu/projects/switchboard/index.html

     - Bill F.



    This archive was generated by hypermail 2b29 : Mon Oct 23 2000 - 22:40:10 MET DST