The Wellington Corpus of Spoken New Zealand English is the result of ten years demanding but very rewarding work. People from many walks of life have contributed to its completion.
The idea of a corpus was first raised within the New Zealand community of linguists in 1985 and when Victoria University of Wellington offered to begin collecting data in 1987 there was enthusiastic support from linguists elsewhere, many of whom subsequently contributed data.
In the initial stages of the project, a local group of linguists acted as a Corpus Research Advisory Group. They assisted in the preliminary design of the corpus and addressed selection issues. Decisions had to be made on how representative the corpus would be in terms of the types of speech and the types of speakers heard in contemporary New Zealand society. For example, what should the ratio of broadcast material to private informal conversation be, and who should count as a New Zealander? What percentage of speakers should be Maori? What percentage should be male?
In subsequent stages the corpus Manager and staff in consultation with the Project Director addressed ongoing corpus development and management issues. Two major areas of discussion were data collection and transcription.
The goal of half a million words of informal conversational speech in addition to the other categories was a hugely demanding one. This was only achieved through the support of the project team, and their friends, relatives and students. Creativity was required to avoid the use of surreptitious recording, while maintaining the quality and naturalness of the data collected.
Transcription is the art of making the ephemeral tangible in a consistent and practical manner. The basic principles of the elaborated orthographic transcription system were established by the Corpus Research Advisory Group. The system was refined at the suggestions of the transcribers as they encountered the obstacles of transcribing real data.
A series of dedicated and skilled Corpus Managers have played a key role in the implementation and refinement of the corpus design. They have tracked down contributors to obtain the background information sheets so crucial to the success of the project. They have co-ordinated the work of others and audited the standard of transcription. Later Managers have developed databases for complex and at times apparently intractable materials, devised procedures for the markup and release of the corpus, and investigated the means to best preserve the corpus for posterity.
Student involvement has been critical to the completion of the Wellington Corpus of Spoken New Zealand English, which has provided an unparalleled training ground for corpus research. Students have learned how to collect good quality speech data in a wide range of contexts. Those who have trained as transcribers have acquired valuable skills, with meticulousness and accuracy being crucial to their task.
The Wellington Corpus of Spoken New Zealand English has already proved an invaluable resource for linguistic research, especially for descriptions of New Zealand English in comparison to other varieties. Two doctorates making use of the corpus have been completed and it has provided data for a number of local as well as visiting researchers.
In addition to providing a rich source of New Zealand material, the corpus is also a particularly good source of informal conversational material. Many of the world's spoken English corpora are dominated by broadcast material. Seventy-five percent of the Wellington Corpus of Spoken New Zealand English material is informal dialogue, an unusually high proportion for any corpus.
This Guide marks the public release of the Wellington Corpus of Spoken New Zealand English. The corpus is both a unique cultural treasure and a major contribution to our understanding of spoken language. With its release to the national and international linguistic research community, New Zealand voices, as our national anthem proclaims, will indeed be heard afar.