ICAME CORPUS COLLECTION - INFORMATION
Brown Corpus, untagged text format I
A revised version of the Brown Corpus with upper- and lower-case
letters and other features which reduce the need for special codes
and make the material more easily readable. A number of errors found
during the tagging of the corpus have been corrected. Typographical
information is preserved; the same line division is used as in the
original version from Brown University except that words at the end
of the line are never divided.
Example:
A01 0010 The Fulton County Grand Jury said Friday an investigation
A01 0020 of Atlanta's recent primary election produced "no evidence" that
A01 0030 any irregularities took place. The jury further said in term-end
A01 0040 presentments that the City Executive Committee, which had over-all
A01 0050 charge of the election, "deserves the praise and thanks of the
A01 0060 City of Atlanta" for the manner in which the election was conducted.
Brown Corpus, untagged text format II
This version is identical to text format I, but typographical
information is reduced and the line division is new.
Example:
A01 0010 1 The Fulton County Grand Jury said Friday an investigation
A01 0020 1 of Atlanta's recent primary election produced "no evidence"
A01 0020 9 that any irregularities took place.
A01 0030 5 The jury further said in term-end presentments that
A01 0040 3 the City Executive Committee, which had over-all charge
A01 0050 2 of the election, "deserves the praise and thanks of
A01 0050 11 the City of Atlanta" for the manner in which the election
A01 0060 11 was conducted.
Brown Corpus, WordCruncher version
This is an indexed version of the Brown Corpus. It can only be used
with WordCruncher. See the article by Randall Jones, ICAME Journal
11, pp. 44-47.
Example:
|CPress_Reportage
|PA01
|S1 The Fulton County Grand Jury said Friday an investigation of Atlanta's
recent primary election produced "no evidence" that any irregularities took
place.
|S2 The jury further said in term-end presentments that the City Executive
Committee, which had over-all charge of the election, "deserves the praise and
thanks of the City of Atlanta" for the manner in which the election was
conducted.
Conditions on the use of ICAME corpus material
The primary purposes of the International Computer Archive of
Modern English (ICAME) are:
- collecting and distributing information on (i)
English language material available for computer processing; and
(ii) linguistic research completed or in progress on this
material;
- compiling an archive of corpora to be located at the
University of Bergen, from where copies of the material can be
obtained at cost.
The following conditions govern the use of corpus material
distributed through ICAME:
- No copies of corpora, or parts of corpora, are to be
distributed under any circumstances without the written permission
of ICAME.
- Print-outs of corpora, or parts thereof, are to be used for
bona fide research of a non-profit nature. Holders of copies of
corpora may not reproduce any texts, or parts of texts, for any
purpose other than scholarly research without getting the written
permission of the individual copyright holders, as listed in the
manual or record sheet accompanying the corpus in question. (For
material where there is no known copyright holder, the person(s)
who originally prepared the material in computerized form will be
regarded as the copyright holder(s).
- Commercial publishers and other non-academic organizations
wishing to make use of part or all of a corpus or a print-out
thereof must obtain permission from all the individual copyright
holders involved.
- The person(s) who originally prepared the material in
computerized form must be acknowledged in every subsequent use of
it.
Use of ICAME texts within an institution
Though ICAME texts cannot be used and distributed outside the
institution making the order, they can be freely used within the
institution (department, faculty, university) for the purposes of
research and teaching. To prevent any use of the material for
commercial and profit-making purposes, it is advisable to limit
access to registered computer users within the institution. The way
this is done may vary depending upon the institution making the
order.