Eesti keeles

Newspaper subcorpus of the Balanced Corpus

Newspapers from the years 1990-2001 (5 million words) consists of three parts:

  1. various newspapers from the years 1990-1999 - 800 764 words, described in more detail in table 1
  2. weekly "Eesti Ekspress" 1996-2001, 1 542 433 words, as described in more detail in table 2
  3. daily "Postimees" 1995 -2000, 2 671 395 words, described in more detail in table 3

Every line equals one sentence. The line begins with a reference. For the newspapers 1990-1999 it contains the name of the newspaper and the number or date of publishing. For "Postimees" and "Eesti Ekspress" the reference gives the name and the year of publishing (e.g. PM1995). The reference is followed by four spaces.

Special characters (accents like umlaut etc are in the SGML-form, see the table of entities. The punctuation marks have been separated from the actual words by spaces. All quotation marks are presented as ".

Valid XHTML 1.0! Valid CSS! Webmaster    Last modified: September 05 2008 16:18:22.