Eesti keeles

Corpus statistics: the 1990s

Text types

The corpus of the 1990s - total 995,800 tokens - consists of the following types of text:

Type Beginning of file name Number of tokens Per cent of corpus
Newspapers (several, see below) 384,800 39 %
Fiction ilu 611,000 61 %

Newspaper texts

Newspaper title Beginning of file name Number of tokens Per cent of newspapers
Edasi, Postimees ed, pm 37,700 9.8 %
Eesti Ekspress ee 40,000 10.4 %
Kultuurileht, Reede, Sirp kl, re, si 70,100 18.2 %
Maaleht ml 67,300 17.5 %
Pärnu Postimees pp 24,700 6.4 %
Rahva Hääl rh 85,000 22.1 %
Õhtuleht ol 36,200 9.4 %
Äripäev ap 23,500 6.1 %

Valid XHTML 1.0! Valid CSS! Webmaster    Last modified: December 19 2018 19:17:05.