Eesti keeles

Corpus statistics: The 1950s

The corpus of the 1950s - total of 308,000 tokens consists of the following text classes:

Text Class File name beginning Number of tokens Per Cent of corpus
Newspapers aja 242,400 79 %
Fiction ilu 66,000 21 %

The newspaper texts come from the following titles:

Newspapers File name beginning Number of tokens Per Cent newspapers Per Cent of corpus
Edasi ed 31,900 13 % 10 %
Noorte Hääl nh 32,800 14 % 11 %
Rahva Hääl rh 109,200 45 % 35 %
Sirp ja Vasar sv 16,400 7 % 5 %
Talurahvaleht tl 11,400 5 % 4 %
Õhtuleht ol 39,500 16 % 13 %

Valid XHTML 1.0! Valid CSS! Webmaster    Last modified: December 14 2018 19:43:41.