Eesti keeles

Corpus of Written Estonian: the 1960s

Corpus statistics: The 1960s

The corpus of the 1960s - 333,000 tokens total - consists of the following text classes:

Text Class File name beginning Number of tokens Per Cent of corpus
Newspapers aja 201,000 60 %
Fiction ilu 132,000 30 %

Newspaper texts come from the following titles:

Newspaper File name beginning Number of tokens Per Cent newspaper texts Per Cent of corpus
Edasi ed 30,000 15 % 9 %
Kodumaa km 8,100 4 % 2 %
Noorte Hääl nh 24,400 12 % 7 %
Punane Täht pt 4,200 2 % 1 %
Rahva Hääl rh 102,600 51 % 31 %
Sirp ja Vasar sv 14,600 7 % 4 %
Õhtuleht ol 17,100 9 % 5 %

Valid XHTML 1.0! Valid CSS! Webmaster    Last modified: December 14 2018 15:54:13.