Eesti keeles

Corpus statistics: The 1890s

The corpus of the 1890s - total of 348,000 tokens - consists of following text classes:

Text class File name beginning Number of tokens Per Cent of corpus
Newspapers aja 193,000 55 %
Fiction ilu 155,000 45 %

Newspaper texts come from the following titles:

Newspaper File name beginning Number of tokens Per Cent from newspapers Per Cent from corpus
Eesti Postimees epo 36,600 19 % 11 %
Olewik ole 33,400 17 % 10 %
Postimees pos 48,000 25 % 14 %
Ristirahwa pühapäewa leht rip 2,100 1 % 1 %
Sakala sak 5,300 3 % 2 %
Walgus val 60,500 31 % 17 %
Wirmaline vir 7,100 4 % 2 %

Valid XHTML 1.0! Valid CSS! Webmaster    Last modified: December 14 2018 22:43:43.