Corpus of Written Estonian: the 1960s

Corpus statistics: The 1960s

The corpus of the 1960s - 333,000 tokens total - consists of the following text classes:

Text Class	File name beginning	Number of tokens	Per Cent of corpus
Newspapers	aja	201,000	60 %
Fiction	ilu	132,000	30 %

Newspaper texts come from the following titles:

Newspaper	File name beginning	Number of tokens	Per Cent newspaper texts	Per Cent of corpus
Edasi	ed	30,000	15 %	9 %
Kodumaa	km	8,100	4 %	2 %
Noorte Hääl	nh	24,400	12 %	7 %
Punane Täht	pt	4,200	2 %	1 %
Rahva Hääl	rh	102,600	51 %	31 %
Sirp ja Vasar	sv	14,600	7 %	4 %
Õhtuleht	ol	17,100	9 %	5 %

Webmaster Last modified: December 14 2018 15:54:13.