The corpus of the 1950s - total of 308,000 tokens consists of the following text classes:
| Text Class | File name beginning | Number of tokens | Per Cent of corpus |
|---|---|---|---|
| Newspapers | aja | 242,400 | 79 % |
| Fiction | ilu | 66,000 | 21 % |
The newspaper texts come from the following titles:
| Newspapers | File name beginning | Number of tokens | Per Cent newspapers | Per Cent of corpus |
|---|---|---|---|---|
| Edasi | ed | 31,900 | 13 % | 10 % |
| Noorte Hääl | nh | 32,800 | 14 % | 11 % |
| Rahva Hääl | rh | 109,200 | 45 % | 35 % |
| Sirp ja Vasar | sv | 16,400 | 7 % | 5 % |
| Talurahvaleht | tl | 11,400 | 5 % | 4 % |
| Õhtuleht | ol | 39,500 | 16 % | 13 % |