Eesti keeles

Corpus of Written Estonian: 1920ies: transcripts of Asutaw Kogu

The corpus has been created at the University of Tokyo under the guidance of Prof. Kazuto Matsumura.

It consists of transcripts of the Asutaw Kogu (Constituent Assembly) from 1919-1920; approximately 2 million words.

How to use it?

The corpus can be used freely only for non-commercial purposes.

Mark-up

utf-8, xml

Text is divided into paragraphs <p> and sentences <s>. The sentences are numbered.


Valid XHTML 1.0! Valid CSS! Webmaster    Last modified: December 14 2018 20:53:25.