The collecting of these corpora has been supported by two projects: ETF ja ELAN. Preliminarily the newspapers are grouped according to the projects.
Non-ascii characters are represented as SGML-entities