The Mixed Corpus: Agraarteadus


This subcorpus contains texts from the internet archive of the journal 'Agraarteadus' (ca  298 000 words altogether). The corpus contains the issues form the years 2001 – 2006, except for some articles from the years 2002-2003.

How can one use it?

The corpus is free for use for non-commercial purposes only.

Texts and annotation

Mark-up and annotation conform to the TEI-guidelines. One file contains one year’s issues of the journal.

Every file begins with a header <teiheader> that contains information about the file size, used tags etc. The rest of the file is structured as follows:

By non-textual material we mean pictures (photos, drawings, diagrams etc), tables, lists of references etc. Longer non-Estonian passages, usually the English summaries of the articles have also been omitted

In the corpus version one can access via our corpus query, all mark-up except the tags <gap> used for the omitted material have been deleted.

