This subcorpus contains texts from the internet archive of the journal 'Agraarteadus' (ca 298 000 words altogether). The corpus contains the issues form the years 2001 – 2006, except for some articles from the years 2002-2003.
The corpus is free for use for non-commercial purposes only.
Mark-up and annotation conform to the TEI-guidelines. One file contains one year’s issues of the journal.
Every file begins with a header
<teiheader> that contains information about the file size, used tags etc.
The rest of the file is structured as follows:
<div0>contais one year’s issues of journal, e.g.
<div0 type='aasta'><head>Agraarteadus 2001
<div1>is one issue of the journal e.g.
<div1 type='number'><head>2001 Nr 2
<div2>is an article, e.g.
<div2 type='artikkel'><head>EESTI HOLSTEINI GENEETILISE SELEKTSIOONIEDU MAJANDUSLIK VÄÄRTUS
By non-textual material we mean pictures (photos, drawings, diagrams etc), tables, lists of references etc. Longer non-Estonian passages, usually the English summaries of the articles have also been omitted
In the corpus version one can access via our corpus query, all mark-up except the tags
<gap> used for the omitted material have been deleted.