Eesti keeles
Most frequent/significant collocations

Most frequent/significant collocations

Scroll down for more information.

Lists:

1. Lemma-lemma collocations:
lemma-lemma adjective- noun collocations:
lemma- lemma AS Sag
lemma- lemma AS LL
lemma- lemma AS MI
lemma- lemma AS MS
lemma-lemma adverb- adjective collocations:
lemma- lemma DA Sag
lemma- lemma DA LL
lemma- lemma DA MI
lemma- lemma DA MS
lemma-lemma noun- adverb collocations:
lemma- lemma SD Sag
lemma- lemma SD LL
lemma- lemma SD MI
lemma- lemma SD MS
lemma-lemma noun- noun collocations:
lemma- lemma SS Sag
lemma- lemma SS LL
lemma- lemma SS MI
lemma- lemma SS MS
lemma-lemma verb- adjective collocations:
lemma- lemma VA Sag
lemma- lemma VA LL
lemma- lemma VA MI
lemma- lemma VA MS
lemma-lemma verb- adverb collocations:
lemma- lemma VD Sag
lemma- lemma VD LL
lemma- lemma VD MI
lemma- lemma VD MS
lemma-lemma verb- noun collocations:
lemma- lemma VS Sag
lemma- lemma VS LL
lemma- lemma VS MI
lemma- lemma VS MS
lemma-lemma verb- verb collocations:
lemma- lemma VV Sag
lemma- lemma VV LL
lemma- lemma VV MI
lemma- lemma VV MS

2. Lemma- word form collocations:
lemma-word form adverb- adjective collocations:
lemma- word form DA Sag
lemma- word form DA LL
lemma- word form DA MI
lemma- word form DA MS
lemma-word form adjective- adverb collocations:
lemma- word form AD Sag
lemma- word form AD LL
lemma- word form AD MI
lemma- word form AD MS
lemma-word form adjective- noun collocations:
lemma- word form AS Sag
lemma- word form AS LL
lemma- word form AS MI
lemma- word form AS MS
lemma-word form noun- adjective collocations:
lemma- word form SA Sag
lemma- word form SA LL
lemma- word form SA MI
lemma- word form SA MS
lemma-word form verb- adjective collocations:
lemma- word form VA Sag
lemma- word form VA LL
lemma- word form VA MI
lemma- word form VA MS
lemma-word form adjective- verb collocations:
lemma- word form AV Sag
lemma- word form AV LL
lemma- word form AV MI
lemma- word form AV MS
lemma-word form verb- adverb collocations:
lemma- word form VD Sag
lemma- word form VD LL
lemma- word form VD MI
lemma- word form VD MS
lemma-word form adverb- verb collocations:
lemma- word form DV Sag
lemma- word form DV LL
lemma- word form DV MI
lemma- word form DV MS
lemma-word form noun- adverb collocations:
lemma- word form SD Sag
lemma- word form SD LL
lemma- word form SD MI
lemma- word form SD MS
lemma-word form adverb- noun collocations:
lemma- word form DS Sag
lemma- word form DS LL
lemma- word form DS MI
lemma- word form DS MS
lemma-word form noun- noun collocations:
lemma- word form SS Sag
lemma- word form SS LL
lemma- word form SS MI
lemma- word form SS MS
lemma-word form verb- noun collocations:
lemma- word form VS Sag
lemma- word form VS LL
lemma- word form VS MI
lemma- word form VS MS
lemma-word form noun- verb collocations:
lemma- word form SV Sag
lemma- word form SV LL
lemma- word form SV MI
lemma- word form SV MS
lemma-word form verb- verb collocations:
lemma- word form VV Sag
lemma- word form VV LL
lemma- word form VV MI
lemma- word form VV MS

3. word form- word form collocations:
word form-word form adjective- noun collocations:
word form- word form AS Sag
word form- word form AS LL
word form- word form AS MI
word form- word form AS MS
word form-word form adverb- adjective collocations:
word form- word form DA Sag
word form- word form DA LL
word form- word form DA MI
word form- word form DA MS
word form-word form noun- adverb collocations:
word form- word form SD Sag
word form- word form SD LL
word form- word form SD MI
word form- word form SD MS
word form-word form noun- noun collocations:
word form- word form SS Sag
word form- word form SS LL
word form- word form SS MI
word form- word form SS MS
word form-word form verb- adjective collocations:
word form- word form VA Sag
word form- word form VA LL
word form- word form VA MI
word form- word form VA MS
word form-word form verb- adverb collocations:
word form- word form VD Sag
word form- word form VD LL
word form- word form VD MI
word form- word form VD MS
word form-word form verb- noun collocations:
word form- word form VS Sag
word form- word form VS LL
word form- word form VS MI
word form- word form VS MS
word form-word form verb- verb collocations:
word form- word form VV Sag
word form- word form VV LL
word form- word form VV MI
word form- word form VV MS


What is a collocation?

Lexicographers, linguists and computational linguists have come up with different definitions of the term collocation. Here, following the usual practice in computational linguistics, the term collocation is used denoting an expression, which consists of words co-occurring more frequently in natural language texts than one would suggest on basis of their individual frequencies. Collocations differ in respect to the number of combining words making up the collocation and their syntactic relations. Based on their meaning, the collocations can be either (1) idioms (for example hambasse puhuma 'lie'), comprehensively represented in dictionaries, but rarely occuring in actual texts; (2) particle verbs and support verb combinations (üle saama 'get over', õppust võtma 'learn a lesson'), of those particle verbs are typically reprsented in dictionaries or (3) various noun phrases (for example rohelised mehikesed 'extraterrestrials').

Collocations also include fixed expressions, which contain words used in their usual meaning, but only a certain combination of theoretically possible words is actual for expressing certain meaning (e.g. in Estonian firewood is chopped (puid lõhutakse) but not broken (tehakse katki), in English one can make or deliver a speech, in Estonian only a verb pidama 'hold, make' can be used in this context, but not esitama 'deliver'. Such numerous fixed expressions are problematic for foreign language learners.

As the word order in Estonian is free, the words making up a collocation can be separated from each other by several intervening words, e.g. in the following sentence there are four extra word-forms between the constituents of the particle verb üle saama 'to overcome': Kass ei saanud priske hiire kaotusest kuidagi üle.

Collocation Tool

There is an Interface for finding collocations from the Balanced corpus of Estonian, Estonian Reference corpus and it's subcorpora. One can use the interface to search for collocations in three different ways;

Both the number of lemmas and word forms can be restricted by their word class category in the collocate search, In order to detect collocations in a corpus, three association measures are used: (1) log-likelihood (LL), (2) Mutual Information (MI) and (3) Minimum Sensitivity (MS). For comparison one can also search for word pairs ordered by frequency( Sag).

What are these lists good for?

Collocation finder enables one to retrieve collocations of a single word. However, in order to analyse the most frequent collocations in a certain text corpus, frequency lists based on source materials are needed. Here we present lists of first 5000 most significant or frequent collocations ordered by an association measure or by frequency.
Similar to possibilities offered by the collocation tool, these lists are organized (1) on basis of the word class of the collocates and (2) whether the collocate is a lemma or a word-form. So, for each word class pair, there are three lists: lemma-lemma, lemma-wordform and wordform-wordform.
The following word class pairs are included in the lists:

Lemma-lemma ja wordform-wordform pairs are symmetric, which means that as the frequencies for the pairs lootma ('hope') V abi ('help') and abi 'help' S lootma 'hope' V are the same, only one of the word order variants is represented in the lists. On the other hand, lemma-wordform pairs are not symmetric and therefore for them also mirror cases are given (compare juriidiline A isiku S ja isik S juriidilise A). Therefore lemma ja word form collocations include also the following "opposite pairs":

From the mentioned collocation pairs, 5000 most frequent/significant ones, occurring in a corpus at least ten times are represented. They are listed according to their frequency (Sag) and the three association measures (Log-Likelihood (LL), Mutual Information (MI), Minimum Sensitivity (MS) mentioned above.
Word class labels:
_A_ adjective
_D_ adverb
_S_ noun
_V_ verb



Valid XHTML 1.0! Valid CSS! Webmaster    Last modified: October 11 2018 19:28:36.