Estonian Coursebook Corpus 2018 is an Estonian corpus containing complete sentences from Estonian language textbooks for students at A1, A2, B1, B2 and C1 language proficiency levels. The corpus is based on Estonian Coursebook Corpus 2017 and it was created in cooperation with the Institute of the Estonian Language.

Availability

This Estonian corpus is available on demand. To gain access, please contact Jelena Kallas jelena.kallas@eki.ee or Kristina Koppel Kristina.Koppel@eki.ee and then forward their answer to our support email support@sketchengine.eu so that we can grant you access to the corpus.

Part-of-speech tagset and lemmatization

The Estonian Coursebook corpus 2018 is part-of-speech tagged using the EstNLTK analyzer indicating the part of speech and grammatical category. The corpus texts also contain lemmatization when each word form from the corpus is assigned to its base form (lemma).

Reference

KALLAS, J., & KOPPEL, K. (2018). Eesti keele A1-C1 õpikute korpus 2018. Center of Estonian Language Resources. https://doi.org/10.15155/3-00-0000-0000-0000-071E9L

Search the Estonian corpus

Sketch Engine offers a range of tools to work with this Estonian corpus.

Other text corpora

Sketch Engine offers 800+ language corpora.

Use Sketch Engine in minutes

Generate collocations, frequency lists, examples in contexts, n-grams or extract terms. Use our Quick Start Guide to learn it in minutes.