Estonian Coursebook Corpus 2018 is an Estonian corpus containing complete sentences from Estonian language textbooks for students at A1, A2, B1, B2 and C1 language proficiency levels. The corpus is based on Estonian Coursebook Corpus 2017 and it was created in cooperation with the Institute of the Estonian Language.
Availability
This Estonian corpus is available on demand. To gain access, please contact Jelena Kallas jelena.kallas@eki.ee or Kristina Koppel Kristina.Koppel@eki.ee and then forward their answer to our support email support@sketchengine.eu so that we can grant you access to the corpus.
Part-of-speech tagset and lemmatization
The Estonian Coursebook corpus 2018 is part-of-speech tagged using the EstNLTK analyzer indicating the part of speech and grammatical category. The corpus texts also contain lemmatization when each word form from the corpus is assigned to its base form (lemma).
Reference
KALLAS, J., & KOPPEL, K. (2018). Eesti keele A1-C1 õpikute korpus 2018. Center of Estonian Language Resources. https://doi.org/10.15155/3-00-0000-0000-0000-071E9L
Search the Estonian corpus
Sketch Engine offers a range of tools to work with this Estonian corpus.
Use Sketch Engine in minutes
Generate collocations, frequency lists, examples in contexts, n-grams or extract terms. Use our Quick Start Guide to learn it in minutes.