Yiddish corpus from Wikipedia
The Yiddish Wikipedia Corpus (yiwiki) is a Yiddish corpus made up of texts collected from Yiddish internet encyclopedia Wikipedia in December 2018. The corpus consists of 2 million words.
Tools to work with the Yiddish corpus
A complete set of tools is available to work with this Wikipedia Yiddish corpus to generate:
- word lists – lists of Yiddish words organized by frequency
- n-grams – frequency list of multi-word units
- concordance – examples in context
- keywords– terminology extraction of one-word
- text type analysis – statistics of metadata in the corpus
Search the Yiddish corpus
Sketch Engine offers a range of tools to work with this Yiddish corpus from Wikipedia.
Use Sketch Engine in minutes
Generate collocations, frequency lists, examples in contexts, n-grams or extract terms. Use our Quick Start Guide to learn it in minutes.