Corpus of Hebrew translation texts
The Hebrew translation corpus, also known as Hebrew Comparable Corpus is a language corpus made up of translated and non-translated texts of the Hebrew language. There are about fifteen books (fiction and non-fiction) in each component. The two components are matched for topic and genre: for example, there is one biography in each. It is best suited for people who want to study differences between translated and non-translated language. It can also be used in order to study language use more generally.
The corpus was compiled as part of a project funded by the Israel Science Foundation and carried out in the Department of Translation and Interpreting Studies at Bar Ilan University.
Part-of-speech tagset
The Hebrew translation corpus is tagged with the YAP part-of-speech tagset.
Tools to work with the Hebrew Translation Corpus
A complete set of Sketch Engine tools is available to work with this Hebrew Translation Corpus to generate:
- keywords – terminology extraction of one-word units
- word lists – lists of Hebrew nouns, verbs, adjectives etc. organized by frequency
- n-grams – frequency list of multi-word units
- concordance – examples in context
- text type analysis – statistics of metadata in the corpus
Search the Hebrew Translation corpus
Sketch Engine offers a range of tools to work with the Hebrew Translation corpus.
Use Sketch Engine in minutes
Generating collocations, frequency lists, examples in contexts, n-grams or extracting terms is easy with Sketch Engine. Use our Quick Start Guide to learn it in minutes.