SiBol corpus of English broadsheets
…modifications. Authors of the SiBol corpus The SiBol corpus was compiled by a small team of linguistics researchers at the…
If you are not happy with the results below please do another search
…modifications. Authors of the SiBol corpus The SiBol corpus was compiled by a small team of linguistics researchers at the…
…process the corpus data so that the complete Sketch Engine functionality is available. It involves the computation of Word Sketches,…
…where otherwise stated, reuse of the EUR-Lex data for commercial or non-commercial purposes is authorized provided the source is acknowledged…
…data produced from 1990 onward. The whole corpus is comprised of 11 million words. The MASC subcorpus consist of 480k…
…corpus cover the period of a hundred years from 1919 to 2019. The parliamentary data is public domain. The corpus…
Verifying corpus consistency, integrity and complenetess The corpcheck program should be run on each corpus, it checks and reports errors…
…Internet. This Hebrew corpus is a domain-independent web corpus consists of newspapers pages, blog posts, commercial websites, etc. A final…
…definitions in Subcorpus definition Save and Compile. After compilation, the subcorpora will be available in the subcorpus selectors in the…
…for Computational Linguistics: Posters & Demonstrations. Association for Computational Linguistics, 2006, pp. 87–90. Arabic part-of-speech tag set Mona T. Diab…
…Many Languages (Kilgarriff et al. at LREC 2010). Data was crawled by the SpiderLing web spider in 2009 and comprises…
…domains in the respective continents. Detailed information about TenTen corpora is on the separate page Common TenTen corpora attributes. Part-of-speech…
…the corpus September 23, 2012 The Turkic part crawled from the Turkish domain .tr was renamed to trTenTen [2012] initial…
…important aim in creating this corpus was to get a corpus that was comparable to the PAROLE corpus of the…
…sources such as scanned books, transcribed data, internet texts, etc. The corpus is classified according to genre, domain, and source…
…for number (e.g. “sheep”, “cod”, “headquarters”) NN1 -> NN1: singular common noun (e.g. “book”, “girl”) NN2 -> NN2: plural common…