BLaRC: British Law Report Corpus

The British Law English corpus is an 8.5 million-word English corpus of judicial decisions. Explore legal English with this corpus.

The British Law Report Corpus (BLaRC) is a British English corpus made up of judicial decisions issued by British courts and tribunals. The corpus consists of 8.5 million words of legal texts published in 2008–2010.

The corpus is owned by the University of Murcia, Spain, and compiled by Dr. María José Marín Pérez.

Part-of-speech tagset

The BLaRC corpus is tagged using English PennTreebank Tagset.

Tools to work with the British Law Reference Corpus

A complete set of tools is available to work with this BLaRC corpus of legal documents to generate:

  • word sketch – English collocations categorized by grammatical relations
  • thesaurus – synonyms and similar words for every word
  • keywordsterminology extraction of one-word and multi-word units
  • word lists – lists of English nouns, verbs, adjectives etc. organized by frequency
  • n-grams – frequency list of multi-word units
  • concordance – examples in context
  • text type analysis – statistics of metadata in the corpus

Marín, M.J. (2013). Identification and analysis of the specialised vocabulary of British law reports : A corpus-driven study of this legal genre at the core of common law legal systems. Doctoral thesis. Universidad de Murcia.

RIZZO, Camino Rea; PÉREZ, Mª José Marín. Structure and design of the British Law Report Corpus (BLRC): a legal corpus of judicial decisions from the UK. Journal of English Studies, 2012, 10: 131-145.

Search the BLaRC corpus

Sketch Engine offers a range of tools to work with this British Law Reference Corpus.

Other text corpora

Sketch Engine offers 800+ language corpora.

Use Sketch Engine in minutes

Generate collocations, frequency lists, examples in contexts, n-grams or extract terms. Use our Quick Start Guide to learn it in minutes.