Historical collection of the Text Creation Partnership’s (TCP)

Early English Books Online (EEBO) – Phase I; Eighteenth Century Collections Online (ECCO); Early American Imprints, Series I: Evans

This Sketch Engine English corpus collection consists of three publicly available parts of greater projects:

  • ProQuest’s EEBO-TCP Phase I – 25 364 books from the EEBO collection covering the years 1473–c.1700
  • Gale Cengage’s ECCO-TCP – 2473 titles printed in the United Kingdom between the years 1701 and 1800
  • Readex’s Evans-TCP – 5007 books published in America during the years 1639–1800

The size of this English historical collection is more than 826 million words. The corpus searches can use criteria such as the year or century published, key terms in books, etc.

Part-of-speech tagset and lemmatization

The Early English Books Online corpus is part-of-speech tagged by TreeTagger using the Penn TreeBank tagset summary indicating the part of speech and grammatical category. The corpus texts also contain lemmatization when each word form from the corpus is assigned to its base form (lemma).

Diachronic analysis

The corpus contains time metadata which enables users to use the trends feature in Sketch Engine. The trends feature analyses the frequency of the use of a word in time by comparing the frequency of use across a series of comparable time periods.

Availability

The corpus is accessible to all users with a subscription plan and site licence members (not to trial users).

Tools to work with the historical English corpus comprising Early English Books Online (EEBO), ECCO and Evans

A complete set of tools is available to work with this historical English corpus to generate:

  • word sketch – English collocations categorized by grammatical relations
  • thesaurus – synonyms and similar words for every word
  • word lists – lists of English nouns, verbs, adjectives etc. organized by frequency
  • n-grams – frequency list of multi-word units
  • concordance – examples in context
  • trendsdiachronic analysis automatically identifies neologisms and changes in use
  • keywordsterminology extraction of one-word
  • text type analysis – statistics of metadata in the corpus

Search this historical English corpus

Study Early Modern with this historical English corpus encompassing sources such as Early English Books Online (EEBO) or ECCO-TCP.

English Trends corpus

Explore our largest English corpus, which totals over 80 billion words and grows automatically every week.

Use Sketch Engine in minutes

Generating collocations, frequency lists, examples in contexts, n-grams or extracting terms. Use our Quick Start Guide to learn it in minutes.