Susanne corpus: a subset of the Brown Corpus
The Susanne corpus is a subset of the Brown Corpus of American English annotated using the special annotation scheme which represents all aspects of English grammar which are sufficiently definite to be susceptible of formal annotation.
Detailed information can be found on https://www.grsampson.net/SueDoc.html
Part-of-speech tagset
The Susanne corpus was tagged by TreeTagger using the special Susanne pos tagset.
Tools to work with the Susanne corpus
A complete set of tools is available to work with this English corpus to generate:
- word sketch – English collocations categorized by grammatical relations
- thesaurus – synonyms and similar words for every word
- word lists – lists of English nouns, verbs, adjectives etc. organized by frequency
- n-grams – frequency list of multi-word units
- concordance – examples in context
- keywords– terminology extraction of one-word and multi-word units
- text type analysis – statistics of metadata in the corpus
Bibliography
Geoffrey Sampson (Ed.). 1995. Susanne Corpus School of Cognitive & Computing Sciences, University of Sussex.
Search the Susanne corpus
Sketch Engine offers a range of tools to work with this English corpus.
Use Sketch Engine in minutes
Generate collocations, frequency lists, examples in contexts, n-grams or extract terms. Use our Quick Start Guide to learn it in minutes.