Keyword and term extraction
Keywords and terms are word and phrases typical for your corpus because they appear in your corpus more frequently than they would in general language. They can be used to define or understand the main topic of the corpus. Sketch Engine combines statistics with linguistic criteria to extract keywords and terms.
Keywords
Keywords are single words (tokens) which appear in the focus corpus more frequently than they would in general language. The general language is represented by the reference corpus.
Terms
Terms are multi-word units (phrases) that fulfil two conditions:
(1) they appear in the focus corpus more frequently than they would in the general language (or in a reference corpus)
(2) they have a structure allowed for terms in the language (set in a term grammar)
How to use keywords and terms
Term extraction, generally, only makes sense with user corpora. You can build a corpus from your own texts, or if you do not have any, you can have Sketch Engine find relevant texts for you.
Log in, build a corpus or select a corpus built previously. On the dashboard, click KEYWORDS & TERMS or use the same button on the main menu. The procedure will start automatically. The extraction can take several minutes for very large corpora.
go to Keywords
local menu to display:
- keyword in context (concordance)
- keyword in the context of the reference corpus (concordance)
local menu to display:
- change the criteria of the extraction (for expert users only)
- download the results
- change how you view the results and what is displayed
- view extraction criteria
- add this result to your favourites for easy access next time