What is CQL?
The Corpus Query Language is a special code or query language used in Sketch Engine to search for complex grammatical or lexical patterns or to use search criteria which cannot be set using the standard user interface.
Where is CQL used?
CQL is only used in the concordance search with the CQL options selected.
CQL? Regular expressions? Wild cards?
All three can be used in Sketch Engine.
- CQL
Corpus Query Language - CQL is used to set criteria for positions or tokens, i.e. words, lemmas, tags, lempos, lc etc.
- REGular EXpressions
- REGEX set criteria for strings of characters and can be used inside the CQL code or for filtering word lists.
- wild cards
- A simple convention to use the question mark (?) for any unspecified character and the asterisk (*) for any number of unspecified characters. Wild cards only work in the Simple concordance search.
VIDEO LESSON
from our YouTube channel
CQL | regular expressions | wild cards | |
---|---|---|---|
purpose | to set conditions for tokens (words), e.g. find words which are nouns followed by a preposition | to set conditions for character strings such as words or tags, e.g. find all words starting with letters br- or find all words whose tag starts with letter N | simple system with limited options to search for text |
where to use it | in concordance search with the CQL option (for advanced users) in Word Sketch Grammar in Term Grammar | in concordance search with these options: lemma, word, phrase, character (not in simple query!) inside the CQL code in word Lists and n-grams to only find required patterns | only in the simple concordance search |
The language was developed at the Corpora and Lexicons group, IMS, University of Stuttgart in the early 1990s, see IMS Corpus Workbench. The CQL as used in Sketch Engine is an extension to the original language and varies in several ways. This documentation describes the CQL as implemented in manatee 2.122 (released April 2015).