A tagset is a list of part-of-speech tags (POS tags for short), i.e. labels used to indicate the part of speech and sometimes also other grammatical categories (case, tense etc.) of each token in a text corpus.

English CLAWS part-of-speech tagset version 5 is available in English corpora annotated by the tool using CLAWS (the Constituent Likelihood Automatic Word-tagging System) developed by University Centre for Computer Corpus Research on Language at Lancaster University.

English tagsets

used in Sketch Engine

The Constituent Likelihood Automatic Word-tagging System abbreviated CLAWS was developed by UCREL at Lancaster University. This is the 5th version of used tagset.

An Example of a tag in the CQL concordance search box: [tag="VBD"] finds all past forms of the verb “be”: was, were  (note: please make sure that you use straight double quotation marks)

TAGSET

POS Tag Description
AJ0 adjective (unmarked) (e.g. GOOD, OLD)
AJC comparative adjective (e.g. BETTER, OLDER)
AJS superlative adjective (e.g. BEST, OLDEST)
AT0 article (e.g. THE, A, AN)
AV0 adverb (unmarked) (e.g. OFTEN, WELL, LONGER, FURTHEST)
AVP adverb particle (e.g. UP, OFF, OUT)
AVQ wh-adverb (e.g. WHEN, HOW, WHY)
CJC coordinating conjunction (e.g. AND, OR)
CJS subordinating conjunction (e.g. ALTHOUGH, WHEN)
CJT the conjunction THAT
CRD cardinal numeral (e.g. 3, FIFTY-FIVE, 6609) (excl ONE)
DPS possessive determiner form (e.g. YOUR, THEIR)
DT0 general determiner (e.g. THESE, SOME)
DTQ wh-determiner (e.g. WHOSE, WHICH)
EX0 existential THERE
ITJ interjection or other isolate (e.g. OH, YES, MHM)
NN0 noun (neutral for number) (e.g. AIRCRAFT, DATA)
NN1 singular noun (e.g. PENCIL, GOOSE)
NN2 plural noun (e.g. PENCILS, GEESE)
NP0 proper noun (e.g. LONDON, MICHAEL, MARS)
NULL the null tag (for items not to be tagged)
ORD ordinal (e.g. SIXTH, 77TH, LAST)
PNI indefinite pronoun (e.g. NONE, EVERYTHING)
PNP personal pronoun (e.g. YOU, THEM, OURS)
PNQ wh-pronoun (e.g. WHO, WHOEVER)
PNX reflexive pronoun (e.g. ITSELF, OURSELVES)
POS the possessive (or genitive morpheme) ‘S or ‘
PRF the preposition OF
PRP preposition (except for OF) (e.g. FOR, ABOVE, TO)
PUL punctuation – left bracket (i.e. ( or [ )
PUN punctuation – general mark (i.e. . ! , : ; – ? … )
PUQ punctuation – quotation mark (i.e. ` ‘ ” )
PUR punctuation – right bracket (i.e. ) or ] )
TO0 infinitive marker TO
UNC “unclassified” items which are not words of the English lexicon
VBB the “base forms” of the verb “BE” (except the infinitive), i.e. AM, ARE
VBD past form of the verb “BE”, i.e. WAS, WERE
VBG -ing form of the verb “BE”, i.e. BEING
VBI infinitive of the verb “BE”
VBN past participle of the verb “BE”, i.e. BEEN
VBZ -s form of the verb “BE”, i.e. IS, ‘S
VDB base form of the verb “DO” (except the infinitive), i.e.
VDD past form of the verb “DO”, i.e. DID
VDG -ing form of the verb “DO”, i.e. DOING
VDI infinitive of the verb “DO”
VDN past participle of the verb “DO”, i.e. DONE
VDZ -s form of the verb “DO”, i.e. DOES
VHB base form of the verb “HAVE” (except the infinitive), i.e. HAVE
VHD past tense form of the verb “HAVE”, i.e. HAD, ‘D
VHG -ing form of the verb “HAVE”, i.e. HAVING
VHI infinitive of the verb “HAVE”
VHN past participle of the verb “HAVE”, i.e. HAD
VHZ -s form of the verb “HAVE”, i.e. HAS, ‘S
VM0 modal auxiliary verb (e.g. CAN, COULD, WILL, ‘LL)
VVB base form of lexical verb (except the infinitive)(e.g. TAKE, LIVE)
VVD past tense form of lexical verb (e.g. TOOK, LIVED)
VVG -ing form of lexical verb (e.g. TAKING, LIVING)
VVI infinitive of lexical verb
VVN past participle form of lex. verb (e.g. TAKEN, LIVED)
VVZ -s form of lexical verb (e.g. TAKES, LIVES)
XX0 the negative NOT or N’T
ZZ0 alphabetical symbol (e.g. A, B, c, d)

Source: http://ucrel.lancs.ac.uk/claws5tags.html

Largest English corpus

Explore our English Trends corpus, which totals over 80 billion words and grows automatically every week.