A tagset is a list of part-of-speech tags (POS tags for short), i.e. labels used to indicate the part of speech and sometimes also other grammatical categories (case, tense etc.) of each token in a text corpus.
Historical English Penn Treebank part-of-speech tagset is available in corpora of Historical English. It is a special POS tagset aimed to describe grammatical categories of historical language. See more at http://www.ling.upenn.edu/histcorpora/
An Example of a tag in the CQL concordance search box: [tag="ADJ.*"]
finds all nouns in plural, e.g. good, slow (note: please make sure that you use straight double quotation marks)
Tagset
. | sentence-final punctuation |
, | sentence-internal punctuation |
‘ | single quote |
“ | double quote |
$ | possessive marker |
+ | joins constituent morphemes in compounds Example: (N+N mankind) |
A | |
ADJ | adjective |
ADJR | adjective, comparative |
ADJS | adjective, superlative |
ADV | adverb |
ADVR | adverb, comparative |
ADVS | adverb, superlative |
ALSO | the words ALSO (except when = AS) and EKE |
(the latter only in Middle English) | |
B | |
BAG | BE, present participle |
BE | BE, infinitive |
BED | BE, past (including past subjunctive) |
BEI | BE, imperative |
BEN | BE, perfect participle |
BEP | BE, present (including present subjunctive) |
C | |
C | complementizer |
CODE | non-text material (e.g., page numbers) |
CONJ | coordinating conjunction |
D | |
D | determiner |
DAG | DO, present participle |
DAN | DO, passive participle (verbal or adjectival) |
DO | DO, infinitive |
DOD | DO, past (including past subjunctive) |
DOI | DO, imperative |
DON | DO, perfect participle |
DOP | DO, present (including present subjunctive) |
E | |
ELSE | the word ELSE in the collocation OR ELSE |
EX | existential THERE |
F | |
FOR | infinitival FOR |
FOR+TO | cliticized FOR+TO |
FP | focus particle |
FW | foreign word |
H | |
HAG | HAVE, present participle |
HAN | HAVE, passive participle (verbal or adjectival) |
HV | HAVE, infinitive |
HVD | HAVE, past (including past subjunctive) |
HVI | HAVE, imperative |
HVN | HAVE, perfect participle |
HVP | HAVE, present (including present subjunctive) |
I | |
ID | token identification number |
INTJ | interjection |
L | |
LB | line break |
LS | list marker |
M | |
MAN | indefinite subject pronoun (ME, MAN) |
(only in Middle English) | |
MD | modal verb |
MD0 | modal verb, untensed |
META | special text material (e.g., stage directions); generally found only in drama |
N | |
N | common noun, singular |
N$ | common noun, singular, possessive |
NEG | negation |
NPR | proper noun, singular |
NPR$ | proper noun, singular, possessive |
NPRS | proper noun, plural |
NPRS$ | proper noun, plural, possessive |
NS | common noun, plural |
NS$ | common noun, plural, possessive |
NUM | cardinal number |
NUM$ | cardinal number, possessive |
O | |
ONE | the word ONE (except as focus particle) |
ONE$ | ONE, possessive |
OTHER | the word OTHER (except as conjunction) |
OTHER$ | OTHER, nominal use, possessive |
OTHERS | OTHER, nominal use, plural |
OTHERS$ | OTHER, nominal use, plural possessive |
P | |
P | preposition or subordinating conjunction |
PRO | personal pronoun |
PRO$ | possessive pronoun |
Q | |
Q | quantifier |
Q$ | quantifier, possessive |
QR | quantifier, comparative (MORE, LESS) |
QS | quantifier, superlative (MOST, LEAST) |
R | |
RP | adverbial particle |
S | |
SUCH | the word SUCH |
T | |
TO | infinitival TO, TIL, and AT |
V | |
VAG | present participle |
VAN | passive participle (verbal or adjectival) |
VB | infinitive, verbs other than BE, DO, HV |
VBD | past (including past subjunctive) |
VBI | imperative |
VBN | perfect participle |
VBP | present (including present subjunctive) |
W | |
WADV | wh-adverb |
WARD | the morpheme WARD |
WD | wh-determiner |
WPRO | wh-pronoun |
WPRO$ | possessive wh-pronoun |
WQ | WHETHER introducing indirect questions |
X | |
X | tag for unknown part of speech |
Extended syntactic tags
The basic syntactic tags listed above can be modified by the following extended tags (also referred to as dash tags).
Suffix tag | Definition | Example | |
---|---|---|---|
-LFD | left-dislocated constituent | ADVP-LOC-LFD | left-dislocated locative adverb phrase |
-PRN | parenthetical or appositive | NP-PRN | parenthetical or appositive noun phrase |
-RSP | resumptive constituent | NP-SBJ-RSP | resumptive subject |
-SPE | direct speech (only on CP, IP) | CP-REL-SPE | relative clause, direct speech |
IP-MAT-SPE | matrix clause, direct speech |
“-#” (a hyphen followed by a numeric index) is used to coindex antecedents and their traces, as well as expletives (overt or empty) that are associated with a clause or noun phrase.
“=#” (an equals sign followed by a numeric index) is used to coindex gapped clauses with full clauses. See Gapping, Right-node raising.
Empty categories
0 | empty operator |
*arb* | arbitrary subject in ECM infinitives (as in I have heard *arb* tell) |
*con* | subject elided under conjunction |
*exp* | empty expletive subject |
*pro* | “small pro” subject |
*ICH* | abbreviation mnemonic for “insert constituent here”; trace of extraposition, scrambling, or other movement that does not fit neatly into the A/A’ dichotomy |
*T* | trace of A’-movement |
* | trace of A-movement; also default empty category |
Source: http://www.ling.upenn.edu/histcorpora/annotation/labels.htm