A tagset is a list of part-of-speech tags (POS tags for short), i.e. labels used to indicate the part of speech and sometimes also other grammatical categories (case, tense etc.) of each token in a text corpus.
FinnTreeBank2 part-of-speech tagset is available in Finnish corpora annotated by the Omorfi morphological analyzer. The FinnTreeBank is a subproject of the national FIN-CLARIN project in the European Framework Programme CLARIN. A collaboration between the Department of Modern Languages of University of Helsinki and the Research Institute for the Languages of Finland.
FinnTreeBank2 is the second version of the FinnTreeBank. The annotation scheme remained the same as in FinnTreeBank1.
An Example of a tag in the CQL concordance search box: [tag="N_Nom_Sg"]
finds all nominative nouns in singular, e.g. komissio, päätös (note: please make sure that you use straight double quotation marks)
Tagset
Short tag | Omorfi tag | Description |
Adj | [POS=ADJECTIVE] | adjective |
Adp | [POS=ADPOSITION] | adposition |
Adv | [POS=ADVERB] | adverb |
Conj | [POS=CONJUNCTION] | conjunction |
Interj | [POS=INTERJECTION] | interjection |
Noun | [POS=NOUN] | substantive |
Num | [POS=NUMERAL] | numeral |
Pron | [POS=PRONOUN] | pronoun |
Verb | [POS=VERB] | verb |
abe | [CASE=ABE] | abessive |
abl | [CASE=ABL] | ablative |
acc | [CASE=ACC] | accusative |
ade | [CASE=ADE] | adessive |
all | [CASE=ALL] | allative |
cmt | [CASE=CMT] | comitative |
ela | [CASE=ELA] | elative |
ess | [CASE=ESS] | essive |
gen | [CASE=GEN] | genitive |
ill | [CASE=ILL] | illative |
ine | [CASE=INE] | inessive |
ins | [CASE=INS] | instructive |
nom | [CASE=NOM] | nominative |
par | [CASE=PAR] | partitive |
prl | [CASE=PRL] | prolative |
tra | [CASE=TRA] | translative |
han | [CLIT=HAN] | hAn (clitic particle) |
ka | [CLIT=KA] | kA (clitic particle) |
kaan | [CLIT=KAAN] | kAAn (clitic particle) |
kin | [CLIT=KIN] | kin (clitic particle) |
ko | [CLIT=KO] | kO (clitic particle) |
pa | [CLIT=PA] | pA (clitic particle) |
s | [CLIT=S] | s (clitic particle) |
cmp | [CMP=CMP] | comparative |
sup | [CMP=SUP] | superlative |
minen | [DRV=MINEN] | minen (derivative) |
tse | [DRV=TSE] | tse (derivative) |
act | [GEN=ACT] | active |
card | [GEN=CARD] | cardinal |
ord | [GEN=ORD] | ordinal |
pss | [GEN=PSS] | passive |
a | [INF=A] | a-infinitive |
e | [INF=E] | e-infinitive |
ma | [INF=MA] | mA-infinitive |
maisilla | [INF=MAISILLA] | mAisillA-infinitive |
cond | [MOOD=COND] | conditional |
impv | [MOOD=IMPV] | imperative |
indv | [MOOD=INDV] | indicative |
potn | [MOOD=POTN] | potential |
pl | [NUM=PL] | plural |
sg | [NUM=SG] | singular |
nut | [PCP=NUT] | nUt participle |
va | [PCP=VA] | vA participle |
abbr | [POS=ABBREVIATION] | abbreviation |
prep | [POS=PREPOSITION] | preposition |
postp | [POS=POSTPOSITION] | postposition |
prop | [POS=PROPER] | proper name |
poss:pl1 | [POSS=PL1] | possessive number: 1-person plural |
poss:pl2 | [POSS=PL2] | possessive number: 2-person plural |
poss:pl3 | [POSS=PL3] | possessive number: 3-person plural |
poss:sg3,pl3 | [POSS=SG3,PL3] | possessive number: 3-person singular, |
3-person plural | ||
prs:neg | [PRS=NEG] | personal number: negative |
prs:pl1 | [PRS=PL1] | personal number: 1-person plural |
prs:pl2 | [PRS=PL2] | personal number: 2-person plural |
prs:pl3 | [PRS=PL3] | personal number: 3-person plural |
prs:sg1 | [PRS=SG1] | personal number: 1-person singular |
Source: http://www.ling.helsinki.fi/kieliteknologia/tutkimus/treebank/sources/FinnTreeBankManual.pdf