A tagset is a list of part-of-speech tags (POS tags for short), i.e. labels used to indicate the part of speech and sometimes also other grammatical categories (case, tense etc.) of each token in a text corpus.
Lithuanian part-of-speech tagset
This Lithuanian part-of-speech tagset is available in Lithuanian corpora tagged by LT PoS tagger. The part-of-speech tagset is based on MULTEXT-EAST specifications but includes some modifications.
An Example of a tag in the CQL concordance search box: [tag="N.*"]
finds all nouns, e.g. Lietuvos, metų (note: please make sure that you use straight double quotation marks).
General part-of-speech tagset classification
Noun | N |
Adjective | A |
Numeral | M |
Pronoun | P |
Verb | V |
Adverb | R |
Interjection | I |
Onomatopoeia | O |
Particle | Q |
Preposition | S |
Conjunction | C |
Residual | X |
Abbreviation | Y |
Punctuation and symbols | T |
Noun
[tag="Ncm.*"]
finds all common nouns in masculine gender, e.g. pasirodymų, kreipimųsi (note: please make sure that you use straight double quotation marks)
P | Attribute (en) | Value (en) | Code |
0 | CATEGORY | Noun | N |
1 | Type | common | c |
proper | p | ||
2 | Gender | common | c |
feminine | f | ||
masculine | m | ||
irrelevant | – | ||
3 | Number | plural | p |
singular | s | ||
dual | d | ||
irrelevant | – | ||
4 | Case | nominative | n |
genitive | g | ||
accusative | a | ||
instrumental | i | ||
locative | l | ||
dative | d | ||
vocative | v | ||
illiative | x | ||
irrelevant | – | ||
5 | Reflexiveness | yes | y |
no | n | ||
6 | Name | first | f |
surname | s | ||
geographic | g | ||
irrelevant | – |
Adjective
[tag="Agcf.*"]
finds all comparative adjectives in feminine gender, e.g. blogesnės, geresnės (note: please make sure that you use straight double quotation marks)
P | Attribute (en) | Value (en) | Code |
0 | CATEGORY | Adjective | A |
1 | Type | general | g |
2 | Degree | positive | p |
comparative | c | ||
superlative | s | ||
diminutive | d | ||
– | – | ||
3 | Gender | feminine | f |
masculine | m | ||
neuter | n | ||
– | – | ||
4 | Number | plural | p |
singular | s | ||
dual | d | ||
– | – | ||
5 | Case | nominative | n |
genitive | g | ||
dative | d | ||
accusative | a | ||
instrumental | i | ||
locative | l | ||
vocative | v | ||
illiative | x | ||
– | – | ||
6 | Defineteness | yes | y |
no | n |
Numeral
[tag="Mo.p.*"]
finds all ordinal, plural numerals , e.g. antrąsias, pirmąsias (note: please make sure that you use straight double quotation marks)
P | Attribute (en) | Value (en) | Code |
0 | CATEGORY | Numeral | M |
1 | Type | cardinal | c |
ordinal | o | ||
collect | l | ||
multiple | m | ||
– | – | ||
2 | Gender | feminine | f |
masculine | m | ||
neuter | n | ||
– | – | ||
3 | Number | plural | p |
singular | s | ||
– | – | ||
4 | Case | nominative | n |
genitive | g | ||
dative | d | ||
accusative | a | ||
instrumental | i | ||
locative | l | ||
vocative | v | ||
illiative | x | ||
– | – | ||
5 | Form | digit | d |
roman | r | ||
letter | l | ||
m-form | m | ||
6 | Defineteness | yes | y |
no | n | ||
– | – |
Pronoun
[tag="P..sn.*"]
finds all singular pronouns in nominative case, e.g. pats, šiš (note: please make sure that you use straight double quotation marks)
P | Attribute (en) | Value (en) | Code |
0 | CATEGORY | Pronoun | P |
1 | Type | general | g |
2 | Gender | feminine | f |
masculine | m | ||
neuter | n | ||
– | – | ||
3 | Number | plural | p |
singular | s | ||
dual | d | ||
– | – | ||
4 | Case | nominative | n |
genitive | g | ||
dative | d | ||
accusative | a | ||
instrumental | i | ||
locative | l | ||
vocative | v | ||
illiative | x | ||
– | – | ||
5 | Defineteness | yes | y |
no | n | ||
– | – |
Verb
[tag="Vg.p3.*"]
finds all general verbs in present tense and 3rd person, e.g. atsiduria, gaunasi (note: please make sure that you use straight double quotation marks)
P | Attribute (en) | Value (en) | Code |
0 | CATEGORY | Verb | V |
1 | Type | general | g |
2 | VForm | infinitive | i |
main | m | ||
particle | p | ||
adverbial participle | a | ||
half particle | h | ||
adverbial participle2 | s | ||
3 | Tense | present | p |
simple past | a | ||
past tense | s | ||
past frequentative | q | ||
future | f | ||
4 | Person | first | 1 |
second | 2 | ||
third | 3 | ||
irrelevant | – | ||
5 | Number | plural | p |
singular | s | ||
dual | d | ||
6 | Gender | feminine | f |
masculine | m | ||
neuter | n | ||
irrelevant | – | ||
7 | Voice | active | a |
passive | p | ||
necessity | n | ||
irrelevant | – | ||
8 | Negative | yes | y |
no | n | ||
9 | Defineteness | yes | y |
no | n | ||
10 | Case | nominative | n |
genitive | g | ||
dative | d | ||
accusative | a | ||
instrumental | i | ||
locative | l | ||
vocative | v | ||
illiative | x | ||
irrelevant | – | ||
11 | Reflexiveness | yes | y |
no | n | ||
12 | Mood | indicative | i |
subjunctive | s | ||
imperative | m | ||
irrelevant | – | ||
13 | Degree | positive | p |
comparative | c | ||
superlative | s | ||
irrelevant | – |
Adverb
[tag="Rgs"]
finds all superlative adverbs, e.g. dažniausiai, labiausiai (note: please make sure that you use straight double quotation marks)
P | Attribute (en) | Value (en) | Code |
0 | CATEGORY | Adverb | R |
1 | Type | general | g |
2 | Degree | positive | p |
comparative | c | ||
superlative | s | ||
diminutive | d | ||
– | – | – |
Interjection
[tag="Ig"]
finds all general interjection, e.g. deja, vaje (note: please make sure that you use straight double quotation marks)
P | Attribute (en) | Value (en) | Code |
0 | CATEGORY | Interjection | I |
1 | Type | general | g |
Onomatopoeia
[tag="Og"]
finds all general onomatopoeia, e.g. dzin, vau (note: please make sure that you use straight double quotation marks)
P | Attribute (en) | Value (en) | Code |
0 | CATEGORY | Onomatopoeia | O |
1 | Type | general | g |
Particle
[tag="Qn"]
finds all indefinite particles, e.g. ane, ale, vat (note: please make sure that you use straight double quotation marks)
P | Attribute (en) | Value (en) | Code |
0 | CATEGORY | Particle | Q |
1 | Type | general | g |
indefinite | n |
Preposition
[tag="Sgg"]
finds all general genitive preposition, e.g. po, iš (note: please make sure that you use straight double quotation marks)
P | Attribute (en) | Value (en) | Code |
0 | CATEGORY | Preposition | S |
1 | Type | general | g |
2 | Case | genitive | g |
dative | d | ||
accusative | a | ||
instrumental | i |
Conjunction
[tag="Cg"]
finds all general conjunction, e.g. būtent, ir (note: please make sure that you use straight double quotation marks)
P | Attribute (en) | Value (en) | Code |
0 | CATEGORY | Conjunction | C |
1 | Type | general | g |
Residual
[tag="Xl"]
finds all web link, e.g. http://www.xxxx.com/ (note: please make sure that you use straight double quotation marks)
P | Attribute (en) | Value (en) | Code |
0 | CATEGORY | Residual | X |
1 | Type | foreign | f |
typo | t | ||
segmentation | p | ||
tag | h | ||
link | l | ||
e-mail addresses | e |
Abbreviation
[tag="Ya"]
finds all acronyms, e.g. TV, SIM (note: please make sure that you use straight double quotation marks)
P | Attribute (en) | Value (en) | Code |
0 | CATEGORY | Abbreviation | Y |
1 | Type | shortening | s |
acronym | a |
Punctuation
P | Attribute (en) | Value (en) | Code | |
0 | CATEGORY | Punctuation | T | |
1 | Type | . | full stop/period | p |
, | comma | c | ||
; | semicolon | s | ||
: | colon | n | ||
? | question mark | q | ||
! | exclamation mark | e | ||
… | ellipsis | i | ||
– – — | dash, minus | h | ||
( [ { | opening bracket | l | ||
) ] } | closing bracket | r | ||
” ‘ „ “ | quotation mark | u | ||
/ | slash, stroke | t | ||
|\*%^$ | other marks, symbols | x |
source: http://corpus.vdu.lt/en/morph
or