A tagset is a list of part-of-speech tags (POS tags for short), i.e. labels used to indicate the part of speech and sometimes also other grammatical categories (case, tense etc.) of each token in a text corpus.
Lithuanian part-of-speech tagset
This Lithuanian part-of-speech tagset is available in the LithuanianWaC corpus.
An Example of a tag in the CQL concordance search box: [tag="N.*"]
finds all nouns, e.g. Lietuvos, metų (note: please make sure that you use straight double quotation marks). Concurrently, the method of dynamic tagging has been used, which involves tagging linguistic elements by iterating through a predefined sequence of attributes. If a particular attribute cannot be determined, the algorithm skips to the next attribute in the sequence and continues tagging.
General part-of-speech tagset classification
Noun | N |
Adjective | A |
Numeral | M |
Pronoun | P |
Verb | V |
Adverb | R |
Interjection | I |
Onomatopoeia | O |
Particle | Q |
Preposition | S |
Conjunction | C |
Acronym | Z |
Roman numbers | U |
Residual | X |
Abbreviation | H |
Punctuation | SENT |
Noun
[tag="Nc.m.*"]
finds all common nouns in masculine gender, e.g. pasirodymų, kreipimųsi (note: please make sure that you use straight double quotation marks)
P | Attribute (en) | Value (en) | Code |
0 | CATEGORY | Noun | N |
1 | Type | common | c |
proper | p | ||
2 | Reflexiveness | reflexive | r |
non-reflexive | 0 | ||
3 | Gender | feminine | f |
masculine | m | ||
common | c | ||
neuter | n | ||
4 | Number | singular | s |
plural | p | ||
5 | Case | nominative | n |
genitive | g | ||
accusative | a | ||
instrumental | i | ||
locative | l | ||
dative | d | ||
vocative | v |
Adjective
[tag="A.cf.*"]
finds all comparative adjectives in feminine gender, e.g. blogesnės, geresnės (note: please make sure that you use straight double quotation marks)
P | Attribute (en) | Value (en) | Code |
0 | CATEGORY | Adjective | A |
1 | Positiveness | positive | p |
negative | n | ||
2 | Degree | positive | p |
comparative | c | ||
superlative | s | ||
3 | Gender | masculine | m |
feminine | f | ||
neuter | n | ||
4 | Number | singular | s |
plural | p | ||
not aplicable | 0 | ||
5 | Case | nominative | n |
genitive | g | ||
accusative | a | ||
instrumental | i | ||
locative | l | ||
dative | d | ||
vocative | v |
Numeral
[tag="Mo.p.*"]
finds all ordinal, pronominal numerals , e.g. pirmąją, pirmaisiais (note: please make sure that you use straight double quotation marks)
P | Attribute (en) | Value (en) | Code |
0 | CATEGORY | Numeral | m |
1 | Type | cardinal | c |
ordinal | o | ||
multiple | m | ||
collective | l | ||
2 | Degree | positive | p |
superlative | s | ||
comparative | c | ||
3 | Definiteness | pronominal | p |
non – pronominal | n | ||
4 | Gender | masculine | m |
feminine | f | ||
neuter | 0 | ||
5 | Number | singular | s |
plural | p | ||
not applicable | 0 | ||
6 | Case | nominative | n |
genitive | g | ||
accusative | a | ||
instrumental | i | ||
locative | l | ||
dative | d | ||
vocative | v |
Pronoun
[tag="P..sn"]
finds all singular pronouns in nominative case, e.g. pats, ši (note: please make sure that you use straight double quotation marks)
P | Attribute (en) | Value (en) | Code |
0 | CATEGORY | Pronoun | P |
1 | Positiveness | negative | n |
2 | Definiteness | pronominal | p |
non – pronominal | n | ||
3 | Gender | masculine | m |
feminine | f | ||
neuter | n | ||
4 | Number | singular | s |
plural | p | ||
dual | d | ||
5 | Case | nominative | n |
genitive | g | ||
accusative | a | ||
instrumental | i | ||
locative | l | ||
dative | d | ||
not aplicable | 0 |
Verb
[tag="Vm...p.3"]
finds all main verbs in present tense and 3rd person, e.g. atsiduria, gaunasi (note: please make sure that you use straight double quotation marks)
P | Attribute (en) | Value (en) | Code |
0 | CATEGORY | Verb | V |
1 | Verb Form | infinitive | i |
main | m | ||
participle (for futher fetaures see the participle table) | p | ||
adverbial participle | a | ||
half particle | h | ||
adverbial participle2 | b | ||
2 | Positiveness | positive | p |
negative | n | ||
3 | Reflexiveness | reflexive | r |
non-reflexive | n | ||
4 | Mood | indicative | i |
imperative | m | ||
subjunctive | s | ||
5 | Tense | present tense | p |
simple past | s | ||
past tense | a | ||
future tense | f | ||
past frequentative | q | ||
6 | Number | singular | s |
plural | p | ||
7 | Person | 1st | 1 |
2nd | 2 | ||
3rd | 3 |
Participle
[tag="Vp..a...f.*"]
finds all participle verbs with active voice in feminine gender, e.g. susitarusios, įsipynusios (note: please make sure that you use straight double quotation marks)
P | Attribute (en) | Value (en) | Code |
0 | CATEGORY | Verb | V |
1 | Verb Form | participle | p |
2 | Positiveness | positive | p |
negative | n | ||
3 | Reflexiveness | reflexive | r |
non-reflexive | n | ||
4 | Type (Voice) | necessity | n |
passive | p | ||
active | a | ||
5 | Tense | present tense | p |
simple past | s | ||
past tense | a | ||
future tense | f | ||
past frequentative | q | ||
6 | Degree | superlatives | s |
positive | p | ||
7 | Definiteness | pronominal | p |
non-pronominal | n | ||
8 | Gender | masculine | m |
feminine | f | ||
neuter | 0 | ||
9 | Number | singular | s |
plural | p | ||
10 | Case | nominative | n |
genitive | g | ||
dative | d | ||
accusative | a | ||
locative | l | ||
vocative | v | ||
instrumental | i |
Adverb
[tag="Rs"]
finds all superlative adverbs, e.g. dažniausiai, labiausiai (note: please make sure that you use straight double quotation marks)
P | Attribute (en) | Value (en) | Code |
0 | CATEGORY | Adverb | R |
1 | Degree | positive | p |
comparative | c | ||
superlative | s |
Particle
[tag="Qn"]
finds all indefinite particles, e.g. nejau, bene (note: please make sure that you use straight double quotation marks)
P | Attribute (en) | Value (en) | Code |
0 | CATEGORY | Particle | Q |
1 | Determination | indefinite | n |
Punctuation
P | Attribute (en) | Value (en) | Code |
0 | CATEGORY | Punctuation | SENT |
. | full stop/period | SENT | |
: | colon | SENT | |
? | question mark | SENT | |
… | ellipsis | SENT | |
! | exclamation mark | SENT |
source: http://corpus.vdu.lt/en/morph
or