A tagset is a list of part-of-speech tags (POS tags for short), i.e. labels used to indicate the part of speech and sometimes also other grammatical categories (case, tense etc.) of each token in a text corpus.

Lithuanian part-of-speech tagset

This Lithuanian part-of-speech tagset is available in the LithuanianWaC corpus.

An Example of a tag in the CQL concordance search box: [tag="N.*"] finds all nouns, e.g. Lietuvos, metų  (note: please make sure that you use straight double quotation marks). Concurrently, the method of dynamic tagging has been used, which involves tagging linguistic elements by iterating through a predefined sequence of attributes. If a particular attribute cannot be determined, the algorithm skips to the next attribute in the sequence and continues tagging.

General part-of-speech tagset classification

Noun N
Adjective A
Numeral M
Pronoun P
Verb V
Adverb R
Interjection I
Onomatopoeia O
Particle Q
Preposition S
Conjunction C
Acronym Z
Roman numbers U
Residual X
Abbreviation H
Punctuation SENT

Noun

[tag="Nc.m.*"]finds all common nouns in masculine gender, e.g. pasirodymų, kreipimųsi (note: please make sure that you use straight double quotation marks)

P Attribute (en) Value (en) Code
0 CATEGORY Noun N
1 Type common c
proper p
2 Reflexiveness reflexive r
non-reflexive 0
3 Gender feminine f
masculine m
common c
neuter n
4 Number singular s
plural p
5 Case nominative n
genitive g
accusative a
instrumental i
locative l
dative d
vocative v

Adjective

[tag="A.cf.*"]finds all comparative adjectives in feminine gender, e.g. blogesnės, geresnės (note: please make sure that you use straight double quotation marks)

P Attribute (en) Value (en)  Code
0 CATEGORY Adjective A
1 Positiveness positive p
negative n
2 Degree positive p
comparative c
superlative s
3 Gender masculine m
feminine f
neuter n
4 Number singular s
plural p
not aplicable 0
5 Case nominative n
genitive g
accusative a
instrumental i
locative l
dative d
vocative v

Numeral

[tag="Mo.p.*"]finds all ordinal, pronominal numerals , e.g. pirmąją, pirmaisiais (note: please make sure that you use straight double quotation marks)

P Attribute (en) Value (en) Code
0 CATEGORY Numeral m
1 Type cardinal c
ordinal o
multiple m
collective l
2 Degree positive p
superlative s
comparative c
3 Definiteness pronominal p
non – pronominal n
4 Gender masculine m
feminine f
neuter 0
5 Number singular s
plural p
not applicable 0
6 Case nominative n
genitive g
accusative a
instrumental i
locative l
dative d
vocative v

Pronoun

[tag="P..sn"]finds all singular pronouns in nominative case, e.g. pats, ši (note: please make sure that you use straight double quotation marks)

P Attribute (en) Value (en) Code
0 CATEGORY Pronoun P
1 Positiveness negative n
2 Definiteness pronominal p
non – pronominal n
3 Gender masculine m
feminine f
neuter n
4 Number singular s
plural p
dual d
5 Case nominative n
genitive g
accusative a
instrumental i
locative l
dative d
not aplicable 0

Verb

[tag="Vm...p.3"]finds all main verbs in present tense and 3rd person, e.g. atsiduria, gaunasi (note: please make sure that you use straight double quotation marks)

P Attribute (en) Value (en) Code
0 CATEGORY Verb V
1 Verb Form infinitive i
main m
participle (for futher fetaures see the participle table) p
adverbial participle a
half particle h
adverbial participle2 b
2 Positiveness positive p
negative n
3 Reflexiveness reflexive r
non-reflexive n
4 Mood indicative i
imperative m
subjunctive s
5 Tense present tense p
simple past s
past tense a
future tense f
past frequentative q
6 Number singular s
plural p
7 Person 1st 1
2nd 2
3rd 3

Participle

[tag="Vp..a...f.*"]finds all participle verbs with active voice in feminine gender, e.g. susitarusios, įsipynusios (note: please make sure that you use straight double quotation marks)

P Attribute (en) Value (en) Code
0 CATEGORY Verb V
1 Verb Form participle p
2 Positiveness positive p
negative n
3 Reflexiveness reflexive r
non-reflexive n
4 Type (Voice) necessity n
passive p
active a
5 Tense present tense p
simple past s
past tense a
future tense f
past frequentative q
6 Degree superlatives s
positive p
7 Definiteness pronominal p
non-pronominal n
8 Gender masculine m
feminine f
neuter 0
9 Number singular s
plural p
10 Case nominative n
genitive g
dative d
accusative a
locative l
vocative v
instrumental i

Adverb

[tag="Rs"]finds all superlative adverbs, e.g. dažniausiai, labiausiai (note: please make sure that you use straight double quotation marks)

P Attribute (en) Value (en) Code
0 CATEGORY Adverb R
1 Degree positive p
comparative c
superlative s

Particle

[tag="Qn"]finds all indefinite particles, e.g. nejau, bene (note: please make sure that you use straight double quotation marks)

P Attribute (en) Value (en) Code
0 CATEGORY Particle Q
1 Determination indefinite n

Punctuation

P Attribute (en) Value (en) Code
0 CATEGORY Punctuation SENT
. full stop/period SENT
: colon SENT
? question mark SENT
ellipsis SENT
! exclamation mark SENT

source: http://corpus.vdu.lt/en/morph

Lithuanian text corpora in Sketch Engine

Sketch Engine offers dozens of Lithuanian language corpora.

or