A tagset is a list of part-of-speech tags (POS tags for short), i.e. labels used to indicate the part of speech and sometimes also other grammatical categories (case, tense etc.) of each token in a text corpus.
Slovak part-of-speech tagset of the Araneum Slovacum Maius corpus is a Slovak tagset modified according to Araneum Universal Tagset (AUT).
An Example of a tag in the CQL concordance search box: [tag="Ss"]
finds all nouns in singular, e.g. ryba, muž and [tag="V.*b"]
would find all verbs in second person, e.g. môžete, vyberte (note: please make sure that you use straight double quotation marks).
Tagset
Substantive
Position |
Character |
Value |
Example |
1. part of speech |
S |
substantive |
slovo, ryba, ústav, muž |
2. number |
s |
singular |
slovo, ryba, ústav, muž |
p |
plural |
slová, ryby, ústavy, muži/mužovia |
Adjective
Position |
Character |
Value |
Example |
1. part of speech |
A |
adjective |
milý, svieži, priateľkin, psí |
2. number congruence |
s |
singular |
láskavý (otec), láskavý (pohľad), otcova (košeľa), super (vysvedčenie), biele (mláďa) |
p |
plural |
láskaví (otcovia), láskavé (pohľady), otcove (košele), super (vysvedčenia), biele (mláďatá) |
|
3. degree |
x |
positive |
vzácny, drahá, otcov, psí, strešný |
y |
comparative |
vzácnejší, drahší, drevenejší (tanečník) |
|
z |
superlative |
najvzácnejší, najdrahší, najdrevenejší (tanečník) |
Pronoun
Position |
Character |
Value |
Example |
1. part of speech |
P |
pronoun |
akýkoľvek, onen, jeho, kadiaľ |
2. number or congruence |
s |
singular |
ja, ktorý, ten, nikto, nejaký |
p |
plural |
my, ktorí, tí, všetci, nijakí |
Numeral
Position |
Character |
Value |
Example |
|
1. part of speech |
N |
numeral |
jeden, dva, raz, sto, prvý, dvojmo |
|
2. number or congruence |
s |
singular |
jeden (muž), druhá (osoba), trojaké (víno) |
|
p |
plural |
jedny (osoby), druhé (miesta), trojaké (vína) |
Verb
Position |
Character |
Value |
Example |
1. part of speech |
V |
verb |
klásť, čítať, vidieť, činiť |
2. verb form |
I |
infinitive |
byť, hriať, volať, viesť, hovoriť |
K |
indicative |
je, hreje, volá, vedie, hovorí |
|
M |
imperative |
buď!, hrej!, volajte!, veďte!, hovor! |
|
H |
transgressive |
súc, hrejúc, volajúc, vedúc, hovoriac |
|
L |
“l” participle |
bol, hrialo, volali, viedla, hovorili |
|
B |
future “to be” |
budem, budeš, bude, budeme, budete, budú, poletím, povedú |
|
3. number |
s |
singular |
je, hrialo, bude, poviem |
p |
plural |
sú, volali, budeme, hovorili |
|
4. person |
a |
the first |
som, sme, hrejme!, volali sme, budem hovoriť |
b |
the second |
si, ste, hriali ste, volajte!, budeš viesť, hovoril by si |
|
c |
the third |
je, sú, hrejú, volalo, povedie, hovoria |
Participle
Position |
Character |
Value |
Example |
1. part of speech |
G |
participle |
robiaci, sediaci, naložený, zohriaty |
2. number congruence |
s |
singular |
píšuci (otec), uplakaný (pohľad), vypratá (košeľa) |
p |
plural |
píšuci (otcovia), vypraté (košele), skáčuce (mačatá) |
|
3. degree |
x |
positive |
kričiaci, hodená, skáčuce |
y |
comparative |
uplakanejší, strhujúcejší |
|
z |
superlative |
najuplakanejší, najstrhujúcejší |
Adverb
Position |
Character |
Value |
Example |
1. part of speech |
D |
adverb |
prísne, milo, pravidelne, prázdno |
2. degree |
x |
positive |
draho, vzácne |
y |
comparative |
drahšie, vzácnejšie |
|
z |
superlative |
najdrahšie, najvzácnejšie |
Preposition
Position |
Character |
Value |
Example |
1. part of speech |
E |
preposition |
po, pre, na, do, cez, medzi |
Conjunction
Position |
Character |
Values |
Example |
1. part of speech |
O |
conjunction |
a, ale, alebo, či, pretože, že |
2. conditionality |
Y |
conditionality |
aby, keby, čoby, žeby |
Particle
Position |
Character |
Value |
Example |
1. part of speech |
T |
particle |
azda, nuž, bodaj, sotva, áno, nie |
2. conditionality |
Y |
conditionality |
kiežby, žeby |
Interjection
Position |
Character |
Value |
Example |
1. part of speech |
J |
interjection |
fíha, bác, bums, dokelu, ahoj, cveng, plesk |
Reflexive morpheme
Position |
Character |
Values |
Example |
1. part of speech |
R |
reflexive morpheme |
sa, si |
Conditional morpheme
Position |
Character |
Value |
Example |
1. part of speech |
Y |
conditional morpheme |
by |
Abbreviations, symbols
Position |
Character |
Value |
Example |
1. part of speech |
W |
abbreviations, symbols |
km, kg, atď., h3O, SND |
Punctuation
Position |
Character |
Value |
Example |
1. part of speech |
Z |
punctuation |
., !, (, + |
Indefinable part of speech
Position |
Character |
Value |
Example |
1. part of speech |
Q |
undefinable part of speech |
bielo(-čierny), New (York) |
Non-verbal element
Position |
Character |
Value |
Example |
1. part of speech |
# |
non-verbal element |
XXXX, – – – – – – – |
Foreign language citation
Position |
Character |
Value |
Example |
1. part of speech |
% |
foreign language citation |
šaj pes dovakeras, take it easy!, náměstí |
Number
Position |
Character |
Value |
Example |
1. part of speech |
0 |
number |
8, 14000, 3 (razy) |
Source: Slovenský národný korpus and Aranea.
or