A tagset is a list of part-of-speech tags (POS tags for short), i.e. labels used to indicate the part of speech and sometimes also other grammatical categories (case, tense etc.) of each token in a text corpus.
Norwegian Universal dependencies tagset
It is a list of part-of-speech tags for Norwegian including both language variants: Bokmål and Nynorsk.
This Norwegian part-of-speech tagset is used in Norwegian (Bokmål) corpora annotated by TreeTagger trained on Norwegian Dependency Treebank.
An Example of a tag in the CQL concordance search box: [tag="(NOUN|PROPN).*"]
finds all nouns and proper nouns, e.g. sak, Oslo (note: please make sure that you use straight double quotation marks)
Universal Dependencies tags for Norwegian
POS tag | Description |
ADJ | adjective |
ADP | adposition |
ADV | adverb |
AUX | auxiliary |
CCONJ | coordinating conjunction |
DET | determiner |
INTJ | interjection |
NOUN | noun |
NUM | numeral |
PART | particle |
PRON | pronoun |
PROPN | proper noun |
PUNCT | punctuation |
SCONJ | subordinating conjunction |
SYM | symbol |
VERB | verb |
X | other |
UD Morphological features for Norwegian
For example, to find all feminine nouns including proper names in the genitive form use the following CQL query: [tag="(NOUN|PROPN).*" & gender="Fem" & case="Gen"] or [tag="NOUN.*Gen.*Fem.*"]
Morphological feature | Description | Part of speech | Value | Example |
Abbr | Abbreviation refers to shortened forms of words or phrases. It also includes acronyms. | adjective, adposition, adverb, noun, proper noun, verb | not applicable Abbr |
f.eks. – “for eksempel” osv. – “og så videre” |
Animacy | Animacy is a grammatical and semantic feature, existing in some languages, expressing how sentinent or alive the referent of a noun is. In Norwegian, the value “human” is used to refer to the pronouns of human beings. | pronoun | not applicable Hum |
hun |
Case | Case is a grammatical category referring to the syntactic or semantic function that the specific part of speech carries out within the sentence. In modern Norwegian, traditional cases like nominative, genitive, and accusative are largely simplified and rarely distinguished morphologically, except in some pronouns. Some words have combined values of the feature. | adjective, determiner, noun, proper noun, pronoun | not applicable Nom Gen Acc Gen,Nom |
meg (PRON-Acc) partenes (NOUN-Gen)
du (PRON-Nom) deres (DET-Gen) |
Definite | Definiteness is a semantic property that indicates whether the referent (referred entity) is identifiable or non-identifiable in a given context. This feature applies to several parts of speech and can have different values. Some words have combined values of the feature. | adjective, determiner, noun, numeral | not applicable Def Ind Def,Ind |
“den store bilen” (ADJ-Def)
“en bil” (DET-Ind) “Jeg ser bilen” (NOUN-Def) “Jeg har noen epler” (NUM-Ind) |
Degree | Adjectives provide description of entities and possibility to compare such entities by means of the so-called degrees of comparison. The feature, only applicable in Norwegian to adjectives, can take 3 different values: positive (Pos), comparative (Cmp) and superlative (Sup) | adjective | not applicable Pos Cmp Sup |
pen (Pos) mer (Cmp) beste (Sup) |
Gender | Gender is typically a lexical element of nouns and an inflectional feature of other parts of speech (e.g., adjectives) that mark agreement with nouns. The feature applies to 6 parts of speech and has three different features: masculine (Masc), feminine (Fem), and neuter (Neut). Some words have combined values of the feature (Fem,Masc). | adjective, determiner, noun, numeral, pronoun, proper noun |
not applicable Fem Masc Fem,Masc Neut |
ingen (PRON-Fem,Masc) sånn (DET-Fem) atlanterhavet (NOUN-Neut) han (PRON-Masc) |
Mood | Mood is the verbal feature expressing the attitude of speakers towards what is being conveyed (e.g., assessment, desire, command). Finite verbs in Norwegian have the following features: indicative and imperative (Ind/Imp). | auxiliary verb, verb | not applicable Imp Ind |
kjøp (VERB-Imp) har (AUX-Ind) |
Number | Number is a grammatical category indicating a quantity. The feature occurs with 6 parts of speech and can take 2 values: plural (Plur) and singular (Sing). In some cases, words can exhibit a combination of these values, indicating forms that can be interpreted as either singular or plural depending on the context. | adjective, determiner, noun, numeral, pronoun, verb | not applicable Plur Sing Plur,Sing |
gutt (NOUN-Sing) hvilke (DET-Plur) |
NumType | Numerals can take different forms according to the language system involved. In Norwegian, the feature occurs with 1 part of speech (i.e., numerals) and can take two different values: cardinal (Card) and ordinal (Ord). | numeral | not applicable Card Ord |
tre (Card) tredje (Ord) |
Person | Person refers to the grammatical distinction between participants in the event described by the verb: the speaker (first person), the addressee (second person), and others (third person). The feature occurs with pronouns and has 3 different values in singular and plural. | pronoun | not applicable 1 2 3 |
Singular Norwegian Personal Pronouns: 1.Jeg (I) 2.Du (You) 3.Han (He) 3.Hun (She) 3.Den (It, masculine or feminine) 3.Det (It, neutral) Plural Norwegian Personal Pronouns: 1.Vi (We) 2.Dere (You plural or formal) 3.De (They) |
Polarity | Polarity refers to whether words occur in positive or negative utterances. The feature applies to adverbs, determiners, and pronouns and can take the value of negative. In cases of double negation, where two negative elements are used together, the value would still be marked as negative. | adverb, determiner, pronoun | not applicable Neg |
ingen ikke aldri intet |
Poss | This feature indicates whether an item expresses ownership or possession. It applies to determiners and pronouns in Norwegian that denote relationships of ownership or belonging. The feature takes one value: Poss | determiner, pronoun | not applicable Poss |
min (DET) hans (PRON) |
PronType | Pronominal type applies to pronouns and determiners. It indicates various types of pronouns and pronominal determiners, capturing distinctions such as articles, demonstratives, indefinites, interrogatives, negatives, personals/possessives, relatives, and totals (collective). Additionally, it can denote a combination of two different types. | determiner, pronoun | not applicable Art Dem Ind Int Neg Prs Rel Tot |
sånn (DET-Dem,Ind) ingen (PRON-Neg) min (PRON-Prs) hvilken (DET-Int) |
PunctType | Punctuation marks are non-alphabetical characters and character groups used in many languages to delimit linguistic units in printed text. | punctuation | not applicable Colon Comma Hyphen Paren Quot Sent |
* … : , ; – – ( ) [ ] ‘ . ? ! |
Reflex | The category indicates whether the item is reflexive or not. It applies to pronouns and takes one value: Reflex. | pronoun | not applicable Reflex |
seg |
Tense | Tense is a grammatical category typical of verbs. It indicates whether the action occurs in the past, present, or future. In Norwegian, it applies to indicative verbs, which are marked for present tense (Pres) and past tense (Past). | auxiliary verb, verb | not applicable Pres Past |
å smile (infinitiv) >jeg smiler (present)> jeg smilte (past) |
VerbForm | This category indicates forms that have features of both verbs and other parts of speech. It applies to auxiliary verbs, adjectives, and verbs, and includes the values: infinitive (Inf), participle (Part), and finite (Fin). Non-finite verbs are distinguished by their VerbForm feature: VerbForm=Part and VerbForm=Inf and finite verbs have VerbForm=Fin. | adjective, auxiliary verb, verb | not applicable Fin Inf Part |
skrevne (ADJ-Part) vær (AUX-Fin) å smile (VERB-Inf) |
Voice | Voice is typically a feature of verbs. Passive verbforms are marked with Voice=Pass | verb | not applicable Pass |
spises |