A tagset is a list of part-of-speech tags (POS tags for short), i.e. labels used to indicate the part of speech and sometimes also other grammatical categories (case, tense etc.) of each token in a text corpus.
Amharic part-of-speech tagset is available in Amharic corpora annotated with the TreeTagger tool trained on manual annotation of Amharic 1065 news items containing 210,000 prosodic words. The manual annotation was developed by the Ethiopian Languages Research Center of Addis Ababa University
Amharic available corpora in Sketch Engine
An Example of a tag in the CQL concordance search box: [tag="PRON.*"]
finds all pronouns , e.g. ይህ, ምን (note: please make sure that you use straight double quotation marks)
Tagset
Basic class | Definition of the tag | Code of the tag |
Noun | Verbal/ infinitival Noun, formed from any verb form such as active, passive, and repetitive, by attaching the prefix m(ä)- | VN |
Any noun including verbal noun attached with a preposition | NP | |
Any noun including verbal noun attached with conjunction | NC | |
Any noun including verbal noun with a proclitic preposition and an enclitic conjunction | NPC | |
Any other noun; simple or derived | N | |
Pronoun | Pronoun attached with preposition | PRONP |
Pronoun attached with conjunction | PRONC | |
Pronoun with a proclitic preposition and an enclitic conjunction | PRONPC | |
Any other Pronoun | PRON | |
Verb | Auxiliary verb | AUX |
Relative verb | VREL | |
Any Verb including relative verbs and auxiliaries attached with preposition | VP | |
Any Verb including relative verbs and auxiliaries attached with conjunction | VC | |
Any Verb including relative verbs and auxiliaries with a proclitic preposition and an enclitic conjunction | VPC | |
Verb (all other) | V | |
Adjective | Adjective attached with preposition | ADJP |
Adjective attached with conjunctions | ADJC | |
Adjective with a proclitic preposition and an enclitic conjunction | ADJPC | |
Any other Adjective | ADJ | |
Preposition | Preposition | PREP |
Conjunction | Conjunction | CONJ |
Adverb | Adverb | ADV |
Numeral | Cardinal | NUMCR |
Ordinal | NUMOR | |
Numeral (cardinal or ordinal) attached with preposition | NUMP | |
Numeral (cardinal or ordinal) attached with conjunction | NUMC | |
Numeral (cardinal or ordinal) with a proclitic preposition and an enclitic conjunction | NUMPC | |
Interjection | Interjections | INT |
Punctuation | Punctuation | PUNC |
Unclassified | Unclassified | UNC |
Reference
Demeke, G. A., & Getachew, M. (2006). Manual annotation of Amharic news items with part-of-speech tags and its challenges. Ethiopian Languages Research Center Working Papers, 2, 1-16.