A tagset is a list of part-of-speech tags (POS tags for short), i.e. labels used to indicate the part of speech and sometimes also other grammatical categories (case, tense, etc.) of each token in a text corpus.
CAMeL Arabic part-of-speech tagset
CAMeL Arabic part-of-speech tagset is available in Arabic corpora annotated by the CAMeL tool which is a set of Arabic natural language processing tools developed by the CAMeL Lab at New York University Abu Dhabi.
The following table shows the Arabic CAMeL part-of-speech tagset
An Example of a tag in the CQL concordance search box: [tag="adj"]
finds all adjectives, e.g. متحد, كامل (note: please make sure that you use straight double quotation marks)
PoS tag | Description |
abbrev | abbreviation |
adj | adjective |
adv | adverb |
adv_interrog | interrogative adverb |
adv_rel | relative adverb |
conj | conjunction |
conj_sub | subordinating conjunction |
digit | digital numbers |
foreign | foreign |
interj | interjection |
noun | noun |
noun_prop | proper noun |
noun_quant | quantity noun |
part | particle |
part_det | demonstrative particle |
part_focus | focus particle |
part_fut | future marker particle |
part_interrog | interrogative particle |
part_neg | negative particle |
part_verb | verbal particle |
part_voc | vocalized particle |
prep | preposition |
pron | pronoun |
pron_dem | demonstrative pronoun |
pron_interrog | interrogative pronoun |
pron_rel | relative pronoun |
punc | punctuation |
verb | verb |
verb_pseudo | pseudo verb |
xxx | other |
Source: https://camel-tools.readthedocs.io/en/stable/reference/camel_morphology_features.html
or