A tagset is a list of part-of-speech tags (POS tags for short), i.e. labels used to indicate the part of speech and sometimes also other grammatical categories (case, tense etc.) of each token in a text corpus.
Chinese NEUCSP part-of-speech tagset is available in Chinese corpora annotated by the NEUCSP tagging tool developed by the Natural Language Processing Group at Northeastern University, China.
An example of a tag in the CQL concordance search box: [tag=”d”] searches for adverbs, e.g. 副词
Tag | Description | Example |
n | common noun | 普通名词 |
nt | temporal noun | 时间名词 |
nd | noun of locality (e.g. 下) | 方位名词 |
nl | location noun (e.g. 内地) | 处所名词 |
nh | personal name | 人名 |
ns | place name | 地名 |
ni | organization name | 团体、机构、组织的专名 |
nz | other proper nouns | 其它专名 |
v | verb | 动词 |
a | adjective | 形容词 |
b | distinguishing word (e.g. 主要) | 区别词 |
d | adverb | 副词 |
m | numeral | 数词 |
q | measure word | 量词 |
r | pronoun | 代词 |
p | preposition | 介词 |
c | conjunction | 连词 |
e | interjection | 叹词 |
o | onomatopoeic word | 拟声词 |
u | particles (e.g. 的,了) | 助词 |
h | prefix | 前接成分 |
k | suffix | 后接成分 |
i | habitual language | 习用语 |
j | abbreviation | 简称 |
g | alpha-numeric symbol | 语素字 |
x | non-language symbol | 非语素字 |
wp | punctuation | 标点 |
ws | string of symbols | 字符串 |
Source: http://www.niutrans.com/niutrans/NiuTrans.html
or