title>Standard of POS tag of contemporary Chinese for CIP - GB/T 20532-2006 - Chinese standardNet - bzxz.net
Home > GB > Standard of POS tag of contemporary Chinese for CIP
Standard of POS tag of contemporary Chinese for CIP

Basic Information

Standard ID: GB/T 20532-2006

Standard Name:Standard of POS tag of contemporary Chinese for CIP

Chinese Name: 信息处理用现代汉语词类标记规范

Standard category:National Standard (GB)

state:in force

Date of Release2006-09-18

Date of Implementation:2007-03-01

standard classification number

Standard ICS number:General, Terminology, Standardization, Documentation>>Vocabulary>>01.040.01 General, Terminology, Standardization, Documentation (Vocabulary)

Standard Classification Number:General>>Basic Standards>>A22 Terms and Symbols

associated standards

Publication information

publishing house:China Standards Press

Plan number:20030301-T-360

Publication date:2007-03-01

other information

Release date:2006-09-18

drafter:Jin Guangjin, Xiao Hang, Guo Shulun, Fu Li, Zhang Yunfan, Yu Guiying, Chen Yuquan, Wang Li

Drafting unit:Institute of Language and Literature Application, Ministry of Education

Focal point unit:Ministry of Education (Language)

Proposing unit:Ministry of Education

Publishing department:General Administration of Quality Supervision, Inspection and Quarantine of the People's Republic of China Standardization Administration of China

competent authority:Ministry of Education (Languages)

Introduction to standards:

This standard specifies the marking codes for modern Chinese word classes and other segmentation units in information processing. This standard is applicable to Chinese information processing and can also be used as a reference for modern Chinese teaching and research. GB/T 20532-2006 Standard for marking modern Chinese word classes for information processing GB/T20532-2006 Standard download decompression password: www.bzxz.net
This standard specifies the marking codes for modern Chinese word classes and other segmentation units in information processing. This standard is applicable to Chinese information processing and can also be used as a reference for modern Chinese teaching and research.


Some standard content:

ICS D1. 040. 01
National Standard of the People's Republic of China
GB/T 20532-2006
Standard of POS tag of contemparary Chinese for CIF2006-09-18Promulgated
General Administration of Quality Supervision, Inspection and Quarantine of the People's Republic of ChinaStandardization Administration of China
2007-03-01Implementation
Foreword·
2 Terms and Definitions·
Classification of Parts of Speech and Other Segmentation Units
List of Codes for Parts of Speech and Other Segmentation Units
CB/T20532—2006
This standard is proposed by the Language and Information Technology Department of the Ministry of Education, and this standard is under the jurisdiction of the Language and Information Technology Department of the Ministry of Education. The drafting unit of this standard is the Language and Information Technology Application Research Institute of the Ministry of Education. The main drafters of this standard are: Jin Guangjin, Xiao Hang, Guo Daolun, Fu Li, Zhang Yunfan, Yu Guiying, Chen Yuquan, Wang Li. GB/T20532—2006
1 Scope
Modern Chinese word class marking standard for information processing This standard specifies the marking code for modern Chinese word classes and other segmentation units in information processing. This standard is applicable to Chinese information processing and can also be used as a reference for modern Chinese teaching and research. 2 Terms and definitions
The following terms and definitions apply to this standard. 2.1
Chinese information processing: Chinese information processing: the use of computers to input, sort, store, output statistics, retrieve, etc. of Chinese words, phrases, meanings, etc. 2.2
Segmentation unit
GB/T20532—2006
The basic unit used in Chinese information processing with a certain grammatical function. It includes words, phrases and other units defined by the rules of this standard.
Parts of speech ; POS
Grammatical division of words, mainly divided into categories based on grammatical functions: 2. 4
Marking ag
Codes for marking the categories of segmentation units in texts, 3 General Principles
3.1 Scope of segmentation units
The segmentation units of this standard include questions, phrases and other segmentation units, such as idioms, abbreviations: pre-partitions, post-components, morphemes. Non-linguistic characters, punctuation marks, and academic symbols. 3.2 Principles of word classification
The word classification system of this standard refers to the grammatical system of Mr. Lü Jixiang, Zhu Dexi, Hu Yushu and the summary of the grammar system of middle school teaching.
Based on the characteristics and requirements of Chinese information processing, this standard mainly divides word classes based on the principle of grammatical function. 3.3 Principles for the formulation of marking codes
According to international common practices, marking codes mainly use letters of English terms. For example, for "noun", the first letter of the English term "noun" \n\ is used as the tag code; for "numeral", the third letter "m\" of the English term "numeral" is used as the tag code. For words that are unique to Chinese or it is inconvenient to use English term letters, the tag code uses Chinese pinyin letters according to the common domestic practice. For example, for "abbreviation", the first letter of the Chinese character "简" in Chinese pinyin \}\ is used as the tag code; for "language", the first letter of the Chinese character "根" in Chinese pinyin "g\ is used as the tag code.
4 Classification of parts of speech and other segmentation units
This standard divides parts of speech into 13 first-level categories and 16 second-level categories: other segmentation units are divided into 7 first-level categories and 13 second-level categories. Users can add them according to their needs.
GB/T20532—2006
4. 1 Classification and marking code
4.1.1 Noun (n), indicating the name of a person or thing, mainly acts as the subject and object in a sentence. 4.1.1.1 General noun (g), indicating the name of a thing. For example: people, horses, books, teachers, airplanes, electric balls, books, A-he, chestnuts, wood, moral theory, history, thought, culture, cause, style, philosophy 4.1.1.2 Time noun (nt), including the so-called time quantifier. For example: year, month, day, minute, second
now, past, yesterday, last year, future, Song Dynasty, present, period 1 4.1.1.3 Position noun (d), indicating the relative direction of the position. For example: up, down, left, right, front, back, Asia, outside, middle, east, west, south, north
front, left, inside, room, outside
4.1.1.4 Place noun (nl), indicating a place. Such as: in the air, away from the place, next door, near the border, beside the wilderness 4. 1. 1. 5 Name of person (nh). A proper noun indicating the name of a person. Hua Luogeng, Afan, Yao, Sima Xiangru, Songzan, Karl, Mark
4.1.1.6 Place name (ns), a proper noun indicating the name of a geographical area. Such as: Asia, Atlantic, Mediterranean, Alsace, Gahen, China, Beijing, Zhejiang, Jingdezhen, Hohhot, Ningguan Village 4.1.1.7 Ethnic name (nm), a proper noun indicating the name of an ethnic group or tribe. Such as: Hui, Dai, Du, Angu, Uyghur, Kazakh 4.1.1.8 Institutional name (ni), a proper noun indicating the name of a group, organization or institution. Women: United Nations, Ministry of Education, Peking University, China Academy of Sciences 4,1.1,9 Other proper nouns (nz). For example: Wuliangye Gongmei Ji Ding Santana
4.1.2 Verbs (v), which express actions, behaviors, psychological activities, physiological states, and the existence and changes of things, etc., mainly serve as predicates in sentences. 4.1.2.1 Intransitive verbs (v). Can take objects. For example: eat, beat, wash, feed, store, send, buy, pick up, fill
really happy, tell, accept, steam, consider, investigate, forget, start 4. T. 2. 2 Intransitive verbs (vi), cannot take objects. For example: sick, follow, rest, cough, stupid, swim, sleep
4.1.2.3 Linking verbs (vl), express judgments about relationships. For example: is
4.1.2.4 Volitional verbs (vu), express possibility and intention. For example: can, should, can, may, pressure, willing, want 4. 1.2. 5 Directional verbs (vd), express tendency. Such as: (walk) up
(just) down
(come) in
(surface) go
(fall) down(lift) up(throw) over
(run) up kiss
4.1.3 Adjective (a), indicating properties and states, mainly acts as predicate, attributive, adverbial and complement in a sentence. 4.1. 3.1 Adjective of properties (aq). Indicates properties. Such as: high, beautiful, big, brave, dangerous, bright, clean, great4. 1. 3.2 Adjective of state (as), indicates state. Such as: thunder, white, acid, black, red, cold, green, shiny, grand, white, flowery, cold4.1.4 Distinguishing words (), indicating the distinguishing characteristics of things, can only be used as attributives to modify nouns or "compose" with auxiliary words in a sentence. Such as
男男公女雎
男型国产军用
4.1.5 Numeral (m). Indicates number and order. Such as:季一华百于
一百八
第第十十八
4.1.6 Quantifier (q), indicates the unit of people, things or actions. Such as:个条片皮瓣尺斤两顿支回次遮千威时4.1.7 Pronoun (1), plays a role of substitution and redistribution: such as:我你他
this that
we you
what where
everyone
恋么戀么样
GE/T 20532—2006
4.1.8 Adverb (d), modifies or limits verbs and adjectives, indicates scope, degree, etc. It acts as an adverbial in Hezi. Such as all only
total being
only very repeatedly will not but
often again once even actually 4,1.9 participle (p), introduces a noun component, does not act as a sentence component alone. For example: put be from
according to
still to rely on
since about
conjunction (heart), conjunction. phrase or sentence, indicating a certain relationship between the two. For example: 4. 1. 10
and with
and, and
and and or
or because so
particle (u) attached to a word, phrase, sentence to indicate a certain additional meaning. For example: got shame and so on
absorbent (e), expresses exclamation, call or response, can be an independent sentence or act as an independent component in a sentence. For example: um, throat
hum, oh, ah, ah
onomatopoeia (o), imitates the sound of something in nature, and cannot form a sentence alone. For example; bang, tick, plop, gurgle, ding, ding, dang, dang
4.2 Division and marking code of other segmentation units 4.2, 1 month language (1 a fixed phrase that is commonly used 4.2. 1. 1 Noun idioms (in). Such as; mirage, frogs at the bottom of the building, clues
4.2, 1.2 Verb idioms (iv). Such as: run a small role, speak in official language, eat the old capital, fear the times and make progress
work hard and govern
4.2.1.3 Transposition idioms (ia). Such as: rich and colorful, hard and simple, open and aboveboard
4.2.1.4 Conjunction idioms (ic). Such as: in short, it can be seen from this. In summary
4.2.2 Abbreviations (), abbreviated forms of proper nouns or common expressions: 4.2.2.1 Noun abbreviations "jm). For example: NPC May Fourth Olympic Games
4.2.2.2 Verb abbreviations (jv). For example: Ma Yan Li Fan Xiu
4.2.2.3 Adjective abbreviations ja). For example: Short Flat Quick Li Jing Jian
4.2.3 Prefix (b), additional word-forming component before the word report. For example: 3
GB/T 20532—2006
A Lao Chu Di
4.2.4 Postfix (k), additional word-forming component after the root. For example: Yu Er Tou Hua Men Shi Xing Zhe
4.2.5 Morpheme (g), Chinese characters that are generally not used alone in the Chinese character set. 4.2.5.1 Noun Morpheme Learning (gn). For example: Min Nong Cai
4.2.5.2 Verb Morpheme (gv). For example: 姝研究谧
4.2.5.3 Adjective word (ga). For example: 朱遂伟
4.2.6 Non-word word (x), Chinese characters that have no meaning when used alone in the Chinese character set, such as: 拉昆蛛跨鸯靖
4.2.7 Others (w)
4.2.7.1 Punctuation (wp), such as:
4.2.7.2 Non-Sinology string (ws), such as: officewindows
4.2.7.3 Other unknown symbols (wu).
Word class and other segmentation unit mark code table Word class to other segmentation unit mark code table See Figure 1. Table 1 List of word classes and their segmentation unit marking codes (arranged in alphabetical order of marking codes)
Marking code
-·First-level class
Second-level class
Adjective
Category name
adjective
Nature adjective
State adjective
Distinguishing word
Language
Adjective linguistics
Noun word
Verb word
Prefix
Idiom
Adjective idiom
Conjunctive idiom
adjectiveguality
adjer tiv-s ta'e
ganjunction
adverb
sxclamation
difference
Code Description
“”’s first Chinese Pinyin letter
“Root\’s first Chinese Pinyin letter-adjecnive“Root\’s first Chinese Pinyin letter-noun
“Root\’s first Chinese Pinyin letter-verb
jdiorn
idiom-adjertive
idiam-conjunction
Mark Code
First Class
Second Class
Table 1 (Continued)
Category Name
Name Tune Idioms of gender
Idioms of tone
Abbreviations
Adjective abbreviations
Noun abbreviations
Abbreviations of tone
Following components
Location nouns
Common nouns
Nouns of machine
Place nouns
Time nouns
Other proper nouns
Onomatopoeia
Directional verbs
Intransitive verbs
Contacting verbs
Transitive verbs
Positive verbs
Punctuation marks
Non-Chinese characters
Other unknown symbols
Non-idioms
idiom-noun
idiom-yerb| |tt||GB/T20532—2006
Code Description
The first initial letter of the Chinese pronunciation of “简”
The first initial letter of the Chinese pinyin of “简\”-adljectivcThe first Chinese pinyin letter of “简\”-oun
The first Chinese pinyin letter of “简”-verb
According to the usual practice
numeral
noundirection
nounrgeneral
noun-human
noun-institution
nonn-locarinn
ounaton
noun-space
nount ime
noun\专"Chinese pinyin initials
gramatopoeia
preposition
guanity
pronoun
auxiliary
verb-dirertion
yerb-intransitive
yerb-linking
verb-transitive
yerb-auxiliary
According to the usual cloud
According to the usual
\w\-gtring
\w\-unkuown
According to the passband method3 Other unknown symbols (wu).
Word class and other segmentation unit marking code table Word class and other segmentation unit marking code table See Appendix 1. Table 1 Word class and its segmentation unit marking code table (arranged in alphabetical order of marking code)
Marking code
-· Level class
Secondary class
Adjective
Category name
adjective
Nature adjective
State adjective
Distinguishing word
Language
Adjective language change
Noun language huanzi
Verb language xuzi
Prefix component
Idiom
Adjective idiom
Conjunctive idiom
adjectiveguality
adjer tiv-s ta'e
ganjunction
adverb
sxclamation
difference
Code Description
“”’s first Chinese Pinyin letter
“Root\’s first Chinese Pinyin letter-adjecnive“Root\’s first Chinese Pinyin letter-noun
“Root\’s first Chinese Pinyin letter-verb
jdiorn
idiom-adjertive
idiam-conjunction
Mark Code
First Class
Second Class
Table 1 (Continued)
Category Name
Name Tune Idioms of gender
Idioms of tone
Abbreviations
Adjective abbreviations
Noun abbreviations
Abbreviations of tone
Following components
Location nouns
Common nouns
Nouns of machine
Place nouns
Time nouns
Other proper nouns
Onomatopoeia
Directional verbs
Intransitive verbs
Contacting verbs
Transitive verbs
Positive verbs
Punctuation marks
Non-Chinese characters
Other unknown symbols
Non-idioms
idiom-noun
idiom-yerb| |tt||GB/T20532—2006
Code Description
The first initial letter of the Chinese pronunciation of “简”
The first initial letter of the Chinese pinyin of “简\”-adljectivcThe first Chinese pinyin letter of “简\”-oun
The first Chinese pinyin letter of “简”-verb
According to the usual practice
numeral
noundirection
nounrgeneral
noun-human
noun-institution
nonn-locarinn
ounaton
noun-space
nount ime
noun\专"Chinese pinyin initials
gramatopoeia
preposition
guanity
pronoun
auxiliary
verb-dirertion
yerb-intransitive
yerb-linking
verb-transitive
yerb-auxiliary
According to the usual cloud
According to the usual
\w\-gtring
\w\-unkuown
According to the passband method3 Other unknown symbols (wu).
Word class and other segmentation unit marking code table Word class and other segmentation unit marking code table See Appendix 1. Table 1 Word class and its segmentation unit marking code table (arranged in alphabetical order of marking code)
Marking code
-· Level class
Secondary class
Adjective
Category name
adjective
Nature adjective
State adjective
Distinguishing word
Language
Adjective language change
Noun language huanzi
Verb language xuzi
Prefix component
Idiom
Adjective idiom
Conjunctive idiom
adjectiveguality
adjer tiv-s ta'e
ganjunction
adverb
sxclamation
difference
Code Description
“”’s first Chinese Pinyin letter
“Root\’s first Chinese Pinyin letter-adjecnive“Root\’s first Chinese Pinyin letter-noun
“Root\’s first Chinese Pinyin letter-verb
jdiorn
idiom-adjertive
idiam-conjunction
Mark Code
First Class
Second Class
Table 1 (Continued)
Category Name
Name Tune Idioms of gender
Idioms of tone
Abbreviations
Adjective abbreviations
Noun abbreviations
Abbreviations of tone
Following components
Location nouns
Common nouns
Nouns of machine
Place nouns
Time nouns
Other proper nouns
Onomatopoeia
Directional verbs
Intransitive verbs
Contacting verbs
Transitive verbs
Positive verbs
Punctuation marks
Non-Chinese characters
Other unknown symbols
Non-idioms
idiom-noun
idiom-yerb| |tt||GB/T20532—2006
Code Description
The first initial letter of the Chinese pronunciation of “简”
The first initial letter of the Chinese pinyin of “简\”-adljectivcThe first Chinese pinyin letter of “简\”-oun
The first Chinese pinyin letter of “简”-verb
According to the usual practice
numeralbzxz.net
noundirection
nounrgeneral
noun-human
noun-institution
nonn-locarinn
ounaton
noun-space
nount ime
noun\专"Chinese pinyin initials
gramatopoeia
preposition
guanity
pronoun
auxiliary
verb-dirertion
yerb-intransitive
yerb-linking
verb-transitive
yerb-auxiliary
According to the usual cloud
According to the usual
\w\-gtring
\w\-unkuown
According to the passband method
Tip: This standard content only shows part of the intercepted content of the complete standard. If you need the complete standard, please go to the top to download the complete standard document for free.