This standard specifies the analysis of the subject of a document and the method of indexing the document based on various Chinese thesauri. This standard applies to manual indexing performed by manual search tools and machine-readable search tools for establishing documents. GB/T 3860-1995 Rules for Indexing Document Descriptors GB/T3860-1995 Standard download decompression password: www.bzxz.net
This standard specifies the analysis of the subject of a document and the method of indexing the document based on various Chinese thesauri. This standard applies to manual indexing performed by manual search tools and machine-readable search tools for establishing documents.
Some standard content:
National Standard of the People's Republic of China Documentation guidelines for determining the subjects and selecting the descriptors1 Main content and scope of application GB/T 3860—1995 Replaces GB386083 1.1 This standard specifies the analysis of document subjects and the methods of document descriptor indexing based on various Chinese descriptor tables. 1.2 This standard is applicable to manual indexing by manual search tools and machine-readable search tools for document establishment. 2 Terminology The following terms apply to this standard. 2.1 Subject: the object or problem specifically discussed and studied in the document. 2.2 Descriptors, that is, formal descriptors. When indexing and structuring documents, the descriptor table specifies the words used to express various concepts. 2.3 Non-descriptors: that is, informal descriptors. Synonyms or quasi-synonyms of descriptors, which are included in the thesaurus but are not specified as document identifiers and only serve as guides. 2.4 Thesaurus is a standardized dictionary composed of semantically related and family-related scientific terms selected from natural language. In the process of document indexing and information retrieval, it is used to convert the natural language of documents, indexers and users into a unified descriptive retrieval language: a terminology control tool. 2.5 Descriptor indexing: the process of assigning descriptor retrieval identifiers to documents based on the results of subject analysis. 2.6 Descriptor combination: in the process of indexing and retrieval, according to certain logical rules, a method of using a combination of two or more descriptors to express a specific concept, 2.7 Connection symbol: abbreviated as connection symbol or connection. A grouping symbol used to indicate that descriptors are related when describing multiple topics in a document. 2.8 Function symbol: abbreviated as job symbol or job number. A symbol used to indicate the functional role of a descriptor in describing the subject of a document. 2.9 Weighting: A method of assigning "weights" that indicate the degree of importance to the descriptors that describe the subject of the document. 3 Thematic Analysis 3.1 Literature Review Literature review is the first step in thematic analysis. Its purpose is to understand and identify the specific objects or topics discussed and studied by the document, so as to determine the subject of the document. When reviewing the document, you should generally base your analysis on the title (book title or article title), preface, conclusion, diary, tables and figures, and the attached abstracts, references, etc. If necessary, you should browse the full text. Avoid conducting subject analysis based only on the title of the document. 3.2 Theme Type The themes contained in the document can be divided into single themes and multiple themes. The subject type can be divided into unit subject and compound subject according to the subject content contained in the subject; can be divided into professional subject and related subject according to the professional attribute reflected by the subject; can be divided into central subject and peripheral subject according to the importance of the subject in the document; can be divided into overall subject and local subject according to the scope of the subject content; can be divided into explicit subject and implicit subject according to the degree of clarity of the subject in the document. The State Administration of Technical Supervision approved on August 22, 1995 and implemented on April 1, 1996 GB/T 3860—1995 Understanding and mastering these subject types will help to correctly and appropriately conduct document subject analysis. 3.3 Subject structure Any subject is composed of certain subject factors. The subject factors that constitute the subject can be summarized as: main factors (research objects, materials, methods, processes, conditions, etc.), general factors, spatial factors, time factors, and document type factors. When analyzing the structural factors of the subject of a document, it is necessary to clarify the central subject factors and the modifying and limiting subject factors so as to select and discard them according to the needs. 3.4 Requirements for subject analysis bzxz.net a Objectively analyze the actual subject of the document without the subjective intention of the indexer; b The comprehensiveness and specificity of the analyzed subject of the document should be basically consistent with the comprehensiveness and specificity of the subject of the document; c. Fully consider the goals of the retrieval system and the needs of users, and select the subject that has retrieval significance in the document. 4 Select indexing terms After subject analysis of the document, the indexer should convert the analyzed subject concepts into corresponding terms according to the following rules. 4.1 The terms used to index the document must be the descriptors in the vocabulary (i.e. formal descriptors), and the writing form must be consistent with the writing form in the vocabulary. Informal descriptors can only be used for indexing. 4.2 The terms used to index the document must first consider the selection of specific descriptors that directly correspond to the subject concepts of the document. 4.3 When there is no specific descriptor in the vocabulary that directly corresponds to the concept in the document, two or more descriptors should be selected for group indexing. 4.3.1 Inter-grouping should be concept grouping. Concept grouping includes the following two types: a. Cross-grouping. Refers to the grouping of two or more descriptors with conceptual cross-relationships, and the result expresses a specific concept. When formulating pre-grouping search headings, the cross-grouping symbol is recommended to be "". b. Aspect grouping. Refers to the grouping of a descriptor that represents a thing and another descriptor that represents a certain attribute or aspect of the thing, and the result expresses a specific concept. When formulating pre-grouping search headings, the aspect grouping symbol is recommended to be "一". 4.3.2 When grouping headings, cross-grouping is preferred, and then Consider the aspects of grouping. 4.3.3 The descriptors involved in grouping must be the most relevant and adjacent words to the subject concept of the document to avoid skipping the level of grouping. 4.3.4 The grouping result requires the concept to be clear and relevant, and can only express one concept. 4.3.5 When a certain superordinate concept is specified in the vocabulary, the specified superordinate word should be used, and other descriptors should not be selected for grouping indexing. 4.4 If grouping is not suitable or cannot be used, the most direct superordinate word or related descriptor can be selected for indexing. .5 Free words are natural language words outside the thesaurus that have not been standardized. The use of free words for indexing should be strictly controlled. 4.5.1 Free word indexing can be used in the following situations: Although some concepts can be indexed with hypernyms or related words, when the indexing rate of these concepts is high, free word indexing can be used! Concepts that are obviously missed in the vocabulary can be indexed with free words: b. New concepts such as new disciplines, new theories, new technologies, new materials, etc. can be indexed with free words; areas that are included in the vocabulary can be indexed with free words. , names of people, documents, products, etc. and important data names can be indexed with free words; d. When some concepts are combined and the combination results in multiple meanings, the indexed concepts can be indexed with free words. e 4.5.2 Free words should be selected from other vocabularies and more authoritative reference books and tool books as much as possible. The selected free words must be concise in form, clear in concept, and highly practical. 4.5.3 After using free words for indexing, they should be recorded and reported to the document management department. GB/T 3860--1995 4.6 The number of descriptors used to index a document is related to the actual number of subjects in the document, the subject type, the system goal of the retrieval system, the degree of prior grouping of the terms in the vocabulary used, etc. Therefore, it is generally stipulated that the average number of descriptors used by the manual search system is 2 to 5, and the average number of descriptors used by the computer search system is 4 to 10 4.7 When indexing a document with multiple subjects, the manual search system should group and index each subject; the computer search system can add a connection symbol to distinguish each subject, and the specific details are determined by the system. 4.8 When indexing documents, if some search systems use functional symbols and weighting, the specific details are determined by the system. 4.9 When compiling manual search tools, the order of descriptors should be: descriptors expressing subject factors - descriptors expressing general factors - descriptors expressing space factors - descriptors expressing time factors - descriptors expressing document type factors. 4.10 If it is necessary to use machine-readable documents to automatically generate compound thesaurus retrieval tags, the symbols should be marked after the relevant thesaurus when indexing. The recommended symbols are "M\" (M-Main, indicating the main title), "\Q" (Q→Qualifier, indicating the subqualifier, and "\S\" (S-Subqualifier, indicating the second-level and above titles). If necessary, use *SA (for the third-level title), "\SB\" (for the fourth-level title), "SC" (for the ninth-level title). The composite title format is MQ--SA-SB-SC.5 Quality Management 5.1 The quality of indexing work mainly depends on the following three factors: 1. Organization and management of indexing work: 2. Professional quality of indexing personnel: 3. The quality of the thesaurus itself. In order to ensure the quality of indexing work, these factors must be fully considered. 5.2 Auditing should be an indispensable step in the indexing work. It is necessary to select full-time personnel to perform the task. If limited by manpower and conditions, mutual auditing can also be adopted. 5.3 Indexers engaged in indexing should implement professional division of labor as much as possible to keep the document model they handle relatively stable. 5.4 Improving the professional quality of indexers is the premise of ensuring the quality of indexing. Indexers are required to: a. be familiar with the vocabulary and indexing rules and methods used; b. have professional knowledge of the subject of the indexed document; c. have a certain level of language (Chinese or foreign) required for the work; d. have as much contact with users as possible and verify the quality of indexing work by analyzing the search results. 5.5 Strengthen the daily management of thesaurus, collect the opinions of the user units on the addition, deletion and modification of thesaurus at any time, and update thesaurus regularly. Read the document and determine the theme of the study. Is there a corresponding tone in the thesaurus? Is the word a near-form? Specially check the official authority GB/T3860—1995 Appendix A Workflow chart for document indexing using descriptive terms (reference document) Whether the group of patients can be matched in the same combination form Formal selection Examine the relationship between the terms that can be combined and the terms in the patient's query (general terms, sub-terms, terms) Are there more relevant terms in the relevant disease? Select the most accurate terms Whether you can use the upper-level terms or the beautiful terms to index? Is the concept of special indexing suitable for the independent determination and re-determination of indexing terms for submissions? Chinese pronunciation Chinese name English translation Number Photographic relationship Date of issue Proposing unit Additional notes: GB/T3860—1995 Appendix B Record card for addition, sub-division and modification of descriptive terms (reference) Decision Contact person This standard is proposed by the National Technical Committee for Standardization of Documentation Work. This standard was drafted by the Fifth Subcommittee of the National Technical Committee for Standardization of Documentation Work. The main drafters of this standard are Xia Rongshou, Qian Qilin, Sun Boqing, Liu Xiangsheng, and Wang Dongbo. Tip: This standard content only shows part of the intercepted content of the complete standard. If you need the complete standard, please go to the top to download the complete standard document for free.