GB/T 14814-1993 Standard Generalized Markup Language for Information Processing Text and Office Systems (SGML)

Basic Information

Standard ID: GB/T 14814-1993

Standard Name: Standard Generalized Markup Language for Information Processing Text and Office Systems (SGML)

Chinese Name: 信息处理文本和办公系统标准通用置标语言(SGML)

Standard category:National Standard (GB)

state:in force

Date of Release1993-01-02

Date of Implementation:1994-08-01

standard classification number

Standard ICS number:Information technology, office machinery and equipment>>Information technology applications>>35.240.20 Information technology applications in office

Standard Classification Number:Electronic Components and Information Technology>>Information Processing Technology>>L74 Programming Language

associated standards

Procurement status:=ISO 8879-1986

Publication information

publishing house:China Standards Press

other information

Release date:1993-12-30

Review date:2004-10-14

Drafting unit:Xi'an Jiaotong University

Focal point unit:National Information Technology Standardization Technical Committee

Publishing department:State Bureau of Technical Supervision

competent authority:National Standardization Administration

Skip to download

Introduction to standards:

This standard specifies a language for document representation, called "Standard Generalized Markup Language" (SGML). In its broadest definition, SGML can be used for typesetting, ranging from traditional single-media data typesetting to multimedia data typesetting. In addition, SGML can also be used for office document processing to meet the needs of people reading and document exchange between typesetting systems. GB/T 14814-1993 Information Processing Text and Office Systems Standard Generalized Markup Language (SGML) GB/T14814-1993 Standard Download Decompression Password: www.bzxz.net

Some standard content:

National Standard of the People's Republic of China
Information processing
Text and office systems
Standard generalized markup language (SGML)
Information processing-Text and office systems-Standard generalized markup fanguage(SGMI.)GB/T 14814---93
This standard adopts the international standard ISO8879-1986 "Information processing text and office systems standard generalized markup language (SGML)" and amendment 1-1988.
0 Introduction
This standard specifies a language for document representation, called "Standard Generalized Markup Language" (SGML). In its broadest definition, SMIL can be used for typesetting, ranging from traditional single media data typesetting to multimedia data typesetting. In addition, SGMI. can also be used for office document processing to meet the needs of people reading and exchanging documents between typesetting systems. 0.1Background
In the abstract, a document can be viewed as a structure composed of many types of elements. For example, an author may organize several chapters into a book, each of which may contain paragraphs and illustrations with textual annotations. Another example is that an editor may compile a number of articles into a journal, each of which may contain dozens of paragraphs, each of which may contain text, and so on. Processors use different methods to handle these elements. For example, a formatter may print a title in a prominent font, leaving it at the beginning of a paragraph or between paragraphs, thus visually displaying the structure and properties of the document to the reader. When a title dictionary is created for an information retrieval system, the text in the title may be given a special meaning. Although this connection between the properties of a document and its processing seems clear now, it was vague in terms of previous text processing methods. Before the advent of automated typesetting, editors would "mark up" manuscripts with special processing instructions, and then typesetters would follow these instructions to produce the desired format. All connections between instructions and file structure existed entirely in the editor's mind.
Later computer typesetting inherited this approach, using "processing-related tags" to add computer-readable file volumes. Although the added tags still consist of specific processing instructions, these processing instructions use the language of the formatter rather than the language used by typesetters. However, this document would be difficult to use in a different month or on a different computer system without changing all the tags in it.
As user levels have improved and text processing programs have become increasingly powerful, people have developed many solutions to the above problems. For example, "macro calls" or "format calls" are used to identify the places in the file that need to be processed, and the actual processing instructions are placed in a "procedure" outside the file (or "macro definition" or "storage format"), so that they can be modified more easily. Although macro calls can be placed anywhere in the file, users gradually realize that most macro calls are placed at the beginning or end of the document element. Therefore, it is natural to choose "universal identifiers" that represent the element type to name these macro calls, rather than using specific processing nouns (for example, using "heading" instead of "format-17", that is, using "title" instead of "format17", which is the beginning of the implementation of "universal encoding" (universalized marking). Universal encoding has taken an important step forward for dynamic text processing systems. It reflects the natural relationship between file attributes and processing. In the early 1971s, the emergence of "universal markup language" provided a formal language foundation for universal encoding technology, which further promoted this Approved by the State Administration of Technical Supervision on December 24, 1993 and implemented on August 1, 1994
GB/T14814-93
Development trend. Universal markup language should follow the following two principles: a. Descriptive markup plays a major role and is distinguished from processing instructions. Descriptive markup includes universal identifiers and other attributes of file elements, which can call processing instructions. Processing instructions can be expressed in any language and are usually collected in the process outside the file. When the source file is scanned and various elements are identified in order to find the markup, the processing system executes the process associated with each element and attribute. If other processing systems are used, the same elements and attributes can be associated with different processes without changing the markup of the file. When it is necessary to put the processing instruction directly into the file, its delimitation is different from the descriptive markup, which can be easily found and modified by different processing systems.
h. Each file type is formally defined. Universal markup language formalizes file markup by adding "file type definition". A type definition contains a description (similar to a formal grammar) of the elements and attributes that may appear in a document and in what order. This information can be used to determine whether a document is correctly marked (i.e., whether it conforms to the type definition) and to provide information about missing markings that can be accurately inferred from other markings.
Method: A more detailed introduction to the concepts of the use of encodings and the standard general marking scheme is given in Appendix A (reference document). 0.2 Objectives
The standard specifies the use of general encodings and general marking concepts in a markup language. It provides a clear and unambiguous syntax for describing user-selected content. The language includes: an "abstract syntax" for descriptive notation of file elements; a "baseline syntax" that combines the abstract syntax with specific delimiter characters and numbers. Users can define variants of the concrete syntax to meet their needs; a group specification that allows users to define a specific vocabulary of common identifiers and attributes for different file types; a pair of specifications for arbitrary data content. In the general specification, "data" refers to content that is not defined by setting terms, including specific "data content notations" that require interpretation independently of general text, such as formulas, images, non-Latin letters, formatted text or graphics mentioned above, etc.; a non-system-specific technique for referencing content outside the data stream of a file (such as separately written chapters, temporarily added words, photos, etc.), a special delimiter used to distinguish processing instructions from descriptive notation. For those situations that cannot be handled by the process, processing instructions can be inserted when necessary, but when the file is sent to another processing system, these processing instructions can be easily found and modified.
However, in order for the General Markup Language to become an acceptable standard, more features are needed. In order to meet the various constraints required for its use in complex environments, the language must also have meta-linguistic features. The main constraints, and the methods used by the Standard General Markup Language to meet them, are summarized as follows: a. The language "marks up" files that can be accepted by widely used text processing systems and word processing systems. A complete language with all optional features provides versatility and flexibility that can be exploited by advanced systems: less powerful systems may not support these features. To facilitate the exchange of files between different systems, "SMI declarations" can be used to describe all the standard features or variations of the specific syntax used in the file. h. It must support existing major text input devices. SGML files using the standard specific syntax can be easily typed and interpreted by people without machine assistance. Therefore, the use of SGMI does not need to wait until a new generation of hardware is developed. Only software that can process files on existing machines is needed; as users become more familiar with SGML, it will be easier to port SGMI to new generations of hardware (when they are available).
Nan can type files on different devices.d. It must be independent of any character set. The language does not rely on a special character set. If a character set contains letters, numbers, spaces, and GB/T14814-93
delimiters represented in byte form, then the character set is acceptable. d. It must be independent of the processor, system, or device. Since markup is mainly descriptive, it has this independence in nature. Since the processing instructions that appear occasionally are specially delimited, when different processors want to abandon unrelated instructions or want to exchange files, these instructions can be found and converted.
References to external parts of the file are indirect. The mapping to the actual system storage is implemented by the "external entity declaration" that appears at the beginning of the file, so they can be easily modified when the file is exchanged. The specific syntax can be modified through the SGML declaration to adapt to any reserved system characters. There must be no national language bias. The characters used for naming can be extended with characters from any specific national language. The general identifiers used in descriptive markup, attribute names, and other names are defined by the user in element and entity declarations. The declared names and keywords used in the markup can also be changed. Just like the links used in multilingual files4, multiple character tables are also supported. [. The The language must accommodate familiar conventions about typewriters and word processors. Conventions about typewriter text input are supported by "convenient quoting" and "data markup" features, and regular text containing paragraphs and quotations can be interpreted according to SGML, even though no visible tags are typed. g The language must not depend on any particular data flow or physical file organization. The markup language adopts a virtual storage model in which a file consists of one or more storage entities, each of which is a sequence of characters. All access to the actual file is done by the processing system. The processing system determines whether to treat the character sequence as continuous, or as separate. Whether the character sequence represents the boundary of a physical record. h. The text of the "standard" must coexist with other data. As long as the beginning and end of the text can be determined, the processing system can allow text that conforms to this standard to appear in a data stream with other content.
Similarly, the system can allow data content that is not defined by SGML to appear in files that conform to this standard. To facilitate interchange, the presence of such data is indicated by a markup declaration. i. Registration is available to both humans and programs. The purpose of the Standard Generalized Markup Language is to be a suitable interface for keyboard input and exchange without the need for a preprocessor. In order to accommodate users in text input The availability and experience of the language, as well as the requirements of various types of keyboards and display devices, allow for extensive tailoring of the language. However, it is recognized that many implementers will attempt to exploit the strengths of the general purpose markup language in terms of information accessibility to provide intelligent editing or creation of .SGML documents from a text processing front-end environment. The SGMI will be able to meet these applications by providing the following features: - element content can be stored separately from markup; - control characters can be used as delimiters; - mixed modes of data representation can be allowed in the document; and - multiple logical structures and layout structures can be supported. 0.3 Organization
The content of this standard is organized as follows: a. The physical organization of SGML documents with entity structure is described in Chapter 6; b. The logical organization of SGML documents with element structure and the representation using descriptive markup are described in Chapter 7; h. Processing instructions are discussed in Chapter 8:
Common markup components such as characters, entity references, and processing instructions are described in Chapter 9; general markup specifications (comments, entities, and special marked sections) are described in Chapter 10: GB/T 14814—93
1. The markup declarations that are mainly used to describe document type definitions (document types, elements, notations, convenient reference mappings, and convenient reference uses) are defined in Chapter 11;
h. The markup declarations that are mainly used to describe link processing definitions (link types, link attributes, link sets, and link set uses) are defined in Chapter 12;
h. The SGMI declarations that describe the file symbol set, capacity set, concrete syntax, and various features are defined in Chapter 13, i. The base concrete syntax is defined in Chapter 14, j. Conformance of files and applications is defined in Chapter 15. Finally, there are a number of appendices that are referenced to this standard. Note: This standard is a formal description of a computer language, and it may be difficult for people whose expertise is in generating files rather than compiling files to read it. Appendices AB and C discuss the main concepts in an informal didactic way that should be more accessible to most readers. However, readers should also be aware that these appendices do not cover all the structures of SCML, nor all the details of these structures. In order to clearly present the topic, some subtle differences are often ignored. 1 Contents
This standard: bzxZ.net
a specifies an abstract syntax called Standard Generalized Markup Language (SGML). The language describes how to represent the structure and other aspects of a file, and also provides other information for interpreting markup: it specifies a basic concrete syntax that combines abstract syntax with specific characters and numbers, and provides rules for defining variant concrete syntaxes: it defines conforming files based on the language domain used by the file: it defines a valid system based on the system's ability to process conforming files and identify markup errors in valid files: it specifies how data not defined by this standard (such as images, graphics, and formatted text) can be included in a conforming e.
file.
Note: This standard does not specify or define "standard" file types, file structures or text structures! It does not specify the implementation, architecture or error handling of conforming systems; it does not specify how to create conforming files! It does not specify data flows, message management systems, file structures, physical representations for storing or exchanging conforming files, or character sets or encoding schemes for making conforming files interchangeable for this purpose; it does not specify the representation of data content or images, graphics, formatted text, etc. contained in conforming files. 2 Scope of application
The standard general purpose markup language can be used in files that can be processed by any text processing or word processing system. It is particularly suitable for: a. When using different text processing languages b. Files exchanged between systems using the same language; b. Files that can be processed in different ways even if the same text processing language is used. Files that exist only in the form of final images are not within the scope of application of this standard. 3 Reference standards
GL1988 Information processing Seven-bit coded character set for information interchange GB4880 Language name code
GB13000.1 Information technology Universal eight-bit coded character set (UCXS) Part 1: Architecture and basic multilingual plane IS)9069 Information processing SGML support tool SGMI. Document interchange format (SDIF) IS)9070 Information processing SGML support tool Common text registration process The following standard is used together with the sample material: GB/T 14814—93
Code extension technology for seven-bit and eight-bit coded character sets GB2311 Information processing
GB2659 Codes for names of countries and regions of the world GB8565.1～8565.3 Information processing Coded character sets for text communication GB11383 Information processing Structure and encoding rules for eight-bit codes for information exchange ISO8632/2 Information processing systems Computer graphics Non-text files for conversion and storage of image description information Part:
Character encoding
Color Metatext files for conversion and storage of image description information Information processing systems Computer graphics
ISO 8632/4
Clear text encoding
Part 4:
—Abstract Language Notation 1 (ASN.1) Basic encoding rules Specification for Information Technology Open Systems Interconnection
ISO8825
4 Definitions
The following definitions apply to this standard.
4.1 (SGML) Abstract syntax (of SGML) A set of rules that defines how markup is added to document data, regardless of the specific characters that represent the markup. 4.2 Active document type (declaration) Active document type (declaration) A document type that has been identified as active by the system. NOTE If an SGM1. entity has a corresponding active document type, the entity is parsed according to its active document type. Otherwise, the entity is parsed according to its base document type and the intended active link type. 4.3 active link type declaration a link handler that has been identified as active by the system. 4.4 ambiguous content model ambiguous content model a content model in which an element or string in a document instance can satisfy more than one initial content token. Note: The use of ambiguous content models is prohibited in SGML. 4.5 application
text processing application.
4.6 application convention applicationconvention application-specific rules governing the text of a document within the limits of what SGML allows the user to choose. Note: There are two types of application conventions: content conventions and markup conventions. 4.7 application-specific information application-specific information parameter of an SGML declaration that specifies information required by an application and/or its structure. Note: For example, this information can identify a structure and/or an application, or enable a system to determine whether it can process the document. 4.8 associatedelementtype an element type associated with the object of a markup declaration using the associatedelementtype parameter. 4.B.1 associated notation (name) associated notation (name) notation name associated with the object of a markup declaration using the associated notation name parameter. 4.9 attribute (of an element) characteristic quantity that is neither type nor content.
4.10 attribute definition attribute definition member of an attribute definition list; defines attribute name, allowed values, and default value. 4.11 attribute definition list attribute definition list a collection of one or more attribute definitions defined by the attribute definition list parameter in an attribute definition list declaration. 4.12 attribute (definition) list declaration attribute (definition) list declaration a markup declaration that associates an attribute definition list with one or more element types. 4.13 attribute list attribute description list.
GB/T 14814-93
4.14 attribute list declaration attribute definition list declaration.
4.15 attribute specification a member of an attribute specification list that specifies the value of a single attribute. 4.16 attribute (specification) list a basket of one or more attribute specifications. Note: Attribute specifications appear in start tags and link collections. 4.17 attribute value literal attribute value literal a delimited string that is interpreted as an attribute value by replacing references and ignoring or converting function symbols. 4.18 Avilable public text A public text that is available for general public use, and whose owner may require payment or other conditions. 4.19 B sequence B sequence
A sequence of consecutive uppercase "H" in a string as a convenience, which represents an empty sequence with a minimum length equal to the length of the B sequence.
4.20 Base document element basedocunentclement A document element whose document type is the base document type. 4.21 Base document type basc document type is specified in the preamble, and the document type specified in the document type statement. 4.22 Basic SGML document basieSGML document document conforming to SGML, which uses the base concrete syntax and set of capabilities, and the minimal tag features SHORTTAG and OMIT-TAG.
Note: It may also use the SHORTRFF feature according to the base concrete syntax. 4. 23 bit
binary digit, that is, 0 or 1.
4.24 bitcombination
ordered set of bits, which can be interpreted as a binary number 4.25 white sequence biank sequence
A contiguous sequence of SPACE and/or SEPCHAR characters. 4.26 Capacity
A named limit on the size or complexity of a file, expressed in some unit, used to aggregate a class of objects or all objects.
Note: Capacity sets are defined by abstract spectra, but the values are assigned to them by individual files and the SFM1 system. 4.27 Capacity set capacitysct
A collection of values that assign valid values to capacity names. Note: In SCMT declarations, capacity sets indicate the maximum size requirements of a file (the actual requirements may be slightly lower). Capacity sets may also be defined by applications to limit the file size requirements that the application implementation must handle. Capacity sets may also be defined by the system to indicate capacity requirements that the system can satisfy. 4.28 CDATA
Number of characters.
4. 29 CDATA Entity
CDATA entity
character data entity,
4.30 Chain of (link) processes
chain of (link) processes
GB/T 14814—93
A sequence of events that are executed in sequence and form a chain in which the source of the first process is an instance of a base file type and the result of each process except the last is the source of the next process. Any part of the chain can be repeated. Note: For example, a complex paging application system may contain three file types: edit file, long proof file, and paging file. It may also contain two linked processes: "Adjust Layout" and "Estimate Page Count". The "Adjust Layout" process creates a long proof file instance from a logical file instance, while the "Estimate Page Count" process creates paging file instances in sequence from the long proof file instances. Since decisions made in the "estimates\" process may require further adjustments to the size of the long proof file, the two processes can be repeated. 4.31 Character
The smallest unit of information with a single meaning defined by the character table. Note: () There are two types of characters: graphic characters and control characters. The meaning of a character in context is defined by markup or data content notation, which cancels or supplements the meaning of the character in the character table.
(character)class
4.32 (Character)class
A set of characters that have a common role in the abstract syntax, such as non-SGMI. characters or separator characters. Note: There are four different ways to designate certain characters as character classes: explicit designation by the abstract syntax (such as Special, Figit, l. CLetter and UCI. Letter, a.
b. Explicitly specified by the specific syntax (such as LCNMSTRTFUNCHARSEPCHAR, etc.); C, implicitly specified as a result of explicitly specifying the delimiter role or other character class (such as DELMCHAR and DATACHAR): d. Explicitly specified by the document character set (such as NJNSCMIL). 4.33 Character data characterdata
Zero or more characters appearing in a text context that does not recognize markup, but does not include the delimiter that ends the character data. Such characters are classified as data characters because they are declared as such. 4.34 Character data entity, characterdata enlity An entity whose text is considered as character-derived data when referenced, and whose text does not depend on a specific system, device, or application handler. 4.35 Character entity set: A common entity set consisting of universal entities, which are themselves graphic characters. Note:) Character entities apply to characters that have no encoding in the file character set, or characters that cannot be conveniently typed from the disk, or characters that can be correctly displayed on all output devices in order to be device-independent. ② There are two types of character entity sets: defined and displayed. 4.36 Character number character number A decimal integer equivalent to the character encoding representation, whose value is obtained by treating the byte sequence as a single binary integer. 4.37 Character reference characterefercnce A reference that uses a single character as a replacement.
Note that there are two types of character references: named character references and digital character references. 4.38 Character repertoire A collection of characters used simultaneously. It defines the meaning of each character and can also define control sequences for multiple characters. Note: When a character appears in a control sequence, the meaning of the sequence replaces the meaning of the individual characters. 4.39 Character set characterset
A mapping from a character repertoire to a code set, so that each character corresponds to its abbreviated representation. 4.40 Character string string A sequence of characters.
4.41 Class luss
Class of characters.
codeexlension
4.42 Code extension
Using a single code to represent multiple characters without changing the character set of the file. ..comGB/T 14814—93
Note: When multiple ethnic languages appear in a file, extensions to the character repertoire may be useful. 4.43 codeset codeset
a collection of octets of the same length, ordered by their numerical values, which must be consecutive. NOTE For example, a codeset whose octets are 8 bits long (*octet codes\) can consist of 256 octets with values ranging from 00000000 to 11111111 (decimal 255), and the codeset block can consist of any consecutive set of these octets. 4.44 codeset position codesetosition the numerical value of an octet in a codeset.
4.45 codedrepresentation a character represented by a sequence of one or more octets of the same length. 4.46 comment comment
a part of a markup declaration containing interpretive or commentary information to assist the user in using the document. 4.47 commentdeclaration a markup declaration containing only comments.
4.48 concrete syntax (of SGMIL) The combination of an abstract syntax with specified delimiters, numbers, names of markup declarations, etc. 4.49 concrete syntax parameter A parameter of an SGML declaration that identifies the concrete syntax used in a document element and (usually) in the leading part of a statement. NOTE This parameter consists of parameters that identify the base syntax character set, functional characters, escape characters, naming conventions, use of delimiters, use of reserved names, and number characteristics.
4.50 conforming SGML application A document that conforms to SGML and compiles documents that meet the requirements of this standard. 4.51 conforming SGML document A document that conforms to all the requirements of this standard. 4.52 containing element An element in which child elements appear.
4.53 contentcontcnt
Characters appearing between the start and end tags of a document instance that may be interpreted as data, proper child elements, contained child elements, other markup, or a mixture of these. NOTE If an element has no explicit content reference, or its declared content is "EMPTY", its content is empty. In this case, the data may be generated by the application itself and is treated the same as content data. 4.54
content conventioncontent conventions
Application conventions that control the content of data, such as length limits, allowed characters, or the use of uppercase and lowercase letters. NOTE Content conventions are essentially informal notations for data content, usually limited to a single element type. 4.55 (content) modelParameters in the model element declaration that specify model groups and exception parameters, both of which define the content allowed within the element. 4.56 content model nesting levelThe maximum number of consecutive occurrences of grpo or dtgo delimiters in a content model that has no corresponding grpc or dtgc delimiter. 4.57 contentreference (attribute) an attribute that may be implied, whose value is referenced by the application to content data. NOTE When an element has an explicit contentreference, the content of the element in the document instance is empty. 4.58 contextual sequence a sequence of one or more markup characters that must follow a delimiter in the same entity so that the sequence is recognized as a delimiter.
4.59 contextually optional element
contextually optional element is an element that:
GB/T 14814—93
can appear only because it is a containing element: or a.
b. its content token in the currently applicable model group is a contextual optional token. 4.60 A contextually optional token
a contextually optional token
is a content token that:
is an inherently optional token, or
has a plus occurrence indicator and is satisfied, or
is in a model group that is itself a contextually optional token and has no satisfied tokens. 4.61 A contextually required element
is an element that is not contextually optional and that: a.
has a general identifier that is a document type name, or b.
is a contextually required token. Note: An element can be both contextually required and contextually optional. For example, an element whose current applicable model token appears in an or group that has no optional tokens. contextuallyreuired tokcn
a contextuallyreuired token
is a content token that:
is the only content token in its model group; or b. appears in a seq-group and:)-: the seq-group itself is a contextuallyreuired token 1-contains a satisfied token: and
i) all the tokens preceding it
have already satisfied; or
is contextuallyoptional.
4.63 control character
a character that controls the interpretation, representation, or other processing of the characters that follow it. For example, the tab character. 4.64 control sequence a character sequence that begins with a control character and that controls the interpretation, representation, or other processing of the characters that follow it: for example, an escape sequence. 4.65 core concrete syntax a variation of the base concrete syntax that does not have the convenience quote delimiter. 4.66 Corresponding content (of a cuntent loken) The element and/or data of the corresponding content token in the file instance. 4.67 Current attribute An attribute whose current (i.e., most recently assigned) value becomes its default value. Note: When an element with the current attribute appears for the first time, the start tag cannot be omitted. 4.68 Current element The open element whose start tag (or omitted due to markup simplification) most recently appeared. 4.69 Current link set The link set corresponding to the current element as defined by the link set usage in the element content or the link processing definition. If the current element has an associated link set, the previous current link set remains the current link set. 4.70 Current mapping current map
Outputs the convenient reference mapping in the element content that corresponds to the current element using declarations or document type definitions. If the current element has no corresponding mapping, the previous current mapping will continue to be the current mapping. 4.71 Current rank currentrank
CB/T14814--93
A number appended to the rank stem in a tag when used to obtain a universal identifier. For a start tag, it is the rank suffix of the nearest element with the same rank stem or a rank stem in the same rank group. For an end tag, it is the rank suffix of the nearest open element with the same rank stem.
4.72 Data data
Characters in a document that indicate the information content of the surrounding document; these characters are not recognized as markup. 4.72.1 Data attribute data attribute Attribute of data that conforms to a particular data content notation. Principle: In most cases, the value of a data attribute must be known before the data can be interpreted according to the notation. 4.73 Data character datacharacter
An SGMI character that is interpreted as data in the context in which it appears, either because it is declared as data or because it is not recognized as markup.
4.74 Data content datacon1ent
A portion of the content of an element that is data and not markup or a subelement. 4.75 Data content notation datacontent notation An application-specific interpretation of the data content of an element or a data entity that usually supplements or differs from the conventional meaning of the document character set. NOTE: A data content notation is a notation-defined notation for data entities, or a notation-name parameter in the entity declaration. 4.75.1 Dataentity
An entity that is declared to be data and is not parsed when referenced. (T) There are three types of data entities: character data entities, special character data entities, and non-SGML data entities. ②) The interpretation of a data entity is determined by the data content notation, which is defined by other standards. 4.76 Data tag data tag
A string that matches the data tag pattern of an open element and is both the end tag of the open element and the character data in the element containing it.
4.77 Data tag group datataggroup
A group notation that links a data tag pattern to a target element type. Note: In the case of tag elements, the data content of the element is scanned for strings that match the pattern (data tag) and 4.78 The data tag group notation defines those strings that would constitute a data tag if they appeared in the appropriate context. 4.79 declaration declaration
a markup declaration.
4.80 declaration-set declaration The delimited part of a markup declaration, within which other declarations may appear. A subset may only appear in file type, link type, and special section declarations. 4.81 declared concrete syntax The concrete syntax described in the concrete syntax parameter of the SGML declaration in det:laredconcrctesyntax. 4.82 dedicated data characters dedicateddatacharacters A character class consisting of every S(ML character that cannot possibly have a markup meaning: its members can only be processed as data characters. 4.83 missing entity lefaultentity
The entity referenced by a general entity reference that uses an undeclared name. 4.84 default.value
Part of an attribute definition, the attribute value used when no attribute is specified. 4.85 definitional (character) entity set definitional (character) entity Het The purpose of a character entity set is to define entity names for graphic characters, but not to actually display them. Its public identifier does not include a public text display version.
GB/T 14814—93
Note: During processing, the system replaces the definitional implementation nest with the corresponding display character entity for transport out of the device. 4.86 Delimiter characters delinitercharacters A character class consisting of SC:ML characters other than name characters or function characters, which appear in a string used as a delimiter in the concrete syntax.
4.871. The delimiter dlitnier-in-contcxt below is a string consisting of a delimiter string followed by a context sequence in the same entity. 4.88 Delimiter role A role defined in the abstract syntax and specified by the concrete syntax that is used to mark a tag and/or distinguish a tag from data.
4.89 Delimiter set dleiimiter sct
A set of delimiter strings corresponding to the delimiter roles in the abstract syntax. 4.9 Delimiter parameter delimiter set parameter A parameter in an SGML declaration that identifies the set of delimiters used in the concrete syntax being declared. 4.91 delimiter (string) A string of characters that is assigned delimiter characters by the body syntax. 4.92 description nonstandard tag markup that describes the structure and other properties of a document in a non-system-specific way, independent of any processing that may be done on it. In particular, it uses markup to express the structure of the element. 4.93 device-dependent version (of public text) A device-dependent version (of public text) is a public text that differs from other public text in its form only by the addition of a public text display version that identifies the display device supported or the encoding scheme used. 4.94 digits
The class of characters consisting of the ten digits "0\ to "g". 4.95 display (character) entity set entity set that has alternate entity names with the corresponding defined character entity set, but that uses character display. It is the version of the dependent device corresponding to the defined entity set.
4.96 Document
As: A set of information processed as a unit. Documents shall be classified as specific document types. NOTE In this International Standard, the term is always SGMI, document (without loss of accuracy). 4.97 Document architecture A formalized plan for text processing applications. NOTE; For example, the architecture of a document may be defined as:.
Attribute semantics used in the definition of various elements: Classification of elements based on the attributes they possess: Structural rules for defining document types based on element classification; Link processing, and how they are affected by attribute values; Information that accompanies a document (\document appearance\) during interaction with and/or with the document. 4.99 Document character set document character set used for all markup in an SGML document and (at least) the character set originally used for data. Note: When documents are exchanged between systems, their character set is converted to the character set of the receiving system. 4.99 Document element document element The outermost element of an instance of a document type, i.e. the universal identifier of this element is the document type number. 4.100 Document instance document instance An instance of a document type.
GB/T14814—93
4.101 Document instance set documentinstance sel An SGML document entity or part of an SGML subdocument entity in an instance structure containing one or more document type instances, which may be extended together with the base document element in the element structure. When the multiple document instance feature is used, multiple instances may exist simultaneously in a document and these instances may share data and markup. 4.102 Document type dorument type
A class of documents with similar characteristics. For example, magazines, articles, technical manuals, handouts, etc. 4.103 (Document) type declaration (document) type declaration A markup declaration containing a formal description of the document type definition. 4.104 Document type declaration subset documenttypedeclarationsubset The set of elements, entities, and convenience references that appear in a document type declaration. Note that the external entities referenced in a document type declaration are considered to be part of the set. 4.105 Document type definition document(type) definition Rules for marking documents of a particular type using S(ML, which are determined by the application. Note: Parts of the document type definition can be described using a document type declaration, while the remaining parts, such as semantics or application conventions for elements and attributes, cannot be expressed formally using SGMI. They can be expressed informally using comments. 4.106 Document type specification documenttypespecification A part of a tag that identifies a document type instance. These tags will be processed in this instance. Note that the name group in each entity reference performs the same function. 4.107ds (declaration separator) declaration separator that appears in a declaration section, 4.108 DTD
document type definition,
4.109 (specially marked) effective status (of a marked section) The highest priority status keyword specified in a special marked section declaration. 4.110 element element
A component of a hierarchy defined by a document type definition, identified in a document instance by descriptive markup, usually a start tag and an end tag.
Note: Elements are classified as belonging to a particular type. 4.11f element declaration elemcnt dcclaration A marked declaration containing a formal description of an element type definition, which involves minimal inlining and markup. 4.112 element set elemcnt set
A set of meta-declarations that are used together. Note: A set of meta-declarations may be public text.
4.113 Element structure The way a document is organized into a hierarchy of elements, each of which conforms to a separate document type definition. 4.114 Element type
A class of elements that have similar characteristics. For example, paragraph, chapter, abstract, footnote, or bibliography. 4.115 Element (type) definition The specific application rules that apply SGMI to the markup of a particular type of element. An element type definition consists of a formal description of the element and attribute definition declarations that describe the content, markup, and attributes allowed for a particular element type. NOTE An element type definition is usually part of a document type definition. 4.116 Element type parameter A parameter in an element declaration that identifies the element type to be defined.
Tip: This standard content only shows part of the intercepted content of the complete standard. If you need the complete standard, please go to the top to download the complete standard document for free.