Some standard content:
ICS 37. 080
National Standardization Guidance Technical Document of the People's Republic of China GB/Z 23283—2009/IS0/TR18492:2005 Long-term preservation of electronic document-based informatian (ISO/TR18492.2005, IDT)
Issued on 13 March 2009
General Administration of Quality Supervision, Inspection and Quarantine of the People's Republic of China Administration of Standardization of the People's Republic of China
GB/Z 232832009/IS0/TR 18492.2005 This guidance technical document is equivalent to ISO/TR1849222005 Long-term preservation of electronic document-based informatian (English version). This guidance technical document is identical to the technical report ISO/TR 18492-2005 in content, and the following editorial changes have been made: the foreword of ISO/TR18492-2005 has been deleted, and the foreword of this guidance technical document has been abbreviated; the self-introduction of ISO/TR18492-20D5 has been deleted; GB/T20225-2006 has been added to the terms and definitions; the error in the publication introduction in Appendix A.3 has been corrected. Appendix A of this guidance technical document is an informative appendix. This guidance technical document was proposed and coordinated by the National Technical Committee for Standardization of Document Imaging Technology (SAC/TC 86). This guidance technical document was drafted by the Seventh Branch of the National Technical Committee for Standardization of Document Imaging Technology. The main drafters of this guidance technical document are Zhang Meifang, Sun Jingrong, Li Du, and Jiang Zhiwei. G/z23283—2009/IS0/TR18492.2005 Introduction
Ensuring the long-term preservation of authentic file-based electronic information is a prominent issue in many fields such as archival science, record management, e-commerce, e-government and technology development. To address this issue, individuals and institutions have adopted a variety of methods and measures to preserve file-based electronic information for a long time.
It is clear that we need to address the long-term preservation of authentic file-based electronic information, but there is currently a lack of a unified international standard for addressing these issues, which has led to inconsistent or even incompatible methods being adopted, which may directly affect the availability and authenticity of file-based electronic information. Based on the widespread technological obsolescence of computer hardware and software and the limited lifespan of digital storage media, this guidance technical document provides guidance for preservation agencies to use and preserve file-based electronic information. This guidance technical document provides a clear framework for the development of file-based electronic information long-term preservation strategies and best practices. It can be widely used by the public sector and individuals to ensure the long-term availability and authenticity of file-based electronic information. 1 Scope
GB/Z 23283—2009/IS0/TR 18492:2005 Long-term preservation of file-based electronic information This guidance technical document provides practical and methodological guidance for the long-term preservation and retrieval of authentic file-based electronic information when the preservation period of file-based electronic information exceeds the expected life of the technology (hardware and software) used to generate and maintain the information. This guidance technical document considers the role of neutral information technology standards in the long-term use of information. This guidance technical document also believes that in order to ensure the long-term preservation and retrieval of file-based electronic information, the joint efforts of information technology experts, file managers and archivists are required. This guidance technical document does not cover the entire process of generating, acquiring and classifying authentic file-based electronic information. This guidance technical document applies to all forms of information generated by information systems as evidence of business activities. Document-based electronic information constitutes the "business memory" of daily business activities or events, allowing institutions to review, analyze or prove these activities and events. Document-based electronic information is a voucher for business activities and can help institutions make management decisions now or in the future, meet user requirements and respond to adverse litigation. For this reason, document-based electronic information should be properly retained and preserved. 2 Normative referenced documents
The provisions in the following documents become the provisions of this guidance technical document through reference in this guidance technical document. For all referenced documents with a date, all subsequent amendments (excluding errata) or revisions are not applicable to this guidance technical document. For any referenced document without a date, all subsequent amendments (excluding errata) or revisions are not applicable to this guidance technical document. For dated referenced documents, the latest versions (including revisions) apply to this guidance technical document. GB/T20225-2006 Electronic Imaging Collection (ISO12651:1999.JDT) I5015489-1 Information and documents Document management, Part 1: Introduction 1 ISO/TR15489-2 Information and documents Document management Part 2: Guide 1 ISO/TS23081-1 Information and documents Document management process Document metadata Part 1: Principles 3 Terms and definitions
The terms and definitions established in GB/T20225-2006, JSO15489-1 and ISO/TR15489-2 and the following terms and definitions apply to this guidance technical document.
Authentic file-based electronic information anthemtic 3.2
Document-based informationDocument-based information is electronic information that is document-based and whose authenticity, reliability and integrity are maintained over time. 3.2
Document-based informationDocument-based information is independent information that can be processed as a whole (e.g. an image, subject, spreadsheet, database view, etc.). Note: Document-based information includes, but is not necessarily limited to, text, images, tabular data (e.g. spreadsheets), or a combination thereof. 3.3
Document-based information contentDocument-based information content is the substantive content contained in document-based information. 3.4
Document-based information contextDocument-based information context is information about the generation, control, use, storage and management of electronic documents, and information about similar materials. GB/Z 23283—2009/ES0/TR 18492.20053.5
Document-based information structureDocument-based information structure is based on the logical properties and physical properties of the information in a file. Note: Logical properties include logical order, such as the hierarchical structure of distinguishable parts, while physical properties include various factors such as size and spacing. 3.6
Electronic archiving The storage of electronic information in a separate physical or logical space to protect it from loss, alteration, or destruction. Note: If information is protected in this way, it may be considered reliable in the future. 3.7
Long-term preservation The period of time during which electronic record-based information remains usable and authentic. This period may range from a few years to several decades, depending on the needs and requirements of the organization. For some organizations, the preservation period is determined by the needs of management, legal requirements, and business needs. For organizations such as archives that preserve public records, the period of preservation of electronic record-based information is generally several hundred years.
Metadata
Metadata
Describes the content of electronic record-based information [including searchable index terms], the context and structure of electronic record-based information, and data for long-term management.
Migration
The process of transferring information from one hardware or software environment or storage medium to another without substantially changing the structure, content, or context of electronic record-based information. 3.10
Storage repository
The storage organization or entity that is responsible for the storage and preservation of authentic electronic record-based information. Note that this definition is different from the technical definition of \"storage organization\\". 3.11
Technological obsolescence
techrological ohsolescence
In the industry, the replacement of existing technical methods due to technological development and progress. 4 Codes and Abbreviations
PDF/A-1
American Standard Code for Information Interchange
Circular Redundant Code Check
Hypertext Markup Language
Joint Photographic Experts Group (a compressed standard) Optical Character Recognition
Portable Document Format
Standard Hash Algorithm
Tagged Image Format
Write Once Read Many
Extended Markup Language
5 Long-term Preservation
5.1 Overview
GB/Z 23283—2009/1S0/TR 18492:2005 With the rapid development of computer information generation, use, storage and preservation technology, private and public sectors are increasingly relying on file-based electronic information as the official evidence of their business activities. Preservation agencies are faced with the challenge of how to ensure the long-term preservation of authentic file-based electronic information. Information is generated in secure and reliable information systems and stored in electronic media. Both may face the risk of technological obsolescence. If uncorrected errors are left due to technological obsolescence, the electronic information based on files will be irrecoverable. The activities and business processes of various institutions are increasingly conducted in a paperless environment. This reality has deepened the importance of "ensuring the long-term use of authentic electronic information based on files". Therefore, it is very necessary to formulate and apply clear strategies for the long-term preservation and protection of authentic electronic information based on files. Clause 5.2 covers the content of the strategy.
5.2 Objectives of the long-term preservation strategy
5.2.1 Overview
This clause defines six key requirements that the preservation agency should consider when formulating a preservation strategy. 5.2.2 Readable electronic information based on files. The long-term preservation strategy should ensure that electronic information based on files is readable in the future. To achieve this goal, the bit streams that make up the file-based electronic information can be used in a computer system or device during any of the following processes: - when the file is created: - when it is currently stored; - when it is currently used: - when it is used to store electronic information in the future. All four processes have the potential to render the file-based electronic information stored on the medium unreadable. This is primarily caused by two reasons. - The first is an unsuitable storage environment. All media currently used to store file-based electronic information are generally susceptible to damage from unsuitable environments, such as fluctuations in temperature and humidity. These adverse conditions can damage the media or accelerate its aging process. Different types of digital storage media require different storage environments to ensure their maximum lifespan. - Some storage media are susceptible to information corruption due to magnetic interference, dust and environmental contaminants (magnetic storage media), while other media that are not susceptible to external factors (optical storage media) are not easily damaged as long as the storage environment is slightly controlled. Regardless of the type of storage media used, it is important to be aware that adverse environments can cause damage or aging of all forms of storage media. - The second is media obsolescence. Obsolescence of media can also render information unreadable. Physical incompatibility between storage media (e.g., tape or optical disk) and current computer hardware (e.g., tape or optical disk devices) can render information unreadable. Due to the continued development of information technology, future obsolescence of media is inevitable, as advances in storage technology will continue to result in changes in the physical storage of file-based electronic information (e.g., changes in recording technology, disk drive hardware/software interfaces), the physical form of the storage medium, and the bit stream that represents the recorded information (e.g., error correction codes). Therefore, over time, storage media will become incompatible with later media, and long-term preservation strategies should specifically consider periodic transfer of file-based electronic information from older media to newer media as media becomes obsolete. Data readability is as important as the format of the data, and consideration should be given to ensuring that the data format (technology-neutral format) can be processed by users in the future. 5.2.3 Understandable file-based electronic information Long-term preservation strategies should provide understandable file-based electronic information. Digital information can only be understood by computers if the computer can tell how to interpret the bit stream. Therefore, the understandability of file-based information is a function of the bit stream's ability to actually represent information and the ability to take appropriate actions based on that information. For example, the binary code that forms the format of a digitally tagged image is not inherently understandable. However, the pages of an image file, through the use of byte ordering and compression, enable a computer (through a combination of an operating system and image processing software) to display and print the image. Similarly, a word processing file carries metadata that makes its contents understandable to text processing software.
5.2.4 Identifiable file-based electronic information Long-term preservation strategies should provide identifiable file-based electronic information. Identifiable file-based electronic information should be organized, categorized, and described in a manner that enables users and information systems to distinguish information objects by unique attributes such as names or ID numbers, and to group file-based electronic information into categories based on shared attributes for easy search and retrieval. Failure to provide such identification will severely limit search and retrieval.
5.2.5 Retrievable File-based Electronic Information Long-term preservation strategies should provide for the retrieval of file-based electronic information, meaning that discrete information objects (or portions thereof) can be retrieved and displayed. Retrievability is generally software dependent and requires specific keys or pointers that link the logical structure of information objects (e.g., data fields or strings) to the physical storage location.
Generally, this link is found in database records, file system directory structures, file allocation tables, file headers, or tags. It includes information such as the location of the beginning of an object, the number of bytes in a file or data element, and their physical location on the storage medium. The logical structure of file-based electronic information is an operating system function or a device driver that is integrated with a specific application system that can be used to store, manage, or use digital information. Therefore, the retrievability of information objects is inevitably linked to the device driver, application software, file system, or operating system. Newer file formats that are compatible with older file formats help ensure the retrievability of file-based electronic information. However, backward compatibility may be limited because many software developers support only certain file formats, while other software developers support all versions of different data formats. For example, Hypertext Markup Language, static image compression standards, and tagged image formats are backward compatible. 5.2.6 Understandable file-based electronic information Long-term preservation strategies should ensure that file-based electronic information is understandable. Understandable document-based electronic information should communicate information to both computers and humans. However, the nature of distributed document-based electronic information is determined not only by its content, but also by the context of its creation and use (e.g., metadata). Therefore, preservation agencies should be aware that ensuring the understandability of document-based electronic information is very different from ensuring the understandability of paper documents. The physical characteristics of paper documents typically reflect the context of their creation and use, while the context of the creation and use of document-based electronic information is often logically related rather than physically related. For example, paper documents related to a particular activity may be bound together or placed in a folder. In the case of an active electronic file-based information that may exist in different media in different locations, it is desirable to link them together in electronic form. These logical connections may include evidence of business processes and participants, the context in which the electronic file-based information is generated and used, and the relationships between electronic file-based information obtained in various ways, including reference codes in document descriptions to other materials that address the same issue, or classification codes that link each piece of electronic file-based information related to similar processes. The successful retrieval of electronic file-based information, regardless of the length of time it is preserved, depends in part on the preservation of these logical relationships. 5.2.7 Authentic electronic file-based information 5.2.7.1 Overview A key element of a long-term preservation strategy is to ensure the authenticity of the electronic file-based information. Authenticity is the purpose of the preservation of electronic file-based information, such as the assurance that the information has not been altered, modified, or destroyed over time. Organizations seeking to provide long-term availability of authentic electronic record-based information should consider the following three key measures in their strategies: a) migration and preservation;
b) storage environment;
c) access and protection.
5.2.7.2 Migration and preservation of file-based electronic information As long as electronic records remain in the environment in which they were created and are not stored on a write-once medium that cannot be altered, it is difficult to protect them from modification. The long-term preservation strategy should provide measures for the migration of file-based electronic information from its environment in which it was created, the producer and the recipient to the preservation system or storage repository. For example, an independently operated third party should be responsible for preserving the file-based electronic information in accordance with the document's policy and practice. 5.2.7.3 Storage environment
The long-term preservation strategy should clearly state the need for a stable storage environment for the medium in which the file-based electronic information is stored. Because an unfavorable or inappropriate storage environment puts the file-based electronic information at risk. 5.2.7.4 Use and protection of file-based electronic information GB/Z23283-2009/S0/TR18492:2005 The long-term preservation strategy should provide methods to restrict the use of file-based electronic information and protect it from deliberate and accidental modification and destruction. File-based electronic information stored on removable media can be altered without leaving any physical evidence. File-based electronic information is also susceptible to accidental corruption when it is transferred between media and information systems. Therefore, organizations that ensure the authenticity of file-based electronic information over the long term should develop appropriate policies, practices, and control technologies. Common control technology measures include: using WORM (such as non-rewritable) magnetic or optical media; secure servers that can prevent direct access to file-based electronic information, providing "read-only" access to the connection cyclic redundancy check technology for checking the reliability of electronic transmission, especially for verifying that the file-based electronic information has not been altered since it was generated;
a standard hash algorithm (such as SHA-1) can be used. An algorithm can compress file-based electronic information into a fixed-length bit string, which can effectively become a unique "fingerprint" of the file-based electronic information to verify that it has not been changed. 6. Long-term preservation strategy requirements
6.1 Overview
Preserving accurate, reliable, and authentic file-based electronic information means considering the following points:
can be read and correctly interpreted by computers;
can be presented in a format that people understand:
|| has a logical and physical structure. Substantive content and information generated at the time of the creation of the information; ||tt| ... , the obvious background information at the time of receipt, the limitations of the durability of electronic record storage media and the inevitable technological obsolescence will force preservation agencies to make critical choices for the long-term preservation and utilization of authentic, processable, file-based electronic information. Faced with the challenges of media durability and technological deficiencies, preservation agencies need to adopt different preservation strategies and use the latest tools. These strategies and tools can be conceptually divided into the following three measures, which together form the basis of long-term preservation strategies.
a) Preservation agencies should adopt linked media renewal to ensure the durability of the media; b) If there are automated tools, migrating file-based electronic information from one technology platform to another is an effective way to deal with technological obsolescence;
When there are no automated migration tools, when digital information and images are stored in obsolete systems, more effective measures are needed. In today's technological environment, the method of emulating obsolete information systems can be used. Although this method has been introduced, it faces several operational limitations for the purpose of long-term utilization of authentic file-based electronic information. Therefore, emulation technology is not further discussed in this technical report.
6.2 Media Updates
6.2.1 Overview
Limited media durability and technological obsolescence mean that periodic media updates are inevitable. Media updates are also a fundamental need to maintain the "liveliness" of the original bit stream to ensure the long-term preservation of authentic and processable file-based electronic information. Media updates require that file-based electronic information be reformatted or copied, as detailed in 6.2.2 and 6.2.3. 6.2.2 Reformatting file-based electronic information 6.2.2.1 Overview When file-based electronic information is reformatted, it undergoes a change in the bit stream (e.g. from an 18-track medium to a 36-track medium) or a change in the character code (e.g. from 7-bit ASCII to 2-bit ASCII) as it is transferred to a different physical medium, but its physical form or actual content is not changed. Reformatting is independent of the application that generated the file-based electronic information. 6.2.2.2 Reasons for reformatting Preservation institutions should consider reformatting file-based electronic information in the following three situations: a) Reformatting during transfer: When file-based electronic information is transferred to a preservation institution, it should be formatted into a standard code form and stored on standard media.
b) Reformatting at the time of upgrading: Formatting is necessary when the storage organization upgrades equipment or replaces old storage devices with new ones.
) Periodic reformatting: Periodic formatting should be consistent with the expected life of the media in use, the expected life of the equipment and the life of the drivers running the media.
6.2.2.3 Storage Media for Reformatting The storage organization should reconsider the choice of storage media, especially when reformatting file-based electronic information. The organization should have a wide range of choices in magnetic and optical technologies. The factors worth considering are: - Large storage capacity and high data transfer rate:
- Minimum expected life of 20 years;
Established solid market share;
- Economic affordability:
Suitability.
Among them, large storage capacity and high data transfer rate are key factors because these two factors ultimately determine the transfer time of file-based electronic information when the media is reformatted and copied. This may become an issue when the electronic information stored by the storage organization is in the terabyte or terabyte range.
6.2.2.4 Reformatting and Authenticity
After reformatting, the authenticity of file-based electronic information is compromised, especially when there are several reformatting scenarios. To provide satisfactory authenticity, the storage system and storage organization should have a written quality control policy to ensure the accuracy of all reformatted file-based electronic information.
The procedures for implementing this policy should include comprehensive and complete instructions for all steps in the new format, including: identity of the person performing the operation;
date of operation:
data format:
comparison of the cyclic residual code and hash digest values generated before and after formatting to verify that no changes have been made; visual comparison of several reformatted file-based electronic information with the old format version. The most effective method is to identify inaccuracies or irrecoverable errors and retain them in the subsequent file. The physical location (e.g., block, sector, track) of any irrecoverable errors should be determined. In addition, a third party should review these operations to confirm that the procedures performed are consistent with established procedures. Such files should be marked with a specific link to the electronic file-based information and receive the same attention as metadata to the electronic file-based information.
6.2.2.5 Reformatting Security
The depository institution should protect the electronic file-based information from alteration and loss during the reformatting process. Electronic storage media are susceptible to human damage and catastrophic loss or natural disasters. Therefore, the preservation institution should take the following measures to minimize the risk: it should install "firewalls" or "one-way links (such as "air gaps") to allow only authorized users to access them; electronic storage media should be placed in locked secure areas or controlled safes; backup copies of storage media should be made and stored in places different from the originals; the original and the backup should use two different storage media to minimize the risk of unexpected technological obsolescence.
6.2.3 Copying file-based electronic information
6.2.3. 1 Overview
The purpose of copying file-based electronic information is to migrate it from an old storage medium to a new storage medium in the same format without changing its structure, content or context to maintain its authenticity and processability. When file-based electronic information is copied to the target storage medium, its inherent bits remain unchanged. 6.2.3.2 Reason for copying
GB/Z 23283—2009/IS0/TR 18492:2005 Preservation agencies should consider copying file-based electronic information in the following three situations: 1. Copying during migration: When file-based electronic information is transferred to another preservation agency, it should be copied using the same storage medium format as before the transfer;
Copying when the media is faulty: When the readability of file-based electronic information is checked annually and a large number of samples are found to have temporary or uncorrectable read errors, it is recommended to copy the file-based electronic information without upgrading the media or equipment; 1. Periodic copying: When the storage media is aged and no media or equipment upgrade is required. File-based electronic information should be copied when the current version is still widely supported and meets the agency's needs. The agency should determine a certain period of time (such as half of the expected media life) to begin copying file-based electronic information to a new version of acceptable storage media. 6.2.3.3 Copy authenticity
Although the bit stream of file-based electronic information is not changed during copying, it is still possible that it may be damaged during the process. In order to provide satisfactory authenticity, the preservation agency should have a written quality control policy to ensure the authenticity of all copies of file-based electronic information.
The procedures for implementing this strategy should include a complete description of all steps in the copy, including: - the identity of the person performing the copy;
- the time of the copy; bzxz.net
the number of bits or bytes involved;
- a comparison of the cyclical residue and hash digest values generated before and after the copy to verify that no changes have been made; and - a visual comparison of the information recorded on the copy with the information recorded on the old media. The best approach is to identify inaccuracies or unrecoverable errors and to preserve the treatment in subsequent documentation. The physical location (e.g., block or sector track) of any unrecoverable errors should be determined. In addition, a third party should review these operations to determine that they are consistent with established procedures. Finally, detailed information on confirmed problems should be clearly identified, with specific links to file-based information, and given the same attention as file-based metadata. 6.2.3.4 Copy Security
The storage organization should protect electronic records from alteration or loss during the copy process. Electronic storage media are vulnerable to human destruction and catastrophic damage or natural disasters. Preservation institutions should take the following measures to minimize the risk: 1. Install a "firewall" or one-way link (such as an "air gap") to allow only authorized users to read-only access; 2. Electronic storage media should be placed in a locked secure area or a controllable safe; 3. Make backup copies of storage media and store them in a different place from the original; 4. Use two different storage media for the original and the backup to minimize the risk of unexpected technological obsolescence. 5.6 Metadata
6.3. 1 Overview
Metadata (data about data) consists of information generated by the background, processing and use of information, which helps to ensure the authenticity of file-based electronic information identification, retrieval and preservation. In some cases, some application software can automatically create metadata, such as file size, file format, data, hash summary value and other similar attributes (such as the ownership and rental characteristics of file-based electronic information). In other cases, manual registration of metadata such as classification, preservation time, fonds and keywords is also necessary. These data and file-based electronic information are searchable. With the possibility that various institutions will use enterprise content management systems more widely in the future, in this case, metadata elements that support preservation strategies will be used far more than they are now. In addition, they will be automatically generated, so manual recording will no longer be required. Accordingly, preservation institutions should ensure that metadata capture and use tools are flexible enough to adjust the size to keep pace with more metadata elements and facilitate their use. 6.3.2 Interoperable metadata
In the future, metadata stored in enterprise content management systems will be interoperable. The design of metadata capture and use should take into account the future GB/Z 23283—2009/IS0/TR 18492:2005 can also be used in an interactive environment, therefore, the design organization should consider ISOTS23081-1. 6.4 Migration of file-based electronic information
6.4.1 Overview
The long-term utilization strategy should include measures to provide for the migration of file-based electronic information. In order to obtain and protect authentic file-based electronic information and ensure its long-term utilization, the preservation organization with the data faces four challenges:
a) In the foreseeable future, organizations and individuals will continue to use a variety of software packages and storage formats when generating and using file-based electronic information. , it is very difficult for preservation agencies to obtain usable file-based electronic information or support various software packages and storage formats;
b) Some file-based electronic information may be software-dependent and, therefore, can only be used in a specific software environment: operating systems and application software will inevitably be replaced by newer, faster, and more versatile systems and software, which means that preservation agencies should regularly transfer file-based electronic information from the current software environment to the new environment; d) In the absence of automatic migration tools, some file-based electronic information can only be retrieved in the preserved information system. File-based electronic information migration can successfully meet the above four challenges. Accordingly, preservation agencies must ensure that authentic file-based electronic information is migrated from one application environment to another new application environment with little loss in structure, content, and context.
6.4.2 Software Dependencies
Long-term preservation strategies should address issues of software dependencies. When file-based electronic information can only be used in specific application software, long-term reuse of file-based electronic information is difficult, especially if the software vendor does not continue to support or provide new software versions. In many cases, it may be possible to eliminate software dependency by losing some structure. For example, text file information in a local word processing application may be migrated to plain text (e.g., simple ASCII text) by automatically embedding word processing instructions or code. These instructions or code control physical properties such as fonts and footnotes, which will reduce software dependency. Preserving agencies should carefully consider the impact of migration on the authenticity of the recorded information. Such electronic file-based information can no longer be considered an original copy because it does not have the structure to replicate the original file-based information. It can even be argued that the resulting file-based information should be considered a "new file" because its authenticity should be reestablished through the recording activity and validity should be established by the fact that the actual content of the file-based information has not been altered. Another way to migrate text-based electronic file information to plain text is to print it on paper or output it to microfilm. This process protects the authenticity of document-based electronic information and is particularly suitable for paper-based electronic information that is considered to be storable in the future and can be used through optical character recognition (XCR). Structured and relational databases can also be migrated to flat table structures to reduce dependence on special software. When related links are removed, the main identification and external link values in each table should be retained. Whenever possible, metadata should be established to identify whether these relationships are one-to-one, one-to-many, many-to-one, or many-to-many, so that such relationships can be established in the future. 6.4.3 Software Upgrades and Installations
For preservation organizations that provide long-term access to authentic document-based electronic information, software upgrades and the installation of new software are inevitable. Long-term access policies provide policies and procedures for this necessity. When software is upgraded (such as from version 1 to version 2), the developer provides a Upward compatibility. File-based electronic information should be migrated to new environments with the physical attributes, actual content and context of the information. When new software replaces existing software, or as part of a stand-alone application or general information system upgrade, file-based information should be migrated using the export features of the old system and the import features of the new system. In addition, some environments support migration (e.g., from one type of word processing program to another) by designing the export/import paths into specific application-specific formats. 6.4.4 Migration to Standard Formats The depository should consider migrating the large number of formats of file-based electronic information used in the production or receipt of information to a few standardized formats. Standardized formats are uniform formats that are widely used and cover most types of file-based electronic information. Personalized file formats should be avoided. Neutral technology formats worth considering are PDF/A-1, XML, TIFF and JPEG. 6.4.5 Migration of File-Based Electronic Information from Obsolete Information Systems 6.4.5.1 Overview
When there is neither upward compatibility nor an export/re-export mechanism between the obsolete systems being maintained or contained and the target information systems, the long-term storage strategy for file-based electronic information is migration to ensure the authenticity and processability of the file-based electronic information. In the future, the migration of file-based electronic information from obsolete information systems may decrease due to the widespread development of systems that support technology-neutral structures and formats. However, in the meantime, preservation agencies will have to migrate file-based electronic information from obsolete information systems to fulfill their responsibilities.
In repeated migrations, information loss is inevitable due to the basic incompatibility between old and new generations of systems. Therefore, rather than trying to ensure that there is no loss of information, preservation agencies should consider developing migration strategies and quality control procedures to reduce information loss during the migration process. It is important to record information loss and quality control results during migration. Whenever possible, such records should be retained with the media.
6.4.5.2 Migration Steps
6.4.5.2.1 Overview
The preservation agency should complete the migration in a ten-step incremental approach. As each migration environment changes, the ten steps outlined below may not be appropriate for all specific migration environments.
6.4.5.2.2 Analysis of Obsolete Information Systems (Part 1)The preservation agency should analyze the obsolete information systems to understand their capabilities and the record-based information they contain. This includes: the basis for various functions;
how metadata is obtained and how metadata relates to record-based information.
The information products produced during this phase should be described in detail to facilitate the application of the "next phase" of functionality, metadata and record-based information to the new system.
6.4.5.2.3 Decompose the legacy information system structure (Part 2) The preservation organization should decompose the legacy information system structure so that its interface, application and database services are its different components - this concept is not possible for all information systems, as described below. - If the system and user interface, application modules, database services and the database itself are separate and independent parts, the legacy system can be decomposed.
- If the interface and database are independent, but the application system and database services form a module, then the legacy system is in a semi-decomposed state.
If the interface, application system and database services are combined in one module, then the legacy system is not decomposable. In any case, in order to facilitate migration, external dependencies on the system structure should be removed. 6.4.5.2.4 Design the target interface (Part 3) The target interface should provide a link to the legacy interface. 6.4.5.2.5 Design the target application (Part 4) The target application should provide a link to the legacy application. 6.4.5.2.6 Design the target database (Part 5) The target database should provide a link to the legacy database. 6.4.5.2.7 Install and fully test the target environment (Part 6) An open target environment with appropriate installation tools should be identified, selected, installed, and fully tested. 6.4.5.2.8 Generate and install necessary gateways (Part 7) To ensure consistent and accurate replication of the functionality of the obsolete system to the target system and to migrate file-based electronic information, gateways should be designed, generated, and installed. Gateways usually have two functions: one is to isolate selected parts from being affected by other changes; the other is to form a 9
Tip: This standard content only shows part of the intercepted content of the complete standard. If you need the complete standard, please go to the top to download the complete standard document for free.