Machine readable records: procedures and practices to date
1.0 The procedures and practices related to the archiving of machine readable or electronic records have a very short history when compared to other forms of information such as documents and manuscripts. Only a handful of national repositories have begun to acquire and preserve such records despite the fact that computers have manipulated records since the early fifties. Before exploring the impact that today's electronic records may have on traditional archival practice, it is important to review briefly what has happened to date.
1.1 Machine readable records were first created as a result of the processing of large amounts of data basically replacing manual efforts. Payroll systems are perhaps the prime example: U.S. census records were processed using Herman Hollerith's statistical calculator as early as 1890 resulting in the simple count of the census returns in six weeks with a full scale analysis in two and a half years. The end of the Second World War, however, marks the first major impetus to develop computer technology. The use of the computer by governments (outside of the military), business and industry took off slowly and within some specific boundaries. Much of the information processed in the fifties and sixties remained administrative and housekeeping in nature. Processing of pay cheques, inventory control, personnel information were the primary purposes to which the machines were dedicated. In 1952, the CBS television network used a UNIVAC computer to predict the outcome of the presidential election.
1.2 Although collecting and manipulating information, the use of the computer remained in the hands of computer specialists because of the technical knowledge required to program the machine. The user of the information was not consulted as to his or her requirements and the electronic data processing (EDP) or automated data processing (ADP) areas developed quite apart from the administrative and records keeping operations of an agency. A new jargon and work procedures developed within the EDP areas which bore little relationship to the records keeping systems and terminology used. The role of government officials and business executives in this new field of development was limited to approval of funds for new systems. An EDP or ADP culture, responsible for the creation and maintenance of enormous amounts of information, developed quite separately from other record keeping units of the organization.
1.3 This development led to a mystique which surrounded the ADP areas for a very long time. Traditional methods of capturing information were not applicable to electronic records. The need for technical expertise to use the computer continued to keep archivists and records managers away from the operation. The information being created by computers was not viewed as records but as data, the separation being very distinct.
1.4 The increasing use and importance of computers was gradually recognized by the large National Archival repositories in a number countries in the establishment of machine readable records programs. Over the years, standards were developed for the appraisal, processing, conservation and servicing of machine readable records. Due to the limited number of archives and archivists involved in such programs, a great deal of cooperation and sharing of information led to the development of procedures to handle the records.
1.5 Appraisal guidelines developed for machine readable records, recognized both the similarities and the differences with other forms of archival material. The guidelines recommended the evaluation of machine readable records based upon the traditional lines of evidential value, informational value, and legal value. The form of the records permitted the use of the information in many more ways than paper. The technical analysis of the records posed the more difficult problems for the archivist appraising machine readable records. An understanding of computer systems was required in order to determine the ease with which the records could be acquired. The direction was to acquire the records in a software independent form. For many of the systems developed during the sixties and early seventies, this was a reasonable expectation. Machine readable records which required specific software and were dependent upon that software posed major conservation problems. Software dependent records would require archives to become collectors, not only of the records, but also of the software and the hardware, an impossible situation.
1.6 The objective of the technical analysis and the processing of the records was to ensure that the machine readable records of archival value could be conserved and disseminated to researchers in a software and hardware independent form. With many of the administrative and housekeeping systems which existed at this time this was not a major difficulty to overcome.
1.7 The verification and description of the records required more immediate attention than their paper counterparts. On receipt of the machine readable records, the archivist was required to verify the content of the records using statistical programs which indicated whether the documentation (record layout or codebook) provided with the data was accurate. In most cases, and dependent upon the detail of verification which was completed, it was impossible to ensure that all of the data were accurate.
The documentation accompanying the records was organized and augmented to ensure that the data could be used with as little help as possible from the archives. The description of the records required a series of levels: the codebook being the most detailed description of the records and equivalent to the traditional finding aid, from which more general descriptions were created.
1.8 Anglo-American Cataloguing Rules were used to provide standardization for descriptive entries. Adaptations of these rules were developed in a number of archival institutions to incorporate specific archival requirements such as provenance and record group.
1.9 The conservation of machine readable records has always required a proactive approach. The medium most commonly used to date has been magnetic tape. It has proved the most inexpensive and the most stable of the magnetic media. Magnetic tapes require an environmentally controlled storage area, away from magnetic fields and dust free. Machine readable data files stored on magnetic tape must be rewound and recopied periodically. At the time of recopying, the new file is copied according to the new technical standards imposed by changes in hardware. This preservation work is labour intensive and costly. For small tape libraries this kind of work is manageable but as new tapes are added, storage space must be increased as well as personnel to ensure that the tapes are cleaned and rewound.
1.10 The servicing of the records has to date been limited to researchers who have access to computing facilities. Most archives make copies of the records and provide supporting documentation to the user who then processes the records offsite. Reference work involves access by subject more frequently than access by provenance. As most descriptions of files are indexed by subject, the researcher is directed to the most appropriate files containing the desired information. A certain level of technical knowledge is required by the researcher in order to use machine readable records in secondary analysis.
1.11 This chapter has been a very general review of the procedures developed to date for the archiving of machine readable records. The procedures developed have followed as closely as possible the procedures for the preservation of traditional archival records. The types of data being created at the time of development lended themselves to these procedures. In order to understand fully the changes which machine readable or electronic records will create, it is important to know how the records have been handled over the past twenty years.
1.12 In most archival repositories only a minimum amount of success has occurred in the identification and subsequent transfer of machine readable records of archival value. The lack of control over the creation and storage of the records by the traditional recordkeeping practices has led to difficulties in the identification of machine readable records for archival evaluation. This has been the most difficult problem to resolve.