Contents - Previous - Next


3. Advantages and problems of computerisation

Advantages
Problems

 

Advantages

To describe the problems associated with the computerisation of archival finding aids without first pausing to consider some of the advantages would be to misrepresent the predominant mood among archivists today. There is no doubt that real benefits are being widely felt from computerisation This is fully borne out by current professional and technical literature as well as in the responses to the questionnaire for this survey.

The major benefits include:

Enhanced control by archivists over the growing mountains of information of all kinds at their disposal, whether concerning management and physical control of the archives on the one hand or intellectual control over their arrangement and content on the other.

A wider vision of what 'information' is and how archival information relates to other aspects of national and international information policy and resources. This is resulting in more dialogue both among archivists themselves and between archivists and other 'information professionals', especially librarians and museum curators, and a greater desire on the part of archivists to be represented in the technical discussions which will affect their future.

The breaking down of other conceptual barriers, such as the distinction between paper and non-paper, or 'traditional' and 'new' media. This is resulting increasingly in moves towards the integration of information management irrespective of medium, and affording scholars working on the archives, as well as archivists themselves, opportunities for better-targeted searches throughout a wider field of information.

The incentive to overhaul traditional practices of archival description, to achieve greater consistency and precision, and sometimes a better quality of output for finding aids, whether within a single archives or at national or international levels. This is resulting in discussions of agreed national and international standards in this field, and the development of the means of exchanging data.

Many individual practical benefits are described at greater length below. These were encapsulated by one respondent as 'savings in energy, cost and time', the energy in this case being that of the staff rather than of the power required to drive the machines! In many applications the computer has obviated whole processes including form-filling, drafting and re-drafting in manuscript and typescript, correction and re-typing and proof-reading, although a number of archives still appear to prefer to make their initial data capture in manuscript on paper or forms, in order to facilitate checking, rather than to have the work done directly at computer terminals.

Savings in cost continue to defy quantification. Only a small minority of respondents admitted to having undertaken any cost-benefit analysis in this specific field, although many could point to 'reduced unit costs' or 'reduced handling time' per transaction.

Australian Archives, for example, estimates that the RINSE application (see above, p.12) has resulted in a 25 per cent saving in time for processing data, and makes the information available in the 8 offices of the Archives within 24 hours where the previous average might have been 6 months.

A corollary of the computer's tackling the old jobs at greater speed is that it allows time for more jobs to be done. It has been a special source of satisfaction that backlogs of work have been seen to dwindle; indeed some have felt this to be the principal benefit from computerisation

The third practical benefit from computerisation has been in opening up entirely new possibilities in information management. Most significantly, as explained in the previous chapter, information can now be dynamic, growing continuously as data is accrued. Users can be offered completely up-to-date information through inter-active, online searches. They can have answers more precisely matching their questions, or even customised guides to their research material.

These options are not of course open to all archives, but almost all those using computers have derived other new benefits. With the right kind of hardware and software, as one respondent put it, you pay once for the data input, but then you have a multiplicity of options for using it.

Reported advantages include automatically-generated indexes for finding aids which would previously have had none; cumulative indexes combining data from a number of different sources; more precise searches and a greater rate of 'hits'; the re-ordering of disordered or poorly catalogued items within an inventory; and amendment facilities enabling a word or phrase to be changed wherever it occurs throughout a finding aid or database, from a single input.

To take just a few examples of operations which were not practical before computerisation but have now become so:

The French National Archives's EGERIE database for searching the contents of the Etat general des fonds has been found to have a number of advantages over conventional finding aids. First, of course, it gives ready access to summary information covering a great number of fonds, but more particularly it does so at speed, with the facility for searches by key-word, and by combinations of key-words using Boolean operators, across several fields (Cloulas 1985).

In the United Kingdom, the Public Record Office has used a simple database application to re-list in box order a class of Admiralty records formerly arranged alphabetically by name of ship. At a very early stage in its explorations of computerisation the PRO discovered the potential of this kind of program for describing, in the order in which they were inspected, badly damaged records from a single class which could not at that stage be arranged without further damage, and re-ordering the information into a logical class list by automated means (Post 1988).

In Italy, the telegrams of the Coding Bureau (Ufficio Cifra) which were badly decayed were transcribed into a machine-readable format for consultation, in order to save wear and tear on the originals. The system simultaneously provided an index of sender, addressee and subject (Mariani-Rinaldi 1981).

The general archives of Belgium compiled by automated means an improved analysis of the correspondence of the Count of Mercy-Argenteau (now held in Austria), using a simple database application (Pieyns-Rigo 1985).

The British Library Department of Western Manuscripts has developed a program using REVELATION software to supply appropriate MARC tags for all the automated indexing data for its Summary catalogues of Manuscripts: the indexers require no knowledge of MARC.

The ability of an archives to derive benefit from computer applications to its finding aids, however, varies according to local circumstances. Any list of the factors determining success would have to include the following:

A robust attitude of mind, determined wherever possible to use the computer to serve the needs of the archives rather than the other way round. This calls for careful systems analysis in advance of computerisation to determine not only the ways in which information is currently handled but also how this will be equalled or improved in a computerised, environment, at what cost, and with what consequences for working practices and for the structure of the organisation (O'Neill 1986).

The ability, or in some cases the freedom, to identify appropriate software and hardware to fulfill the requirements for the system.

Access to the financial and manpower resources, as well as the technical skills to implement, maintain and develop the system and to derive the greatest benefits which it is capable of delivering.

Consultation with, and appropriate training for, the staff involved in implementing, maintaining and exploiting the system.

The development of consistency in the methods of archival description.

The opportunity to exchange ideas and discuss problems as widely as possible with other archivists engaged in computerisation particularly those using the same or similar systems.

The lack of some or all of these preconditions has caused many of the problems reported during this survey that are discussed below.

Problems

Many of the problems associated with archival computerisation have already been identified in the earlier RAMP study by Michael Cook (1986 pp. 32-40). But it is evident from returns to the questionnaire and discussions with individual archivists that the same problems continue to be experienced.

Once again it should not be overlooked that the needs and scale of requirements of any two archives can be very different. The ideal computer system for one may not at all suit the next, and the most important considerations are that the system chosen should as nearly as possible match the needs of the institution and be sustainable within its present or expected resources and skills. If for the time being this means limiting horizons to a word processor or a small database application on a microcomputer, that option should be explored.

Even the humblest first-time user, however, must be aware that aspirations and opportunities, as well as the technology, change as one is in the very course of familiarisation with a system and its advantages or limitations. If there is a short- to medium-term goal of a higher kind, such as the creation of larger, relational databases or the exchange of information with other institutions, care should be taken from the start if possible to do nothing that would make this more difficult. The kind of practical problems recorded below, however, may determine that decisions and progress have to be made in the light of existing constraints, setting aside some of the higher aspirations.

Software and hardware

The identification of software and hardware appropriate to the task was one of the problems most commonly cited in response to the questionnaire.

Where the archives has freedom of choice in the matter, the selection of the right software is often the higher priority, in the form either of a commercial package which will satisfy all or most of the operational requirement arising from the systems analysis, or of software which can be developed and adapted as necessary.

The archivist must be alert, however, to the difference between, on the one hand, the minor compromise to meet constraints intrinsic to a particular system (such as, for example, a simplified form of alphabetisation in indexing, as reported in the Directory of archives and manuscript repositories in the United States (1988, p. XII), and, on the other hand, compromises of such magnitude that they result in major inconvenience, as where a system can work only in structured fields where free text is required, or only in controlled language where natural language would be preferable. Whilst it may be true that even an imperfect system may be used as a means of familiarisation with computers (MacDermaid 1990) it is better if possible to avoid such frustrations.

The notorious tendency to obsolescence of both hardware and software has given many small repositories in particular problems in knowing how and when to leap on the computer band-wagon. Much hardware has a lifespan of five years or less. Commercial software is being developed almost continuously, and the ability to switch to the most recent releases/editions can bring considerable advantages. Viewed as a purely financial problem, this has been somewhat mitigated in recent years by the steep relative fall in prices of microcomputers and their software, and the great increase in their power and capabilities which in turn is leading to much less reliance than in the past upon mainframe computers, and to the rapid development of distributed systems and local networks.

The Society of American Archivists has appointed an automation program officer to advise members on computer applications, a practice which could be widely commended at national level elsewhere unless there is already an appropriate advisory infrastructure through the central direction of archives.

In countries where computer applications in archives are numerous, it is an increasingly common practice, as in the United Kingdom, for archivists to form 'user groups', in which those having the same software application can exchange advice, and even make representations to manufacturers.

Commercially available software applications have been found entirely satisfactory by many archives for word processing applications and for creating simple databases. But few respondents to the questionnaire were so enthusiastic about their current software as to wish to recommend it to others internationally.

TEXTO was commended by the Centre des Archives Contemporaines, Fontainebleau, France as being user-friendly, requiring no previous knowledge of computers, and offering good interrogation and output facilities.

ASK SAM was recommended by the Archives Centre at Maribor, Yugoslavia, for its ability to handle both free text and structured information.

Unesco's CDS/ISIS and its derivatives (with users as far afield as Hungary and Zimbabwe, Canada and China, the Soviet Union and Portugal), have a growing and evidently enthusiastic following. Originally designed with libraries in mind, they have been found very suitable also for archival applications.

Many national archives, sometimes in addition to using standard packages' have had to have software developed to meet their specific needs, either by their own in-house specialists or in collaboration with, for example, government or university computing centres or with regional or local archives (in Belgium, Liege university; in Sweden, Lund regional archives). Applications thus tailored for use in one archives should in theory have wide potential in others, but in relatively few countries, it seems, has there yet been a significant move towards standardising the software in use throughout the various archives (e.g. MAIS in the Netherlands, ARQBASE in Portugal, with similar developments under way in Norway and Sweden). Still less have such tailor-made applications yet been known to cross national boundaries. Among examples reported to the survey was MAIS which is being translated into French for experimental use in Belgium.

Choice of hardware and software may be artificially restricted by considerations outside the archivist's control. The desired application, if foreign, may not be readily available owing to cost or current trading practices. The archivist based in, say, a library environment may be obliged to use the same system as the parent authority, even if its primary use was for a method of bibliographical control which is not in every respect suited to archives. At the level of local or central government, the parent authority will commonly have existing computer applications designed to meet its own administrative requirements, with the result that the archives may have to adopt the same or a compatible system if it is to make any cost-effective and properly supported advance in computerisation regardless of whether that offers the best solutions for the archives. Worse, these imposed systems may be dependent on a particular kind of hardware or software which might not be compatible with the potential electronic transfer of information between archives. It is in order to overcome the last problem that the MARC: AMC format, which is not system-dependent, has been devised in the USA (see p.43). To quote one authority, 'the archivist needs to be adaptable and pragmatic in order to make reasonable use of what actually is available' (Post 1988).

Incompatibility between one system and another continues to cause inconvenience, to put it no more strongly.

The Direction des Archives de France, for example, is one apparently centralised authority which does not in practice control the purchase of hardware and software by the regional (departmental) administrations (see' for example, Gazette des Archives, 141 (1988), p.133).

In the united Kingdom, the Public Record Office has no central control over the archives of local government. As microcomputers have burgeoned around the country the prospect of any unified national approach to software applications has vanished ( Roper 1989). Nor does the PRO control the software used in government departments by those drafting class lists of records for transfer to the archives. It has, however, recently standardised most of its own in-house software applications to simplify management and control.

Choosing software is only a preliminary to making it satisfy the user's requirements. But that too has given rise to its own particular problems: lack of senior staff time to plan and supervise the application; lack of understanding of archival concepts by outside developers and of computer concepts by senior staff in the archives; lack of staff training to develop the software properly and recognise any snags; strong resistance from professional colleagues trained in a pre-computer age or on earlier computer systems; lack of technical support from the manufacturer or developer.

Regrettably, it is also the case that commercially available software is often furnished with poorly written manuals and may need expert intervention before it becomes fully user-friendly. This depends, however, to some extent on the technical knowledge or training of the user. The number of purely technical problems that have not been solved by someone is relatively small, but most archival users operate in a restricted computer environment, some in total isolation, and this has the effect of magnifying the inconvenience of problems experienced.

In all these matters, the low profile of archives in relation to other services at both local and national levels continues to attract comment. To make only the comparison with libraries and museums, archives are fewer in number and the profession much smaller. Apart from the obvious difficulties which this presents in attracting resources in the first place, there are other problems specific to the present discussion. As a specialised market for computer applications, archives has been too small to attract real interest from most software manufacturers. On another plane, archivists have found it difficult to achieve adequate representation on some national Standards committees or to make their voice heard alongside the stronger lobby of librarians. It is noticeable, particularly in the field of standardisation, that many achievements have come about largely through the determined efforts of individuals or groups of archivists, persistently working through complex problems with at best unofficial backing from their professional bodies (which generally lack the funds to make a greater commitment on any one front at a given moment, and have in any case many other issues to tackle simultaneously). But the responsibility cannot indefinitely rest with individuals: sooner or later the initiative has to pass to some form of national organisation

Training

The complexity of computer technology and its rapid development have given rise to another set of problems. The number of archivists who are fully cognisant in this field is tiny. There is a great need for more training, both at national and international levels, nowadays not only basic familiarisation with computers but increasingly in-depth training to equip at least some specialists in each country, or in groups of countries such as the regional branches of ICA, with the vision to lead archives sensibly through what might otherwise be a minefield.

Finance

Finance remains a problem everywhere, and for all the cost benefits that can be marshaled in its defence there is no contesting that computerisation can be a very expensive process.

Virtually every respondent to the questionnaire reported that the entire charge of computer development had fallen upon the budget of the institution itself. Commercial and private sponsorship is extremely rare at national level, though it has been achieved in Spain in the Archive of the Indies optical disk project ( see p.55), and special subventions from other public funds have been made available to support research and development as in Canada, the United States and the United Kingdom (see p.47). A number of universities and colleges, as in Canada (MacDermaid 1990) have obtained grant aid for the purchase of computers.

Finding the money even to take the most tentative first steps is a hurdle for some of the developing countries, where the millions of dollars made available (we might now almost say required) in North America for computer developments in archives can only be looked on with envy if not disbelief. The ICA with Unesco might wish to explore further financial as well as technical aid for appropriate computerisation projects in these countries.

Data capture and back-loading

The costs and problems associated with the capture of data from those finding aids which were completed before the advent of computerisation and from finding aids produced by word processing rather than in a structured form, are widely recognised.

Archives faced with a formidable quantity of back data have often chosen, at least for the present, to ignore the problem and computerise only newly accruing data. Some, on the other hand, like the Archives Nationales in France, Australian Archives and the National Register of Archives (UK), have embarked upon major programmes of backloading in order to establish comprehensive databases. This can be a massive manpower commitment and may involve the employment of additional, sometimes agency, staff. But the end results have been found to be worthwhile.

The larger archives have generally chosen to computerise mainly selected information at the fonds level, geared to the production of repository or topic guides. Some, however, as in Norway, are moving towards a new description format for all information even down to item level, and for smaller (eg local and specialist) archives elsewhere this may be a realistic option. It is provided for in the exchange formats discussed in the next chapter.

The methods of data capture vary from rekeying the information from the original means of reference to scanning it digitally or by optical character recognition (OCR) techniques. Several respondents including those from Norway and the United States reported major concern over capturing word-processed text for computer databases, whilst Hungary reported solving this problem. With some commercial software it is now feasible to tag an already word-processed file and download it into a relational database for merger with other data. The problem is rather one of scale, both in terms of the volume of work to be done and the size of the resulting files of data as the process is repeated frequently.

The traditional methods of marking the original finding aids, or a photocopy of them, with the symbols and comments necessary to indicate to a keyboarder the beginnings and endings of each field of data, or to supply information previously assumed or left blank, have been fully described (Arad 1981 (2), 1987; Cloulas 1985). But these are labor-intensive and do not suit the needs of every archives. Data such as geographical locations or country names which could be left implicit in a manual system may have to be supplied to enable the computer to make a comprehensive and successful search. Associated and linked words may need to be indicated, and if the system is to be exploited to the full it may be necessary to indicate or supply key-words or explanations of context to assist searches and indexes.

Access by topic, by personal or place name, and by medium or format of the records described are all being developed alongside the traditional access by provenance and by call number. There is in particular a growing demand for topicbased access which requires more subject indexing than has traditionally been provided for in national archives but which seems increasingly necessary where information on archives is to be merged in databases with, for example, bibliographical information. But this cannot be achieved without a commitment of resources.

Those just beginning may be able to avoid entirely some of the problems rehearsed above, by using a completely freetext (inverted file) system such as STATUS (Woolgar 1988) or one of the new systems now appearing commercially which claim to be able to merge free text and structured file data.

Only a few respondents reported real success with OCR as a means of data capture. First of all, it requires additional equipment which may not be available. The Soviet Union has found it unsuitable for its earlier finding aids in manuscript, the Public Record Office (UK) because the earlier data is not adequately structured. Norway also reports that its early experiments have not been promising. But the capabilities of this technology are advancing rapidly (Allen 1986, Gillett 1988).

As data from a number of different sources accumulates, some of it may well be found to duplicate existing information ('data redundancy') or to express it in a slightly different way ('data anomaly') - problems highlighted by the National Archives of Canada. Provided that there is time to resolve these matters, the opportunity created for 'data clean-up' is generally welcome. Clean-up may also be required on freshly keyed information. Australian Archives employed contractors to key in data which was then checked by archives staff before being run on to a mainframe computer overnight.

Errors of even a single keystroke, which would be recognised. as such in a typescript finding aid, can lead to the effective loss of data to searchers using a computer. In these cases even the old adage 'garbage in, garbage out' becomes too generous because searches may fail to yield any result at all, when in fact the information is there, but in garbled form. These problems may to some extent be resolved where the system is capable of truncated or 'fuzzy' searches to identify terms similar to the one being sought.

Identifying the user

Searches - but by whom? One of the most fundamental issues to be considered is the user community. For whom is the computer system designed? For archivists? For record-originators in government departments etc? For the public? Can they all be served by the same system or are their needs sufficiently different to justify different approaches for different user groups? Do we know sufficiently well who the users are and what is the nature of their demands upon the computerised, information? Should they be consulted? How practical is it to unleash any given category of user on-line on a live database, or even off-line by means of disks or CD-ROM? These questions are being widely addressed in Europe and North America.

Most respondents to the questionnaire said that their computerised, finding aids were designed with all three kinds of user in mind, or at least the archivist and the public. In practice they have usually been designed by or for archivists, and make assumptions as to the user's familiarity with archival conventions respecting provenance. The public may need, and indeed are sometimes given, fairly detailed guidance to get the best out of such systems. They may need separate screen-prompts and 'menus' from those designed for the archivist.

There are good reasons for allowing the public (with suitable safeguards) on-line access to computer data-bases: freedom of information, the provision of the very latest information. But there are often compelling reasons for not doing so. In some archives 70 to 80 per cent of users are genealogists or local or family historians, characteristically making a single visit to discover or check one piece of information. It is impractical to explain the use of the computer to each one.

In the Public Record Office (UK) the Current Guide to holdings, although fully automated, is made available to searchers in print-out or microfiche form. The Soviet Union reported that it was normally more economical for staff rather than the public to sit at a keyboard.

In smaller repositories particularly there may be too few computer terminals to allow public access, or their dedication to this use may reduce response times and other facilities for staff.

Charges?

Many archives have traditionally prided themselves on the supply of information free of charge to users, except perhaps where any appreciable research is undertaken to obtain the information.

A question now being asked, especially in the USA, is whether and for how long such a free service can continue in respect of computerised, information. This is especially pertinent in the case of information obtained from a proprietary computer network, for membership of which the archives may itself have to pay substantial subscription fees and telephone connection charges. It may be practical by means of a modem link and the national telecommunications network for the remote user to obtain the information at home and be charged pro rata for the use of the service. In France there are plans to allow public access to certain databases in the user's home through the MINITEL computerised, telecommunications system. Should the service be free if instead he presents himself at the archives?

Security

Any kind of user-access to live data, even by the archives' own staff, must be properly controlled to ensure that only authorised persons may enter and amend or delete information. To a large extent provision for this kind of security is written into the operating system itself, and is no more difficult to arrange than the 'menus' or questions designed to help any user to reach the desired information, but it must be competently set up and maintained.

Quite commonly a computer system will include whole areas of management or confidential data to which there should be no public access. The Centre des Archives Contemporaines, Fontainebleau, is deferring direct public access to its database until the ability to protect confidential data is guaranteed. Similar reservations apply in other national archival institutions, although no respondent mentioned any particular security problems of the kind regularly reported in the press: computer 'hackers' using telephone modem links to break their way into other people's data, or sowing 'viruses', 'worms' or 'Trojan horses' which gradually corrupt or destroy the data. It would be imprudent to say that these problems cannot affect archives, but with vigilance they may perhaps be avoided.

Computerised data is vulnerable in quite different ways from that contained in earlier means of reference. A determined vandal with a magnet may be capable of wiping clean a database. Power surges or an earthquake may do the same. Computer fires are by no means unknown. It is therefore essential to maintain adequate back-up facilities such as up-to-date duplicate disks or tapes stored away from the location of the main data.

Back-up

If the mention of some of these hazards seems a little alarmist, more common practical problems arise when for any reason the computer is 'down' or ' crashes ', and it is important to have a strategy for recovery and for servicing user demand in the meanwhile. Australian Archives keeps complete paper and fiche back-up copies of its computerised, data. But many would find this an unworkable solution on account of the bulk or form of data and the expense involved. The maintenance of anything like a complete dual system (computerised and manual) cannot, however, be recommended as an alternative to a determination to secure the successful operation of the new technology.

Updating

A problem somewhat related to back-up is that of the maintenance or updating of information once entered on the system. For many archival applications, data will be entered once for all, describing a fonds (etc) which is finite and will not be altered. But this is by no means always the case, and the problem looms especially large in relation to the networks described below and to the specialist surveys by academics etc mentioned briefly above.

The data is collected, processed and entered into the database, but then what? Does anyone maintain it, or does it go progressively out of date, as would have been the case with a printed volume? How, if at all, may it be made publicly available? And if it is out of date has some indication been given of the date at which it was compiled? The onus must rest on the original data supplier, and in the case of surveys with a finite duration these issues should ideally be resolved before the database is compiled.

Documentation

It has already been pointed out that commercial software does not always come complete with a readily intelligible manual. For the many national archives which have to develop their own applications rather than using standard software packages, there is a risk of the system's becoming complex, and difficult for new generations of programmers to understand. The need to simplify a complex system was identified as a priority in the Norwegian reply to the questionnaire and has also been a concern of the National Register of Archives (UK). Everywhere it is essential for programmers to ensure that their steps are clearly documented for their successors. In this respect the computer systems which produce, or serve as, archival finding aids are not dissimilar from electronic records which may be received as archival holdings, whose adequate documentation is crucial to their proper use and interpretation.

Electronic records

While the survey was in preparation a far-ranging report was published by the United Nations' Advisory Committee for the Coordination of Information Systems, under the title Management of electronic records: issues and guidelines (United Nations 1990). Although addressed first to the constituent organisations of the United Nations, the report raises issues which are being or will be faced by archivists and records managers wherever electronic records are created.

It points out the urgent need for archivists to be trained so that they can be fully involved in the design of electronic information systems and their documentation from the moment of creation. It also suggests that at present metadata systems (for description of electronic records) are not consonant with the formats being evolved for describing other media, as discussed below.

 

The complexity of the issues involved requires the lead in this field to be taken by concerned national archives, as is already being done in the USA, in close collaboration with the professional associations of archivists which must seek full representation on the committees of both the national and the international Standards organisations addressing these problems. The national archives of Norway and Sweden have already agreed on a joint approach to such questions.

A number of additional problems of a more local nature were reported during the survey. Attention here has been focussed mainly on those affecting the widest constituency.

Some of these have been seen to be of a practical or managerial nature rather than strictly of the methodological or technical kind to which the survey was primarily directed, but they are too closely interrelated to be ignored. As will readily be judged from the examples given, the degree of urgency which a problem may assume locally or nationally depends upon many variable factors. Broadly speaking, the problems judged to be the most serious internationally are those on which most research is already being done. Some of these are described in the next chapter.


Contents - Previous - Next