The use of sampling techniques in the retention of records
Felix Hull
INTRODUCTION
Among the various techniques adopted for reducing the bulk of certain classes of records is 'sampling', a method which can vary from a purely subjective choice of examples through a variety of procedures to an exact statistical process, providing an ideal objective answer for the student involved in the quantitative analysis of data. Insofar as this is a special procedure appropriate only in special circumstances and requiring careful assessment of method to be employed, size of sample required and precise evaluation for the purposes of research, it has been considered desirable to treat it as the subject for a special study. This study, therefore, is directed to one particular aspect only of records disposal and to one technique of appraisal, which should only be applied when circumstances indicate that a particular need is present.
Nevertheless, there is a certain misunderstanding prevalent regarding sampling. Terminology has tended, in the past, to be less than precise and the whole question of the use of sampling has given rise to much uncertainty and some misgivings among archivists. Here is a process which by definition leads to the destruction of a high proportion of the total documentation involved: can we be sure that the right material is preserved? Is the proper statistical sample required for accurate analysis by computer necessarily what the archivist looks for as part of his records, or indeed, what the historian or sociologist really needs? To what extent, if selection is inevitable, must purely objective criteria dominate; do we require the ordinary or the extraordinary in our archival sample? it is because of these many uncertainties; of questions of principle which disturb those concerned with sampling as it affects archives: and the desire for some assessment of procedures and for suggestions for suitable basic guidelines, that this study has been prepared for UNESCO in co-operation with the International Council on Archives. It approaches the subject in two ways:
(a) by consideration of theoretical principles and methods of sampling; and
(b) by an examination of the experience of a number of national and other repositories where sampling has been practiced. On the basis of these assessments an attempt has been made to draw up some essential principles for the application of different methods, even if it has not proved wholly possible to state categorically what should or should not be done. In this study, too, it has not been overlooked that many repositories are increasingly concerned with non-conventional archives and that pictures (still and motion) sound archives and machine-readable records all play a part in the contemporary scene. Although still more difficult to assess within the terms of this study, these newer forms of records have not been ignored.
6. GUIDELINES
6.1 In his book Archives Administration, published in 1977, Michael Cook briefly examined sampling as a technique as part of his chapter on appraisal. Having identified three types - random, selective and representative (1) - he completed his examination by claiming that there is 'no such thing as a statistical sample of general utility' and then stated that in any event, sampling was inadvisable without some specialist advice (2). After considering the conflicting evidence and views of archivists in Europe and America one comes reluctantly to much the same conclusion and feels sympathy with one correspondent who wrote that 'the first thing about archival sampling ... is that one should only do it if one really has to.'
6.2 Nevertheless the purpose of this study has been to try to
identify the cases when sampling is an appropriate exercise and
under what terms it can properly be carried out and it is,
therefore, desirable to attempt some statement of general
principles and to offer guidelines for those faced by sampling
problems. In this respect, it is necessary to return to the basic
distinction between appraisal and sampling and to attempt to
define the terms in use. Appraisal, therefore, is the fundamental
selection of documentation leading to preservation or to
destruction, under whatever criteria are considered valid at the
time the choice is made. Every archive class, at least in theory,
has passed through an appraisal process, for a decision,
conscious or unconscious, has been taken in respect of its
retention. In contrast to this process, sampling can only occur
in those cases, where despite the appraisal decision, some
uncertainty remains because of the positive but limited
informational content of the records and also because of their
bulk.
For good or ill the archivist is influenced by costs, for
storage, maintenance and care are expensive commodities and
potential, perhaps unknown, research values (that possibility of
usage for secondary purposes), have to be set against the costs
of retaining the records in their original state.
6.3 It seems abundantly clear that sampling is a technique which, in whatever form, is subject to some criticism and uncertainty by custodian and searcher alike and that the archivist will be well advised to adopt sampling methods as infrequently as possible and only then after the most careful consideration of the methodology to be used and, perhaps - for the final decision may well depend upon circumstances beyond the archivist's control - with the advice of an expert in the field of study involved, and, or, in statistical method.
6.4 Appraisal, therefore, may determine the fate of classes, SERIES or even unit pieces; sampling will only apply when some further reduction in bulk is deemed necessary over and above that first decision, or if, for some reason it is decided to keep an example of a type of record otherwise destroyed.
6.5 A second consideration will relate to those types of record where such a decision may be appropriate. Bulk has already been mentioned and, except for the 'example', there are few occasions when the factor of bulk coupled with that of continuing growth will not 2 provide the key element in the final decision making. Yet not all types of record are equally suitable for sampling and it has been necessary to stress repeatedly, that a series of records which are suitable for this technique, will be homogeneous. If the documentation is wholly individual in content and if it provides significant evidential detail, then it is improbable that sampling can apply. One manual of advice which refers to statistical sampling states that it can only be 'applied to documents which contain mathematically quantifiable information in standard form and in sufficient depth, either because they cover a long period of time, or because they are complete for the particular subject with which they deal', (3) a ruling which goes some way towards explaining why statistical methods are seldom wholly applicable to non-conventional archives.
6.6 Opposed to the somewhat cautious and qualified approach of European archivists and their tardy acceptance of fully random statistical methods, is the enthusiasm evinced by the American paper on Statistical Sampling of Archives' (4) which argues forcibly that since 'the record' is never complete we should 'be able to sample without undue anguish about the integrity of the records'. Certainly, as suggested earlier, the trans-Atlantic attitude is much less overburdened with fears that the 'significant' item will be lost and, although NARS still employs both selective (purposive) and systematic sampling on occasions, the movement of thought in the case of large homogeneous series of files and case papers is very much in line with that proclaimed in Canada.
6.7 A related matter which must be briefly mentioned is the size of the sample if, in fact, one is to be taken, though here again, it is difficult to find completely common ground. Eleanor McKay considered that in order to obtain a satisfactory and authoritative sample of Congressional papers a twenty per cent sample was required (5). McReynolds argues from the statistical point of view that 'the greater the reduction the imprecision of the resulting sample' and he, like Cook, recommends the seeking of advice. He also suggests that if a record group has a number of series of varying significance, the archivist can create a stratified sample of the whole record group'. (6) This is, in fact, essentially what is being attempted in Kent with the sampling of Social Services records, where the size of the sample taken from various series of records is dependent upon the relative size, importance or nature of the individual series. This concept suggests that the size of sample will vary according to circumstances, though it will remain true that the larger the sample the more truly representative it is likely to be. Once again the overriding factor is likely to be cost, for if a very large sample is to be retained may it still not be preferable and possible to retain the whole, which after all is the ideal solution?
6.8 Guidelines
We can conclude so far:
(1) Sampling should only take place (a) when there is some doubt about the validity of retaining the whole class or series of conventional (paper or textual) records, but when automatic destruction is regarded as too drastic a course of action, or (b) when it is felt proper to retain some examples from an otherwise destructible category of records.
(2) Where the material itself is appropriate for this kind of technique, i.e. where classes or series of files are homogeneous ('if the individual files contain similare records in each file, the variability will be small and the statistical significance or precision of the sample will be high') (7). Heterogeneous or highly variable records will produce a serious bias and should not be sampled for that reason unless there is a sub-series within the main series which has special characteristics indicating that retention of the sub-series is desirable.
(3) In most instances these criteria indicate that sampling is not applicable for the selection of cartographic, audio-visual or machine-readable records.
(4) While the size of sample will vary according to the nature of the documentation and the circumstances under which decisions have been made, a larger sample will provide a more satisfactory coverage of the whole and will therefore be more likely to provide researchers with their special requirements.
(5) In all cases the methodology used and the reasons which led to sampling must be indicated in any finding aid which is prepared.
6.9 It now becomes necessary to consider the various types of sampling in greater detail and to attempt to define areas of usage and significant variations in value in terms of methodology. In this section, therefore, it is most important that exact terminology should be adopted if at all possible. This is particularly so with words such as 'random' and 'statistical', for it is clearly apparent that not only has there been much confusion in the past, but that both these terms are still being used to describe significantly different processes. Although 'random' is still regarded as suitable as a description of various patterns of systematic sampling, its use here, so far as possible, will be limited to describe the precise statistical process described on pp. 24-S. The terminology, therefore, will be that adopted in Chapter 2 but, wherever applicable, known variants will also be given in the heading to any particular method.
6.10 (a) The Example
Although it would appear that examples are taken fairly widely
in appropriate circumstances, it must be stressed that within the
terms of definition 'an example' cannot be regarded as a true
sample. It neither illustrates the qualities of the whole mass,
nor does it provide a representative experience of the series. On
the other hand, it does indicate that a certain class or series
existed or even that a particular type of individual document was
once in use.
From time to time circumstances may arise when it seems desirable
to retain an example of what would otherwise be destroyed.
6.11 Guidelines
This is a valid appraisal decision, but
(6) Any description must indicate the provenance of the example and why the residue was destroyed.
(7) The significance of the example rests in its nature as an example and in nothing else; it has virtually no research potential except as an indicator of what was formerly in existence, even though it may have value as a precedent for the agency concerned.
6.12 (b) Purposive Sampling (Qualitative or Selective Sampling)
The most dramatic example of this kind of sampling and the one which led to considerable dispute was the French attempt some years ago at producing 'models'. This was attacked in 1953 by R H Bautier (8) and was further discussed in 1967 by Pierre Boisard in La Gazette des Archives. (9) As a method it is superficially very tempting to the archivist, who considers that a class or series must contain material of special or particular merit in respect of individual topics, areas or personalities. As a method it attempts to answer the criticism that sampling removes the exceptional and thus loses what is of special significance. In considering this argument, one author has referred to this form of sampling being carried out under 'criteria of significance', but has then pointed out that such criteria are atypical simply because they are of special significance and that 'the resulting sample does not, in any way, reflect the whole group of records'. (10) While he does, somewhat grudgingly, admit that selective sampling 'within series or record groups is a valid technique for archivists in some circumstances', it is also plain that this system is not very far from the taking of examples. It is nevertheless a method which has been used by many archivists and, almost as often, it has given rise to criticism because of its inevitable built-in bias and its unsuitability for statistical purposes. It must be admitted, however, that so long as doubts remain about the wholly satisfactory nature of systematic or true random sampling, there will be a tendency to continue to use this method. It is of interest to note that when the sampling of the papers of Congressmen in Wisconsin was attempted, the person involved still considered it desirable to take a purposive sample from the eighty per cent residue in case the statistical sample did not prove to cover all contingencies. (11) Indeed, if purposive sampling is to be accepted at all, it is probably most appropriate in this kind of context where, after the taking of a statistically acceptable sample, it is used as a secondary system - a kind of safety-net. In other words, that once a statistical sample has been taken, it is permissible to extract from the residue, other files or papers on a qualitative basis.
6.13 Guidelines
In view of the above facts and of the continued use of this method, even if there remain many reservations, it is well to establish certain basic rules for its adoption and usage:
(8) Purposive sampling, because of its 'exception' character and built-in bias, is the less appropriate the more homogeneous the original series of records.
(9) No purposive sample should be taken in the place of a statistically valid sample - it may be a supplementary process, but is not really acceptable as the primary method to be employed.
(10) Criteria of selection must either be very specific (e g. the case files of conscientious objectors referred to on pp.20-1) or must be as comprehensive as they can reasonably be made. Once again the problem arises that the attempt to be comprehensive may only be achieved by the retention of the whole series.
(11) A very clear description of any criteria must preface any finding aid, so that the user is made fully aware of what was done, why it was done, how it was done and where he can find the other element of the sample which can be used for quantitative analysis.
6.14 (c) Systematic Sampling (Representative, Quantitative or Statistical other than "random', Time-Series or Chronological, Numerical)
There is no doubt that of all sampling methods, systematic sampling is the method with the greatest number of devotees at the present time. The preference for this method is usually expressed in terms of the simplicity with which it can be carried out. Nevertheless, there has been some criticism of methods in the past, and indeed, systematic sampling covers such a very wide spectrum of techniques, varying from ones bordering on purposive methods to systems designed to provide a near 'random' sample that it is difficult to define in any one simple manner. It is worthy of note, however, that whereas European archivists still tend to regard systematic methods as the most suitable for general purposes, the pressure in North America is towards the greater use of truly random methodology. Each of the varieties of systematic sampling must be briefly considered and for each some rules of action must be established.
6.15 Only one remove from purposive sampling and still far from being what one usually regards as a systematic sample is the representative, topographical sample. This occurs when, faced with a large number of similar groups (fords) within the archives of an agency, each based on a particular topographical area or regional office, it is decided to retain whole archive groups for a selected number of offices. It is immediately clear that this is not true sampling, rather it is more closely associated with basic appraisal in that it is not series which are being reduced in size but rather a determination to retain some units of archives as opposed to others. Moreover, there is no homogeneity within the records chosen for retention - the group may contain many disparate classes - but each group will approximate to its fellows in the character of the records created and those selected for retention will therefore present a representative sample of the records of the central agency as they existed at the local level. This approach has been strongly criticised in France as creating a false sense of uniformity and for damaging the resources of local history within those areas where destruction took place. There is a sense, of course, in which this process is of the kind of arbitrary selection which the ravages of time have created and, indeed, such selected groups of records are only valid for research within the parameters of their own topographical area and cannot be cited as authoritative evidence for what took place elsewhere in areas for which the records have been destroyed. Finally, this is a method which underlines the archivist's dilemma very forcibly - the archives of the unit, office or area can only be regarded as marginally worth preservation; can the cost of keeping all such records and, therefore, all such units be justified? If it cannot, then possibly a topographical sample is to be considered as a comprehensive example of what once existed and took place in one area only.
6.16 Guidelines:
It should be understood therefore:
(12) That topographical samples are only acceptable where a large central agency has many local offices and where the central agency's records are already retained in an adequate form.
(13) The records of chosen units within the system should be retained in their entirety as group - this is a very special kind of example and its validity rests in the completeness of the archive groups of which it is composed.
6.17 The second form of systematic sampling frequently used is the retention of files based on letters of the alphabet. Alphabetical sampling is regarded as having a certain statistical justification and, in Canada, is combined with a still more 'random' numerical selection. The weakness of this system lies in the national and local variability of letter usage, so that although the choice of initial letter may appear representative of the whole series, sub-categories of individuals may be entirely missed. For example the use of this method in the German Federal Republic has concentrated on the letter H. but while this is satisfactory for names of Germanic origin, it omits those of Romance origin; and a similar use in the United Kingdom would result in the exclusion from the sample of certain immigrant groups in the population. Nevertheless with large series found in alphabetical order it is a form of sampling very easy to put into practice.
6 .18 Guidelines
It may be said to be appropriate when:
(14) There is a large homogeneous series of personal files arranged alphabetically, so that the sample will be of a large enough size to provide reasonably accurate information (e.g. at Koln the use of the initial letter H provides an 8.5 per cent sample);
(15) Where an analysis has been carried out to establish which initial letter will be satisfactory according to the purpose of the exercise as a whole and will also effectively represent the records in question (e.g. it is useless to select a letter of very infrequent usage like Q or Z; equally to adopt M in Scotland might lead to an undesirably high percentage sample).
6.19 A more nearly 'random' and statistically sound sample is achieved by numerical selection. Numerical or Serial sampling can be simple, i.e. every tenth or twentieth box, file, etc., according to the format of the records; or it can be based on far more complex criteria, e.g. the Social Insurance Number selection in Canada, where files with terminal digit 5 alone are selected. (12) It must be indicated that the 'tenth box' method can lead to some problems if the make-up of the records results in files which overlap the box arrangement. It is said that this method used in the Public Record Office for the records of the Registrar General of Shipping has resulted in a sample, which, while it provides a statistical base, is not particularly suitable for other research purposes. (13) Nevertheless numerical sampling is one of the three most widely accepted methods which, while not wholly random in the statistician's sense, can provide an acceptable base for most purposes. One of the difficulties illustrated in the evidence submitted for this study rests once more in the loose use of terminology and one cannot always be certain when 'random' methods are mentioned whether that is really so or whether some form of numerical selection is not being practiced. It is essential, however, that if the intention is to produce a sample which is valid for statistical purposes, then bias must be avoided and not all records will be equally suitable for this kind of sampling.
6.20 Guidelines
It can, therefore, be stated:
(16) 'A serial sample may be acceptable for statistical study if the existing order of the whole body of the records is random (e.g. a series of returns filed in no systematic order)'. (14)
(17) 'A serial sample is the only practicable method of sampling if the individual items cannot be separated and the assemblage has to be taken as the unit'. (14)
(18) This method should not be used if there is an undoubted alphabetical, topographical or chronological arrangement to the records.
(19) The degree of acceptability depends upon every unit in the series having its unique individual number, a vital element if statistics are to be meaningful. {1 (15)
(20) The numerical series to be adopted must be established in advance and must be adhered to rigidly.
6.21 Chronological or Time - Series samples
This form of systematic sampling, which finds much favour, depends upon the chronological arrangement of the papers to be sampled and, in most cases, results in the survival of records for every fifth or tenth year, often using census years because of their association with other demographic material. The weakness of this form of sampling rests primarily in the fluctuations of human society and that the years thus selected may avoid vital changes of a political or economic or legislative character. This is a cause for concern and has made searchers suspicious of a method which tends to concentrate on the short term facts rather than the long term trends. It has been this consideration which has led in France to the somewhat complex pattern of sampling adopted for the records of Sante-Travail, where the papers for one year in thirteen are retained and those for one month, in rotation, kept for the intervening years. This system is further refined, however, by permitting the year for which there is a total retention to be determined not by an arbitrary series, but by the significance of events of that year. In the end, therefore, one is met in essence with a statistical sample based on the monthly series with its built-in variable in order to obtain a representative cover over a period of years, and then superimposed on that, what is in effect a purposive sample dependent upon 'criteria of significance'. This complication must be recognized in any statistical work carried out, for while some valid comparisons are possible with the monthly series, the chosen years will not be similarly comparable.
6.22 Guidelines
For a satisfactory time-series sample therefore:
(21) the records must be homogeneous and arranged chronologically;
(22) the time series should be selected irrespective of political or other changes happening in between the retention years and this time-series should be adhered to at all times if the result is to provide statistical information;
(23) the closer together the selected years are, the more likely it will be that sudden aberrations in society will be picked up, but since it is only a sample one cannot and should not regard special circumstances as reasons for special variation; if there is cause for doubt, then it may be that a selective (purposive) sample should be taken in addition to and after the chronological sample.
6.23 It must be stressed again that it is the relative ease with which systematic samples can be taken which is their principal attraction. In the R.A.D. paper from the Public Record Office, quoted above, the comment is made that 'in practice it may be too expensive or time consuming to take 'a random sample, and that therefore 'the alternative of a "serial" or "systematic" sample' may have to be adopted. This is now very much the preferred method in European repositories, but it is gradually giving way to the true random sample in the United States and Canada.
6.24 (d) Random Sampling
The essential problem in this technique is to establish the fully random nature of the sample; to apply a statistically sound method of selection with no element of bias; and to be satisfied that the needs of traditional research are as adequately covered as those of quantitative analysis. In the view of the Canadian Public Archives these criteria are all met and this is also the opinion of R M McReynolds of the National Archives and Records Service of the U.S.A. On this side of the Atlantic, there are still doubts and the much slower adoption of computer techniques in archives and lack of resources has limited the use of random methods. One comment received reads that 'calculations based on the (random) sample will not provide historical accuracy in the sense of tying the creating authority's operations to particular cases, but they should give an accurate overall view of the effect of policies or the extent of problems'. (16) Material to be processed in this way must be essentially homogeneous, i.e. with a very low variability of content, and should 'contain mathematically quantifiable information in standard form or in sufficient depth, either because they cover a long period of time, or because they are complete for the particular subject with which they deal'. (17)
6.25 Guidelines
The choice and practice of this methodology depends upon:
(24) a suitable series of homogeneous records;
(25) the use of a random number table (18) or, possibly, of a highly sophisticated numerical series; (19)
(26) the numerical individuality of all the pieces (units) in the file series so that bias is eliminated;
(27) the careful determination of an appropriate size for the sample, bearing in mind that 'the greater the reduction the greater the imprecision of the resulting sample'. (20) It should likewise be remembered that 'to double the accuracy of a sample it is necessary to quadruple its size; (21)
(28) that in this area in particular, the advice of a statistician and expert in historical quantitative research can be invaluable and can prevent serious error.
6.26 The random sample is taken by a precise scientific process, all other samples only approximate to a greater or lesser degree to that objective ideal and since they are easier to adopt, and are in some ways more natural in methodology to traditional archival thinking, they will tend to be used, especially where the technology associated with true random sampling is still difficult to acquire and the skills of the persons who must carry out the work limited. Nevertheless, as computer technology expands and becomes less expensive, it would appear that the random sample based on the random number table, or perhaps on some essentially random system like the Canadian S.I.N. numerals, will become increasingly the standard adopted for long homogeneous series of paper files. It will never be appropriate for records with a high variable factor, but it is questionable whether sampling of any kind should be advised in those circumstances.
6.27 This chapter has not considered at length the question of non-traditional archives, but it may be recalled that the essential argument of Chapter 4 was that, in most instances, sampling was not a technique suitable for material which must be selected on a unit basis. There were a few exceptions, depending upon the provenance of the material and in such cases it was almost invariable that time-series or chronological samples were taken, unless indeed the basic record had been microfilmed and a purposive sample appeared more appropriate. In all these instances the rules which would apply are those which have already been set out in the appropriate section of this chapter. There seems to be no room for quantitative, statistical samples in these areas, though of course the argument is confused by the availability of almost unlimited samples for research purposes, where the records themselves form a data base available for investigation.
6.28 In conclusion, therefore, sampling is a methodology forced on archivists by the sheer bulk of documentation and the cost of preservation. It should not be adopted unless there is no alternative solution, for it can seldom be wholly satisfactory. Random statistical sampling is appropriate for homogeneous series of paper files and can form a satisfactory base for quantitative research and, dependent upon that homogeneity, a reasonable base for traditional research also. The archivist, however, faced with costs, staff problems and many classes which are somewhat more variable than should ideally be the case, will often tend to adopt simpler methods, which can still be statistically based even though less completely satisfactory than random sampling and which provide more scope for the retention of the exceptional as well as the normal. In all cases, however, any sample which is intended to be statistically valid must be taken first and any other type of selection made subsequently. Full notes must always be retained of every action taken and of the various elements of the sample if more than one has been taken. The records themselves, too, must be stored in such a way that the distinction between what is acceptable for quantitative analysis and what is not, is clearly apparent.
6.29 In one sense, sampling is the worst of all worlds, but there is a growing opinion which sees in random sampling a set of criteria acceptable for all purposes provided the basic record is suitable for that kind of technology. Microphotography and the preparation of data bases must help to overcome the dilemma presented by archival sampling, but even apart from cost, the application of modern techniques depends upon the availability of such technology and it will be many years before the less sophisticated methods are rejected entirely. Unfortunately sampling will always leave the archivist and perhaps the scholar in a state of uncertainty, even though the methods used may be totally acceptable. In all cases, the residue must be destroyed and something may be lost thereby. Accepting that irreducible factor, forms of sampling will continue so long as very bulky series of essentially similar records have to be appraised and cost of storage prevents the retention of the whole.