in coastal regions and in small islands
Coastal management sourcebooks
Part 1 Remote Sensing for Coastal Managers An Introduction
|4||Field Survey: Building the Link between Image and Reality|
Summary Field survey is essential to identify the habitats present in a study area, to record the locations of habitats for multispectral image classification (i.e. creation of habitat maps) and to obtain independent reference data to test the accuracy of resulting habitat maps. Field surveys should be planned to represent the range of physical environments within a study area and it is recommended that the locations of approximately 80 sites be recorded for each habitat type (30 for guiding multispectral classification and an additional 50 for accuracy assessment). The importance of accuracy assessment should not be under-estimated: a care fully produced habitat map does not guarantee high accuracy and inaccurate information can mislead decision makers .
The costs of field survey may be divided into fixed and variable cost categories; an overview of these cost considerations is provided. Global Positioning Systems (GPS) are a vital component of field survey and the selection of an appropriate system will depend on the mapping objectives and type of remotely sensed imagery being used.
Several complementary methods of assessing the accuracy of habitat maps are available. These are described and their relative advantages listed.
Almost every remote sensing exercise will require field surveys at some stage. For example, field surveys may be needed to define habitats, calibrate remotely sensed imagery (e.g. provide quantitative measurements of suspended sediments in surface waters), or for testing the accuracy of remote sensing outputs. This chapter aims to describe some of the key generic issues that must be borne in mind when planning a field survey. Specifically, the chapter sets out the general considerations involved in surveying coastal habitats, describes the importance of recording the positions of survey sites using Global Positioning Systems (GPS), and gives an introduction to the costs of field survey (costs are explored further in Chapter 19). The importance of assessing the accuracy of remote sensing outputs is stressed and guidance given on appropriate statistical methods for calculating the accuracy of habitat maps. Specific coral reef, seagrass and mangrove field survey methods (Plate 5) are too varied to include here and are discussed in Chapters 11, 12 and 13 respectively.
The need for field survey
Before the need for field survey is discussed, it is worth briefly reviewing the concept of remote sensing. Remote sensing provides a synoptic portrait of the Earth’s surface by recording numerical information on the radiance measured in each pixel in each spectral band of the image being studied. To create a habitat map, the operator must instruct the computer to treat certain reference pixels as belonging to specific habitats. The computer then creates a ‘spectral signature’ for each habitat and proceeds to code every other pixel in the image accordingly, thus creating a thematic map.
Historically, some researchers have looked upon remote sensing as a means of mapping without the need to conduct field work. Whether this is an appropriate tenet depends on the objective of the study and familiarity of the operator with the study site. On a general basis, most people can view a satellite image or aerial photograph and easily distinguish different features according to their colour, contrast, pattern, texture and context. In some instances instances, this may be all that is required to make use of the imagery. For example, visual interpretation is usually sufficient to delineate the shape of coastlines. In the majority of studies, however, the objective is more sophisticated (e.g. mapping submerged habitats) and the thematician may not be able to draw on visual interpretation and background knowledge to identify each habitat type. In fact, the thematician is unlikely to be aware of the variety of habitat types in the image. Our own experience supports this view (see Chapter 9): even when moderately familiar with an area (the Caicos Bank), the overall accuracy of the final map was low if field surveys were not conducted (e.g. 15–30%).
The aims of field survey are three-fold. Firstly, to identify each feature of interest (e. g. each habitat type). Secondly, to locate representative areas of each feature in order to generate spectral signatures (spectra) from the imagery. Thirdly, to generate adequate additional data to test the quality or accuracy of the image classification (i.e. habitat map). This latter consideration is extremely important for any mapping exercise. In a coastal management context, imagine the legal problems in suggesting that a developer had cleared a particular mangrove area if the accuracy of mangrove maps were unknown. Taken a step further, where do decision makers stand legally if offenders are fined according to the extent of habitat that they have illegally destroyed? Legal problems may not be the only consequence. In biological terms, management initiatives based on a habitat map of unknown accuracy could lead to unnecessary or inappropriate action, although it is difficult to predict or generalise specific problems arising from such circumstances. Surprisingly though, accuracy assessments are fairly scarce in the context of mapping tropical coastal resources. Green et al. (1996) found that only a quarter of papers reviewed included an assessment of accuracy. The apparent scarcity of such assessments is understandable, although hardly acceptable. To test a classification rigorously, further field data are required which must be independent of the field data used to classify the imagery in the first place. It is often suggested that an adequate accuracy assessment is not possible on financial grounds. Such arguments may be countered by asking what the value is of habitat maps of unknown accuracy. The extra expenditure will clarify the degree to which the map can be ‘trusted’ for planning activities and should avert inappropriate management action on the basis of poor information. For example, the map might have to be disregarded for planning some areas whereas other sites might be well-represented. Methods of accuracy assessment will be discussed later in this chapter.
Planning field surveys
Field surveys must be planned carefully and due consideration must be given to the objectives of the study and the nature of habitats being surveyed. These issues will dictate most aspects of survey design, such as the sampling strategy, sampling technique, sampling unit, amount of replication, time to survey (i.e. weather conditions, date of image acquisition), ancillary data (e.g. depth, water turbidity) and the means of geographically referencing data. Specific considerations on methods, sampling units and ancillary data are described in the relevant chapters of this handbook (i.e. for mapping coral reefs, seagrass beds and mangroves) but more general comments are made here.
Most habitat mapping projects aim to represent the full range of relevant habitats. A helpful starting point is to conduct an unsupervised classification (Chapter 10) of imagery prior to conducting any field work. The unsupervised classification will provide a general guide to the location, size and concentration of habitat types (although it will not necessarily identify them correctly!). There are four main considerations when planning a field survey.
1. Represent all habitats
The physical environment (e.g. depth, wave exposure, aspect) will, to a large extent, control the distribution and nature of habitats. Relative depths can be inferred from most optical imagery (deeper areas are darker) and if the prevailing direction of winds and waves are known, the area can be stratified according to the major physical environments (e.g. Plate 12). If possible, each of these should be included in the survey.
2. Represent the range of environmental conditions for each habitat
Coastal areas will often possess gradients of water quality and suspended sediment concentration. Changes in these parameters across an image can lead to spectral confusion during image classification and misassignment of habitat categories. To mitigate this effect, surveys should represent a wide cross-section of each physical environment. This will provide further field data to train the image classification and provide data for accuracy assessment (to highlight the extent of the inaccuracies where they occur). As an example, an unsupervised classification of Landsat TM imagery of the Caicos Bank identified a specific habitat type on both sides of the Bank (some 40 km apart). Surveys near the field base identified this habitat as seagrass and it would have been easy to assume that all similar-looking habitats in the imagery fell into this class. However, field surveys at the opposite side of the Bank identified a very different habitat type (organic deposits on sand), thus reinforcing the need for extensive field work.
3. Choose a sampling strategy
To ensure that all habitats are adequately represented by field data, a stratified random sampling strategy should be adopted (Congalton 1991). The unsupervised image classification and map of main physical environments can be used to stratify sampling effort. A similar number of sites should be obtained in each area. Truly random sampling within each stratum (area) is likely to be prohibitively time-consuming because the boat would have to visit randomly selected pairs of coordinates, thus incurring wasteful navigation time. In practice, driving the boat throughout each area with periodic stops for sampling is likely to be adequate. The main limitation to any field survey is cost/time. While every attempt is made to obtain the maximum amount of data, Congalton (1991) recommends that at least 50 sites of each habitat be surveyed for accuracy assessment purposes. We feel that an additional 30 sites should be visited for use in image classification (Chapter 19).
4. Estimate costs of field survey
Field surveys are expensive and not all of the costs incurred in gathering field data and relating it to remotely sensed data are immediately obvious. However, a full analysis of field costs is vital when designing a remote sensing campaign to ensure that realistic budgets and work schedules are planned. A generalised discussion of costs is presented here. Detailed advice on planning a remote sensing field campaign in terms of cost and the actual costs incurred in mapping the habitats of the Turks and Caicos Islands are given in Chapter 19.
Fixed costs are defined as costs which are independent of the type of imagery used, the duration of fieldwork and the number and distribution of sites. As such they represent single payments, usually for equipment which can be used repeatedly for the field component of any remote sensing campaign. Of the costs being considered here, computing equipment constitutes the largest fixed cost. The following practical guidance points may be helpful:
Several image processing packages are available as versions for high specification PCs but a UNIX work-station is still the most flexible, reliable and powerful platform on which to carry out image processing.
A full Landsat TM scene is around 270 Mbyte in size. Image processing often generates a number of intermediate images which cannot be deleted before completion. It is also frequently useful to be able to display, compare and analyse more than one image simultaneously. Therefore, a large amount of free disk space (e. g. 2 Gbyte) is necessary to store original images and to allow the generation of intermediates. A large-capacity external hard drive is recommended.
Remotely sensed images are delivered on EXABYTE tapes, Computer Compatible Tapes (CCTs) or Compact Disk (CD). A tape drive or CD-ROM drive is necessary to import these files. Data may also be written to tape or CD so a tape drive or CD writer will also allow work to be saved and retrieved (one 8 mm EXABYTE tape can store up to 14 Gbyte of data; one CD can store 650 Mbyte of data).
Visual examination of raw and classified images is an essential part of image processing and therefore, it is advisable to purchase the best quality monitor that budgets allow. For example, we purchased a 20 inch (51 cm) colour monitor capable of displaying 24-bit graphics (approximately 16 million colours).
A good-quality colour printer is necessary for hard copy output, preferably one capable of large format printing.
Image processing software varies greatly in price from simple systems costing a few hundred pounds (e.g. IDRISI) to sophisticated pack ages costing several thousands of pounds (e.g. ERDAS Imagine, PCI). Universities and other educational institutions usually receive a discount on licences through softwa re sharing schemes. Government departments normally pay a higher price but this is usually lower than the cost to commercial organisations. In addition to image processing software, a mapping package is useful for producing presentation-quality hard copy (e. g. Map Info and Vertical Mapper), a spreadsheet package is useful for data management (e.g. Excel, Lotus) and a statistical package (e. g. Minitab, PRIMER) for analysis of field data.
For field work, a Global Positioning System (GPS) is an essential piece of equipment to estimate the surveyor’s location on the Earth’s surface. There are two principal types of GPS, which differ in their cost and locational accuracy. The cheaper, less accurate unit is known as a ‘stand-alone GPS’ and the more expensive option, ‘differential GPS’ (DGPS), which provides accuracy to a higher order of magnitude (see section on GPS later in this chapter). The positional accuracy of a stand-alone GPS is perfectly adequate for use with imagery of coarse spatial resolution such as Landsat MSS. However, imagery with greater spatial resolution (e.g. SPOT XS and Pan) justifies the use of higher-specification DGPS.
Variable costs are defined as those which vary with the type of imagery used, the duration of fieldwork and the number and distribution of sites. Personnel salaries are the major cost in this category; they will be directly related to time costs and for this reason all time costs are expressed in person-days (see next section). Fuel and oil for boats is another cost which will depend on the amount of fieldwork undertaken and the distances covered. The last main variable cost is that of the imagery itself; this is discussed in detail in the following chapter.
The time required to undertake a field survey will depend on the following considerations:
The amount of fieldwork undertaken: fieldwork time will be related to the number of survey and accuracy sites visited
The area over which survey sites are spread: are survey and accuracy sites concentrated in a small area of an image or do they cover the entire scene?
The survey methods employed: these determine the rate of site survey (number of sites per day). SCUBA surveys take much longer than snorkel surveys which, in turn, a re considerably more time intensive than shipboard surveys using glass-bottom buckets. The time costs of SCUBA surveys are further increased for safety reasons (e. g. the need to have pairs of divers in the water, limited number of dives in a working day). Rapid visual assessment methods can convey considerable savings in time for larger surveys (see Chapter 11).
The data being collected: species level data takes more time to collect than phylum level, for example. Survey data typically take longer to collect than accuracy data (when a site is usually being assigned to a particular category on an already well-defined habitat classification scheme).
Accessibility of survey areas: (i) depth: deeper areas may require SCUBA surveys, very shallow areas can be inaccessible by motorised boats and alternative transport may be necessary, (ii) exposure: areas open to prevailing winds may be impossible to survey except on calm days, (iii) ease of access: areas with high concentrations of natural hazards, such as patch reefs or sand banks, may be difficult to navigate through. The interior of mangrove forests is difficult to penetrate.
The habitats themselves: complex habitats like coral reefs take considerably longer to survey than simpler habitats like bare sand.
Global Positioning Systems (GPS)
In a remote sensing context, GPS has two major applications:
to measure the position of prominent features on an image in situ which can be used to provide ground control points for geometric correction (see Chapter 6),
to assign positions to field data. These field data can then be correlated with spectral information at the same point on a geometrically corrected image. Conversely, a group of image pixels of particular interest can be surveyed in the field by using a GPS to navigate to that location.
GPS was developed by the US Department of Defense as a global navigation system for military and civilian use. It is based on a constellation of 24 Navstar satellites which orbit the Earth at a height of 17,500 km and a period of 24 hours. These satellites act as fixed reference points whose distance from any point on the Earth’s surface can be measured accurately. Each transmits a unique radio signal which serves to identify the satellite. The time between transmission and reception of the radio signal enables the distance between the satellite and any receiving point on the Earth’s surface to be calculated. The signal is received by the GPS, which is typically either battery-powered and hand-held (or carried in a back-pack) or mounted on a boat and powered from the onboard supply. The receivers are highly mobile and positions can be taken at any location which allows radio reception (for example, there may be problems receiving satellite signals under dense forest canopies).
In order to compute a position, the GPS receiver must know the distance between itself and the Navstar satellites and the exact position of those satellites. Position fixing with GPS then utilises trigonometric theory which states that if an observer knows the distance from three points of known position (the satellites), then that observer must be at one of two points in space. Usually the observer has a rough idea, to within a few degrees of latitude and longitude, of his position and so one of these positions is ridiculous and can be discounted by the computer. Theoretically, if the GPS is being used at sea level, or a known height above it, then only three satellites are needed to calculate position because the receiver is a known distance from a fixed point in space (the centre of the Earth). However, in practice a fourth satellite range is needed in order to correct for timing differences (offsets) between the highly accurate atomic clocks on the satellites, and the less accurate internal clocks of the receiver units (for further details, see Trimble Navigation Ltd, 1993).
For a wealth of information on GPS, a good starting point is Peter Dana’s web pages at the Department of Geography, University of Texas at Austin at the following URL: http://www.utexas.edu/depts/grg/gcraft/notes/gps/gps.html.
GPS costs and accuracy
Any position fix obtained from a GPS contains a degree of uncertainty. The simplest and cheapest units are typically targeted at the outdoor leisure market (mountaineers, hikers etc.), cost of approximately £200 and give positions with an error of ± 100 m. The most sophisticated GPS systems are used by surveyors who require sub-centimetre accuracy and cost many thousands of pounds. The ultimate accuracy of GPS is determined by the sum of several sources of error, the contribution of any source depending on specific ionospheric, atmospheric and equipment conditions. Sources of error include variations in the speed of radio transmission owing to the Earth’s constantly changing ionosphere and atmosphere, drift in the atomic clocks, electrical interference within a receiver and multipath error where the radio signal does not travel directly to the receiver but is refracted en route. These errors are all unavoidable physical and engineering facts of life. However, the accuracy of GPS is deliberately degraded by the US Department of Defense, which operates a ‘selective availability’. Selective avail ability is designed to deny hostile military forces the advantages of accurate GPS positional information, can be varied and is by far the largest component of GPS error (Table 4.1).
|Table 4.1 Average errors for a good-quality GPS|
|Typical error source||
|Typical selective availability error||30.0|
There has been some speculation recently that selective availability may be removed altogether. However resistance from the US military is likely to delay this for several years, and in the event of political tension selective availability would be rapidly reinstated.
A further factor in the accuracy of GPS systems is the principle of geometric dilution of precision (GDOP). The errors listed above can vary according to the positions of the satellites in the sky and their angles relative to one another: the wider the angles between the satellites, the lesser the effects of errors and the better the positional measurement. The computer in a good receiver will have routines to analyse the relative positions of all the satellites within the field of view of the receiver and will select those satellites best positioned to reduce error. Thus, GDOP is minimised. Values for position dilution of precision (PDOP) can vary from 4 to 6 under reasonable conditions, under good conditions values < 3 can be obtained. The predicted accuracy of a GPS can be calculated by multiplying the total error in Table 4.1 by the PDOP, giving typical errors of 18–30 m for a good receiver and, in the worst case, about 100 m.
Differential GPS (DGPS)
We have already seen that inaccuracies in GPS signals originate from a variety of sources, vary in magnitude and are difficult to predict. For many uses of GPS, positional errors in the range of 30–100 m are unacceptable. Fortunately, by using a system which measures errors as they occur, and at the same time as positional information is being collected, it is possible to correct much of the inaccuracy. This is achieved by using a second, reference receiver, which remains stationary at a location whose position is known because it has been surveyed to a very high degree of accuracy (if positional accuracy of 2–3 m is required then it would be necessary to survey this point to an accuracy of < 0.5 m). The Navstar satellites’ orbit is so high that, if the two receivers on the ground remain only a few hundred kilometres apart, then the signals which reach both will have travelled through virtually the same ionospheric and atmospheric conditions. Therefore, both signals will have virtually the same delays. Selective availability will also delay the signals to both receivers by the same amount. However, receiver noise and multipath refractions will be different for each signal and cannot be corrected.
Instead of using radio signals to calculate its position, the reference receiver uses its position to calculate the actual time taken for the signal to travel between it and the satellite. This is possible because:
The distance between the reference receiver and the satellite can be calculated at any time. The positions of the satellite can be calculated from orbital details and the position of the reference receiver is, of course, known and stationary.
The theoretical time taken for the signal to cover this distance is calculated from the speed of transmission of radio waves.
Any difference (hence the term ‘differential’) between the two times is the delay or error in the satellite’s signal.
The reference receiver continuously monitors the errors and these can be compensated for in one of two ways:
The reference receiver transmits the corrections to the ‘roving unit’ (the other receiver, which is in the field gathering positional data) for real-time correction. One reference receiver can provide corrections for a series of roving units. In real-time correction systems, the instantaneous errors for each satellite are encoded by the receiver and transmitted to all roving units by a VHF radio link. Roving receivers receive the complete list of errors and apply the corrections for the particular satellites they are using. The advantage of a real-time system is that, as the name suggests, positional information of differential accuracy is made available to the operator of the roving unit during the survey. This is especially useful if the roving unit is being used to navigate to a specific location.
The reference receiver records and stores corrections for later processing (post-processed GPS). In post-processed systems, the roving units need only to record all their measured positions and the exact time at which they were taken. At some later stage, the corrections are merged with this data in a post-collection differential correction. The advantages of post-processed systems are the ability to operate in areas where radio reception may not be good, greater simplicity and reduced cost. However, navigation to a particular location can only be performed at stand-alone accuracy with a post-processed system.
Accuracy of DGPS
The published accuracy of good quality DGPS systems is less than 2 m (Table 4.2). Operational accuracy achieved in fieldwork in the Turks and Caicos was 2-4 m.
errors for a good quality Trimble DGPS unit
(Trimble Navigation Ltd 1993)
|Typical error source||
|Selective availability error||0.0|
Cost of DGPS systems
The cost of establishing a DGPS will depend on many variables such as model, computer specifications, the availability of a suitable previously surveyed point to act as the reference base station and location. The costs of a DGPS to the Department of Environment and Coastal Resources of the Turks and Caicos Islands government was £11,935 in 1993 but would cost closer to £2000 today.
Replacement parts can be expensive. For example, the cables connecting the batteries and receiver of the roving unit to the data logger, which typically suffer from high rates of wear and tear in the field, cost US$300 each. A full set of lithium/nickel long-life batteries for the data-logger (necessary if the roving unit is to be operated for several days at a time) costs about US$100. The staff of the Department of Environment and Coastal Resources surveyed their own reference position, which required a high level of technical expertise and took one person twelve days. If such expertise were not available then a technician would have to be imported and paid at commercial rates.
Useful information on costs of current GPS and DGPS receivers can be found at the following URLs: http://www.navtechgps.com/ and http://www.trimble.com/ among others.
The need for accuracy assessment
What is accuracy?
Accuracy is referred to in many different contexts throughout this book. The accuracy of a GPS position fix is a measure of the absolute closeness of that fix to the ‘correct’ coordinates, whereas positional accuracy refers to the accuracy of a geometrically corrected image and is measured with the root mean square (Chapter 6). This section is concerned with thematic accuracy, that is, the non-positional characteristics of spatial data. If data have been subjected to multispectral classification then thematic accuracy is also known as classification accuracy (Stehmen 1997). This accuracy refers to the correspondence between the class label and the ‘true’ class, which is generally defined as what is observed on the ground during field surveys. In other words, how much of the class labelled as seagrass on a classified image is actually seagrass in situ.
There has been a tendency in remote sensing to accept the accuracy of photointerpretation as correct without confirmation. As a result, digital classifications have often been assessed with reference to aerial photographs. While there is nothing wrong with using aerial photographs as surrogate field data, it is important to realise that the assumption that photo-interpretation is without error is rarely valid and serious misclassifications can arise as a consequence (Biging and Congalton 1989).
How accurate should a habitat map be?
It might seem surprising that few guidelines exist on the accuracy requirements of habitat maps for particular coastal management applications. The absence of guide-lines may be partly attributable to a widespread paucity of accuracy analyses in habitat mapping projects and may also reflect the unsophisticated manner in which remote sensing outputs have been adopted for coastal management. For example, where habitat maps are used to provide a general inventory of resources as background to a management plan, a thematic accuracy of 60% is probably as useful as 80%. However, more sophisticated applications such as estimating the loss of seagrass cover due to development of a marina, would require the highest accuracies possible (currently about 90%).
It is unfortunate that many coastal habitat maps have no accuracy assessment (Green et al. 1996), particularly when the accuracies from satellite sensors tend to be low. However, provided that adequate field survey, image processing and accuracy assessments are undertaken, planning activities that depend on coastal habitat maps derived from high-resolution digital airborne scanners such as CASI or Daedalus are likely to be based on more accurate information. The precise advantages of better habitat information are unclear because the biological and economic consequences of making poor management decisions based on misleading information have not been studied. However, managers will have greater confidence in, say, locating representative habitats and nursery habitats for fish and shellfish if more accurate data are available. Examples of thematic accuracy per remote sensing instrument and habitat type are given throughout this book (see Chapters 11–13, 19) but the selection of accuracy requirements remains the user’s dilemma. Due consideration must be given to the final use of output maps (e.g. will they have legal ramifications such as prosecution, land cover statistics, etc.) and the consequences of making mistakes in the map.
Calculation of classification accuracy
Imagine an image that has been classified into just two classes, coral reef and seagrass. It would be a serious mistake to accept this image as 100% accurate because it will contain error from a variety of sources. The similarity of reef and seagrass spectra may have caused some reef to be classified as seagrass and vice versa. Error in the geometric correction applied to the image and in GPS positioning may have resulted in some correctly classified reef pixels being mapped to locations which are actually seagrass in situ (this would be particularly prevalent along the boundaries between the two habitats). A method is needed which quantifies these classification errors by estimating how many reef pixels are in reality seagrass, how many seagrass pixels are reef; hence the reliability, or accuracy, of the classification. There are several complementary methods of conducting this assessment.
Error matrices, user and producer accuracies
An error matrix is a square array of rows and columns in which each row and column represents one habitat category in the classification (in this hypothetical case, reef and seagrass). Each cell contains the number of sampling sites (pixels or groups of pixels) assigned to a particular category. In the example below, two hundred accuracy sites have been collected: one hundred reef and one hundred seagrass. Conventionally the columns represent the reference data and the rows indicate the classification generated from the remotely sensed data.
|Table 4.3 Hypothetical error matrix for reef and seagrass|
|Classification data||Reef||Seagrass||Row Total||
|Reef||85||25||110||85/110 = 77%|
|Seagrass||15||75||90||75/90 = 83%|
In this simplified example, 85 of the 100 reef sites have been classified as reef and 15 as seagrass. Similarly, 75 of the 100 seagrass sites have been classified as seagrass and 25 as reef. Of the 110 sites which were classified as reef (the sum of the reef row), only 85 were actually reef. Extrapolating this to the whole image, the probability of a pixel labelled as reef on the classified image actually being reef in situ is 85/110, or 77%. Likewise, the reliability of the seagrass classification is 75/90, or 83%. The probability that a pixel classified on the image actually represents that category in situ is termed the ‘user accuracy’ for that category. The classification however ‘missed’ 15 of the reef sites and 25 of the seagrass. The omission errors for the reef and seagrass classes were therefore 15% and 25% respectively.
The ‘producer accuracy’ is the probability that any pixel in that category has been correctly classified and in this case is 85% for reef pixels and 75% for seagrass. Producer accuracy is of greatest interest to the thematician carrying out the classification, who can claim that 85% of the time an area that was coral reef was classified as such. However, user accuracy is arguably the more pertinent in a management context, because a user of this map will find that, each time an area labelled as coral reef on the map is visited, there is only a 77% probability that it is actually coral reef. In practice, remotely sensed data will usually be classified into more than just two classes; user and producer accuracies may be calculated for each class by using an error matrix.
Confusion may arise when reading remote sensing literature because different authors have used a wide variety of terms for error matrix and error types. For example, an error matrix has been known as a confusion matrix, a contingency table, an evaluation matrix and a misclassification matrix. Producer accuracy is sometimes called omission error (although omission error is strictly speaking 100 - producer accuracy). Similarly, user accuracy is sometimes called commission error (and again, commission error is strictly speaking 100 - user accuracy). Janssen and van der Wel (1994) provide a useful clarification of these terms in their discussion of accuracy assessment.
It is also desirable to calculate a measure of accuracy for the whole image across all classes, however many there are. The simplest method is to calculate the proportion of pixels correctly classified. This statistic is called the ‘overall accuracy’ and is computed by dividing the sum of the major diagonal (shaded) by the total number of accuracy sites ((85+75)/200 = 80%). However, the overall accuracy ignores the off-diagonal elements (the omission and commission errors) and different values of overall ac curacy cannot be compared easily if a different number of accuracy sites were used to test each classification. Off-diagonal elements are incorporated as a product of the row and column marginal totals in a Kappa analysis (Box 4.1), which can be computed in a standard spread-sheet package. K will be less than the overall accuracy unless the classification is exceptionally good (i.e. the number of off-diagonal elements is very low).
Kappa analysis is a discrete multivariate technique used to assess classification accuracy from an error matrix. Kappa analysis generates a Khat statistic or Kappa coefficient that has a possible range from 0 to 1.
where r = number of rows in a matrix, xii is the number of observations in row i and column i, xi+ and x+i are the marginal totals of row i and column i respectively, and N is the total number of observations (accuracy sites). For more details see Bishop et al. (1975). K expresses the proportionate reduction in error generated by a classification process compared with the error of a completely random classification. For example, a Kappa coefficient of 0.89 implies that the classification process was avoiding 89% of the errors that a completely random classification would generate (Congalton 1991).
Ma and Redmond (1995) recommended use of the Tau coefficient (T – Box 4.2) in preference to the Kappa coefficient. The main advantage of Tau is that the coefficient is readily interpretable. For example, a Tau coefficient of 0.80 indicates that 80% more pixels were classified correctly than would be expected by chance alone. The coefficient’s distribution approximates to normality and Z-tests (Box 4.3) can be performed to examine differences between matrices (see Ma and Redmond 1995).
Calculation of the Tau coefficient
Po is the overall accuracy; M is the number of habitats; i is the ith habitat; N is the total number of sites; ni is the row total for habitat i and xi is the diagonal value for habitat i (i.e. number of correct assignments for habitat i).
Comparing error matrices with the Tau coefficient
Z-tests between Tau coefficients 1 and 2 (T1 and T2 ) are conducted using the following equations:
where is the variance of the Tau coefficient, calculated from:
[See Box 4.2 for definitions of Po , N and Pr ]. For a two- sample comparison, Tau coefficients have a 95% probability of being different if Z = 1.96.
In addition to the error matrix, the quality of an image classification can be quantified via an image-based approach which measures the statistical separability of digital numbers that comprise each habitat class mapped (e.g. the difference between mean digital number (DN) values for forereef, seagrass, mangrove etc.). For example, several authors have examined the separability of classes using canonical variates analyses on the image data (Jupp et al. 1986, Kuchler et al. 1989, Ahmad and Neil 1994). Others have used analysis of variance on the DN values that comprise each class (Luczkovich et al. 1993). There is nothing inherently wrong in this, provided that a high separability of image classes is not assumed to be indicative of an accurate habitat map. The advantage of the error matrix approach is the quantification of the accuracy to which each habitat class produced in a classified image is actually found in situ. An accuracy assessment of this type allows the user to make statements such as, ‘I can be 70% confident that the area classified as seagrass on the image is actually seagrass in real life’.
Accuracy assessment is an essential component of a habitat mapping exercise and should be planned at the outset of the study. Map accuracy can be determined using several complementary statistical measures:
The collective accuracy of the map (i.e. for all habitats) can be described using either overall accuracy, the Kappa coefficient or Tau coefficient. Of these, the Tau coefficient is arguably the most meaningful but many remote sensing studies use overall accuracy and, therefore, since overall accuracy has become a ‘common currency’ of accuracy assessment, its use is also recommended.
User and producer accuracies should be calculated for individual habitats.
The Tau coefficient should be used to test for significant differences between error matrices. For example, if habitat maps have been created using several image processing methods, Z-tests can be performed on the Tau coefficients to determine which map is most accurate.
Ahmad, W., and Neil, D.T., 1994, An evaluation of Landsat Thematic Mapper (TM) digital data for discriminating coral reef zonation: Heron Reef (GBR). International Journal of Remote Sensing, 15, 2583–2597.
Biging, G., and Congalton, R.G., 1989,Advances in forest inventory using advanced digital imagery. Proceedings of Global Natural Research Monitoring and Assessments: Preparing for the 21st Century. Venice, Italy, September 1989, 3, 1241–1249.
Bishop, Y., Fienberg, S., and Holland, P., 1975, Discrete Multivariate Analysis – Theory and Practice (Cambridge, Massachusetts: MIT Press).
Congalton, R.G.,1991, A review of assessing the accuracy of classifications of remotely sensed data. Remote Sensing of Environment, 37, 35–46.
Green, E.P., Mumby, P.J., Edwards, A.J., and Clark, C.D., 1996, A review of remote sensing for the assessment and management of tropical coastal resources. Coastal Management, 24, 1–40.
Janssen, L.L.F., and van der Wel, F.J.M., 1994, Accuracy assessment of satellite derived land-cover data: a review. Photogrammetric Engineering and Remote Sensing, 60, 419–426.
Jupp, D.L.B., Mayo, K.K., Kuchler, D.A., Heggen, S.J., Kendall, S.W., Radke, B.M., and Ayling, T., 1986, Landsat based interpretation of the Cairns section of the Great Barrier Reef Marine Park. Natural Resources Series No. 4. (Canberra: CSIRO Division of Water and Land Resources).
Kuchler, D., Biña, R.T., and Claasen, D.R., 1989, Status of high-technology remote sensing for mapping and monitoring coral reef environments. Proceedings of the 6th International Coral Reef Symposium, Townsville, 1, 97–101.
Luczkovich, J.J., Wagner, T.W., Michalek, J.L., and Stoffle, R.W., 1993,Discrimination of coral reefs, seagrass meadows, and sand bottom types from space: a Dominican Republic case study. Photogrammetric Engineering and Remote Sensing, 59, 385–389.
Ma, Z., and Redmond, R.L., 1995, Tau coefficients for accuracy assessment of classification of remote sensing data. Photogrammetric Engineering and Remote Sensing, 61, 435–439.
Stehmen, S.V., 1997, Selecting and interpreting measures of thematic classification accuracy. Remote Sensing of Environment, 62, 77–89.
Trimble Navigation Limited, 1993, Differential GPS Explained. (Sunnyvale: Trimble Navigation Ltd.).