Introduction

Over the past several years, researchers have increasingly turned to remotely sensed data to improve the accuracy of data sets that describe the geographic distribution of land cover at regional and global scales. To develop improved methodologies for global land cover classifications as well as to provide global land cover products for immediate use in global change research, we have employed the NASA/NOAA Pathfinder Land (PAL) data set with a spatial resolution of 8 km. This data set has a length of record of 14 years (1981-1994), providing the ability to test the stability of classification algorithms. Furthermore, this data set includes red, infrared, and thermal bands in addition to the Normalized Difference Vegetation Index (NDVI). Inclusion of these additional bands improves discrimination between cover types.
We aim through this study to 1) develop methodologies for global land cover classifications that are objective, reproducible, and feasible to implement on data from additional years and 2) produce a global land cover classification at 8 km spatial resolution accessible to the global change research community.


Data

To identify the pixels to be used for training of the 8 km AVHRR Pathfinder data, we collected a total of over 200 high resolution scenes of which we were confident of which cover type occurs. Most of the scenes used were acquired by the Landsat Multispectral Scanner System, and a few by Landsat Thematic Mapper and the LISS (Linear Imaging Self-Scanning Sensor).

Scenes were selected based on the following criteria:

  • The scenes must have minimal cloud cover.
  • The scenes should have been acquired at a time of year when the cover types can be best distinguished, e.g. at the end of the rainy season or during the growing season.
  • The scene should occur within the training area used for our coarse resolution one by one degree land cover classification. The one by one degree training areas were identified as those locations where three coarse resolution global land cover data sets agree that the cover type is present. This criterion determined which cover types should be identified from each scene and provided a degree of confidence in the interpretation of the scene.
  • At least one or two scenes should be located in an area where we do not expect a significant amount of change in land cover to occur since time of acquisition. Scenes near urban areas, for example, were not selected.

    Of an initial 200 scenes, we considered 156 to be suitable for interpretation. Scenes were considered unsuitable if haze or poor quality data obscured the scene or if the cover types in the scene could not be visually distinguished.
    For most scenes, we aimed to identify only one cover type within the scene. It was possible, however, to identify more than one cover type in some scenes if croplands were visually identifiable based on the spatial patterns of fields or if vegetation maps showed the presence of clearly identifiable cover types. Appendix 1 lists the cover types identified from each scene. Table 1 lists the definitions of land cover types used in this study.
    Additional information on the use of MSS and 8 km data will be available in the International Journal of Remote Sensing (submitted).


    Methods

    To view our methods chart, see Figure 1.
    Step 1: rectify the Landsat scene into Goode's projection- This step is described in detail in DeFries et al. (in press). Briefly, it involves reprojecting the scene to a 12 meridian Goode's Interrupted Homolosine equal area projection, the projection of the AVHRR Pathfinder data, using ground control points identified from physical features on 1:250,000 scale maps from the Joint Operations Graphic series produced by the U.S. Defense Mapping Agency. We used ground control points rather than corner points supplied with the Landsat images because of known inaccuraies with the corner points.
    Step 2: coregister AVHRR 8 km and 1 km data- The second step involves a cross correlation technique to compensate for inaccurate navigation of the 8 km AVHRR data. The technique, also described in more detail in DeFries et al. (in press) is based on the presumption that the optimum coregistration occurs where there is highest correlation between the NDVI values from two data sources. To avoid excessive computation time that would be required to find the highest correlation in NDVI values between the 8 km AVHRR data with 1 km AVHRR data. Using the offset between the 1 km and 8 km AVHRR data required for the highest correlation, the locational information of the 1 km data, and the locational information of the Landsat data from step 1, we found the offset required to coregister the 8 km AVHRR with the Landsat data. The offsets obtained by this procedure were then used in step 4. Inaccuracied in the locational information of the 1 km data could account for some of the noise in the training data.
    Step 3: delimit areas occupied by the cover type of interest- We first made a decision based on visual analysis as to whether the land cover in the scene is homogeneous or heterogeneous. For example, a large area of agricultural fields or a large patch of continuous forest would be considered homogeneous, whereas small, discontinuous forest patches or clouds would be considered heterogeneous. We consulted local and regional maps, as well as available field knowledge, to aid interpretation of the scene.
    If we considered the scene to be homogeneous, we simply delimited a polygon over the area to be included in the training data set (step 3a in figure 1) for the respective cover type using visual analysis of the suitably contrast-stretched image. For heterogeneous scenes, we carried out a supervised classification with a decision tree classifier on the entire scene (step 3b in figure 1) using the four MSS bands as well as NDVI calculated from MSS bands 2 and 4 as described in DeFries et al. We visually identified pixels in the scene to be used for training the classifier, and consultation with the ancillary sources gives us a high degree of confidence in the results. For example, we would identify a scene as forest, and then use our ancillary sources to determine which type.
    Step 4: identify pixels in 8 km AVHRR data- For those scenes where we had traced the cover type, we then overlaid the AVHRR 8 km data and the Landsat data using the geographic offsets determined in step 2. If 100 percent of the pixels from the Landsat data were identified as the respective cover type in the 8 km grid cell, then the 8 km pixel was included in the training data for that cover type (figure 2). For the heterogeneous scenes, the 100 percent criterion was relaxed to 90 percent (figure 3). This meant that we included some local variability but without this relaxation many cover types would have been represented by a very small number of 8 km pixels. In a few scenes, the criteria was further relaxed to 90 percent in the case of homogeneous scenes (5 out of 61 scenes) and 80 percent in the case of heteogeneous scenes (17 out of 95 scenes) if no pixels could be obtained with the 100 and 90 percent criteria.
    Step 5: create bitmaps for training areas in the 8 km AVHRR data- Those pixels identified from the procedure outlined in steps 1 through 4 were then included in the training data set.


    Results and Conclusions

    The training data set includes a total of 9,306 pixels in the 8 km AVHRR data (Table 2). The land area covered by these pixels is only about 10 percent of the land area covered by the 156 Landsat scenes, reflecting the heterogeneous nature of much of the earth's land surface at an 8 km spatial resolution. We also applied the same Landsat scenes and procedure to obtain training and validation data for use with the 1 km data set. 1 km pixels that contained 100 percent of Landsat pixels identified as the respective cover type (for both traced and classified scenes) were included in the training data set. In this case, over 600,000 pixels were identified, indicating the many fold increase in pixels with homogeneous land cover at this spatial resolution. A similar procedure will be applied to pixels at 250 or 500 m resolution for future use with data from MODIS to be launched in 1998.


  • To return to the homepage, click here.