Introduction
Over the past several years, researchers have
increasingly turned to remotely sensed data to improve the accuracy of
data sets that describe the geographic distribution of land cover at
regional and global scales. To develop improved methodologies for global
land cover classifications as well as to provide global land cover
products for immediate use in global change research, we have employed the
NASA/NOAA Pathfinder Land (PAL) data set with a spatial resolution of 8
km. This data set has a length of record of 14 years (1981-1994),
providing the ability to test the stability of classification algorithms.
Furthermore, this data set includes red, infrared, and thermal bands in
addition to the Normalized Difference Vegetation Index (NDVI). Inclusion
of these additional bands improves discrimination between cover
types.
We aim through this study to 1) develop methodologies for global land
cover classifications that are objective, reproducible, and feasible to
implement on data from additional years and 2) produce a global land
cover classification at 8 km spatial resolution accessible to the global
change research community.
Data
To identify the pixels to be used for training of the 8
km AVHRR Pathfinder data, we
collected a total of over 200 high resolution scenes of which we were
confident of which cover type occurs.
Most of the scenes used were acquired by the Landsat Multispectral Scanner
System, and a few by Landsat
Thematic Mapper and the LISS (Linear Imaging Self-Scanning
Sensor).
Scenes were selected based on the following criteria:
The scenes must have minimal cloud cover.
The scenes should have been acquired at a time of year when the cover
types can be best distinguished,
e.g. at the end of the rainy season or during the growing season.
The scene should occur within the training area used for our coarse
resolution one by one degree land
cover classification. The one by one degree training areas were
identified as those locations where three
coarse resolution global land cover data sets agree that the cover type is
present. This criterion
determined which cover types should be identified from each scene and
provided a degree of confidence in
the interpretation of the scene.
At least one or two scenes should be located in an area where we do
not expect a significant amount of
change in land cover to occur since time of acquisition. Scenes near
urban areas, for example, were not
selected.
Of an initial 200 scenes, we considered 156 to be
suitable for interpretation. Scenes
were considered unsuitable if haze or poor quality data obscured the scene
or if the cover types in the
scene could not be visually distinguished.
For most scenes, we aimed to identify only one cover type within the
scene. It was possible, however,
to identify more than one cover type in some scenes if croplands were
visually identifiable based on the
spatial patterns of fields or if vegetation maps showed the presence of
clearly identifiable cover types.
Appendix 1 lists the cover types
identified from each scene. Table 1 lists the
definitions of land cover types used in this study.
Additional information on the use of
MSS and
8 km data will be
available in the International Journal of Remote Sensing (submitted).
Methods
To view our methods chart, see Figure
1.
Step 1: rectify the Landsat scene into Goode's projection- This step
is described in detail in DeFries et al. (in press). Briefly, it involves
reprojecting the scene to a 12 meridian Goode's Interrupted Homolosine
equal area projection, the projection of the AVHRR Pathfinder data, using
ground control points identified from physical features on 1:250,000 scale
maps from the Joint Operations Graphic series produced by the U.S. Defense
Mapping Agency. We used ground control points rather than corner points
supplied with the Landsat images because of known inaccuraies with the
corner points.
Step 2: coregister AVHRR 8 km and 1 km data- The second step involves
a cross correlation technique to compensate for inaccurate navigation of
the 8 km AVHRR data. The technique, also described in more detail in
DeFries et al. (in press) is based on the presumption that the optimum
coregistration occurs where there is highest correlation between the NDVI
values from two data sources. To avoid excessive computation time that
would be required to find the highest correlation in NDVI values between
the 8 km AVHRR data with 1 km AVHRR data. Using the offset between the 1
km and 8 km AVHRR data required for the highest correlation, the
locational information of the 1 km data, and the locational information of
the Landsat data from step 1, we found the offset required to coregister
the 8 km AVHRR with the Landsat data. The offsets obtained by this
procedure were then used in step 4. Inaccuracied in the locational
information of the 1 km data could account for some of the noise in the
training data.
Step 3: delimit areas occupied by the cover type of interest- We first
made a decision based on visual analysis as to whether the land cover in
the scene is homogeneous or heterogeneous. For example, a large area of
agricultural fields or a large patch of continuous forest would be
considered homogeneous, whereas small, discontinuous forest patches or
clouds would be considered heterogeneous. We consulted local and regional
maps, as well as available field knowledge, to aid interpretation of the
scene.
If we considered the scene to be homogeneous, we simply delimited a
polygon over the area to be included in the training data set (step 3a in figure 1) for the respective cover type using visual
analysis of the suitably contrast-stretched image. For heterogeneous scenes, we
carried out a supervised classification with a decision tree classifier on the
entire scene (step 3b in figure 1) using the four MSS bands
as well as NDVI calculated from MSS bands 2 and 4 as described in DeFries et al.
We visually identified pixels in the scene to be used for training the
classifier, and consultation with the ancillary sources gives us a high
degree of confidence in the results. For example, we would identify a
scene as forest, and then use our ancillary sources to determine
which type.
Step 4: identify pixels in 8 km AVHRR data- For those scenes where we
had traced the cover type, we then overlaid the AVHRR 8 km data and the
Landsat data using the geographic offsets determined in step 2. If 100
percent of the pixels from the Landsat data were identified as the
respective cover type in the 8 km grid cell, then the 8 km pixel was
included in the training data for that cover type (figure
2). For the heterogeneous scenes, the 100 percent criterion was relaxed to 90
percent (figure 3). This meant that we included some local
variability but without this relaxation many cover types would have been
represented by a very small number of 8 km pixels. In a few scenes, the criteria
was further relaxed to 90 percent in the case of homogeneous scenes (5 out of
61 scenes) and 80 percent in the case of heteogeneous scenes (17 out of
95 scenes) if no pixels could be obtained with the 100 and 90 percent
criteria.
Step 5: create bitmaps for training areas in the 8 km AVHRR data-
Those pixels identified from the procedure outlined in steps 1 through 4
were then included in the training data set.
Results and Conclusions
The training data set includes a total of 9,306
pixels in the 8 km AVHRR data (Table 2). The
land
area covered by these pixels is only about 10 percent of the land
area covered by the 156 Landsat scenes, reflecting the heterogeneous
nature of much of the earth's land surface at an 8 km spatial
resolution. We also applied the same Landsat scenes and procedure to
obtain training and validation data for use with the 1 km data set.
1 km pixels that contained 100 percent of Landsat pixels identified
as the respective cover type (for both traced and classified scenes)
were included in the training data set. In this case, over 600,000
pixels were identified, indicating the many fold increase in pixels
with homogeneous land cover at this spatial resolution. A similar
procedure will be applied to pixels at 250 or 500 m resolution for
future use with data from MODIS to be launched in 1998.
To return to the homepage, click
here.