Ecosystem Model-Data Intercomparison Workshop

Outlier Detection and Flagging for Class B Sites

R.J. Olson, J.M.O. Scurlock, K.J. Johnson, and EMDI Workshop Participants

January 27, 2000


Assigned consistent biome class to Class B sites (2363 records, 1271 sites)
Calculated new NPP ensemble values (8 models)
Calculated new AET ensemble values (4 models)
Performed outlier analysis on NPP and driver data (18 flags)
Excluded managed sites (crops, pasture, plantation and wetlands) (169 sites)
Identified a set of 373of the 2194 (2363 —169) sites to be excluded, including:
Recalculated NPP for 959 sites from the 1821 NPP records
Compared revised NPP with model ensemble NPP, modeled ensemble averaged 91 gC/m2 higher than observed NPP


The Ecosystem Model-Data Intercomparison (EMDI) Workshop was held December 5-8, 1999 in Durham, New Hampshire with 12 modeling groups participating (see agenda). The EMDI Workshop included a variety of models, including biogeochemical, satellite-driven, detailed process, and DVGM types. Initial results showed general agreement between models and data but with obvious differences that indicate areas for potential data and model improvement. Much of the workshop was devoted to looking at potential outliers and harmonizing some of the driver data, especially land cover or vegetation types.

Goal - The goal is to produce a consistent set of NPP measurements with associated environmental driver data that can be used for regional model development and validation.

QA Issues - The ORNL group had reviewed the data extensively prior to EMDI by looking at scatter plots and data outside of reasonable limits. However, the Workshop provided the initial model results to compare with the observed NPP data. An ensemble NPP value was calculated for each site as the average of the 12 Class A (old class 1-2) and 8 Class B (old Class 3) models, including AVIM, CARAIB, CENTURY, GLO-PEM, IBIS, PnET, STOMATE, and VECODE. In addition, an ensemble AET (actual evapotranspiration) value was calculated based on the average of AET provided by four of the models. The EMDI workshop also provided an opportunity to review the classification of the sites, compare observed to predicted NPP, and look at relationships between variables. The specific issues that we hope to address are:

Land cover / Biome Class Consistency- This review was prompted in part by the realization of problems in using the satellite-derived land cover for each site (i.e., often this represented the dominant land cover for a 1x1 km area, not the typical 1x1 m to 1 hectare NPP measurement site).
Managed sites - In addition, modelers decided to flag and exclude likely heavily managed sites and wetlands from the EMDI comparison.
Multiple NPP values for a site — Some sites have up to 35 observed NPP values, often from several vegetation types. We assume this is often a result of reporting imprecise latitude/longitude coordinates.
Coordinates — Coordinates for sites are reported in a mix of formats, from whole degrees for a set of Russian sites to 4 decimal places for GPS registered sites. Some of the climate data and model outputs were provided with coordinates rounded off to 2 decimal places. Although we had distributed data with coordinates up to 4 decimal places, we rounded all sites to 2 decimal places for consistency.

The power of the statistical-empirical approach is that we can look for patterns within similar groups (i.e., biomes) and look for relationships between variables. Even if we could review the original literature reference associated with each study, we would not pickup the potential outliers that were found in this process.

Biomes - The outlier process started by reviewing the biome designation for all sites. Twenty-one classes were defined at the EMDI workshop to represent the data and needs of the models. They were assigned based on initial biome class, subbiome, species, vegetation type and evaluation by Jonathan Scurlock, Peter Thornton, Mac Post, Bill Parton, Steve DeGrosa and others. Four classes were assigned to heavily managed or sites typically not addressed by regional models. The types included crops, pasture, plantations, and wetlands. These sites will be flagged and generally excluded from the EMDI exercises. The 21 biome classes were grouped into 12 classes based on the number of sites to ensure there were enough data (at least 30-40 sites) within each biome to conduct the outlier detection described below.


BIOMENEW Total Aggregated (BIOME2)
*crops 14 managed
*pasture 17 managed
*plantation 27 managed
*wetland 46 managed
DBL forest / boreal 43 boreal
DBL forest / temperate 233 DBL forest / temperate
DBL forest / tropical 17 DBL forest / tropical
desert 26 desert
DNL forest / boreal 29 boreal
EBL forest / temperate 250 EBL forest / temperate
EBL forest / tropical 102 EBL forest / tropical
ENL forest / boreal 117 ENL forest / boreal
ENL forest / temperate 210 ENL forest / temperate
grassland / C3 41 grassland
grassland / C4 temperate 18 grassland
grassland / C4 tropical 32 grassland
mediterranean 12 Savanna
mixed forest 49 mixed forest
Savanna / temperate 1 Savanna
Savanna / tropical 8 Savanna
tundra 24 tundra
Grand Total 1317  


At the EMDI Workshop, we decided to rename the groups to Class A (old Class 1 &2), Class B (old Class 3) and Class C (old Regional cells. This analysis covers Class B sites. A similar approach will be used with Class A and Class C sites. Actually, we may use the biome level information derived from Class B sites to compare to the sites in the other classes.

Class B Sites

The EMDI NPP and driver data were reviewed to flag potential outliers based on criteria to identify unrepresentative sites or potential errors. Each variable was reviewed independently and then in combination with other variables. At the EMDI Workshop, the QA checking was restricted to average NPP at 918 Class B (old Class 3) sites, those for which a complete set of model predictions from 7 models were available. In this phase, we reviewed all 2363 individual NPP data values that had model predictions from at least 3 models. The final step will be to calculate average NPP for all unique site-biome combinations using the subset of NPP records that pass our flagging criteria. Initially there were 1663 Class B sites; however, this set was reduced to 1271 unique sites with valid driver data. We expect after the exclusion of outliers that there will be approximately 1000 sites remaining.

The approach included the following tests to set flags:

  1. Flags based on values outside of reasonable limits:

    ANPP_C>2000 gC/m2, BNPP_C>2000, TNPP_C > 3000

    ELEV > 2500 m (high elevation sites were expected to comprise unrepresentative sub-biomes or present problems with the climate extrapolation algorithm)

    ANPP > .95TNPP, BNPP>.95 TNPP (for a site with both above and below ground components)

  2. Flags based on questionable values. <We didn’t include this in our set of flags but are aware that the Russian sites should be flagged and there may be other sites.>


  3. Flags based on NPP values outside of the .05-.95 percentiles for each biome calculated assuming a normal distribution of variables (this replaces our initial rule of using 2 standard deviations):


  4. Flags based on climate values outside of the .01-.99 percentiles for each biome calculated assuming a normal distribution of variables. In addition, we set some specific limits for some biomes:

    Temp, Precip

    Temp > 6 C for boreal forests

    Precip > 1000 mm for desert and tundra

    Precip < 1000 mm for tropical forest

  5. Flags based on inconsistencies at a site, such as precipitation reported for the site being different than the precipitation derived from global climate data. These flags were based on calculating the Normalized Error (NE) as the ratio of the difference (predicted - observed) divided by the average (predicted + observed )/2. Based on the frequency distribution of the ratios, ratios greater than 1.0 were flagged.

    Elevation, Precipitation, Temperature

  6. Flags based on the comparison of measured NPP versus modeled NPP using the average or ensemble value for all available models. The comparison was based on bias (predicted - observed), Normalized Error (predicted - observed) divided by the average (predicted - observed )/2, and Mean Absolute Error (MAE) — (predicted - observed) divided by the observed. . Based on the frequency distributions, bias greater than +_ 1000 gC/m2, NE ratios greater than +_1.0, and MAE ratios greater than +_ 5 were flagged.

    NPP vs MODEL ensemble

  7. Flags based on relationships between variables. Linear regression analysis was performed between NPP and average AET (from 4 models), NPP and precipitation, and NPP and temperature. Points falling outside of the .95 Confidence Interval about the regression line were flagged.

Critical flags

We assume that sites that have multiple flags have inconsistencies between NPP, driver data, and model predictions, and are therefore more likely to be an outlier. This seemed to hold true in plots (see below). We also know that some flags are deserve more critical weight as indicators of potential problems. We designated flags by assigning a value of 10 for those critical checks, including NPP inconsistent with model NPP ensemble value, elevation > 2500 m or < -100 m, and either site temperature or precipitation inconsistent with that assigned based on the global climate data. We assigned a flag of 100 to the heavily managed sites for easy identification.

The overall flags value was calculated as the sum of the individual flags. Sites with a sum greater that 100 (managed biomes) were dropped, and those greater than 5 (all those with at least one major flag) were considered potential for excluding.

  1. Drop managed sites (169 or 7% of the total)
  2. Select sites with no flags set or less than 5 noncritical flags (or exclude those with a critical flag set or five or more noncritical flags)


A total of 136 of the 2363 records were dropped because they are managed sites. A total of 373 appeared to be outliers. We are in the process of reviewing this list. Virtually all the sites identified at the EMDI workshop as outliers were included in this set. If the EMDI group as a whole accepted these as outliers and we recalculate site and biome averages, we would have a total of 1821 records for 951 sites.

Estimating Total NPP

For those sites lacking a value for TNPP, we calculated biome-specific ratios to estimate NPP_EST from ANPP or BNPP. We will recalculate these ratios based on the new biome classification and recalculate NPP_EST use data with outliers excluded.

Plots and Charts

We plotted NPP observed against the modeled NPP ensemble, by AET ensemble, and by latitude with the plot symbols indicating the magnitude of the flags. In general, the points that had high flag values appeared to be on the fringe of the cluster of points. As included below, the difference between the old NPP and revised included increases and decreases, generally getting bigger with higher NPP levels. There was a strong relationship of the recalculated model ensemble to be higher at low NPP and lower than observed NPP at higher NPP. Bar charts are presented for the new biome groups.

The SAS variable names and labels for the flags are:

Flag name Flag Flag label
BIOME_F 100 crops, pasture, plantations, wetlands**
ELEV_MXF 10 elev >2500, elev < -100*
PREC_F 10 (prec — prec_ann) / ave(prec) > 1*
TAVE_F 10 (tave — temp_ann) / ave(tave) > 1*
MOD_F 10 (npp_est - modcb_av) / ave(npp) > 1*
ANPP_P5F 1 anpp outsite .05-.95 percentile by biome2
BNPP_P5F 1 bnpp outsite .05-.95 percentile by biome2
TNPP_P5F 1 tnpp outsite .05-.95 percentile by biome2
NPP_P5F 1 npp outsite .05-.95 percentile by biome2
ANPPBADF 1 anpp >2000gc/m2, anpp>95%tnpp
BNPPBADF 1 bnpp >2000gc/m2, bnpp>95%tnpp
TNPPBADF 1 tnpp >3000gc/m2, anppbadf or bnppbadf
AET_RGF 1 NPP outside .95 CI of NPP=a+b*AET regression
PREC_RGF 1 NPP outside .95 CI of NPP=a+b*PREC regression
TAVE_RGF 1 NPP outside .95 CI of NPP=a+b*TAVE regression
ELEV_F 1 (elev_giv - elev_dem) / ave(elev) > 1
PRECBADF 1 prec outsite .01-.99 percentile by biome2 (or >1000mm in desert)
TAVEBADF 1 temp outsite .01-.99 percentile by biome2 (>6 in boreal forests)