[Previous Article] [Next Article]


Application of the El Niño-Southern Oscillation CLImatology

andPERsistence (CLIPER) Forecasting Scheme

contributed by John Knaff1 and Christopher Landsea2

1Department of Atmospheric Science, Colorado State University, Fort Collins, Colorado

2NOAA/AOML/Hurricane Research Division, Miami, Florida



INTRODUCTION

To provide a more stringent test for skill in seasonal ENSO forecasting, a multiple regression technique has been fashioned that takes best advantage of CLImatology, PERsistence and trend of initial conditions -- the ENSO-CLIPER (see Knaff and Landsea [1997]). This new model is presented as a replacement for pure persistence for determining a skill threshold for ENSO forecast models. We then redefine "skill" in ENSO prediction as the ability to show significant improvements over the forecast capability of ENSO-CLIPER, rather than just persistence.

This statistical prediction method is based entirely on the optimal combination of persistence, month-to-month trend of initial conditions, and climatology. The selection of predictors is by design intended to avoid any pretense of predictive ability based on "model physics" and the like, but rather to specify the optimal "no-skill" forecast as a baseline comparison for more sophisticated forecast methods. Multiple least squares regression is employed to test a total of 14 possible predictors for the selection of the best predictors, based on 1950-1994 developmental data. Zero to four predictors were chosen for each of 12 regression models--one for each initial calendar month. The predictands to be forecast include the Southern Oscillation (pressure) Index (SOI) and the Niño 1+2, Niño 3, Niño 4 and Niño 3.4 SST indices for the equatorial eastern and central Pacific at lead times ranging from zero seasons (0-2 months) through seven seasons (21-23 months). Though hindcast ability is strongly seasonally dependent, substantial improvement is achieved over simple persistence wherein largest gains occur for two to seven season (6 to 23 months) lead times. The ENSO-CLIPER model thus not only offers a baseline "no-skill" forecast of ENSO variability, but a practical forecast based upon the CLIPER premise.

It is hoped that ENSO-CLIPER will replace, or at least supplement, the current use of persistence as a baseline of skill against which other ENSO forecast models can be judged. Using only persistence as a baseline of skill is overly simplistic. By optimally utilizing available climatology and persistence information as is, we are able to construct a more stringent "no-skill" test for comparison. As recognized in Barnston et al. (1994), true forecast skill cannot be judged until an adequate sample of real--time predictions have been run. A copy of Knaff and Landsea (1997) as well as future monthly ENSO-CLIPER forecasts are available at the Web site: http: //tropical.atmos.colostate.edu/~knaff. In addition, the program to run ENSO-CLIPER is available upon request.

METHODOLOGY

The ENSO--CLIPER predictive model utilizes a multiple linear regression based on least squares deviations which uses the method of leaps and bounds (IMSL 1987). This routine steps forward using every possible combination of the predictors, eventually finding the best multiple regression equation having one, two, three... predictors. Prospective predictors were retained only if they achieved a significance level beyond 95% using a t-test and increased the total variance explained by at least 2.5%. When no predictor met these two criteria, no ENSO-CLIPER forecast equation was obtained and a zero anomaly (climatology) forecast was made. Other restrictions on the regression predictors designed to avoid overfitting are discussed below.

The SST indices and SOI are forecast at leads of 0 to 7 seasons. All forecasts are made for three month target prediction intervals but are made for each indiv-idual monthly initiation time. Here we follow the nom-enclature of Barnston and Ropelewski (1992) wherein zero lead indicates predictions for the next immediately upcoming month (their Fig. 5). A limit of two years lead time (i.e., 7 seasons) reflects the fact that hindcast ability becomes negligible beyond 7 seasons lead.

A pool of 14 predictors was available to the regression scheme. Each regression had the choice of 1, 3 or 5 month averages of initial predictor anomalies for each predictand, and similar choices for the trend of the initial conditions (1, 3 or 5 month differences between average anomalies). Similarly, the regression considered the three month initial conditions and trend of the other four predictands.

The following additional criterion was imposed on the regression to inhibit predictor selection beyond meaningful significance: For any given predictand, the regression was permitted to retain only one initial condition predictor and only one initial trend predictor, where these predictors consist of the predictand itself at an earlier time (e.g. exploiting persistence, antiper-sistence, etc.). This restriction minimizes multicol-linearity of predictors that create hindcast ability (Aczel 1989) which often does not hold up on independent data. The variety of initial conditions and trends of the predictand allows flexibility in handling a strong annual cycle of persistence. Rather than manually selecting the highest persistence and trend time periods, we allowed the regression model to perform the selection, subject to the above criterion. If no predictors are found, which is occasionally the case at longer leads, climatology is forecast.

All skill results from the dependent sample multiple regression are degraded to reflect what should be expected in independent (future) forecasts. This alteration of both the variance explained and the RMSE is performed following the methodology of Davis (1979) and Shapiro (1984).

Five separate predictands (Niño 1+2, Niño 3, Niño 4 and Niño 3.4 SST indices and the SOI) plus 8 different forecast periods (zero to seven seasons lead) and 12 initial starting times (1 January, 1 February, ..., 1 December) yields a total of 480 regression rela-tionships which were examined. An equation for each was developed using the 1950-1994 data which provided a sample of 43 hindcast data points.



SKILL RESULTS

Of the total 480 possible regression equations, 411 met the first two criterion of having a prediction with 95 percent confidence and 2.5 percent increase in hindcast ability, providing non-negligible forecast ability (i.e. significantly greater than a linear correlation coefficient of zero). These were based on one to four predictors with most equations containing two to three predictors.

The equations were tested in the hindcast mode on 1950-1994 data. Comparisons of the five predictands reveal that the Niño 3.4 and Niño 4 regions have, in general, the most proficient hindcasts at all lead times. Figure 1 shows the independent forecast results for the Niño 3.4 region from 1993 to the present for leads zero through four seasons. The skill results through the lead 2 time period are rather encouraging, while forecasts at longer leads damp toward climatological values to minimize squared errors. Interesting also is the persistent forecasts at longer leads for warm conditions to develop during the Northern Hemisphere summer of 1997 through winter 1997/1998.

Table 1 shows the adjusted r-squared skills estimated for this model from the complete set of 1950-94 historical data, specifically for forecasts with a start time of 1 June. The model does well through 2 seasons lead for all regions, with skill falling off to zero in and around the highly transient spring target period. Minimal skill returns again at the longest leads.











Table 1. Skills in terms of R-squared (X100) for the various Niño SST regions and the SOI at leads 0 through 7 seasons. The 0-season lead refers to a target (predictand) period that begins at the time of the forecast.
seasons lead==> 0 1 2 3 4 5 6 7
Niño 4 83 72 58 42 0 0 15 16
Niño 3.4 74 61 60 27 0 17 21 12
Niño 3 72 52 53 11 12 13 15 0
Niño 1+2 82 48 30 0 0 0 10 0
SOI 58 58 31 22 0 8 10 8








FORECASTS FOR JJA 1997 TO MAM 1999

Employing the chosen predictors on a 1 June 1997 initialization date yields forecasts for Jun-Aug 1997 (0 season lead) out through Mar-May 1999 (7 season lead), shown in Fig. 2. The adjusted RMSE values are is also plotted as vertical bars, indicating the degree of uncertainty associated with each forecast. The RMSE varies both as a function of the ENSO-CLIPER model forecast ability and the annual cycle of the variability of the predictand. All predictands are suggested to be heading toward a moderate to strong ENSO warm phase (an El Niño event) in the upcoming 1 to 2 seasons, with a peak of around 2.5C for Niño 3.4 in Dec 1997 to Feb 1998.The warm phase is suggested to end by Jun-Aug 1998, remaining near neutral out to early 1999. This 1997-98 El Niño event has been consistently predicted by ENSO-CLIPER all the way back to forecasts initialized in early 1996.

The major components for the model at the 1 June forecast time for the key regions of the central pacific (i.e. Niño 3 and Niño 3.4) are initial conditions in Niño 3.4 and Niño 4 as well as the trends since winter 1996-97 in the Niño 4 region. If Niño 4, Niño 3.4 and Niño 3 are warm and Niño 4 is trending up since winter, as is the case this year, warm conditions are predicted for leads 0 through 3 seasons. At longer leads these same factors are weighted in an opposite fashion (i.e. cooling is expected at longer leads). For full details the code as well as the coefficients for this simple model are available upon request.

If this El Niño event occurs, as it appears to be doing, then other statistical and numerical ENSO forecast models must outperform ENSO-CLIPER (i.e. be closer to the observed anomalies than ENSO-CLIPER's forecast) in order to claim to have a "skillful" prediction of the event, rather than just surpassing a forecast based upon persistence of the SST anomalies.

Acknowledgments: The authors wish to thank William Gray, Tony Barnston, John Sheaffer, Dave Enfield Dennis Mayer, Barb Brumit, Amie Hedstrom, Bill Thorson and Rick Taft for all their help and comments concerning this work. The lead author is being sup-ported by NOAA under contract NA37RJ0202 (Wil-liam Gray, PI) with supplemental support given by NSF under contracts ATM-9417563 (William Gray, PI). The second author was funded through the 1995-96 NOAA Postdoctoral Program in Climate and Global Change.

Aczel, A. D., 1989: Complete Business Statistics . Richard D. Irwin, Inc., 1056 pp.

Barnston, A. G. and C. F. Ropelewski, 1992: Prediction of ENSO episodes using canonical correlation analysis. J. Climate , 7 , 1316-1345.

Barnston, A. G., H. M. Van den Dool, S. E. Zebiak, T. P. Barnett, M. Ji, D. R. Rodenhuis, M. A. Cane, A. Leetmaa, N. E. Graham, C. R. Ropelewski, V. E. Kousky, E. A. O'Lenic, and R. E. Livezey, 1994: Long--lead seasonal forecasts - Where do we stand? Bull. Amer. Meteor. Soc., 75, 2097-2114.

Davis, R. E., 1979: A search for short range climate productivity. Dyn. Atmos. Oceans, 3, 485-497.

IMSL, 1987: FORTRAN subroutines for statistical analysis. International Mathematical & Statistical FORTRAN Library, 1232 pp.

Knaff, J. A. and C. W. Landsea, 1997: An El Niño-Southern Oscillation CLImatology and PERsistence (CLIPER) Forecasting Scheme. Wea. Forecasting, 12, in press.

Shapiro, L. J., 1984: Sampling errors in statistical models of tropical cyclone motion: A comparison of predictor screening and EOF techniques. Mon. Wea. Rev. , 112 , 1378-1388.

Fig. 1a, b, c, d (four separate links). Time series of independent forecasts of Niño 3.4 SST anomalies (base period 1950-1979) at leads zero through four seasons, beginning in 1993 and ending 1 May of 1997.

Fig. 2a, b, c, d, e (five separate links). Forecast of Niño 4, Niño 3.4, Niño 3, Niño 1+2, and SOI using data available through the end of May 1997. Forecasts are valid for Jun-Aug (JJA) 1997, Sep-Nov (SON) 1997, Dec-Feb (DJF) 1997-98, Mar-May (MAM) 1998, JJA 1998, SON 1998, DJF 1998-99, and MAM 1999. Actual numerical forecast values for these times are shown on each plot along with estimated RMSE bars. These anomalies are based on a 1950-1979 mean.



[Previous Article] [Next Article]