Beginning in 1999, CPC's long-lead temperature and precipitation probability forecasts
were issued in a more explicit format as "probability of exceedance" graphs that
map the probability forecasts onto actual temperature or precipitation amounts. The
regional units used for this product have been 102 climate divisions, of approximately
equal area, that cover the lower 48 U.S. states. The climate division data were derived
from smaller such divisions that NCDC (in Asheville, NC) maintains, that are considered
to be relatively "cleaner" than the constituent individual station data. The geographic
largeness of the 102 divisions is thought not to be a problem for long-range 3-month
mean climate forecasts, because anomalies on this time scale usually appear on a broad
spatial scale such that only 7 to 15 anomaly pockets appear across the U.S.

However, the divisional forecasts have the weakness that they do not accurately treat individual localities within the divisions. The temperatures and precipitation forecast amounts given for a climate division represent a summary, or average, over the whole division, and do not take the final step of indicating the implication for the stations within the division. The station amounts may deviate significantly from the divisional average due to geographical factors (elevation, proximity to a water body, or terrain features) or anthropogenic factors (urban heating). An additional extrapolation is needed to fill this need, and has now been done for temperature for a set of 2114 stations. The implications for degree days, the ultimate concern for energy users, has also been provided. This requires a determination of local temperatures first, followed by a look-up of the best-guess degree day equivalents for the given 3-month period. The degree day implications are related to the temperature extrapolations in a very simple way (i.e., linearly) over a considerable temperature range, becoming a less sensitive function of temperature when the average temperature comes within 18 degrees of 65 F from below (for hdds), or within 12 degrees of 65 F from above (for cdds).
The extrapolation from climate division temperature forecasts to station temperature forecasts uses linear simple regression equations. Such an approach is ideal, because the correlations between station and divisional temperatue are found to be high (usually over 0.90, and often over 0.95). This is the case even for very large urban stations such as Chicago O'Hare, Laguardia or Baltimore City (Custom House), where the bias may be substantial but the relationship with the climate division is stable and reliable. Stations having weaker relationships with their embedding climate division (correlations below 0.85) are those located in unique geographic settings, such as San Francisco or Key West, where major factors determining the station climate do not enter into the climate of the larger surrounding region as strongly. Correlations are generally higher in winter than in summer because the large-scale weather systems of winter usually affect all portions of a climate division approximately equally, while in summer the weather is less broadly determined, and thus more locally determined ("noisy").
The regression equations were derived using station and climate division data from 1983 to 1997. This somewhat short period was used in order to capture a recent relationship, given that stations and urban areas may change over time. It is expected that this period will be updated to a more recent 15-year period every few years. The equations include both an adjustment for the differences in the mean (the constant term) and ratio of the variability (contributing to the coefficient). The correlation between the division and the station also contributes to the coefficient, but only weakly: the square root is used instead of the correlation itself. This is done because there is no question about the physical reason for the correlation between them, so that statistical guesswork plays less of a role than would be the case if the reason for the correlation were unknown. A user may omit the correlation-based contribution to the coefficient (which is shown explicitly) if desired. The regression equations can be accessed at xxx.xxx.xxx.xxx

How to Use the Regression Equations

The equations are fairly simple to use. There is a set of individual stations for each of the 48 states of the conterminous U.S. (Alaska and Hawaii are not included here because at CPC they are not presently viewed in terms of the large climate divisions.) Each station's extrapolation is determined from either of two climate divisions; users may choose which they prefer, or may use both and combine the results as they wish. Perhaps illustration by example is the best way to proceed. Let us look at Chicago O'Hare for the 3-month period of Dec-Jan-Feb. The state of Illinois should be selected, and "CHICAGO O'HARE" found. The first three lines shown for O'Hare are:

Station# 522 CHICAGO O'HARE AP , il 1154900 lat/lon 87.90 41.98 94M

seasDJF 0.98(F 25 - 22.99)(0.99) + 26.00 0.97(F 24 - 28.76)(1.19) + 26.00

seasJFM 0.98(F 25 - 26.87)(0.95) + 29.45 0.96(F 24 - 32.37)(1.05) + 29.45

The first line gives the station information: The number out of the total of 2114, the station name and state abbreviation (in lower case), the coop station number as assigned by NCDC, the latitude and longitude in degrees N and degrees W, and the number of months missing between 1951 and 1997. In the case of Chicago O'Hare all months up to October 1958 (94 months) are missing because the station did not open for observations until November 1958. For all stations other than Chicago O'Hare, stations were included in this product only if they had 42 or fewer months missing. O'Hare was included because it is so critical for the weather derivatives community. Missing months were filled in by estimating their values based on the observations of neighboring stations.

The first line gives the station information: The number out of the total of 2114, the station name and state abbreviation (in lower case), the coop station number as assigned by NCDC, the latitude and longitude in degrees N and degrees W, and the number of months missing between 1951 and 1997. In the case of Chicago O'Hare all months up to October 1958 (94 months) are missing because the station did not open for observations until November 1958. For all stations other than Chicago O'Hare, stations were included in this product only if they had 42 or fewer months missing. O'Hare was included because it is so critical for the weather derivatives community. Missing months were filled in by estimating their values based on the observations of neighboring stations.

Following the the first line, 12 lines appear. Each line pertains to one of the 12 overlapping 3-month seasons being forecast. The first two of the 12 are shown above. Let us consider the DJF season. After the season label, two mathematical expressions appear. The one on the left side uses the most closely related climate division, and the one on the right side uses the second most closely related division. These divisions usually, but not always, are located nearest to the station. Their identity is shown following the "F" in the equations. For DJF, climative division 25 is most closely related to O'Hare, and division 24 is the next best division for O'Hare. Suppose we concentrate on division 25. The "F 25" refers to the forecast temperature for DJF for climate division 25. One might first want to use the "point forecast" for division 25 shown in the upper left corner of its probability of exceedance graph. Suppose, just as an example, that the point forecast is 23.50. The corresponding temperature for O'Hare would then be computed as 0.98(23.50-22.99)(0.99) + 26.00. This turns out to give 0.49 + 26.00, or 26.49 degrees F for O'Hare. As additional computations, one might want to use the end points of the 50% confidence interval, or the 90% confidence interval, for climate division 25's forecast, shown on the upper right side of the probability of exceedance graph. Or perhaps the tercile boundaries for O'Hare for 1961-90 would be desired, and the boundaries for division 25 could be used as inputs to the equation.

seasDJF 0.98(F 25 - 22.99)(0.99) + 26.00 0.97(F 24 - 28.76)(1.19) + 26.00

seasJFM 0.98(F 25 - 26.87)(0.95) + 29.45 0.96(F 24 - 32.37)(1.05) + 29.45

Let us look at the mechanics of the regression equation more closely for a better understanding of how it works. In the calculation for O'Hare just given as an example, 22.99 is the average temperature over the 1983-97 period for climate division 25. 26.00 is the same, except for Chicago O'Hare Airport. In other words, O'Hare averaged 3.01 degrees warmer than climate division 25, and this is simply added onto the forecast for the climate division when the climate division is forecast to be at its mean value of 22.99. When the division is forecast to be away from its mean, the difference from its mean is multiplied by 0.99 before adding the difference to O'Hare's mean. This is because O'Hare's deviations from its 15-year average have been very slightly less than those of climate division 25. (Note: In estimating the year-to-year variability, or standard deviation, more than 15 years was used in order to get a better estimate; and changes in the 15-year mean were not permitted to contribute to the result.) The remaining factor, 0.98, is the square root of the correlation between the climate division and the station. It damps the anomaly of the forecast for the station due to the slight uncertainty of the correspondence. Users may wish to disregard this factor.