Data Vintages and Measuring Forecast Model Performance
John C. Robertson and Ellis W. Tallman
Economic Review, Vol. 83, No. 4, 1998
The data on economic variables are usually estimates, and these estimates may be revised many times after their initial publication. Most historical forecast evaluation exercises use the "latest available" or most recently revised vintage of historical data when constructing the forecasts—that is, they use estimates that may well have been unavailable to a forecaster in real time. Evaluations using such data could thus give a misleading picture of the forecast performance that can be expected in real-time situations. This fact is particularly relevant if a forecasting model's performance is to be compared with that of published real-time forecasts. One practical question is whether actually using the data set available to a forecaster in real time would lead to inferences that are substantially different from those made using the latest available vintage of data. A related question is whether it matters which vintage of data the forecasts are evaluated against.
The authors argue that the choice of data vintage can have both a quantitative and a qualitative influence on forecast and model comparisons, at least over short horizons. This influence is illustrated by examining the performance of the composite index of leading indicators as a forecaster of alternative measures of real output. However, more research is required in order to determine whether the results generalize to forecasts of other series that are subject to revision, such as the various money aggregate measures.