Macroeconomic model comparisons and forecast competitions
by Volker Wieland and Maik Wolters
via Mark Thoma
We use two small micro-founded New Keynesian models, two medium-size state-of-the-art New Keynesian business-cycle models – often referred to as DSGE models – and for comparison purposes an earlier-generation New Keynesian model (also with rational expectations and nominal rigidities but less strict microeconomic foundations) and a Bayesian VAR model.
Note that one of the models has nothing to do with micro foundations and three non DSGE models. To jump ahead, I note there is no comparison of the performance of these totally different approaches in the rest of their post.
Given this failure to predict the recession and its length and depth, the widespread criticism of the state of economic forecasting before and during the financial crisis applies to business forecasting experts as well as modern and older macroeconomic models. ... over purely model-based forecasts, were not able to predict the Great Recession either. Thus, there is no reason to single out DSGE models, and favour more traditional Keynesian-style models that may still be more popular among business experts. In particular, Paul Krugman’s proposal to rely on such models for policy analysis in the financial crisis and disregard three decades of economic research is misplaced.
I think Wieland and Wolters are totally completely utterly unfair to Krugman when they hold him responsible for the forecasts of professional forecasters. Krugman is responsible for Krugman's forecasts. He didn't give a numerical prediction for GDP, but he did predict in fall 2008 that there wouldn't be a quick recovery. I think Krugman outperformed all of the models and all of the professional forecasters. concluding that Krugman was wrong based on their data is bizarre.
Note the gross category error of saying that Krugman's recomendation "is" misplaced (an unqualified statement in the indicative) because of something which professional forecasters "may" do. What is the chance that each professional forecasters does what W and W guess they may do ? If some do and some don't, then the average professional forecast can not be used to evaluate the forecasting performance of more traditional Keynesian approaches.
Finally see below that W and W must concede that professional forecasters do better on average than their models, some of which have nothing to do with DSGE.
The model forecasts are on average less accurate than the mean SPF forecasts (see Wieland and Wolters 2011 for detailed results). ...
Computing the mean forecast of all models we obtain a robust forecast that is close to the accuracy of the forecast from the best model-
Note that they do not explain which model performs best. Since the models are as different as models can be, this is a shocking omission.
Conditioning the model forecasts on the nowcast of professional forecasters (reported in the paper) can further increase the accuracy of model-based forecasts. Overall, model-based forecasts still exhibit somewhat greater errors than expert forecasts, but this difference is surprisingly small considering that the models only take into account few economic variables and incorporate theoretical restrictions that are essential for evaluations of the impact of alternative policies but often considered a hindrance for effective forecasting.
Professional forecasters do not set a very high standard. It is very easy to improve the forecasts of most professional forecasters using no theory and almost no data (see Ehrbeck and Waldmann Quarterly Journal of Economics (1996)
Vol 111 pp 21–40 also note Solferino and Waldmann 2010. "Predicting the signs of forecast errors," Journal of Forecasting vol. 29(5), pages 476-485).
I am shocked that W and W treat DSGE models and Bayesian VARs as part of a uniform class of "models" and argue that the fact that they perform only slightly worse than professional forecasters is evidence that DSGE models are better than old Keynesian models. There is no similarity between Bayesian VARs and DSGE models. I do not believe that this gross conflation is the result of carelessness.
Does any reader of this post believe they would have presented "models" as a homogenous group if DSGE models outperformed the less rigorously micro founded new Keynesian models or if the theory influenced models outperformed Bayesian VARs ?
It is very odd that they consider a narrow set of conditioning variables and a small set of estimated parameters to be an unambiguous handicap. It is well known that richly parametrised models tend to have poor out of sample forecasting performance. The idea of limiting models based on theory was that it would give better forecasts not that, of course, rigor hampers forecasting.
Finally, I think that W and W propose ignoring the past few decades of Macroeconomic empirical research which has shown again and again that theory based macro models can only fit patterns in the data if they are massaged ex post. The pattern they attempt to fit is a simple hump shaped impulse response function. The pathetic failure of the models is shocking -- only to someone who hasn't been paying attention for the past few decades.
There is something else which I type with some reluctance.
"For each forecast we re-estimate all five models using exactly the data as they were available for professional forecasters when they submitted their forecasts to the SPF. Using these historical data vintages is crucial to ensure comparability to historical forecasts by professionals."
Now the legend for figure 1 "Solid black line shows annualised quarterly output growth (real-time data vintage until forecast starting point and revised data afterwards), grey lines show forecasts from the SPF, green line shows mean forecast from the SPF, red lines show model forecasts conditional on the mean nowcast from the SPF."
Click the link and look at figure 1. The numbers for 2008Q2 and 2009Q1 in figure 1 should be "revised."
On July 29, 2011 the BEA released revised estimates for GDP including GDP in 2008 and 2009.
The revisions did not change the timing of the contraction. The overall pattern of quarterly changes during the downturn was similar in both the revised and previously published estimates, though the revised estimates show larger decreases for 2008:Q4 (-8.9 percent compared with -6.8 percent) and for 2009:Q1 (-6.7 percent compared with -4.9 percent). The contributions of specific GDP components to the contraction were similar in both the revised and previously published estimates. (See the briefing on results of the 2011 NIPA annual revision.)
Figure 1 should show revised GDP growth for 2008Q4. It shows a contraction at an annualized rate on the order of 6%. It should show a contraction at an annualized rate of 8.9 percent. The data in the figure do not correspond to the legend -- they are incorrect.
NOw look at figure 2. The contraction rate for 2008Q4 shown in the second panel of Figure 2 should have been available in 2009Q2 as the "nowcast" corresponds to 2009Q2. The number is very similar to those shown in figure 1 and the first panel of figure 2 which should be revised. There was a massive revision of this number made in 2011. Again the figures do not fulfill the promise made in the legend to figure 1.
The data presented in the figures are not the current official estimates of the GDP growth to be forecast. They give no hint of revisions long after the fact. The analysis is incorrect.