JOURNAL OF MONETARY ECONOMICS Aims and Scope: The Journal of Monetary Economics publishes important research contributions to a wide range of modern macroeconomic topics including work along empirical, methodological and theoretical lines. In recent years, these topics have been: asset pricing; banking, credit and financial markets; behavioral macroeconomics; business cycle analysis; consumption, labor supply, and saving; dynamic equilibria (theory and computational methods); economic growth and development; expectation formation, information and aggregate economic activity; fiscal shocks and fiscal policies; expectation formation; forecasting, macroeconometrics, and time series analysis; information and aggregate economic activity; international trade, exchange rates, and open economy macroeconomics; labor markets; macroeconomic data and history; monetary policy; monetary theory; money demand and money supply behavior; optimal contracting and economic activity; productivity measurement and theory; pricing in product markets and labor markets; and real investment (inventories, fixed, human capital). The Journal has eight regular issues per year, with the Carnegie-Rochester Conference Series on Public Policy as the January and July issues. Founding Editors: KARL BRUNNER and CHARLES I. PLOSSER Editor: ROBERT G. KING, Department of Economics, Boston University, 270 Bay State Road, Boston, MA 02215, USA Co-Editors: Urban Jermann, University of Pennsylvania; Ricardo Reis, Columbia University Senior Associate Editors: MARIANNE BAXTER, Boston University; JANICE EBERLY, Northwestern University; SERGIO REBELO, Northwestern University; STEPHEN WILLIAMSON, University of Washington Associate Editors: KLAUS ADAM, University of Mannheim; GEORGE ALESSANDRIA, Research department of Federal Reserve Bank, Philadelphia; YONGSUNG CHANG, University of Rochester; MARIO CRUCINI, Vanderbilt University; HUBERTO ENNIS, Federal Reserve Bank of Richmond; KRISTOPHER GERARDI, Federal Reserve Bank of Atlanta; FRANCOIS GOURIO, Boston University; REFET GURKAYNAK, Bilkent University, Turkey; JONATHAN HEATHCOTE, Federal Reserve Bank of Minneapolis; ERIK HURST, University of Chicago; RICARDO LAGOS, New York University; EDWARD NELSON, Federal Reserve Board; GIORGIO PRIMICERI, Northwestern University; ESTEBAN ROSSI-HANSBERG, Princeton University; PIERRE-DANIEL SARTE, Federal Reserve Bank of Richmond; FRANK SCHORFHEIDE, University of Pennsylvania; CHRISTOPHER SLEET, Carnegie-Mellon University; SILVANA TENREYRO, London School of Economics; ANTONELLA TRIGARI, Università Bocconi; ADRIEN VERDELHAN, Massachusetts Institute of Technology; ALEXANDER WOLMAN, Federal Reserve Bank of Richmond; JONATHAN WRIGHT, Johns Hopkins University, Baltimore CRC Editors: THOMAS F. COOLEY, New York University; MARVIN GOODFRIEND, Carnegie Mellon University CRC Advisory Board: ANDREW ABEL, University of Pennsylvania; MARK AGUIAR, University of Rochester; MARK BILS, University of Rochester; YONGSUNG CHANG, University of Rochester; HAROLD COLE, University of Pennsylvania; JANICE EBERLY, Northwestern University; BURTON HOLLIFIELD, Carnegie Mellon University; BENNETT T. McCALLUM, Carnegie Mellon University; THOMAS PHILIPPON, New York University; CHARLES I. PLOSSER, Federal Reserve Bank of Philadelphia; CHRISTOPHER SLEET, Carnegie Mellon University; GIANLUCA VIOLANTE, New York University; TONI WHITED, University of Rochester; STANLEY E. ZIN, New York University Submission fee: There is a submission fee of US$250 for all unsolicited manuscripts submitted for publication. There is a reduced fee for full-time students (US$150). To encourage quicker response referees will be paid a nominal fee and the submission fee will be used to cover these refereeing expenses. There are no page charges. Cheques should be made payable to the Journal of Monetary Economics. When a paper is accepted the fee will be reimbursed. Publication information: Journal of Monetary Economics (ISSN 0304-3932). For 2011, volume 58 (8issues) is scheduled for publication by Elsevier (Radarweg 29, 1043 NX Amsterdam, the Netherlands). Further information on this journal is available from the Publisher or from the Elsevier Customer Service Department nearest you or from this journal’s website (http://www.elsevier.com/locate/jme). Information on other Elsevier products is available through Elsevier’s website (http://www.elsevier.com). Periodicals Postage Paid at Rahway, NJ, and at additional mailing offices. USA mailing notice: Journal of Monetary Economics (ISSN 0304-3932) is published 8 times per year by Elsevier (Radarweg 29, 1043 NX Amsterdam, The Netherlands). Periodical postage rate paid at Rahway NJ and additional mailing offices. USA Postmaster: Send change of address to Journal of Monetary Economics, Elsevier Customer Service Department, 3251 Riverport Lane, Maryland Heights, MO 63043, USA. Airfreight and mailing in USA by Mercury International Limited, 365 Blair Road, Avenel, NJ 07001.
The paper used in this publication meets the requirements of ANSI/NISO Z39.48-1992 (Permanence of Paper).
Printed by Henry Ling Ltd., The Dorset Press, Dorchester, UK
Journal of Monetary Economics 58 (2011) 83
Contents lists available at ScienceDirect
Journal of Monetary Economics journal homepage: www.elsevier.com/locate/jme
Editorial Announcement The Journal of Monetary Economics is pleased to announce that George Alessandria, Refet Gurkaynak, Erik Hurst, Silvana Tenreyro, and Jonathan Wright will be joining the editorial board as associate editors. Nick Bloom, Mike Golosov, Julia Thomas, and Laura Veldkamp have completed terms as associate editors. I thank them for their contributions to the Journal. Senior associate editor Martin Eichenbaum has accepted a co-editorship at the American Economic Review and will be departing from the Journal. I thank Marty for his contributions to the Journal of Monetary Economics, beginning with his first two publications and extending through his many years of service on the editorial board. Robert G. King, Editor Boston University, Department of Economics, 270 Bay State Road, Boston, MA 02215, United States E-mail address:
[email protected] 0304-3932/$ - see front matter doi:10.1016/j.jmoneco.2011.03.007
Journal of Monetary Economics 58 (2011) 84–97
Contents lists available at ScienceDirect
Journal of Monetary Economics journal homepage: www.elsevier.com/locate/jme
Gold rush fever in business cycles Paul Beaudry a, Fabrice Collard b, Franck Portier c, a
University of British Columbia and NBER, Canada The University of Adelaide, Australia c ´e de Brienne, Toulouse School of Economics Aile Jean-Jacques Laffont, Manufacture des Tabacs, Universite´ Toulouse 1 Capitole, 21 alle 31000 Toulouse, France b
a r t i c l e in f o
abstract
Article history: Received 31 January 2009 Received in revised form 30 December 2010 Accepted 19 January 2011 Available online 1 February 2011
A flexible price model of the business cycle is proposed, in which fluctuations are driven primarily by inefficient movements in investment around a stochastic trend. A boom in the model arises when investors rush to exploit new market opportunities even though the resulting investments simply crowd out the value of previous investments. A metaphor for such profit driven fluctuations are gold rushes, as they are periods of economic boom associated with expenditures aimed at securing claims near new found veins of gold. An attractive feature of the model is its capacity to provide a simple structural interpretation to the properties of a standard consumption and output Vector Autoregression. & 2011 Elsevier B.V. All rights reserved.
1. Introduction There is a large literature aimed at decomposing business cycles into temporary and permanent components. A common finding in this literature is that there is a significant temporary component in business cycle fluctuations; that is, an important fraction of business cycles appears to be driven by impulses that have no long run impact. While technology shocks have arisen as a leading candidate explanation to the permanent component (whether these shocks be surprise increases in technological capacities, or news about future possibilities), there remains substantial debate regarding the driving forces behind the temporary component of macroeconomic fluctuations. Several potential explanations to the temporary component have been advanced and explored in the literature; the most notable being monetary shocks and government spending shocks. While such disturbances can create temporary business cycle movements, quantitative evaluation of their effects have generally found that they account for a very small fraction of macroeconomic fluctuations.1 Hence, the puzzle regarding the driving force behind temporary fluctuations persists. Since the most obvious – and most easily measured – candidates have not been convincingly shown to adequately explain temporary fluctuations, part of the literature has turned to explore the potential role of shocks that are conceptually more difficult to measure. A prominent example of this alternative line of research is the literature related to sunspot shocks.
Corresponding author. Tel.: + 33 5 61 12 88 40; fax: + 33 5 61 12 86 37.
E-mail address:
[email protected] (F. Portier). Following a Bayesian likelihood approach to estimate a dynamic stochastic general equilibrium model for the US economy using seven macroeconomic time series (Smets and Wouters, 2007) found that no more than 10% of the variance of output can be explained by monetary shocks between one quarter and one year, while an ‘‘exogenous spending’’ shock can explain 35% of the variance of output at a one quarter horizon, but only 15% after one year. In an estimated new neoclassical synthesis model of the U.S. economy, Justiniano et al. (2010) find that government spending shocks explain no more than 2% of output variance at business cycle frequencies (from 6 to 32 quarters). Using a Vector Autoregression, Uhlig (2005) finds that monetary policy shocks account for 10% of the variations in real GDP at all horizons. 1
0304-3932/$ - see front matter & 2011 Elsevier B.V. All rights reserved. doi:10.1016/j.jmoneco.2011.01.001
P. Beaudry et al. / Journal of Monetary Economics 58 (2011) 84–97
85
While several papers have argued that sunspot shocks offer a good explanation to temporary business cycle fluctuations (see Benhabib and Farmer, 1999 for a survey), much of the profession has remained skeptical.2 The present research proposes and evaluates a theory of temporary business cycle fluctuations which has some similarities with sunspot shocks, in that expectation changes are the initial driving force. However, our approach is fundamentally different since it does not rely on indeterminacy of equilibrium nor on increasing returns to scale. Instead our model builds on the intuition derived from gold rushes, where expectations play an important role but are nevertheless based on fundamentals. Furthermore, like in a gold rush, the individual level gains from investment are clear while the social gains may be small or nil. To help motivate our approach, let us briefly discuss the properties of a gold rush. For example, consider the case of Sutter’s Mill near Coloma, California. On January 24, 1848, James W. Marshall, a carpenter from New Jersey, found a gold nugget in a sawmill ditch. This was the starting point of one of the most famous Gold Rushes in history, the California Gold Rush of 1848–1858. More than 90,000 people made their way to California in the two years following Marshall’s discovery, and more than 300,000 by 1854—or one of about every 90 people then living in the United States. The population of San Francisco exploded from a mere 1000 in 1848 to 20,000 full-time residents by 1850. More than a century later, the San Francisco 49ers NFL team is still named for the prospectors of the California Gold Rush. Another famous episode, which inspired Charlie Chaplin’s movie ‘‘The Gold Rush’’ and Jack London’s book the ‘‘Call of the Wild’’, is the Klondike Gold Rush of 1896–1904. Gold prospecting took place along the Klondike River near Dawson City in the Yukon Territory, Canada. An estimated 100,000 people participated in the gold rush and about 30,000 made it to Dawson City in 1898. By 1910, when the first census was taken, the population had declined to 9000. As these examples make clear, gold rushes are periods of economic boom, generally associated with large increases in expenditures aimed at securing claims near new found veins of gold. We are aware that gold rush episodes do not occur at business cycle frequency, but they will serve here as a useful metaphorical example. This paper explores whether business cycle fluctuations may sometimes be driven by a phenomenon akin to a gold rush. In particular, an analytic dynamic general equilibrium model is constructed, in which the opening of new market opportunities causes an economic expansion by favoring competition for market share. Those episodes are called market rushes. To capture the idea of a market rush, the model is an expanding varieties one, in which agents compete to secure monopoly positions in new markets, as often done in the growth literature (see for example Romer, 1987, 1990) and in some business cycle models (see for example Devereux et al., 1993), although the growth in the potential set of varieties is technologically driven and exogenous. In this setting, when agents perceive an increase in the set of technologically feasible products, they invest to set up a prototype firm (or product) with the hope of securing a monopoly position in the new market. It is therefore the perception of these new market opportunities that causes the onset of a market rush and the associated economic expansion. After the initial rush, there is a shake out period where one of the prototypes secures the dominant position in the market.3 The long term effect of such a market rush depends on whether the expansion in variety has an external effect on productivity. In the case where it does not have an external effect, the induced cycle is socially wasteful as it only contributes to the redistribution of market rents. In contrast, when the expansion of variety does exert positive external effects, the induced cycle can have social value but will generally induce output fluctuations that are excessively large.4 In the case where the market expansion has no external effect, the model is capable of explaining the salient qualitative features obtained from a permanent–temporary decomposition of a consumption–output vector autoregression (VAR). Section 2 presents a set of properties of the data that models of fluctuations should aim to explain. Several of these features are well known and extensively discussed in Cochrane (1994). In a bivariate output–consumption vector error correction model (VECM) of the U.S. postwar economy, consumption is, at all horizons, almost solely accounted for by a permanent shock recovered using a long run restriction. In contrast, the associated temporary shock of the system is found to explain an important part of the short run volatility of output—i.e. the business cycle. This temporary shock also explains much of the fluctuations in hours worked and investment. These robust features of the data are quite challenging for business cycle models since even temporary shocks generally imply some reaction of consumption. Furthermore, the literature remains divided as to a structural interpretation for the temporary shock. As we think that a market rush is a potential candidate, Section 3 builds a model5 which can be solved analytically and whose properties can therefore be clearly stated. In this model, the current economic activity depends positively on the expectation of next period’s activity and on the perceived opening of new markets. Hence, when agents believe that the economy is starting a prolonged period of market expansion, this induces an immediate increase in investment and an associated economic expansion. In contrast,
2 There are at least two reasons for why the profession has remained skeptical about the importance of sunspot shocks in business cycles. First, the empirical evidence has not provided great support for the theoretical features of the economy needed to allow for sunspot shocks. Second, the coordination of beliefs implicit in the underlying mechanism is hard to understand. 3 The assumption that all markets are monopolistically competitive is made for analytical convenience. A richer model would make the degree of competition on the market a function of the number of startups. Such a model is presented in the online technical appendix to this paper. 4 A potential example of such a process is the ‘‘dot com’’ frenzy of the late 1990s,where large investments were made by firms trying to secure a position in the expanding internet market. At the end of this process, there was a large shake out as many firms went bankrupt and only a small percentage survived and obtained a substantial market position. The long run productivity gains and social value associated with this process are still debated. 5 The model presented belongs to the class of models in which nominal rigidities play no role. Our interpretation of such models is that they can correspond to models with sticky prices in which monetary authorities follow rules that implement the flexible price outcomes.
86
P. Beaudry et al. / Journal of Monetary Economics 58 (2011) 84–97
when there are no newly perceived market opportunities, the economy experiences a slump. Section 4 highlights the properties of this simple model in relation to the empirical properties of a consumption–output VECM. In particular, our market rush model is shown to display several of the qualitative properties of consumption–output VECM: consumption does not respond at all to the temporary (but persistent) shock, while this shock contributes to the short run dynamics of output, investment and worked hours. These patterns are often interpreted as providing evidence in favor of the permanent income hypothesis. However, it must be emphasized that these properties are aggregate properties and not partial equilibrium ones, which implies that a coherent explanation to these patterns requires a general equilibrium model that gives rise to permanent–temporary decomposition with no temporary component in consumption. As shown in Section 4, such patterns are not consistent with the standard analytical RBC model. 2. A target set of observations The set of observations presented here provides a rich, though concise, description of fluctuations in output, consumption, investment and hours worked. Some of these observations are well-known, and some are not. The set of observations presented is meant to capture important features of fluctuations that business cycle theory should aim at explaining. These observations will be used to evaluate the potential role of market rushes in explaining macroeconomic fluctuations. 2.1. An output–consumption VECM and two identifications Let us begin by reviewing properties of the bi-variate process for consumption and output in a VAR with one cointegrating relation. The main properties of this system were originally discussed in Cochrane (1994). As in this paper, two schemes are used to orthogonalize the innovations of the process: a long run orthogonalization scheme a la Blanchard and Quah (1989), and a short run or impact scheme a la Sims (1980). At this point, these two schemes should be viewed as devices for presenting properties of the data. There is no claim that these schemes identify structural shocks, nor that these data should be explained by a model with only two shocks. Our empirical analysis is based on quarterly data for the U.S. economy. The sample spans the period 1947Q1 to 2004Q4. Consumption, C, is defined as real personal consumption expenditures on nondurable goods and services and output, Y, is real gross domestic product. Both series are first deflated by the 15–64 U.S. population and expressed in logarithms.6 Standard Dickey–Fuller likelihood ratio and cointegration tests indicate that C and Y are I(1) processes and do cointegrate. The joint behavior of those variables is therefore modeled with a VECM, where the cointegrating relation coefficients are [1; 1] (meaning that the (log) consumption to output ratio is stationary).7 Likelihood ratio tests suggest that the VECM should include three lags. Omitting constants, the joint behavior of (C,Y) admits the following Wold representation: ! ! m1,t DCt ¼ AðLÞ ð1Þ m2,t , DYt P i where L is the lag operator, AðLÞ ¼ I þ 1 i ¼ 1 Ai L , and where the covariance matrix of m is given by O. As the system possesses one common stochastic trend, A(1) is not full rank. Given A(1), it is possible to derive a representation of the data in terms of permanent and transitory components of the form ! ! ePt DCt ¼ GðLÞ T , ð2Þ et DYt where the covariance matrix of ðeP , eT Þ is the identity matrix and GðLÞ ¼ ( G0 Gu0 ¼ O,
Gi ¼ Ai G0
for i 4 0:
P1
i¼0
Gi Li . The G matrices solve ð3Þ
Note that once G0 is known, all Gi are pinned down by the second set of relations. But, due to the symmetry of the covariance matrix O, the first part of the system only pins down three parameters of G0 . One remains to be set. This is P achieved by imposing an additional restriction. The [1,2] element of the long run matrix Gð1Þ ¼ 1 i ¼ 0 Gi is set to zero, meaning that the orthogonalization chosen is such that the disturbance eT has no long run impact on C and Y (the use of 6 Consumption is defined as the sum of services and nondurable goods, while output is real gross domestic product. Each variable is expressed in per capita terms by dividing the 15–64 population. The series are obtained from the following links. Real personal consumption expenditures: nondurable goods: http://research.stlouisfed.org/fred2/series/PCNDGC96, Real personal consumption expenditures: services: http://research.stlouisfed.org/fred2/ series/PCESVC96, Real gross domestic product, 3 decimal: http://research.stlouisfed.org/fred2/series/GDPC96, Population: 15–64, annual: downloaded from http://www.economy.com/freelunch/default.asp, Investment: real gross private domestic investment, 3 decimal: http://research.stlouisfed.org/ fred2/series/GPDIC96/downloaddata. The hours worked refer to the non-farm private, business sector of the economy, and are taken from Citibase. 7 Recent work by Whelan (2003) has shown that real consumption and real output have different long-run trends as they are measured in the latest set of chain-weighted NIPA data. In the online technical appendix to this paper, it is shown that results are unchanged when the cointegrating relation is estimated rather than imposed to be [1; 1].
P. Beaudry et al. / Journal of Monetary Economics 58 (2011) 84–97
87
this type of orthogonalization was first proposed by Blanchard and Quah, 1989). Hence, eT is labeled as a temporary shock, while eP is a permanent one. This orthogonalization is called the ‘‘long run’’ one. Let us now consider an alternative orthogonalization that uses short run restrictions: ! ! eC DCt ~ ðLÞ t , ¼G ð4Þ eYt DYt ~ matrices are solution to a system ~ i Li and the covariance matrix of ðeC , eY Þ is the identity matrix. The G ~ ðLÞ ¼ P1 G where G i¼0 ~ 0 is equal to zero. of equations similar to (3). The system however departs from (3) and imposes that the 1,2 element of G Therefore, eY can be called an output innovation, and by construction the contemporaneous response of C to eY is zero. This orthogonalization is called the ‘‘short run’’ one. 2.2. Results Consider first the long run identification. Fig. 1 graphs the impulse response functions of C and Y to both shocks as well as their associated 95% confidence bands, obtained by bootstrapping the VECM. Table 1 reports the corresponding variance decomposition of the process. These results provide an interesting decomposition of macroeconomic fluctuations. The lower left panel of Fig. 1 clearly shows that consumption virtually does not respond to the transitory shock. This is confirmed by Table 1 which shows that the transitory shock accounts for less than 4% of consumption volatility at any horizon. Conversely, consumption is very responsive to the permanent shock and most of the adjustment dynamics take place in less than one year. In other words, consumption is almost a pure random walk that responds only to permanent shocks and has very little dynamics. On the contrary, short run fluctuations in output are mainly associated with the temporary shocks, which explain more than 60% of output volatility on impact. These patterns are often interpreted as simply reflecting the permanent income hypothesis. If the data corresponded to the consumption and investment decision of an individual facing a fixed interest rate, such interpretation would be correct. However, it must be emphasized that these properties are aggregate properties and not individual level properties, which implies that a coherent explanation to these patterns requires a general equilibrium model that exhibits a permanent–temporary decomposition with no temporary component in consumption. For example, in a standard real business cycle model, a temporary change in technology that generates a persistent increase in investment will also generate – because of general equilibrium constraints – a temporary rise in consumption. Output − εP
1.5
1.5
1
1 %
%
Consumption − εP
0.5 0 −0.5
0.5 0
5
10 Quarters
15
−0.5
20
5
15
20
15
20
Output − εT
1.5
1.5
1
1 %
%
Consumption − εT
10 Quarters
0.5 0
0.5 0
−0.5
−0.5 5
10 Quarters
15
20
5
10 Quarters
Fig. 1. Responses of output and consumption to eP and eT . This figure shows the responses of consumption and output to temporary eT and permanent eP one percent shocks. These impulse response functions are computed from a VECM (C,Y) estimated with one cointegrating relation [1; 1], three lags, using quarterly per capita U.S. data over the period 1947Q1–2004Q4. The shaded area depicts the 95% confidence intervals obtained from 1000 bootstraps of the VECM.
88
P. Beaudry et al. / Journal of Monetary Economics 58 (2011) 84–97
Table 1 The contribution of the shocks to the volatility of output and consumption. Horizon
1 4 8 20 1
Output
Consumption
eT
eY
eT
eY
62 28 17 10 0
80 46 33 22 4
4 1 1 0 0
0 1 1 2 4
This table shows the k-period ahead share (in percentage points) of the forecast error variance of consumption and output that is attributable to the temporary shock eT in the long run orthogonalization and to the output shock eY in the short run one, for k=1, 4, 8, 20 quarters and for k!1. Those shares are computed from a VECM (C,Y) estimated with one cointegrating relation [1; 1], three lags, using quarterly per capita U.S. data over the period 1947Q1–2004Q4.
Output − εC
1.5
1.5
1
1 %
%
Consumption − εC
0.5 0 −0.5
0.5 0
5
10 Quarters
15
−0.5
20
5
15
20
15
20
Output −εY
1.5
1.5
1
1 %
%
Consumption −εY
10 Quarters
0.5 0
0.5 0
−0.5
−0.5 5
10 Quarters
15
20
5
10 Quarters
Fig. 2. Responses of output and consumption to eC and eY . This figure shows the responses of consumption and output to consumption eC and output eY one percent shocks obtained from a short run orthogonalization scheme. Those impulse response functions are computed from a VECM (C,Y) estimated with one cointegrating relation [1; 1], three lags, using quarterly per capita U.S. data over the period 1947Q1–2004Q4. The shaded area depicts the 95% confidence intervals obtained from 1000 bootstraps of the VECM.
Fig. 2 graphs the impulse responses of C and Y associated with the second orthogonalization scheme. The associated variance decompositions are displayed in Table 1. The striking result from these estimations is that the consumption shock eC is almost identical to the permanent shock to consumption (eP in the long run orthogonalization scheme), so that the responses and variance decompositions are very similar to those obtained using the long run orthogonalization scheme. This observation is further confirmed by Fig. 3, which plots eP against eC and eT against eY . It is striking to observe that both shocks align along the 451 line, indicating that the consumption innovation is essentially identical to the permanent component. 2.3. The movements of investment and hours worked Let us now link the behavior of investment and hours worked to the above description of output and consumption. In particular, how much of the variance of those variables is associated with the temporary shock (or quasi-equivalently the output shock) versus the permanent shock recovered from the consumption–output VECM? To answer this question, the following approach is taken. Once the innovations eP and eT are recovered from the bivariate C–Y VECM, investment in
4
4
2
2 εT
εP
P. Beaudry et al. / Journal of Monetary Economics 58 (2011) 84–97
0
89
0 −2
−2 −4 −4
−2
0
2
−4 −4
4
−2
εC
0
2
4
εY
Fig. 3. Plots of eC against eP and eY against eT . The left panel plots the estimated permanent innovation eP (from the long run orthogonalization scheme) against the consumption innovation eC (from the short run orthogonalization scheme). The right panel plots the estimated temporary innovation eT (from the long run orthogonalization scheme) against the output innovation eY (from the short run orthogonalization scheme). In both panels, the straight line is the 451 line. These shocks are computed from a VECM (C,Y) estimated with one cointegrating relation [1; 1], three lags, using quarterly per capita U.S. data over the period 1947Q1–2004Q4.
Table 2 Variance decomposition of investment and hours worked. Horizon
1 4 8 20 40
Investment
Hours in level
Hours in difference
eP
eT
eI
eP
eT
eH
eP
eT
eH
1 37 50 44 23
97 62 48 49 60
2 1 2 7 17
19 37 61 60 54
75 56 32 21 20
6 7 7 19 26
21 46 66 69 57
74 52 32 28 38
5 2 2 3 5
This table shows the k-period ahead share (in percentage points) of the forecast error variance of hours and investment that is attributable to the temporary and permanent shocks eT and eP and to the residual shock, for k= 1, 4, 8, 20 and 40 quarters. Those shares are computed from the estimation of (5). The shocks eT and eP are obtained in a first stage from the VECM (C,Y) estimated with one cointegrating relation [1; 1], three lags, using quarterly per capita U.S. data over the period 1947Q1–2004Q4.
difference and hours worked (in levels or differences) are regressed on current and lagged values of these two shocks plus a moving-average error term denoted eI or eH , which is called an investment or hours specific shock.8 An attractive feature of this approach (compared to estimating a tri-variate VAR) is that it delivers results that are robust to the specification of hours worked (level or difference).9 More precisely, the regression estimated is xt ¼ c þ
K X
ðak ePtk þ bk eTtk þ gk eXtk Þ,
ð5Þ
k¼0
where xt denotes either the (log) hours per capita in levels or the (log) difference of hours and investment. This model is estimated by maximum likelihood, choosing an arbitrarily large number of lags (K = 40). For each horizon k is computed the share of the overall volatility of investment or hours worked accounted for by eP , eT , and by the specific shock eI or eH . Results are reported in Table 2. The numbers reported in the table clearly indicate that investment and hours worked are primarily explained by the transitory component for 1–4 quarter horizons. This transitory component still explains one-half of the variance of investment and one-third of the variance of hours at a 8-quarter horizon. This is also illustrated in Fig. 4 that displays the estimated impulse response function of investment and hours worked to temporary and permanent shocks, as estimated from Eq. (5). The method we use to estimate the response of investment has the disadvantage of working with investment 8
Such a two step strategy amounts to the estimation of the following restricted tri-variate moving-average process: 0
1 Ct BY C @ t A¼ Xt
RðLÞ
02,1
SðLÞ
TðLÞ
!
0
ePt
1
B eT C @ t A,
eXt
where R(L) is a 2 2 polynomial matrix, 02,1 is a 2 1 vector of zeros, S(L) is a 1 2 polynomial matrix and T(L) is a polynomial in lag operator. R(L), eP and eT are recovered from the first step bivariate VECM, while S(L), T(L), and eH are estimated using a truncated approximation of the third line of the above MA process (which is Eq. (5)). In the case of an estimation in difference, X has to be replaced by (1 L)X. 9 It is well known (see for instance the discussions in Gali, 1999; Gali and Rabanal, 2004; Chari et al., 2004; Christiano et al., 2004) that specification choice (levels versus first differences) matters a lot for VARs with hours worked. Results show that our procedure is robust to this specification choice.
90
P. Beaudry et al. / Journal of Monetary Economics 58 (2011) 84–97
Investment − εP
0.04
Investment − εT
0.04
0.03
0.03
0.02 0.02 0.01 0.01
0 −0.01
5
10 Quarters
15
20
Hours Worked − εP
1.5
0
5
10 Quarters
15
20
Hours Worked − εT
1 0.5
1 0 0.5 −0.5 0
5
10 Quarters
15
20
Hours Worked − εP
1.5
−1
5
10 Quarters
15
20
Hours Worked − εT
1.5 1
1
0.5 0.5 0 0 −0.5
−0.5 5
10 Quarters
15
20
−1
5
10 Quarters
15
20
Fig. 4. Responses of investment and hours to eP and eT . This figure shows the response of investment and hours worked to the temporary eT and permanent eP shocks. Those impulse responses are computed using a two-step procedure. First eT and eP are derived from the estimation of a VECM (C,Y) with one cointegrating relation [1; 1], three lags, using quarterly per capita U.S. data over the period 1947Q1–2004Q4. Then investment in difference or hours worked (in levels or difference depending on the specification) are projected on current and past values of those innovations plus a moving-average term in eI or eH . Confidence bands are obtained by a delta method.
first differences, and therefore not taking care of long run relations between investment, consumption and output. As a consequence, the temporary shock eT happens to still explain quite a large share of investment after 10 years (60%). An alternative method amounts to estimate a trivariate (Y,C,I) VECM with cointegrating relations between I, C and Y, and impose that the long run impact of the temporary shock is zero. This method is presented in the online technical appendix, and is shown to give very similar short run responses of investment to the temporary shock. To summarize, four properties of the data are worth highlighting: (i) the permanent shock eP , as recovered from a long run restriction in a consumption–output VECM, is essentially the same shock as that corresponding to a consumption shock eC , as obtained from an impact restriction, (ii) the response of consumption to a temporary shock is extremely close to zero at all horizons, and there are almost no dynamics in the response of consumption to a permanent shock, as it jumps almost instantaneously to its long run level, (iii) the temporary shock (or the output shock in the short run orthogonalization) is
P. Beaudry et al. / Journal of Monetary Economics 58 (2011) 84–97
91
responsible for a significant share of output volatility at business cycle frequencies and (iv) investment and hours are largely explained by the transitory shock at business cycle frequencies. These facts emphasize that a substantial fraction of the business cycle action seems to be related to changes in investment and hours worked, without any short or long run implications for consumption. It is shown in the online technical appendix that these findings are robust both against changes in the specification of the VECM – by estimating rather than imposing the cointegration relation, adding additional lags, or estimating the VECM in levels – and against the data used to estimate the VECM – taking total consumption rather than the consumption of nondurables and services, measuring output as consumption plus investment only. In all these cases, no major changes in patterns are found. Since some emphasis has been put on the quasi-equivalence between the shocks recovered using a long run restriction, and shocks recovered using an impact restrictions, a formal test10 for the equality between eY and eT is conducted. At a 5% significance level, the hypothesis that the consumption shock is identical to the permanent shock cannot be rejected. 3. An analytical model of market rushes In this section we present a simple analytical model of market rushes. The main element of the model is that agents receive, each period, information about potential new varieties of goods that could become profitable to produce. In response to these expectations of profits, agents invest in putting on the market a prototype of the new good. Since many agents may invest in such startups, they engage in a winner-takes-all competition for securing the market of a newly created variety. The winning firm becomes a monopolist on the market, but may randomly lose this position at an exogenous rate. Expansion in variety may or may not have a long run impact on productivity, so that the market rush is not forced a priori to satisfy the gold rush analogy. 3.1. The model Firms: There exists a raw final good, denoted Qt, produced by a representative firm using labor ht and a set of intermediate goods Xt(j) with mass Nt. The constant returns to scale technology is represented by the production function Qt ¼ ðYt ht Þa Xt1a ,
ð6Þ
where a 2 ð0,1Þ. Yt is an index of disembodied exogenous technological progress and Xt is an aggregate of intermediate goods: Z Nt 1=w Xt ðjÞw dj , ð7Þ Xt ¼ Ntx 0
where w r 1 determines the elasticity of substitution between intermediate goods and x is a parameter that determines the long run effect of variety expansion. Since this final good will also serve to produce intermediate goods, Qt will be referred to as the gross amount of final good. Also note that the raw final good will serve as the nume´raire. The representative firm is price taker on the markets. Each existing intermediate good is produced by a monopolist. It is assumed that the production of one unit of intermediate good requires the use of one unit of the raw final good as input. Since the final good serves as a nume´raire, this leads to a situation where the price of each intermediate good is given by Pt ðjÞ ¼ 1=w. Therefore, the quantity of intermediate good j, Xt(j), produced in equilibrium, is given by f1
Xt ðjÞ ¼ ðwð1aÞÞ1=a Yt Nt
ht ,
ð8Þ
where f ¼ ðð1aÞ=aÞðx þ ð1wÞ=wÞ. The profits, Pt ðjÞ, generated by intermediate firm j are given by
Pt ðjÞ ¼ p0 Yt Ntf1 ht ,
ð9Þ 1=a
where p0 ¼ ðð1wÞ=wÞðwð1aÞÞ
. Equalization of the real wage with marginal product of labor implies
f
Wt ¼ AYt Nt ,
ð10Þ ð1aÞ=a
. where A ¼ aðwð1aÞÞ Value added, Yt, is then given by the quantity of raw final good, Qt, net of that quantity used to produce the intermediate goods, Xt(j). Substituting out for Xt(j), and taking away the amount of Qt used in the production of Xt(t), one obtains Z Nt f Xt ðjÞ dj ¼ BYt Nt ht , ð11Þ Yt Qt 0
where B ¼ ð1wð1aÞÞðwð1aÞÞð1aÞ=a . 10 The online technical appendix shows that such a test amounts to testing the nullity of a12, the [1,2] element of the long run matrix of the Wold decomposition. The confidence intervals for the estimate of a12 are obtained from 1000 bootstraps of the long run matrix. The coefficient a^ 12 takes an average value of 0.2024 with a 95% confidence interval [ 0.2,0.8].
92
P. Beaudry et al. / Journal of Monetary Economics 58 (2011) 84–97
The net amount of raw final good Yt can be used for consumption Ct and startup expenditures St: Yt ¼ Ct þ St :
ð12Þ
Variety dynamics: Let N t denote the number of potential varieties in period t, and Nt denote the number of active varieties—i.e. those which are effectively produced, with Nt r N t . In each period, new potential varieties are created at the stochastic growth rate Zt . The N t existing potential varieties of the period become obsolete at an exogenous rate m 2 ð0,1Þ. Therefore, the dynamics for the number of potential products is given by N t þ 1 ¼ ð1 þ Zt mÞN t :
ð13Þ
Note that Z brings information about future potentially profitable varieties but does not immediately affect the production function. In the following, there is no drift in N as we assume EðZt Þ ¼ m. The law of motion of the number of effectively produced goods is driven by an endogenous adoption decision. Any f entrepreneur, who desires to produce a potential new variety, has to pay a fixed cost of kt kYt Nt 4 0 units of the final good to setup the startup. She does so if the expected discounted sum of profits of a startup exceeds kt . Let NS,t denote the number of startups and St kt NS,t denote total expenditures on setup costs. A time t + 1, a startup will become a functioning new firm with a product monopoly with an endogenous probability rt , and existing monopolies will disappear at rate m. Therefore, the dynamics for the number of effectively produced goods is given by Nt þ 1 ¼ ð1mÞNt þ rt NS,t :
ð14Þ
The NS,t startups of period t compete to secure the Zt N t new monopoly positions. The successful startups are uniformly drawn among the NS,t existing ones. Therefore, the probability that a startup at time t will become a functioning firm at t + 1 is given by rt ¼ minð1, Zt N t =NS,t Þ, and the number of new goods created will be minðNS,t , Zt N t Þ. If it turns out that startups are not profitable enough, so that NS,t o Zt N t , not all existing varieties will be exploited and therefore Nt oN t . In order to obtain a tractable solution, parameters are chosen to rule out this case of partial adoption. Allocations will have the property that it is always optimal for entrepreneurs to exploit the whole range of intermediate goods.11 In other words, it amounts to assuming that the adoption cost kt is sufficiently small. This implies that there will be no difference in the model between the potential and the actual number of varieties in equilibrium, so that Nt ¼ N t 8t. Households: The preferences of the representative household are represented by the utility function
Et
1 X t
b ðlogðCt þ t Þ þ cðhht þ t ÞÞ,
ð15Þ
t¼0
where 0 o b o 1 is a constant discount factor, Ct denotes consumption in period t and ht is the quantity of labor the household supplies. The household chooses how much to consume, supply labor, and hold equities in existing firms (E t ) and in startups (E St ) by maximizing (15) subject to the following sequence of budget constraints: Ct þ PtM E t þPtS E St ¼ Wt ht þE t Pt þ ð1mÞPtM E t1 þ rt1 PtM E St1 ,
ð16Þ
where PtM is the beginning of period (prior to dividend payments Pt ) price of an existing monopoly equity, PtS is the price of startups and Wt is the wage rate. 3.2. Equilibrium allocations The decision to invest in a startup is obtained by combining the first order conditions associated with the household’s program and is given by 1 X Ct PtS ¼ brt Et bt ð1mÞt Pt þ t þ 1 : ð17Þ Ct þ t þ 1 t¼0 This condition states that the price of a startup is equal to the expected discounted sum of future profits. Free entry of startups drives to zero the expected discounted sum of profits (the right hand side of Eq. (17)) net of the setup cost. Therefore, one has in equilibrium PtS ¼ kt . Using this last equation, the labor demand condition (10), the profit equation (9), the resource constraint (12), and the startup equity market equilibrium condition E St ¼ NtS , the asset pricing equation (17) becomes p0 1 1 1 ðht c Þ ¼ bdt Et ht þ 1 þ bdt Et 1 ðht þ 1 c Þ , ð18Þ A dt þ 1 where dt ¼ Zt =ð1m þ Zt Þ is an increasing function of the fraction of newly opened markets Zt . Eq. (18) is a key equation of the model. It shows that current employment ht depends on ht + 1, dt and dt þ 1 , and therefore indirectly depends on all the future expected d. As dt brings information about the future, employment is purely forward looking. The reason why future employment favors current employment is that higher future employment reflects higher 11 Such an assumption would be definitively not appealing in a growth perspective, or to account for cross-country income differences (see for Comin and Hobijn, 2004), but seems to us acceptable from a business cycle perspective.
P. Beaudry et al. / Journal of Monetary Economics 58 (2011) 84–97
93
expected profits, which therefore stimulates new entries today. Note that the model exhibits certain salient neutrality properties, as the determination of employment does not depend on either current or future changes in disembodied technological change Yt .12 Iterating forward, the above equation can be written as a function of current and future values of d only. Given the nonlinearity of Eq. (18), it is useful to compute a log-linear approximation around the deterministic steady-state value of hours worked, ht13 ! 1 hc h^ t ¼ gEt h^ t þ 1 þ Et ½d^ t bd^ t þ 1 , h 1 where h^ t and d^ t now represent relative deviations from the steady state, h ¼ c ð1bð1dÞÞ=ð1bdp0 =Abð1dÞÞ and 14 g ¼ bdðp0 =AÞ þ bð1dÞ with g 2 ð0,1Þ. Solving forward, this can be written as #! ! "X 1 1 i^ ^h t ¼ hc ^d t bd Ap0 Et ð19Þ g dt þ 1 þ i : h A i¼0
Note that, as g 2 ð0,1Þ, the model possesses a unique determinate equilibrium path. Eq. (19) reveals that a positive d^ t – i.e. an acceleration of variety expansion – causes an instantaneous increase in hours worked, output and investment in startups S. This boom arises as the result of the prospects of future profits derived from securing those new monopoly positions. This occurs irrespective of any current change in the technology or in the number of varieties. Such an expansion is therefore akin to a ‘‘demand driven’’ or ‘‘investment driven’’ boom. Once the equilibrium path of ht is computed, output is directly obtained from Eq. (11). Finally, combining labor demand (10) and the household’s labor supply decision, one obtains an expression for aggregate consumption: Ct ¼
A
c
Yt Ntf :
ð20Þ
4. Equilibrium allocation properties This section first derives the VECM representation of the model solution, and shows the similarity between some orthogonalized representations of the model and of the data. The optimal properties of equilibrium allocations are then discussed. Finally, our results are contrasted with the ones obtained from a baseline RBC model, and we discuss the empirical counterpart to our new markets metaphor. 4.1. A VECM representation of the model solution As mentioned in the Introduction, it is attractive to represent macroeconomic fluctuations as responses to permanent and transitory shocks in a consumption and output autoregressive vector. We therefore begin by deriving a consumption– output VECM representation of the model solution. This representation will then be compared to the estimated VECM. It is assumed that disembodied technical change Yt follows (in log) a random walk without drift: logYt ¼ logYt1 þ sY eY t , where eY t are i.i.d. with zero mean and unit variance. The variety expansion shock Zt follows an AR(1) process of the form 15 N logðZt Þ ¼ rZ logðZt1 Þ þð1rZ Þlogðm~ Þ þ sN eN The t , where et are i.i.d. with zero mean and unit variance, and with 0 o rZ o1. 1 solution for hours worked is given by h^t ¼ oZ^t , with o ððhc Þ=hÞð1dÞð1brÞ=ð1grÞ. The logs of consumption and output are therefore given by logðYt Þ ¼ ky þ logðYt Þ þ flogðNt Þ þlogðht Þ,
ð21Þ
logðCt Þ ¼ kc þ logðYt Þ þ flogðNt Þ,
ð22Þ
where kc and ky are constant terms. Using Eq. (18) to replace ht with its approximate solution, it is straightforward to derive the following MAð1Þ representation of the system: 1 ! ! ! 0 fL sY DlogðCt Þ eY eY 1rL sN t t @ A ¼ CðLÞ : ð23Þ ¼ DlogðYt Þ eNt eNt sY ðoð1LÞ þ fLÞ sN 1rL
12 This result is due to the functional forms chosen for preferences and technology. It is related to (i) the separability between consumption and hours in the utility function; (ii) logarithmic preferences for consumption and (iii) Cobb–Douglas production function. 13 In the online technical appendix, an exact analytical solution to the model is derived in the case of i.i.d. shocks. 14 This follows from the restriction bdð1wÞð1aÞ=a þ bð1dÞ o 1 imposed on parameters to guarantee positive hours worked in the non-stochastic steady state. 15 Note that m~ takes the value m2ð1r2Z Þ=s2N so that E½Z ¼ m.
94
P. Beaudry et al. / Journal of Monetary Economics 58 (2011) 84–97
4.2. Orthogonalized representations of equilibrium allocations If the model of Section 3 is a data generating process, what would it imply for the orthogonalizations performed in Section 2? One way to answer this question would be to simulate data using the model, and then to estimate and orthogonalize a VECM on those simulated data. As our simple model has a tractable analytical solution, it is possible to derive exactly the VECM representation of equilibrium allocations. The impact matrix, C(0), and long run matrix, C(1), can be obtained from the system (23) as 0 1 ! sY 1fr sN sY 0 A: Cð0Þ ¼ and Cð1Þ ¼ @
sY osN
sY 1fr sN
The VECM permanent and transitory shocks are then given by 8 !1=2 > > f 2 2 > Y þ f s eN , > ePt ¼ s2Y þ s s e > N Y t t N > 1r 1r < !1=2 > > f 2 2 f > Y þ s eN : > eT ¼ s2 þ s s e > N t Y t > N Y : t 1r 1r Similarly, short run orthogonalization yields ( Y et ¼ eY t ,
ð24Þ
ð25Þ
eCt ¼ eNt :
This simple model shares the importance of dynamic properties with the data when the parameter f is set to zero. This corresponds to the case where x ¼ ðw1Þ=w, meaning that an expansion in variety exerts no effect on labor productivity. First of all the system (23) clearly shows that consumption and output do cointegrate (C(1) is not full rank) with cointegrating vector [1; 1]. Second, it shows that consumption is a random walk, that is only affected – in the short run as well as in the long run – by technology shocks, eY . Output is also affected by the temporary shock, eN , in the short run. Hence, computing sequentially our short-run and long-run orthogonalization with this model would imply eP ¼ eC ¼ eY and eT ¼ eY ¼ eN , as it can been seen from (24) and (25) in the case f ¼ 0. Finally, it is the temporary shock eT (which is indeed eN ) that explains all of hours worked volatility at any horizon, as h^t ¼ oZ^t . Such a model, therefore, allows for a structural interpretation of the results obtained in Section 2. Permanent shocks to C and Y are now interpretable as technology shocks. Consumption does not respond to variety expansion shocks, which however account for a lot of output fluctuations and all the fluctuations in hours worked. Variety expansion shocks create market rushes that are indeed gold rushes, generating inefficient business cycles as the social planner would choose not to respond to them (as shown below). In effect, these shocks only trigger rent seeking activities, as startups are means of appropriating a part of the economy pure profits. Although simple, this model illustrates how the market mechanism we have put forward has the potential to account for some intriguing properties of the data; in particular, the equivalence of the short and long run identification schemes, and the complete absence of a temporary component in consumption. 4.3. Comparison between equilibrium and optimal allocations Optimality properties of those allocations are worth discussing, and it is useful to compute the socially optimal allocations as a benchmark. The social planner problem is given by 8 f ^ > > > Ct r A Yt Nt ht kt Zt NS,t , > N > t þ 1 ¼ ð1mÞNt þ rt NS,t , i¼0 > > : N rN , t
t
with A^ ¼ að1aÞa=ð1aÞ and where one has already solved for the optimal use in intermediate goods. Note that parameters are again assumed to be such that it is always socially optimal to invest in a new variety, so that Nt ¼ N t . One necessary condition for full adoption to be socially optimal is that the long run effect of variety expansion is positive—i.e f 4 0()x 4ð1aÞð1wÞ=w. The first order condition of the social planner program is given by ðx þ ð1aÞð1=w1ÞÞ=a A^ Yt Nt ðx þ ð1aÞð1=w1ÞÞ=a ht Zt Nt kt A^ Yt Nt
¼ c:
ð26Þ
There are many sources of inefficiency in the decentralized allocations. One obvious source is the presence of imperfect competition: ceteris paribus, the social planner will produce more of each intermediate good. Another one is the congestion
P. Beaudry et al. / Journal of Monetary Economics 58 (2011) 84–97
95
effect associated with investment in startups, because only a fraction rt of startups are successful. The social planner internalizes this congestion effect, and does not duplicate the fixed cost of startups, as the number of startups created is equal to the number of available slots for optimal allocations.16 Because of these imperfections, the decentralized allocation differs from the optimal allocation along a balanced growth path. The difference between the market and the socially optimal allocations that we want to highlight regards the response to expected future market shocks. It is remarkable that the socially optimal allocation decision for employment (26) is static, and only depends on Zt (positively). This stands in sharp contrast with the market outcome, as summarized by Eq. (18), in which all future values of Z appear. To understand this difference, let us consider an increase in period t in the expected level of Zt þ 1 . In the decentralized economy, larger Zt þ 1 means more startup investment in t+1 and more firms in t+2. Those firms will affect other firms’ profits from period t+2 onward. Therefore, a period t startup will face more competitors in t+2, which reduces its current value, and therefore decreases startup investment and output.17 Such an expectation is not relevant for the social planner, which does not respond to changes in the future values of Z. Therefore, in that simple analytical model, part of economic fluctuations are driven by investors (rational) forecasts about future profitability that are inefficient from a social point of view.18 A stark result is obtained in the case when the returns to variety are nil, so that an expansion in the number for varieties has no long run impact on productivity. This case corresponds to f ¼ ðx þð1aÞð1wÞ=wÞ=a ¼ 0. In this particular case, investment in startups occurs in the decentralized equilibrium in response to market shocks, whereas the social planner would choose not to adopt any new good (Nt ¼ N0 8t), as implementing new goods costs kt and has no productive effect. In this very case, optimal allocations are invariant to market shocks Z, while equilibrium allocations react suboptimally to those shocks. In particular, as hours are only affected by market shocks in equilibrium, all equilibrium fluctuations in hours are suboptimal. This case echoes with an interesting aspect of gold rushes. In effect, from a social point of view, part of the increased activity was wasteful since historically it mainly just contributed to the expansion of the stock of money. 4.4. Properties of an extended analytical RBC model Let us now contrast the positive properties of our model with those obtained in analytical RBC model that is extended to have both TFP and investment specific shocks. In order to be fully analytical, logarithmic consumption utility, Cobb–Douglas technology and full depreciation is assumed. The representative household has the same preferences as in the preceding model
Et
1 X t
b ½logðCt þ t Þ þ cðhht þ t Þ:
ð27Þ
t¼0
The final good, Y, is produced according to Yt ¼ Kta ðYt ht Þ1a ,
ð28Þ
where Yt is an exogenous TFP shock. Capital accumulates as Kt þ 1 ¼ Qt It ,
ð29Þ
where Qt is an investment specific shock and It denotes investment. Equilibrium allocations of such a model are given by19 ht ¼ h ¼
1a
%
cð1abÞ
,
ð30Þ
a Yt ¼ Gy ðQt1 Yt1 Þa Y1 , t
ð31Þ
a , Ct ¼ Gc ðQt1 Yt1 Þa Y1 t
ð32Þ
a
1a
and Gc ¼ ð1abÞGy . with Gy ðabÞ h Note that the saving rate is constant in this analytical model (Ct ¼ ð1abÞYt ), so that any shock that does affect output proportionally affects consumption. As such, the model cannot replicate the facts, as the temporary shock increases consumption as much as investment (in percentage points). This rather extreme result is due to the very specific assumptions that were made in order to obtain an analytical solution, but we show in Beaudry et al. (2009) that the impossibility of such a model to replicate the VARs facts highlighted here extends to non-analytical models of that type. %
16 Note that it has been assumed here that parameter values are such that it is optimal to adopt all the new varieties. Another potential source of suboptimality would be an over or under adoption of new goods by the market. As shown in Benassy (1998) in a somewhat different setup with endogenous growth, the parameter x is then crucial in determining whether the decentralized allocations show too much or too little of new goods adoption. 17 This is due to the typical ‘‘business stealing’’ effect found in the endogenous growth literature, for example in Aghion and Howitt (1992), and originally discussed in Spence (1976a, 1976b). 18 The very result that it is socially optimal not to respond to such future shocks is of course not general, and depends on the utility and production function specification. The general result is not that it is socially optimal not to respond to shocks on future Z, but that the decentralized allocations are inefficient in responding to those shocks. 19 See the online technical appendix for a derivation of the model solution.
96
P. Beaudry et al. / Journal of Monetary Economics 58 (2011) 84–97
4.5. Discussion An important question not yet discussed is the interpretation of ‘‘a new market’’ and the associated empirical observations with regard to its cyclical properties.20 Our metaphor of new markets describes all new ways of introducing new products given existing technology or using new technologies.21 Broadly speaking, a new market ranges from producing a newly invented product (say cellular phones) to producing old goods with newly developed uses (fiber-optic cable networks once the use of the internet has exploded) or new ways of designing old products (say producing shirts of a fashionable new color). Given this broad interpretation, it is difficult to obtain a comprehensive measure of our new market margin. In a very narrow sense, one could associate new markets with new firms, and therefore look at Net Business Formation. Net Business Formation is without ambiguity procyclical in the U.S., which is also one of our model predictions if one literally associates N with the number of firms. The problem is that the evidence suggests that smaller firms typically make up the majority of entries and exits, which is insufficient to account for a large share of hours worked and output variance at short horizons. A less restrictive interpretation is to look at variations in the number of establishments and franchises as an additional channel affecting the number of ‘‘operating units’’. The Business Employment Dynamics database documents job gains and job losses at the establishments level and quarterly frequency for the period between the third quarter of 1992 and the second quarter of 2005. Using these observations, Jaimovich (2004) finds that more than 20% of the cyclical fluctuations in job creation is accounted for by opening establishments, which is already a sizable number. Another dimension that could be associated to the new market margin is variation in the number of franchises. As Lafontaine and Blair (2005) show, numerous firms in a variety of industries have adopted franchising as a method of operation. Sales of goods and services through the franchising format amounted to more than 13% of real Gross Domestic Product in the 1980s and 34% of retail sales in 1986. Jaimovich (2004) documents that the variations in the number of franchises are procyclical at the business cycle frequency, which is again in line with the ideas put forward by the model. We take this empirical evidence as supporting the notion that agents’ expectations about the possibility of new markets is potentially an important driving force of the business cycle. 5. Conclusion This paper presented theory and evidence in support of the idea that expectations of new market openings may be a key element in explaining the temporary component in output fluctuations. In particular, we proposed a model where the opening of new market opportunities causes an economic expansion by favoring competition for market share. Such an episode was called a market rush in analogy to a gold rush. A simple analytical model of market rushes has been developed and it has been shown how it can replicate an important qualitative feature of the data, namely that the temporary component extracted from an output–consumption VECM is associated with virtually no movement in consumption at any frequency. It has been demonstrated that such a pattern arises in our model when most of the investment in new varieties is socially inefficient. While such an interpretation of business cycles is certainly controversial, it is worth noting that the properties of the consumption–output VECM suggest that the data can be generated by only two large classes of models. Either the data is generated by a model that does not admit a structural temporary–permanent decomposition, which would be the case if all shocks have permanent effects. Or, the data is generate by a model that does admit a structural temporary–permanent decomposition, in which case the induced temporary fluctuations should be explained in terms of socially inefficient investment as there are no associated gains in terms of consumption even though more work is exerted.22 The contribution of this paper is to provide a candidate explanation to the second possibility. A natural follow-up question to this paper is whether the market rush phenomenon can be quantitatively an important source of fluctuations? Such an exploration requires extending the model in several directions to make it more realistic. In a companion paper (Beaudry et al., 2009), we pursue this goal by introducing into the model capital accumulation, two types of intermediate goods and habit persistence in consumption. The extent to which the model is quantitatively capable of replicating the impulse responses presented here is investigated in this extended version. This ongoing work suggests that market rush phenomenon with social wasteful variety expansion may be a significant contributor to business cycle fluctuations.
Acknowledgments The authors thank Nir Jaimovich, Evi Pappa, Oreste Tristani for their discussion of the paper as well as participants in the various seminars and conferences where this work was presented. The comments of an anonymous referee and of the associate editor Klaus Adam are also gratefully acknowledged. Franck Portier is also affiliated to the CEPR. 20
We have here benefited from comments and discussion with Nir Jaimovich. The new goods margin has been recently shown to be important in understanding the pattern of international trade (see Ghironi and Melitz, 2005; Kehoe and Ruhl, 2006). 22 For example, the properties of the consumption–output VECM should be viewed as challenging to sticky price theories of the business cycles driven by one permanent shock and one temporary shock. In such models, the temporary shock induces temporary movements in consumption, while such predicted outcome is not apparent in the data. 21
P. Beaudry et al. / Journal of Monetary Economics 58 (2011) 84–97
97
Appendix A. Supplementary data Supplementary data associated with this article can be found in the online version at doi:10.1016/j.moneco.2011.01.001. References Aghion, P., Howitt, P., 1992. A model of growth through creative destruction. Econometrica 60 (2), 323–351. Beaudry, P., Collard, F., Portier, F., 2009. A theory of transitory macroeconomic shocks. Mimeo, Toulouse School of Economics. Benassy, J., 1998. Is there always too little research in endogenous growth with expanding product variety? European Economic Review 42 (1), 61–69 Benhabib, J., Farmer, R., 1999. Indeterminacy and sunspots in macroeconomics. In: Talor, J., Woodford, M. (Eds.), Handbook of Macroeconomics, vol. 1A. North-Holland, New York, pp. 387–448. Blanchard, O., Quah, D., 1989. The dynamic effects of aggregate demand and supply disturbances. The American Economic Review 79 (4), 655–673. Chari, V., Kehoe, P., McGrattan, E., 2004. A critique of structural vars using real business cycle theory. Working Paper 631, Federal Reserve Bank of Minneapolis. Christiano, L., Eichenbaum, M., Vigfusson, R., 2004. What happens after a technology shock. Working Paper 9819, National Bureau of Economic Research. Cochrane, J., 1994. Permanent and transitory components of gnp and stock prices. The Quarterly Journal of Economics 109 (1), 241–265. Comin, D., Hobijn, B., 2004. Cross-country technology adoption: making the theories face the facts. Journal of Monetary Economics 51 (1), 39–83. Devereux, M.B., Head, A.C., Lapham, B.J., 1993. Monopolistic competition, technology shocks, and aggregate fluctuations. Economics Letters 41 (1), 57–61. Gali, J., 1999. Technology, employment and the business cycle: Do technology shocks explain aggregate fluctuations? The American Economic Review 89 (1), 249–271 Gali, J., Rabanal, P., 2004. Technology shocks and aggregate fluctuations: How well does the RBC model fit postwar U.S. data? In: Gertler, M., Rogoff, K. (Eds.), NBER Macroeconomics Annual 2004. The MIT Press, Cambridge, MA, pp. 225–288. Ghironi, F., Melitz, M.J., 2005. International trade and macroeconomic dynamics with heterogeneous firms. The Quarterly Journal of Economics 120 (3), 865–915. Jaimovich, N., 2004. Firm dynamics, markup variations and the business cycle. Mimeo, Stanford University. Justiniano, A., Primiceri, G.E., Tambalotti, A., 2010. Investment shocks and business cycles. Journal of Monetary Economics 57 (2), 132–145. Kehoe, T.J., Ruhl, K.J., 2006. How important is the new goods margin in international trade? 2006 Meeting Papers 733, Society for Economic Dynamics. Lafontaine, F., Blair, R., 2005. The Economics of Franchising. Cambridge University Press, Cambridge, UK. Romer, P.M., 1987. Growth based on increasing returns due to specialization. The American Economic Review 77 (2), 56–62. Romer, P.M., 1990. Endogenous technological change. The Journal of Political Economy 98 (5), S71–S102. Sims, C., 1980. Macroeconomics and reality. Econometrica 48 (1), 1–48. Smets, F., Wouters, R., 2007. Shocks and frictions in us business cycles: a Bayesian DSGE approach. American Economic Review 97 (3), 586–606. Spence, M., 1976a. Product differentiation and welfare. The American Economic Review Papers and Proceedings 76, 407–414. Spence, M., 1976b. Product selection, fixed costs, and monopolistic competition. Review of Economic Studies 43, 217–235. Uhlig, H., 2005. What are the effects of monetary policy on output? results from an agnostic identification procedure. Journal of Monetary Economics 52 (2), 381–419. Whelan, K., 2003. A two-sector approach to modeling U.S. NIPA data. Journal of Money, Credit and Banking 35 (4), 627–656.
Journal of Monetary Economics 58 (2011) 98–116
Contents lists available at ScienceDirect
Journal of Monetary Economics journal homepage: www.elsevier.com/locate/jme
Money and capital$ S. Bora˘gan Aruoba a,, Christopher J. Waller b, Randall Wright c a
University of Maryland, United States Federal Reserve Bank of St. Louis and University of Notre Dame, United States c University of Wisconsin—Madison and Federal Reserve Bank of Minneapolis, United States b
a r t i c l e in f o
abstract
Article history: Received 11 September 2010 Received in revised form 2 March 2011 Accepted 3 March 2011 Available online 15 March 2011
The effects of money (anticipated inflation) on capital formation is a classic issue in macroeconomics. Previous papers adopt reduced-form approaches, putting money in the utility function, or imposing cash in advance, but using otherwise frictionless models. We follow instead a literature that tries to be explicit about the frictions making money essential. This introduces new elements, including a two-sector structure with centralized and decentralized markets, stochastic trading opportunities, and bargaining. These elements matter quantitatively and numerical results differ from findings in the reduced-form literature. The analysis also reduces a gap between microfounded monetary economics and mainstream macro. & 2011 Elsevier B.V. All rights reserved.
1. Introduction The relation between anticipated inflation and capital formation is a classic issue in macroeconomics, going back at least to Tobin (1965), Sidrauski (1967a,b), Stockman (1981), Cooley and Hansen (1989, 1991), Gomme (1993), Ireland (1994) and many others. All these contributors adopt reduced-form approaches: they put money in the utility function, or impose cash in advance, in an attempt to capture implicitly the role of money in the exchange process, but use otherwise frictionless models. An alternative literature on money, going back to Kiyotaki and Wright (1989, 1993), Aiyagari and Wallace (1991), Shi (1995), Trejos and Wright (1995), Kocherlakota (1998), Wallace (2001) and others, strives to be more explicit about the frictions that make a medium of exchange essential.1 In doing so, these papers introduce new elements into monetary economics, including detailed descriptions of specialization, information, matching, alternative pricing mechanisms, etc. Many papers in the area show how these ingredients matter in theory. This paper shows they also matter for quantitative analysis. We use the two-sector model in Lagos and Wright (2005), where some economic activity takes place in centralized markets and some in decentralized markets. In addition to providing microfoundations for money, the use of decentralized markets allow the introduction of ingredients like stochastic trading opportunities and bargaining, while centralized
$ This paper previously circulated with the title ‘‘Money and Capital: A Quantitative Analysis’’. The authors thank David Andolfatto, Paul Beaudry, Gabriele Camera, Miguel Faig, Paul Gomme, Marcus Hagedorn, James Kahn, Ricardo Lagos, Iourii Manovskii, Ellen McGrattan, Lee Ohanian, Guillaume Rocheteau, Richard Rogerson, Chris Sims, Michael Woodford, and many other participants in seminars and conferences for comments. The NSF, the Bank of Canada, the Federal Reserve Bank of Cleveland, and the University of Maryland General Research Board provided research support. Wright also thanks the Ray Zemon Chair in Liquid Assetsat the Wisconsin School of Business. Corresponding author. Tel.: þ1 301 405 3508; fax: þ 1 301 405 3542. E-mail address:
[email protected] (S.B. Aruoba). 1 This literature has recently been dubbed New Monetarist Economics. See Williamson and Wright (2010a,b) and Nosal and Rocheteau (2010) for extended discussions and surveys.
0304-3932/$ - see front matter & 2011 Elsevier B.V. All rights reserved. doi:10.1016/j.jmoneco.2011.03.003
S.B. Aruoba et al. / Journal of Monetary Economics 58 (2011) 98–116
99
markets are useful to incorporate capital as in standard growth theory. This is a further step toward integrating theories with decentralized trade, on the one hand, and mainstream macro, on the other, which has been a challenge for some time.2 The framework in this paper combines components from both standard models in macro and models in monetary theory that strive for better microfoundations. To explain how this works, relative to reduced-form models, here are the ingredients that matter most for the results. First, stochastic trading opportunities, like those in search models, are critical for matching some observations, including observations on velocity. These observations are notoriously hard to capture in cash-in-advance models, especially when calibrating to a shorter period length (see Telyukova and Visschers, 2009 for a recent discussion). Previous models of the reduced-form variety such as those mentioned above did not incorporate stochastic trading opportunities—which is not to say they could not, but simply to say that they did not—and hence miss this point. Second, the two-sector structure highlights a channel not in the models mentioned above. When capital produced in the centralized market is used in decentralized production, since inflation is a tax on decentralized trade, monetary policy affects centralized market investment. The transmission of inflation effects from the sector where goods are traded using money to a sector where inputs for these goods are produced, is new compared to reduced-form models. Third, as explained in more detail below, the results depend in interesting ways on what one assumes about price formation in decentralized trade. If one uses bargaining, inflation has little impact on investment, although it has a sizable impact on consumption and welfare: going from 10% inflation to the Friedman rule barely changes capital, but the welfare gain is still worth around 3% of consumption. Alternatively, with price taking, the same experiment increases long-run capital by as much as 7%, and has a welfare effect between 1% and 1.5%. There is nothing in the reduced-form literature that considers bargaining, and hence this comparison has been missed. Fourth, fiscal policy also matters: due to tax distortions, the first best outcome cannot be obtained even at the optimal monetary policy, which increases the cost of inflation under either bargaining or price taking. Although this is certainly not the first time this has been pointed out, and some of the papers mentioned above also include taxation, the interaction between taxation and the other key ingredients (stochastic trade, the two-sector structure, and alternative pricing mechanisms) has not been studied. The intuition for why it matters whether one assumes price taking or bargaining is the following. When agents invest in capital they not only earn income in the centralized market, they also lower their production cost in the decentralized market. But there is a holdup problem, well known to practitioners of bargaining theory, if somewhat neglected in macro and growth theory (but see Caballero, 1999 for some discussion). Suppose the buyer gets a big share of the surplus in bilateral trade. Then the seller does not reap much of a return on his investment above what he gets in standard models, so the demand for capital does not depend much on what happens in decentralized trade, and inflation does not affect investment much. Now suppose the buyer has low bargaining power. Then the seller does get a big share of the surplus, but the surplus is small, due to a holdup problem on money demand. So whether buyer bargaining power is high or low, inflation has a small impact on investment. This depends on calibration, of course, but the impact is quite small for a wide range of parameters. Nonetheless, due to these holdup problems, decentralized market consumption is very low, so even though inflation does not have a huge effect on decentralized trade, due to concavity of the utility function it does have a sizable impact on welfare. With price taking these holdup problems vanish. This means investment demand depends much more on what happens in the decentralized market. Since inflation is a tax on decentralized trade, it acts as a tax on investment. Thus monetary policy can have a big impact on capital formation. However, without the holdup problems, decentralized market consumption is not nearly so low, and thus when it decreases with inflation the net effect is less painful with competitive price taking than bargaining. The cost of bargaining inefficiencies are sizable. This is true even though bargaining is used only in the decentralized market, which accounts a small share of aggregate output for the calibrated parameter values. These results, about bargaining versus price taking, in models that are otherwise similar to standard macro are novel. However, to be clear, the goal is not to determine whether bargaining or price taking better fits the data—the goal is to lay out models with each and report how it matters. This is part of ongoing research trying to better understand how the details of the micro structure matter for macro and monetary economics. In terms of the most closely related work in the area, a previous attempt to put capital into a similar monetary model by Aruoba and Wright (2003) lead to some undesirable implications, including the following dichotomy: one can solve independently for allocations in the centralized and decentralized markets. This implies monetary policy has no impact on investment, employment or consumption in the centralized market. This is not the case here. Other attempts to study money and capital in models with frictions include Shi (1999), Shi and Wang (2006), and Menner (2006), who build on Shi (1997), and Molico and Zhang (2005), who build on Molico (2006). Those models have only decentralized markets. It is much easier to connect with mainstream macro in a model with some centralized trade. Thus, as a special case, in nonmonetary equilibrium the model developed in this paper reduces to the textbook growth model, while those models
2 As Azariadis (1993) put it, ‘‘Capturing the transactions motive for holding money balances in a compact and logically appealing manner has turned out to be an enormously complicated task. Logically coherent models such as those proposed by Diamond (1984) and Kiyotaki and Wright (1989) tend to be so removed from neoclassical growth theory as to seriously hinder the job of integrating rigorous monetary theory with the rest of macroeconomics’’. And as Kiyotaki and Moore (2001) put it, ‘‘The matching models are without doubt ingenious and beautiful. But it is quite hard to integrate them with the rest of macroeconomic theory—not least because they jettison the basic tool of our trade, competitive markets’’.
100
S.B. Aruoba et al. / Journal of Monetary Economics 58 (2011) 98–116
reduce to something quite different. It is also worth emphasizing that with either bargaining or price taking the numerical results obtained using this model differ from the reduced-form literature. Here is a short survey: Cooley and Hansen (1989, 1991) find much smaller effects, with welfare numbers substantially below 1%. Gomme (1993) gets even smaller effects in an endogenous growth version of the model. Ireland (1994) gets welfare numbers around 0.67%. Lucas (2000), without capital, gets welfare numbers below 1%; earlier efforts at this approach by Lucas (1981) and Fischer (1981) get 0.3–0.45%. Imrohoroglu and Prescott (1991) also get less than 1%. Quantitatively, inflation matters a lot more in the framework developed in this paper.3 This paper makes a contribution to policy-relevant quantitative economics as well as to theory, in the sense that the model brings modern monetary economics much closer to the mainstream. At the very least this should facilitate communication between different camps in macro. The rest of the paper proceeds to make the case as follows. Section 2 describes the model. Section 3 lays out the calibration strategy. Section 4 presents the quantitative results. Section 5 concludes.4
2. Model In this section, the model is presented. After a description of the environment, the agents’ centralized market and decentralized market problems are presented. Equilibrium for the bargaining and price taking versions of the model and a brief digression on banking concludes the section.
2.1. General assumptions A [0,1] continuum of agents live forever in discrete time. To combine elements of standard macro and search theory, we adopt the sectoral structure in Lagos and Wright (2005), hereafter LW. Each period agents engage in two types of economic activity. Some activity takes place in a frictionless centralized market, called the CM, and some takes place in a decentralized market, called the DM, with two main frictions: a double coincidence problem, and anonymity, which combine to make a medium of exchange essential.5 Given that a medium of exchange is essential, one issue in monetary theory is to determine endogenously which objects serve this function (e.g. Kiyotaki and Wright, 1989). In order to focus on other questions, however, other papers avoid this by assuming there is a unique storable asset that qualifies for the role. Since one obviously cannot assume a unique storable asset in a paper called ‘‘Money and Capital’’, a few words about the issue are in order. A story along the lines of the ‘‘worker–shopper pair’’ used to motivate cash-in-advance constraints by Lucas (1980), extended based on time-honored ideas about currency having advantages in terms of portability and recognizability, is useful: First, in terms of portability, in the DM the agents have their capital physically fixed in place at production sites. Thus, when you want to buy something from someone you must visit their location, and since you cannot bring your capital, it cannot be used in payment. This use of spatial separation is in the spirit of the ‘‘worker–shopper’’ idea, but one really should go beyond this, in any model, and ask why claims to capital, or claims more generally, cannot overcome this problem of spatial separation. That is to say, the ‘‘worker–shopper’’ idea may rule out barter, but it is logically irrelevant for ruling out credit or other more sophisticated trading arrangements, and hence cash-in-advance models are not really well motivated by this story at all. A logically coherent theory needs some additional frictions—recognizability, in this case. A stark version of the assumption that works is that agents can costlessly counterfeit claims, other than currency, say, because the monetary authority has a monopoly on the technology for producing hard-to-counterfeit notes. Given this, sellers no more accept claims to capital from anonymous buyers in the DM than they accept personal IOU’s. Therefore, money has a role to play in payments and exchange, even if capital is a storable factor of production. While this is not the place to go into all of the details concerning explicit information frictions and the notion of recognizability, there is ongoing research attempting to make this more rigorous (see Lester et al., 2010 and the references therein). And while by no means this is the last word on the coexistence of money and other assets, the story is at least logically coherent. It is also important to emphasize that it is by now well understood that, even if one allows capital to be used as a medium of exchange, money is still essential when the efficient stock of (portable and recognizable) capital is low, since otherwise agents overinvest (see Lagos and Rocheteau, 2008 and the references therein). The assumptions here guarantee that capital is not used as a medium of exchange for simplicity, and because it is the more relevant case during the period in question for the quantitative analysis in this paper; but there is certainly more to be done on this topic. 3 To understand why most previous models generate such low costs of inflation, it helps to remember the envelope theorem. In a typical cash-inadvance setup, for instance, at the Friedman rule the first best is obtained, and so a relatively small inflation has only a second-order effect on welfare. That is not the case in this paper since, due to holdup problems, even the Freidman rule (and even with no tax distortions) does not achieve the first best. A few papers do find larger effects, such as Dotsey and Ireland (1996), because even though inflation does not affect capital very much it affects the amount of resources used in intermediation. 4 An appendix with alternative models and detailed information about data used is available online at http://www.boraganaruoba.com and on the journal website. 5 For formal discussions of essentiality and anonymity see Kocherlakota (1998), Wallace (2001) or Aliprantis et al. (2007).
S.B. Aruoba et al. / Journal of Monetary Economics 58 (2011) 98–116
101
Moving on to the details of the specification, as in standard growth theory, in the CM there is a general good that can be used for consumption or investment, produced using labor H and capital K hired by firms in competitive markets. Profit maximization implies r ¼FK(K,H) and w¼FH(K,H), where F is the technology, r the rental rate, and w the real wage. Constant returns implies equilibrium profits are 0. In the DM these firms do not operate, but an agent’s own effort e and capital k can be used with technology f(e,k) to produce a different good. Note that k appears as an input the DM, because when you go to a seller’s location he has access to his capital, even though you do not have access to your capital. This is important—it is the fact that capital produced in the CM is productive in the DM that breaks the dichotomy mentioned in the Introduction, and this allows money to have interesting effects in the CM. In the DM, each period with probability s an agent discovers he is a buyer, which means he wants to consume but cannot produce, so he visits the location of someone that can produce; with probability s he is a seller, which means he can produce but does not want to consume, so he waits at his location for someone to visit him; and with probability 12s he is a nontrader, and he neither produces nor consumes. This taste-and-technology-shock specification is equivalent to bilateral random matching, where there is a probability s of meeting someone that produces a good that you like, but the interpretation here fits better with the idea of spatial separation, where buyers visit sellers’ locations. In some buyer-seller meetings, the former is able to pay with credit due in the next CM. As in many of the models surveyed in Williamson and Wright (2010a,b) and Nosal and Rocheteau (2010), these meetings are monitored, as opposed to anonymous. Let ‘ (for loan) be the payment made in the CM, measured in dollars without loss of generality, and assume that it is costlessly enforced. But credit is only available in meetings with probability 1o. With probability o, the meeting is anonymous, or not monitored, and the seller requires cash. Instantaneous utility for everyone in the CM is U(x) Ah, where x is consumption and h labor. As in most applications of LW-style models, linearity in h reduces the complexity of the analysis considerably, although Rocheteau et al. (2008) show how to get the same simplification with general preferences by assuming indivisible labor and lotteries a la Rogerson (1988). Moreover, one can dispense with quasi-linear utility, or indivisible labor and lotteries, altogether as long as one is willing to use more sophisticated computational methods, as in Molico (2006) or Chiu and Molico (2010). In the DM, buyers enjoy utility u(q), and sellers get disutility e, where q is consumption and e labor (normalizing the disutility of DM labor to be linear is a choice of units with no implications for the results). Assume u and U have the usual monotonicity and curvature properties. Solving q¼f(e,k) for e¼c(q,k), the function c(.) denotes the utility cost of producing q given k. One can show that cq 40, ck o0, cqq 4 0 and ckk 4 0 under the usual assumptions on f, and cqk o 0 if k is a normal input. Government sets the money supply so that M þ 1 ¼ ð1 þ tÞM, where þ 1 denotes next period. The policy instrument in this paper is t. In steady state, inflation equals t and the nominal rate is defined by the Fisher equation 1 þi ¼ ð1þ tÞ=b, and hence either i or t can be used as policy instruments. Government also consumes G, levies a lump-sum tax T, labor income tax th, capital income tax tk, and sales tax tx in the CM (sales taxes in the DM are omitted to ease the presentation, but it makes little difference for the results). Letting d be the depreciation rate of capital, which is tax deductible, and p the CM price level, the government budget constraint is G ¼ T þth wH þ ðrdÞtk K þ tx X þ tM=p if M is interpreted as M0. Alternatively, if M1 is used as the relevant money stock, one must make an appropriate adjustment to the real revenue they earn from printing money (something that seems to have gone unnoticed in at least some of the previous literature). 2.2. Household’s problem Let Wðm,k,‘Þ be the value function for an agent in the CM holding m dollars and k units of capital and owing ‘ from the previous DM. Let V(m,k) be the DM value function. Assuming agents discount between the CM and DM at rate b 2 ð0,1Þ, but not between the DM and CM, the problem in the CM is Wðm,k,‘Þ ¼
max
x,h,m þ 1 ,k þ 1
fUðxÞAh þ bV þ 1 ðm þ 1 ,k þ 1 Þg
ð1Þ
subject to ð1 þ tx Þx ¼ wð1th Þh þ½1 þðrdÞð1tk Þkk þ 1 T þ
mm þ 1 ‘ : p
ð2Þ
One can adapt the discussion in LW to guarantee the concavity of the problem and interiority of the solution (or, in quantitative analysis, one can check it directly). Then, eliminating h using the budget, the first order conditions are x : U 0 ðxÞ ¼
mþ1 :
kþ1 :
Að1þ tx Þ , wð1th Þ
A ¼ bV þ 1,m ðm þ 1 ,k þ 1 Þ, pwð1th Þ
A ¼ bV þ 1,k ðm þ 1 ,k þ 1 Þ: wð1th Þ
ð3Þ
ð4Þ
ð5Þ
102
S.B. Aruoba et al. / Journal of Monetary Economics 58 (2011) 98–116
Since ðm,k,‘Þ does not appear in (4), for any distribution of ðm,k,‘Þ across agents entering the CM, the distribution of ðm þ 1 ,k þ 1 Þ exiting the CM is degenerate. Also, it is immediate that W is linear Wm ðm,k,‘Þ ¼
A , pwð1th Þ
ð6Þ
Wk ðm,k,‘Þ ¼
A½1 þ ðrdÞð1tk Þ , wð1th Þ
ð7Þ
W‘ ðm,k,‘Þ ¼
A : pwð1th Þ
ð8Þ
Moving to the DM, the value of entering the DM is given by Vðm,kÞ ¼ sV b ðm,kÞ þ sV s ðm,kÞ þ ð12sÞWðm,k,0Þ, b
ð9Þ
s
where V (m,k) and V (m,k) denote the values to being a buyer and to being a seller, which are given by V b ðm,kÞ ¼ o½uðqb Þ þ Wðmdb ,k,0Þ þ ð1oÞ½uðq^ b Þ þ Wðm,k,‘b Þ,
ð10Þ
V s ðm,kÞ ¼ o½cðqs ,kÞ þWðm þ ds ,kÞ þ ð1oÞ½cðq^ s ,kÞ þ Wðm,k,‘s Þ:
ð11Þ
In these expressions qb and db (qs and ds ) denote the quantity of goods and dollars exchanged when buying (selling) for money, while q^ b and ‘b (q^ s and ‘s ) denote the quantity and the value of the loan for the buyer (seller) when trading on credit. Given all this, it is now straightforward to derive A @q A @db A @ds @qs Vm ðm,kÞ ¼ þ so u0 b þ so cq , ð12Þ pwð1th Þ pwð1th Þ @m @m pwð1th Þ @m @m Vk ðm,kÞ ¼
A½1 þ ðrdÞð1tk Þ @q A @db A @ds @qs þ so u0 b þ so cq ck wð1th Þ pwð1th Þ @k @k pwð1th Þ @k @k ^ ^ @ q A @‘ A @‘ @ q s b b s þ sð1oÞ cq ðq^ s ,kÞ ck ðq^ s ,kÞ : þ sð1oÞ u0 pwð1th Þ @k @k pwð1th Þ @k @k
ð13Þ
To complete the analysis, the terms of trade (q, d, q^ and ‘) need to be specified, which in turn will determine the derivatives in (12) and (13). Before doing so, however, consider as a benchmark the planner’s problem when money is not essential: JðKÞ ¼ max fUðXÞAH þ s½uðqÞcðq,KÞ þ bJ þ 1 ðK þ 1 Þg X,H,K þ 1 ,q
ð14Þ
subject to X ¼ FðK,HÞ þð1dÞKK þ 1 G:
ð15Þ
Eliminating X, and again assuming interiority, the first order conditions are q : u0 ðqÞ ¼ cq ðq,KÞ,
ð16Þ
H : A ¼ U 0 ðXÞFH ðK,HÞ,
ð17Þ
K þ 1 : U 0 ðXÞ ¼ bJ 0þ 1 ðK þ 1 Þ:
ð18Þ
The envelope condition J ðKÞ ¼ U ðXÞ½FK ðK,HÞ þ 1dsck ðq,KÞ implies 0
0
U 0 ðXÞ ¼ bU 0 ðX þ 1 Þ½FK ðK þ 1 ,H þ 1 Þ þ 1dbsck ðq þ 1 ,K þ 1 Þ: n
n
ð19Þ
0
From (16), q¼q (K) where q (K) solves u ðqÞ ¼ cq ðq,KÞ. Then the paths for (K þ 1,H,X) satisfy (15), (17) and (19). This characterizes the first best, or FB for short.6 Note the term bsck ðq þ 1 ,K þ 1 Þ 4 0 in (19), which reflects the fact that investment affects DM as well as CM productivity because K is used in both sectors. If K did not appear in c(q) the system would dichotomize: one could first set q¼qn, where qn solves u0 ðqÞ ¼ c0 ðqÞ, and then solve the other conditions independently for (K þ 1,H,X). The fact that K is used in the DM and produced in the CM breaks this dichotomy. Here the assumption is that the same K used in both sectors, but the online appendix contains a version with two distinct capital goods in the CM and DM, as well as a version where K is used only in the CM but is produced and traded in the DM. As discussed in Section 4.3, these variations do not affect the main results much. 6 As is standard, one can characterize the solution by the FOC and envelope condition, or replace the FOC for K þ 1 and envelope condition by the Euler equation and transversality. One can check when there is a unique steady state to which the planner’s solution converges under the usual kind of assumptions. This is less straightforward for equilibria with distortions. In the working paper we show there is a unique steady state under price taking.
S.B. Aruoba et al. / Journal of Monetary Economics 58 (2011) 98–116
103
2.3. Bargaining Assume the DM terms of trade are determined by generalized Nash bargaining.7 Consider first a nonmonitored meeting where trade requires cash. If the buyer’s and seller’s states are (mb,kb) and (ms,ks), (q,d) solves the generalized Nash bargaining problem with the bargaining power of the buyer given by y and threat points given by continuation values. Since the buyer’s payoff from trade is u(q)þW(mb d,kb,0) and his threat point is W(mb,kb,0), by the linearity of W, his surplus is uðqÞAd=pwð1th Þ. Similarly, the seller’s surplus is Ad=pwð1th Þcðq,ks Þ. Hence the bargaining solution is y 1y Ad Ad max uðqÞ cðq,ks Þ s:t: d r mb : ð20Þ pwð1th Þ pwð1th Þ q,d As in LW, it is easy to show that in equilibrium d¼ mb. Inserting this and taking the first order condition with respect to q, mb gðq,ks Þwð1th Þ , ¼ A p
ð21Þ
where gðq,ks Þ
ycðq,ks Þu0 ðqÞ þð1yÞuðqÞcq ðq,ks Þ : yu0 ðqÞ þ ð1yÞcq ðq,ks Þ
ð22Þ
Writing q¼q(mb,ks), where qðÞ is given by (21), the relevant derivatives in (12) and (13) are @d=@mb ¼ 1, @q=@mb ¼ A=pwð1th Þgq 4 0 and @q=@ks ¼ gk =gq 40, where gq ¼
gk ¼
u0 cq ½yu0 þð1yÞcq þ yð1yÞðucÞ½ðu0 cqq cq u00 Þ ½yu0 þ ð1yÞcq 2
40,
yu0 ck ½yu0 þð1yÞcq þ yð1yÞðucÞu0 cqk o 0: ½yu0 þð1yÞcq 2
ð23Þ
ð24Þ
^ is Now consider a meeting where credit is available, assuming the buyer has the same bargaining power y. Then ðq,‘Þ determined just like (q,d) above, except that there is no constraint on ‘, the way d r mb needed to hold in monetary trades. Hence, the solution is given by ^ ¼ cq ðq,k ^ s Þ, u0 ðqÞ
ð25Þ
A‘ ^ þ ycðq,k ^ s Þ: ¼ ð1yÞuðqÞ pwð1th Þ
ð26Þ
^ is the same as the solution to the planner’s problem qn(K) (bilateral credit transactions are Given ks ¼K, notice that qðKÞ efficient, conditional on K). It is now easy to take the relevant derivatives and insert them into (12) and (13). After imposing (m,k) ¼(M,K), this delivers Vm ðM,KÞ ¼
ð1soÞA soAu0 ðqÞ þ , pwð1th Þ pwð1th Þgq ðq,KÞ
ð27Þ
Vk ðM,KÞ ¼
A½1þ ðrdÞð1tk Þ ^ sogðq,KÞsð1oÞð1yÞck ðq,KÞ, wð1th Þ
ð28Þ
^ where it is understood that q¼q(M,K) and q^ ¼ qðKÞ, while
gðq,KÞ ck ðq,KÞ þ cq
@q g ðq,KÞ ¼ ck ðq,KÞcq ðq,KÞ k o0: @K gq ðq,KÞ
ð29Þ
The last two terms in (28) capture the idea that if a seller has an extra unit of capital it affects marginal cost in the DM, which augments the value of investment in the CM.8 Substituting (27) and (28), as well as prices p¼AM/w(1 th)g(q,K), r ¼FK(K,H), and w¼FH(K,H), into the first order condition for m þ 1 and k þ 1, the two equilibrium conditions are gðq,KÞ bgðq þ 1 ,K þ 1 Þ u0 ðq þ 1 Þ ¼ , ð30Þ 1so þ so M Mþ1 gq ðq þ 1 ,K þ 1 Þ U 0 ðXÞ ¼ bU 0 ðX þ 1 Þf1þ ½FK ðK þ 1 ,H þ 1 Þdð1tk Þgbð1 þtx Þs½ogðq þ 1 ,K þ 1 Þ þ ð1oÞð1yÞck ðq^ þ 1 ,K þ 1 Þ:
ð31Þ
7 At the suggestion of a referee we mention the following: Often Nash bargaining is motivated by arguing that it can be considered the reduced-form for an underlying strategic bargaining game (see e.g. Osborne and Rubinstein, 1990). But those strategic foundations do not generally apply in nonstationary situations like the one in this paper (see Coles and Wright, 1998 or Ennis, 2001). This means the Nash solution is a primitive here—there is no claim here that it can be derived from a strategic bargaining game in the usual manner. 8 The expression in (29) captures nonprice-taking behavior in the bargaining model: the first term reflects the cost reduction due to extra capital, and the second reflects the change in cost due to the change in the terms of trade when sellers have more capital.
104
S.B. Aruoba et al. / Journal of Monetary Economics 58 (2011) 98–116
Two other conditions come from the first order condition for X and the resource constraint, Að1 þ tx Þ , ð1th ÞFH ðK,HÞ
ð32Þ
X þ G ¼ FðK,HÞ þ ð1dÞKK þ 1 :
ð33Þ
U 0 ðXÞ ¼
An equilibrium with bargaining is defined as (positive, bounded) paths for (q,K þ 1,H,X) satisfying (30)–(33), given policy and the initial condition K0. We are mostly interested in monetary equilibrium, with q 40 at every date. But consider for a moment nonmonetary equilibrium, with q¼0 at all dates. In this case, (K þ 1,H,X) solves (31) and (33) with g ¼ 0, which is exactly the equilibrium for a standard neoclassical growth model, as mentioned in the Introduction. Also, notice that if capital is not used in the DM, then c(q,K)¼c(q) and gðq,KÞ ¼ ck ðq,KÞ ¼ 0. This version dichotomizes, and since M appears in (30) but not (31) and (33), monetary ^ Equilibrium does not dichotomize when K enters c(q,K). Notice, however, that if y ¼ 1 policy affects q but not (K þ 1,H,X) or q. then, although K enters c(q,K), (31)–(33) can be solved for (K þ 1,H,X), then (30) determines q since gðq,KÞ ¼ 0. So if y ¼ 1 money still does not influence CM variables, even though anything that affects the CM (e.g. taxes) influences q. Intuitively, when y ¼ 1 sellers do not get any of the surplus from DM trade, and so investment decisions are based solely on returns to K that accrue in the CM. Looking at (29), when y ¼ 1, the cost reduction due to having more capital is exactly matched by the increase in cost due to higher production. This is an extreme version of a holdup problem in the demand for capital. More generally, for any y 4 0, sellers do not get the full return on capital from DM trade, and hence they underinvest. This holdup problem is not present in most standard macro models, and constitutes a distortion over and above those from taxes and monetary inefficiencies. Even under the Friedman Rule (FR) where i¼0 and with only lump-sum taxes, the holdup problem on capital and a related problem on money emphasized in LW remains. In some models all holdup problems can be resolved if one sets bargaining power y correctly. This is not possible here: y ¼ 1 resolves the problem in the demand for money, but this is the worst case for investment; and y ¼ 0 resolves the problem in the demand for capital, but this is the worst case for money. There is no y that can eliminate the double holdup problem, which has implications for both the empirical performance of bargaining models and their welfare implications. 2.4. Price taking While the holdup problems cannot simultaneously be solved by bargaining, some other solution concepts work much better. For example, it is by now well known that competitive search equilibrium, based on directed search and price posting, rather than random matching and bargaining, resolves multiple holdup problems (see e.g. Acemoglu and Shimer, 1999 or Mortensen and Wright, 2002). And competitive equilibrium with Walrasian price taking also does the job here, even though this is not true in general (for example Rocheteau and Wright, 2005 show that competitive search equilibrium can do better than competitive equilibrium in environments with search externalities, but there are no such externalities in the model). Since it is easier to present, relative to price posting with directed search, in this section we consider price taking.9 For simplicity assume that there are two distinct trading locations in the DM, one for anonymous traders where cash is needed, and one where credit is available. Agents do not get to choose, but are randomly assigned to one location. The DM value function then has the same form as (9), but now, in the location with anonymous meetings ~ V s ðm,kÞ ¼ maxfcðq,kÞ þ Wðm þ pq,k,0Þg,
ð34Þ
~ V b ðm,kÞ ¼ maxfuðqÞ þ Wðmpq,k,0Þg
ð35Þ
q
q
~ s:t: pqr m,
where p~ is the price (which generally differs from the CM price p), and in the location with monitored meetings s ^ þ Wðm,k,p^ qÞg, ^ V^ ðm,kÞ ¼ maxfcðq,kÞ q^
b
^ þWðm,k, p^ qÞg: ^ V^ ðm,kÞ ¼ maxfuðqÞ q^
ð36Þ ð37Þ
The first order condition for the sellers in the two DM locations are ~ m ¼ pA=pwð1t ~ cq ðq,kÞ ¼ pW h Þ,
ð38Þ
9 At the suggestion of a referee we emphasize the following: One can think of the two models—the one with bargaining and the one with price taking—as representing two alternative environments, one with bilateral meetings and one with multilateral meetings. In the former case it makes sense to let agents bargain, while in the latter it makes sense to assume they take as given the price that clears the market as is standard in competitive equilibrium. By analogy, one could think about the labor market search models of Mortensen and Pissarides (1994), with bilateral meetings and bargaining, and Lucas and Prescott (1974), with multilateral meetings and price taking. However, it is important to emphasize that the only impact of assuming multilateral matching here is to motivate (Walrasian) price taking, and it has no implications for the set of allocations that are feasible, since all agents with the same trading status (buyer versus seller) in DM are identical. This would not be the case in some search models of exchange (e.g. Kiyotaki and Wright, 1989), where the switch from multilateral to bilateral meetings would make a big difference for feasible allocations.. Alternatively, one can simply interpret the analysis of the price-taking model in this paper as an analytic short cut to deriving the allocation that obtains in competitive search equilibrium, which does not require multilateral meetings.
S.B. Aruoba et al. / Journal of Monetary Economics 58 (2011) 98–116
^ ^ ‘ ¼ pA=pwð1t ^ cq ðq,kÞ ¼ pW h Þ:
105
ð39Þ
^ As in the previous model, in the Market clearing implies buyers and sellers choose the same q, and the same q. ~ Inserting p~ ¼ M=q, the analog to (21) from the bargaining anonymous market buyers spend all their money so q ¼ M=p. model is given by qcq ðq,kÞwð1th Þ M ¼ : p A
ð40Þ
^ ^ q/A. ^ Similarly, when credit is available, q^ ¼ qðKÞ, as in the bargaining model, but now ‘ ¼ pwð1th Þu0 ðqÞ Then the analogs to (27) and (28) are Vm ðM,KÞ ¼
ð1soÞA sou0 ðqÞ , þ pwð1th Þ p~
ð41Þ
Vk ðM,KÞ ¼
A þAðrdÞð1tk Þ ^ sock ðq,KÞsð1oÞck ðq,KÞ: wð1th Þ
ð42Þ
Inserting these into (4) yields the analogs to (30) and (31) cq ðq,KÞq bcq ðq þ 1 ,K þ 1 Þq þ 1 u0 ðq þ 1 Þ ¼ , 1so þ so M cq ðq þ 1 ,K þ 1 Þ Mþ1 U 0 ðXÞ ¼ bU 0 ðX þ 1 Þf1þ ½FK ðK þ 1 ,H þ 1 Þdð1tk Þgbð1 þtx Þs½ock ðq þ 1 ,K þ 1 Þ þ ð1oÞck ðq^ þ 1 ,K þ 1 Þ:
ð43Þ ð44Þ
The other equilibrium conditions (32) and (33) are the same as above. An equilibrium with price taking is given by (positive, bounded) paths for (q,K þ 1,H,X) satisfying (43)–(44) and (32)–(33), given policy and K0. The difference between bargaining and price taking is the difference between (30)–(31) and (43)–(44). The equilibrium condition for q here looks like the one from the bargaining model when y ¼ 1, and the condition for K looks like the one from the bargaining model when y ¼ 0, indicating that price taking avoids both holdup problems.10 2.5. A digression on the relevant concept of money At first blush, it might seem the relevant notion of money here is M0, but that is not the only interpretation. Although it has not yet been done it a fully satisfactory way, one can imagine introducing banks into the model following the approach in Berentsen et al. (2007) (see also He et al., 2008 and Chiu and Meh, in press). Assume that after production and exchange stops in the CM, at which point agents have decided their m þ 1, it is revealed which ones want to consume and which ones are able to produce while banks are still open but before the DM convenes. As the sellers have no use for money, they deposit it in banks, who then lend it to buyers, at interest. One can think of banks either lending out the same physical currency, or perhaps keeping that in the vault and issuing bank-backed securities usable for payments (assuming these are not easily counterfeitable). However, in neither case does one get anything like the ‘‘money multiplier’’ from undergraduate monetary economics that would allow one to take seriously the relationship between M0 and M1. Some of the models discussed in Williamson and Wright (2010a,b) or Nosal and Rocheteau (2010) are better in this regard, but still provide nothing like a definitive banking setup that can be inserted seamlessly into the environment here. The model in He et al. (2005) actually does generate an explicit ‘‘money multiplier’’ very much like the one in undergraduate economics, but only for second-generation search models of monetary exchange. Second-generation models assume for technical reasons that assets, including currency, are indivisible, making them ill suited for quantitative analyses like the one in this paper. It is not hard to see why a model with indivisible assets might generate a role for inside (bank) money—simply put, there may not be enough outside money to go around—and to see why this can lead to a ‘‘money multiplier’’. This is more difficult to capture formally when money is divisible. Evidently, much more work remains to be done in order to address issues related to financial intermediation in these kinds of models. Having said that, in the quantitative work, we do not necessarily want to take M to be currency per se. Results for several measures of money, including M0, M1, and so on, are presented below and the reader can pick and choose as desired. But M1—actually, M1S, the so-called sweep-adjusted version of the series—is perhaps most relevant, for several reasons. First, M1 is the measure used by most of the previous studies cited above, and so its use here facilitates comparisons. Second, when the fraction of DM trades where credit is available is calibrated using micro data, monetary trade is interpreted to include all transactions that use cash, check and debit card, but not credit card, purchases. This is based on two criteria: (a) checks and debit cards can be thought of (simplistically?) as convenient ways to access deposits, 10 To show this formally, set tk ¼ th ¼ tx ¼ 0. Then under price taking the equilibrium conditions for (K þ 1,H,X) are the same as those for the planner problem. Hence the equilibrium coincides with the FB iff u0 ðqÞ ¼ cq ðq,KÞ. From (43), this means cq ðq,KÞq=M ¼ bcq ðq þ 1 ,K þ 1 Þq þ 1 =M þ 1 . Using (40) this reduces to 1=pw ¼ b=p þ 1 w þ 1 . Since w ¼ A=U 0 ðXÞ, it further reduces to p=p þ 1 ¼ U 0 ðxÞ=bU 0 ðX þ 1 Þ. Since in any equilibrium the slope of the indifference curve U 0 ðxÞ=bU 0 ðX þ 1 Þ equals the slope of the budget line 1 þ r, with r equal to the real interest rate, the relation in question finally reduces to p þ 1 =p ¼ 1=ð1 þ rÞ. Using the Fisher equation, this holds and hence q ¼ q ðKÞ solves (43) iff the nominal rate is set to zero. This implies that under price taking with lumpsum taxes, setting i¼ 0 yields efficiency.
106
S.B. Aruoba et al. / Journal of Monetary Economics 58 (2011) 98–116
which like cash have the property that they are very liquid and pay 0 or close to 0 interest; and (b) the relevant feature of credit cards, like credit in general, is that they allow you to consume now and work later, while with either cash or demand deposits you have to raise the funds before you spend them (as discussed by Dong, 2008). There is a tension here, and indeed there is a tension whenever one tries to implement monetary theory empirically, irrespective of the extent to which the theory has any claim to microfoundations. To quote Lucas (2000, p. 270): Another set of questions about the time series estimates concerns the fact that M1—the measure of money that I have used—is a sum of currency holdings that do not pay interest and demand deposits that (in some circumstances) do. Moreover, other interest bearing assets beside these may serve as means of payment. One response to these observations is to formulate a model of the banking system in which currency, reserves, and deposits play distinct roles. Such a model seems essential if one wants to consider policies like reserve requirements, interest on deposits, and other measures that affect different components of the money stock differently. A second response to the arbitrariness of M1 y is to replace M1 with an aggregate in which different monetary assets are given different weights. The basic idea, as proposed in Barnett (1978, 1980), and Poterba and Rotemberg (1987), is that if a treasury bill yielding 6% is assumed to yield no monetary services, then a bank deposit yielding 3% can be thought of as yielding half the monetary services of a zero-interest currency holding of equal dollar value. Implementing this idea avoids the awkward necessity of classifying financial assets as either entirely money or not monetary at all, and lets the data do most of the work in deciding how monetary aggregates should be revised over time as interest rates change and new instruments are introduced. Having understood this, Lucas still uses M1, and suggests that it may not be too bad an approximation for the issues at hand and the period under consideration (although perhaps not for all issues or all periods). The use of the sweep-adjusted data M1S in this paper is somewhat parallel to the second approach he mentions but this does not solve all of the issues in terms of measurement.11 And we very much agree with the first idea, to model banking and payments more seriously, although obviously this is far from a trivial exercise in terms of theory. 3. Quantitative analysis We now turn to describing the quantitative methodology. Since the model is essentially a two-sector model, a careful accounting of aggregate variables is necessary as is presented in the next section, followed by a description of the calibration strategy. 3.1. Preliminaries The price levels in the CM and DM are p and p~ ¼ M=q, respectively, where p satisfies p¼
AM ð1th Þgðq,KÞFH ðK,HÞ
ð45Þ
in the bargaining version of the model by (21), and p¼
AM ð1th Þqcq ðq,KÞFH ðK,HÞ
ð46Þ
in the price-taking version by (40). Nominal output is pF(K,H) in the CM, and soM þ sð1oÞ‘ in the DM. Using p as the unit of account, real output in the CM is YC ¼F(K,H) and in the DM is YD ¼ soM=p þ sð1oÞs‘=p. Total real output is Y ¼YC þYD. Define the share of output produced in the DM by sD ¼ YD/Y, the share of output where money is essential by sM ¼YM/Y where YM ¼ soM=p, and the share where loans are used by s‘ ¼ Y‘ =Y where Y‘ ¼ sð1oÞ‘=p. These shares are not calibrated, but they are indirectly computed from other variables. To see how, note that velocity is v ¼ pY=M ¼ soY=YM . Hence, sM ¼ YM =Y ¼ so=v. The maximum s can be is 1/2, and the maximum o can be is 1, so given M1 velocity is around 5, sM is bounded above by 10%. In fact, given the calibrated parameters it is actually even smaller. There are two points to emphasize. First, to think about the size of the different sectors, one does not have to take a stand on which goods are traded in each. Second, the results presented below do not depend on having an excessive amount of monetary trade—at least 90% of economic activity looks just like what one sees in nonmonetary models. The markup m, is an aggregate of markups (price over marginal cost) in the two markets. The markup in the CM market is 0, since it is competitive. The markup in the DM under price taking is also 0. With bargaining, the markup in the DM is derived as follows. First consider monetary trades. Marginal cost in terms of utility is cq(q,K). Since a dollar is worth A/p(1th)w utils, 11 By way of example, Lucas in the quotation takes it for granted that ‘‘a treasury bill yielding 6% is assumed to yield no monetary services’’, while recent work by Krishnamurthy and Vissing-Jorgensen (2009) puts these self-same T Bills into households’ utility functions in order to capture in a reduced-form way their ‘‘convenience yields’’. If this it is meant to stand in for anything at all, presumably it stands for an assets’ liquidity or monetary services.
S.B. Aruoba et al. / Journal of Monetary Economics 58 (2011) 98–116
107
marginal cost in dollars is cq(q,K) p(1 th)w/A. Since the price is M/q, the markup in monetary trade is given by M=q gðq,KÞ ¼ , cq ðq,KÞpð1th Þw=A qcq ðq,KÞ
1 þ mM ¼
ð47Þ
after eliminating M using (45). Similarly, the markup in credit trade in the DM is 1 þ m‘ ¼
^ þ ycðq,k ^ sÞ ‘=q^ ð1yÞuðqÞ ¼ : ^ ^ ^ cq ðq,KÞpð1t qc Þw=A ð q,KÞ q h
ð48Þ
The average markup in the DM is then mD ¼ omM þð1oÞm‘ , while the average markup for the whole economy is m ¼ sD mD . 3.2. Calibration Consider the following functional forms for preferences and technology: in the CM UðxÞ ¼ B½x1e 1=ð1eÞ and FðK,HÞ ¼ K a H1a ; in the DM uðqÞ ¼ C½ðqþ bÞ1Z b1Z =ð1ZÞ and cðq,kÞ ¼ qc k1c . The cost function cðÞ comes from the technology q ¼ e1=c k11=c ; if c ¼ 1 then the model dichotomizes. The parameter b in u(q) is introduced merely so that u(0)¼0, which is useful for technical reasons, and is set to b¼0.0001. This means relative risk aversion is not constant, but if b 0, it is approximately constant at Zq=ðqþ bÞ Z. Risk aversion parameters are set to e ¼ Z ¼ 1 as a benchmark, to facilitate comparison with the literature, and also because one can show these choices are consistent with a balanced growth path in an extended version of the model with long run technical change (see Waller, in press). In any case, the results are robust to these choices, as discussed below. C is normalized at C ¼1, with no loss in generality. In terms of calibrating the remaining parameters, we begin with a heuristic description, and then provide details. It is useful to point out that the approach here is a natural extension of standard methods. To pick a typical application, Christiano and Eichenbaum (1992) study the one-sector growth model, parameterized by U ¼ logðxÞ þ Að1hÞ and Y ¼ K a h1a for their indivisible-labor version; for their divisible-labor version replace A(1 h) by Alogð1hÞ. One calibrates the parameters as follows: Set the discount factor b ¼ 1=ð1 þ rÞ where r is some observed average interest rate. Then set depreciation d ¼ I=K to match the investment-capital ratio. Then set a to match either labor’s share of income LS or the capital-output ratio K/Y, since these yield the same result given there are no taxes (see below). Finally, set A to match observed average hours worked h. This method can be adapted to many scenarios. For example, Greenwood et al. (1995) calibrate a two-sector model, am 1am an with home production, as follows. Consider U ¼ logðxÞ þ Að1hm hn Þ, Ym ¼ Km hm and Yn ¼ Knan h1 , where x ¼ ½Dxkm þ n ð1DÞxkn 1=k , and xm, hm and km are consumption, hours and capital in the market while xn, hn and kn are consumption, hours and capital in the nonmarket or home sector. The two-sector version of the standard method is this: again set b ¼ 1=ð1 þ rÞ; set dm and dn to match Im/Km and In/Kn; set am and an to match Km/Ym and Kn/Ym; and set A and D to match hm and hn. This leaves k, which is hard to pin down based on steady state observations, and is therefore typically set based on direct estimates of the relevant elasticities. Since the model in this paper is also a two-sector model, a variant of the home-production method is appropriate. Thus, first set b, d and A as above. Then set a and c to match both K/Y and LS. In the standard one-sector model, without taxes, it does not matter if one calibrates a to LS or K/Y, but with taxes calibrating a to LS yields a value for K=Y that is too low (Greenwood et al., 1995; Gomme and Rupert, 2007). The idea here is to set a to match LS, then try to use c to match K/Y, since DM production provides an extra kick to the return on K. Given this, the utility parameter B and probabilities s and o are set to match some money demand observations, as discussed below, which is the analog of picking k in home production framework, and is similar to what is done in any calibrated monetary model. This completes the heuristic description. An online appendix describes in more detail the data used to obtain the calibration targets, and a summary is provided here. The benchmark model is annual, but as discussed below, the results are basically the same for quarterly and monthly calibrations (which is a big advantage over the typical cash-in-advance model, as mentioned in the introduction). The benchmark calibration period is 1959–2004 and some alternatives are considered below. Model-consistent measures from the data are used, where available. For example, the definition of GDP excludes consumption expenditures on durables and net exports since these are not explicitly modeled. Also the measure of money is the sweep-adjusted measure for M1, which have some distinct advantages, as discussed in Cynamon et al. (2006). Table 1 lists the calibration targets and parameters. Some parameters can be directly pinned down: b ¼ 1=ð1þ rÞ with r ¼ 0:028; th ¼0.251 and tk ¼ 0.533; tx ¼0.069; G/Y¼0.241; d ¼ I=K ¼ 0:070; a ¼ 0:293 to get LS¼0.707.12 In order to pin down the fraction o of DM trades where credit is not available, two sources are used. First, Klee (2008) finds that shoppers use credit cards in 12% of total transactions in the supermarket scanner data. The remaining transactions use cash, checks and debit cards which, recall from the digression on banking, fit with the notion of money in the model. The DM does not literally correspond to supermarket shopping, but since this is the best available data, it is nevertheless informative. Second, using earlier consumer survey data, Cooley and 12 r is the annual after-tax real interest rate based on an average pre-tax nominal rate on Aaa-rated corporate bonds of 7.8%, an inflation rate from the GDP deflator of 3.7%, and a tax on real bond returns of 30% from the NBER TAXSIM model. As is standard, a bond market is not included in the definition of equilibrium, but bonds can be priced in the usual way. th and tk are the average effective marginal tax rates in McGrattan et al. (1997) (Gomme and Rupert, 2007 report similar numbers); tx is the average of excise plus sales tax revenue divided by consumption, LS is obtained using the method in Prescott (1986).
108
S.B. Aruoba et al. / Journal of Monetary Economics 58 (2011) 98–116
Table 1 Benchmark calibration. (a) ‘Simple’ parameters Parameters b Targets 0.0001 (b) Remaining parameters Parameters Targets Target values
e¼Z
b
1
0.973
A H 0.33
th 0.251 B v 5.381
tk 0.533
tx 0.069
G/Y 0.241
d
a
o
0.070
0.293
0.85
c
s
K/Y 2.337
x
mD
0.064
0.3
y
Notes: Panel (a) shows the values for the parameters and the calibration targets used for the parameters than can be directly calibrated. Panel (b) shows the parameters that are jointly calibrated using the calibration targets shown.
Hansen (1991) come up with a similar measure of around 16%. Thus o is calibrated using o ¼ 0:15, which seems to be a good compromise, but it turns out that over a reasonable range o does not matter much. This determines all the parameters in panel (a) of Table 1. The remaining ones in panel (b) are: A and B from utility, the cost parameter c, the probability of being a buyer s, and, in the bargaining model, y. These parameters are determined simultaneously to match the following targets. First, the standard measure of work as a fraction of discretionary time H¼1/3. Second, average velocity v ¼5.381. Third, K/Y ¼2.337. Fourth, a money demand semi-elasticity of x ¼ 0:064.13 Fifth, in models with bargaining a DM markup of 0.3 is targeted.14 These parameters are calibrated simultaneously to minimize the squared percentage distance between the targets in the data and model with equal weights on each target. All of the targets, except for the money demand semi-elasticity, can be directly obtained using straightforward formulas. The semi-elasticity is computed using the change in money demand when the interest rate changes from iþ0.5 to i 0.5 where i is the benchmark value. This concludes the baseline calibration strategy. In Section 4.3, two variations on this baseline are discussed and while the details differ the results are very similar. 3.3. Decision rules ^ ¼ m=M, p^ ¼ p=M etc. Then the individual state becomes ðm,k,KÞ, ^ All nominal variables are scaled by M, so that m where in ^ ¼ 1 and k¼ K. Although the above presentation was more general, the recursive equilibrium is given by timeequilibrium m invariant decision rules [q(K), K þ 1(K), H(K), X(K)] and value functions [W(K),V(K)] solving the relevant equations. These equations are solved numerically using a nonlinear global approximation, which is important for accurate welfare computations. 4. Results This section presents the quantitative results obtained using the methodology outlined above. 4.1. Calibration results In Table 2, one column lists the relevant moments in the data, while the others list moments from three specifications of the model. Model 1 uses bargaining in the DM with bargaining power y ¼ 1, giving up on the DM markup mD as a target; it is presented mainly as a benchmark since as proved earlier, when y ¼ 1 money cannot affect the CM variables at all. Model 2 uses bargaining with y calibrated along with the other parameters. Model 3 uses price taking in the DM, so there is no y, calibrating the rest of parameters to match the targets other than the markup. The targets are matched with two exceptions. First, the DM markup mD can be matched only under bargaining when we calibrate y, rather than fixing it at 1 or assuming price taking, for obvious reasons. Second, K/Y can be matched only in the price-taking model, for reasons that we now explain. Intuitively, the calibration sets the CM technology parameter a to match LS and then tries to hit K/Y using the technology parameter c (although this way of looking at things is instructive, it is meant only to be suggestive, since in fact all parameters are calibrated simultaneously). When c ¼ 1, K is not used in the DM, and K/Y is too low, as in the standard model once taxes are introduced. As c increases above 1, the return on K from its use in the DM increases and hence so does K/Y. But, in practice, with bargaining, this effect is tiny because the holdup problem eats up most of the DM return on K. Of course, this depends on bargaining power, but even if y is picked to maximize K/Y, this still is not enough. Intuitively, if y is big then buyers have all the bargaining power, which makes q big, other things being equal, but gives little return from DM trade to sellers; and if y is small then sellers have all the bargaining power, which gives them a big 13 As explained in the Appendix, the money demand relationship is estimated using the cointegration methods in Stock and Watson (1993). The resulting semi-elasticity estimate of 0.064 is perfectly in line with other estimates: Stock and Watson (1993) get semi-elasticities between 0.05 and 0.10; Ball (2001) gets 0.05; and Lucas (2000) argues that 0.07 fits the data best. 14 Since this is somewhat novel, our markup target uses the evidence discussed by Faig and Jerez (2005) from the Annual Retail Trade Survey of retail establishments. At the low end, Warehouse Clubs and Superstores come in around 17%, Automotive Dealers 18%, and Gas Stations 21%. At the high end Specialty Foods come in at 42%, Clothing and Footware 43%, and Furniture 44%. The value used, mD ¼ 0:3, is in the middle of the data, but the robustness discussion shows that this does not matter much.
S.B. Aruoba et al. / Journal of Monetary Economics 58 (2011) 98–116
109
Table 2 Calibration results. Data
Model 1 y¼1
Model 2 calibrate y
Model 3 price taking
Calibrated parameters
s
0.10 1.09 1.73 2.75 –
B
c A
y
0.10 1.23 1.52 3.12 0.92
0.10 1.87 1.65 4.80 –
Calibration targets
mD K/Y H v
x
30.00 2.34 0.33 5.38 0.06
Miscellaneous sD sM
m q=q^
z Squared error
0.0059
42.08 (n) 2.19 0.33 5.39 0.06 3.09 1.56 1.30 0.72 0 0.0031
30.00 2.20 0.33 5.39 0.06
0.00 (n) 2.34 0.33 5.38 0.06
2.65 1.64 0.80 0.44 0.0001 0.0034
2.47 1.63 0.00 0.71 0.004 0.0000
Note: Model 1 refers to the version with buyer-take-all bargaining ðy ¼ 1Þ, Model 2 refers to the version with generalized Nash bargaining and Model 3 refers to the version with price taking. Squared error is the sum of the squared differences between the calibration targets and the model-implied values. The calibration targets marked with (n) are not targeted in the corresponding model and is not included in the computation of the squared error.
share of the return, but only on a very small q. There is no way around it with bargaining. With price taking, the holdup problems vanish and c can be picked to match K/Y exactly. All models deliver a DM share sD of only around 3%. At first this may seem to imply that the DM is very small and hence it cannot have a significant impact on welfare. As the results below show, however, this is not true for reasons explained below. Also, because sD is relatively small the aggregate markup for Model 2 is only around 1%.15 This is lower than the numbers some macroeconomists use, but remember that the CM has no markups.16 In any case, the robustness discussion shows that the results do not hinge much on mD . For example an aggregate markup target of 10% yield similar results. Finally, Table 2 also reports the semi-elasticity of investment (or capital) with respect to the interest rate (or inflation), denoted z. This can be used as a testable implication of the model since it is not directly targeted. It is especially useful as such since the main focus in this paper is on the effect of inflation on capital accumulation. As discussed in the online appendix, the empirical counterpart of this is 0.006, which means that a one percentage point increase in inflation reduces investment by around 0.6%. Model 1 delivers exactly z ¼ 0 since as a matter of theory this specification implies inflation has no impact on capital accumulation. Model 2 delivers z ¼ 0:0001, due to the holdup problem explained above. Model 3 delivers z ¼ 0:004 which is fairly close to the empirical counterpart. Section 4.3 also shows results where z is added to the list of calibration targets. To conclude, the model, especially the price-taking version is very much in line with the data in terms of its implications for the effect of inflation on capital accumulation. 4.2. Policy experiments In the experiments considered in this section, starting in a steady state, make a once-and-for-all change in the growth rate of money t and track the behavior of the economy over time. Since inflation in steady state equals t, with a slight abuse of language, the experiments are presented as a change in inflation, but note that inflation actually does not jump to the new steady state level in the short run (i.e. inflation may not equal t during the transition). 4.2.1. 10% inflation to the Friedman rule Table 3 contains results for a common experiment in the literature where t changes from t1 ¼ 0:1 to the FR, which is t2 ¼ 0:027 for the baseline calibration. For now, any change in government revenue is made up using the lump-sum tax T, but other fiscal options are considered below. Table 3 presents ratios of equilibrium values of several variables at the two inflation rates. 15 For Model 1, i.e. when y ¼ 1 the markup is actually negative in Table 2, because take-it-or-leave-it offers by buyers means price equals AC which is below MC. Hence, just to get mD 4 0, y needs to be significantly below 1. 16 Aruoba and Schorfheide (forthcoming) introduce markups in the CM by incorporating monopolistic competition, calibrating to around 15% in each sector.
110
S.B. Aruoba et al. / Journal of Monetary Economics 58 (2011) 98–116
Table 3 Comparing 10% inflation and the Friedman Rule. Model 1 Allocation q1/q2 1 2 q^ =q^ K1/K2 H1/H2 X1/X2 1 2 YC/YC Y1/Y2
Welfare gains Steady state Transition Net Net from t ¼ 0:1 to 0
Model 2
Model 3
0.58 1.00
0.54 1.00
0.56 0.98
1.00 1.00 1.00 1.00 0.98
1.00 1.00 1.00 1.00 0.98
0.93 1.00 0.98 0.98 0.96
1.36 0.00 1.36 1.20
2.78 0.02 2.76 2.03
2.55 0.80 1.75 1.31
Note: This table reports the differences in allocations and welfare of the models with 10% steady state inflation versus the Friedman rule. Superscript of 1 refers to the model with 10% inflation and 2 refers to the model under the Friedman rule. Model 1 refers to the version with buyer-take-all bargaining (y ¼ 1Þ, Model 2 refers to the version with generalized Nash bargaining and Model 3 refers to the version with price taking. Italics denote numbers that are close to but not exactly equal to unity. The welfare results report the welfare gain of changing inflation from 10% to the Friedman rule, as percentage of consumption.
The first thing to note is that q1/q2 is considerably less than 1. Looking across models, q is considerably lower in Model 2 than the other specifications, reflecting the impact of the money holdup problem. Intuitively, inflation is a tax on DM activity, and these results show that this tax is quantitatively very important for q. In Model 1 this is the only effect, since y ¼ 1 implies monetary policy has no impact on the CM. In Model 2, monetary policy does affect the CM, in principle, but the impact is tiny as one should expect from the discussion in Section 4.1. Models 1 and 2 predict that going to the FR increases aggregate output Y by 2%, essentially all due to the change in q. In Model 3 the effects are very different. Now K changes and by a lot, around 7%. This makes CM consumption X change by about 2%, and the net impact on Y is 4%. Now consider welfare. As is standard, welfare is measured by the required percentage increase in the agents’ consumption under the high t regime that makes the agents indifferent between the two t regimes. The table shows the answer comparing across steady states—jumping instantly from t1 and K 1 to t2 and K2—as well as the cost of the transition from K1 to K2 and the net gain to changing t starting at K1. This net gain is the true benefit of the policy change, although the steady state comparison is also interesting (it shows how much an agent facing t1 and K1 would pay to trade places with someone facing t2 and K2). In Model 1 there is no transition since t does not affect K, and in Model 2 it is unimportant, since t does not affect K much, but in Models 3 the transition is significant. The table also reports the net gain to reducing t to 0, instead of all the way to FR, to check how much of the gain comes from eliminating inflation and how much comes from deflation (most comes from the former). In Model 1, with y ¼ 1, going from 10% inflation to the FR is worth 1.4% of consumption. This is larger than the findings from reduced-form models discussed in the Introduction, but not a lot larger. In Model 2, with y 0:9, this same policy is worth just under 3% of consumption. Intuitively, at y 0:9 the money holdup problem makes q very low, so any additional reduction is very costly. In Model 3 the steady state gain is close to the one in Model 2. Inflation has a sizable impact on K and X in Model 3, but since much of the gain accrues in the long run, and agents work more and consume less during the transition, the net gain is only 1.8%. Fig. 1 shows the transitions for Models 2 and 3. In Model 3, e.g., in the short run H increases over 2% and X falls slightly before settling down to the new steady state, while DM output jumps on impact over 50% and quickly settles down. The difference between the two panels of Fig. 1 is the size of the adjustment in CM variables: with bargaining, K changes only about 0.2% in the long run, while with price-taking K changes over 6%. The size of the DM is roughly 3%, yet as Table 3 indicates the welfare results are sizable. One may think that since DM activity is a small part of the economy, changes in inflation cannot have a large impact on welfare, but this logic is flawed. As hinted above, the matching parameter s is key to determining the size of DM, and it is almost directly pinned down by the semi-elasticity of money demand.17 As Bailey (1956) and many others since emphasized, the slope of the money demand curve is key for calculating the welfare cost of inflation, and this semi-elasticity is intimately related to that slope. As a result, since the model displays a realistic semi-elasticity, when inflation changes from 10% to the FR, real-money balances rise by about 45%, leading to a similar size change in the quantity of goods produced in the DM. Even with the relatively small s (and the size of the DM) this yields a welfare gain of 2–3%.
17 In a log-linearized environment, Aruoba and Schorfheide (2011) shows that the interest semi-elasticity of money demand is approximately equal to ð1 þ iÞ=ðiþ sÞ.
S.B. Aruoba et al. / Journal of Monetary Economics 58 (2011) 98–116
111
Model 2 Path of Capital
Path of Hours
1.01
1.01
1.005
1.005
1
1
0.995
0.995
0.99
0.99 5
1.005 1 0.995 5
10 15 20 25 30 35 40 Periods
Path of DM Output/Consumption
Path of CM Output
1.01
0.99
5
10 15 20 25 30 35 40 Periods
10 15 20 25 30 35 40 Periods
1.5 1.4 1.3 1.2 1.1 1
5
10 15 20 25 30 35 40 Periods
Path of CM Consumption
1.01
Path of GDP
1.005
1.02
1
1.01
0.995
1
0.99 10 15 20 25 30 35 40 Periods
5
0.99
5
10 15 20 25 30 35 40 Periods
Model 3 Path of Capital
1.08
Path of Hours 1.03
1.06
1.02
1.04
1.01
1.02
1
1 10
20
30 40 50 Periods
60
70
0.99
Path of CM Output 1.02 1.01 1 10
20
30 40 50 Periods
60
70
Path of CM Consumption 1.02 1.01 1 0.99 10
20
30 40 50 Periods
20
30 40 50 Periods
60
70
Path of DM Output/Consumption
1.03
0.99
10
60
70
1.6 1.5 1.4 1.3 1.2 1.1 1
1.05 1.04 1.03 1.02 1.01 1 0.99
10
20
30 40 50 Periods
60
70
60
70
Path of GDP
10
20
30 40 50 Periods
Fig. 1. Figure 1–10% to FR: transitions. Note: Each panel of this figure shows the path of key variables during the transition after policy changes from 10% inflation to the Friedman rule.
112
S.B. Aruoba et al. / Journal of Monetary Economics 58 (2011) 98–116
Table 4 Welfare—Friedman rule versus first best. Model 1
Model 2
Model 3
Benchmark calibration Steady state Transition Net
23.06 10.87 12.18
22.67 10.30 12.37
17.46 8.33 9.13
No taxes (Recalibrated) Steady state Transition Net
1.52 1.39 0.13
1.78 0.99 0.79
0.00 0.00 0.00
No taxes (Not recalibrated) Steady state Transition Net
1.85 1.65 0.20
2.16 1.20 0.96
0.00 0.00 0.00
Note: This table reports the welfare gain of going from the equilibrium under the Friedman rule to the first best under different assumptions about calibration and taxes, as a percentage of consumption. Model 1 refers to the version with buyer-take-all bargaining (y ¼ 1Þ, Model 2 refers to the version with generalized Nash bargaining and Model 3 refers to the version with price taking.
4.2.2. Friedman rule to the first best Table 4 reports the welfare gain of going from the FR to the FB under three different assumptions: the benchmark calibration; a version with no distorting taxes (i.e., th ¼tk ¼tx ¼ 0) and parameters recalibrated; and a version with no distorting taxes but the original parameter values. The differences in the first panel are big, mainly due to taxation (McGrattan et al., 1997 find similar results in standard nonmonetary models). Once taxes are shut down, Model 3 shows a gain of 0 because the FR implements the FB. In Model 1, with capital holdup but no money holdup, the steady state gain is around 2%, although much is lost in transition. In Model 2, with both holdup problems, the steady state gain is around 2% and about half remains after incorporating the transition. These calculations provide measures of the impact of holdup problems: based on the steady state comparisons from the third panel, say, 1.85% of consumption is the cost of capital holdup and an additional 0.30% is the cost of money holdup. Although there is no single ‘correct’ way to decompose these effects, this suggests holdup can be quantitatively important, even though bargaining occurs only in the DM and the DM is small. 4.2.3. Using proportional taxes to make up lost revenue One can also consider lowering t and making up the revenue with proportional taxes. The first panel of Table 5 reports results when lump-sum taxes are used to make up revenue, reproducing Table 3. The second and third panels use labor and consumption taxes, respectively.18 In the data, the monetary base is on average 32.9% of M1 and as such government seigniorage revenue is a third of t times the change in M. Going to the FR and making up the revenue with labor taxes requires raising th from 25.1% to between 26.2% and 26.9%. Now there are two effects of this policy change. On the one hand there is the channel from before: the reduction in inflation increases q and K (and therefore all other CM variables), which leads to welfare gain, that is partially offset by the extra work and reduced consumption to accumulate the extra capital. On the other hand, the increase in labor income tax reduces K and all related variables. The effect of the first channel on welfare can be read from the first panel, which is reported in Table 3. The second channel reduces welfare in the long-run and thus all steady state numbers in the second and third panel are lower than those in the first panel. Turning to the transition, in Models 1 and 2 the net effect of the change in policy on K is a decrease since the change in inflation barely increases K due to the holdup problems while the increase in taxes reduce K significantly. In Model 3, the effect of inflation on K is large enough so the net effect is still an increase. Thus on the transition path there is a welfare gain for Models 1 and 2 and a loss of Model 3. On net, the overall impact of lower inflation, however it is financed, is positive in all models. The last rows of each panel report results for the extreme assumption that the government is able to collect seigniorage revenue from all of M1 (as in Cooley and Hansen, 1991). This makes the lost revenue of the government much larger than before, forcing the labor income tax to increase to around 31% or the consumption tax to increase to around 13%. As a result, the welfare loss due to these tax increases are sufficiently large to offset the gains from lower inflation in for all models in the case of when labor income tax is used. This result makes it clear that how the lost seigniorage revenue is financed makes a difference for welfare results.19 18 The case where the lost revenue is made up with capital taxes cannot be solved, since increasing tk lowers K by so much that sufficient revenue is not forthcoming. 19 Note that the analysis here is not about optimal monetary policy when fiscal policy is also set to maximize utility, since the existing tax rates are taken as given from the data; see Aruoba and Chugh (2010).
S.B. Aruoba et al. / Journal of Monetary Economics 58 (2011) 98–116
113
Table 5 10% inflation versus the Friedman rule—alternative fiscal policies. Model 1 Making up revenue by T Steady state gain Transition Net gain Net gain (Full seignorage)
1.36 – 1.39 1.36
Model 2
Model 3
2.78 0.02 2.75 2.75
2.55 0.80 1.75 1.75
Making up revenue by th (Old th ¼ 0.251) New th Steady state gain Transition Net gain Net gain (Full seignorage)
0.269 0.33 0.19 0.52 1.49
0.269 1.76 0.17 1.92 0.10
0.262 1.93 0.67 1.26 0.58
Making up revenue by tx (Old tx ¼ 0.069) New tx Steady state gain Transition Net gain Net gain (Full seignorage)
0.088 0.62 0.14 0.75 0.55
0.088 2.04 0.11 2.15 0.84
0.081 2.12 0.72 1.40 0.20
Note: This table reports the results of the policy experiment of reducing inflation from 10% to the Friedman rule under various fiscal arrangements (all assuming the seignorage revenue is one third of the change in money supply): making up the loss revenue by increasing the lumpsum tax, increasing the labor income tax and increasing the consumption tax. The rows labelled net gain (full seignorage) shows the results where all of the change in money supply can be considered as seignorage revenue of the government. Model 1 refers to the version with buyer-take-all bargaining (y ¼ 1Þ, Model 2 refers to the version with generalized Nash bargaining and Model 3 refers to the version with price taking.
4.3. Robustness We redid all the calculations for many alternative specifications; in the interest of space, Table 6 reports the results in terms of one statistic: the net welfare gain of going from 10% inflation to the FR. Detailed results for each specification are available upon request. The first row is the benchmark model. The first robustness check involves shutting down the distorting taxes, both for the case where other parameters are kept at benchmark values, and when they are recalibrated. Most of the results are similar to the benchmark calibration, although the cost of inflation is somewhat lower, especially under price taking. This is because the FR achieves the FB under price taking without distortionary taxes, and hence the cost of moderate inflation is low, by the envelope theorem. It is no surprise that some results depend on what one assumes about taxation, and since taxes are a fact of life, the benchmark calibration should be trusted.20 Next are the preference parameters b, e and Z. Clearly, the results are not overly sensitive, although lowering Z generally does increase the cost of inflation somewhat. One can also vary b, d etc. over reasonable ranges without affecting things too much (not reported). Similarly changing the target for the DM markup does not change the results much. When the aggregate markup m is targeted instead of the DM markup, however, welfare cost increases substantially since matching this markup requires a very different y, and this decrease in y increases the money holdup problem. The table also shows that the results are not very sensitive to using the so-called Great Moderation period (1985–2004) for the calibration, and not at all sensitive to assuming a different length for a period (quarterly, monthly and annual models deliver very similar predictions). This is easy to understand: to go from an annual to a quarterly or monthly model, inflation, velocity, interest rates, K/Y and I/K are simply adjusted by the relevant factor. The calibrated s declines, because a shorter period reduces the probability of consuming in any given DM, but the welfare conclusions do not change. This is important because changing frequency typically does change the results in some models, including standard cash-inadvance models, where agents spend all their money every period. Perhaps surprisingly, the results are robust to changed in the payment parameter o within a wide range. Even when only 25% of DM trades require cash, the welfare costs are similar. To understand this, first note it is certainly true that a reduction in o reduces the cost of inflation when other parameters are fixed. But when parameters are recalibrated as o changes, in order to match the calibration targets, s increases and B falls. On net, this renders DM activity just about as important for welfare as before. Obviously, o ¼ 0 means money is not valued and hence inflation is irrelevant, but if o ¼ 0 then the calibration targets cannot be matched. For values of o in a reasonable range, as long as the same targets are matched, the net effects are very similar. What does matter is the empirical measure of money: we repeat the calibration using currency component of M1 and M2. These alternatives imply different values for average velocity and interest elasticity, and given the calibration method, 20 Throughout the table, Model 3 provides the clearest picture of how a certain change affects the results since, as in the benchmark calibration, all calibration targets can be matched exactly. In Models 1 and 2, due to the trade off between competing targets and because K/Y cannot be matched exactly, results are more sensitive.
114
S.B. Aruoba et al. / Journal of Monetary Economics 58 (2011) 98–116
Table 6 Robustness. Model 1
Model 2
Model 3
Benchmark
1.36
2.76
1.75
Only lump-sum tax Recalibrated Not
1.57 1.36
3.01 2.75
1.62 0.87
CM ðeÞ and DM ðZÞ risk aversion (Benchmark e ¼ Z ¼ 1Þ e ¼ 0:5, Z ¼ 0:5 1.26 e ¼ 2, Z ¼ 2 1.57 e ¼ 1, Z ¼ 0:5 1.27 e ¼ 1, Z ¼ 2 1.36
3.84 2.31 3.84 2.85
2.82 1.14 2.19 1.35
Utility parameter b (Benchmark b ¼0.0001) b¼ 0.00001 b¼ 0.001 b¼ 0.1
1.36 1.36 1.37
2.91 2.61 2.66
1.75 1.75 1.86
Markup target (Benchmark mD ¼ 30%) mD ¼ 10% mD ¼ 100% m ¼ 10%
– – –
2.94 2.87 3.75
– – –
Measures of money (Benchmark M1) Currency M2
0.26 0.91
0.74 1.23
0.54 1.26
Frequency (Benchmark annual) Quarterly Monthly
1.31 1.28
2.28 1.93
1.61 1.59
Period (Benchmark 1959–2004) 1985– 2004
1.79
2.88
1.71
Payment parameter o (Benchmark o ¼ 0:85) o¼1 o ¼ 0:25
1.36 1.29
2.74 2.08
1.81 1.59
Alternative calibration strategies #1 : Add z #2 : c ¼ 1=ð1aÞ
– 1.39
3.19 2.74
1.87 1.68
Alternative model Two-capital
–
4.81
1.33
Note: This table reports the net welfare gain of going from 10% inflation to the Friedman rule under various changes in the calibration strategy. Model 1 refers to the version with buyer-take-all bargaining (y ¼ 1Þ, Model 2 refers to the version with generalized Nash bargaining and Model 3 refers to the version with price taking.
this changes the cost of inflation. Intuitively, consider the traditional method of computing the cost of inflation by the area under the money demand curve. With a broader definition of M (i.e. lower velocity), the curve shifts up and increases the estimated cost. At the same time, remember that the slope of this curve is linked to the interest elasticity of money. When currency is used for the calibration, the elasticity is slightly lower and the velocity is much larger relative to M1, both of which reduces the welfare cost of inflation. However, for M2, even though velocity is lower so is the interest elasticity and the end result is a decrease in the welfare cost. In any case, these results indicate that the measure of money does matter, as it should, and as it will in any monetary theory. Two alternative calibration strategies are also considered. In Strategy 1, the investment semi-elasticity, z ¼ 0:0059, is added to the list of calibration targets in Models 2 and 3.21 As expected, Model 2 has a very hard time improving on the benchmark calibration due to the holdup problems and z is only 0.0002. Model 3, on the other hand, is able to match this target exactly with very little sacrifice from other targets—essentially, the only substantive change is a slight increase in K/Y. Since inflation creates a bit more reduction in investment, it is slightly more costly—1.87 versus 1.75 in the benchmark calibration. Strategy 2 reverts to the benchmark calibration but it no longer uses a to target the labor share of income but calibrates it. In turn, c is now restricted to be 1=ð1aÞ. This is a natural restriction—it follows from the assumption that production in the DM and the CM uses the same technology. With this strategy, a will adjust to match K/Y exactly in all models (not just Model 3) and it does so with only a slightly larger a than the benchmark calibration. The calibrated values of a using this strategy are 0.310, 0.311 and 0.297 for the three models. There are also small changes in the remaining parameters, but the bottom line is the welfare results here are virtually identical to the benchmark results.
21
As argued above in Model 1, z ¼ 0 by definition, independent of parameter values.
S.B. Aruoba et al. / Journal of Monetary Economics 58 (2011) 98–116
115
Finally, robustness with respect to some larger modeling choices is also considered. One can study a version of the model with two capital stocks, KC and KD (see the online appendix). This version of the model can be calibrated exactly as the benchmark version, paying attention to the definition of capital since now there are two types. The last row of Table 6 shows the welfare results for this version of the model. By and large the quantitative results of the benchmark model extend to this version of the model. 5. Conclusion The main contribution of this paper is to show that one can integrate elements from models with explicit trading frictions into capital theory in a way that generates interesting effects of money on investment. One can also use standard methods to calibrate the model, even though it contains some parameters like s or y that are not in standard models. This strategy performs fairly well, doing a good job matching most targets, although with price taking the markup cannot be matched, and with bargaining K/Y cannot be matched very well. Backing out the size of the two sectors from observables, the DM accounts for around 3% of total output. There are number of policy implications, some similar to the previous literature but some not. Inflation is a tax on DM consumption q, and its impact is big. Qualitatively, given K is useful for producing q, inflation reduces investment; quantitatively, this effect is tiny under bargaining but big (around 7%) under price taking. In terms of welfare, under price taking, reducing inflation from 10% to the FR is worth 2.5% across steady states, and 1.7% taking into account the transition; it is worth around 3% under bargaining. With either price taking or bargaining, much of the gain is achieved by reducing inflation to 0 rather than going all the way to the FR. Not surprisingly, the costs of fiscal distortions are big. The holdup problems for both money and investment are is important. Most of these results are robust, but the empirical measure of M does matter. Finally, a key element of the framework is the explicit two-sector structure, although it does not matter much if the same or different capital stocks are used in the two sectors, or whether capital is traded in one sector or the other. Perhaps the most surprising result is that the impact of inflation on output and investment can be so large. The model predicts that going from the FR to 10% inflation decreases output by up to 4%, and decreases investment by up to 7%, depending on the specification. How plausible are these findings? As discussed above, the model, especially the price-taking version roughly matches the U.S. data, both as an independent testable implication as well as a calibration target. Further work, perhaps using cross-country variation, is certainly warranted.22 Related to the response of output, it is also true that the theory predicts an upward-sloping long-run Phillips curve, or a positive relation between inflation and unemployment (at least a negative correlation between inflation and employment, since there is no notion of unemployment per se in the model). This is as it should be: whatever one believes about the short-run Phillips curve, it is documented in Berentsen et al. (2011) and Haug and King (2009) that after filtering out business cycle frequencies, the US data displays a clear positive correlation between inflation and unemployment, and a negative correlation between inflation and employment. Again, while more work is necessary on this, there is nothing in this data obviously inconsistent with the model in this paper. Our overall conclusion is that it is quantitatively relevant for capital formation to incorporate elements from the microfoundations literature, including bargaining, alternating centralized and decentralized markets, and stochastic trading opportunities. In terms of future work, it may be interesting to consider more general preferences, perhaps still quasi-linear, but not nonseparable between x and q. This allows one to parameterize more flexibly substitutability between CM and DM goods, and breaks the dichotomy even if K is not used in the DM. In terms of other ideas, one really should try to take financial intermediation more seriously, or study optimal fiscal and monetary policy, or examine business cycle properties of the model. All of this is left to other research. Appendix A. Supplementary data Supplementary data associated with this article can be found in the online version at doi:10.1016/j.jmoneco.2011.03.003.
References Acemoglu, D., Shimer, R., 1999. Holdups and efficiency with search frictions. International Economic Review 40, 827–850. Aiyagari, S.R., Wallace, N., 1991. Existence of steady states with positive consumption in the Kiyotaki–Wright model. Review of Economic Studies 58, 901–916. Aliprantis, C., Camera, G., Puzzello, D., 2007. Anonymous markets and monetary trading. Journal of Monetary Economics 54, 1905–1928. Aruoba, S.B., Chugh, S.K., 2010. Optimal fiscal and monetary policy when money is essential. Journal of Economic Theory 145, 1618–1647. Aruoba, S.B., Schorfheide, F., 2011. Insights from an estimated search model of money with nominal rigidities. American Economic Journal: Macroeconomics 3, 60–90. Aruoba, S.B., Wright, R., 2003. Search, money and capital: a neoclassical dichotomy. Journal of Money, Credit and Banking 35, 1086–1105. Azariadis, C., 1993. Intertemporal Macroeconomics. Blackwell, Oxford.
22 As simple cross-country stylized facts, inflation has a significantly negative correlation with output and a weakly negative correlation with investment/output ratio, which are qualitatively in line with the model.
116
S.B. Aruoba et al. / Journal of Monetary Economics 58 (2011) 98–116
Bailey, M.J., 1956. The welfare cost of inflationary finance. Journal of Political Economy 64, 93–110. Ball, L., 2001. Another look at money demand. Journal of Monetary Economics 47, 31–44. Barnett, W., 1978. The user cost of money. Economic Letters 1, 145–149. Barnett, W., 1980. Economic monetary aggregates: an application of index number and aggregation. Journal of Econometrics 14, 11–48. Berentsen, A., Camera, G., Waller, C.J., 2007. Money, credit and banking. Journal of Economic Theory 135, 171–195. Berentsen, A., Menzio, G., Wright, R., 2011. Inflation and unemployment in the long run. American Economic Review 101, 371–398. Caballero, R., 1999. Aggregate investment. In: Taylor, J., Woodford, M. (Eds.), Handbook of Macroeconomics, vol. 1B. North-Holland, Amsterdam. Christiano, L., Eichenbaum, M., 1992. Current real business cycle theory and aggregate labor market fluctuations. American Economic Review 82, 430–450. Chiu, J., Meh, C. Banking, liquidity and the market for ideas. Macroeconomic Dynamics, in press. Chiu, J., Molico, M., 2010. Liquidity, redistribution, and the welfare cost of inflation. Journal of Monetary Economics 57, 428–438. Coles, M., Wright, R., 1998. A dynamic equilibrium model of search, bargaining and money. Journal of Economic Theory 78, 32–54. Cooley, T., Hansen, G., 1989. The inflation tax in a real business cycle model. American Economic Review 79, 733–748. Cooley, T., Hansen, G., 1991. The welfare cost of moderate inflations. Journal of Money Credit and Banking 23, 483–503. Cynamon, B.Z., Dutkowsky, D.H., Jones, B.E., 2006. Redefining the monetary aggregates: a clean sweep. Eastern Economic Journal 32, 661–673. Diamond, P., 1984. Money in search equilibrium. Econometrica 52, 1–20. Dong, M., 2008. Money and costly credit. Mimeo, Bank of Canada. Dotsey, M., Ireland, P., 1996. The welfare cost of inflation in general equilibrium. Journal of Monetary Economics 37, 29–47. Ennis, H., 2001. On random matching, monetary equilibria, and sunspots. Macroeconomic Dynamics 5, 132–141. Faig, M., Jerez, B., 2005. A theory of commerce. Journal of Economic Theory 122, 60–99. Fischer, S., 1981. Towards an understanding of the costs of inflation: II. Carnegie-Rochester Conference Series on Public Policy 15, 5–42. Gomme, P., 1993. Money and growth revisited: measuring the costs of inflation in an endogenous growth model. Journal of Monetary Economics 32, 51–77. Gomme, P., Rupert, P., 2007. Theory, measurement and calibration of macroeconomic models. Journal of Monetary Economics 54, 460–497. Greenwood, J., Rogerson, R., Wright, R., 1995. Household production in real business cycle theory. In: Cooley, T. (Ed.), Frontiers of Business Cycle Research. Princeton University Press, Princeton. Haug, A., King, I., 2009. Inflation and unemployment in the long run: an econometric analysis. Mimeo. He, P., Huang, L., Wright, R., 2005. Money and banking in search equilibrium. International Economic Review 46, 637–670. He, P., Huang, L., Wright, R., 2008. Money, banking and inflation. Journal of Monetary Economics 55, 1013–1024. Imrohoroglu, A., Prescott, E.C., 1991. Seigniorage as a tax. Journal of Money Credit and Banking 23, 462–475. Ireland, P., 1994. Money and growth: an alternative approach. American Economic Review 84, 47–65. Kiyotaki, N., Moore, J., 2001. Liquidity, business cycles and monetary policy. Lecture 2, Clarendon Lectures. Kiyotaki, N., Wright, R., 1989. On money as a medium of exchange. Journal of Political Economy 97, 927–954. Kiyotaki, N., Wright, R., 1993. A search-theoretic approach to monetary economics. American Economic Review 83, 63–77. Klee, E., 2008. How people pay: evidence from grocery store data. Journal of Monetary Economics 55, 526–541. Kocherlakota, N., 1998. Money is memory. Journal of Economic Theory 81, 232–251. Krishnamurthy, A., Vissing-Jorgensen, A., 2009. The aggregate demand for treasury debt. Mimeo. Lagos, R., Wright, R., 2005. A unified framework for monetary theory and policy evaluation. Journal of Political Economy 113, 463–484. Lagos, R., Rocheteau, G., 2008. Money and capital as competing media of exchange. Journal of Economic Theory 142, 247–258. Lester, B., Postlewaite, A., Wright, R., 2010. Information, liquidity, asset prices and monetary policy. Mimeo. Lucas, R., 1980. Equilibrium in a pure currency economy. In: Kareken, J.H., Wallace, N. (Eds.), Models of Monetary Economies. Federal Reserve Bank of Minneapolis, pp. 131–145. Lucas, R., 1981. Discussion of: Stanley Fischer, towards an understanding of the costs of inflation: II. Carnegie-Rochester Conference Series on Public Policy 15, 43–52. Lucas, R., 2000. Inflation and welfare. Econometrica 68, 247–274. Lucas, R., Prescott, E.C., 1974. Equilibrium search and unemployment. Journal of Economic Theory 7, 188–209. McGrattan, E., Rogerson, R., Wright, R., 1997. An equilibrium model of the business cycle with household production and fiscal policy. International Economic Review 38, 267–290. Menner, M., 2006. A search-theoretic monetary business cycle with capital formation. Contributions to Macroeconomics 6 (11). Molico, M., 2006. The distribution of money and prices in search equilibrium. International Economic Review 47, 701–722. Molico, M., Zhang, Y., 2005. The distribution of money and capital. Mimeo, Bank of Canada. Mortensen, D.T., Pissarides, C., 1994. Job creation and job destruction in the theory of unemployment. Review of Economic Studies 61, 397–415. Mortensen, D.T., Wright, R., 2002. Competitive pricing and efficiency in search equilibrium. International Economic Review 43, 1–20. Nosal, E., Rocheteau, G., 2010. Money, Payments, and Liquidity. MIT Press. Osborne, M.J., Rubinstein, A., 1990. Bargaining and Markets. Academic Press. Poterba, J.M., Rotemberg, J.J., 1987. Money in the utility function: an empirical implementation. In: Barnett, W.A., Singleton, K.J. (Eds.), New Approaches to Monetary Dynamics. Cambridge University Press. Prescott, E.C., 1986. Theory ahead of business cycle measurement. Federal Reserve Bank of Minneapolis Quarterly Review Fall, 9–22. Rocheteau, G., Rupert, P., Shell, K., Wright, R., 2008. General equilibrium with nonconvexities and money. Journal of Economic Theory 142, 294–317. Rocheteau, G., Wright, R., 2005. Money in search equilibrium, in competitive equilibrium and in competitive search equilibrium. Econometrica 73, 75–202. Rogerson, R., 1988. Indivisible labor, lotteries and equilibrium. Journal of Monetary Economics 66, 3–16. Sidrauski, M., 1967a. Rational choice and patterns of growth in a monetary economy. American Economic Review 57, 534–544. Sidrauski, M., 1967b. Inflation and economic growth. Journal of Political Economy 75, 796–810. Shi, S., 1995. Money and prices: a model of search and bargaining. Journal of Economic Theory 67, 467–496. Shi, S., 1997. A divisible search model of fiat money. Econometrica 65, 75–102. Shi, S., 1999. Search, inflation and capital accumulation. Journal of Monetary Economics 44, 81–104. Shi, S., Wang, W., 2006. The variability of the velocity of money in a search model. Journal of Monetary Economics 53, 537–571. Stock, J.H., Watson, M.W., 1993. A simple estimator of cointegrating vectors in higher order integrated systems. Econometrica 61, 783–820. Stockman, A., 1981. Anticipated inflation and the capital stock in a cash-in-advance economy. Journal of Monetary Economics 8, 387–393. Telyukova, I., Visschers, L., 2009. Precautionary demand for money in a monetary search business cycle model. Mimeo. Trejos, A., Wright, R., 1995. Search, bargaining, money and prices. Journal of Political Economy 103, 118–141. Tobin, J., 1965. Money and economic growth. Econometrica 33, 671–684. Wallace, N., 2001. Whither monetary economics? International Economic Review 42, 847–869 Waller, C.J. Random matching and money in the neoclassical growth model: some analytical results. Macroeconomic Dynamics, in press. Williamson, S., Wright, R., 2010a. Monetarist Economics: Methods. Federal Reserve Bank of St. Louis, Review 92, 265–302. Williamson, S., Wright, R., 2010b. New Monetarist Economics: Models. In: Friedman, B., Woodford, M. (Eds.), Handbook of Monetary Economics. Elsevier, pp. 25–96.
Journal of Monetary Economics 58 (2011) 117–131
Contents lists available at ScienceDirect
Journal of Monetary Economics journal homepage: www.elsevier.com/locate/jme
Putty–clay technology and stock market volatility$ Franc- ois Gourio a,b, a b
Boston University, Department of Economics, 270 Bay State Road, Boston, MA 02215, USA National Bureau of Economic Research, USA
a r t i c l e i n f o
abstract
Article history: Received 5 August 2010 Received in revised form 18 February 2011 Accepted 20 February 2011 Available online 2 March 2011
Firms’ first-order conditions imply that stock returns equal investment returns from the production technology. Much applied work uses the adjustment cost technology, which implies that the realized return is high when the investment-capital ratio is high. This paper derives, for an arbitrary stochastic discount factor, the investment return implied by the putty–clay technology. The combination of capital heterogeneity and irreversibility creates a novel channel for return volatility. The investment return is high when the ratio of investment to gross job creation is low. Empirically, the putty–clay feature helps account for U.S. stock market data. & 2011 Elsevier B.V. All rights reserved.
1. Introduction While much interest in macroeconomics has been devoted to asset pricing puzzles, most of this work is limited to endowment economies. Production economies present additional challenges for macroeconomics. In the standard stochastic growth model, the price of capital is constant, and equal to one, because at the margin, consumption goods can be transformed into capital goods one-for-one. Hence, the value of a firm equals its stock of physical capital. Equivalently, the stock return equals the current net marginal product of capital (and does not depend on expectations of the marginal product of capital in the future). The model is at odds with the data, as it produces a stock return that is two orders of magnitude less volatile than in the empirical stock return (Rouwenhorst, 1995). The standard response to this empirical challenge is to introduce adjustment costs, which generate a wedge between the value of installed capital and the value of consumption goods. If adjustment costs are large enough, the empirically observed variation in investment implies large variations in the value of installed capital. However, the adjustment cost model has several shortcomings. First, the curvature of the adjustment cost function needs to be implausibly high to match the volatility of the stock market. Second, the correlation of investment and Tobin’s Q is far from perfect in the data. Third, large adjustment costs, when embedded in a general equilibrium business cycle model, typically lead to counterfactual business cycle implications: investment is too smooth, and employment is countercyclical (Boldrin et al., 2001). Finally, it is unclear how to reconcile the high curvature of adjustment costs with microeconomic evidence (e.g., Cooper and Haltiwanger, 2006), which does not point towards large, smooth adjustment costs. Overall, Hall (2004) concludes that
$ This paper is a revised version of a chapter of my 2005 University of Chicago dissertation. I am grateful to Lars Hansen for his supervision and to Fernando Alvarez, John Cochrane and Anil Kashyap for their advice. I also thank Jeff Campbell, Jonas Fisher, Simon Gilchrist, Boyan Jovanovic, Robert King, Pierre-Alexandre Noual, Monika Piazzesi, Adrien Verdelhan and Pierre-Olivier Weill, anonymous referees, as well as seminar and conference participants for their comments. Correspondence address: Boston University, Department of Economics, 270 Bay State Road, Boston, MA 02215, USA. Tel.: þ 1 617 353 4534; fax: þ 1 617 353 4449. E-mail address:
[email protected] 0304-3932/$ - see front matter & 2011 Elsevier B.V. All rights reserved. doi:10.1016/j.jmoneco.2011.02.002
118
F. Gourio / Journal of Monetary Economics 58 (2011) 117–131
‘‘rents arising from adjustment costs are relatively small and are not an important part of the explanation of the large movements of the values of corporations’’. The contribution of this paper is to study, theoretically and empirically, the implications of the putty–clay technology for the valuation of the aggregate stock of capital. The putty–clay technology provides an appealing alternative to the adjustment cost technology, by recognizing that capital goods are heterogeneous in capital intensity (or, more generally, productivity). Moreover, capital intensity is chosen once and for all at the time of investment. For instance, an aircraft requires exactly two pilots; or a computer workstation is used by one person. Ex-ante, firms can choose a high or low capital intensity, but ex-post it is fixed: airlines do not substitute planes for pilots when the wage of pilots rise. Hence, the production function of each existing capital good is Leontief: substitution of capital and labor occurs only through new equipment investment. At any given point in time, capital goods of different capital intensities coexist. In particular, because capital intensity on average rises over time (one of Kaldor’s stylized facts), old capital goods are less capital intensive than recent capital goods. The value of a firm now is a function not only of the quantity of capital that it owns, but also of the type of this capital (i.e. of its capital intensity). This leads to a new channel which creates volatility in the aggregate stock market value. The intuition goes as follows. Consider a positive shock to the economy that leads to an increase in the present discounted value of future productivity. The shock also leads to an increase in wages, but under reasonable conditions the increase in the present discounted value of future wages is less than that of productivity. Low capital-intensity machines, which rely more on labor, benefit from this change in factor prices. As a result, the value of low capital-intensity capital rises. On the other hand, the value of high capital-intensity capital falls through a symmetric argument. One might think that these effects offset. However, because of growth, the existing capital stock is predominantly low capital-intensity. (That is, of lower capital intensity than the capital being built today, which is the marginal investment, for which Tobin’s q equals one.) Overall, because low capital-intensity capital is procyclical, the aggregate stock market is procyclical. The putty–clay feature thus implies some volatility in the value of capital, even if there is no adjustment cost. Based on this intuition, the theoretical analysis derives a simple production-based asset pricing formula for the value of the stock market, which holds for any stochastic discount factor. Furthermore, the analysis shows that a single variable acts as a sufficient statistic for asset prices: the investment per new job kt. Investment per new job is the capital intensity of new investment, and hence measures what type of capital is built today. Because investors today are free to build any type of capital, the type that they pick must be the optimal one, and hence the most valuable. For instance, when kt falls, it must be that low capital-intensity machines become relatively more valuable. In turn, kt is falling because the present value of future wages is small relative to the present value of future productivity, making it advantageous to use less capital-intensive capital. The variable kt encodes these present discounted values and reveals them through the choices made by investors in new machines. Following the intuition above, the stock market rises when investment per new job kt falls. This provides a clear contrast with the adjustment cost model: in that model, the key variable is the investmentcapital ratio, and the model predicts that the stock market rises when the investment-capital ratio rises. These results imply that it is possible to test the model’s implications without having to either measure the present discounted value of wages and productivity, or to specify a full general equilibrium model (including a utility function and a shock process): it is enough to measure the variable kt directly in the data. The only assumption is that firms behave competitively and maximize value, subject to the technology constraints, taking prices – the wage rate and the stochastic discount factor – as given. Section 4 turns to the data and measures the investment per new job kt. This variable has a strong positive trend, that reflects the standard process of capital deepening: the capital per worker rises as the economy grows. However, the variable kt also exhibits significant countercyclical fluctuations around this trend. This countercyclicality implies some significant volatility in prices. To assess the model, I follow Cochrane (1991) and use the asset pricing formula to construct the investment return implied by the putty–clay model, and compare this return both to the data return and to that implied by the adjustment cost model. The putty–clay model outperforms the adjustment cost model by a wide margin for realistic values of adjustment costs. When adjustment costs are very large, the putty–clay model does as well as the adjustment cost model. Interestingly, the two models do not generate similar time series of returns, which implies that a combination of the two models does better than either model separately. Overall, the putty–clay mechanism appears to be empirically relevant. Besides theoretically characterizing and empirically evaluating this novel channel for price variation, a contribution of the paper is to provide a simple framework to study putty–clay technology. There has been relatively little work in this area due to the intractability of the model: the endogenous, time-varying cross-sectional distribution of production units must be computed as part of solving for the equilibrium, which makes the state space infinite dimensional. In contrast, my paper adds a single additional state variable to the standard neoclassical growth model. Importantly, the mechanism relies on the heterogeneity of productivity among capital goods. For simplicity, the model considers only heterogeneity in capital intensity, but vintage effects or idiosyncratic shocks would lead to similar effects. Productivity heterogeneity is large in many industries (Bartelsman and Doms, 2001), and recent research in trade, industrial organization, development, and macroeconomics highlights the role of this heterogeneity. An important simplifying assumption of the analysis is that no capital goods are idle, i.e. there is full utilization. This assumption is necessary to derive the asset pricing formula, which is both the central analytical result of the paper, and the basis of the empirical implementation. Without full utilization, the model must be solved numerically, which is quite
F. Gourio / Journal of Monetary Economics 58 (2011) 117–131
119
feasible but requires a large number of extraneous assumptions on the utility function, the shock process, etc. Moreover, it is impossible to derive the analytical results that are proved in the paper under the assumption of full utilization.1 Finally, the asset pricing formula used for the empirical work does not have a simple counterpart with variable utilization. Hence, relaxing the full utilization assumption would require a different strategy to study and evaluate the model. The paper is organized as follows. The rest of the introduction discusses the related literature. Section 2 describes the model, Section 3 derives its implications, and Section 4 examines these implications empirically. An online appendix studies the sensitivity of the results to the assumption of full utilization. 1.1. Relation to the literature The paper builds directly on the work of Gilchrist and Williams (2000, 2004, 2005) who develop a modern business cycle version of the putty–clay theory of Johansen (1959) and Solow (1960). The asset pricing implication or their model have remained largely unexplored, with the exception of the study by Wei (2003). Wei (2003) extends the Gilchrist and Williams model to include energy demand, and uses it to study the effect of the 1974 oil price shock on the stock market. The main result is that the effect is relatively small, because oil expenditure is a small share of output, and because of a reduction of the real wage in general equilibrium. Rather than capital, labor, and energy, the current paper applies the model to capital and labor. Because the labor share is much larger than the energy share, the putty–clay assumption has more bite than in Wei’s analysis. (However, the putty–clay assumption may be less realistic for labor than for energy.) The methodological approach is different since, rather than running the full DSGE model, this paper derives a formula for the stock market value and directly implements it in the data. On the technical side, the simplification of the putty–clay model that occurs with full utilization was also noted by Atkeson and Kehoe (1999), who study energy demand, but they do not derive its asset pricing implications. The paper also relates to the large literature studying the link between investment and firm value. In this literature, the adjustment cost model remains the dominant framework.2 However, several studies have explored alternative mechanisms. In particular, several studies explore the role of slow diffusion of ideas, innovation, and ensuing market power.3 Another line of research quantifies the quasi-rents accruing to other inputs than capital: notably, Merz and Yashiv (2007) study the role of labor adjustment costs, while McGrattan and Prescott (2005) introduce intangible capital. Finally, the paper is also related to recent ‘‘production-based asset pricing’’ literature (e.g., Gomes et al., 2003; Zhang, 2005; Bazdresch et al., 2009). One important difference between the current study and this literature is that the implications of technology for asset prices are derived without making any assumption on the stochastic discount factor. In this sense this is truly a production-based approach rather than a general equilibrium approach.4 2. Model This section first describes the microeconomic technology, then defines macroeconomic quantities, and finally derives an aggregation result. 2.1. Technology Production takes place in a continuum of production units called machines. Machines have ex-ante Cobb–Douglas a production function, and they all have the same total factor productivity: output is y ¼ Ak n1a , where A is the current aggregate productivity, k is machine capital and n is machine labor. Both k and n are fixed at the machine level: they are decided when the machine is built, and remain constant thereafter. Given the constant return to scale assumption, the size a of each machine can be normalized to one worker, i.e. set n ¼1, and write output as y ¼ Ak , where k is the capital of the machine, and also its capital–labor ratio. Once a machine is built it has no scrap value: investment is irreversible. Machines have a constant probability d of breaking down each period: hence depreciation here is a matter of machines disappearing, not of machines shrinking in size. For clarity, think of a representative firm as operating the entire stock of machines, paying wages to workers, rebating dividends to households, and investing in new machines each period, which are added to the existing stock next period. To build a machine, the representative firm chooses its capital intensity, or capital per worker k. Since a machine has one worker, k is also the total amount of capital sunk into the machine, i.e. its construction cost. Once k is chosen, it remains fixed for the lifetime of the machine. Hence, the firm faces a trade-off between quality and quantity: for the same 1 An online appendix solves numerically the full utilization model and the variable utilization model under a standard parametrization, and finds that their quantitative predictions are close. 2 Among many others, see Cochrane (1991), Jermann (1998), Hall (2001, 2004), and Liu et al. (2009); early studies of the adjustment cost model include Abel (1983), Hayashi (1982), Mussa (1977), and Summers (1981). 3 See Greenwood and Yorukoglu (1997), Greenwood and Jovanovic (1999), Laitner and Stolyarov (2003), and Garleanu et al. (2009). 4 Exceptions include Belo (2010) and Jermann (2010) who also derive implications for asset prices without making assumptions on the stochastic discount factor. These papers differ from my paper in the technologies that they study.
120
F. Gourio / Journal of Monetary Economics 58 (2011) 117–131
investment expense, it can have either a small number of highly capital intensive machines or a large number of low capital intensity machines. The irreversibility of investment is largely consistent with plant-level studies such as Cooper and Haltiwanger (2006), which find very little negative investment.5 On the other hand, there is, to my knowledge, little systematic evidence on the flexibility of the capital-labor ratio. Certainly one should not think of a firm or a plant, as a single machine. Rather, a plant is a collection of machines, and variation over time in employment and investment is understood as additions of new machines to the existing stock.6 The model also accords with the lumpiness of investment documented by Cooper and Haltiwanger (2006). Gilchrist and Williams (2000) consider a more general case, where the number of workers operating the machine is free a to vary, and output is given by the formula y ¼ Ak minðn,1Þ. As this equation shows, a machine can operate at full capacity with one worker (n ¼1), or it can operate at less than full capacity (n o1). The key difference with Gilchrist and Williams is that I require machines to operate at full capacity, i.e. to always set n ¼1. One can think of this as a technological or regulatory constraint on machines. But actually, this assumption would be a result if, along the equilibrium path, the a operating profit of any existing machine were positive: pðA,k,wÞ ¼ Ak w Z0 for all existing k’s where w denotes the aggregate wage, so that owners choose to operate machines at full capacity. This turns out to be almost the case for some parameter values, in the following sense: in any period, some low capital-intensity machines (i.e. which where built with a low k, generally the older ones) may not satisfy these conditions; however, because of depreciation there are few of these machines. This assumption is a highly convenient shortcut both for theoretical and numerical analysis. The online appendix compares the results from this model with the results from the model with variable utilization (i.e. the Gilchrist– Williams model). The main difference is that aggregate employment is less volatile in my model. 2.2. Cross-sectional distribution and macroeconomic quantities The equilibrium of this economy features a cross-sectional distribution of machines. Macroeconomic aggregates are obtained by integrating over the cross-sectional distribution. This section derives the law of motion of the cross-sectional distribution, which then allows to find the evolution of macroeconomic aggregates. Let ht be the number of new machines built at time t, which is also the gross creation of jobs at time t since each machine has one job. Let kt be the capital intensity which is chosen at time t. (Since all machines are identical ex-ante, and investors have the same information, they choose to build machines with the same capital intensity at time t.) The law of motion for the measure Gt of machines with capital intensity less than k is 8kZ 0 : Gt þ 1 ðkÞ ¼ ð1dÞGt ðkÞ þht 1k Z kt ,
ð1Þ
so that the number of machines of capital intensity less than k at time t þ1 is the number of machines of capital intensity less than k at time t that survived, plus the new machines: these were designed in quantity ht with capital intensity kt, whence the step function 1k Z kt . Total output in this economy is obtained by summing output across machines, Z 1 Yt ¼ At ka dGt ðkÞ: ð2Þ 0
Define the ‘‘productive capacity’’ of the economy Z 1 Yt ¼ ka dGt ðkÞ,
ð3Þ
0
so that Yt ¼ At Y , and let Z 1 Nt ¼ dGt ðkÞ,
ð4Þ
0
be the total number of machines. Because of the normalization that each machine has one worker, N t is simply employment. Both productive capacity Y t and employment N t are predetermined. This is because machines operate at full capacity, and the cross-sectional distribution of machines Gt(k) is a stock variable. (Note also that these equations imply that productivity At is not the standard Solow residual.) Aggregate investment It is the product of the number of new machines ht and the units of capital per machine kt: It ¼ kt ht :
ð5Þ
Finally, the resource constraint of the economy reads Ct þ It ¼ Yt :
ð6Þ
5 Furthermore, negative investment almost always takes the form of asset sales, not of physical scrapping. In a setting where the identity of the owner of the machine is irrelevant, this capital reallocation does not count as negative investment. 6 The putty–clay model with full utilization implies that investment cannot take place without job creation, and vice-versa. Relaxing the full utilization assumption is likely important to allow the model to match the richness of micro data.
F. Gourio / Journal of Monetary Economics 58 (2011) 117–131
121
Since the results of the paper do not directly assume anything about preferences, this resource constraint is not exploited in the analysis, as discussed in more detail in Section 2.4. 2.3. Aggregation result In the standard neoclassical growth model, the endogenous state variable is the quantity of capital. This section shows that in my model, there are two endogenous state variables: employment N t (i.e. the number of production units), and the productive capacity Y t : These variables, which are moments of the cross-sectional distribution Gt(k), capture the quantity as well as the average productivity of the existing stock of machines. This result is a substantial simplification compared to the setup of Gilchrist and Williams (2000), where the state space includes the infinite-dimensional cross-sectional distribution of machines. The simplification works because of the following two facts. First, the full utilization assumption implies that only these two moments of the cross-sectional distribution play a role in the static equilibrium determination. Second, the two moments can be expressed as a function of their own lags, and some control variables. Hence, forecasting the evolution of the two moments does not require any additional information on the cross-sectional distribution, which implies that the moments are sufficient state variables for the dynamic equilibrium. This simplification dramatically simplifies the equilibrium computation, and makes it possible to derive some analytical results. This simplification would also apply to various model extensions, as long as the above two facts still hold. Formally, direct inspection of the model equation reveals that the distribution Gt(k) is relevant for aggregates only R1 R1 through the two moments Y t ¼ 0 ka dGt ðkÞ and N t ¼ 0 dGt ðkÞ: Using the law of motion for Gt(k) (Eq. (1)) leads to two simple recursions for these moments: Z 1 Z 1 Nt þ1 ¼ dGt þ 1 ðkÞ ¼ ð1dÞ dGt ðkÞ þht ek ¼ kt ¼ ð1dÞN t þ ht , ð7Þ 0
0
where ek ¼ k0 denotes a Dirac distribution (a unit mass) at point k0. Similarly, Z 1 Z 1 Y tþ1 ¼ ka dGt þ 1 ðkÞ ¼ ka ð1dÞ dGt ðkÞ þ ht ek ¼ kt ¼ ð1dÞY t þ ht kat : 0
ð8Þ
0
These equations are easy to interpret: Eq. (7) says that employment (the number of jobs) at time tþ1 equals the number of jobs N t at time t, minus jobs destroyed dN t , plus the new jobs ht created at time t, which come into line at time t þ1. Eq. (8) similarly says that the new productive capacity Y t þ 1 equals undepreciated capacity ð1dÞY t plus the capacity of new machines, i.e. the product of the number of new machines, times the productivity of each of the new machines. The key implication of Eqs. (7) and (8) is that the two endogenous state variables Y t þ 1 and N t þ 1 are expressed solely as a function of their own lags Y t and N t and the controls kt and ht. This is precisely the reason why no ‘‘curse of dimensionality’’ arises in my model, in contrast to the Gilchrist and Williams (2000) model. 2.4. A partial specification approach The standard approach to analyze such a general equilibrium model is to specify a utility function and a stochastic process for the productivity shocks At. For instance, one may assume that there is a representative household with an expected discounted time-separable utility function over consumption and leisure, and that productivity follows an AR(1) process. It is then straightforward to solve and simulate the model. The competitive equilibrium can be computed through the social planner problem: i.e. choose contingent plans for fct ,kt ,ht g to maximize utility subject to the constraints (6)–(8), and an initial condition ðA0 ,Y 0 ,N 0 Þ: The paper follows a different approach: the next section derives a prediction of the model, without specifying the utility function (or more generally, the stochastic discount factor) and the shock process. It is only assumed that firms maximize value, subject to the technology constraints, taking prices (the wage rate and the stochastic discount factor) as given. This is an attractive approach given that there is little consensus in the empirical finance literature regarding the correct specification of the stochastic discount factor or utility function. Similarly in the labor economics literature, there is little agreement on how best to model wages. Finally the macroeconometrics literature still debates the exact contribution of various shocks to macroeconomic fluctuations. My approach circumvents these difficulties. This approach distinguishes the paper from many important recent production-based studies, which need to make specific assumptions regarding the stochastic discount factor to derive or test model implications. However, this methodology is not new: it has been extensively used to assess the empirical relevance of the adjustment cost model, which key implications can similarly be derived without assumptions on preferences or shocks. Section 4 draws out the comparison with the adjustment cost model. 3. Theoretical implications This section derives the production-based formula for the value of the capital stock in the model economy, as a function of some macroeconomic variables: Vt ¼ gðkt ,Y t ,N t Þ. The first step to obtain this formula is to analyze the investment
122
F. Gourio / Journal of Monetary Economics 58 (2011) 117–131
problem of the representative firm and to calculate the value of a machine. The second step is to aggregate across the existing stock of machines to find the total stock market value. Finally we analyze the implications of the formula. 3.1. Investment decision Given the assumption of full utilization and exogenous exit, there is no decision to make regarding existing machines: they simply generate cash flows which depend on productivity and wages. The only decisions to make regard new machines: how many to build, and which capital intensity to choose for each machine. These decisions are driven by value maximization. The value of an installed production unit is the present discounted value of cash flows, given expectations of future total factor productivity At þ s and wages wt þ s as well as discount rates mt,t þ s (i.e. the stochastic discount factor). Mathematically, the ex-dividend value of a machine of capital intensity k at time t is Pt ðkÞ ¼ Et
1 X
mt,t þ s ð1dÞs1 ðAt þ s ka wt þ s Þ:
ð9Þ
s¼1
This present value can be broken down into two terms, the present value of revenues and the present value of costs: Pt ðkÞ ¼ qA,t ka qw,t , ð10Þ P1 P1 s1 s1 where qA,t ¼ Et s ¼ 1 mt,t þ s ð1dÞ At þ s and qw,t ¼ Et s ¼ 1 mt,t þ s ð1dÞ wt þ s are the present discounted values of productivity and wages, respectively. Using Eq. (10), we can now analyze the choice of capital intensity k and the choice of the number of new machines produced each period. First, let us examine the choice of capital intensity for a given machine. The firm chooses kt to maximize the value, net of the real construction cost k: the optimal choice of k at time t is the solution to the program maxfPt ðkÞkg:
ð11Þ
kZ0
Given Eq. (10), this program is strictly concave and its solution is characterized by the first-order condition:
akta1 ¼
1 : qA,t
ð12Þ
Hence, given the expected realizations of productivity and discount rates, there is a unique optimal capital intensity kt at time t, and the economy produces new machines of only this capital intensity. The formula is analogous to the familiar a1 Cobb–Douglas user cost rule aAk ¼ r þ d, but because the capital is fixed over the lifetime of the machine, the decision now takes into account the full present discounted value of productivity, rather than just the current productivity. The free-entry condition implies (assuming positive investment in every period) that the value of a new machine equals the cost of building it: Pt ðkt Þ ¼ kt ,
ð13Þ
which, given the value formula (10), leads to kat qA,t qw,t ¼ kt :
ð14Þ
This equation reflects that there are no adjustment costs: Tobin’s q is equal to one for new machines. (However, at the machine level adjustment costs are infinite since capital is fixed.) For machines with a capital intensity different from kt, Tobin’s q is less than one since the irreversibility constraint binds. Mathematically, kt maximizes the expression Pt(k) k and sets it equal to zero, hence for all kakt , we have Pt ðkÞ o k, or Pt ðkÞ=k o1.7 Through the conditions (12) and (13), the variables qA,t and qw,t can be expressed as function of the single endogenous variable kt: qA,t ¼ qw,t ¼
a k1 t
a 1a
a
,
ð15Þ
kt :
ð16Þ
Combining these yields qw,t ¼ ð1aÞkat , qA,t 7
ð17Þ
As a result, properly measured aggregate Q is always less than unity: Z
1
Pt ðkÞdGt ðkÞ o 0
Z
1
kdGt ðkÞ: 0
It is possible to add elements to the model to generate a higher average level of Q. Two examples are idiosyncratic productivity shocks and selection (Hopenhayn, 1992), or external adjustment costs (Mussa, 1977).
F. Gourio / Journal of Monetary Economics 58 (2011) 117–131
123
Hence the capital intensity of new investment kt is higher when the present value of wages rises relative to the present value of productivity. This simply reflects the standard capital-labor substitution, as in the Cobb–Douglas case w=A ¼ ð1aÞðK=NÞa , but here too, because the capital-labor ratio is fixed over the lifetime of the machine, the decision is now based on the present discounted values of productivity and the wage rather than just the current values of these variables. The analysis finally leads to the following result, which gives the value of any given machine, as a function of its capital intensity k and the capital intensity of new machines kt. Proposition 1. The price of a machine with capital intensity k is a kt k ð1aÞ : Pt ðkÞ ¼ kt a
ð18Þ
Proof. According to Eq. (10), Pt ðkÞ ¼ qA,t ka qw,t : Using Eqs. (15) and (16) to substitute out qA,t and qw,t yields the result. & Hence kt is a sufficient statistic for asset values in this economy: while the asset value depends on two present values of expected future productivity and wages, these objects are summarized through the variable kt. Economically, the only margin of adjustment in this economy is the investment in new machines. Hence, the choices made by investors building new machines reveal the values through kt. In equilibrium, the capital intensity of new machines kt is endogenously determined, by expectations of wages, interest rates and productivity, themselves determined by the shock process and the goods and labor market clearing conditions. The key idea of the partial specification approach is that rather than specifying a full DSGE model, simulating kt, and then deducing the asset prices, it is possible to directly measure kt and assess the fit of the model. 3.2. The value of the aggregate stock market The main result of the paper is the following. Proposition 2. The ex-dividend value of the stock market is ! kt Y t ð1aÞN t : Vt ¼ a kat
ð19Þ
Proof. The ex-dividend value of the aggregate stock market Vt is found by summing the values of all the machines in the economy: Z 1 Z 1 a kt k 1 a Vt ¼ Pt ðkÞ dGt ðkÞ ¼ ð1aÞ dGt ðkÞ ¼ ðk1 Y t ð1aÞkt Nt Þ, ð20Þ kt a a t 0 0 R1 R1 where the second line uses Proposition 1 and the third line uses the definitions Y t ¼ 0 ka dGt ðkÞ and Nt ¼ 0 dGt ðkÞ. & Eq. (19) links the value of the stock market Vt with macroeconomic variables: the two endogenous state variables (employment N t , and the productive capacity Y t ) and the investment per new job kt. The intuition for this formula is as follows: the value of existing machines is the present discounted value of their future cash flows, i.e. the present value of the future output that these machines will produce, minus the costs of the labor they will require. These present values in turn depend on the present value of productivity and wages, which in equilibrium are summarized by the variable kt as discussed in the previous section. The volatility of the stock market stems from the volatility of these two present values. It is interesting to compare this formula with its counterpart in the standard neoclassical growth model. If there are no adjustment costs, Vt ¼ Kt: the value of the stock market is predetermined, and it moves only by the change in the real quantity of capital. This is clearly at odds with the empirical volatility of the aggregate stock market. The standard answer to this limitation of the model is to introduce adjustment costs. As shown by Hayashi (1982), under homogeneity assumptions, marginal q equals average q, i.e. Vt ¼qtKt where qt is an increasing function of the investment-capital ratio. Since investment is a control (or jump) variable, i.e. is not predetermined, any shock – say, a productivity shock – affects qt instantaneously and hence lead to some price variation. As a result, the stock market value is not predetermined any more. The conditional volatility of the stock market value is directly related to the conditional volatility of qt and ultimately depends on the volatility of the investment-capital ratio and the curvature of the adjustment cost function. Eq. (19) has a similar implication. The variables Yt and Nt are predetermined, but the variable kt is a control, hence it moves on impact when a shock hits, and the value of the capital stock is not predetermined, even though the quantity of capital is (i.e. the distribution Gt(k) is predetermined). Hence, this theory generates some volatility in the aggregate price of capital, even though there are no adjustment costs in the creation of new machines. The following result clarifies the mechanism and explains the volatility of the stock market, taking as given the movements in kt. We then turn to the sources of the movements in kt. Condition 3. The growth rate of aggregate productivity is positive: gA 40, and the growth rate of population is greater than the opposite of the depreciation rate: gn 4 d.
124
F. Gourio / Journal of Monetary Economics 58 (2011) 117–131
Proposition 4. Assume Condition 3 is satisfied. Then, in equilibrium, along the balanced growth path, the stock market value falls when investment per new job rises. Proof. Define the function g as gðkt ,Y t ,N t Þ ¼ ðkt =aÞððY t =kat Þð1aÞN t Þ ¼ Vt : Because Y t and N t are predetermined, the impact response to a shock is given by the response of kt. Hence, it is sufficient to evaluate @g=@k at the nonstochastic steady-state of the balanced growth path. Differentiating yields ! @gðk,Y ,NÞ 1a Y ¼ N : ð21Þ @k a ka Using Eqs. (8) and (7), the nonstochastic steady state satisfies ðgY þ dÞY ¼ ka h, and ðgN þ dÞN ¼ h, where gY ¼ ða=ð1aÞÞgA þ gn denotes the rate of growth of Y , and gn (resp. gA) is the growth rate of population (resp. productivity). Simple algebra yields the slope of the function g, at the steady-state: @gðk,Y ,NÞ gA h ¼ a : @k gA þ gn þ d ðgn þ dÞ 1a Under condition 3, gn þ d 4 0, and gA 4 0, this expression is negative.
ð22Þ
&
Hence, when there is growth, the model predicts a negative relation between the stock market value and investment per new job. To understand the intuition for this result, it is useful to note three elements. First, since capital is heterogeneous, the aggregate response of the stock market to a shock is an average of the responses of each capital good, weighted by the cross-sectional distribution Gt(k). Hence the response depends on the cross-sectional distribution of machines’ capital intensity Gt(k). Second, the shape of the cross-sectional distribution Gt(k) is affected by the trend growth rate. With growth, the distribution shifts to the right over time because of capital deepening: newer units are more capital intensive than older ones.8 Hence, most machines have a capital intensity lower than the machines built today, i.e. the marginal machines, for which Tobin’s q is one. Third, if investors building new machines today choose a low capital intensity kt , it must be that, given today’s information (i.e. expectations of future wages, productivity and discount rates), low capital intensity machines are more profitable. Hence the value of low capital intensity machines rises when kt falls. Combining these three points leads to the conclusion: if there is growth, most capital goods have a lower capital intensity than the marginal ones, hence a lower kt leads to an increase in the overall stock market value. To get a sense for the magnitude of the effect involved, note that the impact response of the stock market to a shock that increases the capital intensity of new machines by 1% is @logV gA C , @logk gn þ dgA
ð23Þ
where gA is the productivity growth rate. The parameters gn, d and gA together determine the shape of the cross-sectional distribution of machines Gt(k). For instance, a higher gn or d implies that old, low capital-intensity machines make up a smaller share of existing capital; the effect of an increase in k is thus smaller. Inversely, if gA is large, the distribution of k has a higher variance (there are some old, very low k machines), and the effect of an increase in k is larger. For reasonable parameter values (gn ¼0.01, gA ¼0.01, d ¼ 0:06) the elasticity is roughly 0.17. The model hence requires substantial volatility in k to generate significant stock market volatility. 3.3. The impact of a productivity shock The preceding analysis has explained the link between kt and Vt, taking as given movements in kt. In a full general equilibrium model, productivity shocks At drive changes in kt. (Other shocks could of course be relevant, and the argument below applies to these shocks as well.) The impact response of the stock market to a productivity shock is @Vt @g @kt ¼ , @k @At @At
ð24Þ
since the other variables Y t ,Nt are predetermined. Under condition 3, @g=@ko 0 and hence the stock market is procyclical if and only if @kt =@At o 0: Whether this condition is satisfied in the model depends on the process for wages and the stochastic discount factor.9 Intuitively, this condition appears realistic for the empirical reason that wages are rather 8 Mathematically, along a balanced growth path, kt ¼ k ð1 þ gk Þt , and the number of new machines is ht ¼ h ð1 þ gn Þt ; the cross-sectional distribution satisfies, for any k 4 0:
Gt ðkÞ ¼
t1 X
ð1dÞðt1sÞ h ð1 þ gn Þs 1k ð1 þ gk Þs r k :
s ¼ 1 9 In the DSGE extension of this model, with AR(1) productivity shocks and a representative agent, numerical simulations suggests that to obtain the condition @k=@A o 0, it is necessary to have a highly elastic labor supply (so that the wage is not very cyclical) and a low intertemporal elasticity of substitution of consumption (making interest rates procyclical).
F. Gourio / Journal of Monetary Economics 58 (2011) 117–131
125
smooth: higher productivity is not fully matched by higher wages. This implies that a positive shock increases the present discounted value of productivity qA,t relative to the present discounted value of wages qw,t, which leads firms to invest in low capital-intensity machines, so that kt falls (Eq. (17)). Section 4 measures kt and finds indeed that it is countercyclical. This theory also predicts a specific pattern of entry over the business cycle. Because It ¼ kt ht , the condition @kt =@At o0, implies that an expansion of investment following a positive productivity shock occurs with ht increasing markedly and kt falling. That is, the economy responds to a shock by expanding capacity (the number of new machines), while lowering the average capital-intensity of each new machine. Bresnahan and Raff (1991) present evidence from the Great Depression, when aggregate investment was low but some significant investment was taking place in new and more modern plants. Using detailed plant-level data, Lee and Mukoyama (2009) also find that only highly productive plants enter during recessions. These empirical results appear consistent with @kt =@At o0: 3.4. Cross-sectional implications The mechanism that generates variation in the value of the stock market has a natural cross-sectional implication: the values of machines with low capital intensity (or more generally, low productivity) are more sensitive to an aggregate productivity shock. The proof is simple: under condition 3, and assuming that @kt =@At o 0, Eq. (17) implies that a 1% increase in productivity leads qA,t to rise relative to qw,t, since @logqA,t @logqw,t @logkt ¼ a 4 0: @logAt @logkt @logAt
ð25Þ
Eq. (15) then implies that the value of a machine Pt ðkÞ ¼ qA,t ka qw,t , goes up by less if k is larger: @2 Pt ðkÞ o 0: @At @k
ð26Þ
Because Pt(k) is increasing in k, this also implies that @2 logPt ðkÞ o0, @At @k
ð27Þ
i.e. the percentage change in value is larger for the low k machines. Summarizing, low productivity machines’ values increase by more than high productivity machines’ when a positive shock hits.10 This is because the present value of labor costs qw,t increases by less than the present value of productivity qA,t. Low productivity machines are more labor-intensive and hence benefit disproportionately from a relatively low wage. Gourio (2008) studies these empirical implications using Compustat data, and finds that firms with low productivity have more procyclical profits, and are also more negatively affected by an increase of the aggregate wage. 4. Empirical evaluation This section evaluates the empirical relevance of the putty–clay and adjustment cost models in accounting for U.S. stock returns. Following Cochrane (1991), I construct the investment returns implied by each technology, using macroeconomic time series (the investment rate for the adjustment cost model, and the investment per new job in the putty–clay model), but without making any assumption on the stochastic discount factor and in particular without using consumption data. Before constructing investment returns, this section quantifies the behavior of the key variable introduced by the putty– clay model: the investment per new job. 4.1. The volatility and countercyclicality of investment per new job The previous section demonstrates that the stock market value implied by the putty–clay model is volatile and procyclical only if kt is volatile and countercyclical. This section constructs an empirical counterpart for kt and documents its properties. In the model, the investment per new job is aggregate investment It divided by the number of new jobs ht. Gross job creation is obtained as ht ¼ N t þ 1 ð1dÞN t given data on employment N t : The investment per new job is then measured as aggregate non-residential investment It divided by ht.11 This process requires to choose a value for d. In the 10 This cross-sectional implication has the ‘‘Schumpeterian’’ flavor that old, low productivity firms are more procyclical, and hence is related to the theoretical work of Caballero and Hammour (1994). However, the mechanism is fairly different, since these authors emphasize the role of entry and exit, while this model concentrate on factor cost differences. 11 This footnote details all the data sources used in this section. The sample is 1950–2010. Aggregate investment is real non-residential investment from the BEA. For the putty–clay model, the employment series is the establishment survey series ([PAYEMS] on the FRED website). Similar results are obtained when the series is instead either the household survey on employment [CE16OV] or a BLS index of total hours worked [HOABNS]. For the adjustment cost model, the capital series is the non-residential capital stock from the fixed asset tables. The ‘‘dividend’’ is computed as net interest plus corporate profits (lines 15 plus 18) from table 1.12 of the National Income and Product Accounts. The stock market return is from CRSP. For the equity return computations (panel B of table 2), leverage is calculated using the Flow of Funds data as the ratio of ‘‘Total Credit Market Instruments—Liabilities’’ [TCMILBSNNCB] over ‘‘Total Assets’’ [TABSNNCB]. It makes very little difference if we use as denominator the sum of ‘‘ Total Credit Market Instruments’’ and ‘‘ Total Net Worth’’ [TNWMVBSNNCB]. The bond return is proxied using the BAA yield [BAA], assuming a 10 year maturity. Following the usual
10
-4
9.5
-4.5
9
-5
8.5
-5.5
8
Log(k) (Full)
F. Gourio / Journal of Monetary Economics 58 (2011) 117–131
log(h) (Dashed)
126
-6
7.5
-6.5
7 1950
1960
1970
1980
1990
2000
-7
Fig. 1. Time series of quasi-change in employment (i.e. gross job creation), and investment per new job. Notes: The figure plots log gross job creation h(t) (dashed line, left scale) and log investment per new job k(t) (full line, right scale). Gross job creation is obtained by quasi-differencing total employment (CES data, d ¼ 0:08). Investment per new job is constructed as the ratio of non-residential investment (BEA) over gross job creation ht. Annual data, 1950–2009. NBER recessions are indicated as shaded areas.
Table 1 Summary statistics.
Standard deviation Correlation with GDP Correlation with stock return
ht
kt
It ¼ ktht
yt
Rt
0.148 (0.017) 0.68 (0.06) 0.36 (0.09)
0.127 (0.012) 0.39 (0.12) 0.23 (0.11)
0.066 (0.006) 0.80 (0.07) 0.28 (0.08)
0.021 (0.002) 1.00 (0.00) 0.34 (0.06)
0.179 (0.015) 0.34 (0.06) 1.00 (0.00)
Notes: annual data, 1950–2009. The variables are ht ¼gross job creation, kt ¼ investment over gross job creation, It ¼ non-residential investment, yt ¼ real GDP, and Rt ¼stock market return. All series except the stock return are logged and HP-filtered (smoothing parameter ¼100). Standard errors are computed using the Newey–West formula (3 lags) and the delta method.
model, capital depreciation and job destruction are equal; in the data they are not, which makes the choice of d more complicated. Capital depreciates at a rate of about 6% per year, while gross job destruction is about 10% per year (see Table 1 of Foster et al., 2006). As a compromise, d is set to 8%. Section 4.3.2 presents some sensitivity analysis.12 Fig. 1 displays the logged time series of gross job creation loght , and investment per new job logkt , together with the NBER recessions. Gross job creation trends up due to population growth, but it is quite volatile: consistent with Davis and Haltiwanger (1992), job creation is strongly procyclical. The investment per new job kt trends up, reflecting capital deepening, consistent with the Kaldor stylized fact. Moreover, the investment per new job exhibits large countercyclical fluctuations. While kt is the ratio of two procyclical time series – investment and job creation – the denominator is significantly more volatile than the numerator, making kt countercyclical. Table 1 summarizes the business cycle properties of the variables ht and kt along with investment, GDP, and the U.S. stock market return. Gross job creation ht is strongly correlated with GDP (0.69) and more than twice as volatile as investment. The investment per new job kt is about twice as volatile as physical investment, and is negatively correlated with GDP ( 0.39). Hence, kt is volatile and countercyclical, which suggests that the model has the potential to generate some return volatility. One possible concern is that capacity utilization affect the measurement of kt: in the early stage of expansions, newly hired people are matched not with new machines but with previously idle equipment. The online appendix presents a simple method to adjust employment data, using capacity utilization data from the Federal Reserve, so as to measure only the jobs created without variation in utilization. This has only a limited effect on the measurement of the variable kt: the volatility of kt goes from 0.127 to 0.138, and its correlation with GDP remains negative (from 0.39 to 0.38). Hence, the
(footnote continued) convention, I attribute the stock market value to the beginning of the period. Note that since computation of the stock value for the putty–clay model requires knowledge of it and thus of N t þ 1 I can infer values only up to the start of 2009 and so my last return is from January to December 2008. 12 An alternative would be to use direct measures of gross job creation, e.g., as in Davis and Haltiwanger (1992). Unfortunately, there are no long samples of gross job creation which cover the entire economy (as opposed to the manufacturing sector).
F. Gourio / Journal of Monetary Economics 58 (2011) 117–131
127
results are apparently not driven by capacity utilization. This is because the construction of the series kt is based on annual data, for which variation in capacity utilization is less important than at higher frequencies. 4.2. Constructing investment returns Given a formula for the stock market value Vt, constructing the investment return simply involves applying the definition of return: Rt þ 1 ¼
Vt þ 1 þ Dt þ 1 , Vt þ It
ð28Þ
where Dt þ 1 is the dividend, and It is physical investment. The term It in the denominator reflects the timing convention: Vt þ 1 is the value of all the existing stock of capital at the beginning of time t þ1, and this capital is obtained by buying the capital that exists at the beginning of date t at price Vt, plus the new investment projects at their cost It. Empirically, the terms Dt þ 1 and It turn out to be relatively unimportant since they are smooth compared to the stock market value. We now discuss the details of the construction of Rt þ 1 for each model. First, for all models, dividends Dt þ 1 and investment It are measured directly from the data. Second, in the case of the putty–clay model, Vt ¼ gðkt ,Y t ,N t Þ ¼ ðkt =aÞððY t =kat Þð1aÞN t Þ. Evaluating this expression requires time series for Y t and N t : (The construction of kt has been discussed in Section 4.1.) N t is employment and hence can be directly measured using the Bureau of Labor Statistics establishment survey. The variable Y t is obtained from the model’s second state law of motion: Y t þ 1 ¼ ð1dÞY t þ kat ht , given the series constructed in the previous section for kt and ht. Since d ¼ 0:08, this construction requires only two inputs: a value for a, and an initial condition Y 0 : Because a is the long-run average capital share, it is set to 0.3; this value has little impact on the results. Y 0 is assumed to be equal to the steady-state value of Y t : The impact of this initial condition disappears after a few years. In the case of the adjustment cost model, Vt ¼ Qt Kt , and to construct marginal Q, a standard quadratic adjustment cost is assumed: Qt ¼ 1 þ k
It , Kt
ð29Þ
where k is the adjustment cost parameter. This formulation assumes that adjustment costs are a function of gross investment, but the results are similar when adjustment costs are a function of net investment, i.e. Qt ¼ 1 þ kððIt =Kt ÞdÞ. There is a wide range of adjustment cost estimates, leading me to consider two values for k: first, k ¼ 2 which reflects realistic estimates (see the discussion in Hall, 2001, 2004), and second, k ¼ 25:7, which is the value which maximizes the fit of the adjustment cost model, as defined below. This parameter implies very large adjustment costs, and a very slow speed of adjustment of investment to shocks, as noted at least since Summers (1981). Finally, it is useful to consider a formulation that incorporates both the putty–clay and adjustment cost features. A simple model extension implies that Vt ¼ Qt gðkt ,Y t ,N t Þ:13 This formula naturally mixes the adjustment cost model, through the marginal Q term, driven by variation in the investment rate It =Kt , and the putty–clay model, through the function g, driven by variation in the investment per new job kt ¼ It =ht . I refer to this as the ‘‘combined model’’. 4.3. Empirical results Table 2 evaluates the fit of the different models: the model with only putty–clay technology, with only adjustment cost technology (either high or low adjustment costs) or with both features (again, either high or low adjustment costs). The table reports the volatility of each model investment return, the correlation of the investment return and the data return, and a measure of fit (pseudo-R2): 1 PT R2 ¼ 1 T
t¼1
Rmodel Rdata t t
VarðRdata Þ t
2 ,
ð31Þ
which is less than one, and is greater than 0 only if the model fits the data better than a constant.14 13 More precisely, suppose that when the representative firm builds new machines, it has to buy capital at price Qt. This price reflects the rising supply curve of a capital-good producing sector (i.e. the adjustment costs are external, as in Mussa, 1977). Using the standard formulation yields Qt ¼ 1þ fðIt =Kt Þ where f is the marginal cost of adjustment. The firm’s program is then a simple variation on the program studied in Sections 2 and 3:
maxfPt ðit ÞQt it g, it Z 0
and the free-entry condition reads: Pt ðit Þ ¼ Qt it . The same steps as in Section 3 yield a modified version of (19): Vt ¼ Qt gðit ,Y t ,N t Þ:
ð30Þ
14 Using a standard R2 is inappropriate: the theory says that the model return should equal the data return, not that the two should be linked by a linear relationship with a flexible intercept and slope.
128
F. Gourio / Journal of Monetary Economics 58 (2011) 117–131
Table 2 Model fit.
Panel A: unlevered returns Putty–clay model Adjustment cost model, k ¼ 2 Adjustment cost model, k ¼ 25:7 Combined model k ¼ 2 Combined model, k ¼ 25:7 Panel B: levered returns Putty–clay model Adjustment cost model, k ¼ 2 Adjustment cost model, k ¼ 25:7 Combined model, k ¼ 2 Combined model, k ¼ 25:7 Panel C: robustness Putty–clay model, d ¼ 0:07 Putty–clay model, d ¼ 0:09 Putty–clay model, d ¼ 0:10 Putty–clay model, d ¼ 0:12
sðRmodel Þ
CorrðRmodel ,Rdata Þ
Pseudo-R2
8.68 (1.54) 2.35 (0.22) 10.43 (1.04) 8.10 (1.61) 14.59 (1.99)
0.58 (0.12) 0.44 (0.10) 0.55 (0.09) 0.55 (0.11) 0.68 (0.08)
0.29 (0.12) 0.05 (0.09) 0.25 (0.14) 0.26 (0.11) 0.42 (0.12)
13.47 (2.25) 3.38 (0.34) 13.48 (1.47) 11.23 (1.71) 16.44 (2.06)
0.59 (0.11) 0.43 (0.11) 0.56 (0.10) 0.62 (0.10) 0.70 (0.08)
0.33 (0.12) 0.12 (0.07) 0.27 (0.15) 0.37 (0.11) 0.44 (0.12)
16.32 (3.69) 5.60 (0.80) 4.14 (0.47) 3.14 (0.30)
0.60 (0.10) 0.53 (0.12) 0.46 (0.13) 0.27 (0.11)
0.23 (0.14) 0.20 (0.10) 0.13 (0.08) 0.03 (0.07)
Notes: The table reports, for each model, the model return volatility, the correlation between model return and data return, and the pseudo-R2. Standard errors (in parenthesis) are obtained by a bootstrap with 1000 replications.
Table 2 reveals that the putty–clay technology generates a return volatility of about 8.7% per year. This is a significant number, but still half that of the data. The model return correlates well with the data return (0.58), and the pseudo-R2 is 0.29. These results must be compared to those of the adjustment cost model. For low values of k, the adjustment cost model does not fit the data well, generating little return volatility (2.3%), a lower correlation (0.44), and a low pseudo-R2 (0.05). Hence, the putty–clay model fits the data much better than the low adjustment costs model. On the other hand, the high adjustment cost model does roughly as well as the putty–clay model: the volatility is slightly higher at 10.4%, but the correlation of the model and data is lower (0.55), and the pseudo-R2 is also lower (0.25). Interestingly, the table reveals that there is a clear improvement of fit when both features are present: the pseudo-R2 increases to 0.42, and the correlation between the model return and the data return rises from 0.58 to 0.68. The volatility of the return is also higher with both adjustment costs and putty–clay. Table 2 reports, for each statistic standard errors obtained from a bootstrap. Despite the relatively short and volatile time series, the key results are statistically significant; however, the pseudo-R2 is fairly imprecise. Fig. 2 depicts the data return, together with either the adjustment cost investment return (for high adjustment costs), the putty–clay investment return, or the combined model. The figure shows that the two models appear to complement each other to increase the fit. Perhaps the biggest limitation of the putty–clay model is its performance during the 1990s boom. It is possible that the putty–clay feature is less relevant for these more recent observations; intuitively, the putty– clay model is a good description of manufacturing sector, but the importance of the manufacturing sector has declined in the U.S. economy, and now accounts for less than 10% of total employment.
4.3.1. Decomposing return volatility How much of the return volatility is due to the putty–clay feature and how much is due to the adjustment cost feature? This section provides two different measures to answer this question. Both measures turn out to yield similar results. First, Eq. (30) can be used to decompose the stock return of the combined model into an ‘‘adjustment cost return’’ and a
F. Gourio / Journal of Monetary Economics 58 (2011) 117–131
129
Data AC
40 20 0 -20 1950
1960
1970
1980
1990
2000 Data PC
40 20 0 -20 1950
1960
1970
1980
1990
2000
50
Data PC+AC
0
-50 1950
1960
1970
1980
1990
2000
Fig. 2. Model and data returns. Notes: The top panel plots the stock return from the data (full line) and the high adjustment costs model (dashed line); the middle panel plots the data and the putty–clay model; the bottom panel plots the data and the model with both putty–clay and high adjustment costs. See the text for construction of the model returns. Annual data, 1950–2009.
‘‘putty–clay return’’, using the following fairly accurate approximation: Rt þ 1 1 C
Qt þ 1 gðkt þ 1 ,Y t þ 1 ,N t þ 1 Þ 1 þ 1 : Qt gðkt ,Y t ,N t Þ |fflfflfflfflffl{zfflfflfflfflffl} |fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl ffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} def
¼ Rac tþ1
ð32Þ
def
¼ Rpc tþ1
This leads to define the share of return volatility due to the putty–clay feature as
o¼
CovðRpc t þ 1 ,Rt þ 1 Þ : VarðRt þ 1 Þ
ð33Þ
The second measure is derived as follows. Assume that a share y of the economy behaves as in the putty–clay model and the remaining share 1y as in the adjustment cost model. The total stock return is a weighted average of the two models. Formally, define Rt ðyÞ as ac Rt ðyÞ ¼ yRpc t þ ð1yÞRt ,
ð34Þ
and choose y as a nonlinear least square estimator:
y 2 argmin
T 1X ðRt ðyÞRdata Þ2 : t Tt¼1
ð35Þ
This procedure yields an estimate of y, the share of the economy that behaves according to the putty–clay model. The two measures are presented in Table 3. These two measures lead to similar answers. Assuming low adjustment costs, essentially all of the stock return volatility stems from the putty–clay feature (o ¼ 0:95, y ¼ 1:00). If one were to follow Hall (2001, 2004) and rule out the high adjustment costs case as empirically implausible, then the putty–clay model accounts for all of the return volatility. On the other hand, if one is willing to assumes very high adjustment costs, around half of total return variability is attributable to the putty–clay feature (o ¼ 0:49, y ¼ 0:56). In any event, the putty–clay feature is at least as important as the adjustment cost feature in accounting for stock returns. 4.3.2. Robustness This section discusses some possible variations on the calculations of the previous section. First, the investment returns are calculated assuming an all-equity financed firm. Panel B reports the returns for the equity returns. Specifically, given measured leverage lt , the equity return is REtþ 1 ¼ lt RIt þ 1 þ ð1lt ÞRbtþ 1 ,
ð36Þ
130
F. Gourio / Journal of Monetary Economics 58 (2011) 117–131
Table 3 Share of return volatility accounted for by the putty–clay model. Low adjustment costs
o Unlevered returns Levered returns
d ¼ 0:07 d ¼ 0:09 d ¼ 0:10 d ¼ 0:12
0.95 (0.02) 1.18 (0.05) 0.98 (0.01) 0.92 (0.02) 0.89 (0.03) 0.84 (0.04)
High adjustment costs
y 1.00 (0.15) 0.73 (0.19) 0.62 (0.12) 1.00 (0.15) 1.00 (0.27) 0.15 (0.40)
o 0.49 (0.07) 0.71 (0.17) 0.76 (0.07) 0.27 (0.05) 0.15 (0.03) 0.03 (0.01)
y 0.56 (0.21) 0.55 (0.10) 0.49 (0.14) 0.44 (0.18) 0.45 (0.12) 0.73 (0.04)
Notes: the table reports, for each version of the putty–clay model, and for either low or high adjustment costs, two measures (omega and theta) of the share of return volatility accounted for by the putty–clay model. See text for the definitions of omega and theta. Standard errors (in parenthesis) are obtained by a bootstrap with 1000 replications.
where RIt þ 1 is the investment return (i.e. the return on assets) and Rbtþ 1 is the return on corporate bonds. Footnote 9 details the construction of lt and Rbtþ 1 . Panel B reveals that taking into account leverage leads to higher volatility of the model returns, but does not change the comparison of putty–clay and adjustment cost models: the relative fit of the two models is largely unchanged. Second, the construction of the key variable kt uses a job separation rate of 8% per year. If d is lower, measured gross job creation becomes more volatile, and hence kt becomes more volatile as well. As a result, the stock return predicted by the model becomes more volatile, and the fit improves slightly. For instance, if d ¼ 7%, the correlation of the model and data return rises from 0.58 to 0.60. On the other hand, if d ¼ 9%, the correlation of the model and data returns falls to 0.53. Higher values of d lead to lower correlations (e.g., 0.46 if d ¼ 10%) and lower predicted return volatilities. Hence, the results are sensitive to this parameter. In particular if d is too small, ht becomes mechanically very close to zero or even negative, leading kt to become infinitely volatile.15 Third, an online appendix implements a correction to take into account variable utilization in the data, using standard measures of capacity utilization from the Federal Reserve. The results suggest that variable utilization weakens somewhat the result, but they remain significant. For instance, the correlation between data return and putty–clay return falls from 0.58 to 0.42, and the return volatility falls slightly from 8.68% to 8.19%. The fit of the model with both putty–clay and adjustment costs remains significantly higher than that of the model with only adjustment costs or putty–clay: the pseudo-R2 is 0.21 versus 0.17 for the adjustment cost model and 0.13 for the putty–clay model. Fourth, the implementation in the paper is done with annual data, but broadly similar results are obtained with quarterly data. Capacity utilization may be more of an issue at quarterly frequency, however. Finally, note that alternative tests of the model are possible. Following Cochrane (1991) and Liu et al. (2009), and others, this empirical section focuses on investment returns. An alternative is to test the model in levels, i.e. try to match the stock market value – or Tobin’s Q – implied by the model with the one in the data. The key issue is which frequencies the model is supposed to match. Looking at business cycle frequencies in levels (e.g., by comparing detrended value in the model and data) yields results similar to those in this section. But looking at levels without filtering leads to focus on some medium to long term dynamics in Q. The putty–clay model shares the same long-run implications as the standard neoclassical model, and hence has little to say about these medium-run fluctuations, which may be better explained by models which emphasize breaks in technology (e.g. Greenwood and Yorukoglu, 1997 and other papers cited in the literature review). This motivates my focus on business cycle frequencies and hence on returns.16 5. Conclusion Ever since Hansen and Singleton (1982), much work has been devoted to evaluate whether stock returns fit consumption data and preferences. But in general equilibrium, stock returns should also fit investment data and 15 However, while the volatility of the return mechanically goes to infinity as d falls, there is no mechanical reason why the model fit (e.g., the correlation) should improve as d falls. 16 There are also empirical reasons to favor an implementation in returns. First, it is more difficult to measure the stock of capital than it is to measure the flow of investment or employment. Second, the macroeconomic data covers both listed firms and non-listed firms, while the financial data applies only to listed firms. This creates a problem to match levels of values, but if the returns on the two types of firms are the same (a natural benchmark), this is not an issue for my analysis.
F. Gourio / Journal of Monetary Economics 58 (2011) 117–131
131
technology. This paper shows that the putty–clay model makes some progress toward reconciling the production side of the economy with stock returns. This work complements Gilchrist and Williams (2000) who found that the putty–clay technology also delivers interesting business cycle implications. Many of these business cycle implications are maintained in the simple variant of their model developed in this paper, and which may prove useful for other applications. More broadly, the empirical results of the paper suggest that technologies that give a more prominent role to labor are worth exploring. Appendix A. Supplementary data Supplementary data associated with this article can be found in the online version at doi:10.1016/j.jmoneco.2011.02. 002.
References Abel, A., 1983. Optimal investment under uncertainty. American Economic Review 73 (1), 228–233. Atkeson, A., Kehoe, P., 1999. Models of energy use: putty–putty versus putty–clay. American Economic Review 89 (4), 1028–1043. Bartelsman, E., Doms, M., 2001. Understanding productivity: lessons from longitudinal microdata. Journal of Economic Literature 38 (3), 569–594. Bazdresch, S., Belo, F., Lin, X., 2009. Labor hiring, investment and stock return predictability in the cross section. University of Minnesota and LSE, Mimeo. Belo, F., 2010. Production-based measures of risk for asset pricing. Journal of Monetary Economics 57 (2), 146–163. Boldrin, M., Christiano, L., Fisher, J., 2001. Habit persistence, asset returns, and the business cycle. American Economic Review 91 (1), 149–166. Bresnahan, T., Raff, D., 1991. Intra-Industry heterogeneity and the great depression: the american motor vehicles industry, 1929–1935. Journal of Economic History 51 (2), 317–331. Caballero, R., Hammour, M., 1994. The cleansing effect of recessions. American Economic Review 84 (5), 1350–1368. Cochrane, J., 1991. Production-based asset pricing and the link between stock returns and economic fluctuations. Journal of Finance 46 (1), 209–237. Cooper, R., Haltiwanger, J., 2006. On the nature of capital adjustment costs. Review of Economic Studies 73 (3), 611–633. Davis, S., Haltiwanger, J., 1992. Gross job creation, gross job destruction, and employment reallocation. Quarterly Journal of Economics 107 (3), 819–863. Foster, L., Haltiwanger, J., Kim, N., 2006. Gross job flows for the U.S., manufacturing sector: measurement from the longitudinal research database. University of Maryland, Mimeo. Garleanu, N., Panageas, S., Yu, J. 2009. Technological progress and asset prices. Berkeley, Mimeo. Gilchrist, S., Williams, J., 2000. Putty–clay and investment: a business cycle analysis. Journal of Political Economy 108 (5), 928–960. Gilchrist, S., Williams., J., 2004. Transition dynamics in vintage capital models: explaining the post-war experience of German and Japan. Boston University, Mimeo. Gilchrist, S., Williams, J., 2005. Investment, capacity, and uncertainty: a putty–clay approach. Review of Economic Dynamics 8 (1), 1–27. Gomes, J., Kogan, L., Zhang, L., 2003. Equilibrium cross section of returns. Journal of Political Economy 111 (4), 693–732. Gourio, F., 2008. Labor leverage, firms heterogeneous sensitivities to the business cycle and the cross-section of returns. Boston University, Mimeo. Greenwood, J., Jovanovic, B., 1999. The IT revolution and the stock market. American Economic Review 89 (2), 116–122. Greenwood, J., Yorukoglu, M., 1997. 1974. Carnegie–Rochester Conference Series on Public Policy 46, 49–95. Hall, R., 2001. The stock market and capital accumulation. American Economic Review 91 (5), 1185–1202. Hall, R., 2004. Measuring factor adjustment costs. Quarterly Journal of Economics 119 (3), 899–927. Hansen, L., Singleton, K., 1982. Generalized instrumental variables estimation of nonlinear rational expectations models. Econometrica 50 (5), 1269–1286. Hayashi, F., 1982. Tobin’s marginal q and average q: a neoclassical interpretation. Econometrica 50 (1), 213–224. Hopenhayn, H., 1992. Exit, selection, and the value of firms. Journal of Economic Dynamics and Control 16 (3), 621–653. Jermann, U., 1998. Asset pricing in production economies. Journal of Monetary Economics 41 (2), 257–275. Jermann, U., 2010. The equity premium implied by production. Journal of Financial Economics 98 (2), 279–296. Johansen, L., 1959. Substitution versus fixed production coefficients in the theory of economic growth: a synthesis. Econometrica 27 (2), 157–176. Laitner, J., Stolyarov, D., 2003. Technological change and the stock market. American Economic Review 93 (4), 1240–1267. Lee, Y., Mukoyama, T., 2009. Entry, exit, and plant-level dynamics over the business cycle. University of Virginia, Mimeo. Liu, L., Whited, T., Zhang, L., 2009. Investment-based expected stock returns. Journal of Political Economy 117 (6), 1105–1139. McGrattan, E., Prescott, E., 2005. Taxes, regulations, and the value of U.S. and U.K. Corporations. Review of Economic Studies 72, 767–796. Merz, M., Yashiv, E., 2007. Labor and the market value of the firm. American Economic Review 97 (4), 1419–1431. Mussa, M., 1977. External and internal adjustment costs and the theory of aggregate and firm investment. Economica 44 (174), 163–178. Rouwenhorst, K., 1995. Asset pricing implications of equilibrium business cycle models. In: Cooley, T. (Ed.), Frontiers of Business Cycle Research. Princeton University Press, Princeton, NJ, pp. 294–330 (Chapter 10). Solow, R., 1960. Investment and technical progress. In: Arrow, K., Karlin, S., Suppes, P. (Eds.), Mathematical Methods in the Social Sciences. Stanford University Press, Stanford, CA, pp. 89–104. Summers, L., 1981. Taxation and corporate investment: a q-theory approach. Brooking Papers on Economic Activity 1, 67–140. Wei, C., 2003. Energy, the stock market, and the putty–clay investment model. American Economic Review 93 (1), 311–323. Zhang, L., 2005. The value premium. Journal of Finance 60 (1), 67–103.
Journal of Monetary Economics 58 (2011) 132–145
Contents lists available at ScienceDirect
Journal of Monetary Economics journal homepage: www.elsevier.com/locate/jme
On the joint determination of fiscal and monetary policy Fernando M. Martin Department of Economics, Simon Fraser University, Burnaby, B.C., Canada V5A 1S6.
a r t i c l e in f o
abstract
Article history: Received 17 September 2009 Received in revised form 31 January 2011 Accepted 9 February 2011 Available online 19 February 2011
The conduct of fiscal and monetary policy absent commitment depends on the interaction between the objective of smoothing distortions intertemporally and a time-consistency problem. When net nominal government obligations are positive, both fiscal and monetary policies are distortionary and the choice of debt depends on how the anticipated response in future monetary policy affects the current demand for money and bonds. There exists a unique steady state with positive net nominal government obligations, which is stable and time-consistent. For any initial level of debt, the welfare loss due to lack of commitment is small. & 2011 Elsevier B.V. All rights reserved.
1. Introduction A standard view in macroeconomics, in the tradition of Barro (1979) and Lucas and Stokey (1983), is that the role of government debt is to smooth tax distortions over time. In this context, the goal of monetary policy is to keep the nominal interest rate low (possibly zero) and stable.1 These results, which are obtained under the assumption that the government can commit to future policy choices, are useful as normative benchmarks, but problematic as positive theory. Typically, the policy prescription with commitment is time-inconsistent, any level of debt can be supported in the long-run with a suitable choice of initial debt, and the predicted behavior of taxes and nominal interest rates is counterfactually smooth. This paper relaxes the commitment assumption and derives determinate implications for government policy in a microfounded model of money. The analysis builds on Martin (2009), which, in the context of a cash-in-advance economy, shows that lack of commitment provides a mechanism that explains the level of debt as a function of economic fundamentals.2 The treatment in Martin (2009) is not as general as one would hope, since the theoretical results refer exclusively to the long-run and the model becomes analytically intractable when adding taxation. This paper shows that adopting a micro-founded model of agent’s portfolio choice, based on the framework proposed by Lagos and Wright (2005), enhances and refines our understanding of the trade-offs that shape fiscal and monetary policy in the absence of commitment. In addition, it addresses the concern that policy prescriptions could be sensitive to precisely how money is modeled.3
Tel.: + 1 778 782 5462.
E-mail address:
[email protected] Lucas (1986) provides a classic articulation of these principles. See also Chari et al. (1991) and, more recently, Aruoba and Chugh (2010), among many others. 2 See also Dı´az-Gime´nez et al. (2008). For alternative mechanisms, see Diamond (1965), Aiyagari and McGrattan (1998), Shin (2006) and Battaglini and Coate (2008). Bohn (1998) shows that U.S. public debt over GDP displays mean-reversion, which suggests the existence of a fundamental long-run level of debt and underscores the need for a theory that explains it. 3 Wallace (1998, 2001), Shi (2006) and Williamson and Wright (2010), among others, argue that monetary policy should be studied in the context of models that provide explicit micro-foundations for the role of money. With the notable exception of Aruoba and Chugh (2010), optimal policy has been analyzed in reduced-form models of money, such as cash-in-advance. 1
0304-3932/$ - see front matter & 2011 Elsevier B.V. All rights reserved. doi:10.1016/j.jmoneco.2011.02.001
F.M. Martin / Journal of Monetary Economics 58 (2011) 132–145
133
Without commitment, the objective of smoothing distortions intertemporally is weighted against a time-consistency problem created by the interaction between debt and monetary policy. On the one hand, the level of debt inherited by the government influences its monetary policy choice since inflation reduces the real value of nominal liabilities. On the other hand, the anticipated response of future monetary policy affects the current demand for money and bonds, and thereby the trade-offs faced by the government today when deciding how much debt to leave to its successor. To expand on the mechanism described above, consider the effects of increasing the debt. This action will induce the government tomorrow to raise the money growth rate to alleviate the added financial burden. For standard assumptions on preferences, a faster expansion of the money supply would make it more valuable for agents tomorrow to have started the period with more money. Hence, the current demand for money increases, which slackens the government budget constraint today. However, a higher money growth rate tomorrow decreases the future value of nominal bonds, and therefore, their current demand. Thus, agents will ask today for a higher return on debt, which tightens the government budget constraint due to the added financial cost. The government increases the debt if the overall effect on the demand for money and bonds results in a slackening of its budget constraint, and reduces it in the opposite case. The gains from varying the level of debt are traded-off with the costs from lower distortion smoothing. Assuming the government does not start with sufficiently large claims on the private sector (i.e., negative debt), debt converges to a unique steady state, with positive net nominal obligations, and where both fiscal and monetary policies are distortionary. This steady state is time-consistent: endowing the government at this point with a commitment technology would have no effect on policy. The size of long-run debt depends on the determinants of the money demand, which confirms the results derived in cash-in-advance models. Within the Lagos–Wright framework, the critical determinants are the intertemporal elasticityof-substitution and the measure of buyers in the market where money is the only medium of exchange. In contrast to the equilibrium described above, if the government were allowed to commit, then distortions would be smoothed perfectly over time after the initial period. Thus, any level of debt can be supported in the long-run by appropriately choosing initial debt. Despite the stark differences in policy prescription, the welfare cost arising from lack of commitment is small. For example, for debt levels between zero and twice the U.S. average, the welfare cost is at most equivalent to a one-time fee of 0.04% of a year’s consumption. The low welfare numbers associated with lack of commitment suggest that the search for institutional arrangements that would induce the government to behave closer to the optimal (commitment) policy should not be a primary concern. The rest of the paper is organized as follows. Section 2 describes the environment and characterizes the monetary equilibrium for a given government policy. Section 3 formulates the government’s problem, characterizes equilibrium policy and derives the main theoretical results. Section 4 compares government policy with and without commitment. Section 5 concludes.4 2. The economy This section describes the environment and characterizes the monetary equilibrium for a given government policy. 2.1. Environment The environment is a variant of Lagos and Wright (2005). There is a continuum of infinitely lived agents. Each period, two perfectly competitive markets open in sequence: a day and a night market. In each stage a perishable good is produced and consumed. Before each day market opens, agents receive an idiosyncratic shock that determines whether they can produce or consume the day-good, x. With probability Z 2 ð0,1Þ an agent wants to consume but cannot produce, while with probability 1Z an agent can produce but does not want to consume. A consumer derives utility u(x), where u is twice continuously differentiable, satisfies Inada conditions and uxx o 0 oux . A producer incurs in utility cost f(x), where f is twice continuously differentiable, fx 4 0 ^ ^ ¼ fx ðZx=ð1 and fxx Z 0. Given the assumptions on u and f there exists a unique x^ 2 ð0,1Þ such that ux ðxÞ ZÞÞ. Agents are anonymous, in the sense that private trading histories are unobservable, and lack commitment. Thus, private credit arrangements are not feasible. Since agents in the day market face a double coincidence of wants problem, some medium of exchange is essential for trade to occur.5 At night, all agents can produce and consume the night-good, c. Utility from consumption is given by U(c), where U is twice continuously differentiable, satisfies Inada conditions and Ucc o0 oUc . The production technology is assumed to be linear in hours worked. Disutility from labor is given by an, where n is hours worked and a 4 0. Let c^ 2 ð0,1Þ such that Uc ðc^ Þ ¼ a. There is a benevolent government that supplies a valued public good g. To finance its expenditure, the government may use proportional labor taxes t, print fiat money at rate m and issue one-period nominal bonds, which are redeemable in fiat money. The public good is transformed one-to-one from the night-good. Agents derive utility from the public good according to v(g), where v is twice continuously differentiable, satisfies Inada conditions and vgg o0 ovg . Let g^ 2 ð0,1Þ such that vg ðg^ Þ ¼ a. 4 In addition, the supplemental Appendix contains the proofs of all theoretical results and provides an analytical characterization of the equilibrium, under suitable assumptions. 5 See Kocherlakota (1998), Wallace (2001), Shi (2006) and Williamson and Wright (2010).
134
F.M. Martin / Journal of Monetary Economics 58 (2011) 132–145
The government can commit to policies within the period, but lacks the ability to commit to future policy choices. To characterize government behavior, let us adopt the notion of Markov-perfect equilibrium, in which policies are a function of fundamentals only.6 Assume the government announces its policy for the period at the beginning of the day, before agents’ idiosyncratic shocks are realized. The government only actively participates in the night market: taxes are levied on hours worked at night and open market operations are conducted in the night market. As in Berentsen and Waller (2008) and Aruoba and Chugh (2010), public bonds are book-entries in the government’s record. Since bonds are not physical objects and the government does not participate in the day market (i.e., cannot intermediate or provide third-party verification), bonds are not used as a medium of exchange in the day market and thus, money is essential.7 All nominal variables—except for bond prices—are normalized by the aggregate money stock. Hence, today’s aggregate money supply is equal to 1 and tomorrow’s is 1 þ m. The government budget constraint is 1 þ B þpg ¼ ptn þ ð1 þ mÞð1þ qB0 Þ,
ð1Þ
where B is the current aggregate bond–money ratio, p is the (normalized) market price of the night-good and q is the price of a bond that earns one unit of fiat money in the following night market. ‘‘Primes’’ denote variables evaluated in the following period. Thus, B0 is tomorrow’s aggregate bond–money ratio. 2.2. The night market An agent arrives to the night market with individual money balances m and government bonds b. Since bonds are redeemed in fiat money at par, the composition of an agent’s nominal portfolio at the beginning of the night is irrelevant. Let z m þ b be total (normalized) nominal holdings. The budget constraint of an agent at night is 0
pc þð1 þ mÞðm0 þ qb Þ ¼ pð1tÞn þz:
ð2Þ
Notice that the composition of the nominal portfolio the agent decides to carry over to the next period matters, as only fiat money is used to buy goods in the day market. Let V(m,b) be the value of entering the day market with money balances m and bond balances b, and let W(z) be the value of entering the night market with total nominal balances z. After solving n from (2), the agent’s problem at night is WðzÞ ¼ max UðcÞ c,m0 ,b0
ac aðzð1 þ mÞðm0 þ qb0 ÞÞ þvðgÞ þ bVðm0 ,b0 Þ: þ 1t pð1tÞ
The first-order conditions are Uc
a
¼ 0,
ð3Þ
að1 þ mÞ þ bV 0m ¼ 0, pð1tÞ
ð4Þ
aqð1 þ mÞ þ bV 0b ¼ 0: pð1tÞ
ð5Þ
1t
Focusing on symmetric equilibria, one can follow Lagos and Wright (2005) to show that (3)–(5) imply all agents exit the night market with the same money and bond balances.8 Furthermore, the value function W is linear, Wz ¼ a=pð1tÞ, and so WðzÞ ¼ Wð0Þ þ az=pð1tÞ. Conditions (4) and (5) imply q¼
V 0b : V 0m
ð6Þ
Thus, if V 0b o V 0m , i.e., the marginal value of entering tomorrow’s day market with a unit of bonds is less than with a unit of money, then agents ask to be compensated to hold bonds, i.e., q o1. 2.3. The day market The ex-ante value for an agent that enters the day market is Vðm,bÞ ¼ ZV c ðm,bÞ þ ð1ZÞV p ðm,bÞ, where Vc and Vp are the values of being a consumer and a producer in the day market, respectively. 6 See Maskin and Tirole (2001) for a definition and justification of this solution concept. For applications to dynamic policy games see Klein et al. (2008), Dı´az-Gime´nez et al. (2008) and Martin (2009, 2010b), among others. 7 Alternatively, one could assume that bonds—but not money—can be costlessly counterfeited. Again, bonds will only be traded in markets where the government participates. See Telyukova and Wright (2008) and Aruoba et al. (forthcoming) for examples, where fiat money is assumed to be the only recognizable asset in all trades. 8 One minor caveat is that V is linear in b and hence, equilibria with a non-degenerate distribution of bonds may also exist; hence the focus on symmetric equilibria. Nevertheless, note that the agent’s day-problem is unaffected by bond holdings, whereas at night, the agent only cares about total nominal holdings.
F.M. Martin / Journal of Monetary Economics 58 (2011) 132–145
135
~ rm, where p~ is the (normalized) market price of good x. Using x as the A consumer faces a day-budget constraint, px Lagrange multiplier associated with this constraint, the problem of a consumer can be written as V c ðm,bÞ ¼ max uðxÞ þWð0Þ þ x
~ aðm þbpxÞ ~ þ xðmpxÞ: pð1tÞ
The first-order condition is ux
ap~ xp~ ¼ 0: pð1tÞ
ð7Þ
c ¼ ux =p~ and Vbc ¼ a=pð1tÞ. From the envelope condition and (7), Vm In general, the individual quantities consumed and produced are different. Let y be day-output of an individual producer. The problem of a producer is
V p ðm,bÞ ¼ maxf ðyÞ þ Wð0Þ þ y
~ aðm þ bþ pyÞ : pð1tÞ
The first-order condition is
ap~ ¼ 0: pð1tÞ
fy þ
ð8Þ
Note that because of quasi-linear preferences in the night stage, a producer’s actions during the day are unaffected by the p amount of money or bonds he brings into the period. The envelope condition implies Vm ¼ Vbp ¼ a=pð1tÞ. From the envelope conditions of the consumer’s and producer’s problems, it follows that
Zux
Vm ¼ Vb ¼
p~
ð1ZÞa pð1tÞ
þ
a pð1tÞ
ð9Þ
:
ð10Þ
2.4. Monetary equilibrium The aggregate resource constraint in the day implies y¼
Zx : 1Z
ð11Þ
~ The market clearing condition is A standard result is that consumers spend all their money in the day market, i.e., m ¼ px. ~ which implies p~ ¼ 1=x. Substitute this expression in the first-order condition of the producer (8) to obtain then Z ¼ ð1ZÞpy,
a
fy x ¼
pð1tÞ
,
ð12Þ
where y satisfies (11). Next, (7) can be written as x ¼ xðux fy Þ. In a monetary equilibrium x Z 0, which implies ux fy Z 0. All agents choose the same c, m0 and b0 at night. Thus, in equilibrium, m0 ¼ 1 and b0 ¼ B0 . Agents only differ in their role in the day market and their corresponding labor effort in the night market. It is straightforward to verify that the aggregate resource constraint in the night market is satisfied, i.e., c þg ¼ n:
ð13Þ
We can now collect the remaining equations that summarize agents’ behavior in any given period. After some rearrangement equations (3), (4), (6), (7), and (12) can be written as
m¼
bx0 ðZu0x þ ð1ZÞf 0y Þ fy x
t ¼ 1 p¼
q¼
a Uc
,
ð16Þ
f 0y 0 ux þ ð1
ZÞf 0y
ux fy Z 0, 0
ð14Þ
ð15Þ
Uc , fy x
Z
1,
where y, y satisfy (11).
,
ð17Þ
ð18Þ
136
F.M. Martin / Journal of Monetary Economics 58 (2011) 132–145
As derived above, the inequality (18) comes from the requirement that the Lagrange multiplier in the day-consumer’s problem, x, be non-negative. Using (9), (10) and (12) this requirement can be written as Vm Vb Z0. In other words, for money to be valued in equilibrium, it has to provide some liquidity services over bonds. In this case, money allows consumers to buy goods in the day market. One could thus interpret the wedge in (18) as a measure of the shadow value of liquidity in the day market. Note that since (18) is satisfied in every period, it implies q r 1, i.e., the nominal interest rate—defined as ð1=qÞ1—cannot be negative in a monetary equilibrium. 3. Government policy This section presents the main results of the paper. First, the problem of the government is formulated. Second, the various policy trade-offs faced by the government are discussed, followed by an analysis of how policy is determined. Third, the Markov-perfect equilibrium is characterized. Finally, long-run policy is analyzed, with an emphasis on the theory’s applicability to actual economies and how it compares to the cash-in-advance economy studied in Martin (2009). For clarity of exposition, all proofs of theoretical results are derived in the supplemental Appendix. 3.1. Problem of the government The literature on optimal policy with commitment typically adopts what is known as the primal approach, which consists of using the first-order conditions of the agent’s problem to substitute prices and policy instruments for allocations in the government budget constraint. Then, the sequence of period budget constraints can be summarized into a single implementability constraint, and the problem of the government is thus formulated in terms of allocations and initial debt only. Absent commitment, one can also remove prices and policy instruments from the government’s problem, except for debt, as noted below. Using equilibrium conditions (13) and (17), the government budget constraint (1) can be written as ðUc aÞcag þ bZx0 ðu0x f 0y Þ þ bf 0y x0 ð1 þB0 Þfy xð1þ BÞ ¼ 0: 0
0
ð19Þ 0
The equation above is a function of B, B , x, x , c and g—recall that from condition (11), the expressions for y and y are determined by x and x0 , respectively. All these variables are chosen in the current period, except for x0 , which depends on the policy implemented by the government tomorrow. In turn, future government policy is a function of the level of inherited debt. Given that the current government anticipates this response, the choice of debt is a fundamental component of its problem. Thus, instead of a single implementability constraint, the problem of the government with lack of commitment needs to be formulated subject to the period budget constraint (19). Since the problem of the government will be formulated in terms of allocations instead of policy instruments (except for debt), some further observations are in order. First, from condition (14), and for a given anticipated realization of x0 (which depends on the choice of B0 ), an increase (decrease) in m implies a decrease (increase) in day-good consumption, x. Thus, variations in day-good consumption are associated with changes in the money growth rate in the opposite direction. Also note from the expressions for p~ and p, that an increase in m (decrease in x) is inflationary, in the sense that current prices increase. Second, from condition (15) it follows that an increase (decrease) in the tax rate t implies a decrease (increase) in night-good consumption, c. Thus, variations in night-good consumption reflect changes in the tax rate in the opposite direction. Throughout the paper, variations in x and c are interpreted as changes in the money growth rate and the tax rate, respectively. The analysis that follows will use the partial derivatives of the government budget constraint extensively. Hence, it is convenient to write (19) compactly as
eðB,B0 ,x,x0 ,c,gÞ ¼ 0, with partial derivatives: eB ¼ fy x; eB0 ¼ bf 0y x0 ; ex ¼ ð1þ BÞðfy þ fyy yÞ; ex0 ¼ bfZðu0x þu0xx x0 Þ þ ð1ZÞðf 0y þ f 0yy y0 Þ þ B0 ðf 0y þf 0yy y0 Þg; ec ¼ Uc a þUcc c; and eg ¼ a. These expressions provide useful insights into the trade-offs faced by the government. Consider first the effects of increasing the level of debt. This action relaxes the government budget constraint, eB0 4 0, as it reduces the need for taxation. The direct counter-effect is a tightening of the future budget constraint, e0B o 0, resulting from the higher financial burden. There is an additional effect, due to changes in future policy, which is accounted for by ex0 . Specifically, from equilibrium conditions (14) and (17), it follows that changes in tomorrow’s allocation of the day-good affect the current demand for money and bonds, and therefore, have an effect on the current government budget constraint. The demand for money affects how distortionary a given change in the money supply is, while the demand for bonds determines the cost of servicing the debt. The sign of ex0 will be critical for determining the government’s incentives to increase or decrease debt. The expression for ex shows how the level of debt affects monetary policy. If 1 þB 40, then ex o0, i.e., the government has an incentive to reduce day-good consumption to relax its budget constraint. As mentioned above, for a given choice of B0 , a lower x is implemented through a higher m. In other words, if inherited net nominal obligations are positive, then the government has an incentive to use inflation to reduce the real value of its financial burden. On the other hand, if 1þ B o 0 then the incentive goes in the opposite direction.
F.M. Martin / Journal of Monetary Economics 58 (2011) 132–145
137
The expressions for ec and eg have a standard interpretation. In equilibrium, ec o 0, i.e., a lower tax rate (higher c) tightens the government budget constraint. The expression for eg simply establishes that an increase in government expenditure tightens the budget constraint, eg o0, due to the larger financial burden. The problem of the current government is to choose B0 , x, c and g in order to maximize agents’ present value utility. The government is constrained to satisfy its period budget constraint in a monetary equilibrium, as determined by (18) and (19), anticipating that policy tomorrow will react to the level of inherited nominal liabilities. Specifically, the government today understands that tomorrow’s allocation of the day-good, x0 , is a function of current debt choice, B0 . Let X ðB0 Þ be the allocation of the day-good that the current government anticipates its future-self will implement as a function of inherited debt. The problem of the government is VðBÞ ¼ max ZuðxÞð1ZÞf ðyÞ þ UðcÞaðc þgÞ þ vðgÞ þ bVðB0 Þ 0 B ,x,c,g
subject to eðB,B0 ,x,X ðB0 Þ,c,gÞ ¼ 0 and ux fy Z0. A Markov-perfect monetary equilibrium in this economy is defined as follows. Definition 1. Let G ½BL ,BH , 1o BL oBH o 1. A Markov-perfect monetary equilibrium (MPME) is a set of functions fB,X ,C,G,Vg : G-G R3þ R, such that fBðBÞ,X ðBÞ,CðBÞ,GðBÞg ¼ arg 0 max ZuðxÞð1ZÞf ðyÞ þUðcÞaðc þ gÞ þ vðgÞ þ bVðB0 Þ B 2G,x,c,g
subject to eðB,B ,x,X ðB Þ,c,gÞ ¼ 0 and ux fy Z0; y ¼ Zx=ð1ZÞ and YðBÞ ¼ ZX ðBÞ=ð1ZÞ; and 0
0
VðBÞ ¼ ZuðX ðBÞÞð1ZÞf ðYðBÞÞ þ UðCðBÞÞaðCðBÞ þ GðBÞÞ þvðGðBÞÞ þ bVðBðBÞÞ: The bounds on debt specified above are necessary for a proper definition of the equilibrium at this point, since the constraints in the government’s problem are not sufficient to characterize a monetary equilibrium. To be specific, (18) and (19) do not rule out the possibility of the government running a Ponzi-scheme. Due to the lack-of-commitment friction, there is no single government on which to impose an additional present-value ‘‘pay-back’’ (i.e., no Ponzi games) constraint. This issue will be addressed below with a more general property of the MPME, which in turn rules out Ponzi schemes—see Proposition 1. The characterization of the equilibrium that follows will not rely on specific bounds on debt. Solving for a MPME involves finding the fixed point of both VðBÞ and X ðBÞ. Typically, one would impose differentiability on anticipated policy functions as a refinement to rule out equilibria where policy functions are non-differentiable for nonfundamental reasons.9 In this case however, since the non-negativity constraint (18) will bind for some debt levels, the restriction needs to be weaker. Assumption 1. The current government anticipates future governments to implement policy X ðBÞ which is continuous and differentiable almost everywhere. 3.2. Determination of government policy Let l and z be the Lagrange multipliers associated with the constraints of the government’s problem. Assuming X 0B and V 0B exist, the first-order conditions are
lðeB0 þ ex0 X 0B Þ þ bl0 e0B ¼ 0,
Zðux fy Þ þ lex þ z uxx
ð20Þ
Zfyy ¼ 0, 1Z
ð21Þ
Uc a þ lec ¼ 0,
ð22Þ
a þvg þ leg ¼ 0,
ð23Þ
where the envelope condition, V B ¼ leB , is used to simplify (20). From Eqs. (21) and (23) it follows that the value of the Lagrange multiplier l is proportional to the wedges in allocations introduced by government policy: ux fy (if z ¼ 0), Uc a and vg a. Thus, l is a direct measure of the intratemporal distortions created by current policy. The government uses debt to substitute these distortions intertemporally. The wedge 0 between l and l is an inverse measure of the degree by which these distortions are smoothed over time. As mentioned in the introduction, the interaction between monetary policy and government debt is critical for the determination of government policy. Let us inspect the first-order conditions of the government’s problem to further understand this relationship. 9 If X ðBÞ were anticipated to be discontinuous, the discontinuity could be preserved in the policy response of the current government, since ‘‘small’’ changes in debt would sometimes trigger ‘‘large’’ changes in future policy. These type of non-differentiable equilibria are an artifact of the infinite horizon, as they would typically not exist in versions of the economy with a finite horizon (and appropriate terminal conditions for the value of money). See Krusell and Smith (2003) and Martin (2009) for further analysis.
138
F.M. Martin / Journal of Monetary Economics 58 (2011) 132–145
Consider first the intratemporal trade-off created by monetary policy. As explained above, when B 4 1, the government has an incentive to inflate away its nominal liabilities, at the cost of distorting the allocation of the daygood. Eq. (21) shows this trade-off. When B Z 1, the non-negativity constraint does not bind (i.e., z ¼ 0, see Proposition 2 below) and (21) can be written as Zðux fy Þ ¼ lð1 þ BÞðfy þ fyy yÞ. Thus, an increase in beginning-of-period debt, B, implies a decrease in day-good consumption, x. In other words, the incentive to use inflation increases with the level of debt. This is the channel through which debt affects monetary policy. Consider now the intertemporal trade-off faced by the government. Since eB0 ¼ be0B ¼ bf 0y x0 , condition (20) can be written as
bf 0y x0 ðll0 Þ þ lex0 X 0B ¼ 0:
ð24Þ
0
The term bf 0y x0 ðll Þ in (24) is the standard trade-off between current and future distortions. Absent any other margins, 0 this term implies that the government would perfectly smooth distortions over time, i.e., set l ¼ l . This is a typical prescription in models with commitment, where the optimal policy is to keep distortions constant after the initial period. Without commitment, the additional term lex0 X 0B appears, which establishes how a change in debt choice affects the current government budget constraint, through its effect on future monetary policy. Debt increases or decreases depending on the direction of this effect, i.e., the sign of ex0 X 0B . 0 The government has an incentive to push distortions to the future (l o l ), through an increase in debt, when by doing 0 so it relaxes its budget constraint (ex0 X B 4 0). Similarly, there is an incentive to decrease debt when ex0 X 0B o0. Note that if the government expects future monetary policy to remain unaffected by a change in debt (X 0B ¼ 0), then distortions are 0 held constant (l ¼ l ); in this case, debt may increase or remain constant. As argued above, a higher debt choice today induces a faster expansion of the money supply tomorrow (lower x0 ). Thus, to fully understand the incentives to vary debt levels, we need to further inspect the term ex0 . Consider the expression V 0m þ V 0b B0 , which measures tomorrow’s marginal value of (normalized) nominal balances. In a monetary equilibrium, Eqs. (9) and (10) imply Vm ¼ xðZux þ ð1ZÞfy Þ and Vb ¼ fy x. 0 One can verify that ex0 ¼ dðV 0m þ V 0b B0 Þ=dx . In other words, ex0 represents the change in tomorrow’s marginal value of 0 c 0 p 0 nominal balances for a given variation in future monetary policy. Note that dV 0m =dx ¼ ZdV m =dx þ ð1ZÞdV m =dx ¼ Zðu0xx x0 þ u0x Þ þð1ZÞðf 0yy y0 þ fy 0 Þ, which can be positive, negative or zero. Therefore, a higher money growth rate tomorrow (equivalently, lower x0 ) may decrease, increase or have no effect on the future value of fiat money. The interesting case is c 0 when dV m =dx ¼ u0xx x0 þ u0x o0: a faster expansion of the money supply tomorrow would make it more valuable for a future consumer to have arrived at the day market with more cash; if this effect is sufficiently large to overcome the negative 0 p 0 0 effect on future producers (dV m =dx ¼ f 0yy y0 þ fy 4 0), then the current demand for money would increase, V 0m =dx o 0, which 0 0 relaxes the budget constraint.10 However, since dV 0b =dx ¼ f 0yy y0 þfy 40, a higher money growth rate tomorrow (lower x0 ) decreases the future value of bonds, and thus, their current demand. Agents will then ask for a higher return on bonds, i.e., a lower q, which (assuming B0 4 0) tightens the current government budget constraint due to the added financial burden. How a government that lacks commitment trades-off distortions intertemporally depends crucially on how the anticipated reaction of future policy affects the current demand for money and bonds. Thus, the choice of debt is fundamentally determined by the interplay between the objective to smooth distortions over time and a time-consistency problem. This is the channel through which monetary policy affects debt policy.
3.3. Equilibrium characterization The following auxiliary result will be useful in the characterization of the equilibrium. ^ c^ , g^ g is implemented in all periods. Lemma 1. In a MPME, l ¼ 0 if and only if fx, In other words, l ¼ 0 if and only if the first-best is implemented. Given the incentives to smooth distortions over time, the ^ c^ , g^ g in the current period if there are distortions in the future. government will not implement the first-best allocation fx, ^ As a reference, from (19), the steady state debt level that implements the first-best is B^ ¼ 1ðag^ =1bÞu^ x x. It seems plausible that there exists a MPME with non-distortionary monetary policy, i.e., where X ðBÞ ¼ x^ for all B. 0 Suppose this is the case. Hence, X B ¼ 0 for all B and thus, from (24), l ¼ l , i.e., perfect distortion smoothing at all levels of 0 0 debt. Eqs. (22) and (23) imply c= c and g =g . Note that ex o0 for B 41, which from (21) implies l ¼ z ¼ 0 for all B 41. ^ c^ , g^ g for B r 1. Thus, our candidate equilibrium is Clearly, given this policy, it is also feasible to implement fx, ^ ^ þ ð1 þ B=bÞ1. In other words, the government implements fX ðBÞ ¼ x,CðBÞ ¼ c^ ,GðBÞ ¼ g^ g and, from (19), BðBÞ ¼ ðag^ =bu^ x xÞ the first-best in every period through a Ponzi-scheme of ever-increasing debt. The proposition below shows that this policy cannot be an equilibrium, while establishing a more general property of the MPME in this environment. ^ Proposition 1. There does not exist a MPME with X B ¼ 0 for all B 4 B. 10
In this respect, the mechanism here resembles the ‘‘interest rate manipulation’’ argument from Lucas and Stokey (1983).
F.M. Martin / Journal of Monetary Economics 58 (2011) 132–145
139
It follows then, that in any MPME, the allocation of the day-good is necessarily distorted for some levels of debt. By Lemma 1 and conditions (22) and (23), the allocation of the night-good and the provision of the public good are also distorted for some levels of debt. Some results below are intractable for general specifications, since they depend on the properties of the third derivatives of some functions. The following assumption provides sufficient conditions that prevent second-order effects to dominate and ensure the equilibrium is well-behaved. ^ and (ii) Ucc cðUc aÞð1 þ Uccc c=Ucc Þ o 0 for Assumption 2. The following conditions hold: (i) 2fxx þfxxx x Z0 for x 2 ð0, x; c 2 ð0, c^ . Part (i) of Assumption 2 ensures that X ðBÞ is strictly decreasing for B 4 1, while part (ii) implies CðBÞ and GðBÞ move in the same direction—see the proof of Proposition 2 in the supplemental Appendix. Both conditions also ensure uniqueness of the steady state with positive net nominal obligations. Note that Assumption 2 is consistent with the functional forms typically adopted in calibrations; e.g., f ðxÞ ¼ xg =g, g Z1 and UðcÞ ¼ cð1rÞ =ð1rÞ, r 40 satisfy the above requirements. The following proposition characterizes the MPME in this economy. 1
0
Proposition 2. A MPME is characterized by critical debt levels B^ o B~ o o B~ o1 o B and the following properties: (i) ^ and CðBÞ o c^ , GðBÞ o g^ for all B 4 B, ^ (iii) X ðBÞ ¼ x^ for all B r1 and X ðBÞ o x^ for all B 41, (ii) CðBÞ ¼ c^ , GðBÞ ¼ g^ for all B r B; ^ (iv) BðBÞ ¼ B for all B 2 ½B, ^ B~ 1 , (v) B~ j ¼ ð1 þ B~ 0 Þðð1bj þ 1 Þ=ð1bÞÞ1, j ¼ 0, . . . ,1, ^ þ ð1 þB=bÞ1 for all B r B, BðBÞ ¼ ðag^ =bu^ x xÞ j 0 j j1 j 0 such that BðBÞ, CðBÞ and GðBÞ are discontinuous at B ¼ B~ , for all j ¼ 0, . . . ,1; BðB~ Þ ¼ 1; BðB~ Þ ¼ B~ , CðB~ Þ ¼ CðB~ Þ and j 0 GðB~ Þ ¼ GðB~ Þ for all j ¼ 1, . . . ,1, (vi) BðBÞ 41 and X 0 o 0 for all B Z 1 and (vii) BðB Þ ¼ B . B
The supplemental Appendix solves the MPME analytically, using suitable assumptions on functional forms and parameter values. Although not a general proof, the example establishes existence of the equilibrium for a limiting case. Figs. 1 and 2 provide a numerical characterization of the MPME, to complement the results in Proposition 2.11 The numerical solution uses the following functional forms: uðxÞ ¼ ðxð1sÞ 1Þ=ð1sÞ; f(x)= x; UðcÞ ¼ lnc; vðgÞ ¼ lng. Parameter values are a ¼ 2, b ¼ 0:75, Z ¼ 0:5 and s ¼ 2. These assumptions imply B^ ¼ 5, B ¼ 1:5 and CðBÞ ¼ GðBÞ. This parameterization is for illustration purposes only—see Section 4.3 for a calibration to the U.S. economy. 1 0 The MPME can be analyzed using the critical debt levels identified in Proposition 2, B^ o B~ o o B~ o1 o B , as reference points. These critical levels are highlighted in all figures. ^ c^ , g^ g in every For B r B^ the government has enough claims on the private sector to implement the efficient allocation fx, period. The first-best policy consists of contracting the money supply at the Friedman rule, m ¼ b1, imposing zero labor taxes and supplying the first-best level of government expenditure. ^ B~ 1 debt is kept constant, since the government has enough claims on the private sector to perfectly smooth For B 2 ½B, distortions across time. 1 0 For B 2 ðB~ , B~ , the government cannot afford to smooth distortions intertemporally without increasing the debt. Thus, distortions are perfectly smoothed across only a finite number of periods, at the cost of increasing debt. The debt function j in this range is increasing and features an infinite but countable number of discontinuities at B~ ,j ¼ 0, . . . ,1. These discontinuities are generated by the ‘‘kink’’ in X ðBÞ at B =1, which in turn, is caused by the non-negativity constraint (18). 0 When B 4 B~ , BðBÞ 41 and thus, the current government anticipates that future governments will distort the allocation of the day-good, i.e., faces X 0B o 0.12 Debt policy in this range is driven by a time-consistency problem, as 0 explained above. For B 2 ½B~ ,B Þ, we have ex0 X 0B 4 0, i.e., an increase in debt relaxes the current government budget constraint—through the effects of anticipated changes in future monetary policy on the current demand for money and bonds—and thus, allows for lower distortions today relative to tomorrow. Similarly, for B 4 B , we have ex0 X 0B o0, which has the opposite effect and thus, implies a reduction in debt. 0 For all debt levels B 4 B~ , the MPME is well-behaved (in the sense that there are no discontinuities in policy stemming from fundamentals) and features a unique steady state, B 4 1. Uniqueness of this steady state follows from X 0B o 0 for all B Z 1 and ex0 ¼ 0 having a unique solution at B —see proof of Proposition 2, part (vii), in the supplemental Appendix. Let us now revisit the mechanism that drives the change in debt. Given X 0B o 0, an increase in debt today implies a decrease in day-good consumption tomorrow, since the government will have a larger incentive to inflate. The anticipated response in future monetary policy affects agents’ decisions today, and thus, the current government budget constraint. This effect is accounted for by ex0 , which as explained above, measures the changes in the current demand for money and bonds due to changes in future policy. For simplicity of exposition, suppose the government is considering an increase in debt above zero; the resulting reduction in day-good consumption tomorrow lowers the marginal value of money if the agent becomes a producer 11 The numerical computation of Markov-perfect equilibria for dynamic policy games has been described extensively elsewhere. The method used here follows the projection algorithm described in Martin (2009, 2010a), with the caveat that one needs to account for the (countable) discontinuities in policy. The code is available upon request. 12 Note that Proposition 2 only establishes BðBÞ 41 for BZ 1. Assuming BðBÞ is increasing, it follows from parts (v) and (vi), plus Lemma 2 in the 0 appendix, that BðBÞ 41 for B 2 ðB~ ,1Þ as well. This property is verified by the numerical solution.
140
F.M. Martin / Journal of Monetary Economics 58 (2011) 132–145
45°
B'
(B)
B
0
Fig. 1. Markov-perfect monetary equilibrium-debt policy. Note: This figure displays end-of-period debt, B0 , as a function of beginning-of-period debt, B. If ^ B~ 1 , then debt is held constant in every period. If initial debt is larger than B~ 1 , then debt converges to B in the long-run. Debt is initial debt is in ½B, normalized by the aggregate money stock.
Day-good consumption
Night-good consumption & government expenditure c, g
x
0 Money growth & tax rates
0
0
B
B
Lagrange multipliers
0
B
B
Fig. 2. Markov-perfect monetary equilibrium-allocations and policy. Note: This figure displays equilibrium allocations and policy variables as functions 1 0 of beginning-of-period debt, B. Critical debt levels B^ o B~ o B~ o 1 o B , as characterized in Proposition 2 and Fig. 1, are highlighted. The Lagrange multipliers are associated with the constraints in the government’s problem. Debt is normalized by the aggregate money stock.
p
0
0
0
(ðdV m =dx ÞX 0B o 0) and decreases the demand for bonds today (ðdV b =dx ÞX 0B o 0). Both effects tighten the current c 0 government budget constraint. However, if ðdV m =dx ÞX 0B 4 0 (which, given X 0B o 0, occurs generically if, for example, 1s uðxÞ ¼ ðx 1Þ=ð1sÞ, with s 4 1), the lower day-good consumption tomorrow leads to an increase in the marginal value of money for tomorrow’s consumers. If this effect is sufficiently large to offset the decrease in value for producers (a calculation which also depends critically on Z), then there is an increase in the total demand for money today. In turn, if the increase in the overall demand for money more than compensates for the higher interest paid on bonds, then ex0 X 0B 40, i.e., issuing more debt today allows the government to lower the current distortions necessary to finance its expenditure. The cost associated with this higher debt is an increase in the wedge between distortions today and tomorrow.
F.M. Martin / Journal of Monetary Economics 58 (2011) 132–145
141
Monetary and tax policies in the MPME depend on the level of debt as follows. First, monetary policy creates an ^ for all B 41, due to the incentives to inflate away nominal liabilities. In turn, the intratemporal distortion (x o x) 0 anticipated distortion in future monetary policy implies an intertemporal distortion for all B 4 B~ —see equilibrium 0 conditions (14) and (17), and footnote 12. The Friedman rule, m ¼ b1, is thus implemented only for B r B~ . Second, tax ^ rates are zero for all B r B and positive otherwise. In the range where monetary policy is anticipated not to be distortionary 0 0 in the following period (i.e., B r B~ ), taxes are constant over time. For B 4 B~ , taxes are increasing in debt. 3.4. Long-run debt The steady state at B satisfies ex0 ¼ 0—see 24 and Proposition 2—and is characterized by fx ,c ,g g solving (19), (21) and (22), where l ¼ ðvg =aÞ1, z ¼ 0 and B ¼
Zðux þ uxx x Þ y fy þfyy
1 þ Z:
ð25Þ
The condition ex0 ¼ 0 means that the anticipated effects of variations in debt on the current demand for money and bonds are exactly offset at B . In other words, the government has no incentive to substitute distortions intertemporally and thus, keeps its debt constant. Fig. 1 suggests that for a large set of initial debt levels, government debt converges to B . For a limiting case, convergence of debt to B can be established formally. Proposition 3. Assume vg ¼ cg n , c 4 a, n 4 0. Then, as n-0, BðBÞ-B for all B Z 1. This result, which allows us to focus on B for positive analysis, has not been previously shown with such generality.13 Numerical simulations, including the example used above, show that the stability of B holds for more general specifications of v(g). One pervasive feature of cross-country data is that governments typically issue significant amounts of debt. Proposition 2 only establishes that B 4 1. Thus, there may exist parameterizations for which long-run debt is negative. The proposition below relates long-run debt to the model’s fundamentals and states conditions for positive long-run debt. In the supplemental Appendix, these results are extended for a case where the MPME can be solved analytically. Proposition 4. Properties of B : (i) B 40 only if x uxx =ux 4 1; (ii) as Z-1, B 4 0 if and only if x uxx =ux 4 1; (iii) as Z-0, B -1; and (iv) if ux ¼ xs , s 40, and fx ¼ f 4 0, then Zs 4 1 is a sufficient condition for B 40. The size of long-run debt depends crucially on the curvature of the utility function for the day-good, u(x), and the measure of buyers in the day market, Z. This result follows from the discussion in the previous section: the curvature of u(x) and the value of Z determine how changes in monetary policy tomorrow—induced by changes in debt today—affect the current demand for money and thus, the incentives to increase or decrease debt. Part (i) of Proposition 4 implies that if u(x) has a constant elasticity-of-substitution, then its curvature needs to be higher than logarithmic for debt to be positive in the long-run; part (ii) shows this condition is also sufficient as the measure of buyers approaches one.14 Part (iii) shows that as the measure of buyers converges to zero—which implies expansionary monetary policy tomorrow always leads to a lower demand for money today—long-run net nominal obligations converge to zero as well. Part (iv) provides a sufficient condition for positive long-run debt, for typically used functional forms.15 The results in Proposition 4 are closely related to those obtained by Martin (2009) for economies with a cash-inadvance constraint. In those models, the curvature of the cash-good utility plays the same role as the day-good utility here for the determination of long-run debt. A new result in this case is that the measure of buyers—an integral feature of the Lagos-Wright environment—has also a critical effect on the determination of debt. It is worth noting, however, that a similar result can be obtained in a reduced-form cash-credit model.16 Consider a cash-in-advance constraint of the form px r Zm, where x is the cash-good, p is the (normalized) price level and Z Z 0 is a measure of the real balances necessary to purchase a given quantity of cash-goods. Following an analysis similar to Martin (2009), one can show that, if the marginal utility with respect to the credit-good is constant, then B depends on u(x) and Z in the way prescribed by Proposition 4. Note that for equivalence between the micro founded and reduced-form environments, we need Z 2 ð0,1Þ. One way to justify this assumption in a cash-in-advance model is to assume that money balances are distributed among household 13 Martin (2009) shows stability of the distortionary steady state in a cash-in-advance economy, assuming no labor taxes and specific functional forms for preferences. 14 Waller (forthcoming) shows that for the Lagos–Wright framework to be consistent with a balanced growth path, we need to assume log-utility in the day-good. If we introduce capital as in Aruoba et al. (forthcoming), then inflation also taxes capital holdings; thus, monetary policy interacts with capital accumulation. The question remains whether this added effect is enough to allow the case with log-utility to feature a realistic level of long-run debt. 15 ^ Thus, for typical applications of the theory, an increase in It is easy to show that the same condition (Zs 4 1) also guarantees Vm =dxo 0 for all x r x. debt today, which triggers a faster expansion of the money supply tomorrow, will always lead to an increase in current money demand. 16 I thank the referee for suggesting this possibility.
142
F.M. Martin / Journal of Monetary Economics 58 (2011) 132–145
members before their role as shoppers or producers is revealed. Then, only shoppers would use accumulated cash balances to purchase goods. The findings from reduced-form monetary models, regarding the determination of long-run debt in the absence of commitment, are largely confirmed here with the adoption of a micro founded approach, since in both cases the critical element is how future changes in policy affect the incentives to increase debt today. What we gain by explicitly modeling money micro foundations is a better understanding of the role played by the money demand in determining these incentives. 4. Government policy with and without commitment This section compares the MPME with the optimal policy under commitment. First, it is established that long-run policy in the MPME is time-consistent. Second, the policies with and without commitment are compared qualitatively. Third, the welfare loss due to lack of commitment is estimated for a calibrated economy. 4.1. Time-consistency of long-run policy As shown in the previous section, the steady state at B satisfies ex0 ¼ 0 and X B o 0. That is, even though small changes in debt choice at B would alter future policy, the positive and negative effects of these changes on the current government budget constraint are balanced out. In other words, the time-consistency problem, which is driving the change in debt, cancels out at the steady state. It follows then that endowing a government at B with a commitment technology would have no effect on policy.17 The statement below formalizes this argument. Proposition 5. Suppose initial debt is equal to B . Then, a government with commitment and a government without commitment will both implement the allocation fx ,c ,g g and choose debt level B in every period. The key property that drives this result is that the time-consistency problem is internalized through a single good—the term ex0 X 0B in (24)—and so, opposing effects are not only balanced, but eliminated at B . It follows that time-consistency of long-run debt should also hold in other monetary environments, under suitable assumptions. For example, the cash-credit model analyzed in Martin (2009) generically features time-consistency problems through both cash and credit goods; however, if utility is linear in the credit good, then the time-consistency problem is internalized through a single good and it is straightforward to derive a result analogous to Proposition 5. Similarly, Dı´az-Gime´nez et al. (2008) report that, in a cash-in-advance economy with no credit goods and no taxes, policy under lack of commitment converges to a timeconsistent steady state. Another implication of Proposition 5 is that time-consistency of the optimal (commitment) policy is not necessarily linked to the optimality of the Friedman rule, as previously suggested by the results in Alvarez et al. (2004). At B there is no time-consistency problem, even though the government is inflating away its nominal liabilities. 4.2. Markov-perfect and Ramsey policies With commitment, the government fully internalizes how changes in monetary policy affect decisions in preceding periods. After the initial period, there are no incentives to shift distortions intertemporally and the optimal (Ramsey) policy is to smooth them perfectly.18 This prescription contrasts sharply with the conduct of policy absent commitment. Fig. 3 compares debt and monetary policies with and without commitment, using the parameterization from Section 3.3. Graphs with tax rates and government expenditure are omitted to save space. As shown above, without commitment, debt converges to B in the long-run if initial debt is greater than B1 . For B Z B0 the money growth and tax rates are increasing in debt, whereas government expenditure is decreasing in debt. With commitment, there are two debt policies, one for the initial period and one for all other periods (the 451 line). The government increases or decreases debt in the first period, depending on the initial level B0, and holds it constant thereafter. Note that any level of debt can be supported in the long-run, with an appropriate choice of initial debt. The government also distinguishes the initial period from all other periods for its monetary policy. One distinct property of the ^ a result also highlighted in commitment solution is that x^ is never implemented after the initial period for all B0 4 B, ^ Aruoba and Chugh (2010). Thus, even though x0 ¼ x for all B0 r1, the Friedman rule is never implemented for any ^ 19 Taxes and expenditure are constant in all periods and a function of initial debt. B0 4 B. Regardless of the government’s ability to commit, policy in this environment creates intertemporal distortions in the long-run, for a large set of initial debt levels. This result contrasts with the findings in Albanesi and Armenter (2009), which show zero intertemporal distortions in the long-run, for a wide class of second-best non-monetary economies. 17
Note, however, that a commitment technology would still significantly affect how policy responds to aggregate shocks. The derivation of the Ramsey policy is well understood and thus, omitted here. See the proof of Proposition 5 in the supplemental Appendix for a formulation and characterization of the Ramsey problem. 19 Aruoba and Chugh (2010) further explain why the structure of preferences in the Lagos-Wright framework does not satisfy the conditions established by Chari et al. (1991) for optimality of the Friedman rule. 18
F.M. Martin / Journal of Monetary Economics 58 (2011) 132–145
Debt policy B'
0
143
Monetary policy 45°
B
0
B
Fig. 3. Markov-perfect and Ramsey policies. Note: This figure compares government policy with and without commitment. BðBÞ and mðBÞ (solid lines) correspond to the Markov-perfect (lack of commitment) policies, which are functions of beginning-of-period debt. Bt ðB0 Þ and mt ðB0 Þ (dashed lines) correspond to the Ramsey (commitment) policies in period t, which are functions of debt in the initial period, where Bt ðB0 Þ ¼ B0 ðB0 Þ and mt ðB0 Þ ¼ m1 ðB0 Þ for all t Z 1. Debt is normalized by the aggregate money stock.
4.3. Welfare The policies with and without commitment differ most notably in how fast they converge to the long-run. With commitment, debt changes at most once, in the initial period, and government policy is kept fixed afterwards. Without commitment, policy gradually converges to a unique steady state with positive nominal liabilities, assuming initial debt is sufficiently large. To evaluate whether the differences in policy implementation across regimes have a significant impact on welfare, one can calibrate the MPME at B and then, evaluate the welfare loss due to lack of commitment for all levels of debt. The model has definite predictions for the following policy variables: debt, inflation, taxes, interest rate, expenditure and velocity of circulation. Following Martin (2009), the tax rate is determined residually by all other variables, to satisfy the government budget constraint. All calibration targets are taken from U.S. annual data for the period 1962–2006. Consider the functional forms: uðxÞ ¼ ðxð1sÞ 1Þ=ð1sÞ, f ðxÞ ¼ fx, UðcÞ ¼ ðcð1rÞ 1Þ=ð1rÞ and vðgÞ ¼ lng. Set Z ¼ 0:5, which leaves the following parameters to calibrate: a, b, f, r and s. Define nominal GDP as the sum of nominal output in the day and night markets. Abusing notation slightly, let Y be ~ þ pðc þ gÞ. Let C pc and G pg and recall that in nominal GDP normalized by the aggregate money stock, i.e., Y ¼ Zpx ~ ¼ 1. Given Z ¼ 0:5, we get Y ¼ 0:5 þC þ G. Note that by the equation of exchange, velocity of circulation is equilibrium, px defined as the nominal GDP divided by the aggregate money stock. Thus, the first target is to set Y equal to the velocity of circulation in the data. Following the literature, take M1 as the measure of money, which implies a velocity equal to 6.3. In steady state, the inflation rate is equal to the money growth rate, m . Using the CPI as the measure of the price level, the inflation rate for the period averaged 4.4% annual. The third target is debt over GDP. In the data, government debt is measured at the end of the period. Thus, the relevant measure in the model is B ð1 þ m Þ=Y . In the U.S., debt over GDP, excluding holdings by federal agencies and the Federal Reserve Banks, averaged 30.8% between 1962 and 2006. Evaluating (14) and (17) in steady state implies q ð1 þ m Þ ¼ b. Note that q is the inverse of the gross nominal interest rate. Take the 1-year treasury constant maturity rate, which averaged 6.3% annual in the period considered. Thus, b ¼ 1:044=1:063 0:982. The last target is government expenditure over GDP, G =Y . Federal government outlays, net of debt interest payments (which the model accounts for in the discounted price of bonds, q), averaged 18.2% per year in terms of GDP. Table 1 summarizes the parameter choice and calibration targets. Using this parameterization, we can calculate the one-time fee that agents are willing to pay to switch from the MPME to the Ramsey policy, expressed in terms of periodconsumption. That is, the function DðBÞ that satisfies ZX ðBÞ ZuðX ðBÞð1 þ DðBÞÞÞð1ZÞf þUðCðBÞð1 þ DðBÞÞÞaðCðBÞ þ GðBÞÞ þ vðGðBÞÞ þ bVðBðBÞÞ ¼ V R ðBÞ, 1Z where VR(B) is the present value utility under the Ramsey policy, for any given initial debt. Table 2 shows the equivalent compensation measure D, evaluated at selected beginning-of-period debt levels. For the 1 benchmark calibration, B~ features the highest D: 0.44%. Note that this is not a general result; e.g., using the simple 1 0 parameterization of Sections 3.3 and 4.2, D peaks somewhere in between B~ and B~ . For more empirically relevant debt levels, say between 0 and 2B , D is at most 0.04%, which is quite low when compared to typical welfare measures in macroeconomics (e.g., the cost of business cycles or the cost of 10% annual inflation). Consider two alternative calibrations to help evaluate the generality of the above results: one which targets twice the U.S. debt over GDP ratio and one which targets zero steady state debt. All other targets remain at benchmark. The economies are recalibrated following the procedure described above. As we can see in Table 2, even though these alternative economies are significantly different from the benchmark case, the welfare loss due to lack of commitment 1 remains low for all levels of debt. The highest value for D is about 1.24% at B~ for the economy featuring B ¼ 0. For more
144
F.M. Martin / Journal of Monetary Economics 58 (2011) 132–145
Table 1 Calibration. Target statistics B ð1 þ m Þ Y
m
G Y
1 q
0.308
0.044
0.182
0.063
6.300
a
b
f
r
s
4.836
0.982
1.107
9.108
5.191
Y
1
Parameter values
Note: Steady state variables: B ð1 þ m Þ=Y is the end-of-period debt to GDP ratio; m is the money growth rate, which is equal to the inflation rate; G =Y is the government expenditure over GDP; ð1=q Þ1 is the nominal interest rate, which depends inversely on the price of bonds; and Y is the nominal GDP normalized by the aggregate money stock, which is equal to the velocity of circulation. Parameters: a is the disutility of night labor; b is the discount factor; f is the utility cost of producing the day-good; r and s determine the curvature of the utility function for the night and day goods, respectively. The calibration assumes an equal measure of consumers and producers in the day market, i.e., Z ¼ 0:5.
Table 2 Welfare loss due to lack of commitment. Bechmark B/Y B^ 1 B~ 0 B~ 1 0 B 2B
Double long-run debt
DðBÞð%Þ
B/Y
Zero long-run debt
DðBÞð%Þ
B/Y
DðBÞð%Þ
10.10
0.00
10.10
0.00
10.18
0.00
2.33
0.44
2.18
0.27
2.58
1.24
0.21
0.12
0.20
0.10
0.22
0.17
0.17 0.00 0.29 0.56
0.10 0.04 0.00 0.04
0.17 0.00 0.57 1.09
0.10 0.06 0.00 0.06
0.17 0.00 0.00 0.00
0.11 0.00 0.00 0.00
Note: B=Y is the beginning-of-period debt to output ratio. DðBÞ is the one-time fee that an agent would be willing to pay to switch from the Markov1 0 perfect to the Ramsey policy, expressed in terms of period-consumption, as a function of debt. B^ o B~ o B~ o 1 o B are critical debt levels, as characterized in Proposition 2 and Fig. 1. ‘‘Benchmark’’ uses the parameters from Table 1. ‘‘Double long-run debt’’ targets B ð1 þ m Þ=Y ¼ 0:6 and features: a ¼ 4:493, f ¼ 1:106, r ¼ 15:003 and s ¼ 8:322. ‘‘Zero long-run debt’’ targets B ð1 þ m Þ=Y ¼ 0 and features: a ¼ 7:684, f ¼ 1:175 r ¼ 3:218 and s ¼ 1:888. The discount factor b is set at benchmark in all cases.
empirically relevant values of debt (i.e., between zero and twice steady state debt), D is at most 0.06%, as obtained for the case with B ¼ 0:6. To further verify the robustness of the welfare results, consider perturbing parameter values around benchmark. The welfare loss due to lack of commitment can be increased by alternatively decreasing a or r, or increasing Z, s or f. Consider increasing/decreasing one parameter at a time by 25%, which is a significant departure from benchmark. The 1 largest effect on welfare arises from varying r. E.g., decreasing r to 6.831 yields DðB~ Þ 0:81% and Dð0Þ Dð2B Þ 0:08%. All these figures are significantly larger than at benchmark, but still small. The reason for the low welfare numbers is that the governments with and without commitment share the objective of smoothing distortions across periods and, as it turns out from the numerical work above, are similarly effective at this task. For example, consider starting at zero debt using the benchmark calibration. The Ramsey government sets a permanent labor tax rate of 19.0%; the money growth rate is 2.7% in the initial period and a fixed 4.2% thereafter. The Markov government starts with a 17.0% tax rate and a 0.3% money growth rate; both rates increase as debt builds up, converging towards 19.6% and 4.4%, respectively. Compared to the optimal policy, the government without commitment trades-off (significantly) lower distortions in the short-run for (slightly) larger distortions in the long-run. Although the welfare loss due to lack of commitment is typically small and even vanishes completely at the MPME steady state, this friction still plays an important conceptual role, as it provides a mechanism that explains the level of debt. In contrast, the commitment (Ramsey) solution—although useful as a normative benchmark—is time-inconsistent and has no meaningful prediction for the level of debt and thus, for government policy in general. The welfare results suggest that the search for institutional arrangements that would induce the government to behave closer to the optimal Ramsey policy should not be a primary concern. When commitment is assumed, many parameterizations are consistent with observed debt to output ratios—initial debt is a free parameter. Thus, if the Ramsey problem is formulated to conduct normative analysis, one could use the Markov-perfect steady state to determine the primitives of the environment. This would provide a justification to the normative analysis.
F.M. Martin / Journal of Monetary Economics 58 (2011) 132–145
145
5. Concluding remarks The model presented in this paper offers several attractive properties for policy analysis from a positive perspective. There exists a unique steady state with positive net nominal government liabilities, which is stable, time-consistent and features positive taxes and inflation above the Friedman rule. It is straightforward to verify that these properties would survive a number of extensions to the basic environment; e.g., financial intermediation and trading frictions. Martin (2010a) analyzes these extensions in detail and further studies the empirical plausibility of this framework.
Acknowledgments I thank David Andolfatto, Stephen Easton, an anonymous referee, the Associate Editor, Christopher Sleet, and the Editor, Robert King, for helpful comments and suggestions. Appendix A. Supplementary data Supplementary data associated with this article can be found in the online version at doi:10.1016/j.jmoneco.2011.02.001.
References Aiyagari, S.R., McGrattan, E.R., 1998. The optimum quantity of debt. Journal of Monetary Economics 42 (3), 447–469. Albanesi, S., Armenter, R., 2009. Intertemporal distortions in the second best, mimeo. Alvarez, F., Kehoe, P.J., Neumeyer, P.A., 2004. The time consistency of optimal monetary and fiscal policies. Econometrica 72 (2), 541–567. Aruoba, S.B., Chugh, S.K., 2010. Optimal fiscal and monetary policy when money is essential. Journal of Economic Theory 145 (5), 1618–1647. Aruoba, S.B., Waller, C., Wright, R. Money and capital. Journal of Monetary Economics, forthcoming. Barro, R.J., 1979. On the determination of the public debt. The Journal of Political Economy 87 (5), 940–971. Battaglini, M., Coate, S., 2008. A dynamic theory of public spendinga taxation and debt. American Economic Review 98 (1), 201–236. Berentsen, A., Waller, C., 2008. Outside versus inside bonds. Working Paper No. 372, Institute for Empirical Research in Economics, University of Zurich. Bohn, H., 1998. The behavior of U.S. public debt and deficits. The Quarterly Journal of Economics 113 (2), 949–963. Chari, V.V., Christiano, L.J., Kehoe, P.J., 1991. Optimal fiscal and monetary policy: some recent results. Journal of Money, Credit and Banking 23 (3), 519–539. Diamond, P.A., 1965. National debt in a neoclassical growth model. American Economic Review 55 (5), 1126–1150. Dı´az-Gime´nez, J., Giovannetti, G., Marimo´n, R., Teles, P., 2008. Nominal debt as a burden on monetary policy. Review of Economic Dynamics 11 (3), 493–514. Klein, P., Krusell, P., Rı´os-Rull, J.-V., 2008. Time-consistent public policy. The Review of Economic Studies 75 (3), 789–808. Kocherlakota, N., 1998. Money is memory. Journal of Economic Theory 81, 232–251. Krusell, P., Smith, A., 2003. Consumption-savings decisions with quasi-geometric discounting. Econometrica 71 (1), 365–375. Lagos, R., Wright, R., 2005. A unified framework for monetary theory and policy analysis. The Journal of Political Economy 113 (3), 463–484. Lucas, R.E., 1986. Principles of fiscal and monetary policy. Journal of Monetary Economics 17, 117–134. Lucas, R.E., Stokey, N.L., 1983. Optimal fiscal and monetary policy in an economy without capital. Journal of Monetary Economics 12 (1), 55–93. Martin, F.M., 2009. A positive theory of government debt. Review of Economic Dynamics 12 (4), 608–631. Martin, F.M., 2010a. Government policy in monetary economies. SFU Discussion Papers 10-01. Martin, F.M., 2010b. Markov-perfect capital and labor taxes. Journal of Economic Dynamics and Control 34 (3), 503–521. Maskin, E., Tirole, J., 2001. Markov perfect equilibrium. Journal of Economic Theory 100 (2), 191–219. Shi, S., 2006. Viewpoint: a microfoundation of monetary economics. Canadian Journal of Economics 39 (3), 643–688. Shin, Y., 2006. Ramsey meets Bewley: optimal government financing with incomplete markets, mimeo. Telyukova, I., Wright, R., 2008. A model of money and credit, with application to the credit card debt puzzle. The Review of Economic Studies 75 (2), 629–647. Wallace, N., 1998. A dictum for monetary theory. Federal Reserve Bank of Minneapolis Quarterly Review 22 (1), 20–26. Wallace, N., 2001. Whither monetary economics? International Economic Review 42 (4), 847–869 Waller, C.J. Random matching and money in the neoclassical growth model: some analytical results. Macroeconomic Dynamics, forthcoming. Williamson, S., Wright, R., 2010. New monetarist economics: models. In: Friedman, B.M., Woodford, M. (Eds.), Handbook of Monetary Economics, vol. 3. North-Holland, Amsterdam.
Journal of Monetary Economics 58 (2011) 146–155
Contents lists available at ScienceDirect
Journal of Monetary Economics journal homepage: www.elsevier.com/locate/jme
Inattention, wealth inequality and equilibrium asset prices Daria Finocchiaro Sveriges Riksbank, Research Division, SE103-37 Stockholm, Sweden
a r t i c l e in f o
abstract
Article history: Received 19 December 2008 Received in revised form 2 March 2011 Accepted 3 March 2011 Available online 15 March 2011
Heterogeneity in planning propensity affects wealth inequality and asset prices. This paper presents an economy where attentive agents plan their consumption period by period, while inattentive agents plan every other period. Inattentive consumers face more uncertainty and trade at unfavorable prices. If the only source of uncertainty is future income, inattentive consumers accumulate more wealth. In contrast, with uncertain asset returns inattentive investors accumulate less wealth. Asset prices must induce attentive consumers to voluntarily bear the burden of adjusting to aggregate shocks and, as a result, are much more volatile than in a representative agent model with full attention. & 2011 Elsevier B.V. All rights reserved.
1. Introduction Survey evidence suggests that there are costs of gathering and processing information and that these costs lead to infrequent planning or even a complete lack of planning, with great heterogeneity among households.1 This paper proposes a model where heterogenous infrequent planning leads to wealth heterogeneity and excess volatility of assets prices in general equilibrium. Lusardi (2003) and Ameriks et al. (2003) empirically find that infrequent planning leads to lower saving and considerable wealth heterogeneity among households with otherwise similar characteristics. However, this is at odds with the existing literature on infrequent planning: in a partial equilibrium model with fixed interest rates, Reis (2006) shows that consumers who plan infrequently face more uncertainty and save more for precautionary reasons. I assume that agents are heterogenous only in their propensity to plan: attentive agents plan their consumption period by period, while inattentive agents plan every other period. In general equilibrium, the inattentive group suffers from an adverse correlation between asset prices and savings. By setting a plan for consumption, an inattentive consumer will let her savings automatically adjust to the aggregate shock. In general, she will accumulate more assets when asset prices are high and reduce her asset holding when prices are low. This adverse ‘‘terms-of-trade’’ effect could, in theory, lead to lower wealth. Since the main channel through which infrequent planning affects the wealth distribution is an increase in uncertainty, it becomes crucial to distinguish between income and return uncertainty. It is shown that, when the only source of uncertainty is future income, inattentive consumers still accumulate more wealth, despite the adverse terms-of-trade effect. In contrast, when asset returns are uncertain, inattentive agents choose to take less financial risk and accumulate less wealth in the longrun. In this case, attentive agents bear the cost of living in a society where half of the population is inattentive. They experience a more volatile consumption profile and attain a lower level of welfare than in a representative agent model. In a stylized general equilibrium portfolio choice model that combines both sources of uncertainty, inattentive agents invest more in bonds and less in equities. Thus, inattentiveness can account for the finding in Lusardi and Mitchell (2007) and Lusardi (2003), i.e. infrequent planners take less financial risk. Tel.: þ46 87870432.
E-mail address: daria.fi
[email protected] See Lusardi (2003) for a review on the recent empirical evidence on planning and saving behavior. Heterogeneity in planning behavior might arise if planning depends on other people’s experience, as individuals learn how to plan from their siblings or their parents, or if planning is related to financial literacy. 1
0304-3932/$ - see front matter & 2011 Elsevier B.V. All rights reserved. doi:10.1016/j.jmoneco.2011.03.002
D. Finocchiaro / Journal of Monetary Economics 58 (2011) 146–155
147
Turning to asset price implications, inattention increases asset return volatility by 30%. Asset prices must induce attentive consumers to voluntarily bear the burden of adjusting to aggregate shocks, since inattentive agents are unable to do so at non-planning periods. Both LeRoy and Porter (1981) and Shiller (1981) note that the stock price volatility observed in the data is hardly justifiable by fundamentals, this is the well-known ‘‘excess volatility puzzle’’. A standard consumption-based asset pricing model fails in reconciling the low volatility of consumption growth with the high volatility of stock returns observed in the data. This finding has led to a number of potential explanations. To name a few, the literature has proposed models with uninsurable labor risk (Heaton and Lucas, 1996), habit formation (Campbell and Cochrane, 1999), transitory income shocks (Rodriguez, 2005) or behavioral factors (Dumas et al., 2009). This paper proposes a novel channel to generate asset price volatility: inattentiveness.2 Reis (2006) studies a partial equilibrium consumption/saving model where costly information acquisition leads to infrequent planning (inattentiveness). Both Gabaix and Laibson (2002) and Abel et al. (2007) compute the optimal degree of inattention to the stock market. The latter find that even small observational costs can imply a substantial degree of inattention. A different branch of literature has focused on the implications of infrequent planning without specifying the rationale behind it (e.g. Caballero, 1995; Lynch, 1996). The present paper is most closely connected to this strand of the literature since it postulates that a fraction of the population plans infrequently. However, it goes further by considering a general equilibrium model with endogenous asset prices. Close to the spirit of this paper, the work of McKay (2010) predicts a low degree of stock market participation among households who put minimal effort into managing their portfolio. An outline of the paper follows. Section 2 presents a model with inattentiveness and income uncertainty. Section 3 analyzes a model with only return uncertainty. Section 4 studies a portfolio problem with inattentiveness. Section 5 concludes. 2. Inattentive consumers In a model with bond trading and income risk, inattentiveness affects wealth inequality through two different channels working in opposite directions. It increases the wealth accumulation of the inattentive group via precautionary saving motives and decreases it via negative ‘‘terms-of-trade’’ effects. The first channel is also present in partial equilibrium (see Reis, 2006), the second is a pure general equilibrium effect. On the one hand, by sporadically planning their consumption inattentive consumers face greater uncertainty. This implies an increase in wealth accumulation for precautionary reasons. On the other hand, in general equilibrium, bond prices must clear the market. At non-planning dates, a positive (negative) income shock pushes up (down) the demand for savings of both groups. However, inattentive consumers automatically adjust their savings to the income shock since their consumption is predetermined. It follows that market-clearing prices must be pro-cyclical and that inattentive consumers face an adverse terms-of-trade effect, i.e. they buy bonds when their price is high and vice versa. As the results in the next subsection will show the first effect dominates and the inattentive group accumulates on average more wealth than the attentive one. Interestingly, through this last channel, inattention magnifies asset price volatility. 2.1. The model Consider an incomplete markets economy with an infinite horizon and aggregate uncertainty as in Den Haan (2001), where consumers differ only in the frequency of their consumption plans.3 Attentive consumers (A) choose consumption and saving plans at all points in time. Inattentive consumers (I) plan consumption every other period and let savings absorb income shocks.4 The attentive group has mass a and the total population size is normalized to one. Each household is endowed with income y, which follows an AR(1) process, and can smooth its consumption by trading a risk-free one-period bond b, in zero net supply, at price q. Agents can go short in bonds only up to an exogenous limit, b .5 All agents are price takers in the bond market. The set of relevant state variables (z) will differ between planning (P) and non-planning (NP) dates. At non-planning dates, consumption (c) of the inattentive group is predetermined and affects utility so that it will enter the policy functions as a state variable. Attentive consumers plan period by period, solving the following problem: V P,A ðzP,t Þ ¼ MaxcA ,bA fUðctA Þ þ bEV NP,A ðzNP,t þ 1 Þg, t
t
st : ctA þ qt bAtþ 1 ¼ yt þ bAt , bAtþ 1 Z b, P,A
ð1Þ NP,A
where V is the value function in planning periods, V ðUðcÞ ¼ c1m =ð1mÞÞ. 2
is the value function in non-planning periods and utility is CRRA
Recently, Chien et al. (2010) connect stock market volatility to infrequent portfolio rebalancing. Den Haan (2001) compares an economy with two types to one with a continuum of different types of agents. Here, I refer to the first framework. With only two groups of agents, the cross-sectional distribution of wealth is characterized by the average bond holdings of one of the two groups. A random distribution of agents across planning dates would generate a continuous wealth distribution, substantially complicating the problem. 4 With inattentiveness, choosing consumption or saving is no longer equivalent. If individuals plan their savings every other period, differences in planning times do not lead to wealth heterogeneity in general equilibrium with only aggregate shocks. 5 The constraint is hardly ever binding. To make sure that the constraint on bt þ 2 is satisfied, in the numerical solution it is imposed that it should be satisfied in the worst possible case. 3
148
D. Finocchiaro / Journal of Monetary Economics 58 (2011) 146–155
Inattentive consumers plan every other period, according to the following problem: V P,I ðzPt Þ ¼ MaxcI ,cI t
tþ1
2
fUðctI Þ þ bUðctI þ 1 Þ þ b EV P,I ðzP,t þ 2 Þg,
st : ctI þ qt bIt þ 1 ¼ yt þ bIt , ctI þ 1 þ qt þ 1 bIt þ 2 ¼ yt þ 1 þbIt þ 1 , bIt þ 1 Zb,
bIt þ 2 Z b:
ð2Þ
The problem in (2) yields to two different sets of Euler equations in planning and non-planning periods. As in Reis (2006), the solution implies that the consumption of inattentive consumers follows a deterministic path between t and tþ1, but a stochastic Euler equation between t and tþ 2, i.e. between the planning dates. Finally, the model is closed with the usual market-clearing conditions: abAt þð1aÞbIt ¼ 04actA þ ð1aÞctI ¼ yt .6 2.2. Numerical solution The income shock is calibrated using data from the NIPA to estimate the process logyt ¼ ry logyt1 þ eyt .7 The discount factor is calibrated at 0.98 to match an average return of 2%. The degree of risk aversion is equal to 1.5 and the dimension of the attentive group, a, is equal to 12.8 One period in the model corresponds to one year in the data. The first column in Table 1 summarizes the parameters in the baseline case. Inattentiveness introduces a computational challenge. In non-planning periods (t þ1), the state space includes two I I continuous state variables, bond holdings of inattentive consumers (bt þ 1) and their predetermined consumption (ct þ 1), as I I well as the discrete variable yt þ 1. However, in equilibrium, ct þ 1 is a function of last period’s bond holdings bt and income I I I NP shock yt and bt þ 1 is also determined by yt and bt. Therefore, yt, bt and yt þ 1 constitute a sufficient state (zt þ 1) at nonP I planning dates along the equilibrium path. In contrast, at a planning date (t), the state space zt is described by yt and bt. 2.3. Inattentiveness and income risk in general equilibrium To highlight the effects of inattentiveness, in Table 2 all the results are compared to an economy without inattention, i.e. populated by a single attentive representative agent (RA). In such an economy everybody consumes her income period by period. The second column in the table reports the results for the model with inattention (I).9 When income is the only source of uncertainty, inattentive consumers save more ðbI 40Þ despite the general equilibrium terms-of-trade effect ðcorrðDbI ,qÞ 40Þ. A change in the discount factor, the level of risk aversion or the stochastic properties of the dividend shock ðrd , sd Þ does not affect this conclusion.10 Increasing the inattention span could potentially strengthen the terms-of-trade effect. At the same time, it would also increase uncertainty and generate more precautionary savings. Figs. 1 and 2 illustrate the saving behavior of an inattentive consumer by graphing her bond accumulation as a function of initial bond holdings in planning (t) and non-planning periods (t þ1).11 Consider a planning date when there is no cross-sectional dispersion in wealth so that both agents hold zero assets: I bt ¼0 in Fig. 1. Inattentive consumers face more uncertainty, since they predetermine future consumption, and save more for precautionary reasons for every realization of the shock ðbIt þ 1 bIt 4 0,8y if bIt ¼ 0Þ. For increased wealth dispersion, inattentive consumers’ behavior resembles the model without inattention: they increase their bond holdings pro-cyclically toward the lower end of wealth and vice versa. At a non-planning date (Fig. 2), one must distinguish between high and low realization of the income shock. For a good realization of the income shock (right panel), both agents would like to save in anticipation of future declines of income. However, inattentive consumers fixed their consumption one period in advance. Hence, their savings increase to satisfy the budget constraint and bond prices rise to keep the market in equilibrium. The opposite is true for a bad realization of the income shock (left panel). In that case, inattentive agents save less than attentive ones and bond prices decrease to clear the market. A similar argument holds toward the lower end of wealth. Above a certain threshold for initial bond holdings, an inattentive consumer is so rich that the prudence motive for wealth accumulation fades out. At the same time, prudence motives are strong for an attentive agent because bA ¼ bI. 6
See the working paper version of this article for a derivation of the Euler equations and other computational details. The series ‘‘total compensation per employees’’ for the years 1952–2009 was detrended using a linear trend and divided by the total U.S. population and the CPI in each year to obtain real per capita income. The income process is approximated by a three-state Markov chain. 8 Lusardi (2003) finds that one third of respondents in her sample has not made any financial plan about retirement. Ameriks and Zeldes (2004) report that the majority of respondents in their samples made few or no changes over time to their portfolio allocations. Here, to take a conservative stand it is assumed that one half of the population is inattentive. 9 Although the existence of a stationary stochastic distribution cannot theoretically be proven, the numerical simulations indicate that the distribution is not degenerate. In the simulations agents are seldom close to the constraint. Moreover, whenever the constraint is hit, the economy quickly moves away from it. 10 To study the robustness of the results, the inattentiveness model is simulated under different parameter configurations. The second column in Table 1 reports the parameter values used. I 11 For illustration purposes, these figures are plotted over a smaller grid for bt. 7
D. Finocchiaro / Journal of Monetary Economics 58 (2011) 146–155
149
Table 1 Calibration. Variable
Description
Benchmark
Robustness
b
Discount factor Risk aversion Dimension attentive group
0.94 3 – –
d dþy
Dividend income share
0.98, 0.925 1.5 1 2 15%
sey sed ry rd
Std. income shock Std. dividend shock Income autocorrelation
0.01 0.03 0.94
0.005 0.06 0.50
Dividend autocorrelation
0.87
0.50
m a
Note: Calibration used to solve the models in Sections 2 and 3. Table 2 Inattentiveness in general equilibrium. Inattentive consumers RA Simulations results Inattentive group bond holdings bI: Mean 0 Std. dev. 0 Bond return R: Mean 1.020 Std. dev. 0.003 – CorrðDbI ,qÞ Inattentive group consumption cI: Mean 1.000 Std. dev. 0.029 Attentive group consumption cA: Mean 1.000 Std. dev. 0.029
Inattentive investors I
RA
I
1.020 0.012 0.873
Inattentive group stock holdings sI: Mean 1 Std. dev. 0 Stock return R0 : Mean 1.081 Std. dev. 0.017 – CorrðDsI ,pÞ
1.081 0.022 0.314
1.006 0.029
Mean Std. dev.
6.678 0.0607
6.568 0.106
0.995 0.032
Mean Std. dev.
6.678 0.0607
6.790 0.116
C I C A % CA C I C RA % CY C A C RA % C RA
–
0.005
–
0.006
–
0.001
0.288 0.119
0.889 0.09
Cost of inattentiveness C I C A % CA C I C RA % C RA C A C RA % C RA
–
0.016
–
0.012
–
0.004
Note: Simulations results for the models presented in Sections 2 and 3. The bond and stock returns are defined as 1=qt and ðpt þ 1 þ dt þ 1 Þ=pt , respectively, where q and p denote the bond and the stock price. RA stands for ‘‘Representative Agent model’’, I for ‘‘Inattention model’’.
This implies that the attentive group is now willing to pay a high price in order to save in good periods and decrease savings in bad periods. Bond prices must induce attentive agents to voluntarily bear the entire adjustment burden since the inattentive ones are unable to react to income shocks. This last channel increases asset price volatility. Translating this result into returns, inattentiveness makes the risk-free return four times more volatile than in a model with full attention. However, the predicted standard deviation of the bond return is still half of what is observed in the data (0.024).12 2.4. The costs of inattentiveness Being inattentive obviously alters the ability of consumption smoothing. The first column in Table 2 reports some sample moments from the simulated series.13 Consumption of the inattentive group is 10% less volatile than consumption of the attentive group. By accumulating more wealth, an inattentive consumer improves her ability to smooth consumption fluctuations. This implies that, despite being fully rational and planning period by period, the attentive group bears part of the cost of living in an environment where half the population plans infrequently. Specifically, 12 This figure was generated by using the long term stock, bond, interest rates and consumption data available from Robert Shiller’s webpage. Heaton and Lucas (1996) report a similar number. 13 Table 2 reports the average across five parallel chains of 1,500,000 periods. The first 500,000 observations were disregarded.
150
D. Finocchiaro / Journal of Monetary Economics 58 (2011) 146–155
10
Savings in planning periods
x 10−4
8
6
4
2
0
−2
−4 −0.1
−0.08 −0.06 −0.04 −0.02
0
0.02
0.04
0.06
0.08
0.1
I
Fig. 1. Inattentive consumers’ savings in planning periods, DbIt þ 1 ¼ bIt þ 1 bIt , as a function of initial bond holdings, bt, and the three realizations of the income shock: high (solid line), medium (dotted line) and low (dashed line).
Low state
High state 0.3
0.1 0.05
13 23 33
0.25
0 0.2 −0.05 −0.1
0.15
−0.15
0.1
−0.2 0.05 −0.25 11 21 31
−0.3
0
−0.05
−0.35 −1
−0.5
0
0.5
1
−1
−0.5
0
0.5
1
Fig. 2. Inattentive consumers’ savings in non-planning periods, DbIt þ 2 ¼ bIt þ 2 bIt þ 1 , for two realizations of the income shock in t þ1: low (left panel) and high (right panel). Each panel reports three different saving rules that depend on the initial income state in t. For example, in the left panel, 11 corresponds to yt ¼ low and yt þ 1 ¼ low, 21 corresponds to yt ¼ medium and yt þ 1 ¼low and 31 corresponds to yt ¼ high and yt þ 1 ¼low.
attentive consumers’ consumption is more volatile as compared to what they would experience in a world with full attention. However, since their consumption profile is optimally chosen, they are also fully compensated for this utility cost by trading at more favorable prices. The net externality on the attentive consumers’ welfare turns out to be positive.
D. Finocchiaro / Journal of Monetary Economics 58 (2011) 146–155
151
Welfare is evaluated by deriving the level of expected lifetime utility for the two groups of agents (Vj, for J¼ A,I) and for a representative agent who consumes her income period by period (VRA).14 Table 2 reports losses translated into consumption units, namely the certainty equivalent level of consumption necessary to attain the same level of expected lifetime utility (V J ¼ ð1=ð1bÞÞðC J Þ1m =ð1mÞ for J¼ A,I,RA). The welfare costs of inattentiveness are very small. The differences between the certainty equivalent consumption level of an attentive and an inattentive agent are about 0.02%. This confirms previous findings that welfare gains from eliminating aggregate fluctuations are small (Lucas, 1987) and that losses due to small deviations from rationality are trivial (e.g. Cochrane, 1989; Pischke, 1995). 3. Inattentive investors With uncertain labor income, the empirical link between the propensity to plan and wealth accumulation mentioned in the introduction still appears to be a puzzle. However, if the source of uncertainty is asset returns rather than income, infrequent planning leads to the opposite result, in line with the empirical evidence. In this case, more uncertainty pushes the inattentive group to invest less in the risky asset and accumulate less wealth in general equilibrium. This mechanism is illustrated in the model described in the next subsection. 3.1. The model As in the previous section, the economy is populated by attentive and inattentive agents. Each household is endowed with a non-stochastic income stream y. Moreover, agents can trade a share of stock s, with price p, that provides a flow of stochastic dividends, d. Stocks are in positive fixed supply, normalized to one. A short-sales constraint, s Z 0, is imposed. Attentive investors plan period by period, solving the following problem: V P,A ðzP,t Þ ¼ MaxcA ,bA fUðctA Þ þ bEV NP,A ðzNP,t þ 1 Þg, t
t
st : ctA þ pt sAtþ 1 ¼ ðdt þ pt ÞsAt þ y, sAtþ 1 Z0,
ð3Þ
where the notation follows the previous section. Inattentive consumers plan their consumption only every other period: V P,I ðzP,t Þ ¼ MaxcI ,cI t
tþ1
2
fUðctI Þ þ bUðctI þ 1 Þ þ b Et V P,I ðzP,t þ 2 Þg,
st : ctI þ pt sIt þ 1 ¼ ðdt þ pt ÞsIt þ y, ctI þ 1 þpt þ 1 sIt þ 2 ¼ ðdt þ 1 þ pt þ 1 ÞsIt þ 1 þ y, sIt þ 1 Z0,
sIt þ 2 Z 0:
ð4Þ
Market clearing requires ð1aÞsIt þ asAt ¼ 14ð1aÞctI þ actA ¼ dt þ y. 3.2. Numerical solution The dividend shock is calibrated using data from the NIPA to estimate the process logdt ¼ rd logdt1 þ edt .15 To also capture other sources of wealth stemming from tradable assets, I follow Heaton and Lucas (1996) and target a steady state ratio of non-labor income over the total income ðd=ðd þyÞÞ of approximately 15% when calibrating the non-stochastic income endowment ðyÞ. The discount factor is calibrated at 0.925 to target an average stock return of approximately 8%. The other parameters follow the calibration in Section 2 (see Table 1). 3.3. Inattentiveness and investment risk in general equilibrium As shown in Table 2, when the only source of uncertainty is return risk, inattentive investors choose to invest less in the risky asset and accumulate less wealth ðsI o1Þ.16 Also in this case, their savings are positively correlated to asset prices ðcorrðDsI ,pÞ 4 0Þ.17 With investment risk, the risk born by each agent is endogenously determined and increasing in their stock positions. In this case, facing more uncertainty, inattentive investors choose to invest less in the risky asset. As a result, in the long-run they will consume less and accumulate less wealth, but experience a less volatile consumption path. 14
To generate these results 1000 parallel series of 1000 periods were simulated, assuming that agents start out with zero bond holdings. The ‘‘dividends’’ series was detrended using a linear trend and divided by the total U.S. population and the CPI in each year to obtain real per capita dividends. The shock is approximated by a three-state Markov chain. 16 Recall that sA ¼ 2 sI. 17 This result is robust to different parameter specifications. 15
152
D. Finocchiaro / Journal of Monetary Economics 58 (2011) 146–155
To understand the properties of this economy, it is useful to recall first that with full attention and no initial wealth dispersion (s ¼1) the model is recast into the representative agent framework of Lucas (1978). In equilibrium the total stock positions of all agents are equal to the aggregate number of shares and the share price is an expected discounted sum of future dividends. Conversely, with initial wealth dispersion, the richer half of the population is more exposed to risk, saves pro-cyclically and the stock price moves accordingly.18 Figs. 3 and 4 illustrate how inattentiveness modifies the representative agent framework described above. They depict the saving behavior of an inattentive consumer at planning and non-planning periods as a function of her initial stock holding. Consider a planning date when there is no cross-sectional dispersion in wealth so that both agents hold one share each: I st ¼1 in Fig. 3. Again, predetermining their consumption inattentive investors face more uncertainty. However, in contrast to the economy with income uncertainty, this induces them to save less in planning periods for every realization of the shock ðsIt þ 1 sIt o0,8d if sIt ¼ 1Þ. Even with full attention, an increase in return risk can lead to a decrease in savings. On one hand, more uncertainty increases savings for precautionary reasons. On the other hand, it decreases demand for the risky asset among risk-averse investors (investment risk).19 Because they infrequently plan their consumption profile, inattentive investors perceive the stock as more volatile and choose to invest less in this asset. For increased wealth dispersion, inattentive investors behave as in a model without inattention: they invest in stocks pro-cyclically toward the higher end of wealth and vice versa. As in the income-risk model, asset prices are procyclical. Overall, the difference between this economy and the economy with full attention is not quantitatively large at planning periods.20 It is useful to turn to non-planning periods to understand the full picture. Fig. 4 describes an inattentive investor’s saving behavior in non-planning periods and plots her stock accumulation as a function of initial stock holdings. At non-planning dates, with low wealth dispersion, inattentive investors’ stock holdings accommodate dividend movements to satisfy the budget constraint and prices move accordingly to clear the market. Thus, also in this case, infrequent planners save pro-cyclically and trade at unfavorable prices. They will buy more stocks when prices are high and sell them when prices are low. The terms-of-trade effect described in the previous section is also in force in this case. Precautionary saving, investment risk and terms-of-trade effects are mechanisms that affect wealth accumulation in two different directions. Simulation results show that the last two effects prevail and that inattentive investors accumulate less wealth than attentive ones in the long-run. By accumulating less wealth, on average they consume less but they also smooth their consumption better. Inattention increases return volatility by 30%, from 0.017 to 0.022. This increase represents a sizeable improvement with respect to a representative agent model, even though asset returns are still much more volatile in the data (16%). 3.4. The cost of inattentiveness As in the income-risk model, inattentive investors are worse off compared to both attentive investors and the representative agents.21 However, as opposed to the income-risk case, attentive investors now experience a much more volatile consumption path and a lower level of welfare than in the representative agent case. In this case, the trading behavior of the inattentive group generates a ‘‘pecuniary externality’’ on asset prices which reduces attentive agents’ welfare.22 Specifically, by holding more wealth, attentive agents are more exposed to fluctuations in asset returns and inattention increases asset price volatility. Interestingly, with return risk the cost of inattentiveness is one order of magnitude lower than in the model with income risk, thus even minor planning costs can rationalize infrequent planning. 4. A portfolio problem with inattention With inattention, income and return risk have opposite effects on wealth accumulation while both increase asset price volatility. This section analyzes a two-period model with both sources of risk. 4.1. The model There are two assets: b is a risk-free bond with price q, while s is a risky asset with dividend d and price, net of dividend, p. In each period, the dividend can only take two values: dH or dL, with dH 4dL , each with probability 12. The risk-free bond is 18
In the long-run, even this economy with initial wealth dispersion will converge to a representative agent economy. Reis (2009) makes this point clear in a representative agent model where the only source of income is investment in a risky asset. Gollier and Kimball (1996) show that with CRRA utility, the opportunity to invest in a risky asset increases savings only if risk aversion is lower than one. 20 Without inattentiveness, Fig. 3 would look similar but the three curves would cross at zero for s¼ 1. 21 The welfare costs of inattention with return risk are evaluated as described in the previous section, assuming that agents start out holding one share each. The last two rows in Table 2 show the results. 22 I thank an anonymous referee for pointing this out. 19
D. Finocchiaro / Journal of Monetary Economics 58 (2011) 146–155
8
153
Savings in planning periods
x 10−5
6 4 2 0 −2 −4 −6 −8 −10 −12 0.98
0.985
0.99
0.995
1
1.005
1.01
1.015
1.02 I
Fig. 3. Inattentive investors’ savings in planning periods, DsIt þ 1 ¼ sIt þ 1 sIt , as a function of initial stock holding, st, and the three realizations of the dividend shock: high (solid line), medium (dotted line) and low (dashed line).
Low state 0.005
20
x 10−3
11 21 31 0
High state 13 23 33
15
−0.005 10 −0.01 5 −0.015 0
−0.02
−0.025 0.5
1
1.5
−5 0.5
1
1.5
Fig. 4. Inattentive investors’ savings in non-planning periods, DsIt þ 2 ¼ sIt þ 2 sIt þ 1 , for two realizations of the dividend shock in tþ 1: low (left panel) and high (right panel). Each panel reports three different saving rules that depend on the initial dividend state in t . For example, in the left panel, 11 corresponds to dt ¼ low and dt þ 1 ¼ low, 21 corresponds to dt ¼ medium and dt þ 1 ¼ low and 31 corresponds to dt ¼ high and dt þ 1 ¼low.
in zero net supply, while the share is in unitary net supply. Besides investment income, agents in the model also receive a stochastic labor income that can take two values, yH 4 yL , each with probability 12. Attentive investors choose their portfolio once the shocks of period 1 are realized. The inattentive ones choose consumption and the risky asset s before the shocks are realized. Agents are homogenous ex ante and both groups are of equal size.
154
D. Finocchiaro / Journal of Monetary Economics 58 (2011) 146–155
Table 3 A portfolio problem with inattention. Return risk
m ¼ 1:5
Return and labor income risk
m¼3
m ¼ 10
RA
I
RA
I
RA
Inattentive agents Wealth wI Bond holdings bI Stock holdings sI
1.002 0 1
1.006 0.005 0.999
1.002 0 1
1.006 0.007 0.997
1.002 0 1
Attentive agents Wealth wA
1.002
0.998
1.002
0.998
Risk free rate r Meana (%) Std. dev.
4.17 0.093
4.23 0.269
4.17 0.186
Risky rate r 0 Meana (%) Std. dev.
4.73 0.113
4.79 0.278
Equity premiuma (%)
0.56
0.56
m ¼ 1:5 I
m¼3
m ¼ 10
RA
I
RA
I
RA
I
0.999 0.004 0.994
1.002 0 1
1.003 0.002 0.9999
1.002 0 1
1.004 0.003 0.9997
1.002 0 1
1.006 0.006 0.999
1.002
1.004
1.002
1.000
1.002
0.999
1.002
0.997
5.06 0.483
4.17 0.559
18.02 1.029
4.17 0.021
4.16 0.064
4.17 0.042
4.16 0.127
4.17 0.140
4.47 0.386
5.29 0.198
6.18 0.492
7.63 0.583
21.90 1.068
4.30 0.066
4.30 0.089
4.42 0.076
4.42 0.142
5.00 0.154
5.32 0.395
1.12
1.12
3.46
3.88
0.13
0.13
0.26
0.26
0.84
0.84
f
Note: Numerical results for the models described in Section 4. The parameter m denotes risk aversion. a Average value across bad and good times.
Attentive investors face the following maximization problem: MaxcA ,sA Uðc1A Þ þ bE1 Uðc2A Þ, 1
1
st : c1A þq1 bA1 þ p1 sA1 ¼ bA0 þ ðp1 þ d1 ÞsA0 þ y1 , c2A þq2 bA2 þ p2 sA2 ¼ bA1 þ ðp2 þ d2 ÞsA1 þ y2 :
ð5Þ
The maximization problem for an inattentive investor is MaxcI ,sI Uðc1I Þ þ bE0 Uðc2I Þ, 1
1
st : c1I þ q1 bI1 þp1 sI1 ¼ bI0 þðp1 þd1 ÞsI0 þ y1 , c2I þ q2 bI2 þp2 sI2 ¼ bI1 þðp2 þd2 ÞsI1 þ y2 :
ð6Þ
The market-clearing conditions read abAt þð1aÞbIt ¼ 04asAt þ ð1aÞsIt ¼ 1, t ¼1, 2. 4.2. Results Table 3 summarizes the numerical solutions for different values of risk aversion, m. The discount factor, b, is calibrated at 0.96. For simplicity, it is assumed that dividend and labor income are perfectly correlated. The standard deviation of both shocks and the dividend income share follows the calibration of the previous two sections. All results are compared with the representative agent model (RA). The model in this section differs in two respects: agents can now invest in two assets and there are two sources of risk. In what follows, these two differences are introduced sequentially to separate the effects of these channels. First, consider the model described above but where the only source of uncertainty is asset returns, yH ¼yL ¼0. I refer to this model as ‘‘Return Risk’’ in Table 3. Facing higher uncertainty in asset returns, the inattentive group saves more in bonds and less in equities compared to the attentive group. Moreover, inattention increases both the level and the volatility of both the risky ðr 0 Þ and the risk-free (rf) return. Inattentiveness increases the risk premium because attentive agents now demand a higher premium for bearing the macroeconomic risk. When risk aversion is high ðm ¼ 10Þ, the increase in the risk premium induced by inattentiveness is large enough to decrease inattentive investors wealth, given their portfolio composition. In this case attentive investors accumulate the most wealth.23 Now, introduce income risk. As before, regardless of the degree of risk aversion, inattention induces inattentive agents to sell equities and buy bonds and it increases asset price volatility. However, the effects of inattention on risk premium are more muted. It follows that, even for a high degree of risk aversion, the equity premium is small and inattentive agents accumulate more wealth. As in Polkovnichenko (2004), the implications of limited stock market participation24 for the equity premium are marginal if shareholders are endowed not only with capital income, but also with labor income. 23 24
Wealth is defined as (E(w2) ¼E(s1d2 þ b1)), since in equilibrium p2 ¼0. In the inattentiveness model, inattentive agents take less financial risk. In this sense, they participate ‘‘less’’ in the stock market.
D. Finocchiaro / Journal of Monetary Economics 58 (2011) 146–155
155
5. Conclusions There is a link between the propensity to plan, wealth inequality and asset prices in general equilibrium. In a simple endowment economy where agents receive equal income or dividend streams, differences in the propensity to plan generate wealth heterogeneity and volatile asset prices. Attentive agents plan their consumption pattern period by period, while inattentive ones plan every other period. In a partial equilibrium model with fixed interest rates, Reis (2006) shows that inattentive consumers face more uncertainty and save more for precautionary reasons. Here, it is shown that in general equilibrium, inattentive consumers buy assets when asset prices are high and vice versa. This negative terms-oftrade effect might potentially lead to lower wealth. If the only source of risk is income, inattentive consumers still accumulate claims on attentive ones. In contrast, they accumulate less wealth when they can trade only in a risky asset. In this respect, inattentiveness can explain why infrequent planners invest less in the stock market. Asset returns are much more volatile than in a representative agent model with full attention because prices must induce attentive agents to voluntarily bear the whole burden of adjusting to aggregate income shocks. Thus, the model suggests a natural link between infrequent planning and the high volatility of stock returns observed in the data.
Acknowledgments For invaluable advice and continuous support, I am deeply indebted to John Hassler, Per Krusell and Torsten Persson. I am grateful to Ricardo Reis and two anonymous referees for their constructive comments and suggestions. I also benefited from comments from Mikael Carlsson, Wouter den Haan, Eric Young and seminar participants at the IIES and the Riksbank. The views expressed in this paper are solely the responsibility of the author and should not be interpreted as reflecting the views of the Executive Board of Sveriges Riksbank. Any errors or omissions are the responsibility of the author. References Abel, A.B., Eberly, J.C., Panageas, S., 2007. Optimal inattention to the stock market. American Economic Review: Papers and Proceedings 97 (2), 244–249. Ameriks, J., Caplin, A., Leahy, J., 2003. Wealth accumulation and the propensity to plan. Quarterly Journal of Economics 118 (3), 1007–1047. Ameriks, J., Zeldes, S.P., 2004. How do Household Portfolio Shares Vary with Age? Mimeo. Caballero, R.J., 1995. Near rationality, heterogeneity, and aggregate consumption. Journal of Money, Credit, and Banking 27 (1), 29–48. Campbell, J.Y., Cochrane, J.J., 1999. By force of habit: a consumption-based explanation of aggregate stock market behavior. Journal of Political Economy 107 (2), 205–251. Chien, Y., Cole, H., Lustig, H., 2010. Is the Volatility of the Market Price of Risk due to Intermittent Portfolio Rebalancing? Mimeo. Cochrane, J.H., 1989. The sensitivity of tests of the intertemporal allocation of consumption to near-rational alternatives. American Economic Review 79 (3), 319–337. Den Haan, W.J., 2001. The importance of the number of different agents in a heterogeneous asset-pricing model. Journal of Economic Dynamics and Control 25 (5), 721–746. Dumas, B., Kurshev, A., Uppal, R., 2009. Equilibrium portfolio strategies in the presence of sentiment risk and excess volatility. Journal of Finance 64 (2), 579–629. Gabaix, X., Laibson, D., 2002. The 6D bias and the equility-premium puzzle. In: Bernanke, B., Rogoff, K. (Eds.), NBER Macroeconomics Annual 2001, vol. 16. MIT Press, Cumberland, pp. 257–330. Gollier, C., Kimball, M., 1996. Toward a systematic approach to the economic effects of uncertainty: characterizing utility functions. Discussion Paper, University of Michigan. Heaton, J., Lucas, D.J., 1996. Evaluating the effects of incomplete markets on risk sharing and asset pricing. Journal of Political Economy 104 (3), 443–487. LeRoy, S.F., Porter, R.D., 1981. The present value relation: tests based on implied variance bounds. Econometrica 49 (3), 555–574. Lucas, R.E., 1978. Asset prices in an exchange economy. Econometrica 46 (6), 1426–1445. Lucas, R.E., 1987. Models of Business Cycles. Basil Blackwell, New York. Lusardi, A., 2003. Planning and Savings for Retirement. Mimeo. Lusardi, A., Mitchell, O.S., 2007. Financial literacy and retirement preparedness: evidence and implications for financial education. Business Economics 42 (1), 35–44. Lynch, A.W., 1996. Decision frequency and synchronization across agents: implications for aggregate consumption and equity return. Journal of Finance 51 (4), 1470–1497. McKay, A., 2010. Household Saving Behavior, Wealth Accumulation and Social Security Privatization. Mimeo. Pischke, J.S., 1995. Individual income, incomplete information, and aggregate consumption. Econometrica 63 (4), 805–840. Polkovnichenko, V., 2004. Limited stock market participation and the equity premium. Finance Research Letters 1 (1), 24–34. Reis, R., 2006. Inattentive consumers. Journal of Monetary Economics 53 (8), 1761–1800. Reis, R., 2009. The times series properties of aggregate consumption: implications for the costs of fluctuations. Journal of European Economic Association 7 (4), 722–753. Rodriguez, J.C., 2005. Consumption, the persistence of shocks and asset price volatility. Journal of Monetary Economics 53 (8), 1741–1760. Shiller, R.J., 1981. Do stock prices move too much to be justified by subsequent changes in dividends? American Economic Review 71 (3), 421–436
Journal of Monetary Economics 58 (2011) 156–171
Contents lists available at ScienceDirect
Journal of Monetary Economics journal homepage: www.elsevier.com/locate/jme
Cointegrated TFP processes and international business cycles Pau Rabanal a, Juan F. Rubio-Ramı´rez b,c,d,e,, Vicente Tuesta f a
Research Department, International Monetary Fund, United States Duke University, United States c Federal Reserve Bank of Atlanta, United States d CEPR, United Kingdom e FEDEA, Spain f Centrum Catolica and Prima AFP, Peru b
a r t i c l e in f o
abstract
Article history: Received 5 October 2009 Received in revised form 3 March 2011 Accepted 7 March 2011 Available online 17 March 2011
A puzzle in international macroeconomics is that real exchange rates are highly volatile. Standard international real business cycle (IRBC) models cannot reproduce this fact. This paper provides evidence that TFP processes for the U.S. and the ‘‘rest of the world’’ are characterized by a vector error correction model (VECM) and that adding cointegrated technology shocks to the standard IRBC model helps to explain the observed high real exchange rate volatility. Also, the model can explain the observed increase in real exchange rate volatility with respect to output in the last 20 years by changes in the parameters of the VECM. & 2011 Elsevier B.V. All rights reserved.
1. Introduction A central puzzle in international macroeconomics is that observed real exchange rates (RERs) are highly volatile. Standard international real business cycle (IRBC) models cannot reproduce this fact when calibrated using conventional parameterizations. For instance, Heathcote and Perri (2002) simulate a two-country, two-good economy with the total factor productivity (TFP) shocks and find that the model can explain less than a fourth of the observed relative volatility of the RER with respect to output for United States (U.S.) data. An important feature of their model, following the seminal work of Backus et al. (1992) and Baxter and Crucini (1993), is that it considers stationary TFP shocks that follow a vector autoregression (VAR) process in levels.1,2 This paper provides evidence that TFP processes for the U.S. and the ‘‘rest of the world’’ (R.W.) have a unit root and are cointegrated. Motivated by this empirical finding, technology shocks that follow a vector error correction model (VECM) process are introduced into an otherwise standard two-country, two-good model. Engle and Granger (1987) indicate that if the system under study includes integrated variables and cointegrating relationships, then this system will be more appropriately specified as a VECM rather than a VAR in levels. As Engle and Granger (1987) note, estimating a VAR in levels for cointegrated systems ignores important constraints on the coefficient matrices. Although these constraints are satisfied asymptotically, small sample improvements are likely to result from imposing them on the cointegrating relationships.
Corresponding author at: Duke University, P.O. Box 90097, Durham, NC 27708-0097, United States. Tel.: þ 1 919 660 1865; fax: þ1 919 684 8974.
E-mail address:
[email protected] (J.F. Rubio-Ramı´rez). We provide an online appendix with the full set of normalized equilibrium conditions and the necessary replication files. 2 Other studies that consider a VAR in levels are Kehoe and Perri (2002), Dotsey and Duarte (2009), Corsetti et al. (2008a,b), and Heathcote and Perri (2009). 1
0304-3932/$ - see front matter & 2011 Elsevier B.V. All rights reserved. doi:10.1016/j.jmoneco.2011.03.005
P. Rabanal et al. / Journal of Monetary Economics 58 (2011) 156–171
157
The presence of cointegrated TFP shocks requires restrictions on preferences, production functions, and the law of motion of the shocks for the balanced growth path to be chosen by optimizing agents. The restrictions on preferences and technology of King et al. (1988) are sufficient for the existence of balanced growth in a closed economy. However, in a two-country model, an additional restriction on the cointegrating vector related to the TFP processes is needed. In particular, the cointegrating vector must be (1, 1), which means the ratio of TFP processes (or, equivalently, the log difference of TFP processes) across countries is stationary. After presenting evidence supporting this additional restriction, the simulated model with a VECM specification for TFP processes solves a large part of the RER volatility puzzle without affecting the good match for other moments of domestic and international variables. In particular, the model can generate a relative volatility of the RER more than two times larger than an equivalent model with stationary shocks calibrated as in Heathcote and Perri (2002). Why does a model with cointegrated TFP shocks generate higher relative volatility of the RER than a model with stationary shocks? The reason is that the VECM parameter estimates imply higher persistence and lower spillovers than the traditional stationary calibrations, which translates into higher persistence of the TFP differential across countries. This is the crucial feature of the model that helps explain the relative volatility of the RER with respect to output. The mechanism works as follows. When a positive TFP shock hits the home economy, output, consumption, investment and hours work increase in the home economy, and the RER depreciates as home prices decrease. In the foreign country, output, investment and hours decrease while consumption increases, as foreign households anticipate the arrival of the technology improvement in the future and face a wealth effect. What happens as the persistence of the TFP differential increases, or, put differently, the speed of transmission of the shocks decreases? It delays the arrival of the TFP improvement in the foreign country and it increases the persistence of the TFP shock in the home country. This delay attenuates the wealth effect for the foreign economy and strengthens it for the home country. As a result, labor, investment and output respond less strongly, reducing output volatility. The increased persistence in TFP differential also reduces the supply of the home good and increases the demand for foreign intermediate goods (which are produced at a higher cost, because the foreign country has not received the productivity shock yet), which leads to an even larger RER depreciation, increasing RER volatility. Another very well-documented empirical fact is the substantial decline in the volatility of most U.S. macroeconomic variables during the last 20 years, a phenomenon known as the ‘‘Great Moderation’’.3 The Great Moderation has not affected the RER as strongly as it has affected output. As a result, the ratio of RER volatility to output volatility has increased. The increase in the relative volatility of the RER of the U.S. dollar coincides in time with a weakening of the cointegrating relationship of TFP shocks between the U.S. and the R.W.4 More importantly, this paper confirms that by allowing the cointegrating relationship to change as it does in the data, the model can jointly account for the observed increase in the relative volatility of the RER and the substantial decline in the volatility of output. Baxter and Stockman (1989) showed that changes in the nominal exchange rate regime greatly affected volatility of the RER but had almost no effect on output volatility and other macroeconomic variables. This empirical fact is an important challenge for IRBC models, since they assume a tight relationship between RER and output volatilities. Since the model is based on the standard IRBC framework, it suffers from the same limitation. To minimize the effect of this valid criticism, and since the impact of changes in the nominal exchange rate regime cannot be studied, the focus of the analysis is the post Bretton-Woods period of flexible nominal exchange rates. Our paper relates to two important strands of the literature. On the one hand, it connects with the literature stressing the importance of stochastic trends to explain economic fluctuations. King et al. (1991) find that a common stochastic trend explains the co-movements of the main U.S. real macroeconomic variables. Lastrapes (1992) reports that fluctuations in real and nominal exchanges rates are primarily due to permanent real shocks. Engel and West (2005) show that RERs manifest near-random-walk behavior if TFP processes are random walks and the discount factor is near one, while Nason and Rogers (2008) generalize this hypothesis to a larger class of models. Aguiar and Gopinath (2007) show that trend shocks are the primary source of fluctuations in emerging economies. Alvarez and Jermann (2005) and Corsetti et al. (2008a) highlight the importance of persistent disturbances to explain asset prices and RER fluctuations, respectively. Also, Lubik and Schorfheide (2005) and Rabanal and Tuesta (2010) introduce random walk TFP shocks to explain international fluctuations, and Justiniano and Preston (2010) suggest that it is important to introduce correlations between the innovations of several structural shocks in order to explain the co-movement between Canadian and U.S. macroeconomic variables. However, these papers do not formalize a VECM, test for cointegration, or estimate the cointegrating vector. On the other hand, this paper also links to the literature analyzing different mechanisms to understand RER fluctuations. Some recent papers study the effects of monetary shocks and nominal rigidities. Chari et al. (2002) are able to explain RER volatility in a monetary model with sticky prices and a high degree of risk aversion. However, their model achieves success by increasing the variance of monetary shocks beyond what the data indicate. Benigno (2004) focuses on the role of interest rate inertia and asymmetric nominal rigidities across countries. Other papers use either non-traded goods, pricing to market, or some form of distribution costs (see Corsetti et al., 2008a,b; Benigno and Thoenissen, 2008; Dotsey and Duarte, 2009). Our model includes only tradable goods with home bias, which is the only
3 Some early discussion of the Great Moderation can be found in Kim and Nelson (1999). A discussion of different interpretations for this phenomenon and some international evidence can be found in Stock and Watson (2003) and Stock and Watson (2005), respectively. 4 In Section 4, we describe the set of countries that compose our definition of R.W.
158
P. Rabanal et al. / Journal of Monetary Economics 58 (2011) 156–171
source of RER fluctuations. Our choice is guided by evidence that the relative price of tradable goods has large and persistent fluctuations that explain most of the RER volatility (see Engel, 1993, 1999). Fluctuations of the relative price of non-traded goods accounts for, at most, one-third of RER volatility (see Betts and Kehoe, 2006; Burstein et al., 2006; Rabanal and Tuesta, 2007). In any case, this choice causes an empirical problem. Our measure of RER is based on the consumer price indices (CPIs) that include non-traded goods, while the model does not have a non-traded goods sector. To reduce the gap between model and data, two alternative measures of RER are considered. The first measure is constructed using producer price indices (PPIs) and the second from export deflators. Of course, the two series still maintain some gap between theory and measurement, but the role of non-traded goods is reduced. Using these two other measures, the results do not change. The rest of the paper is organized as follows. Section 2 documents the increase in the RER volatility with respect to output volatility for the U.S. Section 3 presents the model with cointegrated TFP shocks. Section 4 reports estimates for the law of motion of TFP processes for the U.S. and the R.W. Section 5 discusses the main findings from simulating the model, leaving Section 6 for concluding remarks.
2. RER volatility and the Great Moderation This section presents evidence that, in the period known as ‘‘the Great Moderation’’, the relative volatility of the RER (measured as the real effective exchange rate) with respect to output (measured as real GDP) has increased in the U.S.5 The real effective exchange rate is constructed as a geometric average of bilateral CPI-based RERs with respect to the Euro area, Japan, Canada, the United Kingdom and Australia, with the same weights used by the Federal Reserve to construct its real effective exchange rate of the U.S. dollar series. These countries represented 69 percent of the aggregate weight in 1973 and 46 percent in 2009. These countries were chosen to be consistent with the definition of the R.W. later in the paper. The RER series is thus constructed from 1957:1 to 2010:1, proceeding in two steps. Between 1973:4 and 2010:1, series for nominal exchange rates with the U.S. dollar were obtained from the Federal Reserve and each country’s CPI series was extracted through the IMF’s International Financial Statistics (IFS). The only exception was the Euro area, where the source for the CPI is the area wide model (AWM) of the European Central Bank (ECB). The Federal Reserve does not publish bilateral exchange rate data prior to 1973. In addition, the Federal Reserve weights start in 1973. Moreover, the AWM from which the Euro area CPI is obtained starts in 1970. The real effective exchange rate of the U.S. dollar between 1957:1 and 1973:3 is extended backwards as follows. A Euro area RER aggregate using U.S. dollar nominal exchange rates and CPI data for West Germany, France, Italy, Spain and the Netherlands from the IMF’s IFS was constructed, using the Federal Reserve’s weights in 1973 (the first year these weights are available). These five countries represented about 86 percent of Euro area trade with the U.S. in 1973. Then, the Euro area RER was averaged with the bilateral RERs of Japan, Canada, the United Kingdom and Australia using again 1973 weights. Since both series overlap in 1973:4, this date was chosen to normalize and put both series together. Fig. 1 shows the standard deviation of the HP-filtered output series, the standard deviation of the HP-filtered RER series, and the ratio of the two.6 Standard deviations are computed using rolling windows of 40 quarters and data from 1957:1 to 2010:1, so Fig. 1 displays volatilities between 1966:4 and 2010:1. The figure shows a substantial decline in the volatility of output from around 2 percent standard deviation until the mid 1990s to 1 percent after that date. This decline in output volatility is what is typically referred to as ‘‘the Great Moderation’’. The volatility of the RER follows a different path: the standard deviation was at about 4 percent until the mid 1980s, increased to around 7 percent for the 1990s, and again declined to around 4 percent after that. What is the behavior of the ratio of volatilities between the RER and output? The ratio has increased in a nonmonotonic way from around 1 percent to around 4 percent in the period of study. Hence, the volatility of the RER has risen by a factor of four relative to that of output. But this is not a formal test of a structural break. To perform such a test, the ratio of RER to output volatility is modeled as an autoregressive (AR) process of order one with mean. The Quandt– Andrews unknown breakpoint test is used on the estimated mean parameter (trimming 15 percent of the data), using the sample 1966:4 to 2010:1. The results are reported in Table 1 and clearly reject the null hypothesis of no breakpoints in the data: the exponential likelihood ratio is well above the 5 percent critical value. In addition, using the maximum likelihood ratio F-statistic, the date with the highest probability of a break is 1993:4. The rest of the paper builds a two-country, two-good model that is calibrated using standard parameters of the IRBC literature and estimated parameters of a VECM using TFP processes for the U.S. and the R.W. After seeing the evidence presented in Fig. 1, the reader may anticipate that the goal of the paper is to match time-varying targets. That it is not the case. Section 5.4 below shows that by estimating two separate VECMs with a breakpoint in 1993:4, the resulting parameter estimates explain a large fraction of the observed increase in the relative volatility of the RER with respect to output. 5
Similar behavior can be observed for the United Kingdom, Canada, and Australia. We do not present those graphs because of space considerations. Using the Federal Reserve’s index for the real effective exchange rate of the U.S. Dollar delivers similar results. The Fed’s measure of the RER of the U.S. Dollar is available from January 1973. The correlation between the HP-filtered RER constructed by the Fed and the measure based in the five countries is 0.94. 6
P. Rabanal et al. / Journal of Monetary Economics 58 (2011) 156–171
159
Std. Deviation of output 0.024 0.022 0.02 0.018 0.016 0.014 0.012 0.01 1970
1975
1980
1985
1990
1995
2000
2005
2010
1995
2000
2005
2010
2005
2010
Std. Deviation of RER 0.06 0.05 0.04 0.03 0.02 1970
1975
1980
1985
1990
Std. Deviation of RER / Std. Deviation of output 6 5 4 3 2 1 1970
1975
1980
1985
1990
1995
2000
Fig. 1. Standard deviation of HP-filtered output and RER and the ratio of standard deviations. Table 1 Quandt–Andrews test. Stability of relative volatility of the RER Method
F-statistic
Prob.
Maximum likelihood ratio (1993:4) Exponential likelihood ratio
15.36773 4.344542
0.0021 0.0018
Notes: The null hypothesis is ‘‘No breakpoints within trimmed data’’. Equation sample: 1967:1 2010:1. Test sample: 1973:3 2003:3 (given 15 percent trimming). Number of breaks compared: 121.
3. The model This section presents a standard two-country, two-good IRBC model similar to that in Heathcote and Perri (2002). The main difference with respect to the standard IRBC literature is the definition of the stochastic processes for the log of TFP.7 In that literature, the TFP processes of the two countries are assumed to be stationary or trend stationary in logs, and are modeled as a VAR process. Baxter and Crucini (1995) were the first paper to consider permanent shocks and the possibility of cointegration in the context of this class of models, but they did not pursue the VECM specification because the evidence of cointegration was mixed for the bilateral pairs they studied. In the present model, TFP processes that are cointegrated of order C(1,1) are introduced, which implies that the processes are integrated of order one but a linear combination is stationary. According to the Granger representation theorem, the C(1,1) assumption is equivalent to defining a VECM for the law of motion of the differences of the TFP processes.8 The VECM is defined in more detail in Section 3.2.3. The cointegration assumption has strong and testable implications for the data. The empirical evidence supporting this assumption will be presented in Section 4. In each country, a single final good is produced by a representative competitive firm that uses intermediate goods in the production process. These intermediate goods are imperfect substitutes for each other and can be purchased from representative competitive producers of intermediate goods in both countries. Intermediate goods producers use domestic capital and labor in the production process. The final good can only be used for consumption or investment in the domestic economy. The stock of domestic capital can therefore only be increased by combining domestic and foreign goods. Thus, all 7 8
To avoid bothersome repetition, in the remainder of the paper the concept ‘‘TFP’’ will actually mean ‘‘the log of TFP’’. See Engle and Granger (1987).
160
P. Rabanal et al. / Journal of Monetary Economics 58 (2011) 156–171
trade between countries occur at the intermediate goods level. In addition, consumers trade non-contingent international riskless bonds denominated in units of domestic intermediate goods. No other financial asset is available. In each period t, the economy experiences one of finitely many events st, with history st ¼ ðs0 , . . . ,st Þ of events up through period t. The probability, as of period 0, of any particular history st is pðst Þ and s0 is given. We only present the problem of the home country. The problem faced by foreign country households and firms is symmetric, and hence it is not presented. 3.1. Households The representative household of the home country solves 1 X
max
fCðst Þ,Lðst Þ,Xðst Þ,Kðst Þ,Dðst Þg
t¼0
bt
X
pðst Þ
st
fCðst Þt ½1Lðst Þ1t g1s , 1s
ð1Þ
subject to the following budget constraint Cðst Þ þXðst Þ þ
PH ðst Þ PH ðst Þ fDðst1 ÞF½Dðst Þg, Q ðst ÞDðst Þ r Wðst ÞLðst Þ þ Rðst ÞKðst1 Þ þ t Pðs Þ Pðst Þ
ð2Þ
and the law of motion for capital Kðst Þ ¼ ð1dÞKðst1 Þ þXðst Þ:
ð3Þ t
The following notation is used: b 2 ð0,1Þ is the discount factor, Lðs Þ 2 ð0,1Þ is the fraction of time allocated to work in the home country, Cðst Þ Z 0 are units of consumption of the final good, Xðst Þ Z 0 are units of investment, and Kðst Þ Z0 is the capital stock in the home country at the beginning of period tþ 1. Pðst Þ is the price of the home final good, which will be defined below, Wðst Þ is the hourly wage in the home country, and Rðst Þ is the home-country rental rate of capital, where the prices of both factor inputs are measured in units of the final good. PH ðst Þ is the price of the home intermediate good in the home country, Dðst Þ denotes the holdings of the internationally traded riskless bond that pays one unit of home intermediate good (minus a small cost of holding bonds, FðÞ) in period t þ1 regardless of the state of nature, and Q ðst Þ is its price, measured in units of the home intermediate good. The function FðÞ is an arbitrarily small cost of holding bonds measured in units of the home intermediate good.9 Following the existing literature, FðÞ takes the functional form 2 f Dðst Þ F½Dðst Þ ¼ Aðst1 Þ , ð4Þ 2 Aðst1 Þ where we have modified the adjustment cost function to ensure balanced growth. 3.2. Firms We now describe the final and intermediate goods producers problems. In this subsection, we also the TFP process that it is the main departure from the standard literature. 3.2.1. Final goods producers The final good in the home country, Yðst Þ, is produced using home intermediate goods, YH ðst Þ, and foreign intermediate goods, YF ðst Þ, with the following technology: Yðst Þ ¼ ½o1=y YH ðst Þðy1Þ=y þð1oÞ1=y YF ðst Þðy1Þ=y y=ðy1Þ ,
ð5Þ
where o denotes the fraction of home intermediate goods that are used for the production of the home final good and y controls the elasticity of substitution between home and foreign intermediate goods. Therefore, the representative final goods producer in the home country solves the following problem: max
Pðst ÞYðst ÞPH ðst ÞYH ðst ÞPF ðst ÞYF ðst Þ,
Yðst Þ Z 0,YH ðst Þ Z 0,YF ðst Þ Z 0
ð6Þ
subject to the production function (5), where PF ðst Þ is the price of the foreign intermediate good in the home country. 3.2.2. Intermediate goods producers The representative intermediate goods producer in the home country uses home labor and capital in order to produce home intermediate goods and sells her product to both the home and foreign final good producers. Taking prices of all goods and factor inputs as given, she maximizes profits. Hence, she solves: max
Lðst Þ Z 0,Kðst1 Þ Z 0
PH ðst ÞYH ðst Þ þ PH ðst ÞYH ðst ÞPðst Þ½Wðst ÞLðst Þ þ Rðst ÞKðst1 Þ,
ð7Þ
9 The FðÞ cost is introduced to ensure stationarity of the level of Dðst Þ in IRBC models with incomplete markets, as discussed by Heathcote and Perri (2002). We choose the cost to be numerically small, so it does not affect the dynamics of the rest of the variables.
P. Rabanal et al. / Journal of Monetary Economics 58 (2011) 156–171
161
subject to the production function YH ðst Þ þ YH ðst Þ ¼ Aðst Þ1a Kðst1 Þa Lðst Þ1a , t
ð8Þ YH ðst Þ
is the amount of where YH ðs Þ is the amount of home intermediate goods sold to the home final goods producers, home intermediate goods sold to the foreign final goods producers, Aðst Þ is a stochastic process describing TFP of home intermediate goods producers, which is characterized below, and PH ðst Þ is the price of the home intermediate good in the foreign country. 3.2.3. The processes for TFP As mentioned above, the main departure from the standard model in the IRBC literature is the assumption that logAðst Þ and logA ðst Þ are cointegrated of order C(1,1). This new assumption involves specifying the following VECM for the law of motion driving the log first-differences of TFP processes for both the home and the foreign country: ! ! DlogAðst Þ eðst Þ k c t1 t1 ¼ , ð9Þ þ ½logAðs ÞglogA ðs Þlogx þ DlogA ðst Þ e ðst Þ k c where ð1,gÞ is called the cointegrating vector, x is the constant in the cointegrating relationship, eðst Þ Nð0, sÞ and e ðst Þ Nð0, s Þ, eðst Þ and e ðst Þ can be correlated, and D is the first-difference operator.10 This VECM representation implies that deviations of today’s differences of TFP with respect to its mean value depend not only on lags of home and foreign differences of TFP but also on a function of the ratio of lagged home and foreign TFP, Aðst1 Þ=½xA ðst1 Þg . Thus, if the ratio Aðst1 Þ=½xA ðst1 Þg is larger than its long-run value, then k o 0 and k 40 will imply that DlogAðst Þ would fall and DlogA ðst Þ would rise, driving both series toward a new common level. The VECM representation also implies that DlogAðst Þ, DlogA ðst Þ, and logAðst1 ÞglogA ðst1 Þlogx are stationary processes. 3.3. Market clearing The model is closed with the following market clearing conditions in the final goods markets Cðst Þ þ Xðst Þ ¼ Yðst Þ
ð10Þ
and in the international bond market Dðst Þ þ D ðst Þ ¼ 0:
ð11Þ
3.4. Equilibrium In this subsection, we first define equilibrium and then describe the set of equilibrium conditions that characterize equilibrium. 3.4.1. Equilibrium definition Given the law of motion for TFP shocks defined by (9), an equilibrium for this economy is a set of allocations for home consumers, Cðst Þ, Lðst Þ, Xðst Þ, Kðst Þ, and Dðst Þ; and foreign consumers, C ðst Þ, L ðst Þ, X ðst Þ, K ðst Þ, and D ðst Þ; allocations for home and foreign intermediate goods producers, YH ðst Þ, YH ðst Þ, YF ðst Þ and YF ðst Þ; allocations for home and foreign final goods producers, Yðst Þ and Y ðst Þ; intermediate goods prices PH ðst Þ, PH ðst Þ, PF ðst Þ and PF ðst Þ; final goods prices Pðst Þ and P ðst Þ; rental prices of labor and capital in the home and foreign country, Wðst Þ, Rðst Þ, W ðst Þ, and R ðst Þ and the price of the bond Q ðst Þ such that: (i) given prices, households’ allocations solve the households’ problem; (ii) given prices, intermediate goods producers’ allocations solve the intermediate goods producers’ problem; (iii) given prices, final goods producers allocations solve the final goods producers’ problem; and (iv) markets clear. 3.4.2. Equilibrium conditions At this point, it is useful to define the following relative prices: P~ H ðst Þ ¼ PH ðst Þ=Pðst Þ, P~ F ðst Þ ¼ PF ðst Þ=P ðst Þ and t t t t ~ RERðs Þ ¼ P ðs Þ=Pðs Þ. Note that P H ðs Þ is the price of home intermediate goods in terms of home final goods, P~ F ðst Þ is the price of foreign intermediate goods in terms of foreign final goods, which appears in the foreign country’s budget constraint, and RERðst Þ is the RER between the home and foreign countries. The law of one price (LOP) holds; hence, PH ðst Þ ¼ PH ðst Þ and PF ðst Þ ¼ PF ðst Þ. The equilibrium conditions include the first order conditions of households, intermediate and final goods producers in both countries, as well as the relevant laws of motion, production functions, and market clearing conditions. We only present first order conditions for the home country because of symmetry. The marginal utility of consumption and the 10 Here we restrict ourselves to a VECM with zero lags. This assumption is motivated by the empirical results to be presented in Section 4, where no lags are significant.
162
P. Rabanal et al. / Journal of Monetary Economics 58 (2011) 156–171
labor supply are given by UC ðst Þ ¼ lðst Þ,
ð12Þ
UL ðst Þ ¼ Wðst Þ, UC ðst Þ
ð13Þ
where Ux denotes the partial derivative of the utility function U with respect to variable x. The first order condition with respect to capital delivers an intertemporal condition that relates the marginal rate of consumption to the rental rate of capital and the depreciation rate: X lðst Þ ¼ b pðst þ 1 jst Þlðst þ 1 Þ½Rðst þ 1 Þ þ 1d, ð14Þ st þ 1 tþ1
where pðs
jst Þ ¼ pðst þ 1 Þ=pðst Þ is the conditional probability of st þ 1 given st. The law of motion of home capital is
Kðst Þ ¼ ð1dÞKðst1 Þ þXðst Þ,
ð15Þ
the analogous expressions for the foreign country are omitted. The optimal choice by households of the home country delivers the following expression for the price of the riskless bond: Q ðst Þ ¼ b
X
pðst þ 1 jst Þ
st þ 1
lðst þ 1 Þ P~ H ðst þ 1 Þ F0 ½Dðst Þ : lðst Þ P~ H ðst Þ b
The next condition equates the price of the riskless bond to the cost of adjusting bonds: " # tþ1 ~ X Þ P H ðst þ 1 Þ RERðst Þ lðst þ 1 Þ P~ H ðst þ 1 Þ F0 ½Dðst Þ t þ 1 t l ðs pðs js Þ t : ¼ t t þ 1 lðs Þ P~ H ðst Þ b l ðs Þ P~ H ðst Þ RERðs Þ tþ1
ð16Þ
ð17Þ
s
From the intermediate goods producers’ maximization problems, labor and capital are paid their marginal product, where the rental rate of capital and the real wage are expressed in terms of the final good in each country: Wðst Þ ¼ ð1aÞP~ H ðst ÞAðst Þ1a Kðst1 Þa Lðst Þa ,
ð18Þ
Rðst Þ ¼ aP~ H ðst ÞAðst Þ1a Kðst1 Þa1 Lðst Þ1a :
ð19Þ
From the final goods producers’ maximization problem, the demands of intermediate goods depend on their relative price: YH ðst Þ ¼ oP~ H ðst Þy Yðst Þ,
YF ðst Þ ¼ ð1oÞðP~ F ðst ÞRERðst ÞÞy Yðst Þ:
ð20Þ ð21Þ
Using the production functions of the final goods Yðst Þ ¼ ½o1=y YH ðst Þðy1Þ=y þð1oÞ1=y YF ðst Þðy1Þ=y y=ðy1Þ
ð22Þ
and the demand equations for intermediate goods just described, the final goods deflator in the home country is Pðst Þ ¼ ½oPH ðst Þ1y þð1oÞPF ðst Þ1y 1=ð1yÞ :
ð23Þ
Hence, given that the LOP holds, the RER is equal to RERðst Þ ¼
P ðst Þ ½oPF ðst Þ1y þ ð1oÞPH ðst Þ1y 1=ð1yÞ ¼ : t Pðs Þ ½oPH ðst Þ1y þð1oÞPF ðst Þ1y 1=ð1yÞ
ð24Þ
Note that the only source of RER fluctuations is the presence of home bias (o 4 1=2). Also, intermediate goods, final goods, and bond markets clear as in Eqs. (8), (10), and (11). Finally, the law of motion of the level of debt P~ H ðst ÞQ ðst ÞDðst Þ ¼ P~ H ðst ÞYH ðst ÞP~ F ðst ÞRERðst ÞYF ðst Þ þ P~ H ðst ÞDðst1 ÞP~ H ðst ÞF½Dðst Þ,
ð25Þ
is obtained using (2) and the fact that intermediate and final goods producers at home make zero profits. Finally, the productivity shocks follow the VECM described in Section 3.2.3. 3.5. Balanced growth and the restriction on the cointegrating vector Eqs. (8), (10) and (11), together with (12)–(25) and their foreign-country counterparts, and the VECM process for TFP characterize the equilibrium in this model. Since both logAðst Þ and logA ðst Þ are integrated processes, it is necessary to normalize the equilibrium conditions in order to obtain a stationary system more amenable to study. Following King et al. (1988), home-country variables that have a trend are normalized by the lagged domestic level of TFP, Aðst1 Þ, and the
P. Rabanal et al. / Journal of Monetary Economics 58 (2011) 156–171
163
foreign-country variables that have a trend are normalized by the lagged foreign level of TFP, A ðst1 Þ. In the online appendix, the full set of normalized equilibrium conditions is presented. The model requires some restrictions on preferences, production functions, and the law of motion of productivity shocks. The restrictions on preferences and technology of King et al. (1988) are sufficient for the existence of balanced growth in a closed economy real business cycle (RBC) model. However, a two-country model requires an additional restriction on the cointegrating vector to ensure balanced growth. In particular, the ratio Aðst1 Þ=A ðst1 Þ must be stationary. In order to understand why the international dimension of the model requires this additional restriction, consider the normalized demand of imported foreign-produced intermediate goods by the home country: Aðst1 Þ Y^ F ðst Þ ¼ ð1oÞ½P~ F ðst ÞRERðst Þy Y^ ðst Þ t1 , A ðs Þ
ð26Þ
where Y^ F ðst Þ ¼ YF ðst Þ=A ðst1 Þ while Y^ ðst Þ ¼ Yðst Þ=Aðst1 Þ. Since P~ F ðst Þ and RERðst Þ are stationary, if the ratio between Aðst1 Þ t1 t and A ðs Þ was to be non-stationary, the ratio between Y^ F ðs Þ and Y^ ðst Þ would also be non-stationary and balanced growth would not exist. A similar argument holds for other normalized equilibrium conditions. Our VECM implies that the ratio between Aðst1 Þ and A ðst1 Þg is stationary. Therefore, a sufficient condition for balanced growth is that the parameter g equals one or, equivalently, that the cointegrating vector equals (1, 1).
4. Estimation of the VECM This section describes the constructed TFP series for the U.S. and the R.W., and presents two important results. First, the assumption that the TFP processes are cointegrated of order C(1,1) cannot be rejected in the data. By the Granger representation theorem this implies that the VECM specification is valid. Second, the restriction imposed by balanced growth, i.e., that the parameter g is equal to one, also cannot be rejected in the data. To conclude, the VECM is estimated, and the parameter values are used to simulate the model in Section 5. 4.1. Data For the U.S., quarterly real GDP data are obtained from the Bureau of Economic Analysis and hours and employment data from the Organization for Economic Cooperation and Development (OECD). Real capital stock data are also obtained from the OECD database. The R.W. aggregate is the Euro area plus the United Kingdom, Canada, Japan, and Australia. This group accounts for about 50 percent of the basket of currencies that the Federal Reserve uses to construct the RER for the U.S. dollar. For all countries except the Euro area nominal GDP, hours, employment, and real capital stock are obtained from the OECD. For the Euro area, the data source for nominal GDP, employment, and real capital stock is the area wide model (AWM). Hours are as reported in Christoffel et al. (2009). The sample period begins at 1973:1 and ends at 2006:4, which is when the hours series for the Euro area ends. Ideally, one would want to include additional countries that represent an important and increasing share of trade with the U.S., such as China and other emerging countries, but long quarterly series are not available. Nominal GDPs of the R.W. are aggregated using PPP nominal exchange rates to convert each national nominal output to current U.S. dollars, and then deflated using the GDP deflator of the U.S. (base year 2000) to obtain the aggregate R.W. real GDP. Aggregate R.W. hours series is constructed by aggregating the number of employees times hours per employee for each country. Real capital stocks series in domestic currency (base year 2000) are aggregated using the base year 2000 PPP RERs. Then, the TFP processes are constructed as follows: logAðst Þ ¼
logYðst Þð1aÞlogLðst ÞalogKðst1 Þ , 1a
logA ðst Þ ¼
logY ðst Þð1aÞlogL ðst ÞalogK ðst1 Þ , 1a
ð27Þ
ð28Þ
where a is the capital share of output and takes a value of 0.36. Backus et al. (1992) and Heathcote and Perri (2002, 2009) use a similar approach when constructing TFP series for the U.S. and a R.W. aggregate but ignore capital dynamics. Given the focus on long-run properties of the model, capital stock is important for the analysis. 4.2. Integration and cointegration properties This section presents evidence supporting the assumption that the TFP processes for the U.S. and the R.W. are cointegrated of order C(1,1). After providing empirical support for the presence of one unit root in each univariate processes, the Johansen (1991) procedure is applied to test for cointegration. Both the trace and the maximum eigenvalue methods support the existence of a cointegrating vector. Univariate analysis of the TFP processes for the U.S. and the R.W. strongly indicates that both series can be characterized by unit root processes with drift. Table 2 presents results for the U.S. TFP process using the following
164
P. Rabanal et al. / Journal of Monetary Economics 58 (2011) 156–171
Table 2 Unit root tests. Method
log TFP U.S.
log TFP R.W.
Level statistic
First-diff. statistic
n
Level statistic n
First-diff. statistic
ADF DF-GLS PT-GLS MZa MZt
2.96 1.94n 23.74n 4.96n 1.50n
11.57 11.18 1.61 84.20 6.48
1.25 0.21n 123.18n 0.37n 0.23n
9.35 5.05 3.05 14.70 2.48
MSB
0.30n
0.07
0.63n
0.16nn
Notes: ADF stands for augmented Dickey–Fuller test. DF-GLS stands for Elliott–Rothenberg–Stock detrended residuals test statistic. PT-GLS stands for Elliott-Rothenberg-Stock point-optimal test statistic. MZa , MZt, and MSB stand for the class of modified tests analyzed in Ng and Perron (2001). For the ADF and DF-GLS, present t-statistics are shown. For PT-GLS P-statistics are presented, while for the MZa , MZt, and MSB the Ng–Perron test statistics are shown. n denotes null hypothesis of unit root not rejected at 5 percent level. nn denotes null hypothesis of unit root not rejected at 5 percent level but rejected at 10 percent.
Table 3 Cointegration statistics II: Johansen’s test. Number of vectors
Eigenvalue
Trace
p-Value
Max-Eigenvalue
p-Value
0 1
0.14 0.02
24.93 3.86
0.001 0.07
21.52 3.84
0.003 0.07
Notes: This table reports cointegration tests based on the eigenvalue and trace statistics of the Johansen maximum likelihood procedure.
Table 4 Likelihood ratio tests. Restriction
Likelihood value
Degrees of freedom
p-Value
None g¼1 k ¼ k
992.88 992.88 992.3
– 1 2
– 0.96 0.57
Notes: This table reports the likelihood ratio test for different restrictions on the parameters of the VECM model. The first restrictions relate to the value of the cointegration vector and the second one to the symmetry of the speeds of convergence.
commonly applied unit root tests: augmented Dickey–Fuller; the DF-GLS and the optimal point statistic (PT GLSÞ, both of Elliott et al. (1996); and the modified MZa , MZt, and MSB of Ng and Perron (2001). The lag length is chosen using the Schwarz information criterion. In each case a constant and a trend are included in the specification. Table 2 also presents the same unit root test results for the R.W. TFP process. None of the test statistics are close to rejecting the null hypothesis of unit root at the 5 percent critical value. Using the same statistics, unit root tests on the first-difference of the TFP processes for the U.S. and the R.W. are stationary. For the U.S. all the tests reject the null hypothesis of unit root at the 5 percent critical value. For the R.W. all the tests reject the null hypothesis of unit root at the 5 percent critical value except the MSB test which rejects it at the 10 percent value. Turning to the question of cointegration, if logAðst Þ and logA ðst Þ share one common stochastic trend (balanced growth), an estimated VAR must possess a single eigenvalue equal to one and all other eigenvalues have to be less than one. To check this possibility an unrestricted VAR with one lag and a deterministic trend for the two-variables system ½logAðst Þ,logA ðst Þ is estimated, where the number of lags was chosen using the Schwarz information criterion. The highest eigenvalue equals 0.99, while the second highest is 0.95. Table 3 reports results from the unrestricted cointegration rank test using the trace and the maximum eigenvalue methods as defined by Johansen (1991). The cointegration test assumes a linear trend and a constant in the cointegrating vector. The data strongly support a single eigenvalue. 4.3. The VECM model To conclude the empirical analysis of the joint behavior of TFP across countries, two additional important empirical results are discussed. First, the null hypothesis that g ¼ 1 cannot be rejected by the data using a likelihood ratio test. This rejection is very important because a cointegrating vector (1, 1) implies that the balanced growth path hypothesis cannot be rejected. Next, a likelihood ratio test provides evidence supporting the null hypothesis that the coefficients related to the speed of adjustment in the cointegrating vector are equal and of opposite sign, i.e., k ¼ k . Table 4 reports the outcome of the two likelihood ratio tests. The last row presents the joint test of the two restrictions.
P. Rabanal et al. / Journal of Monetary Economics 58 (2011) 156–171
165
Table 5 VECM model.
k
cn
c nn
n
0.001 (1.76)
0.007n ( 4.19)
0.006 (12.46)
Notes: t-statistics in parenthesis.
n
denotes significance at the 5 percent level and
nn
denotes significance at the 10 percent level.
The estimated restricted model delivers the parameter estimates reported in Table 5. The restricted VECM includes with zero lags. It is worth noting that the coefficient of the speed of adjustment, while significant, is quantitatively small so TFP processes converge slowly over time. A low speed of adjustment parameter ðkÞ implies slow spillover of TFP shocks across countries. This feature is key to explain the results of the present paper. The constant terms c and cn have different point estimates. However, this difference does not imply that the growth rates of both TFP processes are different. Indeed, because the cointegrating vector is (1, 1), they must grow at the same rate along the balanced growth path. Given these parameter estimates, the implied long-run growth rate of TFP processes is 1.44 percent (in annualized terms). The estimated standard deviation of the innovations s and s is 0.0105 and 0.0088, respectively. In the model simulations, the correlation between eðst Þ and e ðst Þ is set to zero, since the null hypothesis of no correlation could not be rejected in the data. 5. Results In this section, we describe the results. We start by discussing the parametrization. Then, we analyze how the model matches the volatility of the RER and the intuition behind the results. We finalize by illustrating how the model explains the increase in RER volatility observed in the data during the last two decades. 5.1. Parameterization The baseline parameterization closely follows that in Heathcote and Perri (2002). The discount factor b is set equal to 0.99, which implies an annual rate of return on capital of 4 percent. The consumption share, t, is set to 0.34; the coefficient of risk aversion, s, is set to 2 as in Backus et al. (1992). We assume a cost of bond holdings, f, of one basis point. Parameters on technology are standard in the literature: the depreciation rate, d, is set to a quarterly value of 0.025, the capital share of output is set to a ¼ 0:36, and home bias for domestic intermediate goods is set to o ¼ 0:9, which implies the observed import/output ratio in the steady state. Two possible values for the elasticity of substitution between intermediate goods are considered, y ¼ 0:85 and 0.62. The first value is based on Heathcote and Perri (2002); the second value is a bit higher than the lower bound of 0.5 considered by Corsetti et al. (2008b). The VECM is calibrated as described in Table 5. In most cases, the results of this calibration are compared to the ones obtained when using stationary TFP shocks. For the stationary case, the parameters of the VAR(1) process for TFP shocks are calibrated as in Heathcote and Perri (2002): at ¼ rat1 þ nat1 þ et
ð29Þ
and at ¼ rat1 þ nat1 þ et where at ¼
logAt , at ¼ logAt ,
ð30Þ
r ¼ 0:97, n ¼ 0:025, Varðet Þ ¼ Varðe
2 t Þ ¼ ð0:0073Þ ,
and corrðet , e
tÞ¼
0:29.
5.2. Matching RER volatility Since the model is non-stationary, we found it convenient to compute HP-filtered moments by stochastic simulation. Hence, series of TFP shocks are drawn based on the empirical estimates and then used to simulate the model. The normalized model is solved taking a log-linear approximation around the steady state. To avoid dependence on initial values, the initial 1000 periods of the simulation are discarded, and the remaining 125 periods are used to compute statistics. The HP-filter is applied to the relevant series from the model (output, consumption, investment, employment, and the RER) and second moments are computed from the filtered series. This procedure is repeated 5000 times. Table 6 reports the average of the simulations. The first and second rows of Table 6 report the results of the economy with cointegrated TFP and high and low values for the trade elasticity, y, respectively. The next two rows show the results for the stationary model. Overall, models with cointegrated shocks generate higher relative volatility of the RER with respect to output than models with stationary shocks. Note that with high trade elasticity and cointegrated shocks, the relative volatility of the RER more than doubles with respect to the model with stationary shocks (1.31 versus 0.75). Hence the model improves, from explaining less than 25 percent of the observed relative volatility of the RER to explaining more than 40 percent.
166
P. Rabanal et al. / Journal of Monetary Economics 58 (2011) 156–171
Table 6 Full sample results. SD(Y) Data Coint., y ¼ 0:85 Coint., y ¼ 0:62 Stat., y ¼ 0:85 Stat., y ¼ 0:62
Data Coint., y ¼ 0:85 Coint., y ¼ 0:62 Stat., y ¼ 0:85 Stat., y ¼ 0:62
Data Coint., y ¼ 0:85 Coint., y ¼ 0:62 Stat., y ¼ 0:85 Stat., y ¼ 0:62
1.58 0.93 0.85 1.19 1.12
RSD(C) 0.76 0.65 0.66 0.52 0.54 Correlations
RSD(X)
RSD(N)
RSD(RER)
rðRERÞ
4.55 2.31 2.46 2.53 2.51
0.75 0.29 0.27 0.32 0.31
3.06 1.31 3.13 0.75 1.41
0.82 0.72 0.70 0.77 0.75
(Y,N)
(Y,C)
(Y,X)
ðRER,C=C Þ
0.87 0.94 0.97 0.97 0.97
0.84 0.95 0.98 0.93 0.93
0.91 0.97 0.98 0.97 0.97
-0.04 0.95 0.97 0.99 0.99
ðY,Y*Þ
ðC,C*Þ
ðX,X*Þ
ðN,N*Þ
0.44 0.11 0.48 0.18 0.33
0.36 0.45 0.69 0.70 0.81
0.28 0.21 0.15 0.18 0.05
0.40 0.29 0.13 0.21 0.05
Notes: SD denotes standard deviation of HP-filtered series. RSD denotes standard deviation of HP-filtered series relative to HP-filtered output. r denotes first autocorrelation. Where n denotes the R.W.
As expected, for lower values of the trade elasticity, the relative volatility of the RER increases under both the stationary and cointegrated models. The striking finding is that the model with cointegrated shocks and elasticity equal to 0.62 is able to closely match the relative volatility of the RER (3.13 in the model versus 3.06 in the data), while the model with stationary shocks and the same elasticity produces RER volatility of just 1.41 (which represents only about 40 percent of the fluctuations in the data). Interestingly, even though the model with cointegrated shocks improves significantly in matching the RER volatility, it does not affect the fit of other unconditional moments. Both the stationary and the cointegrated shocks models display very similar volatilities of consumption, hours, and investment relative to output. Also, both models display similar cross-correlations between consumption, hours, and investment relative to output and autocorrelations of RERs and neither of the models can explain the Backus and Smith (1993) puzzle. All models produce similar cross-correlations between U.S. and R.W. for output and consumption, but the model with cointegrated shocks and y ¼ 0:62 can better explain the international co-movement of investment and hours. For the case of y ¼ 0:62, the stationary model produces a cross-correlation between domestic and foreign output of 0.33, while with cointegrated shocks is 0.48 and the observed correlation is 0.44. For the case of consumption the stationary and cointegrated models produce cross-country correlations of 0.81 and 0.69, respectively, while the observed correlation is 0.36. For investment, the numbers are 0.05, 0.15 and 0.28, respectively. Finally, for hours, the stationary and cointegrated shock models produce a correlation of 0.05 and 0.13, respectively, while the observed correlation is 0.40. Unfortunately, the model with cointegrated shocks, like the model with stationary shocks, cannot solve the ‘‘quantity puzzle’’. In the data, output is more correlated than consumption across countries, while, in the model, consumption is more correlated than output. In any case, the cointegrated model with y ¼ 0:62 does better than the other versions. Although not reported in the tables, the model was also simulated with two alternative asset market structures: complete markets and financial autarky. In the first case, agents have access to a full set of state-contingent bonds that pay one unit of the domestic intermediate good in every state of the world. In the second case, the cost of holding bonds, F0 ½Dðst Þ, is calibrated to a very large number such that intertemporal trade never occurs. As expected (see Heathcote and Perri, 2002), the version of the model with complete markets generates lower relative volatility of the RER (falling from 1.31 to 0.90 when y ¼ 0:85, and from 3.13 to 1.01 when y ¼ 0:62), while the version of the model with financial autarky delivers a larger relative volatility of the RER (increasing from 1.31 to 1.51 when y ¼ 0:85, and from 3.13 to 4.14 when y ¼ 0:62). Therefore, the presence of incomplete markets helps the model with cointegrated shocks in increasing the relative volatility of RER (at least with respect to the complete markets case). Finally, it is important to recognize that RERs in the data differ in two important ways from the theory. First, existing empirical evidence shows deviations from LOP, which the model ignores. With respect to this point, Crucini and Shintani (2008) argue that LOP deviations lack persistence at the microeconomic level, which may help justify abstracting from LOP deviations in the context of the present study. As a rough check on this, a version of the model with Calvo-type sticky prices and domestic currency pricing was simulated, to find that the results do not change significantly. In any case, it is important to mention that ignoring the deviations makes it more difficult for this model to match the data: deviations from LOP are one important source of the fluctuations in the data.
P. Rabanal et al. / Journal of Monetary Economics 58 (2011) 156–171
167
Second, the measure of RER constructed in this paper uses trade weights applied to CPIs. CPIs include non-traded goods, which are excluded in the model. Thus, there is the issue of whether the CPI-based RER is the right empirical counterpart to the model-based RER: there is a gap between theory and measurement. To try to fill the gap, two alternative measures of the RER were constructed. The first measure uses PPIs instead of CPIs. The idea, going back to Engel (1999), is that CPIs have a larger share of non-traded goods in the basket than the PPIs, and hence the PPI is a better measure of tradable goods price index. The second measure uses export deflators as the relevant tradable goods price indices, since it measures the prices of goods that are actually shipped internationally.11 The volatility of the PPI- and CPI-based RERs is similar, which reinforces Engel’s (1999) result. In particular, for the period 1973:1 to 2006:4 the standard deviation of the PPI-based RER is 2.96 larger than the standard deviation of output while using the CPI-based RER, the standard deviation of RER is 3.06 larger than the standard deviation of output (as shown in Table 6). For the same period, the standard deviation of the export deflator-based RER is 2.34 larger than the standard deviation of output. Hence, using any of these alternative measures does not change the main message of the paper. 5.3. Intuition In the model, four key parameters drive the behavior of the volatility of the RER with respect to output: (i) the elasticity of substitution between home and foreign goods, (ii) the fraction of intermediate goods in the production of the final good (or ‘‘home bias’’), (iii) the persistence of the TFP process, and (iv) the persistence in the differential of TFP processes across countries. This subsection studies how these four elements shape the results. The first two components are well-known in the IRBC literature, so a brief discussion is presented here. The second two components are new to this paper, so more space is devoted to them. 5.3.1. The role of the elasticity of substitution and home bias It is important to note that since the model only includes tradable intermediate goods, the RER relates to input prices as follows: RERðst Þ ¼
½oPF ðst Þ1y þ ð1oÞPH ðst Þ1y 1=ð1yÞ ½oPH ðst Þ1y þð1oÞPF ðst Þ1y 1=ð1yÞ
:
ð31Þ
Thus, for a given volatility of the relative prices of intermediate goods, the model needs low elasticity of substitution (low y) and/or high home bias (high o) in order to match the observed RER volatility. However, this is largely an artifact of ignoring non-traded intermediate goods. An alternative would be introducing a non-traded intermediate good as a substitute to the asymmetric aggregate among traded goods. However, there are challenges associated with measuring TFP in the tradable and non-traded sectors in each of the two countries at quarterly frequency. Since one of the key ingredients of this paper is, indeed, to properly measure TFP, it is not feasible to follow this route here. In the model, it is difficult to obtain one analytical expression linking the volatility of the RER to output. However, by combining the two demand equations for home and foreign intermediate goods (20) and (21), normalizing by TFP in each country, and log-linearizing, the following expression arises: y^ H,t y^ F,t ¼
y 2o1
rer t ðat1 at1 Þ:
ð32Þ
In words: for a given volatility of relative quantities and relative TFP, lowering y or increasing o would imply higher RER fluctuations (see also Backus et al., 1994). Of course, this is not a full argument, since changes in y and/or o will affect all variables in general equilibrium. The full general equilibrium effects are elaborated in Fig. 2, which compute the ratio between the standard deviation of the RER with respect to the standard deviation of output for different combinations of y and o. As expected, increasing home bias (o) or decreasing the elasticity of substitution (y) leads to higher relative volatility of the RER. 5.3.2. The role of the estimated coefficients of TFP processes The key fact of productivity in Baxter and Crucini (1995), at least in terms of the relevance of asset market restrictions they explored, is not the persistence of TFP processes but the persistence of the productivity gap. The reason is clear from their Hicksian decompositions: home and foreign wealth effects, due to incomplete markets, arise when productivity diverges across countries in a persistent fashion. The joint process of TFP shocks across countries can be written as follows: at ¼ rat1 þ nat1 þ et ,
ð33Þ
at ¼ rat1 þ nat1 þ et ,
ð34Þ
11 We construct the export deflator-based real exchange rate series between 1973 and 2009. We obtain export deflators series for Australia, Canada, Japan, the United Kingdom and the U.S. using national data sources, while data for the Euro area come from the AWM.
168
P. Rabanal et al. / Journal of Monetary Economics 58 (2011) 156–171
1.4
1.3 0.5
1.2
0.5
θ
0.
5
1.1
1
1
0.9 1
0.8 0.5
1.5
1.5
2
1
0.7
2.5
2
3
3.5
0.5
0.55
0.6
0.65
0.7
0.75 ω
0.8
0.85
0.9
0.95
Fig. 2. Standard deviation of the RER with respect to the standard deviation of output when y and o change.
as in Backus et al. (1994) and Heathcote and Perri (2002). The level of TFP in a country is related to its own lag with coefficient r, and to the lag of the other country’s TFP with coefficient n. The coefficient n is also known as the TFP spillover from one country to the other. Note that when r ¼ ð1þ kÞ and n ¼ k, this VAR equals the VECM with cointegrating vector (1, 1) and symmetric convergence speed to the cointegrating relationship that is considered throughout this paper.12 r Given that the two countries are of the same size, define world TFP as aw t ¼ at þ at , and relative TFP as at ¼ at at . These processes evolve as w w aw t ¼ ðr þ nÞat1 þ et ,
ð35Þ
art ¼ ðrnÞart1 þ ert , w t
r t
ð36Þ
where e ¼ et þ e and e ¼ et e The key parameters in the model are ðr þ nÞ, the persistence of the world TFP shock, and ðrnÞ, the persistence of relative TFP. In the VECM, ðr þ nÞ ¼ 1 and rn ¼ 12k. Therefore, given a common unit root in the system, a slower convergence to the cointegrating relationship (smaller k) implies a higher persistence of relative TFP (higher rn). Fig. 3 shows the relative standard deviation of the RER as a function of (r þ n) and (rn). The mechanism works as follows. When a positive TFP shock hits the home economy, output, consumption, investment and hours worked increase in the home economy, and the real exchange rate depreciates. In the foreign country, output, investment and hours decrease and consumption increases because foreign households anticipate the arrival of the technology improvement in the future and react to the wealth effect. Increasing the persistence of the world TFP shock increases the volatility of output but not to a large extent. However, for a given level of persistence in the world TFP shock, an increase in the persistence of the TFP differential reduces the standard deviation of output. Why? One the one hand, it delays the arrival of the TFP improvement in the foreign country and eases the wealth effect for the foreign economy. On the other hand, it increases the persistence of the TFP shock in the home country, intensifying the wealth effect. Hence, labor, investment and output respond less strongly in both countries reducing output volatility. What happens with the volatility of the RER? The intuition is clear when changes in r and n are studied in isolation. When r increases but n remains fixed, the wealth effect leads home households to demand more consumption goods. In order to produce more final goods, the home country demands more intermediate goods from the foreign country, which, provided that the elasticity of substitution is low enough, leads to larger RER depreciations. Hence, the volatility of the RER increases. When r is fixed but n increases, the exact opposite occurs. In this case the wealth effect hits the foreign-country 12
t
t.
We are thankful to associate editor Mario Crucini for suggesting this presentation.
35.
25.
2
1
169
3
P. Rabanal et al. / Journal of Monetary Economics 58 (2011) 156–171
ρ+ν
3.5
3
2
0.985
2.5
0.99
1.5
0.995
0.98
0.975
3.5
3
2.5
2
0.97
0.965
0.96 0.96
0.965
0.97
0.975
0.98 ρ−ν
0.985
0.99
0.995
Fig. 3. Standard deviation of the RER with respect to the standard deviation of output when r and n change.
households. They know that productivity will transit positively to their country sooner and they demand more consumption goods than they would if spillovers were slower. Thus, the demand for home intermediate goods increases because foreign final goods producers substitute away from their intermediate goods. As a consequence, the price of home goods increases relative to that of foreign goods and the RER depreciates less than in a model with slower (or no) spillovers. Therefore, an increase in world TFP (r þ n) holding constant the persistence of relative TFP (rn) means that both r and n increase. As discussed, these two coefficients have opposite effects on the volatility of the RER and, on net, increasing (r þ n) leads to a decline in the relative volatility of the RER. On the other hand, an increase in the persistence of relative TFP differential (rn) holding constant the persistence of world TFP shock (r þ n) means that r increases by as much as n decreases, which unequivocally increases the relative RER volatility. In Fig. 3 it is possible to analyze the case that has been examined throughout the paper, that is, a VECM with cointegrating vector (1, 1) and symmetric speed of convergence k. This case involves horizontal movements along the line when r þ n ¼ 1. Decreasing k (i.e. increasing rn while keeping r þ n ¼ 1) leads to a decline in the volatility of output, an increase in the relative volatility of the RER that matches the one observed in the data.13
5.4. Matching the increase in RER volatility As described in Section 2, the volatility of the RER with respect to the volatility of output has increased in the last two decades in the U.S. As shown by the Quandt–Andrews test, the increase seems to occur at about 1993:4. As Table 6 shows the volatility of the RER has gone from less than three times the volatility of output during the period 1973:1 to 1993:4 to more than five times during the period 1994:1 to 2006:4. Using U.S. and R.W. data, this section presents evidence that relates the decrease in the speed of convergence to the cointegrating relationship, i.e., lower k, with the increase in the relative volatility of the RER. To formalize this, the VECM is estimated for two non-overlapping sub-samples.14 The first sample goes from 1973:1 to 1993:4, while the second sub-sample goes from 1994:1 to 2006:4. We split the sample to match the results of the Quandt– Andrews test. For the first sub-sample the estimated value of the speed of adjustment term is larger in absolute value than the value estimated with the entire sample. In particular, k moves from 0.007 to 0.008. Also, the standard deviation of 13 Note that, theoretically, it would be possible to calibrate a stationary model with low persistence in world productivity and high persistence in the relative productivity that implies high RER volatility. However, this would require that rn 4 r þ n and hence that n is negative. None of the available estimates suggest that spillovers are negative, indicating that this theoretical result is not supported by empirical evidence. 14 We assume that the cointegrating relationship is the same across samples.
170
P. Rabanal et al. / Journal of Monetary Economics 58 (2011) 156–171
Table 7 Sub-sample results. SD(Y)
RSD(C)
RSD(X)
RSD(N)
RSD(RER)
rðRERÞ
1973–1993 Data Coint., y ¼ 0:85 Coint., y ¼ 0:62
1.89 1.12 1.01
0.78 0.59 0.62
4.47 2.25 2.17
0.79 0.27 0.25
2.72 1.32 2.85
0.82 0.72 0.71
1994–2006 Data Coint., y ¼ 0:85 Coint., y ¼ 0:62
0.88 0.79 0.73
0.78 0.55 0.66
4.82 2.74 2.01
0.90 0.38 0.42
5.01 1.45 4.31
0.82 0.71 0.72
Notes: SD denotes standard deviation of HP-filtered series. RSD denotes standard deviation of HP-filtered series relative to HP-filtered output. r denotes first autocorrelation.
the stochastic process for the U.S., s, is estimated to be 0.012, while the standard deviation for the R.W., s , is estimated to be 0.011. Both values are larger than the ones obtained when the whole sample is used. In the second sub-sample, 1994:1 to 2006:4, the estimated speed of adjustment coefficient decreases dramatically with respect to both the full sample and the first sub-sample: the point estimate is 0.002. This means that the catching up process is much slower in the second part of the sample. This result indicates that the co-movement between TFPs in the post-1994 period is characterized by a very slow return to the long-run level. Finally, the standard deviations s and s are estimated to be 0.009 and 0.007, respectively. The sub-sample estimates of s and s reflect both the sample period and the countries in the R.W. aggregate. The large drop in s and s across sub-samples reveals the reduction in output volatility that the U.S. and other countries experienced after the 1980s (see Kim and Nelson, 1999; McConnell and Perez-Quiros, 2000). Table 7 reports the results. It is important to point out that the change in VECM parameters is entirely unexpected and then fully understood by the agents in the model. The results indicate that the change in the estimates of the VECM across samples is an important force behind the increase in the relative volatility of the RER. In the data the relative volatility of the RER increases by 80 percent across samples. Our simulations show that the model generates increases in relative volatility of around 50 percent for both low and high values of y: the model can explain more than 60 percent of the increase in RER volatility. 6. Concluding remarks This paper documents two empirical facts. First, that TFP processes of the U.S. and the R.W. are cointegrated with cointegrating vector (1,1) and, second, that the relative volatility of the RER with respect to output has increased in the U.S. during the last 20 years. Then, the paper has shown that introducing cointegrated TFP processes in an otherwise standard IRBC model increases the model’s ability to explain RER volatility, without affecting the fit to other second moments of the data (and sometimes providing small improvements of fit). If one allows the speed of convergence to the cointegrating vector to change as it does in the data, the model can also explain the observed increase in the relative volatility of the RER. For future research, it would be interesting to introduce cointegrated TFP processes in medium-scale open economy macroeconomic models, which typically include more frictions, and try to match a larger set of domestic and international variables (see Adolfson et al., 2007). Also, instead of analyzing the volatility of the RER between the U.S. and a synthetically constructed R.W., one could compute bilateral RERs and relate them to bilateral trade flows and output co-movements. This exercise is beyond the scope of this paper, but it could be an interesting line of research given that the equilibrium trade literature is consistently finding that RERs and gross bilateral trade flows are closely related.
Acknowledgments We thank Larry Christiano, Martin Eichenbaum, Jesu´s Gonzalo, Dirk Kruger, Jim Nason, Fabrizio Perri, Gabriel Rodrı´guez, Barbara Rossi, and especially Mario Crucini for very useful comments. We are also thankful to seminar participants at La Caixa, Universitat Autonoma de Barcelona, Toulouse School of Economics, Universidad de Navarra, Singapore Management University, Hong Kong Monetary Authority, University of Hong Kong, Hong Kong Science and ˜ a, Ghent University and the Federal Reserve Banks of Atlanta and Philadelphia for Technology University, Banco de Espan useful comments. NSF and La Caixa support are acknowledged by Juan F. Rubio-Ramı´rez. Vicente Tuesta is a Professor of CENTRUM Cato´lica, Pontificia Universidad Cato´lica del Peru´. Appendix A. Supplementary data Supplementary data associated with this article can be found in the online version at doi:10.1016/j.jmoneco.2011.03.005.
P. Rabanal et al. / Journal of Monetary Economics 58 (2011) 156–171
171
References Adolfson, M., Laseen, S., Linde´, J., Villani, M., 2007. Bayesian estimation of an open economy DSGE model with incomplete pass-through. Journal of International Economics 72, 481–511. Aguiar, M., Gopinath, G., 2007. Emerging market business cycles: the cycle is the trend. Journal of Political Economy 115, 69–102. Alvarez, F., Jermann, U., 2005. Using asset prices to measure the persistence of the marginal utility of wealth. Econometrica 73, 1977–2016. Backus, D., Kehoe, P., Kydland, F., 1992. International business cycles. Journal of Political Economy 100, 745–775. Backus, D., Kehoe, P., Kydland, F., 1994. Relative price movements in dynamic general equilibrium models of international trade. In: van der Ploeg, R. (Ed.), Handbook of International Macroeconomics. Wiley-Blackwell, Indianapolis, pp. 62–96. Backus, D., Smith, G., 1993. Consumption and real exchange rates in dynamic economies with non-traded goods. Journal of International Economics 35, 297–316. Baxter, M., Crucini, M., 1993. Explaining saving-investment correlations. American Economic Review 83, 416–436. Baxter, M., Crucini, M., 1995. Business cycles and the asset structure of foreign trade. International Economic Review 36, 821–854. Baxter, M., Stockman, A., 1989. Business cycles and the exchange rate system: some international evidence. Journal of Monetary Economics 23, 377–401. Benigno, G., 2004. Real exchange rate persistence and monetary policy rules. Journal of Monetary Economics 51, 473–502. Benigno, G., Thoenissen, C., 2008. Consumption and real exchange rates with incomplete markets and non-traded goods. Journal of International Money and Finance 27, 926–948. Betts, C., Kehoe, T., 2006. Real exchange rate movements and the relative price of nontradable goods. Journal of Monetary Economics 53, 1297–1326. Burstein, A., Eichenbaum, M., Rebelo, S., 2006. The importance of nontradable goods’ prices in cyclical real exchange rate fluctuations. Japan and the World Economy 18, 247–253. Chari, V., Kehoe, P., McGrattan, E., 2002. Can sticky price models generate volatile and persistent real exchange rates? Review of Economic Studies 69, 533–563. Christoffel, K., Kuester, K., Linzert, T., 2009. The role of labor markets for euro area monetary policy. European Economic Review 53, 908–936. Corsetti, G., Dedola, L., Leduc, S., 2008a. International risk sharing and the transmission of productivity shocks. Review of Economic Studies 75, 443–473. Corsetti, G., Dedola, L., Leduc, S., 2008b. High exchange rate volatility and low pass-through. Journal of Monetary Economics 55, 1113–1128. Crucini, M., Shintani, M., 2008. Persistence in law of one price deviations: evidence from micro-data. Journal of Monetary Economics 55, 629–644. Dotsey, M., Duarte, M., 2009. Non-traded goods, market segmentation and exchange rates. Journal of Monetary Economics 55, 1129–1142. Elliott, G., Rothenberg, T., Stock, J., 1996. Efficient tests for an autoregressive unit root. Econometrica 64, 813–836. Engel, C., 1993. Real exchange rates and relative prices: an empirical investigation. Journal of Monetary Economics 32, 35–50. Engel, C., 1999. Accounting for U.S. real exchange rate changes. Journal of Political Economy 107, 507–538. Engel, C., West, K., 2005. Exchange rates and fundamentals. Journal of Political Economy 113, 485–517. Engle, R., Granger, C., 1987. Co-integration and error correction: representation, estimation, and testing. Econometrica 55, 251–276. Heathcote, J., Perri, F., 2002. Financial autarky and international business cycles. Journal of Monetary Economics 49, 601–627. Heathcote, J., Perri, F., 2009. The international diversification puzzle is not as bad as you think. NBER Working Paper 13483. Johansen, S., 1991. Estimation and hypothesis testing of cointegration vectors in Gaussian vector autoregressive models. Econometrica 59, 1551–1580. Justiniano, A., Preston, B., 2010. Can structural small open economy models account for the influence of foreign disturbances? Journal of International Economics 81, 61–74. Kehoe, P., Perri, F., 2002. International business cycles with endogenous incomplete markets. Econometrica 70, 907–928. Kim, C., Nelson, C., 1999. Has the U.S. economy become more stable? A Bayesian approach based on a Markov switching model of the business cycle. Review of Economics and Statistics 81, 608–616. King, R., Plosser, C., Rebelo, S., 1988. Production, growth and the business cycle. Journal of Monetary Economics 21, 195–232. King, R., Plosser, C., Stock, J., Watson, M., 1991. Stochastic trends and economic fluctuations. American Economic Review 81, 819–840. Lastrapes, W., 1992. Source of fluctuations in real and nominal exchange rates. Review of Economics and Statistics 74, 530–539. Lubik, T., Schorfheide, F., 2005. A Bayesian look at new open economy macroeconomics. In: Gertler, M., Rogoff, K. (Eds.), NBER Macroeconomics Annual 2005. MIT Press, Cambridge, pp. 313–366. McConnell, M., Perez-Quiros, G., 2000. Output fluctuations in the United States: what has changed since the early 1980s? American Economic Review 90, 1464–1476. Nason, J., Rogers, J., 2008. Exchange rates and fundamentals: a generalization. Federal Reserve Bank of Atlanta Working Paper 2008-16. Ng, S., Perron, P., 2001. Lag length selection and the construction of unit root tests with good size and power. Econometrica 69, 1519–1554. Rabanal, P., Tuesta, V., 2010. Euro-dollar real exchange rate dynamics in an estimated two-country model: an assessment. Journal of Economic Dynamics and Control 34, 780–797. Rabanal, P., Tuesta, V., 2007. Non-tradable goods and the real exchange rate. La Caixa Working Paper 03/2007. Stock, J., Watson, M., 2003. Has the business cycle changed and why? In: Gertler, M., Rogoff, K. (Eds.), NBER Macroeconomics Annual 2002. MIT Press, Cambridge, pp. 159–230. Stock, J., Watson, M., 2005. Understanding changes in international business cycle dynamics. Journal of the European Economic Association 3, 968–1006.
Journal of Monetary Economics 58 (2011) 172–182
Contents lists available at ScienceDirect
Journal of Monetary Economics journal homepage: www.elsevier.com/locate/jme
Production, hidden action, and the payment system Chao Gu a, Mark Guzman b, Joseph Haslag c, a
University of Missouri, United States University of Reading, United Kingdom c Department of Economics, University of Missouri, Columbia, MO 65211, United States b
a r t i c l e in f o
abstract
Article history: Received 10 November 2008 Received in revised form 3 March 2011 Accepted 7 March 2011 Available online 15 March 2011
Using a modified version of Freeman’s (1996) payment system model, the optimal intraday rate is examined. The production set is modified to account for a nondegenerate distribution of settlements within a day. In addition to the modified production set, the consumption set is modified. A positive intraday interest rate may be able to implement the planner’s allocation. & 2011 Elsevier B.V. All rights reserved.
1. Introduction Settlement risk seemingly intensifies during financial crises like the one experienced in 2008. The existing literature deals with settlement risk as being completely under the purview of monetary policy and offers theoretical justification for a zero intraday rate policy. The underpinnings are developed in Freeman’s (1996) seminal paper. In his work, settlement risk arises because the requests for payments and the arrivals of payments are not synchronized. Liquidity plays a critical role. If, for example, agents have not accumulated enough money to settle all requests for payments at par, then some creditors will suffer. Because the problem is purely about liquidity, the best central bank policy is to provide liquidity at no cost. Thus, under such a Friedman-rule-like policy, full consumption insurance is achieved and the efficient allocation is implemented.1 The purpose of this paper is twofold. First, debtors decide when to settle in this model economy. This decision has important implications for the within-day distribution of settlements.2 More specifically, debtors borrow production inputs from creditors by issuing IOUs. Debtors settle IOUs – either in the morning or in the afternoon – according to when their production occurs. (Hereafter, debtors will be identified as morning producers or afternoon producers according to their production decision.) For a given input, afternoon production generates greater output than morning production. By choosing morning production, however, debtors settle in the morning and can participate in the IOUs resale market. In equilibrium, a non-arbitrage condition is satisfied. As such, the marginal debtor is indifferent between morning and afternoon production because the sum of capital gains from IOUs purchased in the resale market and the returns from morning production are equal to the returns from afternoon production. Consequently, the intraday return on IOUs is positive and the distribution of settlements is non-degenerate.
Corresponding author. Tel.: þ1 573 882 3483; fax: þ1 573 882 2697.
E-mail address:
[email protected] (J. Haslag). Green (1997) showed that the efficient allocation can be implemented by using private money instead of central bank money. Mills (2004) solved a mechanism design problem, demonstrating the equivalence between Freeman’s economy with an active discount window and an economy in which perfect record keeping exists. Thus, the liquidity problem is a by-product of imperfect record keeping. 2 Note that the average daily volume settled through Fedwire and CHIPs is $7.3 trillion. The Federal Reserve asserted that exposure to credit risk is heightened when the late-day settlement volume is large. On average, 20 percent of the payments are settled before 1 pm with the remaining 80 percent being settled after 1 pm (see Armantier et al., 2008). Angelini (1998) considers a partial equilibrium model in which there is a network externality that induces banks to delay settlement. 1
0304-3932/$ - see front matter & 2011 Elsevier B.V. All rights reserved. doi:10.1016/j.jmoneco.2011.03.004
C. Gu et al. / Journal of Monetary Economics 58 (2011) 172–182
173
Second, we examine optimal central bank policy. There are conditions under which the optimal intraday rate is positive. Our findings rest squarely on changes to the production set that was just described and to the consumption set. Indeed, the relationship between consumption timing and settlement timing affects the central bank policy. If creditors’ consumption needs are independent of their settlement requests, it is efficient to produce only in the afternoon because afternoon goods are perfect substitutes for morning goods and afternoon production strictly dominates morning production. By adopting a zero-intraday-rate policy, capital gains are eliminated in the IOU resale market and the efficient allocation is implemented. It follows that under the optimal policy, equilibrium distribution of settlements is degenerate with all settlements occurring in the afternoon. Next, consider a case in which the creditor’s timing of consumption needs coincides with the timing of settlement requests. In this case, creditors who need to consume in the morning will not view afternoon goods as substitutes for morning goods. The bottom line is that the optimal intraday rate can be positive in these versions of the model economy. The efficient production plan depends on whether the economy’s endowment is sufficient to meet early consumption needs. If endowments can meet morning creditors’ needs, then there is no morning production in the efficient allocation. In the decentralized economy, a zero-intraday interest rate implements the efficient allocation. If, however, endowments cannot meet morning creditors’ needs, then a zero-intraday-rate policy is not optimal. Under this condition, the intraday rate is the relative price of morning consumption to afternoon consumption and it should be positive for creditors who need to consume early. In addition, morning producers’ intertemporal decisions also depend on the intraday rate. For morning producers, it should be zero as their consumption is independent of settlement or production timing. Therein lies the problem. Suppose the planner’s allocation consists of both morning and afternoon production. In the decentralized economy, one policy tool cannot, in general, simultaneously correct two efficiency conditions. The appropriate intraday rate policy was clearly an important topic during the 2008 financial crisis. In contrast to the existing literature, there are conditions in which monetary policy is not sufficient to address both the liquidity problem and efficiently solve the production allocation problem. Thus, two questions motivate this study: (i) what if there are gains to waiting that create an incentive for debtors to settle later in the day? and (ii) what if creditors’ consumption opportunities are time sensitive? By introducing seemingly minor modifications to the payment system model, the optimal intraday rate is not necessarily zero. The paper is organized as follows. Section 2 describes the economic environment. Section 3 defines and characterizes the equilibrium in the decentralized market. The planner’s allocation is characterized in Section 4. Government policies that will implement the planner’s allocation are derived in Section 5. Section 6 considers a modified economy in which morning goods and afternoon are imperfect substitutes. A brief summary and conclusions are presented in Section 7. 2. The physical environment A modified version of the payment system model developed by Freeman (1996) is considered here. Debtors acquire production inputs from creditors, employing the inputs in one of the two technologies. The technology choice also determines when the debtor settles. Time, location and goods: There is an infinite sequence of time periods. Dates are indexed by t ¼1,2,y . At each date t, there are two subperiods, called morning and afternoon. There are I pairs of islands distributed around a central island where I is a large number. For each island pair, one is called the creditor island and the other is the debtor island in anticipation of the agent’s equilibrium trading behavior on these islands. All debts are settled on the central island. There is an authority on the central island that costlessly enforces all contracts. There is an island-specific, perishable consumption good. Goods originating from a particular creditor island are denoted by yi for i¼1,2,y,I and can move only between the island of origin and the paired-neighbor island. Goods originating from a particular debtor island are denoted by xi for i¼ 1,2,y,I and can either stay on the island of origin or move to the central island. In addition, each debtor island has its own capital, denoted ki for i¼1,2,y,I. Goods only take on non-negative values. Agents: At the beginning of each period, there is a continuum of measure one of the two-period lived agents born on each island. Agents born on a creditor (debtor) island are called creditors (debtors). At date t ¼1, there are a continuum of both creditors and debtors who live only one period, hereafter referred to as the initial old. Endowments: At the beginning of each date t Z1, debtors and creditors receive endowments of their island-specific goods. The location subscripts are dropped because quantities are equal. Let x denote the quantity received by each young debtor. Note also that each initial old debtor will also receive an endowment equal what the young debtor receives. Let y denote the quantity received by each young creditor. The initial old creditors are endowed with m0 units of fiat money. Production technology: There are two technologies available to young debtors. Each technology converts good y at oneto-one rate into island-specific capital. With k units of date-t capital, the morning production technology produces f(kt) units of date-t þ1 island-specific goods available in the morning. Alternatively, the afternoon-production technology transforms k units of date-t capital into F(kt) units of date-t þ1 island-specific goods available in the afternoon. Note that for a given finite input kt, the production of the afternoon good strictly dominates the morning good; that is, FðkÞ 4 f ðkÞ for any k 40. The functions f(k) and F(k) are strictly increasing and strictly concave.3 3
Freeman (2002) specifies a model in which production is present. There is no distinction between morning and afternoon production in Freeman’s model.
174
C. Gu et al. / Journal of Monetary Economics 58 (2011) 172–182
Preferences: A creditor born at date t derives utility from consuming his own endowment good at date t and a yet-to-bedetermined island-specific debtor good at date t þ1. When old, creditors learn which island-specific debtor good they c c want. Let u(y1t,x2t þ 1) denote a creditor’s utility function. Throughout the paper, the superscript of a variable denotes the identity of an agent (c for creditors and d for debtors), the first subscript denotes the age of the agent, and the second one is the date. A debtor born at date t derives utility from consuming his own island-specific good at dates t and tþ 1. A debtor’s d d utility function is denoted by v(x1t,x2t þ 1). Both utility functions are separable, strictly increasing, strictly concave, and satisfy the Inada condition in both arguments. Travel: At each date t Z 1, a young debtor travels to the paired creditor island. Upon returning, the young debtor chooses a production technology. The technology selected is private information. The choice of the production technology determines when the debtor travels to the central island when old. After the production decision, the young debtor meets with randomly relocated old creditors and old returning debtors. In the next period, the old debtor travels to the central island; morning (afternoon) producers arrive in the morning (afternoon). The old debtor returns to his home island at the end of the period. At each date t Z 1, a young creditor stays on her home island. When old, the creditor first learns what island-specific good gives utility and she is relocated to the appropriate debtor islands before the end of the period. The relocation is drawn from a uniform distribution; thus the probability that any one old creditor is relocated to a particular debtor island is the same. After learning which island good provides old-age utility, each creditor travels to the central island in the morning. At the end of the morning subperiod, a measure 1a of old creditors are informed that they must leave the central island (morning-leaving creditors) and are relocated to the debtor islands. The rest of the old creditors leave in the afternoon (afternoon-leaving creditors). 3. The decentralized economy We begin with an overview of the trade patterns in the decentralized economy. When young, a debtor wants to purchase good y from the creditors. Because a young creditor does not derive utility from debtor’s good and because the young debtor does not have money, IOUs are offered to the young creditor in exchange for the y good. IOUs will be settled next period on the central island. After obtaining good y, the debtor returns to his home island and starts to produce using either the morning or the afternoon technology. Let l be the measure of the debtors who choose morning production. After production is started, the young debtor sells some of his endowment good x to either old creditors or to old debtors in exchange for fiat money. When old, the debtor uses fiat money to settle IOUs or to buy goods from the young generation. Based upon his production decision, the old debtor travels to the central island to settle IOUs. Morning (afternoon) producers arrive in the morning (afternoon). After settling IOUs, the old debtor returns to his home island in the afternoon. Note that the old debtors can bring produced goods to the central island to settle IOUs. Remember that the young debtor’s production choice is private information and is not publicly observed until next period when the debtor arrives to settle the IOU. A young creditor stays on his home island. She sells some of her endowed good to the young debtors, receiving IOUs. When old, the creditor immediately learns which debtor good provides utility and travels to the central island in the morning to settle the IOUs. An IOU contract specifies the nominal value of the debt. It is expected to be settled at par in the next period (either morning or afternoon) on the central island.4 IOUs can be redeemed by either fiat money or real goods. At the time the IOU is issued, creditors know a, the morning departure rate for old creditors, and take the believed value of l as given. When the morning-leaving creditors leave the central island, any IOUs issued by the afternoon producers are unredeemed. The morning-leaving creditors can sell the unredeemed IOUs in a secondary, or resale, market. Potential buyers are those arriving at the central island in the morning and leaving during the afternoon subperiod; that is, afternoon-leaving creditors and morning producers. Note that the preference and travel patterns account for the existence of valued IOU contracts and valued fiat money. When young debtors and young creditors meet, barter is impossible and young creditors do not have fiat money. Instead, young debtors offer an IOU as payment for the creditor’s goods. IOUs do not circulate among the creditor/debtor island pairs because the creditors only value island-specific goods. Fiat money is valued as the means of settling IOUs and as the means for intergenerational exchange. IOUs have to be redeemed on the central island because the central island is the only place that the issuers and holders of the IOUs meet and such contracts can be enforced. 3.1. Debtor’s problem All markets in our model are competitive. A young debtor compares the lifetime utilities from producing the morning good and the afternoon good, taking goods prices, the resale price of the loan, and the measures of morning and afternoon 4 Gu et al. (2010) consider an economy in which the creditor can observe the debtor’s production decision. In this case, creditor’s will charge a different price for morning production than for afternoon production. In other words, the IOU contract is contingent on the timing of payments. All the results reported in this paper carry over to the environment in which state-contingent IOUs are offered.
C. Gu et al. / Journal of Monetary Economics 58 (2011) 172–182
175
producers as given. Let the lifetime utilities of a morning producer and an afternoon producer be represented by d vðxd1t ,xd2t þ 1 Þ and vðxd 1t ,x2t þ 1 Þ, respectively. Throughout our analysis, the superscript ‘‘’’ denotes the quantities related to afternoon producers and afternoon-leaving creditors. The debtor’s decision rule is straightforward: choose the technology that results in higher lifetime utility. If two technologies yield the same lifetime utility, a debtor is indifferent to becoming either a morning or an afternoon producer. In this case, a particular debtor chooses to become a morning producer with probability l. By the law of large numbers, l is also the measure of debtors that are morning producers. In what follows, the debtor’s problem is distinguished, deriving each solution by the technology chosen. 3.1.1. Morning producer A morning producer faces the following budget constraints when young: pyt kt ¼ ht
ð1Þ
pxt x ¼ pxt xd1t þ mt
ð2Þ
where pxt and pyt are the prices of goods x and y in period t, respectively, ht is the nominal value of IOUs issued by the morning producer in exchange for good y, and mt is the quantity of money acquired by selling the endowed goods to old creditors and old debtors. The morning producer transforms the y good into capital, denoted kt, at a one-for-one rate. When old, a morning producer arrives on the central island in the morning and faces the following budget constraint: pxt þ 1 f ðkt Þ þmt ht þð1rt þ 1 Þbt þ 1 ¼ pxt þ 1 xd2t þ 1
ð3Þ
where bt þ 1 is the par value of the IOUs he purchases in the date-t þ1 resale market, and rt þ 1 is the date-t þ1 price of those IOUs. Note that under this constraint, debtors are capable of meeting their old-age needs – consumption and settlement – through a combination of production, outside money and gains from IOU purchases. Money holdings by the old debtors need not be equal to IOU values. The morning producer’s life-time budget constraints are pxt þ 1 f ðkt Þ þpxt ðxxd1t Þpyt kt þ ð1rt þ 1 Þbt þ 1 ¼ pxt þ 1 xd2t þ 1
ð4Þ
xxd1t Z 0
ð5Þ
The morning producer faces a liquidity constraint in the loan resale market: pxt þ 1 f ðkt Þ þpxt ðxxd1t Þpyt kt rt þ 1 bt þ 1 Z 0
ð6Þ
which says the morning producer cannot borrow to purchase the IOUs. d d A morning producer maximizes lifetime utility v(x1t,x2t þ 1) subject to (4)–(6). Throughout this paper, the notation gj denotes the partial derivative of function g with regard to the jth argument. The first-order conditions for the morning producers are pxt 1 ¼0 ð7Þ ðxxd1t Þ v1 v2 pxt þ 1 rt þ 1 f 0 ðkt Þ
v2
pyt ¼0 pxt þ 1
1rt þ 1 m1 rt þ 1 ¼ 0 pxt þ 1
ð8Þ
ð9Þ
where m1 is the Lagrange multiplier associated with a morning producer’s liquidity constraint. From Eq. (9), it follows that the liquidity constraint is non-binding if and only if r ¼ 1.5 3.1.2. Afternoon producer If a debtor chooses to be an afternoon producer, his budget constraints when young are pyt kt ¼ ht
ð10Þ
pxt x ¼ pxt xd 1t þ mt
ð11Þ
When old, the debtor will not be able to trade in the loan resale market as he arrives late. The old afternoon producer’s budget constraint is pxt þ 1 Fðkt Þ þ mt ht ¼ pxt þ 1 xd 2t þ 1
ð12Þ
5 Because the IOU contracts are redeemed at par by the issuers, no one will purchase IOUs at a price higher than 1. Thus, the equilibrium price of IOU in the resale market is at most 1.
176
C. Gu et al. / Journal of Monetary Economics 58 (2011) 172–182
The afternoon producer’s life-time budget constraints are d pxt þ 1 Fðkt Þ þ pxt ðxxd 1t Þpyt kt ¼ pxt þ 1 x2t þ 1
ð13Þ
xxd 1t Z 0
ð14Þ d vðxd 1t ,x2t þ 1 Þ
An afternoon producer maximizes his lifetime utility subject to his lifetime budget constraints (13)–(14). The first-order conditions for the afternoon producer’s problem are pxt Þ v v ¼0 ð15Þ ðxxd 1t 1 2 pxt þ 1 F 0 ðkt Þ
pyt ¼0 pxt þ 1
ð16Þ
3.2. Creditor’s problem A young creditor divides his endowment between his own consumption and trades with young debtors. His budget constraint is lt þ py yct ¼ py y
ð17Þ
where lt denotes the nominal value of the IOUs accepted by the creditor at date-t and carried over to the date-t þ1 settlement location. A young creditor faces uncertainty regarding the timing of departure from the central island when old. In the event that the creditor is a morning-leaver, settlement may not occur because some of his debtors may not arrive in the morning. If unsettled, the morning-leaver will offer the IOUs in the resale market. The budget constraint for an old morning-leaving creditor is
rt þ 1 ð1at Þlt þat lt ¼ pxt þ 1 xc2t þ 1
ð18Þ
where at is the proportion of the IOUs that are issued by the morning producers at date t. It follows that 1 at is the proportion of IOUs sold in the resale market. With these resources, the old-age creditor can purchase consumption good from debtors (either young or old or both). We assume that creditors can costlessly trade debtor island-specific goods with each other on the central island. If IOUs are paid in goods, the creditors can convert the goods to his preferred debtor island-specific goods. If, instead the old creditor leaves in the afternoon, the creditor will settle all of his IOUs on the central island. He can also make arbitrage profits in the resale market by purchasing IOUs at discount from morning-leavers and settling with the afternoon-arriving debtors. This creditor’s old-age budget constraint is lt þ ð1rt þ 1 Þqt þ 1 ¼ pxt þ 1 xc 2t þ 1
ð19Þ
where qt þ 1 denotes the par value of the IOUs that an old afternoon-leaving creditor purchases in the date-tþ1 resale market. The sum of the creditor’s old-age resources are used to purchase units of the consumption good. The liquidity constraint for an afternoon-leaving creditor is at lt rt þ 1 qt þ 1 Z 0
ð20Þ
Thus, a creditor maximizes expected lifetime utility, ð1aÞuðyc1t ,xc2t þ 1 Þ þ auðyc1t ,xc 2t þ 1 Þ, subject to the budget and liquidity constraints (17)–(20). Note that an old creditor can consume at any time, but the quantity depends on the pattern of travel. In other words, consumption by an old creditor is contingent on their departure time when liquidity is scarce. Substitute c c cn y1t, x2t þ 1, and x2t þ 1 using lt and qt þ 1, the first-order conditions with respect to lt and qt for this problem are u1 þfð1aÞu2 ½at þ rt þ 1 ð1at Þ þ au2 g
au2
1rt þ 1 m2 rt þ 1 ¼ 0 pxt þ 1
pyt þ pyt m2 l ¼ 0 pxt þ 1
ð21Þ
ð22Þ
where m2 is the Lagrange multiplier associated with creditor’s liquidity constraint. Again, from Eq. (22), it follows that the liquidity constraint is non-binding if and only if r ¼ 1. Also note that with rt þ 1 ¼ 1, Eqs. (18) and (19) imply that xc2t þ 1 ¼ xc 2t þ 1 . Combine the two FOCs, obtaining u2 u1 u2 1 ¼0 ð23Þ þ ½at þ rt þ 1 ð1at Þ ð1aÞ þa pyt pxt þ 1 rt þ 1 pxt þ 1 3.3. Equilibrium A competitive rational expectations equilibrium is defined as (i) debtors and creditors maximize expected lifetime utility, taking prices as given; (ii) all markets clear; and (iii) the subjective distribution of debtors arrival rates at the central island is equal to the objective distribution of debtors arrival rates at the central island.
C. Gu et al. / Journal of Monetary Economics 58 (2011) 172–182
177
The goods market clearing conditions for goods x and y, respectively, are y ¼ yc1t þ lkt þð1lÞkt
ð24Þ
d c c x þ lf ðkt1 Þ þ ð1lÞFðkt1 Þ ¼ lðxd1t þ xd2t Þ þ ð1lÞðxd 1t þ x2t Þ þ ð1aÞx2t þ ax2t
ð25Þ
In addition, the market for good x in the morning satisfies c x þ lf ðkt1 Þ Z lxd1t þð1lÞxd 1t þð1aÞx2t
ð26Þ
The inequality does not have to bind because good x can be sold to the old debtors and afternoon-leaving creditors in the afternoon. The money market clearing condition is
lmt þ ð1lÞmt ¼ m0
ð27Þ
The loan market clearing condition is lt ¼ lht þ ð1lÞht
ð28Þ
The equilibrium proportion of IOUs issued by morning producers is at ¼
lht lht þ ð1lÞht
ð29Þ
The IOU resale market clearing condition is
lbt þ aqt ¼ ð1aÞð1at1 Þlt1
ð30Þ
The first-order conditions and the market clearing conditions pin down the equilibrium amount of consumption goods for both agents, the relative prices of goods, the equilibrium ratio of morning producers, and the equilibrium price of loans in the resale market. Proposition 1 shows that in the equilibrium, there is a non-degenerate distribution of morning and afternoon producers. The resale price of loan on the central island is always less than 1.6 That is, the liquidity constraints are always binding in the resale market. Proposition 1. In the equilibrium, the fraction of morning producers is strictly between0 and 1, and the IOU price in the resale market is strictly between0 and 1. That is, l 2 ð0,1Þ and r 2 ð0,1Þ. Proof. Prove by contradiction. Suppose l ¼ 1, which implies that all debtors are morning producers. All IOUs are redeemed in the morning. There is no demand for liquidity in the IOU resale market, which implies that the liquidity constraint is not binding for morning producers, resulting in an equilibrium with r ¼ 1. The morning producers will not gain at all from the resale market. As the return on the afternoon good strictly dominates that on the morning good, a debtor would deviate, becoming an afternoon producer if r ¼ 1. A contradiction. Suppose alternatively that all debtors are afternoon producers; that is, l ¼ 0. Then there is no supply of liquidity in the IOU resale market, which drives the price of the IOU in the resale market, r, to 0. Here, there is an incentive for a debtor to deviate, choosing to be a morning producer, making profits in the resale market. Again, a contradiction. & The implication of Proposition 1 is that there exists a non-degenerate equilibrium distribution of settlements. In this equilibrium, the debtor is indifferent to morning and afternoon production/settlement. Intuitively, there is an nonarbitrage condition for a debtor’s choice: Once at the settlement site, the morning producers can find creditors leaving in the morning and offer payment at less than par for the IOUs. Thus, the return to morning goods plus intra-period gains from the resale market equal the return on the afternoon goods. The following sections will focus on the stationary equilibrium. So the time subscripts are dropped.7 4. Planner’s allocation Let us turn to the planner’s allocation. The planner selects one technology for each individual debtor. Thus, in the planner’s allocation, l is the measure of debtors that the planner chooses to be morning producers. In this section, old creditor’s consumption demand is not tied to their departure from the central island. It is the same assumption as made in Freeman (1996). A different consumption pattern will be considered in Section 6. 6 Here fiat money bears a liquidity premium in the IOU resale market. The literature on liquidity premium is vast. See Williamson and Wright (forthcoming, 2010) for liquidity premium on real assets as well as on fiat money in the money search models. 7 The uniqueness of the stationary equilibrium is not guaranteed. But the uniqueness exists in some classes of utility functions and production functions. For example, log linear utility function and Cobb–Douglas production functions.
178
C. Gu et al. / Journal of Monetary Economics 58 (2011) 172–182
Following Mills (2004), the social planner’s problem is formalized as follows: max
yc ,xc ,xc ,xd ,xd , 1 2 2 1 2 xd ,xd ,k,k , l 1 2
d c c c c b½lvðxd1 ,xd2 Þ þ ð1lÞvðxd 1 ,x2 Þ þ ð1bÞ½auðy1 ,x2 Þ þð1aÞuðy1 ,x2 Þ
d d d d x þ lf ðkÞ þð1lÞFðk Þ ¼ ð1aÞxc2 þ axc 2 þ lðx1 þx2 Þ þð1lÞðx1 þ x2 Þ
s:t:
y ¼ yc1 þ lkþ ð1lÞk 0rlr1
where b is the weight of debtors in the planner’s welfare function. The first two constraints are the resource constraints for goods x and y, respectively. The solution to the planner’s problem, denoted by the ‘‘hat’’ on the variables, satisfies the following first-order conditions that describe the consumptions for each type of agents in both periods: v^ 1 ¼ v^ 2
v^ 2 ¼
1b
b
ð31Þ u^ 2
ð32Þ
u^ 1 ¼ u^ 2 F 0 ðk^ Þ
ð33Þ
c c x^ 2 ¼ x^ 2
ð34Þ
Because the old creditors are indifferent to when they consume in the second period of their life and because afternoon production dominates morning production, the planner’s allocation consists only of afternoon production; that is, k^ ¼ 0 and l^ ¼ 0. The planner provides full risk sharing to creditors. The planner can always redistribute goods to achieve the desired wealth distribution that satisfies Eq. (32). It is straightforward to see that the equilibrium in the decentralized economy does not achieve the planner’s allocation. In a decentralized economy, IOUs must be discounted to attract some of the debtors to arrive early. While the resale market provides liquidity for morning-leaving creditors, it shifts production to a low-return technology, reducing total consumption and resulting in incomplete risk sharing among creditors. 5. Government policies In this section, we examine government policies – specifically, central bank actions – that implement the efficient allocation. Suppose there is a central bank on the central island. The central bank’s discount window supplies unlimited loans at the interest rate of r in the morning of each period. Discount window loans are intraday; that is, loans made in the morning must be repaid in the afternoon. In this economy, the afternoon-leaving creditors and morning producers trade between the central bank and morning-leaving creditors as ‘‘commercial banks.’’ The competitive market will result in a nonarbitrage condition in which r ¼ 1=ð1þ rÞ. Because IOUs can be settled with the debtor’s goods, the central bank can accept debtor goods as settlement for any IOUs it acquires. The central bank can trade the goods it receives with old creditors. The upshot is that the total money supply is constant; that is, throughout our analysis, mt ¼m0 for all t Z 1.8 In addition to the central bank, a government operates a lump-sum tax and transfer process. Let Td ,Td ,Tc , and Tc denote the net life-time transfers to the morning producer, afternoon producer, morning-leaving creditor, and afternoon-leaving creditors, respectively.9 The government runs a balanced budget in each period. The following proposition is derived with these policy tools. Proposition 2. In the decentralized economy, the optimal intraday rate is zero. Proof. Let r ¼ 1,Td 4 Td , and Tc ¼ Tc . Because there is no profit in the resale market and afternoon producers receive larger transfers, there are only afternoon producers debtors in equilibrium. In a stationary equilibrium, the debtor’s firstorder condition (15) is identical to the first-order condition in the planner’s problem (see Eq. (31)). Because there is no profit in the resale market and all creditors receive the same transfers/tax, both morning-leaving and afternoon-leaving creditors consume the same amount. With xc2 ¼ xc 2 , the creditor’s first-order condition (23) is identical to (33). Lastly, the value of Td and Tc is chosen in such a way that the marginal utility for the two types of agents satisfies (32). & 8 Central bank loans can be considered outside money. The constant stock of outside money referred to here then is ‘‘unbacked’’ outside money. Any outside money created through the discount window is ‘‘backed’’ outside money, where backing refers to the loan itself. 9 The lump-sum tax and transfer scheme is implicit in Freeman (1996) and other works (see Mills, 2004). The balanced-budget constraint is ð1lÞTd þ ð1aÞTc þ lTd þ aTc ¼ ð1rÞ½ð1aÞð1lÞllbaq.
C. Gu et al. / Journal of Monetary Economics 58 (2011) 172–182
179
As in Freeman, Proposition 2 indicates that morning-leaving creditors do not care about when they obtain consumption goods during the day; morning departure imposes a pure liquidity problem. The bank can counteract this friction by injecting money temporarily into the economy. 6. A Diamond–Dybvig creditor The consumption set is modified in this section. Specifically, here morning-leaving creditors value only morning consumption. In other words, the creditor, whose settlement takes place in the morning, is also seeking the consumption good in the morning subperiod. These creditors are referred as Diamond and Dybvig (1983) creditors.10 With Diamond– Dybvig creditors, afternoon goods are not substitutes for morning goods for morning-leaving creditors. The creditors whose consumption needs and settlement timing are uncorrelated, such as in the previous model economy, are referred to as ‘‘Freeman creditors.’’ In this setup, our question is how modifying the consumption set affects the optimal intraday policy. Banks face a variety of deadlines to make payments during the day. (See Martin and McAndrews, 2008). By assuming that morning and afternoon goods are imperfect substitutes, the creditor’s settlement deadline is treated as being consistent with the timing of consumption. Insofar as the timing of consumption matters, production may also be affected. Shifting production to morning goods results in fewer total goods for the afternoon leavers for the sake of the morning leavers. 6.1. Planner’s allocation With the different commodity space, the planner faces an additional resource constraint. For the morning-leaving creditor, the new constraint specifies that relationship between the quantity of morning good available for morningleaving creditors. Formally, x þ lf ðkÞ Z ð1aÞxc2
ð35Þ
Note that endowments are available for morning consumption and thus are perfect substitutes for the morning good. There are three cases. c Case1: The resource constraint (35) is not binding, implying that ð1aÞx^ 2 o x and l^ ¼ 0. Then the planner’s solution is identical to allocation with Freeman creditors. c Case2: The resource constraint (35) is binding, and the planner’s solution requires l^ ¼ 0 and ð1aÞx^ 2 ¼ x. In this case, the planner still invests all resources in the production of afternoon goods. He rations the endowments of the young debtors among the morning-leaving creditors. The planner’s solution is described by Eqs. (31)–(33) and the resource c c constraint. Because of the binding resource constraint and by concavity of the utility function, it follows that x^ 2 o x^ 2 . c Case3: The resource constraint (35) is binding, and the planner’s solution requires l^ 40 and ð1aÞx^ 2 4 x. The planner invests some resources in producing the morning goods, and allocates the endowments of the young debtors and morning goods to the morning-leaving creditors. The first-order conditions are d d x^ 1 ¼ x^ 1
ð36Þ
d d x^ 2 ¼ x^ 2
ð37Þ
^ u^ 1 ¼ u^ 2 f 0 ðkÞ
ð38Þ
^ u^ k^ ¼ u^ Fðk^ Þu^ k^ u^ 2 f ðkÞ 1 1 2
ð39Þ
c c x^ 2 o x^ 2
and Eqs. (31)–(33). Note that holds in this case. Eq. (39) describes optimal choice of l. The left-hand side of (39) is the net marginal value (marginal gain minus marginal cost in terms of utility) from having an additional producer engaged in morning production while the right-hand side is the net marginal value obtained from having an additional producer engaged in afternoon production. The optimal l equates the net marginal value of these two types of production. Also note that from Eqs. (31), (36), and (37), the allocation satisfies the morning producer’s intertemporal marginal condition represented by v^ 1 ¼ v^ 2 . Overall, the planner’s allocation depends on the morning creditor’s intertemporal marginal rate of substitution relative to the differences between the marginal product of afternoon production and morning production. Because morning creditors treat the endowments of young debtors and morning-produced goods as perfect substitutes, the planner first applies the endowment to satisfy the consumption by morning-leaving creditors. Case 1 verifies that if the amount of endowment is large enough to satisfy old morning creditors’ demand, then the planner’s allocation achieves full risk sharing. In contrast, if the endowment is not large enough, the planner finds the efficient trade-off between the morningleaving creditor’s intertemporal marginal rate of substitution (IMRS) and the marginal product (MP) of the morning 10 See Martin (2008) for a Diamond–Dybvig economy with commodity money. Because of the resources associated with the commodity money, the loan interest rate should be positive in Martin’s economy.
180
C. Gu et al. / Journal of Monetary Economics 58 (2011) 172–182
production. If, when evaluated at the endowment quantity, the IMRS is higher than MP, the planner will simply ration the endowment of the young to the morning-leaving creditors (Case 2). In contrast, if the IMRS is less than the MP when evaluated at the endowment quantity, the planner’s allocation will consist of both morning and afternoon production. In the efficient allocation, morning good production is set to equate IMRS and MP (Case 3). Both Cases 2 and 3 are characterized by the planner’s allocation achieving partial risk sharing in the sense that the morning-leaving creditors consume less than afternoon-leaving ones. In those two cases, the planner compensates the creditors by increasing their consumption when young. Investment declines. Thus, compared with Case 1, it follows immediately that there is less afternoon production and less total production in Cases 2 and 3. 6.2. Equilibrium in the decentralized economy revisited Consider the decentralized economy with no discount window. It is easy to show that equilibrium with Diamond– Dybvig creditors is identical to the one with Freeman creditors. With l 40, the equilibrium in the decentralized economy does not achieve the planner’s allocation for Cases 1 and 2. Though less obvious, the Case 3 planner’s allocation is not achieved in the decentralized economy. To see why, note that the planner’s allocation calls for v1 ¼ v2; that is, morning producer’s IMRS is equal to one. However, in the decentralized economy with 0 o r o 1, the morning producer’s first-order condition is satisfied at v1 4 v2 . In other words, morning producer’s IMRS is less than one. 6.3. Government policies We discuss optimal central bank policies for each of the three cases in the economy with Diamond–Dybvig creditors. Note that these three cases correspond to the three planner’s solutions discussed above. The results are summarized in the following proposition. Proposition 3. In the economy with Diamond–Dybvig creditors, (i) if the planner’s resource constraint in the morning is not binding (Case1), then setting r ¼ 1 (or r¼ 0) implements the planner’s allocation; (ii) if the planner’s resource constraint in the morning is binding and the resource equals the consumption by morning-leaving creditors (Case2), then setting r ¼ u^ 2 =u^ 2 (or r ¼ u^ 2 =u^ 2 1) implements the planner’s allocation; (iii) if endowments are strictly less than consumption by morning-leaving creditors (Case3), then no intraday rate can implement the planner’s allocation.11 The intuition behind Proposition 3 is as follows. In the planner’s allocation, the morning-leaving (afternoon-leaving) creditor’s IMRS is equal to the marginal product of morning (afternoon) production. Compared with the afternoon-leaving creditors, the morning-leavers are treated less favorably (at most equally) because morning goods are more costly. The intraday rate in the decentralized economy, which measures the price of morning consumption relative to the afternoon consumption, should be positive to match the planner’s consumption allocation among the creditors. However, this positive intraday rate also measures the IMRS of the morning producers, who trade high return on production for profits in the resale market. From the planner’s point of view, debtor’s IMRS should be one because the consumption needs of the debtors (both morning and afternoon producers) are time-insensitive within a period. In other words, the intraday rate needs to be set at 0 to match the planner’s consumption allocation for the morning producers. So in general, the one intraday rate cannot achieve the efficient IMRS of the morning producers (which requires r¼0) and the creditors (which requires r Z0). The exception occurs in cases 1 and 2. In case 1, the endowments of the young debtors are large enough to satisfy the consumption demand of the morning-leaving creditors. Morning consumption does not come at the cost of afternoon consumption. Instead, the timing issue becomes a pure liquidity problem as in the economy with Freeman creditors, and providing free liquidity solves the problem. In case 2, endowments are not large enough, but the morning-leaving creditor’s IMRS, evaluated at the endowment point, is still large relative to marginal product in the morning production. By imposing a large tax on morning producers – that is, Td is negative and large enough (in terms of absolute value) – all debtors are induced to opt for afternoon production. The intraday rate, therefore, can be used exclusively to target the creditors’ intertemporal efficiency condition. Alternatively, let us expand on this intuition by focusing attention on the planner’s production scheme. Three variables fully characterize the production scheme in the planner’s allocation: (i) the fraction of debtors who are engaged in morning good production ðlÞ, (ii) the quantity of capital invested in morning good production (k), and (iii) the quantity of capital put in afternoon good production (k ). In Case 3, all three variables are non-trivial and must be matched for the decentralized economy to implement the planner’s allocation. Yet, the government has two policy tools: the intraday rate and tax-transfer scheme. Policy tools are outnumbered by the variables that fully characterize the planner’s production scheme. So the planner’s allocation cannot be achieved in general.12 Again, cases 1 and 2 are special. In these two cases, 11
See Gu et al. (2010) for a complete derivation of the results in the three cases. Note that the nature of the IOU contract does not drive this result. Consider a state-contingent IOU contract under which a debtor pays the face value if he redeems the IOUs in the morning and pays at the interest rate of g if he redeems in the afternoon. Market competition for borrowers drives the interest rate to g ¼ 1=r1. That is, in equilibrium creditors will be indifferent to lending to morning and afternoon producers. This additional feature of the contract does not affect the non-arbitrage condition for the debtors to be indifferent to becoming a morning and an afternoon producer, and the results in this paper survive under a state-contingent IOU contract. See Gu et al. (2010) for details. 12
C. Gu et al. / Journal of Monetary Economics 58 (2011) 172–182
181
there is no morning production in the planner’s production scheme. The government can impose tax on morning producers to encourage all debtors to choose afternoon production. With l ¼ 0, the choice of k is trivial. Hence, there is only one variable, kn, in the planner’s production scheme that must be matched. A properly chosen positive intraday interest rate can adjust the creditors’ IMRS such that k ¼ k^ . A secondary, and subtle, reason accounts for why central bank policies cannot implement the planner’s allocation in Case 3. With hidden action, debtors’ production choices are not observable at the time of purchase. Insofar as creditors cannot distinguish between morning and afternoon producers, they cannot sell capital at different prices to debtors who invest in different production. Thus, there is one price of capital despite two types of production, and the choices of k and k are both distorted.13 In the economy with Freeman creditors, the tax-transfer scheme merely redistributes consumption among all agents to achieve the desirable welfare weights in the planner’s allocation. In the economy with Diamond–Dybvig creditors, the taxtransfer scheme also affects the capital gains in the IOU resale market. In case 2, for example, the optimal intraday rate is positive. It follows that capital gains are earned by purchasing IOUs in the morning and settling them in the afternoon. With no tax-transfer scheme, some debtors would choose to be morning producers if the intraday rate is high enough. To implement the efficient allocation with a degenerate distribution of settlements, the government taxes morning producers to induce debtors to opt for afternoon production. Here, the tax cancels capital gains and only afternoon production takes place. In case 3, the tax-transfer scheme adjusts the size of the capital gains. The planner’s allocation cannot be achieved, but the tax-transfer scheme, together with a positive intraday rate, is the second best. 6.4. Related literature Cases 2 and 3 follow a theme in the literature on the Friedman rule. In separate papers, Martin and McAndrews (2008) and Bhattacharya et al. (2009) demonstrate that in an economy with production externalities, the optimal policy will deviate from the Friedman rule.14 There is a related literature in which moral hazard is present, finding that the optimal intraday rate is positive. Chapman and Martin (2007) develop a model in Freeman’s (1999) framework to show that in an economy with a moral hazard problem, a positive intraday rate encourages creditors to monitor the quality of the loans thus reducing the default risk.15 Our model provides an alternative explanation, which relies on the opportunity cost of settling payments early, for adopting a positive intraday rate. The literature on the endogenous distribution of settlements is thin. Bech and Garratt (2003) apply a partial equilibrium approach to examine the distribution of settlements. Compared with their work, our chief contribution is that our model can address the role that the intraday rate policy has on the equilibrium distribution of settlements. In addition, Martin (2004) analyzes a model in which settlement timing is endogenous. He starts with a model in which payments arrive randomly then modifies it so that agents can pay a cost to guarantee the early arrivals of payments. Both Martin’s model and ours utilize similar approaches to endogenizing settlement timing. However, our two approaches yield radically different predictions about the distribution of settlements. To illustrate, consider the case in which the intraday rate is set to zero. In Martin’s setup, the resulting distribution of payments is non-degenerate because it is solely determined by the exogenous random arrival rate. In other words, no agents will incur the cost to guarantee early arrival. In contrast, our analysis focuses on the distribution of settlement being determined endogenously. With a zero intraday rate, the equilibrium distribution is degenerate with the mass of settlements occurring in the afternoon. Our results suggest that the mismatch of consumption and production together must play roles. With such a consumption–production mismatch, however, we examine cases in which there are intraday frictions, which can result in the optimal intraday rate being positive. 7. Conclusion Introducing two production technologies that differ in output returns and maturities endogenizes the timing of settlements. However, altering the production set in this way is not sufficient to alter optimal policy. Because consumers are indifferent between when production occurs, the fundamental issue in this version of the payment system remains a liquidity problem. Providing liquidity at zero cost will implement the efficient allocation. Our model is then extended to an environment in which the timing of consumption needs coincides with the timing of settlement requests. In this setup, the optimal intraday rate can be positive. With time-sensitive consumption opportunities, consumption and production are explicitly linked by the timing of settlement. Because the intraday rate 13 Hidden action is not a necessary condition for the results in this paper. In the working paper version of our study (Gu et al., 2010), we consider an economy with no hidden action. Creditors can sell capital at different prices to different types of producers. In equilibrium, the capital prices must satisfy a non-arbitrage condition so that a creditor is indifferent to morning and afternoon production. The debtors’ non-arbitrage condition on the production return and profit-making in the IOU resale market is not affected. Propositions 1–3 in the present paper survive in such an economy. Moreover, such an economy is equivalent to an economy with hidden action and contingent IOU contracts. 14 See also Millard et al. (2006). Our paper also overlaps substantially with Lacker (1997) in addressing settlement risk with monetary policy tools. In Lacker (1997), paying the market rate of interest on reserves is analogous to the Friedman rule. 15 See also Martin and McAndrews (2008) for a survey of literature on intraday money market.
182
C. Gu et al. / Journal of Monetary Economics 58 (2011) 172–182
affects outputs and consumption allocations, it no longer solves a simple liquidity problem. There are conditions under which the optimal intraday rate is positive. More interestingly, optimal central bank policies do not always implement the efficient allocation. In this more general setup, there are conditions under which the distribution of settlements is nondegenerate and the optimal intraday policy is not a version of the Friedman rule. Overall, our findings indicate that the links between production, consumption and settlement timing combine the liquidity problem and the Fisherian intertemporal problem. Both attributes are potentially important factors when examining optimal intraday interest rate policy. The question regarding the appropriate policy structure is left for future research.
Acknowledgments We thank Marco Bassetto, Morten Bech, Philip Dybvig, Paula Hernandez, ToddKeister, Antoine Martin, Jamie McAndrews, David Mills, Ed Nosal, ErwanQuintin, two referees and the editor, and seminar participants at Texas A&M University, Southern Methodist University, Federal Reserve Bank of New York, Federal Reserve Bank of Dallas, Federal Reserve Bank of Chicago and Missouri Economics Conference for helpful comments on earlier drafts of this paper. References Angelini, P., 1998. An analysis of competitive externalities in gross settlement systems. Journal of Banking and Finance 22, 1–18. Armantier, O., Arnold, J., McAndrews, J., 2008. Changes in the timing distribution of fedwire funds transfers. Federal Reserve Bank of New York Economic Policy Review 14, 83–112. Bech, M., Garratt, R., 2003. The intraday liquidity management game. Journal of Economic Theory 109, 198–219. Bhattacharya, J., Haslag, J.H., Martin, A., 2009. Why does overnight liquidity cost more than intraday liquidity? Journal of Economic Dynamics and Control 33, 1236–1246 Chapman, J., Martin, A., 2007. Rediscounting under aggregate risk with moral hazard. Federal Reserve Bank of New York Staff Reports No. 296. Diamond, D., Dybvig, P., 1983. Bank runs, deposit insurance, and liquidity. Journal of Political Economy 91, 401–419. Freeman, S., 1996. The payments system, liquidity, and rediscounting. American Economic Review 86, 1126–1138. Freeman, S., 1999. Rediscounting under aggregate risk. Journal of Monetary Economics 43, 197–216. Freeman, S., 2002. Payments and output. Review of Economic Dynamics 5, 602–617. Green, E., 1997. Money and debt in the structure of payments. Monetary and Economic Studies 215, 63–87. Gu, C., Guzman, M., Haslag, J.H., 2010. Production, hidden action, and the payment system. Working Paper No. 10-4, University of Missouri. Lacker, J., 1997. Clearing, settlement and monetary policy. Journal of Monetary Economics 40, 347–381. Martin, A., 2004. Optimal pricing of intraday liquidity. Journal of Monetary Economics 51, 401–424. Martin, A., 2008. Reconciling Bagehot and the Fed’s response to September 11. Journal of Money, Credit, and Banking 41, 397–415. Martin, A., McAndrews, J., 2008. Should there be intraday money markets? Staff Reports No. 337, Federal Reserve Bank of New York. Millard, S., Speight, G., Willison, M., 2006. Why do central banks observe a distinction between intraday and overnight interest rates? Manuscript, Bank of England. Mills Jr., D.C., 2004. Mechanism design and the role of enforcement in Freeman’s model of payments. Review of Economic Dynamics 7, 219–236. Williamson, S., Wright, R. New monetarist economics: models. In Friedman, B., Woodford, M. (Eds.), Handbook of Monetary Economics 3, 25–96. Williamson, S., Wright, R., 2010. New monetarist economics: methods. Federal Reserve Bank of St. Louis Review 92 (4), 265–302.
Journal of Monetary Economics 58 (2011) 183–189
Contents lists available at ScienceDirect
Journal of Monetary Economics journal homepage: www.elsevier.com/locate/jme
Review of Allan H. Meltzer’s A history of the Federal Reserve, Volume 2, University of Chicago Press, 2009 John B. Taylor Stanford University, Stanford, CA 94305, USA
a r t i c l e i n f o
abstract
Article history: Received 5 October 2010 Accepted 6 October 2010 Available online 22 February 2011
This is a review of Allan Meltzer’s ‘‘A History of the Federal Reserve, Volume 2.’’ By carefully reviewing thousands of transcripts and records, Meltzer’s history lets policy makers explain their decisions in their own words, and creatively weaves historical events into a single exceptionally clear story as he did in Volume 1. In this review I first examine the book’s main theme—that discretionary monetary policy failed in the Great Depression (1929–1933), in the Great Inflation (1965–1980), and in the recent Great Recession (2007–2009)—and then consider its main conclusion—that monetary policy should be based on less discretion and more rule-like behavior. & 2011 Elsevier B.V. All rights reserved.
When Allan Meltzer published Volume 1 of the history of the Federal Reserve (Meltzer, 2003), it was received with wide acclaim. Bordo (2006), reviewing it in the Journal of Monetary Economics, praised it as a ‘‘monumental accomplishment’’ researched with ‘‘painstaking detail’’. Laidler (2003), writing in the Journal of Economic Literature, called it ‘‘an exceptionally clear story’’. Both emphasized that the history complemented Friedman and Schwartz’s (1963) monetary history since Meltzer provides a penetrating biography of an institution rather than a broader history of monetary trends. Volume 2, published six years later, is all this and much more. It picks up the story in 1951 where Volume 1 left off. Reflecting the increased volume of relevant source material, it is substantially longer than Volume 1—up from 800 to 1312 pages, which required that it be split up into two books. The year 1951 is a logical break point because it coincides with the start of William McChesney Martin’s long term as Chairman and also with the year of the famous Accord which gave the Federal Reserve freedom to set interest rates in a way that would ‘‘minimize monetization of the public debt’’ rather than simply peg the Treasury borrowing rate. To write this history, Meltzer digested thousands of documents—transcripts of meetings of the Federal Open Market Committee, notes and interviews with Fed officials, records at the New York Fed and other regional banks, papers from presidents of the United States and their assistants, not to mention records of the Congress, the Treasury, the Council of Economic Advisers, foreign monetary officials, academics and journalists. He selected key passages and quotes, insisting that policymakers express their views and explain their decisions, good or bad, in their own words; his frequent quotes of policymakers in memos, media appearances, and private interviews are a wonderful trademark of Meltzer’s approach to monetary history. He then creatively weaves together the complex series of events into a single narrative, transforming the painstaking details into an exceptionally clear story, just as he did in Volume 1. This is no easy task and no one is more qualified to carry it out than Allan Meltzer with his long experience as a monetary scholar active in public affairs. Policymakers, monetary economists, and historians should be forever grateful to him for making so much rarely seen but crucial information accessible in a readable and manageable form. Volume 2 can be read independently of Volume 1. Indeed, with a little knowledge of monetary economics and history, it is possible to read and learn a lot from any chapter in either volume independently of other chapters. In fact, this is the
E-mail address:
[email protected] 0304-3932/$ - see front matter & 2011 Elsevier B.V. All rights reserved. doi:10.1016/j.jmoneco.2010.10.009
184
J.B. Taylor / Journal of Monetary Economics 58 (2011) 183–189
most sensible way to approach such a long history in which many chapters are longer than typical articles and even some books. Choose a chapter or a section you are interested in, read it, then go on to others. If you want to find out more about how to avoid bailouts, read the section ‘‘The Penn Central Failure’’ and you will see that Federal Reserve Board Chair Arthur Burns wanted a bailout while OMB Director George Shultz and the CEA did not, and you will also find out how and why the bailout was avoided and how the Fed prevented contagion anyway. Or you can read why Chairman Martin let inflation rise at the end of his long term, or how decisions were made that would eventually end Bretton Woods with the famous Camp David announcement on August 15, 1971. The excellent index facilitates using Meltzer’s history as reference in this way, though it only appears at the end of Book 2, and scholars might want to consider making copy and inserting it at the end of Book 1 as well. But Meltzer’s history is more than a reference work. It tells a consistent story of policy successes and policy failures with an explanation for both and a lesson for the future. This story is found in Volume 2, but it is also found in Volume 1, and even in what I see as a preliminary-Volume 3, found within Volume 2. The main failures are the Great Depression of the 1930s, the Great Inflation of the 1970s, and the financial crisis which began in 2007 resulting in the Great Recession. The main successes are the Great Disinflation of the early 1980s and the Great Moderation which succeeded it. To explore this theme of success and failure, I start with a brief review of Volume 1. 1. The Great Depression Volume 1 begins with the antecedents of the Fed and its founding in 1914. It tracks the Fed’s early operations into the 1920s when policymakers were searching for some kind of guide for actions and settled on a real bills doctrine. It also covers the 1940s when monetary policy was subservient to the borrowing needs of the Treasury in order to finance government spending in World War II. But the main event in Volume 1 is the Great Depression of the 1930s. Chapter 5 of that volume addresses the key question, ‘‘Why Did Monetary Policy Fail in the Thirties?’’ Meltzer’s overall assessment of the proximate cause of the deep contraction in output and employment was a monetary policy mistake in the form of a massive decline in the money supply. ‘‘From the peak of the cycle in the summer of 1929 to the bottom of the depression in March 1933, the stock of money—currency and demand deposits—fell by 28 percent and industrial production fell by 50 percent’’ (Volume 1, p. 271). Meltzer shows that this decline could have been prevented, and carefully reviews the records looking for the most likely reason for that tragic mistake. He concludes that a flawed view of how monetary policy works was the main problem. Federal Reserve officials viewed the nominal interest rate and member bank borrowings as measures of monetary tightness, rather than tracking the real interest rate and broader monetary aggregates—such as currency and deposits. His search through the Fed’s records documents for example that ‘‘The minutes of the period, statements by Federal Reserve officials, and outside commentary by economists and others do not distinguish between real and nominal interest rates’’ (Volume 1, pp. 412–413). Meltzer argues that this faulty view, more than anything else—including a lack of leadership stressed by Friedman and Schwartz (1963)—led to harmful discretionary actions as well as to a reluctance to engage in open market purchases, which would have prevented the decline in the money stock. Open market purchases were not thought to be necessary because low member bank borrowings and low interest rates were interpreted as a sign of monetary ease. Hence, an incorrect view of the monetary transmission mechanism took attention away from money supply measures. While it had the data showing the volume of currency hoarded by the public, it did not let the monetary base rise in order to accommodate this increased demand for money. Meltzer shows that the alternative, correct, view that the central bank should control deposits and currency was known at the time—referring, for example, to Keynes’ writings—but it was not reflected in the Fed’s decisions. The bottom line is that the Fed did not have a sensible strategy or rule to guide its decisions. Meltzer writes that ‘‘So certain was the System about the correctness of its actions and its lack of responsibility for the collapse that I have found no evidence the Board undertook an official study of the reasons for the policy failure’’ (Volume 1, p. 413). 2. The Great Inflation Next on the list of huge policy failures is the Great Inflation. In much the same way that the Great Depression of the 1930s is the central event of Volume 1, the Great Inflation from 1965 to 1980 is the central event of Volume 2. In fact, Chapter 7 of Volume 2 is an eerie parallel to Chapter 5 of Volume 1, and is entitled ‘‘Why Monetary Policy Failed Again in the 1970s.’’ In total, Meltzer devotes three chapters of Volume 2 to the Great Inflation: Chapter 4 which covers the original rise of inflation, Chapter 6 which covers the wage and price controls and their aftermath, and Chapter 7 which addresses the overall ‘‘why the failure’’ question. Moreover, these chapters are preceded by a logical prelude to the Great Inflation—the introduction of Keynesian economics to Washington in the 1960s (Chapter 3)—and they are followed by the Volcker Disinflation (Chapter 8) from 1979 to 1983 and the efforts to restore stability after the legacy of 1970s inflation (Chapter 9) from 1983 to 1986. Meltzer emphasizes that the inflation of the 1970s actually began in 1965 under Martin’s chairmanship. The proximate reason for the rise of inflation was the increase in money growth and the corresponding decision to keep interest rates too low to prevent the rise in inflation. Why did the Martin Fed take these actions? Meltzer finds two main explanations. A first reason is that the Fed ‘‘accepted its role as a junior partner by agreeing to coordinate its actions with the administration’s fiscal policy. Coordination permitted the chairman to discuss the administration’s fiscal policy with the
J.B. Taylor / Journal of Monetary Economics 58 (2011) 183–189
185
president, but he had little effect on decisions. In practice, coordination meant that the Federal Reserve would not raise interest rates much, if at all’’ (p. 485). Why did Martin succumb to this junior partnership role? Meltzer argues that it was due to Martin’s very narrow view of central bank independence, and he provides evidence in Martin’s own words. According to Martin the Fed should be ‘‘independent within the government’’ (p. 84), elaborating that ‘‘I do not believe it is consistent to have an agent so independent that it can undertake, if it chooses, to defeat the financing of a large deficit, which is a policy of the Congress’’ (p. 85). As Meltzer summarizes, Martin made it clear that ‘‘He could not prevent inflation if the deficit remained large, so he could not meet the primary responsibility of an independent central bank—to maintain money’s purchasing power’’ (p. 85). Meltzer offers two other reasons for Martin’s going along with the administration’s wishes: Martin was committed to consensus-building between the Fed and the administration, and he did not distinguish between real and nominal interest rates, similarly with the Fed in the 1930s. A second reason given by Meltzer for Martin’s decision not to allow interest rates to rise was that he succumbed to a new but incorrect theory put forth by administration economists and others, namely a belief in the Phillips curve tradeoff, which lessened the resolve to keep inflation from rising because higher inflation was believed to reduce unemployment permanently. As Meltzer puts it, ‘‘Administration economists believed that a little more inflation was the price of permanently lower unemployment. This called for keeping interest rates from risingy’’ (p. 485). Meltzer refers to documents showing references to the Phillips curve tradeoff from the members of the President’s Council of Economic Advisers in the 1960s. This second reason for the Fed’s low interest rate and high money growth decisions is similar to one that I have favored (Taylor, 1997) and differs from many other explanations such as Brad De Long’s, which stresses memories of high unemployment in the Great Depression, Michael Parkin’s, which is based on time inconsistency, or Alan Blinder’s, which blames the oil price shocks. As argued in Levin and Taylor (forthcoming), Meltzer’s first reason for the rise in inflation—the political reason stemming from lack of sufficient Fed independence—is a crucial part of the explanation, because the Fed continued to let inflation rise in the 1970s, long after Milton Friedman and Edmund Phelps convincingly showed the flaws in the concept of a Phillips curve tradeoff. It was not only a flawed theory it was a reluctance to follow a strategy for political reasons. If you look just a few years later into the Great Inflation, to the period after Richard Nixon was elected president, you see a different view put forth by the Council of Economic Advisers, one consistent with that of Friedman and Phelps. As Meltzer puts it, ‘‘the members of the Council of Economic Advisers in the Nixon administration brought different ideas but no less inflation’’ (p. 485). By then the analytical changes implied by the Friedman–Phelps view was largely accepted, and ‘‘These analytic changes could have served as the basis for a successful anti-inflation policy, but they were not used for that purposey As the 1972 election approached, [concerns about unemployment] became overwhelmingy No less important, Arthur Burnsybecame convinced that the unemployment rate required to reduce inflation would be politically unacceptable. He became the leading proponent in the administration of wage–price guidelines and later wage and price controls’’ (p. 486). In other words the theory changed but the policy did not. Meltzer’s emphasis on political influence and the lack of independence is the more important explanation. Another piece of evidence which supports Meltzer’s stress on political factors relative to changes in economic theory is found in the last few years of the Great Inflation. One might argue—in fact I made that argument in Taylor (1997)—that the reason for the delay in reducing inflation in the late 1970s was the view imbedded in the adaptive-expectations augmented or the accelerationist version of the Friedman–Phelps theory. Such versions had the implication that reducing inflation would be costly and thus gave a theoretical argument for why policymakers were reluctant to take actions to reduce inflation earlier. But the problem with the argument is that inflation actually got higher, rather than remain steady in the late 1970s, as stressed in Levin and Taylor (forthcoming) and shown in Fig. 1 which is drawn from that paper.
14 12
CPI Inflation Livingston Survey
10 8 6 4 2 0 56 58 60 62 64 66 68 70 72 74 76 78 80 82 84 Fig. 1. Actual inflation and short-run inflation expectations, 1955–1985 Source: Levin and Taylor (2009)
186
J.B. Taylor / Journal of Monetary Economics 58 (2011) 183–189
Political factors or insufficient independence is thus needed to explain why the Fed deviated from a policy strategy that even pessimistic a short-run accelerationist Phillips curve would support. 3. The Great Disinflation Meltzer has many positive things to say about monetary policy during the disinflation period which began with Paul Volcker’s appointment to the Fed. He emphasizes how Volcker recaptured much of the independence the Fed lost during the Great Inflation and thereby was able to take the necessary steps required to bring about the disinflation and the restoration of stability afterwards. In many ways, the story of the disinflation is the reverse of the stories of Great Depression and the Great Inflation which were about either insufficient independence or inadequate monetary theory, and it illustrates how following sound policies consistently can bring impressive results. Meltzer stresses Volcker’s reliance on basic economic theory that inflation is a monetary phenomenon and that there is no long-run Phillips curve tradeoff. In my view, the chapter Disinflation is one of the best parts of Meltzer’s entire history. It is upbeat and even inspirational. It provides a model for how to make big changes in government. It does this by explaining in detail how Volcker accomplished this seemingly impossible mission. As Meltzer puts it: ‘‘President Carter gets credit for appointing [Volcker] and President Reagan for supporting him through a deep recession. But Paul Volcker’s major contribution stands out. Unlike 1966, 1969, 1973, and other times, he persisted in an anti-inflation policy long enough to bring the inflation rate down permanently’’ (p. 1011). There is much to learn from reading about how Volcker pulled this off. Meltzer notes how Volcker was ideally suited for the job. He ‘‘had the background and experience to be a successful chairmany Foreign central bankers and New York bankers knew him and had confidence in him. He was knowledgeable and strong-willed, and he recognized the importance of reducing inflation. He was also determined and committed to the task’’ (p. 1012). He was clear from the start that he would be independent and that he would be willing to go against the wishes of the White House if necessary, and at times he did go against such wishes. Meltzer quotes a memo from Charles Schultze, Chairman of the Council of Economic Advisers, to President Carter in which he urges the president to probe when Volcker could ‘‘begin easing a bit’’ (p. 1020), even though inflation had not yet started coming down. Volcker also kept it simple: ‘‘People don’t need an advanced course in economics to understand that inflation has something to do with too much money,’’ he said. And he was firm in his intentions. When asked on Face the Nation when he would change from ‘‘fighting inflation to fighting unemployment’’ he answered ‘‘I don’t think we can stop fighting inflationy I think we’ve got to keep our eye on that inflationary ball’’ (p. 1023). And most famously he became deeply involved in technical matters relying on a small group of economic advisers such as Steve Axilrod. Indeed, from a technical monetary economics viewpoint, his change in operating procedures is the most interesting action he took. Meltzer tells this fascinating story in detail: It begins with a meeting of the Federal Reserve Board on September 18, 1979. It was a watershed event, occurring soon after Volcker was appointed. On that day the Board approved a proposal to increase in the discount rate by 50 basis points, but the vote was very close, four to three, which implied to the markets a low chance of additional tightening. With excessive money growth and high inflation continuing, the market had already begun to lose confidence in Volcker’s ability to reduce inflation. It was then that Volcker knew he had to try something different if he was to get broad support for what had to be done. Working with the Fed staff, he assembled a three-part proposal: first, an increase in the discount rate of one percentage point to 12%; second, an increase in marginal reserve requirements on managed liabilities of large banks; third, a new operating procedure, which would focus on reserves rather than on the federal funds rate. By targeting reserves, the new procedures would result in more sizable interest rate responses to inflation and would prevent the Fed from falling behind the curve. They would also allow Volcker to say that the market was setting the interest rate, rather than the Fed, and they even offered the possibility of a quick downward movement in the federal funds rate if the economy fell into a deep recession. But the main reason for the change according to Meltzer was to give a ‘‘psychological push to the idea that inflation would slow or end’’ (pp. 1030–1031). When the FOMC and the Board met on October 6, 1979 the vote for the proposal was unanimous which represented a huge change in less than a month from the September 18 meeting. Those who dissented or disagreed earlier to raise the interest rate by a half percentage point now agreed to increase it by a full percentage point. Why the change? Meltzer’s answer is that the crisis was more apparent. However, as I argued in Taylor (2005), I think an additional, perhaps even more important reason was that Volcker put the package together in a way that was purposely designed to get wide support; it had something for everyone, including a cut in rates if things became dire. In any case, the October 6 votes provide a clear example of good implementation of economic policy in practice. Both the knowledge of what to do and leadership to get it done were essential. Simply knowing the economic theory or proposing the economic reform is not enough. Implementing the solution required leadership and skillful coalition-building. Unfortunately, the immediate impact of the policy change was not good. Long-term interest rates rose and the stock market fell, and the administration expressed its concerns. Charles Schultze wrote a memo to the president emphasizing the need for an alternative policy of wage and price controls. The disinflation was not going to be easy. In Meltzer’s view the October 6 decisions were a watershed in another more lasting sense. The Fed ‘‘implicitly changed the weights on unemployment and inflation’’ (pp. 1033–1034). In another appearance on Face the Nation Volcker was asked: ‘‘How high an unemployment rate are you prepared to accept in order to break inflation?’’ His answer included the
J.B. Taylor / Journal of Monetary Economics 58 (2011) 183–189
187
important line ‘‘over time we have no choice but to deal with the inflationary situation because over time inflation and unemployment go together’’ (p. 1034). An important part of this history is how the disinflation effort was almost aborted and certainly delayed by the request by President Carter to impose credit controls on March 14, 1980. Most of Carter’s advisers and those at the Fed opposed the idea, but the decision was made for political reasons ‘‘to provide an alternative to high interest rates’’ (p. 1050), according to Meltzer’s research. While the Federal Reserve Board was reluctant, it went along and voted five to one to adopt the controls. The controls led to a reduction in money growth and worsened the recession; the federal funds rate was allowed to decline. By summer the negative impact of the controls was clear and the Fed voted to eliminate them on July 2. Meltzer shows how the overall experience with the controls was counterproductive because the Fed ‘‘appeared to abandon its anti-inflation policy under administration pressurey The Episode increase skepticism and further weakened the Federal Reserve’s credibility’’ (p. 1057). In effect, the disinflation effort was delayed by almost a year. ‘‘Although the Federal Reserve began its anti-inflation program in October 1979, it had to start over again in the Fall of 1980’’ (p. 1094). As Goodfriend and King (forthcoming) emphasize, long-term interest rates were higher after this episode then they were in October 1979. This time, however, was different. Volcker and the Fed stuck with it. The federal funds rate went as high 20% in January 1981 and again in July 1981, and the tighter monetary policy brought the inflation rate down. Happily the focus on inflation remained and inflation continued to decline and then stayed down for the rest of the 1980s and1990s. 4. The Great Moderation Meltzer states in the first sentence of Volume 2 that he has no plans for another volume which, when combined with Volumes 1 and 2, would then complete the history of the Fed from its founding to the present. But in several ways Meltzer does bring the history up-to-date and indeed there is a ‘‘preliminary-Volume 3’’ hidden in the concluding Chapter 10, in the Epilogue, and in the many references to research in monetary economics completed after the span of time covered in most of the chapters, even if simply to look back on the history from a more modern vantage point. This preliminaryVolume 3, covering the period from 1986 through the financial crisis in 2008, is largely about the Great Moderation and the Great Recession. Meltzer’s assessment of monetary policy during the Great Moderation is as favorable as it is of policy during the Volcker Disinflation. Of course the actual economic performance was excellent. The volatility of output and inflation was low, expansions were long and recessions were infrequent and relatively mild. And he gives monetary policy and monetary policymakers credit. ‘‘From the end of the Great Inflation to 2007, the United States experienced three of its longest expansions interrupted by relatively mild, brief recessions. Federal Reserve policy contributed to the change sometimes called ‘the great moderation’’ (p. 1217). He notes that Volcker’s successor Alan Greenspan also ‘‘put greater weight on inflation control’’ (p. 1206), and that Greenspan and other FOMC members ‘‘said repeatedly that low inflation encouraged and even facilitated high employment and economic growth. The facts supported them’’ (p. 1207). Going on, he emphasizes that, although the Fed did not formally adopt a policy rule, it recognized that ‘‘policy should not be made one meeting at a time’’ (p. 1238), and that ‘‘In 1994 it began to announce the current interest rate target and some clues to what it planned for the next meeting’’ (pp. 1238–39). I think Meltzer’s critique of the rational expectations/sticky price models which became the analytical framework during the Great Moderation paints too much of a caricature by stressing that they have only one interest rate, or do not have money in them. We should not confuse simplified textbook versions of these models, which frequently boil down to only three equations, with more detailed models used for policy. For practical monetary policy work, those simplifying assumptions are usually relaxed as I think the collection of models by Wieland et al. (2009) illustrates. Some of the more complex models have time varying risk premia in the term structure of interest rates, an exchange rate channel, and more than one country. Nevertheless we can always improve monetary models, and central banks should try to be at the forefront of these improvements. In any case, Meltzer argues, correctly in my view, that during the Great Moderation period, the Fed did not fall to political pressure nor was trapped by a faulty theory. Its policy framework was one that emphasized predictable rule-like behavior. 5. The financial crisis and Great Recession Meltzer is as critical of monetary policy leading up to and during the financial crisis that began in 2007 as he is of policy during the Great Depression and the Great Inflation. Reading the material presented in the Epilogue, one sees a remarkable consistency in Meltzer’s analysis of the policy mistakes which led to the crisis that began in 2007. Leading up to the crisis in the period from 2003 to 2005, ‘‘Federal Reserve policy was too expansive as judged by the Taylor rule or the 1 percent Federal funds rate that held the real short-term interest rate negative in and expanding economy’’ (p. 1248). Why did monetary policy keep rates too low? Meltzer gives several explanations but emphasizes that ‘‘Chairman Alan Greenspan believed and said that the country faced risk of deflation,’’ but Meltzer adds ‘‘That was a mistake’’ (p. 1248).
188
J.B. Taylor / Journal of Monetary Economics 58 (2011) 183–189
Then during the crisis, Meltzer faults the Fed’s lender of last resort actions because ‘‘There was no clear pattern, no consistency, in the decisions’’ (p. 1249). And he comes back to the political pressures and the loss of independence. He notes that ‘‘Chairman Bernanke seemed willing to sacrifice most of the independence that Paul Volcker restored in the 1980s. He worked closely with the Treasury and yielded to the chairs of the House and Senate Banking Committees and others in Congress’’ (p. 1243). Thus, in sharp contrast to the Disinflation and the Great Moderation we see a reappearance of both political pressures and a faulty model. 6. The lesson In sum Meltzer’s history of the Federal Reserve is a history of either making or avoiding two principle sources of policy errors: ‘‘political interferences or pressure and mistaken beliefs’’ (p. 1217). The Great Depression was more the latter source of error: mistaken beliefs about the real bills doctrine. The Great Inflation was a combination of both types of errors, but failure to resist short term political pressures dominated because when beliefs changed in the 1970s, policies did not. In contrast the Disinflation period is marked by an absence of both types of mistakes as Volcker regained independence and restored basic monetary fundamentals about the impact of changes in the money supply and interest rates. Similarly the Great moderation was a period where rules-based policy, grounded in fundamentals, was followed, and independence was solidified, especially in the 1990s when independence of the Fed from the Treasury was discussed and accepted by the administration. The financial crisis and the Great Recession were a return to a combination of both kinds of errors, a departure from rules-based policies that worked in the Great Moderation and a loss of independence as the Fed engaged in fiscal and credit allocation policy, either at the urging of the administration or because they felt such discretionary actions were good policy. Meltzer’s historical research leads him to a clear and convincing policy conclusion as he wrote in the final pages of the book as it went to press in 2009: ‘‘Discretionary policy failed in 1929–33, in 1965–80, and now’’ (p. 1255). And equally clear and convincing is that ‘‘The lesson should be less discretion and more rule-like behavior’’ (p. 1255). While I registered some disagreements with parts of Meltzer’s history in my summary, I think his overall conclusion about monetary policy is largely correct. But most important, the facts and the references are there for anyone to inspect and debate. 7. Other stories and subplots In emphasizing this broad policy theme, I do not want to give the impression that this is all there is in Meltzer’s history. In particular, there is much important and useful material on specific institutional reform efforts. Parts of Chapter 2 concern reform and reorganization of the Federal Reserve System in the 1950s by William McChesney Martin. Martin wanted to centralize more decision making power at the Board in Washington. While he was not successful in having the manager of the open market desk report directly to the Board, rather than to the president of the New York Fed, he was successful in shifting more control of the FOMC decisions to the Board in Washington. For example, the Board usually decided in advance whether it would approve discount rate changes requested from the district banks. There is also much in Chapter 2 about Martin’s idea of Federal Reserve independence, discussed above, and the interesting episode was Martin’s offer to resign when Eisenhower was elected in 1952. Of course, the resignation was not accepted, but Meltzer notes that Martin did not offer his resignation after any other presidential election. Chapter 5 is a detailed study of the international side of monetary policy and can be read almost as a separate history, running in parallel to the Great Inflation episode discussed above. The story begins with the growing U.S. trade deficits in the 1960s and the reluctance to deal with them in a way which might risk slowing economic growth. The result was failed policies of capital controls and efforts to twist the yield curve by buying longer term Treasuries in ‘‘operation twist’’. These policy inconsistencies naturally led to the decision to go off the gold standard, announced in 1971 by Richard Nixon, and then to set up a procedure which would lead to a system of flexible exchange rates. This part of the story extends into Chapter 6. I found it especially noteworthy that most of these momentous decisions were made by a very small group of people, including George Shultz, John Connelly, and Peter Peterson meeting in early August 1971. Later Paul Volcker as Under Secretary of Treasury would chair a group and devise a negotiating strategy that was successful in getting other countries to change their own exchange rates. It was a prelude to the combination of consensus-building and a tough negotiation strategy which would serve Volcker well twenty years later as he led the Disinflation. 8. Conclusion In sum there is much to like in Meltzer’s A History of the Federal Reserve, Volume 2, just as there was in Volume 1. Not everyone will agree as much as I do with the key policy conclusion calling for less discretion and a more rule-like policy, but one cannot disagree that the history is comprehensive, thorough, and serious. Moreover, the facts presented in the book offer an opportunity for protagonists on either side of the rules versus discretion debate—whether inside or outside of central banks—to come closer together or else explain their differences. In this regard let me emphasize that while Meltzer does not hold back either his strong criticism or his strong praise of Federal Reserve decisions, he applauds the
J.B. Taylor / Journal of Monetary Economics 58 (2011) 183–189
189
integrity and purposefulness of the people who work and who have worked at the Federal Reserve. As Meltzer writes in the preface to Volume 2, ‘‘Although I find many reasons to criticize decisions, I praise the standards and integrity of the principals’’. References Bordo, Michael, 2006. Review of a history of the federal reserve. Volume I (2003) by Allan H. Meltzer. Journal of Monetary Economics 53, 633–657. Friedman, Milton, Schwartz, Anna J., 1963. In: A Monetary History of the United States, 1867–1960. Princeton University Press, Princeton, NJ. Goodfriend, Marvin, King, Robert G. The great inflation drift. In: Bordo, Michael D. (Eds.), Orphanides, Athanasios, The Great Inflation, University of Chicago Press, forthcoming. /http://www.nber.org/chapters/c9168.pdfS Laidler, David, 2003. Meltzer’s history of the federal reserve. Journal of Economic Literature 41 (4 (December)), 1256–1271. Levin, Andrew, Taylor, John B. Falling behind the curve: a positive analysis of stop–start monetary policies and the great inflation In: Bordo, Michael D., Orphanides, Athanasios (Eds.), The Great Inflation, University of Chicago Press, forthcoming. /http://www.nber.org/chapters/c9170.pdfS Meltzer, Allan H., 2003. In: A History of the Federal Reserve, Volume 1. University of Chicago Press, Chicago. Taylor, John B., 1997. Comment on ‘America’s peacetime inflation: the 1970s’ by J. Bradford De Long. In: Romer, Christina, Romer, David (Eds.), Reducing Inflation. University of Chicago Press. Taylor, John B., 2005. The International Implications of October 1979: Toward a Long Boom on a Global Scale, Federal Reserve Bank of St. Louis Review, March/April, Part 2. Wieland, Volker, Cwik, Tobias, Mueller, Gernot, Schmidt, Sebastian, Wolters, Maik, 2009. In: A New Comparative Approach to Macroeconomic Modelling and Policy Analysis—Manuscript. Center for Financial Studies, Frankfurt.