INTRODUCTION TO THE SERIES
The aim of the Handbooks in Economics series is to produce Handbooks for various branches of...
62 downloads
961 Views
6MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
INTRODUCTION TO THE SERIES
The aim of the Handbooks in Economics series is to produce Handbooks for various branches of economics, each of which is a definitive source, reference, and teaching supplement for use by professional researchers and advanced graduate students. Each Handbook provides self-contained surveys of the current state of a branch of economics in the form of chapters prepared by leading specialists on various aspects of this branch of economics. These surveys summarize not only received results but also newer developments, from recent journal articles and discussion papers. Some original material is also included, but the main goal is to provide comprehensive and accessible surveys. The Handbooks are intended to provide not only useful reference volumes for professional collections but also possible supplementary readings for advanced courses for graduate students in economics. KENNETH J. ARROW and MICHAEL D. INTRILIGATOR
PUBLISHER'S NOTE
For a complete overview of the Handbooks in Economics Series, please refer to the listing at the end of this volume.
CONTENTS OF THE HANDBOOK
VOLUME IA
PART 1
-
EMPIRICAL AND HISTORICAL PERFORMANCE
Chapter 1 Business Cycle Fluctuations in US Macroeconomic Time Series JAMES H. STOCK and MARK W WATSON Chapter 2 Monetary Policy Shocks: What Have we Learned and to What End? LAWRENCE J. CHRISTIANO, MARTIN EICHENBAUM and CHARLES L. EVANS Chapter 3 Monetary Policy Regimes and Economic Performance: The Historical Record MICHAEL D. BORDO AND ANNA J. SCHWARTZ Chapter 4 The New Empirics of Economic Growth STEVEN N. DURLAUF and DANNY T. QUAH
PART 2
-
METHODS OF DYNAMIC ANALYSIS
Chapter 5 Numerical Solution of Dynamic Economic Models MANUEL S. SANTOS Chapter 6 Indeterminacy and Sunspots in Macroeconomics JESS BENHABIB and ROGER E.A. FARMER Chapter 7 Learning Dynamics GEORGE W EVANS and SEPPO HONKAPOHJA Chapter 8 Micro Data and General Equilibrium Models MARTIN BROWNING, LARS PETER HANSEN and JAMES J. HECKMAN
Vll
Contents of the Handbook
viii
PART 3
-
MODELS OF ECONOMIC GROWTH
Chapter 9
Neoclassical Growth Theory ROBERT M. SOLOW
Chapter IO
Explaining Cross-Country Income Differences ELLEN R. McGRATTAN and JAMES A. SCHMITZ, Jr.
VOLUME lB
PART 4
-
CONSUMPTION AND INVESTMENT
Chapter II
Consumption ORAZIO
P.
ATTANASIO
Chapter I2
Aggregate Investment RICARDO J. CABALLERO
Chapter I3
Inventories
VALERIE A. RAMEY and KENNETH D. WEST
PART 5
-
MODELS OF ECONOMIC FLUCTUATIONS
Chapter I4
Resuscitating Real Business Cycles ROBERT G. KING AND SERGIO T. REBELO
Chapter I5
Staggered Price and Wage Setting in Macroeconomics JOHN B. TAYLOR
Chapter I6
The Cyclical Behavior of Prices and Costs JULIO J. ROTEMBERG and MICHAEL WOODFORD
Chapter 1 7
Labor-Market Frictions and Employment Fluctuations ROBERT E. HALL
Chapter I8
Job Reallocation, Employment Fluctuations and Unemployment DALE T. MORTENSEN and CHRI STOPHER A. PISSARIDES
Contents of the Handbook
VOLUME lC
PART 6 - FINANCIAL MARKETS AND THE MACROECONOMY Chapter 19
Asset Prices, Consumption, and the Business Cycle JOHN Y. CAMPBELL
Chapter 20
Human Behavior and the Efficiency of the Financial System ROBERT J. SHILLER
Chapter 21
The Financial Accelerator in a Quantitative Business Cycle Framework BEN S. BERNANKE, MARK GERTLER and SIMON GILCHRIST
PART 7 - MONETARY AND FISCAL POLICY Chapter 22
Political Economics and Macroeconomic Policy TORSTEN PERSSON and GUIDO TABELLINI
Chapter 23
Issues in the Design of Monetary Policy Rules BENNETT T. McCALLUM
Chapter 24
Inflation Stabilization and BOP Crises in Developing Countries GUILLERMO A. CALVO and CARLOS A. VEGH
Chapter 25
Government Debt DOUGLAS
W
ELMENDORF AND N. GREGORY MANKIW
Chapter 26
Optimal Fiscal and Monetary Policy VV CHARI and PATRICK J. KEHOE
IX
PREFACE TO THE HANDBOOK
Purpose
The Handbook ofMacroeconomics aims to provide a survey of the state of knowledge in the broad area that includes the theories and facts of economic growth and economic fluctuations, as well as the consequences of monetary and fiscal policies for general economic conditions. Progress in Macroeconomics
Macroeconomic issues are central concerns in economics. Hence it is surprising that (with the exception of the subset of these topics addressed in the Handbook of Monetary Economics) no review of this area has been undertaken in the Handbook of Economics series until now. Surprising or not, we find that now is an especially auspicious time to present such a review of the field. Macroeconomics underwent a revolution in the 1 970's and 1980's, due to the introduction of the methods of rational expectations, dynamic optimization, and general equilibrium analysis into macroeconomic models, to the development of new theories of economic fluctuations, and to the introduction of sophisticated methods for the analysis of economic time series. These developments were both important and exciting. However, the rapid change in methods and theories led to considerable disagreement, especially in the 1 980's, as to whether there was any core of common beliefs, even about the defining problems of the subject, that united macroeconomists any longer. The 1990's have also been exciting, but for a different reason. In our view, the modern methods of analysis have progressed to the point where they are now much better able to address practical or substantive macroeconomic questions - whether traditional, new, empirical, or policy-related. Indeed, we find that it is no longer necessary to choose between more powerful methods and practical policy concerns. We believe that both the progress and the focus on substantive problems has led to a situation in macroeconomics where the area of common ground is considerable, though we cannot yet announce a "new synthesis" that could be endorsed by most scholars working in the field. For this reason, we have organized this Handbook around substantive macroeconomic problems, and not around alternative methodological approaches or schools of thought.
xi
xii
Preface
The extent to which the field has changed over the past decade is considerable, and we think that there is a great need for the survey of the current state of macroeconomics that we and the other contributors to this book have attempted here. We hope that the Handbook of Macroeconomics will be useful as a teaching supplement in graduate courses in the field, and also as a reference that will assist researchers in one area of macroeconomics to become better acquainted with developments in other branches of the field. Overview
The Handbook of Macroeconomics includes 26 chapters, arranged into seven parts. Part 1 reviews evidence on the Empirical and Historical Peiformance of the aggregate economy, to provide factual background for the modeling efforts and policy discussion of the remaining chapters. It includes evidence on the character of business fluctuations, on long-run economic growth and the persistence of cross country differences in income levels, and on economic performance under alternative policy regimes. Part 2 on Methods of Dynamic Analysis treats several technical issues that arise in the study of economic models which are dynamic and in which agents' expectations about the future are critical to equilibrium determination. These include methods for the calibration and computation of models with intertemporal equilibria, the analysis of the determinacy of equilibria, and the use of "learning" dynamics to consider the stability of such equilibria. These topics are important for economic theory in general, and some are also treated in the Handbook ofMathematical Economics, The Handbook of Econometrics, and the Handbook of Computational Economics, for example, from a somewhat different perspective. Here we emphasize results - such as the problems associated with the calibration of general equilibrium models using microeconomic studies - that have particular application to macroeconomic models. The Handbook then turns to a review of theoretical models of macroeconomic phenomena. Part 3 reviews Models of Economic Growth, including both the determinants of long-run levels of income per capita and the sources of cross-country income differences. Both "neoclassical" and "endogenous" theories of growth are discussed. Part 4 treats models of Consumption and Investment demand, from the point of view of intertemporal optimization. Part 5 covers Models of Economic Fluctuations. In the chapters in this part we see a common approach to model formulation and testing, emphasizing intertemporal optimization, quantitative general equilibrium modeling, and the systematic comparison of model predictions with economic time series. This common approach allows for consideration of a variety of views about the ultimate sources of economic fluctuations and of the efficiency of the market mechanisms that amplify and propagate them. Part 6 treats Financial Markets and the Macroeconomy. The chapters in this part consider the relation between financial market developments and aggregate economic
Preface
xiii
activity, both from the point of view of how business fluctuations affect financial markets, and how financial market disturbances affect overall economic activity. These chapters also delve into the question of whether financial market behavior can be understood in terms of the postulates of rational expectations and intertemporal optimization that are used so extensively in modern macroeconomics-an issue of fundamental importance to our subject that can be, and has been, subject to special scrutiny in the area of financial economics because of the unusual quality of available data. Finally, Part 7 reviews a number of Monetary and Fiscal Policy issues. Here we consider both the positive theory (or political economics) of government policymaking and the normative theory. Both the nature of ideal (or second-best) outcomes according to economic theory and the choice of simple rules that may offer practical guidance for policymakers are discussed. Lessons from economic theory and from experience with alternative policy regimes are reviewed. None of the chapters in this part focus entirely on international, or open economy, macroeconomic policies, because many such issues are addressed in the Handbook of International Economics. Nevertheless, open-economy issues cannot be separated from closed-economy issues as the analysis of disinflation policies and currency crises in this part of the Handbook of Macroeconomics, or the analysis of policy regimes in the Part I of the Handbook of Macroeconomics make clear. Acknowledgements
Our use of the pronoun "we" in this preface should not, of course, be taken to suggest that much, if any, of the credit for what is useful in these volumes is due to the Handbook's editors. We wish to acknowledge the tremendous debt we owe to the authors of the chapters in this Handbook, who not only prepared the individual chapters, but also provided us with much useful advice about the organization of the overall project. We are grateful for their efforts and for their patience with our slow progress toward completion of the Handbook. We hope that they will find that the final product justifies their efforts. We also wish to thank the Federal Reserve Bank of New York, the Federal Reserve Bank of San Francisco, and the Center for Economic Policy Research at Stanford University for financial support for two conferences on "Recent Developments in Macroeconomics" at which drafts of the Handbook chapters were presented and discussed, and especially to Jack Beebe and Rick Mishkin who made these two useful conferences happen. The deadlines, feedback, and commentary at these conferences were essential to the successful completion of the Handbook. We also would like to thank Jean Koentop for managing the manuscript as it neared completion. Stanford, California Princeton, New Jersey
John B. Taylor Michael Woodford
Chapter II
CONSUMPTION* ORAZIO P. AT TANASIO
University College London. Institute for Fiscal Studies and NBER Contents
Abstract Keywords 1 . Introduction 2. Stylised facts 2. 1 . Aggregate time series data 2.2. Household consumption expenditure 2.2 . 1 . Nature of the data sets and their comparability with the National Account data 2.2.2. Life cycle profiles
3. The life cycle model 3 . 1 . The simple textbook model 3 . 2 . Quadratic preferences, certainty equivalence and the permanent income model 3.3. The Euler equation approach 3.4. P recautionary motives for saving 3.5. Borrowing restrictions 3.6. Taking into account demographics, labour supply and unobserved heterogeneity 3.7. Bequest motives
4. Aggregation issues 4. 1 . Aggregation across consumers 4.2. Aggregation across commodities
5. Econometric issues and empirical evidence 5 . 1 . Aggregate time series studies 5.2. Micro data: some econometric problems 5.2. 1 . Consistency of estimators derived from Euler equations 5.2.2. Average cohort techniques
742 742 743 745 746 750 75 1 752 760 761 762 765 770 772 777 780 781 78 1 782 783 784 785 785 787
A preliminary draft of this chapter was presented at a conference at the New York Fed., February 27-28 1997, where I received useful comments from my discussant, Chris Carroll and several participants. Tullio Jappelli provided many careful and insightful comments for which I am very grateful. I would like to thank Margherita Borella for research assistance and James Sefton for providing me with the UK National Accounts Data. Material from the FES made available by the ONS through the ESRC Data Archive has been used by permission of the Controller of HMSO. Neither the ONS nor the ESRC Data Archive bear any responsibility for the analysis or interpretation of the data reported here. *
Handbook of Macroeconomics, Volume I, Edited by J.B. Taylor and M. Woodford © I999 Elsevier Science B. V All rights reserved 741
O.P. Attanasio
742 5 .2.3. Conditional second (and higher) moments 5.3. Micro data: some evidence
6. Where does the life cycle model stand? 7. Insurance and inequality 8. Intertemporal non-separability 8 . 1 . Durables 8.2. Habit formation
9. Conclusions References
788 789 791 795 798 799 802 804 805
Abstract
Consumption is the largest component of GDP. Since the 1 950s, the life cycle and the permanent income models have constituted the main analytical tools to the study of consumption behaviour, both at the micro and at the aggregate level. Since the late 1 970s the literature has focused on versions of the model that incorporate the hypothesis of Rational Expectations and a rigorous treatment of uncertainty. In this chapter, I survey the most recent contribution and assess where the life cycle model stands. My reading of the evidence and of recent developments leads me to stress two points: (i) the model can only be tested and estimated using a flexible specification of preferences and individual level data; (ii) it is possible to construct versions of the model that are not rejected by the data. One of the main problems of the approach used in the literature to estimate preferences is the lack of a 'consumption function' . A challenge for future research is to use preference parameter estimates to construct such functions.
Keywords
consumption, life cycle model, household behaviour
JEL classification: E2
Ch. 11:
Consumption
743
1. Introduction
In most developed economies, consumption accounts for about two thirds of GDP. Moreover, it is from consumption that, in all likelihood, utility and welfare are in large part determined. It is therefore natural that macroeconomists have devoted a considerable amount of research effort to its study. In modern macroeconomics, consumption is typically viewed as part of a dynamic decision problem. There is therefore another sense in which an understanding of consumption is central for macroeconomics. Consumption decisions are also saving decisions from which the funds available for capital accumulation and investment arise. Therefore, consumers attitudes to saving, risk bearing and uncertainty are crucial to understand the behaviour of capital markets, the process of investment and growth and development. It is not by chance that modern consumption theory is also used to characterise asset prices equilibrium conditions. The desire consumers might have to smooth fluctuations over time determines the need for particular financial instruments or institutions. Understanding recent trends in consumption and saving is crucial to the study, both positive and normative, of the development of financial markets, of the institutions that provide social safety nets, of the systems through which retirement income is provided and so on. One of the main themes of this chapter is that consumption decisions caunot be studied in isolation. Exactly because consumption and saving decisions are part of a dynamic optimisation problem, they are determined j ointly with a number of other choices, ranging from labour supply to household formation and fertility decisions, to planned bequests. While modelling all aspects of human economic behaviour simultaneously is probably impossible, it is important to recognise that choices are taken simultaneously and to control for the effects that various aspects of the economic environment in which consumers live might have on any particular choice. This is particularly true if one wants to estimate the parameters that characterise individual preferences. Implicit in this argument is another of the main themes of this chapter: consumption decisions should be modelled within a well specified and coherent optimisation model. Such a model should be flexible and allow for a variety of factors. Indeed, I think it is crucial that the model should be interpreted as an approximation of reality and should allow for a component of behaviour that we are not able to explain. However, such a model is crucial to organise our thinking and our understanding of the data. Without a structural model it is not possible to make any statement about observed behaviour or to evaluate the effect of any proposed change in economic policy. This, however, is not a call for a blind faith in structural models. Inferences should always be conditional on the particular identification restrictions used and on the particular structural model used. Such models should also be as flexible as possible and incorporate as much information about individual behaviour as is available. It should be recognised, however, that without such models we cannot provide more than a statistical description of the data.
744
O.P. Attanasio
The other main theme of the analysis in this chapter is that to understand aggregate trends it is necessary to conduct, in most situations, a detailed analysis of individual behaviour. In other words, aggregation problems are too important to be ignored. This obviously does not mean that the analysis of aggregate time series data is not useful. Indeed, I start the chapter with a brief summary of the main time series properties of consumption. Estimation of structural models of economic behaviour, however, cannot be performed using aggregate data only. This chapter is not an exhaustive survey of the literature on consumption: such a literature has grown so much that it would be hard even to list it, let alone summarise all the contributions. What I offer, instead, is a discussion of the current status of our knowledge, with an eye to what I think are the most interesting directions for future research. In the process of doing so, however, I discuss several of the most important and influential contributions. Omissions and exclusions are unavoidable and should not be read as indicating a negative judgement on a particular contribution. At times, I simply chose, among several contributions, those that most suited my arguments and helped me the most to make a given point. Moreover, notwithstanding the length of the chapter, not every sub-fields and interesting topic has been covered. But a line had to be drawn at some point. There are four fields that I did not included in the chapter and over which I have agonised considerably. The first is asset pricing: while much of the theoretical material I present has direct implications for asset prices, I decided to omit a discussion of these implications as there is an entire chapter of this Handbook devoted to these issues. The second is the axiomatisations of behaviour under uncertainty alternative to expected utility. There are several interesting developments, including some that have been used in consumption and asset pricing theory, such as the Kreps Porteus axiomatisation used by Epstein and Zin ( 1 989, 199 1 ) in some fascinating papers. The third is the consideration of within-household allocation of resources. There is some exciting research being developed in this area, but I decided to draw the line of 'macro' at the level of the individual household. Finally, I do not discuss theories of consumption and saving behaviour that do not assume optimising and fully rational behaviour. Again, there is some exciting work in the area of social norms, mental accounting, time varying preferences, herd behaviour and so on. In the end, however, I decided that it would not fit with the rest of the chapter and rather than giving just a nod to this growing part of the literature I decided to leave it out completely. The chapter is organised as follows. In Section 2, I start with a brief description of some stylised facts about consumption. These include both facts derived from aggregate time series data and from household level data. Throughout the section, I use in parallel data from two countries: the USA and the UK. In Section 3, I discuss at length what I think is the most important model of consumption behaviour we have, the life cycle model. In that section, I take a wide view of what I mean by the life cycle model: definitely not the simple textbook version according to which the main motivation for saving is the accumulation of resources to provide for retirement. Instead, I favour a flexible version of the model where demographics, labour supply, uncertainty and precautionary saving and possibly
Ch. II:
Consumption
745
bequests play an important role. In other words, I consider the life cycle model as a model in which consumption decisions are determined within an intertemporal optimisation framework. What elements of this model turn out ot be more important is largely an empirical matter. Indeed, even the presence of liquidity constraints, or borrowing restrictions, can and should be incorporated within this framework. In Section 4, I discuss aggregation problems. In particular, I focus on two different kinds of aggregation: that across consumers and that across commodities. The aim of this section is not just to give lip service to the aggregation issues and proceed to sweep them under the carpet. With the development of computing and storage capability and with the availability of increasing large number of micro data sets, it is important to stress that scientific research on consumption behaviour cannot afford to ignore aggregation issues. In Section 5 , I consider the empirical evidence on the life cycle model and discuss both evidence from aggregate time series data and evidence from micro data. In this section I also address a number of econometric problems with the analysis of Euler equations for consumption. In Section 6, I take stock on what I think is the status of the life cycle model, given the evidence presented in Section 5 . I n Section 7, I address the issues o f insurance and inequality. I n particular, I present some of the tests of the presence of perfect insurance and discuss the little evidence there is on the evolution of consumption inequality and its relationship to earning inequality. Most of the models considered up to this point assume time separability of preferences. While such a hypothesis is greatly convenient from an analytical point of view, it is easy to think of situations in which it is violated. In Section 8, I discuss to forms of time dependence: that induced by the durability of commodities and habit formation. Section 9 concludes the chapter.
2. Stylised facts
In this section, I document the main stylised facts about consumption behaviour using both aggregate and individual data. I consider two components of consumption expenditure: on non-durable and services and on durables. In addition I also consider disposable income. While most of the facts presented here are quite well established, the evidence in this section constitute the background against which one should set the theoretical model considered in the rest of the chapter. The data used come from two western countries: the United States and the United Kingdom. I have deliberately excluded from the analysis developing or less developed countries as they involve an additional set of issues which are not part of the present discussion. Among the developed countries I have chosen the USA and the UK both because data from these two countries have been among the most widely studied and because the two countries have the best micro data on household consumption. For
OP. Attanasio
746
4000
USA - disposable income and consumption
150000 100000
2000
UK- disposable
income and consumption
50000
97.1 C.,....----.---.---,55
65
75
85
95
65
75
85
95
Fig. I. Disposable income (top curve) consumption, divided into durables (bottom curve) and non durables (middle curve).
the UK, in particular, the Family Expenditure Survey runs for 25 consecutive years, giving the possibility of performing interesting exercises. 2.1.
Aggregate time series data
In this section, I present some of the time series properties of consumption expenditure and of disposable income. While the models considered in the following sections refer to household behaviour, typically the consumption aggregates considered in the National Account statistics include outlays of a sector that, together with households, includes other entities, such as charities, whose behaviour is unlikely to be determined by utility maximisation. While this issue is certainly important, especially for structural tests of theoretical models of household behaviour, in the analysis that follows I ignore it and, instead of isolating the part of total expenditure to be attributed to households, I present the time series properties of National Account consumption. Seslnick ( 1 994) has recently stressed the importance of these issues. In Figure 1, I plot household (log) disposable income along with consumption divided into durables and non-durables and services for the UK and the USA. The series have quarterly frequency and run from 1 959: 1 to 1 996:3 for the USA and from 1 965: 1 to 1 996:2 for the UK. The data are at constant prices and are seasonally adjusted. From the figure, it is evident that non-durable consumption is smoother than disposable income. Durable consumption, on the other hand, which over the sample accounts, on average, for 1 3% of total consumption in the USA and around 1 4% in the UK, is by far the most volatile of the three time series. This is even more evident in Figure 2 where I plot the annual rate of changes for the three variables. In Table 1 , I report the mean and standard deviation of the three variables. These figures confirm and quantifY the differences in the variability of the three variables considered. In Tables 2 and 3, I consider two alternative ways of summarising the time series properties of the three series I analyse for both countries. In Table 2, I report the estimates of the coefficient of an MA( 1 2) model for the same series. The advantage of such an un-parsimonious model is that it avoids the sometimes difficult choice among competing ARMA representations. Furthermore, its impulse response function
Ch. 11:
�2
Consumption
747
USA- disposable income and consumption rates of growth
55
75
65
85
95
�2
UK- disposable income and consumption rates of growth
L,------�- --- ,-75 95 85 65
Fig. 2. Annual rates of change for the variables of Figure I. Table I Mean and standard deviations (annual rates of growth) us
UK
Mean
St. dev.
Mean
St. dev.
Disposable income
0.032
0.025
0.026
0.026
Nondurable consumption
0.023
0.018
0.0 1 7
0.021
Durable expenditure
0.048
0.069
0.043
0.112
can be easily read from the estimated coefficients. I also purposely decided to be agnostic about the presence of random walks in the time series consumption or income, even though this has implications for the so called 'excess smoothness' puzzle briefly discussed below. In Table 3 , instead, I report the Maximum Likelihood estimates of a parsimonious ARMA model for the first differences of the log of the three variables. While in some cases there were alternative specifications that fitted the data as well as those reported in the table, the latter all pass several diagnostic tests. The Q-statistics reported in the table indicates that the representations chosen capture adequately the dynamic behaviour of the series over the period considered. The time series properties of the rate of growth of the three variables are remarkably different. Notice, in particular, the fact that both in the UK and in the USA, the sum of the MA coefficients for non-durable consumption is positive, while that for durables is negative. The time series properties of non-durable consumption differ remarkably: in Table 2 the sum of the first 12 MA coefficient is much larger in the UK than in the USA. Furthermore, while the US data are well represented by an MA(3) (with the first and third lag large and very strongly significant), the UK require an AR(2) model 1 .
1 The presence of an MA(3) effect in the non-durable series for the USA is evident even in the MA(12) representation but it is not very robust. If one truncates the sample to 1990 or dummies out the few quarters corresponding to the 1 990-91 recession, 83 is estimated non-significantly different from zero
O.P. Attanasio
748 Table 2 MA(12) representation a
ei
us
UK
Non-durable consumption
Durable consumption
Disposable income
Non-durable consumption
Durable consumption
Disposable income
81
-0.30 (0 .09 1)
0.41 (0.096)
-0.092 (0.088)
-0. 10 (0.094)
0.005 (0.094)
-0.29 (0.094)
82
0. 1 5 (0.094)
0. 1 8 (0. 103)
-O.o35 (0.089)
0.12 (0.095)
0.20 (0.093)
-0. 14 (0.096)
0.092 (0.094)
0.43 (0. 1 04)
0.063 (0.089)
-0.06 (0.092)
0.004 (0.093)
0.24 (0.097)
84
-0.092 (0.088)
0. 1 2 (0. 1 10)
0.084 (0.086)
-0. 1 8 (0.088)
0.28 (0.092)
-0.45 (0.099)
Bs
-0. 1 5 (0.087)
-0.057 (0. 108)
-0. 1 6 (0.085)
-0. 1 9 (0.089)
0. 1 9 (0.093)
0. 1 5 (0. 106)
86
0. 1 1 (0.088)
0. 100 (0. 108)
0. 1 5 (0.077)
0.22 (0.088)
0.1 9 (0.094)
0.05 (0. 1 07)
e7
-0. 13 (0.087)
0. 1 1 (0. 1 07)
-0.45 (0.077)
0.21 (0.088)
0.09 (0.094)
-0.07 (0. 106)
Bs
-0. 1 7 (0.088)
-0.20 (0. 1 07)
-0.021 (0.085)
0.14 (0.087)
0.22 (0.092)
-0. 1 8 (0. 1 04)
0.38 (0.088)
0.05 (0.109)
-0.23 (0.085)
-0.14 (0.086)
-0. 1 1 (0.090)
-0.08 (0.098)
Bw
0.20 (0.095)
-0.03 (0. 1 00)
-0.03 (0.088)
-0.20 (0.087)
0.23 (0.09 1)
0.02 (0.095)
Bn
-0.06 (0.096)
0.05 (0.099)
0.005 (0.087)
0.05 (0.088)
0. 1 8 (0.09 1)
-0.20 (0.094)
-0.27 (0.091)
0.08 (0.092)
-0.23 (0.086)
-0.05 (0.086)
O.o3 (0.093)
-0.02 (0.089)
-0.95
-0. 1 8
e3
e9
el2
Li:r ei a
-0.25
1 .23
1.51
-0.97
Standard errors are given in parentheses.
The sum of the MA coefficients for disposable income in both countries is quite small in absolute value, but is positive in the USA and negative for the UK. As far as a 'parsimonious' specification is concerned, in the USA I chose an MA(l ) for the first differences, even though its coefficient is not very large and is statistically insignificant. This model was almost indistinguishable from an AR(l ) model. In the UK, the best model for disposable income is an ARMA( l , l ). The richer dynamics of the UK series is also evident in the pattern of the MA coefficients in Table 2. both in the MA(l2) and in the MA(3) model. The same result is obtained if one excludes services from this series.
Ch. 11:
749
Consumption Table 3 ARMA representation a UK
us
Variable Disposable income
Non-durable consumption
1/11
Durable consumption
Disposable income
Non-durable consumption
Durable consumption
-0. 1 03 (0.089)
-0.77 (0.293)
O.D15 (0.087)
-1 .09 (0.098)
0.28 (0.087)
-0.45 (0.082)
1/12 el
-0.1 9 (0.088)
0.38 (0.083)
e2
0. 1 8 (0.088)
e3
0.39 (0.082)
Q-stat (p-value) a
1 3 .40 (0. 10)
7.35 (0.28)
1 0.09 (0.26)
0.684 (0.339)
0.85 (0.077)
0.684 (0.339)
0.85 (0.077)
1 1 .79 (0. 1 1)
1 1 .49 (0.12)
7.22 (0.30)
Sample 1965:3-1996:3 (125 observations). Standard errors are given in brackets.
The properties of durable consumption are particularly interesting. The fact that the time series properties are inconsistent with a simple model which adds durability to the standard random walk property derived from some version of the permanent income has been noticed by Mankiw (1 982). Such a model would imply an MA(l) model for the changes in expenditure with a coefficient that would differ from minus one by an amount equivalent to the depreciation rate. As can be seen from Table 2, the US series' best representation is indeed an MA( l ) with a negative coefficient; but that coefficient is far from minus one2• Caballero (1 990b) has interpreted this and the fact that, as reported in Table 3 for both'cm.iritries, the sum of the 1 2 MA coefficients is negative and much larger in absolute value, as an indication of the presence of inertial behaviour that 'slows down' the process of adjustment of durables. Having characterised the main time series properties of consumption and income, the next step would be the estimation of a multivariate time series model that would stress the correlations among the variables considered at various leads and lags. Indeed, some of the studies I cite below, such as Flavin ( 1 98 1 ), do exactly this with the purpose of testing some of the implications of the life cycle-permanent income hypothesis. For the sake of brevity, I omit the characterisation of the multivariate time series process of consumption and other macro variables. One of the reasons for this omission is the belief, discussed below, that aggregation problems make it very difficult to give
2 For durable consumption in the UK, the best model is an ARMA(2,1), by far the most complex model I fitted to these data.
O.P Attanasio
750
structural interpretation to this type of results. This does not mean, however, that aggregate time series studies are not useful. The careful specification of a flexible time series model for consumption and other variables can be quite informative, especially if the dynamic specification allows for the type of dynamic effects implied by the microeconomic behaviour. Several of the studies by David Hendry and his collaborators are in this spirit; one of the most widely cited examples of this literature is the paper by Davidson et al. ( 1 978). The approach taken in these papers, which received a further motivation by the development of cointegration techniques, is to estimate a stable error correction model which relates consumption to other variables. The statistical model then allows to identify both short run and long run relationships between consumption and its determinants. While the theory can be informative on the choice of the relevant variables and even on the construction of the data series, it does not provide explicit and tight restrictions on the parameters of the model. A good example of a creative and informative use of this type of techniques is Blinder and Deaton (1 985). While it is difficult to relate this type of models to structural models and therefore they cannot be directly used for evaluating economic policy, they constitute useful instruments for summarising the main features of the data and, if used carefully, for forecasting. Often the lack of micro economic data makes the use of aggregate time series data a necessity. The only caveat is that these studies cannot be used to identify structural parameters. 2.2.
Household consumption expenditure
In this section, I use two large microeconomic data set to document the main stylised facts about consumption. The two data sets used are the US Consumption Expenditure Survey (CEX) and the UK Family Expenditure Survey (FES). Both data sets are run on a continuous basis to gather information for the construction of the weights for the CPI (RPI in the UK). They have, however, been extensively used by researchers and have now become an essential tool to study household consumption and saving behaviour. The focus of the analysis is going to be the household. No attempt will be made to attribute consumption to the single household members, even though some (limited) information on this does exist3 . Most of the descriptive analysis presented below attempts at describing the main features of the life cycle profile for consumption expenditure and some other variables.
3
Both data sets contain very detailed information on the expenditure on individual commodities. Some of this information can be used to attribute some items to some household members. For many items, however, such attribution is difficult both in practice and conceptually. Browning ( 1987) has imputed expenditure on alcohol and tobacco to the adults to check whether predicted changes in household income and composition (such as the arrival of children with consequent - at least temporary- withdrawal from the labour force of the wife) cause changes in consumption. Gokhale, Kotlikoff and Sabelhaus ( 1 996) in their study of saving behaviour have attempted to impute all of consumption to the individual household members.
Ch. II:
Consumption
751
This approach reflects the fact that the theoretical discussion in the next sections will be focused around the life cycle model. 2.2.1.
Nature of the data sets and their comparability with the National Account data
The FES is now available for 25 consecutive years. Each year around 7000 households are interviewed and supply information on their consumption patterns as well as their demographic characteristics and several other economic variables such as employment status, income, education and so on. Each household stays in the sample for two weeks, during which it fills a diary in which all expenditure items are reported. At the end of the two week period an interviewer collects the diaries and asks additional information on durables acquired during the previous three months and on all major expenditure items reported in the diary and periodic expenditures such as utilities. The CEX is available on a continuous and roughly homogeneous basis since 1 980. Each year about 7000 different households are interviewed for 4 subsequent interviews, with quarterly frequency 4 . Each month new households enter the survey to replace those that have completed their cycle of interviews. During each interview the household is asked to report expenditure on about 500 consumption categories during each of the three months preceding the interview 5. The panel dimension of the CEX is unfortunately very short: because each household is only interviewed four times, seasonal variability is likely to dominate life cycle and business cycle movements. In what follows, I do not exploit the panel dimension of the survey. There have been several discussions about the quality of survey data and the importance of measurement error and about their ability to reproduce movements in aggregate consumption. Several studies, both in the USA and the UK, have addressed the issue 6. It should be stressed that the aggregated individual data and the National Account aggregate should be expected to differ for several reasons. First of all, for many consumption categories, the definitions used in the surveys and in the National Accounts are quite different. Housing, for instance, includes imputed rents in the National Accounts data but does not in the surveys. In the CEX, health expenditure 4 In total there are data for over 20000 interviews per year. Each household is in fact interviewed five times. However, the Bureau for Labor Statistics does not release information on the first (contact interview). The Bureau of Labor Statistics also runs a separate survey based on diaries which collects information on food consumption and 'frequently purchased items'. 5 Unfortunately, the monthly decomposition of the quarterly expenditure is not very reliable. For several commodities and for many households, the quarterly figure is simply divided by three. Given the rotating nature of the sample, the 'quarters' of expenditure do not coincide perfectly. For instance, somebody interviewed in December will report consumption in September, October and November, while somebody interviewed in November will report consumption in August, September and October. 6 See, for instance, Seslnick ( 1992) and Paulin et a!. (1990) for comparisons between the aggregate Personal Consumption Expenditure and the CEX in the USA and the papers in Banks and Johnson (1 997) for comparisons on the F ES and the UK National Accounts.
O.P Attanasio
752
measures only out-of-pocket expenditures, while the National Accounts definition includes all health expenditures regardless of the payee. Furthermore, the populations of reference are quite different. Surveys, for instance, do not include institutionalised individuals, while the National Accounts do. Finally, National Account data are not exempt from measurement error that, for some items, can be quite substantial. Should major difference emerge, it is not obvious that the National Account data should be considered as being closer to the 'truth' . The issues that arise are different for the two data sets. Overall, the degree of correspondence between the aggregated individual data and the National Account data seems to be higher in the UK. For most consumption components, aggregating the PES data, one obtains about 90% of the corresponding National Accounts figure, while the same ratio is about 65% for the CEX in the 1 980s. This is probably due to the use of diaries rather than recall interviews. The latter, perhaps not surprisingly, tend to underestimate consumption. In both surveys, however, because of the consistent methodology used over time, there is no major trend in the ratio of the aggregated individual data to the corresponding National Accounts aggregates 7. Furthermore, the dynamics of consumption and income growth and of saving in both the aggregated CEX and PES data do not track the corresponding macroeconomic aggregates badly. The data are therefore not only useful to characterise individual behaviour and its shifts over time, but also to make inferences, based on individual behaviour, about possible explanations of the observed macroeconomic trends. 2.2.2.
Life cycle profiles
In the second part of the chapter, in which I discuss the main theoretical model of consumption behaviour, a substantial amount of attention is devoted to the life cycle model in its several incarnations. In this section, I present life cycle profiles for consumption, its components and various other variables in the USA and the UK. In this sense, the life cycle model is the conceptual framework that I use to organise the presentation of the microeconomic data. As the data sets I use are not panels, to estimate age profiles, I am forced to use grouping techniques. These techniques were first used within life cycle models by Browning, Deaton and Irish ( 1 985) 8. The idea is quite simple. Rather than following the same individual over time, one can follow the average behaviour of a group of
7 There are substantial differences in this ratio between the early CEX surveys ( 196()-61 and 1972-73) and those of the 1980s, probably due to the differences in the methodology employed. In the PES the one commodity for which a (downward) trend in the ratio is apparent is tobacco. 8 Ghez and Becker (1 975) use observations on individual of dijforent ages to study life cycle behaviour. However, as they use a single cross section, they do not control for cohort effects as Browning et a!. (1 985) do. Deaton (1985) and, more recently, Moffitt (1993) have studied some of the econometric problems connected with the use of average cohort techniques. Heckman and Robb (1 987), MaCurdy and Mroz ( 1 989) and Attanasio ( 1994) discuss identification issues.
Ch. 11:
Consumption
753
individuals as they age. Groups can be defined in different ways, as long as the membership of the group is constant over time 9. Within the life cycle framework, the natural group to consider is a 'cohort', that is individuals (household heads) born in the same period. Therefore, to compute the life cycle profile of a given variable, say log consumption, one splits the households interviewed in each individual cross section in groups defined on the basis of the household head's year of birth. This involves, for instance, considering all the individuals aged between 20 and 24 in 1 980, those aged between 2 1 and 25 in 1 98 1 and so on to form the first cohort; those aged between 25 and 29 in 1 980, between 26 and 30 in 1981 and so on to form the second cohort, etc. Having formed these groups in each year in which the survey is available, one can average log consumption and therefore form pseudo panels: the resulting data will have dimension Q x T, where Q is the number of groups (cohorts) formed and T is the number of time periods 10. Even if the individuals used to compute the means in each year are not the same, they belong to the same group (however defined) and one can therefore study the dynamic behaviour of the average variables. Notice that non-linear transformations of the variables do not constitute a problem as they can be computed before averaging. The resulting age profiles will not cover the entire life cycle of a given cohort, unless the available sample period is longer than any of the micro data set commonly used. Each cohort will be observed over a (different) portion of its life cycle. These techniques can be and have been used both for descriptive analysis and for estimating structural models. Their big advantage is that they allow to study the dynamic behaviour of the variables of interest even in the absence of panel data. Indeed, in many respects, their use might be superior to that of panel data 11. Furthermore, as non-linear transformations of the data can be handled directly when forming the group means, they allow one to solve various aggregation problems that plague the study of structural models with aggregate time series data. In what follows, I define groups on the basis of the year of birth and educational attainment of the household head. The length of the interval that defines a birth 9 Group membership should be fixed over time so that the sample is drawn from the same population and the sample mean is a consistent estimator of the mean of the same population. Attanasio and Hoynes (1995) discuss the implications of differential mortality for the use of average cohort techniques. Other possible problems arise, at the beginning of the life cycle, from the possible endogeneity of household formation and, more generally, from migration. Io Here I am implicitly assmning that the pseudo panel is a balanced one. This is not always the case as each group might be observed for a different nmnber of time periods. Suppose, for instance, to have data from 1 968 to 1994. One might want to follow the cohort born between 1 965 and 1 970 only from the late 1980s or the early 1 990. On the other hand, at some point during the 1 980s one might want to drop the cohort hom between 1906 and 1 910. II Time series of cross sections are probably less affected by non-random attrition than panel data. Furthermore, in many situation, averaging across the individuals belonging to a group can eliminate measurement error and purely idiosyncratic factors which are not necessarily of interest. As most grouping techniques, average cohort analysis has an Instrumental Variable interpretation.
O.P Attanasio
754 Table 4 Cohort definition and cell size Cohort
Cell size
Year of birth Average size
2
us
Years in sample
Average size
UK
Years in sample
1 895-1999
338
1 968-1 977
1 900-1 904
459
1 968-1 982
526
1 968-1987
3
1 905-1909
4
1 9 1 0-1914
232
1980-1992
560
1 968-1 992
5
1 915-1919
390
1980-1992
519
1 968-1992
6
1 920-1924
333
1 980-1992
653
1 968-1 992
7
1 925- 1 929
325
1 980-1992
572
1968-1992
8
1 930- 1 934
317
1 980-1992
546
1968-1992
9
1 935-1939
345
1 980-1992
562
1 968-1 992
10
1 940-1 944
420
1 980-1992
594
II
1 968-1 992
1 945-1 949
566
1 980-1992
652
1 968-1 992
12
1 950-1954
657
1 980-1992
547
1 97 1 - 1 992
13
1 955-1959
734
1 980-1992
508
1 976-1992
14
1 960-1 964
463
1981-1992
15
1 965-1969
334
1 986-1992
cohort is chosen taking into account the trade-off between cell size and within-cell homogeneity. Table 4 contains the definition of the cohorts and the average sample size for both surveys. We start, in Figures 3 and 4, with the life cycle profile of (log) consumption and disposable income at constant prices for both countries. The units of measurement for income and consumption are chosen so that the two graphs would be roughly in the same scale, enabling to stress the differences in the shape of the age profile. In the figures, I plot the average cohort (log) consumption at each point in time, against the median age of the household head. Each connected segment represent the behaviour of a cohort, observed as it ages, at different points in time. As each cohort is defined by a five year interval, and both surveys cover a period longer than five years, at most ages we observe more than one cohort, obviously in different years. It might be tempting to attribute the differences between adjacent cohorts observed at the same age, to 'cohort effects' . It should be remembered, however, that these observations refer to different time periods and might therefore be reflecting business cycle effects. The plotted profiles reflect age, time and cohort effects 12 that, without 12
As well as measurement error and small sample variability.
Ch. 11:
1ij
!),'. Q; Eo !'! _!!! 0 "0
25000
755
Consumption USA - disposable income
% 250 � 200
20000
Q; Eo U) "0 " :::>
1 5000
g_ c
c
1)l "
0 0
1 0000
UK - disposable income
1 50
1 00
1)l "
20
40
Age
60
70
0 0
50
20
40
Age
60
70
Age
60
70
Fig. 3.
1ij (])
25000
USA - total consumption
% 250 � 200 Q; Eo 1 50
Q; 20000 Eo ,.,
!'! _!!! 0
"0
c
1)l "
0 0
U) "0 " :::>
15000
1 0000
g_
c
UK - total consumption
1 00
1)l
"
20
40
Age
60
70
0 0
50
20
40
Fig. 4.
an arbitrary normalisation or additional information from a structural model, cannot be disentangled. Several considerations are in order. First of all, both consumption and income age profiles present a characteristic 'hump'. They both peak in the mid 40s and decline afterwards. The picture seems, at first glance, to contradict the implications of the life cycle model as stressed in the typical textbook picture which draws a 'hump shaped' income profile and a flat consumption profil e. For total disposable income, the decline around retirement age is faster in the UK than in the USA, but approximately of the same magnitude. This probably reflects the more synchronised retirement of British individuals. The consumption profiles, however, present some strong differences. The most notable is the fact that UK consumption declines much more at retirement than US consumption. Total consumption at age 70 is roughly 35% of the peak in the UK and above 50% in the USA. I discuss the decline of consumption at retirement below. In the UK consumption profile, the consumption boom of the late 1 980s, followed by the bust of the early 1 990s, is quite apparent. Notice, in particular, the fact that the aggregate consumption boom is accounted for mainly by the youngest cohorts. I have discussed elsewhere how to interpret that episode. It is worth stressing, however, that the analysis of the cross sectional variability of consumption can be useful to shed some light on the nature of episodes that the analysis of the time series data cannot
O.P Attanasio
756 Table 5 Variability of consumption and income Standard error (%)
Variable USA (CEX)
UK (FES); age < 8 1
Total consumption
2.94
2.46
2.65
Total consumption per adult equivalent
2.39
2.62
2.64
Non-durable consumption
2.60
2.30
1 .88
UK (FES); 1 0 cohorts, year < 86
1 .95
2.49
2.05
1 5.79
9.54
8.54
Non-durable consumption (from levels)
2.58
2.31
1 .86
Income
3.68
3.05
3.60
Non-durable consumption per adult equivalent Durable consumption
explain. Information about which groups in the populations where mainly responsible for a determinate episode can be informative about alternative hypotheses 13 . It i s not obvious how to assess the time series volatility of (log) consumption and income. The main reason for this is that a large part of the variation of consumption over the life cycle is very predictable and can be explained by age and cohort effects. Furthermore, given the limited size of our samples, the year to year variation in the average cohort data reflects both genuine time series variation and the measurement error induced by sample variation. As Deaton ( 1 985) has stressed, some information about the size of the measurement error can be gained using the within-cell variability of the variables used. Using this information, one might correct for that part of variability accounted for by sampling variation and attempt to isolate the genuine time variation. In an attempt to isolate this component, I run a regression of log consumption and income on a fifth order polynomial in age and cohort dummies and consider the deviations of the observed profiles from such a profile. The standard deviation of the changes in these deviations, corrected for that part which can be attributed to sampling error, is my measure of time variability 14. These estimates of volatility for (log) income and consumption are reported in Table 5 along with those for the other variables considered. The first column refers to the USA, while the second and third columns are computed using the UK data. The former includes the whole sample,
1 3 See Attanasio and Weber ( 1994). Groups do not need to be formed on the basis of age. In Attanasio and Banks ( 1997) that analysis is extended considering not only the variability across cohorts but also across regions. 14 The sample mean x is distributed around the population mean as a random variable with variance given by a2 /N, where N is the cell size and a is the within-cell variance. The latter can be estimated from the available micro data. These estimates can be used to correct our estimates of volatility.
Ch. II:
1ij
g;, Q; Eo (!! 1ll 0 "0 E
� 8 c:
757
Consumption
1 5000
USA - non-durable consumption per household and per adult equivalent
¥ 200 � Q; 1 50 Eo -g 1 00 ::l
1 0000
UK - non-durable consumption per household and per adult equivalent
(J)
0 c_ E
1)l
5000 20
40
Age
60
8 50
70
20
40
Age
60
70
Fig. 5.
while the latter truncates it to 1 986 to remove the effect of the consumption 'boom and bust' of the last part of the sample. As in the case of aggregate time series, total consumption appears less volatile than disposable income, both in the UK and in the USA. In particular, the standard deviation of changes in total disposable income at the cohort level is above 3% in both countries. That of total consumption is between 0.6% and 0.95% less. It may be argued that the differences in the consumption profiles for the two countries are due to the differences in the definitions used in the two surveys. For this reason, I next focus on a narrower definition of consumption which excludes a number of items which might be recorded in different fashion in the two countries. In particular, in Figure 5 I plot (log) expenditure on non-durables and services against age. This definition excludes from total consumption durables, housing, health and education expenditure. The other advantage of considering consumption of non-durables and services, is that I avoid the issue of durability and the more complicated dynamics that is linked to durables. The main features of the two profiles, however, including the larger decline observed in the UK, are largely unaffected. In Table 5, the volatility of non-durable consumption is considerably less than that of total consumption, especially in the UK when data up to 1 986 are used. An important possible explanation for the life cycle variation of consumption over the life cycle (and between the two countries considered), is the variation in needs linked to changes in family size and composition. To control for this possibility, I have deflated total household expenditure by the number of adult equivalents in the household. For such a purpose, I use the OECD adult equivalence scale 15. The most evident result is that the life cycle profile of consumption looks much flatter now. In this sense, we can say that a large proportion of the variability of consumption over the life cycle is accounted for by changes in needs. This result is perhaps not surprising
15 No adult equivalence scale is perfect. Different alternatives, however, do not make much difference for the point I want to make here. The OECD scale gives weight 1 to the first adult, 0.67 to the following adults and weight 0.43 to each child below 19.
D.P. Attanasio
758
'" (l)
""
Q;
-S !'? � 0 "0
c $ "'
1.
(3. 1 4)
This is essentially what Abel and Eberly ( 1 994) do 28 . Absent the advantage of lumping adjustment brought about by the presence of fixed costs, standard q-theory is recovered whenever the firm invests. Provided adjustment takes place, the firm equalizes the marginal benefit of adjustment and the marginal cost of investing, which is now an increasing function of adjustment:
for 1J =1= 0. By setting 1J to zero, we can obtain the boundaries of inaction in qM -space. Indeed, investment will not occur if
Abel and Eberly ( 1994) go further, and show that their insight is robust to the presence of flow-fixed costs. That is, fixed costs which are multiplied by !it; if adjustment occurred instantaneously, the firm effectively would pay no fixed cost. Because of the convex adjustment component, the firm chooses not to adjust instantaneously and pays the fixed costs instead. In a sense, the endogenous adjustment decisions and the fact that the fixed cost goes to zero as adjustment speeds up, ensures that the fixed cost remains relatively "small," and so do investment projects 29. It is important to realize that their paper "unifies" q-theory with irreversible investment and regulation (i.e. infrequent but infinitesimal adjustments) problems, but it does not unify it with the standard (S, s) literature on lumpy adjustment, which is, unfortunately, the way many have interpreted their results. Barnett and Sakellaris ( 1 995) study a panel of US firms searching for evidence on a reduced sensitivity of investment to changes in q when the latter is close to one (the "inaction" range). They find the opposite; in their panel, a firm's investment seems to be more rather than less responsive to q when q is close to one. Abel and Eberly ( 1996a), however, show that allowing for unobserved heterogeneity in the inaction 27 Which, at the same time, makes transitions outside the inaction range less rare. 28 Needless to say, it is trivial to add asymmetries to the adjustment cost function. But that is beside the point of this section. 29 Alternatively, if one assumes perfect competition and constant returns to scale, the profit function becomes linear with respect to capital (if the other factors of production can be adjusted at will), so changes in investment do not feed back into q. In this extreme case, the modified (i.e. with an inaction range) q-theory works well even in the presence of traditional fixed costs.
832
R.J. Caballero
range relevant for different types of investments could explain the negative Bamett Sakellaris finding. 3.1.3. 3. Taking stock. One may be inclined to conclude from this section that before going ahead with q-theory one should check whether investment literally exhibits jumps or not. This is not the lesson I draw, however. For once, this is not right. It is not difficult to add a time to build mechanism such that a lumpy project is decomposed into a fairly smooth flow, without altering the argument of why marginal q fails in the presence of fixed costs. But more importantly, I suspect the main lesson is one of modesty. I doubt that researchers will often find the required data and/or patience to determine whether one scenario or the other holds. In this case, we might as well acknowledge that the relationship between marginal q and investment is not robust, and that average q is unlikely to be a sufficient statistic for investment. Of course it is important to include variables that capture knowledge of the future on the right-hand side of investment equations, but we should avoid reading "too much" from these regressions. 3. 1.4. Another detour: Several misconceptions about irreversible investment As I mentioned before, when describing the special case of irreversible investment, the regulation barrier, L, is to the left of one. That is, investment occurs only when the stock of capital is substantially below the frictionless stock of capital. Alternatively, investment occurs when the marginal profitability of capital is substantially above the cost of capital. This is the famous "reluctance to invest" result. There are several misconceptions about the implications of this "reluctance" result. I will mention three of them. It is often said that, (a) reluctance implies that, in the presence of irreversibility, the firm accumulates less capital; (b) since reluctance rises with uncertainty (the regulation point moves further to the left), more uncertainty implies less capital; and (c) standard present value techniques are inappropriate because reluctance reflects the value of the "option to wait" for more information before irreversibly sinking resources and this is not taken into account by the standard formulae. In order to show the fallacious nature of the first statement, it is useful to go back to our canonical problem and simulate the path of the (log of) stock of capital of a firm facing no irreversibility constraint. Panel (a) in Figure 3.4 does so for a random realization of the path of e. Panel (b) in the figure shows the corresponding path of the marginal profitability of capital, which is equal to the constant - frictionless - cost of capital, r 3 0 . Imagine now imposing an irreversibility constraint on the firm, but assume that the firm does not modify its "frictionless" investment rule whenever it can invest. This is 30 These figures are from Caballero (1993a).
Ch. 12:
833
Aggregate Investment (a)
�
,
..
20
(b)
9 � � fi a g ll 9 � � 9 ..
••
time
,.
"
••
..
(c)
�.
..
..
20
..
time
1 -..
f."C Marg. Pro
..
..
..
I
..
(d)
5 � � fi
9 �
:9�.--�--�
I =- �rg. I Prof.
�=== .. == .� .==�.. , ==: .
..�-=�-7 � � .. .. �, ..
time
(e)
;L , --�--..
..
(f)
L-�..::::..:::.r::J..
..
� -� � , -� --
time
.
..
Fig. 3.4. Reluctance and its counterpart.
20
H
time
JO
31!
40
..
834
R.J. Caballero
portrayed in panel (c) of the figure. The solid and dashed lines represent the actual and frictionless stocks of capital, respectively. It is apparent that the firm would, on average, have too much capital, for it would have the same stock of capital in good times, but too much in bad times. The counterpart of this is in panel (d), which shows that on average the marginal profitability of capital is below the cost of capital. Reluctance to invest in good times is an optimal response attempting to offset the natural tendency to over-accumulate capital induced by the irreversibility constraint. Panel (e) illustrates this point. The solid and dashed lines represent the same variables as in panel (c), while the dotted line illustrates the target stock of capital when the firm behaves optimally. The counterpart of the negative value of ln(Kd!Kf) is a positive constant h in the marginal profitability of capital required for investment to take place (panel f). It is apparent that whether the stock of capital is on average higher or lower than without the irreversibility constraint is unclear; the firm has too little capital during good times but too much during very bad times. A precise answer depends on things about which we know little, and which may tum out to yield only second order effects 3 1 . It is now easy to see the fallacious nature of the second statement. More uncertainty raises reluctance precisely because it raises the need to reduce the extent of excessive capital during the now deeper recessions. Without raising reluctance, an increase in uncertainty would raise the average stock of capital in the presence of irreversibility constraints. This occurs because there would now be greater capital accumulation during extremely good times which would not be offset by large disinvestment during extremely bad times 32. The third misunderstanding is of a different nature. In my view, it is the result of insightful but, unfortunately, abused language. First, what is right: there is nothing mysterious about irreversibility constraints as a mathematical problem. Dynamic programing works, in the same way it does with other, more traditional, adjustment frictions. This means that present value formulae, using the correct calculation of future marginal profitability of capital also work. Of course such calculations must be performed along the optimal investment path, constraints included! What is wrong: the standard analysis must be modified to consider the value of the "option to wait".
3 1 See Bertola (1 992), Caballero (1993a), Bertola and Caballero (1994) for early discussions of this issue and of the related uncertainty-investment misconception. More recently, Abel and Eberly (1996b) have formalized these claims and made them more precise. 32 This does not mean that one cannot construct scenarios where an increase in uncertainty reduces investment. For example, if there is an increase in perceived future uncertainty, the investment threshold may jump today - i.e., before the variance of shocks does - resulting in an unambiguous decline in investment. Also, one should not confuse changes in uncertainty with changes in the probability of a bad event. The latter links increases in uncertainty to a reduction in expected value, an entirely different and more straightforward effect on investment. One can find traces of this confusion in the (informal) credibility literature.
Ch. 12:
Aggregate Investment
835
As we have seen, there is no need to do so. However, one may choose to follow an alternative path, in which one starts by evaluating the future marginal profitability of capital without considering the effect of future optimal investment decisions on marginal profitability. This "mistake" can then be "corrected" with a term that has an option representation. This alternative way of doing things is akin to the arbitrage approach in finance, and it was nicely portrayed in Pindyck ( 1 988). The confusion arises, in my view, from mixing the language in the two approaches 33 . A related claim exists for a once and for all project (as opposed to incremental investment). It is said that the simple positive net present value rule used in business schools to decide whether a project should be implemented does not hold because it does not consider the option to wait and decide tomorrow, when more information is available. Since I have never taught at a business school I cannot argue directly against that claim. However, if the issue is whether to invest today or tomorrow, the right criterion has never been invest if NPV is positive - at least that is what we teach economics undergraduates. This is a case of mutually exclusive projects, thus the right criterion has always been to compare their net present value and take that with the highest NPV, provided it is positive. If investment is irreversible, the project invest tomorrow has a lower bound at zero (because investment will not occur if NPV looks negative tomorrow), which the project invest today does not. Thus, other things equal, irreversibility necessarily makes investing tomorrow more attractive than investing today. 3.1.5. Adjustment hazard At a qualitative level, the (L, l, u, U) models described above capture well the nonlinear nature of microeconomic adjustment. Maintenance expenditures aside, investment is mostly sporadic and often lumpy; scarcely reacting to small changes in the environment but abruptly undoing accumulated imbalances when they become sufficiently large, and with possibly significant asymmetries between investment and disinvestment. At an empirical level, however, these characterizations are too stark. For reasons, some of which we understand and most of which we do not, firms respond differently to similar imbalances over time and across firms. Caballero and Engel ( 1 999) propose a probabilistic instead of a deterministic adjustment rule. Rather than having a clear demarcation between regions of adjustment and inaction, they model a situation where large imbalances are more likely to trigger adjustment than small ones 34.
33 See Bertola (1 988) for one of the first discussions of this issue in the economics literature. There is also a related discussion in applied mathematics; see, for example, El Karoui and Karatzas (199 1). Abel et a!. (1996) have recently revisited and expanded the discussion on the relation between the two approaches. 34 Another advantage of this approach is that it nests linear models as the probability of adjustment becomes independent of Z.
R.J. Caballero
836
There are many formal motivations for such an assumption. A particularly simple one, pursued by Caballero and Engel ( 1 999), is to assume that c1 in the adjustment cost function (3.4) is an i.i.d. random variable, both across firms and time. Although technically more complex, the nature of the problem is not too different from that of the simpler (L, l, u, U) model. Let w denote the random fixed cost, and G( w) its time invariant distribution. It is possible to characterize the problem of the firm in terms of two functions similar to those used before: V(Z, K*) and V(Z, K*, w), the value of a firm with imbalance Z, desired capital K*, and realization of fixed adjustment cost w. In particular, V(Z, K*) is the value of the firm provided it does not adjust, while V(Z, K*, w) is the value of the firm when it is left free to choose whether or not to adjust. Thus, V(Z1 , Kt) = II(Z1 , Kt)Llt + ( 1 - rllt)E1 [V(Zt+!lt, Kr:M , wt+M , )] ,
{
}
V(Zr , Kt , w) = max V(Z1 , Kt), m�x { V(Zr + 'f}, Kt) - C( n, Kt, W1)} .
(3. 1 5) (3. 1 6)
Not surprisingly, the nature of the solution is not too different from that of the (L, l, u, U) case. Indeed, conditional on w it is an (L, l, u, U) rule, although there are additional intertemporal considerations, since the firm weighs the likelihood of drawing higher or lower adjustment costs in the future. Without conditioning on w, it is a probabilistic rule in the space of imbalances. In order to simplify the exposition, I will suppress the proportional costs. Thus, conditional on adjustment, the target point is the same regardless of whether the firm is adding or subtracting to its stock of capital (i.e. l = u = c). Moreover, let me define a new imbalance index centered around zero: x
=::
ln (Z/c).
The probability of adjustment rises with the absolute value of x because there are more realizations of adjustment costs which justify adjustment. This is the sense in which the (S, s) nature of the simpler models is preserved. Let A(x) denote the function describing the probability of adjustment given x, and call it the adjustment hazard function [see Caballero and Engel ( 1 999)]. Given an imbalance x, it is no longer possible to say with certainty whether or not the firm will adjust, but the expected investment by the firm is given by E
[i;t lxJ
= (e-x - 1 ) 1\.(x) � -xA(x),
(3. 1 7)
which is simply the product of the adjustment if it occurs, and the probability that adjustment occurs 35 . Aggregation is now only a step away. 35 Caballero and Engel (1999) refer to Jl.(x) as the "effective hazard" to capture the idea that, through a normalization, it also captures scenarios where adjustment, if it occurs, is only a fraction of the imbalance
X.
Ch. 12:
Aggregate Investment
837
3.2. Aggregation Unlike microeconomic data, aggregate investment series look fairly smooth. Large microeconomic adjustments are far from being perfectly synchronized. The question arises, and this was the maintained hypothesis during the 1980s, as to whether aggregation eliminates all traces of lumpy microeconomic adjustment. The answer is a clear no. Doms and Dunne's evidence on the role of synchronization of primary spikes in accounting for aggregate investment, and on the high time series correlation between aggregate investment and a Herfindahl index of microeconomic investments, as well as the more structural empirical evidence reviewed in the next section, support this conclusion. With the setup at hand, aggregation proceeds in two easy steps. To simplify things further, I will define the aggregate as the behavior of the average, rather than the weighted average 36 . Both steps rely on having a large number of establishments, so that laws of large numbers can be applied. In the first step, one takes as the average investment rate (i.e. the ratio of investment to capital) of establislunents with more or less the same imbalance of capital, x, the conditional expectation of this ratio given in Equation (3 . 1 7):
(Kt )x /1
=
-xA(x),
(3. 1 8)
where the superscript x denotes the aggregate for plants with imbalance x. The second step just requires averaging across all x. Let f(x, t) denote the cross sectional density of establishments' capital imbalances just before investment takes place at time t. Then the aggregate investment rate at time t, (I/K1 is
(iJA j xA(x)f(x, t) dx. = -
)A ,
(3 . 1 9)
This is an interesting equation, with macroeconomic data on the left and microeco nomic data on the right-hand side. An example serves to illustrate this aspect of the investment equation: If the adjustment hazard is quadratic,
A(x) = A.o + }qx + A.zx2 , Equation (3 . 1 9) reduces to
(it r
=
-A.oXP) - Aj�(2) - Az�(3) '
(3.20)
where �( ! ) , �(Z) and �(3) denote, respectively, the first, second and third moments of the distribution of establishments' imbalances. 36 Using microeconomic data, Caballero, Engel and Haltiwanger ( 1 995) show that in US manufacturing this approximation is. See Caballero and Engel (1999) for a detailed discussion of the issue.
838
R.J. Caballero
If }q = A-2 = 0, the model only has aggregate variables, both on the right and left hand side. Indeed this case corresponds to the celebrated partial adjustment model, and it also coincides with the equation obtained from a quadratic adjustment cost model with a representative agent [e.g. Rotemberg ( 1 987) and Caballero and Engel (1 999)]. If either A- 1 or A-2 is different from zero, however, information about the cross sectional distribution of imbalances is needed on the right-hand side. All the microeconomic models discussed in this section yield situations where higher moments of the cross sectional distribution play a role. 3.3. Empirical evidence There are two polar empirical strategies used to estimate Equation (3. 1 9), with a continuum of possibilities in between. At one extreme, one can use microeconomic data to construct all the elements on the right-hand side; in particular one can construct the path of the cross sectional distribution and estimate the adjustment hazard as an accounting identity, or estimate a parametric version of it. At the other extreme, one can attempt to learn about the adjustment hazard from aggregate data only, by putting enough structure on the stochastic processes faced by firms and by starting with a guess on the initial cross sectional distribution. Both avenues have been explored, with similar results along dimensions they can be compared. 3.3.1. Microeconomic data Caballero, Engel and Haltiwanger ( 1 995) use information on approximately seven thousand US manufacturing plants from 1 972 to 1 98 8 to empirically recreate the steps described in the previous section 3 7. The figures below were constructed with data from that paper38 . The procedure used by Caballero, Engel and Haltiwanger i s essentially accounting, except for the first step, which requires estimating a series of frictionless capital for each establishment, and, from this, a measure of Xif (an index of the capital imbalance of firm i at date t). The series of frictionless capital were constructed using a procedure similar to that described in Section 2, but cointegration regressions were run at the individual establishment level 3 9. The average estimate of the long run elasticity of capital with 37 As in Doms and Dunne (1993), we used data from the Longitudinal Research Datafile (LRD). The LRD was created by longitudinally linking the establishment-level data from the Annual Survey of Manufacturing. The data used in that paper is a subset of the LRD, representing all large, continuously operating plants over the sample. The data sets include information on both investment and retirement of equipment (i.e. the gross value of assets sold, retired, scrapped, etc. 3 8 Warning: x in that paper corresponds to -x in this survey. 39 The results reported there constrained the coefficient on the elasticity of capital with respect to its cost to be equal across two-digit sectors, but all principal results were robust to different constraints and specifications.
Ch. 12:
839
Aggregate Investment
0.4
/
A(x)
0.3
0.2
0. 1
._ !L__.J..._ ...L___J.___j 0 . 0 -2 -1 0 2
Fig. 3.5. Adjustment hazard.
respect to its cost was close to minus one, with substantial heterogeneity across sectors. The measures of xit, up to a constant, correspond to the difference between actual and estimated frictionless capital 40 . There are two results from that paper which seem particularly relevant for this section of the survey. One on the shape of the adjustment hazard, and the other on the consequences of this shape for aggregate dynamics. I discuss the former here and the latter after the next subsection. Figure 3.5 reports the average adjustment hazard constructed from simply averaging the investment rates of establishments in a small neighborhood of each x, divided by minus the corresponding x. The hazard is clearly increasing for positive adjustment (i.e. expected investment rises more than proportionally with the shortage of capital), as one would expect from the nonlinearities implied by (L, l, u, U) type models, and unlike the linear models which imply a constant hazard. The estimated hazard is also very low for negative changes, suggesting irreversibility4 1 . Following a similar procedure, Goolsbee and Gross (1997) have studied very detailed and high quality microeconomic data on capital stock decisions in the US airline industry. They found clear evidence of behavior consistent with non-convex adjustment costs. 3.3.2. Aggregate data If only aggregate data are available, one needs to make some inference about the path of the cross sectional distribution of capital imbalances, f(x, t), from these data. This is possible if enough structure is placed on the stochastic processes faced by firms. 40 The establishment specific constants were estimated as the average gap between their respective kit and ki� for the five points with investment closest to their median (broadly interpreted as maintenance investment). 41 Retirements include assets sold, scrapped or retired. It is possible that observations are very noisy on this side. The right-hand side of the figure should therefore be viewed with some caution.
840
R.J. Caballero
The basic operations affecting the evolution of f(x, t) are quite simple. Given the density, or histogram, at time t - 1 , there are three basic operations in its transformation into f(x, t). First, aggregate shocks and common depreciation shift everybody's x in the same direction; second, given the adjustment hazard, the density at each x is split into those that stay there and those that adjust and move to some other position in the state space (in the simplest case, they move to x = 0, but this is not necessary); and third, idiosyncratic shocks hit, which amounts to a convolution of the density resulting from the second step and that of idiosyncratic shocks. Making distributional assumptions about idiosyncratic shocks and the initial cross sectional distribution, is enough, therefore, to keep track of the evolution of the cross sectional density, conditional on aggregate shocks and for a given adjustment hazard. In continuous time, and assuming Brownian motions for aggregate and idiosyncratic shocks, Bertola and Caballero ( 1 994) estimated the irreversible investment model, and Caballero ( 1 993b) did so for the (L, l, u, U) model. 42 In discrete time but continuous state space, Caballero and Engel ( 1 999), estimated the more general adjustment hazard model described in the previous sections, assuming that both idiosyncratic and aggregate shocks were generated by log-normal processes. We did so for US manufacturing investment in equipment and structures (separately) for the 1 947-1 992 period43 . The results were largely consistent with those found with microeconomic data by Caballero, Engel and Haltiwanger ( 1 995). There is clear evidence of an increasing hazard model; that is, the expected adjustment of a firm grows more than proportionally with its imbalance 44. An important point to note is that since only aggregate data were used, these microeconomic nonlinearities must matter at the aggregate level, for otherwise they would not be identified. The improvement in the likelihood function from estimating this non-linear model rather than a simple linear model (including the quadratic adjustment cost model) was highly significant, and so was the improvement in the out-of-sample forecasting accuracy45 .
42 See Bertola and Caballero (1 990) for a discrete time and space model and estimation procedure. 43 Another important difference between this and the previous papers is that estimation was done by a single step maximum likelihood procedure, which did not require estimating frictionless capital separately. 44 We did not allow for asymmetries between ups and downs but this turned out not to matter much because given the strong drift induced by depreciation and the small value we found for the hazard in an interval around zero, the model effectively behaves as if investment is irreversible (i.e. It is very asymmetric around the median value of x and with a very small hazard for values of x much higher than that.). 45 For within sample criteria, we ran Vuong's [Rivers and Vuong (1991)] test for non-nested models, and we rejected strongly the hypothesis that both models (linear and non-linear) are equally close to the true model against the hypothesis that the structural (non-linear) model is better. For out-of-sample criteria, we dropped the last ten percent of the observations and evaluated the Mean Squared Error of the one step ahead forecasts for these observations [see Caballero and Engel (1 999)].
Ch. 12: 3.3. 3.
841
Aggregate Investment
Pent-up demand
What is the aspect of the data that makes these models better than linear ones at explaining aggregate investment dynamics? The simplest answer comes from an example. Suppose that a history of mostly positive aggregate shocks displaces the cross sectional distribution of imbalances toward the high part of the hazard. Such a sequence of events will not only lead to more investment along the path but also to more pent-up investment demand; indeed, the cross sectional distribution represents unfulfilled investment plans. But as unfulfilled demand "climbs" the hazard, more units are involved in responding to new shocks; incremental investment demand is more easily boosted by further positive aggregate shocks, or depressed by a turnabout of events. This time-varying/history-dependent aggregate elasticity plays a very important role for aggregate investment dynamics. It captures the aggregate impact of changes in the degree of synchronization of large adjustments; already an important explanatory variable in Doms and Dunne's less structural study. In particular, their observation that the Herfindahl of investment rises during episodes of large aggregate investment matches well this mechanism. Using the path of cross sectional distributions and hazards described at the beginning of this subsection, Caballero, Engel and Haltiwanger ( 1 995) found an important role for the mechanism described above. Figure 3.6 depicts the relative contribution of the time-varying aggregate elasticity for aggregate investment dynamics. A positive value reflects an amplification effect (micro-nonlinearities exacerbate the economy's response to aggregate shocks), while a negative value reflects an offsetting effect. The impact of the time-varying elasticity appears to be especially large after the tax-reform of 1 986 (when tax-incentives for investment were removed). The decline in investment was 20 percent greater than it would have been under a linear model. Fraction
1 976
1978
1980
1982
1984
1986
Fig. 3.6. Relative contribution of time-varying marginal response, 1 974-- 1 988.
R.J Caballero
842
0 I
Fig. 3.7. Equipment-mean-difference.
The importance of the time-varying elasticity is confirmed by Caballero and Engel (1999), this time using only aggregate data. As before, it is the flexible cyclical elasticity of the increasing hazard model which allows it to better capture the high skewness and kurtosis imprinted on aggregate data by brisk investment recoveries 46 . The solid line in Figure 3.7 plots the difference between the path of the US manufacturing equipment investment-capital ratio and the predictions of a linear model (partial adjustment) fed with the shocks estimated for the increasing-hazard model; the dashed line portrays the path of the aggregate investment-capital ratio around its mean. It is apparent from these figures that the linear model makes its largest errors at times of large investment changes.
3.4. Equilibrium The literature described in the previous section only considers exogenous aggregate shocks. What the econometric procedures identified as aggregate shocks are in all likelihood a combination of "deep" aggregate shocks and the feedback and constraints brought about by factor markets, goods markets, and intertemporal preferences, among other things. Bottlenecks may certainly limit the extent of synchronized investment. Equilibrium constraints not only affect the response of aggregate investment to deep aggregate shocks, but also affect the nature of the stochastic processes faced 46 Note that just allowing for skewness and kurtosis in shocks, although it improves the performance of linear models, is not nearly enough to make the linear model as good as the non-linear one. In Caballero and Engel ( 1999) we compared the structural model with normal shocks (to the rate of growth of desired capital) with a linear model which flexibly combined normal and log-normal shocks (which allows for skewness and kurtosis). We found that Vuong's test still favored the non-linear model very clearly. Moreover, in Caballero, Engel and Haltiwanger ( 1 995) we found no evidence that would allow us to reject the hypothesis that shocks have a normal distribution.
Ch. 12:
Aggregate Investment
843
by firms and the dimension of the state space. It is this last observation which has inhibited progress in constructing general equilibrium versions of these models. In principle, the entire cross sectional distribution is needed to forecast future prices faced by any particular firm, which means that actions today, and therefore equilibrium determination, depend on these complex forecasts, and so on. We are, however, beginning to see progress along this dimension. Much of this has occurred in models with active extensive margins, and will be discussed in the next sections, together with the reasons why the presence of an extensive margin (entry and/or exit) may facilitate rather than complicate the solution of the model. However, there has also been recent progress along the lines of the intensive margin models discussed up to now. Krieger (1 997) embeds the heterogeneous agents irreversible investment model of Bertola and Caballero (1994) into a more or less standard Real Business Cycle model. He deals with the curse of dimensionality by arguing that, except for very high frequency aspects of the data, expectations can be well approximated by keeping track of a finite (and not too large) number of statistics of the Fourier representation of the cross sectional distribution. I suspect that the quality of this approximation is facilitated by the fact that, in Krieger's model, aggregate shocks occur only infrequently. Nonetheless, I view his as an important step forward. At this stage, the primary effect of general equilibrium is not surprising. It brings important sources of aggregate convexity into the problem, smoothing further the response of aggregate investment to aggregate shocks. How important are aggregate sources of convexity? I suspect that, together with time to build considerations, they are among the main sources of convexity in the short run. On one hand, we have already presented substantial evidence on microeconomic lumpiness, which is largely inconsistent with a dominant role for generalized convexity at the microeconomic level. On the other, not only is it well known that estimated partial adjustment coefficients grow with the degree of disaggregation of the data, but we also have direct evidence on the importance of bottlenecks. Goolsbee (1 995a) provides interesting evidence on the latter. He exploits the variation across time and assets (capital) in investment tax incentives, as instruments for short-run investment demand. He shows that the price of assets is highly responsive to ITCs: A 1 0 percent increase in ITCs leads to an average increase in the price of capital goods of about 6 percent. This price effect slowly vanishes over the following three years 47. Equilibrium considerations will play a central role in the sections that follow. In particular, the issue of the elasticity of the supply of capital, generally interpreted, as well as that of other bottlenecks will be revisited often. 47 In further work, Goolsbee ( 1 997) concludes that an important fraction of the increase in short nm marginal cost is due to an increase in the wages of workers who produce capital goods. In the last part of Section 5 I will discuss the connection between sunk investment and payments to complementary factors. Questioning the robustness ofGoolsbee's (1 995a) findings, Hassett and Hubbard (1 996a), find evidence of a positive effect of tax credits on prices of capital goods before 1 97 5 but not after that.
844
R.J. Caballero
4. Entry, exit and scrapping
Changes in the aggregate stock of capital are not only due to the expansion of existing establishments and projects, but also result from the entry (creation) decisions of existing and new entrepreneurs, the exit decisions of some incumbents, and the restructuring of possibly outdated forms of production. There is a very extensive and interesting industrial organization literature on these issues which I will not discuss here. Instead, I will focus on issues that directly relate to our current discussion: the impact of sunk costs on aggregate investment and the feedback of equilibrium considerations into individual decisions about lumpy actions. This section contains three main messages: First, by truncating the distribution of perceived future returns, free entry acts as if each competitive investor internalized the negative effect of its entry decision on expected future industry prices. Second, equilibrium scrapping and creation are closely connected: if industry wide creation costs are linear, scrapping will be less responsive to aggregate shocks than if these costs are convex (i.e. if there is an upward sloping short-run supply of (newly) installed capital). Among other things, this is important for capital accumulation and the patterns of its mismeasurement. And third, in equilibrium, shocks to the scrapping margin can lead to investment booms, and to double-counting problems in the measurement of capital 48 .
4. 1. Competitive entry and irreversibility Dixit ( 1 989), Leahy ( 1 993), and Caballero and Pindyck ( 1 996), among others, have provided simple models of competitive equilibrium investment in which the only meaningful investment decision of firms is whether or not to enter into and, in some cases, exit from the industry 49 . Below, I sketch a representative model of this type. Investment is sunk upon entry in the sense that selling the firm's capital does not change its productivity. The flow accruing to a firm i at time t is summarized by the product of an idiosyncratic productivity level, Sit > 0 and the industry price, P1• The idiosyncratic productivity level is such that industry output, Y1, is (4. 1 ) where N1 i s the measure of firms at time from the demand equation: Tl" r
Pt = t yt-llry
=
t. Given N1 , the industry price is determined
T7 N -llry r
t t ,
(4.2)
48 See Greenspan and Cohen (1996), for a discussion of the importance of considering endogenous scrappage to forecast sales of new motor vehicles in the USA.
49 See Hopenhayn (1992) for an elegant characterization of the steady state properties of a competitive equilibrium model of entry and exit.
Ch. 12:
Aggregate Investment
845
where V1 is an aggregate demand shock that follows a geometric Brownian motion with drift f1 > 0 and standard deviation a, and 1') is the elasticity of demand with respect to price 50 . Let there be an infinite supply of potential entrants, whose initial productivity upon entry is drawn from the distribution of productivities of existing firms. There is an entry cost F and no depreciation or higher productivity alternative use (issues of exit will be discussed in the next subsection). Free entry implies: (4.3) Using Fubini's Theorem (i.e. moving the expectation with respect to the idiosyncratic shocks inside the integral) allows us to remove the idiosyncratic component from Equation (4.3), yielding
F ;;?: Et
[!= s -r s-t ]
P e ( ) ds .
(4.4)
Given N1, the industry price is exclusively driven by the aggregate demand shock. Thus, absent entry, the right-hand side of Equation (4.4) is an increasing function of P1, call it fo(P). Entry, however, cannot always be absent, for that would occasionally violate the free entry condition. Indeed, as soon as fo(P) > F, there would be infinite entry which, in turn, would lower the equilibrium price instantly. There is only one price, call it P0 , such that the free entry condition holds with equality. Once this price is reached, enough entry will occur to ensure that the price does not cross this upper bound; but, to be justified, entry must not occur below that bound either. Entry, therefore, changes the stochastic process of the equilibrium price from a Brownian Motion to a regulated Brownian Motion. This change in the price process, however, means that fo is no longer the right description of the expression on the right-hand side of Equation (4.4). There is a new function, fi (P), which is still monotonic in the price, but which satisfies fi (P) < fo(P) for all P because of the role of entry in preventing the realization of high prices. This, in turn, implies a new reservation/entry price P 1 > Po, which leads to a new function .h(P), such that fo > .h > fi , which leads to a new regulation point in between the previous ones, and so on until convergence to some equilibrium,
(f(P), P) S l .
Thus, through competitive equilibrium, we have arrived at a solution like that of the irreversible investment problem at the individual level, but now for the industry as a whole. Periods of inaction are followed by regulated investment (through entry) during favorable times. The constructive argument used to illustrate the solution isolates 50 Adding an aggregate productivity shock is straightforward. The Brownian Motion assumption is not needed, but it simplifies the calculations. 5 1 Needless to say, this iterative procedure is not needed to obtain the solution of this problem.
846
R.J. Caballero
the feedback of equilibrium on individual decisions. Potential entrants (investors) know that if market conditions worsen they will have to absorb losses (this is where irreversibility kicks in), while if market conditions improve, entry will occur, limiting the potential gains (since the price will never be higher than P). As a result, they delay entry because the expected value of future market prices is necessarily lower than the current/entry price. There is a methodological angle in this literature. Entry (and exit) is a very powerful mechanism. With the "appropriate" assumptions about potential entrants, entry often simplifies the computation of equilibrium in models with heterogeneity and sunk costs. Essentially, the methodological "trick" is that the degree of complexity of the computational problem in cases where both extensive and intensive margins are present is often largely determined by the nature of the distribution of potential entrants, which can be made much simpler than the endogenous evolution of the cross sectional distributions discussed in the previous section. Of course, in reality there is substantial inbreeding, so the distribution of potential entrants is in all likelihood related to that of incumbents. Nonetheless, the current set of models are convenient machines that allow us to cut the chain of endogeneity before it gets too forbidding, but after the first stage, where there are no endogenous interactions. This methodological advantage has allowed researchers to explore some of the equilibrium issues left open in Section 3 . Caballero and Hammour ( 1 994) have explored in more detail the consequences of different assumptions on the supply of capital for the pattern of aggregate investment Gob creation) and scrapping Gob destruction). The latter is a very important, and often disregarded, aspect of the timing of capital accumulation. I will return to the scrapping issue in the next sections, but for now I just want to interpret it as an incumbent's decision (as opposed to a potential entrants' decision). The issue at hand is how does the entry pattern affect the response of incumbents to aggregate shocks. A scrapping margin can easily be added to the entry model discussed above by, for example, allowing S; to take negative values (e.g. due to the increase in the price of an intermediate input). Imagine, however, that the drift in the aggregate shock (and/or the failure rate of incumbents) is strong enough so there is continuous entry. Since the supply of capital faced by the industry is fully elastic (the entry cost is constant), continuous entry implies that the industry price is constant and equal to P (corrected for the exit possibility). That is, aggregate shocks are accommodated by the flow of investment by new entrants; fully insulating insiders from aggregate shocks. Insiders go about their scrapping decisions only considering their idiosyncratic shocks; adding a standard intensive margin does not change the basic insight [see Campbell and Fisher ( 1 996)]. Caballero and Hammour ( 1 994) refer to this result as perfect insulation. From a technical point of view, the simplicity of the computation of equilibrium in the perfect insulation case carries through to situations where the cost of investment fluctuates exogenously, although in that case perfect insulation breaks down. If the industry faces an upward sloping supply of capital, a sensible assumption at least in the
Ch. 12:
Aggregate Investment
847
short run (remember Goolsbee's evidence), we return to a scenario in which the "curse of dimensionality" appears. Caballero and Hammour ( 1 994, 1 996a) have dealt with this case in scenarios where aggregate shocks follow deterministic cycles 52 . Besides the specific issues addressed in those papers, the main implication for the purpose of this survey is that investment by potential entrants becomes less responsive to aggregate shocks, which also means a break down of perfect insulation and therefore a more volatile response of the scrapping and intensive margins. Krieger ( 1 997) also discusses equilibrium interactions between creation and destruc tion margins, although he obtains positive rather than negative comovement between investment and scrapping. In his model, a permanent technology shock leads to a short term increase in interest rates which squeezes low productivity units relative to high productivity ones. The ensuing increase in scrapping frees resources for new higher productivity investment. Similarly, Campbell (1 997) studies the equilibrium response of entry and exit to technology shocks embodied in new production units. He argues that the increase in exit generated by positive technological shocks is an important source of resources for the creation of new production sites.
4.2. Technological heterogeneity and scrapping Scrapping is an important aspect of the process of capital accumulation. Understanding it is essential for constructing informative measures of the quantity and quality of capital at each point in time. Nonetheless, the scrapping margin is seldom emphasized, I suspect, mostly because of the difficulties associated with obtaining reliable data 53 . As a result, many time series comparisons of capital accumulation and productivity growth (especially across countries) are polluted by inadequate accounting of scrapping. Effective capital depreciation must surely be higher in countries undergoing rapid modernization processes. Partly to address these issues, vintage capital and putty-clay models have regained popularity lately. Benhabib and Rustichini ( 1 993), for example, describe the investment cycles that follow scrapping cycles in a vintage capital model. While Atkeson and Kehoe (1 997) argue that putty-clay models outperform standard putty-putty models with adjustment costs in describing the cross sectional response of investment and output to energy shocks. Gilchrist and Williams (1 996), on the other hand, embody the putty-clay model in an otherwise standard RBC model and document a substantial gain over the standard RBC model in accounting for the forecastable comovements of economic aggregates. And Cooley et al. (1 997) describe the medium/low frequency
52 In work in progress [Caballero and Hammour (1997b)], we have obtained an approximate solution
for the stochastic case, in a context where the sources of convexity are malfunctioning labor and credit markets. 5 3 See Greenspan and Cohen (1996) for sources of scrapping data for US motor vehicles.
848
R.J Caballero
aspects of a multisectoral vintage capital economy, and show how tax policy can have significant effects on the age distribution of the capital stock 54 . The technological embodiment aspect of these models captures well the creative destruction component of capital accumulation and technological progress 55. Salter's ( 1960) careful documentation of the technological status of narrowly defined US and UK industries is very revealing with respect to the simultaneous use of different techniques of production and the negative correlation between productivity ranking and the technological age of the plant 56. For example, his table 5 shows the evolution of methods in use in the US blast furnace industry from 1 9 1 1 to 1 926. At the beginning of the sample, the "best practice" plants produced 0.32 gross tons of pig iron per man-hour, while the industry average was 0 . 1 4. By the end of the sample, best practice plants productivity was 0.57 while the industry average was 0.30. While at the beginning of the sample about half of the plants used hand-charged methods of production, only six percent did at the end of the sample. As mentioned above, obsolescence and scrapping are not only driven by slowly moving technological trends, but also by sudden changes in the economic environment. Goolsbee ( 1 995b) documents the large impact of oil shocks on the scrapping of old and fuel-inefficient planes. For example, he estimates that the probability of retirement of a Boeing 707 (relatively inefficient in terms of fuel) more than doubled after the second oil shock. This increase was more pronounced among older planes. Once more, the endogenous nature of the scrapping dimension must be an important omitted factor in our accounting of capital accumulation and microeconomic as well as macroeconomic performance. The sunk nature of technological embodiment is a source of lumpy and discontinu ous actions at the microeconomic level. The (S, s) apparatus, with its implications for aggregates, is well suited for studying many aspects of vintage and putty-clay models. In particular, episodes of large investment which leave their technological fingerprints, and remain in the economy, reverberating over time.
5. Inefficient investment
Fixed costs, irreversibilities and their implied pattern of action/inaction, have mi croeconomic and aggregate implications beyond the mostly technological (and neoclassical) ones emphasized above. Indeed, they seed the ground for powerful inefficiencies. This section describes new research on the consequences of two of 54 Jovanovic ( 1 997) studies the equilibrimn interaction of the cross sectional heterogeneity implied by vintage capital and putty-clay models with heterogeneity in labor skills. 55 Besides obsolescence and scrapping, these models are also useful for studying the issues of "mothballing" and capital utilization. 5 6 This correlation is less clear in modern data; perhaps because retooling occurs within given structures.
Ch. 12:
Aggregate Investment
849
the most important sources of inefficiency in aggregate investment: informational and contractual problems.
5. 1. Informational problems Information seldom arrives uniformly and comprehensively to every potential investor. Each investor probably holds part of a truth which would be more easily seen if all investors could (or would) pool their information. Actions by others are a partial substitute for information pooling, for they reveal, perhaps noisily, the information of those that have taken actions. If, however, investment is irreversible, it may pay to wait for others to act and reveal their information before investing. Moreover, if lumpiness leads to periods of no or little action, information may remain trapped for extended periods of time, and when agents finally act, an avalanche may occur because accumulated private information is suddenly aggregated. These issues form the crux of a very interesting new literature, summarized in Gale ( 1 995) under the heading of "social learning." There are two themes emerging from this literature which are of particular importance for this survey. The first is the existence of episodes of gradualism, during which industry investment can occur at an excessively slow pace, or even collapse altogether. The second is an exacerbation of the aggregate nonlinearities implied by the presence of fixed costs; aggregation of information coincides with the synchronization of actions, further synchronizing actions. Caplin and Leahy ( 1 993, 1 994) cleanly isolate the issues I have chosen to stress here. Caplin and Leahy ( 1 993) describe a model very similar to the free entry model reviewed in Section 4 . 1 , except that their model has neither aggregate nor idiosyncratic shocks. Instead there is a flow marginal cost of producing which is only known to industry insiders. Insiders have the option to produce one unit of output or none and they will produce if price is above marginal cost. This generates an information externality. If all incumbents are producing, potential investors know that marginal cost is below the current equilibrium price; if not, the industry's marginal cost is revealed to be equal to the current price. Whenever a new establishment is created, equilibrium price either declines or stays constant, improving the precision of potential investors ' assessment of the industry's marginal cost. In a second best solution, investment occurs very quickly up to a point at which, even if marginal cost has not yet been reached, no further investment takes place because it is very unlikely that the present value of future social surpluses is enough to cover the investment costs. The industry equilibrium outcome has the same critical point at which investment stops, but unlike the second best outcome, it yields a much slower pace of industry investment. A potential entrant must weigh the value of coming early into the industry (expected profits are higher than they will be later), not only against the cost of capital (as in the second best solution) but also against the probability of learning in the next second from the investment decisions of others that it was not worth entering
R.J. Caballero
850
the industry. Caplin and Leahy show that the price process differential equation: 1
-2 x(t) = rF - F
i(t) x(t) ' -
x(t)
obeys the following
(5 . 1)
where F is the fixed entry cost paid by the firm and r is the real interest rate. This equation has a natural interpretation which captures the idea that competitive firms are indifferent between entry today and entry tomorrow. The left-hand side represents the loss in current revenue incurred by a firm which delays entry for a brief instant beyond t 57. The right-hand side captures the expected gain from this delay. The term rF reflects the gain due to the postponement of the entry cost, while the last term represents the saving due to the possibility that delay will reveal the true industry's marginal cost, aborting a wasteful investment 58. In equilibrium, entry is delayed and price declines slowly; "gradualism" maintains prices high enough for sufficiently long so as to offset (in expectation) the risk incurred by investors who act early rather than wait and free-ride off of others' actions 59• Caplin and Leahy ( 1 994) characterize the opposite extreme, one of delayed exit. The key connection with the previous sections is that the problem of information revelation arises from the fact that, as we have seen, fixed costs of actions make it optimal not to act most of the time. Thus, information that could be revealed by actions remains trapped. Their model is one of time-to-build. Many identical firms simultaneously start projects which have an uncertain common return several periods later (e.g. a real estate boom). Along the investment path, firms must continue their investment and receive private signals on the expected return. The nature of technology is such that required investment is always the same if the firm chooses to continue in the project. The firm has the option to continue investing ("business as usual"), to terminate the project, or to suspend momentarily, but the cost of restarting the project after a suspension is very large. Project suspension reveals (to others) negative idiosyncratic information; if nobody suspends, it is good news. However, the costly nature of suspension delays it, and therefore information revelation is also delayed. Bad news may be accumulating but nobody suspends, because everybody is waiting for a confirmation of their bad signals by the suspension of other people. Eventually, some firms will receive enough bad signals to suspend in spite of the potential cost of doing so (i.e., if they are wrong
57 At the time when the industry starts, potential investors' priors are that the price is distributed uniformly on [0, I]. As entry occurs and the price declines, the priors are updated. If convergence has not happened at time t, marginal cost is assumed uniformly distributed on [O,x(t)]. The expected cost of waiting is, therefore, equal to the price minus the expected marginal cost, �x(t). 5 8 Here d t is the probability that price hits marginal cost during the next dt units of time.
!�:\
59 Even though entrants make zero profits in expectation, ex-post, early entrants earn positive profits, while late entrants lose money.
Ch. 12:
Aggregate Investment
851
in their negative assessment of market conditions). Since the number of firms in their model is large, the number of firms that suspend for the first time fully reveals future demand: if demand is low, everybody exits; if it is high, all those that suspended restart. If it were not for the interplay between inaction (investment as usual) and private information, the fate of the market would be decided correctly after the first round of signals. Information aggregation does not take place until much later, however. Thus, substantial investment may turn out to be wasted because the discrete nature of actions inhibits information transmission. The title of their paper beautifully captures the ex post feeling: Wisdom after the fact. The "classic" paper from the literature on information and investment is due to Charnley and Gale ( 1 994). In their model all (private) information arrives at time zero; the multiple agent game that ensues may yield many different aggregate investment paths, including suboptimal investment collapses. In reviewing the literature, Gale ( 1 995) illustrates the robustness of the possibility of an inefficient investment collapse (or substantial slowdown and delay). He notices that in order for there to be any value to waiting to see what others do before taking an action (investing for example) it must be the case that the actions of others are meaningful. That is, the action taken in the second period by somebody who chose to wait in the first period must depend in a non trivial way on the actions of others at the first date. If a firm chooses to wait this period, possibly despite having a positive signal, it will only invest next period if enough other firms invest this period. It must therefore be possible for every firm to decide not to invest next period because no one has invested this period, even though each firm may have received a positive signal this period, in which case, investment collapses. This is a very interesting area of research for those concerned with investment issues and is wanting for empirical developments.
5.2. Specificity and opportunism The quintessential problem of investment is that it is almost always sunk, possibly along many dimensions. That is, the number of possible uses of resources is reduced dramatically once they have been committed or tailored to a specific project or use. Every model I discussed in the previous sections, at some stage hinges in a fundamental way on this feature of investment. To invest, often means opening a vulnerable flank. Funds which were ex-ante protected against certain realizations of firm or industry specific shocks, for example, are no longer so. In equilibrium, investment must also allow the investor to exploit opportunities which would not be available without the investment. If the project is well conceived, the weight of good and bad scenarios is such that the expected return is reasonable. Indeed, this is precisely the way I characterized the standard irreversible investment problem early on. The problem is far more serious, and more harmful for investment, when the probability of occurrence of the bad events along the exposed flanks are largely
852
R.J. Caballero
controlled by economic agents with the will and freedom to behave opportunistically. In a sense, this is a property rights problem, and as such it must have a first-order effect in explaining the amount and type of capital accumulation and, especially, differences in these variables across countries. Thus, the window for opportunism arises when part of the investment is specific to an economic relationship, in the sense that if the relationship breaks up, the potential rewards to that investment are irreversibly lost. Further, such opportunism is almost unavoidable when this "fundamental transformation" from uncommitted to specialized capital is not fully protected by contract [Williamson (1 979, 1 985)] 60 . Specificity, that is, the fact that factors of production and assets may be worth more inside a specific relationship than outside of it, may have a technological or an institutional origin. Transactions in labor, capital and goods markets are frequently characterized by some degree of specificity. The creation of a job often involves specific investment by the firm and the worker. Institutional factors, such as labor regulations or unionization also build specificities. There is a very extensive and interesting microeconomic literature on the impact of unprotected specificity on the design of institutions, organizations and control rights. Hart ( 1 995) reviews many of the arguments and insights. For the purpose of this survey, however, the fundamental insight is in Simons ( 1 944), who clearly understood that hold-up problems lead to underinvestment: . . . the bias against new investment inherent in labor organizations is important . . . . Investors now face . . . the prospect that labor organizations will appropriate most or all of the earnings . . . . Indeed, every new, long-term commitment of capital is now a matter of giving hostages to organized sellers of complementary services.
More recently, Grout ( 1 984) formalized and generalized Simons' insight, and Caballero and Hammour ( 1 998a) studied, at a general level, the aggregate conse quences of opportunism 6 1 . Here, I borrow the basic model and arguments from that paper to discuss those aspects of the problem which are most relevant for aggregate investment. Everything happens in a single period 62 . There is one consumption good, used as a numeraire, and two active factors of production, 1 and 2 63 . Ownership of factors 1 and 2 is specialized in the sense that nobody owns more than one type of factor. 60
This is known as the hold-up problem. For specific applications which relate to investment see Kiyotaki and Moore ( 1 997) [credit constraints]; Caballero and Hammour ( 1996a, 1 998b) and Ramey and Watson (1996) [turnover and unemployment]; Caballero and Hammour ( 1996b), Blanchard and Kremer (1996) [transition economies and structural adjustments]; Caballero and Hammour ( 1997b) [interactions between labor market and credit market opportunism]; Acemoglu (1996) [human capital investment]. 62 Many of the insights discussed here can and have been made in dynamic, but more specialized contexts. I am confident, therefore, that this section's discussion is fairly robust to generalizations along this dimension. 63 Also, there is a passive third factor which earns the rents of decreasing returns sectors. 61
Ch. 12:
Aggregate Investment
853
There are two modes of production. The first is joint production, which requires, in fixed proportions, x1 and x units of factors 1 and 2, respectively, to produce y units 2 of output. Let E denote the number of joint production units, so E; = x;E represents employment of factor i in joint production. The other form of production is autarky where each factor produces separately, with decreasing returns technologies F;(U;), and where U; denotes the employment of factor i in autarky, such that E; + U; = 1 . The autarky sectors are competitive, with factor payments, p;:
p; = F{( U;).
(5.2)
For now, there are no existing units. At the beginning of the period there is mass one of each factor of production. There are no matching frictions so that, in the efficient/complete contracts economy, units move into joint production (assuming corners away) until (5.3) where asterisks are used to denote efficient quantities and prices. Specificity is captured by assuming that a fraction ¢; of each factor of production cannot be retrieved from a relationship once they have agreed to work together. If the relationship breaks up, ( 1 - ;)x; units of factor i can return to autarky, where it produces for the period, while l/J;x; is irreversibly wasted. In the simple deterministic single-period model discussed here, specificity plays no role in the efficient economy, where there are no separations. Contracts are needed because investment occurs before actual production and factor participation. There are myriad reasons why contracts are seldom complete. An extreme assumption which takes us to the main issues most directly, is the assumption that there are no enforceable contracts. It turns out that, in equilibrium, the incomplete contracts economy has no separations either; but unlike the efficient economy, the mere possibility of separations alters equilibrium in many ways. Generically, equilibrium rewards in joint production will have ex-post opportunity cost and rent-sharing components. For simplicity, let us assume that factors split their joint surplus 50/50. Thus, the total payment to the x; units of factor i in a unit of joint production is 64 w;x;
where
= ( 1 - l/J;)x;p; + �s,
(5.4)
s denotes the (ex-post) quasi-rents of a production unit: (5.5)
64 Factors bargain as coalitions within the production unit.
854
R.J. Caballero
For a factor of production to willingly participate in joint production it must hold that (5.6) Substituting Equations (5.4) and (5.5) into Equation (5.6), transforms factor i's participation condition into (5.7) with (5.8) which measures the net sunk component of the relationship for factor i. In other words, it is a measure of the "exposure" of factor i to factor j. When Ll; is positive, part of factor i 's contribution to production is being appropriated by factor j 65 .
5.2. 1. Generic implications Figure 5 . 1 characterizes equilibrium in both efficient and incomplete contract econo mies. The two dashed curves represent the right-hand side of condition (5 .7) for factors 1 and 2. They are increasing in the number of production units because the opportunity cost of factors of production (the p;s) rise as resources are attracted away from autarky. The thick dashed curve corresponds to that factor of production (here factor 1 ) whose return in autarky is less responsive to quantity changes 66. If one thinks of capital and labor, arguably capital is this factor; which is a maintained assumption through most of this section. The horizontal solid line is a constant equal to y, which corresponds to the left-hand side of condition (5.7). Equilibrium in the incomplete contracts economy corresponds to the intersection of this line with the highest (at the point of intersection) of the two dashed lines. In the figure, the binding constraint is that of capital. An efficient equilibrium, on the other hand, corresponds to the intersection of the horizontal solid line with the solid line labeled Eff. The latter is just the sum of the ex ante opportunity costs of factors of production [the right-hand side of Equation (5.3)]. This equilibrium coincides with that of the incomplete contracts economy only when both dashed lines intersect; that is, when net appropriation is zero (Ll; = -L11 = 0). There are several features of equilibrium which are important for investment (or capital accumulation). First, there is underinvestment; equilibrium point A is to the left of the efficient point A*. Because it is being appropriated, capital withdraws into autarky (e.g. consumption, investment abroad, or investment in less socially-valuable 65
66
It should be apparent that L11 = -L1i . That is, autarky exhibits relatively less decreasing returns for this factor.
Ch. 12:
Aggregate Investment
(Z)
855
Eff.
i B / .,.. ,... ,...
.... ( ! )
Fig. 5. 1 . Opportunism E
m
general equilib-
rium.
activities) 67. Second, the withdrawal of capital constrains the availability of jobs and segments the labor market 68• In equilibrium, not only are there fewer joint production units, but also the right-hand side of condition (5.7) for labor is less than y, reflecting the net appropriation of capital; outside labor cannot arbitrage away this gap because its promises are not enforceable. Third, investment is more volatile than it would be in the efficient economy 69. Changes in y translate into changes in the number of joint production units through capital's entry condition (thick dashes), which is clearly more elastic (at their respective equilibria) than the efficient entry condition ("Eff" line). If profitability in joint production is high enough, equilibrium is to the right of the balanced specificity point, B. In that region, it is the labor entry condition which binds. In principle, problems are more easily solved in this region through contracts and bonding. If not solved completely, however, there are a few additional conclusions of interest for an investment survey. First, there is underinvestment since the complementary factor, labor, withdraws (relative to the first best outcome) from joint production. Second, capital is now rationed, so privately profitable investment projects do not materialize. Third, investment is now less volatile than in the efficient economy. Changes in y translate into changes in the number of joint production units through labor's entry condition (thin dashes), which is clearly less elastic than the efficient entry condition ("Eff" line).
67 See Fallick and Hassett ( 1 996) for evidence on the negative effect of union certification on firm level investment. 68 This holds even in the extreme case where capital and labor are perfect substitutes in production. See Caballero and Hammour ( 1 998a). 69 In a dynamic model, this translates into a statement about net capital accumulation rather than, necessarily, investment. The reason for the distinction is that the excessive response of the scrapping margins and intertemporal substitution effects on the creation side may end up dampening actual investment. See Caballero and Hammour (1996a).
856
R.J. Caballero
The equilibrium implications of incomplete contracts also affect the scrapping decisions of firms. The easiest way to see this is to examine an existing production unit and ask how low its profitability would have to be for it to scrap itself and seek other opportunities. Moreover, assume that neither factor suffers from specificity in this production unit, so that the efficient rule is scrap whenever profitability is less than y. Two, apparently contradictory, features characterize the incomplete contracts economy. First, because the opportunity cost of factors of production is depressed by the excessive allocation to autarky, there is sclerosis; that is, there are units with profitability below y which are not scrapped because the opportunities in autarky are depressed. Second, given the depressed level of investment, there is excessive destruction. Since the appropriating factor earns rents in joint production, some of them leave socially valuable production units in order to improve their chances of earning these excess returns. Caballero and Hammour ( 1 998a,b) argue that, over the long run, capital/labor substitution takes place. If capital is being appropriated, it will seek to exclude labor from j oint production by choosing a capital intensive technology. This effect goes beyond purely neoclassical substitution, as it also seeks to reduce the appropriability problem 70 • At a general level, of course, unenforceability of contracts results from the absence of well defined property rights. There is plenty of evidence on the deleterious consequences of such problems for investment. Two recent examples in the literature are Besley ( 1 995) and Hall and Jones ( 1 996). The former provides a careful description of land rights in different regions of Ghana. He documents that an "extra right" over a piece of land increases investment in that land by up to 9 percent in Anloga and up to 28 percent in Wassa 7 1 . Hall and Jones ( 1 996) use a large cross section of countries to show, among other things, that capital/labor ratios are strongly negatively related to "divertment activities."
5.2.2. Credit constraints There is by now a large body of evidence supporting the view that credit constraints have substantial effects on firm level investment. Although there are a number of qualifications to specific papers in the literature, the cumulative evidence seems overwhelmingly in favor of the claim that investment is more easily financed with internal than external funds 72 . I will not review this important literature here because there are already several good surveys 73 • 70 We argue that this is a plausible factor behind the large increase in capital/labor ratios in Europe relative to the USA. 7 1 Rights to sell, to rent, to bequeath, to pledge, to mortgage, etc. 72 For a dissenting view, see e.g. Kaplan and Zingales ( 1 997) and Cummins, Hassett and Oliner (1996b). 73 See e.g. Bernanke et a!. (1996, 1999) and Hubbard (1 995) for recent ones.
Ch. 12:
Aggregate Investment
857
While there are extensive empirical and theoretical microeconomic literatures, the macroeconomics literature on credit constraints is less developed. Notable exceptions are: Bemanke and Gertler ( 1 989, 1 990), Kiyotak:i and Moore ( 1 997) and Greenwald and Stiglitz ( 1 993) 74 . Although the exact mechanisms are not always the same, many of the aggregate insights of this literature can be described in terms of the results in the preceding subsections. Changing slightly the interpretation of factor 2, from labor to entrepreneurs, allows us to use Figure 5 . 1 to characterize credit constraints. Rationing in the labor market becomes rationing of credit available to projects. To the left of point B, which is the region analyzed in the literature, net investment is too responsive to shocks; there is more credit rationing as the state of the economy declines; and there is underinvestment in general. Internal funds and collateralizable assets reduce the extent of the appropriability problem by playing the role of a bond, and introduce heterogeneity and therefore ranking of entrepreneurs. Since the value of collateral is likely to decline during a recession, there is an additional amplification effect due to the decline in the feasibility of remedial "bonding" 75 .
6. Conclusion and outlook
This survey started by arguing that the long run relationship between aggregate capital, output and the cost of capital is not very far from what is implied by the basic neoclassical model: in the US, the elasticity of the capital-output ratio with respect to permanent changes in the cost of capital is close to minus one. In the short run things are more complex. Natural-experiments have shown that, in the cross section, the elasticity of investment with respect to changes in investment tax credits is much larger than we once suspected. How to go from these rnicroeconomic estimates to aggregates, and to the response of investment to other types of shocks is not fully resolved. We do know, however, that these estimates represent expected values of what seems to be a very skewed distribution of adjustments. A substantial fraction of a firm's investment is bunched into infrequent and lumpy episodes. Aggregate investment is heavily influenced by the degree of synchronization of microeconomic investment spikes. For US manufacturing, the short run (annual) elasticity of investment with respect to changes in the cost of capital is less than one tenth the long run response when the economy has had a depressed immediate history, while this elasticity can rise by over 50 percent when the economy is undergoing a sustained expansion.
74 Also see Gross (1994) for empirical evidence and a model integrating financial constraints and irreversibility. 75 See e.g. Kiyotaki and Moore ( 1 997).
858
R.J. Caballero
Still, the mapping from microeconomics to aggregate investment dynamics especially equilibrium aggregate investment dynamics - is probably more complex than just the direct aggregation of very non-linear investment patterns. Informational problems lead to a series of strategic delays which feed into and feed off of the natural inaction of lumpy adjustment models. This process has the potential to exacerbate significantly the time varying nature of the elasticity of aggregate investment with respect to aggregate shocks. Moreover, sunk costs provide fertile ground for opportunistic behavior. In the absence of complete contracts, aggregate net investment is likely to become excessively volatile. The lack of response of equilibrium payments to complementary - and otherwise inelastic - factors (e.g. workers), exacerbates the effects of shocks experienced by firms. Also, the withdrawal of financiers' support during recessions further reduces investment. Thus, capital investment seems to be hurt at both ends: workers that do not share fairly during downturns, and financiers that want to limit their exposure to potential appropriations from entrepreneurs which cannot credibly commit not to do so during the recovery. The last two themes, equilibrium outcomes with informational problems and opportunism, are wanting for empirical work. I therefore suspect that we will see plenty of research filling this void in the near future.
References
Abel, A.B. (1979), Investment and the Value of Capital (Garland, New York). Abel, A.B., and J.C. Eberly (1994), "A unified model of investment under uncertainty", American Economic Review 84(December): 1369-1 384. Abel, A.B., and J.C. Eberly (1996a), "Investment and q with fixed costs: an empirical analysis", mimeograph (Wharton School, University of Pennsylvania, January). Abel, A.B., and J.C. Eberly (1996b), "The effects of irreversibility and uncertainty on capital accumulation", mimeograph (Wharton School, University of Pennsylvania, May). Abel, A.B., A.K. Dixit, J.C. Eberly and R.S. Pindyck ( 1 996), "Options, the value of capital, and investment", Quarterly Journal of Economics 1 1 1(3, August):753-777. Acemoglu, D. (1 996), "A microfoundation for social increasing returns in human capital accumulation", Quarterly Journal of Economics 1 1 1 (3, August):779-804. Atkeson, A., and P.J. Kehoe (1997), "Models of energy use: putty-putty vs. Putty-clay", Federal Reserve Bank of Minneapolis, Research Department, Staff Report 230 (March). Auerbach, A.J., and K.A. Hassett (1992), "Tax policy and business fixed investment in the United States", Journal of Public Economics 47(2):141-170. Barnett, S., and P. Sakellaris (1995), "Non-linear response of firm investment to Q: testing a model of convex and non-convex adjustment costs", mimeograph (University of Maryland, August). Benhabib, J., and A. Rustichini (1 993), "A vintage model of investment and growth: theory and evidence", in: R. Becker, ed., General Equilibrium, Growth and Trade, vol. II (Academic Press, New York). Bemanke, B.S., and M. Gertler (1989), "Agency costs, net worth, and business fluctuations", American Economic Review 95: 14-3 1 . Bemanke, B.S., and M . Gertler (1990), "Financial fragility and economic performance", Quarterly Journal of Economics 105:87-1 14.
Ch. 12:
Aggregate Investment
859
Bemanke, B.S., M. Gertler and S. Gilchrist (1 996), "The financial accelerator and the flight to quality", Review of Economics and Statistics 78:1-1 5 . Bemanke, B . S . , M. Gertler and S. Gilchrist ( 1 999), "The financial accelerator i n a quantitative business cycle framework", ch. 2 1 , this Handbook. Bertola, G. (1 988), "Adjustment costs and dynamic factor demands: investment and employment under uncertainty", Ph.D. Dissertation (MIT, Cambridge, MA). Bertola, G. (1 992), "Labor turnover costs and average labor demand", Journal of Labor Economics 10(4):389-4 1 1 . Bertola, G., and R.J. Caballero ( 1990), "Kinked adjustment costs and aggregate dynamics", in: O.J. Blanchard and S. Fischer, eds., NBER Macroeconomics Annual (MIT Press, Cambridge, MA) 237-288. Bertola, G., and R.J. Caballero ( 1994), "Irreversibility and aggregate investment", Review of Economic Studies 61(2, April):223-246. Besley, T. (1995), "Property rights and investment incentives: theory and evidence from Ghana", Journal of Political Economy 1 03(5, October):903-937. Blanchard, O.J. ( 1 986), "Investment, output, and the cost of capital: a comment", Brookings Papers on Economic Activity 1986(1 ) : 1 53-158. Blanchard, O.J., and M. Kremer ( 1996), "Disorganization", mimeograph (MIT, October). Brainard, W.C., and J. Tobin ( 1968), "Pitfalls in financial model building", American Economic Review 58(May):99-122. Caballero, R.J. ( 1 993a), "On the dynamics of aggregate investment", in: L. Serven and A. Solimano, eds., Striving for Growth After Adjustment, The Role of Capital Formation (The World Bank, Washington, DC) 8 1-106. Caballero, R.J. ( 1993b), "Durable goods: an explanation for their slow adjustment", Journal of Political Economy 1 01 (2, April):35 1-384. Caballero, R.J. (1 994a), "Small sample bias and adjustment costs", Review of Economics and Statistics 76(1, February):52-58. Caballero, R.J. ( 1994b), "A reconsideration of investment behavior using tax reforms as natural experiments: a comment", Brookings Papers on Economic Activity 1 994(2):62-68. Caballero, R.J., and E. Engel (1 999), "Explaining investment dynamics in U.S. manufacturing: a generalized (S, s) approach", Econometrica 67(4, July). Caballero, R.J., and M.L. Hammour (1 994), "The cleansing effect of recessions", American Economic Review 84(5, December) : l 350-1368. Caballero, R.J., and M.L. Hammour (1 996a), "On the timing and efficiency of creative destruction", Quarterly Journal of Economics 1 1 1 (3, August):805-852. Caballero, R.J., and M.L. Hammour (1 996b), "On the ills of adjustment", Journal of Development Economics 5 1 ( 1 , October): l 61-192. Caballero, R.J., and M.L. Hammour ( 1997b), "Improper chum: financial constraints and factor markets", mimeograph (MIT, May). Caballero, R.J., and M.L. Hammour (1 998a), "The macroeconomics of specificity", Journal of Political Economy 106(4, August):724-767. Caballero, R.J., and M.L. Hammour (1 998b), "Jobless growth: appropriability, factor substitution, and unemployment", Carnegie-Rochester Conference Series on Public Policy 48(June):5 1-94. Caballero, R.J., and J. Leahy ( 1996), "Fixed costs: the demise of marginal q " , Working paper No. 5508 (NBER, March). Caballero, R.J., and R.S. Pindyck ( 1 996), "Uncertainty, investment, and industry evolution", International Economic Review 37(3, August):641-662. Caballero, R.J., E. Engel and J. Haltiwanger (1995), "Plant-level adjustment and aggregate investment dynamics", Brookings Papers on Economic Activity 1995(2): 1-54. Campbell, J.R. (1 997), "Entry, exit, embodied technology, and business cycles", Working Paper No. 5955 (NBER, March).
860
R.J. Caballero
Campbell, J.R., and J.D.M. Fisher (1996), "Aggregate employment fluctuations with microeconomic asymmetries", mimeograph (University of Rochester, August). Caplin, A., and J. Leahy ( 1 993), "Sectoral shocks, learning, and aggregate fluctuations", Review of Economic Studies 60(4, October):777-794. Caplin, A., and J. Leahy (1994), "Business as usual, market crashes and wisdom after the fact", American Economic Review 84(3, June):549-565. Charnley, C., and D. Gale (1 994), "Information revelation and strategic delay", Econometrica 62: 1065-1085. Chirinko, R.S. (1 993), "Business fixed investment spending: a critical survey of modelling strategies, empirical results, and policy implications", Journal of Economic Literature 3 1 (December): 1 87 5-1 9 1 1 . Clark, J.M. ( 1 9 1 7), "Business acceleration and the law of demand: a technical factor in economic cycles", Journal of Political Economy 25(March):217-235. Clark, J.M. (1 944), "Additional note on business acceleration and the law of demand", in: American Economic Association, Readings in Business Cycle Theory (Blackiston Company, Philadelphia, PA). Cooley, T.F., J. Greenwood and M. Yorukoglu ( 1997), "The replacement problem", Working paper No. 444 (Rochester Center for Economic Research, August). Cooper, R., J. Haltiwanger and L. Power ( 1994), "Machine replacement and the business cycle: lumps and bumps", mimeograph (Boston University). Cummins, J.G., K.A. Hassett and R.G. Hubbard ( 1 994), "A reconsideration of investment behavior using tax reforms as natural experiments", Brookings Papers on Economic Activity 1994(2): 1-59. Cummins, J.G., K.A. Hassett and R.G. Hubbard ( 1996a), "Tax reforms and investment: a cross country comparison", Journal of Public Economics 62(1/2, October):237-273. Cummins, J.G., K.A. Hassett and S.D. Oliner ( 1996b), "Investment behavior, internal funds, and observable expectations", mimeograph (New York University, October). Dixit, A. ( 1 989), "Entry and exit decisions under uncertainty", Journal of Political Economy 97:620-638. Dixit, A. ( 1 993), The Art of Smooth Pasting (Harwood Academic Publishers, Langhorns, PA). Doms, M., and T. Durme ( 1993), "An investigation into capital and labor adjustment at the plant level", mimeograph (Center for Economic Studies, Census Bureau). Eisner, R. (1 969), "Tax policy and investment behavior: a comment", American Economic Review 59(June):379-388. El Karoui, N., and I. Karatzas ( 1 99 1 ), "A new approach to the Skorohod problem, and its applications", Stochastics 34:57-82. Fallick, B.C., and K.A. Hassett (1996), "Investment and union certification", Discussion paper No. 1996-43 (FED, November). Fazzari, S.M., R.G. Hubbard and B.C. Petersen (1 988), "Financing constraints and corporate investment", Brookings Papers on Economic Activity 1 988(1): 141-195. Gale, D. ( 1 995), "What have we learned from social learning?", mimeograph (Boston University, August). Gilchrist, S., and J.C. Williams (1996), "Putty-clay and investment: a business cycle analysis", mimeograph (Boston University, May). Goolsbee, A. ( 1995a), "Investment tax incentives and the price of capital goods", mimeograph (Chicago GSB). Goolsbee, A. ( 1995b), "Factor prices and the retirement of capital goods", mimeograph (Chicago GSB, July). Goolsbee, A. (1997), "The incidence of investment tax subsidies: to the workers go the spoils?", mimeograph (Chicago GSB, February). Goolsbee, A., and D.B. Gross (1 997), "Estimating adjustment costs with data on heterogeneous capital goods", mimeograph (Chicago GSB, September). Greenspan, A., and D. Cohen ( 1 996), "Motor vehicles stocks, scrappage, and sales", Working paper No. 1 996-40 (FED, October).
Ch. 12:
Aggregate Investment
861
Greenwald, B., and J. Stiglitz ( 1 993), "Financial market imperfections and business cycles", Quarterly Journal of Economics 1 08(1, February):77-1 14. Gross, D. (1994), "The investment and financing decisions of liquidity constrained firms", mimeograph (Chicago GSB). Grout, P.A. (1984), "Investment and wages in the absence of binding contracts: a Nash bargaining approach", Econometrica 52(2, March):449-460. Hall, R.E., and C.I. Jones ( 1 996), "The productivity of nations", mimeograph (Stanford University, August). Hall, R.E., and D.W. Jorgenson ( 1967), "Tax policy and investment behavior", American Economic Review 57(3, June):391-414. Hart, 0. (1995), Firms, Contracts and Financial Structure, Clarendon Lectures in Economics (Oxford University Press, Oxford). Hassett, K.A., and R.G. Hubbard ( 1 996a), "New evidence concerning the openness of the world market for capital goods", mimeograph (Board of Governors of the Federal Reserve System, June). Hassett, K.A., and R.G. Hubbard ( 1996b), "Tax policy and investment", Working Paper No. 5683 (NBER, July). Hayashi, F. (1982), "Tobin's marginal Q and average Q: a neoclassical interpretation", Econometrica 50(1, January): 2 13-224. Hopenhayn, H.A. (1 992), "Entry, exit, and firm dynamics in long run equilibrium", Econometrica 60: 1 127-1 150. Hubbard, R.G. ( 1 995), "Capital-market imperfections and investment", mimeograph (Columbia University). Jorgenson, D.W. (1 963), "Capital theory and investment behavior", American Economic Review 53(2, May):247-259. Jovanovic, B. (1997), "Obsolescence of capital", mimeograph (New York University, February). Kaplan, S.N., and L. Zingales (1 997), "Do investment-cash flow sensitivities provide useful measures of financing constraints?, Quarterly Journal of Economics 1 12(1, February): 1 69-2 1 6. Kiyotaki, N., and J. Moore ( 1997), "Credit cycles", Journal of Political Economy 1 05(2, April):21 1-248. Koyck, L.M. (1954), Distributed Lags and Investment Analysis (North-Holland, Amsterdam). Krieger, S. (1997), "The general equilibrium dynamics of investment, scrapping and reorganization in an economy with firm level uncertainty", mimeograph (Chicago, July). Leahy, J. (1993), "Investment in competitive equilibrium: the optimality of myopic behavior", Quarterly Journal of Economics 108:1 105-1 1 3 3 . Meyer, J.R., and E. Kuh (1 957), The Investment Decision: A n Empirical Study (Harvard University Press, Cambridge, MA). Nickell, S.J. (1978), The Investment Decisions of Firms (Cambridge University Press, Oxford). Oliner, S.D., G.D. Rudebusch and D. Sichel (1995), "New and old models of business investment: a comparison of forecasting performance", Journal of Money, Credit and Banking 27:806-826. Pindyck, R.S. (1 988), "Irreversible investment, capacity choice, and the value of the firm", American Economic Review 78(5, December):969-985. Ramey, G., and J. Watson (1 996), "Contractual fragility, job destruction and business cycles", mimeograph (University of California at San Diego, June). Rivers, D., and Q.H. Vuong (1991), "Model selection tests for nonlinear dynamic models", Working paper No. 91-08 (University of Toulouse). Rotemberg, J.J. ( 1987), "The new Keynesian microfoundations", in: OJ. Blanchard and S. Fischer, eds., NBER Macroeconomics Annual 1 987 (MIT Press, Cambridge, MA) 69-104. Rothschild, M. ( 1 971), "On the cost of adjustment", Quarterly Journal of Economics 85(November): 605-622. Salter, WE.G. ( 1960), Productivity and Technical Change (Cambridge University Press, Cambridge). Shapiro, M.D. ( 1986), "Investment, output, and the cost of capital", Brookings Papers of Economic Activity 1986(1): 1 1 1-152.
862
R.J. Caballero
Simons, H.C. ( 1 944), "Some reflections on syndicalism", Journal of Political Economy 52: 1-25. Stock, J.H., and M.W. Watson (1 993), "A simple MLE of cointegrating vectors in higher order integrated systems", Econometrica 6 1(4, July):783-820. Tinbergen, J. ( 1939), "A method and its application to investment activity", in: Statistical Testing of Business Cycle Theories, vol. 1 (Economic Intelligence Service, Agathon Press, New York). Tobin, J. (1969), "A general equilibrium approach to monetary theory", Journal of Money, Credit and Banking 1 : 1 5-29. Williamson, O.E. (1 979), "Transaction-cost economics: the governance of contractual relations", Journal of Law and Economics 22(2, October):233-261 . Williamson, O.E. ( 1985), The Economic Institutions of Capitalism (Free Press, New York).
Chapter 13
INVENTORIES* VALERIE A. RAMEY
University of California - San Diego KENNETH D. WEST
University of Wisconsin Contents
Abstract Keywords Introduction 1 . Sectoral and secular behavior of inventories 2 . Two stylized facts about inventory behavior 2. 1 . Procyclical inventory movements 2 . 1 . 1 . Illustrative evidence 2 . 1 .2. A survey of results 2.2. Persistent movements in the inventory-sales relationship 2.2. 1 . Illustrative evidence 2.2.2. A survey of results
3 . Linear quadratic models 3 . I . Introduction 3.2. A model 3.3. A first-order condition 3.4. Whose inventories?
4. Decision rule 4. 1 . Introduction 4.2. Derivation of decision rule 4.3. Persistence in the inventory-sales relationship 4.4. Summary on persistence in the inventory-sales relationship
5. The flexible accelerator model
864 864 865 868 872 873 873 875 877 877 880 882 882 882 885 887 887 887 888 891 892 893
* We thank the National Science Foundation and the Abe Foundation for financial support; Clive Granger, Donald Hester, James Kahn, Ani! Kashyap, Linda Kale, Spencer Krane, Scott Schuh, Michael Woodford and a seminar audience at the University of Wisconsin for helpful comments and discussions; and James Hueng and especially Stanislav Anatolyev for excellent research assistance. Email to:
kdwest @ f acstaf f . wi s c . edu; vramey@weber . ucsd . edu.
Handbook of Macroeconomics, Volume 1, Edited by J.B. Taylor and M. Woodford © 1999 Elsevier Science B. V All rights reserved 863
V.A. Ramey and K.D. West
864
6. Dynamic responses 7. Empirical evidence 7 . 1 . Introduction 7.2. Magnitude of cost parameters 7.3. Shocks 7 .4. Interpretation
8 . Directions for future research 8 . 1 . Introduction 8.2. Inventories in production and revenue functions 8.3. Models with fixed costs 8.4. The value of more data
9. Conclusions Appendix A Data Appendix Appendix B . Technical Appendix B. l . Solution of the model B.2. Computation of E(Q2 - S2) B.3. Estimation of e
B.4. Social planning derivation of the model's first-order conditions
References
894 902 902 903 906 906 909 909 909 910 911 912 913 914 914 919 919 919 920
Abstract
We review and interpret recent work on inventories, emphasizing empirical and business cycle aspects. We begin by documenting two empirical regularities about inventories. The first is the well-known one that inventories move procyclically. The second is that inventory movements are quite persistent, even conditional on sales. To consider explanations for the two facts, we present a linear-quadratic model. The model can rationalize the two facts in a number of ways, but two stylized explanations have the virtue of relative simplicity and support from a number of papers. Both assume that there are persistent shocks to demand for the good in question, and that marginal production cost slopes up. The first explanation assumes as well that there are highly persistent shocks to the cost of production. The second assumes that there are strong costs of adjusting production and a strong accelerator motive. Research to date, however, has not reached a consensus on whether one of these two, or some third, alternative provides a satisfactory explanation of inventory behavior. We suggest several directions for future research that promise to improve our understanding of inventory behavior and thus of business cycles. Keywords
JEL classification:
E22, E32
Ch. 13:
Inventories
865
Introduction
In developed countries, inventory investment typically averages less than one-half of one percent of GDP, whereas fixed investment averages 1 5% of GDP and consumption two-thirds. Perhaps with these fractions in mind, macroeconomists have concentrated more on the study of consumption and fixed investment than on inventories. Inventories generally do not appear as separate variables in dynamic general equilibrium models, nor in exactly identified vector autoregressive models. It has long been known, however, that other ways of measuring the importance of inventories suggest that inventories should receive more attention, especially in business cycle research. Half a century ago, Abramowitz (1 950) established that US recessions prior to World War II tended to be periods of inventory liquidations. Recent experience in the G7 countries indicates this regularity continues to hold, and not just for the USA. In six of the seven G7 countries (Japan is the exception), real GDP fell in at least one recent year. Line 2 of Table 1 shows that in five of those six countries (the United Kingdom is now the exception), inventory investment also declined during the period of declining GDP, accounting in an arithmetical sense for anywhere 1 2-7 1 % of the fall in GDP. And Table 1 's use of annual data may understate the inventory contribution: Table 2 indicates that for quarterly US data, the share is 49 rather than 1 2% for the 1 990-1991 recession, with 49 a typical figure for a post-War US recession. Such arithmetical accounting of course does not imply a causal relationship. But it does suggest that inventory movements contain valuable information about cyclical fluctuations. In this chapter, we survey and interpret recent research on inventories, emphasizing empirical and business cycle aspects. Among other points, we hope to convince the reader that inventories are a useful resource in business cycle analysis. They may be effective in identifying both the mechanisms of business cycle propagation and the sources of business cycle shocks. Our chapter begins by documenting two facts about inventories. The first is the well-known one that inventories move procyclically. They tend to be built up in expansions, drawn down in contractions. The second, and not as widely appreciated, fact is that inventory movements are quite persistent, even conditional on sales. In many data sets, inventories and sales do not appear to be cointegrated, and the first order autocorrelations of supposedly stationary linear combinations of inventories and sales are often around 0.9, even in annual data. To consider explanations for the two facts, we use a linear quadratic/flexible accelerator model, which is the workhorse for empirical research on inventories. In our model, one source of persistence is from shocks to demand for the good being put in inventory - "demand" shocks. ("Demand" is in quotes because we, and the literature more generally, do not attempt to trace the ultimate source of such shocks; for example, for an intermediate good, the shocks might be driven mainly by shocks to the technology of the industry that uses the good in production.) But even if this shock has a unit root, our model yields a stationary linear combination of inventories
VA. Ramey and K.D. West
866
Table 1 Arithmetical importance of inventory change in recessions of the 1990s (annual data) a Canada France
Country
( 1) Peak year trough yearb (2) Peak�trough change i n inventory change as percentage of peak-to-trough fall in GDP 0
Italy West Germany
Japan
UK
USA
1989 1991
1 992 1 993
1 992 1993
1992 1993
n.a.
1 990 1992
1 990 1 99 1
50
71
19
30
n.a.
�0.
12
a
The figures are based on annual real data. The inventory change series is computed by deflating the annual nominal change in inventories in the National Income and Product Accounts by the GDP deflator; see the Data Appendix. b The trough year was found by working backwards from the present to the last year of negative real GDP growth in the 1 990s. There were no such years in Japan. The peak year is the last preceding year of positive real GDP growth. c Computed by multiplying the following ratio by 100: inventory change in trough year� inventory change in peak year GDP in trough year� GDP in peak year By construction, the denominator of this ratio is negative. A positive entry indicates that the numerator (the change in the inventory change) was also negative. The negative entry for the United Kingdom indicates that the change in the inventory change was positive.
Table 2 Arithmetical importance of inventory changes in post-war US recessions (quarterly data) a Peak quarter�trough quarter
Peak-to-trough inventory change as a percentage of peak-to-trough fall in GDP
1948:4-1949:2
1 30
1953:2�1954:2
41
1957: 1�1958 : 1
21
1960: 1 � 1 960:4
122
1969:3�1970: 1
127
1973:4- 1 975 : 1
59
1980: 1 � 1 980:3
45
1981 :3�1982:3
29
1 990:2�199 1 : 1 b
49
a
The figures are based on quarterly real data. See the notes to Table 1 for additional discussion. The figure for the 1 990� 1 991 recession differs from that for the USA in Table 1 mainly because quarterly data were used. It also differs because in this table the inventory change is measured in chain weighted 1 992 dollars, whereas Table 1 uses the nominal inventory change deflated by the GDP deflator. b
Ch. 13:
Inventories
867
and sales. This stationary linear combination can be considered a linear version of the inventory-sales ratio. We call it the inventory-sales relationship. And our second inventory fact is that there is persistence in this relationship. While the model is rich enough that there are many ways to make it explain the two facts, we focus on two stylized explanations that have the virtue of relative simplicity, as well as empirical support from a number of papers. Both explanations assume a upward sloping marginal production cost (a convex cost function). The first explanation also assumes that fluctuations are substantially affected by highly persistent shocks to the cost of production. Cost shocks will cause procyclical movement because times of low cost are good times to produce and build up inventory, and conversely for times of high cost. As well, when these shocks are highly persistent a cost shock that perturbs the inventory-sales relationship will take many periods to die off, and its persistence will be transmitted to the inventory-sales relationship. The second explanation assumes that there are strong costs of adjusting production and a strong accelerator motive. The accelerator motive links today's inventories to tomorrow's expected sales, perhaps because of concerns about stockouts. Since sales are positively serially correlated, this will tend to cause inventories to grow and shrink with sales and the cycle, a point first recognized by Metzler ( 1 94 1 ). As well, with strong costs of adjusting production, if a shock perturbs the inventory-sales relationship, return to equilibrium will be slow because firms will adjust production only very gradually. Both explanations have some empirical support. But as is often the case in empirical work, the evidence is mixed and ambiguous. For example, the cost shock explanation works best when the shocks are modelled as unobservable; observable cost shifters, such as real wages and interest rates, seem not to affect inventories. And the literature is not unanimous on the magnitude of adjustment costs. While the literature has not reached a consensus, it has identified mechanisms and forces that can explain basic characteristics of inventory behavior and thus of the business cycle. We are optimistic that progress can continue to be made by building on results to date. Suggested directions for future research include alternative ways of capturing the revenue effects of inventories (replacements for the accelerator), alternative cost structures and the use of price and disaggregate data. The chapter is organized as follows. Section 1 presents some overview information on the level and distribution of inventories, using data from the G7 countries, and focussing on the USA. We supply this information largely for completeness and to provide a frame of reference; the results in this section are referenced only briefly in the sequel. Section 2 introduces the main theme of our chapter (business cycle behavior of inventories) by discussing empirical evidence on our two facts about inventories. Procyclical movement is considered in Section 2 . 1 , persistence in the inventory-sales relationship in Section 2.2. In these sections, we use annual data from the G7 countries and quarterly US data for illustration, and also summarize results from the literature.
868
V.A. Ramey and K.D. West
Sections 3-7 develop and apply our linear quadratic/flexible accelerator model. Sections 3-5 present the model. Much of the analysis in these three sections relates to the process followed by the inventory-sales relationship, because this process has not received much direct attention in existing literature. The discussion focuses on analytical derivations, for the most part deferring intuitive discussion about how the model works to Section 6. That section aims to develop intuition by presenting impulse responses for various configurations of parameters. Section 7 reviews empirical evidence from studies using the model. In Section 8, we discuss extensions and alternative approaches, including models that put inventories directly in production and profit functions, models with fixed costs, and the use of different data. Section 9 concludes. A Data Appendix describes data sources, and a Technical Appendix contains some technical details.
1. Sectoral and secular behavior of inventories
In this section we use basic national income and product account data from the G7 countries, and some detailed additional data from the USA, to provide a frame of reference for the discussion to come. As just noted, for the most part this is background information that will not loom large in the sequel. Lines l (a) and l (b) of Table 3 present the mean and standard deviation of the real annual change in economy wide inventory stocks in the G7 countries, over the last 40 years. These were computed from the national income and product account data on Table 3 Basic inventory statistics Canada
(I) Annual NIPA change in inventories,
France
West Germany
Italy
Japan
UK
USA
1956-1995 a,b
(a) Mean
2.32
37.4
12.3
12.3
2.41
1 .8 1
23.6
(b) Standard deviation
3.91
40. 1
1 2.7
9.8
1 .44
3.04
2 1 .6
(2) Reference: 1 995 GDP c
721
6882
2608
1351
453
584
6066
(3) 1995 Inventory level d
131
n.a.
41 1
n.a.
71
104
971
a The inventory change series is computed by deflating the annual nominal change in inventories in the National Income and Product accounts by the GDP deflator; see the Data Appendix. Units for all entries are billions (trillions, for Italy and Japan) of units of own currency, in 1990 prices. b Sample periods are 1 957-1994 for West Germany and 1 960-1994 for Italy, not 1956-1995. c GDP entries for Italy and Germany are for 1994, not 1 995. d The "level" entries for Canada, West Germany, Japan and the UK are computed by deflating the nominal end of year value by the GDP deflator; see the Data Appendix. The entry for the US is the Department of Commerce constant (chained 1 992) dollar value for non-farm inventories, rescaled to a 1990 from a 1 992 base with the GDP deflator.
Ch. 13:
869
Inventories Table 4 Sectoral distribution of US non-farm inventories a,b ( 1) Percent of total level, 1995
Total Manufacturing
100 37
Finished goods
13
Work in process
12
Raw materials Trade Retail Wholesale Other
12 52 26 26 11
(2) Mean (s.d.) of change
(3) Mean (s.d.) of growth
2 1 .4
3.5
(22.5)
(3.5)
7.0
2.8
( 1 1 .6)
(4.2)
2.5
3.0
(4.4)
(4.8)
2.3
2.8
(5.9)
(6.0)
2.2
2.6
(5.4)
(6.2)
12.2
4.4
(13.4)
(4.5)
5.9
4.2
(10.3)
(6.7)
6.2
4.5
(7.3)
(4.8)
2.2
3.1
(5.1)
(5.8)
a
Data are in billions of chained 1 992 dollars, 1959:l-1996:IV The inventory change differs from the US data on changes in Tables 1-3 in coverage (Tables 1-3 include changes in farm inventories), in sample period (1959-1996 here, 1 956-1 995 in Table 3) and in base year (1 992 here and Table 2, 1 990 in Tables 1 and 3). b
the change in aggregate inventories. See the notes to the table and the Data Appendix for details. Upon comparing line l (a) to line 2, we see that in all seven countries, the average change in inventories is small, about one percent of recent GDP in Italy, well less than that in other countries. Inventory changes are, however, reasonably volatile, with the standard deviation roughly as large as the mean in all seven countries. We have less complete data on the level (as opposed to the change) of inventory stocks. Line 3 of Table 3 indicates that in the countries for which we have been able to obtain data, total inventories are about one-sixth of GDP. This implies a monthly inventory-sales ratio of about 2, a value that will be familiar to those familiar with monthly US data. Table 4 has a breakdown of US non-farm inventories by sector. We see in column 1 that about half of non-farm inventories are held by retailers and wholesalers (including
V.A. Ramey and K.D. West
870
.9 current dollars
.8
0 ..... ..... rc '-
.7
.6
.5 47 50
55
60
65
70
year
75
80
85
90
95
Fig. 1. Quarterly ratio of nonfarm inventories to final sales.
non-merchant wholesalers who are associated with particular manufacturers), whereas somewhat over a third are held by manufacturers. The remaining "other" category reflects holdings by a number of industries, including utilities, construction, and service companies. Like the aggregates in Table 3, investment in each of the components is positive on average, and has standard deviations about the same size as means. This applies whether one looks at arithmetic changes (column 2) or growth rates (column 3). For future reference, it will be useful to note that manufacturers' inventories of finished goods, which have received a fair amount of attention in the inventory literature, are only 1 3 % of total inventories, and are not particularly volatile. Figure 1 plots the ratio of total non-farm inventories to final sales of domestic product. The dashed line uses real data (ratio of real inventories to real sales), the solid line nominal data. In the real data, the inventory series matches that in line 1 of Table 4, but over the longer sample 1 947:1-1996:IV. (Table 4 uses the 1 959-1996 subsample because the disaggregate breakdown is not available 1 947-1958.) The real ratio shows a run-up in the late 1 960s and early 1 970s, followed by a period of slight secular decline. At present, the ratio is modestly above its value at the start of our sample (0.63 vs. 0.56). It will be useful to note another fact for future reference. The figure suggests considerable persistence in the inventory-sales ratio, an impression borne out by estimates of first-order autocorrelations. These are 0.98 for the sample as whole, 0.93 if the autocorrelation is computed allowing for a different mean inventory-sales ratio for the 1 947:1-1973:IV and 1 974:1-1996:IV subsamples.
Ch. 13:
Inventories
871
1095.7 Ul Cll .... c.. 0 ..... c Cll > c .... Ul Ul Cll c
..... Ul :::J .0 E c.. 10 ...... c 0 c
20 1.8 358
final sales of domestic business
1743.97
Fig. 2. Quarterly inventories and sales, 1947: 1-1996:4, in billions of chained 1 992 dollars.
Readers familiar with the monthly inventory-sales ratios commonly reported in the US business press may be surprised at the absence of a downward secular movement. Such monthly ratios typically rely on nominal data. The solid line in Figure 1 shows that the ratio of nominal non-farm inventories to nominal sales of domestic product indeed shows a secular decline. Evidently, the implied deflator inventories has not been rising as fast as that for final sales. We do not attempt to explain the differences between the nominal and real series. We do note, however, the nominal ratio shows persistence comparable to that of the real ratio. The estimate of the first order autocorrelation of the ratio is 0.97 whether or not we allow a different mean inventory-sales ratio for the 1 947:I-1 973 :IV and 1 974:I-1996:IV subsamples. To return to the secular behavior of the real series: we see from column 3 in Table 4 that the rough constancy of the overall ratio hides some heterogeneity in underlying components. In particular, raw materials, and to a lesser extent, work in progress, have been growing more slowly than the aggregate, implying a declining ratio to final sales. This fact was earlier documented by Hester ( 1 994), who noted that possible explanations include just-in-time inventory management, outbasing of early stages of manufacturing to foreign countries, and a transitory response to transitory movements in costs. In the sequel we do not attempt to explain secular patterns in inventory-sales ratios; see Hester (1 994) for a discussion of US data, for retail as well as manufacturing, West ( 1 992a) and Allen (1995) for discussions of Japanese data. Instead we hope that the reader will take the message away from these tables that inventories and sales are positively related in the long run: they tend to rise together. This is illustrated quite
872
V.A. Ramey and K.D. West
strikingly in Figure 2, which is a scatterplot of the inventory and sales data. A second message in the tables and the autocorrelations reported above is that while inventory movements are small relative to GDP, they are volatile and persistent. Characterizing and explaining the stochastic, and especially business cycle, behavior of inventories is the subject of the rest of this chapter.
2. Two stylized facts about inventory behavior
Our essay focuses on the business cycle aspects of inventory behavior, and is oriented around two stylized facts: (I) inventory movements are procyclical, (2) the inventory sales relationship is highly persistent (the inventory-sales relationship is our term for a linear version of the inventory-sales ratio). These facts serve two purposes. First, they demonstrate the potential role of inventories in understanding economic fluctuations. Second, they serve as a measure by which we judge inventory models and, more generally, theories of the business cycle. For each of the two "facts", we present illustrative evidence from annual, post World War II data, for the G7 countries, as well as from quarterly post-War US data. We then review estimates from the literature. For the first of our stylized facts (procyclical movements), Section 2. 1 . 1 below presents estimates, Section 2. 1 .2 presents the review. Sections 2 .2 . 1 and 2.2.2 do the same for the second of our facts (persistence in the inventory-sales relationship). The remainder of this introductory subsection describes the data used in both 2. 1 and 2.2. For the G7 countries, we continue to use the aggregate (nation-wide) change in inventory stocks used in previous sections, and construct a time series of inventory levels by summing the change 1 . We measure production as GDP and sales as final sales. The quarterly US inventory data are that used in the previous section, total non-farm inventory and final sales of domestic product in chained 1 992 dollars, and with sales measured at quarterly rates 2 .
1 When we summed the Ml 1 series, we initialized with H 0 = 0 . Given the linearity o f our procedures, the results would be identical if we instead used the />,.H1 series to work forwards and backwards from the 1 995 levels reported in Table 3. The reader should be aware that when prices are not constant, a series constructed by our procedure of summing changes typically will differ from one that values the entire level of stock at current prices. Those with access to US sources can get a feel for the differences
by comparing the inventory change that figures into GDP (used in the 07 data, and in NIPA Tables 5 . 1 0 and 5 . 1 1 ) and the one implied by differencing the series for the level of the stock (used in our quarterly US data and NIPA Tables 5 . 1 2 and 5 . 1 3).
2 We repeated some of our quarterly calculations using final sales of goods and structures, which differs
from total final sales because it excludes final sales of services. There were no substantive changes in results.
Ch. 13:
Inventories
873
All of these measures are linked by the identity production = sales + (inventory investment), or (2. 1 ) where Q1 is production, S1 are sales, and H1 i s end of period t inventories. This relationship holds by construction, with S1 being final sales.
2. 1. Procyclical inventory movements 2. 1.1. Illustrative evidence Procyclicality of inventory movements can be documented in several ways. A simple indication that inventories move procyclically is a positive correlation between inventory investment and final sales. Consider the evidence in Table 5. In column 1 we see that all the point estimates of the correlation are positive, with a typical value being 0. 1-0.2. The correlation between sales and inventory investment is related to the relative variances of production and sales. As in Table 5 , let "var" denote variance, "cov" covariance. Since (2. 1 ) implies var(Q) = var(S) + var(AH) + 2 cov(S, AH), it follows from the positive correlation in column 1 that var(Q) > var(S) (column 2). Other indications of procyclical behavior include two variance ratios robust to the possible presence of unit autoregressive roots. The column 3 estimates indicate that var(�Q1)/var(�S1) > 1 , the column 4 estimates that E(Qf - Sf ) > 0. [E(Q7 - S7 ) is essentially an estimate of var(Q) - var(S) robust to the presence of unit autoregressive roots; see the Technical Appendix.] To illustrate the pattern of correlation over different short-run horizons, we present impulse response functions. The responses are based on a bivariate VAR in the level of inventories and sales for the quarterly US data, including eight quarterly lags, a time trend, and breaks in the constant and trend at 1 974. In accordance with this section's aim of presenting relatively unstructured evidence, we present responses to a one standard deviation shock to the VAR disturbances themselves, and not to orthogonalized shocks. Figure 3 shows the responses of inventories and sales to a disturbance to the sales equation, Figure 4 the responses to a disturbance to the inventory equation. To prevent confusion, we note that on the horizontal axis, we plot the pre-shock (period - 1 ) values of the variables; the shock occurs in period 0. Figures 3 and 4 both show a positive comovement of inventories and sales. In Figure 3, by construction the contemporaneous (period 0) response of inventories is zero. But the 7 billion (approximately) dollar rise in sales in period 0 is followed in the next quarter by a 1 .5 billion dollar increase in inventories. Inventories continue to rise for the next five quarters, even after sales turn down. Both series smoothly decline together. Figure 4 shows that after a 3 billion dollar shock to inventories, sales rise by nearly 2 billion dollars. Both inventories and sales subsequently show some wiggles.
V.A. Ramey and K.D. West
874 Table 5 Relative variability of output and final sales a-e Country
Period
(I)
corr(S, !Vl)
Canada France West Germany Italy Japan UK USA USA
(2) var(Q)/var(S)
(3) var(�Q)/ var(�S)
(4) l + [E(Q2 - S2 )/ var(�)]
1956-1995
0. 1 4
1.16
1 .53
1 .4 1
1 974-1995
0. 1 7
1 .2 1
1 .55
1 .24
1956-1995
0. 1 7
1 .3 6
1 .65
1 .68
1974-1 995
0.32
1 .63
2.09
1 .4 1
1957-1994
0.12
1 .10
1 .36
1 .0 1
1 974-1994
0.13
1 .08
1 .27
1 .03
1 960-1994
0.13
1 .30
1.81
1.12
1974-1994
0. 1 1
1 .27
1 .83
1 .08
1956-1995
0.23
1 .07
1.10
1 .30
1 974-1995
0.5 1
1.15
1 .08
1.12
1956-1 995
0.28
1 .2 1
1 .52
1 . 10
1 974-1995
0.26
1 . 17
1 .38
1 .04
1956-1 995
0.26
1.19
1 .48
1 . 12
1974-1995
0.25
1 .2 1
1 .50
0.98
1947:1-1996:IV
0.30
1 .26
1. 39
1 .4 1
1974:1-1996:IV
0.14
1.13
1 .40
1 .48
a "var" denotes variance, "corr" correlation, Q = output, S = final sales, !Vi = change in inventories. The variables are linked by the identity Q = S +!Vi. b In all but the last row, data are annual and real (1990 prices), with Q = real GDP, S = real final sales,
!Vl = real change in aggregate inventories. In the last row the data are quarterly and real (1992 prices), with S = final sales of domestic business goods and structures, �H = change in non-farm inventories,
and Q = S + !Vi . See the text and Data Appendix for sources. c In columns and 2, Q and S were linearly detrended, with the full sample estimates allowing a shift in the constant and trend term in 1974 (1974:1 in the last row); !Vi was defined as the difference between
I
detrended Q and S. In columns 3 and 4, �Q and �S were simply demeaned, again with the full sample estimates allowing a shift in the mean in 1 974 ( 1 974:1). 2 2 d In column 4, the term E(Q - S ) essentially is the difference between the variance of Q and the variance of S, computed in a fashion that allows for unit autoregressive roots in Q and S. See the Technical Appendix for further details. e The post-1 973 sample, as well as the post- 1 973 shifts in the full sample estimates, were included to allow for the general slowdown in economic activity.
This shock appears to have more persistent effects than does the sales shock, with inventories still over 2 billion dollars above their initial level after six years. The important point is that both sets of impulse response functions offer the same picture of procyclical inventories as the statistics in Table 5 . Thus, inventories seem to amplify rather than mute movements in production.
Ch. 13:
Inventories
875
8.7077 sales
t... m .....
ID
..... 0 "1:1
I ,1, /
/
I
N m m ..... 0
inventories
ID oo and specifying U dt as exogenous; with g ---'> oo , S1 = Udt . In practice, one needs to allow for serial correlation in Uctt · In principle one might want to rationalize such serial correlation with (say) costs of adjustment on the part of purchasers, or with observable shifters of the demand curve [West ( 1 992b)]. But since the model focuses on production, and, moreover, is typically not used to study the effects of a hypothetical intervention or change in regime, taking such serial correlation as exogenous is a useful simplification that will be maintained here. Finally, Christiano and Eichenbaum ( 1989) and West ( 1 990b), building on Sargent ( 1 979, ch. XVI) derive the linear demand curve (4. 1 ) in general equilibrium. Both papers assume a representative consumer whose per period utility is quadratic in S1 and linear in leisure. The disturbance U dt is a shock to the consumer's utility. There is no capital; the only means of storage is inventories. See the cited papers for detail.
1 4 To prevent confusion, we note that the first-order condition (3.3) also results if one assumes cost minimization. So if one aims to use condition (3 .3) to estimate model parameters, one can motivate the equation by reference to cost minimization without taking a stand on the how revenue is determined (apart from the caveat stated in Footnote I I).
Ch. 13:
Inventories
889
Here, we do not derive Equation (4. 1 ) in general equilibrium but take (4. 1) as given. We do not attempt to trace the demand shock back to preferences or other primitive sources. We therefore caution the reader that despite the label "demand", Udt should not be thought of as literally a nominal or monetary shock, since it (like all our variables) is real. As well, one can imagine scenarios in which Uct1 reflects forces typically thought of as supply side. If the good in question is an intermediate one, for example, one can imagine that shocks to the technology of the industry that produces the good dominate the movement of Udt · Whatever the interpretation of Uct1, we derive an industry equilibrium assuming a representative firm. Even so, to obtain a decision rule, we must be specific about market structure and the structure of the demand and cost shocks. We assume here that the market is perfectly competitive, and normalize the number of firms to one. If the firm is a monopolist, the reduced form is identical, but with a certain parameter being the slope of the marginal revenue curve rather than the slope of the demand curve. We assume that W1, Uct1 and Uct follow exogenous AR processes, possibly with unit autoregressive roots. By "exogenous" we mean "predictions of W1, Uct1 and Uct conditional on lagged W s, U ctS and ucs are identical to those conditional on lagged Ws, UctS, ucs, and industry-wide Hs and Ss: W1, Udt and Uct are not Granger-caused by industry-wide H1 or St''. (Of course in general equilibrium, such exogeneity of W1 is doubtful.) Finally, for notational simplicity, and to make contact with the literature on the "speed of adjustment" (see the next section), we tentatively assume that
a0 = 0.
(4.2)
This assumption is arguably not a good one empirically, and we will relax it below. Under the assumption of perfect competition, there are two equivalent methods for deriving the decision rules for inventories and sales. The first method, which studies the decentralized optimization problem, derives the individual firm's first-order conditions and then incorporates those into the industry equilibrium. The second method, which uses a social planning approach, derives the first-order conditions for the social planner problem and obtains decision rules for those. Both methods yield identical answers. We exposit here the decentralized method, and present in the Technical Appendix the social planner approach. With a0 = 0, the first-order condition for sales S1 for the representative firm is (4.3) In the absence of inventories, this would simply tell our competitive firm to set marginal revenue P1 equal to marginal cost a 1 Q1 + Uct· The additional term a2a3(H1 1 - a3S1) is the effect on inventory holding costs of an additional unit produced for sale. Upon using P1 = -gS1 + Uct1 in Equation (4.3) and Q1 = S1 +1Vf1 in Equation (4.3) and the inventory first-order condition (3.3), we obtain a pair of linear stochastic
V.A. Ramey and K.D. West
890
difference equations in H1 and S1• This two equation system is solved in the Technical Appendix. The resulting decision rule is H1
=
nHHt 1 + distributed lag on Uct, Udt and Wt, -
S1 = nsHt-1 + a different distributed lag on Uct, Udt and W1•
(4.4a) (4.4b)
In (4.4a), nH is the root to a certain quadratic equation, !nH! < 1 . Both nH and ns depend on b, g, a 1 , a2 and a3 . (Note two differences from the relatively well-understood case of exogenous sales. Even when a" a2 > 0, if g < oo : (a) it is in principle possible to have liH ( 0, and (b) the accelerator coefficient a3 affects liH.) The distributed lag coefficients on Uc1, Udt and W1 depend on b, g, a 1 , a2 and a3 as well as the autoregressive parameters governing the evolution of the u01, Udt and W1• In the empirically relevant case of nH > 0 , nH increases with marginal production costs a 1 and decreases with marginal inventory holding costs a2 . The signs of onH/oa3 and 8nH!og are ambiguous. The solution when revenue is exogenous (g -+ oo ) is obtained by replacing Udt with g Udt [see Equation (4. 1 )] and letting g -+ oo. In this case, ns = 0, S1 = Udt and the solution (4.4) may be written in the familiar form H1 S1
=
�
nHH1_ 1 + distributed lag on S1 and on measures of cost,
(4.5a)
exogenous autoregressive process.
(4.5b)
On the other hand, when revenue is endogenous, ns "" 0 and we see in Equa tion (4.4b) that inventories Granger-cause sales. The intuition is that forward looking firms adjust inventories in part in response to expected future conditions. Thus industry-wide stocks signal future market conditions, including sales. This signalling ability is reflected in Equation (4.4b). These same results can be obtained directly from the social planner problem that maximizes consumer surplus plus producer surplus, which is equal to the area between the inverse demand and supply curves. See the Technical Appendix. In empirical application, matching the data might require allowing shocks with rich dynamics. Such dynamics may even be required to identify all the parameters of the model. Blanchard (1983), for example, assumes that the demand shock follows an AR(4). For expositional ease, however, we assume through the remainder of this section that all exogenous variables - Ud1, Uc1, W1 follow first-order autoregressive processes (possibly with unit roots). Specifically, assume that -
II - it follows from Equation (4.8) that
( 1 - :rrHL)(1 - t/JeL)(Ht - H;) = Vt, (4. 1 0) MA(1). Vt = m oe eet + m 1 e eet- l + mod edt t/Je m od edt Thus, Ht -H7 ARMA(2, 1) with autoregressive roots :rrH and 1/Jc. (This presumes that the moving average root in Vt does not cancel an autoregressive root in Ht -H;, which generally will not happen.) Note that the innovation edt, rather than the shock Udt, appears in Equation (4.9) and thus in Equation (4. 1 0). With 1/Jct 1 , however, the right hand side of Equation (4. 1 0) would include a linear combination of Udt and Uct1_ 1 -
-
1
�
�
*
that would not reduce to a linear function of edt, and 1/J d would also be one of the autoregressive roots of Ht -H7 . In this case, if 1/Jct 1 , then Ht - H7 would also have a moving average root that would approximately cancel the autoregressive root of 1/Jd . Similarly, when there are observable cost shifters (a * 0), it may be shown that Equations (4.6) and (4.7) imply �
Ht - H;
Ht - est - a' Wt = :n:H(Ht-1 - est-1 - a' Wt-1 ) + disturbance, disturbance = m �w ewt + m oeUct + m l eUet- 1 + moct edt · =
Once again, persistence in H t - H; is induced by
H
:n;
and 1/Je .
(4. 1 1 )
V.A. Ramey and K.D. West
892
We close this subsection by re-introducing costs of adjusting production a0. Suppose ao 7= 0.
(4. 1 2)
It is well known that when revenue is exogenous (g --7 oo), costs of adjusting production put additional persistence in inventories [Belsley ( 1969), Blanchard ( 1 983)]: in this case Equation (4.5a) becomes H1 = :n;H I HI-1 + :n;H2HI-2 + distributed lag on S1 and on measures of cost,
(4. 1 3) with :n;H2 7= 0. Unsurprisingly, inventory decisions now depend on Qt- 1 = S1- 1 + Ht- 1 Ht-2 and thus on Ht-2 , even after taking into account H1_1 and the sales process. As one might expect, the presence of costs of adjusting production has a similar effect even when sales and revenue are endogenous, and on the inventory-sales relationship as well as inventories. The Technical Appendix shows that a0 7= 0 puts an additional autoregressive root in H1 - H7, which now follows an ARMA(3, 2) process. One autoregressive root is l/Jc· We let :n; 1 and :n;2 denote the two additional (possibly complex) roots. These are functions of b, a0, a 1 , a2 , a3 and g. Intuition, which is supported by the simulation results reported below, suggests that increases in a0 increase the magnitude of these roots. 4.4. Summary on persistence in the inventory-sales relationship We summarize the preceding subsection as follows: assume the shocks follow the AR(1) processes given in Equation (4.6), with the additional restriction (4.8) that the demand shock and observable cost shifters follow random walks. Then ao = 0 =} Ht - H; = HI - est - a' Wt with AR roots :n;H and l/Jc·
�
ARMA(2, 1 ),
(4. 14)
The root :n;H is a function of b, g and the a;, but not the autoregressive parameters of the shocks, and is increasing in the marginal production costs a 1 • In addition, ao 7= 0 =? H1 - H; = H1 - est - a' W1 with AR roots :1r t , :n;2 and l/Jc; if l/Jc = 0, Ht - H; ARMA(2, 1 ) with AR roots :n; 1 and Jr2 . �
�
ARMA(3, 2), (4. 1 5)
The roots :n; 1 and :n;2 are functions of b, g and the a;, but not the autoregressive parameters of the shocks; both analytical manipulations of the formulas in the
Ch. 13:
Inventories
893
Technical Appendix and simulations reported in Section 6 indicate that the modulus of the larger of the roots increases with a0 and a 1 1 5 . Thus the persistence documented in Section 2.2 above follows i f there are sharply increasing production costs (a0 and/or a 1 are sufficiently large) and/or serially correlated cost shocks. In addition, it is important to observe that qualitatively similar reduced forms are implied by the following two scenarios: (1) serially correlated cost shocks with no costs of adjusting production, and (2) serially uncorrelated cost shocks and sharply increasing costs of adjusting production. We shall return to this point below. Of course persistence may also follow if we put different dynamics into the shocks Wt, Uct and Udt ·
5. The flexible accelerator model
We now derive (4. 10)-(4. 1 1) from another optimization problem. This optimization problem is one that underlies empirical work motivated by the flexible accelerator model. In this model, pioneered by Lovell (1961 ), firms solve a static one period problem, balancing costs of adjusting inventories against costs of having inventories deviate from their frictionless target level H;. Specifically, the firm chooses H1 to minimize
(5. 1 ) I n (5 . 1 ) , v > 0 i s the weight of the second cost relative to the first, and exogenous unobservable disturbance 1 6. The first-order condition is then
Ht - H1_, = [ 11(1 + v)](H; - Ht-i) - [ 1 1(1 + v)] ut .
u1 is an (5.2)
The coefficient 11(1 + v) is the fraction of the gap between target and initial inventories closed within a period. If v is big (cost of adjusting inventories is big), the fraction of
15 Under the present set of assumptions, then, the parameter called "p" in Section 2.2 is max{nH , qlc} if a0 = 0, max{ In 1, 1n2 1. qlc} if a0 "' 0. 1 1 6 H, and S, are sometimes measured in logs [e.g., Maccini and Rossana (1981, 1 984)], and the variable u1 is sometimes split into a component linearly dependent on the period t surprise in sales and a component unobservable to the economist [e.g., Lovell (1961), Blinder (1 986b)] . We slur over differences between regressions in levels and logs, which in practice are small (see Footnote 8), and omit a sales surprise term in the inventory regression, which in practice has little effect on the coefficients that are central to our discussion.
894
V.A. Ramey and K.D. West
the gap expected to be closed is, on average, small. To make this equation operational, target inventories H7 must be specified. Let (5.3) Here, W1 is a vector of observable cost shifters [as in Section 2.2.2 and Equation (3 .2)]. Notation has been chosen because of link about to be established with e and a' W1 as defined earlier. Suppose (5.4)
E1_1 ect1 = 0, E1_ 1 ew1 = 0. (In practice, E1_1S1 is usually approximated as a linear function of a number of lags of S, the actual number dependent on the data, and similarly for W1 [e.g., Maccini and Rossana ( 1984)]. The single lag assumed here is again for simplicity.) Then with straightforward algebra, the first-order condition (5.2) implies
Ht - esr - a1 Wr = lfH(Hr-1 - est-! - a' Wt-1) + disturbance, lfH = [v/(1 + v)], disturbance = [ 1/(1 + v)](eect1 + a' ew1 - u1 ),
(5.5)
which is in the same form as Equation (4. 1 1). We have thus established that in the simple parameterization of this section, in which sales follows an exogenous random walk, high serial correlation in a stationary linear combination of inventories and sales is the same phenomenon as slow speed of adjustment of inventories towards a target level. 6. Dynamic responses
To develop intuition about how the model works, and what the two stylized facts suggest about model parameters and sources of shocks, this section presents some impulse responses. Specifically, we present the industry equilibrium response of ( 1 ) Ht. St and Qt, or (2) Ht, St and Ht - est, to a shock to udt or Uct. for various parameter sets, with no observable cost shifters (a = W1 = 0). While the parameter values we use are at least broadly consistent with one or another study, we choose them not because we view one or more of them as particularly compelling, but because they are useful in expositing the model. Table 8 lists the parameter sets. It may be shown that the solution depends only on relative values of g, a0, a1 and a2 ; multiplying these 4 parameters by any nonzero constant leaves the solution unchanged. [This is evident from the first-order conditions (3.3), (B.4) and (B.5): doubling all these parameters leaves the first-order conditions unchanged, apart from a rescaling of the shocks.] Our choice of a2 = 1 is simply a normalization. We fix g = 1 in part because some of the properties documented below can be shown either to be invariant to g [see West ( 1986, 1 990b) on procyclicality of inventories] in part because a small amount of
Ch. 13:
Inventories
895 Table 8 Parameter sets a
(1)
(2)
Mnemonic
g
(3) ao
(4)
(5)
(6)
(7)
(8)
3
0, inventory movements may be procyclical if a 1 < 0.
7. Empirical evidence
7. 1. Introduction The analytical results in West (1 986, 1 990b) and Section 4 and the simulations in Section 6 suggest at least two different ways of rationalizing the procyclicality of inventory movements and the persistence of the inventory-sales relationship. One is a demand-driven model with rapidly increasing marginal production costs (marginal production costs a0 and/or a1 are large relative to marginal inventory holding costs a2 ), together with a strong accelerator motive (a2 a3 large relative to a0 and a J ). The second is a cost-driven model, with increasing marginal production costs; such a model may or may not have a role for the accelerator. For simplicity we somewhat loosely refer to these as our demand-driven and our cost-driven explanations. We do so with some reservations: please recall that our demand shock Udt may in some data basically reflect supply side forces. These two do not exhaust the possibilities, and many economists (including us) would expect both cost and demand shocks to be important over samples of reasonable length. Our own work, for example, has emphasized the possibility of declining marginal production costs [Ramey ( 1 99 1 )] . In combination with highly persistent cost shocks, both procyclicality of inventories and persistence of the inventory-sales relationship may result. And West ( 1 990b) finds both stylized facts explicable with a model with strong costs of adjusting production and a substantial role for both cost and demand shocks, but with no accelerator.
Ch. 13:
Inventories
903
But our demand-driven and cost-driven explanations have the virtue of simplicity, and both have support from a number of papers: as summarized in this section and the next, most aggregate studies, and the limited microeconomic evidence available, do not point to declining marginal cost, and do find a role for the accelerator. In citing such support we do not cast a wide net but instead selectively cite representative papers. In addition, after some introductory remarks on papers using the flexible accelerator model (Section 7.2), we focus on papers that explicitly use the linear quadratic model, for ease of exposition. Section 7.2 reviews parameter estimates from the linear quadratic literature, Section 7.3 discusses sources of shocks, and Section 7.4 provides an interpretation. We remind the reader that the behavior of inventories depends only on the relative values of g, a0, a1 and a2 • All statements referencing "large" values of one of these parameters should be understood to mean "large relative to another parameter or linear combination of parameters". The normalization involved will be clear from the context. 7. 2.
Magnitude of cost parameters
Our discussion will focus on estimates of the linear quadratic model. We begin, however, with a brief discussion of results from less structured studies, including those using the flexible accelerator. We record two results. The first is that in flexible accelerator studies, actual or expected sales is generally found to be an important determinant of inventory movements, with a positive relationship between the two series. See, for example, Maccini and Rossana (198 1 , 1 984) or Blinder ( 1 986b) . In terms of the model in Section 5 , a positive relationship may be interpreted as e > 0, where e is the coefficient on sales in the expression for target inventories [see Equation (5.3)]. As well,}n direct estimation of a cointegrating parameter, Granger and Lee ( 1989) do obtain 8 > 0 in all 27 of their US two-digit manufacturing and trade series. To interpret this with the linear quadratic model, recall that under certain conditions, the decision rule from the flexible accelerator model (5. 1 ) can be mapped into that of the linear quadratic model (3 . 1). Under those conditions, 8 = a3 - [a1 ( 1 -b)/(ba2 )] [see Equations (3.5), (4. 1 1 ) and (5.5)]. Thus a3 > 0 is necessary for the cointegrating parameter 8 to be positive, as noted by Kashyap and Wilcox ( 1 993) . [This holds even when U01 is present (although cointegration requires U01 �J(O)).] Thus here and in linear quadratic studies (see below) there is support for a nontrivial role for the accelerator motive - a result that may be unsurprising or reassuring to some, but in any event is not particularly helpful in discriminating between our two candidate explanations. The second result from the flexible accelerator literature concerns the structure of production costs. As discussed in Section 2, this literature has found large autoregressive roots in H1 - H;, which implies slow adjustment of H1 towards H; . In quarterly data, a typical estimate of the root is around 0.8-0.9, implying that about
904
V.A. Ramey and K.D. West
10-20% of the gap between actual and target inventories is closed in a quarter. Ever since Carlson and Wehrs ( 1974) and Feldstein and Auerbach (1976), many observers have found such estimated speeds puzzling and perhaps not well-rationalized by the flexible accelerator model. One reason is that even the largest quarterly movements in inventories amount to only a few days production. This suggests to Feldstein and Auerbach ( 1 976, p. 376) and others that costs of adjusting inventories [v, in the notation of Equation (5 . 1 )] cannot be very large 2 1 . To interpret this second result with the linear quadratic model, recall that we set ao = 0 when we established a mapping from the flexible accelerator to the linear quadratic model (3. 1). With ao = 0, an arbitrarily slow speed of adjustment results when a 1 is arbitrarily large. It is not clear to us how large a value of a 1 is implausibly large. But we take from the flexible accelerator literature the message that many find this simplest version of the model unappealing [see Blinder and Maccini (1991) for a recent statement]. Accordingly we consider the other sources of persistence isolated above: costs of adjustment (discussed in this subsection), and serial correlated cost variables (discussed in the next subsection). To focus the discussion of costs of adjustment, we highlight estimates from some recent linear quadratic studies using two-digit manufacturing data from the USA. Different studies present estimates of a0, a 1 and a2 relative to different parameters or linear combinations of parameters. To display results from various studies in consistent form, we restate published estimates of a0, a 1 and a2 relative to a common linear combination of the published estimates of those parameters. This linear combination is (7. 1) with b = 0.99. Here, "c" is the second derivative of the objective function (3. 1) with respect to H1; the Legendre-Clebsch condition states that c > 0 is a necessary condition for an optimal solution. [See Stengel (1986, p. 2 1 3) or Kollintzas (1989, p. 1 1).] Note that the estimates we discuss will therefore not be comparable to those used in the simulations in the previous section and in Table 8, and often are not as easily interpreted as those expressed relative to a single parameter. We nonetheless use this normalization since studies sometimes report negative estimates of a0, a 1 or a2 , which can make interpretation of estimates relative to one of those parameters problematic. Most authors examine more than one specification. Table 10 presents results for a specification that seemed to be preferred by the author(s). For the preferred specifi cation, columns 2-6 present the median point estimate of a0/c, a 1 /c, [(1 + b)ao + at ]/c 21
The logic apparently is that it should be easy to make inventory movements rapid if firms are
beginning from a starting point in which current movements are small relative to production. But small inventory movements seem to be exactly what one would associate with slow adjustment speeds, if costs of adjustment determine both the size of movements and the adjustment speeds; if, instead, the slow adjustment speeds were accompanied by large movements in inventories, there would be a puzzling contrast between regression results and basic data characteristics.
Ch. 13:
Inventories
905 Table
10
Median point estimates o f model parameters a-d
(1)
Reference e
(2) aolc
(3) a1/c
(4) [(1 + b)ao + aJ ]I c
(5) a2/c
(6) a3
(7) Number of industries
Models with serially correlated cost variables: ( 1)
Durlauf and Maccini
( 1 995)
(2) Eichenbaum (1989) (3)
Kollintzas
(4)
Ramey
(1995)
(1991)
0.
0.43
0.43
0.15
0.55
5
0.
0.21
0.21
0.58
1.15
7
-0. 16
0.83
0.64
-0.09
1.14
6
0.15
-0.63
-0.43
1 .69
0.40
6
Models without serially correlated cost variables: (5)
Fuhrer, Moore and Schuh
(1995)
(6) West (1986)
0.13
0.12
0.38
0.00
0.67
0.05
0.34
0.44
0.01
1 . 12
2 a I n the column definitions, c = (1 + 4b + b ) a0 + (1 + b) a1 + ba2 , b = 0.995. Note in columns 2-4 are therefore not comparable to those in columns 4-6 in Table 8. b
that the magnitudes
Different papers expressed point estimates relative to different linear combinations of parameters. For
each paper, the reported point estimates were restated relative to states that
c >0
industry (line times column
c.
The Legendre-Clebsch condition
is a necessary condition for an interior solution of the optimization problem. The table
reports the median of the restated estimates. When
c
10
5), the 2, plus
column column
4 3.
a0 = 0
(lines
1
and
2),
or when there is only one
entry for marginal production cost is by construction equal to:
(I + b)
All the studies used two-digit manufacturing data from the USA. The exact data, sample period,
specification and estimation technique vary from paper to paper.
d
Most papers present more than one set of results. We chose the specification that seemed to be favored
by the author(s). e
(1) Table 7 (p. 85), entries labelled "Table 3"; (2) Table 2 (p. 861); (3) Tables 1-6 (4) Table 1 (p. 323), excluding autos; (5) Table 4 (p. 1 28), "FIML-endogenous sales"; (6) Table 4 (p. 391).
Sources by reference:
(pp . 77-80),
columns labelled "random walk";
entry labelled
(=marginal production cost, taking into account costs of adjusting production), a2 /c and a3 • The median is computed across the datasets considered by the author; the number of datasets is given in column 7. A skim of the table suggests a broad consensus on a3 (column 6). As well, there is relatively little disagreement on the sign of the slope of marginal production costs (column 4); with the exception of Ramey ( 1 99 1), the studies find an upward slope to marginal production cost. There is, however, some variation in the extent to which the cost of adjustment a0 contributes to this upward slope. Consistent with the demand driven explanation, Fuhrer et al. (1995) (line 5) and to a lesser extent West (1 986) (line 6) find that a0 contributes to the upward slope. Some studies with other datasets have found an even stronger role for the cost of adjustment a0, with a0 positive and significant but with estimates of the production cost a1 negative [consistent with Ramey ( 1 99 1 )], or economically or statistically
906
V.A. Ramey and K.D. West
indistinguishable from zero. For example, Kashyap and Wilcox's ( 1 993) study of the automobile industry in the 1 920s and 1 930s yielded median estimates of parameters as follows: [( 1 + b)ao + aJ ]/c aole GJ (7.2) 0.20 0.29 o. n · Similar results are reported for the modern automobile industry by Blanchard ( 1 983) and Ramey ( 1 99 1 ), and for US aggregate inventories by West (1 990b). On the other hand, we see in lines 1 and 2 that the preferred specifications in the Eichenbaum ( 1 989) and Durlauf and Maccini ( 1 995) set the cost of adjustment to zero. In these two papers, the estimates of a 1 tended to be positive but perhaps not so large as to imply a speed of adjustment that Feldstein and Auerbach ( 1 976) would find implausibly slow. In part these papers set a0 to zero - because in a setup similar to that of Kollintzas ( 1 995) in line 3, negative and insignificant point estimates of a0 tended to result. Rounding out the cost-driven story requires finding substantial persistence from stochastic variation in costs. This is discussed in the next subsection.
7.3. Shocks There is much circumstantial evidence that serially correlated cost shifters have important effects on inventory behavior. In particular, the data often seem happy with specifications in which the unobservable disturbance Uct is highly autocorrelated [e.g., Eichenbaum ( 1 989), West ( 1 990b), Ramey ( 1 99 1)]. One's confidence that this unobservable disturbance really reflects stochastic variation in production costs would be increased if inventories could be shown to respond aggressively to observable measures of costs. Unfortunately, this appears not to be so. In practice, factor prices and interest rates usually are insignificant (in both economic and statistical terms), and sometimes have effects opposite of the theoretical predictions. For statistical significance, Table 1 1 shows a selection of results using cost variables, from studies of two-digit manufacturing in the USA, and now including flexible accelerator as well as linear quadratic studies. It may be seen in columns 1-4 that a finding of a statistically significant effect of observable measures of costs is rare: only 2 entries are "y"s, indicating that in only two of the 2 1 studies did significance at the 5% level characterize at least three-fourths of the coefficients estimated in a given study. 1 1 entries are "n"s, indicating that in these 1 1 studies fewer than one-fourth of the coefficients were significant. On the other hand, in column 5 it may be seen that for the unobservable disturbance, three of the 6 entries are "y"s, and that two of these "y"s are for studies that also included some observable measures of costs (lines 6 and 8); none of the 6 entries are "n"s.
7.4. Interpretation We showed in Section 4 that the demand-driven and cost-driven explanations put two large autoregressive roots in the inventory sales relationship HI - est ; in fact,
Ch. 13:
Inventories
907 Table
11
Statistical significance of cost variables a-c Reference d
Wage
(1)
Blinder
(2)
Durlauf and Maccini
(1 986b)
(3)
Eichenbaum
(4)
Kollintzas
(5) Maccini
(1995)
Materials
Energy
Interest
Unobservable
prices
prices
rate
shock
n
?
?
?
?
n
n
( 1 989)
y
(1995)
?
and Rossana
(198 1 )
y
?
n
?
(6)
Maccini and Rossana
(1 984)
n
y
n
y
(7)
Miron and Zeldes
n
?
n
n
?
n
?
?
(8)
Ramey
(9)
Rossana
(1988)
(1991) (1990)
a This table is an updated version of a table in West
b
n y ?
(1995).
All the studies used two-digit manufacturing data from the USA. The exact data, sample period,
specification and estimation technique vary from paper to paper. c
A "y" entry indicates that the coefficient on the variable in a given colunm was significantly different
from zero at the
5%
level in at least three-fourths of the datasets in a given study, a "n" that it was
significant in at most one-fourth of the datasets, a "?" that it was significant in more than one-fourth but fewer than three-fourths of the datasets. A blank indicates that the variable was not examined. d Sources by reference: (I) Table I (pp. 360-61); (2) Table 3, inst. set. 4; (3) Table 2 (p. 861);
(4) Tables 1-6 (pp. 77-80), colunms labelled "HP filter" and "quadratic trend"; (5) Table I (p. 20); (6) Table 3 (p. 231) and discussion on p. 227; (7) Table II (p. 892); (8) Table I (p. 323); (9) Tables 3 and 4 (pp. 26-27), with the cost of capital variable "ce" and "cp" interpreted as interest rate variables.
under certain conditions, both imply ARMA(2, 1) processes of H1 - 8S1 • Specifically, this happens when if>c = O in the demand-driven explanation [see Equations (4. 1 4) and (4. 1 5)] . That the similar ARMA structures might allow both models to fit a given body of data is illustrated by Kollintzas ( 1 995). Kollintzas' results in line 3 of Table 1 0 were for a specification with a random walk (if>c = 1) cost shock (i.e., Kollintzas differenced the first-order condition before estimating). Among other specifications, Kollintzas allowed for an i.i.d. unobservable cost shock. In the specification with i.i.d. cost shocks, the median estimates of the parameters were:
aole 0.03
[( 1
+
b) ao 0.47
+ a i ]/c
a3 2.5 1 '
(7.3)
While the estimate of marginal production costs was not wildly different (0.47 with i.i.d. shocks vs. 0.64 with random walk shocks [line 3, column 4 of Table 1 0)], the
V.A. Ramey and K.D. West
908
median estimate of a0 was higher. In fact, the estimate of a0 was higher in 5 of the 6 datasets 22. An interpretation is that a model that fits his data would imply two autoregressive roots. When serial correlation in the cost shock was suppressed, the positive values for a0 rationalized a second autoregressive root; when serial correlation in the cost shock was imposed, large (or even positive) values of a0 would imply an autoregressive structure too elaborate for the data, and accordingly the regression yielded diminished values of ao . Discriminating between the two explanations thus means distinguishing between costs of adjustment [when a0 =1= 0 and the serial correlation of the cost shock is zero (t/lc = 0)] and exogenous serial correlation (when a0 = 0 and t/lc is near one). In principle this may be done, using either cross-equation restrictions, or additional variables such those in the W1 vector. But in both inventory and non-inventory contexts this has proved difficult [e.g., Blinder ( 1986b), McManus et al. ( 1 994), Surekha and Ghali (1997)]. And in any case, our discussion so far perhaps has understated the extent of conflict across empirical results. There is a range of estimates of most parameters (including some wrong-signed or otherwise implausible ones), and we have pushed papers into one of just two camps in the interest of summarizing a complex set of results: while in principle it may be possible to pin down important macroeconomic parameters and sources of shocks by simply estimating linear inventory models with aggregate data, this tantalizing idea has not proved true in practice so far. The conflict across papers, or the range of estimates, may be no worse than in empirical work in other areas. For example, those familiar with the real business cycle literature will probably not be surprised that it is difficult to find observable counterparts to unobservable cost shocks. And Lovell ( 1 994) shows that the estimated speed of adjustment of H1 towards H; is in fact no slower than those of some other variables. As well, part of the conflict across papers no doubt results from econometric problems related to sample size or estimation technique [West and Wilcox ( 1 994, 1 996), Fuhrer et al. ( 1 995)]. Finally, it may be that careful analysis would reveal that seemingly disparate conclusions in fact result mainly from the use of different sample periods, datasets, and observable cost shifters (" W/', in the notation of the previous sections). But pointing out (perhaps unfairly!) that other literatures have similar problems will not advance our knowledge about inventories. Nor, most likely, will sharp estimates be produced by even the most refined econometric technique, at least when applied to familiar data. We therefore suggest some alternative approaches.
22
Such statements potentially are sensitive to how the parameters are expressed (relative to
Table
1 0,
or
"c",
as in
some other linear combination of parameters). But in this case the statement applies not
only with respect to the normalization we have used, but also with respect to the normalization used by Kollintzas, which was relative to
a0(1 + b) + a 1 •
Ch. 13:
Inventories
909
8. Directions for future research 8. 1.
Introduction
In this section we offer what we believe to be fruitful directions for future research. Sections 8.2 and 8.3 describe alternative modelling strategies. Some of our suggestions are based on alternatives to the basic linear quadratic production smoothing model; others extend the basic model. All build on the insights delivered by the basic model: that procyclical movements result when inventories facilitate sales (a force captured in the basic model with the accelerator term), and that the shape of production costs influences both the character of cyclical movements and the persistence of the inventory-sales relationship. In addition, all seem intuitively capable of helping explain either or both of our stylized facts (procyclicality of inventories, persistence of the inventory-sales relationship), although, of course, research to date involving these suggestions has its share of blemishes (e.g., wrong-signed parameter estimates). Finally, Section 8.4 describes how the use of different data may help understand inventory behavior. 8.2.
Inventories in production and revenue functions
The potential importance of the accelerator term (a3 ) in explaining both the business cycle and long-run behavior of inventories suggests that the relationship between inventories and sales deserves more study. Consider first Holt et al.'s ( 1960) original motivation for this formulation as a2 (H1 - a3 St+ 1 )2 . As discussed on pp. 56-57 of their book, their initial model of optimal inventory holdings used lot-size formulas, where the optimal batch size, the number of batches and optimal inventory levels all increase with the square root of the sales rate. They used two approximations to capture the costs and benefits associated with inventory holdings. First, they approximated the square-root relationship with a linear relationship between inventories and sales (e.g., H; = a3 S1+ I ). Second, they approximated all costs and benefits associated with inventories with a quadratic in which costs rise with the square of the deviation of inventories from the optimal level. This generates the accelerator term a2 (H1 - H;)2. While this tractable formulation provides a plausible mechanism for procyclical in ventory movements, there are two potential problems with it. First, the approximations may be inadequate. As we will discuss below, the approximations used imply that the cost of a marginal reduction in inventories is linear in the stock of inventories, whereas at least one paper found significant convexities. Second, inventories may directly affect revenue in a way that is not well captured by including the accelerator term in the cost function. One alternative strand of the literature has modelled inventories as factors of production, or considered interrelationships between inventories and other factors of production. Christiano (1988), Ramey (1989), Galeotti et al. (1997) and Humphreys et al. ( 1997) are examples. Ramey ( 1989) argues that since inventories at all stages
V.A. Ramey and K.D. West
910
of production facilitate production and shipments, they should be considered factors of production. She includes three stages of inventories in the production function, and estimates factor demand relationships. This approach obviously has the potential to make inventories move procyclically, since factor usage fluctuates with output. The results from the linear quadratic model suggest if costs of adjustment are allowed, persistence would result as well. A second line of research has considered the revenue role of inventories. Kahn ( 1 987, 1 992) develops a theory of a stockout avoidance motive for holding inventories and tests some of its implications using automobile industry data. Kahn argues that demand uncertainty and a nonnegativity constraint on inventories can explain several important patterns in the data. Bils and Kahn ( 1 996) extend this line of research by assuming that the demand function is a (nonlinear) function of the stock of goods available for sale. They apply the model to two-digit manufacturing data with mixed success. Rotemberg and Saloner ( 1 989) offer another potential role for inventories in revenue, arguing that inventories may be accumulated to deter deviations from an implicitly collusive arrangement between firms. The nature of the equilibrium implies that inventories will be high when demand is high. They show empirically that the correlation between inventories and sales is higher in concentrated industries, as predicted by their model. A third line of research studies uses more general functional forms for the relationship between inventories and sales. [The work by Kahn ( 1 992) and Bils and Kahn ( 1 996) also fits into this category.] Pindyck ( 1 994) studies the convenience yield of inventories for three commodities. Augmenting the usual production, sales and inventory data with futures prices, he provides evidence that the marginal convenience yield is very convex, increasing sharply as inventories approach zero. This indicates that the approximation embodied in the basic Holt et al. ( 1 960) model may miss some important aspects of inventory behavior.
8.3.
Models with fixed costs
We next consider arguments and evidence that a key shortcoming of the linear quadratic model is that it fails to account for fixed costs facing firms. Blinder ( 1 9 8 1 ), Caplin ( 1 985), Mosser ( 1 988), Blinder and Maccini ( 1 99 1 ), and Fisher and Hornstein ( 1 996) all argue that fixed costs of ordering may be very important for understanding the behavior of retail and wholesale inventories as well as manufacturers' materials and supplies. In some environments the aggregation argument presented in Section 3.4 will not apply, and research to date has shown that under certain conditions such fixed costs may lead to (S, s) type of decision rules. In their review article, Blinder and Maccini ( 1 99 1 ) recommend that future inventory research concentrate on the (S, s) model. This will require resolution of difficult problems of aggregation, perhaps partly through the use of simulations [Lovell ( 1 996)]. While the results look suggestive at the level of a single-product firm, the implications for a multi-product firm, let alone for an
Ch. 13:
Inventories
911
industry or economy, have been harder to obtain because of difficulties o f aggregating (S, s) rules. The studies by Bresnahan and Ramey ( 1 994) and Hall ( 1 996) show that fixed costs are an important determinant of production costs in the automobile industry. Bresnahan and Ramey follow some fifty assembly plants on a weekly basis from 1 972 to 1 983, and uncover important lumpiness in the margins for varying production. Hall studies fourteen assembly plants from 1 990 to 1 994. Both studies isolate two important methods for varying production, which appear to involve some sort of nonconvex costs. First, they find that complete shut-down of a plant for a week at a time is an important method for temporarily decreasing output rates. Second, the adding and dropping of extra shifts (each of which doubles or halves production) are an important source of output variability, and appear to involve fixed costs and lumpiness of production levels. Thus, costs in the automobile industry deviate from the linear quadratic production smoothing model in two important ways. First, there appear to be fixed costs of adjusting production, not convex costs as postulated in the production smoothing model. Second, the lumpiness of the margins, accompanied by the fixed costs, leads to a nonconvex cost function. It is important to point out that the nonconvexity is due not to declining marginal costs [as Ramey ( 1 9 9 1 ) originally posited], but rather to the existence of large fixed costs at key points in the cost curve. Thus, both the (S, s) literature and the limited amount of factory evidence available suggests that fixed costs may be very important. Furthermore, the types of fixed costs highlighted can potentially explain both the procyclicality of inventories and the persistence of the inventory-sales relationship. For example, the lumpiness of the shifts margin in the automobile industry can explain why an increase in sales might lead to a more than proportional increase in production. Also, the importance of fixed adjustment costs can explain why significant deviations in the inventory-sales relationship are allowed to occur before production responds. It is not yet clear, however, how general are the results from the automobile industry. And more generally it is not well understood how and whether fixed costs at the plant or firm level translate into industry- or economy-wide behavior. Thus, the role of these types of fixed costs in explain aggregate inventory fluctuations remains an important topic for future study.
8.4. The value of more data Finally, we discuss how the addition of more data may help narrow the estimates obtained from the linear quadratic model, as well as shed light on the unobserved cost shocks. We will argue that there are several available sources of data that have the potential to clear up ambiguities. One possible explanation for the range of estimates obtained from the production smoothing model is the data are not sufficient for distinguishing the relative values of the parameters. One way to glean more information from macroeconomic data is to use information contained in prices, something done in a handful of papers
9 12
V.A. Ramey and K.D. West
including Eichenbaum ( 1 984), Blanchard and Melino ( 1 986) and Bils and Kahn (1996). Pindyck's ( 1 994) results using futures prices provides additional evidence of the information contained in prices. A second possible use of new data is to measure the stochastic variation in cost. As Table 1 1 indicates, a number of authors have experimented with several observable cost shifters, but generally do not find effects. Another possible source of cost shocks that has been studied in a few papers is credit conditions. We remarked in Section 2.2 that Kashyap et al. ( 1 994) and Carpenter et al. ( 1 994, 1 998) still find persistence in the inventory-sales relationship after including measures of credit conditions. But they also regularly find that credit conditions affect inventory holding behavior of small firms, across various specifications. If these credit conditions are serially correlated (which they are likely to be), and if small firms are important enough to substantially affect industry- and economy-wide aggregates, credit conditions may ultimately help explain our two stylized facts. Finally, we advocate more plant and firm-level studies, although gathering such data requires substantial work. Schuh ( 1 996), for example, uses panel data from the M3LRD to calibrate biases from aggregation. And consider Holt et al.'s ( 1 960) study of six firms ranging from a paint producer to an ice cream maker and Kashyap and Wilcox ( 1 993) and Bresnahan and Ramey's ( 1 994) studies of the automobile industry. They use not only firm-level data on production, inventories and sales, but also company reports and industry press, which provide valuable insights into the cost structure facing firms. For example, Bresnahan and Ramey ( 1 994) were able to categorize the cause of every plant shutdown using information from Automobile News, which chronicled drops in demand and cost shocks such as strikes and model year change-overs.
9 . Conclusions
We conclude by briefly reiterating several points we have made in this chapter. We began by asserting that inventories are a useful resource in business cycle research. The theoretical dependence of the comovements of sales, production, and inventories on important parameters such as the slope of marginal costs, and on the nature of the underlying shocks, indicates that inventory models can in principle be used to identify these important macroeconomic features. The two stylized facts we highlight - the procyclicality of inventories and the persistence of the inventory-sales relationship are intimately linked to other aspects of business cycle fluctuations. Thus, inventory movements have valuable business cycle information. To consider explanations for the two facts, we presented a linear quadratic model. We showed that the model can rationalize the two facts in a number of ways, but focused on two stylized explanations have the virtue of relative simplicity and support from a number of papers. Both assume that there are persistent shocks to demand for the good in question, and that marginal production cost slopes up. The first explanation assumes as well that there are highly persistent shocks to the cost of production. The second
Ch. 13:
Inventories
913
assumes that there are strong costs of adjusting production and a strong accelerator motive. Our review of the empirical evidence, however, indicates that the range of estimates of key parameters and of the relative importance of cost versus demand shocks is too wide to allow us to endorse one of the two or some third explanation. But while the literature has not reached a consensus, it has identified mechanisms and forces that can explain basic characteristics of inventory behavior. We believe that several research strategies, and use of different data, promise to continue to improve our understanding of inventory movements and therefore of business cycle fluctuations.
Appendix A. Data Appendix
Data sources for annual G7 data: all data on inventory changes were obtained from International Financial Statistics, mostly from the 1 996 CD-ROM. From the CD-ROM, we obtained nominal and real GDP and the nominal change in aggregate inventories. The GDP deflator was used to convert the inventory change from nominal to real. For the Canada, France, the UK and the USA, 1 955 data were available to compute �Q and M in 1 956. For all other countries an observation was lost in computing the initial �Q and �S. Additional sources were used for West Germany and Italy. West Germany: (a) 1 957-1 978: the IFS data used in West ( 1 990a), rebenchmarked to a 1 990 from a 1 980 base, and output measured with GNP instead of GDP. (b) 1 9791 994: in both the CD-ROM and in recent hardcopy versions of iFS, the figures on the annual change looked suspicious: they were uniformly positive and large relative to 1 957-1 958, bore no obvious connection to the figures on the levels reported in the Statisches Bundesamt publication cited below, and in recent years bore no obvious connection to the average of the reported quarterly figures. So for 1 979-1 990, we used the annual change reported in the hardcopy IFS, obtaining a given year's data from the April issue three years later (e.g., the 1 990 figure came from the April 1 993 issue of IFS). (For 1 980 we used the May 1 983 issue, because the April 1 983 issue was missing.) For 1 99 1-1 994, we used the average of the quarterly figures from the April 1 995 hardcopy version of IFS. Italy: 1 993 and 1 994 real GDP came from OECD Economic Surveys, Italy, 1996, rebenchmarked to a 1 990 from a 1 985 base. We checked the US data against the Department of Commerce's 1 992 chain-weighted NIPA data, and while there were notable differences, overall the two perhaps were tolerably close: the correlation between inventory investment as constructed here and the Department of Commerce measure was 0.96. Data sources for non-US data on inventory levels: Canada: private communication from Statistics Canada gave a nominal 1 995:IV inventory figure for all nonfinancial industries of 1 40.8 billion Canadian dollars, which we deflated with the GDP defla tor. West Germany: Statisches Bundesamt, Volkswirtschaftliche Gesamtrechnungen,
914
V.A. Ramey and K.D. West
Table 3.2.9. Agriculture (line 2) was subtracted from total (line 1), and the result was deflated by the GDP deflator. Japan: Economic Planning Agency, Annual Report on National Accounts, 1997, table on "Closing Stocks". The nominal figure for total stocks was deflated by the GDP deflator. United Kingdom: Office for National Statistics, United Kingdom National Accounts: The Blue Book, 1996, Table 1 5 . 1 . Agriculture (series DHIE) and government (AAAD) were subtracted from total (DHHY), and the result was deflated by the GDP deflator. Data sources for sectoral distribution of US inventories: broad sectoral categories were obtained from Citibase, and manufacturing inventories by stage of processing were obtained from the BEA. The stage of processing inventories were converted from monthly to quarterly data by sampling the last month of the quarter.
Appendix B. Technical Appendix
This appendix discusses the following: ( 1 ) solution of the model 23 ; (2) Computation of E(Q2 - S2 ) in Table 4; (3) Estimation of e in Table 5; (4) the social planning approach to derivation of the first-order conditions.
B. I. Solution of the model We assume throughout that a � , a2 , g > 0 and a0, a3 � 0. See Ramey ( 1 991) for solutions when a 1 < 0. We begin by working through in detail the solution discussed in Section 4, when a0 = 0 and the forcing variables follow first-order autoregressions. For simplicity, for the most part we set a = 0 as well. Thus U ct = Uct [see Equation (3.2)], Et- J Uct = ifJc Uct-1 and E t- 1 Udt = ifJd Udt- I . To insure a unique stable solution, we assume that either (B. 1 a) or (B. 1 b) holds: (B. 1 a) (B. 1b) Note that the right-hand inequality in (B. 1 b) follows if a3 falls outside (b(1 + bt 1 , 0.5), a narrow range when b - 1 . There will also be a stable solution when a2 a3 ( 1 - a3 ) =g. But to allow us to divide by g - a2 a3 ( 1 - a3 ) at certain stages in the derivation, we rule this out for conciseness.
23
We thank Stanislav Anatolyev for assistance in the preparation ofthis part ofthe Technical Appendix.
Ch. 13:
915
Inventories
When
a0 = 0,
differentiating the objective function ( 3 . 1 ) with respect t o S1 gives
(B.2) Use P1 = -gS1 + (B.2) becomes
Uct1 ,
Q1 = S1 + ;jJ/1, and our tentative assumption that
-a,Ht - (a1 + a2 aj + g)St + (a1 + a2 a3)Hr- 1 - Uct + Uctt = 0. ::::::;,.
Uc� = Uct · (B.3) (B.4)
Use (B.4) and (B.4) led one period to substitute out for S1 and St+1 in H1 's first-order condition (3 .3) (with a0 :::: 0). After some rearrangement, the result may be written
0 = bErHt+ l - (1 + b + m)Ht + Ht-1 + gficUct + gHct Uctt :::: bE,Ht+l - TJHt + Hr- 1 + gficUct + gHct Uctr , a2 [b(a1 + g) + a 1 a3( 1 - b)] m :::: ' a1 [g + a2 a3 (a3 - 1 )] (B.5) g + a2 a� - bl/Jc[g + a2 a3(a3 - 1)] , glic = a, [ g + a2 a3 (a3 - 1)] a t - bl/Jct (at + a2 a3) . glid :::: a1 [g + a2 a3(a3 - 1)] It can be shown that inequality (B. 1) guarantees that there is exactly one root less than _
one to the polynomial
bx2 - 1Jx + 1 Call this root
lfH
{
=
JTH ,
0.
(B.6)
where
= 0.5b- 1 [1J - (1}2 - 4b) l/2 ] = 0.5b- 1 [1} + ( 1}2 - 4b) 1 12 ]
Using techniques from Hansen and problem (B.5) is
1] > 0, if 1} < 0. Sargent (1980) if
(B.7) it follows that the solution to
Ht = nHHr-I +fHeUer +fHd Uctt, (B.8) fHd :::: [JTH/(1 - bJTHl/Jd )]gH d · fHe :::: [JTH/(1 - bJTH l/Jc)]gHc , Upon substituting Equation (B.8) into Equation (B.4) and rearranging, we obtain (B.9)
V.A. Ramey and K.D. West
916
Let L be the lag operator, "adj" the adjoint of a matrix. From Equations (B.7) and (B.8), a representation for the bivariate (H1, H1 - est)' := Y1 process is
Y1 = A Yt- ! + BU,, H A= nH ens =? =?
I
�I'
I !
fHct fH �efse fHct - efsct ad"(I � - AL) Yt = (I - AL)-1 B Ut = B Ut II - ALI II - ALl Y, = adj(I - AL) BU,.
�
B1 =
I
(B. 10)
This may be used to solve for the univariate process for H1 - es,, which is
(1 - nHL)(Ht - eS,) = (fHe - efse) Uet + e(nH fSe - nsfHe) Uet-! + (fHd - efsct ) Uctt + 8 (nH fsct - nsfHct )Uctt-!·
(B. l l)
Suppose that Udt follows a random walk, so that t/Jct = 1 and (fHd - 8fsct ) Udt (fHd - 8fsct )(Uctt- ! + edt ) - Upon using the definition of e in Equation (3.5), and in light of the quadratic equation used to obtain nH , tedious manipulations reveal that UHct - 8fsct ) + 8(nH fsct - ns fHct ) = O. It follows that
t/Jd = 1
=}
( 1 - nH L)(Ht - eS,) = (fHe - efse) Uet + e(nH fse - nsfHe ) Uet- ! + (fHd - efsct ) edt =}
( 1 - nH L)( 1 - t/JeL)(H, - es,) = Vt �
MA( l ).
=
moe eet + m ! e ect- ! + mod edt - t/Je moct edt-! (B. 12)
Thus, when t/Jct = 1 , H1 - eS1 ARMA(2, 1) with autoregressive roots nH and t/Je · Now suppose that a # 0, so that Uet = a' W, + Uet , with Et- ! Wt = -iHHHi-'�
,_..,_.,,_,_,__,.,_..,_U
:i�j·· · •· ·•-···-·-··-·-·-·-·-·-·-·-·-·-·-·-·-·-·-· 0.05
o -1-------�------�------�------�
-0.8
-1-------�--�---1 1
Quarters
Consumption: Wage Effect
Consumption: T-otal Effect
Leisure: Wealth Effect
Leisure: Interest Rate Effect
o�:rl
fr-" .r-+" . r U 4 ..E-' .l->" ..lt--' .r-o--+ .:l-+" ..r-> $-+ .:lt--< � � .3--fr-> .:r-> ,j
0,25
·-·-·-·-·-·-·-·-·-·-·-·-·-·-·-·-·-·-·-·-·
· Leisure: Total Effect
Leisure: Wage Effect
0,::+2f -0.: •!�,'"\�·-�· ,---------"L.....---- -·/.------.----.-:;_.... tr,'
g ·
·
g . ti
·g ·· O ·ii.i- g . g . g . . g . ,u - �
0
Quarters
Fig. 12. Substitution and wealth effects of persistent and permanent shocks: stars, permanent shocks; squares, persistent ·shocks.
Ch. 14:
Resuscitating Real Business Cycles
973
determine the consequences for consumption and leisure at date 0, which we call the
wage effect. This effect is analogous to the Hicksian effect of the wage on consumption and leisure, in that it holds utility fixed, tracing out a substitution response. However, in our general equilibrium model, the productivity shock implies that wages change in all periods, {w1}t� 0 • Thus, the wage effect in Figure 12 takes into account the entire change in the time path of wages, combining static and intertemporal substitutions. When p = 0.979 , the representative household correctly understands that productivity will raise the path of wages at date 0 and in many future periods, but that the long-run level of the wage will be unchanged. Accordingly, the household plans to consume more at date 0. Leisure hardly changes at all because the current period is about "average"; this conclusion depends on the particular p value. However, this pattern is sharply altered when p = 1 , for then the household recognizes that the current wage is below the long-run wage and leisure rises due to the wage effects that stem from a positive productivity shock 42 . In the general equilibrium of our RBC model, there is one additional channel: interest rate effects that induce intertemporal substitutions of consumption and leisure. In general, these intertemporal price effects are a powerful influence, but one that is not much discussed in informal expositions of the comparative dynamics of RBC models. In particular, permanent increases in productivity lead to high real interest rates and these induce individuals to substitute away from date 0 consumption and leisure as shown in Figure 12. We are now in a position to describe why a permanent shift in productivity (arising when p = I ) has a smaller effect on labor than a persistent but ultimately temporary shock (p = 0.979). When the shock is temporary, there is a small wealth effect that depresses labor supply but temporarily high wages and real interest rates induce individuals to work hard. When the shock is permanent, there are much larger wealth effects and the pattern of intertemporal substitution in response to wages is reversed since future wages are high relative to current wages. However, labor still rises in this case in response to productivity shocks due to very large intertemporal substitution effects of interest rates.
5.3. Why not other shocks? We have just seen that the basic real business cycle model driven by persistent technology shocks can produce realistic business cycle variation in real quantities. Do these same patterns emerge when the economy is buffeted by other disturbances? Shocks to fiscal and monetary policy have been long standing suspects in the search
42
The wage effect on consumption is constant across time in each case because the separable momentary utility function implies that efficient consumption plans do not depend on the amount of work. Equivalently, with this utility function, there is a general substitution effect on consumption at all dates that works much like a wealth effect.
974
R. G. King and S. T. Rebelo
for the causes of business cycles. It is thus natural to ask what are the effects of these shocks in the standard RBC model. Shocks to government spending cannot, by themselves, produce realistic patterns of comovement among macroeconomic variables 43 . This result stems from the fact that an increase in government expenditures (financed with lump sum taxes) gives rise to a negative wealth effect that induces consumption to fall at the same time that labor and output rise. Thus, if government spending were the only shock in the model, consumption would be countercyclical 44 . Changes in labor and capital income taxes have effects that are similar to productivity shocks. However, these taxes change infrequently making them poor candidates for sources of business cycles fluctuations. Monetary policy shocks have small effects in this class of models both in versions in which money is introduced via a cash-in-advance constraint [Cooley and Hansen ( 1 989)] and in models that stress limited participation [Fuerst (1 992), Christiano and Eichenbaum ( 1 992b)]. Many researchers are also currently investigating the nature of business cycles in models that start with the core structure of an RBC framework but also incorporate nominal rigidities of various forms 45 . This research has not yet produced a business cycle model that performs at the same level as the RBC workhorse described in Section 4.
6.
Extensions of the basic neoclassical model
Since the basic RBC model contains explicit microeconomic foundations, part of the literature has tried to improve its predictions for individual behavior. Other researchers have sought to improve the fit between model and data, focusing on moments and sample paths of macroeconomic time series. In this section, we discuss two strands of this research: work on labor supply and on capital utilization.
6. 1. The supply of labor There is a substantial body of work that focuses on the labor supply and, more generally, on the labor market in RBC models. This research is motivated by four difficulties encountered by the basic model on micro and macro dimensions. In most
43
There is a large literature that investigates the effects of fiscal policy in an RBC context. References include Wynne ( 1987), Christiano and Eichenbaum (1992a), Rotemberg and Woodford (1992), Baxter and King ( 1993), Braun (1994), McGrattan (1 994), and Cooley and Ohanian (1997). 44 For an early discussion of this difficulty, see Barro and King ( 1 984). There is actually some evidence that in historical periods dominated by large shocks to government expenditures consumption was countercyclical, see Correia, Neves and Rebelo ( 1992) and Wynne (1987). 45 Examples include Cho and Cooley ( 1995), Dotsey, King and Wolman (1996), and Chari, Kehoe and McGrattan (1996).
Ch. 14:
Resuscitating Real Business Cycles
975
RBC models the implied labor supply elasticity to wage changes is very large, relative to micro studies. All of the variation in aggregate hours in the model arise due to movements in hours-per-worker, while the US experience is that most of the action comes from movements of individuals in and out of employment. Labor in the model lacks a close correspondence to labor in data (see Figure 6). Finally, labor input and its average product are very highly correlated in the model, but not in the data.
6. 1.1. Estimated and assumed labor supply elasticities Labor economists have long been interested in estimating the response of the labor supply to a change in the real wage rate. In the standard static model, an increase in the real wage produces a substitution effect which leads to an increase in N and C as well as a wealth effect which leads to a decline in N and an increase in C. While the effect of a wage increase on consumption is unambiguous, the effect on the labor supply involves conflicting substitution and wealth effects. In a dynamic model, the effect of a wage change is complicated by the fact that the size of the wealth effect depends on the anticipated duration of the wage change: temporary wage changes have a small wealth effect and permanent ones have a large wealth effect. In a dynamic setting, the key equation that determines the supply of labor is the requirement that the marginal utility of leisure equal its cost along the intertemporal budget constraint. Many empirical studies of dynamic labor supply [e.g., MaCurdy ( 1 98 1)], suppose that the utility function has the separable form (4.2), that we introduced in our discussion of the approximation of the RBC model in Section 5 above and for which we showed that A
N=
1 -N 17 (w + A). N A
(6. 1)
In this expression, the term t;[� is the A-constant elasticity of labor supply. To isolate the substitution effect, labor economists often estimate a A-constant elasticity of labor supply and we organize our discussion of labor supply issues around this elasticity. 17 In the basic RBC model, with its assumption of log utility ( = 1) and a steady state fraction of time spent working of N = 0.2, it follows that the implied labor supply elasticity is four: a one percent change in the wage rate calls forth a four percent change in hours worked if there is little wealth effect (A constant), as with a temporary wage change. Yet, the microeconomic evidence on variations in hours worked is sharply at odds with the elasticity built into the RBC model. While estimates of this elasticity vary across different gender and race groups, they are typically much lower than unity [e.g. Pencavel ( 1 986)].
6. 1.2. Implications of varying the aggregate labor supply elasticity To show the consequences of adopting a labor supply elasticity in line with microeconomic estimates, the third and fourth panels of Figure 8 show the effect of
R. G. King and S. T. Rebelo
976
choosing
';;� = 1 which is the upper bound suggested by Pencavel's estimates, rather
than ';;� = 4 as in the model of Section 4. There is an important reduction in the volatility of output in the third panel of Figure 8 . However, the model loses most of its ability to produce fluctuations in labor (see the fourth panel of Figure 8). In terms of moments, the standard deviation of output falls from 1 .39 to 1 . 1 6 with the smaller labor supply elasticity and the standard deviation of labor falls from 0.67 to 0.33.
6.1.3. Modeling the extensive margin RBC researchers have investigated ways of enhancing the aggregate labor supply response by focusing on the extensive margin. Figure 4 shows that most fluctuations in total labor input occur as households substitute between employment and nonem ployment (the extensive margin) rather than between a greater or smaller number of per capita hours worked (the intensive margin). Explaing these facts seems to require that there are fixed costs of going to work or other attributes of the technology that lead to nonconvexities in the individual's opportunity set. There are two strategies for incorporating the extensive margin into business cycle analysis. The first is to assume that households are heterogeneous with respect to their reservation values of work, probably due to differences in fixed costs of working such as travel time to the job. This is a conventional approach in labor economics [see, e.g., Rosen (1 986)] that has been introduced into a business cycle model by Cho and Rogerson (1988) and Cho and Cooley ( 1 994). In order to make such a model tractable, it is necessary to view individual agents as efficiently sharing the resulting employment risks 46 . An alternative approach, developed by Rogerson ( 1 988) and applied to business cycles by G.D. Hansen ( 1 985), assumes that households are identical but agree on an efficient contract which allocates some individuals to work in each period while leaving the remaining idle. A remarkable feature of both approaches is that there is a stand-in representative agent whose preferences generally involve more intertemporal substitution in work than displayed by the underlying individual agents. For simplicity and congruence with the literature, we focus our discussion on the economies with indivisible labor and lotteries, following Rogerson (1988). Each individual in the economy has to choose between working a fixed shift of H hours and not working at all. Suppose that preferences are such that individuals would ideally like to supply a number of hours N < H. This arrangement is not possible because the choice set is not convex, it includes N = 0 (with zero labor income) and N = H (with labor income wH) but no linear combinations of these two points. In this set up agents can be made better off by the introduction of lotteries which convexify their 46 In actual economies, variations in aggregate hours reflect changes at both the intensive and extensive margins. In a model where workers have different fixed costs of going to work, Cho and Cooley ( 1994) have captured both of these responses. Such a framework appears necessary to explain the differing cyclical patterns of employment and hours-per-worker in the USA and Europe that are documented by Hansen and Wright (1992).
Ch. 14:
977
Resuscitating Real Business Cycles
choice set. By entering a lottery an agent can choose to work a fraction p of his days remaining unemployed a fraction (1 - p) of his time. Let us use the subscript 1 to denote those agents who are assigned to work by the random lottery draw and the subscript 2 to refer to the unemployed agents. The expected utility of an individual prior to the lottery draw is pu(c 1 , 1 - H) + ( l -p)u(cz , 1 ),
(6.2)
where p is the fraction of the population assigned to work. Feasible allocations of consumption across the employed and unemployed agents must obey (6.3)
pc 1 + (1 - p)cz = c ,
where c is per-capita consumption. Maximizing Equation (6.2) subject to condi tion (6.3), we find that marginal utility of consumption must be equated across types, i.e., (6.4) which is an efficient risk-sharing condition in this situation of employment lotteries as in many other contexts. The standard indivisible labor model. The typical treatment of the indivisible labor model, as in Rogerson ( 1 98 8) and Hansen (1 985), involves assuming separable utility. Within the general class of utility functions (3.8), this corresponds to a = 1 so that u(c , L) = log(c) + log(v(L)). In this case, efficient risk-sharing implies that the employed and unemployed share the same level of consumption (c 1 = c2 ). Using this fact, expected utility can be written as
(V])
1 u(c, L) = log(c) + ( l - L) - log vz H
+ log(v2 ),
(6.5)
where L = 1 -pH is the average number of hours of leisure in the economy and where v1 = v(l - H) and vz = v(1). There are three notable features of this economy. First, even though each individual agent has a finite elasticity of labor supply, the macroeconomy acts as if it were populated by agents with a more elastic supply of labor. In particular, the stand in representative agent for this economy has preferences that are linear in leisure, implying a infinite A-constant elasticity of labor supply [see Equation (6. 1) with 11 = 0], a feature whose consequences we explore further below. Second, contrary to conventional wisdom, this is an economy in which it is optimal to have unemployment. Finally, agents actually choose to bear uncertainty by entering the lottery arrangement instead of working a fixed number of hours in every period. It is interesting to explore further why the individual elasticity of labor supply differs from that of the economy as a whole and the consequences of this difference for the
978
R. G. King and S. T. Rebelo
determination of output and labor. The individual elasticity of labor supply answers the question "how many more hours would you work in response to a 1% raise in salary?". But the answer to this question is irrelevant because the number of hours worked is not flexible, it is either H or zero. In other words, the intensive margin is not operative and hence its elasticity of response is irrelevant. Proceeding to the consequences for the determination of labor, the preferences of the stand-in representative agent (6.5) imply that small changes in wages and prices can lead to very large effects on quantities. To see this, consider an isolated individual maximizing
Along the relevant intertemporal budget constraint, suppose that the discounted cost of 1 a unit of leisure is {3 A1w1• Then, for the individual to work part of the time (0 < L1 < 1) in each period, it must be the case that A1w1 = iJ log(v 1 /v2 ) 47 . But, if this condition is satisfied, the individual is indifferent across all sequences of leisure which imply the 1 same level of l:� 0 {3 [( 1 - L1) k log(v tlv2 )]: there is an infinite intertemporal elasticity of substitution in work. One implication of this labor supply behavior is that it is the demand side of the labor market which determines the quantity of employment and work effort in the equilibrium of the indivisible labor model. From this perspective, firms choose the quantity of labor that equates its marginal product to the real wage, with the position of the demand schedule being shifted by the level of productivity and the capital stock. Since the capital stock and the multiplier A1 are endogenously determined, this labor market equilibrium picture is incomplete, but it is a useful partial equilibrium description. The indivisible labor model with more general preferences: When the indivisible labor model is generalized, as in Rogerson and Wright ( 1 988), there are interesting new conclusions. To develop these, we use the utility function (3.8), with a * 1 . Efficient risk-sharing condition implies that consumption allocations must satisfy (6.6) According to this specification, if a > 1 there will be more consumption allocated to the employed (group 1 ) than to the unemployed (group 2)48 . Thus, as more individuals are allocated to the market (higher p) aggregate consumption will rise even If i\.,w, < 1f log(u1/u2), our agent spends all available time at t in leisure (L = 1). If i\.,w, > 1f log(utfu2), our agent devotes no time to leisure (L = 0). 48 This conclusion makes use of the fact that u2 = u(l ) > u1 = u(l - H), which follows from the fact that u is an increasing function. 47
Ch. 14:
Resuscitating Real Business Cycles
979
if consumption of employed individuals and unemployed individuals stays relatively constant. Further, using this consumption rule along with the expected utility objective, there is a stand-in representative agent whose preferences are 49 (6.7) where
There are two points about this expression. First, the stand-in's utility function inherits the long-run invariance of hours to trend changes in productivity from the underlying utility function (3 .8). Second, the stand-in's utility function inherits the original utility function's properties with respect to effects of changes in leisure on the marginal utility of consumption. In particular, when a > 1 , the marginal utility of consumption is decreasing in leisure. Let us again think about an isolated individual maximizing lifetime utility, '2:;� 0 f31 u(c1, L1), but with the new momentary utility function (6.7). As with our discussion of the representative worker in Section 4 and as with our previous discussion in this section, the stand-in agent equates the marginal utility of consumption and the marginal utility of leisure to the shadow values along the economy's resource constraint (D 1 u(c1, L1) = A1 and Dzu(c1, L1) = A1w1 = A1A1DzF(k1, N1)). These conditions must always hold if there is an interior optimum for work effort, i.e., 0 < L1 < 1 in each period. Taking loglinear approximations to this pair of conditions, we find (6.8)
(6.9) where I( = economy 5 ° .
L��;g)
is pinned down by information on the steady state of the
49
There are two steps to this demonstration. First, one shows that efficient risk-sharing implies that expected utility is proportional to:
and then one substitutes in for leisure using L = I - pH.
50 That is
'
K
=
LDv * (L) LD2u(c, L) = v*(L) CD1 u(c , L)
=
Lw = � (wN/y) . N (ely) c
980
R. G. King and S. T. Rebelo
This set of equations reveals that there is infinitely elastic labor supply even when the preference specification is not separable. That is, the pair of equations implies that
which is the statement that the stand-in will supply any amount of work at a particular real wage. But because preferences are nonseparable, variations in work require variations in consumption. When a > 1 , in particular, workers require more consumption than nonworkers and aggregate consumption is negatively related to leisure, i.e.,
Thus, this model involves a modified form of the permanent income hypothesis, which includes the effects of changes in work effort on the marginal utility of consumption. Baxter and Jermann (1 999) have argued that this type of preference nonseparability will arise in any model with household production; they have also stressed that this specification can make consumption more cyclically volatile.
6.2. Capacity utilization In the standard version of the neoclassical model, there is a dramatic contrast between the short run and long run elasticities of capital supply. The short run elasticity of capital supply is zero: there is no way for the economy to increase the capital stock inherited from the previous period. In contrast, the long run elasticity of capital supply is infinity: there is only one real interest rate consistent with the steady state of the economy. This difference between short run and long run elasticities stems from the assumption that capital services are proportional to the stock of capital. This is an assumption we make every time we write a production function as Y = F(K, N). While this assumption may be suitable for some purposes, it is clearly problematic for business cycle analysis. The third panel of Figure 3 suggests that capacity utilization displays pronounced cyclical variability. The fact that equipment and machinery are used more intensively in booms than in recessions is corroborated by the procyclical character of electricity consumption in manufacturing industries [Burnside, Eichenbaum and Rebelo ( 1 995)] and by the fact that expansions are accompanied by the use of two and three shifts in manufacturing industries [Shapiro (1 993)]. All this evidence suggests that the flow of capital services is high in expansions. In contrast, recessions are times when capital tends to lie idle, thus producing a small service flow. Several authors have extended the basic RBC model to incorporate variable capital utilization. Kydland and Prescott ( 1 988) showed that introducing time-varying capital utilization enhanced the amplification capability of their 1 982 model. Greenwood, Hercowitz and Huffinan ( 1 988) introduced variable utilization in a model that features
Ch. 14:
Resuscitating Real Business Cycles
981
shocks to the productivity of new investment goods 5 1 . Finn ( 1 99 1 ) used a similar framework to study the interaction between capital utilization and energy costs. In her model, more intensive capital use accelerates the depreciation of capital and raises marginal electricity consumption. Burnside and Eichenbaum (1 996) explored a model with both capital utilization and labor hoarding. They showed that these two features significantly enhance the ability of the model to propagate shocks through time 52 . Modeling variable utilization. Most studies of variable utilization assume that depreciation is an increasing function of the utilization rate 53 . The benefits from variable capital utilization can be incorporated into the production function as follows:
where z1 denotes the utilization rate 54. The costs of variable capital utilization are imbedded in the following law of motion for the capital stock:
where D( - ) is a convex, increasing function of the utilization rate 55. To determine its optimal rate of utilization, a firm maximizes its profits holding fixed its future capital stock. The marginal benefit of a higher utilization rate is additional output (A1D,F(z1K1, N1Xt)K1) and the marginal cost is higher (replacement) investment (df1 = DD(z1)K1). Equating these and using the Cobb-Douglas production function, we find that efficient utilization implies (6. 1 0) which is the requirement that the marginal benefit in terms of additional output produced be equated to the marginal cost in terms of additional units of capital being worn out. The consequences of variable utilization. To explore how efficient variation in the utilization rate affects the linkages in the economy, we linearize Equation (6. 10) to
51 These shocks tend to make consumption and investment move in opposite directions. Introducing capital utilization eliminates this counterfactual correlation between consumption and investment. 52 Their model is also capable of producing a humped shape response of investment to technology shocks - a feature that is common in empirical impulse response functions estimated using VAR techniques. 53 An exception is Kydland and Prescott (1988). 54 For simplicity, we use the Cobb-Douglas form throughout our discussion of capital utilization. 55 Thus, it has a positive first derivative Db(-) and a positive second derivative D2 b(·).
982
R. G. King and S. T. Rebelo
obtain an expression for function:
Zt
and substitute this result into the linearized production
Yt = At + aNt + (1 - a)(kt + zr) = At + (l - a) kt + aNt +
�:� (At - akt + aN1) .
(6. 1 1)
In this expression � represents the elasticity of Db(z1), which is positive if there is increasing marginal depreciation cost of higher utilization 56. The model without utilization occurs as a special case in which � is driven to infinity, since in that case the quantity of capital services does not respond to changes in the marginal product of these services. At the other extreme, as � is driven toward zero, the response of output becomes Yt = �A1 + N1• For this reason, time-varying capital utilization is sometimes described as leading to a short-run production function that is nearly linear in labor. Variable utilization makes the marginal product of labor - the real wage rate - less responsive to changes in labor input. The comparable log-linear expression for the real wage rate is (6. 12) and, as � is driven toward zero, the response of the real wage approaches w1 = �At. In other words, the labor demand schedule drawn in (w,N) space "flattens" as depreciation becomes less costly on the margin (� falls). When � is driven to zero, the labor demand curve becomes completely flat.
7. Remeasuring productivity shocks
We have seen that productivity shocks are an essential ingredient of real business cycle models. In the absence of measurement error in labor and capital services, these shocks coincide with the Solow residual. Prescott (1 986) used the Solow residual as a measure of technology shocks to conclude that these shocks "account for more than half the fluctuations in the postwar period with a best point estimate near 75%". There are three reasons to distrust the standard Solow residual as a measure of technology shocks. First, Hall (1 988) has shown that the Solow residual can be forecasted using variables such as military spending, which are unlikely to cause changes in total factor productivity. Similarly, Evans (1 992) showed that lagged values of various monetary aggregates also help forecast the Solow residual. Second, the conventional Solow residual implies probabilities of technological regress that are implausibly large. Burnside, Eichenbaum and Rebelo (1 996) estimate that the 5 6 It can be shown that !; = z(D2 15)/Do
>
0.
Ch. 14:
Resuscitating Real Business Cycles
983
probability of technological regress associated with the conventional Solow residual is 3 7% in US manufacturing. Finally, cyclical variations in labor effort ("labor hoarding") and capital utilization can significantly contaminate the Solow residual. There are two strategies for dealing with these extra, hard-to-measure sources of factor variation. The first strategy is to use an observable indicator to proxy for the unobserved margin. For example, since individuals working harder may have more accidents in an industrial setting, the frequency of worker accidents could be used as an indicator of unobserved effort 57 . More commonly, electricity consumption in manufacturing industries is taken as an indicator of capacity utilization. The second strategy is to use implications of the model to solve out for the unobserved factor variation and then to examine other implications of the model economy. We discuss application of each of these strategies to measuring capacity utilization in the remainder of this section. Capital utilization proxies: Burnside, Eichenbaum and Rebelo ( 1 996) employ electricity use as a proxy for capacity utilization. In particular, assuming that the utilization rate is proportional to electricity utilization, they can use the Solow decomposition in modified form, log SR7 = log Y1 - a log N1 - (1 - a)[log K1 + log(z;)],
(7. 1 )
where log(z;) i s the log of electricity use. They find that when electricity use is employed as a proxy for capital services the character of the Solow residual associated with the manufacturing sector changes dramatically: (i) there is a 70% drop in the volatility of the growth rate of productivity shocks relative to output, implying that a successful model must display much stronger amplification than the basic RBC model; (ii) the hypothesis that the growth rate of productivity is uncorrelated with the growth rate of output cannot be rejected; and (iii) the probability of technological regress assumes much more plausible values, dropping to 1 0% in quarterly data and to 0% in annual data. These corrections to the Solow residual significantly reduce the fraction of output variability that can be explained as emanating from shocks to technology 58. Using the model to measure capacity utilization: An alternative strategy is to use the model's implications for efficient utilization to solve for the unobserved
57 Several variants of this proxy strategy have been used to shed indirect light on the presence of labor hoarding. Bils and Cho (1 994) use time and motion studies to document the presence of variability in effort. Shea (1992) uses data on on-the-job accidents to construct an indirect measure of labor hoarding. Burnside, Eichenbaum and Rebelo (1 993), Sbordone (1 997), and Basu and Kimball ( 1997) postulated a model of labor hoarding that they proceeded to use to purge the Solow residual of variations in the level of effort. 58 Aiyagari (1 994) proposed a method to compute a lower bound on the contribution of technology shocks to output volatility. His procedure relies on knowledge of two moments in the data: the variability of hours relative to the variability of output and the correlation between hours worked and labor productivity (which is essentially zero in the data). Unfortunately, his method is not robust to the presence of labor hoarding or capacity utilization.
984
R. G. King and S. T. Rebelo
utilization decision, i.e., z1• In essence, this empirical strategy corresponds to our theoretical method in the previous section, when we solved out for z1 in order to derive Equation (6. 1 1 ), which describe how output responds to changes in labor, capital and productivity when utilization is efficiently varied. One possibility would be to exactly follow this strategy, substituting observed variations in labor and capital into Equation (6. 1 1) to compute the productivity residual, but we use a more "reduced form" approach that we describe more fully in the next section.
8. Business cycles in a high substitution economy
Motivated by the vanishing productivity shock, we now construct an economy in which small variations in productivity can have large effects on macroeconomic activity, i.e., an RBC model in which there is substantial amplification of shocks. There are two central ingredients to this modeL First, as in Section 6. 1 , we assume that there is indivisible labor. This makes the supply of aggregate hours strongly responsive to changes in wages and intertemporal prices. Second, as in Section 6.2, we assume that there is variable capacity utilization. This makes the supply of capital services strongly responsive to changes in the level of aggregate hours. Taken together, these ingredients mean that the economy has high substitution in all factors of production. Further, the Solow residual is a very poor measure of technology shocks in our model economy. However, the very same structural feature that makes the Solow residual a bad measure of technology shocks (unmeasured variation in capital services) also provides a powerful amplification mechanism that allows our model to account for the observed output variation with much smaller shocks. Finally, our model provides a means of implicitly measuring the smaller shocks that occur, which can be viewed as a variant of Solow's approach 59.
8. 1. Specification and calibration The specification and calibration of the model follows the same general approach that we used in Section 4, but with some relatively minor modifications. Restrictions on the steady state: First, we know that the production side of the basic model determines most aspects of the steady state and that continues to be true with variable capital utilization. The efficiency condition for utilization in the steady state determines a steady-state utilization rate such that r + D(z) = DD(z), with the remainder of the steady-state relative prices and great ratios then adjusted to reflect the fact that the flow of capital services is zK rather than K. 59
The approach was suggested by Mario Crucini in unpublished research many years ago, so perhaps we should call these "Crucini residuals". Another application is contained in Burnside, Eichenbaum and Rebelo's ( 1993) study of unobserved effort (labor hoarding). Ingram, Kocherlakota and Savin ( 1 997) use a similar procedure to infer information on observed shocks to the home production sector.
Ch. 14:
Resuscitating Real Business Cycles
985
Table 4 Calibration of high substitution economy a
b
y
a
3
0.984
1 .004
0.667
p 0.025
0.1
0.9892
0.0012
Second, since we are assuming an indivisible labor model, there is a different calibration of the preference side of the model. Evidence from asset pricing studies suggests that a is larger than the unit value used in the basic model; this means that our model will have the realistic implication that more consumption is allocated to working individuals than to nonworking individuals. Drawing on Kocherlakota's ( 1 996) review, we use a = 3. We assume that 60% of the population is employed in the steady state and that employed individuals work 40 hours. This implies that an average individual's hours are N 0.2 14, i.e., 24 hours out of a weekly 1 12 hours of nonsleeping time. Then, this information (including the assumed value of a) determines the ratio v( l )/v( l - H) which dictates the ratio of consumptions of the two types of individuals. It turns out that the ratio c1/c2 is 3.3 1 so that workers have substantially higher consumption than nonworkers. Table 4 summarizes our parameter assumptions. Unless otherwise discussed, the parameters are the same as in Table 2. Measuring technology shocks: We use the implications of our model as discussed in the last section to produce a series on technology shocks which is consistent with unobserved variation in capacity utilization 60 . In particular, we start by assuming a value for the persistence and volatility of technology shocks and solve the model. The decision rule for output can be written as =
Yt
=
nykkt + nyAAt.
Using this decision rule together with data for output and capital (which we logged and linearly detrended), we can compute an initial guess about the time series for technology shocks 6 1 :
, _ 1 y, � A� -
JiyA
0
nyk k't· JiyA
-
6 There is no unique way of computing this shock process, but rather any of the model's decision rules could be used in this way or these rules could be combined with other relationships in the model. For example, one could exploit the decision rule for utilization as in Burnside, Eichenbaum and Rebelo's (1993) analysis of labor hoarding, z, = :n:ykk, + :n:yAA,, and combine this with the modified Solow decomposition (7 .I). This alternative method would produce a different shock process, which lead to broadly similar, but somewhat less dramatic results. The difference between these two productivity measures lies in whether labor in Equation (7 . 1 ) is taken from the data or from the model. 61 We should not use the empirical capital stock series since these are flawed in the eyes of the model: they are computed assuming constant rates of depreciation. This can be circumvented by using a second decision rule to compute the "true" capital stock series. In practice this has little impact on the results.
986
R. G. King and S. T. Rebelo
This guess is not exactly right because the serial correlation coefficient (p) for this At series need not match that used to solve the model and to construct the n coefficients. Therefore, once we obtain a time series for At, we compute its persistence (p) and use this new value to solve the model again. Using the new decision rule, we recompute At and once again its calculate its persistence. We continue this process until the new and old estimates for the serial correlation of At are the same. This iterative procedure yielded an estimate of 0.9892 for the first-order serial correlation and 0.00 12 for the standard deviation of the ft.
8.2. Simulating the high substitution economy With a series of productivity shocks in hand, we simulated our model economy's response to these shocks just as we previously did for the standard RBC model. Figure 1 3 displays the results, which we think are dramatic. Panel 1 shows the model and actual paths for output, which are virtually identical. In part, this is an artifact of our procedure for constructing the technology shock, which is a weighted average of output and capital as we just discussed. For this reason, we think that the performance of the model should not be evaluated along this dimension. Instead, the model has to be judged by its predictions for other variables of interest. The remaining panels of Figure 1 3 display the model's implications for total hours worked, consumption and investment, with all of these series detrended with the HP filter. The correlation between the empirical and the simulated series is 0.89 for labor, 0.74 for consumption and 0.79 for investment! This remarkable correspondence leads to three sets of questions, similar to those which arose in the analysis of the standard RBC model. First, how do small variations in productivity have such dramatic effects? Second, what are the properties of the technology shocks? Third, how sensitive are the results?
8.3. How does the high substitution economy work? The high substitution economy contains four mechanisms that substantially amplify productivity shocks and lead to strong comovements of output, labor, consumption and investment. To begin, variable capacity utilization makes output respond more elastically to productivity shocks in Equation (6. 1 1), which we repeat here for the reader's convenience:
Since utilization of capital increases when there is a positive productivity shock, there is a direct effect which is part of the amplification mechanism. In the limiting case of s = 0 for example, a labor's share of a = � implies that the productivity shock raises output by i or � times its direct effect. We use a value of s = 0. 1 in constructing our simulations, so that the effect with a = � is 1 + ;�� = 1 + �:�� = 1 .43. Thus, variable utilization helps create amplification, but only in a modest manner.
Ch. 14:
987
Resuscitating Real Business Cycles Output
�
0
�
:. -2 -4
-6 -6 51
56
61
66
76
71
81
86
91
96
86
91
96
86
91
Date
61
56
51
-8
71
66
76
81
Date
Investment
15
51
56
61
66
76
71
81
96
-20 Date
Consumption - - - Model
--Data
-4
� +-�-+���-r����+-�-+��+-�-+�+-�-+���-r�� 51
56
61
66
71
76
81
86
91
96
Date
Fig. 13. Capacity utilization model: simulated business cycles. Sample period 1s 1947:2-1996:4. All variables are detrended using the Hodrick-Prescott filter.
988
R.G. King and S. T. Rebelo
Relative to the standard RBC model that we discussed in Section 4, most of the increased amplification in the model of this section comes from greater elasticity of the labor demand and labor supply schedules. Highly elastic labor supply is due to indivisible labor: work effort is highly responsive to small changes in its rewards. In fact, we have previously argued that it is the demand side which approximately determines this quantity in indivisible labor economies. Variable capacity utilization makes the labor demand more elastic. As discussed above, labor demand is implicit in the equation:
In the model without variable utilization (or with � = oo), a one percent increase in labor quantity causes the real wage to fall by 0. 333 percent when a = �' since the coefficient on N1 is (a - 1 ). At the other extreme, as ; is driven toward zero, the response of the real wage to a productivity shock approaches w1 = t)1, i.e., the labor demand schedule becomes more elastic until it is completely elastic in the limit. With variable utilization, the combined coefficient on labor is (a - 1 ) + ��� a . Using
a=�
and s = 0. 1 0, as in our simulations, we find that the combined coefficient is (0.67 - 1) + g:�� 0.67 = -0.043 : a one percent change in labor requires a decline in the wage that is an order of magnitude smaller than in the standard model. With indivisible labor and variable utilization, a small productivity shock shifts up labor demand and calls forth a large increase in labor supply. In order to determine the exact size of this change, however, it is essential that we simultaneously determine the path of capital (k1) and the multiplier (A1). The final structural feature that is important for the simulated time series is the nonseparable form of the utility function. In the standard Hansen-Rogerson case of log utility, most of the model's change in output goes into investment rather than consumption. However, since the efficient plan calls for the allocation of more consumption to employed individuals when a > 1 , the high substitution economy displayed in Figure 1 3 involves more volatile consumption that corresponds closer to the data. We return to a discussion of this feature in the context of impulse responses later in this section.
8.4. What are the properties of the shocks? Is this remarkable coherence between data and model achieved by using an empirically unpalatable productivity shock as a driving force? Figure 14 answers this question. The first panel depicts the level of the productivity, which involves a combination of the deterministic trend and stochastic component (i.e., A1Xn. It increases through time smoothly in the manner that many economists believe is appropriate for the level of technology. The second panel of Figure 14 shows the growth rate of productivity in our economy. This graph shows that the average rate of technical progress is large
Ch. 14:
Resuscitating Real Business Cycles
989
A. Model Productivity Level 8.2 8.1
7.9
7.8
7.7 7.6
7.5 7.4 7.3
7.2 +-+--- 0,
a * 1,
(A. 1 )
It is easy to verify two properties of these specifications. First, if agents have a budget constraint for goods and leisure of the form c + wL :::;; w, where w is the real wage 6 rate and 1 is the time endowment, then there is invariance of L to the level of w 8 . 67 Another possibility is that utility depends on leisure in efficiency units, i.e., on leisure augmented by technological progress (L1X1 ). In this case it is sufficient to assume that u( C, LX) is homogeneous, 2 of class C , and concave. The dependency of utility on leisure measured in efficiency units can be justified by introducing home production into the model. See Greenwood, Rogerson and Wright (1995, pp. 161-162) for a discussion. 68 This invariance extends to a setting where the budget constraint includes nonwage income which grows at the same rate as the real wage.
R. G. King and S. T. Rebelo
996
Second, using I.;Hopital's rule, the second case is the limiting expression of the first as a ----+ 1 . We require that utility be sufficiently differentiable as well as concave and increasing in consumption and leisure; this implies restrictions that must be placed on v which depend on the value of a 69. Differentiability allows us to characterize efficient allocations using variational methods. When combined with convexity of the constraint set, concavity of preferences insures that the solution to the planner's problem is 0 unique, whenever lifetime utility (U) is finite 7 • Since, as we will see shortly, the competitive equilibrium under rational expectations coincides with the solution to the planner's problem, this guarantees that the competitive equilibrium is also unique. Technology: The production function FO is also twice continuously differentiable, concave and homogeneous of degree one. Constant returns to scale implies that the number of firms in the competitive equilibrium is undetermined. With increasing returns to scale a competitive equilibrium does not exist because it would entail negative profits for all firms 7 1 . In contrast, with decreasing returns to scale we would see an infinite number of infinitesimal firms whose total output would be infinite 72 • Alternatively, firms would earn economic profits if, for some reason, entry were limited. We assume that FO satisfies the following limiting conditions, often referred to as Inada conditions 73 : lim D F(K, N) = 0, K -+ oo 1
lim D1F(K,N)
K -+ 0
= oo.
These conditions ensure the existence of a steady state in which the level of capital is strictly positive. One can also show that they imply that labor is essential in production: F(K, O) = 0.
69 More specifically, we assume that the functions V; are twice continuously differentiable. If a = I , then concavity requires that the function log(v) must be increasing and concave. I f a i s not equal to I, then v l -a must be increasing and concave if a < 1 and decreasing and convex if a > 1 . In addition we need -av(L) v"(L) > (1 - 2a)[v'(L)f to assure the overall concavity of u. 70 Whenever there is one path that yields infinite utility it is always possible to construct other paths (in fact a continuum of paths) that also yield infinite utility. Thus, to ensure that there is only one solution to the planner's problem we need to constrain the discount factor so that life-time utility, U, is finite. The requirement (by l -a < I) involves the interaction of preferences and technology. See Alvarez and Stokey ( 1999) for a discussion of this type of conditions. 71 See Hornstein ( 1993), Rotemberg and Woodford (1995), and Chatterjee and Cooper (1 993) for a discussion of models that move away from perfect competition and incorporate increasing returns to scale. 72 Suppose, for example, that the production function is Cobb-Douglas and that there is a stock of capital K and a number of labor hours N which will be divided equally among n firms. Total production will be given by Y = nA(Kin) a ' (N/n)a2 = AK a, N az n l -a,-az . With decreasing returns to scale a1 + a2 < 1 Y = 00 . and lim" 73 As in the main text we use the notation D;FO to refer to the partial derivative of FO with respect to its ith argument. We use DFO to refer to the total derivative of a function of a single variable. _, 00
Ch. 14:
Resuscitating Real Business Cycles
997
A.2. The dynamic social planning problem Let us consider first the case in which allocation decisions are made by a benevolent planner who maximizes the welfare of the representative agent. The solution to this problem will be a symmetric Pareto-optimum in which all agents receive the same consumption and leisure allocations. The stationary economy: In the steady state of a deterministic version of this economy Y, C, I, and K all grow at rate y, i.e., the model captures the Kaldor growth facts. This suggests that it is useful to write the planner's problem for this economy in terms of variables that are constant in the steady state: y = YIX, c = CIX, i = 1/X, k = KJX. Using these stationary variables the planner's problem is given by max Eo
00
1 L f3 u(c1, l - N1) t�o
(A.2)
subject to:
= A tF(kt . Nt ), Yt Yt ykt+ 1 = it + (1 - o)kt , > 0. ko where {3 = by 1 -a.
(A.3) (A.4) (A.5) (A.6)
In a deterministic environment the solution to the problem of maximizing Equa tion (A.2) subject to conditions (A.3)-(A.5) would be a sequence of consumption, labor supply and capital accumulation decisions: { c1 }� 0 , {N1 } � 0 , and {k1} � 1 . These decisions could be made at time zero, since no relevant information is revealed later on. In contrast, in a stochastic economy agents learn over time the realizations of the random shocks that affect their environment. It would be inefficient to ignore this information that will be available later on and cast in stone the consumption and leisure decisions at time zero. For this reason, the solution to the utility maximization problem is a set of contingency rules, which specify how much to consume and work at each point in time as a function of the state of the economy in that period. Since the state of the economy can be, at any point in time summarized by two variables, the value of A1, which influences current output and helps predict future productivity, and the value of the stock of capital. Thus contingency rules take the form c = c(k, A) and
N = N(k, A). Dynamic programming: To use this approach, we write the planner's problem in recursive form as
V(k,A) = max { u(c, l - N) + {3EV(k', A')},
(A.7)
c + yk' - ( 1 - o)k = AF(k, N).
(A.8)
c,N,k'
subject to:
where we use primes (' ) to denote the value of a variable in the next period. The value function V(k,A) represents the expected life-time utility of the representative
R. G. King and S. T. Rebelo
998
agent of an economy with a capital stock equal to k and a level of productivity equal to A . Equation (A.7) decomposes this life-time utility into two parts: the utility flow that accrues in the current period, u(c,L), and the expected utility that results from starting tomorrow with a stock of capital k' and a shock A ' and proceeding optimally from then on. The planner will decide today on the value of k' , so this variable is known with certainty at time t. However, the value of A ' will only be known in the next period, so we have to compute the expectation of (3V(k' ,A ' ) with respect to A ' : f3EV(k' , A ' ) = (3 J V(k' , A ' )H(dA ' , A). Bellman's Principle of Optimality guarantees that the solution to the problem (A.2)-(A.5) coincides with the solution to the recursive problem (A.7)-(A.8) [see Stokey, Lucas and Prescott ( 1989), Section 9.1]. The efficiency conditions for the planning problem can be computed forming a Lagrangian in which Equation (A.7) is the objective and Equation (A.8) the constraint. The optimal value of c is dictated by (A. 9)
D1 u(c, 1 - N) = A, where A is the multiplier associated with the constraint (A.8). The optimal value of N, which we assume has an interior solution (0 given by
0 denotes ¢"(1) times the steady state value of K!w, and f3 denotes the steady-state value of Ry", the discount factor for income streams measured in units of the adjustment-cost input. This can then be substituted into the log-linear approximation to Equation (2. 1 2), (2. 1 5) to obtain a formula to be used in computing markup variations. Equation (2. 1 4) makes it clear that the cyclical variations in the labor input are the main determinant of the cyclical variations in Q. The factor Q will tend to be high when hours are temporarily high (both because they have risen relative to the past and because they are expected
28 In this equation, s11 refers to wHIPY as before. In order for this to correspond to labor compensation
as a share of value added, one must assume that the adjustment-cost inputs are not purchased from outside the sector of the economy to which the labor-share data apply. However, to a first-order approximation, it does not matter whether the adjustment costs are internal or external, as discussed below. 2 9 More generally, we shall use the notation Yxt to denote the growth rate x/x1 1 , for any state variable x. __
1074
J.J. Rotemberg and M. Woodford
to fall in the future), and correspondingly low when they are temporarily low. Thus, it tends to increase the degree to which implied markups are countercyclical 30 . More precisely, the factor Q tends to introduce a greater negative correlation between measured markups and future hours. Consider, as a simple example, the case in which hours follow a stationary AR(l) process given by
where 0 < p < 1 , and E is a white-noise process. Then Qt is a positive multiple of fit � 'Alit- j , where A = ( 1 /3(1 p)r I ' and cov(Qt' fit+j) is of the form C( 1 'Ap)pi for all j ? 0, where C > 0, while it is of the form C(1 'Ap-1 )p-i for all j < 0. One observes (since p < 'A < lip) that the correlation is positive for all leads j ? 0, but negative for all lags j < 0. Thus this correction would make the implied markup series more negatively correlated with leads of hours, but less negatively correlated with lags of hours. The intuition for this result is that high lagged levels of hours imply that the current cost of producing an additional unit is relatively low (because adjustment costs are low) so that current markups must be relatively high. Since, as we showed earlier, the labor share is more positively correlated with lags of hours (and more negatively correlated with leads of hours) this correction tends to make computed markup fluctuations more nearly coincident with fluctuations in hours. To put this differently, consider the peak of the business cycle where hours are still rising but expected future hours are low. This correction suggests that marginal cost are particularly high at this time because there is little future benefit from the hours that are currently being added. The last two columns of Table 2 show the effect of this correction for c equal to 4 and 8 while f3 is equal to 0.99. To carry out this analysis, we need an estimate of E/ (Ht+ i · We obtained this estimate by using one of the regressions used to compute expected output growth in Rotemberg and Woodford ( 1996a). In particular, the expectation at t of fit+ ! is the fitted value of a regression of fit+, on the values at t and t 1 of II, the rate of growth of private value added and the ratio of consumption of nondurables and services to GDP. Subtracting the actual value of fit from this fitted value, we obtain E/YHt+i · This correction makes the markup strongly countercyclical and ensures that the correlation of the markup with the contemporaneous value of the cyclical indicator is larger in absolute value than the correlation with lagged values of this indicator. On the other hand, the correlation with leads of the indicator is both negative and larger still in absolute value, particularly when c is equal to 8. The same calculations apply, to a log-linear approximation, in the case that the adjustment costs take the form of less output from a given quantity of labor inputs. �
�
�
�
�
30 Even though they allow for costs of changing find any industries with countercyclical markups in However, their adjustment-cost parameter is often expect the markups computed on the basis of these
employment, Askildsen and Nilsen ( 1997) do not their study of Norwegian manufacturing industries. estimated to have the wrong sign and one would estimates to be procyclical.
Ch. 16:
The Cyclical Behavior of Prices and Costs
1 075
Suppose that in the above description of production costs, H refers to the hours that are used for production purposes in a given period, while H¢J indicates the number of hours that employees must work on tasks that are created by a firm's variation of its labor input over time. (In this case, K" = w. ) Equations (2. 1 2) and (2. 1 3) continue to apply, as long as one recalls that H and SH now refer solely to hours used directly in production. Total hours worked equal AH instead, and the total labor share equals AsH , where A = 1 + ¢'( YH ). But in the log-linear approximation, we obtain A = 0, and so Equations (2. 1 4) and (2. 1 5) still apply, even if YH and sl! refer to fluctuations in the total labor inputs hired by firms. A more realistic specification of adjustment costs would assume costs of adjusting employment, rather than costs of adjusting the total labor input as above 3 1 . Indeed, theoretical discussions that assume convex costs of adjusting the labor input, as above, generally motivate such a model by assuming that the hours worked per employee cannot be varied, so that the adjustment costs are in fact costs of varying employment. In the data, however, employment variations and variations in total person-hours are not the same, even if they are highly correlated at business-cycle frequencies. This leads us to suppose that firms can vary both employment N and hours per employee h, with output given by F(K, zhN), and that costs of adjusting employment in period t are given by K"1N1¢J(N/N1_J ). If, however, there are no costs of adjusting hours, and wage costs are linear in the number of person-hours hired Nh, firms will have no need ever to change their number of employees (which is clearly not the case). If, then, one is not to assume costs of adjusting hours per employee 32 , one needs to assume some other motive for smoothing hours per employee, such as the sort of non-linear wage schedule discussed above. We thus assume that a firm's wage costs are equal to W(h)N, where W(h) is an increasing, convex function as above. One can then again compute the marginal cost of increased output at some date, assuming that it is achieved through an increase in employment at that date only, holding fixed the number of hours per employee h at all dates, as well as other inputs. One again obtains Equation (2. 12), except that the definition of Q in Equation (2. 1 3) must be modified to replace YH by YN , the growth rate of employment, throughout. [In the modified Equation (2. 1 3), w now refers to the average wage, W(h)/h.] Correspondingly, Equation (2. 1 5) is unchanged, while Equation (2. 1 4) becomes (2. 1 6)
31 Bils and Cho ( 1 994) assume a convex cost of adjusting the employee-to-capital ratio, interpreting this as a cost of changing the organization of production, rather than a cost of hiring and firing employees. Because most variations in the employment-to-capital ratio at business-cycle frequencies are due to variations in employment, the consequences of such a specification are similar to those of the more familiar assumption of convex costs of changing the number of employees. 32 Studies that estimate separate adjustment costs for variations in employment and in the number of hours worked per employee, such as Shapiro (1986), tend to find insignificant adjustment costs for hours.
1076
J.J. Rotemberg and M Woodford
Thus one obtains, as in the simpler case above, a correction to Equation (2.5) that results in the implied markup series being more countercyclical (since employment is strongly procyclical, just as with the total labor input). Alternatively, one could compute the marginal cost of increased output, assuming that it is achieved solely through an increase in hours per employee, with no change in employment or in other inputs. In this case, one obtains again Equation (2. 1 1 ), but with H everywhere replaced by h in the first factor on the right-hand side. There is no contradiction between these two conclusions. For the right-hand sides of Equations (2. 1 1) and (2. 12) should be equal at all times; cost-minimization requires that (2. 1 7) which implies that Q = w. Condition (2. 17) is in fact the Euler equation that Bils (1987) estimates in his "second method" of determining the cyclicality of the marginal wage; he uses data on employment and hours variations to estimate the parameters of this equation, including the parameters of the wage schedule W(h) 3 3 . An equivalent method for determining the cyclicality of markups would thus be to determine the importance of employment adjustment costs from estimation of Equation (2. 1 7), and compute the implied markup variations using Equations (2. 1 5) and (2. 1 6). Insofar as the specification (2. 17) is consistent with the data, both approaches should yield the same implied markup series. It follows that Bils' results using his second method give an indication of the size of the correction that would result from taking account of adjustment costs for employment, if these are of the size that he estimated. His estimate of these adjustment costs imply an elasticity of Q even greater than the value of 1 .4 discussed above. 2.2.5. Labor hoarding Suppose now that not all employees on a firm's payroll are used to produce current output at each point in time. For example, suppose that of the H hours paid for by the firm at a given time, Hm are used in some other way (let us say, maintenance of the firm's capital), while the remaining H - Hm are used to produce the firm's product. Output is then given by Y = F(K, z(H - Hm)) rather than Equation (2. 1). We can again 33 Bils is able to estimate this equation by assuming parametric functional forms for the functions W' (h) and 1/J(YN) , and assuming that K1 is a constant multiple of the straight-time wage. He also notes that the term w1 should refer not simply to the average hourly wage, but to total per-employee costs divided by hours per employee; the numerator thus includes the costs of other expenses proportional to employment but independent of the number of hours worked per employee, such as payments for unemployment insurance. In fact, identification of the parameters in Equation (2.17) is possible only because w1 is assumed not to be given by a time-invariant function W(h,)lh" but rather by (W(h1) + F1)/h1 , where the shift term F1 representing additional per-employment costs is time-varying in a way that is not a function of h,.
Ch. 16:
The Cyclical Behavior of Prices and Costs
1 077
compute the marginal cost of increasing output by hiring additional hours, holding Hm fixed (along with other inputs). One then obtains instead of Equation (2.5) (2. 1 8)
where uH = (H -Hm)IH is the fraction of the labor input that is utilized in production. Note that this conclusion is quite independent of how we specify the value to the firm of the alternative use to which the hours Hm may be put. It suffices that we believe that the firm is profit-maximizing, in its decision to allocate the hours that it purchases in this way, as in its other input decisions, so that the marginal cost of increasing production by shifting labor inputs away from maintenance work is the same as the cost of increasing production by hiring additional labor. The fraction uH is often argued to be procyclical, insofar as firms are said to "hoard labor" during downturns in production, failing to reduce payrolls to the extent of the decline in the labor needed to produce their output, so as not to have to increase employment by as much as the firms' labor needs increase when output increases again. For example, the survey by Fay and Medoff (1 985) finds that when output falls by 1 %, labor hours used in production actually fall by 1 . 1 7%, but hours paid for fall only by 0.82% 34 .
Insofar as this is true, it provides a further reason why markups are more countercyclical than would be indicated by Equation (2.5) alone 3 5 . If the Fay and Medoff numbers are correct, and we assume furthermore that nearly all hours paid for are used in production except during business downturns, they suggest that uH falls when output falls, with an elasticity of 0.35 (or an elasticity of about 0.4 with respect to declines in reported hours). Thus this factor alone would justify setting b = -0.4 in Equation (2.9) . A related idea is the hypothesis that effective labor inputs vary procyclically more than do reported hours because of procyclical variation in work effort. We may suppose in this case that output is given by Y = F(K, zeH), where e denotes the level of effort exerted. If, however, the cost of a marginal hour (which would represent e units of effective labor) is given by the reported hourly wage W, then Equation (2.5) continues to apply. Here the presence of time-variation in the factor e has effects that are no different than those of time-variation in the factor z, both of which represent changes in the productivity of hours worked; the fact that e may be a choice variable of the 34 Of the remaining hours paid for, according to survey respondents, about two-thirds represent an increase in employee time devoted to non-production tasks, while the other third represents an increase in employee time that is not used at all. Fair (1985) offers corroborating evidence. 35 Models in which output fluctuations result from changes in firms ' desired markups can also explain why labor hoarding should be counter-cyclical, as is discussed further in Section 2.3. At least some models in which fluctuations in output result from shifts in the real marginal cost schedule have the opposite implication: periods of low labor costs should induce increases both in the labor force employed in current production and in the labor force employed in maintenance tasks.
1078
JJ Rotemberg and M. Woodford
firm while z is not has no effect upon this calculation. Note that this result implies that variations in the relation between measured hours of work and the true labor input to the production due to "labor hoarding" are not equivalent in all respects to variations in effort, despite the fact that the two phenomena are sometimes treated as interchangeable 3 6 . If we allow for variation in the degree to which the measured labor input provides inputs to current production (either due to labor hoarding or to effort variations), one could also, in principle, measure marginal cost by considering the cost of increasing output along that margin, holding fixed the measured labor input. Consideration of this issue would require modeling the cost of higher utilization of the labor input for production purposes. One case in which this does not involve factors other than those already considered here is if higher effort requires that labor be better compensated, owing to the existence of an effort-wage schedule w(e) of the kind assumed by Sbordone (1 996). In this case the marginal cost of increasing output by demanding increased effort results in an expression of the form (2. 1 1 ), where now w = ew' (e)/w( e). If, at least in the steady state, the number of hours hired are such that the required level of effort is cost-minimizing, and that cost-minimizing effort level is unique, then (just as in our discussion above of a schedule specifying the wage as a function of hours per employee) the elasticity w will be an increasing function of e, at least near the steady-state level of effort. The existence of procyclical effort variations would then, under this theory, mean that implied markup variations are more countercyclical than one would conclude if the effort variations were not taken into account. This does not contradict the conclusion of the paragraph before last. For in a model like Sbordone's, effort variations should never be used by a firm, in the absence of adjustment costs for hours or employment (or some other reason for increasing marginal costs associated with increases in the measured labor input, such as monopsony power). In the presence, say, of adjustment costs, consideration of the marginal cost of increasing output through an increase in the labor input leads to Equation (2. 12), rather than to Equation (2.5); this is consistent with the above analysis, since a cost-minimizing choice of the level of effort to demand requires that w(e) = Q
(2. 19)
at all times. It is true (as argued two paragraphs ago) that variable effort requires no change in the derivation of Equation (2. 1 2). But observation of procyclical effort variations could provide indirect evidence of the existence of adjustment costs, and hence of procyclical variation in the factor Q. A further complication arises if the cost to the firm of demanding greater effort does not consist of higher current wages. Bils and Kahn (1996), for example, assume 36 For example, models of variable effort are sometimes referred to as models of "labor hoarding", as in Burnside et al. (1993).
Ch. 16:
The Cyclical Behavior of Prices and Costs
1 079
that there exists a schedule w(e) indicating the effective cost to the firm of demanding different possible effort levels, but that the wage that is actually paid is independent of the current choice of e, due to the existence of an implicit contract between firm and worker of the form considered in Hall ( 1980). They thus suppose that the current wage equals w( e*), where e* is the "normal" (or steady-state) level of effort. In this case, Equation (2. 12) should actually be (2.20) If effort variations are procyclical, the factor w(e)/w(e*) is procyclical, and so this additional correction makes implied real marginal costs even more procyclical. In their empirical work Bils and Kahn (1996) relate w(e)/w(e*) to variations in the energy consumption per unit of capital and show that this correction makes marginal cost significantly procyclical in four of the six industries they study. Interestingly, these four industries have countercyclical marginal costs when they ignore variations in the cost of labor that result from variations in effort. 2.2. 6. Variable utilization of capital It is sometimes argued that the degree ofutilization of firms' capital stock is procyclical as well, and that the production function is therefore properly a function of "effective" capital inputs that do not coincide with the measured value of firms' capital stocks. If by this one means that firms can produce more from given machines when more labor is used along with them, then it is not clear that "variable utilization" means anything that is not already reflected in a production function of the form (2. 1 ). Suppose, however, that it is possible for a firm to vary the degree of utilization of its capital stock other than by simply increasing its labor-to-capital ratio, and that the production function is actually of the form Y = F(uKK, zH), where UK measures the degree of utilization of the capital stock K. Even so, the derivation of Equation (2.5) is unaffected [and the same is true of subsequent variations on that equation, such as (2.8), {2. 1 1), (2. 12) and (2. 1 8)]. The reason is that variation in capital utilization has no consequences for those calculations different from the consequences of time variation in the capital stock itself. It is simply necessary to define y in Equation (2.6) by y/uK. In the case of an isoelastic production function (2.3), the methods of calculating implied markup variations we discussed above do not need to be modified at all. Variable capital utilization matters in a more subtle way if one assumes that capital utilization depends upon aspects of the firm's labor input decisions other than the total labor input H. For example, Bils and Cho ( 1994) argue that capital utilization should be an increasing function of the number of hours worked per employee; the idea being that if workers remain on the shop floor for a longer number of hours each week, the capital stock is used for more hours as well (increasing the effective capital inputs
J.J. Rotemberg and M Woodford
1080
used in production), whereas a mere increase in the number of employees, with no change in the length of their work-week, does not change the effective capital inputs used in production 3 7 . Under this hypothesis, the aggregate production function is given by Y = F(uK(h)K, zhN). This modification again has no effect upon the validity of the derivation of Equation (2. 1 2) from a consideration of the cost of increasing output by varying employment, holding hours per employee fixed [except, again, for the modification of Equation (2.6)]. Thus Equation (2. 1 5) becomes
j) = ay - aAh - s'H - Q,
(2.2 1 )
where A i s the elasticity o f uK with respect to h , while Equation (2. 1 6) i s unchanged. If one assumes a = 0 [as Bils ( 1 987) does], this would mean no change in the implied markup variations obtained using this method (which, as we have argued, is equivalent to Bils' "second method") 3 8 . Assuming that uK depends upon h does affect our calculation of the cost of increasing output by increasing hours per employee. In particular, Equation (2. 1 1 ) must instead be replaced by (2.22) where 17K is the elasticity of output with respect to the effective capital input. However, while the presence of A > 0 in Equation (2.20) is of considerable importance for one's estimate of the average level of the markup (it increases it), it has less dramatic consequences for implied markup fluctuations. In the Cobb-Douglas case, 1711 and 1JK are both constants, and implied percentage variations in markups are independent of the assumed size of A. Thus this issue has no effect upon the computations of Bils ( 1 987). If we maintain the assumption of constant returns but depart from the Cobb-Douglas case by supposing that 1711 is countercyclical (because EKH < 1 ), then allowance for 0 < A "( 1 makes the factor 1711 + A1JK less countercyclical. This occurs for two reasons; first, the factor 1711 + A17K decreases less with decreases in 1JH (and in the limit of A = 1 , it becomes a constant), and second, the factor y/uK (upon which 1711 depends) is again less procyclical. Nonetheless, even if we assume that all countercyclical
37 They provide evidence of a statistical correlation between hours per worker and other proxies for
capital utilization. Their econometric results are consistent with an assumption that capital utilization is proportional to hours per employee, a result that also has a simple interpretation in terms of a common work-week for all inputs. On the other hand, as Basu and Kimball ( 1 997) note, this correlation need not indicate that firms are forced to vary the two quantities together. 38 More generally, belief that A should take a significant positive value, perhaps on the order of 1 , reduces the significance of variations in I'JH as a contribution to implied markup variations, since both y and h are strongly procyclical. It is not plausible, however, to suppose that A should be large enough to make y - Ah a significantly countercyclical factor.
Ch. 16:
The Cyclical Behavior of Prices and Costs
1081
variation in this factor is eliminated, implied markup variations will still be as strongly countercyclical as they would be with a Cobb-Douglas production function. To sum up, there are a number of reasons why the simple ratio of price to unit labor cost is likely to give an imprecise measure of cyclical variations in the markup. As it happens, many of the more obvious corrections to this measure tend to make implied markups more countercyclical than is that simple measure. Once at least some of these corrections are taken account of, one may easily conclude that markups vary countercyclically, as is found by Bils ( 1 987) and Rotemberg and Woodford ( 1 99 1).
2.3. Alternative measures ofreal marginal cost Our discussion in Sections 2 . 1 and 2.2 has considered for the most part a single approach to measuring real marginal cost (or equivalently, the markup), which considers the cost of increasing output through an increase in the labor input. However, as we have noted, if firms are minimizing cost, the measures of real marginal cost that one would obtain from consideration of each of the margins along which it is possible to increase output should move together; thus each may provide, at least in principle, an independent measure of cyclical variations in markups. While cyclical variation in the labor input is clearly important, cyclical variations in other aspects of firms' production processes are observed as well. We tum now to the implications of some of these for the behavior of real marginal cost.
2. 3. 1. Intermediate inputs Intermediate input use (energy and materials) is also highly cyclical. Insofar as the production technology does not require these to be used in fixed proportions with primary inputs [and Basu ( 1 995) presents evidence that in US manufacturing industries it does not], this margin may be used to compute an alternative measure of real marginal cost. Consideration of this margin is especially attractive insofar as these inputs are not plausibly subject to the kind of adjustment costs involved in varying the labor input [Basu and Kimball ( 1 997)], so that at least some of the measurement problems taken up in Section 2.2 can be avoided. Suppose again that gross output Q is given by a production function Q( V, M), where V is an aggregate of primary inputs, and M represents materials inputs. Then, considering the marginal cost of increasing output by increasing materials inputs alone yields the measure (2.23) by analogy with Equation (2.2). [Note that in Equation (2.23), p refers to the "gross output" markup which we called f.1G in Equation (2. 1 0). Also note that P now refers to the price of the firm's product, and not a value-added price index as before.] Under
1082
J.J. Rotemberg and M. Woodford
the assumption that Q exhibits constant returns to scale 3 9, QM is a decreasing function of M!V, or equivalently of the materials ratio m = M!Q. In this case, log-linearization of Equation (2.23) yields (2.24) where f < 0 is the elasticity of QM with respect to m, and PM indicates percentage ' fluctuations in the relative price of materials PM = PMIP. Both terms on the right-hand side of Equation (2.24) provide evidence that markups vary counter-cyclically. Basu ( 1 995) shows that intermediate inputs (energy and materials) rise relative to the value of output in expansions, at least when these are not 0 due to technology shocks 4 . Basu furthermore assumes that pM is equal to one because he views materials inputs as indistinguishable from final output. Under this assumption, the increase of m in booms immediately implies that markups are countercyclical. In fact, however, goods can be ranked to some extent by "stage of processing"; all goods are not used as both final goods and intermediate inputs of other sectors to the same extent. And it has long been observed that the prices of raw materials rise relative to those of finished goods in business expansions, and fall relative to those of finished goods in contractions [e.g., Mills (1936), Means et al. (1 939)]. Murphy, Shleifer and Vishny (1 989) show that this pattern holds up consistently both when they consider broad categories of goods grouped by stage of processing, and when they consider particular commodities that are important inputs in the production of other particular goods. Hence it would seem that for the typical industry, PM is a procyclical variable. Because of Equation (2.24), this would itself be evidence of countercyclical markup variation, even if one regarded QM as acyclical. The combination of these two facts clearly supports the view that real marginal costs are procyclical, and hence that markups are countercyclical. Note that in the case that the production function Q(V, M) is isoelastic in M, Equation (2.23) implies that 11 should be inversely proportional to the share of materials costs in the value of gross output, sM = pMm. Thus in this case the materials share would directly provide a suitable proxy for variations in real marginal cost, just as in our previous discussion of the labor share. However, this specification (implying a unit elasticity of substitution between intermediate and primary inputs) is hardly plausible. Rotemberg and Woodford ( 1 996b) estimate elasticities of substitution for 20 two-digit manufacturing sectors, and find an average elasticity less than 0.7. Basu's
39 This assumption allows for increasing returns, but requires that they take the form of increasing returns in the value-added production function V(K,zH). 40 This is shown in the fourth row of his Table 5. He regresses the percentage change in m on the percentage change in Q, for each of 2 1 two-digit US manufacturing industries. He instruments output growth using the Ramey-Hall instruments for non-technological aggregate disturbances. He also shows that intermediate inputs rise more than does a cost-weighted average of primary inputs (labor and capital), using the same instruments; as one should expect, the regression coefficient in this case is much larger.
Ch. 16:
The Cyclical Behavior of Prices and Costs
1 083
(1995) estimate of the response of m to changes in the relative price of primary and intermediate inputs suggests an elasticity of substitution half that size 4 1 . Thus it seems most likely that instead/ < - 1 in Equation (2.24). If the materials ratio m is procyclical as found by Basu, it follows that real marginal costs are actually more procyclical than is indicated by the materials share alone. A related measure is used by Domowitz, Hubbard and Petersen ( 1 9 86), who measure "price-cost margins" defined as the ratio of price to "average variable cost". They measure this as a the ratio of industry revenues to the sum of labor and materials costs, which is to say, as the reciprocal of the sum of the labor and materials shares. This should correspond to the markup as we have defined it only under relatively special circumstances. lfthe production function is isoelastic in both labor inputs and materials inputs, then real marginal cost is proportional to the labor share (as explained in Section 2.1 ), and also proportional to the materials share (as explained in the previous paragraph). It then follows that these two shares should move in exact proportion to one another, and hence that their sum is a multiple of real marginal cost as well. Domowitz et al. report that this sum is somewhat countercyclical for most industries, and as a result they conclude that price-cost margins are generally procyclical. However, the conditions under which this measure should correspond to variations in the markup of price over marginal cost are quite restrictive, since they include all of the conditions required for the labor share to be a valid measure of real marginal cost, and all of those required for the materials share to be a valid measure. We have reviewed in Section 2.2 a number of reasons why the labor share is probably less procyclical than is real marginal costs. Similar considerations apply in the case of the materials share, although the likely quantitative importance of the various corrections is different in the two cases; in the case of materials, the elasticity of substitution below unity is probably a more important correction, while adjustment costs are probably much less important. Nonetheless, one must conclude, as with our previous discussion of the labor share alone, that real marginal cost is likely to be significantly more procyclical than is indicated by the Domowitz et al. measure of "average variable cost" 42 .
41 The last line of his Table 5 indicates an increase in m of only 0. 12% for each 1 % increase in the relative price of primary and intermediate inputs. His estimates of the cyclicality of materials input use indicate three times as large an elasticity for MIV as for MIQ (comparing lines 2 and 4 of that table), though the estimated elasticity of M!V is reduced when labor hoarding is controlled for. This would suggest an increase in MIV of at most 0.36% for each percent increase in the relative price of inputs. 42 Similar issues arise with the study of Felli and Tria ( 1996) who use the price divided by overall average cost as a measure of the markup. They compute this by dividing total revenue by total cost including an imputed cost of capital (which depends on a measure of the real interest rate). Leaving aside the difficulties involved in measuring the cost of capital, it is hard to imagine that adding together the shares of labor, materials and capital is appropriate for computing markups unless each share in isolation is appropriate as well. In addition, the existence of adjustment costs of capital probably make the marginal cost that results from producing an additional unit by adding capital considerably more procyclical than average capital cost. These adjustment costs may also rationalize the dynamic relation they find between their ratio of average cost to output and output itself.
J.J. Rotemberg and M. Woodford
1084
2.3.2. Inventory fluctuations Another margin along which firms may increase the quantity of goods available for sale in a given period is by drawing down inventories of finished goods. For a cost minimizing firm, the marginal cost of drawing down inventories must at all times equal the marginal cost of additional production, and thus measurement of the costs of reduced inventories provides another potential (indirect) measure of the pehavior of marginal cost. The following simple framework will clarify what is involved in such an analysis. Inventories at the end of period t, It+ l , equal I1 + Q1 S1, where Q1 is production at t and S1 are sales at t. It is thus possible for a firm to keep its path of sales (and hence revenues) unchanged, increasing production and inventories at time t by one unit while reducing production by one unit at time t + 1 . If the firm's production and inventory holding plan is optimal, such a marginal deviation should not affect the present value of its profits. For the typical firm, the proposed deviation raises nominal costs by the marginal cost of production at t, c1, while lowering them by the present value of the marginal cost of production at t + 1 , and also raising profits by the marginal benefit of having an additional unit of inventory at the end of t. Denoting the real value of this latter marginal benefit by b(I1, Z1), where Z1 denotes other state variables at date t that may affect this benefit, we have -
as a first-order condition for optimal inventory accumulation by the firm, where P1 is the general price level at date t (not necessarily the price of the firm's output), and Rt,t+ l is a stochastic discount factor for nominal income streams. This may equivalently be written (2.25)
where now Pt,t+l is the corresponding discount factor for real income streams. Given an assumption about the form of the marginal benefit function b(I, Z), observed inventory accumulation then provides evidence about real marginal costs in an industry - more precisely, about the expected rate of change in real marginal costs. The early studies in this literature [e.g., Eichenbaum (1 989), Ramey (1991)] have tended to conclude that real marginal cost is countercyclical. The reason is that they assume that the marginal benefit of additional inventories should be decreasing in the level of inventories (or equivalently, that the marginal cost of holding additional inventories is increasing); the finding that inventories are relatively high in booms then implies that b is low, from which the authors conclude that real marginal costs
Ch. 16:
The Cyclical Behavior of Prices and Costs
1 085
must be temporarily low43 . Eichenbaum interprets the countercyclical variation in real marginal costs as indicating that output fluctuations are driven by cost shocks, while Ramey stresses the possibility that increasing returns to scale could be so pervasive that marginal cost could actually be lower in booms. Regardless of the explanation, if the finding of countercyclical real marginal costs is true for the typical sector, it would follow that markups in the typical sector must be procyclical. This is indeed the conclusion reached by Kollman ( 1996). Bils and Kahn ( 1 996) argue, instead, that real marginal cost is procyclical in each of the six production-for-stock industries that they investigate. The differing conclusion hinges upon a different conclusion about cyclical variation in the marginal benefits of additional inventories. They begin by observing that inventory-to-sales ratios do not vary secularly. This suggests that the function b is homogeneous of degree zero in inventories and sales; specifically, they propose that b is a decreasing function, not of I alone, but of I!S44. A similar conclusion follows from noticing that inventory-to-sales ratios are fairly constant across different models of automobiles at a given point in time, even though these models differ dramatically in the volume of their sales. But this implies that b is actually higher in booms. The reason is that, as Bils and Kahn show, the ratio of inventories to sales is strongly countercyclical; while inventories rise in booms, they rise by less than do sales. Thus, the marginal value of inventories must be high in booms and, as a result, booms are periods where real marginal costs are temporarily high. This conclusion is consistent both with the traditional view that diminishing returns result in increasing marginal costs, and with the view that business cycles are not primarily due to shifts in industry cost curves. As noted earlier, Bils and Kahn also show that their inventory-based measures of real marginal cost covary reasonably closely with a wage-based measure of the kind discussed above, once one corrects the labor cost measure for the existence ofprocyclical work effort as in Equation (2.20). If their conclusion holds for the typical industry, and not just the six that they consider, it would have to imply countercyclical markup variations 45 .
43
This aspect of inventory behavior has been much discussed as an embarrassment to the "production smoothing" model of inventory demand, which implies that inventories should be drawn down in booms [e.g., Blinder (1 986)]. That prediction is obtained by adjoining to Equation (2.25) the assumptions that b is decreasing in I and that real marginal cost is increasing in the level of production Q. 44 A theoretical rationale for this is provided in terms of a model of the stockout-avoidance demand for inventories. 45 The price data for the particular industries considered by Bils and Kahn are ambiguous in this regard; they find that (given their measures of variations in marginal cost) markups are countercyclical in some industries but procyclical in others. This means that certain of their sectors have strongly procyclical relative prices for their products - something that cannot be true of industries in generaL
J.J. Rotemberg and M. Woodford
1086
2.3. 3. Variation in the capital stock A final way in which output can be increased is by increasing the stock of capital 46 . Thus
fJ =
PFK(K, zH) E(r)
'
(2.26)
where E(r) is the expected cost of increasing the capital stock at t by one unit while leaving future levels of the capital stock unchanged. Assuming that the capital stock at t can actually be changed at t but also letting there be adjustment costs, r1 equals
where PK,t is the purchase price of capital at t, c1,1 is the adjustment cost associated with increasing the capital stock at t by one unit, 0 is the depreciation rate. It then becomes possible to measure changes in f.1 by differentiating Equation (2.26). This is somewhat more complicated than the computation of marginal cost using either labor or materials because the rental rate of capital r cannot be observed directly; it must be inferred from a parametric specification for c1 . A related exercise is carried out by Galeotti and Schiantarelli ( 1 998). After specifying a functional form for c1 and making a homogeneity assumption regarding F, they estimate Equation (2.26) by allowing f.1 to be a linear function of both the level of output and of expected changes in output. Their conclusion is that markups fall when the level of output is unusually high and when the expected change in output is unusually low. As we discuss further in Section 3, this second implication is consistent with certain models of implicit collusion.
2.4. The response offactor prices to aggregate shocks Thus far we have discussed only the overall pattern of cyclical fluctuations in markups. Here we take up instead the degree to which markup variations play a role in the observed response of the economy to particular categories of aggregate shocks. We are especially interested in shocks that can be identifie d in the data, that are known to be non-technological in character and that are thus presumptively statistically independent of variations in the rate of technical progress 47. These cases are of particular interest 46 We have considered separately each of these different ways in which firms can increase their output
and their associated marginal cost. An alternative is to postulate a relatively general production (or cost) function, estimate its parameters by assuming that firms minimize costs, and thereby obtain estimates of marginal cost that relate to many inputs at once. One could then compare this "average" estimate of marginal cost to the price that is actually charged. Morrison ( 1 992) and Chirinko and Fazzari (1997) follow a related approach. 47 In taking this view, of course, we assume that variations in technical progress are essentially exogenous, at least at business-cycle frequencies.
Ch. 16:
The Cyclical Behavior of Prices and Costs
1 087
because we can then exclude the hypothesis of shifts in supply costs due to changes in technology as an explanation for the observed response of output and employment. This allows us to make judgments about the nature of markup variations in response to such shocks that are less dependent upon special assumptions about the form of the production function than has been true above (where such assumptions were necessary in order to control for variable growth in technology). In particular, in the case of a variation in economic activity as a result of a non technological disturbance, if markups do not vary, then real wages should move countercyclically. In our basic model, this is a direct implication of Equation (2.2), under the assumption of a diminishing marginal product of labor48. For in the short run, the capital stock is a predetermined state variable, and so increases in output can occur if and only if hours worked increase, as a result of which the marginal product of labor must decrease; this then requires a corresponding decrease in the real wage, in order to satisfy Equation (2.2). In the case of such a shock, then, the absence of countercyclical real wage movement is itself evidence of countercyclical markup variation. Before turning to the evidence, it is worth noting that the inference that procyclical (or even acyclical) real wages in response to these shocks imply countercyclical markups is robust to a number of types of extension of the simple model that leads to Equation (2.2). For example, the presence of overhead labor makes no (qualitative) difference for our conclusion, since the marginal product of labor should still be decreasing in the number of hours worked. A marginal wage not equal to the average wage also leads to essentially the same conclusion. If, in particular, we assume that the firm's wage bill is a nonlinear function of the form W(H) = w0v(H), where the function v(H) is time-invariant though the scale factor w0 may be time-varying49 , then w(H), the ratio of the marginal to the average wage, is a time-invariant function. Since the denominator of Equation (2.2) should actually be the marginal wage, when the two differ, our reasoning above actually implies that f.lW must be countercyclical. But as we have explained above, w(H) is likely to be an increasing function (if it is not constant), so that f-1 should vary even more countercyclically than does the product f.lW (which equals the ratio of the marginal product of labor to the average wage). If there are convex costs of adjusting the labor input, one similarly concludes that f.lQ must be countercyclical. But since the factor Q [defined in Equation (2. 1 3)] will generally
48 Note that the latter assumption is necessary for equilibrium, if we assume that markups do not vary because product markets are perfectly competitive. In the case of market power but a constant markup (as in a model of monopolistic competition with Dixit-Stiglitz preferences and perfectly flexible prices see below), a mildly increasing marginal product of labor schedule is theoretically possible, but does not seem to us appealing as an empirical hypothesis. 49 For example, Bils (1987) assumes a relationship of this kind, where w0 represents the time-varying straight-time wage, while the function v(H) reflects the nature of the overtime premium, which is time invariant in percentage terms.
JJ Rotemberg and M. Woodford
1088
vary procyclically, this is again simply a reason to infer an even stronger degree of countercyclical variation in markups than is suggested by Equation (2.2). If there is labor hoarding, it can still be inferred in the case of an increase in output due to a non-technological disturbance that H - Hrn must have increased; and then, if real wages do not fall, Equation (3 .8) implies that markups must have declined. In the case of variable capital utilization, the situation is more complicated. Condition (2.2) generalizes to
PzFH (uKK, zH)
/h = -----,--- . w
(2.27)
If we assume as above that F is homogeneous degree one, FH is a decreasing function of zH!uKK. But the mere fact that output and the labor input increase will not settle the question whether the ratio of labor inputs to effective capital inputs, zH!uKK, has increased or not. Hence it may not be clear that the marginal product of labor must decline in booms. Suppose, however, that the cost of higher capital utilization consists of a faster rate of depreciation of the capital stock. Let the rate of depreciation be given by D(uK), and let V(K') denote the value to the firm of having an undepreciated capital stock of K' at the end of the period. The usual assumption of diminishing returns makes it natural to suppose that D should be an increasing, convex function, while V should be an increasing, concave function 50 . Then if we consider the marginal cost of increasing output solely by increasing the rate of utilization of the capital stock, we obtain the additional relation (2.28) Now if zH/uKK decreases when output expands, it follows that FK declines. Furthermore, this requires an increase in UK, so that, under our convexity assumptions, both V' and (J' must increase. Thus Equation (2.28) unambiguously requires the markup to decrease. Alternatively, if zH/uKK increases, FH declines, and then, if there is no decline the real wage, Equation (2.27) requires a decline in the markup. Thus under either hypothesis, markup variations must be countercyclical, if real wages are not 5 1 . We turn now to the question of whether expansions in economic activity associated with non-technological disturbances are accompanied by declines in real wages. There are three important examples of identified non-technological disturbances that are often used in the literature. These are variations in military purchases, variations in the world
50 See Appendix 2 in Rotemberg and Woodford (1991). 5 1 Which case is actually correct will depend upon the relative degrees of curvature of the various schedules that enter into the right-hand sides of Equations (2.27) and (2.28).
Ch. 1 6:
The Cyclical Behavior of Prices and Costs
1 089
oil price, and monetary policy shocks identified using "structural VAR" methods. At least in the USA, the level of real military purchases has exhibited noticeable variation over the post-World War II period (as a result of the Korean conflict, Vietnam, and the Reagan-era military build-up). The causes of these variations are known to have had to do with political events that have no obvious connection with technical progress. (We consider military purchases rather than a broader category of government purchases exactly because this claim of exogeneity is more easily defended in the case of military purchases.) Similarly, the world oil price has been far from stable over that period (the two major "oil shocks" of the 1 970s being only the most dramatic examples of variation in the rate of increase in oil prices), and again the reasons for these variations, at least through the 1 970s, are known to have been largely external to the US economy (and to have had much to do with political dynamics within the OPEC cartel) 5 2 . In the case of monetary policy shocks, the identification of a time series for exogenous disturbances is much less straightforward (since the Federal funds rate obviously responds to changes in economic conditions, including real activity and employment, as a result of the kind of policies that the Federal Reserve implements). However, an extensive literature has addressed the issue of the econometric identification of exogenous changes in monetary policy 53, and we may therefore consider the estimated responses to these identified disturbances. In each of the three cases, the variable in question is found to be associated with variations in real activity, and these effects are (at least qualitatively) consistent with economic theory, so that it is not incredible to suppose that the observed correlation represents a genuine causal relation. We turn now to econometric studies of the responses to such shocks, using relatively unrestricted VAR models of the aggregate time series in question. Rotemberg and Woodford ( 1 992) show that increases in real military purchases raise private value added, hours worked in private establishments and wages deflated by the relevant value added deflator. Ramey and Shapiro ( 1 998) show that the effect on this real wage is different when revised NIPA data are used and that, with revised data, this real wage actually falls slightly. They argue that this response can be reconciled with a two-sector constant markup model. Whether a one-sector competitive model can be reconciled with their evidence remains an open question. Christiano, Eichenbaum and Evans ( 1 996) show, using a structural VAR model to identify monetary policy shocks, that output and real wages both decline in response to the increases in interest rates that are associated with monetary tightening. This again suggests that the contraction in output is associated with an increase in markups. An increase in the federal funds rate by one percent that leads to a 0.4% reduction in output reduces real wages by about 0.1 %. If one supposes that hours fall by about the same percent as output, the effective increase in the markup is about 0.2%.
52 These first two series have been widely used as instruments for non-technological sources of variation in US economic activity, following the precedent of Hall (1988, 1 990). 53 For a recent survey, see Leeper, Sims and Zha ( 1996).
J.J. Rotemberg and M. Woodford
1090
Rotemberg and Woodford ( 1 996b) look instead at the response of the US economy to oil price increases. They show that during the pre- 1 980 OPEC period, such increases lowered private value added together with real wages. Specifically, a one percent unexpected increase in oil prices is shown to lead to a reduction of private value added by about a quarter of a percent after a year and a half, and to a reduction of the real wage (hourly earnings in manufacturing deflated by the private value-added deflator) by about 0. 1 %, with a similar time lag. This combination of responses again suggests that markups increase, especially during the second year following the shock. The inference is, however, less straightforward in this case; for one might think that an increase in oil prices should have an effect similar to that of a negative technology shock, even if it does not represent an actual change in technology. In fact, Rotemberg and Woodford show that this is not so. Let us assume again the sort of separable utility function used to derive Equation (2.23), but now interpret the intermediate input "M" as energy. In this case, consideration of the marginal cost of increasing output by increasing labor inputs yields
1-1 =
PQv ( V, M) VH(K, zH)
w
.
(2.29)
Comparison of Equation (2.29) with (2.23) allows us to write a relation similar in form to Equation (2.2), (2.30) where the price index P is defined by (2.3 1 ) Thus i f we deflate the wage by the proper price index P, it i s equally true o f an energy price change that a decrease in labor demand must be associated with an increase in the real wage, unless the markup rises. [Note that the situation is quite different in the case of a true technology shock, since the relation (2.30) is shifted by a change in z.] Under the assumption of perfect competition (/-1 = 1 ), the price index defined in Equation (2.3 1 ) is just the ideal (Divisia) value-added deflator. Thus a competitive model would require the value-added-deflated real wage to rise following an oil shock, if employment declines 54; and the observation that real wages (in this sense) decline would suffice to contradict the hypothesis of perfect competition. The results of Rotemberg and Woodford do not quite establish this; first, because their private value-added deflator is not precisely the ideal deflator, but more importantly, because 54 This result is discussed extensively by Bruno and Sachs ( 1 985), who use it to assert that the unemployment following the oil shocks was due to real wage demands being too high.
Ch. 16:
The Cyclical Behavior of Prices and Costs
109 1
their measure of private value added includes the US energy sector, whereas the above calculations refer to the output of non-energy producers (that use energy as an input). Still, because the energy sector is small, even the latter correction is not too important quantitatively; and Rotemberg and Woodford show, by numerical solution of a calibrated model under the assumption of perfect competition, that while small simultaneous declines in their measure of output and of the real wage would be possible under competition, the implied declines are much smaller than the observed ones 55. Similar reasoning allows us to consider as well the consequences of changes in the relative price of intermediate inputs other than energy. We ignored materials inputs in our discussion above of the inferences that may be drawn from the response of real wages to identified shocks. As before, however, Equation (2.2) [and similarly (2.27)] can be interpreted as referring equally to a production technology in which materials inputs are used in fixed proportions with an aggregate of primary inputs, under the further assumption that the relative price of materials is always one, because materials and final goods are the same goods. But the relative prices of goods differing by "stage of processing" do vary, and so a more adequate analysis must take account of this. When one does so, however, one obtains Equation (2.30) instead of (2.2). It is still the case that the failure of real wages to rise in the case of a non-technological disturbance that contracts labor demand indicates that markups must rise, as long as the real wage in question is w/P. What, instead, if one observes only the behavior of w!P? Then the failure of this real wage to rise might, in principle, be explained by a decline in PIP, consistent with a hypothesis of constant (or even procyclical) markups. However [referring again to Equation (2.29)], this would require a decline in Qv(V, M). Under the assumption that Q is homogeneous degree one, this in turn would require a decline in MIV, hence an increase in QM ( V, M). If markups are constant or actually decreasing, this would then require an increase in the relative price of materials, PMIP, by Equation (2.23). Thus we can extend our previous argument to state that if one observes that neither w!P nor PMIP increases in the case of a non-technological disturbance that leads to reduced labor demand, one can infer that markups must increase. In fact, Clark ( 1 996) shows, in the case of a structural VAR identification of monetary policy disturbances similar to that of Christiano et al., that a monetary tightening is followed by increases in the price of final goods relative to intermediate goods and raw materials. This, combined with the evidence of Christiano et al. regarding real wage responses, suggests that a monetary tightening involves an increase in markups. A possible alternative explanation of declines in real wages and the relative price of materials inputs at the same time as a contraction of output and employment is an increase in some other component of firms' marginal supply cost. Christiano et al.
55 Finn (1999), however, finds larger declines in the case of a competitive model that allows for variable utilization of the capital stock.
1092
J.J. Rotemberg and M Woodford
propose that an increase in financing costs may be the explanation of their findings 56. As they show, in a model where firms require bank credit to finance their wage bill, the interest rate that must be paid on such loans also contributes to the marginal cost of production; and it is possible to explain the effects of a monetary tightening, without the hypothesis of markup variation, as being due to an increase in marginal cost due to an increase in the cost of credit. But while this is a theoretical possibility, it is unclear how large a contribution financing costs make to marginal costs of production in reality 57. This matter deserves empirical study in order to allow a proper 'quantitative evaluation of this hypothesis.
2.5. Cross-sectional differences in markup variation In this subsection we survey the relatively scant literature that investigates whether markups are more countercyclical in industries where it is more plausible a priori that competition is imperfect. This issue is of some importance because countercyclical markups are less plausible in industries where there is little market power. For markups below one imply that the firm can increase its current profits by rationing consumers to the point at which marginal cost is no higher than the firm's price. But if markups never fall below one, there is little room for markup variation unless average markups are somewhat above one. In addition, the theoretical explanations we present for countercyclical markups in section 3 all involve imperfect competition. A consideration of whether the measures of markup variation that we have proposed imply that markup variation is associated with industries with market power is thus a check on the plausibility of our interpretation of these statistics. Quite apart from this, evidence on comparative markup variability across industries can shed light upon the adequacy of alternative models of the sources of markup variation. The most straightforward way of addressing this issue is to compute markups for each sector using the methods discussed in section 2, and compare the resulting markup movements to output movements. In Rotemberg and Woodford ( 1 99 1 ), we carry out this exercise for two-digit US data, treating each of these sectors as having a different level of average markups and using Hall's ( 1 98 8) method for measuring the average markup in each sector 58. We show that the resulting markups are more negatively 56 The same explanation is offered by Clark for the behavior of the relative prices of goods at different stages of processing.
57 Interruptions of the supply of bank credit certainly can significantly affect the level of economic activity, but the most obvious channel through which this occurs is through the effects of financing costs upon aggregate demand. Financing costs are obviously important determinants of investment demand, the demand for consumer durables, and inventory accumulation; but a contraction of these components of aggregate demand can easily cause a reduction of equilibrium output, without the hypothesis of an increase in supply costs. 58 For a more elaborate analysis of the evolution of cyclical markups in four relatively narrowly defined (four digit) industries, see Binder (1 995). He finds that these four industries do not have a common pattern of markup movements, though none of them has strongly countercyclical markups.
Ch. 16:
The Cyclical Behavior of Prices and Costs
1093
correlated with GNP in sectors whose eight-digit SIC sector has a higher average four-firm concentration ratio. Thus, assuming this concentration is a good measure of market power, these results suggest that sectors with more imperfect competition tend to have more countercyclical markups. One source of this result is that, as shown earlier by Rotemberg and Saloner ( 1 986), real product wages W/P; are more positively correlated with GNP, and even with industry employment, in more concentrated industries. By itself, this is not sufficient to demonstrate that markups are more countercyclical since zFH could be more procyclical in these sectors. However, the analysis of Rotemberg and Woodford ( 1 99 1 ) suggests that this is not the explanation for the more procyclical real product wages in more concentrated sectors. As we discussed earlier, Domowitz, Hubbard and Petersen ( 1 986) measure markup changes by the ratio of the industry price relative to a measure of "average variable cost". They show that this ratio is more procyclical in industries where the average ratio of revenues to materials and labor costs is larger, and see this as suggesting that markups are actually more procyclical in less competitive industries. As we already mentioned, this method for measuring markup variation imparts a procyclical bias for a variety of reasons. This bias should be greater in industries with larger fixed (or overhead) costs [because of Equation (2.8)], and these are likely to be the more concentrated industries. In addition, the ratio of revenues to labor and materials costs is a poor proxy for the extent to which a sector departs from perfect competition, because this indicator is high in industries that are capital-intensive, regardless of the existence of market power in their product markets. Domowitz, Hubbard and Petersen (1987) use a different method for measuring industry markup variations and obtain rather different results. In particular, they run regressions of changes in an industry's price on changes in labor and materials cost as well as a measure of capacity utilization. Using this technique, they show that prices are more countercyclical, i.e., fall more when capacity utilization becomes low, in industries with higher average ratios of revenues to materials and labor costs. If the relation between capacity utilization and marginal cost were the same across industries, and if one accepted their method for deciding which industries are less competitive, their study would thus show that markups are more countercyclical in less competitive industries. 3. Implications of markup variations for business fluctuations
In this section, we study whether it is empirically plausible to assign a large role on markup fluctuations in explaining business fluctuations. We first take up two related aspects of the observed cyclical variation in the relation between input costs and the value of output, that are sometimes taken to provide prima facie evidence for the importance of cost shifts (as opposed to markup changes) as the source of fluctuations in activity. These are the well-known procyclical variation in productivity and in
1094
J.J. Rotemberg and M. Woodford
profits. We show that these procyclical variations contain very little information on the importance of markup changes because markup variations induce such procyclical responses. We next take up a more ambitious attempt at gauging the role of markup fluctuations in inducing cyclical fluctuations in economic activity. In particular, we study the extent to which the markup changes that we measured in Sections 2 . 1 and 2.2 lead to output fluctuations. Any change in output that differs from that which is being induced by changes in markups ought naturally to be viewed as being due to a shift in real marginal costs (for a given level of output). Thus, this approach allows us to decompose output changes into those due to markup changes and those due to shifts in the marginal cost curve. What makes this decomposition particularly revealing is that, under the hypothesis that markups are constant all output fluctuations are due to shifts in real
marginal costs.
3. 1. Explaining cyclical variation in productivity and profits 3. 1 . 1. Cyclical productivity Standard measures of growth in total factor productivity (the "Solow residual" and variants) are highly positively correlated with growth in output and this fact is cited in the real business cycle literature [e.g., Plosser ( 1 989)] as an important piece of evidence in favor of the hypothesis that business cycles are largely due to exogenous variations in the rate of technical progress. It might seem that variations in economic activity due to changes in firms' markups (in the absence of any shift in the determinants of the real marginal cost schedule) should not be associated with such variations in productivity growth, and that the close association of output variations with variations in productivity growth therefore leaves little role for markup variations in the explanation of aggregate fluctuations - or at least, little role for disturbances that affect economic activity primarily through their effect upon markups rather than through their effect on production costs. In fact, however, there are a number of reasons why variations in markups should be expected to produce fluctuations in measured total factor productivity growth, that are strongly and positively correlated with the associated fluctuations in output growth. Thus observation of procyclical productivity growth does not in itself provide any evidence that markup variations do not play a central role in accounting for observed aggregate fluctuations. (Of course, procyclical productivity is not in itself conclusive evidence of markup variation either, since other explanations remain possible. For this reason productivity variations are a less crucial statistic than those discussed in Sections 2. 1 and 2.2.) One reason is simply that standard measures of total factor productivity growth may use incorrect measures of the elasticities of the production function with respect to factor inputs. If these elasticities are assigned values that are too small (in particular, the elasticity 'YJH with respect to the labor input), then spurious procyclical variation in
Ch. 16:
The Cyclical Behavior of Prices and Costs
1095
total factor productivity growth will be found. As Hall ( 1 988) notes, the Solow residual involves a biased estimate of just this kind, if firms have market power. Consider a production function of the form (2. 1 ), where F is not necessarily homogeneous of degree 1 . Differentiation yields (3 . 1 ) As noted before, Equation (2.2) implies that 1JH = f.lSH ; similar reasoning (but considering the marginal cost of increasing output by increasing the quantity of capital used) implies that 1JK = f.lSK . Thus under perfect competition (so that f1 = 1 ), the elasticities correspond simply to the factor shares, and a natural measure of technical progress is given by the Solow residual
More generally, however, substitution of Equation (3 . 1 ) (with the elasticities replaced by f1 times the corresponding factor income share) yields (3.2) In the case of perfect competition, only the second term is present in Equation (3.2), and the Solow residual measures growth in the technology factor z. But in the presence of market power (fl, > 1 ), increases in output will result in positive Solow residuals (and decreases in output, negative Solow residuals), even in the absence of any change in technology. In particular, output fluctuations due to changes in the markup will result in fluctuations in the Solow residual, closely correlated with output growth. Hall ( 1 990) points out that in the case that the production function exhibits constant returns to scale, this problem with the Solow residual can be eliminated by replacing the weights sK, sH by the shares of these factor costs in total costs, rather than their share in revenues. Thus he proposes a "cost-based productivity residual"
where sH = sH/(sK + sH ), and h = 1 sH. In terms of these factor shares, the production function elasticities are given by 1JH = psH, 1JK = ph, where p = 1JK + 1JH is the index of returns to scale defined earlier. Similar manipulations as are used to derive Equation (3.2) then yield -
(3.3) Even if f1 > 1 , as long as p = 1 , Hall's "cost-based" residual will measure the growth in z. One can show, in fact, that this measure of productivity growth is procyclical
1 096
J.J. Rotemberg and M. Woodford
to essentially the same degree as is the Solow residual 59. But again this need not indicate true technical change. For if there are increasing returns to scale (p > I), due for instance to the existence of overhead labor as discussed above, then increases in output will result in positive Solow residuals even without any change in technology. This explanation for the existence of procyclical productivity in the absence of cyclical changes in technology is closely related to the previous one, since we have already indicated that (given the absence of significant pure profits) it is plausible to assume that f1 and p are similar in magnitude. The quantitative significance of either of these mechanisms depends upon how large a value one believes it is plausible to assign to f1 or p. Hall ( 1988, I 990) argues that many US industries are characterized by quite large values of these parameters. He obtains estimates of f1 that exceed I .5 for 20 of the 26 industries for which he estimates this parameter. Within his 23 manufacturing industries, I 7 have estimates of f1 above 1 .5 while I 6 have estimates of p that are in excess of I .5. His evidence is simply that both productivity residuals are positively correlated with output movements, even those output movements that are associated with non-technological disturbances. In effect, he estimates the coefficients on the first terms on the right hand sides of Equations (3.2) and (3.3) by instrumental-variables regression in using military purchases, a dummy for the party of the US President, and the price of oil as instruments for non-technological disturbances that affect output growth. However, even assuming that the correlations with these instruments are not accidental, this merely establishes that some part of the procyclical productivity variations that are observed are not due to fluctuations in true technical progress; since explanations exist that do not depend upon large degrees of market power or increasing returns, one cannot regard this as proving that f1 and p are large. A second possible mechanism is substitution of intermediate for primary inputs, as discussed by Basu ( I995). Suppose that materials inputs are not used in fixed proportions, but instead that each firm's gross output Q is given by a production function Q = Q( V , M), where M represents materials inputs and V is an index of primary input use (which we may call "economic value added"), and the function Q is differentiable, increasing, concave, and homogeneous of degree I . As before, economic value added is given by a value-added production function V = F(K,zH). Now consider a symmetric equilibrium in which the price of each firm's product is the same, and this common price is also the price of each firm's materials inputs (which are the products of other firms). Consideration of the marginal cost of increasing output by increasing materials inputs alone then yields
(3.4)
59 Because, as Hall notes, pure profits are near zero for US industries, sK + sll has a value near one for a typical industry; hence the two types of factor shares, and the two types of productivity residuals, are quantitatively similar in most cases.
Ch. 16:
The Cyclical Behavior of Prices and Costs
1097
Because of our homogeneity assumption, (3 .4) can be solved for MIV = m(tJ.), where m is a decreasing function. Then defining accounting value added as Y one obtains YIV = Q( l , m((.l)) - m(tJ.).
=
Q- M,
(3 .5)
Furthermore, as long as firms have some degree of market power ((.l > 1 ), Equation (3.4) implies that QM > 1. Hence Q( l , m) - m will be increasing in m, and Equation (3.5) implies that YIV, the ratio of measured value added to our index of "economic value added", will be a decreasing function of f.L. This implies that a decline in markups would result in an increase in measured value added Y even without any change in primary input use (and hence any change in V). This occurs due to the reduction of an inefficiency in which the existence of market power in firms' input markets leads to an insufficiently indirect pattern of production (too great a reliance upon primary as opposed to intermediate inputs). If one's measure of total factor productivity growth is based upon the growth in Y instead of V, then markup variations will result in variations in measured productivity growth that are unrelated to any change in technology. Since a markup decline should also increase the demand for primary factors of production such as labor, it will be associated with increases in employment, output, and total factor productivity - where the latter quantity increases because of the increase in YIV even if the measurement problems stressed by Hall (relating to the accuracy of one's measure of the increase in V that can be attributed to the increase in primary factor use) are set aside. The quantitative importance of such an effect depends upon two factors, the elasticity of the function m and the elasticity of the function Q( l , m) - m. The first depends upon the degree to which intermediate inputs are substitutable for primary inputs. Basu ( 1 995) establishes that materials inputs do not vary in exact proportion with an industry's gross output; in fact, he shows that output growth is associated with an increase in the relative use of intermediate inputs, just as Equation (3.4) would predict in the case of an output increase due to a reduction in markups. The second elasticity depends upon the degree of market power in the steady state (i.e., the value of f.J. around which we consider perturbations), because as noted above, the derivative of Q( l , m) - m equals f.J. - 1 . Thus while Basu's mechanism is quite independent of Hall's, it too can only be significant insofar as the typical industry possesses a non-trivial degree of market power. An alternative mechanism is "labor hoarding"; indeed, this is probably the most conventional explanation for procyclical productivity variations. If only H - Hm hours are used for current production, but productivity growth is computed using total payroll hours H as a measure of the labor input, then a tendency of Hm to decline when H -Hm increases will result in spurious procyclical variations in measured productivity
JJ Rotemberg and M. Woodford
1098
growth. Furthermore, this is exactly what one should expect to happen, in the case of fluctuations in activity due to markup variations. Suppose that the value to a firm (in units of current real profits that it is willing to forego) of employing Hm hours on maintenance (or other non-production) tasks is given by a function v(Hm). It is natural to assume that this function is increasing but strictly concave. Then if the firm is a wage-taker, and there are no adjustment costs for varying total payroll hours H, the firm should choose to use labor for non-production tasks to the point at which v'(Hm) = w/P
(3.6)
Let us suppose furthermore that the real wage faced by each firm depends upon aggregate labor demand, according to a wage-setting locus of the form
w!P = v(H),
(3.7) 0
where v is an increasing function 6 . Since v' is a decreasing function while v is increasing, Equations (3.6) and (3.7) imply that H and Hm should move inversely with one another, assuming that the source of their changes is not a shift in either of the two schedules. Finally, allowing for labor allocated to non-production tasks requires us to rewrite Equation (2.2) as
11 =
PzFH(K, z(H - Hm))
w
.
(3.8)
Substituting for Hm in the numerator the decreasing function of H just derived, and substituting for w in the denominator using Equation (3.7), the right-hand side of Equation (3.8) may be written as a decreasing function of H. It follows that a reduction in the markup (not associated with any change in the state of technology, the value of non-production work, or the wage-setting locus) will increase equilibrium H and reduce equilibrium Hm . The result will be an increase in output accompanied by an increase in measured total factor productivity. If the firm faces a wage that increases with the total number of hours that it hires (due to monopsony power in the labor market, the overtime premium, or the like), then the resulting procyclical movements in measured productivity will be even greater. In this case, Equation (3.6) becomes instead v
'
(Hm)
= w(H)w/P,
(3.9)
where w(H) is the ratio of the marginal to the average wage, as in Equation (2. 1 1 ). We have earlier given several reasons why w(H) would likely be an increasing function, 60 If we imagine a competitive auction market for labor, then Equation (3.6) is just the inverse of the
labor supply curve. But a schedule of the form (3.6) is also implied by a variety ofnon-Walrasian models of the labor market, including efficiency wage models, union bargaining models, and so on. See, e.g., Layard et al. ( 1 991), Lindbeck (1993), and Phelps ( 1994) for examples of discussions of equilibrium employment determination using such a schedule.
Ch. 16:
The Cyclical Behavior of Prices and Costs
1099
at least near the steady-state level of hours. Hence the specification (3.9) makes the right-hand side an even more sharply increasing function of H than in the case of (3.6). Similarly, if there are convex costs of adjusting the total number of hours hired by the firm, Equation (3.6) becomes instead
v' (Hm)
=
Qw/P,
(3. 10)
where Q is again the factor defined in Equation (2. 1 3). Again, this alternative specification makes the right-hand side an even more procyclical quantity than in the case of (3. 6). Thus either modification of the basic model with labor hoarding implies even more strongly countercyclical movements in Hm, and as a result even more procyclical variation in measured productivity. A related explanation for cyclical variation that results from markup variations in measured productivity is umneasured variation in labor effort. If, as in the model of Sbordone (1 996), the cost of increased effort is an increase in the wage w(e) that must be paid, and there are convex costs of varying hours, then the cost-minimizing level of effort for the firm is given by Equation (2. 1 9). As discussed earlier, this implies that effort should co-vary positively with fluctuations in hours (albeit with a lead), since the factor Q will be procyclical with a lead, while the function w(e) will be increasing in e. Furthermore, consideration of the marginal cost of increasing output by demanding increased effort implies that 6 1
11 =
PzFH(K, zeH) w ' (e)
(3 . 1 1 )
·
Since w' (e) must be increasing in e (at least near the steady-state effort level, as a consequence of the second-order condition for minimization of the cost wle of effective labor inputs), Equation (3 . 1 1) requires that a reduction in markups result in an increase in eH (to lower the numerator), an increase in e (to increase the denominator), or both. Since e and H should co-vary positively as a consequence of Equation (2. 19), it follows that a temporary reduction of markups should be associated with temporary increases in effort, hours, and output. Countercyclical markup fluctuations would therefore give rise to procyclical variations in measured productivity. Another related explanation is umneasured variation in the degree of utilization of the capital stock. The argument in this case follows the same general lines. If markups fall, firms must choose production plans that result in their operating at a point of higher real marginal costs (which quite generally means more output). Cost minimization implies that real marginal costs increase apace along each of the margins available to the firm. Thus if it is possible to independently vary capital utilization, the real marginal cost of increasing output along this margin must increase; under standard 61
As noted earlier, this implies that Equation (2. 1 1) holds with
w
replaced by
w(
e).
J.J. Rotemberg and M. Woodford
1 100
assumptions, this will mean more intensive utilization of the firm's capital. But the resulting procyclical variation in capital utilization will result in procyclical variation in measured productivity, even if there is no change in the rate of technical progress. Similar conclusions are obtained when capital utilization is a function of hours worked per employee. Consider again the case in which there is an interior solution for hours because the wage schedule W (h) is nonlinear in hours per employee, and in which hours per employee nonetheless vary because of convex adjustment costs for employment. Then the cost-minimizing decision regarding hours per employee satisfies the first-order condition 62
Q = w(h)
( 'fJH 'fJ+HA'fJK ) .
(3. 12)
If we assume both a Cobb-Douglas production function Y (uKK) 1 -a(zhN)a and an isoelastic capital utilization function uK = h;.. with 0 < A ( 1 , the expression in parentheses is a constant, and Equation (3. 1 2) implies that hours per employee h must covary positively with Q. This means that fluctuations in hours will accompany temporary fluctuations in employment (but with a lead). Furthermore, Equation (2.22) implies that
{l =
[a + A( l - a)] PzaK I -a W' (h)h( l -a)( ! -J..)N I -a
Thus a decline in {l must be accompanied by an increase in W' (h)h( l -a)( I -J..) (hence an increase in h), an increase in N I-a (hence an increase inN), or both. Since employment and hours must co-vary positively, there will be an increase in both. As a result, capital utilization will increase along with output and employment, again resulting in procyclical variation in measured productivity.
3. 1.2. Cyclical profits Business profits are also well-known to vary procyclically [e.g., Hultgren ( 1 965)]; corporate profits after taxes have long been a component of the NBER's index of coincident business cycle indicators. This is sometimes thought to make it implausible that business expansions are associated with declines in markups, since reduced markups should lower profits. Indeed, Christiano, Eichenbaum and Evans (1 996) report calculations intended to show that a model in which expansions are due to markup declines will almost inevitably make the counterfactual prediction that profits must decline when output expands 63 . 62 Note that this follows from the fact that both Equations (2. 1 2) and (2.22) apply in this case. 63 They present their analysis as a criticism of sticky-price models of the effects of monetary policy;
but in fact their criticism relates simply to the fact that the model is one in which output increases due to a reduction in markups.
Ch. 16:
The Cyclical Behavior of Prices and Costs
1 1 01
This implication is, however, less direct than it might at first seem. There are a number of reasons why profits might well rise when markups fall. Many of these have been introduced above as reasons why the inverse of the labor share need not move countercyclically to the same extent as the markup. The connection between these two issues is simple . The cyclical variation in (real) profits is essentially determined by the cyclical variation in the amount by which the value of output exceeds the wage bill, Y - (wiP)H. (This is because the remaining deductions involved in the calculation of accounting profits, such as interest payments and depreciation allowances, are relatively less cyclical.) Now if the labor share wHIPY is not procyclical, it follows that when output increases, wHIP increases no more than proportionally to output, which surely means less in absolute magnitude, since labor compensation is on average only three-quarters of the value of output. Hence Y - (wiP)H will increase. Thus any model that does not predict a procyclical labor share will a fortiori not predict countercyclical profits. And indeed, parameter values that imply procyclical variation in profits in response to markup variations are not hard to find. Consider first our simplest model, in which firms pay the same wage regardless of the number of hours they hire, there are no adjustment costs, and the measured capital and labor inputs are all that matter for a firm's output. Then equilibrium output Y, hours H, and real wage wiP are determined by Equations (2. 1 ), (2.2), and (3.7), given the capital stock K, the state of technology z, and the markup fl.. Let us consider the effects of markup variations, holding fixed the other two parameters (and the functions F and v). If we neglect changes in interest and depreciation, the change in profits is given by dll
=
d(Y - vH) = (zFH - v) dH - H dv
= (fl. - 1 - Ev) v dH,
(3. 1 3)
where v = wiP is the real wage, and Ev = H v'I v is the elasticity of the wage-setting locus in Equation (3.7). It follows that profits increase along with employment and output if and only if
fl. > 1 + Ev.
(3. 1 4)
Now this is certainly possible; under the hypothesis of market power in the product market (which we require in order to suppose that markup variations are possible), fl. > 1, so it is simply necessary that Ev be small enough. This may not, however, seem empirically plausible; essentially, Christiano et al. argue that it would require a greater degree of market power than is plausible for most US industries. Their proposed value for Ev, however (their "baseline" calculation assumes Ev = 1), is based not upon the observed degree of cyclicality of wages, but upon what they regard as a plausible specification of household preferences, given an interpretation of Equation (3.7) as the labor supply schedule of representative
J.J. Rotemberg and M. Woodford
1 102
household. In fact, the average wage is observed to be relatively acyclical, and even if this is a puzzle for the theory of labor supply, there is no reason to assume a stronger real wage response to increases in labor demand in calculating the effect on profits of an increase in output associated with a decline in markups. For example, Solon, Barsky and Parker ( 1994) find an elasticity of the average real wage with respect to hours worked of about 0.3 64; thus an average markup in excess of 1 .3 would suffice to account for procyclical profit variations. And again, this is the "value-added markup" that must exceed 1 .3; for this, the typical supplier's markup need not be· much more than 1 0 percent. In any event, procyclical profits do not require even as large an average markup as this, if we make the model more realistic, in any of the several ways discussed above. Consider first the possibility that the marginal wage paid by a firm varies with the number of hours that it hires, and not only with aggregate labor demand (as assumed above), due, for example, to monopsony power in the labor market. Let us write the firm's wage bill as W(H; ; H), where H ; represents hours hired by firm i, and H represents aggregate hours hired. Then in a symmetric equilibrium, the average wage v is given by W(H; H)/H, and the ratio w of the marginal wage to the average wage is given by HW1 (H; H)IW(H; H). In this case, Equation (3. 13) generalizes to dll = d(Y - W(H; H)) = (zFH - W1 - W2 ) dH = (p - 1 - W2/WI ) W1 dH = (wp - 1 - Ev)V dH, so that Equation (3. 14) becomes WfJ. > 1 + Ev.
(3. 1 5)
Since, as explained earlier, there are a number of reasons for w to be larger than one, the markup need not be as large as is required by Equation (3. 14) in order for profits to be procyclical. If, for example, we assume that w = 1 .2, as Bils ( 1987) estimates6S, and Ev = 0.3, it suffices that fJ. = 1. 1 (which means a gross-output markup of 4%).
64 Solon et al. find a considerably larger elasticity for the wage of individuals, once one controls for cyclical changes in the composition of the workforce. However, for purposes of the cyclical profits calculation, it is the elasticity of the average wage that matters; the fact that more hours are low-wage hours in booms helps to make profits more procyclical. 65 This is what Bils' estimates imply for the ratio of marginal wage to average wage when the margin in question is an increase in weekly hours per employee, and the derivative is evaluated at a baseline of 40 hours per week. (As noted above, Bils finds that this ratio rises as hours per employee increase.) In applying this ratio to Equation (3. 15), we assume that the marginal cost of additional hours is the same whether they are obtained by increasing hours per employee or by increasing the number of employees, as must be true if firms are cost-minimizing.
Ch. 1 6:
The Cyclical Behavior of Prices and Costs
1 103
Alternatively, suppose that some labor is used for non-production purposes, as in our above discussion of "labor hoarding". Then Equation (3 . 1 3) becomes instead dJI = d(Y - vH) = zFH(dH - dHm) - v dH - H dv
= ({}f.l - 1 - Ev)v dH, where () denotes the derivative of labor used in production H - Hrn with respect to total labor H . Thus Equation (3 . 1 4) becomes (3 . 1 6)
()f.l > 1 + Ev.
If labor hoarding is countercyclical, () > 1 , and Equation (3 . 1 6) also requires a smaller markup than does Equation (3 . 14). The findings of Fay and Medoff ( 1 985), discussed above, would suggest a value of () on the order of 1 .4. This would be enough to satisfy Equation (3 . 1 6) regardless of the size of the markup. Similar results are obtained in the case of variable labor effort or variable capital utilization. The implied modification of Equation (3 . 14) is largest if the costs of higher effort or capital utilization do not show up in accounting measures of current profits. For example, suppose that effective capital inputs are given by uKK, where the utilization rate UK is an independent margin upon which the firm can vary its production process, and suppose that the cost of higher utilization is faster depreciation of the capital stock (but that this is not reflected in the depreciation allowance used to compute accounting profits). As explained above, we should expect a decline in markups to be associated with a simultaneous increase in real marginal costs along each margin, so that firms choose to increase UK at the same time that they choose to increase labor inputs per unit of capital. Let A denote the elasticity of UK with respect to H as a result of this cost-minimization on the part of firms 66. Then Equation (3 . 1 3) becomes instead dJI = d(Y - vH) = zFH dH + KFK duK - v dH - H dv
= (fJ + r]Ks[) A - 1 - Ev)v dH, 17H A1JK f.l - 1 - Ev V dH, = H
[( ; )
]
and Equation (3 . 1 4) again takes the form (3 . 1 6), where now () :::= ( 'fJH + A'fJK)/ 'fJH · If capital utilization and hours co-vary positively (as we have argued, and as is needed in order to interpret procyclical productivity variations as due to cyclical variation in capital utilization), then () > 1 , and again a smaller markup than is indicated by Equation (3 . 1 4) will suffice for procyclical profits. If, for example, A = 1 , as argued
66 Note that we do not here assume a structural relation between the two variables.
J.J. Rotemberg and M. Woodford
1 104
by Bils and Cho ( 1994), then f) > 1 . 3 , and Equation (3. 1 6) is satisfied no matter how small the average markup may be.
3.2. IdentifYing the output fluctuations due to markup variation We now describe the consequences of alternative measures of marginal costs for one's view of the sources of aggregate fluctuations. We propose to decompose the log of real GDP y1 as �
* !l Yt - Yt + Yt ,
(3. 17)
where the first term represents the level of output that is warranted by shifts in the real marginal cost curve introduced in Section 1 (for a constant markup), while the second is the effect on output of deviations of markups from their steady-state value, and hence represents a movement along the real marginal cost schedule. We then use this decomposition to investigate the extent to which changes in y are attributable to either term. Because there is no reason to suppose that changes in markups are independent of shifts in the real marginal cost curve, there is more than one way in which this question can be posed. First, one could ask how much of the fluctuations in aggregate activity can be attributed to the fact that markups vary, i.e., would not occur if technology and labor supply varied to the same extent but markups were constant (as would, for example, be true under perfect competition). Alternatively, one might ask how much of these fluctuations are due to markup variations that are not caused by shifts in the real marginal cost schedule, and thus cannot be attributed to shifts in technology or labor supply, either directly or indirectly (through the effects of such shocks on markups). The first way of posing the question is obviously the one that will attribute the greatest importance to markup variations. On the other hand, the second question is of particular interest, since, as we argued in Section 1 , we cannot attribute much importance to "aggregate demand" shocks as sources of business fluctuations, unless there is a significant component of output variation at business-cycle frequencies that can be attributed to markup variations in the more restrictive sense. Mere measurement of the extent to which markup variations are correlated with the cycle - the focus of our discussion in Section 2, and the focus of most of the existing literature - does not provide very direct evidence on either question. If we pose the first question, it is obviously necessary that significant markup variations exist, if they are to be responsible for significant variation in economic activity. But the relevant sense in which markup variations must be large is in terms of the size of variations in output that they imply. The size of the correlation of markup variations with output is thus of no direct relevance for this question. Moreover, markup variations could remain important for aggregate activity in this first sense even if markups were procyclical as a result of increasing whenever real marginal costs decline. In this case, markup variations would dampen the effects of shifts in real marginal costs.
Ch. 16:
The Cyclical Behavior of Prices and Costs
1 105
If, instead, we ask about the extent to which markup variations contribute to output movements that are independent of changes in real marginal cost, the correlation of markups with output plays a more important role. The reason is that these orthogonal markup fluctuations lead output and markups to move in opposite directions and thus induce a negative correlation between output and markups. However, markups could be very important even without a perfect inverse correlation since, as we show below, the dynamic relationship between markup variations and the employment and output variations that they induce is fairly complex in the presence of adjustment costs. Furthermore, even neglecting this, a strong negative correlation between markups and activity would be neither necessary nor sufficient to establish the hypothesis that orthogonal movements in markups contribute a great deal to output fluctuations. The negative correlation might exist even though the business cycle is mainly caused by technology shocks, if those shocks induce countercyclical markup variations that further amplify their effects upon output and employment. And the negative correlation might be weak or non-existent even though shocks other than changes in real marginal cost are important, if some significant part of aggregate fluctuations is nonetheless due to these cost shocks, and these shocks induce procyclical markup variations (that damp, but do not entirely eliminate or reverse, their effects upon output). In this section, we try to settle these questions by carrying out decompositions of the sort specified in Equation (3. 1 7) and analyzing the extent to which y*, yll and the part of yll that is orthogonal to y* contribute to movements in y. We do this for two different measurements of Jt, which imply different movements in yll. The first measurement of f1 we consider is based on Equation (2.9) while the second is based on the existence of a cost of changing the level of hours worked. Because of space constraints, we are able to give only a cursory and illustrative analysis of these two cases. We start with the case where markups are given by Equation (2.9), for which we gave several interpretation above. To compute how much output rises when markups fall, we must make an assumption about the extent to which workers demand a higher wage when output rises. We thus assume that, in response to changes in markups, wages are given by
(3. 1 8) w1 = fJwfft. Thus, fJw represents the slope of the labor supply curve along which the economy moves when markups change. Obviously, this simple static representation is just a simplification. We again let fJH represent the elasticity of output with respect to hours when hours are being changed by markup movements. Using Equation (3. 1 8) in (2.9) together with the assumption that changes in output induced by markup changes equal fJH times fi, it follows that �
[l = -
(
1 - b - fJH(l - a) fJH
+
-)
fJw � 11 y , fJH
(3 . 19)
where the term in parentheses is positive because fJH is smaller than one and a and b are nonpositive. This formula allows us to compute yll once we have measured f1 as
1 106
J.J. Rotemberg and M. Woodford
above. In other words, it allows us to go from the measurement of markups to the measurement of output movements implied by markups. Once we have obtained yfl in this manner, we subtract this from y to obtain y*, as required by Equation (3 . 1 7). To do all this, we need three parameters, namely a and b (to construct the markup) and the expression in parentheses in Equation (3 . 1 9). Our illustrative calculation is based on setting a equal to zero, b equal to -0.4 (which we saw guarantees that the markup is quite countercyclical) and setting the expression in parentheses equal to 1 /0.7. Given these values for a and b, this last parameter can be rationalized by supposing that 'fJH = 0. 7 and 'f/w = 0.3. This elasticity of labor supply is broadly consistent with the estimates of Solon, Barsky and Parker ( 1994). If we use these parameters and compute f1 in the way that we did in Section 2, however, the variance of f1 and, in particular, the movements in f1 that are orthogonal to movements in y are rather large. These orthogonal movements in yfl must then be matched by equal and opposite movements in y* . One interpretation of this is that shifts in the marginal cost curve would lead to much larger output swings than those we actually observe if it weren't for procyclical markup movements that dampen these shifts. Another interpretation is that there are large errors in the measurement of the wage that lead the labor share to be measured with error. These random movements in the labor share then lead to offsetting movements in the two terms of Equation (3. 1 7), yfl and y*. To deal with this possibility, we modify the analysis somewhat. Instead of using actual wages in computing [1, we use the projection of the ratio of per capita compensation to per capita output, (w - y), onto the cyclical variables that we used in Rotemberg and Woodford ( 1 996a). In other words, we make use of the regression equation (3.20) where Z1 now represents the current and lagged values of the change in private value added, the ratio of nondurables and services consumption to output, and detrended hours worked in the private sector. To obtain the ratio of per capita compensation to per capita output that we use in Equation (3.20) we divided the labor share by the deviation of hours from their linear trend. Since this same deviation of hours is an element of the Z1 vector, we would have obtained the same results if we had simply projected the labor share itself. For this included level of hours (and output) to be comparable to the labor share we use to construct (w -y), this labor share must refer to the private sector as a whole. We thus use only this particular labor share in this section. Because of the possibility that this labor share does not follow a single stationary process throughout our sample, we estimated Equation (3.20) only over the sample 1 969: 1 to 1 993 : 1 . Equation (3.20) allows us to express (w - y) a s a linear function of Z . Given that a is zero, the only other determinant of the markup in Equation (2.9) is the level of hours fl, which is also an element of Z. Thus, our estimate of f1 is now a linear function of Z . Equation (3. 1 9) then implies that j!f is a linear function of Z1 as well.
Ch. 16:
The Cyclical Behavior of Prices and Costs
1 107
It is not the case, however, that y; is a linear function of Z1• The reason for this is that Z includes only stationary variables and therefore does not include y. On other hand, the change in private value added, .1y, is an element of Z. This means that, armed with the stochastic process for Z that we estimated in Rotemberg and Woodford ( 1 996a), (3.21) we can construct the innovations in yfl and in y* . These are linear functions of the vector E1 which, given Equation (3.21), equals (Z1 - AZ1_ 1 ) so that these innovations depend only on the history of the Z's. Similarly, the vector (Z1 - AZ1_I ) together with the matrix A in Equation (3 .2 1 ) determines how the expectation of future values of Z is revised at t. This means that we can use Equation (3.21 ) to write down the revisions at t in the expectations of Yt+lo y��k and y;+k as linear functions of the history of the Z's. Finally, the variance covariance matrix of the E's (which can be obtained from A and the variance covariance matrix of the Z's) then implies variances and covariances for both the innovations and revisions in the y 's, the yfl 's and the y* 's. Table 3 focuses on some interesting aspects of these induced variances and covariances. Its first row focuses on innovations so that it shows both the variance of the innovation in y* and in yfl as ratios of the innovation variance in y. The subsequent rows focus on revisions at various horizons. The second row, for example, gives the population variances of the revisions at t of y;+S and y�+S as ratios to the variance of the revision of Yt+S · All these revisions correspond to output changes one year after the effect of the corresponding E1 's is first felt. The next row looks at innovations two years after the innovations first affect output and so on. We see from Table 3 that this measure of the markup has only a very modest effect on one's account of the source of aggregate fluctuations in output. The variances of revisions in y* are almost equal to the corresponding variances of y for all the horizons we consider. The innovation variance of y* is actually bigger which implies that innovations in yfl that are negatively correlated with y* dampen the effect of these short-run movements of y* on y. The last column in Table 3 looks at the variances of the component of y11 that is orthogonal to y*. This variance is equal to the variance of yfl times (1 - p2 ) where p represents the correlation between Y' and y* and where this correlation can easily be computed from Equation (3.2 1 ). To make the results clearer, we again present the variance of this orthogonal component of yfl as a fraction of the corresponding variance of y. It is apparent from this column that this orthogonal component explains very little of the variance of y at any of the horizons we consider. Thus, even though this measure of the markup is negatively correlated with our cyclical indicators, it induces movements in output that are much smaller than the actual ones. Overturning this finding appears to require implausible parameters. To make output more responsive to markup changes requires that the term in parenthesis in Equation (3. 1 9) be smaller. We could achieve this by making 1JH smaller or 1Jw bigger but, given the values that we have chosen, large changes of this sort would be unreasonable. An alternative way of lowering this coefficient is to make b smaller
J.J. Rotemberg and M. Woodford
1 108
Table 3 Fractions of the variance of y accounted by Y" and y * a Vardy * VarL1y
VarL1jllt VarLly
VarL1jJft orthogonal to L\.y * VarLly
Innovation variances
1 .43
0.05
0.01
Revisions over 5 quarters
0.88
0.06
0.,06
Revisions over 9 quarters
0.86
0.08
0.08
Revisions over 13 quarters
0.90
0.07
0.07
Revisions over 17 quarters
0.90
0.05
0.05
Revisions over 21 quarters
0.91
0.05
0.05
Revisions over 25 quarters
0.91
0.04
0.04
b
=
c =
-0.4, a, c = 0
8, a, b = 0
Innovation variances
2.38
2.89
0.97
Revisions over 5 quarters
0.55
1 .28
0.97
Revisions over 9 quarters
0.65
1.13
0.89
Revisions over 1 3 quarters
0.66
1.13
0.90
Revisions over 17 quarters
0.61
1 .03
0.86
Revisions over 21 quarters
0.59
0.91
0.81
Revisions over 25 quarters
0.58
0.81
0.75
Revisions over 81 quarters
0.84
0.21
0.21
a Calculations based on projecting (w - y) on Z for period 1969:1-1993 : 1 and using properties of stochastic process in Equation (3.21) where this stochastic process is estimated from 1948:3 to 1 993 : 1 .
in absolute value. The problem is that, as we saw before, this makes the markup less cyclical. Thus, it does not help in making Y' track more of the cyclical movements in output. By the same token, setting a equal to a large negative number makes the markup more countercyclical but raises the term in parenthesis in Equation (3. 1 9) thereby reducing the effect of the markup on yll. We now turn to the case where adjustment costs imply that deviations of the markup from the steady state are given by Equation (2. 1 5). We follow Sbordone ( 1996) in that we also let output vary with employee effort and this, as we saw, is consistent with Equation (2. 1 5). Letting a be zero and using Equation (2. 14), Equation (2. 1 5) can be rewritten as (3.22) Allowing for variable effort is useful because it relaxes the restriction that the short run output movements induced by markup variations are perfectly correlated with changes
Ch. 16:
The Cyclical Behavior of Prices and Costs
1 1 09
in hours. Thus, as in our earlier discussion of her model, we suppose that output is given by Y = F(K, zeH). As a result, we have (3.23) We suppose as before that the wage bill is given by H wg(e), where w captures all the determinants of the wage that are external to the firm and g is an increasing function. This leads once again to Equation (2. 1 9) which, once linearized, can be written as et =
c
Ew
[
,
']
,
'
(Ht - Ht_J ) - E1(Ht+l - Ht) .
(3.24)
Finally, we assume that average wages are given by wt = Wot + nwfit + 1/Jet.
(3 .25)
It is important to stress that the parameters 'rJ and 1/J do not correspond to the elasticities of the average wage paid by an individualfirm with respect to the individual firm s hours and effort level. Rather, they are the elasticities of the economy-wide average wage with respect to aggregate changes in hours and average work effort. Note also that W01 is the exogenous component of the wage, i.e., the one that is not affected by changes in markups. Using Equations (3.23) and (3.25) to substitute for y1 and w1 respectively in Equation (3.22) and then using Equation (3 .24) to substitute for et in the resulting expression we obtain ftt + wat = ('rJH - 1 - 'rJw)fit + ('rJH - 1/J - Ew)
:[
E
]
Cfit - flt-I ) - Et (fit+I - fit) .
This difference equation in if can be written as f3
-
-
-
Et Cl - }qL)(l - AzL) Ht = -�(ftt - Wot), z '
where L is the lag operator while X1 and X2 are the roots of {3J,.2 - [ 1 + {3 + 8]J,. + 1
=
0
and (1
=::
1 + 'r/w - 'rJH Ew ' 1/J + Ew - 'rJH c
1 Ew �= 1/J + Ew - 'rJH C ----
Noting that X1{3 is equal to 11X2 and letting X be the smaller root (which is also smaller than one as long as 1 + 'rJw > 'rJH and 1/J + Ew > 0), the solution of this difference equation is 00
00
fit = -t L L X\{3};,)iEt-k [P,t+j-k - Wot+j-k] k�Oj�O
(3.26)
The deviations of hours from trend that are due to changes in markups, fif', can then be obtained by simply ignoring the movements in W0 in Equation (3.26). We can
J. J. Rotemberg and M. Woodford
1 1 10
then compute the deviations of output from trend that are due to markup variations, yfl, by combining Equations (3.23) and (3.24) to yield (3.27) This implies that, as before, yfl is a linear function of current and past values Z1• To see this, note first that Equation (3.22) implies that we can write /),1 as a function of Z1• The reason for this is that (w y) is a function of Z1, H1 is part of Z1 and, as a result of Equation (3.21), E1Ht+ 1 is the corresponding element of AZ1• Therefore, using Equation (3 .21) once again, the expectation at t of future values of /), must be a function of Z1• Past expectations of markups which were, at that point, in the future are therefore functions of past Z's. The result is that we can use Equation (3 .26) to write if{' as a function of the history of the Z's 67• Finally, we use Equation (3.27) to write the component of output that is due to markup changes as a function of the Z's. We require several parameter values to carry out this calculation. First, we set c equal to 8 to calculate /),1 in Equation (3.22). To compute the connection between );'' and the Z's we need three more parameters. It is apparent from Equations (3.26) and (3.27) that this calculation is possible if one knows X, t and Ew in addition to c (which is needed to compute markups anyway). For illustrative purposes, we set these three parameters equal to 0.79, 0. 1 3 and 3 respectively. The parameters X and t are not as easy to interpret as the underlying economic parameters we have used to develop the model. In addition to c and Ew these include 'YJH, 'YJw, 1/J. Because the number of these parameters is larger than the number of parameters we need to compute yfl, there is a one dimensional set of these economically meaningful parameters that rationalizes our construction of Y'. In particular, while this construction is consistent with supposing that 'YJH, 'YJw, and 'ljJ equal 0. 7, 0.25 and 0. 1 , respectively, it is also consistent with different values for these parameters 68 . We use our knowledge of the relationship between yfl and the Z's for two purposes. As before, we compute the variances of the innovations and revisions in yfl as well as of y* . Second, we look at the resulting sample paths of yfl and y* . The second part of Table 3 contains the variances, which correspond to the ones we computed before. The results are quite different, however. In particular, the variance of the component of yfl that is orthogonal to y* now accounts for about 90% of the variance of the revisions in output growth over the next two years. Thus, independent markup movements are very important in explaining output fluctuations over "cyclical" horizons. Moreover, if one -
67 Because we later want to compute the sample values of j)ll we truncate k so that it runs only between zero and eighteen. Given that our X equals 0.79, this truncation should not have a large effect on our results. 68 Note that we have made 'YJw, the elasticity of the wage with respect to hours along the aggregate labor supply curve, somewhat smaller than before because our use of a positive 1/J implies that wages rise with output not only because hours rise but also because effort rises.
Ch. 16:
1111
The Cyclical Behavior of Prices and Costs
8 . 6 ..,------, 8.5 8.4 8.3 8.2 8.1
74
76
78
80
82
84
86
88
90
92
Fig. 3 . Constant-markup and actual output.
takes the view that the movements of y that are genuinely attributable to y* are those which are not due to the component of.V that is orthogonal to y*, the movements in y* account for only about 10% of the movements in y. Movements in y* have essentially no cyclical consequences. It is not that the revisions in the expectations of y* are constant. Rather, upwards revisions in y* over medium-term horizons are matched by increases in markups that essentially eliminate the effect of these revisions on y. This cannot be true over long horizons since the markup is assumed to be stationary so that yll is stationary as well. Thus, changes in y* that are predicted 20 years in advance account for about 80% of the revisions in output that are predicted 20 years in advance. An analysis of the sample path of );'1 (and the corresponding path of y*) delivers similar results. Such sample paths can be constructed since yll depends on the Z's which are observable. Admittedly, Equation (3.26) requires that the entire history of Z's be used. Data limitations thus force us to truncate k at 1 8 as explained in footnote 67. The result is that yll depends on 1 8 lags of Z. To make sure that the lagged expectations of markups which enter Equation (3.26) are computed within the period where the labor share remains a constant stationary function Z1, we construct this sample path starting in 1 973 :2. The resulting values of y* and the log of output y are plotted in Figure 3. It is apparent from this figure that the episodes that are usually regarded as recessions (which show up in the Figure as periods where y is relatively low) are not reflected in movements of y*. Figure 4 plots instead );!1 against the predicted declines of output over the next 12 quarters. These series are nearly identical so that, according to this measure of the markup, almost all cyclical movements in output since 1973
JJ Rotemberg and M Woodford
1 1 12
0.06
Predicted decline
0.04 0.02 0.00 -0.02 -0.04 -0.06 -0.08 74
76
78
80
82
84
86
88
90
92
Fig. 4. Markup-induced output gap and predicted output declines.
are attributable to markup variations. This second measure of markups is thus much more successful in accounting for cyclical output movements. This result is probably partly due to the fact that this method of estimating yl' recognizes the possibility that, in booms, output expands more than is suggested by the labor input as a result of increases in effort 69• 4. Models of variable markups
We now briefly review theoretical models of markup variation. We give particular attention to models in which markups vary endogenously, and thus affect the way the economy responds to shocks. The shocks of interest include both disturbances that shift the marginal cost schedule and other sorts of shocks, where these other shocks would not have any effect on equilibrium output in the absence of an effect upon equilibrium markups. Before reviewing specific models, it is perhaps worth commenting upon the kind of theoretical relations between markups and other variables that are of interest to us. It is important to note that an explanation for countercyclical markups need not depend upon a theory that predicts that desired markups should be a decreasing function of the level of economic activity. If the real marginal cost schedule c(Y) is upward-sloping, 69 For another setting where inferences regarding markups are significantly affected by supposing that there are costs of adjusting labor, see Blanchard (1997).
Ch. 16:
The Cyclical Behavior of Prices and Costs
1 1 13
then any source of variations in the markup that are independent of variations in the marginal cost schedule itself will result in inverse variations in the level of output, and so a negative correlation between the markup and economic activity. Thus theories of why markups should vary as functions of interest rates or inflation (rather than of the current level of economic activity) might well be successful explanations of the cyclical correlations discussed in Section 2. In fact, a theory according to which the markup should be a function of the level of economic activity is, in some respects, the least interesting kind of theory of endogenous markup variation. This is because substitution of !l = fl,(Y) into Equation ( 1 . 1) still gives no reason for equilibrium output Y to vary, in the absence of shifts in the marginal cost schedule. (Such a theory, with !l a decreasing function of Y, could however serve to amplify the output effects of shifts in that schedule.) Care is also required in relating theories of pricing by a particular firm or industry, as a function of conditions specific to that firm or industry, to their implications for aggregate output determination. For example, a theory according to which a firm's desired markup is an increasing function of its relative output, fli = fl(//Y) with fl1 > 0, might be considered a theory of "procyclical markups". But in a symmetric equilibrium, in which all firms price according to this rule, relative prices and outputs never vary, and there will be no cyclical markup variation at all. If instead (as discussed in section 4.3 below) not all firms continuously adjust their prices, the fact that adjusting firms determine their desired markup in this way can reduce the speed of overall price adjustment; and this increase in price stickiness can increase the size of the countercyclical markup variations caused by disturbances such as changes in monetary policy. The models we look at fall into two broad categories. In the first class are models where firms are unable to charge the price (markup) that they would like to charge because prices are sticky in nominal terms. Monetary shocks are then prime sources of discrepancies between the prices firms charge and the prices they would like to charge. This leads to changes in markups that change output even if desired markups do not change. In the second class of models, real factors determine variations in desired markups, even in the case of complete price flexibility. Finally, we briefly consider interactions between these two types of mechanisms. 4. 1. Sticky prices We do not provide a thorough survey of sticky price models since that is taken up in Taylor ( 1 999). Rather, our aim is threefold. First, we want to show how price stickiness implies markup variations, and so may explain some of the findings summarized in our previous sections. Second, we want to argue that markup variations are the crucial link through which models with sticky prices lead to output variations as a result of monetary disturbances. In particular, such models imply a link between inflation and markups which is much more robust than the much-discussed link between inflation and output. Thus viewing these models as models of endogenous markup
J.J. Rotemberg and M. Woodford
1 1 14
variation may help both in understanding their consequences and in measuring the empirical significance of the mechanisms they incorporate. Finally, we discuss why sticky prices alone do not suffice to explain all of the evidence, so that other reasons for countercyclical markups also seem to be needed. It might seem at first peculiar to consider output variations as being determined by markup variations in a model where prices are sticky. For it might be supposed that if prices are rigid, output is simply equal to the quantity demanded at the predetermined prices, so that aggregate demand determines output directly. However, this' is true only in a model where prices are absolutely fixed. It is more reasonable to suppose that some prices adjust, even over the time periods relevant for business cycle analysis. The issue then becomes the extent to which prices and output adjust, and, as we shall see, this is usefully understood in terms of the determinants of markup variation. We illustrate the nature of markup variations in sticky-price models by presenting a simple but canonical example, which represents a discrete-time variant of the model of staggered pricing of Calvo (1983), the implications of which are the same (up to our log-linear approximation) as those of the Rotemberg ( 1982) model of convex costs of price adjustment. First, we introduce a price-setting decision by assuming monopolistic competition among a large number of suppliers of differentiated goods. Each firm i faces a downward-sloping demand curve for its product of the form (4. 1 ) where Pl i s the price of firm i at time t, P1 is an aggregate price index, Y1 i s an index of aggregate sales at t, and D is a decreasing function. We suppose that each firm faces the same level of (nominal) marginal costs C1 in a given period 70 . Then neglecting fixed costs, profits of firm i at time t are given by
. .
II: = (P;
-
C1)D
(p:) Pt
ft.
Following Calvo, we assume that in each period t, a fraction ( 1 - a) of firms are able to change their prices while the rest must keep their prices constant. A firm that changes its price chooses it in order to maximize
where Rt,t+J is the stochastic discount factor for computing the present values at t of a random level of real income at date t +j. (The factor ai represents the probability 70 Note that we have discussed above reasons why this need not be so, for example when a firm's marginal wage differs from its average wage. As Kimball ( 1995) shows, deviations from this assumption may matter a great deal for the speed of aggregate price adjustment, but we confine our presentation to the simplest case here.
Ch. 16:
The Cyclical Behavior of Prices and Costs
1 1 15
that this price will still apply j periods later.) Denoting by A.[ the new price chosen at date t by any firms that choose then, the first-order condition for this optimization problem is (4.2) where Eo(x) = -xD' (x)/D(x) is the elasticity of the demand curve (4. 1). For now, we further simplify by assuming that the elasticity of demand is a positive constant [as would result from the kind of preferences over differentiated goods assumed by Dixit and Stiglitz ( 1977)]. This means that a firm's desired markup, in the case of flexible prices, would be a constant, !1* = Eo!(Eo - 1). In this way we restrict attention to markup variations due purely to delays in price adjustment. It is useful to take a log-linear approximation of the first-order condition (4.2) around a steady state in which all prices are constant over time and equal to one another, marginal cost is similarly constant, and the constant ratio of price to marginal cost equals /1* . Letting Xt. n1 and c1 denote percentage deviations of the variables A.[IPt . P/P1_1 and C/P1, respectively, from their steady-state values, Equation (4.2) becomes (4.3)
where {3
, such long term progress increases the number of firms and thereby reduces markups. One way of avoiding this difficulty is to assume that entry simply leads to an increased number of goods being produced by monopolistically competitive firms, as in Devereux, Head and Lapham ( 1 996) or Heijdra ( 1 995). These authors assume that the monopolistically competitive firms produce intermediate goods that are purchased by firms which combine them into final goods by using a Dixit-Stiglitz ( 1 977) aggregator. The result is that increased entry does not change the ratio of price to marginal cost. It does, however, reduce the price of final goods relative to the price of intermediate goods, because final goods can be produced more efficiently when there are more intermediate goods. This reduction in the price of final goods effectively raises real wages and, particularly if it is temporary, leads to an increase in labor supply. Thus, Devereux, Head and Lapham ( 1 996) show that, in their model, an increase in government purchases raises output together with real wages. The increase in output comes about because the increased government purchases make people feel poorer and this promotes labor supply; this results in a shift out of the real marginal cost schedule. The real wage in terms of final goods then rises because of the increase in the number of intermediate goods firms.
4.3. Interactions between nominal rigidities and desired markup variations Finally, we briefly consider the possibility that markups vary both because of delays in price adjustment and because of variations in desired markups, for one or another of the reasons just sketched. The possibility that both sources matter is worth mentioning, since interactions between the two mechanisms sometimes lead to effects that might not be anticipated from the analysis of either in isolation. For example, variations in desired markups may amplify the effects of nominal rigidities, making aggregate price adjustment even slower, and hence increasing the 81 Portier (1995) also considers a model where markups fall not only because entry occurs in booms but also because the threat of entry leads incumbent firms to lower their prices.
Ch. 16:
The Cyclical Behavior of Prices and Costs
1 127
output effects of monetary policy, even if the model of desired markup variation would not imply any interesting effects of monetary policy in the case of flexible prices. To illustrate this, let us consider again the discrete-time Calvo model of Section 4. 1 , but now drop the assumption that the function D has a constant elasticity with respect to the relative price. In this case, log-linearization of Equation (4.2) yields
(4. 1 5) as a generalization of Equation (4.3), where the desired markup
{lfes denotes the percentage deviation of
ED,t des _ = --
llt
ED,t - 1
from its steady-state value. The elasticity ED, and hence the desired markup, is a function of the relative price of the given firm i, or equivalently of the firm's relative sales Y;/Y. Letting the elasticity of the desired markup with respect to relative sales be denoted .;, we obtain
(4. 1 6) as a consequence of which Equation (4. 1 5) implies an equation of the same form as (4.3), but with the variable c1 replaced by (1 + .;ED)- 1c1 each period. This in turn allows us to derive again an equation of the form (4.5), except that now
1 (1 - af3)(1_ - a) _:______:__:__:_ _ ____:_ . - 1 + .;ED a
T 0, i.e., an elasticity of demand decreasing in the firm's relative price. Kimball ( 1995) shows how to construct aggregator functions that lead to arbitrary values of.;. Thus, this model can rationalize extreme price stickiness even when the fraction of firms that change prices is relatively high. Woglom ( 1 982) and Ball and Romer ( 1990) suggest that search costs provide an alternative rationale for a positive .;. The idea is that search costs imply that relatively small price increases lead many customers to depart while small price reductions only attract relatively few customers. A smoothed version of this kinked demand curve gives the variable elasticity just hypothesized. Equation (4. 1 7) implies that .; > 0 makes 1 ]" =I �
NO + + GE NE + +
IT
40
.s
FR
+
20
u
+ DE
AU + JA
+ +
!:: ......
0
UK
+
CA
sw
USA
+
-20 +-----,----,-� 10 20 30 -10 0 -20 Increase in inequality (%) Fig. 5. Rise in unemployment and wage inequality (late 1970s to late 1980s).
that should be explainable within the search and matching framework, though to our knowledge there are as yet no models that claim to explain it fully. We make an attempt to explain it in Section 5 . 1 [see also Mortensen and Pissarides [ 1 999)]. Thus, for twelve OECD countries with comparable data on wage inequality, there appears to be a close correlation between the percentage change in wage inequality during the 1 980s and the percentage rise in unemployment. Wage inequality is measured by the ratio of the earnings of the most educated group in the population to the least educated [usually, university graduates versus early school leavers; see OECD ( 1994), p. 1 60- 1). Other measures of inequality, however, give similar results [e.g. OECD ( 1994), p. 3; the results in Galbraith (1996), are also consistent with our claim, despite his claim to the contrary, if one measures the change in inequality by the change in the Gini coefficient of the wage distribution]. Figure 5 shows that the USA, Canada and Sweden experienced the biggest rises in inequality and the smaller rises in unemployment (fall in the USA). Japan and Australia come next, with moderate rises in both, and the European countries follow, with small rises or falls in inequality but big rises in unemployment. The only country that does not conform to this rule is the United Kingdom, which experienced North American style increase in inequality and European-style increase in unemployment over the sample of the chart. Recently, however, unemployment in the UK has fallen substantially, giving support to the view that the reforms of the 1 980s moved the United
Ch. 18:
Job Reallocation, Employment Fluctuations and Unemployment
1 1 83
Kingdom closer to a US style economy but had their impact first on inequality and only more recently on unemployment.
2. The equilibrium rate of unemployment
Here we introduce the formalities of the search and matching approach and derive the equations that express the dynamics of the stock of unemployment (or employment). This analysis will point to the variables that need to be explained in order to arrive at an equilibrium characterization of employment flows and unemployment levels. We shall talk explicitly about unemployment, with the solution for employment implied by the assumption of an exogenous labor force. The search and matching approach to aggregate labor market analysis is based on Pissarides' ( 1990) model of equilibrium unemployment as extended by Mortensen and Pissarides ( 1994) to allow for job destruction. The approach interprets unemployment as the consequence of the need to reallocate workers across activities and the fact that the process takes time. The model is founded on two constructs, a matching function that characterizes the search and recruiting process by which new job-worker matches are created and an idiosyncratic productivity shock that captures the reason for resource reallocation across alternative activities. Given these concepts, decisions about the creation of new jobs, about recruiting and search effort, and about the conditions that induce job-worker separations can be formalized 1 . The job-worker matching process is similar to a production process, in which "employment" is produced as an intermediary production input. The output, the flow of new matches, is produced with search and recruiting efforts supplied by workers and employers respectively. As a simple description, the existence of a market matching function is invoked, an aggregate relation between matching output and the inputs. Under the simplifying assumption that all employers with a vacancy recruit with equal intensity and that only unemployed workers search, also at a given intensity, aggregate matching inputs can be represented simply by the numbers of job vacancies v and of unemployed workers u. Let the function m(v, u) represent the matching rate associated with every possible vacancy and unemployment pair. As in production theory, it is reasonable to suppose
1 Of course, that at least some unemployment is due to "frictional" factors has always been recognized. Lilien (1982) was among the first to claim that even "cyclical" unemployment was of this kind. Although his results have been criticized, e.g. by Abraham and Katz (1986) and Blanchard and Diamond (1 989), the modern approach to unemployment groups all kinds of unemployment into one, as we do here.
D.T. Mortensen and C.A. Pissarides
1 1 84
that this function is increasing in both arguments but exhibits decreasing marginal products to each input. Constant returns, in the sense that
( ;) v =. q(8) v
m(v, u) = m 1 ,
where
e = !:'. ' u
(2. 1 )
i s a convenient additional assumption, one that i s consistent with available evidence 2 . The ratio of vacancies to unemployment, 8, market tightness, is an endogenous variable to be determined. On average, a job is filled by a worker at the rate m(v, u)!v = q(8) and workers find jobs at rate m(v, u)/u = 8q(8). By the assumption of a constant returns matching function, q(8) is decreasing and 8q(8) increasing in e. 8q(8) represents what labor economists call the unemployment spell duration hazard 3 . The duration of unemployment spells is a random exponential variable with expectation equal to the inverse of the hazard, 11 8q ( 8), a decreasing function of market tightness. Analogously, q(8) is the vacancy duration hazard and its inverse, llq(8) is the mean duration of vacancies. As noted above, the most important source of job-worker separations is job destruction attributable to an idiosyncratic shock to match productivity. Because initial decisions regarding location, technology, and/or product line choices embodied in a particular match are irreversible, subsequent innovations and taste changes, not known with certainty at the time of match formation, shock the market value of the product or service provided. For example, the initial decision might involve the choice of locating a productive activity on one of many "islands". In future, island-specific conditions that affect total match productivity, say the weather, may change. If the news about future profitability implicit in the shock is bad enough, then continuation of the activity on that particular island is no longer profitable. In this case, the worker loses the job. To model this idea, we assume that the productivity of each job is the mathematical product of two components, p, which is common to all jobs, and x, which is idiosyncratic. The idiosyncratic component takes values in the range [0, 1], it is distributed according to the c.d.f. F(x) and new shocks arrive at the Poisson rate A. Note that these assumptions satisfy the empirical properties of idiosyncratic job destruction, i.e. the shocks have persistence and they appear to hit the job independently of the aggregate state of the economy (here represented by p) . Entrepreneurs are unconstrained with respect to initial location, technology and product choice and also have the same information about market conditions. Under the assumption that they know the product that commands the highest productivity, all will create jobs at the highest idiosyncratic productivity, x = 1 . Given this property 2 See Pissarides (1986) and Blanchard and Diamond (1 989). 3 As workers are generally happy when an unemployment spell ends, the unemployment hazard seems an ironic label. This unfortunate term is borrowed from statistical duration analysis where the typical spell is that of a "life" that ends as a consequence of some "hazard", e.g. a heart attack or a failure.
Ch. 18:
Job Reallocation, Employment Fluctuations and Unemployment
1 1 85
of the model and the assumption that future match product evolves according to a Markov process with persistence, all matches are equally productive initially, until a shock arrives 4. Under these assumptions, an existing match starts life with x = 1 but is eventually destroyed when a new value of x arrives below some reservation threshold, another endogenous variable denoted as R. Unemployment incidence J..F(R), the average rate of transition from employment to unemployment, increases with the reservation threshold. As all workers are assumed to participate, the unemployed fraction evolves over time in response to the difference between the flow of workers who transit from employment to unemployment and the flow that transits in the opposite direction, i.e., il
= J..F(R)(l - u) - fJq(fJ) u,
(2.2)
where 1 - u represents both employment and the employment rate. The steady-state equilibrium unemployment rate is
u=
JtF(R) J..F(R) + fJq(fJ) .
-=----'---:---::-
(2.3)
Equivalently, individual unemployment histories are described by a simple two-state Markov chain where the steady-state unemployment rate is also the fraction of time over the long run that the representative participant spends unemployed. It decreases with market tightness and increases with the reservation product, because the unemployment hazard fJq(fJ) and the employment hazard JtF(R) are both increasing functions. 2.1. Job destruction and job creation conditions A formal equilibrium model of unemployment requires specification of preferences, expectations, and a wage determination mechanism. We assume that both workers and employers maximize wealth, defined as the expected present value of future net income streams conditional on current information. Forward looking rational expectations are imposed. Several wage determination mechanisms are consistent with the matching approach. Following much of the literature, we shall assume bilateral bargaining as the baseline model. Given this specification, equilibrium market tightness satisfies the following job creation condition: the expected present value of the future return to hiring a worker equals the expected cost. The hiring decision is implicit in the act of posting a 4 Generalizing the model to realistically allow for productivity heterogeneity across vacancies and for the fact that a random sample of new job-worker matches initially improve in average productivity are still problems at the research frontier.
1 1 86
D. T Mortensen and C.A. Pissarides
job vacancy and is taken by an employer. In contrast, the equilibrium reservation product, R, reflects the decisions of both parties to continue an existing employment relationship. Individual rationality implies that separation occurs when the forward looking capital value of continuing the match to either party is less than the capital value of separation. For joint rationality, the sum of the values of continuing the match must be less than the sum of the values of separating, otherwise a redistribution of the pair's future incomes can make both better off. Whether these job destruction conditions also satisfy the requirements of joint optimality depends ' on the wage mechanism assumed. For a given wage determination mechanism, a search equilibrium is a pair (R, 8) that simultaneously solves these job creation and job destruction conditions. For expositional purposes, we invoke the existence of a wage mechanism general enough to accommodate the special cases of interest. A wage contract, formally a pair (w0 , w(x)), is composed of a starting wage w0 E Re and a continuing wage function w : X _. Re that obtains after any future shock to match specific productivity. Implicit in this specification is the idea that a worker and an employer negotiate an initial wage when they meet and then subsequently renegotiate in response to new information about the future value of their match 5 . A continuing match has specific productivity x and the worker is paid a wage w(x). Given that the match ends in the future if a new match specific shock z arrives which is less than some reservation threshold R, its capital value to an employer, J(x), solves the following asset pricing equation
rJ(x) = px - w(x) + A
i1
[J(z) - J (x)] dF(z) + A.F(R)[ V -pT - J(x)],
(2.4)
where r represents the risk free interest rate, V is the value of a vacancy, and p T denotes a firing cost borne by the employer, represented as forgone output. We multiply the termination cost by p to show that it is generally more expensive to fire a more skilled worker than a less skilled one. The termination cost is assumed to be a pure tax and not a transfer payment to the worker and to be policy-determined. For example, it may represent the administrative cost of applying for permission to fire, as is the case in many European countries. Of course, T � 0 and none of the fundamental results are due to a strictly positive T. Condition (2.4), that the return on the capital value of an existing job-worker match to the employer is equal to current profit plus the expected capital gain or loss associated with the possible arrival of a productivity shock, is a continuous-time 5 Note that contracts of this form are instantly "renegotiated" on the arrival of a new idiosyncratic shock. MacLeod and Malcomson (1993) persuasively argue that the initial wage need not be adjusted until an event occurs that would otherwise yield an inefficient separation. Contracts of this form may well generate more realistic wage dynamics but job creation and job destruction decisions are the same under theirs and our specification. Hence, for the purpose at hand, there is no relevant difference.
Ch. 18:
Job Reallocation, Employment Fluctuations and Unemployment
1 1 87
Bellman equation. An analogous relationship implicitly defines the asset value of the same match to the worker involved, W(x). Namely,
=
rW(x) w(x) + A
11 [W(z) - W(x)] dF(z) + AF(R)[U - W(x)],
(2.5 )
where U is the capital value of unemployment. Given a match product shock z, the employer prefers separation if and only if its value V exceeds the value of continuation J (z). Similarly, the worker will opt for unemployment if and only if its value, U, exceeds W (z). Given that both J (z) and W (z) are increasing, separation occurs when a new value of the shock arrives that falls below the reservation threshold
(2.6) where J(Re) V - pT and W(Rw) = U. Because in the bilateral bargain wealth is transferable between worker and employer, the separation rule should be jointly optimal in the sense that it maximizes their total wealth. The necessary and sufficient condition for joint optimization is that R Re Rw where J(R) + W(R) = V - pT + U, a condition that holds only for an appropriately designed wage contract 6. Although the idiosyncratic component of a new j ob match is x = 1, the expected profit from a new match will generally be different from J(l), as defined in Equation (2.4), because of the existence of a job creation cost. We therefore introduce the notation J0 for the expected profit of a new match to the employer and write the asset pricing equation for the present value of an unfilled vacancy, V, as =
=
r V = -pc + q(8)[Jo - V - pC],
=
(2.7)
where pc is the recruiting cost flow per vacancy held, and pC is a fixed cost of hiring and training a new worker plus any other match-specific investment required. Here these costs are indexed by the aggregate productivity parameter to reflect the fact that the forgone output that these costs represent is larger when labor is more productive. The value of unemployment solves
rU = b + 8q(8)[Wo - U],
(2.8)
where b represents unemployment-contingent income. Crucially for many of the results that hold in matching equilibrium, unemployment-contingent income is independent of employment income or of the aggregate state of the economy. 6 See Mortensen (1978) for an early analysis of this issue within the search equilibrium framework. For alternative approaches to the modeling of the job destruction flow, see Bertola and Caballero (1 994), who model a firm with many employees moving between a high-employment and a low-employment state, and Caballero and Hammour (1 994), who analyze the implications of sunk costs and appropriation.
D. T. Mortensen and C.A. Pissarides
1 1 88
Given an initial wage equal to w0 , the by now familiar asset pricing relations imply that the initial value of a match to employer and worker respectively satisfy
11 [J(z) - Jo] dF(z) + AF(R)[V -pT -Jo] 1 rWo = wo + A 1 [W(z) - Wo] dF(z) + A.F(R)[U - Wo],
rJo = p - wo + A and
(2.9)
(2. 1 0)
where J(x) and W(x) represent the values of match continuation defined above. The job creation condition that we defined earlier is equivalent to a free entry condition for new vacancies. The exploitation of all profitable opportunities from job creation requires that new vacancies are created until the capital value of holding one open is driven to zero, i.e., c Jo V = O -¢=? - + C = -. q(8) p
(2. 1 1)
As the expected number of periods required to fill a vacancy is llq(8), the condition equates the cost of recruiting and hiring a worker to the anticipated discounted future profit stream. The fact that vacancy duration is increasing in market tightness guarantees that free entry will act to equate the two.
2.2. Generalized Nash bargaining The generalized axiomatic Nash bilateral bargaining outcome with "threat point" equal to the option of looking for an alternative match partner is the baseline wage specification assumption found in the literature on search equilibrium 7 . Given that the existence of market friction creates quasi-rents for any matched pair, bilateral bargaining after worker and employer meet is the natural starting point for an analysis 8 . 7
See Diamond (1982b), Mortensen (1978, 1982a,b), and Pissarides (1985, 1 990). 8 Binmore, Rubinstein and Wolinsky (1 986), Rubinstein and Wolinsky (1 985) and Wolinsky (1 987) applied Rubinstein's strategic model in the search equilibrium framework. The analyses in these papers imply the following: If the worker searches and the employer recruits at the same intensities and if {3 is interpreted as the probability that the worker makes the wage demand (1 {3 is the probability that the employer makes an offer) in each round of the bargaining, then the unique Markov perfect solution to the strategic wage bargaining is the assumed generalized Nash solution. If neither searches but there is a positive probability of an exogenous job destruction shock during negotiations, the solution is again the one assumed but with {3 = �. However, if neither seeks an alternative partner while bargaining and there is zero probability of job destruction, the strategic solution divides the joint product of the match J0 pC + W0 subject to the constraint that both receive at least the option value of searching and recruiting, U and V, rather than the net surplus, as we assumed. As these bargaining outcomes generate the same job creation and job destruction decisions, we consider only the former case with a {3 between 0 and 1 . -
-
Ch. 18:
Job Reallocation, Employment Fluctuations and Unemployment
1 189
Given the notation introduced above, the starting wage determined by the general ized Nash bargain over the future joint income stream foreseen by worker and employer supports the outcome
{
1 wo = arg max [ Wo U]f:l [So � (W0 U)] -f:l �
�
}
subject to the following definition of initial match surplus,
So
=
Jo � pC � V + Wo � U.
(2. 12)
In the language of axiomatic bargaining theory the parameter {3 represents the worker's relative "bargaining power." Analogously, the continuing wage contract supports the outcome
{
}
w(x) = arg max [ W(x) � U]f:l [S(x) � (W(x) � U)] 1-f:l , where continuing match surplus is defined by
S(x)
=
W(x) � U + J(x) � V +pT.
(2. 13)
The difference between the initial wage bargain and subsequent renegotiation arises for two reasons. First, hiring costs are "sunk" in the latter case but "on-the-table" in the former. Second, termination costs are not incurred if no match is formed initially but must be paid if an existing match is destroyed. The solution to these two different optimization problems satisfy the following first order conditions
{3 (Jo V � pC) = ( 1 {3) (Wo � U) PI implies eh > e, and R, < Rh in this case. This fact generalizes but only if the aggregate shock frequency r} is not too large. For r} > 0, the equilibrium relationships are more complicated because forward looking agents knowing themselves to be in state i anticipate the effects and likelihood of transiting to state} in the future. Indeed, under generalized Nash bargaining in which the initial wage is set contingent on the aggregate state and the continuing wage is renegotiated in the event of either an aggregate or a match specific shock, the surplus value of a continuing match with idiosyncratic productivity x in aggregate state i, S;(x), and the surplus value of a new match in state i, S0; , satisfY the following generalization of Equations (2. 1 8) and (2.2 1 ) : =
rS;(x) = p;x - r(U; + V; -p;T) + A
l;1 [S;(z) - S;(x)]
dF(x) +
rJ
[Sj (x) - S;(x)] ,
rSo; = p;x0 - r( U; + V; - p; T) - (r + A + rJ)p;(C + T) +A
L1 [S;(z) - So;] dF(z) +
r}
[So} - So;] ,
1 1 For example, see Kydland and Prescott (1982) and Lucas (1987).
(3 . 1 )
D. T. Mortensen and C.A. Pissarides
1 196
where the aggregate state contingent values of a vacancy and unemployment solve
rV; q (8;) (1 - {3) So; + 1] (rJ - V;) - p;c, rU; = b + 8;q (8;) {3So; + 17 ( Dj - U;) . =
(3.2)
An equilibrium now is a state contingent reservation threshold and market tightness pair (R;, 8;), one for i = I and another for i = h, that satisfy the free entry job creation condition and job destruction condition in both states, i.e.,
V;
=
0
and
S;(R;)
=
0,
i E {/, h}.
Market tightness is procyclical and market tightness and the reservation product threshold move in opposite directions in response to aggregate shocks if the shock is sufficiently persistent. Formally, a unique equilibrium exists with the property that Ph > PI =? Rh < R1 for all 1] while a critical value oo > 1j > 0 exists such that
8h > () ii 12.
Aggregate state contingent equilibrium market tightness is actually lower in the higher aggregate product state for sufficiently large values of the shock frequency because investment in job creation is relatively cheaper when productivity is low and because the present value of the returns to j ob creation investments are independent of the current aggregate state in the limit as 1J becomes large. In other words, job creation investment is larger when aggregate productivity is higher only if expected return given high current productivity offsets the cost advantage of investment in the low productivity state, a condition that requires sufficient persistence in the productivity shock.
3.2. The Beveridge curve As just demonstrated, "boom" and "bust" in this simple model are synonymous with the prevalence of the "high" and "low" average labor productivity when the aggregate shock is persistent. Unemployment dynamics in each aggregate state are determined by the law of motion
(3.3) Hence, the unemployment rate tends toward the lower of the two aggregate state contingent values, represented by
JtF(R;) U; = AF(R;) + 8;q(8;) ' i E {l, h}, *
(3.4)
during a boom and tends toward the higher value in a bust. 12 Formal derivations of the value equations, those of(3 . 1 ) and (3.2), and proofs can be found in Burdett, Mortensen and Wright ( 1 996).
1 1 97
Ch. 18: Job Reallocation, Employment Fluctuations and Unemployment v
eI
u
Fig. 7. The Beveridge curve.
The observation that actual vacancies and unemployment time series are negatively correlated is consistent with this model under appropriate conditions, a fact illustrated in Figure 7. In the figure, the two rays from the origin, labeled 8z and 8h, represent the vacancy-unemployment ratios in the two aggregate states when (Jh > 8z. The negatively sloped curves represent the locus of points along which there is no change over time in the unemployment rate, one for each of the two states. Because the curve for aggregate state i is defined by
vq(v/u;) 1 - U;
=
"AF(R;),
Rh < R1 implies that uh < uz for every v as drawn in Figure 7. Finally, the two steady state vacancy-unemployment pairs lie at the respective intersections of the appropriate curves, labeled L and H in the figure. Provided that the curve along which it 0 doesn't shift in too much when aggregate productivity increases, vJ, > vi as well as uJ, < ui. However, sufficient persistence, in the form of a low transition frequency, is necessary here. Indeed, the points L and H lie on a common ray when persistence is at the critical value 11 = ij since 8z = (Jh by definition. =
3.3. Job creation and job destruction flows In our simple model, the notion of a job is equivalent to that of an establishment, plant, or firm given the linear technology assumption. Consequently, the job creation flow, the employment changes summed across all new and expanding plants over a given period of observation, can be associated with the flow of new matches in the model. Analogously, job destruction, the absolute sum of employment reductions across contracting and dying establishments, is equal to all matches that either experience an idiosyncratic shock that falls below the reservation threshold or were above the
D. T. Mortensen and C.A. Pissarides
1 198
threshold last period but are below it this period. The fact that market tightness and the reservation product move in opposite directions in response to an aggregate productivity shock implies negative co-movements in the two series, as observed. Furthermore, a negative productivity shock induces immediate job destruction while a positive shock results in new job creation only with a lag. This property of the model is consistent with the fact that job destruction "spikes" are observed in the job destruction series for US manufacturing which are not matched by job creation "spurts" 1 3 . As in the OECD data, cyclical job destruction at the onset of recession is completed faster than cyclical job creation at the onset of a boom.
3.4. Quits and worker flows As the model is constructed so far, aggregate hires are equivalent to job creation and separations equal job destruction. These identities no longer hold when some employed workers quit to take other jobs without intervening unemployment spells. As these so-called job to job flows constitute a significant component of both hires and separations, are procyclical, and represent a worker reallocation process across jobs, their incorporation in the model represents an important extension. Job to job worker flows can be viewed as the outcome of a decision by some workers to search for vacancies while employed, as in Mortensen (1 994b). Given that 8q( 8) represents the rate at which employed as well as unemployed workers find a vacant job, the quit flow representing job to job movement in aggregate state i E {/, h} is
Q; = 8;q(8;)(l - u;) s; , where s; is the fraction of the employed who search and 8; is now the ratio of vacancies to searching workers, i.e. 8,
=
v,
u, + s,(l - u,)
Once employed, workers have an incentive to move from lower to higher paying jobs. Suppose that employed workers can search only at an extra cost, a, interpreted as foregone leisure, a reduction in b. As search is j ointly optimal for the pair if and only if the expected return, equal to the product of the job-finding rate and the gain in
1 3 These points are discussed in more detail in Mortensen and Pissarides (1994) and Mortensen (1994b).
Ch. 18: Job Reallocation, Employment Fluctuations and Unemployment
14
1 1 99
match surplus realized, exceeds the cost, all workers employed at x equal to or less than some critical value, denoted as Q;, will search where 8; q( 8; ) [S; ( 1 ) - S; (Q;)]
=
a,
i E {I, h }.
(3.5)
Because idiosyncratic productivity is distributed F (x) - F(R) across jobs, it follows that the fraction of the employed workers who search in aggregate state i is given by (3.6)
s; = F(Q;) - F(R;).
Because a quit represents an employment transition for the worker and the loss of a filled job for the employer, the surplus value equation under joint wealth maximization IS
rS;(x)
=
p;x -
a - r(U; + V; - T) + A
11
[S; (z) - S; (x)] dF(x)
+ 'fJ [Sj (x) - S; (x)] + 8; q(8; ) (S; ( l ) - S; (x))
(3.7)
V x < Q;.
Because the worker does not search when x ;?: Q; and this condition always holds when x = 1 , Equations (3 . 1 ) continue to hold in this range. To the extent that market tightness is procyclical, Equation (3.5) implies Qh > Qz. Hence, the quit flow is procyclical for two separate reasons. First, because Q is higher and R is lower in the high aggregate productivity state, the fraction of employed workers who search is procyclical, i.e., Sh > Sf . Second, because eh > el when the aggregate shock is sufficiently persistent, the rate at which searching workers meet vacancies 8q( 8) is also larger in the high aggregate product state. Worker reallocation across different activities is represented by both the direct movement from one job to another via quits and by movements through unemployment induced by job destruction and subsequent new job creation. Davis, Haltiwanger and Schuh ( 1 996) estimate that between 30% and 50% of worker reallocation is attributable to the job destruction and creation process. Given the procyclicality of the quit flow and the flow of hires, the sum of job creation and quits is highly procyclical, while the separation flow, the sum ofjob destruction and quits, is acyclical. Hence, the reallocation of workers across activities is procyclical relative to the more countercyclical reallocation of jobs across activities both in fact and according to the model. The quit process also interacts with job creation and job destruction in more complicated ways that are not explicitly modeled here. For example, when a worker 1 4 Although the decision to maximize the sum of the pair's expected future discounted income by the appropriate choice of the worker's search effort is individually rational under an appropriate contract, both costless monitoring and enforcement of the contract is generally necessary to overcome problems of dynamic inconsistency. Indeed, otherwise the worker will search if and only if the personal gain exceeds cost, i.e., iff W;( l ) - W;(x) f3[S;(l ) - S;(x)] > a which would imply too few quits. =
1 200
D. T. Mortensen and C.A. Pissarides
quits an existing job to take a new one, the employer can choose to search for a replacement. If the decision is not to replace the worker, the quit has induced the destruction of a job with no net change in either the number of jobs or unemployment. If the decision is to declare the job vacant, a new job was created by the original match but there will be no net reduction in unemployment unless the old job vacated is filled by an unemployed worker. Of course, if filled by an employed worker, the employer left by that worker must decide whether or not' to seek a replacement. This sequential replacement process by which a new vacancy leads to an eventual hire from the unemployment pool, known in the literature as a vacancy chain, propagates the effects of job creation shocks on unemployment [see Contini and Revelli ( 1 997) and Akerlof, Rose and Yellen ( 1 998)]. Also, quit rates are high in the first several months after the formation of new matches and then decline significantly with match tenure, presumably as a consequence of learning about the initially unknown "quality" of the fit between worker and job 1 5 . This source of quits is of significant magnitude and it represents the primary form of quits to unemployment. Because this "job shopping" process implies that an unemployed worker typically tries out a sequence of jobs before finding satisfaction, a job destruction shock is likely to be followed by a drawn-out period of higher than normal flow into and out of unemployment 1 6. Were the job shopping process incorporated in the model, job reallocation shock effects on worker flows would be prolonged and amplified, features that should also improve the model's fit to the data.
4. Explaining the data
Besides the attempts to use the models that we have described to match the stylized facts of j ob and worker flows 1 7, there have recently been some attempts to calibrate stochastic versions of the models to explain the cyclical behavior of the US economy. These attempts are partly motivated by the emergence of the new data on job flows that need to be explained and partly by the apparent failure of competitive labor market models to match the business cycle facts in the data. In order to explain the business cycle facts the models need to be extended to include capital, an exercise that has attracted some attention recently 18 . 15 There is an extensive labor economics literature on this point initiated by the seminal theoretical development by Jovanovic (1 979). See Farber (1994) for a recent analysis of the micro-data evidence on tenure effects on quit rates and the extent to which these are explained by the job shopping hypothesis. Pissarides ( 1994) explains these facts within a search model with learning on the job. 16 Hall ( 1 995) argues that this effect is apparent in the lag relatioships between the time series aggregates. 17 For attempts to estimate structural forms of the matching model see Pissarides (1 986) and Yashiv (1997). 18 When used to calibrate the business cycle facts the models are often offered as alternatives and compared with Hansen's (1985) indivisible labor model.
Ch. 18: 4. 1 .
Job Reallocation, Employment Fluctuations and Unemployment
1201
Explaining job flows data
Cole and Rogerson ( 1 996) conduct an analysis of the extent to which the rudimentary Mortensen-Pissarides model can explain characteristics of the time series observations on employment and job flows in US manufacturing. For this purpose, they construct the following stylized approximation to the continuous time formulation sketched above: Job creation in period t, c1, is equal to the matches that form during the observation period and survive to its end. As one can ignore the possibility that a job is both created and destroyed when the observation period is sufficiently short, approximate job creation in period t is (4. 1 ) where n1_ 1 = 1 - u1_ 1 i s employment at the beginning o f the period, 1 - a; i s the probability that the representative worker who is unemployed at the beginning of the period is not matched with a job during the period given that aggregate state i prevails, 8;q(8;) is the aggregate state contingent unemployment hazard rate, and s1 E {1, h} is the aggregate state that prevails during period t. Job destruction in period t has two components as already noted. First, the fraction of filled jobs that experience a shock less than the prevailing reservation threshold, which equals 1 - e-AF(R;) given aggregate state i prevails, are destroyed. Second, the fraction of existing j obs that do not experience a shock but have match productivity less than the current reservation threshold are also destroyed. The latter is G1_ 1 (R1) where G1_ 1 (x) is the fraction ofjobs at the beginning of the period that have match productivity less than or equal to x . Although this distribution of jobs over productivity is not stationary but instead evolves in response to the history of aggregate shocks, between shock arrivals it converges toward an aggregate state contingent distribution equal to 0 for all x � R; obviously and F(x) - F(R;)I(l - F(R;)) for all values of R; < x < 1 . Given sufficient persistence in the aggregate shock (i.e., 'YJ small enough), Cole and Rogerson argue that these steady-state distributions can be used to approximate G1_ 1 • Because Rh < R1 implies that job destruction attributable to a change in the aggregate state only occurs when the transition is from high to low productivity, the following characterization of the job destruction flow holds as an approximation:
dt tPt where
={ =
( Ds, + tPtDo) nt- 1 where D; = 1 - e-AF(R;) , 1 if s1- I = h and s1 = l, Do = :rr (F(Rz) - F (Rh)) , 0 otherwise,
:rrzh = :rrhz = :rr =
(4.2)
1 - e-1]
is the probability of an aggregate state transition. Finally, the aggregate employment process { n1} is generated by the following stochastic difference equation defined by the employment adjustment identity
nt+]
=
nt + Ct+] - dt+1 = as, + ( 1 - as, - Dst+l - tPt+ 1 Do) nt
D.T. Mortensen and C.A. Pissarides
1 202
given the Markov forcing process {s1} defined on the state space {1, h} and characterized by the symmetric probability of transition Jr. Obviously, the employment, job creation, and j ob destruction processes are interre lated and fully characterized by the set of reduced form parameters {at, ah, Dt, Dh, Do, n}. The question asked by Cole and Rogerson ( 1996) is whether an appropriate choice of these parameters will replicate the salient features of the Davis-Haltiwanger-Schuh observations, which they summarize in the following useful way: ( 1 ) Volatility: Job creation is roughly four times as volatile as employ'ment, and job destruction is more than six times as volatile. (2) Persistence: The series for job creation, job destruction and employment display strong positive autocorrelation, but the autocorrelation for employment, which is 0.9, is nearly twice that for the other two series. (3) Contemporaneous Correlations: Creation and destruction have a fairly large neg ative correlation. Destruction is (weakly) negatively correlated with employment, whereas creation is virtually uncorrelated with employment. Creation is negatively correlated with lagged employment, and posi Dynamics: (4) tively correlated with future employment. The opposite pattern holds for destruc tion. To answer their question, Cole and Rogerson simulate the model above for trial parameter values, compute the associated simulation statistics, and then adjust the parameter values to obtain a better match. They conclude that the model can replicate observations in their sense when the probability of finding a job is not too large. Specifically, the model simulation for the parameter set
{at, ah, Dt, Dh, Do, n}
=
{0. 2 1 , 0. 30, 0.069, 0. 044, 0. 0 1 , 0. 2}
generates their preferred results which are not only consistent with their qualitative characterization of the data but are quite close in quantitative terms as well. Given that the two job destruction rates Dt and Dh are set to match the average of 0.055 reported in the Davis-Haltiwanger-Schuh data, one potential problem which Cole and Rogerson emphasize and discuss are the low values of the probabilities of finding employment. To see the significance of the point, simply note that the two state contingent steady state unemployment rates associated with this parameter set are
Ut
= -- = Dt Dt + at
0.25,
two numbers that yield an average unemployment rate of 19%. Nonetheless, the authors argue that these numbers are reasonable given the following observations reported by Blanchard and Diamond ( 1 990): First, although the simple model ignores non participants, in fact the flow to employment from this stock is roughly equal to the flow from those officially categorized as unemployed. Second, the number of workers classified as out-of-the-labor-force who report they want jobs is also roughly equal to
Ch. 18:
Job Reallocation, Employment Fluctuations and Unemployment
1203
the number classified as unemployed. Including these individuals in the pool of the unemployed would rationalize the low average value of a, especially if these workers search at lower intensities.
4.2. Capital accumulation and shock propagation Merz ( 1 995) and Andolfatto ( 1 996) each construct different but related syntheses of the neoclassical stochastic growth model and the Pissarides ( 1 990) model of frictional unemployment. The contributions of these authors include a demonstration that the "technology shocks" responsible for business cycles in the real business cycle (RBC) model will also induce negative correlation between vacancies and unemployment, the Beveridge curve, and a positive correlation between flows into and out of unemployment in a version of the model with a labor market characterized by a matching process. However, like the earlier simpler RBC models, the amended models fail to propagate productivity shocks in the manner suggested by the observed persistence in actual output growth rates. Recently, den Haan, Ramey and Watson ( 1 997) have constructed, calibrated, and simulated a synthesis of the Mortensen and Pissarides ( 1 994) model of job creation and job destruction with the neoclassical stochastic capital accumulation model. As in the Merz and Andolfatto models, job creation is governed by a matching function whose inputs include vacancies and unemployed workers. In addition, a job destruction margin is introduced by supposing that existing job matches experience idiosyncratic productivity shocks orthogonal to the aggregate shock to match productivity as described above. They find that interaction between the household saving decision and the job destruction decision play a key role in propagating aggregate productivity shocks. As a consequence, their synthesis provides an explanation for the observed autocorrelation in output growth rates as well as the correlation patterns observed in job flows with themselves and employment, those matched by Cole and Rogerson ( 1 996). Den Haan et al. ( 1 997) explicitly formulated the model in discrete time with each period equal to one quarter. Following Merz ( 1 995) and Andolfatto ( 1 996), idiosyncratic variation in labor income attributable to unemployment spells is fully insured through income pooling. Hence, the existence of a representative household can be invoked; one assumed to have additively separable preferences over future 1 consumption streams represented by Lt y u(C1) where t is the time period index, y is the time discount factor, and u( C) is one period utility expressed as a concave function of consumption. A single consumable and durable asset, capital, exists which also serves as a productive input. The sequence of future market returns for holding the asset, denoted {r1}, is an endogenous stochastic process. Hence, the optimal consumption plan must satisfy the usual Euler equation
u'(C1)
=
yEt {u'(Ct+J)(l - D + rt+,)} ,
where the expectation is taken with respect to information available in period D is the rate of physical capital depreciation.
(4.3)
t
and
D. T. Mortensen and C.A. Pissarides
1204
The surplus value of a new match is another endogenous stochastic process, denoted When an unemployed worker and job vacancy meet at the beginning of period t + 1, Nash bargaining takes place. The outcome allocates the share f3S� 1 to the worker and the remainder ( 1 - /3) S� 1 to the employer, where as above f3 represents worker market power. The anticipated bargaining outcome motivates search and recruiting effort by unemployed workers and employers with vacancies during period t. The flow return to unemployed search is the sum of home production while unemployed, b, and the expected gain attributable to finding a match:
{S� 1 }.
b + 81q(8t) f3Et
) } { yu'(Ct+I . S u' (Ct) t+ l o
(4.4)
The expected capital gain, the second term, is the product of the probability of finding a job and the expected value of the worker's share of match surplus given information available in period t appropriately discounted back to the present by a factor which accounts for any difference in the marginal utility of consumption in the next and the current period. Similarly, free entry of vacancies requires zero profit in the sense that recruiting cost per vacancy posted, p1c, equals expected return, the product of the probability that the employer finds a match and the employer's share of its expected discounted surplus value:
p1c - q(8t)(l - f3) Et _
{ yu'(Ct+I) St+l } u' (Ct) o
(4.5)
·
The aggregate productivity shock, the process {p1 }, is Markov with the transition probability kernel assumed to be common knowledge. For simplicity, den Haan et al. ( 1 997) assume that the match-specific process, represented by {x1}, is i.i.d. with 1 c.d.f. F(x) 9. Still, the idiosyncratic shock is expected to persist for the duration of the current period. The output of an existing match in period t is p1xtf(k1 ) where k1 is the amount of capital per worker rented during the period at rate r1, andf(k), normalized output per worker, is an increasing concave function. Because the option value of continuing the match is zero for the employer and equal to the flow value of search for the worker, b + f3p1c8/(1 - f3) from Equations (4.4) and (4.5), the joint match surplus conditional on idiosyncratic productivity x1 is
) }
f3PtC 81 S1(x1) = m:x I\ptxtf(k) - rtk - b 1 f3 yu'(Ct+l ) + E1 max {S1(xt+l), O} , u '(Ct )
{
_
(4.6)
where the last term reflects appropriate discounting of next-period surplus and the option to destroy the match next period if need be. 1 9 Otherwise, the distribution of idiosyncratic productivity across existing matches is a decision relevant state variable. They claim that the model loses no essential property as a consequence ofthis abstraction.
Ch. 18:
Job Reallocation, Employment Fluctuations and Unemployment
1205
By implication of the optimal capital choice, the current period demand for rented capital by an existing match characterized by idiosyncratic productivity x1 = x is
k*(x) 1 =
{
d 0
(�,)
x � R1,
if
(4.?)
otherwise,
where d = (f't' is a decreasing function and R1 is the reservation value of the idiosyncratic shock. Obviously, the representation reflects the fact that an existing job worker match is destroyed and no capital is rented if an idiosyncratic shock is realized below the reservation value. The capital rental rate r1 is determined by the capital market clearing condition which can be written as
Kt =
[1 ' (!.!.__) ] R,
d
dF(x) N1,
XPt
(4.8)
where (K1, N1) is the given aggregate capital stock and employment pair as of the beginning of period t. As the current reservation value R1 solves S1(R) = 0, Equation (4.6) implies
max {p1Rtf(k) - r1k} +E1 k
{ yu'(Ct+I) f3PtC 81 max {S1(Xt+t), O} } = b + (3 • u( ) '
�
1-
(4.9)
Given that x1 F(x), it follows that expected ex ante match surplus conditional on knowledge of (p1, R1) is �
J max{S1(x), 0} dF(x) = 1,00 {
m;x {ptf(k) - r1k} - m;x {p1Rtf(k) - r1k}
by Equation (4.6). The fact that Xt+ I (4. 1 0) imply
�
} dF(x)
(4. 10)
F(x) as well together with Equation (4.9) and
max {p1RJ(k) - r1k} k
=
f3 b + PtC et 1 - f3 yu'(Ct+t ) - Et u' (C )
{
I
1' ( Rt+I
max { Pt+I xf(k) - rt+ik} k - m;x { Pt+I Rt+tf(k) - rt+t k} ) dF(x)
(4. 1 1)
}·
Finally, because x = 1 for a new match, S? = S1(1). Hence, Equations (4.6) and (4 . 1 0) imply that Equation (4.5 ) can be written as
PtC = q(8t)(1 - {3)
D.T. Mortensen and C.A. Pissarides
1 206
Note that Equations (4. 1 1) and (4. 1 2) are generalizations of the job destruction and job creation conditions. Indeed, in the non-stochastic case with linear utility and no capital, these equations are equivalent to Equations (2.26) and (2.22) since Equation (4.3) implies y = 1/( 1 + rt) for all t and the discrete time specification and the assumption that the idiosyncratic shock persists for one period imply that the duration of any shock is unity, i.e., A = 1 . However, a complete characterization of general equilibrium also requires that the equilibrium conditions of the neoclassical stochastic growth model, Equations (4.3) and (4.8), and the laws of motion hold. The laws of motion for capital and employment are Kt+I
=
( 1 - D)Kt + Pt
[i1 ( (;1)) ] xf g
dF(x) Nt
(4. 1 3 )
- Cpt 8t q(8t )(l - Nt) - Ct
and (4. 1 4) respectively 20 . The first equation reflects the effects of job destruction and capital demand decisions made at the beginning of the period on output and the consumption decision while the second reflect the outcomes of current period job creation and destruction decisions. As the information relevant state of the economy is a triple composed of the capital stock, the employment level, and the aggregate shock, a dynamic stationary general equilibrium is a vector function that maps the state variable triple (N, K,p) to the four endogenous variables (C, r, R, 8); one that solves the Euler equation (4.3), the capital market clearing condition (4.8), the job destruction condition (4. 1 1 ), and the job creation condition (4. 1 2) under the laws of motion (4. 1 3 ) and (4. 1 4). Den Haan et al. ( 1 997) derive the properties of the equilibrium by solving and simulating a particular parameterization of the model numerically. The qualitative properties they report are intuitively suggested by the known implications of the two models married in this synthesis. For example, a positive aggregate shock stimulates current investment in both job creation and physical capital which augment employment and productive capacity in the next period. In the short run, these investments must be financed with an output increase induced by a lower than normal reservation productivity choice and by a reduction in consumption. However, because of the consumption smoothing motive, the limited ability to increase output by increasing utilization through reductions in job destruction, and the complementarity of physical capital and labor, more investment of both types is made in subsequent periods as well, i.e., the shock is propagated. 2° Following the literature, horne production b cannot be used to create capital by assumption. It is simply consumed.
Ch. 18:
Job Reallocation, Employment Fluctuations and Unemployment
1207
A negative shock has an immediate and sharp negative effect on output along the job destruction margin. Although the effect is cushioned by the reallocation of existing capital to those jobs that continue, rental rates fall on impact in response to the decrease in demand for capital induced by job destruction and will be expected to fall further in the future as a consequence of the persistence in the shock. The result is a reduction in capital formation and job creation which has the effect of reducing output further in the future. Again the consumption smoothing motive interacting with the job creation and destruction process propagates the shock into the future. As a consequence of the adjustment mechanisms described above, the simulated model implies strong first- and second-order autocorrelation in output growth rates, substantial persistence in the response of physical capital to negative productivity shocks, and a substantial magnification of the effects of productivity shocks on aggregate output. Neither the RBC model nor the augmented model featuring job matching but exogenous job destruction, like those of Merz ( 1 995) and Andolfatto ( 1 996), explain these features of the aggregate time-series data. As in Cole and Rogerson's ( 1 996) reduced form analysis of the Mortensen and Pissarides job creation and destruction model, the calibrated version of the extended model studied by den Haan et al. ( 1997) also reproduces all the job flow time series stylized facts.
5. Technological progress and job reallocation
Search and matching models have been used to address the old "Iuddite" question of the influence of technological progress on job flows and unemployment levels. The common view is that new technology destroys jobs. Of course, innovations also generate new job creation. But, the resulting reallocation of workers from the old to new jobs may require an intervening unemployment spell. In this section, we explore the relation between the exogenous rate of technological progress and steady-state employment. The analysis that follows suggests that the extent to which technical progress is "embodied" is critical. The distinction between embodied and disembodied technology is Solow's. In his original growth model [Solow ( 1 956)], any improvement in technology instantaneously affected the productivity of all factors of production currently employed. But later he introduced the vintage model of embodied technical change in which productive improvements is a property of new capital investment only [Solow ( 1 959)]. In the latter case, to capture the productivity benefits of technical change, older capital vintages must be replaced with the most recent equipment. Our analysis begins by making the original assumption of disembodied technology. We show that if the rate of interest is independent of the rate of technological progress, faster technological progress leads to more job creation in the steady state. The dominant effect in this case is one of "capitalization". Because the costs of job creation are paid initially, faster technological progress implies a lower effective discount rate on future profits, leading to a higher present discounted value for profits [see Pissarides
D. T. Mortensen and C.A. Pissarides
1208
( 1 990), Chapter 2]. The effect of faster growth on job destruction is, however, of indeterminate sign. We then consider the vintage model in the sense that "new capital" is assumed to be embodied only in newly created jobs. We show that under the assumption that the same worker cannot be moved from an old job to a new one without intervening unemployment, steady-state unemployment is higher at faster rates of technological progress [as in Aghion and Howitt ( 1 994)] 2 1 .
5. 1. Disembodied technology Let p(t) represent the aggregate productivity parameter but now expressed as a function of time t. We assume that the rate of technological progress g is constant, exogenous, and less than the rate of time discount, i.e.,
E = g < r. p
(5. 1 )
We treat r as a constant independent o fg 22 . The other restrictions made are the same as in the basic model of Section 2 . 1 , with the additional assumption that unemployment income is also a function of time. We assume for simplicity that b(t) = bp(t). This assumption is needed to ensure the existence of a steady-state growth equilibrium and is plausible in a long-run equilibrium when p(t) is an aggregate productivity parameter 2 3 . The job creation and job destruction conditions of Section 2 . 1 change in an obvious way. Because all parameters in the value expressions (2.4), (2.5), (2.7) and (2.8) are multiplied by p(t), and the wage equation still satisfies either (2.20) or (2.23), there is an equilibrium where all value expressions grow at constant rate g. Intuitively, the firm that has a job with value J(x, t) at time t, expects to make a capital gain of dJ(x, t)ldt = j(x) = gJ(x) on it. The same holds true for the value of a job to the worker, W(x, t), and the value of unemployment, U(x, t), where the capital gain is, respectively, gW(x) and gU(x). But the value of a vacant job, V(t), because it is zero 21
Mortensen and Pissarides (1998) consider a more general case of adoption of the new technology at a cost and show that the two cases that we consider here are two limiting cases, the first case approached when the adoption cost tends to zero and the second when the adoption cost tends to infinity. The main result of the paper is that there is a critical level of the adoption cost below which the dominant influences on job creation and job destruction are those described here under disembodied technology and above which the dominant influences are those described under embodied technology. See also Aghion and Howitt (1 998, chapter 4) for more analysis of this issue. 22 Eriksson (1 997) embeds the model in an optimizing (Ramsey) growth model and shows that the restriction that the effective discount rate decline with the rate of growth can be violated by feasible parameter values. He also considers the effects of growth on unemployment in an endogenous growth framework. 23 Making b(t) a proportional function of the equilibrium wage rate would not change the results.
Ch. 18:
Job Reallocation, Employment Fluctuations and Unemployment
1209
by the free entry condition, does not change. It is this asymmetry between V(t) on the one hand and the other asset values on the other that creates the capitalization effect of faster growth. We do not reproduce all the value expressions with growth but show instead the value of a continuing job to the firm, (2.4):
rJ(x , t) = p(t) x - w(x, t) + A
i1 [J(z, t) - J(x, t)] dF(z)
+ AF(R)[ V(t) -p(t) T - J(x, t)] + j(x, t).
(5 . 2)
The capital gain to the firm is shown as an addition to revenues from continuing the job. Replacing the capital gain by its steady-state value, we get
(r - g)J(x, t) = p(t)x - w(x, t) + A
11 [J(z, t) - J(x, t)] dF(z)
(5.3)
+ AF(R)[V(t) -p(t) T - J(x, t)].
The main result of the introduction of growth can be seen from Equation (5.3). Because all value expressions grow at the constant rate g, wages will also grow at the constant rate g, and so all time-dependent variables in Equation (5.3) can be written as proportional functions of p(t). Letting then J(x , t) = p(t)J(x) and using similar notation for the other time-dependent variables, we can re-write Equation (5.3) in the same form as Equation (2.4), except that the discount rate r is replaced by r - g. It is straightforward to work through the model of Section 2 with the assumption that all time-dependent variables are proportional functions of aggregate productivity and show that there is a solution for the job creation and job destruction flows that replicate the solution shown in Figure 6 but with r replaced by r - g. Hence, under the assumption that r - g falls monotonically in g, we find that faster disembodied technological progress increases market tightness f) but has ambiguous effects on the reservation productivity R. Therefore, faster growth increases job creation, decreases the duration of unemployment but has ambiguous effects on j ob destruction and the incidence of unemployment in general. However, much of the literature on the effects of growth on unemployment concentrates on the obsolescence effects of new technology on job destruction (see the next section) and ignores the idiosyncratic reasons for job destruction. This assumption, also adopted in Pissarides ( 1 990, Chapter 2), is justified in the long-run context by the fact that most variations in the job destruction rate in the data are high-frequency, with, at least in the European context where there have been substantial changes in the unemployment rate, virtually a constant j ob destruction flow across business cycles. This fact justifies a 0, 1 restriction on the support of the distribution of idiosyncratic shocks. In this case, variations in R do not influence the job destruction rate, which is equal to A, and so the effect of faster growth is to increase j ob creation and reduce unemployment.
1210
D. T. Mortensen and C.A. Pissarides
5.2. Adoption through "creative destruction " New technology cannot always be adopted by existing jobs. Much of public discussion and a large body of literature deals with the situation where the adoption of new technology requires the creation of new jobs with new capital equipment. This process of implementation is referred to in the literature as "creative destruction", because old jobs have to be destroyed to release the resources for the creation of new j obs [see Aghion and Howitt ( 1 992, 1 994), Grossman and Helpman (199 1 ), and Caballero and Hammour ( 1994)]. In this section we assume that the process of creative destruction induces a transition of the worker to unemployment and search for a new job. We demonstrate that more rapid technological progress under these assumptions induces more labor reallocation and so higher unemployment because of both lower job creation rate and higher job destruction rate. In order to emphasize the new element of the model we abstract from idiosyncratic productivity shocks. Instead, heterogeneity in productivity arises because older jobs embody less productive technology and a j ob is destroyed when the technology embodied becomes obsolete. Given that current technological improvements affect only productivity in newly created j obs, we need to distinguish between the date at which a job is created, its vintage u, and the current date, denoted as t. The expected present value of both future profit J and wage income W for a given job-worker match depends on the job's vintage and the current date. These value functions solve the following asset pricing equations:
=
rJ(u , t) = p(u) x - w(u , t) - OJ(u , t) + j (u, t), rW(u , t) w(u , t) - D[W(u , t) - U(t)] + W( u , t),
(5.4)
where x represents job match productivity, w(u , t) is the wage paid on a job of vintage u at date t, (j > 0 represents an exogenous job separation rate, and U(t) is the value of unemployed search at t. The fixed cost of investment in a new job, denoted as p(t)C, is incurred when the match forms. The investment is specific to a job, i.e., it is "irreversible" with no outside option value once the match forms. The recruiting costs, p(t) c, are modelled as a cost per vacancy posted. New vacancies enter at every date until market tightness is such that the value of creating a vacancy, V(t), is zero, i.e.
rV(t)
=
q( 8)[J(t, t) -p(t)C] - cp(t)
=
0,
(5.5)
where q( 8) is the rate at which vacancies are filled. Similarly, the value of unemployment solves the asset pricing equation
rU(t) = p(t) b + 8q(8)[W(t, t) - U(t)] + U(t),
(5.6)
where p(t) b represents the opportunity cost of employment and where 8q( 8) is the rate at which workers find vacancies. As before, recruiting costs, the investment required to
Ch. 18:
Job Reallocation, Employment Fluctuations and Unemployment
121 1
create a match, and the opportunity cost of employment grow at rate g by assumption to assure the existence of a balanced growth path equilibrium solution to the model. We assume that the wage bargain divides the surplus value of a continuing match in fixed proportion, i.e.,
f3J( u , t) = ( 1 - f3)[ W(u, t) - U(t)],
(5.7)
where f3 represents the worker's share 24. Because Equations (5.4) and (5.6) imply
(r + o)J(u , t) = p(u) x - w(u, t) + j(u, t), (r + O)[W(u, u) - U(t)] = w(u, t) - rU(t) + W (u, t), the wage contract that supports the assumed bargaining outcome (5.7) is
w(u , t) = f3p(u) x + p(t) (( l - {3) b + f3 (c8 + 8q(8) C))
(5.8)
by virtue of the free entry condition (5.5). The first term on the right reflects the worker's productivity while the second captures the worker's option value outside the firm. Because the latter grows at the rate of technological progress but the former is stationary, every job becomes obsolete eventually. By substituting from the wage equation into the first of Equations (5.4), we obtain
(r + D) J(u , t) = ( 1 - f3)p(u) x -p(t) ((1 - {3) b + f3 (c8 + 8q(8) C)) + j(u, t).
(5.9)
Indeed, Equation (5.9) holds only for t - u ( r where r is the optimal economic life span of a job. The employer's choice of a job's economic life maximizes its value, I.e.,
{ jr+u [0 - f3)p(u) x -p(s) [( l - {3) b + f3 (c8 + 8q(8) C)l] e-(r+b)(s-t) s } .
J(u , t) = m;x
X
(5. 10)
d
The maximal value of a new job at time t is the special solution to this equation 0 satisfying the balance growth equation J(t, t) = J ( 8)p(t) where, given the normalization p(O) = 1 ,
0 J ( 8)
=
J(O, O)
= m;x
{ fo [0 - f3)x '
J
egs [(1 - {3) b + f3(c 8 + 8q(8) C)] e-(r+b) s ds
}·
(5. 1 1)
24 Here workers do not share the cost of initial investment by accepting a lower starting wage for an
initial period of employment as assumed in Section 2. Instead, the initial wage is equal to the continuing wage at initial productivity. Although equilibrium market tightness will be too low relative to a social optimum initially, the qualitative behavior of a model under a jointly efficient wage bargain would be much the same. See Caballero and Hammour (1994, 1996) for more discussion of this issue.
D.T. Mortensen and C.A. Pissarides
1212
The first-order condition for a positive optimal choice of the economic life of a job equates stationary match product with the rising opportunity cost of continuing an existing match, i.e. ( 1 - (3)x - [( 1 - (3) b + (3(c8 + 8q(8) C)] e�r 0. (5. 12)
=
J(t, t) = J0 (8)p(t), the free entry condition (5.5) can be written as c = q(8)[J0 (8) - C]. (5. 1 3) A search equilibrium is characterized by any market tightness and age at job destruction pair (8*, r*) that solves Equations (5. 1 2) and (5. 13). Because the right-hand side of Equation (5. 1 3) is strictly decreasing in 8, equilib rium market tightness is unique. Of course, given 8* , the associated equilibrium value of the optimal job age at destruction, r*, is the unique solution to Equation (5.1 2). Since Equations (5. 12), (5. 1 3) and (5. 1 1) imply c _ + C = J0 (8* ) = (1 - (3)x { r* ef(s-r * ) e-(r+b) s ds' (5.14) o q(8*) l Since
_
[1 -
]
a necessary but hardly sufficient condition for the existence of a positive equilibrium pair ( 8*, r*) is that match productivity x exceed the opportunity cost of employment b. Indeed, given this condition, an economically meaningful equilibrium exists only if both recruiting and creation costs, c and C, are sufficiently small. Because the surplus value of a match decreases with the rate of technological progress, g, for every value of market tightness by virtue of Equation (5.1 1) and the envelope theorem, namely
aJ0 ag
r*
= -
r [se/8 ((1 - (3) b + (3(c8* + 8* q(8 * ) C))] e-(r+O) s ds < 0, lo
and because both the value of a job and the rate at which vacancies are filled decrease with market tightness, the free entry condition (5 . 1 3) implies that market tightness falls with the growth rate, i.e.,
)
(
fjJO q(8*)2 < cq'( 8*) + q(8*)2 cg; ag O. Because the left-hand side of (5. 1 4) is decreasing in g and the right-hand side is increasing in both g and r*, it follows that the economic life of a new job also falls 88* ag
=
with the rate of growth, i.e.,
(1
- +-
- - (3)xef'* Jror* (r* - s)e (r b g) s ds ( 1 - (3)xg�r· foT* e-(r+b-g) s ds
ae* - cq'(f!*) q( f!* ) og
or* ag