T. Asada, T. Ishikawa (Eds.) Time and Space in Economics
T. Asada, T. Ishikawa (Eds.)
Time and Space in Economics Wi...
93 downloads
2538 Views
4MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
T. Asada, T. Ishikawa (Eds.) Time and Space in Economics
T. Asada, T. Ishikawa (Eds.)
Time and Space in Economics With 48 Figures
Toichiro Asada, Ph.D. Professor, Faculty of Economics Chuo University 742-1 Higashinakano, Hachioji, Tokyo 192-0393, Japan Toshiharu Ishikawa, Ph.D. Professor, Faculty of Economics Chuo University 742-1 Higashinakano, Hachioji, Tokyo 192-0393, Japan
ISBN-10 4-431-45977-4 Springer Tokyo Berlin Heidelberg New York ISBN-13 978-4-431-45977-4 Springer Tokyo Berlin Heidelberg New York Library of Congress Control Number: 2006935369 This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in other ways, and storage in data banks. The use of registered names, trademarks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. Springer is a part of Springer Science+Business Media springer.com © Springer 2007 Printed in Japan Typesetting: SNP Best-set Typesetter Ltd., Hong Kong Printing and binding: Shinano Inc., Japan Printed on acid-free paper
Preface
This book is published in celebration of the 100th anniversary of the Faculty of Economics of Chuo University. The Faculty of Economics of Chuo University in Tokyo, Japan, was founded in 1905. Since the beginning of the twentieth century, it has helped educate many brilliant students who have had an influence throughout the world. This faculty has worked not only as a pathfinder in the field of economics but also as a compass that indicates the directions of various movements in our society. To mark the auspicious occasion of the 100th anniversary, the academic committee, organized by the faculty and assisted by overseas academic partners as well as staff of Chuo University, launched a small but important international conference to contribute to the development of economics. The conference aimed to enrich the respective disciplines of the economics of time (dynamic economics) and the economics of space (spatial economics) and to expand their applicability in the real world. The conference, the Chuo Meeting on Economics of Time and Space 2005 (Chuo METS 05), was held at Chuo University August 29–30, 2005. Most of the chapters in this book, all of which were refereed by the editors, are based on papers presented at that conference. It is the hope of the committee that this book will play a role in providing scholars and experts with new ideas with regard to time and space in economics, and that it will help build a foundation for developing a frontier of economics as we stand at the beginning of the twenty-first century. Part I, “Economics of Time: Keynesian Macrodynamics,” is a collection of five chapters on macroeconomic dynamics in the Keynesian tradition, which allows for the existence of involuntary unemployment. Chapter 1, “A Sophisticatedly Simple Alternative to the New-Keynesian Phillips Curve” (R. Franke), presents an alternative microeconomic foundation to the newKeynesian Phillips curve by using a model with heterogeneous and bounded rational firms, and attempts to compromise the rational expectation
V
VI
Preface
hypothesis theoretically with the adaptive expectation hypothesis. Chapter 2, “Harrodian Dynamics Under Imperfect Competition: A Growth-Cycle Model” (H. Yoshida), reconsiders the Harrodian instability problem by using a model of imperfect competition with putty-clay technology, and shows the existence of endogenous fluctuations by using the three-dimensional Hopf bifurcation theorem. Chapter 3, “Endogenous Technical Change: The Evolution from Process Innovation to Product Innovation” (G. Gong), introduces product innovation and process innovation to the nonlinear discrete time version of the Harrodian dynamic model of economic growth, and investigates the complex dynamic behavior and occurrence of bifurcation in such a system. Chapter 4, “Tobin’s q and Investment in a Model with Multiple Steady States” (M. Kato, W. Semmler, and M. Ofori), considers the investment theory based on Tobin’s q under adjustment costs with size effect, and investigates the implication of the existence of discontinuous policy function with multiple steady states by means of the Hamilton-Jacobi-Bellman (HJB) method. Chapter 5, “Significance of the Keynesian Legacy from a Theoretical Viewpoint: A High-Dimensional Macrodynamic Approach” (T. Asada), presents a high-dimensional Keynesian macrodynamic model with debt accumulation, which is designed to interpret the performance of the Japanese economy in the 1990s and the 2000s. Part II, “Economics of Time: Nonlinear Dynamics,” contains four chapters on the classical, neoclassical, and other traditions of nonlinear economic dynamics. The common feature of these chapters is the complexity that is due to the nonlinearity of the system. Chapter 6, “Instability Problems and Policy Issues in Perfectly Open Economies” (P. Flaschel), considers the classical model of a small open economy with perfectly flexible wages and prices to study the dynamic instability of the system and the policy rules that stabilize the system. Chapter 7, “Corridor Stability of the Neoclassical Steady State” (A. Dohtani, T. Inaba, and H. Osaka), introduces the permanent income hypothesis to the Solow–Swan-type neoclassical model of economic growth, and shows the existence of the corridor-stability property, which means that the path inside the corridor converges to the steady state while the path outside the corridor diverges. Chapter 8, “Time-Delayed Dynamic Model of Renewable Resource and Population” (A. Matsumoto, M. Suzuki, and Y. Saito), investigates a system of mixed differential and difference equations (delay differential equations) introducing time delays in production to the Ricardo–Malthus dynamic model of population and renewable resources, and provides a comparative dynamic analysis with respect to delay in production. Chapter 9, “A Determinantal Criterion of Hopf Bifurcations and Its Application to Economic Dynamics” (J. Minagawa), provides a new mathematical criterion for the occurrence of Hopf bifurcations in the general ndimensional system of differential equations, and presents an example of the
Preface
VII
application of this criterion to the three-dimensional dynamic macroeconomic model. Part III, “Economics of Space: Empirical Analysis,” selects four fields that have direct geographical relationships with economic activity in different ways. Chapter 10, “Public- and Private-School Competition: The Spatial Education Production Function” (D. Brasington), using spatial statistics instead of non-spatial ones, reveals the very interesting fact that poverty appears to depress reading and writing pass rates in non-spatial regressions, but this effect disappears in the spatial models. Chapter 11, “Innovation, R&D Cooperation, and the Geography of Regional Labor Acquisition” (J. Simonen and P. McCann), investigates the role played by geography in the promotion of innovation. One of the important suggestions in this chapter is that local, face-to-face contact is an essential feature of the innovation process. Chapter 12, “Taxation of Car Commuters’ Employer-Subsidized Parking” (R. Wall), deals with economic policy instruments to mitigate traffic jams in dense urban areas. It proposes a feasible instrument—taxation of car commuters’ free or partly employer-subsidized parking—and estimates its effectiveness to reduce rush-hour vehicle traffic into the inner city. Chapter 13, “Forest Protection System and Optimal Land-Use Management Policy in Japan” (M. Yabuta and Y. Yamanishi), considers Japanese forest policy and the protection forest system, which is based on the public benefit of forests, and proves that the environment policy of each prefecture in Japan has a positive effect on the choice of protection forests. Part IV, “Economics of Space: Theoretical Analysis,” analyzes the effects of spatial competition between economic organizations or agents on economic performance in a region. Chapter 14, “Railway Competition in a Parkand-Ride System” (T. Kuroda and K. Miyazawa), examines the scale effect of city size and the cost advantage of railways over automobiles in a park-and-ride commuter system. It presents reasons the railway sector should be subsidized to balance the budget for improved social configuration. Chapter 15, “An Analysis of the Relationship Between Manufacturer’s Profit and Spatial Economic Structure in the Retail Market” (T. Ishikawa), describes the circumference model and shows the influences of spatial competition among retailers on manufacturer’s profit and the retail market structure. Chapter 16, “Redistribution Through Local Competition” (F. Guy), develops a model with different kinds of shops and consumers, and suggests that to the extent that local shops serve a poorer clientele, a rise in prices at out-of-town shops has a progressive distributive effect. Toichiro Asada Toshiharu Ishikawa Editors
VIII
Preface
Acknowledgements We are grateful to Chuo University for its financial support with the publication of this book. The title of this volume is from the name of the book Economics of Space and Time (Springer-Verlag, Berlin, 1997) by T. Puu, a pioneer of spatial-dynamic economics.
Contents
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
V
I. Economics of Time: Keynesian Macrodynamics 1. A Sophisticatedly Simple Alternative to the New-Keynesian Phillips Curve Reiner Franke . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
2. Harrodian Dynamics Under Imperfect Competition: A Growth-Cycle Model Hiroyuki Yoshida . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
29
3. Endogenous Technical Change: The Evolution from Process Innovation to Product Innovation Gang Gong . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
41
4. Tobin’s q and Investment in a Model with Multiple Steady States Mika Kato, Willi Semmler, and Marvin Ofori . . . . . . . . . . . . . . . .
55
5. Significance of the Keynesian Legacy from a Theoretical Viewpoint: A High-Dimensional Macrodynamic Approach Toichiro Asada . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
81
II. Economics of Time: Nonlinear Dynamics 6. Instability Problems and Policy Issues in Perfectly Open Economies Peter Flaschel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
99
7. Corridor Stability of the Neoclassical Steady State Akitaka Dohtani, Toshio Inaba, and Hiroshi Osaka . . . . . . . . . . .
129
8. Time-Delayed Dynamic Model of Renewable Resource and Population Akio Matsumoto, Mami Suzuki, and Yasuhisa Saito . . . . . . . . . . .
145
IX
X
Contents
9. A Determinantal Criterion of Hopf Bifurcations and Its Application to Economic Dynamics Junichi Minagawa . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
III. Economics of Space: Empirical Analysis 10. Public- and Private-School Competition: The Spatial Education Production Function David M. Brasington . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
175
11. Innovation, R&D Cooperation, and the Geography of Regional Labor Acquisition Jaakko Simonen and Philip McCann . . . . . . . . . . . . . . . . . . . . . . . .
205
12. Taxation of Car Commuters’ Employer-Subsidized Parking Rickard E. Wall . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
227
13. Forest Protection System and Optimal Land-Use Management Policy in Japan Masahiro Yabuta and Yasuhito Yamanishi . . . . . . . . . . . . . . . . . . .
239
IV. Economics of Space: Theoretical Analysis 14. Railway Competition in a Park-and-Ride System Tatsuaki Kuroda and Kazutoshi Miyazawa . . . . . . . . . . . . . . . . . . .
265
15. An Analysis of the Relationship Between Manufacturer’s Profit and Spatial Economic Structure in the Retail Market Toshiharu Ishikawa . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283 16. Redistribution Through Local Competition Frederick Guy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
297
Subject Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
309
Part I Economics of Time: Keynesian Macrodynamics
1. A Sophisticatedly Simple Alternative to the New-Keynesian Phillips Curve Reiner Franke
Summary. This chapter reconsiders the standard microfoundations of the new-Keynesian–Phillips curve. It argues that with heterogeneous and boundedly rational firms, the expectational variable in the Phillips curve refers to a longer time horizon than the next period, and is updated in a gradual manner. However, firms are less naive than adaptive expectations; they also take output data into account, as well as the credibility of the central bank. At the macroeconomic level, the ideas about such an adjustment process are specified in the concept of an “adaptive inflation climate.” Exploratory estimations of the process where the inflation climate is proxied by a survey measure of expectations yield reasonable results that are supportive of this approach. Key words. New-Keynesian Phillips curve, Bounded rationality, Adaptive inflation climate, Survey of Professional Forecasters
1 Introduction As a theory of nominal rigidity, the new-Keynesian Phillips curve explicitly incorporates forward-looking elements in economic behavior, and accounts for the importance of expectations. Doing this in an elegant way, it is widely used in teaching macroeconomics and in policy analysis, specifically in evaluating the properties of alternative policy rules. However, there is a growing awareness that the model is hard to square with the facts. In particular, the canonical sticky-price model allows for the possibility of disinflationary booms, whereas shocks to monetary policy are well known to have delayed and gradual effects on inflation, such that in the first stages a reduction of inflation is typically accompanied by output losses. Mankiw Institute for Monetary Economics, Technical University, Vienna, Austria
3
4
R. Franke
(2001, p. C59) even concludes that the new-Keynesian Phillips curve “is appealing from a theoretical standpoint, but it is ultimatively a failure.” To give a larger role to inflation inertia, present versions of the newKeynesian Phillips curve have often built in some backward-looking behavior. The rate of inflation expected for the next period that shows up on the right-hand side of the Phillips curve relationship is here generalized to a weighted average of expected inflation and inflation in the previous period(s). It also appeared that these hybrid expectations could be successfully estimated, and that the forward-looking component receives a significant weight; Galí and Gertler (1999) is only one example among many.1 In a profound econometric study, these positive results (i.e., their newKeynesian interpretation) have recently been severely challenged by Rudd and Whelan (2001). Under conditions that are highly possible to meet in practice, the paper first points out that the weight on the forward-looking expectations is upward-biased. It can even be significant when the true model of the inflation dynamics is purely backward-looking.2 This holds when in the estimation equation, the realized rates of inflation are substituted for the expected values, as well as when expected inflation is proxied by a survey measure (cf. Rudd and Whelan 2001, p. 9, fn 12). While this kind of criticism might not apply to full-information estimation techniques, these estimates run the great risk of being inconsistent if any part of the model is misspecified, even if it does not belong to the theoretical core. The tests just mentioned are based on only the most elementary property of the model’s rational expectations hypothesis, according to which the period t + 1 expectational error should be unforecastable by variables dated t or earlier. Noting this, the authors subsequently turn to the stronger property that rational expectations should be model-consistent, that is, consistent with the process for inflation as described by the model. Given the forcing variable xt in the Phillips curve, a measure of economic activity or real marginal cost, this additional prediction yields testable implications for the influence that the infinite discounted sum of the expected future values of xt exerts on the current rate of inflation. In the corresponding reducedform estimation, the coefficient on this present-value term, if it is at all significant, is extremely small (0.012 at the most) and absolutely dominated by the coefficients on lagged inflation (0.795 as their minimum sum). The authors also check the possibility that lags of inflation are given this large In addition, we will indicate below that although this extension of the Phillips curve is straightforward, its microfoundations are presently less rigorous than in the baseline case with purely forward-looking expectations. 2 The basic supposition involved is that a dynamic variable that belongs in the true model is erroneously omitted from the test specification. 1
Alternative to the New-Keynesian Phillips Curve
5
weight because they already contain so much information about the future of the forcing variable. However, they find no statistically significant effect of this kind, which means the significant role played by lags of inflation in reduced-form Phillips curves cannot be assigned to their proxying for rational expectations of future output gaps or real marginal costs. In sum, the new-Keynesian–Phillips curve, its baseline version as well as its extension with hybrid expectations, “cannot serve as an adequate approximation to the empirical inflation process” (Rudd and Whelan 2001, p. 18).3 Apparently, some modification of the new-Keynesian sticky-price model is needed to explain the role played by lagged dependent variables in inflation regressions. The alternative usually put forward at this stage of the discussion is “adaptive expectations”. 4 Mankiw (2001, p. C59), for example, notes that this assumption is far from satisfying, and that it would be odd to assert that expectations about inflation are formed without incorporating the detailed news from the permanent media coverage of monetary policy. “Yet,” he continues, “the assumption of adaptive expectations is, in essence, what the data are crying out for.” This chapter proposes an alternative formulation of a Phillips curve relationship whose expectations may be viewed as some compromise between rational expectations and adaptive expectations.5 Our price-setting firms are not assumed to be rational in the sense of economic theory, but rational in the sense of using their limited capabilities consciously and intelligently. In contrast to rational expectations, the expectation formation process of these firms is modelled in an explicit way. The stylized assumptions on the reasoning of human beings facing a complex decision problem seek to follow the famous KISS principle coined by Zellner (1992, 2002), “keep it sophisticatedly simple.” In forming expectations, the firms are at least more sophisticated than the plain adaptive expectations of the inflation rate and take additional variables into account. Since we want to remain within the bounds of the output– inflation nexus, this set of variables is limited to the central bank’s target rate of inflation, and the level and change of the output gap. The approach On p. 17, the authors also mention that their results are perfectly consistent with the work of Fuhrer (1997), who arrived at a very similar conclusion by using a different methodology. 4 While originally adaptive expectations was a dynamic process of partial adjustments, the expression is now often reduced to merely pointing out the influence of lagged dependent variables. 5 Mankiw and Reis (2002) take a different route to obtain a dynamic response to monetary policy that resembles Phillips curves with lagged inflation rates. While their firms are still supposed to form expectations rationally, they do not do so in every period, but only infrequently. The model is thus a model of sticky information. 3
6
R. Franke
is in the spirit of vector autoregressions, although for reasons of parsimony our specifications will be much simpler. We also add that by considering target inflation, there can be a role for credibility of monetary policy, a feature that is lacking in the usual, as they are called, backward-looking models. An important point is the time horizon of expected rates of inflation. Like most of the new-Keynesian microfoundations, we employ the Calvo framework of time-contingent price setting where the single firm, which does not know when it will next be “allowed” to reset its price, minimizes the expected losses from committing itself to a fixed price over that time interval. These losses do not only depend on next period’s inflation, but on the entire path of future inflation, which the firm cannot possibly forecast and does not even try. 6 Instead, in period t a firm f seeks to characterize the future time path by a single number p tf. This is the rate of inflation that the firm expects to prevail on average over the whole future, where the averaging procedure is based on Calvo’s probability weights for discounting. More accurately, the firm is aware that its estimation is more or less imprecise, but efforts to reduce the degree of imprecision appear too costly and too unreliable. So the firm contents itself with informed guesses about average future inflation. The guesses are predetermined in a given period, and updated between periods as new information arrives. Moreover, the expectations of firms are not homogeneous, so that the prices that are reset in a given period are dispersed. It is clear that the changes in the aggregate price level depend on the firm-specific rates p tf. In fact, the other components of the firms’ decision problem and, in particular, their expectations regarding future output are treated in such a way that in the end the determination of the economy’s rate of aggregate inflation looks very similar to the new-Keynesian Phillips curve. However, the expected inflation rate for the next period in the latter relationship is replaced with the aggregate of the firms’ guesses p tf. To accentuate this, we will from then on avoid the expression “expectations” and call this aggregate the general inflation climate in the economy. Accepting the modelling philosophy, everything else hinges on the updating of the inflation climate, which will again be discussed at the macroeconomic level. Although the inflation climate is an unobservable variable, it may be proxied by survey measures of inflation expectations. From studies of their main characteristics, we infer the following guidelines for designing an adjustment process. (1) Firms do not incorporate innovations in inflation Without rational expectations, it is also of no use to the firm to reformulate the solution of its optimization problem as a one-period ahead forward-looking equation for the reset price. 6
Alternative to the New-Keynesian Phillips Curve
7
and other variables instantaneously and to their full extent, but the inflation climate responds only partially to this news. (2) There are systematic overand underpredictions in the surveys, which suggest a (partial) role for an adaptive expectations component. (3) On the other hand, the survey data exhibit mean-reverting tendencies; this motivates us to employ the central bank’s target rate of inflation as a second benchmark toward which the inflation climate gradually seeks to adjust (in addition to the current inflation benchmark). (4) Inflation surveys also account for information other than inflation, which as already indicated leads us to presume reactions of the inflation climate to the output gap as well as to its changes. We understand these adjustments to be carried out by adaptive firms in an uncertain and constantly changing environment.7 The concept of the inflation climate in combination with its updating process is therefore called the adaptive inflation climate, also referred to by the acronym AIC. This module is the theoretical core of this chapter. As a first check on whether the AIC concept can make economic sense, we proxy the inflation climate by a survey measure of inflation expectations. In this way, the adjustment process can easily be estimated. Although the coefficients are not perfectly robust, the results are reasonable. They are, in any case, sufficiently encouraging to try more elaborate methods in future research. The remainder of the chapter is organized as follows. Section 2 reconsiders the optimization problem of price-setting firms with heterogeneous expectations, and presents our reinterpretation of the expectational variable in the Phillips curve. Section 3 develops the concept of the adaptive inflation climate. In particular, it makes the parameter explicit that reflects the credibility of monetary policy. The adjustment process is subsequently justified by the exploratory estimation just mentioned. Section 5 concludes.
2 The New-Keynesian Phillips Curve Revisited 2.1 Homogeneous and Heterogeneous Expectations We begin with a recapitulation of the standard derivation of the newKeynesian Phillips curve, as based on the Calvo model with its timecontingent price adjustment rules for firms in monopolistic competition. Each firm must precommit to a price until it receives a random signal that To emphasize the expression “adaptive” beyond the serious scientific paths, we may mention a one-line advertising campaign of hp-invent in Germany in autumn 2004, which was directly formulated in English and read, “Solutions for the adaptive enterprise.” In contrast to ruling theory, the profane business world was apparently less attracted by a slogan like, “The solution for the rational firm.” 7
8
R. Franke
it can change the price. Accordingly, let (1 − q) be the random fraction of firms that are able to reset their price in any particular time period t. Denoting the reset price of a representative firm in that period by zt and the overall price level by pt (both expressed in logs), these staggered price changes give rise to pt = qpt−1 + (1 − q)zt
(1)
Designate the frictionless price of the firm as p*t , which is the price the firm would set in t if there were no price rigidities. It deviates from the prices set by the firm’s competitors according to the current state of demand. Under an upward-sloping supply curve with elasticity h, the frictionless price may be given by p*t = pt + hyt
(2)
where yt is the output gap, i.e., the percentage deviation of actual output from potential output. In an inflationary environment, however, the reset price should exceed p*t since the firm will take into account that its price may be fixed for multiple periods. The firm therefore chooses zt such that it minimizes an expected “loss function,” a convenient specification of which is the quadratic expression L(zt) = Σ ∞k=0q kEt(zt − p*t+k)2 . Certainly, Et(zt − p*t+k)2 is the firm’s expected loss in period t + k if it had no opportunity to set a frictionless optimal price either then or previously, and this loss is weighted by the probability q k that its price will remain unchanged until this period. Then the optimality condition from differentiating the loss function with respect to zt, dL(zt)/dzt = 0, and some elementary rearrangements determine the reset price as ∞
zt = (1 − θ )∑ θ k Et pt*+ k
(3)
k=0
Even without rigorously solving the optimization problem, it is immediately recognized that zt is the average desired price until the next price adjustment. Furthermore, Eq. 3 is equivalent to zt = qEtzt+1 + (1 − q)p*t
(4)
All relationships entering the derivation of the new-Keynesian Phillips curve are made explicit if we lastly apply the expectation operator to the price level Eq. 1 for the next period t + 1 and solve it for Etzt+1, from which one gets Et zt +1 =
1 ( Et pt +1 − θ pt ) 1−θ
(5)
Then, substitute Eq. 2 into Eq. 4, solve Eq. 1 for zt, and equate the two resulting expressions for zt, which gives
Alternative to the New-Keynesian Phillips Curve
1 ( pt − θ pt −1 ) = θ Et zt +1 + (1 − θ )( pt + η yt ) 1−θ
9
(6)
Substituting Eq. 5 for Etzt+1 leads to 1 θ ( pt − θ pt −1 ) = ( Et pt +1 − θ pt ) + (1 − θ )( pt + η yt ) 1−θ 1−θ
(7)
It remains to denote the rate of inflation by p t = ∆pt = pt − pt−1, and after a few algebraic manipulations of Eq. 7 one arrives at
π t = Et π t +1 +
(1 − θ )2η yt θ
(8)
It should be noted that the expectations in Eq. 8 need not necessarily be rational. The key assumption in deriving the relationship was to treat all firms alike, so that we have just spoken of “the” firm as a representative of all price-changing firms. As hinted at in the introduction, when employing rational expectations in Eq. 8, this forward-looking Phillips curve is well known for several undesirable features such as lack of inertia or disinflationary booms. A straightforward extension that seeks to overcome these deficiencies is to allow for some form of lagged dependence in inflation. With one lag, for simplicity, the usual specification is p t = mEtp t+1 + (1 − m)p t−1 + byyt
(9)
where by > 0 and 0 ⭐ m ⭐ 1. This relationship, perhaps with some lags of the rate of inflation added, represents the contemporary consensus on a theoretical new-Keynesian model of inflation. While the primary justification for Eq. 9 is empirical, its theoretical background has largely been left to the fringes of the discussion. One proposal to derive Eq. 9 in the present Calvo framework, although with a more elaborated firm sector, has been advanced by Christiano et al. (1999). They maintain the hypothesis of homogeneous firms with rational expectations, but relax the assumption that the firms that are not allowed to reoptimize commit themselves to their old price; the latter rather increase their price automatically at the most recent rate of inflation. 8 A similar device is Woodford’s (2003, sect. 3.2) backward-looking indexation in the firms’ setting of optimal prices. Often, however, equations like Eq. 9 are introduced as reflecting forwardlooking and backward-looking “behavior.” This characterization suggests that the economy is populated by two (or more) types of firm, which form 8
With a discount factor of unity, then m = 1/2 results.
10
R. Franke
their expectations about inflation in different ways. Generally, a continuum of firms that are distributed over the unit interval may be assumed, with density function s = s( f ). A single firm is identified as f ∈ [0, 1], and firmspecific variables are labelled by a superscript f. In particular, let E tf be the operator that incorporates the rules according to which this firm forms its expectations. Then the following reformulation of Eq. 8 suggests itself:
π t = ∫ Etf π t +1ds ( f ) + 1
0
(1 − θ )2η yt θ
(10)
Specifically, a fraction m of the population may be told to have rational expectations, which may here be expressed as E tf p t+1 = Etp t+1 for a firm f, if the symbol Et is reserved for model-consistent expectations. The rest of the firms, labelled f′, are supposed to entertain backward-looking expectations in the simple form Etf′p t+1 = p t−1. Thus,
∫
1
0
Etf π t +1ds ( f ) = µEt π t +1 + (1 − µ )π t −1
(11)
and combining Eqs. 10 and 11 obviously gives rise to the hybrid Phillips curve in Eq. 9. But precisely how, then, is Eq. 10 supposed to come about? It might be expected that Eq. 9 or Eq. 10 can be similarly derived as the baseline case Eq. 8.9 For a careful investigation of this conjecture, we have to go back to the current and future reset price and make the aggregation procedure explicit. Let z tf be the reset price of firm f in period t, and It ⊂ [0, 1] the set of firms (with mass 1 − q) changing their price in that period. Suitably scaled, the aggregate reset price is therefore (1 − q)−1兰I ztfds(f ). Assume furthermore that the distribution of the heterogeneous expectations in the set It does not differ from their distribution in the total population of firms.10 This makes the aggregate reset price independent of the outcome of the random process selecting the firms that can charge a new price, and we have zt = 兰10ztfds(f ). Denoting the frictionless price of firm f by ptf, the firmspecific version of Eq. 4 is ztf = qEtfzft+1 + (1 − q)ptf, and the aggregate version reads t
9 Galí and Gertler (1999, pp. 210f) is an alternative proposal to Christiano et al. (1999) and Woodford (2003) to derive Eq. 9, which has explicit recourse to forward- and backward-looking firms. Here the backward-looking firms’ reset price is specified as the average reset price of the two types of firm in the previous period plus a correction for the rate of inflation it has brought about. Interestingly, the authors themselves call this rule “admittedly ad hoc.” The resulting coefficient b y, by the way, is similar, but not identical, to (1 − q)2h/q in Eq. 8. 10 The assumption is acceptable if 1 − q is not too small, a feature that is corroborated by Blinder et al. (1998). They report that the median US firm adjusts its price once per year, which for a quarterly model gives us 1 − q = 1/4.
Alternative to the New-Keynesian Phillips Curve
11
zt = ∫ ztf ds ( f ) = θ∫ Etf ztf+1 ds ( f ) + (1 − θ )∫ ptf ds ( f ) 1
1
1
0
0
0
On the other hand, with 兰 10y tfds(f ) = yt, the counterpart of Eq. 6 is given by 1 1 ( pt − θ pt −1 ) = θ∫ Etf ztf+1ds ( f ) + (1 − θ )( pt + η yt ) 0 1−θ
(12)
If we seek to proceed with this equation analogously to Eq. 6, a fundamental difference from the economy with homogeneous expectations becomes apparent. Recall that at this stage of the argument, Etzt+1 in Eq. 6 was replaced with Eq. 5. In the homogeneous case, the latter equation could be thought of as follows. The single firm not only knows that the price level is updated every period according to Eq. 1, but also that it shares its expectations with the other firms. Hence Eq. 5 equally holds for its own reset price, and the firm can use this relationship for a determination of Etzt+1 that directly refers to expectations about aggregate prices. In contrast, in the present framework with heterogeneous expectations, Eq. 5 for the future aggregate reset price becomes (after an obvious rearrangement)
∫
1
0
Etf pt +1ds ( f ) = θ pt + (1 − θ )∫ Etf ztf+1ds ( f ) 1
(13)
0
If on the analogy of Eq. 5 this equation were solved for the integral on the right-hand side and this term substituted in Eq. 12, we could indeed continue as before.11 But what does Eq. 13 mean? Of course, it would be true if each firm f believed that its future reset price and the aggregate price level are related by f Etfpt+1 = qpt + (1 − q)Etfzt+1
(14)
However, the individual firm is negligibly small, and if it reckons on the other firms having different expectations about their future reset prices, it will see no reason for this relationship to hold. More precisely, while the single firm may well have expectations about prices in the next period, it will not conceive the general price level pt+1 as having anything essential to do at all with its own reset price in t + 1. Equation 14 or any modification of it that maintains the term Etf z ft+1 will simply be meaningless for an individual firm. As a consequence, there is no conceptual basis for the aggregate version Eq. 13 to come into being either. In finer detail, this procedure would involve an additional assumption of the firms’ expectations about other firms’ expectations, for which, in short, no predictable movements are expected. This feature is spelled out in Adam and Padula (2002). On the other hand, the authors use an equivalent formulation of Eq. 13 without any further comment (see Eq. 10 in their Appendix). 11
12
R. Franke
We conclude that when heterogeneous expectations of firms are allowed for, the extended Phillips curve relationship in Eq. 10 cannot be established along the lines of the standard argument. In particular, substituting mEtp t+1 + (1 − m)p t−1 for Etp t+1 in Eq. 8 has so far no rigorous microeconomic underpinnings that are comparable to those of the baseline case, where m = 1. In this sense, hybrid expectations appear to contain some “ad hoc” element. This observation motivates us to take one more step and reconsider the kind of reasoning by which single firms determine their reset price. In this way, we will, in particular, face the question of the time horizon of expectations in an alternative interpretation of the Phillips curve.
2.2 Reconsidering the Reset Price and Expectations About Inflation To re-examine the firm’s decision problem about its reset price, we return to the solution formula of Eq. 3 that minimizes the expected losses for the periods in which the firm will not be capable of changing the price. Assuming that the reset probability (1 − q) is common knowledge, and taking account of the desired frictionless price (Eq. 2), the reset price z tf of firm f is given by ∞
ztf = (1 − θ )∑ θ kEtf ( pt + k + η ytf+ k )
(15)
k=0
To be clear, we henceforth abjure rational expectations for any firm. As has been pointed out in the previous section, because of its negligible size it is not meaningful for a single firm to relate its future reset price to the expected aggregate price level. Each firm has thus to cope with Eq. 15 in another, more direct, way. It would be an enormous task for a firm to predict an entire time profile of future price levels pt+k and the demand directed to itself, from which its f derive. This would ask too much of the major economic output gaps y t+k research institutes of a country, and firm f is not supposed to be more able in this respect. Therefore, the firm decomposes Eq. 15 into a few elements which it then treats in an approximate manner. The central idea is that firm f captures future inflation inherent in the price series pt+k by a single number p tf. As discussed below, p tf will change over time; therefore the time index t. In a first description, p tf could be conceived as the firm’s expected average rate of inflation for the whole future from t on, where the terms regarding the period t + k are to be discounted by the probability weight q. In detail, consider the expected price level for t + k and, invoking the rates of inflation until that time, write it as E tfpt+k = pt + Σ kj=1E tfp t+j. When the firm seeks to grasp future inflation by its summarizing rate p tf, we have Etf pt + k = pt + ∑ kj =1 [π tf + Etf π t + j − π tf ] = pt + kπ tf + ∑ kj =1 ( Etf π t + j − π tf )
Alternative to the New-Keynesian Phillips Curve
13
Denoting the accumulated residuals for period t + k by e ft+k = e ft+k(p tf ), the expected price level is split up into E tf pt+k = pt + kp tf + e ft+k(p tf ) where k
etf+ k (π tf ) = ∑ ( Etf π t + j − π tf )
(16)
j =1
What the number p tf should accomplish is that these residuals average out on the whole. Realistically, however, the firm sees itself in an uncertain inflationary environment. It has no reliable basis on which to build up a probability distribution of future inflation rates and on which it could really expect the residuals to cancel out by a suitable choice of p tf . On the other hand, the firm has to settle down on a definite reset price in the end. So it just proceeds as if p tf were able to satisfy this condition. The firm is aware that this will not be exactly the case, but for lack of a better (and not more costly) procedure it is willing to accept the possible errors. Accordingly, in using Eq. 16 to determine its reset price ztf , the firm sets the discounted errors equal to zero: ∞
∑θ e (π ) = 0 k
k =1
f t +k
f t
(17)
This rule completes the treatment of future prices in Eq. 15. Turning to the second component in this equation, the firm still has to form expectations about its output. ∞
∑θ
∞
k
k=0
Etf ytf+ k = ytf + ∑ θ k Etf ytf+ k
(18)
k =1
Since the output gap fluctuates around zero in the long run, in this equation f for the next few no secular growth rate has to be assessed. The values y t+k periods will nevertheless be relevant. In a similar spirit as before, the firm could work here with a single growth rate g tf that it expects, or rather presumes, to prevail on average over the near future. We abstain from this approach, however, since we want to stay close to the conventional Phillips curve where output expectations have no explicit role to play. Instead, let the firm assume that, in the absence of foreseen shocks, output in the near future (or demand, for that matter) is still essentially determined by its present level and growth rate. Thus, firm f approximates the infinite sum on the right-hand side in Eq. 18 by ∞
∑θ k =1
k
Etf ytf+ k = ξ1 ytf + ξ2 ∆ytf
(19)
14
R. Franke
Generally, the coefficients x1, x2 ⭓ 0 may vary across firms, but to simplify the notation we treat them directly as homogeneous. x1 and x2 might also be supposed to be zero, by arguing that the firm has already sought to eliminate Eq. 19 by its choice of p tf (again, by and large at least). This case would not be too unreasonable since when updating their current value of p tf , firms will take account of the output gap and its change (this is explained in greater detail below). Combining the rules and approximations in Eqs. 15–19, the reset price ztf on which firm f decides is given by ∞
∞
k =0
k =0
ztf = (1 − θ )∑ θ k pt + π tf ∑ (1 − θ )θ k k + (1 − θ )η[(1 + ξ1 ) ytf + ξ2 ∆ ytf ]
(20)
The remaining two infinite series in Eq. 20 can easily be resolved. Since the q k sum to 1/(1 − q), the first term on the right-hand side is just pt. The second infinite series can be written as qΣ ∞k=1(1 − q)q k-1k or, putting q = 1 − q, as qΣ ∞k=1(1 − q)qk-1qk. The latter series is precisely the mean value of the geometric distribution, which can be looked up in a formulary as 1/q (in addition, this is also the mean waiting time for a firm until it can change its price). Hence, the second term on the right-hand side of Eq. 20 equals p tf q/q = qp tf /(1 − q). To sum up, we arrive at the following determination of the reset price of firm f, ztf = pt +
θ π tf + (1 − θ )η[(1 + ξ1 ) ytf + ξ2 ∆ytf ] 1−θ
(21)
2.3 A Reinterpretation of Expectations in the Phillips Curve Having available the firms’ reset prices in Eq. 21, it is now straightforward to derive a Phillips curve. To this end, for the aggregation of the firm-specific rates of inflation p tf , we define
π tc = ∫ π tf ds ( f ) 1
(22)
0
The aggregate reset price is already known to be given by zt = 兰01z ft ds(f ). Thus, Eq. 21 gives rise to the aggregate equation zt = pt + qp ct /(1 − q) + A, with A := (1 − q)h[(1 + x1)yt + x2∆yt] Substituting this expression for zt in Eq. 1 for the staggered price changes yields pt = (1 − q)pt + (1 − q)qp ct /(1 − q) + (1 − q)A + qpt−1, qp t = q(pt − pt−1) = qp ct + (1 − q)A
or
Alternative to the New-Keynesian Phillips Curve
15
The Phillips curve relationship we thereby obtain reads (1 − θ )2η πt = π + [(1 + ξ1 ) yt + ξ2 ∆ yt ] θ c t
(23)
While the analogy to Eq. 8 is obvious, two important distinctions should be drawn. The first one is immediate, as in Eq. 23 not only the output gap but also its rate of change enters (which is the difference between the two growth rates of actual total output and potential output). This could perhaps be an attractive feature for empirical estimations. Conceptually more significant, however, is the second distinction, that Etp t+1 in Eq. 8 is here replaced with p ct . From what has been said above, it is clear that this variable is no longer meant to capture expectations about next period’s inflation. For each single firm, the p tf that constitute p ct are rather the firms’ time averages of all future inflation rates, where the averaging procedure uses the probability q for discounting. Moreover, the firms’ choice of p tf cannot be based on consistent mathematical expectations such that the expected deviations of p tf from the future p t+k offset each other in the long run, in the sense of Eq. 17; hence, this choice has to be made in more informal ways, or it includes more intuitive elements. For this reason, we completely avoid speaking of expectations in connection with Eq. 23. As an alternative, we prefer to call the aggregate variable p tf a general inflation climate. Of course, to breathe life into this notion it has to be made clear how firms come to set up their rates p tf , or how the general inflation climate is supposed to change. These dynamic adjustments, studied directly in the aggregate, are the subject of the next section.
3 Dynamic Adjustments of the Inflation Climate 3.1 Motivational Background As the notion of the inflation climate p ct has been introduced, there is no direct empirical counterpart to which this variable could be related. However, survey measures of expected inflation are data that come relatively close to it. These types of data appear attractive insofar as they are obtained from real-world agents who, like our firms, are not blessed with rational expectations, but derive their forecasts from a mix of formal methods, sophisticated reasoning, and intuition.12 Hence, although we have emphasized that p ct is
However, intuition can later be rationalized by referring to more acknowledged arguments while ignoring others. 12
16
R. Franke
not a rate of inflation that firms expect to prevail in the next period (or four quarters ahead, for that matter), survey data may be stimulating in designing or evaluating the mechanisms that determine the movements of the inflation climate. Since p ct is an average rate across the population of firms, our discussion of the survey expectations will only be concerned with the mean of the forecasted values.13 The first question about survey expectations is, of course, their rationality; that is, whether these subjects process the information available to them as if, in the mean, they were perfectly rational. Several studies have examined the statistical properties of such inflation measures, where departures from rationality are usually identified as rejections of the hypotheses of unbiasedness and efficiency. While there are a number of interesting details concerning results over subperiods and for single measures, it seems fair to say that on balance the evidence points to rejections of these null hypotheses (see Thomas 1999; Andolfatto et al. 2002, sect. 2; Mehra 2002). On the other hand, the literature on surveys of inflation expectations has also revealed that they are not as unsophisticated as simple autoregressive models (“adaptive expectations”) suggest. Already Mullineaux (1980) and Gramlich (1983), for example, have both found that additional information like the money supply helps explain movements in the survey data, even when lagged inflation is controlled for (see also Croushore 1998, for more recent evidence). In view of this evidence, Roberts (1997, 1998), has developed the idea that survey expectations are an average of rational expectations and adaptive expectations, as captured by lagged inflation. As he stresses himself, however, this approach represents only an empirical regularity, and does not correspond to structural models derived from underlying economic behavior (Roberts 1998, p. 9). Furthermore, even if this work could make a case for rational expectations as one component of a structural model, this does not necessarily rule out that alternative concepts also have comparable explanatory power.14 Branch (2004) has recently advanced, and successfully estimated, an ambitious model that seeks to explain the dynamics of the total population of forecasts. It assumes that each forecaster employs one of three available predictors (corresponding to static, adaptive, and vector autoregressive expectations), which he may change from period to period according to their costs and current performance. As a side remark regarding the population share of fully rational forecasters that Roberts (1998) estimates in his simpler framework (as we mention in the next footnote), it might be interesting to note that Branch (2004, pp. 617–619), when he replaces the vector autoregression (VAR) expectations with rational expectations, gets the much smaller estimate of 11% for this proportion than Roberts. 13
Alternative to the New-Keynesian Phillips Curve
17
Another approach in the literature explicitly denies rational expectations to the agents, but views them as smarter than simply sticking to adaptive expectations. As a fairly modest deviation from rational expectations, the agents are treated as econometricians; they have estimated a dynamic reduced-form structure of the economy in the form of a vector autoregression (VAR), and use this model to predict inflation in the next, or next few, periods. This type of behavior has been called “near-rational,” since it produces forecast errors that are only a little worse than those of fully rational expectations. When the VAR module is incorporated into a complete model economy, the regression coefficients may be exogenous and invariant over the simulated sample runs, as, for example, in Ball (2000) and Huh and Lansing (2000), or they may even be endogenous, where in a learning process agents update them in each period, as in Orphanides and Williams (2002, 2003) or (for an AR inflation process) in Tetlow and von zur Muehlen (2003). In our opinion, this approach to modelling expectations is a more appealing compromise between rational expectations and adaptive expectations than the one proposed by Roberts (1997, 1998). The main reason why we do not adopt such a VAR to determine our inflation climate is, again, the argument that p ct is not the inflation to be expected for a particular period in the future. Also, we strive for a simpler specification than a complete VAR. Nevertheless, the adjustment mechanism that we will put forward for p ct can be viewed as being in the same spirit; it could even be interpreted as a particularly simple (or degenerate) VAR, which, especially regarding the number of lags, has been subjected to a great number of a priori restrictions. Whether, or in what sense, our simplified specification is compatible with the data has to be investigated at a later stage. When looking at time series of surveys of expectations, three features emerge that a model determining the inflation climate p ct should reflect. First, at least at a quarterly frequency, the expectation series are typically smoother than the inflation rates to be predicted (cf. Figs. 1–3 in Thomas
Actually, Robert’s estimations seem to be fraught with econometric subtleties. While in Roberts (1997, p. 190) the results “indicate that the departure of survey expectations from rationality is not adequately captured by the simple weighted-average model,” and they suggest “that some other model of the deviation of the surveys from rationality should be preferred” (Roberts 1997, p. 191), the estimations presented in Roberts in 1998 are more favorable, and here the relevant parameters are significant at a meaningful order of magnitude. The latter results lead him to the conclusion “of an ‘intermediate’ degree of rationality, with perhaps 20 to 40 percent of the population using simple, backward-looking expectations” (Roberts 1998, p. 15). 14
18
R. Franke
1999). This observation suggests that new information is not immediately, or not to its full extent, taken into account; instead, expected inflation appears to adjust to new information in a partial way, or with some delay. Incidentally, for imperfect agents in an uncertain world this behavior is by no means “irrational,” but can very well be shown to be optimal (Heiner 1988). Expressing our inflation climate p ct directly as a function of the other endogenous variables in a model would imply that p ct exhibits the same volatility. We therefore take the partial adjustments explanation of the observed smoothness in the surveys as an inspiration to treat p ct as a predetermined variable, which in the light of incoming information is then updated for the next period in a gradual manner. A second feature of the survey expectations is that they tend to underestimate inflation during expansions and to overestimate it during contractions (Thomas 1999, p. 134). This lagging behind the actual series is typical of an adaptive expectations rule. A more detailed examination of this phenomenon reveals a third feature. Underestimations of inflation also occurred during periods in which inflation exceeded its average. Conversely, overestimations occurred during periods of below-average inflation. The interpretation of this finding is that the forecasts exhibit a propensity to expect inflation to revert toward its longer-run average value (Thomas, 1999, p. 134). The latter two features lead us to incorporate an adaptive expectation and a mean-reverting component into the adjustment mechanism for p ct . In addition, as a reflection of the VAR method, the inflation climate should respond to movements of variables other than inflation. Since we want to remain in the framework of elementary price Phillips curves, the only variable that we will consider here is the output gap.
3.2 Specification of the Adaptive Inflation Climate (AIC) Having sketched this motivational background, we now turn to a detailed description of the dynamic adjustments of the inflation climate. To begin with, the inflation climate p ct is predetermined in a given period t, and modified by the firms at the beginning of the next period as new information arrives. The updating procedure is based on the notion of a general benchmark rate of inflation, toward which the current value of p ct is adjusted in a gradual manner. The benchmark itself is a combination of four single benchmark components. Needless to say, the composite benchmark is not fixed, but generally varies over time. A first requirement of a benchmark concept is, of course, that the inflation climate should not be persistently above or below the current rate of inflation p t. Hence, p t is one of the four single benchmarks toward which p ct is to
Alternative to the New-Keynesian Phillips Curve
19
adjust. Formally, this amounts to the above-mentioned adaptive expectations. In this respect, however, we emphasize the following three points. (i) p ct is not the expected inflation in the next quarter t + 1, but is held to be some sort of average over a longer period of time. (ii) Firms are aware that the benchmark p t is a moving target, which is also quite volatile in the presence of random shocks. In particular, since they do not have perfect knowledge of the probability laws of the shocks, the firms are reasonably cautious and do not adjust p ct too strongly in the direction of p t; this is basically Heiner’s (1988) argument. (iii) Current inflation is only one guideline among several others for updating the inflation climate. For all these reasons we will avoid the term “adaptive expectations” altogether. Nevertheless, on many occasions p t would not be very useful as an overall benchmark. For example, this becomes clear in a boom phase of the business cycle. When firms expect an additional upward pressure on prices in this situation, p t as a benchmark may be considered to be corrected upward in proportion to the degree of capacity utilization, which in the present setting is represented by the output gap yt. On second thoughts, the idea of rising inflation in a stage of overutilization may not be entirely convincing if the economy has reached a turning point and a contraction is already underway. Recognizing such a pattern is often difficult without the data from the next several quarters, so in this respect firms may just have a look at the recent change in capacity utilization. This means that, in a third concept, firms correct the benchmark p t for a growth component. Accordingly, we suppose that they correct p t for this purpose in proportion to the change in the output gap, ∆yt. Whereas the latter two concepts primarily seek to extrapolate the current trend in the motion of the rate of inflation, a longer-run time perspective is also relevant. In accordance with the above-mentioned mean-reverting component in the survey expectations, we think of firms imagining that in a not too distant future the central bank will succeed in bringing inflation back to normal, where an obvious measure of “normal” inflation is given by the central bank’s target rate of inflation. If monetary policy is assumed to be sufficiently transparent, we can work directly with a definite target rate of inflation p * that is known to all firms. This constitutes the fourth benchmark toward which the inflation climate may partially adjust. The four concepts of sensible benchmark rates of inflation are summarized in Table 1. This table also contains our notation for the proportionality factors (z) by which the benchmark p t is corrected in the second and third concept; for the speeds of adjustments (δ) at which p ct is, respectively, updated in the direction of the four single benchmarks; and for the weights (w) that each of these concepts carries when they are combined into a closed formula. Thus, the changes in the general inflation climate are described as
20
R. Franke
Table 1. The four updating concepts for the inflation climate p ct Provisional benchmarks for p ct to adjust Adjusted speed Current inflation: Output–adjusted inflation: Growth–adjusted inflation: Target inflation:
pt p t + zyyt p t + zg∆yt p*
δp δy δg δs
Weight of concept wp wy wg ws
As well as zy, zg > 0, the adjustment speeds and weights of the four concepts satisfy 0 艋 δa, w a 艋 1 (a = p, y, g, s) and Σaw a = 1
p ct+1 = p ct + w p ∆p c,p + w y∆p c,y + w g∆p c,g + w s∆p c,s
(24)
where ∆p c,p := δ p (p t − p ct)
∆p c,y := δy(p t + zyyt − p ct)
∆p c,s := δs(p * − p ct)
∆p c,g := δg (p t + zg∆yt − p ct)
(the index “g” alludes to the growth component that is captured by ∆yt, and “s” alludes to the star symbol in the target rate of inflation). It goes without saying that with p c = p *, Eq. 24 supports the steady-state p = p*, y = 0 of the economy. The structural representation of the dynamics of the inflation climate contains ten behavioral coefficients. While they are useful in order to distinguish the four single benchmark concepts from each other, they are not all needed in the further analysis. We reduce them to the following four parameters: a c := w p δ p + w yδy + w gδg + w sδs g := w sδs/a c
ay := w yδyzy/(w p δ p + w yδy + w gδg) a g := w gδgzg/(w p δ p + w yδy + w gδg)
and write Eq. 24 equivalently as p ct+1 = p ct + a c[gp * + (1 − g)(p t + ayyt + a g∆yt) − p ct ]
(25)
Once the four parameters in Eq. 25 are given, the values of the original ten structural parameters in Eq. 24 can be constructed from them. Clearly, this determination is not unique, but there exists a whole continuum of such parameter combinations. Especially regarding the coefficient a c, however, one may no longer wish to decompose it any further since, as the terms are collected in Eq. 25, it is seen to represent the average adjustment speed in the four updating procedures (a c ⭐ 1). Equation 25 summarizes how the single firms’ views about future inflation cause the general inflation climate to change in an adaptive way: where, as usual in the learning literature on heterogeneous agents, the expression
Alternative to the New-Keynesian Phillips Curve
21
“adaptive” is used in a broader sense than just an “adaptive expectations” rule. The equation can thus be said to describe the concept of an adaptive inflation climate. In short, the updating rule in Eq. 25 may be referred to as the AIC module. If it is introduced into our Phillips curve Eq. 23 from Sect. 2.3, we may simply speak of the PC + AIC module.
3.3 Central Bank Credibility and Inflation Persistence in the Phillips Curve In this short subsection, we point out the economic significance of the parameter g in Eq. 25 and how it relates to more familiar concepts in the literature. We start out from an elementary specification of expectations in the Phillips curve, which can reflect the faith firms have got in the conduct of monetary policy. Following Freedman (1996, pp. 253f), such a Phillips curve reads p t = mp * + (1 − m)A(L)p t−1 + byt
(26)
where A(L) is a polynomial lag function indicating that expected inflation is tied to current and past rates of inflation, whose coefficients add up to unity. The parameter we are interested in is the weight m, which expresses the degree to which inflation expectations are anchored on the target rate of inflation. In this sense the coefficient can be viewed as measuring the credibility of the central bank.15 While m will depend on the success of monetary policy in the past, it is intuitively clear that increases in credibility are also favorable for maintaining price stability, although this will generally require the central bank to appropriately adjust (the coefficients in) its reaction function.16 Hence the coefficient m in equations like Eq. 26 deserves particular attention. Of course, this amounts to the same as other discussions of the Phillips curve that, in terms of Eq. 26, concentrate on the coefficient 1 − m as a measure of inflation persistence (as mentioned in the previous footnote, the role of p * here might not be made explicit). In the formulation of Eq. 26, central bank credibility (m) and inflation persistence (1 − m) are obviously complements.
Note that this concept is inherent in Phillips curve estimations with demeaned or detrended inflation rates, where the sum of the coefficients on lagged inflation is significantly less that one. This is easily seen by adding target inflation on both sides of such a regression equation, when it is assumed that p * approximately equals the trend inflation in the data. 16 This is an important outcome in Amano et al. (1999), which they obtain in an elaborated macroeconometric model of the Canadian economy (see especially pp. 11–13, 24f). 15
R. Franke
22
Now, in combination with Eq. 25, our Phillips curve Eq. 23 from Sect. 2.3 also makes reference to the central bank’s target rate of inflation p *, the strength of its influence being governed by the coefficient g. Note, however, that p * in Eq. 26 directly impacts on period-t inflation, while in Eqs. 23 and 25 its effect on inflation is delayed. If for the moment the expectational term mp * + (1 − m)A(L) in Eq. 26 is identified with our general inflation climate, we can say that in Eq. 26 the target rate determines the level of the climate, and in Eq. 25 it determines the change of the climate. The PC + AIC module in Eqs. 23 and 25 can nevertheless be made comparable to Eq. 26 by an elementary rearrangement of terms. Dating Eq. 25 one period backward and substituting it in Eq. 23, we get p*, p t−1, and p ct−1 on the right-hand side (as well as the output terms, of course). Then dating Eq. 25 two periods backward and substituting it for p ct−1, an additional term with p * appears as well as p t−2 and p ct−2 . If we proceed in this way and simplify the rest by neglecting ∆yt in Eq. 23 and denoting the composed coefficient on yt as by, our Phillips curve Eq. 23 becomes ∞
π t = α c ∑ (1 − α c )k[γπ *+ (1 − γ )(π t − k −1 + α y yt − k −1 + α g ∆yt − k −1 )] + β y yt k=0
Since the infinite series Σ ∞k=0(1 − a c)k has limit 1/a c, the equation can be rewritten as ∞
π t = γπ *+ (1 − γ )∑ α c (1 − α c )k π t − k −1 k=0
∞
+ β y yt + α c (1 − γ ) ∑ (1 − α c )k (α y yt − k −1 + α g ∆yt − k −1 )
(27)
k=0
which includes the inflation terms in the same form as Eq. 26. As inflation persistence in a Phillips curve context is usually specified as the sum of the coefficients on the lagged rates of inflation, the role of the parameter g in the PC + AIC module can be summarized as follows: g 1−g
credibility of the central bank inflation persistence in the Phillips curve
(28)
In addition to highlighting the role of the parameter g, Eq. 27 shows the main difference of the PC + AIC module from the usual backwardlooking Phillips curves, even if they include a target inflation rate: as well as inflation, our approach also implicitly invokes the entire history of output evolution. This observation is useful for comparisons with the literature. The inflation dynamics themselves, however, are better studied by keeping track of the explicit adjustments of the inflation climate in Eq. 25.
Alternative to the New-Keynesian Phillips Curve
23
3.4 Econometric Exploration with the Survey of Professional Forecasters (SPF) It has been pointed out above that while the concept of the adaptive inflation climate has no direct empirical counterpart, survey expectations should be relatively close to it. This suggests a first econometric test of the AIC module, which substitutes such a measure for pC in Eq. 25 and tries to estimate the coefficients in a regression equation. If they come out with the correct sign and in a reasonable order of magnitude at some decent level of significance, and also if the fit to the data is not too bad, we can take this result as support that it is indeed worthwhile to introduce the AIC module into the Phillips curve, and inquire into the inflation dynamics that are thus generated, although on basis of additional criteria, the numerical parameter values from the regression might still be modified. Three survey measures for the US are commonly used in the Phillips curve literature (Thomas 1999): the Michigan Survey of Households, the Livingston Survey, and the Survey of Professional Forecasters (SPF). For the present purpose, it seems appropriate that the participants are businessmen or professional forecasters, which is the case for the latter two surveys. Since the Livingston Survey is only semiannual, we concentrate on the SPF forecasts. Regarding inflation, quarterly predictions are here provided for the GDP implicit price deflator and the Consumer Price Index (CPI). The first measure is available from 1968 on, but has the disadvantage that the underlying deflator has been periodically redefined. For this reason we consider the CPI inflation forecasts which, however, were not initiated until the third quarter of 1981. As for the forecast horizon, we choose the mean forecasts of the (annualized) quarterly rate of change four quarters ahead. To reduce the influence of the last high inflation rates at the beginning of the 1980s, we let the sample period begin with 1981 : 4 and end with quarter 2000 : 4. The other variables entering the regression are the (annualized) quarterly CPI inflation rates, the output gap, and target inflation. The output gap yt is conveniently represented by the percentage deviations of real GDP from its Hodrick–Prescott trend line.17 Because inflation rates have also declined to some extent in the first half of the 1990s, it may be argued that target inflation was subject to (mild) variations, so that it may better be proxied by trend inflation. We try both prespecified fixed values p * = const and a Hodrick– Using the standard smoothing parameter for quarterly data, l = 1600. The trend was computed over a longer time span than the sample period, so there are no problems with end-of-period distortions. 17
24
R. Franke
Prescott trend p *t of CPI inflation.18 Denoting the SPF survey expectations by SPFt, the regression equation thus reads SPFt+1 = (1 − a c)SPFt + a c[gp *t + (1 − g) (p t + ayyt + ag∆yt)] + et
(29)
With respect to a fixed value of target inflation, the three rates p * = 2.50%, 3.00%, and 3.50% are considered. The fit is then only slightly worse than in the case in which trend inflation is employed for p *. In all four cases, the coefficient ay on the level of the output gap turns out to be negative, so we re-estimate Eq. 29 by putting ay = 0. The results of these regressions are collected in Table 2. All coefficients reported in the table are significant at the conventional levels. In all four estimations there are no serious signs of serial correlation in the residuals either. Corresponding to the R2 -values are standard errors of regression of around 0.30%. To get a visible impression of the goodness of fit, Fig. 1 juxtaposes the actual SPFt series (solid line) and the series resulting from the first estimation in Table 2 (bold line). A main factor explaining the good fit is the relatively high degree of smoothness in the forecast series: it looks even more pronounced if the forecasts are contrasted with the actual rates of inflation. Plotting the survey forecast SPFt together with the current rates of inflation p t also shows that the forecasts do not deviate very much from what one would draw free-hand as a trend line of p t (perhaps with some upward bias). This feature suggests that already the component p *t in Eq. 29 explains a good deal of the change in SPFt, so that the coefficient g might confuse two effects: expectations of the forecasters that inflation will return to trend inflation, and because of the low variation in the forecasts themselves the simple fact that SPFt remains quite close to its own trend, especially in the 1990s. In this respect the high value g = 0.70 in the first row of Table 2 could be a deceptive measure of central bank credibility. Actually, the suspicion
Table 2. Estimations of regression, Eq. 29
1 2 3 4
p*t
ac
g
ay
ag
R2
HP-trend 2.50 3.00 3.50
0.21 0.13 0.15 0.17
0.70 0.37 0.47 0.54
— — — —
2.92 2.71 2.67 2.43
0.908 0.898 0.900 0.903
To correct for the possibly spurious effects at the end of the sample period, after inspection of the time series graph we have frozen p*t at a value of 2.60% from t = 1999 : 3 until t = 2001 : 2. 18
Alternative to the New-Keynesian Phillips Curve
25
Fig. 1. Actual and estimated survey forecasts SPFt (solid and bold lines, respectively)
seems to be confirmed by the estimations with the constant values of p *t , which lead to a sizeable reduction of g. Besides the good fit of Eq. 29, there are other properties of the regression that appear less delightful. If the sample period is shortened and the subperiods are varied, the estimates of a g easily become insignificant, while a c and g change in nonnegligible ways. This lack of robustness is unsatisfactory from an econometric point of view. In the light of the smoothness of the dependent variable, it may indicate that there are too many explanatory variables on the right-hand side of Eq. 29 (although they are not collinear). This means there would be a whole continuum of combinations of the parameters a c, g, a g, and possibly also ay that yield a similar goodness of fit. In this intuitive sense the approach may be said to be over-identified. On the other hand, this observation is not necessarily an argument against Eq. 29 as a theoretical construct. First, there do exist meaningful values of the coefficients that give rise to a satisfactory fit. Second, if the module is combined with other model building blocks there may be additional criteria beyond the regression Eq. 29 that offer some guidance to selecting among the many parameter combinations. Here, we take the results in Table 2 as evidence that our updating procedure of the inflation climate has passed the basic econometric test. The AIC module Eq. 25 makes economic sense, not only as a theoretical device, but also when it comes to concrete numerical values.
26
R. Franke
4 Conclusion The baseline case of the new-Keynesian Phillips curve, as well as its hybrid variants with forward-looking expectations, have come under severe econometric criticism, the crucial argument being that this approach cannot explain the role played by lagged dependent variables in inflation regressions. It is thus time to consider alternative, backward-looking versions which, however, should not fall back on simple adaptive expectations as in the traditional interpretation. Abjuring rational expectations and taking heterogeneous expectations of firms seriously, this chapter has proposed a reinterpretation of the expectational variable in the Phillips curve as a general inflation climate. Rather than refer to inflation in the next period, this concept seeks to summarize in a single number the expectations about suitably discounted inflation over the entire future. Updating of the climate in response to new information proceeds in a gradual manner. The adjustments are not only oriented toward current inflation, but also take the level and change of the output gap into account as well as the central bank’s target rate of inflation. These are in our view the basic ingredients of any reasonable expectation formation process about inflation. Our aim was to model such a process in a sophisticatedly simple way. The four-parameter specification at which we arrived was then called the adaptive inflation climate (AIC). Obtaining specific numerical coefficients for the adjustment process is a separate matter. Here, we have limited ourselves to a preliminary test on whether beginning such an endeavor would be at all worthwhile. For this purpose the inflation climate was proxied by a survey measure of fourquarter inflation expectations. Although this does not exactly correspond to our idea of the climate, it is good enough for a first exploratory estimation. While the coefficients, especially the output gap coefficients, should not be taken too literally, the estimation results are sufficiently encouraging to approach a numerical analysis of AIC in greater depth, which is left to subsequent research. The natural field of application for our inflation module are small models to study monetary policy. While the use of the new-Keynesian Phillips curve in these models has an understandable theoretical appeal, its implications may not be innocuous. The fact that here current inflation summarizes the entire sequence of expected future output gaps for the economy is a strong prediction that may well have a bearing on the kind of optimal policy. To quote Rudd and Whelan (2001, p. 20), “given that this prediction is soundly rejected by the data, the use of these models for policy analysis strikes us as questionable at best.” Even if one does not fully share this harsh assessment, substituting our alternative model with the adaptive inflation climate for the new-Keynesian Phillips curve should be worth investigating.
Alternative to the New-Keynesian Phillips Curve
27
References Adam K, Padula M (2002) Inflation dynamics and subjective expectations in the United States. European Central Bank, Working Paper Series 222 Amano R, Coletti D, Macklem T (1999) Monetary rules when economic behaviour changes. Bank of Canada, Working Paper 1999–8 Andolfatto D, Hendry S, Moran K (2002) Inflation expectations and learning about monetary policy. Bank of Canada, Working Paper 2002–30 Ball L (2000) Near-rationality and inflation in two monetary regimes. NBER Working Paper 7988 Blinder AS (1998) Asking about prices: a new approach to understanding price stickiness. Russell Sage Foundation, New York Branch WA (2004) The theory of rationally heterogeneous expectations: evidence from survey data on inflation expectations. Econ J 114:592–621 Christiano LJ, Eichenbaum M, Evans C (1999) Nominal rigidities and the dynamic effects of a shock to monetary policy. NBER Working Paper 8403 Croushore D (1998) Inflation forecasts: how good are they? Federal Reserve Bank of Philadelphia, Working Paper 98-14 Freedman C (1996) What operating procedures should be adopted to maintain price stability? Practical issues. In: The Federal Reserve Bank of Kansas City, Achieving price stability: a symposium, pp 241–285 (www.kc.frb.org/publica/sympos/1996/ pdf/s96freed.pdf) Fuhrer J (1997) The (un)importance of forward-looking behavior in price specifications, J Money Credit Banking 29:338–350 Galí J, Gertler M (1999) Inflation dynamics: a structural econometric analysis, J Monetary Econ 44:195–222 Gramlich EM (1983) Models of inflation expectations formation. J Money Credit Banking 15:155–173 Heiner RA (1988) The necessity of delaying economic adjustment. J Econ Behav Organ 10:255–286 Huh CG, Lansing KJ (2000) Expectations, credibility, and disinflation in a small macroeconomic model. J Econ Bus 51:51–86 Mankiw G (2001) The inexorable and mysterious tradeoff between inflation and unemployment. Econ J 111:C45–C61 Mankiw NG, Reis R (2002) Sticky information versus sticky prices: a proposal to replace the new-Keynesian–Phillips curve. Q J Econ 117:1295–1328 Mehra YP (2002) Survey measures of expected inflation: revisiting the issues of predictive content and rationality. Federal Reserve Bank of Richmond, Econ Q 88:3, 16–36 Mullineaux DJ (1980) Inflation expectations and money growth in the United States”. Am Econ Rev 70:149–161 Orphanides A, Williams JG (2002) Imperfect knowledge, inflation expectations, and monetary policy. In: Bernanke B, Woodford M (eds) Inflation targeting. Chicago University Press, Chicago, in press Orphanides A, Williams JG (2003) The decline of activist stabilization policy: natural rate misperceptions, learning, and expectations. Federal Reserve Bank of San Francisco, FRBSF Working Paper 2003–24 Roberts JM (1997) Is inflation sticky? J Monetary Econ 39:173–196 Roberts JM (1998) Inflation expectations and the transmission of monetary policy. Board of Governors of the Federal Reserve System, mimeo
28
R. Franke
Rudd J, Whelan K (2001) New tests of the new-Keynesian–Phillips curve. Board of Governors of the Federal Reserve System, Finance and Economics Discussion Series 2001–30; published in J Monetary Econ 52(2005):1167–1181 Tetlow R, von zur Muehlen P (2003) Avoiding Nash inflation: Bayesian and robust responses to model uncertainty. Federal Reserve Board, Washington DC, mimeo Thomas LB Jr (1999) Survey measures of expected US inflation. J Econ Perspect 13:4, 125–144 Woodford M (2003) Interest and prices: foundations of a theory of monetary policy. Princeton University Press, Princeton, Oxford Zellner A (1992) Statistics, science and public policy. J Am Stat Assoc 87:1–6 Zellner A (2002) My experience with nonlinear dynamics models in economics, Stud Nonlinear Dynamics Econometrics 6:2, 1–18
2. Harrodian Dynamics Under Imperfect Competition: A Growth-Cycle Model Hiroyuki Yoshida
Summary. This chapter considers Harrod’s knife-edge instability, which implies severe and extreme business cycles restricted by a full employment ceiling and a zero-gross-investment floor. We construct a dynamic model with imperfect competition in the output market by using the subjectivedemand curve approach. In addition, we consider technical choice by taking account of the putty–clay technology. The chapter shows the occurrence of endogenous and moderate fluctuations through the Hopf bifurcation. Key words. Harrod’s knife-edge, Growth cycle, Imperfect competition, Putty–clay technology, Hopf bifurcation
1 Introduction Harrod (1939) set up the foundation of “dyamic theory” and analyzed the relation between the actual and the warranted rates of growth. His conclusion was remarkable: the actual growth path is “highly unstable,” and thus any departure from the warranted growth path stimulates a further divergence. Later, Harrod’s theory was developed by Hicks (1950), who presented a constrained business cycle model with a full employment ceiling and a zero-gross-investment floor. Although Hicks’ theory is truly essential and complete from the theoretical point of view, the fluctuation observed in his model is so severe that we rarely experience it. The purpose of this chapter is to investigate the possibility of perpetual growth cycles in the Harrodian framework.1 Here we emphasize the College of Economics, Nihon University, Misaki 1-3-2, Chiyoda-ku, Tokyo 101-8360, Japan 1 There is a body of literature that argues Harrod’s growth theory by using the mathematical methods of nonlinear dynamics. See, for example, Nikaido (1990), Flaschel (1994), Yoshida (1999), and Sportelli (2000).
29
30
H. Yoshida
microeconomic behavior of firms. In particular, we introduce imperfect competition in the output market. Furthermore we consider technical choice by taking account of the putty–clay technology: firms choose the optimal level of technology for new capital equipment by solving a unit-cost minimization problem. These elements will reveal the mechanism of the cyclical growth process. The rest of the chapter is organized as follows. Section 2 explains the distinction between the short-run and long-run production functions. Section 3 sets up the formal model. In Section 4, we investigate the dynamic properties of the Harrodian dynamic model. The final section summarizes our results.
2 Long-Run and Short-Run Production Functions We start this section by explaining the long-run and short-run production functions. The distinction between the two functions is due to the presumption of putty–clay technology, which implies ex ante substitution possibilities between capital and labor, but no such possibilities ex post. While the long-run production function represents the best suited equipment and manufacturing technique, the short-run production function prescribes current production decisions with the existing capital stock of the firm. Let k[v](t) stand for the amount of equipment of vintage v surviving at time t, and let n[v] e (t) be the labor required to operate that equipment fully. Since we assume that the capital stock depreciates at the same constant rate d, k[v](t) = e−d(t−v)k[v](v). Thus we obtain the total stock of capital at time t. t
t
−∞
−∞
K (t ) = ∫ k[v ] (t ) dv = ∫ e −δ (t − v ) I (v ) dv
(1)
where we define k[t](t) = I(t), which represents the rate of gross investment at time t. By differentiating Eq. 1 with respect to time, we obtain . K(t) = I(t) − dK(t) (2) We now introduce x[v] e (t) as the labor intensity required to operate equipment of vintage v at full capacity, or the normal labor intensity for newly installed equipment at time v. Note that for v < t, x[v] e (t) is fixed, and for t = v it is variable because of the putty–clay technology. Moreover, let us denote by n[v] e (t) the amount of labor required to operate equipment of vintage v at [v] [v] [v] −d(t−v) I(v). In total we full capacity. Then we have n[v] e (t) = xe (t)k (t) = xe (t)e [v] [v] t t −d(t−v) I(v)dv, where Nn(t) is the necessary write Nn(t) = 兰 −∞ne (t)dv = 兰 −∞xe (t)e amount of employment when all the existing capital stock is fully utilized, or the normal level of employment to produce capacity output.
Harrodian Dynamics Under Imperfect Competition
31
The long-run production function is assumed to be represented by Yn(t) = F(Nn(t), K(t))
(3)
where Yn(t) is the level of capacity output at time t. What is more, we assume that the long-run production function is homogeneous of degree one, and then we can write Yn(t)/K(t) = f(xn(t)),
xn(t) = Nn(t)/K(t)
(4)
which satisfies f(0) = 0, f(∞) = ∞, f ′(⋅) > 0, and f ″(⋅) < 0. In reality, the firm controls the degree of capital utilization by adjusting the level of employment according to the state of the economy; consequently, the capital stock is not always fully utilized. To incorporate the above relationship between the rate of capital utilization u and the actual level of employment N, we introduce the following function 2: u = u( N / Nn ),
u ( 0) = 0,
u (1) = 1,
lim u ( N / N n ) = u > 1
N →+∞
(5)
where ¯ u represents a physical maximum rate of capacity utilization. The above function is referred to as a “utilization rate function.” Then, using Eq. 5 we can now express the short-run production function as Y Y Yn = = u ( x / xn ) f ( xn ) K Yn K
x = N /K
(6)
where Y is the actual level of output. The difference between two functions can be explained by a geometric example. We depict the long-run and short-run production functions in Fig. 1. There are two short-run production functions corresponding to two different techniques, x1n and xn2 , in that place. In addition, we find the longrun production function (LPF) therein. For example, consider the curve SPF1, which represents the short-run production function corresponding to a particular technique x1n. Obviously, we can observe that SPF1 lies below LPF for all x, and SPF1 equals LPF at the point x1n. Accordingly, SPF1 and LPF must be tangential to each other at x1n. What is more, it is clear that the same properties are realized for all xn. We can therefore conclude that the long-run production function is the upper envelope of the family of the short-run production functions. Such a relationship between two production functions is essentially identical to that between the long-run and short-run cost functions that is explained in microeconomic theory. The envelope property of the long-run production function brings us to one further point that we must not ignore. The following condition holds: 2
Here and henceforth we omit the time index so long as no confusion arises.
32
H. Yoshida
LPF
SPF2
SPF1
x1n
xn2
x
Fig. 1. Long-run and short-run production functions
f ′ ( xn ) = u ′ (1)
f ( xn ) xn
for ∀xn
(7)
This condition means that the long-run and short-run production functions are tangential to each other at the normal employment–capital ratio xn. By replacing u′(1) with q and integrating Eq. 7 with respect to xn, we obtain f(xn) = A(xn) q A > 0,
00
(30)
where the additional variable m is the expected growth rate of aggregate demand. We posit the following expectation formulation with respect to m:
36
H. Yoshida
. m = (Y /Y)* (31) . where (Y /Y)* is the steady-state growth rate of output. Before closing this section, we must draw attention to the following two points. First, the community’s propensity to save s is assumed to be constant. Thus, saving S per unit of capital can be expressed as S = suf ( xn ) , K
0 < s 0,
b3 > 0,
b1b2 − b3 > 0
(41)
For high values of the adjustment speed of B (w > 1), the instability of the steady state can be verified immediately on the grounds that b3 < 0. For the remainder of this section, we assume that 0 < w < 1. Now we can prove the following two important propositions. Proposition 1 deals with the case where F(u) > 0, while Proposition 2 investigates the system for the intermediate negative values of F(u). Proposition 1: If the adjustment speed of B is sufficiently small (0 < w < 1) and the elasticity of labor intensity for new capital equipment with respect to capital utilization F(u) is positive, then the steady state is locally stable. Proof of Proposition 1: It is easily checked that b1 > 0 and b3 > 0. Moreover, we must verify the sign of b1b2 − b3. Using Eqs. 38–40, we can obtain C : = b1b2 − b3 1 {[1 − θ (1 − ε )]Φ (u) + εω }[ −uk (1 − ω ) k ] β = g ( ε + G ′′ / G ′ ) − b1ω sgkuk (1 + Φ (u))θ f
(42)
The asterisk is suppressed without confusion, since all evaluations in this section refer to the steady-state point. 3
38
H. Yoshida
Clearly the sign of C is positive. This completes the proof. 䊏 Proposition 1 assures us of the stability of the warranted growth path. The local stability can be ensured in the case of 0 < w < 1 and F(u) > 0. What happens in the case where f u < 0? We can obtain the following: Proposition 2: Suppose that 0 < w < 1 and that max{−[1 − q(1 − e)], −q(e + G″/G′)} < [1 − q(1 − e)]F(u) < −we. Then there exists a positive critical value bH. The steady state is locally stable if b < bH, while it is unstable if b > bH. Furthermore, the system undergoes a Hopf bifurcation at b = bH. In the case of a super-critical bifurcation, the economy experiences perpetual and selfsustained fluctuations, whereas in the case of a subcritical bifurcation the economy exhibits corridor stability, as proposed by Leijonhufvud. (1973). Proof of Proposition 2: Again, we can be fairly certain that b1 > 0 and b3 > 0 for all b > 0. Since 0 < w < 1 and [1 − q(1 − e)]F(u) < −we, we can see that the function C(b) is monotonically decreasing with respect to b. Furthermore, the function C(b) has a positive vertical intercept since b1 > 0 and −1 < F(u). We can therefore recognize that there exists a critical value bC such that C(b) = 0; C(b) > 0 for b < bC and C(b) < 0 for b > bC . It follows from what has been said that the Routh-Hurwitz conditions are satisfied for b < bC . Let us now turn to the issue of Hopf bifurcation. For our purpose it is sufficient to examine the following conditions: (i)
b1 > 0,
b3 > 0,
C(bH) = 0
(ii) dC(b)/db| b =b ≠ 0 H
where bH is a bifurcation value of b. These conditions enable us to apply the Hopf bifurcation theorem.4 From the above discussion, it is clear that the Hopf bifurcation criterion is satisfied at b = bC . This proves the proposition. 䊏
5 Conclusion This chapter has developed and analyzed a Harrodian growth model with imperfect competition. In the steady-state position, the economy grows along the warranted growth path where all the existing capital stock is fully utilized. We showed in Proposition 1 that the local stability of the growth path can be guaranteed by the slow reaction of expected demand (0 < w < 1) and a positive elasticity of labor intensity for new capital equipment with respect to utilization rate (F(u) > 0). 4
On this subject, see Asada and Semmler (1995, pp. 634–635).
Harrodian Dynamics Under Imperfect Competition
39
An outline of the stabilizing mechanism can be explained as follows. Suppose that the economy is in a prosperous state with high levels of capacity utilization. At that stage the firm chooses a labor-using direction of technical change (f u > 0), which induces an increase in xn. This phenomenon causes the firm to reduce the rate of capital utilization (u x < 0), and hence cut down the rate of capital accumulation through the Harrod-type investment function. As can be seen in the above process, the warranted growth path is surrounded by centripetal forces. The stabilizing process mentioned above would be described by the following feedback chain: u ⇑ → [f u > 0] → xe ⇑ → xn ⇑ → [u x < 0] → u ⇓ → g ⇓ Judging from the above discussion, we conjecture that the warranted growth path is unstable if the firm chooses a labor-saving direction of technical change (f u < 0) in prosperity. In fact, this observation is true, as mentioned in Proposition 2. In addition, the application of the Hopf bifurcation theorem established the existence of periodic fluctuations around the warranted growth path if the destabilizing force of technical change is weak, that is, the value of F(u) lies in a certain range of negative values. This final point is the most important part of our arguement.
References Adachi H (1991) A growth model with imperfect competition (Hukanzen kyosou no seichoumoderu) (in Japanese). J Polit Econ Commer Sc (Kobe University) 164:53–77 Alexander SS (1950) Mr. Harrod’s dynamic model. Econ J 60:724–739 Asada T, Semmler W (1995) Growth and finance: an intertemporal model. J Macroecon 17:623–649 Flaschel P (1994) A Harrodian knife-edge theorem for the wage–price sector. Metroeconomica 45:266–278 Flaschel P, Franke R, Semmler W (1997) Dynamic macroeconomics: instability, fluctuations, and growth in monetary economics. MIT Press, Cambridge Harrod RF (1939) An essay in dynamic theory. Econ J 49:14–33 Harrod RF (1973) Economic dynamics. Mcmillan, London Hicks JR (1950) A contribution to the theory of the trade cycle. Clarendon Press, Oxford Leijonhufvud A (1973) Effective demand failures. Swedish Econ J 75:27–48 Nikaido H (1975) Factor substitution and Harrod’s knife-edge. Z Nationalökonomie 35:149–154 Nikaido H (1980) Harrodian pathology of neoclassical growth. Z Nationalökonomie 40:111–134 Nikaido H (1990) A model of cyclical growth. J Tokyo Int Univ Dept Econ 3:1–9
40
H. Yoshida
Okishio N (1964) Instability of Harrod–Domar’s steady growth. Kobe Univ Econ Rev 10:19–27 Silverberg G, Dosi G, Orsenigio L (1988) Innovation, diversity and diffusion: a selforganization model. Econ J 98:1032–1054 Sportelli MC (2000) Dymamic complexity in a Keynesian growth-cycle model involving Harrod’s instability. J Econ 71:167–198 Yoshida H (1999) Harrod’s “knife-edge” reconsidered: an application of the Hopf bifurcation theorem and numerical simulations. J Macroecon 21:537–562
3. Endogenous Technical Change: The Evolution from Process Innovation to Product Innovation Gang Gong
Summary. A nonlinear growth model is presented that introduces technical innovations within a Harrodian growth framework. We first introduce product innovation into a standard Harrodian model. This resolves the knife-edge problem, whereas irregular growth cycles could occur with excess capacity. We then introduce process innovation. Bifurcation analysis shows that the model permits different structurally stable dynamics when the parameters go through some critical levels. Such a structural change in dynamics reflects an essential feature of endogenous technical change in the long run: the evolution from process innovation to product innovation. Key words. Process innovation, Product innovation, Knife-edge and bifurcation
1 Introduction The last 15 years have seen the development of new growth theory inspired by the seminar papers of Romer (1986, 1990) and Lucas (1988). The central argument of new growth theory which makes it different from the old growth theory1 is that technical innovation is endogenous. Therefore, their models are often considered to be the endogenous growth model.2 A special contribution to new growth theory was made by Romer (1990), in which technical innovation is explicitly formulated as product innovation. Product innovation refers to the type of innovation that introduces new (including upgraded) products. This is different from the other type of innovation, process innovation, by which new and improved production methods School of Economics and Management, Tsinghua University, Beijing 100084, PR China. 1 See Harrod (1939), Solow (1956), and Kaldor (1957). 2 For the most recent extensive review of endogenous growth literature, see Greiner et al. (2005).
41
42
G. Gong
(often through specialization and mechanization) are introduced. The economic impacts of these two innovations are also different. Process innovation increases the factor productivity, whereas the major effect of product innovation is the emergence of new markets and sector re-organization. Although the idea of such distinction can be found as early as in Schumpeter (1934), 3 it was only with Romer (1990) that product innovation was formally introduced into a model. However, Romer’s 1990 formulation with regard to product innovation cannot be considered as satisfying: it simply gives a new interpretation to the Solow residual, i.e., an expansion of product horizon instead of total factor productivity as expressed in other models. Therefore, we essentially cannot see the economic impact brought about by product innovation compared to process innovation. Also, the new growth theory cannot express the stylized factor of productivity growth slowdown, as discussed intensively in the mid-1990s. One of the difficulties arising from both old and new growth theories is that they do not distinguish between innovation and invention, the two apparently different concepts that had been emphasized by Schumpeter almost 70 years earlier. Inventions are new knowledge, new patents, and new findings. They provide the technological resource for innovations, which are supposed to be carried out by an entrepreneur. However, “as long as they are not carried into practice, inventions are economically irrelevant.” (Schumpeter 1934, p. 88) Yet in the current literature, all inventions are assumed to be automatically carried into innovations. This chapter tries to contribute to the research in technical innovation by distinguishing innovation from invention. We shall still follow the idea of a “cumulative cause of knowledge,” as proposed by new growth theorists, so that the technological resources for innovation are not scare. This will allow us to be focused on the incentive for an entrepreneur to carry out innovation. We assume that the incentives are different depending on the current stage of business cycles, in order to introduce the two different types of innovation (product vs. process). In particular, when the economy is in excess demand, the entrepreneur has an incentive to carry out process innovation to speed up its production, but not product innovation because the existing product is marketed well. On the other hand, the reverse will hold when the economy is in recession (or excess supply). According to Schumpeter (1934, p. 66), technical innovation can be organized into five categories: the new product, the new method of production, the opening up of a new market, the utilization of new raw materials, and the sector reorganization of the economy. Modern economists have grouped these into two types: process innovation and product innovation (see, e.g., Scherer 1984; Pavitt 1984; Nelson and Winter 1982).
3
Endogenous Technical Change
43
This consideration about innovation incentives means that the model we construct here cannot be an equilibrium model. Therefore, the model will be built on a Harrodian framework. One of the interesting findings of this chapter is that the knife-edge problem that is supposed to occur in a Harrodian type of model can be resolved when we introduce technical innovation. The rest of this chapter is organized as follows: Section 2 formulates a prototype model where technical innovations are not introduced, and hence the knife-edge problem can be revealed. Section 3 introduces product innovation into the prototype model. Section 4 deals with process innovation by a bifurcation analysis of the model. Finally, Sect. 5 concludes the argument. The mathematical proof of the propositions in the text is provided in the appendix.
2 A Prototype Model The prototype model consists of the following structural equations: It = g(ut−1)Kt−1, ut = Yt/Y
g′ > 0
p t
(1) (2)
Y pt = Kt/v,
v>0
(3)
Yt = It/s,
s>0
(4)
Kt = (1 − d)Kt−1 + It−1,
1>d>0
(5)
Here, It is investment, Kt is capital stock, ut is capacity utilization, Yt is actual output, Y pt is potential output, s is the marginal and average propensity to save, v is the capital–output ratio, and d is the depreciation rate. The meanings of these equations are all straightforward. Equation 1 is an investment function, Eqs. 2 and 3 are the definitions of capacity utilization and potential output, respectively, Eq. 4 explains the determination of output by a multiplier, and Eq. 5 describes the dynamics of capital stock. For the investment function, we assume g(⋅) to be nonlinear, which may take the form shown in Fig. 1. This nonlinear form resembles the investment functions in Kaldor (1940) and Goodwin (1951). If we express ut−1 in terms of Eq. 2, and further Y pt−1 in terms of Eq. 3, Eq. 1 can also be written as It = G(Kt−1,Yt−1). This is Kaldor’s investment function as defined in Chang and Smyth (1971). Let kt denote the growth rate of capital stock, and hence Kt = (1 + kt)Kt−1. Substitute this into Eq. 3, we obtain Y pt = (1 + kt)Kt−1/v. Meanwhile, expressing Yt in terms of Eq. 4, we obtain from Eq. 2
44
G. Gong
I t K t −1
Fig. 1. Standard investment function
ut =
It / s (1 + kt ) K t −1 / v
(6)
From Eq. 5 kt = −d + It −1 / Kt −1 = −d + ( s / v )ut −1
(7)
The warranted growth rate in this model is equal to −d + s/v when ut−1 is set to 1.4 Now expressing It/Kt−1 in terms of Eq. 1 and kt in terms of Eq. 7, Eq. 6 can be rewritten as ut = F(ut−1)
(8)
with F (u) =
g (u) ( s / v )(1 − d ) + ( s / v )2 u
(9)
By the restriction of nonnegative investment, Eqs. 8 and 9 should be subject to g(ut−1) > 0. Otherwise, It will be 0 and so will ut. The trajectory ut can thus be derived by iterating the map F. The feasible interval of this trajectory is [0, +∞). Once ut is determined, the trajectory kt is also determined through Eq. 7. Therefore, understanding ut is sufficient to understand the dynamics of kt. We remark that Harrod did not consider depreciation, and therefore his warranted growth rate is s/v.
4
Endogenous Technical Change
45
The fixed points of F can be resolved as follows. Let ut = ut−1 = ¯ u , where ¯ u is regarded to be a steady state. Thus, from Eqs. 8 and 9, we obtain ¯) = g(u ¯), where y(u y(u) = u(1 − d)s/v + u2(s/v)2
(10)
Figure 2 provides a graphical analysis of the possible solutions of ¯ u. The curve y(u) can be justified by observing u ≥ 0, y(u) ≥ 0, y′(u) > 0, and y″(u) > 0. One finds that there are two possible equilibriums (or steady u 2 , with ¯ u1 > ¯ u 2 . Proposition 1 regards the stability of these states): ¯ u 1 and ¯ two equilibriums. Figure 3 provides a graphical analysis of trajectory ut. Proposition 1: Let F be defined as in Eq. 9. Then, ¯2) > 1, F′(u
¯1) > −1 1 > F′(u
Clearly, Harrod’s unstable equilibrium corresponds to ¯ u 2 in this model. This steady state is close to, and possibly equal to, 1, indicating demand–supply ¯2 , +∞), it will move asymptotically to ¯ u 1. If it equilibrium. If ut starts in (u starts in [0, ¯ u 2), the trajectory will be forward to its lower bound 0. However, u 2 , the trajectory, as Harrod if the initial ut happens to be at the knife-edge ¯ expected, will stay on ¯ u 2.
3 Introducing Product Innovation We have presented our prototype model where investment follows the conventional principle and is considered to fill only the expected capacity shortage. Such a conventional principle indicates that investment must be positively related to capacity utilization. Apparently it is this principle that causes the knife-edge problem. However, investment not only creates the capacity of existing products, but also the capacity of new products. Next, we shall introduce a model of product innovation. The model is built on the
Fig. 2. The steady states of the prototype model
46
G. Gong
Fig. 3. The orbits of ut in the prototype model
following considerations. First, in each period, some technological resources for new products are always available for firms. This might be due to the idea of a “cumulative cause of knowledge,” as proposed by new-growth theorists. Second, although the technological resources for product innovation are always available, in a given period, the resources are limited despite the fact that they accumulate over time. Third, for each firm in the economy, there exists a threshold (in terms of capacity utilization) below which it will consider investing in a new product. When the demand is strong for an existing product, or the capacity utilization is still high, there is not much incentive for the entrepreneur to promote the new product. However, when the entrepreneur finds that his existing capacity is excessive or below some threshold, he then consider introducing the new product in order to continuously establish his leadership in the economy. We now begin presenting our model. Assume that there exist n firms in the economy, among which m firms, m ≤ n, have a technological resource that can either upgrade existing product or produce a completely new product. To simplify our analysis, we shall further assume that each of these m firms have one, and only one, such technology, which requires a certain amount of investment. However, the firms may have different considerations about when they should invest in their new products. For instance, firm A may consider investing when its capacity utilization is less than 87%, while firm B will invest when its capacity utilization is less than 85%. Let q i be the required amount of investment for i to carry its product innovation. Then its investment in new product, denoted as I ni, may be written as ⎧θ Iin = ⎨ i ⎩0
µi < µi* otherwise
(11)
Endogenous Technical Change
47
Here, mi is the capacity utilization for firm i, and m*i is the threshold below which i will invest in its new product. Given Eq.11, the aggregate investment in new products, denoted as I n, can be written as m
I n = ∑ I in
(12)
i =1
Equations 11 and 12 form a map from Rn to R. This gives rise to the derivation of In ∈ R from m ∈ Rn, where m is the vector with its i-th coordinator mi. We shall define this map as Q: Rn → R. However, our objective is to derive a function H : R → R that relates the aggregate capacity utilization u ∈ R to the aggregate investment in new products I n ∈ R. It is reasonable to assume that a measure of aggregate capacity utilization is a function (most likely a weighted average) of n individual firms’ capacity utilization. We denote this function by U: Rn → R. It has the property ∂U/∂mi ≥ 0. We shall now make a somewhat strong assumption. We shall assume that mi = mj,
for all i ≠ j,
i, j = 1,2, . . . , n
(13)
In other words, the firms in the economy have the same level of capacity utilization. The economic meaning of this assumption can be roughly expressed as follows: the firms in the economy share the impact of a shock (either positive or negative) to approximately the same extent. Although this is a strong assumption, it will allow us to exclude some intractable situations. Suppose there are two firms A and B in the economy. Without our assumption, it is quite possible that a measure of 85% of aggregate capacity utilization may be a correspondence of the combination of either 80% and 90% or 87% and 83% capacity utilizations for A and B, respectively. Since each combination, or a point in R2 , may indicate a certain amount of investment via Q formed by Eqs. 11 and 12, our aggregate investment function could be multivalued at u = 85%. Therefore, the assumption could help us to exclude this type of intractable situation. Finally, we shall also remark that this assumption is not a necessary condition for the proposition below to hold. However, it will makes our derivation of the proposition much simpler. Proposition 2: Given U and Q, there exists an interval [u2 , u1] such that H has the following properties: — I n = 0 when u > u1; — ∂I n/∂u ≤ 0 and I n ≤ Σmi=1q i when u2 ≤ u ≤ u1; — I n ≤ Σmi=1q i when u < u2 . It should be noted that an investment function H derived in this way is not smooth and continuous. However if n and m are large, we could treat H as smooth and continuous within the interval [u2 , u1].
48
G. Gong
Given our discussion on H, we now introduce product innovation into our conventional investment function. The new investment function (Fig. 4) can be written as It = [g(ut−1) + h(ut−1)]Kt−1,
h′ ≤ 0
(14)
Above, we let h(ut−1) ≡ H(ut−1)/Kt−1. The nonlinear form of h(⋅) shown in Fig. 4 reflects the properties that we have just discussed on the product innovation H. Given h(⋅), F can be rewritten as F (u) =
g (u) + h (u) (1 − d )( s / v ) + u ( s / v )2
(15)
Figure 5 provides a graphical analysis of the fixed points of this new F. As shown in Fig. 5, a new fixed point ¯ u 3 emerges with the introduction of h.
Fig. 4. The new investment function
Fig. 5. The steady states of the extended model
Endogenous Technical Change
49
Fig. 6. The orbits of ut in the extended model
Moreover, the map F (Fig. 6) has a new hump between ¯ u 2 and ¯ u 3. This makes the orbits starting at less than ¯ u 2 no longer forward to the lower bound 0. u 2 that have been proved in Sect. 2 should still The properties of ¯ u 1 and ¯ be valid. Thus we only need to consider ¯ u 3. Elsewhere (Gong 2001), I have proved that for some probable functional forms of g(⋅) and h(⋅), irregular cycles can occur in the neighborhood of ¯ u 3.
4 Bifurcation Analysis In this section, we shall assume that d and the parameters in g(⋅) and h(⋅) are all fixed, and therefore we concern ourselves only with the variation in s/v denoted as l. The bifurcation analysis in this section will provide a mathematical preparation for the major argument developed in the next section. Given the parameter d, Eq. 10 indicates that ∂y/∂l > 0. Thus, with the variation of l , the motion of y(e) and hence of F(e) can be illustrated as in Figs. 7 and 8, respectively. When l is sufficiently small, F has only one fixed point ¯ e 1. With the e 2, ¯ e 3, increase in l , new fixed points are born, starting with ¯ e 4 , followed by ¯ e 3 remains. and then ¯ e 5. When l is sufficiently large, only ¯
50
G. Gong
Fig. 7. The motion of y(e) with the increase in l
Fig. 8. The motion of F with the increase in l
Apparently, ¯ e 4 and ¯ e 5 are not structurally stable because they will vanish by an arbitrarily small perturbation in l. Denoting l1 and l2 , with l1 < l2 , e 5, respectively, as the values of l corresponding to the occurrence of ¯ e 4 and ¯ 5 e1 bifurcation occurs at l1 and l2 . When l < l1, all orbits are asymptotic to ¯ since ¯ e 1 is attracting and is the only steady state. When l > l2 , all orbits e 3 is repelling) or be asymptotic to ¯ e 3 (if ¯ e 3 is will fluctuate around ¯ e 3 (if ¯ ¯5) = 1, as indicated by Fig. 9, l2 is a saddle-node bifurcation, as defined in Since F′(e Devaney (1989) and Guckenheimer and Holmes (1983), or fold bifurcation, as defined in Lorenz (1989). l1, however, is formally not a saddle-node bifurcation because F may not ¯4) is not necessarily equal to 1. Nevertheless, it plays the be smooth at ¯ e4 , and hence F′(e same role as a saddle-node bifurcation. 5
Endogenous Technical Change
51
Fig. 9. The equilibrium locus
attracting). When l2 > l > l1, F has three fixed points, ¯ e 1, ¯ e 2 , and ¯ e 3. Whether 6 ¯ ¯ an orbit moves towards e 1 or e 3 depends on the initial e. The analysis above allows us to describe the locus of equilibrium points, as in Fig. 9. As l increases from very small values (less than l1), the growth path around which the economy fluctuates gradually moves along the equilibrium locus from the upper leaf and jumps downward at l2 .7
5 Conclusion: The Evolution from Process Innovation to Product Innovation The previous section has shown that our extended model permits different structurally stable dynamics when l goes through different bifurcations. Consider a situation in which the economy persistently remains in excess demand. This is possible when l is small (i.e., less than l1). In this case, there is an incentive for the economy to change l in the following directions. First, the saving ratio s will increase because entrepreneurs want to accumulate More bifurcations are permitted when l > l1. One possible bifurcation is the periodicdoubling bifurcation, which concerns the attractability of ¯ e 3 . Another possibility, which plays an important role in the system of chaotic behavior, is the homoclinic bifurcation (see Devaney, 1989, and Guckenheimer and Holmes, 1983, for the definitions of these two bifurcations). These two bifurcations, especially the homoclinic bifurcation, are not easily detectable. Their existence depends on d and the parameters in g(⋅) and h(⋅). However, the major argument developed in the following section does not necessarily rely on the detection of these two bifurcations. Thus, for conservation of space, we do not provide a formal detection of their existence. 7 This somewhat resembles a fold catastrophe in a continuous time-differential system. See the definition of catastrophe for a continuous time system in Varian (1979) and Lorenz (1989). 6
52
G. Gong
more capital to create more capacity. Second, v will decline since the persistent capacity shortage provides an incentive for entrepreneurs to carry out technical change in the form of process innovation. As expected, this will increase productivity and hence reduce v (recall that v is an inverse measure of capital productivity). Both variations indicate an increase in l and thus drive the economy to move downwards along the equilibrium locus, as in Fig. 9. On the basis of previous bifurcation analysis, we suggest the following evolution of a capitalist economy in the long run. Historically, the capitalist economy experienced a period of excess demand. This was the early stage of the capitalist economy, where the production process was in the form of craft production (Nell 1992). Nevertheless, this period produced a strong incentive for the capitalist to carry technical innovation in terms of process innovation, and therefore increased the productivities of both labor and capital. The increase in capital productivity caused a decline in the capital–output ratio v and hence an increase in l. With the increase in l , the economy gradually moved away from the state of excess demand into the one characterized by excess capacity. In between, there is a period in which the direction of movement is determined by the initial condition. This implies that exogenous shocks could be important in determining the state of the economy. However, this period cannot last, since as long as the economy remains in persistent excess demand, the capital–output ratio will decline. When the capital–output ratio moves downward to the level where l < l2 , the economy will finally evolve to the state where persistent excess capacity is dominant. Given such a state of excess capacity, the form of technical innovation is also transformed. The emergence of new products and new markets become the main form of technical innovation. Accordingly, the direction of movement of productivity becomes unclear due to the properties of the new products that the economy introduces. This also explains the slowdown of productivity growth that occurred in US and other Western countries.
Mathematical Appendix Proof of Proposition 1 From Fig. 2, both ¯ u 1 and ¯ u 2 belong to the interval satisfying g(u) > 0. Thereu 2, fore, in the neighborhood of ¯ u 1 and ¯ F ′ (u ) =
g ′ (u) − u ( s / v )2 (1 − d )( s / v ) + u ( s / v )2
where ¯ u stands for either ¯ u 1 or ¯ u 2 . Let
(A1)
Endogenous Technical Change
¯) − y′(u ¯) ∆ ≡ g′(u
53
(A2)
where y(u) is given by Eq. 10 and hence ¯) = (s/v)(1 − d) + 2u ¯(s/v)2 y′(u
(A3)
¯). We Substitute Eq. A2 into Eq. A1 and then use Eq. A3 to interpret y′(u obtain F ′ (u ) =
∆ +1 (1 − d )( s / v ) + u ( s / v )2
(A4)
¯2) < g′(u ¯2) and hence ∆ > 0 at ¯ u 2 . We thus prove that From Fig. 2, y′(u ¯2) > 1. Also from Fig. 2, y′(u ¯1) > g′(u ¯1), and thus ∆ < 0 at ¯ u 1. It follows F′(u ¯1) < 1. Next we prove F′(u ¯1) > −1. that F′(u ¯1) → 0 and hence ∆ → y′(u ¯1). From Consider ¯ u 1 → +∞. In this case, g′(u ¯1) tends to its lower bound. We thus Eq. A4, this further indicates that F′(u obtain − y ′ (u1 ) inf F ′ (u 1) = lim ⎡⎢ + 1⎤⎥ u1 →+∞ ⎣ (1 − d )( s / v ) + u ( s / v )2 ⎦ 1 u1 ( s / v )2 ⎡ ⎤ = lim ⎢ − u1 → +∞ ⎣ (1 − d )( s / v ) + u ( s / v )2 ⎥ ⎦ 1 = −1
䊏
Proof of Proposition 2 Define u1 to be the aggregate capacity utilization that is equal to the maximum threshold Max(mi). Then when u > u1, no firm will invest in new products, indicating that In = 0. Define u2 to be the aggregate capacity utilization that is equal to the minimum threshold Min(mi). Then when u < u1, all firms in m will invest. However, due to the limitation of technological resources, there is a maximum amount of in investment in the economy, which is equal to Σmi=1q i. When u2 ≤ u ≤ u1, a small reduction in u may not induce a firm to invest, indicating that ∂In/∂u = 0. However, when the reduction is large, passing some threshold, some firms will invest, indicating that ∂In/∂u < 0. Again the total amount of investment will not exceed 䊏 Σmi=1q i. Acknowledgments. The author would like to thank Robert Solow, Edward Nell, Peter Flaschel, and Peter Scott for their comments on various earlier versions of the manuscript. Their critique and suggestions brought many important improvements. Of course, the usual disclaimer applies.
54
G. Gong
References Chang WW, Smyth DJ (1971) The existence and persistence of cycles in a non-linear model: Kaldor’s 1940 model re-examined. Rev Econ Stud 38:37–46 Devaney RL (1989) An introduction to chaotic dynamic systems, 2nd edn. AddisonWesley, Redwood Gong G (2001) Product innovation and irregular growth cycles with excess capacity. Metroeconomic 52:428–448 Goodwin RM (1951) The nonlinear accelerator and the persistence of business cycles. Econometrica 19:1–17 Greiner A, Semmler W, Gong G (2005) The forces of economic growth: theory and time series evidence. Princeton University Press, Princeton Guckenheimer J, Holmes P (1983) Nonlinear oscillations, dynamic systems and bifurcations of vector fields. Springer, New York Harrod RF (1939) An essay in dynamic theory. Econ J 49:14–33 Kaldor N (1940) A model of the trade cycle. Econ J 50:78–92 Kaldor N (1957) A model of economic growth. Econ J 66:591–624 Lorenz HW (1989) Nonlinear dynamical economics and chaotic motion. Springer, New York Lucas RE (1988) On the mechanics of economic growth. J Monetary Econ 22:3–42 Nell E (1992) Transformational growth: from Say’s law to the multiplier. In: Nell E (ed) Transformational growth and effective demand. New York University Press, New York Nelson RR, Winter SG (1982) Evolutionary theory of economic change. Harvard University Press, Cambridge Pavitt P (1984) Sector pattern of change. Res Policy 13:343–373 Romer P (1986) Increasing return and long-run growth. J of Polit Econ 94:1002–1037 Romer P (1990) Endogenous technical change. J Polit Econ 98:71–105 Scherer FM (1984) Innovation and growth: Schumpeterian perspectives. MIT Press, Cambridge Schumpeter JA (1934) The theory of economic development: an inquiry into profits, capital, credit, interest, and business cycle. Harvard University Press, Cambridge Solow RM (1956) A contribution to the theory of economic growth. Q J Econ 70:65–94 Varian HR (1979) Catastrophe theory and business cycle. Econ Inquiry 17:14–28
4. Tobin’s q and Investment in a Model with Multiple Steady States Mika Kato1, Willi Semmler2, and Marvin Ofori3
Summary. This chapter considers a simple dynamic investment decision problem of a firm where adjustment costs have capital size effects. This type of setting possibly results in multiple steady states, thresholds, and a discontinuous policy function. We study the global dynamic properties of the model by employing the Hamilton–Jacobi–Bellman method and dynamic programming that help us in the numerical detection of multiple steady states and thresholds. We also explore the model’s implications concerning the effects of aggregate demand, interest rates, and tax rates. Finally, an empirical study on the firm size distribution is provided using US firm-size data. We utilize two different approaches, Kernel density estimation and Markov chain transition matrix, to study an ergodic distribution. Our results suggest a twin-peak distribution of firm size in the long run, which empirically supports the theoretical conjecture of the existence of multiple steady states. Key words. Adjustment costs, Multiple steady states, Global dynamics
1 Introduction In the theory of investment, a firm maximizes the discounted value of its future net cash flows. What the firm wants to know is the optimal investment schedule over time. Generally speaking, by solving a firm’s optimization problem, we want to obtain a policy function which tells us the optimal Department of Economics, Howard University, ASB-B-319, Washington, DC 20059, USA 2 Center for Empirical Macroeconomics, Bielefeld and New School for Social Research, New York, 65 Fifth Ave, New York, NY 10003, USA 3 University of Bielefeld, 33502 Bielefeld, Germany 1
55
56
M. Kato et al.
investment policy corresponding to each level of capital stock. This policy function has been perceived to be continuous. Here, however, we demonstrate the possibility of a discontinuous policy function. The key factor is to introduce adjustment costs with capital size effects, whereby the growth rate of capital is the source of adjustment costs. Interestingly, a recent work by Feichtinger et al. (2000) provides a very simple investment model of this type. The purpose of our study is to solve a firm’s dynamic investment decision problem where multiple steady states and a discontinuous investment strategy may arise. We study the global dynamic properties of the model, and pursue a numerical detection of the threshold. It should be noted that economists have long been interested in studying the implication of adjustment costs on the dynamics of investment. An earlier version of such an investment study was presented by Jorgenson (1963). In his study, the optimal path of the capital stock for an exogenously given output is derived. The optimal path of investment rate is determined only when assuming a distributed lag function. Those necessary ad hoc assumptions were recognized by Lucas (1967), Gould (1968), Uzawa (1968, 1969), and Treadway (1969). Their solution was to introduce the adjustment costs. Generally speaking, there are two types of adjustment cost: those with and without capital size effects. Lucas (1967), Gould (1968), and Treadway (1969) developed the models with adjustment costs without size effects, whereas the models with adjustment costs with size effects have been studied by Uzawa (1968, 1969) and Hayashi (1982). Their work was stimulated by the earlier work on adjustment costs by Eisner and Stroz (1963). Our model focuses on the role of adjustment costs with capital size effects. As it turns out, the capital size effects are central to the generation of multiple steady states and the associated interesting dynamic properties. In addition to historydependence, both continuous and discontinuous policy functions can arise. Recently, Feichtinger et al. (2000), using a simple framework with adjustment costs with size effects, discussed the case of multiple steady states. Our model here is crucially based on their model, but focuses more on the study of global dynamics using the Hamilton–Jacobi–Bellman (HJB) method, stresses the analysis of comparative dynamics and the policy implications, and explores the empirical support of the model’s prediction on the firm size distribution. The most important economic implication of the multiple steady states is that it has the property of history-dependence. History-dependence arises if solution paths converge toward distinct attractors depending on the initial conditions. The existence of two or more stable steady states implies the existence of thresholds. At a threshold, the firm will not be able to decide between investment strategies which converge toward one or the other stable steady state.
Model with Multiple Steady States
57
Feichtinger et al. (2000) solve their model by using Pontryagin’s maximum principle and studying the local properties of each steady state. Instead, we focus on comparing the values of the candidate paths and derive the global solution. We use the HJB method to derive a global value function and pave the way to detecting thresholds numerically. The remainder of the chapter is organized as follows. Section 2 gives a brief review of the literature on adjustment costs in the investment theory. Section 3 presents a traditional model with adjustment costs with no capital size effects, where we will obtain a unique and continuous policy function. In other words, the optimal investment rate is continuous in Tobin’s q. In Sect. 3, we present the model with capital size effects. Using the HJB method, the global value function and the threshold will be derived. The policy function can be both continuous and discontinuous. Numerical simulations are attached at the end of Sect. 3. Section 4 explores the model’s implications concerning the effects of aggregate output (booms and recessions), interest rates, and tax rates. Sections 5 and 6 provide an empirical study on firm size distribution using the US firm size data. We utilize two different approaches, Kernel density estimation and Markov chain transition matrix, to obtain an ergodic distribution. Our results suggest the twin-peak distribution of firm size in the long run, which can be viewed as an empirical support of our theoretical model.
2 The Benchmark Model Without Size Effects We first present a traditional model without size effects. The model used here has been developed by Abel (1982), Hayashi (1982), and Summers (1981), and is known as the q theory model of investment. Generally speaking, this type of model has a unique equilibrium and a continuous policy function. The model is represented in Eqs. 1–3. ∞
max ∫ e − rt {R ( K ) − I − A ( I )} dt
(1)
. s.t. K = I − dK;
(2)
I
0
K0 = given
where R(0) = 0, R″(K) < 0, A(0) =0, A′(I) > 0 for I > 0, A′(0) < 0 for I < 0, A″(I) > 0 (3) R(K) is a representative firm’s revenue function and K(t) is a representative firm’s capital stock. I(t) is the firm’s investment. We also assume that the purchase price of a unit of investment goods is 1, and thus the cost of purchase investment goods is I. A(I) represents simply adjustment costs which depend only on the firm’s investment I. The depreciation rate of capital stock is d.
M. Kato et al.
58
For more specific results, we use the following specific functions which satisfy Eq. 3: R(K) = aK − bK2
(4)
A(I) = cI
(5)
2
We employ the HJB method to study the analytical solution of this problem. The HJB equation for the present model has the form rV ( K ) = max [ R ( K ) − I − A( I ) + V ′ ( K )( I − δ K )] I
(6)
Solving ∂[•]/∂I = 0 gives the first order condition −1 − A′(I) + V′(K) = 0
(7)
Note also that V′(K) is equivalent to Tobin’s q, Tobin (1969). Therefore, we have the following rule for optimal investment: q (t ) > 1 ⎧I > 0 ⎪ I (t ) = A′ (q − 1) ; ⎨I = 0 for q (t ) = 1 ⎪⎩I < 0 q (t ) < 1 −1
(8)
We can interpret q as the market value of a unit of capital. Since we assume that the purchase price of a unit of capital is 1, the economic interpretation of Eq. 8 is that a firm invests if the market value of capital exceeds the purchase price of capital, and disinvests if the market value of capital is less than the purchase price of capital. With specific functions, the optimal investment policy reads I=
1 (V ′ ( K ) − 1) 2c
(9)
If e. is a steady-state level of capital stock, then from Eq. 2 I − de = 0
(10)
and, at the steady state, we obtain rV(e) = R(e) − de − A(de) V ′ (e ) =
R ′ (e ) r +δ
(11) (12)
Thus, from the first-order condition in Eqs. 7 and 2, the steady-state e. satisfies −1 − A′ (δ e ) +
R ′ (e ) r +δ
(13)
Model with Multiple Steady States
59
Using specific functions, Eq. 13 becomes −1 − 2cδ e +
a − 2be =0 r +δ
(14)
Therefore, we have a unique steady state e=
a − (r + δ ) 2cδ (r + δ ) + 2b
(15)
To establish a policy function, we construct a phase diagram in {I, K} space. There exists a unique global stable manifold associated with a unique steady state. The global stable manifold in {I, K} space is equivalent to the policy function, and thus the policy function is continuous, as Fig. 1 shows.
3 The Model with Adjustment Cost and Size Effects Here we follow the model developed by Feichtinger et al. (2000). The model is a simple dynamic investment model with relative adjustment costs with capital size effects. The simplest example is an adjustment cost function of investment to capital ratio, as in Feichtinger et al. (2000). Multiple steady states and a discontinuous policy function may arise depending on parameters. We employ the HJB method again to solve the problem with the aim of obtaining a global value function. We want to detect the threshold
I
I =0
K =0
0 Fig. 1. Continuous policy function
e
K
M. Kato et al.
60
numerically when we construct a policy function. The policy function can be both continuous and discontinuous depending on the parameters. Consider a firm acting to maximize the present value of the sum of future net cash flows. ∞
{
max ∫ e − rt R ( K ) − I − A I
0
. s.t. K = I − dK;
( KI )}dt
K0 = given
(16) (17)
where R(K) is a revenue function, K is a capital stock, I is investment, A(I/K) is adjustment costs with size effects, i.e., the size of capital stock K affects adjustment costs, d is a depreciation rate, and r is a discount rate. Here, by replacing the expression I/K by u, our new control variable is u ∞
max ∫ e − rt {R ( K ) − uK − A(u)} dt
(18)
. s.t. K = uK − dK;
(19)
I
0
K0 = given
We assume that R(K) and A(u) are continuously twice-differentiable, and R(0) = 0, R(0) > 0, R″(K) < 0 for all K
(20)
A(0) = 0, A′(0) = 0, A′(u) > 0 for u > 0, A′(u) < 0 for u < 0, A″(u) > 0 for all u (21) We also assume that R′(0) > r + d
(22)
so that the trivial solution u = 0 for all t is excluded. For more specific results, let us assume that revenue and adjustment cost functions are quadratic. R(K) = aK − bK2
(23)
A(I) = cI2
(24)
where a, b, c > 0. The HJB equation for the present model has the form rV ( K ) = max [ R ( K ) − uK − A (u ) + V ′ ( K ) (uK − δ K )] I
(25)
Step 1. Compute the steady states for the stationary HJB equation. Solving ∂[•]/∂u = 0 gives the first order condition −K − A′(u) + V′(K)K = 0
(26)
Again, it is important to notice that V′(K) is equivalent to Tobin’s q. The following rule of optimal investment is derived:
Model with Multiple Steady States
⎧u > 0 ⎪ u (t ) = A′ [(q − 1) K ]; ⎨u = 0 ⎪⎩u < 0 −1
for
q (t ) > 1 q (t ) = 1 q (t ) < 1
61
(27)
Comparing the investment rule under the relative adjustment cost with size effects, i.e., Eq. 27 with Eq. 8, the rule under the adjustment cost with no size effects, the only difference is that the capital stock K appears in the function u in a multiplicative way in Eq. 27. This implies that the firm with a larger capital stock has a higher incentive to invest, and an increasing return to scale exists locally. This can be understood as the source of multiple steady states. With specific functions Eqs. 23 and 24, Eq. 27 can read u=
1 (V ′ ( K ) − 1) K 2c
(28)
If e. is a steady state, then from Eq. 19 ue − de = 0
(29)
Since for any positive steady state e > 0 and u = d rV(e) = R(e) − de − A(d)
(30)
1 V ′ (e ) = [R ′ (e ) − δ ] r
(31)
Thus, the first-order condition Eq. 26 at the positive steady state e > 0 becomes 1 −e − A′ (δ ) + [R ′ (e ) − δ e] = 0 r
(32)
From the specific functions R′(K) = a − 2bK
(33)
A′(u) = 2cu
(34)
Therefore, we obtain positive steady states from the condition Eq. 32 1 −e − 2cδ + [a − 2be − δ ]e = 0 r
(35)
Thus,
{
u=0 u =δ
for for
e=0 e>0
(36)
M. Kato et al.
62
and we have three steady states 0 ⎧ ⎪ e = ⎨ a − r − δ ± (a − r − δ )2 − 16bcδ r ⎪⎩ 4b
(37)
Note that two positive steady states exist if a − r − δ > 4 bcδ r . Step 2. Solve the dynamic HJB equation starting from the equilibrium candidates. From the optimal investment rule Eq. 27, the satisfactory HJB equation is rV(K) = R(K) − A′−1((V′(K) − 1)K)K − A(A′−1((V′(K) − 1)K)) + V′(K)(A′−1((V′(K) − 1)K) − d)K
(38)
For specific results, we substitute the optimal investment rule Eq. 28, and the satisfactory HJB equation will be V ′ ( K )2 −
2K + 4δ c 4crV ( K ) 4ac V ′ (K ) − + − 4bc + 1 = 0 K2 K K
(39)
Then we obtain an ordinary differential equation in V. . V ′ (K ) =
( K + 2δ c )2 4crV ( K ) 4ac K + 2δ c − + − + 4bc − 1 for K ⭓ e K K2 K2 K
(40)
V ′ (K ) =
( K + 2δ c )2 4crV ( K ) 4ac K + 2δ c + + − + 4bc − 1 for K < e K K2 K2 K
(41)
with 1 V (e ) = [ae − be 2 − δ e − cδ 2 ] r
for each e > 0
(42)
as initial conditions.1 Step 3. Solve the global value function. We can compute the global value function for the original problem by V(K) = maxVi
(43)
The local value functions Vi are generally computed numerically. The global value function is shown in Fig. 2 corresponding to Fig. 3, the phase diagram in {u, K} space. This example shows the case when the middle As we see in Eqs. 36 and 39, u. = 0 for e = 0 and u = d for e > 0. Thus, the value of the value function at each steady state is expressed as V(e) = (1/r)[ae − be2 − de − cd 2] for e > 0 and V(0) = 0 for e = 0. 1
Model with Multiple Steady States
V (K )
Threshold
0
e2
K
e3
Fig. 2. Global value function
u Threshold
δ
K =0
u=0
0
e2
Fig. 3. Discontinuous policy function in {u, K} space
e3
K
63
64
M. Kato et al.
unstable steady state has a focus property. The point where the local value function changes is called the threshold or Skiba point. The phase diagram in {u, K} space can also be expressed in {I, K} space. Using some numerical examples presented by Feichtinger et al. (2000), we carry out some simulations. Case 1: Continuous policy function. The Skiba point coincides with the middle equilibrium candidate e2 . Example: r = 0.3, d = 0.1, b = 0.6, c = 0.3, a = 1, e2 = 0.07047, e3 = 0.2129. Case 2: Discontinuous policy function. The Skiba point does not coincide with the middle equilibrium candidate e2 . Example: r = 0.2, d = 1.2, b = 0.6, c = 0.3, a = 1, e2 = 0.017679, e3 = 0.5665, Skiba = 0.01. When we have multiple steady states, there generally exist multiple local stable manifolds satisfying the system and the transversality condition. Our model has three steady states, and there are two local stable manifolds corresponding to two stable steady states. Moreover, depending on the local dynamic property of the middle unstable steady state, these two stable manifolds overlap at about the middle unstable steady state, as Fig. 3 shows. Our strategy to compare the values of those candidate paths is to use the global value function Eq. 43. Using the global value function shown in Fig. 2, we can uniquely choose the optimal global stable manifold which generates the maximum value among the multiple candidate stable manifolds. The global stable manifold in {u, K} space (or in {I, K} space) is equivalent to the policy function, and thus the policy function can be discontinuous, as in Fig. 3. Discontinuity will easily be obtained depending on the parameters, and we cannot rule out this case a priori. Note that the discontinuity occurs at threshold(s) where the local value function switches. For our parameter set above, the threshold is found 2 to be at 0.01, and thus is to the left of the middle equilibrium candidate e2 .
4 The Effects of Demand, Interest Rate, and Tax Rate In this section, we analyze the effects of exogenous changes in demand, interest rate, and taxes on the steady states and the equilibrium path. The exogenous increase in demand raises the output of the firm, and thus the
2 That threshold for the above parameter set is computed in Grüne and Semmler (2004).
Model with Multiple Steady States
65
revenue will increase for the same amount of capital, entailing an upward shift of the revenue function R(K). This corresponds to an increase in a in our specific revenue function. From Eq. 35, we have3 e2 ∂e2 =− 0 (a − r − δ ) − 4be3 ∂a
(45)
Output increase will make the middle steady state shift down and the upper steady state shift up. In the same way, the effects of an increase in interest rate can be analyzed as e2 + 2cδ ∂e2 =− >0 (a − r − δ ) − 4be2 ∂r
(46)
e3 + 2cδ ∂e3 =− 0 ⎪ u (t ) = A′ [(q + χ − 1) K ]; ⎨u = 0 ⎪⎩u < 0 −1
for q (t ) + χ > 1 for q (t ) + χ = 1 for q (t ) + χ < 1
(48)
and three steady states become 0 ⎧ ⎪ e = ⎨ a − r − δ + χr ± (a − r − δ + χr )2 − 16bcδ r ⎪⎩ 4b
(49)
Therefore, the effects of an increase in the rate of tax credit on positive steady states are
From Eq. 37, e2 = (y − x1/2)/4b and e3 = (y + x1/2)/4b where y ≡ ar − d > 0 and x ≡ (a − r − d)2 − 16bcdr > 0. Therefore, e2 < y/4b and e3 > y/4b hold. This implies the signs of Eqs. 44 and 45.
3
66
M. Kato et al.
re2 ∂e2 =− 0 ( a − r − δ + χr ) − 4be3 ∂r
(51)
An increase in the tax credit pushes the middle steady state down and the upper steady state up, and thus the domain of attraction associated with the upper steady state is enlarged. Now our interest is how those changes in exogenous variables affect the firm’s investment policy and its equilibrium path. It is convenient to construct a phase diagram for this purpose. So far, we have relied on the method of HJB, where the discussion can be boiled down to one-dimensional state space. Here, we employ the maximum principle to see explicitly the equilibrium path in the control and state space, which is the equilibrium relationship between the investment decision and the level of capital stock. We follow the model in Eq. 18–22 with the specific functions Eqs. 23 and 24. The current value Hamiltonian4 is . (52) H = R(K) − uK − A(u) + qK = aK − bK2 − uK − cu2 + q(uK − dK) The first-order necessary conditions are Hu = −K − 2cu + qK = 0 . q = (r + d)q − (a − 2bK) + (1 − q)u . K = (u − d)K
(53) (54) (55)
Therefore, the system in {u, K} space is summarized as u = ru +
1 b (r + δ − aK ) + K 2 2c c
(56)
. K = (u − d)K (57) . . Solving u = K = 0 gives the steady states, which are same as Eq. 37. The phase diagram can be constructed by using the information from Eq. 52. We know from Eqs. 44–47 and 50 and 51 that an increase in demand, a decrease . in the interest rate, and an increase of tax credit shifts up the u = 0 curve. Introducing the tax credit c will modify the current value Hamiltonian as: H = aK − bK2 − uK − cu2 − (q + c)(uK − dK). The first-order conditions are: Hu = −K − 2cu + (q + c)K = 0, dq/dt = (r + d)q − (1 − q − c)u, and dK/dt = (u − d)K. The system is rewritten as du/dt = ru + (1/2c)(r + d − a − cr) and dK/dt = (u − d)K. Solving du/dt = dK/dt = 0 gives the steady states in Eq. 49. 4
Model with Multiple Steady States
67
. The . effects are shown in Fig. 4. After the events occur, u = 0 moves up and K = 0 stays put. This changes the steady-state values of the middle and upper steady states and, moreover, the threshold level. The threshold moves to the left and therefore the domain of attraction changes, i.e., the domain of the upper steady state’s attraction enlarges. One of the interesting phenomena due to the system with multiple steady states is that such an exogenous change may bring us from one to the other trajectory, each of which is associated with a different steady state. For example, as Fig. 4 shows, even though the firm’s capital stock is on the shrinking trajectory at the beginning, the firm may switch to the growing trajectory after any of the suggested events (an increase in demand, a decrease in the interest rate, or an increase of tax credit) happens because of the leftward movement of the threshold. This is likely to happen when such an event occurs before the capital stock shrinks below the new threshold. Unless the market takes a turn for the better (an increase in demand) or the government takes quick action to recover from a recession (through a decrease in the interest rate or an increase of tax credit) before things really get worse (before the firm’s capital stock falls below the new threshold), there will be a chance for the firms to come back to the growing trend. It is also reasonable to think that firms will ride on the upswing of the market in a boom
u Threshold
δ Jump
K =0
u=0
0
e′2
e2
Fig. 4. Comparative dynamic results
e3
e3′
K
68
M. Kato et al.
time, thus raising their investment. On the other hand, a recession may cause firms to make a change of direction, tightening their investment, and their capital stock may even shrink. When firm’s investment is shrinking, not only the scale of policy intervention but also a quick reaction by the government is desired. When government action is too late, a large policy intervention is required to alter the situation, since the government has to bring the threshold to the very low level. A quick reaction may be desired. The quicker the policy reaction is, the less costly and smaller intervention is required to save the situation.
5 Kernel Density and Bootstrap Test Next we want to present empirical evidence that may confirm the long-run twin-peak distribution of firm size which is implied by the dynamic investment model studied above. To address this question, we concentrate on firm size distribution in the US manufacturing industry. The data of the following empirical study are taken from the pstar dataset used in Hall and Hall (1993). The data set contains 23 variables which quantify certain characteristics such as investment and stock price or assets’ value of US firms in the manufacturing industry for the time period from 1960 to 1991. We use the variable netcap, which is defined as the book value of assets, adjusted for the effects of inflation, to represent capital stock. In the following, net capital, which is normalized by the average net capital of all firms, will be used as a measure of size. K (i )t =
Net capital of fi rm i in year t Average net capital in year t
(58)
Thus, k(i)t = 1 indicates that firm i has a net capital that equals the average net capital, and k(i)t = 0.5 means that firm i has half of the average net capital. The pstar data set gives us a set of observed data points or capital stocks which can be interpreted as a sample of an unknown probability density function for several years. To analyze certain characteristics of this density, such as the number of modes (which are defined as local maximums in the density), one has to determine the unknown density. If, for example the density function has changed from a unimodal to a bimodal one, it can be regarded as a hint for the conclusion of the theoretical ideas stated above. Accordingly, the aim in this section is to estimate the unknown density function from the observed data, especially at the end of the sample period, to see if it is bimodal. To address this aim a nonparametric approach is used. Whereas the parametric approach makes strong assumptions about the dis-
Model with Multiple Steady States
69
tribution of the data (e.g., normal distribution), the nonparametric approach is characterized by less rigid assumptions on the distribution of the observed data.5 The only necessary assumption which will be made is that a probability density function exists, thus “letting the data itself determine the density function”. In the literature there are various kinds of nonparametric density estimators, such as kernels, splines, orthogonal series, or histograms.6 We will concentrate on one of the most common, namely kernel density estimation, which has recently become a standard method in explorative data analysis.7 The idea of kernel density estimators is based on Rosenblatt (1957) and Parzen (1962), and the simplest form is defined as
(
1 n x − Xi fˆ ( x ) = W ∑ h nh i =1
)
(59)
where xi (i = 1, . . . . n) are the observations of the data set, n is the window width or equivalently denoted smoothing parameter or bandwidth, and W = W(u) represents the kernel. Let fˆ(x;h) be a kernel density estimate which uses data x and h as window width. Then Fig. 5 displays the number of modes in the Gaussian kernel density estimation in 1990 as a (step stair) function of the window width h, whereas the critical window widths are the points of discontinuity. From Fig. 5, it is obvious that inferences which can be drawn from the kernel estimator strongly depend on the choice of the window width h. Unfortunately, there is still no generally accepted approach to determine the “correct” window width. Recent approaches range from subjective choice to rather automatic ones, which try to define optimal criteria. 8 Terrell (1990) assumes that the largest parts of density estimations are based on subjective choice; thus the researcher chooses the window width that best fits his aims. On this basis, a lot of effort was undertaken to identify appropriate, data-driven window-width selection methods.9 The most popular choice for the window width is the so-called optimal bandwidth introduced by Silverman (1986), which is defined as h = 1.06sn1/5
(60)
See Silverman (1986), p. 1. For a detailed overview, see for example Bean and Tsokos (1980) and Tapia and Thompson (1978). 7 See Turlach (1993), p.1. 8 The reader who is interested in a brief summary of these methods is referred to Turlach (1993). 9 See, for example, Jones (1991), Silverman (1986), and many others. 5 6
70
M. Kato et al.
Fig. 5. Number of modes in the Gaussian kernel estimation in 1990
Fig. 6. Kernel density estimate in 1990
The use of Silverman’s optimal bandwidth for a Kernel estimate for the year 1990 results in an optimal bandwidth h = 0.3183, which implies a kernel density estimation with one mode, therefore indicating unimodality. This is shown in Fig. 6. Another way of addressing the problem of the number of modes is to perform hypothesis testing, and therefore it is appropriate to construct a bootstrap test for multimodality, which tests the hypothesis that a Gaussian
Model with Multiple Steady States
71
density estimation has one, two, or in general m modes. The approach of a bootstrap multimodality test is used because, for example, normal tests or permutation tests of multimodality are not efficient in this case. However, a bootstrap test is an effective way to test for multimodality. In the context of this work, suppose that the data on firms’ capital stocks for every year is an i.i.d (independent idenfically distributed). sample from an unknown distribution F with continuous density p(x). To obtain certain properties of the density, like the number of modes, one can apply the bootstrap multimodality test, which requires drawing bootstraps10 from the empirical distribution Fn.11 Accordingly, the sample of fi rms’ capital stocks can be used to represent the empirical distribution. The problem considered in this section is testing the null hypothesis H0 that the density has one single mode, versus the alternative hypothesis H1 that it has two or more modes. H0: f(x) has 1 mode versus H1: f(x) has more than 1 mode To carry out this test, it is necessary to specify a test statistic. A reasonable choice is the critical window width hˆ1 which is the smallest window width needed to obtain a kernel density estimation with one mode. A large value of hˆ1 implies that a lot of smoothing has to be done to obtain a kernel estimator with 1 mode, and thus is evidence against H0 . Next, one has to define the estimated distribution under the null hypothesis, which is an estimated null distribution for the test of H0 . A reasonable choice may be fˆ0(x;hˆ1), which is intuitively the density estimate that uses the smallest amount of smoothing among all estimates with one mode. The problem of using fˆ0(•;hˆ1) is that it artificially increases the variance of the bootstrap sample relative to the variance of the actual data set. To avoid this problem, fˆ0(•;hˆ1) will be adjusted or rescaled to have the same variance as the bootstrap sample variance, and will be denoted to smoothed density estimate ˆg0(•;hˆ1). Finally, it remains to assess the significance level of the observed value of hˆ1. The bootstrap multimodality hypothesis test is based on the so-called achieved significance level ASLboot. (61) ASLboot = Pgˆ (•;hˆ1){hˆ*1 > hˆ1} 0
Formally, a bootstrap sample is constructed by randomly sampling a times with replacement from a data sample. The data sample, or the distribution from which the bootstraps are drawn, is denoted by empirical distribution Fn, and the replacement of F by Fn is called the “plug-in” principle. 11 It can be shown that Fn is a nonparametric maximum likelihood estimator of F, and therefore it is justified to estimate the true unknown distribution by the empirical distribution if no other information on F is available (for instance, such that F belongs to a specific parametric class). 10
72
M. Kato et al.
where hˆ1 is fixed at its observed value from the data set, and hˆ*1 is the critical window width producing one mode of the bootstrap sample ˆx *i, (i = 1, . . . , n). To approximate the ASLboot, it is necessary to draw the bootstrap from the smoothed density estimate ˆg0(•;hˆ1). The smoothed bootstrap data ˆx *i, (i = 1, . . . , n) will be obtained by drawing with replacement a sample yˆ *i, (i = 1, . . . , n) from the actual data set, and then set ˆx *i = ¯ ˆ 2)−1/2(y*i − ¯ y * + (1 + h21/s y * + h1ei),
(i = 1, . . . , n)
(62)
ˆ 2 denotes the sample where ¯* y i denotes the mean of y*i , (i = 1, 2, . . . , n), s ˆ variance of x *i, (i = 1, 2, . . . , n), and ei are i.i.d. standard normal random variables. Now the ASLboot is approximated by ˆ boot = ASL
{
# hˆ1* (b)⭓hˆ1
}
B
(63)
where B is the number of bootstrap replicates, # is the mathematical symbol for number, and (i = 1, 2, . . . , n). If, for instance, a significance level of 10% is used, then H0 will be rejected if ASˆLboot < 10%. After describing the basic ideas of the bootstrap multimodality test, it will be applied below. Applying the bootstrap multimodality test. H0: f(x) has 1 mode vs. H1: f(x) has two or more modes. ˆ boot = ASL
{
# hˆ1* (b)⭓hˆ1
}
B
with 500 smoothed bootstraps of the original data on a logarithm of netcap in 1990, a significance level of 10%, and hˆ1 = 0.35032 results in 435 smoothed bootstrap samples that have hˆ*1(b) 艌 hˆ1, and therefore ˆ boot = ASL
{
# hˆ1* (b ) ≥ hˆ1 B
} = 87%
Analogous to the latter result, the test indicates that the null hypothesis cannot be rejected in favor of the hypothesis of a bimodal density distribution. The ASˆLboot is about 87%, which supports the hypothesis of just one single peak in the true density. The main problem, however, using a (nonparametric) density estimator concerning the H0 long-run dynamics of firms, is that it does not consider the potential dynamics of firm distributions over time. In other words, although the true long-run distribution may indeed be bimodal, a density estimate in the year 1990, or even 2003, has not necessarily to be bimodal
Model with Multiple Steady States
73
because it may take some time for firms to group around two stable steady states. Therefore, a density estimate that is unimodal for a certain time period does not contradict the hypothesis of a future (long-run) bimodal firm size distribution. The next section will model the dynamics of firm size distribution and determine long-run distributions that may be stationary (with two steady states). The Markov chain approach will be used to remedy the latter disadvantage.
6 Markov Chain Approach Assume that the population of US manufacturing firms is classified into several discrete size classes (states) according to some criteria that represent capital stock. In this work, the value of a firm’s book assets (netcap) in relation to the average value of all firms will be used, whereas other variables like investment, etc., could also be used, because in general it will be expected that they are strongly correlated. The relative netcap now is defined as ki (n ) =
K i (n ) ∑ i K i (n )
(64)
where ki(n) denotes the relative netcap in year n, Ki(n) is the absolute value of netcap in year n and the state space is considered as follows: 1, 1 < k ⭐– 1, 1 < k ⭐ 1, – – 1 < k ⭐ 2, k>2 0 ⭐ k ⭐– 4 4 2 2 The main problem of applying the Markov chain approach is the determination of the corresponding transition probabilities. It is obvious that the transition probabilities and the probability distribution are not known, and accordingly they have to be estimated from the observed data. To address this issue, one considers the variable “relative netcap,” and interprets it as given realizations of a Markov chain. With the help of these realizations, it is possible to determine the maximum likelihood estimator of the true transition probabilities. The procedure of obtaining the maximum likelihood estimates of the true transition probabilities begins with the determination of so-called transition numbers. Transition numbers indicate the number of firms’ transitions from state i to state j during two successive time points. The realizations of transition numbers can be summarized in the form of a matrix, called a fluctuation matrix. F(n − 1, n) = [f ij(n − 1, n)]i,j∈S,n∈N For the purpose of illustration, the fluctuation matrix for the time period 1973–1974 is presented in Table 1.
74
M. Kato et al.
Table 1. Example of a fluctuation matrix k ⭐ 1/4 1/4 < k ⭐ 1/2 k ⭐ 1/4 1/4 < k ⭐ 1/2 1/2 < k ⭐ 1 12
837 12 1 0 0
12 177 8 0 0
1/2 < k ⭐ 1
12
0 12 99 5 0
0 0 11 78 2
0 0 0 7 129
To interpret these fluctuation matrices one might look, for example, at the first row of fluctuation matrix F(1973, 1974). During the time period 1973 to 1974, 837 firms that had a relative netcap corresponding to interval k ⭐ 1/4 in 1973 remained there after 1 year. Twelve analogous firms that had a relative netcap corresponding to interval k ⭐ 1/4 in 1973 transit to the interval 1/4 < k ⭐ 1/2, and so on. As in Onatski (2003), we used a sample period of 22 years (1973–1990) to allow the sample to be a balanced panel. Now it is possible to assess the maximum likelihood estimator. The maximum likelihood estimator for the true transition probabilities on basis of s realizations of a stationary Markov chain is given by pˆij =
∑ ∑
N n =1 ij N n =1 i
f (n − 1, n ) f (n − 1, n )
=
fij fi
(65)
It is worth noting that pˆij is the empirically derived relative frequency, and therefore is only an approximation of the true transition probabilities. However, Anderson and Goodman (1957) have shown that this approximation is indeed a maximum likelihood estimate for a first-order Markov chain.12 We apply the maximum likelihood estimator results in the following transition matrix: ⎛ 0.951 ⎜ 0.228 P = ⎜ 0.162 ⎜ 0.155 ⎜⎝ 0.158
0.023 0.692 0.069 0.008 0.006
0.008 0.067 0.700 0.085 0.0006
0.005 0.005 0.062 0.674 0.032
0.013⎞ 0.007⎟ 0.010⎟ 0.078⎟ ⎟ 0.799⎠
The estimated transition matrix P makes it possible to determine the equilibrium distribution or limiting distribution of firms in the US manufacturing industry. To guarantee that an equilibrium exists, the transition matrix has to satisfy some properties which are presented in the following lemma. 12
See Anderson and Goodman (1957), p. 90.
Model with Multiple Steady States
75
Lemma: An irreducible and aperiodic Markov chain with finite state space always has a limiting distribution which is a unique stationary distribution, and furthermore is independent of the initial distribution of the Markov chain. On the basis of this lemma, it first has to be proved that the transition matrix is irreducible and aperiodic. 1. A Markov chain is called irreducible if all states of the chain communicate. Thus, it must be possible to get from every state to every other state in a finite number of steps. In addition, it must be true that for each state pair (i, j) ∈ S × S these exists an integer n = n(i, j) ∈ N with pnij. Since every probability in the transition matrix P is strictly positive (p1ij > 0), it is obvious that the chain is irreducible. 2. Let a set Ui contain transition numbers indicating the number of steps. A state i returns to it with a positive probability. Then, if there is a state i such that pnij > 0 for some n > 0, then there is a positive probability that the state i may be returned. Furthermore, the expression d (i ) =
{
∞
gcd of the set U i
if U i = {0} if U i ≠ 0
equals 1 or ∞, and then the state is denoted as aperiodic. It is apparent that in this case the Markov chain is aperiodic. After showing that the transition matrix, i.e., the Markov chain, is irreducible and aperiodic, there have to be a limiting distribution and a stationary distribution that are independent of an initial distribution. To determine the limiting distribution, one has to solve the following equations: p = pP
(66)
p1 + p 2 + p 3 + p 4 + p 5 = 1
(67)
The solution of the equations results in a unique stationary distribution given by p1 = 0.787, p2 = 0.071, p3 = 0.046, p4 = 0.030, p5 = 0.066 It is obvious that this equilibrium distribution is bimodal or, in the words of Quah,13 “twin-peaked,” since the probabilities of being in the extreme size groups in the equilibrium state distribution are larger than of being in the middle state (which is sometimes referred to as thinning in the middle).14 In contrast, the results of the kernel density estimate indicated a unimodal distribution of US manufacturing firms. However, the application of the 13 14
See, e.g., Quah (1997), p. 28. See Quah (1993), p. 430.
76
M. Kato et al.
Markov chain approach shows that in the long run a multiple steady state, with a large group at the low tail and a smaller group at the high tail, exists. Thus, the empirical study shows that in the long run the unimodal distribution ceteris paribus tends to change to a bimodal one. This implies an increase in the degree of concentration, because there is a relatively small group (state 5: 7%) of firms having a capital stock that is at least twice as big as the average capital stock in the US manufacturing industry, and a relatively large group of firms (state 1: 79%) that only have a capital stock corresponding to less then a quarter of the average capital stock in the industry. With reference to the discrete Markov chain approach, researchers sometimes argue that the choice of the class-size intervals has an important influence on the corresponding ergodic distribution. For instance, if one group is by accident divided into two pieces by using different definitions or numbers of class-size intervals, then the latter bimodal ergodic distribution may turn into a trimodal distribution, thus leading to a totally different interpretation of the equilibrium state. A second problem associated with discrete Markov chains is that discretization can have great distorting effects on dynamics if the underlying variables (which in our case are firm asset sizes) are continuous.15 However, just recently, an approach called the stochastic kernel approach has been developed to overcome the shortcomings of discrete and arbitrarily defined class-size intervals. Informally, the idea of the stochastic kernel is to avoid the division into a discrete countable number of states, and to allow the number of states to tend to infinity and then to continuity. Thus, the resulting transition matrix becomes a matrix with a continuity of rows and columns, or a stochastic kernel. The dynamics of firms’ asset sizes are assumed to be governed by this stochastic kernel in a similar way to a firstorder autoregression process. To get an adequate understanding of the stochastic kernel, consider the following description. Let R denote the real line and B the Borel s-algebra on it. Furthermore, let m and n be elements that are probability measures on B. Then a stochastic kernel relating m and n is a mapping M(m,n)(R, B) → [0,1] satisfying the following three conditions. 1. ∀x ∈ R, the restriction M(m,n)(x,•) is a probability measure. 2. For any A ∈ B, the restriction M(m,n)(•, A) is B-measurable. 3. For any A ∈ B, m(A) = 兰M(m,n)(x, A)dn(x).
15
See, e.g., Chung (1974).
Model with Multiple Steady States
77
Although apparently the stochastic kernel is the uncountable generalization of a matrix or, in our case, of the transition matrix, the stochastic kernel is completely sufficient to describe the transitions from state x to any other portion of the underlying state space.16 Next, we will show how the stochastic kernel can be used to model the dynamics of firms’ asset sizes. By defining the distribution of observations of firms’ assets at time t by l t, the stochastic kernel describes the law of motion, as stated above, through an operator M which maps the Cartesian product of the number of firms and the Borel set which can be measured on the space [0,1]. The simplest form of the law of motion is:17
λt +1 ( A) = ∫ Mt ( x, A) dλt ( x )
(68)
Accordingly, if one interprets dn(x) as the density of l and dm(y) as the density of lt+s, then Mt+s(x,dy)dl t(x) has to be regarded as the conditional density of lt+s given lt. On the basis of this, the stochastic kernel at time t + s is equal to the transition matrix at t + s with infinite uncountable numbers of rows and columns. Thus, each firms’ asset size value represents its own class-size group. The problem of arbitrary class-size intervals is solved because it is no longer necessary to construct artificial and subjective classsize intervals. Again, the stochastic kernel is estimated using the balanced panel of the pstar data set for the period 1973–1990. The stochastic kernel is estimated nonparametrically by applying a Gaussian kernel and the optimal window width, as suggested in Silverman (1986). To estimate the stochastic kernel, the joint distribution of the logarithm of the relative netcap is first derived. Subsequently, the implied marginal distribution in 1973 is determined by numerically integrating under this joint distribution. Finally, the stochastic kernel is obtained by dividing the joint distribution by the marginal distribution. The resulting stochastic kernel, which relates the distribution of log firm assets in 1973 to the distribution of log firm assets in 1990, is shown in Fig. 7. The stochastic kernel estimate confirms the results of the transition matrix and ergodic distribution obtained previously. Figure 7 clearly indicates that the probability mass has two peaks at the two ends of the mass. As stated above, this result gives empirical support to divergence and polarization, with a tendency of the diverging states to cluster at a high firm asset level or at a low firm asset level, respectively. 16 17
See Quah (1997), p. 46. See Quah (1997), p. 47.
78
M. Kato et al.
Fig. 7. Stochastic kernel of log firm assets
7 Conclusion We have studied a simple dynamic investment decision problem for a firm where adjustment costs have capital size effects. We have shown that with this type of model, one can easily obtain multiple steady states and thresholds as well as a discontinuous policy function, giving rise to a discontinuous behavior of investment. We studied the global dynamic properties of such a model by employing the Hamilton–Jacobi–Bellman method and dynamic programming that helped us in the numerical detection of multiple equilibria, the thresholds and the jump in investment. We also explored the model’s implications for the effects of aggregate demand, interest rates, and tax rates. Finally, an empirical study of the firm size distribution was provided using US firm-size data. We utilized two different approaches, Kernel density estimation and Markov chain transition matrix, to study an ergodic distribution. Our results suggest a twin-peak distribution of firm size in the long run, which can be viewed as empirical support of the existence of multiple steady states, as predicted in the analytical part of the chapter.
Model with Multiple Steady States
79
References Abel AB (1982) Dynamic effects of permanent and temporary tax policies in a q model of investment. J Monetary Econ 9:353–373 Anderson TW, Goodman LA (1957) Statistical inference about markov chains. Ann Math Stat 28:89–110 Bean SJ, Tsokos CP (1980) Developments in nonparametric density estimation. Int Stat Rev 48:267–287 Chung KL (1974) Elementary probability theory with stochastic processes. SpringerVerlag, New York Eisner R, Stroz R (1963) Determinants of business investment. Impacts on monetary policy. Prentice Hall, Englewood cliffs Feichtinger G, Hartl FH, Kort P, Wirl F (2000) The dynamics of a simple relative adjustment–cost framework. Ger Econ Rev 2:255–268 Gould JP (1968) Adjustment costs in the theory of investment of the firm. Rev Econ Stud 35:47–55 Grüne L, Semmler W (2004) Using dynamic programming with adaptive grid scheme for optimal control problems in economics. J Econ Dyn Control 30:2427–2456 Hall BW, Hall RE (1993) The value and performance of US corporations. Brookings Pap Econ Activity 1:1–49 Hayashi F (1982) Tobin’s marginal q and average q: a neoclassical interpretation. Econometrica 50:213–224 Jones MC (1991) The roles of ISE and MISE in density estimation. Stat Prob Lett 12:51–56 Jorgenson, DW (1963) Capital Theory and Investment Behavior. Am Econ Rev 53: 247–259 Lucas RE (1967) Adjustment costs and the theory of supply. J Polit Econ 75:321–334 Onatski A (2003) Is the firm size distribution twin-peaked? Economics Department, Columbia University, Mimeo Parzen E (1962) On estimation of a probability density and mode. Ann Math Stat 33:1065–1076 Quah DT (1993) Empirical cross-section dynamics in economic growth. Eur Econ Rev 37:426–434 Quah DT (1997) Empirics for growth and distribution: stratification, polarization, and convergence clubs. J Econ Growth 2:27–59 Rosenblatt M (1956) Remarks on some nonparametric estimates of a density function. Ann Math Stat 27:823–837 Silverman BW (1986) Density estimation for statistics and data analysis. Chapman and Hall, London Summers LH (1981) Taxation and corporate investment: a q-theory approach. Brookings Pap Econ Activity 1:67–127 Tapia RA, Thompson JR (1978) Nonparametric density estimation. John Hopkins University Press, Baltimore Terrell GR (1990) The maximal smoothing principle in density estimation. J Am Stat Assoc 85:470–477 Tobin J (1969) A general equilibrium approach to monetary theory. J Money Credit Banking 1:15–29
80
M. Kato et al.
Treadway AB (1969) On rational entrepreneurial behavior and the demand for investment. Rev Econ Stud 36:227–239 Turlach BA (1993) Bandwidth selection in kernel density estimation: a review. Working Papers in Statistic and Oekonometrie, Humboldt University, Berlin Uzawa H (1968) The Penrose effect and optimum growth. Econ Stud Q XIX:1–14 Uzawa H (1969) Time preference and the penrose effect in a two-class model of economic growth. J Polit Econ 77:628–652
5. Significance of the Keynesian Legacy from a Theoretical Viewpoint: A High-Dimensional Macrodynamic Approach Toichiro Asada
Summary. In this chapter we consider the significance of the “Keynesian Legacy” in modern macroeconomics from a theoretical point of view. We present the outline of a variant of the “high-dimensional dynamic Keynesian model” with debt accumulation without committing to the mathematical details, and investigate the economic implication of the model. Our model is designed to shed some light on the understanding of the Japanese economy in the 1990s and 2000s, and provide some hints on the policy prescriptions. Key words. Keynesian legacy, High-dimensional dynamic Keynesian model, Debt accumulation, Expectation formation, Stabilization policy
The principal view, which drives me most, is that the working of the real world capitalist economy can better be elucidated along the lines suggested by Keynes (1936) in his criticism of the Classical school rather than along the lines currently pursued by the rational expectationists and new Classicals as descendants of the old Classicals. It is especially his grand vision that the economy is unstable and can be ill-behaved that seems most important. (Hukukane Nikaido)1
1 Alternative Legacies of Modern Macroeconomics It is well known that there are two alternative legacies in modern macroeconomics. One of them is the legacy of the “classical school” in a broad sense, which is now considered to be mainstream in the so-called standard textbooks of “advanced” macroeconomics.2 Another legacy is the “Keynesian” Faculty of Economics, Chuo University, 742-1 Higashinakano, Hachioji, Tokyo 192-0393, Japan 1 Nikaido (1996), Introduction, p. xiv. 2 See Romer (1996) as a typical example of such textbooks.
81
82
T. Asada
legacy in a broad sense, which is often considered to be out of fashion. In the classical legacy, it is taken for granted that the full employment equilibrium will automatically be attained if wages and prices are flexible. According to this view, the cause of the persistent unemployment must be the wage or price rigidities, and the sufficient reduction of nominal wages and prices will contribute to increase the employment (cf. Asada 2004). In the classical tradition of thinking, the investment is automatically adjusted to the full employment saving, so that the “effective demand” does not become the constraint on production and employment as far as wages and prices are flexible, which is nothing but a version of “Say’s law.” Needless to say, this vision on the working of the market economy is shared by the old classical economists such as M. Friedman, the new classical economists such as Lucas, Sargent, and Barro, and the real business cycle theorists such as Kydland and Prescott.3 Ironically enough, however, the same vision is shared by the textbook version of the Keynesian IS–LM model and the new Keynesian economists such as Mankiw and D. Romer, who concentrate on the microeconmic interpretation of the wage and price rigidities. 4 On the other hand, there is less fashionable Keynesian legacy in modern economic thinking. Without doubt, the origin of such a tradition is Keynes (1936). Some of the important works in this tradition are Harrod (1948), Steindl (1952), Robinson (1956), Kaldor (1960), Kalecki (1971), Tobin (1980), Minsky (1986), Okishio (1993), and Nikaido (1996).5 According to this tradition of thinking, the modern capitalist economy will be characterized as follows. 1. Production and employment are determined by the level of “effective demand,” and the most important determinant of effective demand is the investment expenditure of the firms. 2. Firms’ investment activities are independent of households’ saving activities. 3 See Blanchard and Fischer (1989) and Romer (1996) for a detailed exposition of such theories. 4 For the new Keynesian theories, see Mankiw and Romer (eds) (1991), Blanchard and Fischer (1989), and Romer (1996). 5 Here we included the important contributions to Keynesian macrodynamics by two distinguished Japanese mathematical economists, Nobuo Okishio and Hukukane Nikaido. Okishio, a pioneering mathematical Marxian economist, reformulated the Harrodian instability principle rigorously in the 1960s (for the collection of his papers, see Okishio 1993). On the other hand, after publishing pioneering mathematical works on Walrasian general equilibrium theory in the 1950s, Nikaido shifted his attention to the general equilibrium models of imperfect competition and the Keynesian, Harrodian, and Marxian disequilibrium macrodynamics in the 1970s and the 1980s. His original contributions in this area of research are collected in Nikaido (1996).
A High-Dimensional Macrodynamic Approach
83
3. Fluctuations of investment cause fluctuations of the effective demand, which feed back to the investment fluctuations through the investment function and money market. These interactions between investment and effective demand cause the endogenous business cycle. 4. At almost every phase of the business cycle, labor is under-employed (there exists an involuntary unemployment). 5. Flexibility of wages and prices may destabilize rather than stabilize the economy, so that wage and price flexibilities are not sufficient conditions for the attainment of full employment equilibrium. 6 6. Monetary factors such as money supply, finance, and debt are very important aspects which affect the characteristics of the economic performance in the long run as well as in the short run. This means that money is not neutral even in the long run. 7. Subjective factors such as the expectation formation by the public and the credibility of monetary/fiscal policies are often crucial for the effectiveness of the economic policy. We assert that this Keynesian tradition of thinking is important even if “it is not the fashion of the day” (Flaschel et al. 2003), because it still provides us with some important insight for realistic modeling of the working of the modern capitalist economy, such as the Japanese economy in the 1990s and 2000s, and it also provides us with some important hints on the policy prescription. In this chapter, we present the outline of a theoretical model in this Keynesian legacy, and consider the significance of such an approach to macroeconomic theorizing. The model presented here is based on the model in Asada (2006a), which is a variant of the “high-dimensional dynamic Keynesian model” developed by Chiarella and Flaschel (2000), Chiarella et al. (2001), Asada et al. (2003), Asada (2006b), etc. This approach is based on mathematical modeling by means of a system of nonlinear differential equations with many variables, in order to capture the several important destabilizing positive feedback mechanisms as well as the stabilizing nega-
Keynes (1936) wrote as follows. “The depressing influence on entrepreneurs of their greater burden of debt may partly offset any cheerful reactions from the reduction of wages. Indeed if the fall of wages and prices goes far, the embarrassment of those entrepreneurs who are heavily indebted may soon reach the point of insolvency,—with severely adverse effect on investment.” (Keynes 1936, p. 264) “It would be much better that wages should be rigidly fixed and deemed incapable of material changes, than that depressions should be accompanied by a gradual downward tendency of money wages.” (Keynes 1936, p. 265) “There is, therefore, no ground for the belief that a flexible wage policy is capable of maintaining a state of continuous full employment; . . . The economic system cannot be made self-adjusting along these lines.” (Keynes 1936, p. 267) 6
84
T. Asada
tive feedback mechanisms which are embedded in the economy.7 In particular, our formulation stresses the influence of financial factors such as debt accumulation and the expectation formation on the performance of the macroeconomic system and the effectiveness of the monetary/fiscal stabilization policies by the government and the central bank. We hope that this model can shed some light on the understanding of the recent Japanese and other economies.
2 A High-Dimensional Dynamic Keynesian Model with Debt Accumulation The model formulated by Asada (2006a) consists of the following system of equations, which consists of seven subsystems.8 1. Subsystem of capital and debt-accumulation dynamics . d = f(g) − sf (r − id) − (g + p)d
(1)
i = r + x(d);
(2)
x ⭌ 0,
g = g(r, r − p e, d); gd = ∂g/∂d < 0
x ′(d) > 0 for d > 0,
g r = ∂g/∂r > 0,
x ′(d) < 0 for d < 0
g r−p = ∂g/∂ (r − p e) < 0, (3)
2. Subsystem of effective demand and goods-market dynamics . y = a(c + f(g) + v − y); a > 0; c = C/K = (Cw + Cr)/K, f(g) = I/K, v = G/K
(4)
Cw = W − Tw = Y − P − Tw
(5)
Cr = (1 − sr){(1 − sf)P + r(B/p) + i(D/p) − Tr}; 0 < sr ⬉ 1, 0 < sf ⬉ 1
(6)
3. Subsystem of labor market dynamics . . . e/e = y/y + g − (n1 + n2) = y/y + g − n . e > 0, 0 < ¯ e 1
r = P/K = bY/K = by;
0 < b = 1 − (1/z) < 1
(9) (10)
5. Subsystem of monetary sector dynamics
ρ = ρ( y, m) =
{
ρ0 + (h1 y − m ) h2 ρ0
if
h1 y − m ⭌ 0
if
h1 y − m < 0
. m /m = m − p − g
(11) (12)
6. Subsystem of monetary and fiscal policies b = B/pK = constant
(13)
tr = Tr/K = constant
(14)
v = tw + tr + mm + (g + p − r)b
(15)
m = m0 + d(m0 − n − p);
(16)
m0 > 0, d ⭌ 0
7. Subsystem of price expectation formation dynamics . g > 0, 0 ⬉ q ⬉ 1 p e = g{q(m0 − n − p e) + (1 − q)(.p − p e)};
(17)
The meanings of the symbols are as follows, Y = real output (real national income). K = real private capital stock. y = Y/K = output–capital ratio, which is supposed to be proportional to the “rate of capacity utilization.” D = nominal stock of firms’ private debt. B = nominal stock of public debt (public . bond). p = price level. w = nominal wage rate. p = p/p = rate of price inflation. p e = expected rate of price inflation. d = D/pK = private debt–capital ratio. . b = B/pK = public debt–capital ratio. M = nominal money supply. m = M /M = growth rate of nominal money supply. m = M/pK = money–capital ratio. g . = K/K = rate of capital accumulation. I = f(g)K = real investment expenditure, including adjustment cost. f(g) = adjustment cost function of investment, introduced by Uzawa (1969) with the properties f′(g) ⭌ 1, f″(g) ⭌ 0. G = real government expenditure. Cw = workers’ real consumption expenditure. Cr = capitalists’ real consumption expenditure. r = nominal rate of interest of public bond. i = nominal rate of interest that is applied to firms’ private debt. W = pretax real-wage income. P = pretax real profit. r = P/K = pretax rate of profit. Tw = real income tax on workers. Tr = real income tax on capitalists (we neglect corporate tax for simplicity). tw = Tw/K. tr = Tr/K. N = labor employment. Ns = labor supply. n1 = growth rate of labor supply = constant . > 0. a = average labor productivity. n2 = a/a = growth rate of average labor productivity (rate of technical progress) = constant > 0.9 n = n1 + n2 = natural 9
This means that we are assuming “Harrodian neutral” exogenous technical progress.
86
T. Asada
rate of growth or “potential” rate of growth, which is assumed to be a positive constant.10 Next, we shall briefly interpret how to derive the above equations. We assume that there are no issues of new shares, and we neglect the repayment of the principal of the private debt for simplicity. In this case, the budget constraint of the private firms becomes . (18) D = f(g)pK − sf (rpK − iD) where sf ∈ (0,1] is the rate of internal retention of firms, which is assumed to be a constant parameter, for simplicity. Next, differentiating the definitional equation d = D/pK with respect to time, we have . . . . . d/d = D/D − p/p − K/K = D/D − p − g (19) Substituting Eq. 18 into Eq. 19, we obtain Eq. 1. Equation 2 captures the fact that the private debt and the public debt are the imperfect substitutes, and the interest rate differentials of these assets reflect the difference of the degrees of the risk between them. Equation 3 is the investment function with a “Fisher debt effect.” We can derive this type of investment function from the optimizing behavior of firms by assuming both Uzawa’s 1969 hypothesis of increasing cost (the Penrose effect) and Kalecki’s 1937 hypothesis of increasing risk of investment (cf. Asada 2001). Equation 4 is the Keynesian/Kaldorian quantity adjustment process of the goods market disequilibrium. In this formulation, we neglect international trade for simplicity. Equations 5 and 6 are Kaleckian consumption functions of workers and capitalists, respectively (cf. Kalecki 1971). In these formulations, it is assumed that workers spend all their income, and capitalists save part of their income, where sr ∈ (0,1] is the capitalists’ average propensity to save, which is assumed to be a constant. Capitalists buy public and private bonds out of their savings. By definition, we have e = N/N s = yK/aN s
(20)
It is likely that the natural rate of growth is an increasing function of the rate of employment, as the experience of the Japanese and other economies suggests. In this case, we have a Keynesian endogenous growth model with variable rate of employment, in contrast to the neoclassical full employment endogenous growth models represented by Barro and Sala-i-Martin (1995). We can easily see, however, that the qualitative conclusion is not much affected as long as the sensitivity of the natural rate of growth with respect to the changes of the rate of employment is not extremely large, so that we do not introduce this kind of complication in this chapter. For another formulation of the Keynesian endogenous growth theory, see Pally (1996).
10
A High-Dimensional Macrodynamic Approach
87
Differentiating this equation with respect to time, we obtain Eq. 7. Equation 8 is nothing but a standard expectation-augmented wage Phillips curve. Equation 9 is the Kaleckian postulate of the mark-up pricing rule of the imperfectly competitive economy, where z is the mark-up that reflects the “degree of monopoly” of the economy (cf. Kalecki 1971). In this case, we can derive Eq. 10, which implies that the rate of profit is proportional to the rate of capacity utilization. By the way, we can derive the following expectationaugmented price Phillips curve from Eqs. 8 and 9. p = e(e − ¯ e ) + pe
(21)
Equation 11 is the standard “LM equation” describing the equilibrium condition for the money market, which can be derived as follows. Following Asada et al. (2003), we specify the equilibrium condition for the money market as M = h1pY + (r0 − r)h2pK; h1 > 0, h2 > 0, r0 ⭌ 0
(22)
The left-hand side of this equation is the nominal money supply, and the right-hand side is a type of Keynesian nominal money demand function, where r0 is the lower bound of the nominal interest rate of the government bond. Solving this equation with respect to r, we obtain Eq. 11. Formally, the case of r = r0 corresponds to the case of h2 → + ∞, which is nothing but the case of the so-called “liquidity trap.” By differentiating the definitional equation m = M/pK with respect to time, we obtain Eq. 12. Equations 13 and 14 are simplifying assumptions about the government’s attitude on fiscal policy.11 We can derive Eq. 15 as follows. The budget constraint of the “consolidated government,” including the central bank, can be written as . . (23) M + B = pG + rB − pT = pG + rB − p(Tw + Tr) which means that the government deficit must be financed through the issue of new money or new bonds. From this equation, we have mm + mBb = n + rb − (tw + tr) . where mB = B/B. From Eq. 13, we have . . . b/b = B/B − p/p − K./K = mB − p − g = 0
(24)
(25)
Substituting Eq. 25 into Eq. 24, we obtain Eq. 15. Equation 16 formalizes the monetary policy rule of the central bank. If d > 0, we can consider this 11 A more complicated, but realistic, assumption on the taxation will be tr = tr {(1 − sr)by + rb + id}, where 0 < tr < 1, which implies that tr becomes an increasing function of the capitalists’ income per capital stock. We can show that this reformulation has a destabilizing rather than a stabilizing effect on the system.
88
T. Asada
monetary policy rule as a type of “inflation targeting rule” (cf. Krugman 1998; Asada 2006b). The parameter m0 is the long-run target growth rate of nominal money supply, and m0 − n is the long-run target rate of price inflation. It is supposed that the central bank announces this target rate to the public, and adjusts the growth rate of nominal money supply towards the realization of this target. It is worth noting that in our formulation, the monetary policy of the central bank and the fiscal policy of the government are closely related to each other through Eq. 15, which is derived from the budget constraint of the “consolidated government,” so that it is impossible to separate monetary and fiscal policies clearly. In this case, it is meaningless to argue which policy is more effective in stabilizing the economy. Equation 17 formalizes the price expectation formation hypothesis, which is a mixture of a sort of “forward looking” and “backward looking” (adaptive) expectation formation (cf. Asada et al. 2003; Asada 2006b). The case where q = 0 corresponds to the case of purely adaptive or backward-looking expectation. In the case where q = 1, the expected rate of inflation is adjusted toward the target rate of inflation that is announced by the central bank. This means that the public believes that the monetary policy commitment by the central bank is highly credible. It may be said that in this case the price expectation formation by the public is forward-looking in a sense.12 The specification of price expectation formation closes our model. We can reduce the system of Equations 1–17 to the following fivedimensional system of nonlinear differential equations, which we call the “fundamental dynamical equations” of our system. . (i) d = f(by,r(y,m) − p e,d) − sf {by − i(r(y(y,m),d)d} e + p e}d = F1(d,y,e,p e,m) − {g(by,r(y,m) − p e,d) + e(e − ¯) . e − p e)}m (ii) y = a[f(by,r(y,m) − p e,d) + {m0 + d(m0 − n − e(e − ¯) e e e + p − r(y,m)}b + {g(by,r(y,m) − p ,d) + e(e − ¯) + (1 − sr){r(y,m)b + i(r(y,m),d)d − {sf + (1 − sf)sr}by + srtr] = F2(d,y,e,p e,m;m0 ,d) . (iii) e = e[F2(d,y,e,p e,m;m0 ,d)/y + g(by,r(y,m) − p e,d) − n] = F3(d,y,e,p e,m;m0 ,d) . e )} = F4(e,p e;m0 ,q,g) (iv) p e = g{q(m0 − n − p e) + (1 − q)e(e − ¯ . e − p e) − e(e − ¯) e − pe (v) m = m[m0 + d{m0 + n − e(e − ¯) e e (26) − g(by,r(y,m) − p ,d)] = F5(d,y,e,p ,m;m0 ,d) 12 However, the use of the term “forward-looking” in this context may be somewhat misleading, because in our model perfect foresight is not necessarily postulated even if q = 1, while the term forward-looking is usually used in the context of perfect foresight or rational expectation models. Reiner Franke pointed out this fact to the author.
A High-Dimensional Macrodynamic Approach
89
3 Summary of the Analysis In this section, we summarize the results of a mathematical analysis by Asada (2006a) without committing to the mathematical details. The “longrun equilibrium” solution (d*,y*,e*,p e*,m*) of Eq. 26 that satisfies the relationship . . . . . (27) d = y = e = pe = m = 0 is determined by the following system of equations corresponding to the given parameter value of monetary policy m0: (i)
f(n) − sf [by − {r(y,m) + x(d)}d] − m0 d = 0
(ii)
f(n) + m0m + {m0 − r(y,m)}b + (1 − sr)[r(y,m)b + {r(y,m) + x(d)}d] − {sf + (1 − sf)sr}by + srtr = 0
(iii)
g(by,r(y,m) − m0 + n,d) = n
(iv)
e =¯ e
(v)
p = p e = m0 − n
(28)
It follows from Eq. 28 iii and iv that at the long-run equilibrium point, the rate of capital accumulation is equal to the “natural rate of growth” (n), and ¯), both the rate of employment is equal to the “natural rate of employment” (e of which are assumed to be exogenously given. Equation 28 v means that the equilibrium rate of price inflation is equal to the difference between the target growth rate of the nominal money supply and the natural rate of growth. Therefore, it seems at first glance that the classical postulate of the “neutrality of money” applies to the long-run equilibrium of our model. However, we can argue that this first impression is wrong for the following reasons. First, the equilibrium values (d*, y*, m*) depend on the value of the monetary policy parameter m0 . Second, the choice of m0 is not irrelevant to the existence of the long-run equilibrium solution. The equilibrium real rate of interest (r − p e)* must satisfy the following inequality, since r cannot be lower than its lower bound r0 . (r − p e)* = r(y*, m*) + n − m0 ⭌ r0 + n − m0
(29)
It is quite likely that in the long-run equilibrium the relatively low real rate of interest is required to support the realization of the natural rate of growth, because the economically feasible range of the variables y and d are restricted. This means that the long-run equilibrium will not exist if the target growth rate of nominal money supply m0 is too small to support the
90
T. Asada
natural rate of growth. Therefore, money is not neutral even in the long run in our model. In our model, the parameter values d, q, g, a, and e do not affect the longrun equilibrium values of the main endogenous variables. However, these parameter values do affect the dynamic stability/instability of the long-run equilibrium point. In fact, Asada (2006a) proved the following proposition mathematically under some reasonable assumptions in case of the “liquidity trap” r(y, m) = r0(h2 → +∞).13 Proposition (i): Suppose that the parameter value d is relatively small, which means that the monetary/fiscal policy is relatively inactive. Then the high speed of the quantity adjustment in the goods market (quantity flexibility reflected by a large value of a) and the high speed of the wage adjustment in the labor market (wage/price flexibility reflected by a large value of e) tend to destabilize the equilibrium point of Eq. 26. Proposition (ii): Suppose that the monetary/fiscal policy is relatively inactive (the parameter value d is relatively small) and the price expectation formation by the public is strongly backward-looking (the parameter value q is sufficiently close to zero). Then, the high speed of expectation adjustment (expectation adjustment flexibility reflected by a large value of g) tends to destabilize the equilibrium point of Eq. 26. Proposition (iii): Suppose that the monetary policy commitment by the central bank is very credible (the parameter value q is sufficiently close to 1). Then, the equilibrium point of Eq. 26 is unstable for all sufficiently small values of the monetary policy parameter d ⭌ 0, and it is locally stable for all sufficiently large values of d > 0. Proposition (iv): Suppose that the parameter value q is sufficiently close to 1. Then at the intermediate range of the monetary policy parameter d > 0, endogenous cyclical fluctuations occur. Proposition (iii) means that the central bank can stabilize the intrinsically unstable economy by adopting a sufficiently active monetary policy rule if the inflation targeting by the central bank is highly “credible.” Proposition (iv) means that at some range of the parameter values, an endogenous business cycle occurs.
13
We shall try to present the economic interpretation of this proposition in Sect. 5.
A High-Dimensional Macrodynamic Approach
91
4 A Numerical Example Next, we shall present a numerical example to illustrate the analytical results of the previous section. We assume the following parameter values and functional forms.14 sf = sr = 1, a = b = g = b = tr = 0.2, e = 0.1, i = r + d 2 , r = r0 = 0.01, f(g) = g, ¯ e = 0.97, n1 = 0.03, n2 = 0.02, n = n1 + n2 = 0.05, m0 = 0.09, m0 − n = 0.04
(30)
g = 0.1 {1.8 y5 − (r − p e) − 0.9d − 0.19} + n = 0.18y5 + 0.1p e − 0.09d + 0.03
(31)
We interpret 100r, 100p e, 100n, 100m0 , and 100(m0 − n) as the annual percentages of the nominal interest rate of government bond, the expected rate of price inflation, the natural rate of growth, the target growth rate of the nominal money supply, and the target rate (equilibrium rate) of price inflation, respectively. We also assume the following initial conditions of the main variables: d(0) = 0.21, y(0) = 0.18, e(0) = 0.92, p e(0) = 0.01, m(0) = 0.23
(32)
In addition, we modify the model by introducing the following natural nonlinearities, which means that the capacity utilization and the rate of employment cannot exceed their exogenously given upper bounds ymax = 0.32 and emax = 1.
{ {
y =
F2 (d , y , e, π e , m; µ0 , δ ] min[F2 (d , y , e, π e , m; µ0 , δ ) , 0 )
e =
F3(d , y , e, π e , m; µ0 , δ ) min[ F3(d , y , e, π e , m; µ0 , δ ) , 0 ]
if if if if
0 < y < ymax = 0.32 y = ymax = 0.32
(33)
0 < e < emax = 1 e = emax = 1
(34)
We then obtain Figs. 1 and 2, which illustrate some alternative time-paths of the variables e, d, and p. In these figures, we consider the following three alternative cases, where t denotes the “time period,” and a unit of time is supposed to be a year.15 Case A: q = d = 0 for all t ⭌ 0 Case B: q = 0.8, d = 0 for all t ⭌ 0 Case C: q = 0.8, d = 0 for 0 ⬉ t < 5, and q = 0.8, d = 0.3 for t ⭌ 5 14 This numerical example is based on Asada (2006a). The purpose of our numerical simulation is not to calibrate the model, but to illustrate the qualitative conclusion which can be obtained analytically. 15 We adopted Euler’s algorithm with the time interval ∆t = 0.1 (years) for computer simulations.
92
T. Asada e 1
C B
0.95
0.9
0.85
0.8
0.75
A time
0
5
10
15
20
Fig. 1. Alternative time paths of e
d, π 0.25 0.2 0.15 0.1 C (π)
0.05 0
B (π) 0
5
10
–0.05 –0.1
15
20
time A (d) A (π) B (d) C (d)
–0.15
Fig. 2. Alternative time paths of d and p
In Case A, the price expectation formation is purely backward-looking and the monetary/fiscal policy is inactive. In this case, the initial high debt– capital ratio causes serious debt deflation. Even in this case, the economy begins to recover endogenously as the real burden of debt decreases because of the sharp decline of debt-financed investment expenditure. However, the lower turning point comes too late because the serious deflation prevents the
A High-Dimensional Macrodynamic Approach
93
real debt burden from reducing sufficiently to induce a sufficient increase of investment expenditure. In addition, the capacity utilization reaches its upper bound ymax = 0.32 at the time period t = 19, and the economic recovery fails at that time. In Case B, the price expectation formation is considerably forward-looking in a sense, but the monetary/fiscal policy is inactive. Even if the monetary/ fiscal stabilization policy is inactive, in this case the speed of economic recovery is much higher than that in Case A because of the movement of price inflation towards the target rate, which helps to sufficiently reduce the real debt burden to induce the sufficient increase of investment expenditure. In this case, at the time period t = 21, the capacity utilization reaches its upper bound and then an economic recession is triggered. In Case C, the price expectation formation is the same as in Case B, and the monetary/fiscal policy becomes very active from the time period t = 5. In this case, the economic recovery is further accelerated through the active monetary/fiscal stabilization policy that is combined with the considerable credibility of inflation targeting by the central bank. At the time period t = 19, the full employment of labor is attained, and at t = 19.5 the capacity utilization reaches its upper bound. Unlike the previous two cases, the full employment of labor and excess demand in the goods market continue in this case. In these examples, it is assumed that the nominal rate of interest is fixed at its lower bound r0 = 0.01 all the time. In the real economy, however, the nominal rate of interest will begin to rise at the later stage of economic recovery. This will have the effect of somewhat slowing down the speed of economic recovery, but the qualitative nature of the economic recovery will not seriously change even if we consider the effect of the rising nominal rate of interest at the late stage of economic recovery. By the way, it is worth noting that in the three cases above, the structure of the economy is the same except for parameter q representing the credibility of the policy commitment of the central bank, and the monetary/fiscal policy parameter d.
5 Economic Implications of the Analysis Finally, we shall try to interpret the economic implications of our analysis. With the lack of an active counter-cyclical monetary/fiscal policy, the following destabilizing positive feedback mechanism, which is called the “Fisher debt effect,” may dominate the economy (cf. Fisher 1933). (e ↓) ⇒ p ↓ ⇒ d = (D/pK) ↑ ⇒ g ↓ ⇒ y ↓ ⇒ (e ↓)
(FDE)
94
T. Asada
The increase of the adjustment speed in the goods market (quantity flexibility) and the increase of the adjustment speed in the labor market (wage/ price flexibility) will reinforce this destabilizing mechanism. If the monetary/fiscal policy is inactive and the price expectation formation by the public is strongly backward-looking (adaptive), the following destabilizing positive feedback mechanism, which is called the “Mundell effect,” will also work through the effect on the expected real interest rate. (e ↓) ⇒ p ↓ ⇒ p e ↓ ⇒ (r − p e) ↑ ⇒ g ↓ ⇒ y ↓ ⇒ (e ↓)
(ME)
The increase of the speed of expectation adaptation will reinforce this destabilizing effect, reinforcing the part p ↓ ⇒ p e ↓. The increase of the quantity flexibility and the wage/price flexibility will also reinforce the Mundell effect as well as the Fisher debt effect. On the other hand, the following stabilizing negative feedback mechanism, through the changes of the nominal rate of interest, which is called the “Keynes effect,” will work if the counteracting Mundell effect is relatively weak. (e ↓) ⇒ p ↓ ⇒ m = (M/pK) ↑ ⇒ r ↓ ⇒ (r − p e) ↓ ⇒ g ↑ ⇒ y ↑ ⇒ (e ↑) (KE) However, this stabilizing Keynes effect is almost negligible in the case where the nominal rate of interest has already fallen to nearly its lower bound, like the Japanese economy in the late 1990s and 2000s. Needless to say, this case corresponds to the case of a “liquidity trap.” Even in case of a liquidity trap, however, a sufficiently active countercyclical monetary/fiscal policy, which is combined with the sufficient credibility of the monetary policy commitment by the central bank, can stabilize the intrinsically unstable economy. The large value of the monetary/fiscal policy parameter d has the following direct stabilizing effect, which may be called the “money-financed fiscal policy effect.” (e ↓) ⇒ p ↓ ⇒ v ↑ ⇒ y ↑ ⇒ (e ↑)
(MFE)
Even if monetary/fiscal policy is not strongly active, the sufficiently high credibility of the inflation targeting commitment by the central bank will have a stabilizing effect through the following mechanism, which may be called the “inflation targeting effect.”
µ0 − n > π e ⇒ π e ↑ ⇒ (ρ − π e )↓ ⇒ g ↑ ⇒ y ↑ ⇒ (e ↑ ) ⇓
π ↑ ⇒ d = ( D pK )↓ ⇒ (e ↑ )
(ITE)
This result is consistent with the Keynesian view which stresses the importance of subtle subjective factors such as the expectation of the public and the credibility of the attitude of the policy maker.
A High-Dimensional Macrodynamic Approach
95
Acknowledgments. This is a revised version of the paper which was presented at Chuo METS 05 (Chuo Meeting on Economics of Time and Space 2005, Chuo University, Tokyo, August 30, 2005) and the Sophia Symposium on Keynesian Legacy and Modern Economics (Sophia University, Tokyo, September 24, 2005). Thanks are due to the helpful comments by the participants of the conferences. Needless to say, however, only the author is responsible for possible remaining errors. This research was financially supported by the Japan Ministry of Education, Culture, Sports, Science and Technology (Grant-in-Aid for Scientific Research (B) 15330037) and Chuo University Joint Research Grant 0382.
References Agliardi E (1998) Positive feedback economics. Macmillan, London Asada T (2001) Nonlinear dynamics of debt and capital: a post-Keynesian analysis. In: Aruka Y, Japan Association for Evolutionary Economics (eds) Evolutionary controversies in economics. Springer, Tokyo, p 73–87 Asada T (2004) Price flexibility and instability in a macrodynamic model with a debt effect. J Int Econ Stud 18:41–60 Asada T (2006a) Stabilization policy in a Keynes–Goodwin model with debt accumulation. Structural Change and Economic Dynamics, forthcoming. Asada, T (2006b) Inflation targeting policy in a dynamic Keynesian model with debt accumulation: a Japanese perspective. In: Chiarella C, Flaschel P, Franke R, Semmler W (eds) Quantitative and empirical analysis of nonlinear dynamic macromodels. Elsevier, Amsterdam Asada T, Chiarella C, Flaschel P, Franke R (2003) Open economy macrodynamics: an integrated disequilibrium approach. Springer, Berlin Barro RJ, Sala-i-Martin X (1995) Economic growth. McGraw-Hill, New York Blanchard O, Fischer S (1989) Lectures on macroeconomics. MIT Press, Cambridge Chiarella C, Flaschel P (2000) The dynamics of Keynesian monetary growth. Cambridge University Press, Cambridge Chiarella C. Flaschel P, Semmler W (2001) Price flexibility and debt dynamics in high order AS–AD model. Central Eur J Oper Res 9:119–145 Fisher I (1933) The debt–deflation theory of great depressions. Econometrica 1:337–357 Flashel P, Krüger M, Asada T (2003) From growth and employment cycles to KMG model building. In: Fuhmann N, Schmoley E, Sud RSS (eds) Gegen den strich: Ökonomische theorie und politische regulierung. Reiner Hampp, Munich p 75–100 Franke R, Asada T (1994) A Keynes–Goodwin model of the business cycle. J Econ Behav Organ 24:273–295 Goodwin R (1967) A growth cycle. In: Feinstein CH (ed) Socialism, capitalism and economic growth. Cambridge University Press, Cambridge, p 54–58 Harrod RF (1948) Towards a dynamic economics. Macmillan, London Kaldor N (1960) Essays on economic stability and growth. Duckworth, London Kalecki M (1937) The principle of increasing risk. Economica 4:440–447
96
T. Asada
Kalecki M (1971) Selected essays on the dynamics of the capitalist economy. Cambridge University Press, Cambridge Keynes JM (1936) The general theory of employment, interest and money. Macmillan, London Krugman P (1998) It’s baaack: Japan’s slump and the return of the liquidity trap. Brookings Pap Econ Act 2:137–205 Mankiw G, Romer D (eds) (1991) New Keynesian economics. 2 vols, MIT Press, Cambridge Minsky H (1986) Stabilizing an unstable economy. Yale University Press, New Haven Nikaido H (1996) Prices, cycles, and growth. MIT Press, Cambridge Okishio N (1993) Essays on political economy. Flaschel P, Krüger M (eds) Peter Lang, Frankfurt am Main Pally TI (1996) Growth theory in a Keynesian mode: some Keynesian foundations for new endogenous growth theory. J Post-Keynesian Econ 19:113–135 Robinson J (1956) The accumulation of capital. Macmillan, London Romer D (1996) Advanced macroeconomics. McGraw-Hill, New York Steindl J (1952) Maturity and stagnation in American capitalism. Oxford University Press, Oxford Tobin J (1980) Asset accumulation and economic activity. Basil Blackwell, Oxford Uzawa H (1969) Time preference and the Penrose effect in a two-class model of economic growth. J Polit Econ 77:628–652
Part II Economics of Time: Nonlinear Dynamics
6. Instability Problems and Policy Issues in Perfectly Open Economies Peter Flaschel
Summary. In this chapter, we consider an advanced classical model of a small, perfectly open economy with, in particular, perfectly flexible wages and prices, and the purchasing power parity (PPP) and uncovered interest parity (UIP) conditions holding at each moment in time. From the budget equations of the model we derive the law of motion for the real value of foreign bonds held domestically, and for the evolution of aggregate real government debt. Moreover, the PPP and the UIP conditions taken together imply a law of motion for the price level in such an economy. Taking these three laws of motion together, instability of the steady state of the dynamics is a likely result if all policy parameters are kept fixed. Specific policy rules with specific anchors for the private part of the economy are therefore proposed and used to make the overall dynamics stable, at least from the global point of view. Key words. Perfectly open economies, Government debt, Foreign debt, Instability, Monetary and fiscal policy
1 Introduction In this chapter we extend the classical analysis of the specie-flow mechanism in the one good world, considered in Chap. 2 of Asada et al. (2003), toward a modern representation of such a classical approach, and now also including international trade in bonds besides commodity trade and the uncovered interest parity (UIP) condition as a representation of perfect international capital mobility. The small open economy1 considered is also a perfect one Department of Economics and Business Administration, Bielefeld University, PO Box 10 01 31, 33501 Bielefeld, Germany 1 The two-country extension of model considered is briefly described in Appendix A.
99
100
P. Flaschel
in all other respects (from a neoclassical perspective), since the law of one price or the absolute form of the purchasing power parity condition (PPP) is assumed to hold, since flexible money wages are assumed to clear the labor market, since domestic prices are also perfectly flexible, and since all excess supply or demand in the domestic economy is always absorbed by, or served by, the world goods market. We are thus considering an ideal small open economy with no demand constraints, perfectly flexible wages and prices, full employment output, and moreover with the PPP as well as the UIP conditions holding on the goods and the financial markets at each moment in time. We provide budget equations for the three sectors of the model, and on this basis flow consistency conditions that imply by themselves that the balance of payments is always balanced without any need for central bank intervention, and in fact also independently of the monetary regimes that are assumed to hold on the domestic money market and the market for foreign exchange. Depending on the fiscal policy regime that characterizes such an economy, we have, however, two stock-flow relationships in this economy which can create stability problems even for such a perfectly open economy. On the one hand, there is the evolution of the capital account that is driven by the surpluses or deficits in the current account. The resulting dynamic of the stock of foreign bonds (in real terms) held by domestic residents (if positive) can be isolated from the rest of the dynamics under certain simplifying conditions, and can be shown to be an unstable one if wealth effects in the assumed consumption behavior of the households of the economy are sufficiently weak. On the other hand, there is the dynamic of the government budget restriction (GBR) which is necessarily explosive or subject to centrifugal forces if all fiscal policy instruments of the government are given magnitudes. The twin deficits (or surpluses) of such an economy, the government deficit as well as the current account deficit, may thus both be subject to cumulative forces under certain assumptions, and may thus represent a problem for such an economy in the long-run. The primary objective of this chapter is to consider the evolution in these deficits and the (in-)stability conditions that characterize this evolution. Moreover, the perfectly open economy considered in this chapter is also subject to explosive price level adjustments, caused by the joint working of the UIP and the PPP conditions. The private sector of this economy is therefore plagued by various type of instabilities, which can only in a very limited way be overcome by the jump-variable technique (JVT) of the rational expectations school, here applicable to the price-level dynamics solely, and indeed of the type as was discussed by the original proposal for such a technique in Sargent and Wallace (1973). We do not follow the JVT here, however, but design appropriate, not always conventionally plausible, policy rules which
Perfectly Open Economies
101
may tame the explosive nature of the dynamics of the private sector under certain conditions. These policies concern an anti-monetarist type of money supply adjustment rule, a tax policy that is adjusting debt to a certain debt to GDP ratio, and a government expenditure policy that attempts to counteract the cumulative forces in the dynamic of the current account or its mirror image, the capital account. In this way the perfectly open economy may be made a stable one, though the dynamics that results from all these additions is difficult to analyze from an integrated point of view in place of the partial ones adopted in the final section of this chapter. Section 2 presents our modeling of the perfectly open economy, including the law of motion for the external debt or creditor position of this economy. Section 3 derives and investigates the dynamics of the GBR in isolation from the rest of the dynamics. In Sec. 4, we consider the joint evolution of internal and external debt (or surpluses) and also the unstable dynamic of the price level as it is implied by the UIP and the PPP conditions. Section 5 finally considers monetary and fiscal policy rules that may overcome the various cumulative forces that the perfectly open economy considered is subject to. The appendices provide a two-country extension of the model of this paper and the notation employed.
2 Perfectly Open Economies: An Advanced Formulation In this section, we provide an advanced formulation of the classical model type envisaged which integrates all perfectness assumptions that can exist in such a framework from a modern perspective, and which therefore may be considered the modern counterpart of the classical approach to the specieflow mechanism. Here we follow Rødseth (2000, ch. 5) and present a brief summary of this ideal model type for a small open economy. The model here therefore collects in an ideal fashion the assumptions of a perfect working of a small open economy in a classical environment, including internationally traded interest-bearing financial assets. This model type, of an extremely or perfectly open classical economy, will be reviewed only briefly here with respect to its structural equations and their fundamental implications. In particular, we will find that the introduction of interest-bearing assets and their international trading will give rise to the possibility that the specie-flow mechanism no longer works in the stable fashion of a classical economy without international trade of bonds. In addition, such destabilizing forces are not only working through cumulative effects or accelerating capital account adjustments, but are also present in the evolution of government debt and driven by government budget constraints. The general model to be considered below thus allows for the
102
P. Flaschel
possibility that twin deficits (or surpluses or mixed combinations) can show explosive or implosive tendencies to be counteracted by active policy intervention. In the following, we consider a one-commodity world augmented by the trade in domestic as well as foreign bonds, B,F. We furthermore assume that domestic bonds are perfect substitutes for foreign bonds, but do not consider the inflow and outflow of bonds explicitly, since the model of this chapter does not consider stock constraints for asset demand functions explicitly. Instead the (balanced) exchange of domestic against foreign bonds is hidden in our presentation of the balance of payments (operating in the background of it), and thus only explicitly present in the uncovered interest rate parity condition to be considered below. Let us first consider the flow budget equations of the three sectors of Rødseth’s (2000. ch. 5) extremely open economy: households, the government, and the central bank (Firms only produce output Y by means of labor N, and they do not invest and transfer all profit income to the household sector. They need not be considered from the perspective of budget equations.) These equations, read in the order of the named sectors, are . . . ¯*Fp ≡ pC + M + B p + eFp (1) p(Y − T) + rBp + er . ¯*Fc + B ≡ pG + rBp (2) pT + er . . (3) M ≡ Bc Equation (1) states that wage, profit, and interest incomes (after taxes) of private households are spent on consumption and additions to money, and domestic and foreign fixprice bond holdings (with prices set equal to one in the respective currency). We here denote by r and ¯* r the domestic and the foreign rate of interest, and by Bp and Fp the current stocks of domestic and foreign bonds held by the private sector. In Eq. 2 we state that government finances its expenditures and its interest rate payments to households through taxes, new issues of bonds, and also the domestic and foreign interest proceeds of the central bank (which are transferred to the government sector). The latter uses open market operations concerning domestic bonds2 in order to change continuously the quantity of money in the form of time derivatives (as shown). We furthermore allow for discontinuous policy changes in the financial stocks held in the domestic economy, through the open market of foreign exchange market transactions of the Central Bank (or briefly CB) of the following form.
2
Of course, the central bank could also buy foreign bonds from domestic residents.
Perfectly Open Economies
103
2.1 Central Bank Stock Policy Constraint CB − PC:
−M + Bc + eFc:
−dM + dBc + edFc = 0
We therefore distinguish between rule-determined continuous changes in the money supply, and isolated policy interventions in the financial markets of the economy. In the above flow identities, we assume consistency (equality) between the . . . flow supplies M and B − B c provided . by the government and the central bank, . and the flow demands M and B p of households. Furthermore, the central bank has some holdings of government bonds, Bc, due to past open market operations (which concern both the stock and the flow constraints). However, since it transfers (by assumption) interest on these bonds back into the government sector, we need not consider these bond holdings explicitly in the budget Eqs. 1–3. Open market operations thus can also change stocks instantaneously, and thus have to be considered as wealth reallocations of households with respect to their wealth constraint later on. Note that the resulting budget equation of the central bank is formulated in a way that allows for consistent steady state calculations in the case of steady state inflation. The consistency assumption made with respect to Eqs. 1–3 implies for the aggregate of these three equations that . ¯*(Fp + Fc) ≡ eFp (4) p(Y − C − G) + er that is, the balance of payments is always balanced in the assumed situation without the need for central bank intervention and independent of the exchange rate and money supply regime the economy is subject to. We assume with respect to this identity that the economy considered produces a single commodity that it can either export (if Y − C − G > 0 holds) or import (Y − C − G < 0), and that there are no restrictions on the world market for doing so. This is some sort of Say’s Law, since there is then no demand constraint (or supply constraint) operating on the domestic goods market. Interpreting Eq. 4 in this way implies that the balance of payment of the domestic economy is always balanced, since the left-hand side represents the current account and the right-hand side (the negative of) the capital account of this economy. This also holds if the capital flows underlying the assumed perfect asset substitutability are made explicit. In fact, in this chapter we go one step further by assuming that only infinitesimal capital flows are needed to ensure the UIP condition of the model considered below, so that the explicitly treated law of motion for the evolution of the foreign bond holdings of domestic residents is solely determined by the current account. This is a significant disadvantage of a model type that does not treat financial asset demands explicitly (as in the Mundell–Fleming–Tobin model of the literature, see Rødseth (2000, ch. 6) for example).
104
P. Flaschel
We define the real wealth economy by the following aggregate: W = Wp + Wg + Wc =
M + Bp + eFp − M − Bp eFc + + p p p
(5)
i.e., W=
e ( Fp + Fc ) eF = p p
(6)
which shows that the real wealth of all sectors in the economy (private households, government, and the central bank) adds up exactly to the real value of the foreign bonds held in the domestic economy (the amount of government bonds Bc held by the central bank has already been canceled in these equations3). In terms of growth rates, Eq. 6 implies that . . ˆ = ê + Fˆ − pˆ W or W = êW + eF/p − pˆW Using Eq. 4, this yields
ˆ = (ê − pˆ)W + [p(Y − C − G) + er ¯*F]/p = (r ¯* + ¯ W e − pˆ )W + Y − C − G
(7)
due to what was shown for the balance of payments in nominal (domestic) terms. The change in real domestic wealth, given by the real value of foreign bonds held in the domestic economy, therefore reflects the capital account ¯* = ¯* r −p r + ê − pˆ (see the in real terms if the real rate of return ¯ r * = ¯* assumptions of the model presented below) is applied in the interest income balance to calculate real interest flows (as shown above). Equation 7 thus reflects the balance of payments (Eq. 4) in real (domestic) terms. In fact it provides the first equation, indeed the core equation, of this model of an extremely open economy as formulated in Rødseth (2000, Ch. 5), if a certain taxation policy is implemented (see footnote). The remaining equations of the model are (with ¯ N the full employment level on the labor market)4
Since government bonds held by the central bank are irrelevant for the trajectories followed by the economy, we completely ignore them from now on and thus now use . the . letter B in the place of Bp for notational simplicity. Note also that we always have F = F p by assumption. 4 The assumption that W ag = constant holds is justified by a corresponding (this ensuring) rule for the collection of government taxes (see Rødseth 2000, P. 117). We also assume that government expenditures G are of a given magnitude in the following equations. 3
Perfectly Open Economies
M/p = md(Y,r),
mdy > 0,
mdr < 0
¯* = constant p = ep*, pˆ* = p
105
(8) (9)
r = ¯* r +ê
(10)
¯) w = pf′(N
(11)
¯) Y = f(N
(12)
r *), C = C(Yp, Wp, ¯
1 > C1 > 0, C2 > 0, C3 < 0
¯ ag r *W − G, Wp = W − W Yp = Y + ¯
(13) (14)
W = eF/p ¯ ag = Wg + Wc = −(M + B)/p + eFc/p = constant W
(16)
¯ ¯* = ¯* r + ê − pˆ = r − pˆ = r = constant r * = r* − p
(17)
We here consider an economy that is small compared to the world market, and that can satisfy all excess demand C + G − Y via imports and can export all excess supply Y − C − G in the opposite case. The world market nominal rate of interest ¯* r is assumed to have a given magnitude, as is the world ¯*. In the above model, we assume a standard LM curve of the inflation rate p Keynesian variety, the absolute form of the PPP condition, the UIP condition as a characterization of international capital mobility, labor market equilibrium, and on this basis the usual description of full employment output, and we supplement this model type by a Keynesian consumption function, the Hicksian definition of disposable income, and the definitions of private and government wealth. A list of the notation used in these equations is provided in Appendix B. Note that the second part of Eq. 2.17 follows directly from Eq. 9 when transformed to rates of growth. Furthermore, the assumption in Eq. 16 is made for the time being only (and later on discarded in Rødseth 2000, ch. 5, as well as in this chapter). It is based on the implicit assumption of a certain taxation policy, as we shall show below, and it allows us to substitute Wp in ¯ ag and thus reduce this term to the one that governs the Eq. 13 by W − W dynamics of the capital account in the balance of payments (minus a constant). Equation 14 for (Hicksian) disposable income of households is also finally a consequence of assuming that W ag = (eFc − M − B)/p is a constant (see below). Making use of the government budget restriction and the budget restriction of the central bank (Wc = eFc/p), we first get − B c−M eFc + eF eFc − M − B − pˆ W ga = p p
ˆ c + pˆ ⎛⎜ M + B ⎞⎟ ˆ c + r *Wc + T − G − rB p − pW = eW ⎝ p p⎠
106
P. Flaschel
This expression can be fixed to zero if the tax variable T is adjusted accordingly. On the basis of this assumption, we then get B p
(r − pˆ ) − pˆ
M = T + ( r * + eˆ − pˆ )Wc − G p
The Hicksian concept of household disposable income, i.e., the income which when consumed would keep a household’s real wealth on a constant level, is defined by Yp = Y − T + ( r * + eˆ − pˆ )
eFp
B M + (r − pˆ ) − pˆ p p p
This implies the expression eFp Yp = Y − T + ρ * + T + ρ *Wc − G p = Y − G + ρ*
eF = Y − G + ρ *W p
¯ ag The disposable income Yp, like Wp, therefore depends only on W = Wp + W = eF/p as far as wealth effects are concerned (with a given real rate of return ¯ r * in addition). Let us now investigate the model in Eqs. 8–17 for an exogenously given ¯* = 0, p* = constant in addition) and (as was assumed money supply (with p above) with endogenous taxation that keeps the value of W ag constant. Let us also ignore wealth effects in the consumption function for the moment. On the basis of the rational expectations solution for price level and exchange rate adjustments, and in the case of an unanticipated shock in the money supply (dM = −dB), we must then have pˆ = ê = 0, and thus know that r is given by ¯*. r The LM curve in Eq. 8 then determines the perfectly flexible after-shock price level p, whilst the PPP assumption in Eq. 9 determines the after-shock level of the exchange rate e. If perfectly flexible nominal variables are thus adjusting, as is assumed to be the case since Sargent and Wallace’s (1973) introduction of the jump variable technique, we thus get in the case of an unanticipated monetary shock a strict neutrality result for the economy considered as far as the price level, the nominal exchange rate, and the nominal rate of interest are concerned. In addition, the value of W = eF/p remains unchanged in such a situation, i.e., the economy just remains in its steady state position in such an event. In the case of C2 > 0, however, we get a jump in the wealth variables Wp, W ag, and thus a change on the right-hand side of Eq. 18, i.e., we will have real effects of unanticipated jumps in money supply in general.
Perfectly Open Economies
107
In the case of real shocks, 5 for example an unanticipated government expenditure shock, the stock magnitudes F,W, i.e., the nominal and real holdings of foreign bonds in the economy considered, start moving continuously in time after the shock, which therefore sets in motion the dynamic we have derived above for the real variable W. . ¯ ag,r ¯*) − G W =¯ r *W + Y − C(Y + ¯ r *W − G,W − W
(18)
This dynamic is described by a single autonomous differential equation for the evolution of the domestic holdings of foreign bonds (or domestic ¯), and now also indebtedness to foreigners) W = eF/p, since ¯ r * = r, Y = Y(N ¯ G , are given magnitudes. The whole model in Eqs. 8–17 can therefore be reduced to, and analyzed by, a single law of motion for the real value of foreign bonds held in the domestic economy, as is implied by the balance of payments of the economy. Yet the perfect world assumed by Eqs. 7–17, with all prices flexible, the permanent fulfillment of the PPP and the UIP, and with myopic perfect foresight in the working of the latter, then faces one significant problem, namely the possibility of an explosive evolution of the real value of foreign assets held domestically if and only if ¯* − C2 = (1 − CY (⋅))r ¯* − CW (⋅) > 0 (1 − C1)r p
p
holds true, i.e., when wealth effects in consumption are sufficiently weak. The steady state level of W behind this stability condition is given by6 ¯ ag,r ¯ ¯*) + ¯ r *W0 = C(Y − ¯ G +¯ r *W0 ,W0 − W G−Y and is then surrounded by centrifugal forces which drive W towards +∞ for W(0) > W0 , or to −∞ for W(0) < W0 . The specie-flow mechanism of the classical open economy model (Asada et al. 2003, ch. 2) thus can now be unstable in a global way if interest rate effects are present in the model considered, here due to a Keynesian consumption function. However, this occurrence of an unstable balance of payments adjustment process with respect to the accumulation of foreign bonds or dollar-denominated debt in the domestic economy undermines the application of the jump variable technique, since then there does not exist a bounded trajectory onto which the jump variables p, e can jump, and that brings the economy back to its steady state when thrown out of this position by an appropriate real shock. Where W(0) remains constant due to the PPP condition. We again assume that there is a single (positive or negative) solution W0 to this equation (describing a creditor or debtor position for the private sector of the economy considered).
5 6
108
P. Flaschel
Therefore, this case is excluded or blocked out from consideration in Rødseth (2000, p. 121). However, it is possible to investigate such an unstable adjustment process further within the balance of payments if it is assumed that wealth effects become dominant far off the steady state, to the extent that not only the slope of the right-hand side of Eq. 18 with . respect to W becomes negative, but also that two further equilibria of the W dynamics are created, as shown in Fig. 1. This shows two stable and, in between, one unstable stationary point for balance of payments adjustment. Flukes apart, the economy will therefore converge over time to either W1 or W3. Let us assume the latter situation, i.e., the economy then approaches a high creditor position with respect to the rest of the world (due to its consumption behavior along this path). Assume now that government increases its expenditure in such a situation . as long as there is a surplus in the current account of the economy. The W curve shown in Fig. 1 is thereby shifted into a downward direction, accompanied by decreases in the level of the actual as well as the stationary level of real foreign wealth held domestically. This process may reach a point where the upper equilibrium W3 gets lost (immediately after the situation where W2 = W3 is established). There is then only one stable equilibrium left, W1, to which the economy will converge over time. Along this path, the creditor position of the domestic economy will sooner or later change into a debtor position (in the situation shown in Fig. 11.1). This process is accompanied by current account deficits until the new stationary point is reached where the current account has become balanced again. Appropriate nonlinearities in consumption behavior may therefore be used to explain long-lasting regime switches away from a strong economy towards a regime of increasing the foreign debt of the domestic economy (and vice versa). The basic result of these considerations is that the steady state of this extreme monetary approach to open economies that are small relative to the
W
W W1
W2
W3
Fig. 1. Multiple stationary points for the balance of payments adjustment process
Perfectly Open Economies
109
world market may be unstable in the larger world (if wealth effects in consumption are so small that they do not turn exports into imports as wealth is accumulated in terms of foreign bonds). This result has been shown here in a perfect flexprice environment with a Keynesian LM curve and Keynesian consumption behavior, but no IS-restriction on the economy (I ≡ 0 still). Rødseth (2000, ch. 5) goes beyond this analysis by also considering imperfect capital mobility with fixed or flexible exchange rates, a government deficit ¯ ag, and nominal (real) wage rigidity by way of an expectationand changing W augmented money-wage Phillips curve (with myopic perfect foresight), again with fixed or flexible exchange rates. In this chapter we will now dispense with the assumption made on the tax rate policy and assume instead that T = ¯ G) is determined exogeT (in addition to given government expenditure ¯ ¯ ag is changing, and nously.7 This implies that aggregate government wealth W will lead us to a situation where this magnitude is interacting dynamically with total wealth W, and in fact now also with the price level p.
3 Integrating the Dynamics of the Government Budget Constraint We start again from the budget equations of the three relevant sectors, households, the government, and central bank. With respect to firms, we again assume that all their income is transferred to the household sector. They do not invest in the present framework, and thus are only organizing the production of the full employment output Y by means of the given labor supply ¯ N . Note also that the CB holds both types of government bonds and may change these holdings by way of instantaneous open market or foreign exchange market policies, 8 but that this does not influence the shown form of the budget equations, since all interest income from these bond holdings is transferred to the government sector, which therefore only has to pay interest on the bonds B held by the private sector. The relevant flow budget restrictions of the small open economy are thus (when government and the Central Bank are considered in integrated form) . . . ¯*Fp = pC + M + B + eF p p(Y − ¯ T ) + rB + er . . ¯ + er ¯ + rB ¯*Fc + B + M = pG pT
Note that this rigidity in government behavior is one root for the instability found to exist below. 8 To be represented by dM, dB, and dFc in a flexible exchange rate regime (where Fc = constant otherwise). 7
110
P. Flaschel
and their implications for the evolution of domestic and foreign bonds held by the domestic economy are (describing, if positive, the public debt of the government and of the foreign economy) . . ¯ −¯ ¯*Fc − M B = rB + p(G T ) − er . ¯*F, eF = p(Y − C − ¯ G) + er F = Fp + Fc Considering the same situation from the viewpoint of savings, we can write . . . ¯*Fp − pC = M + B + eF p T ) + rB + er pSp = p(Y − ¯ . . ¯ + er ¯ − rB = − B − M ¯*Fc − pG S g = pT which, as expected, gives for total savings pS . . ¯*(Fp + Fc) = eF p = eF pS = pY − pC − pG + er This is simply a reformulation of the fact that the balance of payments must be balanced in the assumed situation without any further adjustment. processes (solely based on the . assumption that the issue of new money M and new government bonds B is accepted by the household sector, as is implicitly made is the above formulation of the two budget equations of our economy). In analogy to the Hicksian definition of private disposable income, we now define and rearrange this concept for the aggregate government sector (including the foreign interest income of the central bank), and show on this basis in particular that the aggregate real wealth of this sector W ag is, in its time-rate of change (as in the case of the private sector), determined by deducting from its real disposable income the consumption of this sector. Then this also provides us with a law of motion for the real aggregate wealth of the government sector, as well as the one we have already determined for the total wealth of the economy. These two laws describe the evolution of surpluses or deficit in the government sector and the evolution of current account surpluses or deficits, and thus in particular allow the joint treatment of the issue of twin deficits in an open economy with a government sector (yet extended towards capital accumulation and economic growth). rB er *Fc ˆ M + B eFc + +p + (eˆ − pˆ ) p p p p eF M+B M + r + ( r * + eˆ − pˆ ) c = T − (r − pˆ ) p p p M M+B − ( M + B) eFc + ( r * + eˆ − pˆ ) = T + r − (r − r * − eˆ) p p p M M+B − ( M + B) eFc + ρ* =T +r +ξ , ξ = r * + eˆ − r p p p
Y ga = T −
Perfectly Open Economies
111
with W ga =
( M + B) eFc − ( M + B) eFc =− + = Wg + Wc p p p
i.e., + B ) eF c p − ( M + B) eFc − (M W ga = − p p p ˆ c ( − pG + rB − pT − r *eFc ) + eeF − ( M + B) eFc − pˆ = p p =T −G −r
B ˆM+B eFc +p + ( r * + eˆ − pˆ ) p p p
which finally gives = Y ga − G = ρ *W ga + ξ
M+B M + r +T −G p p
The above calculations concern the sources of income, and consider as disposable in this regard that part of accounting income that, when consumed, just preserves the current level of real wealth of the sector considered, which here is the aggregated government sector. On the basis of this income concept and the definition of the aggregate wealth of the government sector (where central bank wealth is included), we then derived the law of motion for aggregate government wealth, and showed in particular that the time-rate of change of this wealth magnitude is just the difference between diposable real government income and real government expenditure. ¯* holds), In the perfectly open economy (where x = 0 and ê = pˆ − p this then gives for a Cagan-type real-money demand function ¯* − r)), i.e., for the reduced form of the LM curve, md(Y, r) = kY exp(a(r ln p + lnY + ln k − ln M as the law of motion for aggregate governr = r*+ α ment wealth (or better debt). . W ag = ¯ r *W ag + rmd(Y,r) + ¯ T −¯ G i.e.,
(
)
ln p + lnY + ln k − ln M M +T −G W ga = ρ *W ga + r * + α p
P. Flaschel
112
This shows that the evolution of this debt is governed (in a destabilizing way) by its level, but is also dependent on the price level and its evolution in addition to the budget surplus (or deficit). Since this debt position of the government is now no longer constant, we next repeat the equations for the evolution of total wealth W = eF/p of the economy for this case, and then again consider private disposable income Yp and private wealth Wp in its interaction with the evolution of aggregate government debt. W = Wp + Wg + Wc =
eF , p
F = Fp + Fc
ˆ = eˆ + Fˆ − pˆ W eF ˆ ˆ + W = eW − pW p p (Y − C − G ) + er *F ˆ − pW p eF = (eˆ − pˆ )W + r * + Y − C − G p ˆ = ( r * + eˆ − p )W + Y − C (Yp , Wp , ρ *) − G ˆ + = eW
. ¯*) − ¯ W =¯ r *W + Y − C(Yp,Wp,r G where we have for the definition of private wealth and disposable income: M + B + eFp = W = W ga p eFp B M + (r − pˆ ) − pˆ Yp = Y − T + ( r * + eˆ − pˆ ) p p p
Wp =
= Y − T + ( r * + eˆ − pˆ )Wp − ( r * + eˆ − pˆ ) = Y − T + ρ *Wp − ξ
B M M+B + (r − pˆ ) − pˆ p p p
M+B M r p p
= Y − T + ρ * (W − W ga ) − ξ
M+B M −r p p
with x = 0 in the case of a perfectly open economy. From the results on the disposable income of households and the government, we finally also get
Perfectly Open Economies
Yp = Y − Yag + ¯ r *W
or
113
Yp + Yag = Y + ¯ r *W
as the relationship between total disposable income, domestic product, and real interest on domestically held foreign bonds. In the case where the price level p, money supply M, government expenT are all held constant, we then get that the implied diture ¯ G , and taxes ¯ evolution of government debt W ag is always divergent at both sides of its steady state level
Wgoa =
T −G −⎛r * + ⎝
ln po + lnY + ln k − ln M ⎞ M ⎠ pc α ρ*
There is thus a need for an active fiscal or monetary policy (not necessarily supported by price-level adjustments according to a standard expectationsaugmented Phillips curve) in order to stabilize the government budget. Without such a policy, both the budget of the government and the behavior of the capital account will be unstable, since the government deficit feeds back into the capital account via the consumption behavior of households (but is, in the present situation, completely independent from the rest of the economy). Using the steady-state value of W ag in the law of motion for W then implies the same stability characterization for the adjustment of domestic foreign bond holdings as before, which therefore is now also driven by the unstable behavior of government debt, but otherwise is behaving as was analyzed in the case of a constant government debt. It follows that we have to integrate active fiscal policy rules, price-level adjustments, and an active monetary policy rule here in other to get less one-sided results on the evolution of the twin deficits considered, the stability of the evolution of government debt, and the balance of payments (BOP) adjustment processes involved.
4 Twin Deficits and PPP/UIP-Driven Price-level Dynamics At the end of the preceding section, we assumed that the price level is a given constant in order to study the evolution of deficits or surpluses in the GBR and in the BOP in isolation. Yet prices (and the exchange rate) are moving if the price level is shocked out of its steady-state position p0 , where the nominal rate of interest implied by the LM curve is no longer equal to the given world interest rate. In such a case, the resulting interest rate differential determines, via the UIP condition, the growth rate of the nominal exchange rate, and consequently, via the PPP, also the growth rate of domestic prices. Changes in the domestic price level feed back into the laws of motion for W as well as W ag (but not vice versa), and thus make the dynamics of these two
114
P. Flaschel
stock variables more complicated. In this subsection, we study these integrated dynamics of the private sector in their still fairly recursive format before we come, in the next section, to the question of what fiscal and monetary policy can do in view of such extreme formulations of the dynamics of the private sector.
4.1 Price-Level Dynamics in the Perfectly Open Economy The full set of three laws of motion of the private sector of the perfectly open economy considered in this chapter, with foreign inflation set equal to zero ¯ ,T ¯,G ¯), reads, on the basis of the ¯* = 0) and with fixed policy parameters (M (p Cagan money demand function of this section, as M ⎛ ⎞ W = ρ *W + Y − C ⎜ Y + ρ *W − ρ *W ga − r − T , W − W ga , ρ *⎟ − G ⎝ ⎠ p M a a W g = ρ *W g + r + T − G p
(19) (20)
now coupled with the law of motion for the price level p ln p + lnY + ln k − ln M pˆ = [eˆ = ] α
(21)
This is based on the relationships W=
eF , p
r = r*+
ln p + lnY + ln k − ln M , α
W ga =
− ( M + B) + eFc p
The steady state of this dynamic system is determined by p0 =
M Y
[ln p0 = ln M − lnY ]
M ⎛ M⎞ ⎜⎝ r ⎟⎠ = r * = r *Y p 0 p0
(W ga )0 =
r *Y exp ( −α r *) + T − G ρ*
(Yp )0 = Y − G + ρ *W0 (Wp )0 = W0 − (W ga )0 Wo =
Y − C (Yp 0 ) , (Wp )0 , ρ *) − G
ρ*
Note that the equations for private disposable income and wealth have to be inserted into the last equation in order to get a single equation in the only unknown W0 , that must then be solved for this term.
Perfectly Open Economies
115
The three laws of motion shown above, which describe the dynamics of the private sector of our perfectly open economy as a result of all the assumptions made on its ideal performance, can be structured hierarchically in the following way. ln p + lnY + ln k − ln M ˆ pˆ = = p ( p) α ln p + lnY + ln k − ln M ⎞ M W ga = ρ *W ga + ⎛⎜ r * + ⎟⎠ + T − G ⎝ α p a a = W g ( p, W g ) ⎛ Y − ρ *W a − ⎛ r * + ln p + lnY + ln k − ln M ⎞ M ⎞ g ⎜⎝ ⎟⎠ ⎟ α p −G W = ρ *W + Y − C ⎜ ⎟ ⎜ a ⎠ ⎝ −T + ρ *W , W − W g , ρ * = W ( p, W ga , W )
(22) (23)
(24)
. Inserting the steady-state values p0 , (Wag)0 into the law of motion for W would lead us back to the situation considered in the preceding section, since neither p nor W ag depend on the evolution of total domestic wealth, due to the hierarchy in the laws of motion of the 3D dynamics that characterize the private sector of this perfectly open economy. Now, however, the price level p and the representation of aggregate government debt W ag are moving in general, and their stability properties must be investigated. The law of motion for the price level is of the type already considered in Sargent and Wallace (1973) in their formulation of the jump-variable technique of the rational expectations approach. It could, in principle, be treated as in this chapter, but would then be confronted with the problem of how the real magnitudes W ag, W have to be treated in view of such jumps in the price level. In view of the many deficiencies of the JVT, see for example Asada et al. (2003), we do not follow such an approach any more, but now investigate the 3D dynamics from the perspective that all its three variables are predetermined, and that an implied instability of the system must be overcome by conscious policy actions and not by the imposition of appropriate jumps on a set of variables that are assumed to be nonpredetermined (a situation that is hard to belive to apply to the general level of prices of an applicable macromodel).
4.2 Stability Analysis The Jacobian of the given 3D dynamics, considered at the steady state, is characterized by
116
P. Flaschel
⎛ + 0 0⎞ J = ⎜ ∓ + 0⎟ ⎝ ∓ + ∓⎠ where the sign of J21, J31, is given by the slope of the function f ( p) = r * +
ln p + lnY + ln k − ln M M ⋅ p α
with respect to p, and the sign of J33 is determined as we have discussed in the original one-dimensional setting of this chapter. Reformulating f(p) as a function in the rate of interest r, we get f˜(r) = f(p(r)) = rmd(Y,r) and thus d
m f ′ (r ) = rmrd + md = md ⎛ rd + 1⎞ ⎝m ⎠ i.e., the sign of this function is determined by the interest-rate elasticity of the money demand function dm d m d f ′ (r ) ⱝ 0 ⇔ f ′ ( p ) ⱝ 0 ⇔ ⱝ−1 dr r The coefficients of the Routh–Hurwitz polynomial corresponding to the matrix J fulfill the given situation (where there are three zero entries in this matrix J) a1 = −traceJ = −J11 − J22 − J33 a2 = J11J22 + J11J33 + J22J33 a3 = −detJ = −J11J22J33 The stability of the dynamics considered thus only depends on the terms in the diagonal of the matrix J, and thus, in particular, not on the considered interest-rate elasticity condition for the money demand function. Furthermore, if a3 > 0 holds (as a necessary stability condition), we basically need to consider only whether the signs of a1 or a2 can hurt stability, since the remaining stability condition a1a2 − a3 > 0 is not a questionable one due to the fact that the terms in a1a2 completely dominated the single one in a3. Ignoring this, the remaining necessary conditions for local asymptotic stability are then J33 < 0,J11 + J22 < −J33
Perfectly Open Economies
117
and if this holds, −(J11J33 + J22J33) < J11J22 This implies that −J33(J11 + J22) < J11 + J22 < −J33 or − J 33 < 1 < −
J 33 J11 + J 22
As a result of the preceding section, the assumption −J33 < 1 is not a crucial one, while the other inequality states that J11 + J22 must be sufficiently small (relative to the size of J33) in order to get local asymptotic stability for the 3D dynamics considered. We thus see that the only potentially stabilizing term J33 in the 3D dynamics considered bears a big burden should it be able to make the overall dynamics of the private sector (with given policy parameters of the government and the Central Bank) an asymptotically stable one. It seems likely that active policy is needed for the proper working of such an economy. We conclude that given fiscal and monetary policy parameters represent too narrow a situation in order to safely exclude accelerating processes from the dynamics implied by the model. This is also fairly obvious from the nominal representation of these dynamics as provided in the digression that follows.
4.3 Digression: The Dynamics in Nominal Terms ¯* = 0), we can In a world where the steady state is inflation-free (M = ¯ M, p also discuss the dynamics of Eqs. 22–24 in nominal terms and get in this case for its three laws of motion. ln p + lnY + ln k − ln M ˆ pˆ = r − r * = = p ( p) α
(
)
ln p + lnY + ln k − ln M B = rB + p (G − T ) − r *pWc = r * + B α + p (G − T ) − r *pWc = B ( p, B ) W = r *W + Y − G
(
)
⎛ Y + r * (W − W ) + r * M + B − r * + ln p + lnY + ln k − ln M M ⎞ c ⎜ α p p⎟ − C⎜ ⎟ ⎜ − T, W −W + M + B , r * ⎟ c ⎜⎝ ⎟⎠ p
118
P. Flaschel
or indeed translated back into nominal terms F = r *F + p * (Y − G )
(
)
⎛ F − W + r * M + B − r * + ln p + lnY + ln k − ln M M ⎞ ⎛ c ⎜ p* ⎜ α p⎟ p − p *C ⎜ Y + r * ⎜ ⎟ ⎜ − T, F −W + M + B , r * ⎟ ⎜ c ⎜⎝ ⎟⎠ ⎜⎝ p* p ¯ c = eFc/p = Fc/p ¯*, W ¯* = conwhere we have again made use of W = eF/p = F/p stant. The stability analysis for system in Eqs. 22–24 also applies to this special case of course, now with respect to the dynamic variables p, B, F. For an application of the jump variable technique, we then find that only p can be a candidate for its application, i.e., for its application, we must construct a case where there is exactly one unstable root of the Jacobian J of the dynamics at the steady state. This can be the case if and only if the determinant of J (the product of the three eigenvalues of the matrix J) is positive, in which case, however, these three eigenvalues must be real and positive in fact, i.e., there is no case within these dynamics where the jump variable can find application. From the perspective of current new Keynesian theory, one may call such a situation an indeterminate one, which should be blocked from consideration. From this perspective, the task would then be to find policy rules that bring determinacy to the system, i.e., that establish the existence of one unstable root for the extended dynamics. This would bring jumpvariable stability to the system by the very existence of these rules, so that, for example, an unanticipated jump in the money supply would cause the price level to jump in the same proportion without any reaction in the other variables of the system (with the exception of the nominal exchange rate that also has to jump by the percentage change in the money supply). In this situation, therefore, latent rules cause the economy to return to its real steady state immediately by purely nominal adjustments, and thus provide stability without any need for policy intervention. However, our approach to such a situation is to assume that if the system is unstable, it behaves in a predetermined way according to the three laws of motion above, and thus, in the case of instability (which is necessarily monotonic, as we have just seen), with increasing or decreasing price levels and increasing imbalances in the budget of the government and in the current and the capital account. We could then, for example, assume policy reactions (fiscal or monetary) when these imbalances pass certain thresholds, which attempt to reduce single or twin deficits in the budget and the current account by tailored policy reactions. To consider such behavior in more detail is the objective of the following section.
Perfectly Open Economies
119
5 Fiscal and Monetary Policy in the Perfectly Open Economy We have seen in the preceding section that constant fiscal and monetary expressions (passive policies) are not a good scenario for obtaining stability of the stationary solution of the dynamic system considered. In particular, prices and the evolution of government debt (but also the evolution of the foreign position) will then be subject to centrifugal forces. In this section, we formulate some appropriate (but not necessarily plausible) monetary and fiscal policy rules that, at least from a partial perspective, are capable of taming the cumulative forces in the price level, the budget deficit, and the current account deficit, but due to space limitations only separately. Later we combine these policy rules in a general setup of the model in order to allow, at least numerically (in future research), an analysis of their potential for stabilizing the perfectly open economy considered either locally around the steady state or, if the policy is still inactive, at least globally if the economy has departed significantly from its steady-state position. This section leads to the conclusion that policy anchors in the form of given policy targets and policy rules that attempt to realize these targets are needed for a proper working of the ideal type of economy considered here.
5.1 Stabilizing Monetary Policy In view of the special type of price dynamics characterizing our perfectly open economy (where the law of motion of the price level is a direct consequence of the UIP and the PPP conditions), we now assume as the monetary policy the rule shown below, which due to its formulation implies an independent 2D dynamics for the state variables p, M. ln p + lnY + ln k − ln M pˆ = α ˆ M = βm1 ( M0 M − 1) + βm2 ( p p0 − 1) with steady-state p0 = 1 = M0 by way of an appropriate choice of the measurement of output Y. Due to the isolated nature of price-level changes, this dynamic system is an autonomous one that can be studied in isolation from the remaining laws of motion of the economy considered. The Jacobian of these dynamics at the steady state is given by J=
(1β α m2
−1 α −βm1
)
The steady state is therefore locally asymptotically stable iff
P. Flaschel
120
1/a < b m < b m 1
2
holds true. Commenting on this result, it is a bit strange to see that the ideal monetarist world of this chapter needs, in its advanced form, money supply increases in order to fight inflation. This is due to the destabilizing feedback channel that leads from rising prices to rising interest rates, from there to rising growth rates of the exchange rate (due to the UIP assumption), and then via the PPP to rising inflation rates in the domestic economy. Increasing the money supply in such a feedback chain then means that the increases in interest rates is counteracted, so that under the above conditions, the positive feedback mechanism between the price level and its rate of change is neutralized, such that the economy can indeed converge back to the steady state after the occurrence of isolated shocks. We thus have a stabilizing antimonetarist monetary policy rule in a model which is extremely monetarist in nature.
5.2 Stabilizing Fiscal Policy I We next consider a fiscal policy rule that attempts to suppress the cumulative forces in the evolution of government debt by way of a debt to a GDP target, as in the Maastricht Treaty of the European Community. We again formulate an isolated 2D dynamics that only shows the interaction of this rule with the dynamics of the government budget constraint, and thus ignores all influences of the dynamics of the price level (and of the money supply). For this purpose, we assume that the above steady state values po, Mo = 1 are underlyp * = 1, ro = ¯* r in ing the discussion of the next policy regime (eo = 1, ¯ addition). . G − T − ¯*W r ¯c B = roB + ¯ ¯) Tˆ = b t (To/T − 1) + b t (B/Y − d 1
2
We have assumed in the second law of motion that government wants to ¯ by adjusting taxes with a certain speed achieve a certain debt to GDP ratio d parameter b t accordingly. In addition, there is lump-sum tax smoothing with a speed parameter b t . The steady-state values of these partial dynamics ¯Y, To, where (where government expenditure is.also held constant) are Bo = d the latter is determined such that B = 0 is established. For the Jacobian at the steady state, in this case we get 2
1
J=
(β Tr Y o
t2 o
−1 −βt1
)
The steady state is therefore locally asymptotically stable iff b t > ro and rob t Y < b t To holds true. A taxation rule with some tax smoothing and a term that 1
2
1
Perfectly Open Economies
121
attempts to steer lump-sum taxes such that the debt to GDP ratio approaches a given percentage (the 60% of the Maastricht Treaty, for example) is thus successful in stabilizing the government deficit if government expenditure is held constant during this adjustment process, and if the price-level dynamics is fixed at its steady-state position. Note here that the limiting case b t = ∞ has already been considered earlier as the case where taxation is determined such that government debt W ag is kept constant. 1
5.3 Stabilizing Fiscal Policy II We have just seen that a debt to GDP target of the government may stabilize the originally cumulative forces in the GBR if taxes are varied into the direction of obtaining such a debt ratio, and if this variation is occurring with sufficient strength. Regarding the second, possibly cumulative, instability in the balance of payments adjustment process, we assume now as a further fiscal policy rule that government expenditure varies such that current account surpluses or deficits are reduced intentionally, again accompanied by a second component in this fiscal policy rule that adds some expenditure smoothing. We now assume (for a partial investigation of the implications of such a policy rule) that the steady-state values po, Mo, Bo are underlying p * = 1, ro = ¯*), r and the discussion of this policy regime (again plus eo = 1, ¯ that lump-sum taxes are a fixed magnitude. W = r *W + Y − G
(
)
⎛ Y + r * (W − W ) + r * Mo + Bo − r * + ln po + lnY + ln k − ln Mo Mo ⎞ c ⎜ α po ⎟ po − C⎜ ⎟ ⎜ − T , W − Wc + Mo + Bo , r * ⎟ ⎝ ⎠ po G = β g1 (Go − G ) + β g2 W Improvements in the current account thus lead to increases in government spending from this partial perspective. In the steady state we have G = Go, which can here be given from the outside (within certain limits). We then obtain the value of Wo by setting the first equation equal to zero. The Jacobian of these dynamics at the steady state reads ⎛ W J = ⎜ W ⎝ β g2 WW
W G ⎞ β g2 W G − β g1 ⎟⎠
This implies local asymptotic stability iff . . W W = ro(1 − C1) − C2 < b g + b g , W Wb g > 0 . holds true. In the case when W W < 0, i.e., for a stable balance of payments adjustment process, this demands that b g > 0 holds, i.e., indeed the 1
2
1
1
P. Flaschel
122
occurrence of government expenditure smoothing. In the opposite case, however, we need b g < 0, i.e., the government should then change its smoothing rule into a rule with a centrifugal in the place of a centripetal adjustment pattern. Stimulating government expenditure above its steady-state level, as well as responding positively to current account improvements, then induces an adjustment in the current account and the level of government expenditure that imply convergence back to the steady state should the economy be shocked out of this steady-state position. 1
5.4 Outlook: The Integrated Dynamics Looking at the system as a whole, we can state that here monetary policy works independently of fiscal policy, while the last two dynamic systems depend on each other, the first through changing G and the second through changing T. Of course they also depend on what is going on in the nominal part of the economy. It is therefore necessary to consider the integrated interaction of the monetary policy rule and the two fiscal rules. The resulting 6D dynamics read ln p + lnY + ln k − ln M pˆ = α ˆ = βm ( Mo M − 1) + βm ( p po − 1) M 1 2 ln p ln Y l n k − ln M + + B = r * + B + pG − pT − r *pWc − M α Tˆ = βt1 (To T − 1) + βt2 (B ( pY ) − d ) W = r *W + Y − G
(
)
(
)
⎛ Y + r * (W − W ) + r * M + B − r * + ln p + lnY + ln k − ln M M ⎞ c ⎜ p p⎟ α − C⎜ ⎟ ⎟ ⎜ − T , W − Wc + M + B , r * ⎠ ⎝ p G = β g1 (Go − G ) + β g2 W ¯ c = Fc/p ¯*, W ¯*. where W = F/p G again given The steady state of these dynamics is characterized by (Go = ¯ exogenously) ¯Y, Wco = Mo/Po = 1, (T − G)o = ¯* r o(Bo − Mo), po = 1, Mo = 1, Bo = d To = Go + (G − T)o
Perfectly Open Economies
123
. while Wo can then be determined from the W = 0 equation on the basis of the steady-state values for fiscal and monetary policy.9 We note that the laws of motion for the price level p and the money supply M are again independent from the rest of the system, while the remaining laws of motion depend on them and are now fully independent of each other. Furthermore, we note that the only link from the capital account block to the block describing the budget dynamics is given by the term G in the government budget constraint, while the laws of motion for W, G depend on all the state variables of the model. It is not possible here to provide a proof of local asymptotic stability due to the complexity of the dynamics considered. In fact, one may furthermore assume that the policy rules considered only come into effect outside a certain neighborhood of the steady state. Numerical tools have therefore to be used for the resulting nonlinear system to check whether the proposed fiscal and monetary policy rules bound the dynamics to an economically meaningful domain. The result (yet to be achieved) would then imply that even the ideal world of the perfectly open economy under consideration (due to its centrifugal forces) needs the visible hand of economic policy to limit these forces to an economically feasible domain (not to speak of convergence back to the steady state). Instability due to myopic perfect foresight and the UIP condition, transferred to price-level dynamics by the PPP, and augmented by eventually cumulative processes in the accumulation of deficits or surpluses in the budget and the current account thus need policy intervention even under the ideal conditions of this advanced classical model of a small open economy.
6 Conclusions The primary objective of this chapter has been to consider the evolution of the twin deficits (or surpluses) concerning the government budget and the current account, and the (in)stability conditions that characterize this evolution. Moreover, the perfectly open economy considered here was also subject to explosive price-level adjustments caused by the joint working of the UIP and the PPP conditions. The private sector of this economy was therefore . If one uses as expenditure rule the law of motion G = b g 1(G 0 − G) + b g 2(Y − C − G − X0), which focuses on the trade account instead of the current and which assumes a given government target X0 for exports X, one can determine. the steady-state value of W by Wo = Xo/¯*, r and on this basis a steady value for G from W = 0 by making use of the given ¯, X0 , in the place of d ¯, ¯ steady-state value for G − T. We then have the anchors d G , for the long-run position of the economy. 9
124
P. Flaschel
plagued by various types of instability, which can only be overcome in a very limited way by the jump-variable technique of the rational expectations school. We did not follow the JVT here, however, but designed appropriate, not always plausible, policy rules which could tame the explosive nature of the dynamics of the private sector under certain conditions. These policies concerned an antimonetarist type of money supply adjustment rule, a tax policy that was adjusting debt to a certain debt to GDP ratio, and a government expenditure policy that attempted to counteract the cumulative forces in the dynamics of the current account, or its mirror image the capital account. In this way, the perfectly open economy could be made a stable one, although the dynamics that results from all these additions may be difficult to analyze from an integrated point of view in the place of the partial ones adopted in the final section of this chapter. However, such an integration must be left for future research.
Appendix A The Two-Country One-Commodity Case and Other Extensions of the Model We return to consideration of the given fiscal and monetary policy parameters, but now consider data from the rest of the world as being endogenously determined in interaction with the dynamics that was so far considered. The situation in country 1, the domestic economy, is then described by ln p + lnY + ln k − ln M pˆ = r − ρ = ro + −ρ α
(
(A1)
)
ln p + lnY + ln k − ln M M +T −G W ga = ρW ga + ro + α p
(
(A2)
)
⎛ Y + ρW − r + ln p + lnY + ln k − ln M M ⎞ o α p ⎟ −G W = ρW + Y − C ⎜ ⎟ ⎜ ⎝ − T − ρW ga , W − W ga , ρ ⎠
(A3)
since r* ≠ r and r = r − pˆ = r* = r* − pˆ* are now endogenously determined variables. In the same way, it holds for the foreign economy that on the basis of the assumptions for perfectly open economies, ln p* + lnY * + ln k* − ln M * pˆ * = r * − ρ = ro + −ρ α*
(A4)
ln p* + lnY * + ln k* − ln M * ⎞ M * W ga * = ρW ga * + ⎛⎜ ro + + T * − G* ⎟⎠ ⎝ α* p*
(A5)
Perfectly Open Economies
125
⎛ Y + ρW − ⎛ r + ln p + lnY + ln k − ln M ⎞ M ⎞ ⎜⎝ o ⎟⎠ ⎟ α p − G (A6) −W * = W = ρW + Y − C ⎜ ⎟ ⎜ ⎠ ⎝ − T − ρW ga , W − W ga , ρ The uniquely determined real rate of interest, the world real rate of interest r, is then to be determined as the “natural rate” from the world goods market equilibrium relationship ⎛ Y + ρW − ⎛ r + ln p + lnY + ln k − ln M ⎞ M ⎞ ⎜⎝ o ⎟⎠ ⎟ α p +G Y +Y* = C⎜ ⎟ ⎜ a a ⎠ ⎝ − T − ρW g , W − W g , ρ
(A7)
⎛ Y * − ρW − ⎛ r + ln p* + lnY * + ln k* − ln M * ⎞ M * ⎞ ⎜⎝ o ⎟⎠ α* p* ⎟ + G * + C*⎜ ⎟ ⎜ a a ⎠ ⎝ − T * − ρW g *, − W − W g *, ρ This has to be inserted into the five laws of motion above for p, p*, W ag, W ga*, and W in order to arrive at an autonomous system of five differential equations describing the dynamic interactions in our two-country world economy. We note that the rate r depends, according to the world goods market equilibrium, on all five state variables of the dynamics considered. Finally, the dynamics of the nominal exchange rate is obtained as an appended law of motion, via the PPP, through ê = pˆ − pˆ*. Due to the form of this law of motion, we see that it depends linearly on the laws of motion for p, p*, which implies that zero-root hysteresis is present in this two-country model as far as the evolution of the nominal exchange rate is concerned. The model considered shows how the real world rate of interest (which was so far considered with given magnitude) is determined in a world of two perfectly open economies, and how, on this basis, the dynamics of goods prices can be obtained together with the laws of motion for aggregate government debts and the debt–creditor relationship between the two economies. The dynamics that results from these interacting two-country macrodynamics is, of course, already fairly involved. In our view, they should be solved on the basis of gradually adjusting price levels and of course gradually adjusting wealth expressions. If they are locally unstable, mechanisms in the private sector or fiscal and monetary policy reactions must again be found that limit the local explosiveness at least away from the steady state such that the world economy remains economically viable. Further extensions may concern the assumption of a two-commodity twocountry world, the removal of the full employment condition and its replacement by a conventional type of Phillips curve, the existence of Keynesian demand restrictions on the world market for the domestic commodity, i.e.,
126
P. Flaschel
the assumption of an export demand function, the inclusion of investment behavior, imperfect asset substitution, and more.
Appendix B Notation The following list of symbols contains only domestic variables and parameters. Magnitudes referring to the foreign country are defined analogously and are indicated by an asterisk (*). Superscript d characterizes demand expressions, while the corresponding supply expressions do not have any index (in order to save notation). A “dot” is used to characterize time-derivatives, and a “hat” for corresponding rates of growth. Furthermore, we use an index o to denote steady-state expressions. Finally, we characterize exogenous variables by means of a bar over the variable considered. The statically or dynamically endogenous variables of the model are listed below. p w r r x e Y T G Yp Yap f(N) N C Sp S M B F Wp W ag W
price level wage level nominal interest rate real interest rate risk premium exchange rate output lump sum taxes government expenditure private disposable income aggregate government disposable income production function labor supply private consumption private savings total savings money supply domestic bonds foreign bonds real private wealth aggregate real government wealth foreign bonds (in real terms)
Perfectly Open Economies
127
References Asada T, Chiarella C, Flaschel P, Franke R (2003) Open economy macrodynamics. An integrated disequilibrium approach. Springer, Heidelberg Rødseth A (2000) Open economy macroeconomics. Cambridge University Press, Cambridge Sargent T, Wallace N (1973) The stability of models of money and growth with perfect foresight. Econometrica 41:1043–1048
7. Corridor Stability of the Neoclassical Steady State Akitaka Dohtani1, Toshio Inaba2, and Hiroshi Osaka1
Summary. Combining the permanent income hypothesis with the capital accumulation equation, we construct a growth model which gives a modified version of the neoclassical growth model. The consumption at one time is determined by the expected permanent income at that time. The modified growth model possesses a unique steady state which is the same as that of the neoclassical growth model. Unlike the neoclassical growth model, however, the modified growth model yields corridor stability. That is, there exists a corridor of stability around the steady state such that any path inside the corridor converges to the steady state and any path outside the corridor diverges. Thus, the neoclassical view of economy holds true for inside the corridor, but does not hold for outside the corridor, and hysteresis emerges near the corridor. Although the modified growth model generates fluctuating dynamics, it does explain the stably observed feature that any path which maintains a nearly proportional relation between consumption and income is compatible with fluctuating dynamics. We also make clear the effects of the parameters on the aggregate and the stability of the per-capita steady state. Key words. Solow–Swan growth model, Permanent income, Neoclassical steady state, Corridor stability, Hopf bifurcation
1 Introduction After the pioneering growth model of the Keynesian type by Harrod (1939) and Domar (1946), Solow (1956) and Swan (1956) independently constructed Faculty of Economics, University of Toyama, Gofuku 3190, Toyama, Japan School of Education, Waseda University, Nishiwaseda 1-6-1, Shinjuku-ku, Tokyo, Japan 1 2
129
130
A. Dohtani et al.
the neoclassical aggregate model of growth (from now on, the S–S model). For the S–S model, see also Barro and Sala-i-Martin (1995). Harrod and Domar tried to prove the inherent instability of the capitalist economy, namely, the instability of the steady state of growth. Conversely, Solow and Swan proved the global stability of the steady state. The growth analyses of the Harrod–Domar type play little role in today’s thinking because the instability is too strong. On the other hand, the S–S model has laid the foundation for the advance of growth theory. Some recent statistical investigations show that the S–S model can describe the actual economy relatively well. See, for example, Mankiw et al. (1992). However, is the capitalist economy globally stable in the manner that the S–S model describes? Leijonhufvud (1973) stated that there exist two types of economic view. The first one is the neoclassical type: the market equilibrium is globally stable. The second one is the Keynesian type: the market equilibrium is completely unstable. In addition to these views, Leijonhufvud offers a new type of economic view: corridor stability. That is, there exists a corridor of stability around the market equilibrium such that any path inside the corridor converges to equilibrium, and any path outside the corridor moves far away from equilibrium. Thus, the emergence of corridor stability implies the coexistence of neoclassical and Keynesian views. Such a coexistence yields hysteresis. From the viewpoint of economic policy, the emergence of corridor stability is important, because in such a case even a small shock may change the economic situation drastically. We can easily extend the notion of corridor stability on market equilibrium to that on the steady state of growth. Consequently, the emergence of corridor stability implies that there exists a corridor around the steady state such that the neoclassical view holds true for the inside of the corridor, but does not do so for the outside of the corridor. This chapter tries to construct a growth model that generates a corridor of stability around the neoclassical steady state. In the sense that the markets in the growth model are always cleared, the growth model is an equilibrium model. The growth model combines the consumption function of the permanent income hypothesis (from now on, the PI hypothesis) with the capital accumulation equation. The steady state of the growth model is the same as that of the S–S model, and is locally asymptotically stable. Thus, the growth model gives a modified version of the S–S model, which is seemingly a commonplace model. Unlike the S–S model, however, the modified version is not globally stable, and generates a corridor of stability that consists of an unstable limit cycle. Therefore, in the modified version, the validity of the neoclassical steady state is limited to the inside of the corridor. This chapter is organized as follows. Section 2 constructs the growth model. Section 3 analyses the dynamics of the model. Section 4 shows that
Stability of the Neoclassical Steady State
131
our model explains an nearly proportional relation between consumption and income. Section 5 considers the effects of parameters on the aggregate and stability of the steady state. Section 6 concludes the chapter. The appendix proves some results.
2 The Model In this section, we construct a continuous-time growth model that possesses the same steady state as the S–S model. In the same way as the S–S model, the capital accumulation equation is given by . k = f(k) − (n + d)k − c where k is the capital stock per capita, c is per-capita consumption, n is the growth rate of the population, d is the depreciation rate, and f is the production function. Throughout this chapter, all functions are assumed to be smooth. The smoothness assumption is necessary in order to use the Hopf bifurcation theorem that proves the emergence of Hopf cycles (namely, periodic paths). We assume a representative household. Unlike the S–S model, the consumption of the representative household is dynamically determined by a differential equation. Such an equation is derived by combining the adjustment equation of the expected per-capita permanent income (from now on, EPCP income) with the consumption function that depends on the EPCP income. Our consideration of expected permanent income is based on Friedman (1957, pp. 143 (5.15)). We assume that the EPCP income of the representative household is determined by the following distributed lag of y: y p (t ) =
∫
β y (τ ) exp (β (τ − t )) dτ
(1)
( −∞ ,t ]
where y is income per capita and yp is EPCP income per capita. To transform Eq. 1 into a differential equation, we consider the time derivative of Eq. 1. y p (t ) =
⎫ d⎧ ⎨exp ( −βt ) ∫ β y (τ ) exp (βτ ) dτ ⎬ dt ⎩ ⎭ ( −∞ ,t ]
= −β exp ( −βt )
∫
β y (τ ) exp (βτ ) dτ + exp ( −βt ) β y (t ) exp ( βt )
( −∞ ,t ]
= β { y (t ) − y p (t )} We call this equation the adjustment equation of EPCP income.
132
A. Dohtani et al.
Remark 1: We consider a model economy in which the household expects that per-capita income does not grow. It can be expected that analogous results will be obtained in the case where per-capita income grows. In fact, Friedman (1957, pp. 144, (5.17)) treated this case by extending Eq. 1 to y p (t ) =
∫
β y (τ ) exp ((β − γ )(τ − t )) dτ
( −∞ ,t ]
where g is the estimated growth rate of yp. In the context in this chapter, the growth of per-capita income is caused by exogenous Harrod-neutral or labor-augmenting technological progress. However, for further generalizations along this line, we will need to analyze higher-dimensional and/or nonautonomous differential equations. Such generalizations are therefore left for future research. 䊏 We next assume the following consumption decision depending on the EPCP income: c = ayp We call this equation the consumption function. Remark 2: In the PI hypothesis, consumption consists of permanent and transitory components. The permanent component of consumption is closely related to growth dynamics. On the other hand, the transitory component of consumption fluctuates stochastically in the PI hypothesis, and is therefore related to business fluctuations. Thus, through transitory consumption, the PI hypothesis can be linked to the stochastic business fluctuations. Since our concern here is the deterministic features in the long run, the permanent component rather than the transitory component plays an important role. So for simplicity, we will disregard the transitory consumption. 䊏 In the S–S model, the consumption function relates the consumption at any one time to the income at that time. However, this consumption function relates the EPCP income at the time to the consumption at the time. Since the EPCP income is dynamically determined, the consumption decision is inevitably dynamic, as follows. From the consumption function and the adjustment equation, we obtain . c = ab(y − yp) We call this equation the consumption equation. Combining the capital accumulation equation with the consumption equation, we have the following system: k = f (k ) − (n + δ ) k − c (2) c = β {α f ( k ) − c}
{
Stability of the Neoclassical Steady State
133
We call a solution of System 2 a path, and the equilibrium point of System 2 the steady state, which is equivalent to the neoclassical steady state. Although System 2 is seemingly a commonplace model, in the next section we prove that System 2 can generates a novel dynamic phenomenon around the steady state.
3 Growth Dynamics of the Model To analyze the dynamic behavior of System 2, we employ the following standard assumptions on the production function: Assumption 1: lim f ′ ( k ) = ∞ , lim f ′ ( k ) = 0.
䊏
Assumption 2: f ′(k) > 0 and f″(k) < 0 for any k > 0.
䊏
k →∞
k→0
We start with the next result on the existence and uniqueness of the steady state. Theorem 1: Suppose Assumptions 1 and 2 are satisfied. Then System 2 has a unique steady state, say (k*, c*) in the positive quadrant. 䊏 䊏
Proof: See Appendix. For simplicity, we denote the elasticity of the production function by e(k) = kf ′(k)/f(k)
Moreover, to use the Hopf bifurcation theorem, we introduce the following condition on the elasticity: Assumption 3: 1 − a < e(k*) < 1.
䊏
For example, let f(k) = sk . Then Assumption 3 implies 1 − a < m < 1. Thus, Assumption 3 is plausible. Before describing the phase diagram of System 2, we make sure of the following lemma. m
Lemma 1: Under Assumption 3, the slope of the characteristic curves of . . k = 0 and c = 0 is positive in the neighborhood of the steady state. 䊏 Proof: See Appendix.
䊏
Noting Lemma 1, the stylized phase diagram of System 2 is indicated in Fig. 1. We will now prove the emergence of a Hopf bifurcation in System 2. As a bifurcation parameter, we choose b. We start with the next lemma. Lemma 2: Suppose Assumptions 1–3 are all satisfied. Define b # = (n + d){e(k*) + a − 1}/(1 − a)
134
A. Dohtani et al.
Fig. 1. Stylized phase diagram of System 2 in the neighborhood of the steady state
Then, b # > 0, and any eigenvalue of the Jacobian matrix evaluated at b = b # is purely imaginary. 䊏 Proof: It follows directly from Assumption 3 that b # > 0. For the remainder of the proof, see Appendix. 䊏 The emergence of a Hopf bifurcation can be obtained immediately from this lemma. For the Hopf bifurcation, see Guckenheimer and Holmes (1983, Sect. 3.4). Theorem 2: Suppose Assumptions 1–3 are all satisfied. If b > b # (or b < b #), then the steady state of System 2 is stable (or unstable). Moreover, System 2 䊏 generates a Hopf bifurcation at the value b = b #. Proof: See Appendix.
䊏
We next consider the stability of the Hopf cycle of Theorem 2. When a stable cycle emerges, the cycle attracts any path with the initial point near the cycle. On the other hand, when an unstable cycle emerges, any path inside the unstable cycle converges to the steady state, and any path outside the unstable cycle moves away from the cycle and, therefore, from the steady state (Fig. 2). Benhabib and Miyao (1981) stated that such a situation may be related to the notion of corridor stability by Leijonhufvud (1973). (See also Gabisch and Lorenz (1987, Sect. 6.1), Owase (1987, Sect. 20), Rosser (1991,
Stability of the Neoclassical Steady State
135
Fig. 2. Corridor stability
Sect. 6.1), Owase (1991), and Lorenz (1993, Sect. 3.2).) Our concern here is what condition guarantees an unstable Hopf cycle in System 2. One remark should be made here. System 2 satisfies the capital accumulation equation, so that aggregate demand equals aggregate supply. In this sense, the instability outside the corridor of System 2 is not related to the instability of market equilibrium, but to the instability of the steady state. As stated in the Introduction, Leijonhufvud’s notion of corridor stability is closely related to market equilibrium, and therefore is different from ours. However, his notion can easily be modified to the notion concerning the steady state. In general, it is not easy to determine the stability of a Hopf cycle. The reason is that the condition guaranteeing the stability of a Hopf cycle depends on up to third-order derivatives of a vector field. In many cases, such a condition is too complicated to allow any economic meaning. We can, however, see that the Hopf cycle in Theorem 2 is unstable under standard production function and standard assumptions. In fact, we have: Theorem 3: Suppose Assumptions 1–3 are all satisfied. In addition, we assume that f ′′′(k*) > 0. Then there is a set included in {b : b > b #}, say Ω, such that for any b ∈ Ω System 2 yields an unstable Hopf cycle. 䊏 Proof: See Appendix.
䊏
Example 1: As the typical production functions that satisfy f ′′′(k*) > 0, we 䊏 have f(k) = skm and f(k) = s log (k + 1) where 0 < s and 0 < m < 1. Theorem 3 is important in the sense that it proves the emergence of corridor stability. In our model, the corridor of stability is the Hopf cycle, so that any path inside the unstable Hopf cycle converges to the steady state, and any path outside the unstable Hopf cycle diverges. Accordingly, a sort
136
A. Dohtani et al.
of hysteresis emerges near the unstable Hopf cycle, that is, a slight difference in starting points yields a drastic historical difference. Theorem 3 shows that, under standard production function and standard assumptions, such an interesting phenomenon can emerge in a modified version of the S–S model that is seemingly commonplace. Remark 3: There exist various notions concerning the idea of corridor stability. The notion of the corridor of stability mathematically implies a sort of the basin boundary (see Robinson 1995, Sect. 5.5), and the inside of the corridor is a basin of attraction. Since the corridor of stability in our model consists of an unstable Hopf cycle, the corridor is also called a stability threshold (see Medio and Lines 2001, Sect. 5.3). Moreover, hysteresis yielded by corridor stability is closely related to the notion of the bifurcation of behavior in the biological context (see Smale and Hirsch 1974, Sect. 12.3). Any of these terminologies, from various viewpoints, grasps the important features of dynamics concerning corridor stability. 䊏 In the case where hysteresis emerges, the policy that intends to cause small economic changes may yield drastic economic changes, that is, may lead us to outcomes that are contrary to the intention of policy makers. Thus, in the case where corridor stability emerges, the policy makers need to be well prepared for unintended cases. The consumption equation constructed above can be extended slightly. Below we discuss it briefly. Consider the smooth function Q(x) that satisfies Q(0) = 0
(3)
Q′(x) > 0 for any x > 0
(4)
Q(x) is linear in a neighborhood of x = 0
(5)
Moreover, we consider the following adjustment equation of EPCP income: . (6) yp = Q(y − yp) From Eq. 6 and c = ayp, we obtain . c = aQ(f(k) − c/a) = aQ((af(k) − c)/a) ≡ F(af(k) − c)
(7)
Equation 7 gives a generalization of the consumption equation above. Moreover, the local linearity condition Eq. 5 shows that the F-function is also linear in a neighborhood of x = 0. Then the same argument is true of the more general growth model with the generalized consumption Eq. 7. The local linearity condition Eq. 5 yields F″(0) = 0, F′′′(0) = 0. Moreover we have
Stability of the Neoclassical Steady State
137
F′(0) = Q′(0) = b We can now easily see that the same results are true of the general growth model because, as for the information on the F-function, only the value of b = F′(0) is required to calculate the stability index concerning System 2. It should be noted here that if the local linearity condition Eq. 5 is taken away, complicated conditions on derivatives of the Q-function or the F-function will be necessary to determine the sign of the stability index. Remark 4: In general, it is not always clear what type of distributed lag equation corresponds to Eq. 6, but if the Q-function is piecewise-linear, Eq. 6 with the Q-function follows from a simple switch corresponding to the 䊏 values of (y − yp). The above generalization about the consumption equation possesses an important economic implication. We explain it briefly here. Since the bifurcation value is given by b # = (n + d){e(k*) + a − 1}/(1 − a) and n + d is very small, the bifurcation value may be too small. Therefore, the argument in such a parameter domain does not seem to be plausible. However, even if the derivative Q′(x) is not small for large values of |x|, the value of b = Q′(0), which is closely related to the emergence of Hopf bifurcation, can be sufficiently small. The reason is that in the case where the difference between actual income and EPCP income is sufficiently small, the sufficiently small adjustment coefficient can be plausible. Thus, such a problem on the bifurcation value b # can easily be improved by incorporating the nonlinear adjustment function of EPCP above into our growth model. Allowing for this point, we assumed, for simplicity, the linear adjustment function. Finally we make one more remark. We paid attention to the possible emergence of an unstable Hopf cycle. If Condition (5) is not satisfied (that is, the Q-function is not locally linear), then there exists a possibility that the Hopf cycle of Theorem 2 is stable. From an economic viewpoint, however, it will be difficult to derive a significant condition for the emergence of a stable Hopf cycle. Moreover, it seems that the plausible parameter domain in which stable Hopf cycles emerge is extremely narrow. However, we do not discuss that here.
4 Nearly Constant Propensity to Consume By assuming the completely proportional relation between consumption and income, Solow (1956) and Swan (1956) succeeded in constructing a growth model in which any path monotonously converges to the steady state. Such a proportional relation between consumption and income (not permanent income) is stably observed in long-term time-series. For this point, see
138
A. Dohtani et al.
Kuznets (1942) and also Romer (2001, Sect. 7.1). On the other hand, we assumed c = ayp, i.e., a proportional relation between consumption and permanent income (not income). Since c = ayp ≠ ay for any path that is not the steady state, the dynamic behavior yielded under such an assumption does not guarantee a completely proportional relation between consumption and income. Moreover, System 2 possesses a corridor that consists of an unstable Hopf cycle. Paths inside the cycle fluctuate around the steady state while converging slowly to the steady state. Can our modified growth model with such fluctuating dynamics explain the important observation by Kuznets? Below, we make clear that a nearly constant propensity to consume, i.e., a nearly proportional relation between consumption and income, is compatible with cyclical but convergent fluctuations around the steady state. That is, any fluctuating path maintains a nearly constant propensity to consume. Although we described Fig. 1 as being an easy-to-see phase diagram, a more exact and typical phase diagram is shown in Fig. 3. As Fig. 3 shows, typical paths that are not the steady state are fluctuating around the steady state. Such typical fluctuating paths cling to the graph c = af(k). Hence, for any t > 0, c(t)/y(t) ⱌ af(k(t))/y(t) = a on the fluctuating paths, so that a ⱌ c(t)/y(t) = C(t)/Y(t) where Y is income and C is consumption (not per capita, respectively.) This shows that although typical paths of System 2 fluctuate, a nearly proportional relation between consumption and income is stably observed on these paths. Figure 4 plots C(t)/Y(t). Thus, System 2 explains why a nearly proportional relation is compatible with fluctuations around the steady state.
Fig. 3. Phase diagram of System 2 in the neighborhood of the steady state
Stability of the Neoclassical Steady State
139
Fig. 4. A nearly constant propensity to consume
5 Effects of Parameters on Aggregate and Stability In this section, to clarify our argument, we consider the production function of the form skm (0 < s, 0 < m < 1). In the same way as the S–S model, the growth rate of the steady state in System 2 equals the growth rate of labor. Therefore, the growth rate of the steady state does not depend on any parameter of System 2, but the aggregate of the steady state depends on a parameter of System 2. Moreover, we will see that the stability of the steady state depends on the parameters of System 2. First, we consider the effects of the steady state on the aggregate. We can easily see that the capital stock of the steady state is given by K = kL = L0 exp(nt){s(1 − a)/(n + d)}1/(1−m) where K is captial stock (not per capita) and L is population. Thus we obtain the same results as the S–S model, that is, an increase in s or m leads to an increase in the aggregate of the steady state. However, an increase in a or n leads to a decrease in the aggregate of the steady state. Next we consider the effects on the stability of the steady state. Denote the Jacobian matrix of System 2 by J, and the trace of J by TrJ. We here consider the case where the eigenvalues of J are complex conjugate, and the case where the real parts of the eigenvalues of J are negative. In this case, any path near the steady state of System 2 approaches the steady state exponentially. For the notion of an exponential approach, see Smale and Hirsch (1974, Sect. 7.1). Under the assumption of the conjugation and the exponential approach, the convergence speed depends on the magnitude of TrJ, which is given by TrJ = (n + d)(m + a − 1)/(1 − a) − b
140
A. Dohtani et al.
See the proof of Lemma 2 in the Appendix. Any change in s has no effect on the stability. An increase in b makes the steady state more stable, but an increase in m or n makes it more unstable. Since dTrJ/da = m(n + d)/(1 − a)2 an increase in a also makes the steady state more unstable.
6 Conclusions and Final Remarks In this chapter, combining the permanent income hypothesis with the capital accumulation equation, we constructed a growth model that gives a modified version of the well-known neoclassical growth model (S–S model). The modified version has a unique steady state, which is the same as the S–S growth model. Using the Hopf bifurcation theorem, we showed that under standard production functions and standard conditions, the modified version can generate unstable periodic paths around the steady state. In other words, the model can generate corridor stability, that is, there exists a neighborhood of the steady state such that the neoclassical view of the economy collapses outside the neighborhood. This also implies that hysteresis occurs near the boundary of the neighborhood, that is, a slight difference in starting points can yield a drastic historical difference. Thus, we see that in the case where corridor stability emerges, even an economic policy intended to result in small economic changes may cause drastic economic changes. The S–S model assumed a completely proportional relation between consumption and income. Consequently, the S–S model is globally stable. The actual economy, however, yields small fluctuations around the steady state while maintaining a nearly proportional relation. The modified version explains why a proportional relation is nearly compatible with such fluctuations around the steady state. By almost the same argument as the S–S model, we can easily see that the growth rate of the steady state equals the growth rate of the population, so that the growth rate does not depend on parameters other than the growth rate of the population. On the other hand, for the effects of param-eters on the aggregate of the steady state, we obtained the same results as the S–S model. Moreover, for the effects of parameters on the stability of the steady state, we proved that any change in s has no effect on the stability. An increase in b makes the steady state more stable, but an increase in a, m, or n makes it more unstable. Our results on the emergence of a Hopf cycle were obtained by the socalled local bifurcation analysis. For the local bifurcation analysis, see, for example, Guckenheimer and Holmes (1983, chap. 3). We cannot say anything
Stability of the Neoclassical Steady State
141
about whether or not a similar result on the emergence of corridor stability can be obtained by global analysis or phase analysis. Other mathematical or geometric methods will be required to attack such a problem. Global analysis of our growth model is left for future research. Acknowledgment. We are grateful for helpful and valuable comments to participants in the Rokko Forum (May 9, 2004) at Kobe University, and to participants in the Chuo Meeting on Economics of Time and Space 2005 (August 29–30, 2005) at Chuo University. Particular thanks go to T. Asada, T. Hagiwara, T. Haruyama, T. Nakamura, and T. Nakatani. Remaining errors are ours.
Appendix The appendix gives the proofs of Lemmas 1 and 2 and Theorems 1–3. Proof of Theorem 1: From Eq. 1, we see that any steady state of System 2 is given as a solution of the equation c = af(k) = f(k) − (n + d)k. Now define g(k) = (1 − a)f(k) − (n + d)k If System 2 possesses more than two steady states in the positive quadrant, then there are more than two points in the positive quadrant such that the points satisfy g(k) = 0. Since g(0) = 0, it follows from the mean value theorem that there are at least two positive numbers that satisfy g′(k) = 0. Let k1 and k2 be such points. Then g′(k1) = g′(k2) = 0
(A1)
Without loss of generality, we assume that k1 < k2 . Assumption 2 yields g″(k) = (1 − a)f″(k) < 0 for any k > 0 Therefore, the function g′(k) is strictly decreasing, so that g′(k1) > g′(k2). This contradicts Eq. A1. The contradiction proves Theorem 1. 䊏 Proof of Lemma 1: From the definition of k and Eq. 7, we have (1 − a) f(k) = (n + d)k, so that for any k > 0, f ′(k) = {kf ′(k)/f(k)}f(k)/k = e(k)(n + d)/(1 − a)
(A2)
Therefore we have f ′(k) − (n + d) = e(k)(n + d)/(1 − a) − (n + d) = {e(k) − (1 − a)}(n + d)/(1 − a) > 0
(A3)
Equation. A3 and Assumption 3 show that the slope of the characteristic curve of k = 0 is positive in the neighborhood of the steady state. On the other
142
A. Dohtani et al.
. hand, the slope of the characteristic curve of c = 0 is positive in the domain of k > 0. This completes the proof. 䊏 Proof of Lemma 2: The former part of Lemma 2 is a direct consequence of Assumption 3. So we now prove the latter part. Denote the determinant of • by Det •, and the Jacobian matrix of System 2 evaluated at the steady state by J(b). Then, Assumption 3 and Eq. A2 yield TrJ(b) = f(k*) − (n + d) − b = (n + d)e(k*)/(1 − a) − (n + d) − b = (n + d){e(k*) + a − 1}/(1 − a) − b
(A4)
DetJ(b) = −b{(1 − a)f ′(k*) − (n + d) = −b(n + d){e(k*) − 1} > 0
(A5)
#
Since Eq. A4 yields TrJ(b ), the proof follows directly from Eq. A5.
䊏
Proof of Theorem 2: Denote the real part of J(b) by R(b). Since in Lemma 2 we showed that the eigenvalues of J(b #) are purely imaginary, the eigenvalues of J(b) are also purely imaginary in a neighborhood of b #. Hence, Eq. A4 yields that in the neighborhood of b # R(b) = TrJ(b) = (n + d){e(k*) + a − 1}/(1 − a) − b Therefore, R(b #) = 0 and R′(b #) = −1 < 0. The proof follows directly from the Hopf bifurcation theorem. 䊏 Proof of Theorem 3: In Theorem 2, we proved the existence of a Hopf cycle. Theorem 3 is now proved by calculating the stability index. For the stability index, see Guckenheimer and Holmes (1983, pp. 151–153). Define F(c, k) = b{af(k) − c}, G(c, k) = f(k) − (n + d)k − c We see that Fcc = Fck = Fccc = Fcck = Gck = Gcc = Gcck = Gccc = 0 Fkk = abf″(k*), Fkkk = abf ′′′(k*), Gkk = f″(k*), Gkkk = f ′′′(k*) Using these equations, the stability index evaluated at the steady state is calculated as follows: r = (Fccc + Fckk + Gcck + Gkkk)/16 + {Fck(Fcc + Fkk) − Gck(Gcc + Gkk) − FccGcc + FkkGkk}/(16x) where x > 0. For the parameter x, see Guckenheimer and Holmes (1983, pp. 151–153). The definition of x is not, however, required for the proof of Theorem 3. The assumption that f ′′′(k*) > 0 yields r = Gkkk/16 + FkkGkk/(16x) = f ′′′(k*)/16 + ab{f″(k*)}2/(16x) > 0 Thus, the stability index is positive. This implies that the Hopf cycle in Theorem 2 is unstable. For this point, see Guckenheimer and Holmes (1983, pp. 151–153). 䊏
Stability of the Neoclassical Steady State
143
References Barro RJ, Sala-i-Martin X (1995) Economic growth. McGraw-Hill, New York Benhabib J, Miyao T (1981) Some new results on the dynamics of the generalized Tobin model. Int Econ Rev 22:589–596 Domar ED (1946) Capital expansion, rate of growth, and employment. Econometrica 14:137–147 Friedman M (1957) A theory of the consumption function. Princeton University Press, Princeton Gabisch G, Lorenz HW (1987) Business cycle theory: survey of methods and concepts. Springer, Berlin–Heidelberg–New York Guckenheimer J, Holmes P (1983) Nonlinear oscillations, dynamical systems, and bifurcations of vector fields. Springer, New York–Berlin–Heidelberg Harrod RF (1939) An essay in dynamic theory. Econ J 49:14–33 Kuznets S (1942) Use of national income in peace and war. Occasional Paper No. 6, NBER, New York Leijonhufvud A (1973) Effective demand failures. Swed J Econ 75:27–48 Lorenz HW (1993) Nonlinear dynamical economics and chaotic motion. Springer, Berlin–Heidelberg–New York Mankiw NG, Romer D, Weil DN (1992) A contribution to the empirics of economic growth. Q J Econ 107:407–437 Medio A, Lines M (2001) Nonlinear dynamics: a primer. Cambridge University Press, Cambridge Owase T (1987) Dynamical system theory in economics. Zeimukeirikyoukai, Tokyo Owase T (1991) Nonlinear dynamical systems and economic fluctuations: a brief historical survey. IEICE Trans E74:1393–1400 Robinson C (1995) Dynamical systems: stability, symbolic dynamics, and chaos. CRC Press, Boca Raton Romer D (2001) Advanced macroeconomics. 2nd ed. McGraw-Hill, New York Rosser JB Jr (1991) From catastrophy to chaos: a general theory of economic discontinuities. Kluwer, Boston/Dordrecht/London Smale S, Hirsch MW (1974) Differential equations, dynamical systems, and linear algebra. Academic Press, New York Solow RM (1956) A contribution to the theory of economic growth. Q J Econ 79:65–94 Swan TW (1956) Economic growth and capital accumulation. Econ Rec 32:334–361
8. Time-Delayed Dynamic Model of Renewable Resource and Population Akio Matsumoto1, Mami Suzuki2, and Yasuhisa Saito3
Summary. This chapter reconsiders the “Ricardo–Malthus” dynamic model of population and renewable resource developed by Brander and Taylor. To this end, it sheds light on a delay in production, which is a key feature of long-run evolution in a preindustrial economic society. It is clear that the delay has a considerable effect on the long-run dynamics of natural resources and population. This notwithstanding, much previous work relates to a case in which production is instantaneous. As such, the purpose of this chapter is to investigate the effect caused by the delay in production. Our analysis shows that there is a critical value of the delay with which a delayed version of an otherwise stable system becomes unstable. In addition, it shows numerically that the critical value is negatively related to the size of the carrying capacity and the exogenous net birth rate. Key words. Delayed differential equation, Easter Island, Economic dynamics model, Environmental degradation, Population growth
1 Introduction This study reconstructs an economic dynamic model for a small island based on the Easter Island study by Brander and Taylor (1998) (BT henceforth). Introducing a delay in production into the BT model, we deal with analytical as well as numerical representations of the evolution of a small island economy, evolution which is characterized by a two-fold process of transformation: environmental degradation and population growth. Its main purpose Department of Economics, Chuo University, Tokyo, Japan Department of Management, Aichi Gakusen University, Aichi, Japan 3 Department of Systems Engineering, Shizuoka University, Shizuoka, Japan 1 2
145
146
A. Matsumoto et al.
is to investigate the effects caused by the delay in production on an adjustment process in population and the stock of natural resources. Archeological studies have suggested that many Pacific islands followed similar evolutionary patterns of natural resources and population dynamics, that is, rapid population growth, resource degradation, economic decline, and then population collapse. BT reconsider the archeological and anthropological evidence from Easter Island as an example from an economic point of view. In particular, BT present a general equilibrium model of renewable resources and population dynamics, and seek to explain the rise and fall of Easter Island for 1400 years between the 4th century and the middle of the 18th century. They bring to light the economic conditions under which a small island economy can survive or perish. Their findings indicate that an economic model linking resources and population dynamics may explain not only the sources of past historical evolution discovered in these small islands, but also the possibility of sustainable growth for our world economy, in which a rapidly increasing population and a rapidly degrading environment become serious problems. The analysis in the BT model has been extended in various directions. Dalton and Coats (2000) examine the impact of market institutions and different property-rights structures. Reuveny and Decker (2000) consider numerically how technological progress and population management reform affected the long-run dynamics of Easter Island. Matsumoto (2002) reconstructs the BT continuous model in discrete steps, and shows that the modified model can generate various dynamics ranging from simple dynamics to complex dynamics involving chaos. In the existing literature, however, not much has been revealed with respect to the “history” of Easter Island. It has been believed that a small group of Polynesians arrived on the island around ad 400, deforestation occurred around ad 1000, most of the statues were carved during ad 1000–1400, and so forth.1 Based on this “conventional wisdom,” BT as well as other researchers attempt to reproduce a dynamic pattern of natural resources and population. However, the “wisdom” is still one of several possible hypotheses and is not yet fully confirmed. In particular, according to Intoh (2000), recent reconsiderations of archeological evidence on the island imply indeterminacy in the arrival time of the Polynesians. It can only be an estimate that a small group settled between ad 410 and 1270. This new finding is inconsistent with the traditional wisdom. In other words, we may have a different history (as well as a different evolutionary pattern) of Easter Island, even though the available historical evidence is the same. It is thus imperative to construct a model for small islands that can generate various patterns of dynamics in order to deal with such ambiguous characteristics of archeological evidence. 1
See Sect. 1 of Brander and Taylor (1998) for more details.
Renewable Resource and Population
147
In this study, we extend the BT model to include a delay or lag in production for the following reasons. Agricultural production may play an essential part in economic activity in a preindustrial economic society. It is well known that an important characteristic of such agricultural production is the significant time-lag between the time at which producers make their decisions to plant seeds in the fields, and the time when they actually gather crops from the fields. It is thus natural to raise the question, How did the delayed production affect the evolutionary pattern of a small island economy? This study is organized as follows. Section 2 constructs a simple economic dynamics model of an small island based on BT’s Easter Island analysis. Section 3 analytically as well as numerically examines the effects caused by a delay in production on the evolution of natural resources and the population. Section 4 provides a summary and concluding remarks.
2 Basic Model for Small Islands Because our analysis is based on BT’s dynamic model of renewable resources and population, we recapitulate the basic part of their model in this section (see BT’s paper for more details). The model describes the dynamics of an economy with two types of goods and three types of economic agent (two producers and one consumer). The harvest of the renewable resource is called the agricultural good, and some other good is called the manufactured good. The model functions as follows. At time t, the stock of natural resources S(t) and the size of the population L(t) are given.2 Producers determine their demands for labor and supplies of goods so as to maximize their profits. A manufacturing producer supplies the manufactured good produced with constant returns to scale using only labor. Since, by choice of units, one unit of manufactured goods can be produced by one unit of labor, the total supply of manufactured good MS is determined by the demand for labor LMD MS = LMD
(1)
An agricultural producer supplies the agricultural good carried out according to the Shaefer harvesting production function HS = aSLMD
(2)
where HS is the harvest supplied, a is a positive constant indicating the harvesting efficiency, and LHD is the labor demand in resource harvesting. A representative consumer is endowed with one unit of labor, and is assumed to have a Cobb–Douglas utility function. 2
For the time being, t is suppressed for notational simplicity.
148
A. Matsumoto et al.
u(h,m) = h b m1−b
(3)
where h and m are individual consumptions of the agricultural good and the manufactured good, respectively, and b ∈ (0,1) is a positive constant reflecting the preference of the agricultural good. Each consumer supplies one unit of labor and demands both goods so as to maximize his utility subject to budget constraint ph + m = w, where p is the price of the agricultural good, w is the wage rate, and the price of the manufactured good is normalized to 1 as it is treated as a numeraire. The usual utility maximization procedure yields optimal demands for both goods, hd = wb/p and md = (1 − b)L where d is attached to a variable indicating individual demand. The total number of the population is L, so that the total demands are HD = Lhd and MD = Lmd
(4)
Prices are adjusted to establish temporary equilibrium in each of three markets: agricultural goods market (HD = HS), manufactured goods market (MD = MS), and labor market (LMD + LHD = L), in which the labor force is assumed to be equal to the population. It can be verified that a fixed proportion of the total population is employed in the agricultural section, LHD = bL, and thus the resource harvest is HS = abSL at the temporary equilibrium state. After finishing transactions in each market, new values of natural resources and the size of the population are determined at the next instant of time. With these new values, the process repeats until a stationary state is attained. The dynamics of temporary equilibrium are described as follows. A change in the stock at time t is determined by the natural growth rate G(S) minus the harvest rate. dS (t ) = G ( S (t )) − H S (t ) dt
(5)
For analytical simplicity, the logistic functional form for G is assumed to be G(S) = r(1 − S/K)S, where K is the maximum possible size for the resource stock, r is the intrinsic growth rate of the natural resource, and both are positive constants. A change of population depends on a difference between an underlying birth rate and death rate, where b denotes the exogenously determined birth rate, and d the endogenously determined death rate. It is assumed that the net rate, denoted as c = b − d, is negative. Following the formulation of Malthusian population dynamics, it is further assumed that the endogenously determined birth rate depends on economic activities, that is, per capita consumption of the agricultural goods increases fertility and/or decreases mortality. Let fHS/L be a fertility function, where f is a positive constant. Then the population growth rate is
Renewable Resource and Population
H S (t ) ⎞ 1 dL (t ) ⎛ = c +φ ⎝ L (t ) ⎠ L (t ) dt
149
(6)
where the first factor is the exogenous net birth rate, and the second is the endogenous birth rate. Substituting the logistic function into the natural growth rate and the optimal harvest into the fertility function yields the dynamics process of the natural resources and the population.
{
dS (t ) /dt = (r (1 − S (t ) / K ) − αβ L(t )) S (t ) dL (t ) /dt = (c + αβφS (t )) L (t )
(7)
This is a two-dimensional dynamic system of differential equations, and is a variant of the Lotka–Volterra predator–prey model in which the human is the predator and the resource stock is the prey. The system has a steady state if dS/dt = 0 and dL/dt = 0 hold. We can solve the last simultaneous system for stocks of the natural resource and the population size.3 The coordinates of an interior solution are Se = −
c αβφ
and
Le =
r ⎛ c ⎞ ⎜⎝ 1 + ⎟ αβ αβφK ⎠
(8)
BT find the following result for the stability of the interior steady point. Theorem 1: (Proposition 4 (iii) of Brander and Taylor) An interior steady state (Le > 0 and Se > 0) is a spiral node with cyclical convergence if −cr + 4 ( −c − αβφ K ) < 0 αβφ K and an improper node allowing monotonic convergence if not. Using the parameterization for the basic model for which BT provide a detailed justification, 4 we perform two simulations, which are reproductions of BT’s numerical examples. Figure 1 illustrates the time series for the population size (i.e., a mountain-shaped curve) and resource stocks (i.e., a
The system can have a corner solution depending on the values of the parameters. However, the corner solution, with L = 0, means an extinction of humans, which is not interesting. Thus, in this study, we focus only on the interior solution. 4 BT use the following parameter values for their simulations: L0 = 40 (initial human population), S 0 = 12 000 (initial stock of the renewable resource), K = 12 000 (carrying capacity), a = 0.00001 (harvesting efficiency), b = 0.4 (preference for the agricultural good), r = 0.04 (intrinsic growth rate of the renewable resource), f = 4 (fertility rate), c = −0.1 (intrinsic net birth rate). 3
150
A. Matsumoto et al.
S,L 12000
8000
4000
4c
9c
11c
14c 15c
18c
14c 15c
18c
Time
Fig. 1. Slow rate of natural resource (g = 0.01)
S,L 40000
20000 12000
4c
9c
11c
Time
Fig. 2. Rapid growth rate of natural resource (r = 0.35)
downward sloping concave–convex curve) when the intrinsic growth rate of the renewable resource is low, r = 0.04. Figure 2 illustrates the time series for the same variables when the rate is roughly nine times higher, r = 0.35. In these figures, one period represents one decade, and the horizontal axis shows 140 periods. The initial period corresponds to the year ad 400, when the first indigenous people are said to have arrived on the island, and the last period corresponds to some time in the 18th century when the first
Renewable Resource and Population
151
European arrived on the island, after which the substantial changes in the environment make the dynamic model in Eq. 7 hold no longer. Figure 1 shows oscillatory dynamics, and appears to replicate what is known of Easter Island history. The island was settled by a Polynesian group about ad 400, and was covered with large palm trees at this time. For the first 300 years, the populations size was small, and the resources degraded very little. However, soon after, the population size began to increase rapidly, and the resources began to decline correspondingly. The heyday of Easter Island is supposed to be between 1100 and 1300: the population reached its maximum (about 10 000) and the statue carving was intensive. After reaching its peak of population size, the island entered a period of decline, and then disappeared from the history; the palm forest was entirely gone by 1400, carving ceased by 1500, and violent internecine conflict appeared. According to Theorem 1, the model can generate monotonic behavior for some combinations of parameter values, which may explain the monotonic evolution observed on some Polynesian islands. In the second simulation, we change the growth rate of the natural resources from the lower value (r = 0.04) to the higher value (r = 0.35), and use the same values of any other parameters as in the first simulation. As illustrated in Figure 2, the simulation shows a entirely different dynamics: a smooth adjustment converging to the stationary state. Comparing these two figures, we observe that an island with a slow-growing resource base will exhibit overshooting and collapse, while an island with rapidly growing resources exhibits a near-monotonic adjustment of population and resource stocks towards a steady state.
3 Application Model with Time Delay In this section, we investigate the hypothetical role that a delay might play in disturbing the monotonic dynamics and further disturbing the oscillatory dynamics. In order to see this disturbing or destabilizing effect, we introduce a delay in production into the basic model, taking account of the fact that it takes some time for agricultural goods to progress from seeding to harvesting. H(t) = abL(t)S(t − t)
(9)
where t is a delay in production. The current harvesting depends on the current amount of labor and the stock of natural resources at time t − t. Substituting Eq. 9 into the basic system Eq. 7 generates a dynamic system with a delay in production t.
A. Matsumoto et al.
152
(
)
⎧⎪dS (t ) /dt = rS (t ) 1 − S (t ) − αβ L (t ) S (t − τ ) K ⎨ ⎪⎩dL (t ) /dt = L (t )(b − d + φαβ S (t − τ ))
(10)
It can be verified that the delayed system has the same equilibrium point as the basic system. To investigate the stability of the delayed system, we first make a coordinate transformation such that a new system is centered at the equilibrium point (Se, Le) and then linearize the resultant system at the origin L = L − Le. Then the to derive its characteristic equation. Let ¯ S = S − Se and ¯ centered system is reduced to e ⎧dS (t ) /dt = r ⎛ 1 − 2S ⎞ S (t ) − αβ S e L (t ) − r S (t )2 ⎪ ⎝ K ⎠ K ⎪ e −αβ L S (t − τ ) − αβ L (t ) S (t − τ ) ⎨ ⎪dL (t )/dt = Le + L (t ) φαβ S (t − τ ) ( ) ⎪ ⎩
(11)
S (t )⎞ = Ce λt , where C ∈ C2 and l ∈ C. Comparing the linear terms, Put ⎛ ⎝ L (t )⎠ we have e ⎛ r ⎛ 1 − 2S ⎞ 0 − αβ S e ⎞ λt −αβ Le λt Cλe = ⎜ ⎝ Ce λ (t −τ ) ⎠ ⎟ Ce + K φαβ Le 0 ⎜⎝ ⎟⎠ 0 0 In consequence, for Eq. 10, we get the characteristic equation of the linearized system around (Se, Le),
(
{
)
}
2S e ⎞ λ 2 + −r ⎛ 1 − + αβ Le e − λτ λ + (αβ )2 φS e Le e − λτ = 0 ⎝ K ⎠
(12)
and substituting Eq. 8 into this gives
{
}
2cr ⎞ ⎛ rc ⎞ − λτ ⎛ λ2 + −⎜ r + λ + (αβ )2 φS e Le e − λτ = 0 ⎟⎠ + ⎜⎝ r + ⎟⎠ e ⎝ φαβ K φαβ K
(13)
When t = 0, Eq. 12 becomes
λ2 +
rc λ + (αβ )2 φS e Le = 0 αβφK
(14)
Since c < 0 and (ab)2fSeLe > 0, all the characteristic roots of Eq. 13 have negative real parts, by which, in this case, the equilibrium point (Se, Le) is locally asymptotically stable for Eq. 10. For the sake of notational simplicity, we introduce new variables, p and q, defined as
Renewable Resource and Population
p=r+
rc αβφK
153
q = (αβ )2 φS e Le > 0
and
Substituting the new variables into Eq. 12, we can rewrite the characteristic equation with delay as l2 + (−(2p − r) + pe−lt )l + qe−lt = 0
(15)
Substituting l = iy into Eq. 14 gives py sin yt + q cos yt = y2
(16)
py cos yt − q sin yt = (2p − r)y
(17)
Squaring and adding Eqs. 15 and 16 yields y4 + (3p2 − 4pr + r2)y2 − q2 = 0
(18)
Let Y = y2 ⭓ 0, where the direction of inequality is due to y ∈ R. By the way, if Y = 0, we have q = 0 which is a contradiction. Thus, we are concerned with Y > 0 and the roots y0 of Eq. 17. 1
⎛ − ( 3 p2 − 4 pr + r 2 ) + ( 3 p2 − 4 pr + r 2 )2 + 4q 2 ⎞ 2 y0 = ± ⎜ − ⎟ 2 ⎝ ⎠
(19)
From Eqs. 15 and 16, cos ( y0τ ) =
y02 (q 2 + 2 p2 − pr ) p2 y02 + q 2
and
sin ( y0τ ) =
y0 { py02 − q (2 p − r )} p2 y02 + q 2
which imply that there is a t0 such that
τ0 =
y02 (q + 2 p2 − pr ) y0 { py02 − q (2 p − r )} 1 1 arcsin arccos = y0 q 2 + p2 y02 y0 p2 y02 + q 2
(20)
Then we have the following theorem. Theorem 2: The delayed dynamic system Eq. 10 is unstable if t > t0, where
τ 0 = 1 / y0 arcsin [ y02 (q + 2 p2 − pr ) / (q2 + p2 y02 )] = 1 / y0 arccos [ y0 { py02 − q (2 p − r )} / (q 2 + p2 y02 )] p=r+
rc > 0, αβφK
q = (αβ )2 φS e Le > 0
A. Matsumoto et al.
154
and
1
⎛ −(3 p2 − 4 pr + r 2 ) + (3 p2 − 4 pr + r 2 )2 + 4q 2 ⎞ 2 y0 = ± ⎜ ⎟⎠ 2 ⎝ Proof: See Appendix. To the best of our knowledge, a nonlinear dynamic system with a time delay does not have an analytical solution.5 Nevertheless, it is possible to examine its dynamic behavior by simulating the system numerically. Since we have performed two simulations without delay in the last section, we now conduct simulations with a production delay, and then compare the results with a time delay with those without one. By doing so, we can detect the effects on the long-run dynamics of a population and natural stocks caused by the delay in production. Figure 3 presents simulation results that occur when the delay in production is introduced into the first example, ceteris paribus. The solid lines show the time series of the population and the natural resources obtained in the current simulation, while the dotted lines are reproductions of the simulation results depicted in Fig. 1. With a delay in production, the population and the resource stocks generates more volatile fluctuations relative to those
St,Lt 17500
12500
7500
2500 4c
9c
11c
14c 15c
18c
Time
Fig. 3. Production delay in a fluctuating economy
It is possible to construct an analytical solution when a nonlinear dynamic system is discrete (see Suzuki 1996, 2000). 5
Renewable Resource and Population
155
without a delay. The population reaches a much higher peak and a much lower trough, jumping to 17 500 and falling quickly down to near zero, while the natural resources decline more rapidly and get closer to zero stock. These numerical simulations indicate the destabilizing effect caused by a delay in production in the oscillatory case. Figure 4 presents simulations that occur when a delay in production is introduced into the second example, ceteris paribus. As in Fig. 3, the solid lines show the simulation results with the delay in production, and the dotted lines those without it, which reproduce the results depicted in Fig. 2. Comparing these results, we observe firstly the earlier and much more volatile fluctuations in population as well as in natural resource, and secondly the near exhaustion of natural resources and the much more severe falls in population at the beginning of 18th century. These numerical simulations again indicate the destabilizing effect of a delay even on otherwise monotonic dynamics. In these numerical simulations, we adopt the parameterization used by BT, for which the critical value of the delay is t 0 = 10.1652. 6 This is indeed a large delay, but we should not be too concerned. As seen in Eq. 20, the critical value depends on many other parameters. We should thus be content with the existence of the critical value, trusting that it might be considerately
S,L 50000
30000
10000
4c
9c
11c
Fig. 4. Production delay in a stable economy 6
See Appendix for calculations.
14c 15c
18c
Time
156
A. Matsumoto et al.
smaller under slightly different parameter specifications. Although it may be possible to analytically derive its dependency on the parameters, we can expect the computations to become messy, and thus confirm it numerically. Figure 5A illustrates the relationship between the delay and the maximum size of the natural resource, ceteris paribus. It displays the downward sloping borderline between the stable region and the unstable region. It can be seen that the critical value of t gets smaller as the size of K becomes larger. Figure 5B illustrates the relationship between the delay and the exogenous net birth rate. We find the same property that t 0 is negatively related to b − d. A combination of larger K and smaller b − d in absolute values can lead to a smaller t 0 .
4 Concluding Remarks This study has investigated the long-run dynamics of renewable natural resources and the population by introducing a delay in agricultural production activities. It first confirms that the basic mode without a delay exhibits stable dynamics. Then it analytically derives a critical value of the delay in production for which a loss of stability occurs. Subsequently, two numerical simulations show the destabilizing effect caused by the delay in production on the evolution of a small island economy. The resource stocks and the population fluctuate much more widely over time when the growth rate of the natural resources is low (see Fig. 3), and fluctuations are also generated
10
τ
τ 5
8
Unstable Region
4
Unstable Region
6
3 4
Stable Region
2
A
2
Stable Region
1 K 14000 16000 18000 20000 22000 24000
–0.15
–0.1
–0.05
0
b–d
Fig. 5. Downward sloping t 0 curves. A The size of the resource stock. B The net birth rate
B
Renewable Resource and Population
157
even in a constantly growing economy with a higher growth rate (see Fig. 4).
Appendix To prove Theorem 2, we apply the method used by Saito (2002, p. 116–122), which makes it possible to check the simplicity of a characteristic root on the imaginary axis without tedious calculations. Let P(l ,t) = l2 + (−(2p − r) + pe−lt )l + qe−lt and tn = t 0 + 2pn (n = 0,1,2, . . .). Then, P(iy0 , tn) = 0 and we obtain the following equations from Eq. 14. ∂P (iy0 , τ n ) = iy0 [ − y02 − (2 p − r )iy0 ] ∂τ ∂P (iy0 , τ n ) = 2iy0 − ( 2 p − r ) + [ p − τ n (q + piy0 )]e − iy0τn ∂λ Clearly, K=
∂P (iy0 , τ n ) ≠ 0 . We now consider the value ∂τ
p2 y04 + 2q2 y02 + (2 p − r )2 q 2 [((2 p − r ) y0 )2 + y04 ][q2 + ( py0 )2 ]
We get K > 0. Furthermore, from Eq. 14, 2iy0 − (2 p − r ) p ⎡ ⎛ ⎞⎤ signK = sign ⎢ Re ⎜ + ⎟ 2 ⎝ −iy0 ( − y0 − (2 p − r )iy0 ) iy0 (q + piy0 ) ⎠ ⎥⎦ ⎣ 2iy0 − (2 p − r ) pe −iy0τ n τn ⎞ ⎤ ⎡ ⎛ + − = sign ⎢ Re ⎜ ⎟ − iy0τ n 2 ⎝ −iy0 ( − y0 − (2 p − r )iy0 ) iy0 (q + piy0 ) e iy0 ⎠ ⎥⎦ ⎣ − iy τ ⎡ ⎛ 2iy0 − (2 p − r ) + pe 0 n τ n ⎞ ⎤ − = sign ⎢ Re ⎜ ⎟ ⎣ ⎝ −iy0 ( − y02 − (2 p − r )iy0 ) iy0 ⎠ ⎥⎦ ⎡ ⎛ ∂P (iy0 , τ n ) ⎞ ⎤ ⎢ ⎟⎥ ∂λ = sign ⎢ Re ⎜ − ⎜ ∂P (iy0 , τ n ) ⎟ ⎥ ⎟⎠ ⎥ ⎢ ⎜⎝ ⎣ ⎦ ∂τ
∂P (iy0 , τ n ) ≠ 0 and, by the well-known implicit ∂λ function theorem, we have Hence, we can obtain
158
A. Matsumoto et al.
⎡ ⎛ dλ sign ⎢Re ⎜ ⎣ ⎝ dτ
⎡ ⎛ ∂P (iy0 , τ n ) ⎞ ⎤ ⎢ ⎜ ⎞⎤ ⎟⎥ ∂τ ⎟ ⎥ = sign ⎢ Re ⎜ − ∂P (iy , τ ) ⎟ ⎥ n 0 λ = iy0 , τ = τ n ⎠ ⎦ ⎟⎠ ⎥ ⎢ ⎜⎝ ⎣ ⎦ ∂λ −1 ⎡ ⎧⎛ ∂P (iy0 , τ n ) ⎞ ⎫⎤ ⎢ ⎪ ⎟ ⎪⎥ ∂τ = sign ⎢ Re ⎨⎜ − ⎜ ∂P (iy0 , τ n ) ⎟ ⎬⎥ ⎢ ⎪⎜ ⎟⎠ ⎪⎥ ⎪⎭⎦ ∂λ ⎣ ⎩⎝ = signK > 0
This implies that (Se, Le) becomes unstable if t > t 0 holds (see, for example, Kuang 1993). Under the current parameter specification in which K = 12 000, c = −0.1, r = 0.04, f = 4, a = 0.00001, and b = 0.4, we have p = 0.0191667, q = 0.00191667, and y0 = 0.0459087. Then solving either arcsin[t 0y0] = 0.449917 or arccos[0.466673] = t 0y0 for t 0 yields t 0 = 10.1652. If we take t > 10.1652, then the dynamic system in Eq. 10 becomes unstable, as shown in Figs. 3 and 4. Acknowledgments. This is a revised version of the paper with the same title that was presented at the international conference on Dynamical Systems Theory and its Applications to Biology and Environmental Science, held at Shizuoka University, Shizuoka, Japan, March 15–17, 2004. We are grateful for comments from Bob Sacker and participants of the conference. We appreciate the financial support from the Japan Ministry of Education, Culture, Sports, Science and Technology (Grand-in-Aid for Scientific Research (B) 15330037 for the first two authors, and Grand-in-Aid for JSPS Fellows 00000472 for the third author). The first two authors also appreciate financial support from Chuo University (Joint Research Grant 0382). Needless to say, any remaining errors are our responsibility.
References Brander JA, Taylor NS (1998) The simple economics of Easter Island: a Ricardo–Malthus model of renewable resource use. Am Econ Rev 88:119–138 Dalton TR, Coats RM (2000) Could institutional reform have saved Easter Island? J Evolut Econ 10:489–505 Intoh M (2000) Prehistoric Oceania. In: Yamamoto M (ed) Oceania history. 17–45 Yamakawa Publishing Co. Tokyo Kuang Y (1993) Delay differential equations with application in population dynamics. Academic Press, New York Matsumoto A (2002) Economic dynamic model for small islands. Discrete Dyn Nature Soc 7:121–132
Renewable Resource and Population
159
Reuveny R, Decker CS (2000) Easter Island: historical anecdote or warning for the future. Ecol Econ 35:271–287 Saito Y (2002) The necessary and sufficient condition for global stability of a Lotka– Volterra cooperative or competition system with delays. J Math Anal Appl 268:109–124 Suzuki M (1996) On some differential equations in economic models. Math J 43:129–134 Suzuki M (2000) Difference equation for a population model. Discrete Dyn Nature Soc 5:9–18
9. A Determinantal Criterion of Hopf Bifurcations and Its Application to Economic Dynamics Junichi Minagawa
Summary. We present a criterion for a class of Hopf bifurcations using the properties of bialternate products of matrices, and apply it to a certain economic system. Key words. System of differential equations, Stability criterion, Bialternate products, Criterion of Hopf bifurcations, Economic dynamics
1 Introduction Given a linear system of differential equations, the system is asymptotically stable if and only if all the roots of the characteristic equation have negative real parts. As such a stability criterion, the Routh–Hurwitz criterion is well known (e.g., Gantmacher 1959). Given a parametric system of differential equations, the system has a simple Hopf bifurcation when a pair of complex conjugate roots of the characteristic equation passes through the imaginary axis, while all other roots have negative real parts. Liu (1994) showed a criterion of simple Hopf bifurcations based on the Routh–Hurwitz criterion. Fuller (1968) gave an alternative determinantal stability criterion. The Fuller criterion has an advantage over the Routh–Hurwitz criterion in the sense that, while the Hurwitz determinants involved in the latter have elements which are themselves sums of determinants, the elements of the determinants involved in the former are simpler. In this study, a criterion of simple Hopf bifurcations based on the Fuller criterion will be shown, and then applied to a certain economic system.
Graduate School of Economics, Chuo University, 742-1 Higashi-Nakano, Hachioji, Tokyo 192-0393, Japan
161
162
J. Minagawa
2 The Routh–Hurwitz and Fuller Criteria Consider a system n dz i = ∑ aij zi , dt j =1
(i = 1, 2, . . . , n )
(1)
where aij are real constant coefficients. In vector–matrix notation, the system is dz = Az dt
(2)
For the system to be asymptotically stable, it is necessary and sufficient that all the eigenvalues of the characteristic equation of A, namely, |lIn − A| = 0
(3)
have negative real parts (e.g., Gantmacher 1959). Let us denote Eq. 3 as a polynomial equation pnl n + pn−1l n−1 + . . . + p0 = 0,
pn > 0
(4)
Routh (1877) gave the following stability criterion. Theorem 1: (Routh 1877). Consider Eq. 4. Let qmmm + qm−1mm−1 + . . . + q0 = 0
(5)
be the equation of root–pair sums of Eq. 4, i.e., let the roots of Eq. 5 be the n(n − 1)/2(= m) values m = li + lj,
(i = 2, 3, . . . , n; j = 1, 2, . . . , i − 1)
(6)
where l1, l2 , . . . , l n are the roots of Eq. 4. Let qm > 0. Then for the roots of Eq. 4 to have all their real parts negative, it is necessary and sufficient that p0 , p1, . . . , pn−1 > 0 and q0 , q1, . . . , qm−1 > 0. The conditions involved are disadvantageous from the following two points of view: “First, they are not all independent, being n(n + 1)/2 conditions in number, whereas only n are necessary. Secondly, despite several ingenious methods devised by Routh, it is not easy to compute them in the general case” (Samuelson 1941). The following Routh–Hurwitz criterion improves on Theorem 1, since only n conditions are required. Theorem 2: (e.g., Gantmacher 1959). All the roots of Eq. 4 have negative real parts if and only if H1 > 0,
H2 > 0, . . . , Hn > 0
(7)
A Determinantal Criterion of Hopf Bifurcations
163
where Hi are the Hurwitz determinants: H1 = pn −1 p H 2 = n −1 pn pn −1 H 3 = pn 0
pn − 3 pn − 2 pn − 3 pn − 2 pn −1
pn − 5 pn − 4 pn − 3
pn −1 pn − 3 pn − 5 0 pn pn − 2 pn − 4 0 0 pn −1 pn − 3 0 Hn = 0 pn pn − 2 0 p0
(8)
However, as noted in Fuller (1968), the Hurwitz determinants have elements which are themselves sums of determinants, and for n > 2 the Hi become extremely cumbersome. Then Fuller posed the question of whether one can obtain alternative determinantal stability criteria in which the elements of the determinants are simpler functions of the aij of matrix A, and presented such a criterion based on Theorem 1. In what follows, the stability criterion will be stated according to Fuller. We begin by introducing the bialternate product studied by Stéphanos (1900) (cited by Fuller), and consider a matrix whose characteristic equation is Eq. 5. Definition 1: (Stéphanos 1900). Let A be an n-dimensional matrix (aij) and B be an n-dimensional matrix (bij). Let F be an m = n(n − 1)/2-dimensional matrix (f pq,rs) whose rows are labeled pq (p = 2, 3, . . . , n; q = 1, 2, . . . , p − 1), whose columns are labeled rs (r = 2, 3, . . . , n; s = 1, 2, . . . , r − 1), and whose elements are 1 a pr f pq ,rs = ⎡ 2 ⎣⎢ bqr
a ps bpr + bqs aqr
bps ⎤ aqs ⎦⎥
(9)
Then F is the bialternate product of A and B, and is written as A ⋅ B. Theorem 3: (Stéphanos 1900). The characteristic roots of the matrix G = 2A ⋅ In
(10)
where A is an n-dimensional matrix (aij) and In is an n-dimensional identity matrix, are the n(n − 1)/2 values
164
J. Minagawa
li + lj,
(i = 2, 3, . . . , n; j = 1, 2, . . . , i − 1)
(11)
where li are the eigenvalues of A. The elements of G is g pq ,rs =
a pr δ qr
a ps δ pr + δ qs aqr
δ ps aqs
(12)
where d ij is the Kronecker delta, d ij = 0 for i ≠ j, and d ij = 1 for i = j. With p > q and r > s, Eq. 12 is ⎧−a ps ⎪a pr ⎪⎪a + a g pq ,rs = ⎨ pp qq a ⎪ qs ⎪−aqr ⎪⎩0
if r = q if r ≠ p and s = q if r = p and s = q if r = p and s ≠ q if s = p otherwise
(13)
Theorem 4: (Fuller 1968). Let A = (aij) be a real square matrix of dimension n > 1. Let G = (g pq,rs) be the square matrix of dimension m = n(n − 1)/2 defined by Eq. 10, with elements given by Eq. 12, or equivalently by Eq. 13. Then for the characteristic roots of A to have all their real parts negative, it is necessary and sufficient that in the characteristic polynomial of A, namely, |lIn − A|
(14)
and in the characteristic polynomial of G, namely |mIm − G|
(15)
the coefficients of li (i = 0, 1, . . . , n − 1) and mi (i = 0, 1, . . . , m − 1) should all be positive. The Fuller criterion can be considered as an alternative solution to the second disadvantage of Theorem 1. Jury (1982, p. 109) mentioned that it is advisable, for computational purposes, to use the Fuller criterion even though some redundancy may exist. We note that, while the Routh–Hurwitz criterion is widely applied to economic dynamics, the Fuller criterion is rarely used in the economics literature. For applications of the latter, see, for example, Murata (1977) and Hadjimichalakis and Okuguchi (1979). Regarding the first disadvantage of Theorem 1, Araposthathis and Jury (1979) showed that the conditions in Theorem 1 can be reduced to 1 + n(n − 1)/2 conditions. Theorem 5: (Araposthathis and Jury 1979). Consider Eqs. 4–6. Then for the roots of Eq. 4 to have their real parts negative, it is necessary and sufficient that p0 > 0 and q0 , q1, . . . , qm−1 > 0.
A Determinantal Criterion of Hopf Bifurcations
165
Remark 1: (Jury 1982). The problem of obtaining the minimum number of conditions as a function of n is still an open problem.
3 Hopf Bifurcations and the Liu Criterion Following Guckenheimer and Holmes (1983) and Liu (1994), we state simple Hopf bifurcations and the Liu criterion. Consider a system dz = fα ( z ) , dt
z ∈⺢n ,
α ∈⺢1
(16)
with an equilibrium (z*, a*), and f ∈ C ∞ . Assume that (i) The Jacobian matrix Dzf a *(z*) has a simple pair of purely imaginary eigenvalues, and all other eigenvalues have negative real parts.
Then there is a smooth curve of equilibria (z(a), a) with z(a*) = z*. The eigenvalues l(a), ¯ l (a) of J(a) = Dzf a (z(a)) which are purely imaginary at a = a* vary smoothly with a. Moreover, if (ii)
d (Re λ (α *)) ≠ 0 dα
then there is a simple Hopf bifurcation. The “simple” Hopf bifurcation is used to distinguish this from the Hopf bifurcations with some other eigenvalues with nonzero real parts. Let us denote the characteristic equation of the Jacobian matrix J(a), namely, |lIn − J(a)| = 0
(17)
as a polynomial equation p(l; a) = pn(a)l n + pn−1(a)l n−1 + . . . + p0(a) = 0,
pn(a) > 0
(18)
where every pi(a) is a smooth function of a. Theorem 6: (Liu 1994). Assume there is a smooth curve of equilibria (z(a), a) with z(a*) = z* for the system Eq. 16. Conditions (i) and (ii) for a simple Hopf bifurcation are equivalent to the following conditions on the coefficients of the characteristic polynomial p(l; a): (i′) p 0 (a*) > 0, H1(a*) > 0, H2(a*) > 0, . . . , Hn−2(a*) > 0, Hn−1(a*) = 0 d ( H n−1(α *)) ≠ 0 (ii′) dα
where Hi are the Hurwitz determinants.
166
J. Minagawa
The Liénard–Chipard criterion is well known as another stability criterion (e.g., Gantmacher 1959). As Gantmacher (1959, vol. II, p. 173) stated, the Liénard–Chipard criterion has an advantage over the Routh–Hurwitz criterion in the sense that the number of determinantal inequalities in the former is roughly half that in the latter. From the practical point of view, Manfredi and Fanti (2004) reformulated the Liu criterion by replacing the conditions based on the Routh–Hurwitz criterion with the corresponding Liénard– Chipard conditions. It is worth pointing out that the coefficient criterion of Hopf bifurcations with some other eigenvalues with nonzero real parts has been established for small values of n (= 2, 3, 4) (see Asada and Yoshida (2003) for n = 4).
4 Alternative Criterion An alternative criterion of simple Hopf bifurcations based on the Fuller criterion will now be given. The following theorem is an immediate consequence of Theorem 1, and the idea of the proof is due to Routh’s (1877) proof of Theorem 1 and Liu’s (1994) proof of Theorem 6. Theorem 7: Given Eqs. 4–6 with qm > 0, Eq. 4 has a simple pair of purely imaginary roots, and all other roots have negative real parts if and only if p 0 , p1, . . . , pn−2 > 0, pn−1 ⭓ 0, q0 = 0, and q1, q2 , . . . , qm−1 > 0. Here, pn−1 = 0 when n = 2 and pn−1 > 0 when n ⭓ 3. Proof: Necessity: If Eq. 4 has a simple pair of purely imaginary roots, and all other roots have negative real parts, then the left-hand side of Eq. 4 can be written as p(l) = r(l)(l2 + s1l + s0)
(19)
where s0 > 0, s1 = 0, and then s 21 − 4s0 < 0, r(l) = r n−2l n−2 + r n−3l n−3 + . . . + r0 ,
r n−2 > 0
(20)
and all (n − 2) roots of r(l) have negative real parts. Thus, from Theorem 1, the coefficients of r(l), r0 , r1, . . ., r n−2 , are all positive. Hence, all the coefficients of p(l) are positive, i.e., p0 , p1, . . . , pn−1 > 0, except for p1 = 0 when n = 2. On the other hand, the left-hand side of Eq. 5 can be expressed as q(m) = t(m)(m + u0)
(21)
where u0 = 0, t(m) = tm−1mm−1 + tm−2mm−2 + . . . + t0 ,
tm−1 > 0
(22)
A Determinantal Criterion of Hopf Bifurcations
167
and all (m − 1) roots of t(m) have negative real parts. By the same reasoning, the coefficients of t(m), t0 , t1, . . . , tm−1, are all positive, Hence we have q0 = 0, q1, q2 , . . . , qm−1 > 0. Sufficiency: The case n = 2 is clear. Let n ⭓ 3 and assume that p0 , p1, . . . , pn−1 > 0. Then p(l) > 0 for l ⭓ 0, i.e., Eq. 4 has neither positive real roots nor zero real roots. On the other hand, if q0 = 0 and q1, q2 , . . . , qm−1 > 0, from the factorization Eq. 21, Eq. 5 has all its real roots in the open left half plane except for one zero real root, but from Eq. 6, the real roots of Eq. 5 include twice the real parts of the complex roots of Eq. 4. Hence, Eq. 4 has a simple pair of purely imaginary roots and all other roots have negative real parts. 䊏 Suppose that the roots of Eq. 17 are l1(a), l2(a), . . . , l n(a). Then there is the equation q(m; a) = qm(a)mm + qm−1(a)mm−1 + . . . + q0(a) = 0,
qm(a) > 0
(23)
whose roots are given by m(a) = li(a) + lj(a),
(i = 2, 3, . . . , n; j = 1, 2, . . . , i − 1)
(24)
where every qi(a) is a smooth function of a and m = n(n − 1)/2. Let us define G(a) = 2J(a) ⋅ In
(25)
where J(a) is the Jacobian matrix of the system Eq. 16, In is an n-dimensional identity matrix, and ⋅ denotes bialternate product. Then G(a) is the matrix whose characteristic equation is Eq. 23. Theorem 8: Assume there is a smooth curve of equilibria (z(a), a) with z(a*) = z* for the system Eq. 16. Conditions (i) and (ii) for a simple Hopf bifurcation are equivalent to the following conditions in the characteristic polynomial P(l;a) and the associated polynomial q(m; a): (i″) p 0 (a*), p1(a*), . . . , pn−2(a*) > 0, pn−1(a*) ⭓ 0, q0 (a*) = 0, and q1(a*), q2(a*), . . . , qm−1(a*) > 0. Here pn−1(a*) = 0 when n = 2 and pn−1(a*) > 0 when n ⭓ 3 d (q0(α *)) ≠ 0 (ii″) dα
Proof: The equivalence of (i) and (i″) follows from Theorem 7. Hence, given (i), which is equivalent to (i″), in a sufficiently small neighborhood of a*, q(m; a) can be expressed as q(m; a) = t(m; a)(m + u0(a))
(26)
where u0(a*) = 0, t(m; a) = tm−1(a)mm−1 + tm−2(a)mm−2 + . . . + t0(a),
tm−1(a) > 0
(27)
168
J. Minagawa
t0(a) > 0, and u0(a) and t0(a) are smooth functions of a. Let the eigenvalues l 0(a). Then, from of J(a) which are purely imaginary at a = a* be l0(a) and ¯ 0 0 0 0 ¯ Eqs. 24 and 26, Re l (a) = (l (a) + l (a))/2 = m (a)/2 = −u0(a)/2. Therefore d 1 d (Re λ 0(α *)) ⱀ 0 ⇔ − ⎡⎢ (u0(α *))⎤⎥ ⱀ 0 ⇔ ⎦ dα 2 ⎣ dα 1⎡ d d 1 (t 0(α *)u0(α *))⎤ ⱀ 0 ⇔ − ⎡ (q0(α *))⎤ ⱀ 0 − ⎢ ⎢ ⎥ ⎥⎦ ⎦ 2 ⎣ dα 2 ⎣ dα
(28)
䊏 This theorem is an immediate consequence of Theorems 2, 4, and 6. In addition, Fuller mentioned the necessary condition for Eq. 14 to have at least one pair of purely imaginary roots, which is expressed as |G| = 0. In Guckenheimer et al. (1997), this condition is used for the numerical detection of candidates of Hopf bifurcation points. On this point, see also Kuznetsov (1998) and Govaerts (2000). It must be noted that Theorem 7 involves redundant conditions. The following theorem is an immediate consequence of Theorem 5, and the idea of the proof is due to Araposthathis and Jury’s (1979) proof of Theorem 5. Theorem 9: Given Eqs. 4–6 with qm > 0, Eq. 4 has a simple pair of purely imaginary roots, and all other roots have negative real parts if and only if p0 > 0, q0 = 0, and q1, q2 , . . . , qm−1 > 0. Proof: Necessity: Necessity follows from Theorem 7. Sufficiency: From p0 > 0, Eq. 4 has no zero real roots. Then it is deduced that Eq. 4 can either have an even number of positive real roots or no positive real roots. However, if q0 = 0 and q1, q2 , . . . , qm−1 > 0, Eq. 5 has all its real roots in the open left half plane except for one zero real root. Thus, from Eq. 6, the real roots of Eq. 4 must be negative. Also, the real roots of Eq. 5 include twice the real parts of the complex roots of Eq. 4. Hence, Eq. 4 has a simple pair of purely imaginary roots and all other roots have negative real parts. 䊏 From Theorem 9, (i″) in Theorem 8 can also be replaced with (i′′′) below. Theorem 10: Assume there is a smooth curve of equilibria (z(a), a) with z(a*) = z* for the system Eq. 16. Conditions (i) and (ii) for a simple Hopf bifurcation are equivalent to the following conditions in the characteristic polynomial P(l; a) and the associated polynomial q(m; a):
A Determinantal Criterion of Hopf Bifurcations
169
p 0 (a*) > 0, q0 (a*) = 0, q1(a*), q2(a*), . . . , qm−1(a*) > 0 d (q0(α *)) ≠ 0 (ii′′) dα (i′′′)
Remark 2: The problem of obtaining similar criteria of simple Hopf bifurcations without redundant conditions is still an open problem.
5 Examples Example 1: For the system Eq. 16 with n = 2, the Jacobian matrix is a (α ) J (α ) = ⎡⎢ 11 ⎣a21(α )
a12(α ) ⎤ a22(α ) ⎦⎥
(29)
and the matrix defined by Eq. 22 is G(a) = [a22(a) + a11(a)]
(30)
In this case, (i′′′) in Theorem 10 is p0(a*) ≡ |J(a*)| > 0 and q0(a*) ≡ −G(a*) = 0. Thus, the conditions for a simple Hopf bifurcation are |J(a*)| > 0,
−G(a*) = 0
(31)
and d ( −G (α *)) ≠ 0 dα
(32)
Example 2: For the system Eq. 16 with n = 3, the Jacobian matrix is ⎡ a11(α ) a12(α ) a13(α ) ⎤ J (α ) = ⎢a21(α ) a22(α ) a23(α )⎥ ⎢⎣a31(α ) a32(α ) a33(α ) ⎥⎦
(33)
and the matrix defined by Eq. 22 is −a13(α ) ⎤ a23(α ) ⎡a22(α ) + a11(α ) G (α ) = ⎢ a32(α ) a33(α ) + a11(α ) a12(α ) ⎥ ⎢⎣ −a31(α ) a21(α ) a33(α ) + a22(α ) ⎥⎦
(34)
In this case, (i′′′) in Theorem 10 is p0(a*) ≡ −|J(a*)| > 0, q0(a*) ≡ −|G(a*)| = 0, q1(a*) ≡ |G(a*)1| + |G(a*)2| + |G(a*)3| > 0, and q2(a*) ≡ −trG(a*) > 0, where G(a*)i stands for the submatrix of G(a*) after deleting its i-th row and column. Then q1(a*) > 0 is redundant, since q1(a*) = (p2(a*))2 + p1(a*). Thus, the conditions for a simple Hopf bifurcation are −|J(a*)| > 0, −|G(a*)| = 0, −trG(a*) > 0
(35)
170
J. Minagawa
and d ( − G (α *) ) ≠ 0 dα
(36)
6 Economic Application Consider the generalized Tobin model formulated by Hadjimichalakis (1971) and studied by Benhabib and Miyao (1981). The system is as follows: . k = sf(k) − (1 − s)(q − q)m − nk . m = m(q − pˆ − n) . q = g (pˆ − q) pˆ = e[m − L(k, q)] + q
(37) (38) (39) (40)
where k, m, q, and pˆ are the capital–labor ratio, the money stock per capita, and the expected and actual rates of inflation, respectively, and s,q, n,g, and e are the saving ratio, the rate of monetary expansion, the natural rate of growth, the speeds of adjustment of expectations, and the price level, respectively, and the over-dot denotes the time derivative. Furthermore, L(k, q) and f(k) are assumed to be smooth functions. To cope with the instability of the original Tobin model, several generalizations of the model have been proposed, together with stability analyses (see the above references and those cited therein). From the viewpoint of this study, it is noted that for the system Eqs. 37–39 with pˆ = e[m − L(k, q)] instead of Eq. 40, Hadjimichalakis and Okuguchi (1979) examined its stability of equilibrium by utilizing the Fuller criterion, while Benhabib and Miyao (1981) showed that the system Eqs. 37–40 has a simple Hopf bifurcation for a certain value of g = g * by implicitly using the Liu criterion. In what follows, the Liu criterion and the alternative criterion in this study will be applied to the system Eqs. 37–40 to show their differences. Let us denote the Jacobian matrix of the system as J(g) = (aij(g)), i, j = 1, 2, 3. Then the Liu criterion becomes −|J(g *)| > 0, −trJ(g *) > 0, H2(g *) = 0, and dH2(g *)/dg ≠ 0. In the above conditions, H 2(γ *) =
p2(γ *) 1
p0(γ *) p1(γ *)
(41)
A Determinantal Criterion of Hopf Bifurcations
171
where p2(γ *) = − ( sf ′ − n − εm − γ *ε L2 ) (1 − s ) m sf ′ − n − εm m (ε L2 − 1) sf ′ − n p1(γ *) = + + γ *ε −γ *ε L2 −γ *ε L1 −γ *ε L2 εmL1 (1 − s )m sf ′ − n − (1 − s )n p0(γ *) = − εmL1 m (ε L2 − 1) − εm −γ *ε L1 γ *ε −γ *ε L2
− (1 − s )n − εm (42)
and f′ = df(k*)/dk, L1 = ∂L(k*, q*)/∂k, and L2 = ∂L(k*, q*)/∂q. On the other hand, Eqs. 35 and 36 give −|J(g *)| > 0, −|G(g *)| = 0, −trG(g *) > 0, and d(−|G(g *)|)/dg ≠ 0. Here, − G (γ *) = −
−εm + sf ′ − n γ *ε γ *ε L1
m (ε L2 − 1) −γ *ε L2 + sf ′ − n εmL1
− (1 − s )m − (1 − s )n −γ *ε L2 − εm
(43)
These expressions show that the determinant H2(g *) has elements which are themselves sums of determinants, while the elements of the determinant of the matrix G(g *) are simple combinations of the aij(g *). Note that −trG(g *) = −2trJ(g *) and −|G(g *)| = H2(g *).
7 Conclusion The determinantal criterion of simple Hopf bifurcations is provided in Sect. 4 and applied to the generalized Tobin model in Sect. 6.
References Araposthathis A, Jury EI (1979) Remarks on redundance in stability criteria and a counter-example to Fuller’s conjecture. Int J Control 29:1027–1034 Asada T, Yoshida H (2003) Coefficient criterion for four-dimensional Hopf bifurcations: a complete mathematical characterization and applications to economic dynamics. Chaos, Solitons Fractals 18:525–536 Benhabib J, Miyao T (1981) Some new results on the dynamics of the generalized Tobin model. Int Econ Rev 22:589–596 Fuller AT (1968) Conditions for a matrix to have only characteristic roots with negative real parts. J Math Anal Appl 23:71–98 Gantmacher FR (1959) The theory of matrices. Chelsea, New York Govaerts WJF (2000) Numerical methods for bifurcations of dynamical equilibria. SIAM, Philadelphia Guckenheimer J, Holmes P (1983) Nonlinear oscillations, dynamical systems, and bifurcations of vector fields. Springer, Berlin
172
J. Minagawa
Guckenheimer J, Myers M, Sturmfels B (1997) Computing Hopf bifurcations. I. SIAM J Numer Anal 34:1–21 Hadjimichalakis MG (1971) Money, expectations, and dynamics: an alternative view. Int Econ Rev 12:381–402 Hadjimichalakis MG, Okuguchi K (1979) The stability of a generalized Tobin model. Rev Econ Stud 46:175–178 Jury EI (1982) Inners and stability of dynamic systems, 2nd edn. Krieger, Malabar, FA Kuznetsov YA (1998) Elements of applied bifurcation theory, 2nd edn. Springer, Berlin Liu WM (1994) Criterion of Hopf bifurcations without using eigenvalues. J Math Anal Appl 182:250–256 Manfredi P, Fanti L (2004) Cycles in dynamic economic modeling. Econ Model 21:573–594 Murata Y (1977) Mathematics for stability and optimization of economic systems. Academic Press, New York Routh EJ (1877) Stability of a given state of motion. Macmillan, London Samuelson PA (1941) Conditions that the roots of a polynomial be less than unity in absolute value. Ann Math Stat 12:360–364 Stéphanos C (1900) Sur une extension du calcul des substitutions linéaires. J Math Pures Appl 6:73–128
Part III Economics of Space: Empirical Analysis
10. Public- and Private-School Competition: The Spatial Education Production Function David M. Brasington
Summary. School vouchers may increase the competition that public-school districts face. Greater competition may spur public schools to improve student outcomes, which reliably predict labor market productivity and earnings. Previous school competition studies did not use spatial statistics, so they failed to incorporate spillovers and the effect of omitted variables into their education production functions. Significant spatial effects are found in all regressions, and spatial statistics improve adjusted R2 values. There seems to be no consistent association between private-school attendance rates and public-school achievement, or between the number of publicschool districts in a county and public-school performance. Competitive effects, which seem plausible in nonspatial regressions, dissipate when spatial statistics are used. When school inputs appeared to be statistically significant in nonspatial regressions, the spatial regressions generally made the significance disappear. Poverty appeared to depress reading and writing rates, but this effect disappeared in the spatial models. Key words. Spillovers, Human capital formation, Spatial statistics, School vouchers, Private- and public-school competition
1 Introduction Economists care about schooling because school quality, as measured by proficiency test scores, is an important predictor of labor market productivity and earnings (Sander 1996; Bishop 1989; Loury and Garman 1995; Murnane et al. 1995). Whatever improves school quality might also improve labor market productivity and earnings. Economics Department, Louisiana State University, Baton Rouge, LA 70803, USA
175
176
D.M. Brasington
The private sector and the government both provide primary and secondary schooling. The dual provision of schooling has inevitably led to comparisons between private and public schools, with many people stating that private schools do a better job with less money. Some economists speculate that private schools must compete for students, and must therefore deliver a high-quality product at low cost to survive. Public-school systems, which have local monopoly power, are under less competitive pressure to deliver high-quality and efficient services (Couch et al. 1993). School vouchers are often proposed to increase competition. Under the current system, students are assigned to tax-funded (public) schools based on their residence. If students attend the schools they are assigned to, they pay no extra tuition, but if students want to attend private schools or a taxfunded school other than the one they are assigned to, parents must continue to pay taxes for their assigned public school, and they also have to pay full tuition rates at the school they attend. A voucher program would lower the cost of attending another school. Under a voucher program, parents receive a voucher (a check from the government) that can be redeemed at the assigned school or another school. Some voucher plans restrict voucher use to public schools only, while other voucher plans allow recipients to use the voucher at any public or private school. A flurry of recent research uses education production functions to investigate the extent to which public schools respond to competition from other public and private schools. By knowing whether public schools respond more to competition from other public schools or from private schools, the most effective voucher policy may be designed. If vouchers increase school competition, student outcomes may improve, and the labor force of the future may be more productive. Previous research has failed to address some important econometric issues. Public schooling is often thought to have positive spillovers over some spatial subgroup. Traditional education production functions ignore the possibility of spillovers. Furthermore, even the most careful education production functions are subject to omitted variable bias. Because of the publicpolicy importance of the voucher issue, estimates of school competition must be as unbiased and efficient as possible. This study introduces the spatial education production function. The spatial education production function uses spatial statistics, i.e., an estimation that accounts for the spatial layout of the data, to address spillovers and omitted variable bias. The study found spatial effects in all sections of the Ohio proficiency test and in all 15 spatial regressions. Therefore, education production functions that do not account for the spatial configuration of the data are vulnerable to biased, inefficient, and inconsistent
The Spatial Education Production Function
177
parameter estimates (Anselin 1988). The specific spatial models used are designed to be insensitive to outliers, and for this reason are expected to have lower adjusted R2 than nonspatial models. Nevertheless, the spatial education production functions have higher adjusted R2 values than their nonspatial counterparts, suggesting that spatial statistics add to explanatory power. Most of the nonspatial regressions show a positive relationship between public school outcomes and the number of public-school districts in the county. Once spatial statistics are used to address spillovers and omitted variables, only 13% of the regressions show a relationship. The magnitude of the effect is trivial: the elasticity of a test pass rate with respect to the number of public-school districts is 0.026.1 Overall, this study provides weak evidence that public schools respond a little to competition from other public schools. The results are more similar to those of Rothstein (2000), who finds no effect, than to those of Hoxby (2000), who finds large competitive effects. A comparison of spatial and nonspatial estimation methods fails to support the assertion that traditional regressions bias the parameter estimate of private-school competition downward (Hoxby 1998). Neither the traditional nor the spatial education production functions indicate a consistent public-school response to competition from private schools. The response is more prevalent in the nonspatial regressions (40%) than the spatial regressions (20%), suggesting that once spatial effects are considered, the correlation between private-school attendance and public-school outcomes is weakened. In fact, when the result is significant, it is almost always negative. The implications for allowing vouchers at private schools are discussed.
2 Theoretical Model of School Competition The main contribution of this chapter is its novel application of an empirical technique, and the results it yields about school competition. However, some researchers find it useful to couch empirical work in a theoretical framework that motivates the statistical test. A theoretical model of the production of education with spillovers is now presented. It is based on the work of Murdoch The number quoted is from the %PASS MATH results, which is the set of results with the highest proportion of statistically significant results. The median of the three significant parameter estimates is used (0.16), and the values of #DISTRICTS and %PASS MATH are evaluated at the sample means. All elasticities throughout the paper are evaluated at sample means. 1
178
D.M. Brasington
et al. (1993), but deviates in several respects.2 The model abstracts from many of the complex issues involved in education; its purpose is to motivate the use of spatial statistics in the empirical section by showing that there may be spillovers between school districts. Each school district i undertakes an activity a, the provision of schooling, which is measured by proficiency test pass rates. Production takes the following form: ai = ai(di, Yi, bi)
(1)
where d is a vector of student and parent demographic characteristics, Y is a vector of community demographic characteristics, and b is school board administrative efficiency and effectiveness. In turn, bi = bi(ki, ei(ki), si)
(2)
where k is a vector of variables related to competition in schooling market, e is enrollment, and s is a vector of inputs chosen by the school board such as teacher quality and the pupil/teacher ratio. Vector s is chosen in a framework outside the current model, but it is flexible enough to incorporate either rent-seeking or output-maximizing behavior by the administration. Education is assumed to exhibit positive externalities (Wyckoff 1984); therefore, although the median voter chooses a, the total amount of schooling consumption g is gi = ai + w i
(3)
where w represents spill-ins from public-schooling provision in neighboring communities. The median voter’s utility is a function of the total amount of schooling consumption, regardless of its source. Therefore, Ui = Ui(ni, g i)
(4)
Equation 4 portrays the median voter’s utility as a function of consumption of a numeraire good n as well as public schooling. The utility function is assumed to be strictly quasiconcave, twice continuously differentiable, and a monotonically increasing function of its arguments. School district i not only receives spill-ins from neighboring communities, it generates them as well. When the median voter chooses a, part of that provision is privately consumed. Another part is the pure public portion of Among the main differences are (1) converting from expenditures to proficiency tests as an outcome measure, (2) including variables for student and parent demographic characteristics, community demographic characteristics, variables related to competition, school characteristics, enrollment, and school board administrative efficiency and effectiveness, and (3) deriving first-order conditions from the model. 2
The Spatial Education Production Function
179
schooling consumed by the community in question as well as by its neighbors. Formally, a i = t i + pi
(5)
where t is the community’s own provision of the public aspect of schooling, and p is the consumption of the purely private portion of schooling. The proportion of a community’s provision of public schooling that stays in the community and the proportion that spills over into neighboring school districts are described by the following joint product technology: pi = q i * a i
(6)
ti = f i * a i
(7)
where q is the proportion of own provision of schooling that is privately consumed, and f is the fraction that is a pure public good. Equation 7 holds for all school districts, so
ω i = ∑ φr * α r
(8)
i ≠i
where f r is the fraction of activity in school district r that spills into jurisdiction i as a public good. Because g i = ti + pi + w i, Eq. 4 may be rewritten as Ui = Ui(ni, ti + pi + w i)
(9)
The median voter maximizes Eq. 9 subject to the following budget constraint: y i = ni + t i a i
(10)
In Eq. 10, y is income and t is the per-unit cost of education faced by the median voter. The median voter chooses a taking other districts’ schooling decisions as given. Constrained maximization proceeds by setting up the Lagrangian ⎛ ⎞ L = U i ⎜ ni , φi * α i + θi * α i + ∑ φr * α r ⎟ + λ ( yi − ni − τ iα i ) ⎝ ⎠ r ≠i
(11)
where l is the Lagrange multiplier. Partial differentiation yields the following first-order conditions, along with the budget constraint: ∂U i −λ =0 ∂ni
(12)
∂U i (θi + φi ) − τ i λ = 0 ∂ni
(13)
180
D.M. Brasington
so that the ratio of the marginal rates of substitution of the numeraire good and own contribution of schooling are equated to the tax price. ∂U i ∂α i
∂U i = τi ∂ni
(14)
The theoretical model has shown how spillovers may be involved in the production of education, and how the spillovers enter communities’ choice of schooling levels. The theoretical model contains variables representing student and parent characteristics, community characteristics, and schoolspecific inputs. The following section discusses the inputs to education in greater detail in order to justify the choice of variables in the empirical section.
3 Literature Review and Choice of Education Production Function Variables Education is produced using the following inputs: students and parents, the community, and school-specific factors. Each input group is discussed in turn. Every attempt is made to include the typical set of education production function variables as well as variables related to competition. Student and parent characteristics are typically strongly related to publicschool outcomes. One such characteristic is the presence of a two-parent household. A single-parent household may not devote as much attention to its children; therefore, %BOTH PARENTS is expected to be positively related to student outcomes. Another theoretically important household characteristic is income. PARENT INCOME is expected to be positively related to student achievement. On the other hand, low parent education levels may imply that parents are less able to help their children with their homework, or that they do not value education highly. Therefore, %NO DIPLOMA and %HS DIPLOMA ONLY are likely to be negatively related to student outcomes compared with higher levels of parent education. The proportion of nonwhite students, %MINORITY, is also included; this has been found to be negatively related to school outcomes in many education production functions. The characteristics of the community include demographic factors and variables related to competition. The proportion of school-district residents who are living in poverty, %POVERTY, is expected to be negatively related to school outcomes. Competitive pressures may also influence the supply of public-school quality. Competition may come from private schools as well as from other public schools. Each of these possibilities is discussed in turn. Theoretical models have found positive (Falkinger 1994), negative (Epple and Romano 1998), and ambiguous (Ireland 1990) effects of increased
The Spatial Education Production Function
181
private-school enrollment on public-school performance. Empirical studies have shown more consistent results. Couch et al. (1993) found that the percentage of students in a public-school district who attended private schools had a positive relationship with standardized algebra test scores in North Carolina’s public schools; public schools seem to respond to competition from private schools. Borland and Howsen (1996) extended Couch et al.’s analysis to include the effect of both public- and privateschool competition on public-school performance. This competition from both sources is captured in a single Herfindahl index, which is positively related to public-school performance. Hoxby (1998) also found that competition from private schools was positively related to higher public-school achievement. Dee (1998) showed that competition from private schools had a positive and statistically significant impact on the high-school graduation rates of neighboring public schools. In contrast, Zanzig (1997) found a negative relationship between public-school math test scores and the percentage of students in the county who attend private schools. %PRIVATE is included in the current study in order to capture the relationship between private-school competition and public-school performance. There may also be competition between public schools. Hoxby (1996, 1998) stated that such competition should exist because efficient and high-quality education providers are rewarded with higher budgets. An increase in quality causes an increase in house prices, which increases the tax base, tax collections, and the size of the school budget. In addition, Hoxby suggests that schools are more responsive to parents’ desires, as opposed to school staff desires, when there are many public-school districts in the area. Hoxby further asserts that when parents have more choice among districts, they are more involved in their children’s schooling. Therefore, one would expect a positive relationship between the number of school districts and publicschool outcomes. Indeed, Hoxby (1998, 2000) finds such a relationship. Using a Herfindahl index to measure competition, she found that an increase in competition is associated with a small but statistically significant increase in achievement. In contrast, when Rothstein (2000) uses Hoxby’s (2000) data with different instruments, he finds no statistically significant results. Rothstein further suggests that Hoxby’s failure to control for private-school attendance led to her findings about public-school competition. Zanzig (1997) generally found that the number of public-school districts in a county had a positive effect on public-school performance until a threshold of three or four districts is reached. Beyond this threshold, additional public schools depress performance. Based on the literature, the number of public school districts in each of Ohio’s 88 counties, #DISTRICTS, is included to capture public-school competition.
182
D.M. Brasington
Finally, the characteristics of the schools themselves may influence student achievement. There are dissenters (Ornstein 1993; Fowler and Walberg 1991; Jewell 1989), but considerable evidence suggests that larger school districts are consistently negatively related to student outcomes, although the magnitude is small (Haller 1992; Stern 1989; Friedkin and Necochea 1988; Brasington 1997). Therefore, COHORT SIZE is included in the education production function; a negative relationship with student performance is anticipated. Other school-specific characteristics include TEACHER SALARY, TEACHER EXPERIENCE, and TEACHER EDUCATION levels, as well as the PUPIL/TEACHER RATIO. Although the debate continues as to the significance of these factors, no education production function is complete without them. Data on per-pupil expenditure are also available, but this variable is highly correlated with TEACHER SALARY (0.81) and PUPIL/TEACHER RATIO (−0.89), and seems to cause multicollinearity problems. The student outcomes are now described. Ohio’s high-school students must pass a ninth-grade proficiency test to receive a full high-school diploma. Students who do not pass this test by the end of twelfth grade, but pass their coursework, receive a certificate of attendance instead. All students must take the test, so sample selection bias is not an issue (Hanushek and Taylor 1990).3 The 1993 proficiency test contains four sections: writing, reading, math, and citizenship. The measures of school outcome employed are the proportion of ninth-graders in each district who passed each portion of the ninth-grade proficiency test in 1993. Of the 611 Ohio school districts, 605 reported their 1993 proficiency test results. The recent education production literature includes studies that use levels of outcomes (Hoxby 1996; Sander 1996; Kennedy and Siegfried 1997; Zanzig 1997; Dee 1998; Couch et al. 1993; Borland and Howsen 1996) as well as the value-added approach (Meyer 1996; Hanushek 1992; Gomes-Neto et al. 1997; Figlio 1999). Two ways of implementing value-added were tried. The value-added approach of Meyer (1996) yielded no additional insights; the approach of Hayes and Taylor (1996) yielded an adjusted R2 of 0.01. This study therefore restricts its attention to variations in the level of outcomes, rather than value-added. The results of this study complement those of Brasington and Haurin (2006), who found that levels of proficiency are valued by the housing market, while measures of the value-added of a school are not. Definitions of the variables, as well as sources and means, are shown in Table 1. Only students assessed to have a learning disability are exempt. Only if a student’s team leader determines that the student has a learning disability each year and exempts that student every year of his or her high school career will that student not be required to take the proficiency test. 3
The Spatial Education Production Function Table 1. Variable definitions, sources, and means Variable Definition and source %PASS ALL
%PASS MATH
%PASS CITIZENSHIP
%PASS READING
%PASS WRITING
%BOTH PARENTS
PARENT INCOME
%PRIVATE
%MINORITY COHORT SIZE
TEACHER SALARY
TEACHER EXPERIENCE
Proportion of ninth-grade students passing all sections of the ninth-grade proficiency test (math, citizenship, reading, and writing) in each school district in 1993 (1) Proportion of ninth-grade students passing the math section of the ninth-grade proficiency test in each school district in 1993 (1) Proportion of ninth-grade students passing the citizenship section of the ninth-grade proficiency test in each school district in 1993 (1) Proportion of ninth-grade students passing the reading section of the ninth-grade proficiency test in each school district in 1993 (1) Proportion of ninth-grade students passing the writing section of the ninth-grade proficiency test in each school district in 1993 (1) Proportion of students in each school district living with two parents in 1990 (2) Average parental income of students in each school district, in hundreds of thousands of dollars in 1990 (2) Proportion of students in grades nine through twelve living in each public school district who attended nonpublic schools in 1990 (2) Proportion of students in each school district who were nonwhite in 1993 (1) Average number of students in each grade in each high school in 1993 in thousands; school district fall average daily membership divided by the number of grade levels the school district serves, divided by the number of high schools the district has, divided by 1000 (1,3) Average teacher salary in each school district in hundreds of thousands of dollars in 1993 (1) Average teacher experience in each school district in hundreds of years in 1993 (1)
183
Mean (s) 0.51 (0.14)
0.61 (0.14)
0.72 (0.12)
0.87 (0.07)
0.87 (0.08)
0.81 (0.09) 0.33 (0.11) 0.06 (0.07)
0.06 (0.12) 0.19 (0.14)
0.33 (0.05) 0.15 (0.02)
184
D.M. Brasington
Table 1. Continued Variable PUPIL/TEACHER RATIO
TEACHER EDUCATION
%POVERTY
#DISTRICTS
%NO DIPLOMA
%HS DIPLOMA ONLY
LAG
Definition and source Total average daily membership in each school district divided by the total number of classroom teachers, in hundreds of students per teacher in 1993 (1) Number of teachers with a Masters degree divided by the total number of regular teachers in each school district in 1993 (1) Proportion of persons in each school district living under the official poverty income level in 1990 (2) Number of school districts in each county in 1993, in hundreds of districts (4) Proportion of parents of school-age children in each school district in 1990 who do not have a high school diploma (2) Proportion of parents of school-age children in each school district in 1990 who have a high school diploma but have not attended college (2) Spatial lag Wx of the variable in question for the spatial Durbin model of Eq. 18
Mean (s) 0.19 (0.02)
0.33 (0.05)
0.10 (0.07) 0.10 (0.07) 0.16 (0.08)
0.45 (0.11)
— —
Means are shown with the standard deviation in parentheses below. The number of observations was 605. Sources: (1) Ohio Department of Education, Information Management Services (1999); (2) School District Data Book (1994); (3) Ohio Department of Education (1993); (4) Ohio Department of Education (1985)
4 Traditional, Nonspatial Estimation Theory provides no guide as to the functional form the education production function should take (Figlio 1999). Furthermore, a Davidson and MacKinnon (1981) test for linearity versus log-linearity yields inconclusive results; the linear functional form is adopted by default. It is only in the last few years that the literature on the economics of education has addressed endogeneity in education production functions (Akerhielm 1995; Hoxby 1998). From the school district point of view, TEACHER SALARY and PUPIL/TEACHER RATIO are choice variables and should be treated endogenously. TEACHER EXPERIENCE and TEACHER EDUCATION are related to how long a teacher has been in the school district. These
The Spatial Education Production Function
185
variables are probably determined simultaneously with student achievement, and are therefore treated endogenously. Hoxby (1998) suggests that #DISTRICTS is endogenous to school quality, and the number of streams in the area should be used as an instrument. Hoxby also suggests that %PRIVATE is endogenous to public-school quality. However, there is some evidence that school-district consolidation is not a function of public-school quality (Brasington 1999), and therefore the number of school districts in an area is not a function of school quality. In addition, a Hausman test reveals that both #DISTRICTS and %PRIVATE may be treated exogenously in the education production function.4 Rothstein (2000) also argues in favor of treating #DISTRICTS as an exogenous variable.
5 The Spatial Education Production Function The theoretical model of school-district competition includes educational spillovers. In fact, a good deal of research has investigated spillovers in public good provision.5 A natural way to account for externalities is to incorporate spatial autocorrelation in the statistical estimation: rarely do economic theory and statistical technique complement each other so naturally. 6 When each school district affects the performance of neighboring school districts, spatial autocorrelation may exist (LeSage 1997a). The ordinary least-squares method does not account for the interplay between spatially close observations, which may lead to biased, inefficient, and inconsistent parameter estimates (Anselin 1988). Neighboring school districts affect each other more than school districts far away from each other (Wyckoff 1984); consequently, a spatial weight matrix must be constructed to summarize the
4 The form of the Hausman test is that detailed by Maddala (1992) and Ramanathan (1998). Specifically, education production functions are run with and without the predicted values of #DISTRICTS and %PRIVATE. The calculated F statistic is 1.89 compared to a critical value of 2.30 at the 0.10 level of significance, suggesting that #DISTRICTS and %PRIVATE may be treated exogenously. In unreported regressions, #DISTRICTS and %PRIVATE are treated endogenously, yielding somewhat larger parameter estimates for the two variables but no change in statistical significance. 5 See Bradford and Oates (1974), Voß (1991), and Reiter and Weichenrieder (1997) for surveys. See also Murdoch et al. (1993), Wyckoff (1984), and Edwards (1986). 6 Murdoch et al. (1993) use a spatial autoregressive model to capture spillovers in recreation expenditures in Los Angeles. Brasington and Hite (2005) use a spatial Durbin model to capture spillovers in environmental quality, and D.L. Millimet and V. Rangaprasad (2005) use spatial statistics to model spillovers in educational input hiring.
186
D.M. Brasington
spatial configuration of the observations. The traditional education production function takes the form tij = bxi + ei
(15)
where t represents the test passes, i is the school district, j is the test section, x is the set of student-, parent-, community-, and school-specific factors related to the test passes, and e is the error term. However, the traditional education production function ignores spillovers and other forms of spatial dependence. We first employ the Bayesian spatial error model of LeSage (1997b) to address heteroskedasticity, outliers, and omitted variables. This is the same as the traditional model in Eq. 15, but with a more complex error term. tij = bxi + ei
(16)
e = lWe + m, m ∼ N(0, s V) 2
V = diag(v1, v2 , . . . , vn) r/vi ∼ c 2(r) / r 1/s 2 ∼ G(h, d) In Eq. 16, e is a spatial autoregressive error term that says the error term for each observation is related to the error terms of the neighboring observations. Unlike a time series, where an autoregressive lag represents nearby time periods, lWe is an autoregressive lag based on nearby observations in space. The W in the autoregressive term is a spatial weight matrix (LeSage 1997a) that summarizes the spatial layout of the data on a map. It tells which the nearest school districts are for each observation. The spatial weight matrix W in this study is constructed so that the five nearest neighbors are allowed to influence each school district. With 605 school districts in the sample, W is a 605 × 605 matrix. Row 1 represents school district 1. For each column y in row 1, we must ask, “Is school district y one of the five nearest to school district 1?” If the answer is yes, the column gets a 1; if not, the column gets a 0.7 Each row then has five 1’s. A common procedure in the spatial statistics literature is to make the sum of each row equal to unity, so with five neighbors, each neighbor is assigned a weight of 1/5 = 0.2. The spatial statistics literature commonly assumes that an observation cannot be its own neighbor, so that the spatial weights matrix has a zero diagonal. Also, just like there is no consensus on functional form in traditional regressions there is no consensus on the number of neighbors to use in spatial statistics, and the choice of the number of neighbors often affects estimation results. Five neighbors are chosen in the current study because, looking at a map of Ohio’s school districts, it seems that on average each school district shares a border with five other school districts. 7
The Spatial Education Production Function
187
The l in Eq. 16 is the spatial autoregressive parameter to be estimated. It gives the degree to which the error terms of our observation and its neighbors are related. Finally, m is a white-noise error term. The first two terms in Eq. 16 describe a spatial error model. 8 The remaining terms distinguish the Bayesian model from the non-Bayesian spatial error model. The additional characterization of the error term helps correct for heteroskedasticity and outliers, and is discussed in greater detail in J.P. LeSage (1999), but a brief discussion is provided here. A big difference between Bayesian and non-Bayesian estimation is the use of prior information. We allow diffuse priors for l , b, and s 2 . Ordinarily, the term r in Eq. 16 would be a distributed gamma with two parameters. Instead, following J.P. LeSage (1999, p. 121), we use an informative prior on vi of r = 4. This particular prior yields relatively constant estimates of vi in the presence of homoskedasticity, while at the same time accommodating nonconstant error variances in the presence of heteroskedasticity and outliers.9 Computational tricks by Barry and Pace (1999) and Pace and Barry (1998) are used to allow the sample to run in a reasonable amount of time.10 The spatial error model captures the influence of omitted variables that vary across space. Any omitted influence that varies across space will be subsumed in the error term, and the spatial autoregressive term will capture these influences. The income of parents and the race of the students are included, but the education production functions do not measure the income and racial characteristics of the neighborhood. These community characteristics may affect student performance above and beyond the influence from the parents, perhaps through volunteering efforts and tax-levy pass rates. Neighborhood crime rates are also omitted, and the accompanying feeling of security could affect student performance. These omitted influences will be subsumed in the error term, and normally might adversely influence parameter estimates. However, the second term in Eq. 16 recognizes the correlation between the error terms of neighboring school districts. If people live in areas with other similar people, then income, race, and crime rates will be similar across space, and change in character gradually from school district to school district. If the error term for school district A is affected by income, race, and crime rates, the error term for nearby school district B is affected to a similar degree. The correlation between the error terms is
If V is a matrix of ones, the first two terms of Eq. 16 fully characterize a non-Bayesian spatial error model. 9 An alternative is to set the two parameters of the gamma distribution for r to the informative priors of 8 and 2. Nearly identical estimates are achieved either way. 10 For example, the Bayesian spatial error model with 300 draws and 30 burn-in draws, using %PASS MATH as the dependent variable, took 629 seconds. 8
188
D.M. Brasington
captured by the spatial model. In this manner, the spatial error model addresses the influence of omitted variables in the house-price hedonic. A more complete intuitive explanation of how spatial statistics addresses omitted variables is found in Brasington and Hite (2005). A mathematical proof is available in Griffith (1988). The Bayesian spatial error model in Eq. 16 depends on having a large number of draws to converge to the true joint posterior distribution of the parameters. With insufficient draws, parameter estimates cannot be trusted. Although convergence diagnostics are available, the true test of convergence is when the estimates do not change with added draws. A model with 300 draws (with 30 additional burn-in draws) achieves similar results to a model with 1000 draws (with 100 additional burn-in draws), suggesting that 300 draws is sufficient. There are other ways in which spatial statistics can capture the influence of omitted variables. One of these is a Bayesian spatial autoregressive model (LeSage 1997b): tij = rWtij + bxi + ei
(17)
e ∼ N(0, s 2V) V = diag(v1, v2 , . . . , vn) r/vi ∼ c 2(r)/r s 2 ∼ G(h, d) r ∼ uniform(−1, 1) In the above equation, rWtij is the spatial autoregressive term. This is the term that captures the essence of public good spillovers, allowing the educational production of each school district to depend on the educational production of its neighboring school districts. The theoretical section modeled spillovers as part of a joint product technology, but there may be an additional justification for the rWtij term. School district administrators may keep their eyes on neighboring school districts and try to compete with them academically, so the outcomes of neighboring school districts may affect the administrative effort of a school district, which in turn may affect school district outcomes. Administrators care less about the performance of far-away school districts. Compared with the Bayesian spatial error model of Eq. 16, the Bayesian spatial autoregressive model of Eq. 17 takes the influence of unobserved variables out of the error term e and controls for them with the rWtij term. While the spatial error model guards against inefficient parameter esti-
The Spatial Education Production Function
189
mates, the spatial autoregressive model guards against biased parameter estimates. If the rWtij term is significant, and if it were omitted as in a traditional education production function, the matrix of parameter estimates b would suffer from bias, and hypothesis testing would be invalid. A final way in which spatial dependence is captured in the education production functions is through a spatial Durbin model (LeSage 1997b): (I − rW)tij = a + b1xi + Wb2 xi + ei
(18)
e ∼ N(0, s 2V) V = diag(v1, v2 , . . . , vn) r/vi ∼ c 2(r)/r s 2 ∼ G(h, d) r ∼ uniform(−1, 1) As in the spatial autoregressive model of Eq. 17, the spatial Durbin model of Eq. 18 also contains a spatial autoregressive term rWtij and the education production function controls xi. Unlike the spatial autoregressive model, the spatial Durbin model contains the education production function controls for our neighbors Wxi as well. While the b terms are different for xi and Wxi, the same W is used in two places in Eq. 18. Some research (Anselin 1988) has claimed that the r and b2 are not identified, but H.H. Kelejian and I.R. Prucha (2004) have proved that they are.11 The Wxi term captures spillovers in a different way than the rWtij term. School quality may be more than a function of its own inputs and neighboring districts’ school quality. It may also be a function of demographic influences in neighboring districts. Children who live on the border of two school districts may play with each other, so that peer-group effects may spill across school district boundaries. Churches, Rotary Clubs, and country clubs may draw members from different (but probably nearby) school districts; parents at these social organizations may discuss schooling with each other and form attitudes about schooling quality, dress codes, and homework levels which may influence the schooling decisions they make. Wxi allows the characteristics of neighboring school districts to affect outcomes in each school district. These spillover effects may be stronger across schools than across
H.H. Kelejian and I.R. Prucha (2004) prove this for a model with both spatial autoregressive lag terms rWt and lWe, but the results should follow for the spatial Durbin model as well. 11
190
D.M. Brasington
school districts, but if they are present across school districts then they are likely to be present across schools within a district.12
6 Estimation Results The first set of regression results uses %PASS ALL, i.e., the percentage of students passing all four sections of the proficiency test, as the dependent variable. The first set of results in Table 2 is the nonspatial instrumental variables model. The results of the instrumental variables model are generally consistent with expectations. If all else is constant, the proficiency test pass rate is higher in school districts where children come from two-parent families with high incomes, where there is a low proportion of minority students, and in communities with low poverty rates and higher-educated citizens. The sign of COHORT SIZE is negative, but fails to achieve statistical significance at the 0.10 level. Most of the school-specific inputs show a significant relationship with student achievement, but two of these are of theoretically inconsistent sign. While school districts with better-paid teachers have higher pass rates, the results suggest that school districts with higher teacher education levels have lower pass rates, and that a higher student/teacher ratio raises pass rates. A correlation matrix suggests that these results are not an artifact of multicollinearity. Consistent with competitive effects, #DISTRICTS is positively related to pass rates. However, %PRIVATE is negatively related to pass rates in public-school districts, a result which is in contrast with most of the literature, but is consistent with Zanzig (1997). The preceding analysis does not address spillovers or omitted variables. The results of the first spatial education production functions are reported in the remaining columns of Table 2.13 Testing for spatial effects is of supreme importance. The spatial error lag has a parameter estimate of 0.36 and is statistically significant. In fact, the spatial parameter estimates are all positive and statistically significant throughout the paper, suggesting that the use of spatial statistics is warranted. The 0.36 estimate means that the error terms have, on average, a 0.36
12 School outcomes are not available in Ohio at the school level, but only at the school district level. For high schools, this may not matter much, as most school districts have one high school. For example, even among the urban school districts, only 24 of the 140 have more than one high school. 13 D.L. Millimet and V. Rangaprasad (2005) also have a claim to being the first, since neither of our papers have been published. The first draft of this paper was completed August 5, 1999, which may precede theirs.
The Spatial Education Production Function Table 2. Regression results using %PASS ALL Instrumental Spatial error variables model model %BOTH PARENTS PARENT INCOME %PRIVATE %MINORITY COHORT SIZE TEACHER SALARY TEACHER EXPERIENCE PUPIL/TEACHER RATIO TEACHER EDUCATION %POVERTY #DISTRICTS %NO DIPLOMA %HS DIPLOMA ONLY LAG %BOTH PARENTS LAG PARENT INCOME LAG %PRIVATE LAG %MINORITY LAG COHORT SIZE LAG TEACHER SALARY LAG TEACHER EXPERIENCE LAG PUPIL/TEACHER RATIO LAG TEACHER EDUCATION
0.29** (4.09) 0.40** (4.53) −0.19* (2.47) −0.15** (2.91) −0.053 (1.18) 0.58* (2.42) −0.30 (0.94) 0.74* (2.13) −0.49* (2.00) −0.24* (2.29) 0.20* (2.46) −0.49** (5.58) −0.25** (3.95) — — — — — — — — — — — — — — — — — —
0.27** (2.68) 0.15 (1.61) −0.15 (1.57) −0.19** (3.26) −0.070 (1.50) 0.32 (1.19) −0.05 (0.13) 0.56 (1.41) −0.20 (0.73) −0.14 (1.08) 0.13 (1.10) −0.56** (5.44) −0.34** (4.16) — — — — — — — — — — — — — — — — — —
191
Spatial autoregressive model
Spatial Durbin model
0.34** (3.56) 0.17* (2.19) −0.14* (1.65) −0.19** (3.65) −0.080* (1.92) 0.23 (0.99) −0.11 (0.35) 0.48 (1.35) −0.16 (0.71) −0.06 (0.53) 0.14* (2.00) −0.51** (6.25) −0.28** (4.47) — — — — — — — — — — — — — — — — — —
0.23* (2.55) 0.15* (2.05) −0.12 (1.43) −0.21** (3.71) −0.077* (1.86) 0.20 (0.83) −0.16 (0.48) 0.56 (1.48) −0.13 (0.55) −0.09 (0.74) 0.26 (1.55) −0.61** (6.59) −0.38** (5.90) 0.25 (1.48) 0.17 (0.92) 0.23* (1.71) 0.06 (0.65) 0.041 (0.38) 0.26 (0.47) −1.20* (1.82) 0.54 (0.73) −0.60 (0.94)
192
D.M. Brasington
Table 2. Continued Instrumental variables model LAG %POVERTY LAG #DISTRICTS LAG %NO DIPLOMA LAG %HS DIPLOMA ONLY CONSTANT Spatial parameter estimate r or l Adjusted R2
— — — — — — — — 0.22* (1.73) — — 0.49
Spatial error model — — — — — — — — 0.37** (2.75) 0.36** (6.87) 0.54
Spatial autoregressive model
Spatial Durbin model
— — — — — — — — 0.17 (1.43) 0.26** (5.73) 0.54
0.08 (0.40) 0.028 (0.15) 0.29 (1.64) 0.36** (2.75) −0.00 (0.02) 0.28** (4.76) 0.56
The number of observations was 605. Parameter estimates are shown with the absolute value of the t-statistic in parentheses below. ** Statistically significant at the 0.01 level; * statistically significant at the 0.10 level. The spatial parameter is r for both the spatial autoregressive model and the spatial Durbin model, and l for the spatial error model. LAG is the estimate of the spatial lag parameter for the Wx of each variable
spatial correlation with each other: the unmeasured things in our education production function are somewhat similar to the unmeasured things of our neighbors. The next model, the spatial autoregressive model, suggests that the average correlation between one school district’s pass rate and its neighbors’ pass rates is 0.26, and the spatial Durbin model suggests this correlation is 0.28.14 The spatial Durbin model also finds that three of the explanatory variables of our neighbors help explain the proficiency pass rate of our school district. It is difficult to compare the results of D.L. Millimet and V. Rangaprasad (2005) because (1) they do not use proficiency tests as measures of school quality, and (2) they do not use the same spatial models and so have no estimates of l or b2 . D.L. Millimet and V. Rangaprasad (2005) reported the estimates of r when the measures of school quality are the pupil/teacher ratio, expenditure per pupil, capital expenditure per pupil, teacher salary, and school size, but the results are not comparable to the estimate of r for the proficiency test pass rate variables used in this study. Their estimates also vary from specification to specification, ranging from small and negative to large and positive. This is much larger than the spatial effects Murdoch et al. (1993) found for recreation expenditures. They found a very small (0.01) spatial autoregressive parameter estimate and statistically significant spillovers in only one of their two regressions.
14
The Spatial Education Production Function
193
The explanatory power is higher in the spatial models. Adjusted R2 is 0.49 in the nonspatial models, while it is 0.54, 0.54, and 0.56 in the spatial models. The Bayesian component of the spatial models purposefully tries not to fit outliers, which depresses the adjusted R2 . If the spatial models had fit outliers like the nonspatial model, the difference in the adjusted R2 might be even larger. Even if the improvement is modest, the larger adjusted R2 is found in all spatial models throughout the study. Perhaps the most significant finding in Table 2 is that the spatial models eliminate the statistical significance of all the school-specific inputs. Failure to incorporate spatial dependence seems to have attributed too much explanatory power to teacher salary, the pupil/teacher ratio, and teacher education levels. The lack of significance is consistent with Hanushek (1986) and much of the recent literature. The other major differences concern the school competition variables. Hoxby (1998) argued that the parameter estimate of %PRIVATE is biased downward, but the spatial models that address the influence of omitted variables suggest otherwise. The nonspatial model’s parameter estimate was −0.19, but the parameter estimates of the spatial models ranged from −0.15 to −0.12, suggesting (if anything) an upward bias to least squares. While the nonspatial model showed a significant relationship for %PRIVATE, only one of the three spatial models showed a statistically significant relationship, and that barely reached the 0.10 level of significance. In any case, even the largest parameter estimate −0.19 yields an elasticity of −0.02, so regardless of statistical significance, the economic significance of competitive effects from private schools is negligible. The results for the relationship between public school competition and those for private school competition are similar. While the instrumental variables regression shows a positive, significant relationship, the spatial models generally find no relationship. Again, the only spatial model showing a significant relationship is the spatial autoregressive model, and again the magnitude of the parameter estimate, even for the largest, shows a trivial elasticity of test passes of 0.04 with respect to the number of school districts. The enrollment of each grade is negatively related to test passage in two of the three spatial models, and approaches significance in the third; it was insignificant in the nonspatial model. The elasticity is −0.03, and is almost exclusively found when %PASS ALL is used. These findings are consistent with those of Brasington (1997). Table 3 shows the results of regressions when the percentage of students passing the math section is the dependent variable. Pass rates on the math section are lower than those for other sections, so it is not surprising that the results are similar to the %PASS ALL results: if a student failed any
194
D.M. Brasington
Table 3. Regression results using %PASS MATH Instrumental Spatial error variables model model %BOTH PARENTS PARENT INCOME %PRIVATE %MINORITY COHORT SIZE TEACHER SALARY TEACHER EXPERIENCE PUPIL/TEACHER RATIO TEACHER EDUCATION %POVERTY #DISTRICTS %NO DIPLOMA %HS DIPLOMA ONLY CONSTANT Spatial parameter estimate r or l Adjusted R2
0.38** (4.30) 0.09 (1.12) −0.21** (2.81) −0.22** (4.44) −0.031 (0.70) 0.79** (3.34) −0.37 (1.17) 0.88* (2.56) −0.63** (2.60) −0.26* (2.50) 0.16* (2.11) −0.56** (6.53) −0.18** (2.87) 0.33** (2.62) — — 0.49
0.27** (2.72) 0.05 (0.60) −0.16 (1.62) −0.27** (4.20) −0.046 (0.95) 0.58* (2.18) −0.04 (0.11) 0.65 (1.52) −0.38 (1.41) −0.14 (1.04) 0.15 (1.44) −0.67** (6.95) −0.22** (2.96) 0.43** (3.08) 0.31** (5.07) 0.53
Spatial autoregressive model
Spatial Durbin model
0.34** (3.93) 0.10 (1.25) −0.14* (1.70) −0.26** (5.36) −0.052 (1.17) 0.49* (2.04) −0.10 (0.30) 0.57 (1.58) −0.35 (1.42) −0.08 (0.75) 0.15* (2.11) −0.59** (7.02) −0.19** (2.94) 0.23* (1.74) 0.23** (5.48) 0.53
0.24* (2.50) 0.05 (0.67) −0.11 (1.32) −0.30** (5.15) −0.06 (1.29) 0.37 (1.40) −0.04 (0.14) 0.49 (1.38) −0.23 (0.87) −0.06 (0.48) 0.31* (1.97) −0.70** (7.93) −0.29** (4.57) 0.02 (0.05) 0.23** (4.04) 0.55
The number of observations was 605. Parameter estimates are shown with the absolute value of the t-statistic in parentheses below. ** Statistically significant at the 0.01 level; * statistically significant at the 0.10 level. The spatial parameter is r for both the spatial autoregressive model and the spatial Durbin model, and l for the spatial error model. Spatial lags are included for the spatial Durbin model, but suppressed in output to save space: LAG PARENT INCOME, LAG %NO DIPLOMA, and LAG %HS DIPLOMA ONLY are positive
section, it was most likely to be the math section. However, while parent income was related to %PASS ALL in three of four regressions, it was not related to %PASS MATH in any regression. The only role for parent income was found in the spatial Durbin model, where LAG PARENT INCOME was positively related to %PASS MATH. The parameter estimate (not shown) is 0.29, suggesting that the elasticity of our own math pass rate with respect to
The Spatial Education Production Function
195
our neighbors’ income levels is 0.16. If neighboring school districts’ incomes are 10% higher, our math pass rate is 1.6% higher. The evidence for competitive effects is stronger for the math model than for any other test section. Five out of eight parameter estimates are statistically significant, but, as before, the magnitude of the effects is trivial. Also as before, the parameter estimate of %PRIVATE is weaker in the spatial models than in the nonspatial model. As before, the nonspatial model shows significant effects for three of four school inputs, and two of these have an unexpected sign. As before, incorporating spatial dependence kills off the statistical significance of TEACHER EDUCATION and PUPIL/TEACHER RATIO. However, TEACHER SALARY remains positive and statistically significant in two of the three spatial models. The median elasticity from the spatial models is 0.29, suggesting that a 10% rise in teacher salary is associated with a 2.9% rise in math pass rates. In other words, a 10% rise in teacher salaries might raise the average district’s pass rate from 0.61 to 0.628. COHORT SIZE is not significant in any regression, spatial or not, and %POVERTY, while significant in the nonspatial model, is insignificant in all spatial models. The magnitude of the spatial parameters ranges from 0.23 to 0.31. The results of the %PASS CITIZENSHIP regression are easily summarized (Table 4). Competitive effects are almost completely absent. As in the math section, the incomes of neighbors seem to spill over and positively affect citizenship pass rates, but no direct relationship between parental incomes and pass rates is found. No school input is statistically significant in either the spatial or the nonspatial models. The magnitude of the spatial parameters ranges from 0.17 to 0.23. The adjusted R2 shows its largest leap by going from 0.44 in the nonspatial model to 0.55 in the spatial Durbin model. The results of the %PASS READING model are similar to those of the %PASS CITIZENSHIP model (Table 5). There is no evidence of competitive effects or of the importance of any school input to test pass rates. Parent income shows as insignificant in the nonspatial model, but becomes statistically significant in all spatial models, and while %POVERTY is statistically significant in the nonspatial model, when spatial dependence is addressed, it loses its significance. The spatial parameters range from 0.17 to 0.26. The model with %PASS WRITING shows almost nothing of statistical significance (Table 6). The adjusted R2 ranges from 0.21 in the nonspatial model to 0.25 in the spatial Durbin model. The lack of significance and low adjusted R2 may stem from the way the writing section is graded. Unlike the other test sections, the writing section is not a multiple choice test. The tests are sent out-of-state and hand-graded. The pass rates are high: the average pass rate of this section is 0.87, and the standard deviation is 0.08. The lack of variation contributes to the lack of statistically significant findings. Still,
196
D.M. Brasington
Table 4. Regression results using %PASS CITIZENSHIP Instrumental Spatial error variables model model %BOTH PARENTS PARENT INCOME %PRIVATE %MINORITY COHORT SIZE TEACHER SALARY TEACHER EXPERIENCE PUPIL/TEACHER RATIO TEACHER EDUCATION %POVERTY #DISTRICTS %NO DIPLOMA %HS DIPLOMA ONLY CONSTANT Spatial parameter estimate r or l Adjusted R2
0.28** (3.60) 0.06 (0.87) 0.08 (1.22) −0.16** (3.72) −0.060 (1.54) 0.25 (1.21) 0.07 (0.26) 0.06 (0.19) −0.14 (0.67) −0.31** (3.30) 0.09 (1.37) −0.29** (3.84) −0.17** (3.06) 0.58** (5.23) — — 0.44
0.27** (3.01) 0.05 (0.70) 0.03 (0.38) −0.18** (3.41) −0.051 (1.17) 0.22 (0.88) 0.26 (0.73) −0.05 (0.15) −0.13 (0.53) −0.27* (2.21) 0.08 (0.78) −0.33** (3.83) −0.20** (3.07) 0.61** (4.73) 0.22** (3.86) 0.45
Spatial autoregressive model
Spatial Durbin model
0.28** (3.36) 0.07 (0.98) 0.02 (0.26) −0.18** (4.17) −0.057 (1.57) 0.22 (1.02) 0.18 (0.59) −0.02 (0.06) −0.14 (0.65) −0.22* (2.57) 0.09 (1.31) −0.32** (3.84) −0.18** (3.37) 0.46** (3.95) 0.17** (3.53) 0.45
0.24* (2.50) 0.05 (0.67) −0.11 (1.32) −0.30** (5.15) −0.06 (1.29) 0.37 (1.40) −0.04 (0.14) 0.49 (1.38) −0.23 (0.87) −0.06 (0.48) 0.31* (1.97) −0.70** (7.93) −0.29** (4.57) 0.02 (0.05) 0.23** (4.04) 0.55
The number of observations was 605. Parameter estimates are shown with the absolute value of the t-statistic in parentheses below. ** Statistically significant at the 0.01 level; * statistically significant at the 0.10 level. The spatial parameter is r for both the spatial autoregressive model and the spatial Durbin model, and l for the spatial error model. Spatial lags are included for the spatial Durbin model, but suppressed in output to save space: LAG PARENT INCOME, LAG %NO DIPLOMA, and LAG %HS DIPLOMA ONLY are positive
The Spatial Education Production Function Table 5. Regression results using %PASS READING Instrumental Spatial error variables model model %BOTH PARENTS PARENT INCOME %PRIVATE %MINORITY COHORT SIZE TEACHER SALARY TEACHER EXPERIENCE PUPIL/TEACHER RATIO TEACHER EDUCATION %POVERTY #DISTRICTS %NO DIPLOMA %HS DIPLOMA ONLY CONSTANT Spatial parameter estimate r or l Adjusted R2
0.19** (4.34) 0.04 (1.07) 0.05 (1.38) −0.10** (4.14) −0.038* (1.79) 0.10 (0.89) 0.09 (0.60) 0.10 (0.62) −0.13 (1.08) −0.12* (2.37) 0.09* (2.39) −0.25** (5.95) −0.05 (1.46) 0.75** (12.40) — — 0.47
0.13** (2.60) 0.07* (1.79) 0.04 (0.97) −0.13** (4.60) −0.040 (1.63) −0.01 (0.09) 0.16 (0.75) 0.07 (0.33) −0.03 (0.25) −0.10 (1.50) 0.05 (0.93) −0.28** (5.30) −0.04 (1.08) 0.80** (10.86) 0.26** (4.55) 0.50
197
Spatial autoregressive model
Spatial Durbin model
0.14** (3.07) 0.08* (2.54) 0.03 (0.63) −0.14** (5.76) −0.030 (1.49) −0.02 (0.19) 0.17 (0.91) 0.05 (0.25) −0.05 (0.37) −0.08 (1.62) 0.06* (1.83) −0.28** (6.95) −0.02 (0.70) 0.64** (9.49) 0.17** (4.02) 0.49
0.10* (2.42) 0.07* (2.08) 0.05 (1.50) −0.15** (5.40) −0.032* (1.70) −0.03 (0.27) 0.13 (0.78) 0.06 (0.37) −0.04 (0.33) −0.06 (0.95) 0.05 (0.89) −0.33** (6.94) −0.06* (1.86) 0.43** (3.36) 0.23** (4.48) 0.51
The number of observations was 605. Parameter estimates are shown with the absolute value of the t-statistic in parentheses below. ** Statistically significant at the 0.01 level; * statistically significant at the 0.10 level. The spatial parameter is r for both the spatial autoregressive model and the spatial Durbin model, and l for the spatial error model. Spatial lags are included for the spatial Durbin model, but suppressed in output to save space: LAG %BOTH PARENTS, LAG %MINORITY, LAG COHORT SIZE, LAG TEACHER SALARY, and LAG %HS DIPLOMA ONLY are positive; LAG TEACHER EDUCATION is negative
198
D.M. Brasington
Table 6. Regression results using %PASS WRITING Instrumental Spatial error variables model model %BOTH PARENTS PARENT INCOME %PRIVATE %MINORITY COHORT SIZE TEACHER SALARY TEACHER EXPERIENCE PUPIL/TEACHER RATIO TEACHER EDUCATION %POVERTY #DISTRICTS %NO DIPLOMA %HS DIPLOMA ONLY CONSTANT Spatial parameter estimate r or l Adjusted R 2
0.21** (3.30) 0.05 (0.77) 0.08 (1.43) −0.04 (1.09) −0.036 (1.14) 0.01 (0.04) −0.03 (0.15) 0.15 (0.61) 0.00 (0.02) −0.15* (2.02) 0.05 (0.87) −0.05 (0.75) −0.11* (2.36) 0.73** (8.16) — — 0.21
0.18* (2.38) 0.03 (0.51) 0.07 (1.23) −0.06 (1.29) −0.037 (0.98) 0.05 (0.26) 0.02 (0.06) 0.16 (0.56) −0.02 (0.09) −0.15 (1.59) 0.01 (0.18) −0.06 (0.79) −0.13** (2.69) 0.77** (7.66) 0.20** (3.13) 0.23
Spatial autoregressive model
Spatial Durbin model
0.20** (3.28) 0.04 (0.66) 0.06 (1.20) 0.05 (1.35) −0.037 (1.22) 0.05 (0.30) 0.00 (0.02) 0.14 (0.55) −0.03 (0.17) −0.12 (1.57) 0.03 (0.53) −0.06 (0.94) −0.11* (2.38) 0.59** (5.35) 0.17** (2.99) 0.22
0.15* (2.45) 0.04 (0.69) 0.10* (1.74) −0.09* (2.24) −0.039 (1.28) 0.02 (0.11) −0.07 (0.30) 0.23 (0.94) −0.02 (0.12) −0.12 (1.13) 0.01 (0.12) −0.07 (0.98) −0.18** (3.89) 0.50* (2.32) 0.18** (3.11) 0.25
The number of observations was 605. Parameter estimates are shown with the absolute value of the t-statistic in parentheses below. ** Statistically significant at the 0.01 level; * statistically significant at the 0.10 level. The spatial parameter is r for both the spatial autoregressive model and the spatial Durbin model, and l for the spatial error model. Spatial lags are included for the spatial Durbin model, but suppressed in output to save space: LAG COHORT SIZE and LAG %HS DIPLOMA ONLY are positive; LAG TEACHER EDUCATION is negative
The Spatial Education Production Function
199
the spatial parameters range from 0.17 to 0.20, which are similar to those of the other test sections, and three spatial lags are significant in the spatial Durbin model. As in the reading section, %POVERTY appears to be significant until spatial dependence is addressed.
7 Conclusion Educational spillovers have been discussed theoretically for a long time, but no one has accounted for them in empirical estimations of local publicschooling provision.15 Significant spatial effects are present in all 15 spatial education production functions, with spatial parameter estimates that range from 0.17 to 0.36. Researchers who ignore the spatial nature of the data run the risk of having biased, inefficient, and inconsistent parameter estimates (Anselin 1988). Researchers should account for spillovers in the production of education and omitted variable bias by incorporating spatial statistics. In addition to more accurate parameter estimates, the spatial models add to the explanatory power of education production functions. Despite the use of spatial Bayesian techniques that mitigate the fitting of outliers, the average adjusted R2 improves from 0.42 to 0.46. The influence of competition on public-school performance is the focus of the education production functions. Most recent education production function studies find that public schools respond to competition from private schools (Hoxby 1998; Borland and Howsen 1996; Dee 1998; Couch et al. 1993). This study disagrees. There seems to be no consistent association between private-school attendance rates and public-school achievement. Furthermore, although there is always a positive relationship between the number of public-school districts in a county and public-school test scores, it is only statistically significant in 8 of 20 regressions. The magnitude of its effect is also paltry, and never exceeds 0.05.16 At first glance, such slight competitive effects suggest that school vouchers will not markedly improve public-school performance; consequently, vouchers will probably not improve labor force productivity or earnings either. If so, vouchers may only serve to subsidize rich parents. Willms and Echols (1992) found that Scottish parents with a high socioeconomic status were the ones who took advantage of Britain’s school voucher system. If only rich parents use the vouchers, they will be used to make tuition at other schools more affordable for those most able to afford it.
Except for the working paper by D.L. Millimet and V. Rangaprasad (2005). At the mean, the largest it reaches is 0.043 in the spatial Durbin model of the %PASS CITIZENSHIP regression. 15 16
200
D.M. Brasington
However, these findings must be interpreted carefully. The data are real. As such, they represent the current competitive environment in education, and as the results of this study confirm, the current environment can hardly be described as competitive. To the extent that they can afford it, parents choose where to live in part based on the quality of school their house is assigned to. However, other considerations are involved: taxes, convenience to work, pollution, availability of parks, and quality of housing stock may all be more important to parents than school quality. Also, the quality of schools may change over time, but high moving costs inhibit relocation. Residents whose children graduate may stop volunteering at the schools and stop approving school tax levies, and the tax laws make it costly to attend private schools or schools outside the attendance zone, since a parent must forego his tax contribution and pay full tuition at the new school as well. What is more, private-school competition consists predominantly of Catholic schools. School vouchers would make private schools more affordable, which would elicit a supply response that may open up a wide variety of attractive choices to parents (Merrifield 2002). So parents who are not particularly interested in sending their children to Catholic schools may be more interested in sending them to a Montessori school that emphasizes African–American culture. Public economists sometimes think of education as a merit good: a pure private good that is funded by taxation and provided by the local government. Being private, its benefits are confined to the area in which it is produced. The significant spatial parameter estimates of the current study imply that some of the benefits of schooling spill over into neighboring areas. Therefore, schooling is not a merit good, and the presence of externalities means that a nonoptimal amount may be consumed. Failure to incorporate spatial statistics may lead to misleading parameter estimates. Six of the eight parameter estimates for school inputs were statistically significant in the %PASS ALL and %PASS MATH regressions, but five of the six effects disappeared in the spatial models. Poverty appeared to depress reading and writing pass rates, but this effect disappeared in the spatial models. Spatial statistics also uncovered other provocative findings, like the discovery that having richer neighbors increases math and citizenship pass rates in our own school district. Having found evidence of spillovers in education production functions, labor and education economists are exhorted to use spatial statistics in their estimations. Public and urban economists who deal with schooling in estimations are encouraged to do the same. Furthermore, significant spatial effects may be present in the demand for, and supply of, other publicly provided commodities such as crime prevention, pollution abatement, garbage collection, libraries, and parks. Additional research is required. Pending
The Spatial Education Production Function
201
these investigations, much of the empirical economic literature of the last 30 years may need to be rewritten to address the spatial nature of the data. Acknowledgments. Bob Martin, Jim LeSage, John H.Y. Edwards, Kelley Pace, John Vandermosten, David Rodgers, Toshiharu Ishikawa, and other participants at the 2005 Chuo University Meeting on the Economics of Time and Space are thanked for their comments. The Tulane University Committee on Research is thanked for a $4000 Summer Fellowship in 1999. The first draft was completed June 22, 1999.
References Akerhielm K (1995) Does class size matter? Econ Educ Rev 14:229–241 Anselin L (1988) Spatial econometrics: methods and models. Kluwer Academic, London, Dordrecht, pp 35, 58, 59, 103, 181 Barry RP, Pace RK (1999) A Monte Carlo estimator of the log determinant of large sparse matrices. Linear Algebra Appl 289:41–54 Bishop J (1989) Is the test score decline responsible for the productivity growth decline? Am Econ Rev 79:178–197 Borland MV, Howsen RM (1996) Competition, expenditures and student performance in mathematics: a comment on Couch et al. Public Choice 87:395–400 Bradford DF, Oates WE (1974) Suburban exploitation of central cities and governmental structure. In: Hochman HM, Peterson GE (eds) Redistributions through public choice. Columbia University Press, New York, London, p 44–90 Brasington DM (1997) School district consolidation, student performance, and housing values. J Reg Anal Policy 27:43–54 Brasington DM (1999) Joint provision of public schooling: the consolidation of school districts. J Public Econ 73:373–393 Brasington DM, Haurin DR (2006) Educational outcomes and house values: a test of the value-added approach. J Reg Sci 46:245–268 Brasington DM, Hite D (2005) Demand for environmental quality: a spatial hedonic analysis. Reg Sci Urban Econ 35:57–82 Couch JF, Shughart WF, Williams AL (1993) Private school enrollment and public school performance. Public Choice 76:301–312 Davidson R, MacKinnon JG (1981) Several tests for model specification in the presence of multiple alternatives. Econometrica 49:781–793 Dee TS (1998) Competition and the quality of public schools. Econ Educ Rev 17:419–427 Edwards, JHY (1986) A note on the publicness of local goods: evidence from New York State municipalities. Can J Econ 19:568–573 Epple D, Romano RE (1998) Competition between private and public schools, vouchers, and peer-group effects. Am Econ Rev 88:33–62 Falkinger J (1994) The private provision of public goods when the relative size of contributions matters. Finanzarchiv 51:358–371 Figlio DN (1999) Functional form and the estimated effects of school resources. Econ Educ Rev 18:241–252
202
D.M. Brasington
Fowler WJ Jr, Walberg HJ (1991) School size, characteristics, and outcomes. Educ Eval Policy Anal 13:189–202 Friedkin NE, Necochea J (1988) School system size and performance: a contingency perspective. Educ Eval Policy Anal 10:237–249 Gomes-Neto JB, Hanushek EA, Leite RH, et al. (1997) Health and schooling: evidence and policy implications for developing countries. Econ Educ Rev 16:271–282 Griffith DA (1988) Advanced spatial statistics: special topics in the exploration of quantitative spatial data series. Kluwer, Dordrecht, p 94–107 Haller EJ (1992) High-school size and student indiscipline: another aspect of the school consolidation issue? Educ Eval Policy Anal 14:145–156 Hanushek EA (1986) The economics of schooling: production and efficiency in public schools. J Econ Lit 24:1141–1177 Hanushek EA (1992) The trade-off between child quantity and quality. J Polit Econ 100:84–117 Hanushek EA, Taylor LL (1990) Alternative assessments of the performance of schools: measurement of state variations in achievement. J Hum Resour 25:179–201 Hayes KJ, Taylor LL (1996) Neighborhood school characteristics: what signals quality to homebuyers? Fed Reserve Bank Dallas Econ Rev 3:2–9 Hoxby CM (1996) Are efficiency and equity in school finance substitutes or complements? J Econ Perspect 10:51–72 Hoxby CM (1998) What do America’s “traditional” forms of school choice teach us about school choice reforms? Fed Reserve Bank New York Econ Policy Rev 4:47–59 Hoxby CM (2000) Does competition among public schools benefit students and taxpayers? Am Econ Rev 90:1209–1238 Ireland NJ (1990) The mix of social and private provision of goods and services. J Public Econ 43:201–219 Jewell RW (1989) School and school district size relationships: cost, results, minorities, and private school enrollments. Educ Urban Soc 21:140–153 Kennedy PE, Siegfried JJ (1997) Class size and achievement in introductory economics: evidence from the TUCE III data. Econ Educ Rev 16:385–394 LeSage JP (1997a) Regression analysis of spatial data. J Reg Anal Policy 27:83–94 LeSage JP (1997b) Bayesian estimation of spatial autoregressive models. Int Reg Sci Rev 20:113–129 Loury LD, Garman D (1995) College selectivity and earnings. J Labor Econ 13:289–308 Maddala GS (1992) Introduction to econometrics. Prentice-Hall, Englewood Cliffs, p 395 Merrifield J (2002) School choices: true and false. The Independent Institute, Oakland MESA Group (1994) School district data book. National Center for Education Statistics, US Department of Education, Washington Meyer RH (1996) Value-added indicators of school performance. In: Hanushek EA, Jorgenson DW (eds) Improving America’s schools: the role of incentives. National Academy Press, Washington, p 97–223 Murdoch JC, Rahmatian M, Thayer MA (1993) A spatially autoregressive median voter model of recreational expenditures. Public Financ Q 21:334–350 Murnane RJ, Willett JB, Levy F (1995) The growing importance of cognitive skills in wage determination. Rev Econ Stat 77:251–266 Ohio Department of Education (1985) Maps of Ohio school districts: city, exempted, local. State of Ohio, Columbus Ohio Department of Education (1993) Ohio educational directory: 1992–1993 school year. State of Ohio, Columbus
The Spatial Education Production Function
203
Ohio Department of Education, Division of Information Management Services (1999) http://www.ode.ohio.gov/www/ims/pregen_rept.html Ornstein AC (1993) School consolidation vs. decentralization: trends, issues, and questions. Urban Rev 25:167–174 Pace RK, Barry RP (1998) Simulating mixed regressive spatially autoregressive estimators. Comput Stat 13:397–418 Ramanathan R (1998) Introductory econometrics with applications. Dryden Press, Fort Worth, Texas, USA, p 170 Reiter M, Weichenrieder A (1997) Are public goods public? A critical survey of the demand estimates for local public services. Finanzarchiv 54:374–408 Rothstein J (2005) Does competition among public schools benefit students and taxpayers? A comment on Hoxby (2000). National Bureau of Economic Research, Inc, NBER Working Paper No. 11215 Sander W (1996) Catholic grade schools and academic achievement. J Hum Resour 31:540–548 Stern D (1989) Educational cost factors and student achievement in grades three and six: some new evidence. Econ Educ Rev 8:149–158 Voß W(1991) Nutzen-spillover-effekte als problem des kommanalen finanzausgleichs: ein beitrag zur okonomischen rationalitat des ausgleichs zentralitatsbedingten finanzbedarfs, Frankfurt a.M. Willms JD, Echols F (1992) Alert and inert clients: the Scottish experience of parental choice of schools. Econ Educ Rev 11:339–350 Wyckoff JH (1984) The nonexcludable publicness of primary and secondary public education. J Public Econ 24:331–351 Zanzig BR (1997) Measuring the impact of competition in local government education markets on the cognitive achievement of students. Econ Educ Rev 16:431–444
11. Innovation, R&D Cooperation, and the Geography of Regional Labor Acquisition Jaakko Simonen1 and Philip McCann2
Summary. This chapter considers the role played by geography in the promotion of innovation. In order to examine these issues, we employ empirical data from Finland to test the extent to which the variety and nature of faceto-face contacts affects the innovation performance of the firm. In addition, we also control for the geographical mobility of the labor employed by the firm. This allows us to identify the different roles which the geography of knowledge spillovers and exchanges and the geography of labor markets play in the innovation process. Our findings here suggest that local face-to-face contact is an essential feature of innovation process. Moreover, our results concerning the importance of the variety of R&D relations also provide support for this argument. This is not to say, however, that the evidence unambiguously supports agglomeration–innovation arguments. The reason is that our labor market results point to a rather different conclusion. In particular, having controlled for the types of R&D cooperation relations which a firm exhibits, the finding that neither local labor acquisition nor local population density are related to innovation casts doubt on some of the agglomeration–innovation theories stressing the importance of local labormarket skills. Key words. Innovation, R&D Cooperation, Labor acquisition, High technology
1 Department of Economics, University of Oulu, PO Box 4600 FIN-90014, University of Oulu, Finland. 2 Department of Economics, University of Waikato, Private Bag 3105, Hamilton, New Zealand, and Department of Economics, University of Reading, Reading RG6 6AW, UK.
205
206
J. Simonen and P. McCann
1 Introduction The role played by knowledge spillovers has become a major focus of research interest amongst economists analyzing the nature and processes of innovation and growth. Since the original work of Schumpeter (1934), the wide-ranging literature on the determinants of innovation has tended to focus on two key lines of enquiry. Firstly, research aims to identify the structural and sectoral characteristics of the firms and entrepreneurs which promote innovation. Here, some of the key issues which have emerged as playing a significant role in promoting firm innovation are the levels of R&D expenditure, the stock of human capital inputs, the sector of activity, the mode of organization, and also the size of the firm. More recently, over the last two decades, there has also emerged widespread interest in the role which geography and spatial industrial organization may play in the innovation process. This research interest has arisen as a response to the groundbreaking work of Krugman (1991), Porter (1990), and Scott (1988) on issues related to industrial clustering and agglomeration. The mechanisms by which knowledge is transferred are seen as a crucial driving force behind the spatial agglomeration of activities. In all three of the original Krugman, Porter, and Scott approaches, local knowledge flows either between firms, or between firms and organizations, are assumed to take place via either local tacit knowledge spillovers between individual people, or via the local movement of embodied human capital. These knowledge flows are assumed to be mediated primarily by face-to-face contact, and the requirements of proximity to facilitate this process are assumed to give rise to spatial concentration of activities. At the same time, knowledge transfer is also seen as a crucial part of the process of innovation, and there is also much empirical evidence which points not only to the localization tendencies of innovation (Jaffe et al. 1993), but also the links between innovation and the spatial clustering of activities (Arita and McCann 2000; Acs 2002; Cantwell and Iammarino 2003). However, none of this empirical evidence provides fundamental insights into the nature of the face-to-face knowledge exchanges associated with innovation and their links with spatial clustering. The reason is that while the literature dealing with innovation and growth generally assumes that face-to-face contact between individuals and firms is positively related to the levels of innovation, the same literature also adopts “rather diffuse and vague notions that knowledge and innovation reside ‘in the air’ or in the ‘buzz’ of urban life” (Power and Lundmark 2004, p. 1025). These vague notions limit our ability to empirically test these arguments relating agglomeration and clustering to innovation, and these limitations are enhanced by
Regional Labor Acquisition
207
the general paucity of appropriate data. The result is that the actual role played by face-to-face contact is still largely “a missing aspect of (the) mechanisms that are considered to generate agglomeration” (Storper and Venables 2004, p. 353). This chapter will explicitly examine these questions by identifying and distinguishing between the effects on innovation of tacit knowledge transfers via inter-firm and inter-organizational contacts, from the effects on innovation by labor mobility. In order to do this, we employ a unique innovation dataset from Finland on firm innovation behavior, which combines firm-specific information on the nature of inter-firm and interorganizational contacts, and also the geographical and sectoral patterns of firm labor recruitment. By relating each of these features to innovation performance, we are able to distinguish between the effects of knowledge spillovers from face-to-face contact from those of labor mobility.
2 Innovation and Economic Geography The hypothetical links between innovation and economic geography have arisen from a variety of different empirical and analytical sources. Various hypotheses have been developed to account for the widely observed uneven spatial distribution of innovative behavior (Gordon and McCann 2005a,b), and these can be explained individually. Hypothesis 1: The contemporary geography of innovation is essentially a geography of the currently more innovative sectors of the economy This hypothesis is based on the observation that in any period there are some sectors of economic activity which will be more heavily involved in the innovation of products or processes than others. The reasons why this is so may be because of the particular phase which has been reached in the lifecycle of an industry’s product set, or because some activities with very short product cycles are more or less permanently locked into the innovative phase. If each of these industries is subject to rather different location factors, because of the nature of their production technologies or their marketing and consumption processes, the geography of innovation may then be reducible simply to a geography of industrial location. With activities remaining in the same broad locations through all of the phases of the product cycle, places which they dominate will also appear to move through that cycle, except for the homes of the permanent innovators, which would remain continuing sites of innovation. Hypothesis 2: The contemporary geography of innovation is essentially a result of sectoral differences in the phases of product or profit cycles.
208
J. Simonen and P. McCann
This alternative interpretation of the product cycle geographies emphasizes significant and typical shifts in the locational requirements between the phases of a (primarily oligopolistic) industry’s product or profit cycle. From this perspective, during these early innovative phases, neither the scale of production nor the certainty of growth are sufficient for firms to attempt self-sufficiency either in production or training, while design uncertainties also militate against reliance on either distant suppliers or semiskilled labor. In this early phase, access to appropriate skills and subcontractors is a crucial condition for successful innovation and the management of uncertainties. Later on, in the mature phases of the cycle, when output scale has been achieved, production methods have become routine, and cost factors are increasingly important, both simple geographical dispersal to lower-cost locations and the spatial division of labor will become increasingly relevant (Markusen 1985). From this perspective, therefore, what is generally significant about the geography of innovative activities is not the distribution of creative or inventive potential, but the production conditions which allow infant firms and industries to survive and thrive in a nursery environment until they acquire the scale and experience to strike out on their own. Hypothesis 3: The contemporary geography of innovation is essentially the outcome of variations in the characteristics between different places which lead to differences in the geography of creativity and entrepreneurship. This third approach to understanding the distribution of innovative activity focuses on the geography of creativity and entrepreneurship, in the sense of place characteristics favoring the development and commercial launching of potentially successful new or improved products, either through established or new business organizations. The emphasis here is on the factors which stimulate and enable novel developments while also facilitating the selection of those with real competitive potential. The three key sets of factors involve: a rich “soup” of skills, ideas, technologies, and cultures within which new compounds and forms of life can emerge; a permissive environment enabling unconventional initiatives to be brought to the marketplace; and vigorously competitive and critical arenas operating selection criteria which anticipate and shape those of wider future markets. The arguments underlying this third hypothesis are currently dominated by the literatures on agglomeration, creativity, and knowledge spillovers. Hypothesis 4: The contemporary geography of innovation is essentially a result of the fact that innovation is most likely to occur in small and mediumsized enterprises, whose spatial patterns happen to be uneven. This fourth approach to explaining the uneven geographical patterns of innovation involves another type of “milieu” argument, which is focused on
Regional Labor Acquisition
209
the geography of cooperation, and rests on the perception that innovation is most likely to occur in small and medium-sized enterprises (SMEs) which have neither the scale nor the risk-bearing capacity to provided all of the key inputs on their own account. Observations from so-called “new industrial districts” (Scott 1988) such as Silicon Valley (Saxenian 1994) and the Emilia– Romagna region of Italy (Scott 1988; Castells and Hall 1994) have suggested that the geographical proximity of SMEs is a necessary criterion for the development of mutual trust relations based on a shared experience of interaction with decision-making agents in different firms. The latter two hypotheses assume that innovation is directly related to industrial clustering and agglomeration, and as a result of the work of Krugman (1991), Porter (1990), and Scott (1988), these clustering–innovation hypotheses are now the most popular analytical approaches. Indeed, the nature of the links between spatial clustering and innovation have now become a key feature of the research into innovation systems and economic growth (Caniels 2000; Cantwell and Iammarino 2003; Acs 2002; Breschi and Lissoni 2001a,b). It is specific aspects of these clustering–innovation arguments that we will test in this chapter. In particular, we focus on the role which face-to-face knowledge exchanges play in promoting innovation. In order to do this, we must first acknowledge that there are in reality four quite distinct mechanisms by which face-to-face contact and innovation can be linked. Firstly, face-to-face contact may promote innovation by increasing the possibility of informal knowledge spillovers between firms and individuals (Krugman 1991); secondly face-to-face contact may promote innovation by increasing the mutual transparency of competitor behavior and thereby competitor responses (Porter 1990); thirdly, face-to-face contact may promote innovation by increasing the levels of cooperation as well as competition between firms (Scott 1988); fourthly, face-to-face contact may promote innovation by increasing the inter-firm mobility of labor (Almeida and Kogut 1999). Unfortunately, a lack of appropriate data has meant that it has previously not been possible to identify these different mechanisms empirically, and therefore it is very difficult to distinguish between, and evaluate the effects of, tacit knowledge spillovers which accrue because of inter-firm or inter-organizational relations, and embodied human–capital knowledge transfers which take place due to the mobility of labor between firms. In principle at least, it is theoretically possible to treat the first three mechanisms by which face-to-face contact and innovation can be linked, namely informal knowledge spillovers, mutual transparancy, and cooperation, as being qualitatively quite similar to each other. At the same time, these three mechanisms can be regarded as a group, and as being quite different in nature from the fourth mechanism, namely the inter-firm mobility of labor. The reason for this becomes clear if we adopt the analogy of a simply
210
J. Simonen and P. McCann
stock-inventory model. The first three types of mechanisms generally involve highly frequent short-term transactions of relatively small quantities of knowledge or information in comparison to the total knowledge of the person(s) undertaking the transaction, whereas the inter-firm mobility and acquisition of labor involves relatively less frequent transactions in which the whole human capital of the individual is transferred for a significant period. In terms of agglomeration theory, any of these types of transaction could be related to innovation. However, while empirical data on patents are readily accessible, along with aggregate measures of labor skills and education, additional microeconometric data on labor mobility and innovation are relatively very difficult to find. For this reason, although there is some evidence supporting the role played by the mobility of local human capital in promoting innovation (Angel 1991; Audretsch and Stephan 1996; Almeida and Kogut 1999; Breschi and Lissoni 2003; Franco and Filson 2000; Persson 2002; Power and Lundmark 2004), these papers are very much in the minority. In most recent studies of the geography of innovation, analyses which emphasize the role played by local informal knowledge spillovers (Acs 2002; Kaiser 2002) have tended to predominate over human capital and labor mobility explanations. The prevalence of such papers therefore leads to the impression that informal knowledge spillovers tend to dominate over labor market effects, although as yet there is no actual evidence for this. The objective of this chapter is to partially fill this research gap. Using a unique dataset, our approach is therefore both to identify and distinguish between these different potential forms of knowledge transfer, and then to empirically relate the importance of these alternative knowledge transfer mechanisms to the innovation process. The findings of this research should contribute to our knowledge of both the economics of growth and also the economics of space.
3 Data and Methodology R&D cooperative relationships usually begin as a result of informal knowledge spillovers, although they subsequently develop into much more complex and formal arrangements. Of all the possible types of inter-firm relations, R&D cooperative relationships require the most intense face-to-face contact in order to be both established and maintained, and for high-technology firms at least, have been demonstrated to be the type of inter-firm relations which are most associated with geographical proximity (Arita and McCann 2000). This is because the required level of trust embedded in the mutual commitments made by the firms tends to be the highest of all types of interfirm and inter-organizational relations. Therefore, testing whether the
Regional Labor Acquisition
211
existence, nature, and variety of R&D cooperation relationships is related to innovation provides an indirect test of whether the existence, nature, and variety of intense and continuing face-to-face contact is related to innovation. In this chapter, we aim to identify the role which labor mobility plays in promoting innovation, after having controlled for other firm-specific characteristics, including face-to-face contact with other firms and organizations embodied in these R&D cooperation relationships. In order to do this, we employ microeconometric data on the innovation behavior and performance of Finnish high-technology firms. As well as information about the innovation behavior of the firms, these data provide us with detailed information about the structural characteristics of the firms, and also information about the R&D cooperation behavior between individual Finnish firms. After controlling for the impact on innovation of a firm’s structural characteristics, and also for the nature and variety of a firm’s cooperation R&D relationships, we are then also able to extend the argument to investigate the effects on a firm’s innovation performance of the geographical and sectoral origins of the firm’s labor acquisitions. This will allow us to isolate the independent role on innovation performance which is played by knowledge transfers associated with local and nonlocal human–capital mobility from the role associated with inter-firm and inter-organizational knowledge spillovers associated with R&D cooperation relationships. The establishment-level data used for our research come from innovation surveys conducted by Statistics Finland in 1996, 2000, and 2002. The innovation surveys undertaken by Statistics Finland are conducted using the subject approach, which is in line with the European Union Community Innovation Survey (CIS) framework. These surveys collect information about an individual establishment’s innovation performance, defined as the launching of any new or substantially improved products or processes during the previous 2 years. In addition to the innovation outputs, the innovation surveys also provide information about the firm’s innovation cooperation with other enterprises or institutions. For the purposes of this research, these innovation databases have also been further expanded with information from the Finnish Business Register for 1990–2002 and the Finnish Longitudinal Employer-Employee Database (FLEED) database maintained by Statistics Finland, which provide us with information about the geographical and sectoral origins of labor recently acquired by the establishment. Our combined database now includes establishment-level information regarding each of the types of innovation exhibited by an establishment, the employment size of each establishment, the turnover of each establishment, the nature and variety of the R&D cooperation relations of the firm, and the geographical and sectoral origins of the labor recently acquired by the establishment.
212
J. Simonen and P. McCann
Our dataset consists of all the Finnish high-technology firms (defined according to the 1995 SIC and listed in Table 1) which were included in the three innovation surveys of 1996, 2000, and 2002. The basic establishment-level unit of analysis we adopt is that of the “local unit,” and the innovation behavior we ascribe to the local unit comes from the firm-level innovation data. Assigning innovation behavior to the local unit is straightforward in the case of a single plant or single-site firm because the firm is the local unit. In the case of a multiplant firm which has establishments in different areas, each individual establishment is the unit of analysis, and in the absence of any additional information, the innovation behavior we ascribe to each establishment is that of the multiplant firm as a whole. In the case of Finnish data, this technique has previously been adopted elsewhere (Alanen et al. 2000). There are two major justifications for this technique. Firstly, if we assume that information is transferred effectively between different local units within the same firm (Orlando 2000), then it is appropriate to give the same innovation status to all the individual local units of the same firm. Moreover, the majority of firms surveyed had fewer than three establishments, given that almost half had only one establishment. In the few cases where a firm had several establishments in the same local area, the unit of analysis adopted is an aggregate of all the local plants combined. In addition, the surveys also provide information about the R&D interaction between individual firms and other firms or institutions, specifically in order to promote innovation, and this information is provided at the level of the firm. Once again, with single-plant firms or with multiplant firms in different localities, assigning interaction and cooperation behavior to the local unit is straightforward. However, in the case of multiplant firms with establishments in different localities, it is possible that all the individual establishments do not interact with other firms or organizations to the same
Table 1. High-technology industries (SIC 1995) High-technology industry Manufacture of pharmaceuticals, medicinal chemicals, and botanical products Manufacture of office machines and computers Manufacture of radio, television, communications equipment, and apparatus Manufacture of aircraft and spacecraft Telecommunications Computers and related activities Research & development Architectural and engineering activities and related technical consultancy
SIC 95 244 30 32 353 642 72 73 742
Regional Labor Acquisition
213
extent. However, once again, in the absence of any additional information, we assign the same levels of cooperation and interaction to each establishment within the same firm. The information on R&D cooperation is treated here as our major proxy for face-to-face contact. The raw nonadjusted data show that between the different size-bands, the proportion of firms innovating is relatively high among both the groups of very small local establishments employing fewer than 10 people, and also the larger establishments employing more than 50 people, whereas the probability of innovation appears to be lower among the establishments employing between 10 and 49 people. This gives rise to a J-shaped relationship between establishment size and innovation intensity, as observed by Tether et al. (1997), who found that small enterprises did not introduce a disproportionately large share of innovations relative to their employment share, and only the largest enterprises introduced more innovations than their size would suggest. A similar result also holds in terms of the probability of cooperation for R&D activities. However, for econometric analysis, given the types of data we have at our disposal, the analysis of these data is most appropriately undertaken by employing a series of probit models, in which we estimate the probability of a particular type of innovation taking place as a function of the size characteristics of the firm, the cooperation behavior of the firm, and the labor acquisition behavior of the firm. In order to test the relationship between innovation behavior and the importance of these various features, we estimate three different types of probit model. In each of these three models, the binary dependent variable is defined as being equal to 1 if the local establishment has managed to introduce new innovations during the previous years, and equal to 0 if it has not. The three separate probit models cover all three response categories of innovations, i.e., product innovations, process innovations, and new products introduced to the market. Product innovation represents the situation where a firm has introduced a product to the market which is new for the particular firm, but not for the industry. Process innovation represents a situation where a firm has introduced a new production process or a new service delivery process. New products to market represents a situation where a firm introduces a product which is new to the market as whole, and can be considered as being the most advanced form of innovation. The set of independent variables we employ includes dummy variables for the employment size-band categories of the local establishments, a continuous variable for the establishment turnover, plus a dummy variable indicating whether the establishment has cooperated with other firms and institutions on R&D issues during the previous 2 years. Our cooperation variable takes a value of 1 if the local establishment cooperates with at least one type of partner, and 0 if it does not have any kind of cooperation
214
J. Simonen and P. McCann
relationship with external partners. As we have seen at the beginning of this section, the importance of the cooperation dummy variable employed here lies in the fact that inter-firm cooperation is generally regarded as being positively related to the level of face-to-face interaction between firms, because of the need to build up mutual trust and confidence. This variable is therefore employed as proxy for face-to-face contact. In terms of local labor market variables, our data are calculated with respect to local labor market regions. There are about 80 official subregions in Finland, with an average population of some 65 000. We employ a continuous variable representing the population density of the region in order to capture any local external agglomeration spillover effects which are external to the individual firm (Ciccone and Hall 1996), and we also employ detailed data about the proportions, and also the respective geographical and sectoral origins, of the firm’s total employment which has been acquired recently. We run the set of three probit models, Model 1, Model 2, and Model 3, in three different types of circumstance, giving a total of nine different probit models. Model 1 estimates the likelihood of a firm producing a product innovation, Model 2 estimates the likelihood of a firm producing a process innovation, and Model 3 estimates the likelihood of a firm producing a new product to market innovation. In the first case, we test whether innovation is related to either cooperation relations, or the pattern of local and nonlocal labor mobility. In the second case, we test whether the geographical and sectoral origins of the labor acquired are related to innovation, after controlling for the number and variety of R&D cooperation relations exhibited. In the third case, having controlled for the geographical and sectoral origins of the labor acquired, we test whether the nature and types of R&D cooperation relations entered into is significantly related to innovation. As such, our models become progressively more detailed as we move from the three firststage models to the three fourth-stage models. The models estimated for the first set of circumstances are estimated on the basis of a pooled dataset from all three survey years of 1996, 2000, and 2002, while the second and third groups of models are estimated on pooled data from the 1996 and 2000 surveys. The reason for this is that the 2002 survey did not contain any questions regarding the number, variety, and types of R&D cooperation relations a firm exhibited, whereas both the 1996 and 2000 surveys did so.
4 Results The nine estimated probit models reported in Tables 2–4 have a goodnessof-fit of between 0.194 and 0.499, with eight models having a goodness-of-fit
Regional Labor Acquisition
215
of over 0.2, and with all models exhibiting an increasing goodness-of-fit with the inclusion of additional explanatory variables. These model results are very good indeed.3 Table 2 shows the results of the models which test whether innovation is related to either cooperation relations, or the pattern of local and nonlocal labor mobility. In terms of the various size characteristics of the establishment, employment size appears to be positively and significantly related to the introduction of new products to market for very small establishments, and negatively related to the probability of producing process innovations by medium-sized establishments, while turnover is positively and significantly related to both product innovations and new products to market. The one variable which is consistently and positively related to innovation performance is R&D cooperation. Cooperation with other firms or organizations for R&D is always positively and significantly related to all three types of innovation. As we have already mentioned, R&D cooperation relationships require the most intense face-to-face contact in order to be both established and maintained, and for high-technology firms, at least, have been demonstrated to be the type of inter-firm relations which are most associated with geographical proximity (Arita and McCann 2000). Therefore, this result supports the argument that face-to-face contact is always essential for all types of innovation. In terms of the proportion of labor acquired from within the local region as against outside the region, our initial results in Table 2 suggest that local labor acquisition is an essential feature of both product innovation and new products to market. These initial findings appear to lead support to agglomeration arguments, although the lack of any positive significance for the population density variable, and even negative significance in the case of process innovations, suggests that simple urbanization assumptions may not be tenable. In order to further investigate this point, in Table 3 we present the results of similar probit models to those reported in Table 2, but in these cases, the dichotomous R&D cooperation variables of Table 2 are now broken down into four categories in order to account for the different number of types of cooperation relation entered into. The seven types of cooperation partner relationships are: other establishments within the same firm; customers and clients; suppliers; competitors; higher education institutions; consultants; government research institutes; or nonprofit private research institutes.
In a choice model such as this, a pseudo R2 value of 0.2 is regarded as being very good indeed (McFadden 1979). 3
216
Name of the explanatory variables Intercept/constant Categories of LUs Very small LUs Small LUs SMLUs SMLUs SMLUs Large LUs
(1–9) (10–19) (20–49) (50–99) (100–249) (250→)
Log turnover Cooperation Mobility of labor from the same subregion (2-year lag) Mobility of labor from the other subregions (2-year lag) Population density Number of observations Log–likelihood
Coefficients of Model 1 (P values in parentheses) −2.0055***
(0.0075)
0.7910*** 0.5061** 0.3330 0.1406 −0.0557 0.0000
(0.0062) (0.0514) (0.1651) (0.5545) (0.8067) Reference category
0.1948*** 1.9509*** 0.3663***
( P L − PO > d, further rises in the cost of out-oftown shopping would lead to further reductions in local prices. Unlike the step-change illustrated above, however, here we will see incremental price reductions in response to incremental cost increases.
PL D
LRAC
D*
P L2 D'' D** QL QL2
Fig. 3. Entry of new local shops shifts the demand curve for typical local shops to the left. New monopolistic competition equilibrium at lower price (P L2) and higher capacity utilization (Q L2) for the typical local shop
304
F. Guy
4 Policy Implications Our model assumes that among the consumers in an area served by one set of competing local shops, some have cars and some do not. The distributional implications which we will draw from the model rest on the further assumption that those who have cars are, on average, richer that those who do not. This entails an assumption of income heterogeneity within a local urban market, in the sense that within that market high- and low-income groups co-exist. It thus differs from models of the Alonso (1964) type, in which a city’s residents are sorted and segregated spatially into internally homogeneous wealth or income groups, with the poorest in the center and so on. Our model is a reasonable approximation of many European, East Asian, and Australasian cities, if not of many North American ones. Bearing these assumptions in mind, we see that in the case where the local equilibrium tips from the first segment of the demand curve to the middle one, our model has an important distributional implication: poor consumers become better off in the market where the tipping occurs, because prices in the local shops they depend on fall. Various policies, discussed below, may cause this tipping (or the reverse) to occur. To draw any practical conclusions from the model, however, we need to recognize that no urban area has a single local market: each urban area has more than one, and some have a great many. Even if we assume homogeneity in the out-of-town market and within each local market in some urban area, there is no reason to expect that the demand curves and cost curves, together with travel costs for mobile consumers from that market, will all be the same in each local market. Therefore, a positive shock to travel costs and/or outof-town prices which affects the entire urban area may tip some local markets from a high-price to a low-price equilibrium, while leaving others unaffected. In any that are not affected, immobile consumers experience no change and mobile consumers bear higher costs. In this light, it might seem that securing redistribution by raising the net cost of out-of-town shopping would be quite costly in terms of allocative efficiency, and that if redistribution were our goal, we would be better off turning to taxes and cash transfers. This may, in fact, be the case, but before concluding that it is, we should revisit the following points. First, the tipping is discontinuous: at some point, a small change in the net cost of shopping out of town triggers a step change in local prices. An implication of this discontinuity is that the rise in net out-of-town costs which is necessary to gain the benefits of tipping might not be very large. Second, in markets which tip to the low-price equilibrium, excess capacity is reduced. In the short run this is presumably mirrored by increased excess
Redistribution Through Local Competition
305
capacity in out-of-town shops; the reduction in local excess capacity is, however, a long-run equilibrium condition. In cases where the model presented here is relevant, a substantial proportion of the difference between out-of-town and local prices will be due, not to differences in logistical efficiency inherent in the two modes of delivery, but to high excess capacity in local shops. Third, there are policies which affect the relative cost of shopping out of town and which can be calibrated to a particular local area, such as parking controls. Finally, the progressive redistribution occurs as a result of a process which also reduces the externalities from motor vehicle traffic, notably carbon emissions (Ruth 2006), and which produces positive social externalities by re-invigorating local shopping districts, one of the elusive desiderata of generations of urban policy (Evans 1999). Indeed, an explicit policy of this nature, which restricts the development of out-of-town retail centers for exactly these reasons (Policy Planning Guidance 6, or PPG6) has recently been introduced in the UK (Office of the Deputy Prime Minister 1999). Our model provides a justification for, and explanation of, such a policy, by showing that it has distributional consequences. It may be useful to review a range of transport and land-use policies in terms of which equilibrium they favor for local shops. In general, policies which reduce the cost of automobile transport to out-of-town shops favor the high-price, low-volume local equilibrium. Such policies include measures which reserve parking spaces for residents, whether on-street, or by allowing (or requiring) the provision of off-street resident parking: ease of coming and going in a car is a reduction in the cost of travel by automobile. Moreover, priority parking for residents may come at the expense of parking for local shops. Although driving to local shops is not addressed explicitly in the model presented here, some provision for customer parking at local shops can be a factor in competing for mobile consumers, to the benefit of immobile consumers in the area. These issues have long been understood by those concerned with the survival of neighborhood and town-center retail businesses. The point made here is that in addition to the survival of such businesses, the prices charged by these businesses is at issue. Similar issues arise in connection with the control of traffic congestion. Measures to manage traffic congestion often focus on reducing automobile trips to town centers (Green et al. 1977; Quinet and Vickerman 2004; Atzema et al. 2005). The points to be aware of here are: (i) to the extent that what we are calling local shops are located in areas targeted for congestion reduction, reducing congestion can have the effect of reducing (for mobile consumers) the relative cost of travel to out-if-town shops; (ii) even outside of the areas subject to congestion-reduction policies, if these policies succeed in reducing
306
F. Guy
overall road traffic, the effect may be a reduction (again, only for mobile consumers) of the cost of traveling out of town. On the other hand, traffic management policies which focus not on congested areas per se, but on reducing the overall level of automobile traffic within a given region, would in many cases raise the relative cost of shopping out of town. Again, this issue is well understood by retailers, such as those in and around the congestion-charging zone of central London. Policy interventions on behalf of such business have, understandably, focused on the problem of the survival of their businesses, on their availability to immobile consumers, and on their employment levels. Again, the contribution of this chapter is to show that the same factors that threaten the survival of such businesses can raise the prices they charge.
5 Limitations and Future Research The model presented here is highly stylized: there are two types of consumer and two types of shop; for mobile consumers, an all-or-nothing choice between local and out-of-town shopping; a uniform distribution of travel costs across mobile consumers; no choice beyond the local shops for immobile consumers; and a parametric out-of-town price. We believe that this simple model does the job of supporting our central arguments. First, that in a local monopolistically competitive market with one class of consumers who have an outside option and another class who do not, a change in the net costs of using the outside option can bring about a discontinuous adjustment in the local price/capacity-utilization equilibrium. Second, that if the consumers without the outside option are also poorer than those with the outside option, then the change in the net costs of using the outside option has distributional consequences. For an assessment of the empirical relevance of the model, and for an estimated magnitude of the welfare effects (including effects on the wealthier/mobile consumers, not addressed here), we would need a more complete and flexible specification. The binary mobile–immobile distinction made in the present model seems the least problematic, since it is based on the fact of access (or not) to an automobile. On the other hand, the distribution of travel costs within the mobile group could be modeled in a number of different ways: are the results affected if costs are distributed logistically, say, or in a Poisson distribution? Is there an empirical basis for choosing a distribution of costs? The linearity of demand might also be relaxed. An explicit model of the out-of-town retail oligopoly and conditions under which its pricing might respond to those of local shops is also needed, along with a formalization of the welfare model.
Redistribution Through Local Competition
307
The present model is silent on the question of product variety; intuitively, tipping to the low-price local equilibrium, involving as it does both higher per-retailer sales volumes and the entry of new retailers, should lead to increased variety. Would this be borne out in a Dixit–Stiglitz (1977)-type variant of the present model? The model should be empirically relevant where two conditions apply: first, where nonautomobile oriented shops are located in a neighborhood in which both car-owning and non-car-owning households are a significant presence; second, where these retailers are close enough to the tipping point hypothesized here that small changes in the real cost of shopping out of town could cause a step-change in equilibrium prices and capacity utilization. For the hypothesized distributional effect to be important, the neighborhoods in question would have to be significantly heterogeneous as regards household wealth, and have wealth positively correlated with car ownership. While the model appears to be applicable to many European, East Asian, and Australasian cities, the effects predicted in the model need empirical verification and measurement. Acknowledgments. I am grateful to Philip McCann for his invaluable advice and encouragement, and to Maureen Kilkenny for her comments on an early version of this chapter.
References Alonso W (1964) Location and land use. Harvard University Press, Cambridge Atzema O, Rietveld D, Shefer D (2005) Regions, land consumption, and sustainable growth. Edward Elgar, Cheltenham Dixit A, Stiglitz J (1977) Monopolistic competition and optimum product diversity. Am Econ Rev 67:297–308 Evans AW (1999) The land market and government intervention. In: Cheshire P, Mills ES (eds) Handbook of regional and urban economics. North-Holland, Amsterdam Green DL, Jones DW, Delucci MA (1977) The full costs and benefits of transportation. Springer, Heidelberg Office of the Deputy Prime Minister, UK (1999) Policy Planning Guidance 6 Quinet E, Vickerman RW (2004) Principles of transport economics. Edward Elgar, Cheltenham Ruth M (2006) Smart growth and climate change. Edward Elgar, Cheltenham
Subject Index
a accumulation of deficits 123 active fiscal policy rules 113 active monetary policy rule 113 Adaptive inflation climate 3 Adjustment costs 55 agglomeration 220 aggregate government debt 112 aggregate wealth 111 area-base model 249
b balance of payments adjustment process 107, 121 balance of payments 102, 110 Barro 130 Benhabib 134 bialternate product 163, 167 bifurcation value 137 bounded rationality 3 Boyesion spatial autoregressive model 188 Boyesion spatial Durbin model 189 Boyesion spatial error model 186, 187 budget dynamics 123 budget equations 109 business purposes 231
c Cagan money demand function 114 Cagan-type real-money demand function 111
capital account 104 capital accumulation equation 130, 131, 135, 140 capital flows 103 capital mobility 105 car commuter proportion 235 car-commuter free-parking taxation 238 central bank 103 centrifugal forces 123 characteristic equation 152 charge-collection costs 228 city size 265, 267 clustering 209 Cobb–Douglas utility function 147 coefficients 116 commercial parking market 229 common pool resources (CPRs) 240 commuter railway 265 commuters’ free or employer-subsidized parking 237 commuting 229 competition 286 competitive market 295 competitive retail market 294 congestion charge systems 238 constant returns to scale 267 consumers 284 cooperation 209, 214 corridor stability 130, 135, 136, 141 cost of replanting and maintenance 243 creativity 208 criteria of simple Hopf bifurcations 169
309
310
Subject Index
criterion of simple Hopf bifurcations 161, 166, 171 critical value of the delay 156 cumulative instability 121 current account 103 current account surpluses 110
d 3D dynamics 115, 117 data 213 database 211 debt 112 debt accumulation 83 debt–creditor relationship 125 delay in production 145 delayed system 152 demand curve for car commuting 235 dependent variables 254 destabilizing effect 151 destabilizing feedback channel 120 differentiation coefficient of probability 257 discontinuous policy function 55 disposable income 105, 113 distance elasticity of the cost 271 distribution, real income 298, 304 diversity 220 Domar 129, 130 domestic economy 124 domestic wealth 104 dynamic model of renewable resources and population 147 dynamics of temporary equilibrium 148
e economic efficiency 294, 295 economic externality 252 economic view 130 economy of scale 279 education production function 176, 180–182, 184, 186, 189, 190, 193, 195, 200 efficient Pareto conditions 253 emissions in urban transportation 265 employee demand 234
employee parking spaces 232 employees 236 employer 236 endogeneity 184, 185 energy conservation 265 entrepreneurship 208 environmental costs 297 environmental degradation 145 environmental externalities 298, 305 environmental issues 265 EPCP income 132, 136, 137 establishment 211 expectation formation 83 expected per-capita permanent income see also EPCP income 131 explanatory variables 254 explicit costs 234 external debt position 101 externalities 178, 179
f face-to-face 207 fertility function 148 finance 228 financial stocks 102 fiscal and monetary policy 117, 123 flow budget equations 102 foreign economy 124 foreign exchange market policies 109 Forest Law in Japan 240 forest policy in Japan 245 forest protection system 239 free-entry competition 286 free-entry equilibrium 286 Friedman 131, 132 frontier price 288, 289 Fuller criterion 161, 164, 166, 170
g Gabisch 134 generalized Tobin model 170, 171 geographical 211 global dynamics 55 government budget constraint 105, 120, 123 government budget restriction 100 government debt 121
Subject Index government expenditure 122 government expenditure shock 107 Greenhut-Ohta equilibrium 289 growth cycle 29 Guckenheimer 134, 140, 142
h Harrod 129, 130 Harrod’s knife-edge 29 harvesting rotation age 250 high-dimensional dynamic Keynesian model 83 high-technology 210 Hirsch 136, 139 Holmes 134, 140, 142 Hopf bifurcation 29, 133, 134, 137, 168 Hopf bifurcation theorem 133, 140, 142 Hopf bifurcations 165, 166 Hopf cycle 135–138, 140 Hopf cycles 131 Hotelling-Smithies equilibrium 289, 295 hysteresis 136, 140
i imperfect competition 29 independence from irrelevant alternatives 263 inner city of Stockholm 235 innovation 206, 223 instability 115, 124 integrated dynamics 114 interest-rate elasticity 116 inter-firm 214 IS-restriction 109
j Jansson and Wall (2002) Stockholm study 227 Jansson and Wall study 229 jump variable technique 100, 106, 107, 115, 118
k Keynesian legacy 83 kinked equilibrium 265, 267
knife-edge and bifurcation 40 knowledge spillovers 206 Kuznets 138
l labor market 233 labor mobility 210 land-use decision model 255 land-use policies 305 Leijonhufvud 130, 134 Leijonhufvud’s 135 Liénard–Chipard criterion 166 Lines 136 Liu criterion 165, 166, 170 LM curve 109, 111 local 215 local asymptotic stability 117 local referendum 228 local unit 212 locally asymptotically stable 119, 152 location 207 logistic function 149 long-run dynamics 154 Lorenz 134, 135 Löschian equilibrium 289, 292, 295 loss of stability 156 Lotka–Volterra predator–prey model 149
m Maastricht Treaty 121 Malthusian population dynamics 148 Mankiw 130 man-made forest 241 manufacturer 284 manufacturer’s mill price 285 manufacturer’s profit 291, 292 market size 287, 288 market structure 288 maximum reservation price 284, 290 median voter 178, 179 Medio 136 mill price 293, 294 mill price negotiations 294 Miyao 134
311
312
Subject Index
model for small island 146 monetary shock 106 money supply 120 money supply adjustment rule 101 monopolistic competition equilibrium 267 monopolistic competition 267, 298, 299, 301, 303 monopoly equilibrium 267 motorization 227 multinominal logit model 254, 257 multiple steady states 55 Mundell–Fleming–Tobin model 103
n Nash equilibrium 251 nation-wide forest plan 246 natural forest 241 nearly constant propensity 138 Nearly Constant Propensity to Consume 137 negative externality 253 negotiating powers 291 neoclassical aggregate model of growth see also “S–S model” 130 neoclassical growth model 140 neoclassical steady state 130 neoclassical view of the economy see also “economic view” 140 nested logit model 261 net salary 230 new Keynesian theory 118 New-Keynesian Phillips curve 3 nominal adjustments 118 nonexcludability 240 nonlinear dynamic system 154 nonlinearities 108 nonprofit organizations 230 nonrandom 232 number of retailers 285, 293 numerical simulations 155
o oligopolistic 208 oligopolistic market 292, 294, 295 oligopolistic situation 289 OLS methods 254 omitted variable bias 187, 188, 199
open economies 108 open market operations 103 optimal market size 289 oscillatory dynamics 151 outside good 267 Owase 134, 135 ownership of land 277
p Pareto equilibrium 253 park and ride 265 parking 305 parking cost 235 parking policies 237 parking taxation 228 perfectly open economies 101, 119 periodic paths see also “Hopf cycle” 131 permanent component of consumption 132 permanent implementation of congestion charges 228 permanent income hypothesis see also “PI hypothesis” 130 permanent income hypothesis 140 Phillips curve 113 PI hypothesis 130 policy rules 124 politicians 236 pooled data 214 population growth 145 positive externality 253 price conjectural variation 286, 287 price conjecture 288 private and government wealth 105 private disposable income 110, 114 private firms 230 private schools 180, 181, 185, 190, 199, 200 probability of choice 255 probit 213 process 213 process innovation 40 product 213 product cycle 208 product innovation 40 proficiency test 182, 190, 192, 195 profit-maximizing mill price 287, 288 profit-maximizing retail price 293
Subject Index public and private schools 177 public authorities 230 public benefits of forest 240 public debt 110 public regulation 279 public schools 180, 181, 185, 190, 199, 200 purchasing power parity condition 100 Putty–clay technology 29
q quality competition 242 quantity demanded 284
r R&D 210 R&D cooperation 223 real interest 113 real interest flows 104 real rate of return 106 real wealth 106 redistribution 304 regional forest plan 246 relations 220 rest of the world 124 restriction on property rights 247 restrictions on the protected forest 248 retail market formation 294 retail market structure 283, 291 retail price 287 retailer’s profit 294 retailers 284 revitalization of the forestry 244 rise and fall of Easter Island 146 rivalry 240 road pricing 229, 238 Robinson 136 Romer 138 Rosser 134 Routh–Hurwitz criterion 161, 162, 164, 166 rush-hour traffic 237
s Sala-i-Martin 130 scale economies 182 scale economy 267
313
school 233 school competition 176, 177, 180, 181, 190, 193, 195, 199, 200 school inputs 180, 182, 192, 193, 195 school vouchers 176, 199, 200 schooling production 178, 179 Shaefer harvesting production 147 simple Hopf bifurcation 161, 167–170 simple Hopf bifurcations 165 simulations with a production delay 154 size 215 Smale 136, 139 social optimum 267, 277 social planner 252 Solow 129, 137 spatial efficiency 294, 295 spatial retail market 283 spatial statistics 177, 185–189, 192, 193, 199 specie-flow mechanism 101, 107 spillovers 178, 179, 185, 188, 189, 192, 194, 195, 199, 200 S–S model 131 S–S model see also “neoclassical growth model” 130 stability 119 stability criteria 163 stability criterion 161, 162, 166 stability index 137, 142 stability properties 115 stabilization policy 83 stable dynamics 156 stationary points 108 steady state 118, 119, 121 stock-flow relationships 100 strictly enforced 231 stumpage value 240 subsidies 265 subsidize the railway sector 281 subsidy 267 suburbs 230 Survey of Professional Forecasters 3 surveys 221 Swan 129, 137
t tax policy 101 temporary equilibrium
148
314
Subject Index
Temporary Measures Law for the Development of Protective Forest 245 tolling 228 total wealth 112 traffic congestion 237 transitory component of consumption 132 transparency 209 transport cost per mile 284 transport infrastructure investment 237 travel cost 299 travel costs 298, 302, 304, 306 turnover 215 twin deficits 102, 116 Two-Country Case 124 two-country world 125
u uncovered interest parity 99 unstable adjustment process 108 unstable mechanisms 125 urban transportation 265
v value added 182 variety 207
w wealth 114 weight matrix 186 working day 231 workplace 233