Spatial Economic Analysis, Vol. 1, No. 1, June 2006
Editorial There has been a decisive shift over the past 15 years. Economic geographers have renewed their interest in quantitative analysis and the application of economic analysis to geographical problems. Economists have rediscovered space. To only a limited extent can one attribute this radical change to the major changes in the external world over recent years. Yes* trade has increased relative to world GDP, as have international flows of capital and people, while information technology has reduced the friction of space, and institutional trade barriers are being continuously reduced with the growth of free trade areas. It is arguable, therefore, that the importance of cities and regions has increased as the role of nation-states has been eroded. The real change, however, has been in our ability to analyse these essentially spatial phenomena. We see the role of this new journal, Spatial Economic Analysis, as contributing to yet further improvement in this ability and providing a lively forum to stimulate yet more geographical economists, economic geographers, regional scientists and mainstream economists to engage with spatial economics and spatial modelling in economics. This is not intended to be a journal devoted exclusively to the ‘New Economic Geography’ (NEG). Major insights into economic geography and spatial economics have arisen from the work of authors with different analytical backgrounds such as Allen Scott, Michael Porter and Edward Glaeser. But our judgement is that the convenient marker of the shift we refer to was the publication of Paul Krugman’s 1991 classic ‘Increasing returns and economic geography’. Why can this article be called a ‘classic’? After all, in many ways it said nothing new. The foundation of Kaldor’s (1970) analysis of regional growth was the existence of increasing returns to scale, leading to economies of agglomeration, and this general idea can be traced back at least to Myrdal (1957). But Krugman approached the issue in an entirely different way. His concern was to have a theory of regional growth processes that built in an entirely consistent way on modern economic analysis, resting on the rigorous Dixit & Stiglitz (1977) model of monopolistic competition with increasing returns and explicit microeconomic foundations. Conceptually the behaviour of Krugman’s regional economies (pretty spaceless places, it must be acknowledged) was built up from individual consumers and firms. Despite its limitations, this laid the foundations for reuniting urban and regional economics with the economics mainstream. Economic geographers, regional scientists and applied urban and regional economists had always known that assuming away transport costs, economies of scale and agglomeration meant losing the essence of the world in which they were interested. But they lacked the intellectual rigour of modern economic theory. The step change in our ability to include space in our analysis does not result just from the fact that the NEG had microeconomic foundations and so was recognizably part of the modern mainstream of economics, it is also because modern computing power allowed one to analyse microdata sets. The article by Anselin & le Gallo (2006) in this issue uses a data set with 115,000 observations and the use of even bigger data sets is now common. Before 1991 the models and the analysis of geographical economists were mainly formulated with ‘regions’ as actors. ISSN 1742-1772 print; 1742-1780 online/06/010001-05 # 2006 Regional Studies Association
DOI: 10.1080/17421770600734001
2
Editorial
We can now at least envisage the prospect not only of building our regional models on the basis of individual actors* whether firms or households* with their particular constraints, institutions and circumstances, but also of testing such models against data sets built up of such actors. This is so different from work undertaken in the past that we can almost think of it as a new field* spatial economics* with its counterpart spatial econometrics. At the present time these are developing symbiotically to provide the intellectual framework and quantitative tools needed to understand and analyse the long-neglected spatial dimension of economic life. We emphatically do not wish to disown our intellectual roots. Much can be learned from the work of economic geographers who, for example, modelled flows between areas, whether of traffic or shopping trips, trade or migration, formalized in the entropy-maximizing models developed by Alan Wilson (1970). Additionally, spatial interaction is embodied within cross-sectional regression analysis, for example, of employment level or growth rate variations between locations, as ‘spillover’ effects between locations. There have been many contributions to quantitative geography that have exploited these themes. Spatial econometrics, with its stronger ties to econometrics and regional science, emerged from the work of Paelinck & Klaassen (1979) and Anselin (1988), but, as the paper in this issue by Pinkse et al. (2006) shows, spatial econometrics has developed rapidly and powerfully. The NEG has begun to generate applied work such as Davis & Weinstein (1999) and Rice & Venables (2003) on regional specialization or Venables & Rice (2004) attempting to estimate agglomeration economies. But another strand of the NEG literature has taken on board the notion of spatial interaction modelling to analyse trade flows between countries, since trade costs are an all-important aspect of the typical NEG model. However, the theory has yet to mature, with various spin-offs and alternatives on offer, and with the full array of real effects, for example technological externalities (Gordon & McCann, 2000), not yet fully integrated. We hope that we will be publishing papers in Spatial Economic Analysis that will assist this progress towards a more useful NEG theory, in particular with respect to policy implications. Adopting a different perspective, the literature on Industrial Organization (IO) has been developing economic spatial models since the classic paper by Hotelling (1929). Departing from the standard location problems, this literature involved generalizing the application of spatial models through the concepts of economic space or economic distance, with the space of product characteristics being its typical example. In so doing the IO literature enables spatial models to be used to frame a much wider range of problems in which proximity effects are relevant in many different ways (e.g. political space, regulatory space, commercial space, industrial structure space, trade openness space). These developments have been surveyed in the book by Greenhut et al. (1987) and recent applications are starting to mount. Spatial Economic Analysis also welcomes research following this tradition. This wider conceptualization of economic space has been adopted in the estimation of cross-sectional (and panel data) models. Prominent in this strand in the literature has been the work of Conley (1999) and Pinkse et al. (2002) or Slade (2006). Pinkse et al. (2002), for example, develop models of price competition among firms in which different assumptions about the spatial extent of price competition are embodied within different conceptualizations of access distance. At one extreme one has markets where competition is very local, while at the other competition may be global, so that all firms compete with all others.
Editorial 3 The aim of Spatial Economic Analysis is to provide a focal point for the emerging field of spatial economics, ranging from economic geography and spatial econometrics to regional science, and also including spatial modelling in IO. It is in essence an economics journal, in that each of the Editors has one or more degrees in economics and undertakes research in an economics-led grouping; however, all of us are spatially inclined economists and several of us have also been trained geographers, and so we are equally open to high-quality empirical and theoretical contributions from geographers and other social scientists. Moreover, our diversity ensures that no one type of theory or methodology will dominate the journal; it is open to all who can advance the frontiers of knowledge of spatial economic phenomena. A good indication of the type and quality of papers we intend to publish is provided below. Most of the papers published in this inaugural edition were commissioned. We plan to publish refereed academic papers selected according to their intellectual quality in future issues. Our experience thus far has been that research of the highest quality is being carried out by academics in the fields we are interested in. We hope that this trend continues and that we can make this journal a leader in its field. We will certainly ensure that excellent and helpful referees are appointed, and that we achieve fast turnaround of articles. It is within this context of an emerging field of spatial economics that we consider the first paper in the new journal (Patuelli et al., 2006), which develops a set of neural network models to obtain short-term forecasts of employment in German regions. Neural networks, which are becoming increasingly popular in regional science and economics, provide a means of avoiding some of the problems associated with the application of standard econometric methods, such as the choice of functional form, and the appropriate set of regressors. What neural networks do is to allow learning from the data, and they therefore have an inherent and commendable flexibility. They are also open to criticism, however, as the paper points out, because they are essentially data, rather than theory, driven. The paper usefully reviews the application of neural networks, and combines it with shiftshare analysis, bringing us right up to date with recent developments related to this standard technique, which has stood the test of time over half a century. The paper uses these state-of-the-art methods to achieve its aim, namely employment forecasts for 339 NUTS 3 regions of Germany. The second paper, by Anselin & le Gallo (2006), gives an authoritative analysis of spatial house price variation in Southern California via spatial econometric modelling. The data set is large, by any standards, amounting to a sample of over 115,000 houses, and one of the novel aspects of the paper is the use of spatially interpolated air quality measures as a covariate. One of the issues in this type of hedonic modelling is how well the model captures all of the covariates that one might reasonably consider to be important and to what extent the presence of spatially autocorrelated residuals reflects omitted spatially autocorrelated variables, such as amenities and local public goods (Cheshire & Sheppard, 2004). The authors capture this residual dependence using an endogenous spatial lag, in which prices are affected by prices in nearby locations. Inevitably, with spatial data, strong residual autocorrelation persists regardless of the complexity of the structural model* hence the need for spatial econometric analysis. State-of-the-art estimation techniques are used, for instance, to handle the very large data set, and including the spatial heteroscedastic and autocorrelation consistent (HAC) estimator of Kelejian & Prucha (2005).
4
Editorial
The third paper, by Pinkse et al. (2006), is at the forefront of recent developments in spatial economics involving economic space, discrete-choice analysis and spatially and temporally dependent data. Discrete choices introduce heterogeneity, and there are usually problems associated with endogeneity and measurement error. When these are present in a panel data analysis, consistent estimation is a major computational challenge. In the paper a new GMM estimator is used to overcome these problems. In the application, the discrete choices are whether a copper mine should be operational or lie idle, expressed as a function of prices, costs, mineral reserves, capacity, output, technology, and a temporally lagged dependent variable, for a panel of 21 Canadian mines. This provides evidence in favour of a mean/variance utility model in which the decision maker is risk averse and there is a trade-off between the mean and the variance of returns, rather than the theory of real options, which supposes that the effects of volatility vary with the prior state. The fourth paper, by Robert-Nicoud (2006), is an example of the burgeoning theory of New Economic Geography. In the paper Robert-Nicoud proposes a ‘New Trade, New Economic Geography’ model, which nests an NEG model and a new trade model as special cases. The hybrid combines agglomeration mechanisms due to input output linkages with centripetal and centrifugal forces due to the presence of trade costs. The paper demonstrates the welfare implications of the NEG model, showing that agglomeration, with all manufacturing concentrated in a singe region, can Pareto-dominate dispersion because strong input output linkages are passed on to consumers everywhere in the form of low prices. With such theoretical advances there is an intensifying challenge related to empirically testing, and not simply calibrating, the NEG models, so that the wider and different perspectives offered by alternative or competing theories can be more fully accommodated (Head & Mayer, 2004; Fingleton, 2005a, b, 2006). The fifth paper, by Ballas et al. (2006), illustrates the insights that can be obtained using spatial microsimulation, which builds on the concepts and achievements of the Leeds school as embodied in the work of Alan Wilson, highlighted at the outset. In their paper they develop a spatial microsimulation model for the Leeds local labour market in order to estimate the effects of a shock to the local economy. The advance shown in this paper on the previous literature is the capturing of multiplier effects as they cascade through local economic space. The paper shows the diversity of impacts that ensue, initially to jobs and incomes, but also to the retail sector and local taxes. These impacts in turn affect employment and incomes further, and so on. The approach is a highly practical, policy-oriented exercise in applied social science, which considers the important welfare consequences of sudden plant closure in a local area. It makes use of readily available data such as small area statistics from the census, the British Household Panel Survey, the National On-line Manpower Information System (NOMIS), etc. They conclude by looking at the future potential for this simulation methodology; at how it can answer counterfactual or ‘what if’ questions; how it might integrate more closely with input output approaches and with the methodologies more closely allied to spatial econometrics. The question for this approach is this: can it be enhanced by some of the new economic theory that is currently being developed, and if so, how? The Editors
Editorial 5 References Anselin, L. (1988) Spatial Econometrics: Methods and Models, Dordrecht, Kluwer. Anselin, L. & le Gallo, J. (2006) Interpolation of air quality measures in hedonic house price models: spatial aspects, Spatial Economic Analysis , 1, 31 52. Ballas, D., Clarke, G. & Dewhurst, J. (2006) Modelling the socio-economic impacts of major job loss or gain at the local level: a spatial microsimulation framework, Spatial Economic Analysis , 1, 127 146. Cheshire, P. & Sheppard, S. (2004) Capitalising the value of free schools: the impact of supply characteristics and uncertainty, Economic Journal , 114, F397 F424. Conley, T. (1999) GMM estimation with cross sectional dependencies, Journal of Econometrics , 92, 1 45. Davis, D. R. & Weinstein, D. E. (1999) Economic geography and regional production structure: an empirical investigation, European Economic Review , 43, 379 407. Dixit, A. K. & Stiglitz, J. E. (1977) Monopolistic competition and optimum product diversity, American Economic Review , 67, 297 308. Fingleton, B. (2005a) Towards applied geographical economics: modelling relative wage rates, incomes and prices for the regions of Great Britain, Applied Economics , 37, 2417 2428. Fingleton, B. (2005b) Beyond neoclassical orthodoxy: a view based on the new economic geography and UK regional wage data, Papers in Regional Science , 84, 351 375. Fingleton, B. (2006) The new economic geography versus urban economics: an evaluation using local wage rates in Great Britain, Oxford Economic Papers (in press), Advanced Access published on 3rd April 2006. Gordon, I. & McCann, P. (2000) Industrial clusters: complexes, agglomeration and/or social networks?, Urban Studies , 37, 513 532. Greenhut, M., Norman, G. & Hung, C. (1987) The Economics of Imperfect Competition: a Spatial Approach, Cambridge, Cambridge University Press. Head, K. & Mayer, T. (2004) The empirics of agglomeration and trade, in: V. Henderson & J.-F. Thisse (eds) The Handbook of Regional and Urban Economics , Vol. IV, pp. 2609 2665, Amsterdam, North-Holland. Hotelling, H. (1929) Stability in competition, Economic Journal , 39, 41 57. Kaldor, N. (1970) The case for regional policies, Scottish Journal of Political Economy , 17, 337 348. Kelejian, H. H. & Prucha, I. R. (2005) HAC estimation in a spatial framework, Journal of Econometrics (forthcoming). Krugman, P. R. (1991) Increasing returns and economic geography, Journal of Political Economy , 99, 483 499. Myrdal, G. (1957) Economic Development and Underdeveloped Regions, London, Methuen. Paelinck, J. & Klaassen, L. (1979) Spatial Econometrics, Farnborough, Saxon House. Patuelli, R., Reggiani, A., Nijkamp, P. & Blien, U. (2006) New neural network methods for forecasting regional employment: an analysis of German labour markets, Spatial Economic Analysis , 1, 7 30. Pinkse, J., Slade, M. E. & Brett, C. (2002) Spatial price competition: a semiparametric approach, Econometrica , 70, 1111 1153. Pinkse, J., Slade, M. & Shen, L. (2006) Dynamic spatial discrete choice using one-step GMM: an application to mine operating decisions, Spatial Economic Analysis , 1, 53 99. Rice, P. & Venables, A. J. (2003) Equilibrium regional disparities: theory and British evidence, Regional Studies , 37, 675 686. Robert-Nicoud, F. (2006) Agglomeration and trade with input output linkages and capital mobility, Spatial Economic Analysis , 1, 101 126. Slade, M. E. (2006) The role of economic space in decision making, Annales d’Economie et de Statistique (forthcoming). Venables, T. & Rice, P. (2004) Spatial Determinants of Productivity: Analysis for the Regions of Great Britain , Discussion Paper No. 4527, CEPR, London. Wilson, A. G. (1970) Entropy in Urban and Regional Modelling, London, Pion.
Spatial Economic Analysis, Vol. 1, No. 1, June 2006
New Neural Network Methods for Forecasting Regional Employment: an Analysis of German Labour Markets
ROBERTO PATUELLI, AURA REGGIANI, PETER NIJKAMP & UWE BLIEN (Received December 2005; revised January 2006)
ABSTRACT In this paper, a set of neural network (NN) models is developed to compute short-term forecasts of regional employment patterns in Germany. Neural networks are modern statistical tools based on learning algorithms that are able to process large amounts of data. Neural networks are enjoying increasing interest in several fields because of their effectiveness in handling complex data sets when the functional relationship between dependent and independent variables is not specified explicitly. The present paper compares two NN methodologies. First, it uses NNs to forecast regional employment in both the former West and East Germany. Each model implemented computes single estimates of employment growth rates for each German district, with a 2-year forecasting range. Next, additional forecasts are computed, by combining the NN methodology with shift-share analysis (SSA). Since SSA aims to identify variations observed among the labour districts, its results are used as further explanatory variables in the NN models. The data set used in our experiments consists of a panel of 439 German (NUTS 3) districts. Because of differences in the size and time horizons of the data, the forecasts for West and East Germany are computed separately. The out-of-sample forecasting ability of the models is evaluated by means of several appropriate statistical indicators.
Nouvelles Me´thodes de Pre´visions Fonde´es sur les Re´seaux Neuronaux Applique´es l’Emploi Re´gional: Une Analyse des Marche´s du travail dans l’Allemagne Re´unifie´e RE´SUME´ Dans cet article, les auteurs ont de´veloppe´ une se´rie de mode`les utilisant les re´seaux neuronaux (RN) pour calculer des pre´visions a` court terme des parame`tres de l’emploi, par re´gion allemande. Les RN sont des outils statistiques modernes fonde´s sur des algorithmes d’apprentissage, capables de traiter de grandes quantite´s de donne´es. On s’inte´resse de plus en plus aux RN car ils permettent de ge´rer efficacement des se´ries de donne´es complexes, bien que la relation fonctionnelle entre les variables de´pendantes et inde´pendantes n’est pas de´finie explicitement. Cet article compare deux me´thodologies fonde´es sur les RN. D’abord, il utilise les RN pour pre´voir l’emploi re´gional dans les deux re´gions anciennement appele´es Allemagne de l’Ouest et Allemagne de l’Est. Chaque mode`le re´alise´ calcule de simples estimations des taux de croissance d’emploi pour chaque district allemand, sur une dure´e de 2 ans. Puis, il calcule des pre´visions comple´mentaires, en combinant la me´thodologie RN Roberto Patuelli (to whom correspondence should be sent) and Peter Nijkamp, Department of Spatial Economics, Free University of Amsterdam, The Netherlands. Aura Reggiani, Department of Economics, University of Bologna, Italy. Uwe Blien, Institut fuer Arbeitsmarkt und Berufsforschung (IAB), Nuremberg, Germany. The authors wish to thank Professor Gu¨nter Haag (STASA, Frankfurt) for kindly providing data on commuting flows. The first author also thanks Professor Kingsley Haynes for a useful discussion of SSA. ISSN 1742-1772 print; 1742-1780 online/06/010007-24 # 2006 Regional Studies Association
DOI: 10.1080/17421770600661568
8
R. Patuelli et al.
avec une analyse shift-share (ASS). Comme l’ASS a pour but d’identifier les variations releve´es sur le marche´ local du travail, on emploie les re´sultats obtenus comme variables inde´pendantes comple´mentaires dans les mode`les RN. Notre e´chantillon de donne´es utilise´ dans nos expe´riences se compose de 439 districts allemands. Comme les districts composant l’e´chantillon pre´sentent de grandes diffe´rences en matie`re de taille et d’horizon temporel, les pre´visions pour l’Allemagne de l’Ouest et l’Allemagne de l’Est sont calcule´es se´pare´ment. La capacite´ des mode`les a` e´tablir des pre´visions hors e´chantillon est e´value´e avec diffe´rents indicateurs statistiques approprie´s. Nuevos me´todos de redes neurales para la previsio´n de empleo regional: un ana´lisis para los mercados laborales de Alemania RESUMEN En este documento desarrollamos una serie de modelos de redes neurales (RN) para calcular las previsiones a corto plazo de los modelos de empleo regional en Alemania. Las RN son modernas herramientas de estadı´sticas basadas en algoritmos de aprendizaje capaces de procesar un gran nu´mero de datos. Las RN se esta´n popularizando cada vez ma´s en diferentes campos porque son capaces de manejar grupos de datos complejos cuando la relacio´n funcional entre las variables dependientes e independientes no esta´ explı´citamente especificada. En este artı´culo comparamos dos metodologı´as de RN. Primero, utilizamos las RN para pronosticar el empleo regional en Alemania del oeste y del este. Cada modelo aplicado computa por separado los ca´lculos de las tasas de crecimiento de empleo para cada distrito alema´n, con un intervalo de previsio´n de 2 an˜os. Luego se calculan las previsiones adicionales combinando la metodologı´a de las RN con el ana´lisis shift-share. Dado que los ana´lisis shift-share identifican las variaciones observadas entre los distritos laborales, sus resultados se utilizan como otras variables explicatorios en los modelos de RN. El grupo de datos utilizado en nuestros experimentos abarca un panel de 439 distritos alemanes. Las previsiones para Alemania del oeste y este se computan por separado debido a las diferencias en los horizontes de taman˜o y tiempo de los datos. La capacidad de previsio´n a partir de las muestras en los modelos es evaluada mediante varios indicadores adecuados de estadı´sticas. KEYWORDS: Neural networks; forecasts; regional employment; shift-share analysis; shift-share
regression JEL
CLASSSIFICATION:
C23, E27, R12
1. Introduction The need for accurate forecasts of modern socio-economic (regional and national) systems has been growing in recent years. Most economic interventions, such as the distribution of federal or EU funds, require adequate policy preparation and analysis, usually made well in advance, and, often, at a disaggregated level. In this context, an emerging problem is the increasing level of disaggregation for which economic data are collected, and, hence, the imbalance between the number of disaggregated (regional) figures to be forecasted, and the quantity of observations (usually years) available. Although conventional econometric techniques can be useful in this respect (see, for example, Bade, 2006), it is well known that, in addition to the many constraints and hypotheses that these econometric models have to cope with, such as the use of fixed regressors, the choice of model specification* and, most importantly, of the explanatory variables to use* is crucial. An alternative approach, able to overcome some of these limitations, such as the choice of model and functional variables* especially in the framework of short-term forecasts* is provided by neural networks (NNs), a family of non-linear statistical optimization methods, which can provide a means of overriding some
New NN Methods for Forecasting Regional Employment
9
such restrictions (see, for example, Cheng & Titterington, 1994). The NNs’ capacity to learn from the data, and to find functional relationships among variables, makes it possible to forgo strict statistical assumptions and specification problems, and to process data by means of a flexible statistical tool. The present paper is concerned with the use of NNs in order to forecast regional employment change. Employment data are necessary in economic and regional policy analysis. Pension systems, social security reforms and annual policy-making tasks, such as the establishment of budget allocations, require detailed employment forecasts. The case study under analysis is the evolution of labour markets in Germany. In particular, our NN experiments focus on short-term employment forecasts, that is, forecasts for 2 years ahead. The paper describes a set of NN models developed with this aim in mind, and reviews the validation process and the statistical results of the NN models, which are evaluated for various test years. The aim of our experiments is not the use of NNs in itself, since nowadays NNs are used widely in different research fields, but the exploration of the NNs’ ability to forecast changes in economic variables in a panel data framework. While applications of NNs to time series* or to other pattern-recognition settings* are rather frequent, contributions on NNs dealing with panel data are limited (see, for example, Lin, 1992). The high number of cross-sections in the data under analysis and the limited number of years for which information is available are a problematic issue for conventional econometric techniques. Herein lies the rationale underlying our methodological choice of NN techniques. A novel part of this paper is the incorporation of shift-share analysis (SSA). We will introduce several variants of SSA, including some modern specifications, known as spatial shift-share and shift-share regression (SSR). This class of methods will be integrated with the NN methodology employed in our paper. This may provide an interesting balance between a data-driven technique and a solid wellknown research method. The paper is organized as follows. Section 2 briefly illustrates NN theory, as well as the criteria to be used in the validation of its results. Then, Section 3 introduces various classes of shift-share techniques. Section 4 describes the data used in our experiments. Section 5 first explains the practical steps in the implementation of the NN models, and, subsequently, reviews the statistical results of the empirical application, which aims to estimate employment variations in the former West and East Germany for the year 2003. A new contribution to NN analysis is offered by embedding SSA components. The results of the NN models* comprising the NN models embedding SSA components* are evaluated by means of appropriate statistical indicators and map visualizations. Finally, Section 6 offers some conclusions and sets future research directions. 2. Neural Network Models for Analysis and Forecasting 2.1. Neural Networks as a Statistical Optimization Tool Neural networks, sometimes also called ‘artificial neural networks’ in order to differentiate them from actual biological networks, are optimization algorithms whose main characteristic is the ability to find optimal goodness-of-fit solutions when the relationships between the variables are not fully or explicitly known, or when only a limited knowledge of the phenomenon examined is available. While traditional statistical models require an identification process for the set of regressors
10
R. Patuelli et al.
employed, as well as a specification of the relationship between dependent and independent variables, these steps are not necessary in NNs. Their no-modelling hypothesis could be considered a drawback in this regard because of the lack of theoretical economic (or behavioural) interpretation, which forces the analyst to accept the data-driven results of the NN models ‘as they are’. On the other hand, the limited possibilities of interpretation of the results are less relevant when the aim is, as in our case, to produce forecasts rather than to explain the relationships between the driving factors. In addition, NNs are also more robust against statistical noise, since they store redundant information. In contrast with conventional statistical techniques, NNs do not efficiently process categorical variables when these have many ‘values’, while there is no set of unifying and optimal NN models. As a consequence, the performance of NNs is dependent on the implementation carried out by the analyst. Because of their relatively simple application, NNs are attractive in various fields of socio-economic application. Reviews of NNs used in several fields can easily be found in the literature. Many examples could be listed, as well as academic journals entirely dedicated to NN-related studies. A very concise and nonexhaustive selection of these is shown in Table 1. Generally, it should be underlined that NNs enjoy great scalability properties, as they can be applied to problem solving in practically any area of application. Although NNs have sometimes been referred to as a ‘black box’ approach, they are definitely not such an obscure tool. The internal functions that process the different information inputs are, of course, selected by the analyst, as well as the algorithms that determine the direction and the degree of interaction of the factors during the computation process. As a matter of fact, NNs are often compared with conventional statistical methods, such as generalized linear models or simple regressions, in the light of an integrated utilization of all these methodologies. This kind of literature is now extensive and diverse (see, among others, Cheng & Titterington, 1994; Swanson & White, 1997a, b; Baker & Richards, 1999; Sargent, 2001), covering different fields. For example, Nijkamp et al. (2004) compared NNs with logit and probit models in an analysis of multimodal freight transport choice. In the labour market field, previous works by Longhi et al. (Longhi, 2005; Longhi et al., 2005a, b) should be cited, particularly for their use of panel and cross-sectional data instead of time series. Neural networks have also been shown to be equivalent, in the case of binary choice, to a logit model (Schintler & Olurotimi, 1998). Table 1. Some illustrative reviews of neural network (NN) applications in different fields; NN journals Field Atmospheric sciences Business and finance Classification of medical data Environmental modelling Medical imaging and signal processing Transportation Neural Computing & Applications Neural Computing Surveys Neural Networks Neural Processing Letters
Authors Gardner & Dorling (1998) Wong et al . (1997); Wong & Selvi (1998); Chatterjee et al . (2000) Dreiseitl & Ohno-Machado (2002) Maier & Dandy (2000); Shiva Nagendra & Khare (2002) Miller et al . (1992) Himanen et al . (1998) (Journal) (Journal) (Journal) (Journal)
New NN Methods for Forecasting Regional Employment
11
Generally, we can define a set of rules for the evaluation and comparison of NNs, which we derive from Collopy et al. (1994): . . .
Comparison with widely accepted ‘conventional’ models. Forecasts from the NN models should be at least as accurate as those generated by a naı¨ve extrapolation, such as random walk. Testing of the models’ out-of-sample performance. The results of out-ofsample forecasts should be used in comparing different methodologies. Use of a satisfying sample size. The size of the sample has to allow for statistical inference.
As can be seen later on in the presentation of an empirical application, these three rules are respected in our experiments. In addition to these general validation guidelines, additional rules may also apply with regard to the actual implementation of NN models. These rules are important in that they define the correct execution of NN modelling experiments, and the presentation of their results. We refer here to Adya & Collopy (1998): . .
.
Provision of the in-sample performance of the models. Sample data provide the basis for the learning process (see next subsection), and are a benchmark for the evaluation of the generalization properties of the NN models. Generalization. The level of similarity between in- and out-of-sample performance provides an indication of the generalization potential of the models. In this regard, a generalization estimator was computed by the authors (see Patuelli et al., 2003). Stability. A similar performance over different data sets allows the stability of the forecasting tool, and its reliability, to be assessed.
Several attempts have been made to assess the usefulness or effectiveness of NNs. Some authors (see, for example, Swanson & White, 1997a, b; Stock & Watson, 1998) have compared NNs with linear and non-linear methods as forecasting tools for variables such as employment, industrial production, or corporate profits, and have come to various conclusions. Stock and Watson (1998) conclude that NNs, and non-linear methods in general, mainly perform worse than linear methods. On the other hand, Swanson & White (1997b, p. 459) suggest that it could be possible to improve macroeconomic forecasts ‘using flexible specification econometric models’, whose specification ‘is allowed to vary over time, as new information becomes available’. Finally, Adya & Collopy (1998) have found that, most of the time, NNs seem to provide better forecasts than the models with which they are compared. Examining a string of studies that developed NNs for business forecasting, they have found that, of the studies correctly validating and implementing the NN models, 88% show that NNs have a superior performance. In order to fully understand the implications of the above-mentioned rules and methodological comparisons, we first need to describe the functioning of NNs. The next subsection will give a very brief discussion of the main components and interactions of a NN. 2.2. Background of Neural Networks Scientists have long been interested in the use of artificial NNs that could replicate the type of simultaneous information processing and data-driven learning seen in
12
R. Patuelli et al.
biological networks. Since Rosenblatt’s first introduction of an artificial NN (Rosenblatt, 1958) and the works of Werbos (1974), who provided a proper mathematical framework, and those of Rumelhart & McClelland (1986), who developed the most commonly used error-correction algorithm (back-propagation), many developments have been made in the NN framework. Neural networks can be defined as systems of units (or neurons) that are distributed in layers and are connected internally. The layers comprise units that can refer either to input variables (first layer) or to output variables (last layer). Intermediate layers composed of hidden units can also be used. When counting the number of layers of a NN, the input layer is usually not considered, since it does not take part in the data computation. Therefore, a NN with no hidden units has a one-layer structure, while, accordingly, a NN with one layer of hidden units has a two-layer structure. In feedforward NNs, every unit from each layer is connected* and transfers information* to every unit of the next layer, while connections between pairs of units go in only one direction (there are no cycles, as in other types of NNs, such as recurrent NNs). Consequently, the input units are connected only to the units of the first hidden layer (if employed), while the output units are connected only to the neurons belonging to the preceding (hidden) layer. It follows that, in the case of a single hidden layer, this is the only intermediate level between input and output units, while, when a hidden layer is not employed in the NN, input and output units are directly linked. Figure 1 provides a graphic illustration of the structure of a NN. Fischer (2001b, p. 23) defines the generic processing unit ui , belonging to u fu1 ; . . . ; uk g; as: ui 8 i (u) Ji (fi (u));
(1)
where the function 8 i can be decomposed into two separate functions: Ji is the activation function, and fi is the integrator function. The activation function computes each unit’s output, and is usually constant over the same NN.1 The integrator function is used for aggregating the information processed by the units of the preceding layer. This is achieved by combining the inputs by means of a set of
Figure 1. A graphical illustration of a feedforward neural network. Source : image licence held by Creative Commons (http://creativecommons.org/licenses/ by/1.0/).
New NN Methods for Forecasting Regional Employment
13
weights, contained in vector wi . The function commonly used for this task is a weighted sum: X fi (u) wij uj ; (2) j
where uj is the jth unit connected to unit ui , and wij is the connection weight associated with the two units (Fischer, 2001a). The ‘learning process’ of a NN is guaranteed by the recursive modification of the aforementioned weights, through which the NN can identify significant rules in data occurrence (see, for example, Rumelhart & McClelland, 1986). The ‘knowledge’ generated by the NN is therefore contained in the set of weights that are computed. A learning algorithm is needed in order to find the optimal values for the NN weights, which normally involves iterative computations. The back-propagation algorithm (BPA) is the one most commonly used for this task. The BPA requires the analyst to provide input examples and their correct* and known* outputs. Neural network models that follow this kind of process are called supervised NNs. The sample data used allow the models to identify the behaviour underlying the data and to replicate it. The actual learning process is given by comparison of the output generated from the current weight configuration2 with the correct output, by means of a back-propagation of the obtained error3 through the network. This process is repeated for each record of the sample, with a consequent readjustment of the weights. The cycle’s stopping condition can be decided by the analyst on the basis of, for example, computing time, error level, or the number of iterations. It should be noted that the algorithm ‘will never exactly learn the ideal function, but rather it will asymptotically approach the ideal function’ (McCollum, 1998), in addition to the local-minima problems that can arise.4 After this brief description of NN methods, we now offer a brief overview of shift-share methods as a complement to NN approaches, which can be integrated in a meaningful way to improve the statistical results of our experiments. 3. Shift-share Analysis for Regional Growth Analysis 3.1. The Conventional Shift-share Analysis Identities Since its inception in the 1960s, SSA has been a popular analytical tool among regional scientists, and not only for improving the understanding of changes in economic variables, such as employment or GDP, at the regional level. Usually, SSA can be employed in four ways: (a) in forecasting; (b) in strategic planning (that is, observing the weight of the effects); (c) in policy evaluation (before-and-after analysis); and (d) in decision making (Dinc et al., 1998; Loveridge & Selting, 1998). Shift-share analysis was first introduced by Dunn (1960), and subsequently formalized by Fuschs (1962) and Ashby (1964). In SSA, the growth showed by economic variables is decomposed into several components. Using employment as an example, the conventional shift-share decomposition can be written as: Deir [g (gi g)(gir gi )]eir ;
(3)
where eir is the employment observed in region r for sector i; g is the overall national employment growth rate; gi is the national growth rate of sector i; and gir is the growth rate of region r in sector i. The employment growth rate Deir is therefore decomposed into three components:
14
R. Patuelli et al.
(i) the national effect g; (ii) the sectoral effect, given by the difference between the sectoral and overall national growth rates, gi and g; (iii) the competitive effect, given by the difference between the local and nationwide sectoral growth rates, gir and gi . Each of the three components can be calculated for each region, over all the sectors, and nationwide. In particular, when summed nationwide, the sectoral and competitive effects sum to zero. This property is usually referred to as the ‘zero national deviation’ (ZND) property. The above identity has been studied and modified by several authors over the years. Alternative formulations of SSA also include an industry-structure approach where, in place of growth rates, industrial structures are compared (Ray, 1990). However, perhaps the most popular SSA extension is that developed by Esteban-Marquillas (E-M) (1972): Deir geir (gi g)eir (gir gi )ehir (gir gi )(eir ehir ):
(4)
In this SSA formulation, ehir is the homothetic employment of sector i in region r. Homothetic employment is calculated as ehir er ei =e; that is, region r’s employment in sector i, as it would be if the sector had the same structure as the nation. The homothetic competitive effect (third component) measures ‘a region’s comparative advantage/disadvantage in [sector] i relative to the nation’ (Esteban-Marquillas 1972, p. 43). The fourth and last component is called the allocation effect, as it is the product of the expected employment and the differential, which measures a region’s competitive advantage in sector i. The claim of this model is that it isolates the competitive effect from its relationship with the sectoral effect. Critiques of the E-M model can be found in Stokes (1974) and in Haynes & Machunda (1987). The E-M extension is not considered in our experiments, since the competitive effect is computed in the same way as in conventional SSA, the only difference being that it is multiplied by the homothetic employment. More generally, the main criticisms of SSA, according to Loveridge & Selting (1998), concern the following: . . . .
.
Its lack of theoretical content. In order to fill this gap, there have been attempts to link SSA to neoclassical microeconomics and factor demand for labour. Aggregation problems. Finer categories increase the weight of the sectoral effect and shrink the competitive effect. However, it has to be remembered that other techniques are also sensitive to aggregation issues. Weighting bias. It is not clear whether it is more convenient to use the base or the terminal year. Alternatively, the average of the two or a middle year could be used, or a ‘dynamic shift-share’ formulation (see Wilson, 2000). Instability of the competitive effect. This instability makes employment projections by means of SSA somewhat precarious. On the other hand, this issue does not exclude the use of SSA in forecasting, particularly in the framework of NNs. Interdependence of the sectoral and competitive effects.
A number of new SSA specifications have been developed over the years5 on the basis of the first technical advances described above, often focusing on the
New NN Methods for Forecasting Regional Employment
15
elimination of dependence among shift-share components or trying to solve other deficiencies of SSA. However, the application of newer methodologies has often deprived the models of their contribution to the understanding of local phenomena (Loveridge & Selting, 1998). While all types of decomposition can be obtained by adding and subtracting variables, all of them can be shown to be rooted in the simple SSA decomposition (Nazara & Hewings, 2004). Consequently, the basic models and a few other modifications, widely accepted as standards, are still preferred by most analysts because of their intuitive and simple specifications. Despite the above considerations, the development of new SSA extensions still goes on. One of the most recent developments in this matter is the extension proposed by Nazara & Hewings (2004), also called ‘spatial shift-share’ by the authors, and described in the next subsection. 3.2. Spatial Shift-share The development of the recent shift-share extension termed ‘spatial shift-share’ is justified by the fact that spatial issues, such as spillovers, spatial competition, and so on, have not been considered in the application of SSA. There is therefore a need for the introduction of an element that accounts for the spatial structure which comprises a particular region. If we consider that regions are* as seems logical* interdependent and they influence each other, we note, in fact, that horizontalinfluence relationships (region to region) are not enclosed in the traditional SSA formulation, while only hierarchical ones are accounted for (that is, nation to region). Starting from this consideration, Nazara and Hewings modified the conventional shift-share identity in: Deir [g (gir g)(gir gir )]eir ;
(5)
where gi is sector i’s growth rate in the regions that are neighbours to region r. The neighbours’ growth rate g˜ i is formulated, for a generic (t, t n) period, as: /
r X
g˜ i S1
w ˜ rs etn is
r X S1
r X
w ˜ rs etis ;
(6)
w ˜ rs etis
S1
where the employment levels of neighbouring regions are weighted according to a ˜ which defines the intensity of the neighbours’ row-standardized weight matrix W; interaction with region r. This interaction can be defined in many ways: for instance, on the basis of geographical contiguity or economic flows. A simplified version of the weight matrix is employed in this paper, where the neighbours of a given region are defined empirically as the three regions that provide the highest number of individuals commuting towards the region being considered.6 In practical terms, the weight matrix employed here is an asymmetrical matrix with only three identical values differing from 0 for each region. The overall employment growth rate of the neighbours is subsequently computed. As a consequence of the new variable presented in equation (5), the sectoral and the competitive components change in meaning. In detail:
16 . .
R. Patuelli et al. the sectoral component now identifies the difference between the growth rate of region r’s neighbours in sector i, and the national all-sector growth; the competitive component is the difference between sector i’s growth rate in region r and in its neighbouring regions.
This recent decomposition is already the subject of further study and expansion. Ferna´ndez & Lo´pez Mene´ndez (2005) have developed a mixed Nazara Hewings/ E-M model that employs both homothetic employment and the spatial connotation given by a geographical connectivity matrix. The interest in the SSA framework also goes beyond its deterministic nature. The next subsection describes a stochastic shift-share approach termed ‘shift-share regression’. 3.3. Shift-share Regression One of the main critiques of SSA is the lack of hypothesis testing, which is due to shift-share’s deterministic nature. A stochastic approach, based on regression techniques equivalent to shift-share, has been developed by Patterson (1991), and subsequently used by, among others, Mo¨ller & Tassinopoulos (2000), and by Blien & Wolf (2002) in the analysis of employment patterns in Eastern Germany. The model proposed by Patterson is rather simple, and is strictly related to the conventional SSA approach: Deirt ai lt kr o irt ;
(7)
where Deirt is the regional employment growth rate in sector i during period (t, t 1); ai is the effect of sector i; lt incorporates time period t (period effect); kr is a locational effect specific to region r; and oirt is stochastic noise. Mo¨ller and Tassinopoulos, as well as Blien and Wolf, propose extensions of this specification, incorporating additional variables, such as structural adjustment or region-type indicators, and qualification level of employees. Equation (7) suffers from perfect multicollinearity, and is therefore estimated by introducing a set of constraints (see Blien & Wolf, 2002). A weighted least squares (WLS) estimation procedure is suggested in order to reduce the impact of outliers. This shift-share regression (SSR) approach has been replicated, in this paper, in a simplified version. We are interested in introducing shift-share components in NNs in order to forecast overall regional employment. Therefore, we only employ the locational effects regressors, which are region specific, as explanatory variables in NN models. In our case, the dependent variable is Der ; that is, the overall employment growth rate of region r. Equation (7) is therefore simplified as follows: /
Der akr o r :
(8)
In equation (8), a is the intercept, while or is the stochastic noise for region r. In this case, the locational effects variable is computed as the competitive effects used in conventional SSA. Consequently, there is a set of locational effects regressors: one for each sector. The model was estimated, by means of WLS,7 for each 2-year period. We found most of the locational effects variables to be statistically significant (for details, see Tables A1 and A2 in Appendix A). The multiple per-year estimations seem logical in the NN forecasting framework. The estimation of a single regression coefficient per sector would only change the scale of the independent variables introduced in a NN model, as they are multiplied by the
New NN Methods for Forecasting Regional Employment
17
corresponding regression coefficients. Computing a regression for each 2-year period enables what could be seen as ‘fine tuning’ of the locational/competitive effect variables, the regression coefficient being different for each year. Certainly, the correctness of this procedure* from a methodological viewpoint* will have to be looked into in more depth. On the basis of the considerations of this and of the preceding sections, several NN-SS models were developed, using conventional and ‘spatial’ SSA formulations as well as SSR. The next section illustrates the data employed for our analyses, and then Section 5 provides details of the NN models developed and their results. 4. The Data Set on German Regional Labour Markets The data available for our experiments concern district units in the former West Germany and East Germany. The data on West Germany cover 17 years (1987 2003), while the data on East Germany are available for 11 years only (from 1993 2003). The number of districts is 326 for West Germany and 113 for East Germany, giving a total of 439 districts. The data sets have been provided by the German Institute for Employment Research (Institut fu¨r Arbeitsmarkt und Berufsforschung* IAB), and include information on the number of full-time workers employed every year at 30 June. A graphical visualization of recent regional trends in the data (for the period 2001 2003) is provided by Figure C1, in Appendix C. The above-mentioned regional data are also classified according to nine economic sectors.8 In addition to these variables, average regional daily wages earned by full-time workers are also available. Furthermore, in an effort to identify labour market patterns in similar regions, the ‘type of economic region’ variable was adopted. This variable, which is an index ranging from 1 to 9, follows the classification adopted by BfLR/BBR (Bundesforschungsanstalt fu¨r Raumordnung und Landeskunde/Bundesanstalt fu¨r Bauwesen und Raumordnung, Bonn). In fact, our West and East German districts may be grouped into the following nine economic regions (Bellmann & Blien, 2001): (1) (2) (3) (4) (5) (6) (7) (8) (9)
central cities in regions with urban agglomerations; highly urbanized districts in regions with urban agglomerations; urbanized districts in regions with urban agglomerations; rural districts in regions with urban agglomerations; central cities in regions with tendencies towards agglomeration; highly urbanized districts in regions with tendencies towards agglomeration; rural districts in regions with tendencies towards agglomeration; urbanized districts in regions with rural features; rural districts in regions with rural features.
The data set illustrated above will be the basis for our forecasting experiments, which are described below. 5. Forecasting Regional Employment in West and East Germany 5.1. Forecasting Employment by Means of Neural Networks This section will illustrate the series of NN models that we developed for our forecasting purposes. The main inputs of our models are the growth rates of the
18
R. Patuelli et al.
number of workers regionally employed in the nine economic sectors. To exploit the panel structure of our data and, more specifically, the correlation across observations of the same regions over time, we introduced in our models what we describe as the ‘time’ variable. This variable was identified in the models in two different ways: (1) as a ‘time fixed effect’ in panel models (Longhi et al., 2005b); and (2) as a set of dummy variables. On the basis of these considerations, 12 NN models in total have been adopted, which start from two basic models: (a) Model A, which employs time by means of dummy variables; and (b) Model B, which employs a fixed effects time variable. In addition to the time variable, further variables were employed in the NN models. Seven additional NN models have been applied (see Tables B1 and B2 in Appendix B). Model AC has the same inputs as Model A, plus a qualitative variable able to distinguish between the districts. As in the case of the time fixed effects variable, this can be seen as corresponding to cross-sectional fixed effects in a panel model (Longhi et al., 2002). Models AD and AE have the same inputs as Model A, plus the ‘type of economic region’ variable, which was introduced in the two NN models as a qualitative variable (Model AD) and as a set of dummies (Model AE). Also, Model B was enhanced with the ‘type of economic region’ qualitative variable, thereby obtaining Model BD. Finally, information about daily wages was introduced as a new input variable: (a) in Model A, obtaining Model AW; (b) in Model AD, obtaining Model ADW; and (c) in Model B, obtaining Model BW. Additional models were developed by employing SSA-computed variables. We refer to these models as NN-SS models. As in some of the models presented above, the NN-SS models use Model B as a basis: .
. .
Model BSS presents nine additional variables, which are the competitive effect coefficients calculated, for each sector, in the framework of conventional SSA. As a result, for each German district and each year, we have nine coefficients expressing regional competitiveness. Similarly, Model BSSN employs the competitive effect coefficients derived from the Nazara and Hewings SSA extension. Finally, Model BSSR embeds variables computed in the SSR framework. The variables employed in this model are the product of the multiplication of the competitive effect variables used in Model BSS, and their regression coefficients, found in the analysis explained in Section 3 (for details on the coefficient values, see Tables A1 and A2 in Appendix A).
The characteristics of the various models presented are summarized in Appendix B. All the models adopted use the growth rate of sectoral employment as input variables. Since, for each year, the NNs were trained on the basis of the 2-year lagged employment variations, the data used in our NN models started from 1991 (1989 1991) for West Germany and from 1997 (1995 1997) for East Germany.9 The data set available for West Germany is six years longer and allows for larger training and testing periods. The first test phase (referred to as the validation phase), which is summarized in Table 2, concerned the validation of a number of network configurations (see, for example, Fischer, 1998). For all NN models, we employed data up to the year 2000. Neural network models related to the case study of West Germany were trained from 1991 to 1998, while NN models for East Germany were trained from 1997 to 1999. For validating the models, two 2-year test sets have been used in the
New NN Methods for Forecasting Regional Employment
19
Table 2. Data utilization for validating the network configuration Models West Germany East Germany
Training
Validating
1991 1998 1997 1999
1999 2000 2000
case of West Germany (1999 2000), while one 2-year test set has been chosen for East Germany (2000). The use of two test sets in the choice of the NN structure is justified by the fact that the performance of the NNs is not uniform for different test sets. The use of statistical indicators calculated on a two-period basis may lead to choices that are less influenced by shocks that could have affected a particular year. However, experiments on East Germany had to be carried out on just one test period, since, because of the limited coverage of the data, only a few years would have been available for the NN learning process. For every NN model, we experimented with five structures in the initial stage. First, a one-layer structure (see Section 2.2) was tried out, followed by three two-layer models containing 5, 10 and 15 neurons, respectively, in one hidden layer. Finally, a three-layer model was attempted, using five neurons for each of the two hidden layers.10 The models trained as described above were subsequently evaluated by means of several statistical indicators.11 The best-performing settings were then chosen for further development of the NNs. In the subsequent test phase, the evaluation of the chosen structures was provided by ex post tests carried out for the year 2003* for which actual data were available. Table 3 summarizes which data were used at this stage. In this phase, the weights were reset and the models were retrained from their respective initial year up to the year 2002. The objective of this procedure was to obtain ex post, out-ofsample forecasts for the year 2003 that could be compared with the actual data, in order to evaluate the models’ generalization properties.12 The next sections will explain and discuss the empirical findings from our experiments. First, the results obtained for West Germany will be shown and examined (Section 5.2), followed by those found for East Germany (Section 5.3). 5.2. Estimation of West German Employment As indicated in the previous section, 12 different models were developed and tested for each data set. The first step was the choice of NN structure (in terms of number of layers and hidden neurons). The models were compared with respect to several configurations, using the years from 1991 to 1998 as the training period, and the years 1999 and 2000 (growth rates for 1997 1999 and 1998 2000) as a validation period (see Table 2). The indicators computed on the basis of the years 1999 and 2000 were calculated on the basis of percentage employment variations. Further details on the structures of the NN models that were finally chosen can be found in Appendix B (Table B1). The models were then retrained up to the year 2002, Table 3. Data utilization for the test phase Models West Germany East Germany
Training
Testing
1991 2002 1997 2002
2003 2003
20
R. Patuelli et al.
while the year 2003 acted as a test set (see Table 2). The statistical indicators emerging from these experiments are presented in Table 4. These results assess the statistical performance of the NN models, and will be the basis for the choice of a reduced set of models that will be adopted for actual employment forecasts (in this case, for the year 2005). It is clear from Table 4 that the models which use Model B as a base (we will call them B-type models) and, in particular, Model BW, perform better than the others (which we call A-type models). Specifically, Models BSS and BSSN, embedding SSA, seem to provide promising results, improving on the performance of the simpler Model B. Also, the B-type models mostly outperform a naı¨ve nochange random walk (see Theil’s U statistic), while the A-type do not. Finally, it is important, in the evaluation of the NN and NN-SS models, to note that the B-type models exceed, in the ex post forecasts, their own statistical performance in the training set, while, again, the A-type models do not. Similarly to what has been presented above, the next section will illustrate the statistical results for the NN models forecasting employment in East Germany. 5.3. Estimation of East German Employment The data set for East German employment contains information on the number of employees for 113 districts. Data are available for the period between 1993 and 2003. The data set is therefore smaller than that for West Germany (which comprises 326 districts from 1987 to 2003) and 6 years shorter. Consequently, only 5 years could be used for training, validating, and testing the models (see Table 2). The NN models were selected, structure-wise, by training the models from the year 1997 to the year 1999, and tested on the year 2000 (growth rate for 1998 2000). Appendix B (Table B2) provides the details on the structure and parameters of each NN model. The aforementioned models were subsequently trained up to the year 2002, employing the year 2003 as a test period (see Table 3). The statistical results of the East German NN models for the 2003 ex post forecasts are presented in Table 5. Table 5 shows results that seem to be consistent with those obtained for the West German NN models, presented in Table 4. As in the West German case, the B-type models* based on time as a fixed effect* display most of the lowest errors for all the indicators. The NN-SS models and, in particular, Model BSSR, employing SSA/SSR components, suggest an enhanced generalization power compared with the base model (Model B). The NN-SS models provide most of the best estimates, ranking among the top models in every statistical indicator. The consistent results between the West and East German NN models make for interesting considerations, which will be illustrated in the next, concluding, section. 6. Conclusions The aim of this paper was to make forecasts* at the time (t 2)* of the number of individuals employed in 439 NUTS 3 districts in Germany. For this purpose, several models* based on NN techniques* were developed. In particular, the districts were divided into West German and East German district data sets. Separate NN models were subsequently developed for the two zones. /
A-type models Model A
Model AC
Model AD
Model ADW
B-type models Model AE
Model AW
NN-SS models
Model B
Model BD
Model BW
Model BSS
Model BSSN
Model BSSR
Training MSE MAE MAPE
6,272,983 1,329.03 2.1312
9,978,277 1,620.98 2.4899
5,738,632 1,284.71 2.0652
6,070,688 1,364.94 2.1650
5,620,431 1,361.20 2.1963
7,191,179 1,410.64 2.2195
19,924,559 2,292.79 3.4124
19,701,038 2,344.16 3.6046
25,194,368 2,586.07 3.8323
22,340,774 2,446.76 3.6286
22,810,078 2,447.85 3.6272
21,735,874 2,373.33 3.5273
Testing MSE MAE MAPE Theil’s U
19,924,131 2,612.97 5.0166 1.3622
20,653,281 2,708.09 5.2581 1.4120
48,389,433 4,283.15 8.1696 3.3083
38,130,097 3,924.97 7.7009 2.6069
30,658,822 3,484.53 6.7716 2.0961
45,534,811 4,114.05 7.8912 3.1132
8,464,111 1,661.20 3.3038 0.5787
9,769,103 1,717.71 3.1697 0.6679
6,887,958 1,415.10 2.8078 0.4709
7,190,785 1,520.36 3.0592 0.4916
7,902,621 1,584.99 3.1406 0.5403
22,116,964 2,428.30 3.6179 1.5121
New NN Methods for Forecasting Regional Employment
Table 4. Statistical performances of the ex post forecasts for the year 2003: the case of West Germany
21
22 R. Patuelli et al.
Table 5. Statistical performances of the ex post forecasts for the year 2003: the case of East Germany A-type models
B-type models
Model A
Model AC
Model AD
Model ADW
Model AE
Training MSE MAE MAPE
22,158,313 1,679.79 3.4297
19,952,364 1,544.25 3.1888
8,596,534 1,527.65 3.5745
21,268,095 1,727.03 3.5303
9,034,762 1,492.37 3.4504
Testing MSE MAE MAPE Theil’s U
9,614,821 1,130.16 3.1412 0.2459
11,553,786 1,536.84 4.5443 0.2955
34,344,579 2,026.88 5.1718 0.8784
24,497,503 1,697.88 4.4177 0.6266
18,994,772 1,387.79 3.4920 0.4858
Model AW
NN-SS models
Model B
Model BD
Model BW
Model BSS
Model BSSN
Model BSSR
21,858,611 1,697.19 3.4643
37,252,966 2,011.37 3.8528
33,940,993 1,902.13 3.6353
33,799,007 1,825.46 3.4996
32,600,242 1,901.07 3.7358
38,854,476 2,036.55 3.8583
31,626,312 1,868.62 3.6847
14,620,784 1,371.51 3.7587 0.3740
1,016,209 618.41 2.1493 0.0260
1,381,412 714.47 2.4442 0.0353
5,556,952 1,023.88 2.8824 0.1421
1,194,348 633.85 2.1511 0.0305
1,400,856 645.65 2.1044 0.0358
916,426 595.45 2.0957 0.0234
New NN Methods for Forecasting Regional Employment
23
The results of ex post forecasts for the year 2003 were evaluated by means of several statistical indicators (see Tables 4 and 5). In particular, we were interested in observing the results of NN models employing SSA/SSR variables. Our results led to the following considerations: (a) The models’ performance shows different error levels, for both the West and East data sets. From a preliminary observation of Tables 4 and 5, the models utilizing the ‘time fixed effect’ variable (B-type models and NN-SS models) seem to forecast better than the remaining models (A-type models). In fact, they provide the lowest error levels for both the West and East Germany models. (b) Through all our experiments, we searched for a NN model that could be considered as the most consistent and reliable. While previous work by Patuelli et al. (2004) found Model B to respond to these criteria (shift-share NN models were not included), the NN-SS models (SSA/SSR-enhanced) presented here seem to improve the performance of Model B. For both West and East Germany they displayed errors that were among the lowest found, competing only with the other B-type models. The A-type models, as stated above, do not seem to be competitive. In conclusion, our aim was to experiment and test NN models that could provide reliable forecasts for German employment at a district level. In doing so, we experienced different levels of result reliability, depending on different data sets and socio-economic background. It has to be said that most of our empirical analysis has been based on only a few main variables (such as employment, type of district, and wages), and thus it cannot be comprehensive with regard to the many variables that come into play when employment and social conditions are at stake. A step in this direction was the introduction of the SSA/SSR-enhanced NN models. By embedding shift-share components in the NNs, we move in the direction of integrating linear and non-linear methods. In addition, as in the case of Model BSSN, we also incorporate spatial information. The incorporation in the NNs of information on the performance of ‘neighbours’ allows us to fill one of the gaps of conventional SSA, and maybe of NNs; that is, they do not include the spatial characteristics of the data. Further directions for research, from an empirical viewpoint, are concerned with addressing the need for a longer data span enriched with more variables (e.g. unemployment or migration). Also, a comparison of the accuracy of forecasts for the (t 1) and (t 2) periods might help in evaluating the usefulness of neural computing for labour markets. On the methodological side, it might be desirable to carry out a multi-criteria analysis that could, if it were based on several appropriate criteria, objectively evaluate the models in terms of the basis of the final user’s information needs. In addition, actual integration of linear methods with NNs should be a main objective. Fulfilling such a task would make it possible to combine the benefits of both families of methods in a more complete approach to labour market analysis. This could, therefore, be exploited in NN forecasting. Also, a more in-depth analysis of the spatial linkages among districts in terms of (un)employment growth might help to achieve a better understanding of regional phenomena. In this framework, the utilization of methods such as spatial filtering (Griffith, 2003), possibly in a joint NN approach, seems to be desirable, in particular from a policy perspective. /
/
24
R. Patuelli et al.
Notes 1. Sigmoid functions are most commonly used as activation functions. For example, Adya & Collopy (1998) have found that, for all the studies they collected on the business application of NNs, the activation function, when specified, is always a sigmoid. The sigmoid function is often used because it introduces non-linearity, by reducing the activation level of computing units to the [0, 1] interval. Another advantage of the sigmoid function is its simple derivative function. 2. The starting set of weights is usually defined randomly, so that a large error is generated at first (Cooper, 1999). On the other hand, Ripley (1993, p. 50) points out that the initial values ‘should be chosen close to the optimal values. Consequently, since the optimal value of the weights is unknown, small random values are used, within the (/0.1, 0.1) interval. 3. The error term is often computed as the mean of the single units’ squared errors. In our experiments, the error is computed as Ej Yj (1Yj )(Dj Yj ); where the error term Ej is a function of the actual output Yj , and of the difference between the expected and the actual output of the model, Dj . 4. A shortcoming of the BPA is that the algorithm is only expected to reach a stationary error, which can indeed be a non-global minimum (Ripley, 1993). Fahlmann (1992, as reported by Ripley) stresses that, although NNs do fall into local minima, these are often those that the analyst wants to reach. He also points out how, in some cases, local minima are blamed for problems that are in fact the result of other causes. 5. For a review of SSA identities see Dinc et al . (1998) and Loveridge & Selting (1998). 6. Data on the commuting flows were kindly provided by Gunther Haag (STASA, Stuttgart, Germany), and refer to the year 2002. Future research would ideally also look at changes in commuting patterns, so as to have a ‘dynamic’ definition of ‘neighbours’ as well. 7. The weights are computed, in our case, as the ratio between regional and national overall employment levels, in a base year. 8. The nine economic sectors are: (1) the primary sector; (2) industry goods; (3) consumer goods; (4) food manufacturing; (5) construction; (6) distribution services; (7) financial services; (8) household services; and (9) services for society. 9. Our models employ the employment variation between years (t /2; t ) in order to forecast the variation for the period (t , t/2). Consequently, if the data start from 1987, the first forecasted interval is 1989 1991. We refer to this forecast as a forecast for 1991. 10. Future research should address various behaviours for the intermediate structures (e.g. four or seven neurons). However, in the future, we will focus on two- and three-layer NN configurations, as empirical evidence has proved that a NN with one hidden layer can approximate nearly every type of function (Cheng & Titterington, 1994; Kuan & White, 1994). 11. The models are compared using the following statistical indicators: Mean absolute error: MAE/1/N * [ai jyi //yfi j]; Mean square error: MSE/1/N * [ai (yi / yfi ) 2]; Mean absolute percentage error: MAPE/1/N * [ai jyi / yfi j * 100/yi ]; Theil’s U : MSE (Model)/MSE (random walk), where yi is the observed value (target); yfi is the forecast of the model adopted (NN); and N is the number of observations/examples. The common interpretation of these indicators is that the estimation is better the closer the value is to zero. The MAPE indicator was not used in the testing phase of the NN models, but only for ex post forecast evaluation. 12. For the final step * and ultimate aim of the experiments * of making forecasts at district level for the year 2005, all of the available data will be employed, training the NNs up to the year 2003. The results for this part of the experiment are not reported here, since at present no real data for 2005 are available for comparison.
References Adya, M. & Collopy, F. (1998) How effective are neural networks at forecasting and prediction? A review and evaluation, Journal of Forecasting , 17(5 6), 481 495. Ashby, L. D. (1964) The geographical redistribution of employment: an examination of the elements of change, Survey of Current Business , 44(10), 13 20. Bade, F.-J. (2006) Evolution of regional employment in Germany: forecast 2001 to 2010, in: A. Reggiani & P. Nijkamp (eds) Spatial Dynamics, Networks and Modelling , pp. 297 323, Cheltenham and Northampton, Edward Elgar. Baker, B. D. & Richards, C. E. (1999) A comparison of conventional linear regression methods and neural networks for forecasting educational spending, Economics of Education , 18, 405 415.
New NN Methods for Forecasting Regional Employment
25
Bellmann, L. & Blien, U. (2001) Wage curve analyses of establishment data from Western Germany, Industrial and Labor Relations Review , 54, 851 863. Blien, U. & Wolf, K. (2002) Regional development of employment in Eastern Germany: an analysis with an econometric analogue to shift-share techniques, Papers in Regional Science , 81(3), 391 414. Chatterjee, A., Ayadi, O. F. & Boone, B. E. (2000) Artificial neural network and the financial markets: a survey, Managerial Finance , 26(12), 32 45. Cheng, B. & Titterington, D. M. (1994) Neural networks: a review from a statistical perspective, Statistical Science , 9(1), 2 30. Collopy, F., Adya, M. & Armstrong, J. S. (1994) Principles for examining predictive validity: the case of information systems spending forecasts, Information Systems Research , 5(2), 170 179. Cooper, J. C. B. (1999) Artificial neural networks versus multivariate statistics: an application from economics, Journal of Applied Statistics , 26, 909 921. Dinc, M., Haynes, K. E. & Qiangsheng, L. (1998) A comparative evaluation of shift-share models and their extensions, Australasian Journal of Regional Studies , 4(2), 275 302. Dreiseitl, S. & Ohno-Machado, L. (2002) Logistic regression and artificial neural network classification models: a methodology review, Journal of Biomedical Informatics , 35(5/6), 352 359. Dunn, E. S. (1960) A statistical and analytical technique for regional analysis, Papers and Proceedings of the Regional Science Association , 6, 97 112. Esteban-Marquillas, J. M. (1972) A reinterpretation of shift-share analysis, Regional and Urban Economics , 2(3), 249 255. Fahlmann, S. E. (1992) Comments on comp.ai.neural.nets, item 2198. Ferna´ndez, M. M. & Lo´pez Mene´ndez, A. J. (2005) Spatial shift-share analysis: new developments and some findings for the Spanish case , Paper presented at the 45th Congress of the European Regional Science Association, August, Amsterdam, The Netherlands. Fischer, M. M. (1998) Computational neural networks: an attractive class of mathematical models for transportation research, in: V. Himanen, P. Nijkamp & A. Reggiani (eds) Neural Networks in Transport Applications, pp. 3 20, Aldershot, Ashgate. Fischer, M. M. (2001a) Central issues in neural spatial interaction modeling: the model selection and the parameter estimation problem, in: M. Gastaldi & A. Reggiani (eds) New Analytical Advances in Transportation and Spatial Dynamics, pp. 3 19, Aldershot, Ashgate. Fischer, M. M. (2001b) Computational neural networks * tools for spatial data analysis, in: M. M. Fischer & Y. Leung (eds) GeoComputational Modelling. Techniques and Applications, pp. 15 34, Berlin, Springer. Fuschs, V. R. (1962) Statistical explanations of the relative shift of manufacturing among regions of the United States, Papers of the Regional Science Association , 8, 1 5. Gardner, M. W. & Dorling, S. R. (1998) Artificial neural networks (the multilayer perceptron): a review of applications in the atmospheric sciences, Atmospheric Environment , 32(14 15), 2627 2636. Griffith, D. A. (2003) Spatial Autocorrelation and Spatial Filtering: Gaining Understanding through Theory and Scientific Visualization, Berlin and New York, Springer. Haynes, K. E. & Machunda, Z. B. (1987) Considerations in extending shift-share analysis: note, Growth and Change , 18 (Spring), 69 78. Himanen, V., Nijkamp, P. & Reggiani, A. (eds) (1998) Neural Networks in Transport Applications , Aldershot, Ashgate. Kuan, C.-M. & White, H. (1994) ANNs: an econometric perspective, Econometric Reviews , 13, 1 91. Lin, C.-F. J. (1992) The econometrics of structural change, neural network and panel data analysis, PhD thesis, University of California, San Diego. Longhi, S. (2005) Open regional labour markets and socio-economic developments, PhD thesis, Vrije Universiteit, Amsterdam. Longhi, S., Nijkamp, P., Reggiani, A. & Blien, U. (2002) Forecasting regional labour markets in Germany: an evaluation of the performance of neural network analysis , Paper presented at the 42nd Congress of the European Regional Science Association, Dortmund, Germany. Longhi, S., Nijkamp, P., Reggiani, A. & Blien, U. (2005a) Developments in regional labour markets in Germany: a comparative analysis of the forecasting performance of competing statistical models, Australasian Journal of Regional Studies , 11(2), 175 196. Longhi, S., Nijkamp, P., Reggiani, A. & Maierhofer, E. (2005b) Neural network modeling as a tool for forecasting regional employment patterns, International Regional Science Review , 28(3), 330 346. Loveridge, S. & Selting, A. C. (1998) A review and comparison of shift-share identities, International Regional Science Review , 21(1), 37 58. Maier, H. R. & Dandy, G. C. (2000) Neural networks for the prediction and forecasting of water resources variables: a review of modelling issues and applications, Environmental Modelling & Software , 15, 101 124.
26
R. Patuelli et al.
McCollum, P. (1998) An Introduction to Back-propagation Neural Networks, Encoder . Internet site: http:// www.seattlerobotics.org/encoder/nov98/neural.html Miller, A. S., Blott, B. H. & Hames, T. K. (1992) Review of neural network applications in medical imaging and signal processing, Medical & Biological Engineering & Computing , 30(5), 449 464. Mo¨ller, J. & Tassinopoulos, A. (2000) Zunehmende Spezialisierung oder Strukturkonvergenz? Eine Analyse der Sektoralen Bescha¨ftigungsentwicklung auf Regionaler Ebene, Jahrbuch fu¨r Regionalwissenschaft , 20(1), 1 38. Nazara, S. & Hewings, G. J. D. (2004) Spatial structure and taxonomy of decomposition in shift-share analysis, Growth and Change , 35(4), 476 490. Nijkamp, P., Reggiani, A. & Tsang, W. F. (2004) Comparative modelling of interregional transport flows: applications to multimodal European freight transport, European Journal of Operational Research , 155(3), 584 602. Patterson, M. G. (1991) A note on the formulation of the full-analogue regression model of the shift-share method, Journal of Regional Science , 31(2), 211 216. Patuelli, R., Longhi, S., Reggiani, A. & Nijkamp, P. (2003) A comparative assessment of neural network performance by means of multicriteria analysis: an application to German regional labour markets, Studies in Regional Science , 33(3), 205 229. Patuelli, R., Longhi, S., Reggiani, A., Nijkamp, P. & Blien, U. (2004) New experiments with learning models for regional labour market forecasting , Paper presented at the 51st Annual North American Meeting of the Regional Science Association International, November, Seattle, WA. Ray, D. M. (1990) Standardizing Employment Growth Rates of Foreign Multinationals and Domes Firms in Canada from Shift-share to Multifactor Partitioning , Working paper, International Labour Organisation, International Labour Office, Geneva. Ripley, B. D. (1993) Statistical aspects of neural networks, in: O. E. Barndorff-Nielsen, J. L. Jensen & W. S. Kendall (eds) Networks and Chaos: Statistical and Probabilistic Aspects, pp. 40 123, London, Chapman & Hall. Rosenblatt, F. (1958) The perceptron: a probabilistic model for information storage and organization in the brain, Psychological Review , 65, 386 408. Rumelhart, D. E. & McClelland, J. L. (1986) Parallel Distributed Processing: Explorations in the Microstructure of Cognition, Cambridge, MA, MIT Press. Sargent, D. J. (2001) Comparison of artificial neural networks with other statistical approaches, Cancer , 91(S8), 1636 1642. Schintler, L. A. & Olurotimi, O. (1998) Neural networks as adaptive logit models, in: V. Himanen, P. Nijkamp & A. Reggiani (eds) Neural Networks in Transport Applications, pp. 131 150, Aldershot, Ashgate. Shiva Nagendra, S. M. & Khare, M. (2002) Artificial neural network based line source emission modelling: a review , Paper presented at ACE 2002: International Conference on Advances in Civil Engineering, January, Kharagpur, India. Stock, J. H. & Watson, M. W. (1998) A Comparison of Linear and Nonlinear Univariate Models for Forecasting Macroeconomic Time Series , NBER Working Paper 6607. Stokes, H. K. (1974) Shift-share once again, Regional and Urban Economics , 4(1), 57 60. Swanson, N. R. & White, H. (1997a) A model selection approach to real-time macroeconomic forecasting using linear models and artificial neural networks, Review of Economic and Statistics , 79, 540 550. Swanson, N. R. & White, H. (1997b) Forecasting economic time series using flexible versus fixed specification and linear versus nonlinear econometric models, International Journal of Forecasting , 13, 439 461. Werbos, P. (1974) Beyond regression: new tools for predicting and analysis in the behavioral sciences, PhD thesis (reprinted by Wiley, 1995), Harvard University. Wilson, P. (2000) The export competitiveness of dynamic Asian economies 1983 1995, Journal of Economic Studies , 27(6), 541 565. Wong, B. K., Bodnovich, T. A. & Selvi, Y. (1997) Neural network applications in business: a review and analysis of the literature (1988 1995), Decision Support Systems , 19(4), 301 320. Wong, B. K. & Selvi, Y. (1998) Neural network applications in finance: a review and analysis of literature (1990 1996), Information & Management , 34(3), 129 139.
Appendix A: Details of Shift-share Regression Parameter Estimates Tables A1 and A2 present the regression coefficients found when regressing the districts’ overall growth rates on the competitive effect variable seen in equation (3), for West and East Germany, respectively. A competitive effect variable was used for each of the nine industry sectors. WLS regressions were carried out for each year (that is, for each 2-year period).
Sector
87 89
88 90
89 91
90 92
91 93
92 94
93 95
94 96
95 97
96 98
97 99
98 00
99 01
00 02
01 03
Primary sector Industry goods Consumer goods Food manufacturing Construction Distribution services Financial services Household services Services for society
0.060*** 0.246*** 0.038*** 0.030** 0.044** 0.156*** 0.060*** 0.029*** 0.161***
0.109*** 0.195*** 0.049*** 0.019 0.073*** 0.146*** 0.075*** 0.058*** 0.106***
0.087*** 0.195*** 0.053*** 0.061*** 0.039** 0.109*** 0.056*** 0.116*** 0.139***
0.051*** 0.269*** 0.074*** 0.033*** 0.043** 0.090*** 0.066*** 0.057*** 0.188***
0.042*** 0.295*** 0.085*** 0.031*** 0.038* 0.135*** 0.052*** 0.052*** 0.080***
0.061*** 0.244*** 0.072*** 0.021 0.099*** 0.140*** 0.033* 0.042** 0.110***
0.022*** 0.231*** 0.058*** 0.025*** 0.096*** 0.107*** 0.068*** 0.045*** 0.127***
0.012*** 0.211*** 0.053*** 0.024*** 0.067*** 0.137*** 0.099*** 0.043*** 0.092***
0.012* 0.221*** 0.032*** 0.015** 0.046*** 0.115*** 0.100*** 0.060*** 0.164***
0.015*** 0.197*** 0.054*** 0.018*** 0.004 0.152*** 0.105*** 0.084*** 0.181***
0.028*** 0.242*** 0.053*** 0.000 0.002 0.167*** 0.097*** 0.074*** 0.093***
0.028*** 0.256*** 0.057*** 0.020* 0.022 0.093*** 0.075*** 0.090*** 0.097***
0.018*** 0.265*** 0.038*** 0.017 0.001 0.197*** 0.117*** 0.058*** 0.155***
0.021*** 0.195*** 0.036*** 0.001 0.058*** 0.186*** 0.112*** 0.077*** 0.209***
0.035*** 0.183*** 0.044*** 0.015* 0.062*** 0.158*** 0.118*** 0.053*** 0.201***
***
Significant at the 99% level;
**
significant at the 95% level;
*
significant at the 90% level.
New NN Methods for Forecasting Regional Employment
Table A1. Shift-share regression parameters for the competitive effect variables: the case of West Germany
27
28 R. Patuelli et al.
Table A2. Shift-share regression parameters for the competitive effect variables: the case of East Germany Sector Primary sector Industry goods Consumer goods Food manufacturing Construction Distribution services Financial services Household services Services for society ***
Significant at the 99% level;
**
93 95
94 96
95 97
96 98
97 99
98 00
99 01
00 02
01 03
0.077*** 0.150*** 0.008 0.035** 0.151*** 0.181*** 0.043*** 0.004 0.208***
0.097*** 0.103*** 0.002 0.017 0.144*** 0.211*** 0.089*** 0.027 0.175***
0.073*** 0.096*** 0.011 0.009 0.187*** 0.123*** 0.091*** 0.055** 0.306***
0.056*** 0.135*** 0.011 0.009 0.210*** 0.139*** 0.101*** 0.002 0.288***
0.056*** 0.135*** 0.011 0.009 0.210*** 0.139*** 0.101*** 0.002 0.288***
0.054*** 0.114*** 0.035** 0.013 0.158*** 0.115*** 0.126*** 0.140*** 0.267***
0.011 0.157*** 0.040*** 0.035*** 0.172*** 0.191*** 0.176*** 0.098*** 0.302***
0.035*** 0.139*** 0.035** 0.015 0.076*** 0.195*** 0.166*** 0.086*** 0.275***
0.040*** 0.104*** 0.026** 0.001 0.102*** 0.141*** 0.097*** 0.031 0.252***
significant at the 95% level.
New NN Methods for Forecasting Regional Employment
29
Appendix B: Details of Model Experiments The NN models used in the present paper were computed using the network parameters shown in the tables below. In addition, the following parameters were used: learning rate: 0.9; momentum: 1; input noise: 0; training tolerance: 0.1; testing tolerance: 0.3. Table B1. Parameter values of the neural network models adopted: the case of West Germany Inputs Model Model Model Model
A AC AD ADW
Model AE Model AW Model B
Employment Employment Employment Employment wage (GR) Employment Employment Employment
(GR), (GR), (GR), (GR),
time time time time
(dummies) (dummies), district (fixed effects) (dummies), district (qualitative) (dummies), district (fixed effects),
Model BD Model BW
Employment (GR), time (qualitative), district (fixed effects) Employment (GR), time (qualitative), wage (GR)
(GR), time (dummies), district (dummies) (GR), time (dummies), wage (GR) (GR), time (qualitative)
Model BSS Employment (GR), time (qualitative), SSA regional component Model BSSN Employment (GR), time (qualitative), SSA spatial regional component Model BSSR Employment (GR), time (qualitative), SSA modified competitive effect
IU
HU
Epochs
22 23 23 24
10 5 10 15
900 600 600 900
31 23 10
19 19
10 200 5 750 5(1stL), 650 5(2ndL) 10 300 5(1stL), 1,600 5(2ndL) 15 100 5 400
19
5
11 11
900
Notes : IU/input units; HU/hidden units; GR/growth rates; 1stL/first hidden layer; 2ndL/second hidden layer. All models have only one output unit; the activation function is always a sigmoid.
Table B2. Parameter values of the neural network models adopted: the case of East Germany Inputs Model Model Model Model
A AC AD ADW
(GR), (GR), (GR), (GR),
time time time time
(dummies) (dummies), district (fixed effects) (dummies), district (qualitative) (dummies), district (fixed effects),
IU
Model AE Model AW
Employment Employment Employment Employment wage (GR) Employment Employment
Model B
Employment (GR), time (qualitative)
10
Model BD Model BW Model BSS
Employment (GR), time (qualitative), district (fixed effects) Employment (GR), time (qualitative), wage (GR) Employment (GR), time (qualitative), SSA regional component
11 11 19
(GR), time (dummies), district (dummies) (GR), time (dummies), wage (GR)
Model BSSN Employment (GR), time (qualitative), SSA spatial regional component Model BSSR Employment (GR), time (qualitative), SSA modified competitive effect
16 17 17 18 25 17
19 19
HU
Epochs
10 100 10 300 5 300 5(1stL), 200 5(2ndL) 15 300 5(1stL), 200 5(2ndL) 5(1stL), 900 5(2ndL) 15 1,100 5 1,000 5(1stL), 200 5(2ndL) 5(1stL), 300 5(2ndL) 5(1stL), 300 5(2ndL)
Notes : IU/input units; HU/hidden units; GR/growth rates; 1stL/first hidden layer; 2ndL/second hidden layer. All models have only one output unit; the activation function is always a sigmoid.
30
R. Patuelli et al.
Appendix C: Map of Observed Growth Rates (Years 2001 2003) in Germany
Figure C1. Observed full-time employment growth rates in Germany, 2001 2003.
Spatial Economic Analysis, Vol. 1, No. 1, June 2006
Interpolation of Air Quality Measures in Hedonic House Price Models: Spatial Aspects
LUC ANSELIN & JULIE LE GALLO (Received January 2006; revised February 2006)
ABSTRACT This paper investigates the sensitivity of hedonic models of house prices to the spatial interpolation of measures of air quality. We consider three aspects of this question: the interpolation technique used, the inclusion of air quality as a continuous vs discrete variable in the model, and the estimation method. Using a sample of 115,732 individual house sales for 1999 in the South Coast Air Quality Management District of Southern California, we compare Thiessen polygons, inverse distance weighting, Kriging and splines to carry out spatial interpolation of point measures of ozone obtained at 27 air quality monitoring stations to the locations of the houses. We take a spatial econometric perspective and employ both maximum-likelihood and general method of moments techniques in the estimation of the hedonic. A high degree of residual spatial autocorrelation warrants the inclusion of a spatially lagged dependent variable in the regression model. We find significant differences across interpolators in the coefficients of ozone, as well as in the estimates of willingness to pay. Overall, the Kriging technique provides the best results in terms of estimates (signs), model fit and interpretation. There is some indication that the use of a categorical measure for ozone is superior to a continuous one.
Interpolation des Mesures de la Qualite´ de l’Air dans les Mode`les He´doniste de l’Estimation Immobilie`re: Aspects Spatiaux RE´SUME´ Cet article examine la sensibilite´ de l’e´valuation he´doniste des prix de l’immobilier a` l’interpolation spatiale des mesures de la qualite´ de l’air. Nous avons envisage´ la question sous trois aspects: la technique d’interpolation utilise´e, l’introduction de la qualite´ de l’air comme variable continue ou discre`te dans le mode`le et la me´thode d’estimation. Nous avons utilise´ un e´chantillon de Luc Anselin (to whom correspondence should be sent), Spatial Analysis Laboratory (SAL), University of Illinois, Urbana-Champaign, Urbana, IL 61801, USA. Email:
[email protected]. Julie Le Gallo, IERSO (IFReDEGRES), Universite´ Montesquieu-Bordeaux IV, 33608 Pessac Cedex, France. Email:
[email protected]. This paper is part of a joint research effort with James Murdoch (University of Texas, Dallas) and Mark Thayer (San Diego State University). Their valuable input is gratefully acknowledged. The research was supported in part by NSF Grant BCS-9978058 to the Center for Spatially Integrated Social Science (CSISS), and by NSF/EPA Grant SES-0084213. Julie Le Gallo also gratefully acknowledges financial support from Programme APR S3E 2002, directed by H. Jayet, entitled ‘The economic value of landscapes in periurban cities’ (Ministe`re de l’Ecologie et du De´veloppement Durable, France). Earlier versions were presented at the 51st North American Meeting of the Regional Science Association International, Seattle, WA, November 2004, the Spatial Econometrics Workshop, Kiel, Germany, April 2005, and at departmental seminars at the University of Illinois, Ohio State University, the University of California, Davis, and the University of Pennsylvania. Comments by participants are greatly appreciated. The usual disclaimer holds. ISSN 1742-1772 print; 1742-1780 online/06/010031-22 # 2006 Regional Studies Association
DOI: 10.1080/17421770600661337
32
L. Anselin & J. Le Gallo
Downloaded by [Tehran University] at 04:23 21 August 2011
115 732 ventes de maisons individuelles, en 1999, dans le district Coˆte Sud de la gestion de la Qualite´ de l’Air en Californie du Sud. Nous avons compare´ les polygoˆnes de Thiessen, la ponde´ration inversement proportionnelle a` la distance, le krigeage et les courbes splines pour mener l’interpolation des mesures ponctuelles de l’ozone, obtenues dans 27 stations de suivi de la qualite´ de l’air en fonction des lieux ou` e´taient situe´es les maisons. Nous avons pris une perspective spatiale e´conome´trique et employe´ aussi bien la probabilite´ maximale que la me´thode ge´ne´rale des moments techniques dans l’e´valuation de l’he´donique. Un degre´ e´leve´ d’auto corre´lation spatiale re´siduelle garantie l’inclusion d’une variable de´pendante spatialement de´cale´e dans le mode`le de re´gression. Nous avons trouve´ des diffe´rences importantes parmi les interpolateurs dans les coefficients d’ozone, ainsi que parmi les indicateurs de la volonte´ de payer. Surtout, la technique de krigeage donne les meilleurs re´sultats pour les estimations (signes), l’ajustement du mode`le et l’interpre´tation. L’utilisation d’une mesure nominale pour l’ozone est supe´rieure a` une mesure continue, semble-t-il. Interpolacio´n de las medidas de la calidad del aire en los modelos de los precios hedo´nicos de la vivienda: aspectos espaciales En este ensayo investigamos la sensibilidad de los modelos de lo precios hedo´nicos de la vivienda para la interpolacio´n espacial de medidas de la calidad del aire. Tenemos en cuenta tres aspectos al respecto: la te´cnica de interpolacio´n utilizada, la inclusio´n de la calidad del aire como variable continua, en vez de discreta, en el modelo, y el me´todo de ca´lculo. Con una muestra de 115.732 ventas de viviendas individuales durante 1999 en el Distrito de Gestio´n de Calidad del Aire de la Costa Sur en California, comparamos los polı´gonos de Thiessen, la ponderacio´n de la distancia inversa, me´todos geoestadı´sticos o Kriging y me´todos basados en splines para llevar a cabo la interpolacio´n espacial de las mediciones puntuales de ozono obtenidas en 27 estaciones de control de calidad del aire en los lugares donde esta´n situadas las viviendas. Desde la perspectiva econome´trica espacial empleamos las te´cnicas de la probabilidad ma´xima del me´todo general de momentos en el ca´lculo de precios hedo´nicos. Debido a un alto grado de autocorrelacio´n espacial residual debemos incluir una variable dependiente espacialmente rezagada en el modelo de regresio´n. Se observan diferencias importantes entre los interpoladores en los coeficientes del ozono y en los ca´lculos de la disposicio´n a pagar. En general, la te´cnica Kriging da los mejores resultados en cuanto a los ca´lculos (sen˜ales), la idoneidad del modelo y la interpretacio´n. Hay indicios de que es mejor usar una medida catego´rica para el ozono en vez de una continua. RESUMEN
KEYWORDS: Spatial econometrics; hedonics; spatial interpolation; air quality valuation; real estate JEL
CLASSSIFICATION:
C21, QS1, QS3, R31
1. Introduction The valuation of the economic benefits of improvements in environmental quality is a well-studied topic in economics and policy analysis (e.g. Freeman III, 2003). In this context, the estimation of a hedonic model of house prices that includes a measure of ambient air quality has become an established methodology (e.g. Palmquist, 1991). The rationale behind this approach is that, ceteris paribus, houses in areas with less pollution will have this benefit capitalized into their value, which should be reflected in a higher sales price. The theoretical, methodological and empirical literature dealing with this topic is extensive, going back to the classic studies of Ridker & Henning (1967) and Harrison & Rubinfeld (1978). Extensive recent reviews are provided in Smith & Huang (1993, 1995), Boyle & Kiel (2001), and Chay & Greenstone (2005), among others. In the empirical literature, an explicit accounting for spatial effects (spatial
Downloaded by [Tehran University] at 04:23 21 August 2011
Interpolation in Spatial Hedonic Models 33 autocorrelation and spatial heterogeneity) using the methodology of spatial econometrics has only recently become evident, e.g. in Kim et al. (2003), Beron et al. (2004), and Brasington & Hite (2005). This coincides with a greater acceptance of spatial econometrics in empirical studies of housing and real estate in general, e.g. as reviewed in Anselin (1998), Basu & Thibodeau (1998), Pace et al. (1998), Dubin et al. (1999), Gillen et al. (2001), and Pace & LeSage (2004), among others. In this paper, we focus on a methodological aspect pertaining to the inclusion of an ambient air quality variable in hedonic house price models that has received little attention to date: the interpolation of pollution values to the location of the house sales transaction. Since measurement of pollution is based on regular sampling at a few monitoring stations, but house sales transactions are spatially distributed throughout the region, there is a mismatch between the spatial ‘support’ of the explanatory variable (e.g. ozone) and the support for the dependent variable (sales price). This change of support problem (Gotway & Young, 2002), or misaligned regression problem (Banerjee et al., 2004, Ch. 6), has been considered extensively in the spatial statistical literature. In hedonic house price models that include air quality, however, this is typically treated in a rather ad hoc manner, and one of several procedures is used that are readily available in commercial GIS software packages. We consider the extent to which the selection of a particular interpolation method affects the parameter estimates in the hedonic function and the derived economic valuation of willingness to pay for improved air quality. Specifically, we compare Thiessen polygons, inverse distance weighting (IDW), Kriging and splines* techniques that are easy to implement and that have seen application in hedonic house price studies to varying degrees. For example, Thiessen polygons were used by Chattopadhyay (1999), Palmquist & Israngkura (1999), and Zabel & Kiel (2000), Kriging in Beron et al. (1999, 2001, 2004), and spline interpolation in Kim et al. (2003).1 We are also interested in the sensitivity of the results to the way in which the pollution variable is quantified, either as a continuous measure of ambient air quality or as a set of discrete categories. It is often argued that the latter conforms more closely to the perception of the buyers and sellers in a sales transaction, who may not be aware of subtle continuous changes in the concentration of a given pollutant. We pursue this assessment by means of an empirical investigation of a sample of 115,732 house sales in the South Coast Air Quality Management District of Southern California, for which we have detailed characteristics, as well as neighbourhood measures and observations on ozone.2 We take an explicit spatial econometric approach to this problem, and, in the process, apply specialized methods for the estimation of spatial regression specifications by means of maximum likelihood (ML) that can be implemented for very large data sets. To our knowledge, ours is the largest actual house sales data set to date for which both ML estimation of the parameters in a spatial regression and inference by means of asymptotic t-values have been obtained. In the remainder of the paper, we first provide a brief discussion of data sources and methods and give some methodological background on the four interpolators we consider. We next review three aspects of the empirical results: the spatial distribution of the interpolated ozone measures and their conversion to spatial regimes; the parameter estimates in the hedonic house price model; and the
34
L. Anselin & J. Le Gallo
valuation of air quality in the form of marginal willingness to pay. We close with some concluding remarks. 2. Data and Estimation Methods
Downloaded by [Tehran University] at 04:23 21 August 2011
2.1. Data Sources The data used in this paper come from three different sources: Experian Company (formerly TRW) for the individual house sales prices and characteristics, the 2000 US Census of Population and Housing for the neighbourhood characteristics (at the census tract level), and the South Coast Air Quality Management District for the ozone measures. The house prices and characteristics are from 115,732 sales transactions of owner-occupied single family homes that occurred during 1999 in the region, which covers four counties, namely Los Angeles, Riverside, San Bernardino and Orange. The data were geocoded, which allows for the assignment of each house to any spatially aggregate administrative district (such as a census tract or zip code zone). Geocoding is also needed for the computation of an interpolated ozone value at the location of each transaction. These ozone values are taken for the year preceding the transaction, rather than simultaneous with the transaction. In order to obtain sufficient variability (ozone measures are highly seasonal as well as spatially heterogeneous), we chose the average of the daily maximum for the worst quarter in 1998, derived from the hourly readings for 27 stations.3 Apart from the interpolated ozone values, the variables used in the hedonic specification are essentially the same as in the earlier work of Beron et al. For a detailed discussion of sources and measurement issues we therefore refer the reader to Beron et al. (2004, pp. 279 281). A list and brief description of the socioeconomic explanatory variables used in the analysis (house characteristics and census variables) are given in Table 1. Table 1. Variable names and description Variable name
Description
Elevation Livarea Baths Fireplace Pool Age Beach AC Heat Landarea
Elevation of the house Interior living space Indicator variable for more than two bathrooms Number of fireplaces Indicator variable for pool Age of the house Indicator variable for home less than 5 miles from beach Indicator variable for central air conditioning Indicator variable for central heating Lot size
Traveltime Poverty White Over65 College Income
Average time to work in census tract Percentage of population with income below the poverty level Percentage of the population that is white Percentage of the population older than 65 years Percentage of population with 4 or more years of college education Median household income
Riverside San Bern. Orange
Indicator variable for Riverside County Indicator variable for San Bernardino County Indicator variable for Orange County
Downloaded by [Tehran University] at 04:23 21 August 2011
Interpolation in Spatial Hedonic Models 35 The 115,732 house sales are made up of 70,357 transactions in Los Angeles County (61% of the total), 12,523 in Riverside County (11%), 14,409 in San Bernardino County (12%), and 18,443 in Orange County (16%). The observed sales prices range from $20,000 to $5,345,455, with an overall mean of $239,518. This overall mean hides considerable variability across counties, with the values for Orange ($270,924) and Los Angeles ($267,455) Counties considerably higher than for San Bernardino ($151,249) and Riverside ($137,867) Counties. A general impression of the spatial distribution of prices (in $/m2) can be gained from Figure 1, which also shows the county boundaries and the locations of the air quality monitoring stations. Note the reasonable coverage of the spatial range of sales transactions by 25 of the monitoring stations. Two stations are somewhat to the east. They will essentially be ignored in the Thiessen and IDW interpolators. Since the inclusion of these stations may provide a better fit for the Kriging and spline procedures, they have been retained in the sample. The spatial distribution of house prices (with darker colours representing higher prices) shows some concentration of higher values in the coastal area of Los Angeles and Orange Counties, as well as in the north-west edge of the basin. However, the distribution is quite heterogeneous, with small groupings of high values in both of the other counties as well. The average of the daily maxima of the ozone values during the worst quarter of 1998, observed at the 27 monitoring stations, ranged from a low of 4.7 ppb to a high of 13.5 ppb, with an average of 8.9 ppb. We interpolate these values from the point locations of the stations to the point locations of the house transactions. In the empirical analysis, we use both the interpolated value as such, as well as indicator variables that result from a transformation of the continuous value into four discrete categories, which we refer to as ‘regimes’. The categories were inspired by the breakpoints for O3 used by the US Environmental Protection Agency to establish national ambient air quality standards (NAAQS) in effect in 1999. We label the four resulting indicators as Good (0.0 6.4 ppb), Moderate (6.5 8.4 ppb), Unhealthy1 (8.5 10.4 ppb), and Unhealthy2 ( 10.4 ppb). We evaluate each interpolation method for both the continuous ozone value and the discrete categories. /
Figure 1. Spatial distribution of price ($/m2) and location of monitoring stations.
36
L. Anselin & J. Le Gallo
Downloaded by [Tehran University] at 04:23 21 August 2011
2.2. Econometric Issues We estimate a hedonic function in log-linear form, with three types of explanatory variables: house-specific characteristics, neighbourhood characteristics (measured at the census tract level), and air quality in the form of ozone (O3). We take an explicit spatial econometric approach, which includes testing for the presence of spatial autocorrelation and estimating specifications that incorporate spatial dependence. For a general overview of methodological issues involved in the specification, estimation and diagnostic testing of spatial econometric models, we refer the reader to Anselin (1988), Anselin & Bera (1998), and, more recently, Anselin (2006). In this section, we limit our remarks to the specific test statistics and estimation methods employed in the empirical exercise. We refer the reader to the literature for detailed technical treatments.4 We follow Anselin (1988) and distinguish between spatial dependence in a specification that incorporates a spatially lagged dependent variable, and a model with a spatial autoregressive error term. We refer to these as spatial lag and spatial error models. Formally, a spatial lag model is expressed as: (1)
y rWyXbu;
where y is an n 1 vector of observations on the dependent variable, X is an n k matrix of observations on explanatory variables, W is an n n spatial weights matrix, u an n 1 vector of i.i.d. error terms, r is the spatial autoregressive coefficient, and b a k 1 vector of regression coefficients. A spatial error model is: /
/
/
/
/
y Xbo
(2)
o lW o u;
(3)
where o is an n 1 vector of spatial autoregressive error terms, with l as the autoregressive parameter, and the other notation is as in equation (1). By means of the spatial weights matrix W, a neighbour set is specified for each location. The positive elements wij of W are non-zero when observations i and j are neighbours, and zero otherwise. By convention, self-neighbours are excluded, such that the diagonal elements of W are zero. In addition, in practice, the weights matrix is typically row-standardized, such that aj wij 1: Many different definitions of the neighbour relation are possible, and there is little formal guidance in the choice of the ‘correct’ spatial weights.5 The term Wy in equation (1) is referred to as a spatially lagged dependent variable, or spatial lag. For a row-standardized weights matrix, it consists of a weighted average of the values of y in neighbouring locations, with weights wij . In our application, we obtain the spatial weights matrix by first constructing a Thiessen polygon tessellation for the house locations, which turns the spatial representation of the sample from points into polygons. We next use simple contiguity (common boundaries) as the criterion to define neighbours. The resulting weights matrix is extremely sparse (0.005% non-zero weights) and contains on average six neighbours for each location (ranging from a minimum of 3 neighbours to a maximum of 35 neighbours for one observation). The weights are used in row-standardized form. For each model specification, we first obtain ordinary least squares (OLS) estimates and assess the presence of spatial autocorrelation using the Lagrange Multiplier test statistics for error and lag dependence (Anselin, 1988), as well as their /
Downloaded by [Tehran University] at 04:23 21 August 2011
Interpolation in Spatial Hedonic Models 37 robust forms (Anselin et al., 1996).6 The results consistently show very strong evidence of positive residual spatial autocorrelation, with a slight edge in favour of the spatial lag alternative (see Section 5). In addition to considering the estimates for this specification, we also estimated the spatial error model, to further assess the sensitivity of the results to the way spatial effects are incorporated in the regression. We use two types of estimation approaches. First, we apply the classical ML method (Ord, 1975; Anselin, 1988), but use the characteristic polynomial technique to allow the estimation in very large data sets (Smirnov & Anselin, 2001). We also exploit a sparse conjugate gradient method to obtain the inverse of the asymptotic information matrix (Smirnov, 2005). These estimation techniques and the regression diagnostics are carried out using GeoDa statistical software (Anselin et al., 2006). For the spatial lag model, to avoid reliance on the assumption of Gaussian errors, we also use a robust estimation technique in the form of instrumental variables (IV) estimation, or spatial two-stage least squares (Anselin, 1988; Kelejian & Robinson, 1993; Kelejian & Prucha, 1998). In addition, to account for the considerable remaining heteroskedasticity, we implement a heteroskedastically robust form of spatial 2SLS, which is a special case of the recently suggested HAC estimator of Kelejian & Prucha (2005). Finally, for the spatial error model, we apply the generalized moments (GM) estimator of Kelejian and Prucha (1999), which does not require an assumption of Gaussian error terms. The robust estimation methods were programmed as custom functions in R statistical software. One final methodological note pertains to the assessment of model fit. In spatial models, the use of the standard R2 measure is no longer appropriate (see Anselin, 1988, Ch. 14). When ML is used as the estimation method, a useful alternative measure is the value of the maximized log-likelihood, possibly adjusted for the number of parameters in the model in an Akaike Information Criterion (AIC) or other information criterion. However, for the models estimated by IV or GM, there is no corresponding measure. In order to provide for an informal comparison of the fit of the various specifications, we also report a pseudo-R2, in the form of the squared correlation between observed and predicted values of the dependent variable. In the classical linear regression model, this is equivalent to the R2, but in the spatial models the use of this measure is purely informal and should be interpreted with caution. For the spatial error model, the pseudo-R2 is simply the squared correlation ˆ where bˆ is the estimated coefficient vector. However, in between y and yˆ X b; the spatial lag model the situation is slightly more complex. Since the spatially lagged dependent variable Wy is endogenous to the model, we obtain the predicted value from the expression for the conditional expectation of the reduced form: ˆ yˆ E[yjX] (I rW ˆ )1 X b:
(4)
This operation requires the inverse of a matrix of dimension n n, which is clearly impractical in the current situation. We therefore approximate the inverse by means of a power method, which is accurate up to 6 decimals of precision.7 /
2.3. Remaining Issues Our main focus is on the sensitivity of estimation results to the spatial interpolation method for the ozone measure. In order to keep this investigation tractable, there are several methodological aspects that we control for and do not pursue at this
Downloaded by [Tehran University] at 04:23 21 August 2011
38
L. Anselin & J. Le Gallo
stage. These include a range of issues traditionally raised in the context of hedonic model estimation, such as the sensitivity of the results to functional form, variable selection, measurement error, identification, distributional assumptions, etc. We include ozone as a single pollutant, given its visibility (as a major cause of smog) and extensive reporting in the popular media. We do not consider other potentially relevant criteria pollutants, such as particulate matter (PM2.5 and PM10). We use the same functional specification in all analyses, taking the dependent variable in log-linear form and including the site-specific and neighbourhood variables used in the studies by Beron et al. (2001, 2004). We dropped neighbourhood variables that were consistently not significant (such as a crime indicator). Arguably, more refined specifications could be considered, but this is relevant in the current context only to the extent that these would affect the estimates of the spatially interpolated ozone values differentially, which is doubtful. We also consider only one spatial weights matrix, which implies that any interaction effect between the properties of the spatial interpolation methods and the specification of spatial weights has been ruled out. Since these are two very different approaches to dealing with spatial effects (one based on discrete locations and the other on a continuous surface), this seems a reasonable assumption. More important is the potential effect of spatial heterogeneity, in the form of coefficient instability and the presence of spatial market segmentation. We leave this as a topic for further investigation.8 Other potentially important methodological aspects that we do not consider at this time are the possible joint determination of location choice, house purchases and environmental quality. Apart from this source of endogeneity, there is also a potential errors in variables problem in the interpolated values. Since these values are treated as ‘observations’, any error associated with them is ignored. To the extent that such error patterns may be correlated with the regression error, this may result in biased estimates (Anselin, 2001b). We maintain that while these issues are important in and of themselves, they are less relevant in the current context, where the sensitivity to the different interpolation methods is our main concern. Our implicit assumption is therefore that the relative performance of the interpolation methods will not be affected by ignoring these other methodological aspects. We intend to investigate this further in future work. 3. Spatial Interpolation of Point Measures of Air Quality In our empirical analysis, we need to allocate ozone measures obtained at the location of 27 monitoring stations to the locations of 115,732 sales transactions. This ‘point-to-point interpolation’ is the simplest among the change-of-support problems, and is well understood. The four techniques that we consider are readily available in commercial GIS software, such as ESRI’s ArcGIS and its Spatial Analyst and Geostatistical Analyst extensions. Thiessen polygons or proximity polygons (also known as Delaunay triangulation or Voronoi diagrams) are obtained by assigning to each house the value measured at the nearest monitoring station. This results in the partitioning of space into a tessellation, which corresponds to the simple notion of a spatial market area in the situation where only transportation costs matter.9 Consequently, the value for ozone follows a step function, taking on only as many different values as observed at the monitoring stations.
Interpolation in Spatial Hedonic Models 39
Downloaded by [Tehran University] at 04:23 21 August 2011
Inverse distance weighting is a weighted average of the values observed at the different monitors in the sample, with greater weight assigned to closer stations. In practice, due to the distance decay effect, the average only includes values observed for a few nearest neighbours. Formally, the interpolated value at j is obtained as: P wz zj Pi i i ; (5) i wi where the weights wi 1=f (dji ) and f (dji ) is a power of the distance between j and i. In our study, we set f (dji ) 1=dji2 : 10 Kriging is an optimal linear predictor based on a variogram model of spatial autocorrelation. This is grounded in geostatistical theory and has an established tradition in natural resource modelling. A detailed discussion of the statistical principles behind Kriging is beyond the scope of this paper, and we refer the reader to extensive treatments in Cressie (1993, Ch. 3), Burrough & McDonnell (1998, Ch. 6), and Schabenberger & Gotway (2005, Ch. 5), among others. In our study, we used ordinary Kriging (i.e. the interpolation was based on the ozone value itself, without additional explanatory variables in the model) and allowed for directional effects in a spherical model.11 Spline interpolators are based on fitting a surface through a set of points while minimizing a smoothness functional, i.e. a function of the coordinates that represents a continuous measure of fit subject to constraints on the curvature of the surface. Parameters can be set to specify the ‘stiffness’ of the surface through a tension parameter, which can be interpreted as a measure of the extent to which any given point influences the fitted surface.12 We applied a regularized spline with the weight set at 0.1. 4. Spatial Interpolation and Air Quality Regimes We begin with a comparison of descriptive statistics for the ozone values assigned to the house locations using each of the four interpolation procedures. Table 2 summarizes the main results. While the overall averages for the four methods are very similar, the large number of observations means that standard tests on equality of means or medians strongly reject these null hypotheses.13 It is important to note the difference in variance between the interpolated measures, as well as the different range. By design, both Thiessen and IDW methods respect the range of the original observations (for the 27 monitoring stations), whereas the Kriging and spline methods do not. While the results for Kriging stay within the observed range, the spline method yields interpolated values Table 2. Descriptive statistics: interpolated ozone values
Mean SD Range Correlation Thiessen IDW Kriging
Thiessen
IDW
Kriging
Spline
8.280 2.033 4.707 13.467
8.276 1.906 4.707 13.467
8.233 1.912 4.718 13.464
8.246 1.967 4.543 15.307
1.0
0.980 1.0
0.933 0.965 1.0
0.939 0.960 0.967
40
L. Anselin & J. Le Gallo
Table 3. Observations by air quality regime
Good Moderate Unhealthy1
Downloaded by [Tehran University] at 04:23 21 August 2011
Unhealthy2
Thiessen
IDW
Kriging
Spline
20,191 17.5% 41,761 36.1% 30,070 26.0% 23,710 20.5%
19,363 16.7% 43,825 37.9% 30,926 26.7% 21,618 18.7%
27,368 23.6% 32,094 27.7% 37,242 32.2% 19,028 16.4%
19,649 17.0% 44,410 38.4% 31,149 26.9% 20,524 17.7%
both below as well as above the observed range. Of the four methods, Thiessen has the highest overall average (8.28) and Kriging the lowest (8.23), while Thiessen also has the highest standard deviation (2.03), due to its being a step function rather than a continuous smoother. The non-spatial correlations between the four interpolated values are extremely high, with the lowest observed between Thiessen and Kriging (0.933) and the highest between Thiessen and IDW (0.980). The allocation of the interpolated values to the four categories of Good, Moderate, Unhealthy1 and Unhealthy2 is shown in Table 3. Interesting differences occur both at the low end and at the high end. In the best category, the largest share is obtained for the Kriging method, with 23.6%, compared to values below 20% for the others. The Thiessen method yields the largest share of houses in the worst group (20.5%), but when the two worst categories are taken together, the greatest share is for Kriging (48.6%). The resulting spatial distributions are quite distinct as well, as illustrated in Figures 2 5. Note in particular the qualitative difference between the edges of regimes for the Thiessen and IDW interpolations, which are centred on the monitoring stations, and the much smoother patterns for Kriging and spline. Both of these show roughly parallel zones of decreasing air quality moving away from the coast. Also note the peculiar elliptical shape of the Good zone for the spline
Figure 2. Spatial regimes for Thiessen interpolation.
Downloaded by [Tehran University] at 04:23 21 August 2011
Interpolation in Spatial Hedonic Models 41
Figure 3. Spatial regimes for IDW interpolation.
interpolator, in contrast to a region that includes most of the coastal properties in Los Angeles County and north-west Orange County for Kriging. 5. Spatial Interpolation and Parameter Estimates in Hedonic Models We start first with a broad overview of the results before focusing more specifically on the estimates of the spatial coefficients and the parameters of the ozone variable. We estimated the hedonic model using six different methods with both a continuous and discrete (regimes) ozone variable, and for each of the four interpolators, for a total of 48 specifications. The detailed results are not listed here, and only the salient characteristics are summarized.14
Figure 4. Spatial regimes for Kriging interpolation.
Downloaded by [Tehran University] at 04:23 21 August 2011
42
L. Anselin & J. Le Gallo
Figure 5. Spatial regimes for spline interpolation.
The point of departure is the OLS estimation of the familiar log-linear hedonic model, which achieves a reasonable fit, ranging from an adjusted R2 of 0.769 (Thiessen, continuous) to 0.774 (Kriging, regimes). For example, this fit is comparable to the results reported in Beron et al. (2001), where the R2 values are around 0.70 but ours is a considerably larger dataset. There is also strong evidence of very significant positive residual spatial autocorrelation, supported by both LM-Error and LM-Lag test statistics, with a slight edge in favour of the latter alternative.15 This is not surprising, given the fine spatial grain at which we have observations on the sales transactions and the lack of such spatial detail for the neighbourhood characteristics. If we maintain the spatial lag model as the proper alternative, the OLS estimates are biased and should be interpreted with caution. For each interpolator, and in both the continuous and discrete instances, the spatial lag specification obtains the best fit, and all spatial models fit the data considerably better than the non-spatial OLS. This is a further indication that the latter may yield biased estimates. To illustrate the improvement in fit, consider the best interpolator, Kriging, for which the log-likelihood improves in the continuous case from 16,927 in the standard regression model to 7,119 in the spatial lag model (the R2 value goes from 0.772 to a pseudo-R2 of 0.814). Similar improvements are obtained for the other specifications. Interestingly, the relative fit of the four interpolators is consistent across all estimation methods and for both the continuous and regimes ozone variable. In each case, Kriging is best, followed by spline and IDW, with Thiessen as worst. Also, in all but one instance (Thiessen, Lag-ML), the regimes model fits the data better than the continuous one. For OLS, the coefficient estimates for the house and neighbourhood characteristics are significant and with the expected sign, except for Elevation and AC, which were both found to be negative. The Elevation coefficient may in part capture an interaction effect with air quality, but the negative value for AC does not have an obvious explanation. The base case for the counties is /
/
Downloaded by [Tehran University] at 04:23 21 August 2011
Interpolation in Spatial Hedonic Models 43 Los Angeles, with negative dummies in increasing order of absolute value for Orange, San Bernardino and Riverside. The main difference between OLS and the spatial lag models lies in the absolute magnitude of the estimates, with consistently much smaller values in the spatial lag specification, as is to be expected. However, note that the OLS estimates may be suspect, given the strong indication in favour of the lag specification.16 The signs and significance are maintained for all but the coefficient of Income, which becomes negative in Lag-ML. However, this is only significant for Thiessen and IDW under ML estimation, but not for the other two methods. Also, the significance disappears for the IV and IV-Robust estimates (in the latter the coefficient is positive, but not significant). A closer look at the estimates for the spatial autoregressive parameter is provided in Table 4. All estimated coefficients are highly significant, with slightly higher magnitudes for the spatial autoregressive error parameter (but note that the spatial error model is inferior in terms of fit relative to the spatial lag model). The spatial autoregressive lag coefficient ranges from 0.376 (Kriging, Lag-IV) to 0.446 (Thiessen, Lag-ML). The largest estimates are consistently for the Thiessen interpolator, and the smallest for Kriging. The ranking between estimation methods is consistent as well, with the estimate for Lag-IVR between the higher Lag-ML and the lower Lag-IV. As is to be expected, the estimated standard error is largest for the robust estimator. Relative to the continuous results, the Lag-ML estimates are smaller for the regimes models, but slightly larger for both IV estimators. However, when taking into account the standard errors of the estimates, there is little indication of a significant effect of the interpolator on the estimate of the spatial parameter. For example, consider the two estimates for the lag parameter using IVR for Kriging 2 standard errors, or, 0.3988 0.0148 0.4136 for /
/
/
Table 4. Estimates for spatial autoregressive parametera Model
Thiessen
Lag-ML
0.4457 (0.0030) 0.3804 (0.0052) 0.4054 (0.0074) 0.5165 (0.0036) 0.4653 0.4440 (0.0031) 0.3831 (0.0052) 0.4142 (0.0074) 0.5147 (0.0036) 0.4632
Lag-IV Lag-IVR Err-ML Err-GM Lag-ML Lag-IV Lag-IVR Err-ML Err-GM a
IDW
Kriging
Spline
Continuous model 0.4438 (0.0030) 0.3791 (0.0052) 0.4028 (0.0074) 0.5139 (0.0036) 0.4634
0.4399 (0.0030) 0.3758 (0.0052) 0.3988 (0.0074) 0.5090 (0.0036) 0.4599
0.4428 (0.0030) 0.3786 (0.0052) 0.4028 (0.0074) 0.5130 (0.0036) 0.4626
Regimes model 0.4404 (0.0031) 0.3808 (0.0052) 0.4112 (0.0074) 0.5102 (0.0036) 0.4595
0.4329 (0.0031) 0.3756 (0.0052) 0.4039 (0.0074) 0.4999 (0.0037) 0.4524
0.4348 (0.0031) 0.3786 (0.0052) 0.4058 (0.0074) 0.5035 (0.0037) 0.4540
Asymptotic standard errors are given in parentheses, except for the generalized moments method (l is a nuisance parameter).
44
L. Anselin & J. Le Gallo
continuous ozone, and 0.4039 0.0148 0.4187 for the regimes. In each case, the point estimates for the other interpolators are included in this interval, suggesting they do not differ significantly. The situation is quite different for the ozone parameters, where we find a distinct and significant effect of the interpolator. The details are summarized in Table 5 for the continuous measure, and Table 6 for the regimes. As shown in Table 5, all the estimates for the continuous ozone variable are highly significant and have the expected negative sign. Relative to OLS and the spatial error models, the absolute values are considerably smaller in the spatial lag model, for example, going from 0.0270 for OLS Kriging to 0.0179 for Lag-ML Kriging. Interestingly, this is less the case for the Thiessen interpolator. The Kriging value is consistently the largest in absolute value, and exceeds the others by more than 2 standard errors. The Thiessen value is consistently the smallest in absolute value. IDW and spline are not significantly different from each other and are in between these two extremes. The differences between the interpolators are accentuated in the regimes results (Table 6). For OLS, the estimates for the Moderate category are counterintuitive, being positive and significant. This is also the case for the Unhealthy1 category using the Thiessen and IDW interpolators. In contrast, the corresponding estimates for Kriging and spline are significant and negative. Only for the worst category (Unhealthy2) are the estimates negative across all interpolators, with the value for Kriging significantly larger in absolute value than the others (again, with Thiessen yielding the smallest value). These results are essentially the same in the spatial error models, only with larger standard errors. The main difference occurs for the spatial lag specifications. Here, the Kriging interpolator yields results consistent with expectations. Even though the estimate for Moderate is positive, it is not significant, and both Unhealthy categories are highly significant and negative, with a larger absolute value for the worst category. The three other interpolators maintain a positive and significant value for Moderate. For Thiessen and IDW, Unhealthy1 is positive as well, although no longer significant for the latter. Spline has negative and significant values for Unhealthy1. Overall, these results would suggest that the Kriging interpolator in a spatial lag specification is the only one that yields estimates for a categorical air quality variable /
/
Downloaded by [Tehran University] at 04:23 21 August 2011
/
/
Table 5. Estimates for ozone parameter (continuous model)a Model OLS Lag-ML Lag-IV Lag-IVR Err-ML Err-GM
a
Thiessen /0.0126 (0.0007) /0.0101 (0.0006) /0.0105 (0.0006) /0.0101 (0.0006) /0.0120 (0.0012) /0.0121 (0.0011)
IDW /0.0204 (0.0008) /0.0148 (0.0007) /0.0156 (0.0007) /0.0150 (0.0007) /0.0207 (0.0014) /0.0207 (0.0013)
Asymptotic standard errors are given in parentheses.
Kriging /0.0270 (0.0007) /0.0179 (0.0006) /0.0192 (0.0007) /0.0187 (0.0007) /0.0277 (0.0013) /0.0276 (0.0012)
Spline /0.0206 (0.0007) /0.0139 (0.0006) /0.0149 (0.0006) /0.0147 (0.0006) /0.0206 (0.0013) /0.0206 (0.0011)
Interpolation in Spatial Hedonic Models 45 Table 6. Estimates for ozone regime parametersa
OLS
Variable
Thiessen
Moderate
0.0528 (0.0030) 0.0357 (0.0029) /0.0365 (0.0048) 0.0140 (0.0027) 0.0081 (0.0026) /0.0432 (0.0043) 0.0194 (0.0027) 0.0119 (0.0026) /0.0423 (0.0043) 0.0127 (0.0030) 0.0096 (0.0026) /0.0432 (0.0039) 0.0544 (0.0053) 0.0365 (0.0052) /0.0251 (0.0086) 0.0541 (0.0048) 0.0363 (0.0047) /0.0268 (0.0079)
Unhealthy1 Unhealthy2 Lag-ML
Moderate Unhealthy1 Unhealthy2
Downloaded by [Tehran University] at 04:23 21 August 2011
Lag-IV
Moderate Unhealthy1 Unhealthy2
Lag-IVR
Moderate Unhealthy1 Unhealthy2
Err-ML
Moderate Unhealthy1 Unhealthy2
Err-GM
Moderate Unhealthy1 Unhealthy2
a
IDW
Kriging
Spline
0.0540 (0.0029) 0.0230 (0.0029) /0.0945 (0.0048) 0.0148 (0.0026) 0.0007 (0.0026) /0.0691 (0.0043) 0.0201 (0.0027) 0.0037 (0.0027) /0.0725 (0.0043) 0.0142 (0.0030) 0.0016 (0.0027) /0.0727 (0.0040) 0.0558 (0.0052) 0.0229 (0.0053) /0.0900 (0.0085) 0.0556 (0.0048) 0.0228 (0.0048) /0.0907 (0.0078)
0.0310 (0.0027) /0.0309 (0.0026) /0.1761 (0.0042) 0.0006 (0.0024) /0.0300 (0.0024) /0.1161 (0.0038) 0.0046 (0.0025) /0.0301 (0.0024) /0.1241 (0.0039) 0.0011 (0.0026) /0.0292 (0.0024) /0.1183 (0.0037) 0.0342 (0.0047) /0.0313 (0.0047) /0.1764 (0.0073) 0.0339 (0.0044) /0.0314 (0.0043) /0.1768 (0.0068)
0.0449 (0.0030) /0.0051 (0.0029) /0.1397 (0.0043) 0.0096 (0.0027) /0.0152 (0.0026) /0.0969 (0.0039) 0.0141 (0.0027) /0.0139 (0.0026) /0.1024 (0.0040) 0.0094 (0.0028) /0.0155 (0.0026) /0.0995 (0.0037) 0.0499 (0.0052) /0.0044 (0.0051) /0.1322 (0.0076) 0.0488 (0.0048) /0.0047 (0.0047) /0.1339 (0.0070)
Standard errors are given in parentheses.
consistent with expectations. This confirms earlier indications that this model also obtained the best fit. 6. The Valuation of Air Quality We conclude this empirical exercise by comparing the valuation of air quality as computed from the parameter estimates for the different interpolators. Theory suggests that the partial derivative of the hedonic price equation with respect to each explanatory variable yields its implicit price. Assuming that the housing market is in equilibrium, this can be interpreted as the marginal willingness to pay (MWTP) for a non-traded good such as air quality.17 Since our specification is log-linear, this yields: MWTPz
@elnP @z
bˆ z P:
(6)
46
L. Anselin & J. Le Gallo
In practice, this can be computed by using the average price for P. As shown in Kim et al. (2003, p. 35), the effect of the spatial multiplier in a spatial lag specification is to change the MWTP to
Downloaded by [Tehran University] at 04:23 21 August 2011
MWTPz
@elnP @z
1 1 rˆ
bˆ z P;
(7)
assuming a spatially uniform unit change, and with rˆ as the estimate for the spatial autoregressive parameter. We begin by comparing the ‘analytical’ MWTP estimates for each of the interpolators between OLS and the spatial lag model that result from a 1 ppb decrease in the value of the ozone variable. This change is assumed to apply uniformly throughout the sample and amounts, on average, to a 12% decrease. For the standard case (OLS), we apply equation (6), with a value of $239,518 for the average house price in the sample. The results are reported in Table 7, where both the dollar amounts and the corresponding percentage of the house price are listed. Also, an approximate measure of the precision of the point estimate is given, obtained by computing the value for the parameter estimate 9 2 standard errors. For the spatial lag model we use equation (7), with the estimates for r; ˆ bˆ and the corresponding standard errors from the IVR method. Note that our reported ‘standard errors’ in the spatial lag case are an underestimate of uncertainty, since the spatial parameter is assumed fixed (only the parameter values for ozone are changed). This provides a reasonable approximation of the relative precision, but does not correspond to an analytical estimate of the overall standard error (e.g. as yielded by the delta method). Before comparing the MWTP estimates between the non-spatial OLS results and the spatial lag model, note that the absolute value of the parameter estimate for ozone in the latter is considerably smaller than for OLS. As illustrated in Table 7, this is more than compensated for by the spatial multiplier effect. In all instances, the estimated MWTP for the spatial lag model is considerably larger than for the matching OLS case. There are also considerable differences between interpolators. The largest MWTP estimate is for Kriging in the spatial lag model. This value of $7,444 exceeds that of all the other interpolators by some $1,500. In the OLS case also, the Kriging estimate of $6,468 is much higher than the others. The smallest estimate is for Thiessen, as low as $3,028 for OLS and $4,087 for the spatial lag model. In /
Table 7. Analytical marginal willingness to pay, by interpolatora Model OLS
Lag-IVR
Thiessen
IDW
Kriging
Spline
$3,028 ($2,699 3,357) 1.26% (1.13 1.40%) $4,087 ($3,609 4,566) 1.71% (1.51 1.91%)
$4,889 ($4,519 5,241) 2.04% (1.89 2.19%) $6,031 ($5,496 6,567) 2.52% (2.29 2.74%)
$6,468 ($6,127 6,808) 2.70% (2.56 2.84%) $7,444 ($6,920 7,969) 3.11% (2.89 3.33%)
$4,925 ($4,592 5,258) 2.06% (1.92 2.20%) $5,899 ($5,394 6,404) 2.46% (2.25 2.67%)
a Uniform 1 ppb O3 improvement, assuming average house price. Two standard error bounds are given in parentheses.
Downloaded by [Tehran University] at 04:23 21 August 2011
Interpolation in Spatial Hedonic Models 47 percentage terms, this ranges from 1.26% for Thiessen OLS to 3.11% for Kriging spatial lag. The analytical approach breaks down for the categorical measures of air quality. Also, the uniform decrease of the ozone value throughout the sample does not fully account for a possible differential effect of the interpolation methods. To assess this more closely we introduce a simulation approach, based on re-interpolating values from the locations of the monitoring stations to the house locations. We lower the value observed at each station by 1 ppb and obtain new measures for each house location by interpolating. Note that, except for the Thiessen method, this does not result in a uniform decrease for each house, since the interpolators are non-linear. Finally, we compute the predicted price for each house using the new ozone value (holding the other parameters and observed characteristics constant) and compare this to the original sales price. In this process, we need to take account of the spatial multiplier to obtain the predicted value in the spatial lag model. Since the change in the variable is not uniform across space, the simplifying result used in equation (7) no longer holds. Instead, we must use the reduced form explicitly, as in equation (4). As before, we obtain an approximate measure of precision by carrying out the calculation for the parameter value 9 2 standard errors. In contrast to the analytical approach, this method can be used for both the continuous and categorical ozone models, since the newly interpolated ozone value can be reallocated to one of the four regimes. Also, since the predicted price is computed for each individual house, the results can be presented for any degree of spatial aggregation. The new relative distribution of observations that results from the allocation of the interpolated values to the four regimes is given in Table 8. This should be compared to the percentages in Table 3. The new interpolation results in a drastic shift of observations out of the Unhealthy2 category. The simulated MWTP values for the continuous ozone model are reported in Table 9 as an average for the full sample. Relative to the analytical results (Table 7), the estimates are similar in magnitude, although uniformly somewhat smaller, ranging from a low of $2,895 for Thiessen OLS to a high of $6,961 for Kriging spatial lag. As before, the values differ greatly across interpolators, with Kriging yielding the highest estimates and Thiessen the lowest. Also, again the values are greater for the spatial lag model relative to OLS, although to a lesser extent than in the analytical approach. A final assessment is presented in Table 10, where the estimated MWTP is given for both continuous and regimes models, and reported for the complete sample as well as for each county. Two major features stand out. First, considering the totals only, the values for the regime models are clearly deficient in the OLS case, a direct result of the wrong signs obtained for the parameter estimates. Only for the Kriging interpolator are they comparable to previous results. /
Table 8. Reallocation of observations by air quality regime Thiessen (%) Good Moderate Unhealthy1 Unhealthy2
32.8 41.3 22.0 3.9
IDW (%) 33.3 40.7 23.9 2.0
Kriging (%) 35.5 38.6 23.6 2.3
Spline (%) 37.7 37.2 22.6 2.4
48
L. Anselin & J. Le Gallo
Table 9. Simulated marginal willingness to pay, continuous modela Model OLS
Lag-IVR
Downloaded by [Tehran University] at 04:23 21 August 2011
a
Thiessen
IDW
Kriging
Spline
$2,895 ($2,609 3,175) 1.21% (1.08 1.34%) $3,808 ($3,415 4,187) 1.65% (1.46 1.84%)
$4,686 ($4,391 4,974) 1.96% (1.82 2.11%) $5,640 ($5,231 6,035) 2.44% (2.23 2.66%)
$6,213 ($5,952 6,469) 2.60% (2.47 2.74%) $6,961 ($6,583 7,326) 3.01% (2.80 3.23%)
$4,727 ($4,455 4,991) 1.98% (1.85 2.11%) $5,511 ($5,122 5,884) 2.39% (2.18 2.59%)
1 ppb O3 improvement at each monitoring station. Two standard error bounds are given in parentheses.
Second, the aggregate values mask considerable spatial heterogeneity, especially for the regime models. For example, taking the Lag-IVR results for Kriging (i.e. using the estimates with the best fit), the impact ranges from a low of $266 for Orange County to a high of $14,013 (9.7%) for San Bernardino County. This pattern contrasts with the results for the continuous measure, where the highest dollar impact (for Kriging) is for Orange County ($7,413), although the highest percentage impact remains for San Bernardino County (4.15%). In addition, even with the Lag-IVR results, negative impacts are obtained in Orange County for the three other interpolators. This suggests that spatial heterogeneity may need to be taken into account by more than a county indicator variable. It also highlights the fact that a sole focus on spatially aggregate indicators of valuation (such as the average across the region) may be misleading. 7. Conclusion Our empirical analysis re-emphasizes the importance of the need to explicitly account for spatial autocorrelation and spatial heterogeneity in the estimation of hedonic house price models: space matters. There was very strong evidence of the presence of positive spatial autocorrelation, even after controlling for the same house characteristics and neighbourhood variables used in previous empirical analyses of this housing market. In our (dense) sample of transactions, a spatial lag model yielded the best results. Consequently, ignoring this aspect, as is the case in a traditional OLS estimation, would yield estimates that are most likely biased. This is important in the current context, since the parameter estimates are directly linked to an economic interpretation, such as the valuation of air quality. In addition to spatial autocorrelation, a high degree of heteroskedasticity warranted the use of a heteroskedastically robust estimator. There is some indication that simply including indicator variables for the counties (as submarkets) may not be sufficient to address spatial heterogeneity. More importantly, we found that the manner in which ozone measures are spatially interpolated to the locations of house sales transactions has a significant effect on the estimate of the air quality parameter in the hedonic equation and on the associated estimate of marginal willingness to pay. Simple solutions, such as Thiessen polygons, may lead to nonsensical results for the economic implications of the model. While the coefficients of the other variables did not change much across interpolators, this was not the case for the ozone parameter.
Interpolation in Spatial Hedonic Models 49 Table 10. Simulated marginal willingness to pay, by countya Model
Region
Continuous model OLS All LA RI SB OR
Downloaded by [Tehran University] at 04:23 21 August 2011
Lag-IVR
All LA RI SB OR
Regimes model OLS All LA RI SB OR Lag-IVR
All LA RI SB OR
Thiessen
IDW
Kriging
Spline
$2,895 1.21% $2,980 1.17% $2,211 1.69% $2,399 1.67% $3,429 1.06% $3,808 1.65% $4,012 1.58% $3,018 2.29% $3,246 2.26% $4,000 1.45%
$4,686 1.96% $4,826 1.89% $3,468 2.65% $3,892 2.71% $5,599 1.74% $5,640 2.44% $5,952 2.34% $4,332 3.28% $4,823 3.36% $5,977 2.16%
$6,213 2.60% $6,422 2.51% $4,401 3.35% $5,178 3.60% $7,457 2.32% $6,961 3.01% $7,375 2.90% $5,112 3.87% $5,972 4.15% $7,413 2.69%
$4,727 1.98% $4,903 1.92% $3,445 2.63% $3,884 2.70% $5,580 1.73% $5,511 2.39% $5,858 2.30% $4,166 3.16% $4,660 3.25% $5,761 2.08%
/$311 /0.12% $18 0.01% $7,669 5.26% $8,335 5.20% /$13,735 /3.80% $1,532 0.66% $260 0.10% $8,354 6.34% $9,032 6.28% /$4,110 /1.49%
$1,215 0.51% $676 0.26% $11,577 8.83% $11,711 8.15% /$11,964 /3.71% $2,858 1.24% $1,074 0.42% $12,382 9.36% $12,465 8.66% /$4,314 /1.57%
$5,103 2.14% $4,604 1.80% $11,129 8.47% $13,879 9.65% /$3,943 /1.23% $5,972 2.59% $4,910 1.93% $11,089 8.37% $14,013 9.74% $266 0.10%
$2,011 0.84% $1,987 0.78% $13,322 10.16% $10,519 7.31% /$12,222 /3.78% $4,010 1.74% $2,822 1.11% $13,861 10.49% $10,997 7.65% /$3,610 /1.31%
a
1 ppb O3 improvement at each monitoring station, point estimates. Percentages are relative to the average house price in each region.
Of the four methods, the Kriging interpolator consistently yielded the best fit, as well as the most reasonable parameter signs and magnitudes, and related measures of marginal willingness to pay. In addition, there was some indication that the use of categorical variables rather than a continuous ozone measure was superior. In order to deal with the lack of continuity of such variables, we employed a simulation method to estimate the change in house value associated with a decrease in ozone levels. This revealed the importance of spatial scale, and results at the county level that were vastly different from the regional aggregate.
50
L. Anselin & J. Le Gallo
While several methodological issues remain to be addressed, our findings suggest that the quality of the spatial interpolation deserves the same type of attention in the specification and estimation of hedonic house price models as more traditional concerns. In future work, we intend to further investigate the role of spatial heterogeneity and the potential endogeneity of the air quality measure.
Downloaded by [Tehran University] at 04:23 21 August 2011
Notes 1. For an extensive empirical assessment of spatial interpolation methods applied to ozone mapping, see, for example, Phillips et al . (1997) and Diem (2003). 2. Other studies of the relation between house prices and air quality in this region can be found in Graves et al . (1988) and Beron et al . (1999, 2001, 2004), although only Beron et al . (2004) takes an explicitly spatial econometric approach. Also of interest is a general equilibrium analysis of ozone abatement in the same region, using a hierarchical locational equilibrium model, outlined in Smith et al . (2004). 3. Owing to missing values, some stations had to be dropped from the complete set of stations available in the region during that time period. 4. For recent collections reviewing the state of the art, see also Florax & van der Vlist (2003), Anselin et al . (2004), Getis et al . (2004), LeSage et al . (2004), LeSage & Pace (2004) and Pace & LeSage (2004). 5. For a more extensive discussion, see Anselin (2002, pp. 256 260), and Anselin (2006, pp. 909 910). 6. See Anselin (2001a), for an extensive review of statistical issues. 7. This is implemented in the Python language-based PySAL library of spatial analytical routines; see http:// sal.uiuc.edu/projects_pysal.php 8. This is in addition to potential problems caused by the use of aggregate (census-tract level) variables in the explanation of individual house prices (see Moulton, 1990). 9. For an extensive technical treatment of tessellations, see Okabe et al . (1992). 10. For further discussion of IDW, see, for example, Longley et al . (2001, pp. 296 297). 11. The estimated parameter values were 302 and 7 for the direction (angle), 6 and 192 for the partial sill, 199,490 for the major range and 67,334 for the minor range. All Kriging interpolations were carried out with the ESRI ArcGIS Geostatistical Analyst extension. 12. For a technical discussion, see, for example, Mitasova & Mitas (1993) and Mitas & Mitasova (1999). 13. The detailed results are not reported here, but available from the authors. 14. The detailed results are available from the authors and are included in an earlier Working Paper version. 15. The detailed test statistics are not reported, but are available from the authors. All test statistics are significant with a p -value of less than 0.0000001 (the greatest precision reported by the software). 16. Since the spatial error model is consistently inferior in fit relative to the lag specification, we will not discuss it in detail here. The main distinguishing characteristic of the findings is the difference in estimated standard errors between OLS and the spatial error model. As a result, the coefficient of AC and of Poverty is no longer significant in the spatial error model. 17. In addition to the equilibrium assumption, this interpretation is further complicated by the fact that the estimated marginal benefits represent the capitalized rather than the annual value of the benefits of air quality improvement. Therefore, other considerations, such as the length of time the buyer expects to reside in the house, the discount rate and projected time path for air quality should all be taken into account (see also Kim et al ., 2003, pp. 34 37, for further discussion).
References Anselin, L. (1988) Spatial Econometrics: Methods and Models, Dordrecht, Kluwer. Anselin, L. (1998) GIS research infrastructure for spatial analysis of real estate markets, Journal of Housing Research , 9(1), 113 133. Anselin, L. (2001a) Rao’s score test in spatial econometrics, Journal of Statistical Planning and Inference , 97, 113 139. Anselin, L. (2001b) Spatial effects in econometric practice in environmental and resource economics, American Journal of Agricultural Economics , 83(3), 705 710. Anselin, L. (2002) Under the hood. Issues in the specification and interpretation of spatial regression models, Agricultural Economics , 27(3), 247 267. Anselin, L. (2006) Spatial econometrics, in: T. Mills & K. Patterson (eds) Palgrave Handbook of Econometrics. Vol. 1: Econometric Theory, pp. 901 969, Basingstoke, Palgrave Macmillan.
Downloaded by [Tehran University] at 04:23 21 August 2011
Interpolation in Spatial Hedonic Models 51 Anselin, L. & Bera, A. (1998) Spatial dependence in linear regression models with an introduction to spatial econometrics, in: A. Ullah & D. E. Giles (eds) Handbook of Applied Economic Statistics, pp. 237 289, New York, Marcel Dekker. Anselin, L., Bera, A., Florax, R. J. & Yoon, M. (1996) Simple diagnostic tests for spatial dependence, Regional Science and Urban Economics , 26, 77 104. Anselin, L., Florax, R. J. & Rey, S. J. (2004) Advances in Spatial Econometrics. Methodology, Tools and Applications, Berlin, Springer. Anselin, L., Syabri, I. & Kho, Y. (2006) GeoDa, an introduction to spatial data analysis, Geographical Analysis , 38, 5 22. Banerjee, S., Carlin, B. P. & Gelfand, A. E. (2004) Hierarchical Modeling and Analysis for Spatial Data, Boca Raton, FL, Chapman & Hall/CRC. Basu, S. & Thibodeau, T. G. (1998) Analysis of spatial autocorrelation in housing prices, Journal of Real Estate Finance and Economics , 17, 61 85. Beron, K. J., Hanson, Y., Murdoch, J. C. & Thayer, M. A. (2004) Hedonic price functions and spatial dependence: implications for the demand for urban air quality, in: L. Anselin, R. J. Florax & S. J. Rey (eds) Advances in Spatial Econometrics: Methodology, Tools and Applications, pp. 267 281, Berlin, Springer. Beron, K. J., Murdoch, J. C. & Thayer, M. A. (1999) Hierarchical linear models with application to air pollution in the South Coast Air Basin, American Journal of Agricultural Economics , 81, 1123 1127. Beron, K., Murdoch, J. & Thayer, M. (2001) The benefits of visibility improvement: new evidence from the Los Angeles metropolitan area, Journal of Real Estate Finance and Economics , 22(2 3), 319 337. Boyle, M. A. & Kiel, K. A. (2001) A survey of house price hedonic studies of the impact of environmental externalities, Journal of Real Estate Literature , 9, 117 144. Brasington, D. M. & Hite, D. (2005) Demand for environmental quality: a spatial hedonic analysis, Regional Science and Urban Economics , 35, 57 82. Burrough, P. A. & McDonnell, R. A. (1998) Principles of Geographical Information Systems, Oxford, Oxford University Press. Chattopadhyay, S. (1999) Estimating the demand for air quality: new evidence based on the Chicago housing market, Land Economics , 75, 22 38. Chay, K. Y. & Greenstone, M. (2005) Does air quality matter? Evidence from the housing market, Journal of Political Economy , 113(2), 376 424. Cressie, N. (1993) Statistics for Spatial Data, New York, John Wiley. Diem, J. E. (2003) A critical examination of ozone mapping from a spatial-scale perspective, Environmental Pollution , 125, 369 383. Dubin, R., Pace, R. K. & Thibodeau, T. G. (1999) Spatial autoregression techniques for real estate data, Journal of Real Estate Literature , 7, 79 95. Florax, R. J. G. M. & van der Vlist, A. (2003) Spatial econometric data analysis: moving beyond traditional models, International Regional Science Review , 26(3), 223 243. Freeman III, A. M. (2003) The Measurement of Environmental and Resource Values, Theory and Methods, 2nd edn, Washington, DC, Resources for the Future Press. Getis, A., Mur, J. & Zoller, H. G. (2004) Spatial Econometrics and Spatial Statistics, London, Palgrave Macmillan. Gillen, K., Thibodeau, T. G. & Wachter, S. (2001) Anisotropic autocorrelation in house prices, Journal of Real Estate Finance and Economics , 23(1), 5 30. Gotway, C. A. & Young, L. J. (2002) Combining incompatible spatial data, Journal of the American Statistical Association , 97, 632 648. Graves, P., Murdoch, J. C., Thayer, M. A. & Waldman, D. (1988) The robustness of hedonic price estimation: urban air quality, Land Economics , 64, 220 233. Harrison, D. & Rubinfeld, D. L. (1978) Hedonic housing prices and the demand for clean air, Journal of Environmental Economics and Management , 5, 81 102. Kelejian, H. H. & Prucha, I. (1998) A generalized spatial two stage least squares procedures for estimating a spatial autoregressive model with autoregressive disturbances, Journal of Real Estate Finance and Economics , 17, 99 121. Kelejian, H. H. & Prucha, I. (1999) A generalized moments estimator for the autoregressive parameter in a spatial model, International Economic Review , 40, 509 533. Kelejian, H. H. & Prucha, I. R. (2005) HAC Estimation in a Spatial Framework , Working paper, Department of Economics, University of Maryland, College Park, MD. Kelejian, H. H. & Robinson, D. P. (1993) A suggested method of estimation for spatial interdependent models with autocorrelated errors, and an application to a county expenditure model, Papers in Regional Science , 72, 297 312. Kim, C.-W., Phipps, T. T. & Anselin, L. (2003) Measuring the benefits of air quality improvement: a spatial hedonic approach, Journal of Environmental Economics and Management , 45, 24 39.
Downloaded by [Tehran University] at 04:23 21 August 2011
52
L. Anselin & J. Le Gallo
LeSage, J. P. & Pace, R. K. (2004) Advances in Econometrics: Spatial and Spatiotemporal Econometrics, Oxford, Elsevier Science. LeSage, J. P., Pace, R. K. & Tiefelsdorf, M. (2004) Methodological developments in spatial econometrics and statistics, Geographical Analysis , 36, 87 89. Longley, P. A., Goodchild, M. F., Maguire, D. J. & Rhind, D. W. (2001) Geographic Information Systems and Science, Chichester, John Wiley. Mitas, L. & Mitasova, H. (1999) Spatial interpolation, in: P. A. Longley, M. F. Goodchild, D. J. Maguire & D. W. Rhind (eds) Geographical Information Systems: Principles, Techniques, Management and Applications, pp. 481 492, New York, Wiley. Mitasova, H. & Mitas, L. (1993) Interpolation by regularized spline with tension: I, theory and implementation, Mathematical Geology , 25, 641 655. Moulton, B. R. (1990) An illustration of a pitfall in estimating the effects of aggregate variables on micro units, Review of Economics and Statistics , 72, 334 338. Okabe, A., Boots, B. & Sugihara, K. (1992) Spatial Tessellations: Concepts and Applications of Voronoi Diagrams, Chichester, John Wiley. Ord, J. K. (1975) Estimation methods for models of spatial interaction, Journal of the American Statistical Association , 70, 120 126. Pace, R. K., Barry, R. & Sirmans, C. (1998) Spatial statistics and real estate, Journal of Real Estate Finance and Economics , 17, 5 13. Pace, R. K. & LeSage, J. P. (2004) Spatial statistics and real estate, Journal of Real Estate Finance and Economics , 29, 147 148. Palmquist, R. B. (1991) Hedonic methods, in: J. B. Braden & C. D. Kolstad (eds) Measuring the Demand for Environmental Quality, pp. 77 120, Amsterdam, North-Holland. Palmquist, R. B. & Israngkura, A. (1999) Valuing air quality with hedonic and discrete choice models, American Journal of Agricultural Economics , 81, 1128 1133. Phillips, D. L., Lee, E. H., Herstrom, A. A., Hogsett, W. E. & Tingey, D. T. (1997) Use of auxiliary data for spatial interpolation of ozone exposure in southeastern forests, Environmetrics , 8, 43 61. Ridker, R. & Henning, J. (1967) The determinants of residential property values with special reference to air pollution, Review of Economics and Statistics , 49, 246 257. Schabenberger, O. & Gotway, C. A. (2005) Statistical Methods for Spatial Data Analysis, Boca Raton, FL, Chapman & Hall/CRC. Smirnov, O. (2005) Computation of the information matrix for models with spatial interaction on a lattice, Journal of Computational and Graphical Statistics , 14, 910 927. Smirnov, O. & Anselin, L. (2001) Fast maximum likelihood estimation of very large spatial autoregressive models: a characteristic polynomial approach, Computational Statistics and Data Analysis , 35, 301 319. Smith, V. K. & Huang, J.-C. (1993) Hedonic models and air pollution: 25 years and counting, Environmental and Resource Economics , 3, 381 394. Smith, V. K. & Huang, J.-C. (1995) Can markets value air quality? A meta-analysis of hedonic property value models, Journal of Political Economy , 103, 209 227. Smith, V. K., Sieg, H., Banzhaf, H. S. & Walsh, R. P. (2004) General equilibrium benefits for environmental improvements: projected ozone reductions under EPA’s Prospective Analysis for the Los Angeles air basin, Journal of Environmental Economics and Management , 47, 559 584. Zabel, J. & Kiel, K. (2000) Estimating the demand for air quality in four U.S. cities, Land Economics , 76, 174 194.
Spatial Economic Analysis, Vol. 1, No. 1, June 2006
Dynamic Spatial Discrete Choice Using One-step GMM: An Application to Mine Operating Decisions
JORIS PINKSE, MARGARET SLADE & LIHONG SHEN
Downloaded by [Tehran University] at 04:23 21 August 2011
(Received December 2005; revised January 2006)
In many spatial applications, agents make discrete choices (e.g. operating or product-line decisions), and applied researchers need econometric techniques that enable them to model such situations. Unfortunately, however, most discrete-choice estimators are invalid when variables and/or errors are spatially dependent. More generally, discrete-choice estimators have difficulty dealing with many common problems such as heteroskedasticity, endogeneity, and measurement error, which render them inconsistent, as well as the inclusion of fixed effects in short panels, which renders them computationally burdensome if not infeasible. In this paper, we introduce a new estimator that can be used to overcome many of the above-mentioned problems. In particular, we show that the one-step (‘continuous updating’) GMM estimator is consistent and asymptotically normal under weak conditions that allow for generic spatial and time series dependence. We use our estimator to study mine operating decisions in a real-options context. To anticipate, we find little support for the real-options model. Instead, the data are found to be more consistent with a conventional mean/variance utility model.
ABSTRACT
Choix Discret Dynamique et Spatial: utiliser le GMM a` une e´tape: Application aux De´cisions Ope´rationnelles dans le Secteur Minier Dans beaucoup d’applications spatiales, les agents font des choix discrets (c’est a`- dire prennent des de´cisions ope´rationnelles ou des de´cisions de production). La recherche applique´e a besoin de techniques e´conome´triques pour mode´liser ces situations. Malheureusement, la plupart des indicateurs de choix discret ne signifient rien, lorsque les variables et /ou les erreurs sont spatialement de´pendantes. Plus ge´ne´ralement, les indicateurs de choix discret ne ge`rent que difficilement la plupart des proble`mes rencontre´s couramment, comme l’he´te´rosce´dasticite´, l’endoge´ne´ite´ et les erreurs de mesure, ce qui les vide de leur sens. Il en est de meˆme avec l’inclusion d’effets fixes dans des panels courts, qui les rend mathe´matiquement tre`s lourds, si ce n’est irre´alisables. Dans cet article, nous introduisons un nouvel indicateur qui peut surmonter les difficulte´s mentionne´es plus haut. En particulier, nous montrons que l’indicateur du GMM a` une e´tape (mise a` jour continue) fonctionne et qu’il est normal RE´SUME´
Joris Pinkse (to whom correspondence should be sent), Department of Economics, Pennsylvania State University, 608 Kern Gradute Building, University Park, PA 16802, USA. Email:
[email protected]. Margaret Slade, Department of Economics, University of Warwick, Coventry CV4 7AL, U.K. Email:
[email protected]. Lihong Shen, Department of Economics, Pennsylvania State University, 608 Kern Graduate Building, University Park, PA 16802, USA. Email:
[email protected]. We thank Tim Conley and seminar participants at Cemmap, the universities of Tilburg and Warwick, and the Tinbergen Institute for their valuable comments. Margaret Slade would like to acknowledge financial support from the ESRC and the Leverhulme Foundation. ISSN 1742-1772 print; 1742-1780 online/06/010053-47 # 2006 Regional Studies Association
DOI: 10.1080/17421770600661741
54
J. Pinkse et al.
de fac¸on asymptotique, dans des conditions faibles, qui permettent de rendre de´pendantes des se´ries spatialement et temporellement ge´ne´riques. Nous utilisons notre indicateur pour e´tudier les de´cisions ope´rationnelles dans le secteur minier dans un contexte d’options re´elles. Pour anticiper, nous avons trouve´ peu d’arguments en faveur du mode`le d’options re´elles.Donc, les donne´e sont plus parlantes avec un mode`le d’utilite´ conventionelle moyenne/variance. Opcio´n discreta espacial dina´mica usando el me´todo MGM de un paso: una aplicacio´n a las decisiones operativas en las minas En muchas aplicaciones espaciales, los agentes optan por elecciones discretas (ej., en las decisiones sobre operaciones o la produccio´n en lı´nea), y para la investigacio´n aplicada se necesitan te´cnicas econome´tricas para poder modelar tales situaciones. Por desgracia, la mayorı´a de los estimadores de elecciones discretas no son va´lidos cuando las variables, los errores, o ambos, tienen una dependencia espacial. En general, los estimadores de elecciones discretas tienen dificultades para tratar con diferentes problemas tales como la heteroscedasticidad, la endogeneidad, y el error de medicio´n que hacen que sean inconsistentes, ası´ como la inclusio´n de efectos fijos en paneles cortos que resultan onerosos e incluso imposibles de calcular. En este artı´culo introducimos un nuevo estimador que puede servir para superar muchos de los problemas antes mentionados. En concreto, demonstramos que el estimador MGM (Me´todo Generalizado de Momentos) de un paso (‘actualizacio´n continua’) es consistente y asinto´ticamente normal en condiciones de´biles que permiten una dependencia gene´rica espacial y temporal. Utilizamos nuesto estimador para estudiar las decisiones operativas en las minas en un contexto de opciones reales. Anticipamos que hallamos poca evidencia a favor del modelo de opciones reales. En cambio, los datos son ma´s consistentes con un modelo de utilidad convencional de media/ varianza.
Downloaded by [Tehran University] at 04:23 21 August 2011
RESUMEN
KEYWORDS: Spatial econometrics; continuous updating; generalized empirical likelihood; GMM JEL
CLASSSIFICATION:
C21, C31
1. Introduction Spatial processes are ubiquitous in economics, particularly when one considers that space can be interpreted broadly to cover both geographic and characteristic space. Furthermore, in many applications, agents make discrete choices. For example, firms can choose which countries to enter and which regional markets to serve within each country. Moreover, once they are established, they are often faced with a choice among a discrete set of contracts that will govern their relations with their suppliers or retailers (see Pinkse & Slade, 1998). Finally, they might have to decide which transport modes to use to get products from factories to markets. In the absence of spatial dependence, applied researchers have a rich set of discrete-choice econometric techniques that they can use to test hypotheses and to discriminate among theoretical models. Those techniques (e.g. logit and probit, as well as nested, ordered, and multinomial versions of those estimators) are well known and need no further discussion. Unfortunately, however, most discretechoice estimators are invalid when variables and/or errors are spatially dependent. For example, heteroskedasticity is often introduced by spatial dependence, and heteroskedasticity renders most discrete-choice estimators inconsistent. In addition, when problems with endogeneity and measurement error surface in discrete-choice models, standard instrumental-variable remedies cannot be applied. Finally, with
Downloaded by [Tehran University] at 04:23 21 August 2011
Dynamic Spatial Discrete Choice
55
linear models it is routine practice to difference short panels to remove the influence of time-invariant cross-sectional factors (fixed effects). When choices are discrete, in contrast, differencing is not usually a viable option. In this paper, we introduce an estimator that can be used to overcome many of the above-mentioned econometric problems. In particular, we propose a discretechoice model that the applied researcher can use in the presence of spatial (and time series) dependence of a very general sort. To illustrate its applicability, we apply our estimator to study mine operating decisions in a real-options context. Our spatial econometric model is a dynamic discrete-choice panel-data model with fixed effects. It has become common to note that there are obvious differences between spatial and time-series data. The differences that are most often noted are that: (i) time is one-dimensional whereas space is of higher dimension, (ii) time is unidirectional whereas space has no natural direction, (iii) time-series observations are usually evenly spaced whereas spatial observations are rarely located on a regular grid, and (iv) time-series observations are drawn from a continuous process whereas, with spatial data, it is common for the sample and the population to be the same (e.g. the set of all firms in a market). Our paper makes use of a new central-limit theorem (CLT) (see Pinkse et al., 2005) that allows us to deal with differences between time-series and spatial data that have received less attention in the literature. Indeed, the theoretical literature has thus far, implicitly or explicitly, treated spatial dependence as a simple multivariate extension of time-series dependence* observations are regarded as draws from a stationary underlying process.1 In many interesting economic applications, however, spatial dependence is non-stationary2 (e.g. competition among firms depends not only on the distance between them but also on the locations of other firms in the neighbourhood). More problematic than nonstationarity is the fact that the characteristics of the spatial process can depend on the number of observations (e.g. the nature of competition among firms changes as new firms enter the market). Finally, both the location of economic observations and the total number of observations can be endogenous (e.g. firms choose to enter profitable markets). Our CLT deals with all of these eventualities. In what follows we first sketch our estimator and then discuss how we apply it to the problem at hand. Those readers who are uninterested in technical details can move directly from the sketches to the application. 1.1. A Sketch of the Estimator Our first step is to prove the consistency and asymptotic normality of the one-step GMM or continuous-updating (CU) estimator of Hansen et al. (1996) under assumptions that are more plausible in many economic applications than those that are made in the existing spatial literature. Conley (1999) established generic convergence results for the standard two-step GMM estimator. We establish the asymptotic properties of a different GMM-type estimator* the CU estimator* which is a member of the class of generalized empirical likelihood (GEL) estimators. GEL estimators, like standard GMM estimators, use moment conditions. Moreover, in exactly identified models, the two classes of estimators are identical. In over-identified models, however, even though their asymptotic distributions are identical, the statistical properties of GEL estimators tend (or can
Downloaded by [Tehran University] at 04:23 21 August 2011
56
J. Pinkse et al.
be made) to be superior in small and moderate-sized samples (see, for example, Newey & Smith, 2003). Our spatial CU procedure is formally stated in a spatial cross-sections context. The results, however, carry over to two different types of panel-data models. In the first model, the number of ‘products’ (mines in our application) increases while the number of time periods is assumed fixed. This allows for a completely general timeseries-dependence structure. With a fixed number of time periods, a panel-data model is equivalent to a (spatial) cross-sections model with a larger number of moment conditions. Furthermore, it is comparatively easy to find suitable instruments for that model. An alternative possibility is that both the time-series and cross-sections dimensions grow. In that case, if one assumes weak dependence in the time-series dimension, the temporal dimension can be treated as an additional spatial dimension. Since we have potentially non-stationary data, we introduce a Newey West (1987) style covariance-matrix estimator for non-stationary spatial data. That estimator simplifies to the Newey West estimator in the case of a stationary time series. It is common for panel-data sets to have a large number of cross-sections and a few time periods. With linear models, researchers usually difference the estimating equation to remove the influence of time-invariant cross-sectional effects. With discrete-choice models, however, the situation is more complex. For this reason, attention is often limited to static conditional-logit models with independent errors and strictly exogenous regressors (e.g. Chamberlain, 1984). Honore´ & Kyriazidou (2000) generalize that model to include a lagged dependent variable but maintain the strict-exogeneity and independence assumptions, whereas Magnac (2004) considers dependent errors with arbitrary known marginal distributions but maintains the strict-exogeneity and static assumptions. We, in contrast, consider a dynamic discrete-choice model with endogenous regressors and arbitrary patterns of spatial and time-series dependence. Unlike the above-mentioned papers, our fixed effect is not included in the latent-variable equation. Instead, it enters linearly in the observed-choice equation. As noted above, estimating dynamic discrete-choice models with fixed effects in the latent-variable equation is problematic and normally requires strong assumptions. Furthermore, sometimes the approach taken is non-parametric, which imposes practical limitations (e.g. data requirements, continuous regressors) in small and moderate-sized samples, even if not all of those limitations are borne out by the asymptotic distribution. In our model, like in linear panel data models,3 the fixed effects enter linearly and can hence be removed by differencing. The interpretation of our fixed effects, however, is different. Indeed, the fixed effects in our model affect the probability of choosing a particular option directly instead of doing so indirectly via the latent variable.4 1.2. A Sketch of the Application We apply our procedure to the estimation of flexible operating rules for mine openings and closings, which we model in a real-options context. This is a twostate optimal-switching decision problem in which a mine can be either active or inactive, and the operator must decide whether to operate the mine or to let it lie idle. We estimate a reduced-form discrete-choice equation that embodies many of the predictions of the theory of real options. The equation that we specify is similar
Dynamic Spatial Discrete Choice
57
to the one that is used in Moel & Tufano (2002), which is itself based on the theoretical model of Brennan & Schwartz (1985). In particular, we impose a Markov structure on the estimation. In other words, instead of estimating the probability that a mine is open (closed), we estimate transition probabilities (i.e. the probability of being in state k in period t, conditional on having been in state j in period t 1). We use data on prices, costs, reserves, capacity, output, and technology for a panel of 21 copper mines* projects that are both irreversible and uncertain. The panel includes all Canadian mines that operated during some portion of the period between 1980 and 1993 in which copper was the primary commodity. About twothirds of the observations pertain to periods in which the mine was active, whereas the remainder are inactive observations. Since we model decision rules in a state-space context, the mine status at the beginning of the period is an important determinant of the current-period operating decision. This means that our estimating equations contain a temporally lagged dependent variable. Furthermore, the coefficients of some of the explanatory variables are predicted to differ in both magnitude and sign, depending on the prior state (i.e. on the lagged dependent variable). To illustrate, the theory of real options predicts that high price volatility tends to delay decisions. This means that high volatility causes the probability that a mine will be active to increase (decrease) if it was active (inactive) in the previous period. For this reason, we estimate decision rules in which the coefficients can be state dependent. Our discrete-choice equations are not structural. Indeed, our intent is to test the predictions of more than one theory in a unified framework. Therefore, rather than imposing restrictions that are implied by theories that might not be valid, we attempt to distinguish among theories by examining whether the data are consistent with their predictions. To anticipate, we find little support for the real-options model. In particular, the signs of coefficients (e.g. the effects of volatility) do not vary with the prior state. Instead, a more conventional mean/variance utility model receives more support. We also find that, although our spatial state-dependent models have greater predictive power, they are associated with reductions in significance vis-a`vis an ordinary probit. The paper is organized as follows. The next section deals with estimation. In particular, it presents our non-linear dynamic panel-data model, describes our central limit theorem, and discusses our CU GMM estimation technique. Section 3 deals with the application. That section briefly discusses the testable predictions that can be derived from the theory of real options, and it describes the industry and the data. Section 4 presents estimates of static and dynamic ordinary-probit and spatial discrete-choice models, and Section 5 concludes. Proofs are contained in the Appendix.
Downloaded by [Tehran University] at 04:23 21 August 2011
/
2. Econometric Methodology 2.1. Our Panel Data Model Our model is a dynamic space-time discrete-choice equation with fixed effects, i.e. yit I(x?it1 u0 eit1 ] 0)yi;t1 I(x?it0 u0 eit0 ] 0)(1yi;t1 ) hi uit ;
i 1; . . . ; N ; t 1; . . . ; T;
(1)
58
J. Pinkse et al.
Downloaded by [Tehran University] at 04:23 21 August 2011
where yit is the binary choice of firm i at time t; hi is a fixed effect, the eitj s and u+it s are errors, u0 is an unknown vector of regression coefficients, xit1 and xit0 are regressor vectors and I is the indicator function. Model (1) allows for various regressor configurations. If xit1 xit0 then the model reduces to a static one. If xit1 [x?it ; 0?]? and xit0 [0?; x?it ] then the regressors in both components of equation (1) are the same but the regression coefficients are allowed to be different. Finally, any combination of the two extremes is possible. We use xit to denote all regressors that are in at least one of xit1 ; xit0 : We assume that (i) the eitj s have standard normal distributions; (ii) eitj is independent of yi;t1 ; 5 (iii) the eitj s are independent of current and past xitj s;6 (iv) a vector of instruments zit exists that are independent of the eitj s and for which E(u+it ½zit ) E(u+i;t1 ½zit ) 0 a.s.7 Typically zit would consist of regressors lagged at least one period. Now, E(yit ½zit ) E(I(x?it1 u0 eit1 ] 0)yi;t1 ½zit ) E(I(x?it0 u0 eit0 ] 0)(1yi;t1 )½zit )E(hi ½zit ):
(2)
But E(I(x?it1 u0 eit1 ] 0)yi;t1 ½zit ) E(E(I(x?it1 u0 eit1 ] 0)yi;t1 ½zit ; yi;t1 ; xit )½zit ) E(F(x?it1 u0 )yi;t1 ½zit ) a:s: Repeat the same steps for the second right-hand-side term in equation (2) to obtain E(yit ½zit ) E(F(x?it1 u0 )yi;t1 ½zit ) E(F(x?it0 u0 )(1yi;t1 )½zit )E(hi ½zit ) a:s:;
(3)
where F is the standard normal distribution function. Take first differences to obtain E(yit yi;t1 F(x?it1 u0 )yi;t1 F(x?it0 u0 )(1yi;t1 ) F(x?i;t1;1 u0 )yi;t2 F(x?i;t1;0 u0 )(1yi;t2 )½zit ) 0 a:s: Like in linear panel-data models, the nature of the dependence between fixed effects and other model variables is irrelevant. However, unlike in linear panel data models, time-invariant regressors are not differenced out with the fixed effects. Let git (u) zit (yit yi;t1 F(x?it1 u)yi;t1 F(x?it0 u)(1yi;t1 ) F(x?i;t1;1 u)yi;t2 F(x?i;t1;0 u)(1yi;t2 )):
(4)
i; t : Egit (u0 ) 0:
(5)
Then
We thus have the main prerequisite for application of a GMM-style procedure: a set of moment conditions. Although not explicit in the notation above, the git s will be allowed to vary with N ; T ; and can also vary across i; t provided that equation (5) is satisfied. We now state our generic theoretical results. In the technical sections that follow n is either N or NT ; depending on whether T is fixed or increases.
Dynamic Spatial Discrete Choice
59
Downloaded by [Tehran University] at 04:23 21 August 2011
2.2. A Suitable CLT In Pinkse et al. (2005) (PSS), we develop a new CLT that is designed to address shortcomings in previously available CLTs. We now summarize the assumptions and results of that paper, which are used to establish the properties of the continuous-updating estimator (CUE) in this paper. CLTs are, in essence, results about sums of zero mean random variables. Because the statistical properties* including the strength and nature of dependence between observations* should be allowed to vary with the sample size, we index them by the sample size (i.e. our observations are jn1 ; . . . ; jnn and their sum is Sn ): The idea in PSS, which is based on an idea by Bernstein (1927), is to divide the observations into non-overlapping groups Gn1 ; . . . ; GnJ ; 1 5 J B ; which are divided up into mutually exclusive subgroups Gnj1 ; . . . ; Gnjmnj ; j 1; . . . ; J: Group membership of each observation can vary with the sample size n and so can the number of subgroups mnj in group j 1; . . . ; J: Partial sums over elements in groups and subgroups are denoted by Snj and Snjt ; j 1; . . . ; J and t 1; . . . ; mnj ; respectively. Thus, Sn
J X j1
Snj
mnj J X X j1 t1
Snjt
n X
jni :
i1
The only role of groups 2 through to J is to reduce the strength of the dependence across subgroups in group 1: To illustrate, suppose that there are a number of gasoline stations in a city. The idea is to partition stations into ‘sets’ that compete intensely with one another, e.g. ones located nearby, ones located along the same thoroughfare, or ones offering similar additional services (see, for example, Pinkse & Slade, 1998). No matter how one chooses the sets, however, there will often be stations at the ‘boundary’ of one set that face strong competition from a station in another set. However, if set 2 is located between sets 1 and 3, then competition between stations in set 1 and those in set 3 is likely to be small. As the city grows, the number of stations will also grow, and new stations will appear both at the periphery and, owing to increased population density, also in established areas. Indeed, wherever entry is deemed profitable, be it because competition is weak or because the market is large, expansion will occur. Furthermore, other stations will shut down because, for example, the land is more valuable in alternative uses or because they have become unprofitable. This means that, as the city grows, the choice of sets will change. The idea, then, is that each of the sets is a subgroup and that subgroups are allocated to groups in such a way that dependence between observations in different subgroups of the same group is small. With the example, sets 1 and 3 could be subgroups of the same group, whereas set 2 would be in a different group. As the city grows, it is possible to allocate ever more stations to each subgroup. Moreover, as this process continues, the level of competition between stations in different subgroups of the same group will dissipate owing to increased competition from stations of another group that are located between them. In the limit, dependence will disappear altogether and we will be back in a familiar situation of independent random variates. In PSS, we combine this grouping idea with a weak-dependence assumption that is due to Doukhan & Louhichi (1999). That assumption is weaker than strong
60
J. Pinkse et al.
mixing (Rosenblatt, 1956) and easier to work with than near-epoch dependence (Ibragimov, 1962). We now state the assumptions and results of PSS without further elaboration.8 Definition 1. Let F be a collection of functions ff : t R : f (t) t or u R : t : f (t) eiut g; where i is the imaginary number.
Downloaded by [Tehran University] at 04:23 21 August 2011
/
Assumption A. For any j 1; . . . ; J; let G+n ; G++ n Gnj be any sets for which t 1; . . . ; mnj : Gnjt \ G+n " fi [ Gnjt \ G++ fi: n Then for any function f F ; sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi X X X ffisffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi X ffi Cov f jns ; f jns 5 Vf jns jns anj ; (6) Vf
k
s G+n
k
s G++ n
s G+n
s G++ n
for some ‘mixing’ numbers anj with lim
J X
n0
m2nj anj 0:
(7)
j1
2 Let s2n ESn2 ; 0 B s2nj ESnj2 B ; and 0 B s2njt ESnjt B : Finally, w2nj mnj 2 at1 snjt : /
Assumption B. lim max snjt =wnj 0; j 1; . . . ; J;
n0 t5mnj
lim wnj =wn1 0; j 2; . . . ; J:
n0
(8)
Assumption C. For some sequence fhn g for which limn0 hn 0 and for all j 1; . . . ; J; 2 S Snjt lim max E njt I hn 0: (9) n0 t5mnj s2njt wnj
/
j j
A sufficient condition for assumption C is that for some p 1; E½Snjt ½2p o(s2njt wnj2p2 );
j 1; . . . ; J; t 1; . . . ; mnj :9
(10)
We can now state our theorems. Theorem 1. If assumptions A C hold, then Sn sn
D
0 N (0; 1):
(11)
Dynamic Spatial Discrete Choice
61
A vector-valued version of 1 is also available. Let jni be vector-valued and Sn
n X
jni :
i1
Furthermore, let Sn V Sn : Theorem 2. If for any vector v with ½½v½½ 1; assumptions A C are satisfied for jni v?S1=2 jni ; then n D
Downloaded by [Tehran University] at 04:23 21 August 2011
S1=2 Sn 0 N (0; I): n
(12)
2.3. Continuous Updating* One-step GMM The CUE is similar to the regular two-step GMM estimator, albeit that the weight matrix is parametrized immediately. Our moment condition is i; n : Egni (u0 ) 0; where u0 U Rd is the vector of parameters of interest, and gni is some vectorvalued function. The CUE is uˆ argminu U Vˆ n (u); where the CUE objective function Vˆ n has the form Vˆ n (u) g¯ n? (u)Wˆ n (u)g¯n (u); with g¯n (u) n1
n X
gni (u);
(13)
i1
Wˆ n (u) Cn Vˆ 1 n (u);
(14)
where fCn g is a sequence of numbers to be defined below and Vˆ n (u) n1
n X
lnij (gni (u) g¯n (u))(gnj (u) g¯n (u))?
(15)
i;j1
pffiffiffi is such that Vˆ n (u0 ) is an estimator of the asymptotic variance of ng¯n (u0 ): Since the Cn s in equation (14) are scalars, their inclusion does not affect the estimates but they facilitate the proofs. The numbers lnij in equation (15) are weights; if observations are known to be independent only the lnii s need to be non-zero; their choice is discussed below. We follow Hansen et al. (1996) in having the g¯n s in equation (15) but our results will also go through if they are omitted. Their main purpose is practical, i.e. to avoid having Vˆ n very large (and Wˆ n very small) when u is far from u0 :
62
J. Pinkse et al. In large samples, the objective function Vˆ n is close to Vn defined by Vn (u) g?n (u)Wn (u)gn (u);
where gn (u) E g¯n (u); Wn (u) Cn Vn1 (u) and Vn (u) n1
n X
E(lnij (gni (u)gn (u))(gnj (u)gn (u))?):
i;j1
Downloaded by [Tehran University] at 04:23 21 August 2011
So, provided that gn (u) 0 if and only if u u0 and that Wn is a positive definite matrix, Vn (u) 0 U u u0 : 2.3.1. Consistency. A number of conditions are necessary for consistency of our CUE, which we now explain. Recall that we do not assume stationarity (i.e. we neither assume that observations are located on a regular grid nor that dependence is equally strong between all pairs of observations that are equally far apart) and that we allow for the dependence structure to change with the sample size. For these reasons, our conditions are more difficult to express (i.e. they are more technical) than most. However, at the end of the subsection we include a discussion of the implications of our assumptions in the context of a simpler spatial setting as well as for our dynamic discrete-choice model. We now state the conditions necessary for consistency, followed by a discussion of each. Assumption D. u0 is an interior point of U; which is convex and compact. The compactness portion of assumption D is standard. Convexity is somewhat unusual, but is reasonable in most applications. Let Ln be the n n matrix with i; j element lnij : /
Assumption E. For some deterministic sequence fxn g with xn O(1); ess sup max E(max ½½gni (u)½½4 ½Ln ) 5 xn ;10 i5n
max E max i5n
uU
uU
@gni
(16)
2
k @u k
(u) 5 xn ;
s 1; . . . ; d:
(17)
s
We condition on Ln in equation (16) since the weights lnij can be random. We need to ensure that u0 is the unique solution to gn (u) 0 for any sufficiently large n; which is accomplished by assumption F Assumption F. For some continuous function g+ : U 0 R; n+ : n n+ ; u U : ½½gn (u)½½ ] g+ (u) and g+ (u) 0 U u u0 : We will also put some restrictions on the strength of the dependence. Let fanij g be some numbers that satisfy
Dynamic Spatial Discrete Choice pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffipffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 8 <jCov(gnit (u); gnjs (u))j 5 anij Vgnit (u) Vgnjs (u); jCov(gnij (u); gni+ j+ (u))j 5 (anii+ anij+ anji+ anjj+ ) pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffipffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi : V gnij (u) V gni+ j+ (u);
63 (18)
for all i; j; i+ ; j+ ; u U; where gnij is one of lnij ; lnij gnit ; lnij gnit gnjs ; for any t; s 1; . . . ; dg ; where dg is the dimension of the gni vector. Without conditions on the a-coefficients, equation (18) is not restrictive. Restrictions will be imposed by means of the rate of increase of the sequence fAn g defined by An max
Downloaded by [Tehran University] at 04:23 21 August 2011
i5n
n X
anij
(19)
j1
In most applications, assuming that lim supn 0 An is finite is reasonable, but for the consistency proof we allow An to grow with n: The conditions imposed on An are stated in assumption I. We also need to impose some conditions to ensure that Vˆ n (u)Vn (u) vanishes and that Vn (u) is always invertible. To achieve this, we have to make assumptions on the choice of weights lnij and on the variability in the gni s. Assumption G. For some deterministic sequence fzn g with zn O(1); n X Vgni (u) ] z1 min E min n1 n ; uU
i1
n X max E max n1 Vgni (u) 5 zn ; uU
i1
where E min ; E max denote the minimum and maximum eigenvalue of a matrix The non negative weights lnij should ideally be chosen such that lnij is equal or close to one when observation is location is close to js. We suggest a choice for these weights at the end of this section. Without loss of generality we will assume the matrix Ln with elements lnij to be symmetric. Assumption H. Deterministic sequences of non negative numbers frn g; fCn g exist such that
n qffiffiffiffiffiffiffiffiffi X 2 Elnij ; ess sup(E max (Ln )) ; Cn max max i5n
rn ess sup
(20)
j1
1 E min (Ln )
;
(21)
and rn O(1); C1 n O(1): If Ln is singular, then Vn may also not be invertible. Note that the assumptions on Ln required for consistency are weak; the shape of Ln is not important.
64
J. Pinkse et al.
Assumption I. An ; Cn are such that L2d 0; lim n1 An C2(2d) n n
(22)
n0
for some Ln with limn0 Ln : Even if limn0 An B ; Cn can increase only slowly with n; certainly more slowly than is necessary for the ‘optimal’ rate of increase of the cut-off parameter of the Newey West estimator. It is possible to weaken the rate of increase limitations on Cn by strengthening other conditions. The current limitation arises from an apparently new generic uniform convergence result, (see lemma 1 in the Appendix).
Downloaded by [Tehran University] at 04:23 21 August 2011
ˆ P u0 : Theorem 3. If assumptions D-1 hold then u0 Note that theorem 3 requires no weak dependence conditions over and above that implied in the requirements imposed on An (defined in equation (19)) in assumption I. So no groups and such need to be chosen, the only restriction is on the covariances. 2.3.2. Asymptotic normality. The conditions required for asymptotic normality are stronger than those needed for consistency. The objective is to show that pffiffiffi D 1 1 ˆ n(uu ); 0 ) 0 N (0; (T?0 V0 T0 )
(23)
where @gn
T0 lim
@u?
n0
(u0 );
V0 lim Vn (u0 ); with Vn (u0 ) n1 n0
n X
E(gni (u0 )g?nj (u0 )):
i;j1
(24) We need to make assumptions to ensure that the limits in equation (24) exist, that (@ g¯n =@u?)(uˆ+ ) and Vˆ n (uˆ+ ) converge to T0 and V0 whenever uˆ+ is a consistent estimator of u0 ; that the asymptotic variance matrix in equation (23) is welldefined, and that the conditions of theorem 2 are satisfied. Assumption J. The matrices T0, V0 defined in equation (24) are finite and have maximum (column) rank. Assumption K. fgni (u0 )g satisfies the assumptions of theorem 2. Now let sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffisffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi @gnj;s @g @gni;t @g Cov (u); (u) 5 anij V ni;t (u) V nj;s (u); @ut @us @ut @us
j
j
(25)
for all i; j; t; s; t+ ; s+ and all u U: An remains defined as in equation (19), but note that with the new requirement on the a-coefficients, the requirements imposed on An in assumption I are now somewhat stronger. Note further that the strength of the dependence is now also controlled indirectly through assumption K.
Dynamic Spatial Discrete Choice
65
The assumptions for theorem 2, imposed by assumption K, apply only to (sums of) the gni s evaluated at u0, but they apply both to the sums of gni s themselves and to complex exponentials thereof. The conditions imposed on the rate of increase of An control the covariance sums of the gni s and @gni =@u?; uniformly in u: Both sets of assumptions are weak. The next assumption imposes some additional constraints on the choice of weights lnij . Assumption L. The weights {lnij } are such that for t; s 1; . . . ; dg : lim n1
Downloaded by [Tehran University] at 04:23 21 August 2011
n0
n X
E½(1lnij )gni;t (u0 )gnj;s (u0 )½ 0:
i;j1
Assumption L suggests that the practitioner must have some information on the dependence. The information need not be detailed but she must have some idea for which (i; j)-pairs gni and gnj are highly correlated. Note that assumption L applies only to the gni s at u0 ; not at arbitrary values. Assumption M. f@ g¯n =@u?g is uniformly stochastically equicontinuous on U. Uniform stochastic equicontinuity is a technical assumption that is implied by all partial second derivatives of gni being bounded in probability, uniformly in u U: Theorem 4. If V0 0 and assumptions D M are satisfied, then equation (23) holds. 2.3.3. Bias correction. Unlike the GMM estimator, the CUE is a GEL estimator. Consequently, a second order bias correction is possible, where such a correction is not feasible for the two-step estimator. The bias correction does not affect the asymptotic distribution of the CUE, so the CUE generally has the same asymptotic distribution as the two-step GMM estimator. However, a bias correction can improve performance in moderate-sized samples. To see how a second-order bias correction can help, note the following. The 1=2 ˆ difference uu ; n1 ; n3=2 ; 0 can be expanded into terms converging at rates n etc., i.e. the first, second, third, . . . terms in an asymptotic expansion. The first order term is responsible for the asymptotic distribution; all other terms are of no consequence in the limit. However, the other expansion terms can make a difference in samples of finite size. By reducing the bias in the second-order term, the precision of the estimator is improved. We do not provide such a bias correction ourselves. However, in work motivated by our current paper, Iglesias & Phillips (2005) provide this bias correction for models with the non-stationarity problem we address here, not only for the CUE but also for the empirical likelihood estimator. 2.3.4. Conditions in a simple spatial model. We now offer a further explanation of our assumptions in a non-linear regression model with the simplest possible spatial dependence structure: one with stationary data that are equally spaced on a line and where the locations equal the observation indices. An example of such a process can be found in Whittle (1954); Cliff & Ord (1973) describe similar models. This model
J. Pinkse et al.
66
does not do justice to the generality of our results but it does help clarify the assumptions. Although the assumptions on the dependence structure are somewhat milder here than in, for example, Conley (1999), the moment conditions implied by our assumptions are probably somewhat stronger than necessary, but hardly unreasonable, in this simple case. Since the dependence structure is independent of the sample size in this model, we drop the dependence on n in our notation here. Consider the non-linear regression model yi h(xi ; u0 )ui hi (u0 )ui ;
i 1; . . . ; n;
where the ys, xs and us used here are unrelated to those used in our dynamic spatial discrete-choice model. Suppose that the dimensions of xi and u0 are both d: If the regressors xi are used as instruments, then g would be given by Downloaded by [Tehran University] at 04:23 21 August 2011
gi (u) xi (yi h(xi ; u));
i 1; . . . ; n:
The locations are non random here so the lij s drop out of any expectation. For instance, assumption E simplifies to 4
E max kx1 (y1 h1 (u))k B ; uU
k
E max x1 uU
@h1 @us
k
2
(u) B ;
for all s 1; . . . ; d: Further, assumption G becomes 0 B Vg1 (u) B and assumption M would be implied by (see, for example, Davidson, 1994, theorem 21.10), @ 2 g1t E max (u) B ; t 1; . . . ; d: u U @u@u?
k
k
Assumption F is implied by the usual identification condition on u0 in the nonlinear regression model specified, and assumption J requires that E(u21 x1 x?1 ) 0 and E(x1 @h1 =@u?(u0 )) is invertible. If, moreover, the xi s were non random then equation (18) would simplify to ½Corr(ui ; uj )½ 5 aij ;
½Corr(ui uj ; ui uj )½ 5 aii aij aji ajj ;
for all i, j, i+, j+. We find it difficult, however, to think of an obvious application for fixed regressors in economics. With random regressors, equation (18) applies to a combination of errors and regressors, i.e. equation (18) would be equivalent to jCorr(git (u); gjs (u))j 5 aij ; jCorr(git (u)gjs (u); gi t (u)gj s (u))j 5 aii aij aji ajj ; for all i, j, i+, j+ and all u U: Condition (25) only adds the requirement that sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffisffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi @hj @hj @hi @hi Cov xit+ (u); xjs (u) 5 aij V xit (u) V xjs (u) ; @ut @us @ut @us
j
j
for all i, j, t, s, t+, s+. By stationarity, aij ai j for any i, j, i+, j+ for which ½ij½ ½i+ j+ ½: Thus, we could replace the notation aij by aij : Equations (18) and (25) are implied by a
Dynamic Spatial Discrete Choice
67
strong mixing assumption with mixing numbers at : If, moreover, the (strong) mixing numbers are summable then An O(1): Now, with respect to the choice of l-weights, equation (20) now defines Cn as the maximum row sum of the Ln matrix and requires its smallest eigenvalue to be bounded away from zero in the limit. Given the dependence structure imposed, it would be natural to select lij lij ; possibly as lt (1½t½=(2kn 1))I(½t½ 5 kn ) where kn increases as a small fractional power of n: Clearly, then, Cn 5 kn : For this choice of ls, assumption L is implied by n1
n X
(1lij )aij o(1);
i;j1
Downloaded by [Tehran University] at 04:23 21 August 2011
which after some routine algebra can in turn be shown to be implied by X i2kn
ai k1 n
2kn X
jaj o(1);
j0
which can be seen to hold if An O(1): Assumption I then requires that n1 k2(2d) L2d o(1); which can be made to n n hold by letting kn increase suitably slowly. Finally we discuss assumption K, which requires assumptions A C. We can use the grouping scheme of Figure 1. We choose J 2 groups, each with mn 1 mn 2 subgroups. The subgroups of group j have lnj , j 1, 2, observations each. So observations 1, . . . ,ln 1 belong to subgroup 1 of group 1, the next ln 2 belong to subgroup 1 of group 2, the following ln 1 to subgroup 2 of group 1, and so forth. We assume here that the mixing weights decline at a rate of at tw for some w 0. First, noting that ln2 o(ln1 ) by assumption B, equation (7) requires that m2n2 aln2 o(1); i.e. m2n2 lw n2 o(1): If w is large then we can afford to choose many subgroups with few observations each. For instance, if w 4, then we can have mn1 mn2 n2=3 ; ln2 n(14=w)=6 and ln1 n1=3 ln2 : 11 Because Snjt is a summation over lnj observations, here s2njt ; w2nj will be of order no less than lnj and mnj lnj ; respectively.12 Although E½Snjt ½2p ; p 2, 2p p p1 p 1p will often be O(lpnj ); it is certainly O(l2p nj ) and since lnj =(lnj mnj ) lnj mnj p=3 2(1p)=3 (2p)=3 O(n ×n ) O(n ) o(1); equation (10) is satisfied. Similarly, if the number of existing moments is large, then we choose fewer subgroups with more elements each and w needs to be only slightly larger than 2. If the weak dependence conditions are strengthened and weak dependence within each subgroup is exploited then values of w B 1 are achievable; we do not show this here. /
/
/
/
/
/
/
/
2.3.5. Conditions in the dynamic discrete-choice model. It is instructive to discuss the meaning of our conditions in the context of our dynamic discrete choice paneldata model. Several assumptions, including those relating to weak dependence and
Figure 1. Bernstein blocks.
J. Pinkse et al.
68
the choice of weights, are largely independent of the particular case of our dynamic discrete choice model, and assumption 18 is satisfied if a sufficiently large number of instruments is available. We will therefore focus our attention on assumptions E, G, and M. Recall that in our dynamic discrete-choice model the dependence of the variables on n was not evident in the notation; here we continue with that notation and use yit instead of ynit ; etc. Like in the example above in this discussion, we again assume that locations are non random and hence so are the ls. First consider assumption E, and note that F; f F?; f? are all bounded, i.e. for some Cf B max(F(t)f(t)½f?(t)½) 5 Cf : tR
Now, jyit j 5 1, and hence Downloaded by [Tehran University] at 04:23 21 August 2011
/
max kgit (u)k 5 kzit k max(yit (1½F(x?it1 u)½½F(x?it0 u)½)yi;t1 uU
uU
(½F(x?i;t1;1 u)½½F(x?i;t1;0 u)½)yi;t2 (½F(x?it0 u)½½F(x?i;t1;0 u)½)) 5 ½½zit ½½(26Cf ): Hence condition (16) of assumption E is satisfied when E½½zit ½½4 is bounded uniformly in i; t; n: The uniformity condition is only needed because the expectations are potentially different and can change with the number of observations. For condition (17) the derivation is similar, albeit that the F(x?u)s in the lastdisplayed equation are replaced with xs f(x?u)s. Thus, max uU
k @u@g (u)k 5 kz k(2C (½x it
f
it
it1s ½2½xit0s ½½xi;t1;1s ½2½xi;t1;0s ½))
s
and equation (17) holds if E½½zit ½½4 ; E½½xit1 ½½4 ; and E½½xit0 ½½4 are bounded uniformly in i, t, n. The second part of assumption G is implied by equation (16) and the first half holds for stationary data unless the instruments are linearly dependent, as noted above. Finally, using the comment following assumption M, assumption M follows from the fact that max uU
@ 2 git
k @u @u (u)k 5 kz k(2C (½x it
s
f
it1s ½2½xit0s ½½xi;t1;1s ½2½xi;t1;0s ½)
s+
(½xit1s+ ½2½xit0s+ ½½xi;t1;1s+ ½2½xi;t1;0s+ ½)); provided that E½½zit ½½6 ; E½½xit1 ½½6 and E½½xit0 ½½6 are bounded uniformly in i, t, n. 2.4. Choice of Weights We propose a scheme for choosing the l-weights, which is in the spirit of Newey & West (1987). But the Newey West weights depend on the stationarity assumption as do other weighting schemes for spatial estimators proposed by Conley (1999) and Kelejian & Prucha (2004). Our weights are
Dynamic Spatial Discrete Choice
69
Pn
Downloaded by [Tehran University] at 04:23 21 August 2011
lnij
tnti tntj 1=2 ; Pn Pn 2 2 t1 tnti t1 tntj t1
(26)
where the tnti are numbers that are large when observation i is near t. The matrix Ln with (i; j) element lnij as defined in equation (26) is necessarily positive semidefinite it is Ln ant1 t˜nt t˜nt ; where t˜nt is a vector with ith element pffiffi since tnti = ðant1 t2nti Þ: If Ln is positive semi-definite, then so is Vˆ n ; 13 which is necessary for the one-step GMM procedure to work. Expression (26) is generic. In a stationary time series tnti can be taken to be I(½it½ 5 kn ); in which case lnij (1½ij½=(2kn 1))I(½ij½ 5 2kn ) if kn B i; j B nkn ; 14 which are effectively the Newey West weights.15 A similar scheme for spatial data that are located on a two-dimensional lattice yields a weight* for observations located at (i, t) and (j, s) respectively* equal to lnitjs
2(min(½i j½; ½t s½) 1)(nnitjs 1) n2nitjs I(nnitjs odd) (2kn 1)2 1
I(nnitjs ] 0); (27)
where nnitjs 2kn (½ij½½t s½); provided that both observations are sufficiently far from the edge of the lattice. From equation (27) it follows that for fixed i, j, t, s, lnitjs 0 1 as kn 0 since nnitjs =(2kn ) 0 1 as kn 0 : Many schemes for the choice of the tnti -weights are possible. The simplest schemes have all weights equal to either 0 or 1; we use such a scheme in our application. But many other configurations are possible, including ones with negative weights. 3. The Application We apply our technique to a panel of Canadian copper mines. The discrete choice in that application is to operate a property or to let it lie idle. In this section, we discuss the predictions from a theoretical real-options model, and we describe the copper mining industry and the data. 3.1. The Theory Since the early 1980s, financial economists have realized that ownership of real assets has much in common with ownership of financial assets, and that the techniques that were developed to price and manage the former could be used to price and manage the latter.16 In this subsection we develop a reduced-form discrete-choice equation that embodies many of the predictions of the theory of real options. The equation that we specify is similar to the one that is used in Moel & Tufano (2002), which is itself based on the theoretical model of Brennan & Schwartz (1985). However, although we focus on a real-options model, our empirical specification allows us to discriminate among competing theories of operating decisions. We consider a competitive industry* copper mining* and analyse flexible operating rules. Each mining property can be in one of two states* active or inactive* at the beginning of each period, and the manager must decide whether
70
J. Pinkse et al.
or not to operate the property. Since there are fixed costs associated with opening and closing, mines do not open (close) as soon as profits are expected to be positive (negative). Instead, there are price thresholds that must be crossed before an action is taken. Furthermore, the threshold that triggers opening is strictly higher than the one that triggers closing. In the dynamic discrete-choice literature, the triggers are called S, s thresholds, with S s. In our application, both thresholds are functions of the current state. Under simple assumptions on the Data Generating Process for price,17 it is possible to solve a real-options model to obtain a number of predictions concerning the determinants of the S; s thresholds and the optimal decision rules. Predictions that cause the thresholds to move in opposite directions or with different magnitudes are state dependent, whereas those that cause them to move in a similar fashion are not. The following is a list of predictions: Downloaded by [Tehran University] at 04:23 21 August 2011
/
(i) (ii)
(iii) (iv) (v)
(vi)
Price: high prices make it more likely that mines will be active, regardless of the prior state. In other words, high prices lower both thresholds. Operating costs: mines with higher average variable costs are less likely to be active, regardless of the prior state. Furthermore, if fixed costs are not sunk (i.e. if they are not incurred when not operating), mines with higher fixed costs are also less likely to be active. In other words, high costs raise both thresholds. Reserves: large reserves make it more likely that a mine will be active, regardless of the prior state. In other words, large reserves lower both thresholds. Prior state: a mine that is active (inactive) is more likely to remain active (inactive). Capacity: the effect of capacity is less clear. However, if size is a reasonable proxy for opening and closing costs, a large capacity will delay actions. In other words, large mines that are active (inactive) will be more likely to remain active (inactive). A large capacity then raises the upper threshold, lowers the lower threshold, and widens the region of no change. Price volatility: high volatility delays actions. In particular, a mine that is active (inactive) is more likely to remain active (inactive) when prices are volatile. This means that high volatility widens the region between the thresholds (the region of inactivity).
Most of the predictions that are on our list would emerge from many other models of optimal operating decisions. For example, it is difficult to think of a model that does not yield the first three predictions. Furthermore, any model with fixed opening and closing costs is apt to yield predictions (iv) and (v). Only the final prediction distinguishes the real-options model from other models of optimal operating decisions. We therefore discuss that prediction in greater depth. With a real-options model, an increase in uncertainty increases the option value of delay. Managers are therefore less likely to open a currently inactive mine or close one that is active. With some competing investment models, such as those based on discounted cash-flow (DCF), an increase in uncertainty has no effect. With others, an increase in volatility lowers the value of a project and makes operation less likely, regardless of the prior state. This is the case, for example, when investors are risk averse and face a trade-off between the
Dynamic Spatial Discrete Choice
71
mean and the variance of returns. Prediction (vi) therefore allows us to distinguish among theories.
Downloaded by [Tehran University] at 04:23 21 August 2011
3.2. The Industry We apply our technique to a panel of Canadian copper mines. There is considerable variation in the size of Canadian mines. Nevertheless, we assume that all decision makers are price takers. Indeed, the copper market is worldwide, and even the largest Canadian companies account for only a relatively small fraction of world copper production.18 There are a number of phases in the production of copper. In particular, ore is mined and concentrated at the mine site and then sent to smelters and refineries that produce metal. Most mining firms are vertically integrated into all phases of production. However, some smelting and refining of Canadian ores is performed by independent companies. When this occurs, however, it is customary for the mining company to pay a fixed charge per unit of metal and to continue to be the residual claimant (in other words, the mining company bears the price risk). This means that it is possible to know how much metal a mine produces each year and at what cost. Furthermore, it means that the price of refined copper is the relevant price, and that a measure of variation in that price is an appropriate measure of volatility. We consider only developed properties. In other words, there are no mines that opened de novo in our sample. In fact, no Canadian copper mines opened de novo during the sample period. The copper industry, however, is highly cyclical, and mines frequently close when times are bad and reopen when conditions improve. Furthermore, most of the mines in our sample were inactive in some sample years, but no mine was inactive in all but one year. Some mines are surface deposits whereas others are underground veins. There are therefore two mining technologies, namely open pit and underground. Furthermore, most open-pit mines are in the province of British Columbia, whereas most underground mines are in Quebec. Spatial dependence can be due to a number of factors. Mines that are located close to one another are apt to have similar unobserved attributes that can lead to spatial error dependence. For example, although many supply and demand shocks are aggregate (e.g. those associated with energy prices and industrial production), some cost shocks are local. Regional cost variation can be due to, for example, common geological features and local input prices. When those factors are not observed, spatial dependence is incorporated into the errors. 3.3. The Data The data that are used in the application pertain to a panel of 21 Canadian copper mines that were observed over a 14-year period, 1980 1993. The 21 mines were chosen to satisfy two criteria. First, their principal commodity had to be copper, and, second, they had to be active during some portion of the 1980 1993 period. All mines that satisfy those criteria are included in the sample. Figures 2 and 3 contain maps of mine locations. Most of the data, which are annual, are described in Slade (2001). A detailed description of the variables that are used in the current application can be found in Appendix B. A shorter description of each variable follows.
72
J. Pinkse et al.
Downloaded by [Tehran University] at 04:23 21 August 2011
Latitude
60.0
55.0
50.0 Rimouski Val d'Or Quebec City Montreal
45.0 -75.0
-70.0
-65.0
-60.0
Longitude
Figure 2. Mine locations in Quebec.
Latitude
60.0
55.0 Prince Rupert Prince George
50.0 Vancouver
-140.0
-130.0
-120.0
Longitude
Figure 3. Mine locations in British Columbia.
Dynamic Spatial Discrete Choice
73
3.3.1. Mine data. The variables that vary by mine and year are: ACTIVE: a binary variable that equals 1 if the mine is active in the current period and 0 otherwise. RES: reserves of ore remaining in the mine. CAP: mine capacity (in units of ore). QCU: output of metal. COST: real cost per unit of metal (available only when the mine is active). DOPEN: a binary variable that equals 1 if the technology is open pit and 0 otherwise.19
Downloaded by [Tehran University] at 04:23 21 August 2011
The variables that vary only by mine are: LAT and LONG: the spatial coordinates of the mine. FCOST and MCOST: fixed and marginal costs. The variable COST measures unit or average total cost. We generated a fixed and a marginal cost for each mine, FCOSTi and MCOSTi , using the following equation and the observations for which the mine was active, TCOSTit COSTit QCUit FCOSTi MCOSTi QCUit eit ; (28) where e is a zero-mean random variable. Estimated fixed and marginal costs for mine i, which are the parameters in equation (28), are denoted FCOˆSTi and MCOˆSTi . 3.3.2. Price data. The price and volatility data are common to all mines. CPR: real copper price (London Metal Exchange cash-settlement price). SIGRR: yearly volatility of real monthly returns, CPR CPR1 SIGRR SD ; CPR1
(29)
where SD(.) denotes the standard deviation, and 1 indicates the previous month. There are 294 observations, 185 or 63% of which are active and 109 or 37% are inactive. Table 1 contains summary statistics for the dependent variable as well as for each variable that is used as a regressor. The first two columns show means and standard deviations for the entire sample, whereas subsequent columns contain /
Table 1. Summary statistics All
Active
Inactive
Variable
Mean
SD
Mean
SD
Mean
SD
Mine status (ACTIVE ) Price (CPR ) Volatility of returns (SIGRR ) Reserves remaining (RES ) Capacity (CAP ) Technology (DOPEN ) Marginal cost (MCOˆST ) Fixed cost (FCOˆST )
0.63 142.3 5.54 84.5 5.64 0.43 97.6 31.2
0.48 32.3 2.13 166.6 8.70 0.50 42.1 58.2
144.9 5.48 105.3 7.35 0.49 91.6 37.8
34.2 2.13 181.0 10.38 0.50 37.0 58.2
137.8 5.64 49.3 2.74 0.33 107.7 20.1
28.2 2.14 132.2 2.90 0.47 48.1 56.6
74
J. Pinkse et al.
separate summary statistics for active and inactive mines. This table suggests that mines are more apt to be active if prices, reserves, and capacity are high and volatility is low. Furthermore, on average, active mines have lower marginal costs. However, they have higher fixed costs, which is due to the fact that large mines are more apt to be active. Finally, open-pit mines are less likely to close than underground mines, which could be an indication of lower operating costs. 4. Results The discrete-choice equations explain operating decisions for each mine in each year. It is assumed that those decisions are made at the beginning of the period using the information that is available at that time, Y t , which in practice is a set of excluded variables plus the set of t j regressors with j ]1.20 Since expectations are formed rationally, expected and realized values of period-t explanatory variables differ by an error that is orthogonal to Y t . Lagged regressors are therefore appropriate instruments for the period-t variables that appear in the discrete-choice equations. The discrete-choice models that we estimate are all special cases of equation (1), which, as discussed in section 2.1, nests several models. The x vector that appears in that equation contains the variables that are shown in Table 1. All equations are estimated by one-step GMM with lagged (and/or current for the static models without fixed effects) regressors used as instruments; the first three lags are used for all years except for the first two, for which fewer lags are available. Specifications differ according to whether the equations are static or dynamic and whether they include fixed effects (denoted FE in the tables). Tables 2 and 3 contain ordinary probit models, albeit that they were estimated using GMM with /
Downloaded by [Tehran University] at 04:23 21 August 2011
/
/
/
Table 2. Ordinary probits, GMM, no state dependence No. 1 2 3 4 5 6 7 8 9
CPR
SIGRR
0.0095 (3.2) 0.0096 (3.2) 0.010 (3.3) 0.0098 (3.3) 0.010 (3.3) 0.0097 (3.2) 0.016 (3.4) 0.016 (3.3) 0.016 (3.4)
/0.115 (/2.5) /0.118 (/2.5) /0.127 (/2.7) /0.120 (/2.6) /0.122 (/2.6) /0.120 /(2.6) /0.201 (/3.2) /0.201 (/3.1) /0.197 (/3.2)
RES
0.0015 (2.4) /0.0013 (/1.6) 0.0011 (1.6) 0.0014 (2.1) 0.0012 (2.0)
0.0005 (0.1) /0.0021 (/0.3)
CAP
DOPEN
MCOˆST
FCOˆST
0.102 (5.0) 0.237 (1.3) /0.0054 (/2.8) 0.0028 (1.8)
0.109 (1.0)
Notes : Dependent variable: ACTIVE . t -statistics are in parentheses. FE denotes specifications with mine fixed fixed effects. NSI denotes normalized success index.
FE
NSI
No
0.03
No
0.06
No
0.12
No
0.06
No
0.09
No
0.07
Yes
0.47
Yes
0.48
Yes
0.49
Dynamic Spatial Discrete Choice
75
Table 3. Ordinary probits, GMM, state-dependent estimates No.
CPR
SIGRR1
1
0.019 (2.8) 0.019 (2.8) 0.019 (2.8) 0.022 (3.3) 0.023 (3.4) 0.024 (3.4)
/0.085 (/1.1) /0.093 (/1.2) /0.093 (/1.1) /0.162 (/1.8) /0.169 (/1.9) /0.175 (/1.9)
2 3 4 5
Downloaded by [Tehran University] at 04:23 21 August 2011
6
RES
0.0012 (1.4) /0.0007 (/0.6)
0.0025 (0.9) /0.0010 (/0.2)
CAP1
SIGRR0 /0.685 (/6.0) /0.689 (/6.0) /0.670 (/5.3) /0.674 (/5.3) /0.686 (/5.3) /0.718 (/4.9)
0.075 (1.9)
0.175 (1.0)
CAP0
0.070 (1.0)
0.207 (1.2)
FE
NSI
No
0.67
No
0.67
No
0.68
Yes
0.74
Yes
0.74
Yes
0.74
Notes : Dependent variable: ACTIVE . t -statistics are in parentheses. FE denotes specifications with mine fixed effects. Variables ending in 1 (0) correspond to observations with the mine previously open (closed). NSI denotes normalized success index.
the regressors as instruments. Since the number of restrictions is then equal to the number of unknowns, no weighting matrix needs to be chosen. For Tables 4 and 5, excluded variables and lagged regressors were used as instruments, the number of instruments exceeds the number of unknown coefficients, and the Ln -matrix used in generating the GMM weighting matrix is the identity matrix. This procedure, which corrects for heteroskedasticity but not for spatial dependence, generates consistent estimates and, in the absence of dependence, also appropriate standard errors. For Tables 6 and 7, the same instruments were chosen as for Tables 4 and 5, but the procedure of Section 2.4 was used to create Ln . We set tnit,js 1 if mine j is among mine is two closest neighbours21 and jt sj5 1, otherwise it is set to 0. /
/
/
Table 4. Heteroskedasticity-corrected models: no state dependence No. 1 2 3 4 5 6
CPR
SIGRR
0.0075 (1.7) 0.0087 (1.9) 0.0088 (1.9) 0.0082 (0.6) 0.014 (1.0) 0.0037 (0.5)
/0.084 (/1.4) /0.112 (/1.8) /0.099 (/1.5) /0.182 (/0.9) /0.238 (/1.1) /0.089 (/0.5)
Notes : Dependent variable: ACTIVE . t -statistics are in parentheses. FE denotes specifications with mine fixed effects. NSI denotes normalized success index.
RES
0.0014 (2.6) /0.0015 (/1.7)
CAP
0.117 (5.0)
FE
NSI
No
0.02
No
0.05
No
0.12
Yes 0.0017 (1.1) /0.010 (/0.9)
Yes 0.232 (1.1)
Yes
76
J. Pinkse et al.
Table 5. Heteroskedasticity-corrected models: state dependence No.
CPR
SIGRR1
1
0.039 (1.3) 0.058 (1.6) 0.057 (1.3) 0.075 (0.5) 0.076 (0.7) 0.140 (0.2)
/0.342 (/1.5) /0.468 (/1.8) /0.329 (/1.2) /0.857 (/0.7) /0.854 (/0.8) /1.26 (/0.2)
2 3 4 5
Downloaded by [Tehran University] at 04:23 21 August 2011
6
RES
0.0013 (1.6) 0.0053 (0.8)
0.0007 (0.7) /0.0063 (/0.3)
CAP1
0.076 (0.9)
2.94 (0.3)
SIGRR0 /2.53 (/1.3) /3.55 (/1.5) /3.88 (/1.3) /4.73 (/0.5) /4.89 (/0.7) /7.48 (/0.2)
CAP0
0.062 (0.1)
FE
NSI
No
0.74
No
0.74
No
0.79
Yes Yes 0.941 (0.4)
Yes
Notes : Dependent variable: ACTIVE . t -statistics are in parentheses. FE denotes specifications with mine fixed effects. Variables ending in 1 (0) correspond to observations with the mine previously open (closed). NSI denotes normalized success index.
We have also experimented with varying the number of neighbours but qualitatively the results do not change. If the errors are heteroskedastic, not all specifications are consistent. However, we report models that range from the simplest static probit to the full specification so that sensitivity can be assessed. 4.1. Ordinary Probit Estimates Table 2 contains static ordinary probit estimates of the operating rule. The dependent variable equals 1 when the mine is active and 0 otherwise. The GMM Table 6. Spatial models: no state dependence No. 1 2 3 4 5 6
CPR
SIGRR
0.0033 (0.8) 0.0058 (1.4) 0.0028 (0.6) /0.0007 (/0.06) 0.0009 (0.06) 0.0051 (0.7)
/0.042 (/0.8) /0.071 (/1.3) /0.018 (/0.3) /0.579 (/0.7) /0.671 (/0.8) /0.123 (/0.8)
Notes : Dependent variable: ACTIVE . t -statistics are in parentheses. FE denotes specifications with mine fixed effects. NSI denotes normalized success index.
RES
0.0013 (2.0) /0.0018 (/1.5)
CAP
0.115 (4.1)
FE
NSI
No
0.01
No
0.04
No
0.10
Yes 0.0014 (1.0) /0.016 (/1.2)
Yes 0.340 (1.3)
Yes
Dynamic Spatial Discrete Choice
77
Table 7. Spatial models: state dependence No.
CPR
SIGRR1
1
0.045 (1.5) 0.056 (1.7) 0.057 (1.5) 0.051 (1.0) 0.058 (1.0) 0.053 (1.9)
/0.399 (/1.8) /0.472 (/2.0) /0.338 (/1.2) /0.523 (/1.2) /0.620 (/1.2) /0.615 (/1.9)
2 3 4 5
Downloaded by [Tehran University] at 04:23 21 August 2011
6
RES
0.0012 (2.0) /0.0024 (/0.4)
0.0005 (0.5) /0.031 (/1.6)
CAP1
0.101 (1.3)
1.98 (1.5)
SIGRR0 /2.81 (/1.5) /3.45 (/1.6) /4.02 (/1.6) /3.48 (/1.0) /4.01 (/1.0) /2.57 (/1.3)
CAP0
0.163 (0.4)
FE
NSI
No
0.74
No
0.74
No
0.79
Yes Yes 0.614 (1.3)
Yes
Notes : Dependent variable: ACTIVE . t -statistics are in parentheses. FE denotes specifications with mine fixed effects. Variables ending in 1 (0) correspond to observations with the mine previously open (closed). NSI denotes normalized success index.
moment conditions make use of the fact that the generalized errors from the probit should be orthogonal to the instruments (see Pinkse & Slade 1998).22 Consider first the specifications without fixed effects (numbers 1 6). They imply that mines are more apt to operate when the price is high, volatility is low, the mine is large, and the marginal cost is small. The effect of the fixed cost, in contrast, is not significant at conventional levels. In addition, mines with larger reserves are more likely to be active, except in specifications that include capacity, in which case the effect of reserves is not significant. This finding is probably due to the fact that RES and CAP are highly correlated. Finally, the effect of the type of mining technology, open pit or underground, is not significant. With the ordinary probits, the mine fixed effects are included in the latentvariable equation rather than in the observed binary-choice equation, and no differencing is required. Since the number of time series observations per mine is 14, doing so may be reasonable in this instance. Specifications 7 9 in Table 2 show that the inclusion of fixed effects increases the magnitude and the significance of the coefficients of the price and volatility variables. However, the coefficients of the other explanatory variables become insignificant. The fall in significance is due to the fact that, with fixed effects, identification is achieved through time-series variation, and there is much less variation in the time-series for reserves and capacity than in the cross-section. The last column in Table 2 contains the normalized success index (NSI), which is a measure of goodness of fit.23 This index shows that performance is poor for the basic model (No. 2) and that the addition of capacity has the greatest effect on performance. In what follows, we therefore emphasize variants of specification No. 3, which includes capacity. The table also shows that the addition of fixed effects improves predictive power very substantially. We therefore also report variants of specification No. 9. The specifications in Table 2 explain the probability that a mine will operate (not operate) in a given period. Equation (1), in contrast, which is a form of
78
J. Pinkse et al.
switching-regression with known switch points, explains the transition probabilities. In other words, it determines the probability of being in state k in period t, conditional on having been in state j in period t 1. The specifications in Table 3 are more flexible in that some of the coefficients are allowed to depend on the regime (i.e. on the lagged dependent variable). Recall that the effects of both volatility and capacity were hypothesized to depend on the previous state. Indeed, the real-options model predicts that increased uncertainty (higher volatility) causes the option value of a property to rise and delays investment. Furthermore, if a large capacity is a reasonable proxy for opening and closing costs, a large capacity should also delay closings and reopenings. This means that the coefficients of SIGRR and CAP should be positive when the mine was previously active but negative when it was previously inactive. Table 3 contains specifications in which the effects of volatility and capacity are allowed to vary by state. It shows, however, that in no case does the sign of an estimated coefficient depend on the previous state and that this conclusion is independent of whether the specification includes mine fixed effects. Both the magnitude and the significance of the coefficients of SIGRR, however, are state dependent. In particular, the effect of volatility is much stronger when the mine was previously inactive. This means that increased volatility delays openings significantly but has a much smaller negative impact on closings. The finding that the sign of the coefficient of SIGRR is negative regardless of the previous state is inconsistent with the real-options model. Instead, the dampening effect of volatility on operations is consistent with a more conventional mean/variance-utility model in which investors demand a higher mean when returns are more volatile. This result can be contrasted with that of Moel & Tufano (2002) who find evidence in favour of the real-options model in the gold-mining industry. In particular, their estimated coefficients are positive and negative, respectively, as the real-options model predicts. Nevertheless, they find that the effect of volatility is not significant when the mine was previously closed. The coefficients of the non-price variables in Table 3 are not significant at conventional levels.24 Moreover, comparing the estimates in Tables 2 and 3, it is clear that significance tends to be lower overall when the model is dynamic. This is due to the fact that although, for example, there are 195 observations where a mine is active, a transition to a new state (e.g. to inactive from active) occurs for only 17 or 9% of them. Furthermore, there is little time-series variation in the mine-specific variables such as capacity and reserves. Finally, the last column in Table 3 shows that the NSI rises significantly when state dependence is considered (from, for example, 12% to 68% for specification No. 3). Moreover, although the inclusion of fixed effects improves predicive power in Table 3, in contrast to the results in Table 2, the improvement is not dramatic. This difference is due to the fact that predictive power was already high.
Downloaded by [Tehran University] at 04:23 21 August 2011
/
4.2. Corrected Estimates 4.2.1. Correction for heteroskedasticity. Tables 4 and 5 contain the principal specifications from Tables 2 and 3, respectively. However, the GMM estimation now uses the procedure of Section 2. Specifically, we correct for heteroskedasticity but not for spatial dependence. Consider Table 4, which contains specifications that are not state dependent. One can see that the specifications without fixed effects have estimated coefficients
Downloaded by [Tehran University] at 04:23 21 August 2011
Dynamic Spatial Discrete Choice
79
that are typically less significant than those in Table 2. Nevertheless, the signs and magnitudes of the coefficients, as well as the normalized success indices, are not very different across tables.25 When mine fixed effects are introduced, significance declines once again. The lack of significance with fixed effects is not surprising. Indeed, both the differencing procedure and the heteroskedasticity correction tend to reduce the number of significant coefficients. When they are combined, the reductions are compounded. Table 5, like Table 3, allows the coefficients of volatility and capacity to depend on the previous state. That table shows that, in general, significance is reduced compared to Table 3. However, the magnitudes of the coefficients of volatility are considerably larger than those in Table 3. Indeed, increased magnitude is observed with all specifications in Table 5. It appears that modelling mine heterogeneity in the form of heteroskedastic errors allows one to uncover stronger volatility effects. Finally, correcting for heteroskedasticity also increases the predictive power of the model (the NSI rises from, for example, 68% to 79% for specification No. 3). 4.2.2. Spatial correction. Tables 6 and 7 are comparable to Tables 4 and 5 except that a correction for spatial and time-series dependence has been added. Specifically we use equation (26) with weights determined as discussed above. It is clear from the tables that, when both time-series and spatial dependence are modelled, the signs and magnitudes of the coefficients are similar to those obtained when only heteroskedasticity is corrected for. However, the coefficients are measured with greater accuracy in the full model (e.g. compare specification No. 6 in Tables 5 and 7). Nevertheless, the predictive power of the spatial models is no better. 4.3. Transition Probabilities The results from our state-dependent discrete-choice model are not supportive of the real-options model. A simple examination of the signs of the estimated coefficients in Tables 3, 5, and 7 reveals this fact. The magnitudes of the coefficients in a probit switching regression, however, are more difficult to interpret directly. Here we examine magnitudes indirectly. Table 8 illustrates how an increase in volatility affects operating decisions. Each part of the table is a 2 2 matrix of transition probabilities* the probability of being in state i conditional on having been in state j. The labels on the rows indicate the prior state, whereas those on the columns indicate the operating decision. The first half of the table is evaluated at the mean of the explanatory variables. It shows that 9% (7%) of active (inactive) mines close (open) in a typical period.26 The second half of the table differs from the first only in the value of our measure of price volatility, SIGRR. Instead of using mean volatility, we chose the mean plus one standard deviation, which we denote ‘high volatility’. The table shows that the probability of operation declines in both states. However, the effect on mines that were previously closed is more dramatic, since, for those mines, the probability of opening when volatility is moderately high is virtually zero. /
80
J. Pinkse et al.
Table 8. Predicted transition probabilities At mean: Operating decision Prior state
Open
Closed
Open
0.91 0.07
0.09 0.93
Closed
High volatility:
Downloaded by [Tehran University] at 04:23 21 August 2011
Operating decision Prior state
Open
Closed
Open Closed
0.88 0.004
0.12 0.994
Notes : Top half: all explanatory variables at mean. Bottom half: high volatility equals mean plus one standard deviation.
5. Conclusions Many economic decisions are both spatial and discrete. For example, economic agents must decide which markets to enter, which products to produce, whether to promote those products by launching advertising campaigns, and whether to engage in R&D activities to improve product quality. In spite of the importance of discrete choices in spatial applications, however, most of the econometric techniques that are available to applied researchers to model those choices suffer from a number of problems that limit their applicability. In this paper we develop an estimator that overcomes some, if not all, of the problems that are apt to surface. In particular, we develop a consistent one-step GMM (or continuous updating) estimator for dynamic discrete-choice models with endogenous regressors, fixed effects, and arbitrary patterns of spatial and time-series dependence. We then use our dynamic discrete-choice estimator to investigate closing and reopening decisions for a panel of Canadian copper mines.27 This exercise, which is cast in a real-options framework, is a two-state optimal-switching decision problem in which a mine can be either active or inactive and the decision maker must decide whether to operate the mine or to let it lie idle. Our spatial discrete-choice framework allows us to use a Markov structure and to estimate Markov transition probabilities or state-dependent operating rules. The important prediction that distinguishes the real-options model from more conventional rules pertains to the effect of uncertainty on operating decisions. Specifically, with a standard real-options model, increased uncertainty delays decisions and causes the region of price space in which no action is chosen to widen. Among other things, this means that operating rules are state dependent. Indeed, both the direction and the magnitude of the effect of uncertainty should depend on the state of the mine in the previous period. We find little support for the real-options model. In particular, the data are more supportive of a conventional model of investment or operating decisions in which the decision maker is risk averse and faces a trade-off between the mean and the variance of returns.
Dynamic Spatial Discrete Choice
81
Our one-step GMM estimator works well on simple models, both static and dynamic. However, when the model becomes more complex, for example through the introduction of a combination of fixed effects and a weighting scheme with a large number of parameters, the estimated coefficients lose significance. Nevertheless, although significance tends to drop when more flexible specifications are estimated, the basic conclusions remain intact. Moreover, modelling state dependence improves the predictive performance of the model dramatically. Finally, modelling unobserved mine heterogeneity in the form of heteroskedastic errors not only improves performance but also uncovers larger effects of uncertainty or volatility.
Downloaded by [Tehran University] at 04:23 21 August 2011
Notes 1. There is a vast literature in which the spatial dependence structure is assumed known up to a finitedimensional parameter vector. Asymptotic results for such processes can be established under the assumption of independence after an appropriate transformation. See for example, Anselin (1988) for an early treatise. See Lee (2004) for an extensive discussion of the estimation of quasi-maximum-likelihood estimators for spatial processes. The results presented here do not presume that the spatial dependence structure is known. 2. By non-stationarity we mean that the joint distribution can depend on location, not just on distance between locations, and not that a unit root is present. 3. See, for example, Arellano & Honore´ (2001) for a survey of linear panel-data models. 4. The interpretation of the fixed effects varies with the exogeneity assumptions. 5. This can be done without loss of generality since eitj can be written as eitj eitj;1 yi;t1 eitj;0 (1yi;t1 ); with eitj;s independent of yi ,t1 for all i , j , t , s . Then eitj in equation (1) can be simply replaced with eitj ;j , which is independent of yi ,t1. 6. Please note that regressors are random here, not deterministic. There is hence no implicit strict exogeneity assumption and condition (iii) is not implied by condition (i) since condition relates to the unconditional error distribution, not the error distribution conditional on the regressors. 7. This means that endogeneity is due to the fact that regressors can be correlated with u*, not with ej ; j0; 1: 8. A detailed description can be found in PSS, section 2.3. 2 9. By the Ho¨lder and Markov inequalities, E(Snjt I(½Snjt ½hn wnj ))5(E½Snjt ½2p )1=p (P(½Snjt ½hn wnj ))11=p 5/ 22p E½Snjt ½2p h22p w : nj n 10. The essential supremum of a random variable v is min{t :P (jv j/t )/0} (see, for example, (8.1) in Davidson, 1994). 11. We ignore the possibility that these numbers may not be integers here. 12. As they would be in the i.i.d case. ˜ nn Ln c˜ n with cn a vector with i th element gni g¯n : 13. Note that Vˆ n c? 14. For i , j closer to 1 or n the formula is somewhat more complex. 15. The Newey West weights would be (1½ij½=(2kn 1))I(½ij½52kn )=(1½ij½=n): 16. See, for example, Tourinho (1979), Brennan & Schwartz (1985), and McDonald & Siegel (1985) for early applications of the theory of financial options to real assets. For a more recent and comprehensive treatment, see Dixit & Pindyck (1994). 17. The standard assumption is that price follows a geometric Brownian motion. 18. The Hirschman/Hirfindalh index for Western-world copper mining averages 400 (see Slade & Thille, 2006). 19. Some mines have both a shaft and a pit, and the section of the mine that is active can vary. 20. The Appendix B discusses the excluded as well as the included variables. 21. In Euclidean distance, see Pinkse et al . (2002) for other notions of distance. 22. When the generated variables, FCOˆST and MCOˆST , are included, the standard errors are apt to be underestimated. 23. See Hensher & Johnson (1981, p. 54) for this index, which is discussed in Appendix C. 24. This is also true in specifications that include the other explanatory variables that are shown in Table 2. 25. The table does not show the NSI for the specifications that have been differenced. Indeed, calculation of the NSI requires estimates of the fixed effects, which have been removed by differencing. 26. Specification No. 3 in Table 3 was used to produce Table 8. 27. See Pinkse & Slade (2005) for an application of this estimator to discrete advertising decisions in a strategic context. 28. Pollard’s result does not require the existence of derivatives.
82
J. Pinkse et al.
Downloaded by [Tehran University] at 04:23 21 August 2011
References Andrews, D. W. K. (1987) Consistency in nonlinear econometric models: a generic uniform law of large numbers, Econometrica , 55, 1465 1472. Andrews, D. W. K. (1992) Generic uniform convergence, Econometric Theory , 8, 241 257. Anselin, L. (1988) Spatial Econometrics: Methods and Models, Dordrecht, Kluwer. Arellano, M. & Honore´, B. (2001) Panel data models: some recent developments, in: J. J. Heckman & E. Leamer (eds) Handbook of Econometrics, Vol. 5, pp. 3229 3296, Amsterdam, North-Holland. Bernstein, S. (1927) Sur l’extension du the´ore`me du calcul des probabilite´s aux sommes de quantite´s dependantes, Mathematische Annalen , 97, 1 59. Brennan, M. J. & Schwartz, E. S. (1985) Evaluating natural-resource assets, Journal of Business , 58, 135 157. Chamberlain, G. (1984) Panel data, in: Z. Griliches & M. Intriligator (eds) Handbook of Econometrics, Vol. 2, pp. 1247 1318, Amsterdam, North-Holland. Cliff, A. D. & Ord, J. K. (1973) Spatial Autocorrelation, London, Pion. Conley, T. G. (1999) GMM estimation with cross sectional dependence, Journal of Econometrics , 92, 1 45. Davidson, J. (1994) Stochastic Limit Theory, Oxford, Oxford University Press. Dixit, A. K. & Pindyck, R. S. (1994) Investment under Uncertainty, Princeton, NJ, Princeton University Press. Doukhan, P. & Louhichi, S. (1999) A new weak dependence condition and applications to moment inequalities, Stochastic Processes and their Applications , 84, 312 342. Hansen, L. P., Heaton, J. & Yaron, A. (1996) Finite-sample properties of some alternative GMM estimators, Journal of Business and Economic Statistics , 14, 262 280. Hensher, D. A. & Johnson, L. W. (1981) Applied Discrete Choice Modeling, New York, Wiley. Honore´, B. E. & Kyriazidou, E. (2000) Panel data discrete choice models with lagged dependent variables, Econometrica , 68, 839 874. Ibragimov, I. A. (1962) Some limit theorems for stationary processes, Theory of Probability and Its Applications , 7, 349 382. Iglesias, E. M. & Phillips, G. D. A. (2005) Asymptotic Bias of GMM and GEL under Possible Nonstationary Spatial Dependence, Mimeo, Michigan State University. Kelejian, H. & Prucha, I. (2004) HAC estimation in a spatial framework, Journal of Econometrics . (forthcoming). Lee, L. (2004) Asymptotic distributions of quasi maximum likelihood estimators for spatial econometric models, Econometrica , 72, 1899 1926. Magnac, T. (2004) Panel binary variables and sufficiency: generalizing conditional logit, Econometrica , 72, 1959 1876. McDonald, R. & Siegel, D. (1985) Investment and the valuation of firms when there is an option to shut down, International Economic Review , 26, 331 349. Moel, A & Tufano, P. (2002) When are real options exercised? An empirical study of mine closings, Review of Financial Studies , 15, 33 64. Newey, W. K. & Smith, R. J. (2003) Higher order properties of GMM and generalized empirical likelihood estimators, Econometrica , 72, 219 255. Newey, W. K. & West, K. D. (1987) A simple, positive definite, heteroscedasticity and autocorrelation consistent covariance matrix, Econometrica , 55, 703 708. Pinkse, J., Shen, L. & Slade M. E. (2005) A central limit theorem for endogenous locations and complex spatial interactions, Journal of Econometrics (forthcoming). Pinkse, J. & Slade, M. E. (1998) Contracting in Space: An Application of Spatial Statistics to Discrete Choice Models, Journal of Econometrics , 85, 125 154. Pinkse, J., & Slade M. E. (2005) Semi-structural models of advertising competition, Journal of Applied Econometrics (forthcoming). Pinkse, J., Slade, M. E. & Brett, C. (2002) Spatial price competition: a semiparametric approach, Econometrica , 70, 1111 1153. Pollard, D. (1984) Convergence of Stochastic Processes, Berlin, Springer. Po¨tscher, B. & Prucha, I. (1989) A uniform law of large numbers for dependent and heterogeneous data processes, Econometrica , 57, 675 683. Rosenblatt, M. (1956) A central limit theorem and a strong mixing condition, Proceedings of the National Academy of Sciences , 42, 43 47. Slade, M. E. (2001) Managing projects flexibly: an application of real-option theory to mining investments, Journal of Environmental Economics and Management , 41, 193 233. Slade, M. E. & Thille, H. (2006) Commodity Spot Prices: an exploratory assessment of market structure and forward trading effects, Economica , 73, 229 256. Tourinho, O. A. (1979) The valuation of reserves of natural resources: an option pricing approach, PhD dissertation, University of California, Berkeley. Whittle, P. (1954) On stationary processes in the plane, Biometrika , 41, 434 449.
Dynamic Spatial Discrete Choice
83
Appendix A: Continuous Updating PLEASE NOTE: throughout the Appendix we indicate, when convenient, the assumption, lemma, equation or well-known theorem an (in)equality is based on above that (in)equality; an assumption is indicated by its letter, a common theorem by its name, a lemma by the letter L followed by its number and an equation by its equation number in parentheses. A.1. Generic Uniform Convergence
Downloaded by [Tehran University] at 04:23 21 August 2011
Lemma 1 is a self-contained generic uniform convergence result. There are many similar results available (see, for example, Andrews, 1987, 1992; Po¨tscher & Prucha, 1989); we use this one simply because it is more convenient. Lemma 1. Let U be some compact convex set and Qn some sequence of random functions which are differentiable with respect to u Rd . Let for any p 0, /
/
mnp max(E½Qn (u)½p )1=p ;
(A1)
uU
and let Dn be such that max uU
k @Q@u (u)k O (D ): n
p
(A2)
n
Then, p d pd pd max jQn (u)j op mnp Dn Ln : uU
(A3)
Proof: Denote the convergence rate in equation (A3) by en . Let [t] denote the smallest integer greater than or equal to t. Divide the parameter space U up into yn [C=ddn ] (for some 0 B C B ) possibly overlapping regions Ui and let u˜i Ui be such that for all u Ui, ½½u u˜i ½½ B dn : This can be done by the convexity and compactness of U. Note that P max jQn (u)j2en P max max jQn (u)j2en ; /
i5yn u Ui
uU
˜ ˜ 5 P max max jQn (u)Qn (ui )jen P max jQn (ui )jen i5yn u Ui
i5yn
X yn @Qn (u) dn en 5 P max P(jQn (u˜i )jen ) uU @u i1 @Qn 5 P max (u) en =dn yn max P(jQn (u)jen ) uU uU @u @Qn 1 1 p max e d D yn ep 5 P D1 (u) n n n n n mnp ; uU @u
k k
k k
k
k
(A4)
84
J. Pinkse et al.
by the Markov inequality. Now choose dn (mnp /Dn )p /(pd ) such that /
1 p p p en d1 n Dn Ln 0 ; yn en mnp C=Ln 0 0;
both as n 0 . The first right-hand-side (RHS) term in equation (A4) is then o(1) by equation (A2). I /
If, for example, Qn is an average of i.i.d. mean zero random functions of u, then mnp n1/2, Dn 1 and the convergence rate is np=(2(pd)) Ln , where Ln can be slowly varying. If an infinite number of moments exists, then the uniform convergence rate is close to root n; see, for example, Pollard (1984), theorem 37, for a result in this case.28 Lemma 1 can be iterated if more derivatives exist. In the i.i.d. average case a convergence rate of Downloaded by [Tehran University] at 04:23 21 August 2011
/
/
D1
m1(d=(pd)) np
Ln ;
with D the number of derivatives, is then achievable. In our paper, Qn will not always be an average, unlike the results of, for example, Andrews (1987) and Po¨tscher & Prucha (1989), and certainly not one of stationary objects. What lemma 1 achieves is that it directly expresses the uniform convergence rate of Qn in terms of convergence and/or divergence rates which are more easily determined. The uniform convergence rate is not necessarily sharp, but it suffices for our purposes. Finally, the notation Ln in lemma 1 intentionally coincides with that of assumption 22; since any sequence fLn g in lemma 1 will do provided that it tends to , it can always be taken identical to the one in assumption I. A.2. Consistency Here the conditions of theorem 3 apply. Let g˜ni (u) gni (u)Egni (u); g˜n (u) g¯n (u)gn (u) and cn n1=(2d) An1=(2d) Ln : Lemma 2. max kg˜n (u)k op (cn ):
(A5)
uU
Proof: We use lemma 1 with Qn g˜n and p 2. We first determine mn 2. /
max kE g˜n k2 max n2 uU
uU
(18)
5n
2
5n
E(g?˜ni g˜nj ) max n2 uU
i;j1 n X i;j1
2
n X
n X i;j1
dg n X X
E(g˜nit g˜njt )
i;j1 t1
dg rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffirffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi X max Ekg˜nit k2 max Ekg˜njt k2 anij t1
uU
uU
rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffirffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi anij dg max Ekg˜ni k2 max Ekg˜nj k2 uU
uU
Dynamic Spatial Discrete Choice Liapunov
5 n
n X
2
anij dg max Ekg˜ni k
4
1=4
uU
i; j1
4
max Ekg˜nj k
1=4
uU
E
2 5dg x1=2 n n
n X
85
anij
i; j1
(19)
1 5 dg x1=2 An : n n
1 So mn2 (dg x1=2 An )1=2 : Now Dn . Note that n n
n n X X @ g˜ni @ g˜ni E max n1 (u) 5 n1 (u) E max uU uU @u i1 @u i1 n n 2 1=2 E Liapunov 2 X 2X @gni @gni 5 E max 52x1=2 5 E max (u) (u) n : uU uU n i1 n i1 @u @u
k
k
Downloaded by [Tehran University] at 04:23 21 August 2011
k
k
k
k
k
k
So Dn x1=2 n : Hence the uniform convergence rate is 2=(2d)
op (mn2
1 Dnd=(2d) Ln ) op ((x1=2 An )1=(2d) xd=(2(2d)) Ln ) n n n E
op ((An =n)1=(2d) Ln ) op (cn ); as stated.
I
Let S 0 (u) n1
n X
lnij ; S 1 (u) n1
i;j1 Xn 1 S 2 (u) n i;j1
n X
lnij gni (u);
i;j1
lnij gni (u)g?nj (u);
and T r (u) ES r (u) for r 0; 1; 2: /
Lemma 3. For r 0, 1, 2, maxu U kS r (u)T r (u)kop (cn Cn ): /
Proof: We establish the proof for r 2; the other two cases are similar and easier. For any t, s 1, . . . , dg , let bnij (u) lnij gnit (u)gnit (u) E(lnij gnit (u)gnit (u)). We show convergence at the stated rate for each t, s, which implies that the established rate also applies to the norm. We use lemma 1 with p 2. We first determine the value of mn 2. /
/
/
/
/
1
max E n uU
n X
2 n X bnij (u) max n2 E(bnij (u)bni+ j+ (u)) uU
i;j1 ð18Þ
5 max n2 uU
i;j;i+ ;j+ 1
n qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi X Eb2nij (u)Eb2ni+ j+ (u)(anii+ anij+ anji+ anjj+ ): i;j;i+ ;j+ 1
(A6)
86
J. Pinkse et al.
Now, by the law of iterated expectations, 2 2 (u)gnjs (u)jLn )l2nij ) max Eb2nij (u) max E(E(gnit uU
uU
Schwarz
4 4 5 max E((E(gnit (u)jLn ))1=2 (E(gnjs (u)jLn ))1=2 l2nij ) uU
E
5xn El2nij :
(A7)
Thus, max n
2
uU
n qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi X Eb2nij (u)Eb2ni+ j+ (u)anii+ i;j;i+ ;j+ 1
Downloaded by [Tehran University] at 04:23 21 August 2011
(A7)
5 n2 xn
n X
anii+
i;i+ 1 (20)
5 n2 xn C2n
n X
n qffiffiffiffiffiffiffiffiffi X n qffiffiffiffiffiffiffiffiffiffiffiffi X El2nij El2ni+ j+ j+ 1
j1 (19)
anii+ 5 n1 xn C2n An :
(A8)
i;i+ 1
Repeat the steps in equation (A8) with anij* , anji* , anjj* in lieu of anii* to establish that equation (A6) is O(n1 xn C2n An ): Since xn O(1) by assumption E, mn 2 in lemma 1 is O(n1=2 Cn A1=2 n ): We now determine the value of Dn in lemma 1. Note first that @bnij @gnit @gnit E max (u) E max lnij (u)gnjs (u)E lnij (u)gnjs (u) uU uU @u @u @u @gnjs @gnjs (u)E lnij gnit (u) (u) lnij gnit (u) @u @u /
k
k
k
k @g @g 5 2E max kl (u)g (u)k2E max kl g (u) (u)k: @u @u
triangle
njs
nit
uU
nij
njs
uU
(A9)
nij nit
We only deal with the first RHS term in (A9); the second RHS term follows similarly. We have sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffisffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi2ffi Schwarz @gnit @gnit 2 2 (u) E max E lnij max gnjs (u)gnjs (u) 5 (u) E max lnij uU u U u U @u @u 1=2 (17) 1=2 2 5 xn E E max gnjs (u)jLn l2nij uU sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
k
k
k
Liapunov
5 x1=2 E n
qffiffiffiffiffiffiffiffiffi El2nij : 5 x3=4 n
(16)
Use equation (A10) in (A9) to obtain that
k
1=2
4 (u)jLn l2nij E max gnjs uU
(A10)
Dynamic Spatial Discrete Choice
k
E max n uU
1
n X @bnij
@u
i;j1
(u)
k
5n
1
n X
E max uU
i;j1 1 5 4x3=4 n n
87
k @b@u (u)k nij
n qffiffiffiffiffiffiffiffiffi X E El2nij 5 4x3=4 n Cn : i;j1
Since xn O(1) by assumption E, we can choose Dn Cn in lemma 1, and hence /
k
/
max n1 uU
n X
k
bnij (u) op (m2=(2d) Dd=(2d) Ln ) n2 n
i;j1
Downloaded by [Tehran University] at 04:23 21 August 2011
2=(2d) d=(2d) op ((n1=2 Cn A1=2 Cn Ln ) op (cn Cn ); n )
as stated.
I
Lemma 4. maxu U kVˆ n (u)Vn (u)k op (cn Cn ): Proof: Note that (omitting the u-arguments and recalling that g˜n g¯n gn ); ¯n S 0 g¯n g?¯n ; Vn T 2 gn T ?1 T 1 g?n T 0 gn g?n ; Vˆ n S 2 g¯n S?1 S 1 g? such that Vˆ n Vn (S 2 T 2 )? g˜n (S 1 T 1 )?gn (S 1 T 1 )? g˜n T 1? (S 1 T 1 )g?˜n (S 1 T 1 )g?n T 1 g?˜n (S 0 T 0 )(g¯n g?¯n gn g?n )T 0 (g¯n g?¯n gg?) (S 0 T 0 )gn g?n : (A11) So if cn O(1);
(A12)
max kg˜n (u)kop (cn );
(A13)
max kS r (u)T r (u)kop (cn Cn ); r 0; 1; 2;
(A14)
max kgn (u)kO(1);
(A15)
max kg¯n (u)g?¯n (u)gn (u)g?n (u)kop (cn ):
(A16)
max kT r (u)kO(Cn ); r 0; 1;
(A17)
uU
uU
uU
uU
uU
88
J. Pinkse et al.
then by equation (A11) maxu U kVˆ n (u)Vn (u)kop cn Cn cn2 Cn cn Cn cn Cn cn2 Cn cn Cn cn Cn cn2 Cn cn Cn cn Cn op (cn Cn ): First, equation (A12) follows from assumption I since cn2d n1 An L2d n o(Cn2(2d) )H o(1): Further, equations (A13) and (A14) follow from lemmas 2 and 3, respectively, and equation (A15) holds since
k
max kgn (u)kmax E n1 Downloaded by [Tehran University] at 04:23 21 August 2011
uU
uU
Liapunov
5
n X
k
gni (u) 5 n1
i1
n X
E max kgni (u)k uU
i1
1=4 E max E max kgni (u)k4 5x1=4 n O(1): i5n
uU
Now equation (A16). Using the expansion g¯n g?¯n gn g?n (g¯n gn )(g¯n gn )? gn (g¯n gn )?(g¯n gn )g?n g˜n g?˜n gn g? ˜n g˜n g?n and the triangle inequality, 2
¯n (u)gn (u)g?n (u)k 5 max kg˜n (u)k max kg¯n (u)g? uU
uU
2 max kgn (u)k max kg˜n (u)k uU
uU
(A13);(A15)
op (cn ):
Finally equation (A17). We show the case r 1; the case r 0 follows similarly. /
max kT 1 (u)k 5 n1 uU
n X i;j1
Liapunov
5 n
1
max Eklnij gni (u)kn1 uU
n X i;j1
Liapunov
5 n1 x1=4 n
/
n X i;j1
4
max E(E(kgni (u)kjLn )lnij ) uU
1=4
max E(lnij (E(kgni (u)k jLn )) uU
E
)5n1 x1=4 n
n X
Elnij
i;j1
n qffiffiffiffiffiffiffiffiffi X (20) E El2nij 5 x1=4 n Cn O(Cn ): i;j1
I 1 and (ii) minu U E max (Vn (u)) 5 Cn zn : Lemma 5: (i) minu U E min (Vn (u)) ] r1 n zn
Proof: First (i). Let Gn (u) be a matrix with ith row n1/2(gni (u) gn (u))?. Then for all v " 0 and all u U, /
/
/
Dynamic Spatial Discrete Choice v?Vn (u)v v?v
E
v?G?n (u)Ln Gn (u)v
v?v
E
v?G?n (u)Ln Gn (u)v v?G?n (u)Gn (u)v
v?G?n (u)Gn (u)v v?v v?G?n (u)Gn (u)v H 1 v?G?n (u)Gn (u)v ]rn E ] E E min (Ln ) v?v v?v
] r1 n
v?E(G?n (u)Gn (u))v v?v
1 ] r1 n n
n X v?E((gni (u) gn (u))(gni (u) gn (u))? )v
v?v
i1
Downloaded by [Tehran University] at 04:23 21 August 2011
89
1 r1 n n
n X v?(Vgni (u) (Egni (u) gn (u))(Egni (u) gn (u))? )v
v?v Pn n 1 X G 1 i1 Vgni (u)v 1 v?n 1 1 Vgni (u) ] r1 ] rn ]rn E min n n zn : v?v i1 i1
Take the minimum over u and note that the RHS does not depend on u. Now (ii). The proof is very similar to that of (i), albeit that we now use equation (20) to obtain an upper bound on E max (Ln ): I Lemma 6. max jVˆ n (u)Vn (u)j op (1): uU
(A18)
Proof: Expand Vˆ n Vn as g?¯n Wˆ n g¯n g?n Wn gn (g¯n gn gn )?(Wˆ n Wn Wn )(g¯n gn gn )g?n Wn gn (g¯n gn )?(Wˆ n Wn )(g¯n gn )2(g¯n gn )?(Wˆ n Wn )gn g?n (Wˆ n Wn )gn (g¯n gn )?Wn (g¯n gn )2(g¯n gn )?Wn gn : Since C1 n O(1) by assumption H it hence suffices to establish that max ½½g¯n (u)gn (u)½½ op (C1 n ); uU
(A19)
max ½½Wˆ n (u)Wn (u)½½ op (1);
(A20)
max ½gn (u)½ O(1);
(A21)
max ½½Wn (u)½½ O(Cn ):
(A22)
uU
uU
uU
J. Pinkse et al.
90
Equation (A19) follows from equation (A13) and assumption I; equation (A21) follows from equation (A15). Now equation (A22). Note that since Wn is positive definite, max ½½Wn (u)½½ max E max (Wn (u)) uU
uU
Cn
L5
minu U E min (Vn (u))
5Cn rn zn O(Cn ); (A23)
by assumptions G and H. Finally equation (A20). Note that 1 ˆ 1 max ½½Wˆ n Wn ½½ max(Wˆ n Wn )(Wn1 Wˆ 1 n )Wn Wn (Wn W n )Wn ½½
Downloaded by [Tehran University] at 04:23 21 August 2011
uU
uU
triangle
ˆ ˆ 5 max ½½C1 n (W n Wn )(Vn V n )Wn ½½ uU
ˆ max ½½C1 n Wn (Vn V n )Wn :
(A24)
uU
First, the second RHS term in equation (A24). By equation (A22), lemma 4 and assumption I, 1 ˆ max ½½C1 n Wn (u)(Vn (u) V n (u))Wn (u)½½ Op (Cn Cn cn Cn Cn ) uU
Op (cn Cn ) op (1): The first RHS term in equation (A24) is of smaller order than the LHS since ˆ ˆ max ½½C1 n (W n (u)Wn (u))(Vn (u) V n (u))Wn (u)½½ uU
ˆ ˆ 5 C1 n max ½½Vn (u) V n (u)½½ max ½½Wn (u)½½ max ½½W n (u)Wn (u)½½ uU
uU
uU
Op (cn Cn ) max ½½Wˆ n (u)Wn (u)½½ op (1) max ½½Wˆ n (u)Wn (u)½½; uU
uU
again by equation (A22), lemma 4 and assumption I.
I
ˆ op (1) which by assumption F Proof of theorem 3: We show that g+( u) + ˆ P u0 : Now, (continuity and g (u) 0 U u u0) implies that u0 /
/
F
ˆ 2 5½½gn (u)½½ ˆ 25 (g(u))
/
/
/
ˆ n1 (u)g ˆ n (u) ˆ L5 Cn g?n (u)V ˆ ˆ ˆ 5 C1 n Vn (u)E max (Vn (u))5zn Vn (u): 1 ˆ Cn E min (Vn (u))
ˆ P 0 implies g+ (u)0 ˆ P 0: So it suffices to Since zn O(1) by assumption G, Vn (u)0 P ˆ ˆ ] 0; show that Vn (u)0 0: Finally, noting that Vn (u0 ) 0 and Vn (u) /
ˆ (Vn (u) ˆ Vˆ n (u))( ˆ ˆ Vˆ n (u0 ))(Vˆ n (u0 )Vn (u0 )) 0 5 Vn (u) Vˆ n (u) 5 op (1)0op (1) op (1): I
Dynamic Spatial Discrete Choice
91
A.3. Asymptotic Normality Here the assumptions of theorem 4 apply. Lemma 7. pffiffiffi D ng¯n (u0 ) 0 N (0; V0 ):
(A25)
By assumption K, the conditions of theorem 2 are satisfied and hence
Proof.
pffiffiffi D (Vn+ (u0 ))1=2 ng¯n (u0 ) 0 N (0; I): Downloaded by [Tehran University] at 04:23 21 August 2011
Since V0 limn0 Vn+ (u0 ); the stated result holds.
I
Lemma 8. For any consistent estimator uˆ+ of u0, @ g¯n @u?
(uˆ+ )T0 op (1):
(A26)
Proof. Write the LHS in equation (A26) as
@ g¯n ˆ+ @ g¯n @ g¯n @gn @gn (u ) (u0 ) (u0 ) (u0 ) (u0 )T0 : @u? @u? @u? @u? @u?
(A27)
The third term in (A27) is o(1) by equation (24) and the first term is op (1) by assumption M. Finally, the second term is also op (1) since for any t 1, . . . ,d, /
E
@ g¯n
2
@gn
k @u (u ) @u (u )k 5 n 0
t
2
0
t
(25)
5 n2
n X @ g˜nj @ g˜ni E (u0 ) (u0 ) @ut @ut i;j1 dg n X X i;j1 t1
E
5dg xn n2
n X
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffisffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi @ g˜nj @ g˜ni E E (u0 )2 (u0 )2 anij @ut @ut
k
(19)
k k
k
I
anij 5 dg xn n1 An o(1):
i;j1
I ˆ which can then be In lemma 9 we establish a bound for the convergence rate of u; used in subsequent results. The true convergence rate is later established to be root n, as the theorem suggests. Lemma 9.
1=2 1=2 ˆ kuu Cn ): 0 kOp (n
ˆ between u0 and u; ˆ Proof: By the mean value theorem for some u
92
J. Pinkse et al. L7 ˆ g¯n (u) ˆ g¯n (u0 ) g¯n (u0 ) ˆ g¯n (u0 )Op (n1=2 ) g¯n (u) g¯n (u)
@ g¯n
1=2 ˆ (uˆ+ )(uu ) 0 )Op (n @u? @ g¯n + 1=2 ˆ ˆ ˆ (u )T0 (uu ) 0 )T0 (uu0 )Op (n @u?
L8 1=2 ˆ ): (op (1)T0 )(uu 0 )Op (n
Thus,
Downloaded by [Tehran University] at 04:23 21 August 2011
1 ˆ ˆ (T 0? op (1))g¯n (u)T ?0 Op (n1=2 ) uu 0 (T ?0 op (1))(T0 op (1)) J 1=2 J 1=2 ˆ ˆ ((T 0? T0 )1 T 0? op (1))g¯n (u)O )Op (1) g¯n (u)O ): p (n p (n
ˆ 2 Op (n1 Cn ): Thus, So we need to show that ½½g¯n (u)½½ ˆ 25 ½½g¯n (u)½½
ˆ Wˆ n (u) ˆ g¯n (u) ˆ ˆ g¯ n? (u) Vˆ n (u) : ˆ ˆ E min (Wˆ n (u)) E min (Wˆ n (u))
(A28)
We now work on each of the components of equation (A28). Since uˆ is a global minimizer of Vˆ n we have L7
ˆ 5 Vˆ n (u0 ) 5 ½½g¯n (u0 )½½2 E max (Wˆ n (u0 ))Op (n1 )E max (Wˆ n (u0 )): (A29) 0 5 Vˆ n (u) Further, triangle
E max (Wˆ n (u0 )) ½½Wˆ n (u0 )½½ 5 ½½Wˆ n (u0 )Wn (u0 )½½½½Wn (u0 )½½
(A20);(A22)
Op (Cn ): (A30)
Finally, L5 1 H 1 ˆ ˆ ˆ E max (Wˆ 1 n (u)) Cn E max (V n (u))5zn Op (1): ˆ E min (Wˆ n (u))
equations (A29) (A31) in (A28).
(A31) I
Lemma 10. Vn (u0 )Vn+ (u0 ) op (1): Proof: Since gn (u0) 0, /
Vn (u0 ) n1
n X
E(lnij gni (u0 )g?nj (u0 )):
i;j1
Hence Vn+ (u0 )Vn (u0 ) n1
n X
E((1lnij )gni (u0 )g?nj (u0 )):
i;j1
Use assumption L.
I
Dynamic Spatial Discrete Choice
93
ˆ Lemma 11. Vn (u)V n (u0 ) op (1): Proof: By the mean value theorem for some uˆ between uˆ and u0, ˆ Vn (u)V n (u0 )
d X @Vn s1
@us
(uˆ+ )(uˆs u0s ):
1=2 1=2 ˆ Cn ) by lemma 9, it suffices to show that for every Since uu 0 Op (n s 1, . . . ,d, /
max
Downloaded by [Tehran University] at 04:23 21 August 2011
uU
k @V@u (u)ko(n n
1=2
C1=2 ): n
(A32)
s
Take any one such s. Note that max uU
@Vn
k @u k
k
(u) max n1 uU
s
k
triangle
5 max n1 uU
n X
E(lnij
i;j1
@ @us
k
((gni (u)gn (u))(gnj (u)gn (u)))
n n X X @g? @g E lnij ni (u)g?nj (u) max n1 E lnij gni (u) nj (u) uU @us @us i;j1 i;j1
k
k
k
(A33)
k
max n uU
1
n n X X @g?nj @gni 1 E lnij (u) g?n (u) max gn (u)n E lnij (u) uU @us @us i;j1 i;j1
k
k
k
(A34)
k
1
max n uU
max uU
E(lnij gni (u))
i;j1
k
n X
@gn @us
@g?n @us
(u)g?n (u)gn (u)
k
(u) max
@g?n @us
uU
@gn
k @u (u)n
1
s
n X
E(lnij g?nj (u))
i;j1
n X (u) n1 Elnij :
k
i;j1
(A36)
Consider the first term in (A33). We have
k
max n uU
1
n X @gni E lnij (u)g?nj (u) @us i;j1
k
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 @gni E 5 max n (u) uU @us i;j1 sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi n qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 X @gni E(l2nij E(kgnj (u)k2 ½Ln )) E (u) max n1 uU @us i;j1
Schwarz
1
n qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi X 2 Eklnij gnj (u)k
k (A35)
k
k
k
k
94
J. Pinkse et al. sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi 2 @gni 2 E(lnij E(kgnj (u)4 k½Ln )) E 5 max n1 (u) uU @us i;j1 n qffiffiffiffiffiffiffiffiffi X E (20) E I 1 1=2 1=2 5x3=4 El2nij 5 x3=4 n Cn ); n n Cn O(Cn )o(n
Liapunov
n X
k
k
(A37)
i;j1
as required by equation (A32). By symmetry the second term in (A33) is also O(cn). The remaining terms, i.e. those in (A34) (A36) can be dealt with by a tedious repetition of steps similar to those in equation (A37). I Lemma 12.
Downloaded by [Tehran University] at 04:23 21 August 2011
ˆ ˆ P 1 C1 n W n (u) 0 V0 ;
P 1 ˆ C1 n W n (u0 ) 0 V0 :
(A38)
Proof: Note that 1 ˆ 1 1 ˆ ˆ ˆ ˆ ˆ C1 n W n (u)V0 Cn (W n (u)Wn (u))Cn (Wn (u)Wn (u0 )) 1 (C1 n Wn (u0 )V0 );
(A39)
1 ˆ 1 1 1 ˆ C1 n W n (u0 )V0 Cn (W n (u0 )Wn (u0 ))(Cn Wn (u0 )V0 ): (A40)
We now show that all RHS terms in equations (A39) and (A40) are op (1) or o(1). This first RHS terms in both equations are op (Cn1) op (1) by (A20) and assumption J. Now, each of the last RHS terms is /
Vn1 (u0 )V01 o(1);
(A41)
since inversion is a continuous operation, V0 0 by assumption J, and because /
L10; (24)
Vn (u0 )V0 (Vn (u0 )Vn+ (u0 ))(Vn+ (u0 )V0 ) o(1): Finally the second RHS term in equation (A39). Note that 1 ˆ 1 ˆ kC1 n (Wn (u)Wn (u0 ))kkVn (u)Vn (u0 )k ˆ n (u0 )Vn (u)V ˆ n1 (u0 ))k kVn1 (u)(V triangle
1 ˆ ˆ 1 5 k(Vn1 (u)V n (u0 ))(Vn (u0 )Vn (u))Vn (u0 )k ˆ n1 (u0 )k kVn1 (u0 )(Vn (u0 )Vn (u))V 1 1 ˆ ˆ ˆ 5 kVn1 (u)V n (u0 )k×kVn (u0 )Vn (u)k×kVn (u)k ˆ 2 ×kVn (u0 )Vn (u)k ˆ kVn1 (u)k
L11; (A41)
ˆ kC1 n (Wn (u)Wn (u0 ))kop (1)O(1)O(1)op (1) ˆ kC1 n (Wn (u)Wn (u0 ))kop (1)op (1):
ˆ Hence kC1 n (Wn (u)Wn (u0 ))kop (1):
I
Lemma 13.
/
ˆ Op (n1=2 ): g¯n (u)
(A42)
Dynamic Spatial Discrete Choice
95
Proof: Note that ˆ 2 5 g¯ n? (u)V ˆ 01 g¯n (u) ˆ E min (V01 )kg¯n (u)k 1 ˆ ˆ ˆ ˆ 01 C1 ˆ ˆ g¯ n? (u)(V n W n (u))g¯n (u)Cn Vn (u) L12
ˆ 2 C1 ˆ ˆ 5 op (1)kg¯n (u)k n Vn (u):
(A43)
Since V0 is positive definite and finite by assumption J, equation (A43) implies ˆ 2 is no slower than that of C1 ˆ ˆ that the convergence rate of ½½g¯n (u)½½ n Vn (u): Thus, 1 ˆ 1 ˆ 1 ˆ ˆ C1 n Vn (u) 5 Cn Vn (u0 ) g¯ n? (u0 )(Cn W n (u0 )V0 ))g¯n (u0 ) ¯ g n? (u0 )V01 g¯n (u0 ) L12; J
L7
Downloaded by [Tehran University] at 04:23 21 August 2011
5 (op (1)E max (V01 ))½½g¯n (u0 )½½2 Op (n1 ):
ˆ Op (n1=2 ); as claimed. Hence ½½g¯n (u)½½
I
Lemma 14. @ Wˆ n ˆ (u) Op (C2n ); @us
s 1; . . . ; d:
(A44)
Proof: Choose any s 1, . . . ,d.. Then /
@ Wˆ n @us
ˆ Wˆ n (u) ˆ (u)
@ Wˆ 1 n @us
ˆ ˆ C1 n W n (u)
ˆ Wˆ n (u) ˆ (u)
@ Vˆ n @us
ˆ Wˆ n (u) ˆ L12 (u) Op (1)
@ Vˆ n @us
ˆ (u)O p (Cn ):
So it suffices to show that @ Vˆ n @us
ˆ Op (Cn ): (u)
Now, n @ Vˆ n ˆ @ 1 X ˆ g¯n (u))(g ˆ nj (u) ˆ g¯n (u))? ˆ (u) n lnij (gni (u) @us @us i;j1
n
1
n X
lnij
i;j1
@ g¯n @us
ˆ 1 (u)n
ˆ 1 ¯g n? (u)n
@gni @us
n X
ˆ nj (u)g ˆ ˆ (u)g? ni (u) 1 ˆ lnij g?nj (u)n
i;j1 n X i;j1
lnij
@gnj? @us
n X
ˆ (u)
ˆ lnij gni (u)
i;j1
@g?nj @us
1 ˆ (u)n
n X i;j1
lnij
@ g¯ n? @us
ˆ (u)
@gni ˆ ˆ (u)g¯ n? (u) @us
96
J. Pinkse et al. n X @ g¯n @ g¯ n? 1 ˆ ˆ ˆ ˆ (u)g¯ n? (u) g¯n (u) (u) n lnij @us @us i;j1
L8; L13 1
n
n X
lnij
i;j1
Op (1)×n1
@gni @us n X
ˆ nj (u)g ˆ ˆ (u)g? ni (u)
1 ˆ lnij g?nj (u)n
i;j1
Op (n1=2 )×n1 Downloaded by [Tehran University] at 04:23 21 August 2011
@us
n X
ˆ (u)
ˆ lnij gni (u)×O p (1)
i;j1 n X i;j1
Op (n1=2 )×n1
@gnj?
n X
lnij
@g?nj @us
1 ˆ (u)n
n X
lnij
i;j1
@gni @us
1=2 ˆ (u)×O ) p (n
lnij :
(A45)
i;j1
Now,
k
1
E n
n X
lnij
@gni
k
ˆ nj (u) ˆ 5 n1 (u)g?
n X
k
E max lnij
@gni
k
(u)g?nj (u) @us @us i;j1 sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi2 n X Schwarz @gni 2 2 1 E max 5 n (u) E lnij E(max ½½gnj (u)½½ ½Ln ) uU uU @us i;j1 sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffisffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi2 n X Liapunov @gni 5 n1 E l2nij E(max ½½gnj (u)½½4 ½Ln ) E max (u) uU uU @us i;j1 i;j1
uU
k
k
k
E
1 5x3=4 n n
k
n qffiffiffiffiffiffiffiffiffi X (20) E El2nij 5 x3=4 n Cn 5O(Cn ):
(A46)
i;j1
Repeat the steps of equation (A46) for the remaining RHS summations in equation (A45). Proof of theorem 4: For s 1, . . . ,d, /
pffiffiffi pffiffiffi ˆ pffiffiffi @ g?¯n n @ Vˆ n ˆ n 1 ˆ ˆ ˆ ˆ ˆ @ W n (u) ˆ g¯n (u) ˆ (u) n (u)Cn W n (u)g¯n (u) g?¯n (u) 0 @us @us 2Cn @us 2Cn L13; L14 pffiffiffi @ g¯ n?
n
@us
1=2 1 1=2 2 1=2 ˆ 1 ˆ ˆ ˆ (u)C Cn n Cn n ) n W n (u)g¯n (u)Op (n
I pffiffiffi @ g¯ ? ˆ 1 ˆ ˆ ˆ n n (u)C n W n (u)g¯n (u)op (1): @us
Hence by the mean value theorem for some uˆ between uˆ and u0,
Dynamic Spatial Discrete Choice
97
pffiffiffi @ g?¯ ˆ 1 ˆ ˆ ˆ op (1) n n (u)C n W n (u)g¯n (u) @u
L7;
pffiffiffi @ g? pffiffiffi @ g?¯ ¯n @g ˆ 1 ˆ 1 ˆ ˆ ˆ ˆ ˆ n (uˆ+ )(uu (u)C n n (u)C 0) n W n (u)g¯n (u0 ) n n W n (u) @u @u @u? pffiffiffi L8; L12; L13 pffiffiffi ˆ nT 0? V01 g¯n (u0 ) nT 0? V01 (uu 0 )op (1);
which, with assumption J, implies that pffiffiffi pffiffiffi D 1 1 ˆ T 0? V01 ng¯n (u0 )op (1) 0 N (0; (T 0? V01 T0 )1 ) n(uu 0 ) (T 0? V0 T0 )
Downloaded by [Tehran University] at 04:23 21 August 2011
by lemma 7.
I
Appendix B: Data Appendix All variables are yearly and span the period 1980 1993. All monetary variables are in real Canadian dollars, 1993 1.00. Mining-industry data, however, are usually reported in US dollars, and people familiar with the industry are accustomed to such numbers. It is therefore helpful to compare the two units. In 1993, a price of US$1.00 per pound was equivalent to CAN$1.35, and costs of US$0.75 were roughly equal to CAN$1.00. CPR: the London Metal Exchange (LME) is the most important exchange for copper trading. Although a copper contract also trades on the Commodity Exchange of New York (COMEX), that market is considerably thinner. For this reason, we use the LME copper price. Yearly prices are averages of the daily grade-A, cash-settlement price published in Nonferrous Metal Data. The Canadian consumer price index, which was obtained from DataStream, and the US/ Canadian exchange rate, which was obtained from Citibase, were used to convert nominal US to real Canadian cents per pound. SIGRR: our measure of volatility is constructed from CPR using equation (29). These two variables are common to all mines. Other variables vary by mine as well as over time. All mine data are reported at a yearly frequency. The variable COST, however, is available only for years in which the mine operated. RES: reserve data, in millions of tonnes of ore, are from the Canadian Mines Handbook. All other mine data were collected from the Canadian Minerals Yearbook. CAP: mine capacity is measured in millions of tonnes of ore per year. QCU: metal refined is measured in thousands of tonnes of copper per year. COST: average total costs, which include the costs of mining, milling, smelting, refining, shipping, and marketing, are published by Brook Hunt, a consulting firm that specializes in the mining industry. According to industry sources, these costs are the most reliable available and are used extensively by firms in the industry. Fixed and marginal costs, FCOˆSTi and MCOˆSTi for each mine are generated from COSTit and QCUit using regression equation (28) applied to active observations. LAT and LONG: data on location were taken from Infomine (http:// www.infomine.com) and Canadian Geographic Names, Department of Natural Resources, Canada (http://geonames.nrcan.gc.ca). These data are used to calculate /
Downloaded by [Tehran University] at 04:23 21 August 2011
98
J. Pinkse et al.
the Euclidean distances between mines that we use to perform the spatial corrections. The data to produce the outlines of the provinces in Figures 2 and 3 were taken from the Geo Community website (http://www.geocomm.com) and are in the public domain. Additional supply and demand variables were used as instruments. These are: WAGE: a provincial mining wage rate was constructed by dividing the total wage and salary bill for copper/nickel/zinc mines in each province, in thousands of dollars per year, by the number of employees of such mines in that province. The raw wage variables are found in Statistics Canada Catalogue 26 223, table 2. ENERGY PRICE: a provincial mining energy-price index was constructed as a share-weighted average of the prices of nine classes of fuels that were purchased by copper/nickel/zinc mines in the province. The raw data consist of two variables for each fuel* the value and quantity of provincial mining-industry purchases. Individual provincial energy prices were obtained by dividing the value by the quantity. These data were then aggregated to form the index. The raw data are found in Statistics Canada Catalogue 26 223, table 6. INPROD: Canadian industrial-production data, in millions of dollars per year, were obtained from Cansim, Statistics Canada’s computerized data base. Appendix C: The Normalized Success Index Suppose that yi ,i 1, . . . ,n is the observed choice and pˆi is the predicted probability. Let X n0 (1yi ) (C1) /
and n1
X
yi
(C2)
(1yi )(1 pˆi );
(C3)
X (1yi )pˆi ;
(C4)
be observed counts. Define n00
X
n01 n10
X
yi (1 pˆi );
(C5)
and X
yi pˆi :
(C6)
nˆ0 n00 n10
(C7)
nˆ1 n01 n11
(C8)
n11 Then
and
are the predicted counts. The NSI is then
Dynamic Spatial Discrete Choice
S S0 S1 : nˆ0 n nˆ1 n n00
n0
n11
n1
99 (C9)
Downloaded by [Tehran University] at 04:23 21 August 2011
This is a better measure of goodness of fit than the proportion of successful predictions, which is often used, for the following reason. Suppose that 90% of the observations are 0 and 10% are 1. A model that simply predicts 0 with probability 0.9 and 1 with probability 0.1 will have a large proportion of successful predictions (approximately 81%). However, it has no predictive power, and its NSI is approximately 0.
Spatial Economic Analysis, Vol. 1, No. 1, June 2006
Agglomeration and Trade with Input Output Linkages and Capital Mobility
FRE´DE´RIC ROBERT-NICOUD
Downloaded by [Tehran University] at 04:23 21 August 2011
(Received February 2006; revised March 2006)
This paper proposes a nesting ‘New Trade, New Economic Geography’ model in which agglomeration is driven by input output linkages among firms, trade in goods and capital mobility. The New Economic Geography sub-model exhibits the same positive and dynamic properties as a wide class of models based on other agglomeration mechanisms. Its normative implications are nuanced: equity and efficiency do not necessarily conflict. When input output linkages are strong, agglomeration might Pareto-dominate dispersion because agglomeration lowers producer prices. When vertical linkages are weak, the market is biased in favour of agglomeration if the planer has a strong aversion to inequalities. ABSTRACT
Accumulation et commerce avec inte´gration amont-aval et mobilite´ du capital RE´SUME´ Cet article de´crit un mode`le, qui a donne´ naissance au mode`le commercial de Flam et Helpman (1987), et de Martin et Rogers (1995) et a` un mode`le original a` la Krugman « Nouvelle Ge´ographie Economique » (1991). L’accumulation se produit par l’inte´gration amont-aval des socie´te´s entre elles et par la mobilite´ du capital. L’auteur e´tudie les conse´quences positives puis normatives du mode`le. Dans le domaine des conse´quences positives, le mode`le NGE montre les meˆmes proprie´te´s dynamiques que les autres mode`les fonde´s sur d’autres me´canismes d’accumulation (migration du travail, accumulation de capital humain). Donc, ce mode`le est bien adapte´ pour e´tudier les questions de localisation des industries, du commerce des biens et de la mobilite´ du capital. En ce qui concerne les conse´quences normatives, lorsque l’inte´gration amont- aval est forte, l’accumulation peut l’emporter sur la dispersion de Pareto, parce que l’accumulation conduit a` une diminution des prix du producteur: l’efficacite´ et la valeur n’entrent pas force´ment en conflit dans ce mode`le. Quand l’inte´gration verticale est faible, le marche´ est oriente´ en faveur de l’accumulation si le de´cideur montre une grande aversion aux ine´galite´s.
Fre´de´ric Robert-Nicoud, Department of Geography and Environment, London School of Economics, Houghton Street, London WC2A 2AE, UK. Email:
[email protected]. I am grateful to Gilles Duranton, Rod Falvey, Marco Fugazza, Steve Redding and Tony Venables for feedback. The original contributions of the present paper appear as ‘Propositions’ and ‘Corollaries’. The NEG model in Sections 2 and 3 is sketched in Chapter 8 of Baldwin et al. (2003). The welfare analysis in Section 5 uses tools developed by Charlot et al. (2006). I am grateful to my co-authors for fruitful interactions and I am happy to acknowledge their input. Interacting with them clearly benefited this paper, since all errors, omissions and loose ends are mine. ISSN 1742-1772 print; 1742-1780 online/06/010101-26 # 2006 Regional Studies Association
DOI: 10.1080/17421770600662459
102
F. Robert-Nicoud
Aglomeracio´n y comercio con enlaces de entrada salida y movilidad de capital En este artı´culo expongo un modelo que atrapa el modelo comercial de Flam y Helpman (1987), de Martin y Rogers (1995) y de un modelo original segu´n la teorı´a la ‘Nueva Geografı´a Econo´mica’ de Krugman (1991). La aglomeracio´n esta´ impulsada por enlaces de entrada salida entre las sociedades y por la movilidad de capital. Aquı´ analizo las implicaciones positivas y normativas del modelo. En te´rminos de implicaciones positivas, el modelo NEG expone las mismas propiedades dina´micas como una amplia clase de modelos basados en otros mecanismos de aglomeracio´n (migracio´n laboral, acumulacio´n de capital humano). De este modo, el modelo encaja bien para estudiar cuestiones en cuanto a la ubicacio´n de la industria, el comercio de mercancı´as y la movilidad de capital. Con respecto a las implicaciones normativas, cuando son so´lidos los enlaces de entrada salida, la aglomeracio´n podrı´a dominar la dispersio´n en el diagrama de Pareto debido a que la aglomeracio´n hace disminuir los precios de los productores: en este modelo la eficiencia y la equidad no necesariamente esta´n en conflicto. Cuando los enlaces verticales son de´biles, el mercado es sesgado a favor de la aglomeracio´n si el planificador tiene una fuerte aversio´n a las desigualdades.
Downloaded by [Tehran University] at 04:23 21 August 2011
RESUMEN
KEYWORDS: New Economic Geography; capital mobility; international trade; welfare JEL
CLASSSIFICATION:
F02, F12, F20, R12
1. Introduction This paper works out, in an integrated way, a ‘New Trade, New Economic Geography’ (NT, NEG hereafter) model that nests, on the one hand, the NT model of Lawrence & Spiller (1983), Flam & Helpman (1987) and extended to a geography framework by Martin & Rogers (1995) and, on the other, an NEG model that shares similarities with Venables’ (1996) model. The former will be referred to as the FHMR model, after the initials of its authors, and the latter will be designated the FCVL model, for ‘Footloose Capital, Vertical Linkages’.1 Specifically, the all-encompassing model is an NT model in which countries are engaged in intra-industry trade a` la Dixit, Stiglitz and Krugman (DSK hereafter) as well as in inter-industry trade and in which capital is internationally mobile. In addition, firms buy each other’s output as intermediate inputs, giving rise to so-called forward and backward linkages; as a result of these linkages, market sizes are endogenous* certainly the distinctive feature of NEG models. On the positive side, the paper describes the equilibrium properties of the model and shows how to generate the FCVL and the FHMR models as special cases. This is convenient because an attractive feature of the FHMR model is its full analytical tractability, whilst the FCVL model is useful because its dynamic properties are as rich as in typical NEG models, but it is (marginally) simpler than most of them. On the normative side, the main contribution of the paper is to study the welfare analysis of the FCVL model; as we will see, these properties are distinguishable from those of Krugman’s (1991) model, as studied by Charlot et al. (2006), as well as from those of Krugman & Venables’ (1995) model, as studied by Ottaviano & Robert-Nicoud (2006). The main result from the welfare section is perhaps that agglomeration, whereby all manufacturing activities are concentrated in a single region, can Pareto-dominate dispersion because strong input output linkages are passed on to consumers everywhere in the form of low prices. The remainder of the paper is organized as follows. In the remaining part of the introduction I first provide some statistics to show that the two key ingredients of
Agglomeration and Trade
103
the model* capital mobility and input output linkages* are important current features of the world economy. I then briefly and discriminatorily review the relevant literature, explaining along the way where and how the current paper contributes. Section 2 sets up the structure of the model. Section 3 solves the equilibria, mostly in the special case of ex ante symmetric regions. Sections 4 and 5 convey the welfare analysis in that case. Section 6 derives the FHMR model as a special case of the nesting model. Section 7 concludes.
Downloaded by [Tehran University] at 04:23 21 August 2011
1.1. Two Important Features of ‘Globalization’ Two of the main ingredients described in the current paper* capital mobility and input output linkages* are two noticeable features of the current international environment. Moreover, these features are closely related (in both the model and in the data). In the model, the form of international capital mobility that is closest to the real world is perhaps foreign direct investment (FDI), that is, long-term investments in the real economy; in recent years, FDI flows have grown enormously, albeit from a low base. Using UNCTAD and World Bank data, Barba-Navaretti et al. (2004) report that worldwide real inflows of FDI grew at an annual rate of 17.7% over the period 1985 1999 (that is, by a factor of 11), faster than worldwide exports (annual rate 5.6%, a factor of 2), faster still than worldwide real GDP (annual rate 3.1%, a 50% increase). Almost as striking perhaps is the volume of employment or turnover of multinational companies. Since the FCVL model explicitly features firm-to-firm sales, let me provide some figures for export volumes and sales of multinational corporations. As an illustration, take the volume of manufacturing sales of US-owned EU affiliates in the EU economy: these about nearly four times as large as US exports of manufactures into the EU. More globally, UNCTAD (various years) estimates that a third of world trade is conducted within the boundaries of multinational corporations (between foreign affiliates or between the parent company and affiliates). Of the remaining two thirds, a good share of world trade takes place between unrelated firms. A recent descriptive paper by Bernard & Jensen (2005) provides a fresh look at this. They report that the ‘most globally engaged’ (MGE) firms of their almost exhaustive sample of US enterprises are responsible for 80% of US trade. These MGE firms are defined as those that both import and export, and at least some of these imports and exports are to and from ‘related parties’ (as opposed to arm’s length). Thus, firm-tofirm international business represents perhaps the most important share of world trade. The model in this paper does not model the boundaries of the firm explicitly, and thus the FCVL model is consistent with both forms of firm-to-firm trade (whether they are distinct entities or whether they belong to the same multinational corporation).2 Policies are also becoming ever friendlier towards FDI flows. In its 2004 World Investment Report, UNCTAD reports that among the 70 countries that changed their FDI regulations between 1992 and 2002, more than 94% of the regulatory changes are classified as being more favourable to FDI, that is, impediments to such capital movements have decreased over the period. Likewise, the number of bilateral investment treaties increased five-fold between 1990 and 2002. In short, capital is increasingly ‘footloose’* certainly more than workers* and vertical linkages are ubiquitous. Thus, it is important to have an NEG model that incorporates these facts in its own right.
104
F. Robert-Nicoud
Downloaded by [Tehran University] at 04:23 21 August 2011
1.2. Literature Review, Terminology, and New Results The main insight of the NEG comes from its formalization of agglomeration mechanisms based on endogenous market size. Various trade models predict that sectors characterized by increasing returns to scale, imperfect competition and transportation costs will be disproportionately active in locations with good market access (Krugman, 1980). In a simple two-country model, this ‘home market effect’ implies that the country with the larger demand for the good produced by such a sector will end up exporting that good (this result sharply contrasts with the predictions of the Ricardian and Heckscher Ohlin Viner models of trade; but see Behrens et al. 2005; Ottaviano & Thisse, 2005). NEG models add cumulative causation to this effect: hosting a larger share of increasing returns activities increases local demand and profitability. If there is factor mobility of sorts, then more of these factors will in turn locate themselves in the already large market, and the cycle repeats under certain conditions. Endogenous agglomeration results from this mechanism, that is, the manufacturing sector clusters in a single region, usually called the ‘core’ as opposed to the ‘periphery’. Therefore, initially symmetric regions might end up hosting very different sectors as increasing returns activities have a tendency to locate in a small number of places (Fujita & Thisse, 2002). In relation to the transmission mechanism whereby agglomeration occurs, the NEG exploits in a spatial setting the kind of circular causality recurrent in models of monopolistic competition (Matsuyama, 1995). Typically, these models predict that agglomeration (respectively dispersion, whereby the manufacturing sector is active in both regions or countries), is the unique stable equilibrium when international trade and transaction costs are low (respectively large) and that both agglomeration and dispersion are stable equilibria for intermediate values of trade costs. For syntheses of this literature, see Fujita et al. (1999), Baldwin et al. (2003) and Combes et al. (2005). This brings us back to the first contribution of this paper. The model here emphasizes, as the agglomeration mechanism, the interplay between capital mobility and input output linkages. Other mechanisms have been proposed in the literature, such as skilled labour migration (Krugman, 1991; Forslid & Ottaviano, 2003) and human capital accumulation (Baldwin, 1999). The mechanism described thus far that is closest to that described in the current paper was originally proposed by Venables (1996), extended in Krugman & Venables (1995) and simplified by Ottaviano & Robert-Nicoud (2006).3 There, agglomeration stems jointly from inter-sector labour mobility and input output linkages (aka vertical linkages) among firms. That is, workers can move between sectors within the same spatial entity (a region or a country) and firms buy each other’s output as intermediate inputs in the spirit of Ethier (1982).4 As Robert-Nicoud (2005) and Ottaviano & Robert-Nicoud (2006) show, all these NEG models are isomorphic and thus exhibit the same dynamic properties (for a comprehensive survey and a synthesis, see also Baldwin et al. 2003).5 As it turns out, this is also the case for the model presented in the current paper, and thus the NEG model described in Section 3 belongs to the same family of models. This paper also studies the normative properties of the model. Unlike most previous contributions, it does so mostly using the Pareto criterion and compensation criteria, as social welfare functions are problematic when people face different price indices (Wildasin, 1986).6 As we shall see, the market delivers the ‘desirable’ outcome when trade is quite free: this desirable outcome is
Downloaded by [Tehran University] at 04:23 21 August 2011
Agglomeration and Trade
105
agglomeration. (An equilibrium outcome is said to be ‘desirable’ if everyone can be made better off, using compensation schemes if necessary, relative to another equilibrium outcome.) As it turns out, if the vertical linkages that bind firms with others are large, agglomeration might even Pareto-dominate dispersion, that is, agglomeration might be preferable to dispersion even for the population left behind at the periphery. To understand this seemingly counterintuitive result, remember that firms trade with each other. Thus, when all firms are clustered in a single location, intermediate inputs are cheapest because firms do not have to pay for transportation or trade costs when they purchase those inputs. This cost-saving aspect of agglomeration, which benefits all firms, is passed on to mill prices at equilibrium. When trade costs are low and vertical linkages are sufficiently strong, these lower mill prices might event translate into a lower consumer price index (which includes trade costs) for the residents at the periphery. The market also delivers a socially optimal outcome in the opposite case, that is, when vertical linkages are modest and when trade costs are near prohibitive. A third contribution of this paper is to underline the distinct role of two parameters of the model that are taken to be the same, for the sake of simplicity, in the aforementioned papers on vertical linkages. As I shall show, the share of manufactured goods in consumers’ expenditure exclusively influences the normative properties of the model in an intuitive way, which I shall explain in detail. By contrast, the share of intermediates in firms’ cost function affects both the positive and the normative features of the model. When this share increases, it makes agglomeration both likely and more desirable. In addition, I show that the elasticity of substitution plays a modest role (in a well-defined sense) in the welfare analysis but is crucial for the positive properties of the model. Finally, there is a good technical reason to integrate NEG forces into a model in which countries trade goods and capital alike, as the model does. To reiterate, the model in this paper nests both an NEG model with all the usual properties described in detail in Fujita et al. (1999) and Baldwin et al. (2003), among others, and the NT model of Flam, Helpman, Martin and Rogers, as special cases. This is useful because applications to, say, the analysis of the location effects of preferential trade agreements in such a setting emphasize how and where the departure from the tractable FHMR trade model, by incorporating vertical linkages, changes the resulting picture. This is the route undertaken by Baldwin et al. (2003, Ch. 14). Finally, it is worth stressing that these extensions are facilitated by the fact that the current model is easier to manipulate than the model proposed by Krugman & Venables (1995). Minor results and results that appear elsewhere in the literature are cast as ‘Lemmas’. All new results appear as ‘Propositions’ or ‘Corollaries’; they appear in bold characters and number six in total. 2. The Basic Model The model developed in this section is an extension of the FHMR trade and geography model and builds on Dixit & Stiglitz’s (1977) framework of monopolistic competition. The novelty here is to add agglomeration forces in the form of vertical linkages, as in Venables (1996). As I proceed, I make choices of units and of the nume´raire that are standard in the NEG literature.
106
F. Robert-Nicoud
2.1. Tastes and Production Consider an economy consisting of two regions or countries (I use the two terms interchangeably), j 1, 2. The typical individual is assumed to supply one unit of labour L (the reward of which is w) and k units of capital K (the reward of which is p) inelastically. There is a measure L of workers and a unit measure of capital in this economy, so the typical worker owns k 1/L units of capital and, as a consequence, her income is wkp. Tastes for a typical individual in j take a Cobb Douglas form in which j spends a share m of her income yj on a composite good M (for ‘manufacturing’ or ‘monopolistically competitive’) and a share 1 m on the homogeneous good A (for ‘agriculture’, say). The composite good M comes in N different varieties. Tastes for the different varieties are captured by a Constant Elasticity of Substitution (CES), ‘love-for-variety’, functional form, with an elasticity of substitution s 1 between any pair of varieties. The dual of this, the indirect utility function of region j’s representative consumers, can therefore be written as: /
/
/
/
Downloaded by [Tehran University] at 04:23 21 August 2011
/
Vj
yj m p1m A Gj
N ; Gj
g p(i)
1s
di
1 1s
; 0 B m B 1;
(1)
s 1;
i0
where p(i) is the consumer price of variety i, Gj is the true CES price index over the N varieties of the manufactured good, and pA is the price of good A (the reason why pA and p are not indexed by j will become clear shortly). Each firm i [0, N] produces a different variety, the buyer price of which is p(i); this brings us to production. Each producer enjoys monopoly power over its own variety. No producer has any incentive to produce a variety already being produced by another producer, for it would then directly compete for the market of that variety with the incumbent producer and, as a result, its profits would be lower. Hence, N is also the number (mass) of firms operating in sector M. The manufacturing sector M is the usual monopolistic competition sector a` la Dixit & Stiglitz (1977). It produces a CES aggregate under increasing returns. Specifically, each firm needs a fixed amount aKX of capital K to start producing and a constant amount b of a Cobb Douglas composite input made up of labour (with share 1 a) and intermediates produced by sector M itself (with share a) for each unit of output it produces. Mathematically, the cost function for the typical firm located in j is given by: /
/
Cj (xj ) aKX pj bxj wj1a Gja ;
0 B a B 1;
(2)
where Gj is the same CES price index as in equation (1), pj is the cost of one unit of K prevailing in j, and xj is a typical firm output. Observe that the same Gj enters equations (2) and (1); this means that the elasticity of substitution among varieties of manufacture is the same for consumers and for firms. Qualitatively, this is an innocuous assumption, but this is required to keep the analysis manageable (see also Fujita et al., 1999, Ch. 14). In FHMR, a is equal to zero. Shutting down vertical linkages, as this assumption does, makes the analysis entirely tractable, but this is at the cost of no longer being an NEG model (when a 0, market sizes are exogenous; see below). The background sector A produces a homogeneous good under constant returns using labour only: by choice of units, the per-unit output labour /
Agglomeration and Trade
107
requirement is set to unity. A is assumed to be freely traded (hence pA is the same in both regions) and parameter values are chosen so that no region ever specializes in M (I make the ‘no-specialization’ condition more precise below); further, A is chosen as the nume´raire. These imply that wj pA 1, j {1, 2}. As a consequence, we can rewrite equation (1) as Vj yj Gjm . In contrast to A, inter-regional trade in M is subject to Samuelson-type iceberg transportation costs t ] 1. That is, in order to sell one unit abroad a firm has to ship t units. The difference t 1 melts in transit (hence the name). Monopolistic pricing yields the usual relation pj (1 1/s) bw1a Gaj for the producer price of a j typical firm in j. The term on the right-hand side is the marginal cost, and s is the perceived elasticity of demand; this requires us to impose s 1 as a regularity condition. We choose units so that b 1 1/s (this is without loss of generality; see Baldwin et al. 2003), hence pj Gaj . In Dixit Stiglitz monopolistic competition, transportation costs are fully passed on to consumers so that the producer price pj holds irrespective of the market served. Would-be entrepreneurs bid for units of capital. Free entry and exit in M ensure that these entrepreneurs make no pure profits, so the operating profits (which are a share 1/s of sales in Dixit Stiglitz) of a typical firm active in j just cover the capital reward pj : /
/
/
/
/
/
/
/
/
/
/
Downloaded by [Tehran University] at 04:23 21 August 2011
/
pj
pj xj s
pj Gja :
;
(3)
Now normalize aKX to 1. By symmetry among all varieties and full employment of capital this implies that N 1; define n as the share of N operating in j 1. As a result, we can rewrite the price indices in equation (1) as: /
D1 nDa1 f(1n)Da2 ;
/
0 5 Dj Gj1s 5 1;
0 5 f t1s 5 1
(4)
and D2 is defined analogously. Note that the definitions of D1 and D2 are implicit and simultaneous. The variable Gj and the primary parameter t are usually raised to the power 1 s, so it is more convenient to use Dj and f instead. The parameter f measures the degree of freedom of inter-regional trade in manufactures M (it is zero when trade is prohibited and equals unity when trade is perfectly free). Like f, Dj lies in the unit interval because n [0, 1] and a B 1. (This claim is easily demonstrated by contradiction.) /
/
/
2.2. Endowments and Factor Mobility Potentially, the two regions differ in size: region 1 is endowed with a share s of world labour and world capital stock alike; assume s ] ½ without loss of generality.7 Labour is embodied and immobile; capital is disembodied and perfectly mobile in the long run (consequently, n " s is possible).8 Both workers and capital owners are themselves immobile. We further assume that the capitalists own a perfectly diversified portfolio, that is, each of them owns the same share of each firm.9 Hence, their portfolio return is p np1 (1 n)p2. Also, remember that w1 w2 1 holds by free trade in A and by the choice of nume´raire. All these imply that aggregate income Y in region 1, say, is equal to Y1 s(L p). Location 1 expenditure on M is given by E1 mY1 anp1bx1. The term mY1 in the previous expression is the share of final demand (and, since there are no /
/
/
/
/
/
/
/
/
/
/
F. Robert-Nicoud
108
savings, income) spent on M; it follows by applying Roy’s identity to equation (1). The second term in the expression for E1, anp1bx1, is the share of intermediate demand spent on M that emanates from other manufacturing firms. It can be inferred from equation (2) using Shepard’s lemma. By analogy, location 2 expenditure on M is given by E2 mY2a(1n)p2bx2. To close the model, note that the value of total output in sector M at producer prices must equal the value of global M-sector private expenditure, namely np1x1 (1 n)p2x2 ab[np1x1(1 n)p2x2] m[L p]. Making use of the pricing rules and free-entry condition (3), we get p mL/[(1 a)s a m].10 Observe that the equilibrium p is a function of parameters and exogenous endowments only; in particular, this expression holds for any n. Importantly, it does not depend upon f or t. As an aside, we now have everything at hand to make the no-specialization condition more precise: if all firms cluster in a single location, we require the labour supply of this region to be larger than the labour these firms demand so that sector A is active in both regions and nominal wages are the same worldwide. Mathematically, this requires min{L1, L2} (1 a)bsp. Using the equilibrium expression for p, the closed-form condition is (1 s) m(1a)(s 1)/(s(1 a) a m). We assume that it holds throughout. /
/
/
/
/
/
/
/
/
/
Downloaded by [Tehran University] at 04:23 21 August 2011
/
/
/
/
/
/
/
/
/
/
/
/
2.3. Short-run and Long-run Equilibria In the short run capital is immobile and becomes mobile only in the long run. Thus, in a short-run equilibrium consumers maximize utility, firms maximize profits and all markets clear. Define qj as the ratio of the actual operating profit in region j to the equilibrium value of p, that is, q pj /p, and ej as the share of expenditure that emanates from region j, namely ej Ej /sp.11 Together with the expressions for p and Ej above, we obtain the following closed-form solutions for ej : /
/
e1 sab(ns)abn(q1 1); e2 (1s)ab(ns)ab(1n)(q2 1): (5) This expression makes clear that a region’s share of expenditure is larger the more populated it is, the larger its number of firms per capita, and the more profitable the firms that have located there. We then use Sheppard’s lemma and Roy’s identity to obtain the demand for a typical variety, pjs mEj /Dj . The equilibrium operating q-ratios can be derived as follows: q1
p1
p1 x1
E1 p1s 1
ps D |{z} 1 E =E 1 e1 e2 f Da1 : D1 D2 p
ps
E2 (p1 t)1s 1 ps |{z}
E2 =E
D2 (6)
The equilibrium expression for q2 is symmetric. The first equality after the identity derives from equation (3); the second one derives from Cobb Douglas preferences and costs; the final step follows from the fact that equilibrium profits are
Agglomeration and Trade
proportional to sales and expenditure (p E/s), the equilibrium price pj Daj , the equilibrium value for Dj in equation (4), and from the definition of ej in equation (5). It is obvious from the definition of p and the qs that q1 q2 1 always holds. Thus, n (0, 1) implies that, first, q1 B 1 if an only if q2 1 (and conversely) and, second, q1 1 if and only if q2 1. In words, the firms located in a given region make above-normal operating profits whenever the firms located in the other one make below-normal operating profits. The system (3) (6) completely characterizes the so-called instantaneous equilibrium. (In an instantaneous equilibrium, n is an exogenous variable and q1 and q2 are functions of n.) In the long run, capital owners seek the higher nominal returns, so that n adjusts so that pj p mL/[(1 a)s a m] (and hence qj 1) for any active firm. Following standard practice, assume that capital owners allocate their capital according to current nominal differences in rewards according to the following ad hoc law of motion for n:12 /
/
/
/
/
/
/
/
/
/
Downloaded by [Tehran University] at 04:23 21 August 2011
109
/
/
/
/
/
n˙ gn(1n)(p1 p2 ) gn(1n)(q1 q2 )p;
(7)
where g is a strictly positive parameter and the second equality follows from the definition of p. The long-run equilibrium is attained whenever n˙ is zero. Three cases can occur: n 0 (in which case q2 1), n 1 (in which case q1 1), and 0 B n B 1 (and hence q1 q2 1). The first two cases are usually referred to as ‘core periphery’ equilibria and the third as interior or ‘dispersed’ equilibria. By the symmetry of the model, the symmetric equilibrium n ½ always exists. More generally denote an interior long-run equilibrium as n0. To assess the stability of these equilibria the NEG typically resorts to the following informal methods.13 Consider that, starting from any long-run equilibrium, the spatial allocation of capital n is hit by an exogenous, epsilonsmall, perturbation. For the interior equilibria (in particular the symmetric equilibrium in the symmetric model n s ½), one evaluates the sign of the change in the nominal profit gap, namely p1 p2. If the displaced unit of capital increases the profit in the receiving region, then the symmetric equilibrium is unstable. For the agglomerated (or core periphery) equilibrium, one checks whether the perturbation creates a nominal profit in the periphery that is higher than the nominal profit in the core. If this is the case then this equilibrium is unstable. Mathematically, these two tests can be written as: /
/
/
/
/
/
/
/
/
/
/
/
d(p1 p2 ) dn
j
B 0;
(p1 p2 )jn1 0:
(8)
nn0
The equilibrium under consideration is stable when the relevant inequality holds, otherwise it is unstable. The description of the model is now complete: the long-run equilibria consist of the values of n in the interval [0, 1] that solve (3) (7) for n˙ 0: 3. Symmetric Regions and Vertical Linkages: the FCVL Model Following the enduring tradition of the NEG, my primary interest here is to discuss how regions that share identical tastes, endowments and technology might endogenously diverge in terms of production structure and real incomes.
F. Robert-Nicoud
110
3.1. Sustainability of the Concentrated Equilibrium Here, the question is: is a core periphery pattern sustainable in region 1? To answer this question, we check under which conditions n 1 and q2 51 hold simultaneously. No firm wants to leave 1 if the shadow profit in 2 is inferior to p. Substituting n 1 into equations (4) (6), we find that this condition holds whenever /
/
/
S1 (f) 2f1a [(1ab)(1ab)(2s1)]f2 [(1ab)(1ab)(2s1)] (9)
Downloaded by [Tehran University] at 04:23 21 August 2011
] 0:
As explained in various papers and monographs mentioned above, this inequality sust holds for all f that are sufficiently large, i.e. for f [fsust is 1 , 1], where f1 implicitly defined as the smallest root of S1(f) (1 is the unique other root). This root is always strictly in the (0, 1) interval because the so-called no-black-hole condition 1 ab always holds, which I assume throughout. In words, whenever trade costs are sufficiently low, if it is already the case that all firms have agglomerated in either region, then none has any incentive to leave the core and start producing in the periphery. It turns out that the clustering of all manufactures in the south can also be sustainable if f is sufficiently large; the resulting condition can be denoted by S2(f) 0, with S2(f) being the perfect symmetric of S1(f). Intuitively, one would expect fsust to be larger than fsust 2 1 , that is, if agglomeration is sustainable in the small country then it must also be sustainable in the large market. This intuition turns out to be correct. To see this, note that S1(f) is always larger than S2(f), since s ½: /
/
/
/
f [0; 1): S1 (f)S2 (f) (1ab)(2s1)](1f2 ) 0:
(10)
It turns out that working with the general, asymmetric case would be a substantial task that is beyond the scope of this paper (one would have to use numerical simulations, which is something I aim to avoid in this paper). Therefore, I impose s ½ in the remainder of the paper, unless specified otherwise. The resulting model is referred to as the FCVL model in Baldwin et al. (2003).14 Denote the sustain point in the symmetric case by fsust ; from equation (9), fsust satisfies: /
fsust B 1:
2(fsust )1a [1ab](fsust )2 [1ab] 0:
(11)
To sum up, we have: Lemma 1. The core periphery equilibrium is said to be sustainable if f is larger than fsust . This result is reminiscent of Krugman (1991) and his followers. In addition, an increase in input output linkages reinforces the forces of agglomeration and thus makes agglomeration more likely: @fsust B @a B 0. Note that fsust 0 if and only if a B 1 (more on this below). /
/
/
/
Agglomeration and Trade
111
3.2. (In)stability of the Dispersed Equilibrium Define a dispersed equilibrium as the configuration in which n ½. With symmetric regions such an equilibrium always exists, as can be seen from equations (4) (7). It might not always be stable in the sense of equation (8), though. Here, the thought experiment is, if a firm moves from region 2 into region 1, would the gap p1 p2 thus created be positive and hence, by equation (7), widening? In which case would that gap be negative and hence the perturbation be selfcorrecting? Formally, answering this question is equivalent to signing dq1/dn evaluated at n ½ (see Puga, 1999).15 As we know from the symmetry of the model, n ½ implies that e1 e2, D1 D2, and q1 q2. We therefore denote common variables with the subscript zero. From equations (4) (6) we find that q0 1, e0 ½, and ^1a (1 f)/2. A 0 small perturbation of the model (in the form of dn 0) around the symmetric equilibrium has an impact on the values of D, q and e. As an illustration, totally differentiate D1 in equation (4) to get: /
/
/
/
/
/
/
/
/
/
/
Downloaded by [Tehran University] at 04:23 21 August 2011
/
dD1 jn1=2 (Da1 f Da2 )dn(a Da1 dD1 f a Da1 dD2 ) 1 1 (1f)Da0 dna Da0 (1f)dD;
(12)
where the second line derives from the fact that the effect of dn on the variables pertaining to region 2 is symmetric to the effect of dn on those pertaining to region 1 around the symmetric equilibrium; thus de0 de1 de2, dD0 dD1 dD2 and dq0 dq1 dq2. Working out the derivatives of equations (5) and (6) in the same way, we obtain: 3 2 32 de0 0 1a(1f)=(1f) 0 4 2 0 ab54dD0 =D0 5 dq0 2(1f)=(1f) (1f)=(1f)a 1 2 3 (1f)=(1f) 5dn: 24 ab (13) 0 /
/
/
/
/
/
/
/
/
Using Cramer’s rule, it is easy to see that dq0/dn ] 0 if and only if /
B(f) ffbreak 0;
fbreak
(1 a)(1 ab) (1 a)(1 ab)
;
(14)
where fbreak is the so-called ‘break point’, which, by inspection, is strictly smaller than unity. A non-empty combination of parameters always exists such that the symmetric equilibrium is stable provided that a B 1, that is, provided that firms use at least some labour. I assume that this condition holds throughout (this is the socalled no-black-hole condition). To sum up we can write: /
Lemma 2. The symmetric equilibrium is stable for all f below fbreak . This result is familiar to readers of Krugman (1991) and the ensuing literature (see especially Puga, 1999, who was the first to derive an analytical solution to the break point). It can be shown that fsust B fbreak (for details see Robert-Nicoud, 2005), thus there is a range of parameters for which both the symmetric and core periphery long-run equilibria are stable. /
F. Robert-Nicoud
112
The symmetric equilibrium is unstable for lower values of f if agglomeration forces are stronger. Ceteris paribus, agglomeration forces are increasing in a and b. a captures the strength of vertical linkages among firms (when a is large, firms buy a lot of each other’s output as intermediate inputs). To study the effect of an increase in the elasticity of substitution on the break point, it is important to recall that b is equal to 1 1/s and that f is defined as t to the power (1 s). Thus, let tbreak (fbreak )1s . Standard algebra reveals that @tbreak /@s B 0, that is, when goods become less close substitutes, agglomeration is the unique stable equilibrium over a larger range of trade costs. The intuition for this result is the following: when s decreases, firms enjoy a higher mark-up, which reinforces the strength of pecuniary externalities or, in other words, of backward and forward linkages. /
/
/
/
Downloaded by [Tehran University] at 04:23 21 August 2011
3.3. Other Equilibria and Global Stability Thus far we have only shown the existence of long-run equilibria for which n {0, ½, 1} exist and the conditions under which they are stable. Now the following natural question may arise. Do other equilibria exist, and what are their dynamic properties if they do? Simulations undertaken by several researchers and a formal method derived in Robert-Nicoud (2005) provide the following answer to this question. Generically, there are five equilibria: the dispersed equilibrium (n ½), the concentrated equilibria (n 0, 1), and two asymmetric, interior equilibria n? and n?? with n? n?? 1. Moreover, whenever they exist, the latter equilibria are unstable in the sense that dq1/dn 0 and dq2/dn B 0 at n n?, n?’. Regarding global stability, Baldwin (2001) has shown that the informal methods developed in Krugman (1991) and Puga (1999), designed to assess the local stability of the model, give the same answer as formal methods, that is, the break and sustain points describe the global stability of the model, too. /
/
/
/
/
/
/
/
3.4. Comparison with the Break and Sustain Points of the CPVL Model The reader familiar with the NEG literature already knows that break and sustain points in Krugman’s (1991) core periphery (CP) model and Fujita et al.’s (1999, p 245) core periphery vertical linkage (CPVL) model are isomorphic, as are the corresponding points of the FCVL model in this section.16 In particular, the break and sustain points of the CPVL model solve: fbreak CPVL
(1 a)(b a) (1 a)(b a)
;
2(fsust CPVL )
1
a s1
2 [1a](fsust CPVL ) [1a] 0:
(15) The model requires the so-called ‘no-black-hole conditions’ b a to hold. Without these, the break points would be negative, implying that the symmetric equilibrium is never stable. The similarity between the CPVL and the FCVL models is striking as both the functional forms for the cost functions and the mechanism driving agglomeration are different. In particular, agglomeration stems from labour mobility between the manufacturing and the background sectors within each region in the CPVL model. By contrast, international capital mobility is the driving mechanism in the FCVL model. /
Agglomeration and Trade
113
4. Equity and Efficiency in the FCVL Model
Downloaded by [Tehran University] at 04:23 21 August 2011
This section studies the normative properties of the FCVL model. Since this model produces two radically different kinds of equilibria, the goal here is to establish a welfare comparison between the two market outcomes.17 From an equity perspective one may ask: who are the gainers and the losers from agglomeration? Which factor owners benefit and which suffer? Which regions are advantaged and which are disadvantaged? From an efficiency perspective one may wonder: can the gainers compensate the losers? Does the free working of market forces deliver too much or too little agglomeration? This section follows the methodology developed by Charlot et al. (2006) and adopted by Ottaviano & Robert-Nicoud (2006). Thus, I will also point to the similarities and differences between the models that these papers study (the Krugman 1991 model and a cousin of the Krugman Venables 1995 model, respectively) and the current one. 4.1. Pareto Welfare Analysis The first thing to note is that nominal rewards are invariant in the FCVL model (thus the conflict between factor owners is unaffected by the equilibrium spatial configuration). Thus we write: Lemma 3. In all cases, capital owners and workers of region 1 (region 2) alike are best off when all firms are clustered in region 1 (region 2). This implies that all the welfare effects occur via the price index. Specifically, welfare in region j, as a function of n and f, reads Vj (n,f) [D(n,f)]mj . As is clear from equation (4), D1 is maximized when n 1. However, in the current setting it is not necessarily the case that increasing the share of industry n necessarily hurts residents of region 2. This is because an increase in n lowers the production costs of firms located in region 1, a reduction that is passed on fully to consumers. Therefore, from region 2’s residents’ point of view, two competing effects arise when n increases. On the one hand, they have to import more varieties, that is, they have to pay transportation costs on a wider range of varieties. This clearly hurts them. On the other hand, the producer price of these varieties is lower and hence the consumer price net of transportation costs is also lower. This is good news to them. The net effect is thus a priori ambiguous, as we shall see. This has an important consequence: the market might select an outcome that is dominated by another possible equilibrium. Returning to Lemma 3; this lemma states that the preferred outcome for residents in the north is n 1 but is silent on which outcome is their second-best option: are these people better off if they are in the periphery (n 0) or if firms are evenly spread between the two regions (n ½)? As it turns out, this is not a trivial question. Take the case a 0 as a benchmark, that is, there are no forward or backward linkages* and hence no agglomeration economies. In such a case, equation (4) reveals that D1 is strictly increasing in n. This implies: /
/
/
/
/
/
Lemma 4. When the magnitude of the vertical linkages is small (a :0), then residents in 1 rank the possible equilibria as follows: they are best off under the core periphery pattern n 1; their second-best option is the dispersed equilibrium n ½; they are least well off under the core periphery pattern n 0. /
/
/
/
114
F. Robert-Nicoud
Now turn to the case a 0. A larger a corresponds to greater agglomeration economies. Agglomeration economies ensure that producer prices are lower in the core periphery outcome than in the dispersed outcome. Low transportation costs ensure that consumer prices are closer to producer prices. Thus, consumer prices in the periphery can be lower under agglomeration than under dispersion if transportation costs are sufficiently low and if vertical linkages are substantial. To see this formally, assume that firms are fully agglomerated in region 1. Then, manipulating equation (4) reveals that in the dispersed outcome n ½ we have: 1 f 1=(1a) D1 D2 B 1: (16) 2 /
/
By contrast, D1 1 and D2 f if firms are clustered in the north. Therefore, residents in 1 always prefer the core periphery pattern and residents in 2 agree with this ranking if and only if
Downloaded by [Tehran University] at 04:23 21 August 2011
/
/
m
P(f) V2 (1=2;f)V2 (1; f) f
1f
m=(1-a)
(17)
2
is positive (P stands for ‘Pareto’). This expression is negative for low values of f and zero if f 1 (in the latter case location is irrelevant). However, standard algebra reveals that this expression is increasing at f 0 and everywhere concave. Moreover, it is increasing in f at the limit f 1, and thus everywhere on [0, 1], if and only if a B½. In this case, equation (17) is negative for all admissible values of f. If a ½ then the expression in equation (17) is decreasing in f at the limit f 1 and thus equation (17) is positive for any f in (fP , 1), where fP is the unique real root of this polynomial in (0, 1). Hence, this analysis has shown: /
/
/
/
/
/
Proposition 5. If a ½ then there exists a fP in (0, 1) such that residents in the periphery are better off under the core periphery outcome than under the dispersed outcome n ½ if and only if transportation costs are sufficiently low (f fP ). Agglomeration is said to be ‘efficient’ in this case. If a ] ½ then residents in the periphery are worse off than under n ½ for all f. Since residents in the core are always better off under agglomeration than under dispersion in this model, dispersion is not Pareto dominated by agglomeration only if fB fP . /
/
/
/
/
/
Moreover, it is easy to see that people in the periphery are more likely to benefit from agglomeration, the larger the agglomeration economies. Indeed, at the limit a 1, D2 0 D1 1, so consumer prices in the periphery are the same as in the core, for any value of f. More formally, P(f) 0 if and only if 2f1a (1 f) 0 and /
/
/
/
@ @a
(2f1a (1f)) 2f1a ln(f) 0;
/
/
/
(18)
which implies that fP is decreasing in a. In other words, Corollary 6. The range of f over which everyone benefits from the clustering of industry in either region vis-a`-vis the dispersed equilibrium is increasing in the magnitude of agglomeration economies. This, too, is rather intuitive.
Agglomeration and Trade
115
We can finally address the question of whether the market provides too much or too little agglomeration from the periphery’s residents’ point of view. To answer this question we rank fP , fbreak , and fsust . We already know that the sustain point comes before the break point, namely fsust B fbreak , which implies that for all f (fsust , fbreak ) both the core periphery and the dispersed outcomes are stable longrun equilibria. Incorporating equation (14) into equation (17) we get: /
/
signfP(fbreak )g signf2(fbreak )1-a [1(fbreak )]g:
The sign of this expression is ambiguous and depends upon the magnitude of a and s. As is to be expected, this expression is most likely to be positive (and hence agglomeration Pareto-dominated dispersion) at the break point when the magnitude of the vertical linkages a is large. Also, numerical comparisons show that the sign of fP fbreak is ambiguous but suggest that fsust B fP holds for all parameter values. The latter inequality means that dispersion is not Pareto dominated by agglomeration at the sustain point. Together, these facts imply that fsust B fbreak , fsust B fP . As a consequence, no fewer than five cases can occur (see the Appendix). Therefore, we can write: /
Downloaded by [Tehran University] at 04:23 21 August 2011
(19)
/
/
/
Lemma 7. If fB fsust then the market delivers dispersion (n ½) and this outcome is Pareto efficient. If f max{fbreak , fP } then the market delivers agglomeration and this outcome is Pareto efficient. If fbreak B fP , then the market delivers too much agglomeration if f (fbreak , fP ). In the remaining instances in which both agglomeration and dispersion may arise at the decentralized equilibrium, the actual outcome might or might not Pareto dominate the other one, and which does in fact dominate the other depends on the parameters values of the model. /
/
/
/
/
The really interesting question is the likelihood of these cases for different parameter configurations. This is answered at the end of the next section, in which all positive and normative analysis is summarized in Table 1 and Figure 1. 5. Potential Pareto Improvements When assessing the global welfare properties of the new economic geography model, Baldwin et al. (2003) use the utilitarian welfare function and, therefore, make interpersonal comparisons.18 This is notoriously problematic, especially when different people face different price indices, as in the present model (Wildasin, 1986).19 For this reason, here I use compensation criteria based on the prevailing equilibrium prices and wages because they do not suffer from this caveat. Table 1. Pareto criterion and Kaldor compensation in various models Model CP model: migration driven (Charlot et al ., 2006) VL models (Ottaviano & Robert-Nicoud, 2006) FCVL model (current paper)
Agglomeration might Pareto-dominate dispersion No: N is fixed and the price of each variety is fixed (Yes:) N is larger under agglomeration (Yes:) the price of each variety is lower under agglomeration
The mechanics of Kaldor compensation Winners are more numerous than losers
Downloaded by [Tehran University] at 04:23 21 August 2011
116
F. Robert-Nicoud
Figure 1. The positive and normative properties of the model.
Specifically, let us ask under which conditions the winners from agglomeration could compensate the losers (in the form of a transfer) from it and still be better off; answering this question involves making use of Kaldor’s (1939) compensation criterion. Alternatively, we could follow Hicks (1940) and evaluate the maximum transfer from periphery to core that under dispersion would let the core reach the same welfare level as under agglomeration. In both cases, we check the feasibility of transfers at the corresponding equilibrium prices, that is, we ask whether the location pattern with transfers is still part of an equilibrium, as this is not necessarily granted (Little, 1949). 5.1. Kaldor’s Approach Let us start by assessing the conditions under which agglomeration corresponds to a potential Pareto improvement with respect to dispersion. Let TK be the per capita transfer from region 1 residents to region 2 residents. People in the periphery are exactly compensated if they get additional income TK such that their real income in equation (1) is the same in both instances:
(1TK )V2 (1; f) V2 (1=2;f)
U
m
(1TK )f
1f 2
m 1a
:
(20)
Since core and periphery host the same number of people, each resident in the core pays TK . Clearly, TK is a positive number unless agglomeration Pareto-dominates dispersion (in which case no compensation is required). It is straightforward to check that Kaldor’s compensation scheme is feasible in the sense that the material balance conditions still hold at the consumption pattern corresponding to the compensated incomes, that is, that market-clearing conditions still hold after TK has been transferred.
Agglomeration and Trade
117
The last step requires us to determine under which conditions region 1 residents prefer agglomeration with compensation to dispersion. Given preferences in (1), this is the case if and only if: (1TK )V1 (1; f) V1 (1=2;f)
U
(1TK )
1f 2
m 1a
(21)
:
Using the value of TK from equation (20), the condition above can be rewritten as:
Downloaded by [Tehran University] at 04:23 21 August 2011
K(f) 2(1f
m
m 1 f 1a ) 0: 2
(22)
Using similar reasoning as for P(f), one can show that a threshold value fKaldor exists such that equation (22) holds for f fKaldor and is violated otherwise. Clearly, if agglomeration Pareto-dominates dispersion, then Kaldor’s criterion favours agglomeration, too. Mathematically it is easily verified that K(f) P(f) if and only if 1 fm 0, which is true for all f B 1 since m B 1. This result is quite tautological: if people in the periphery are better off under agglomeration than under dispersion, then of course the ‘winners’ in the core are able to compensate the ‘losers’ in the periphery* because even the losers gain and hence do not require being compensated. The converse, of course, is not true: there is an intermediate range of trade costs such that Kaldor’s criterion selects agglomeration even though this outcome does not Pareto-dominate dispersion.20 Four further points deserve mention here. First, fKaldor is always positive and less than unity, even if a B 1/2. Second, when vertical linkages are stronger (a increases), agglomeration is more likely to dominate dispersion (fKaldor decreases) in the sense of Kaldor. Third, since P(f) is larger than K(f), it must be that fKaldor B fPareto holds: when a compensation scheme is available agglomeration is sometimes preferable to dispersion when the former does not dominate the latter. Finally, when m increases, it has an ambiguous effect on fKaldor . On the one hand, at given f and prices, residents of the core benefit more from lower consumer prices the higher the value of m because manufactured goods represent a more important item in their expenses, and thus they are willing to forgo a larger transfer TK to compensate people in the periphery; this tends to make @fKaldor /@m negative. On the other hand, the effect of a higher m on the welfare of people in the periphery depends on whether or not they gain from agglomeration. In the former case, the Kaldor criterion is even less binding (as explained in the previous paragraph) and thus @fKaldor /@m tends to be negative because of this effect; since fKaldor B fPareto always holds; however, this case is immaterial. In the latter case, when people in the periphery are worse off under agglomeration than under dispersion, when m increases they require a larger nominal compensation, given higher c.i.f. manufacturing prices; if that were the only effect then @fKaldor /@m would be positive. The net effect is thus ambiguous (a higher m requires larger compensations and, simultaneously, enables winner to afford giving away a larger TK ). Unreported numerical simulations suggest that fKaldor is increasing in m and that fPareto is surprisingly insensitive to the value of m.21 Together, these two facts are consistent with the fact, outlined in the previous paragraph, that the gap between fPareto and fKaldor is increasing in m.We have thus established: /
/
/
/
/
/
/
/
/
118
F. Robert-Nicoud
Proposition 8 (Kaldor improvements). Kaldor’s criterion prefers agglomeration to dispersion when the former is Pareto dominant (fKaldor B fPareto ). When the two configurations cannot be Pareto ranked (fB fPareto ), it still favours agglomeration provided vertical linkages are strong and trade costs are sufficiently low (f fKaldor , fKaldor B fPareto . /
/
Downloaded by [Tehran University] at 04:23 21 August 2011
/
/
In other words, agglomeration with compensation from core to periphery can make both regions better off when trade costs are sufficiently low. The intuition behind this result is straightforward. Agglomeration enhances product variety. This has a positive impact on consumer surplus both in the core and the periphery. If linkages are strong, the impact is large, which makes it easier to compensate the periphery* the more so the lower the transport costs. It is interesting to contrast this result with its counterpart derived by Charlot et al. (2006) and Ottaviano & Robert-Nicoud (2006) for CP models and another class of VL models, respectively. Both papers show that compensation from core to periphery is possible provided that transport costs are sufficiently low and that vertical linkages are sufficiently strong, as here. The underlying reason is, nonetheless, quite different. In Ottaviano & Robert-Nicoud (2006), like here, people do not cross borders and firms buy each other’s output as intermediates. But winners can sometimes compensate the losers and still be better off under agglomeration because product variety is larger under agglomeration than under dispersion (the larger this difference the easier it is to compensate the losers resulting from agglomeration). The price of each variety is constant in their model. In stark contrast, in the current model the total mass of variety is constant (N 1) and the effect works by making each variety cheaper under agglomeration than under dispersion. Interestingly, in either case, the effect of enhanced product variety or of lower product price results in a lower price index in agglomeration. In Charlot et al. (2006) compensation from core to periphery is possible, provided that transport costs are sufficiently low, for very different reasons than those in both of the aforementioned frameworks. First, in CP models product variety is independent from the spatial distribution of economic activities (N 1). Hence, agglomeration cannot be a Pareto improvement with respect to dispersion. Second, migration-driven agglomeration implies that the number of residents in the core is larger than the number of residents in the periphery. Hence, compensation is possible for a given product variety. This discussion is summarized in Table 1. Three models, as well as the papers that discuss their properties, are listed in the first column. In some of these models, agglomeration might dominate dispersion in the sense of Pareto: the second column shows why each model has this property, or why it does not. Finally, the third column provides the reason why agglomeration dominates dispersion in the sense of Kaldor for some (but not all) parameter values. /
/
5.2. Hicks’s Approach Following Hicks, under dispersion we have to check for the existence of an appropriate redistribution of income from periphery to core that would let the latter region reach the same welfare as under agglomeration. Clearly, for the problem to be non-trivial, P(f) B 0. It is easy to see that a compensation scheme a` la Hicks is not compatible with the location equilibria on which it is built: the payment of any compensation makes /
Agglomeration and Trade
119
the spatial distribution of expenditures uneven (ej " ½). Under agglomeration, sales and operating profits do not depend on ej , ensuring in particular that the material balances hold for any transfer. Under dispersion, by contrast, qj does depend on ej and hence, ultimately, on n. This implies that with any transfer supply no longer matches demand at the dispersed market prices, hence dispersion is no longer in equilibrium once transfers are realized. Accordingly, the would-be periphery is unable to compensate the would-be core at prevailing market prices without destroying the dispersed equilibrium. Therefore we have: /
Downloaded by [Tehran University] at 04:23 21 August 2011
Proposition 9 (Hicks’s improvements). Hicks’s criterion always prefers agglomeration to dispersion. This is reminiscent of Charlot et al. (2006) and Ottaviano & Robert-Nicoud (2006), who obtained the same result for CP models and in another class of VL models, respectively. 5.3. Scitovsky’s Indetermination The foregoing analysis has shown that, if trade costs are small and/or linkages are strong, then agglomeration is preferred under both Kaldor’s and Hicks’s criteria. In this sense, agglomeration is the ‘desirable’ outcome when K(f) 0. Otherwise, we are in a typical case of indetermination in the sense of Scitovsky (1941): neither outcome is preferred to the other with respect to both criteria simultaneously. If f B fKaldor , Kaldor’s criterion favours dispersion while Hicks’s criterion supports agglomeration. To summarize, we write: /
/
Proposition 10 (potential Pareto improvements). Agglomeration is always socially desirable, according to Hicks. It is socially desirable also according to Kaldor only when trade costs are low, and forward and backward linkages are strong. In this case, the market might deliver inefficient dispersion. Otherwise, the market outcome cannot be improved upon. This result is reminiscent of Charlot et al. (2006), who found the same sort of indeterminacy in CP models and consider it ‘as the synthesis of the very contrasted views that prevail in a domain in which the two tenets [those who think agglomeration is efficient against those who do not] have many good reasons to be right’ (p. 4). Figure 1 plots the freedom of trade f against the strength of vertical linkages a. KK and PP describe normative features of the model and will be considered first. Curve PP maps fPareto as a function of a. The condition in equation (17) is violated for all f if a B ½; fPareto 1 if a ½; and fPareto 0 if a 1. More generally, PP is decreasing in a (given f, consumer prices are lower if a is high) and agglomeration Pareto dominates dispersion for any combination of a and f to the north-east of PP* the shaded area labelled ‘A’. Turning next to the curve KK, which plots fKaldor as a decreasing function of a (as vertical linkages are tighter, consumer prices are relatively lower in the agglomerated outcome and it becomes easier to compensate people in the periphery): even when a is very low, it is feasible to engage in win-win compensation from the core to the periphery provided that f is sufficiently large; to the north-east of KK all combinations of a and f make agglomeration desirable (I say that agglomeration is ‘desirable’ when it /
/
/
/
/
F. Robert-Nicoud
Downloaded by [Tehran University] at 04:23 21 August 2011
120
is favoured by both the Kaldor and the Hicks criteria). Between KK and PP, agglomeration is desirable even if it does not Pareto-dominate dispersion. To the south-west of KK, agglomeration is desirable in the sense of Hicks only; it is not so in the sense of Kaldor. We now turn to curves BB and SS (curve SS is not shown on the graph for the sake of readability), which summarize the positive properties of the model. Consider first the dashed line denoted by BB, which plots the break point in equation (14) as a decreasing function of a; to the north-east of BB parameters of the model are such that agglomeration is the unique equilibrium. In the configuration shown in the diagram, which holds for low values of s, curve BB starts below curve KK for low values of a and, as a increases, crosses curve KK first and then curve PP (not obvious in the figure) from below. Before addressing the generality of this configuration, let me first observes that this divides the parameter space into two cut-off values for a, denoted by aBK and aBP, respectively; clearly, aBK B ½B aBP is always true. When vertical linkages are weak (a B aBK) then agglomeration does not dominate dispersion in the sense of Pareto and is not desirable in the sense of Kaldor (though it is so in the sense of Hicks). As it turns out, dispersion is also a market outcome in this case. For intermediate values of f, the market is biased towards agglomeration: agglomeration is the decentralized outcome, even though it is not desirable in the sense of Kaldor and does not dominate dispersion. For yet larger values of trade freedom f, the market delivers the desirable outcome (in the sense of both Hicks and Kaldor). When firms are bound strongly to each other (a aBP) then agglomeration is simultaneously the market outcome* efficient and desirable if f is sufficiently large. In all other cases it is useful to bring in the sustain point in order to avoid a profusion of cases and topology. Curve SS would plot fS from a decreasing function of a; for parameter combinations to the south-east of SS, dispersion is the unique market outcome, from equation (11). SS has the same shape as BB (except that it does not cross PP in any circumstances) and everywhere it is situated below it (and to the left of it), except at the endpoints (where they are confounded). Define aSK as the value of a such that fS fK (graphically, this point is to the left of aBK but would look similar to it if the dashed line were to describe SS rather than BB). When a aSK, then for high transportation costs (low f), the market outcome* dispersion* agrees with the Pareto criterion and is desirable, too. When f takes intermediate values, then the market is biased towards dispersion: the equilibrium is Pareto efficient; however, agglomeration is desirable. A third general case arises when SS and KK never intercept: for parameter combinations to the south-west of SS (itself to the south-west of KK), dispersion is the market outcome; also, it is dominated neither in the sense of Pareto nor of Kaldor (but it is in the sense of Hicks). In this restricted sense, the market delivers a socially optimal outcome. To sum up: the market delivers the desirable outcome when trade is quite free (graphically: this is the case whenever (a, f) is to the north-west of KK): this desirable outcome is agglomeration; in addition, provided that a is sufficiently large, it Pareto-dominates dispersion. The market also delivers a socially optimal outcome in the opposite case (when vertical linkages are tiny and when trade is near prohibitive). Briefly, let me turn to the generality of the situation described here. Let us turn first to the role of b (which is a function of the elasticity of substitution only, namely b 1 1/s). Conveniently, in the (a, f) space b affects curves describing /
/
/
/
/
/
/
/
Agglomeration and Trade
the positive features of the model only, that is, BB and SS. Specifically, fS and fB are both decreasing in b (and thus in s, the elasticity of substitution); s has no impact on KK or PP. When s increases towards infinity, aSK 0 aBK 0 aBP 0 1, that is, SS and BB are both positioned to the south-west of KK for all a in (0, 1). In words, as goods become closer substitutes and thus when mark-ups shrink, vertical linkages are weakened; this makes it less likely that the market delivers agglomeration when this outcome is neither desirable nor dominates dispersion. Second, and lastly, consider the role of m. Conveniently, m has no influence on the positive threshold values of f, that is, BB and SS are unaffected by a change in the value of this parameter. By contrast, I had stressed that an increase in consumer expenditure on manufactured goods, m, expands the set of (a, f) for which agglomeration is desirable; more precisely, KK shifts outwards (but at the endpoints). A change in m has no meaningful impact on PP, as we have seen already. Thus, an increase in m makes it less likely that the market delivers agglomeration when this outcome is neither desirable nor dominates dispersion, that is, it makes it more likely that KK crosses SS or BB or both. /
Downloaded by [Tehran University] at 04:23 21 August 2011
121
/
/
6. Asymmetric Regions in the Absence of Linkages: the Positive and Normative Properties of the FHMR Model Owing to limitations of space, I do not provide a comprehensive description of the FHMR model here. Fortunately, the properties of this model* both positive and normative* can be found in several sources.22 This model is a special case of the model developed in Sections 2 and 3 and holds for a 0. The advantage of this model is that it is fully tractable, even if s " ½; as a result, we can relax the assumption s ½ in this section. Its main positive features are the following. First, the spatial equilibrium is unique because market sizes are exogenous: from equation (5) it is easy to see that e1 s and e2 1 s, irrespective of n. This property allows us to introduce all sorts of exogenous asymmetries, like different endowment sizes (s " ½), asymmetric trade costs (f2 " f) and even asymmetric relative endowments (whereby sL " sK). Then, solving the model for the spatial equilibrium n, we find: /
/
/
/
/
/
/
/
/
n s(s1=2)
2f 1 f2
f2 f (1 f)(1 f2 )
(12f)
(s 1=2) (f2 f) (1 f)(1 f2 )
(23) or n 0 or n 1 in an obvious manner if the solution to the expression above does not belong to [0, 1]. This expression shows that region 1’s manufacturing share is more likely to be larger than its income share s if (a) its GNP is larger than region 2’s (this is a manifestation of Krugman’s (1980) celebrated home market effect), and (b) its foreign partner is more open: in DSK monopolistic competition, unilateral protection always lowers the price index because the elasticity of de-location is high (see Baldwin et al., 2003). Multilateral trade liberalization (simultaneous reduction of f2 f) increases the gap between n and s, and hence n 1 for all f larger than fsust 1 (1 s)/s B 1; see equation (9). In words, the larger country attracts the whole of the manufacturing sector when trade barriers are sufficiently low. This benefits the large country’s residents more than the small country’s. In this model, the nominal rewards are invariant in the spatial equilibrium of the model, hence all welfare effects operate through the price index, or D. As it turns out, the reduced /
/
/
/
/
/
/
122
F. Robert-Nicoud
form of D shows that both countries benefit from a reciprocal increase in f, even though 1 n decreases. Baldwin et al. also study the normative properties of the model using the Pareto and utilitarian criteria. In particular, they show (this lemma is related to Lemma 4): /
Downloaded by [Tehran University] at 04:23 21 August 2011
Lemma 11. In the FC model, conflicts of interest arise in the spatial dimension only. Indeed, any spatial reallocation of capital (industry) benefits one region at the expense of the other. Moreover, there is no conflict between capital owners and workers who live in the same region. This implies that no Pareto improvement is feasible but it is legitimate to ask whether a planner with a utilitarian social welfare function can improve on the decentralized equilibrium. Baldwin et al. show that there is a ‘social home market effect’ whereby the planner would provide the larger region with a more than proportional share of manufactures; however, the planner would chose a more even spatial equilibrium than the market. In this sense: Lemma 12 (utilitarian Social Welfare Function (SWF)). When regions are asymmetric, the market outcome has too many firms in the region that has the highest per capita income. In other words, the market is biased in favour of agglomeration; the utilitarian planner implements the market outcome only when per capita incomes are equal. Note that this result depends crucially on the absence of vertical linkages. More generally, using a CES-type welfare function that encompasses the utilitarian criterion as a special case, we find that as the degree of aversion towards inequality increases (which is equal to one over the elasticity of substitution between individuals’ welfare; the case of the utilitarian criterion corresponds to a unit elasticity), the gap between the market outcome and the social optimum widens. More generally:23 Proposition 13 (social welfare function). Assume that all agents have the same income per capita. Then the market is biased in favour of agglomeration if and only if the SWF planner is more averse to inequalities than the utilitarian planner. Conversely, the market is biased against agglomeration if and only if the SWF planner is less averse to inequalities than the utilitarian planner. 7. Summary and Concluding Remarks This paper proposes a ‘New Trade, New Economic Geography’ model in which agglomeration is driven by the interaction of increasing returns at the level of the firm, trade/transportation costs, free capital mobility and vertical linkages among firms* that is, firms use each other’s output as an input. I have sketched the ways in which the present model behaves in a very similar fashion to already wellestablished economic geography models. In particular, it shares the features of the original model developed by Venables (1996). However, it is more tractable and hence allows for easier extensions and less reliance on simulations. I have also shown that this model nests the ‘footloose capital’ trade model of Flam & Helpman (1987) and Martin & Rogers (1995). The main contribution of the paper is to study the normative implications of the model. First, Proposition 5 established that agglomeration might benefit
Downloaded by [Tehran University] at 04:23 21 August 2011
Agglomeration and Trade
123
everyone, including residents of the periphery, provided that vertical linkages are sufficiently strong (so that producer prices are low) and trade barriers are sufficiently low (so that consumer prices are close to producer prices). Together, Propositions 8 and 9 convey a related message: they show that, under qualitatively similar but less stringent conditions, agglomeration is the most desirable market outcome when compensations are allowed for. Second, when vertical linkages are absent and when international per capita incomes are equalized, Proposition 13 claims that the market delivers the social optimum only if the planner uses the utilitarian criterion. The market is biased in favour of (against) agglomeration if the planner has an aversion to inequalities that is larger than unity, which corresponds to the utilitarian case. The paper also provides a thorough description of the effects of the respective normative and positive role of each parameter of the model. The distinction between the share of income that consumers spend on manufactures and the share of variable costs that firms spend on other firms’ output is usually blurred in the literature because people make the working assumption that they are the same. I show here that the former has no positive implications for the model but has a crucial impact on its normative ones, whereas the value of the latter is relatively more important for the positive implications of the model. Likewise, I show that the main role of the elasticity of substitution is to change the range of parameter values for which agglomeration is desirable but not a market equilibrium (and vice versa). A final remark is in order here. One central prediction of the model is that agglomeration is more likely when trade costs fall. In this model, agglomeration also means that one country specializes in manufacturing, thereby eliminating all intraindustry and firm-to-firm international trade, which is clearly counterfactual. However, it is well known that when NEG models are enriched to allow for different intensities in immobile primary factors across sectors (Epifani, 2005), decreasing returns to labour in the nume´raire sector, or when the no-specialization condition is violated, all predict that the industrial structure of the two regions or countries converges when trade is sufficiently free (Fujita et al., 1999, Ch. 14). This model is no exception: an increase in trade freedom, for large values of f to start with, increases the volume of intra-industry and firm-to-firm trade. The welfare properties of the model in this case are outside the scope of the present paper (see Gaigne´, 2006, for a contribution along these lines).
Notes 1. This NEG model is outlined in Baldwin et al . (2003), who attributed to it the acronym ‘FCVL’ model. 2. As it turns out, one possible equilibrium of the FCVL model * agglomeration * is counterfactual in this respect. I will mention in the concluding remarks how to enrich the model to reconcile it with these stylized facts. 3. See also Englmann & Walz (1995) (on geography and growth), Faini (1984) (on geography and vertical linkages), and Ottaviano et al . (2002) and Pflu¨ger (2004) (on geography and embodied factor mobility). 4. In Ethier (1982), the upstream sector is monopolistically competitive a` la Dixit & Stiglitz (1977) and the downstream sector is perfectly competitive and operates under constant returns to scale. In Venables (1996), both sectors are monopolistically competitive and operate under increasing returns to scale. In Krugman & Venables (1995) and Ottaviano & Robert-Nicoud (2006), as well as in the current paper, the downstream and upstream sectors are merged into a single one. 5. We had to wait 15 years after Krugman’s (1991) seminal contribution to have an exhaustive characterization of its properties. Mossay (2006) formally shows that the short-run (or instantaneous) equilibrium exists in the
124
6. 7. 8. 9. 10. 11.
Downloaded by [Tehran University] at 04:23 21 August 2011
12.
13. 14. 15. 16. 17.
18. 19. 20.
21. 22. 23.
F. Robert-Nicoud
Krugman model; in Robert-Nicoud (2005), I provide an analytical characterization of the long-run properties of the model. See Charlot et al . (2006) for a detailed discussion of this issue in CP models. Baldwin & Robert-Nicoud (2000) relax the assumption of identical relative endowments. What ‘long run’ means in the context of this model will be made clear in Section 4. For more on this see Note 10 below. Economic consistency imposes p /0, which requires (1/a )s/a /m /0. This always holds because s /1 and 1/m . If K has not been normalized to unity, the expression for p in the text should be divided by K . Total operating profits are equal to total expenditure over s . Since there is a unit mass of firms, total operating profits are equal to p , hence the expression in the text. Also, the use of the ‘q ’ notation in this static model is deliberate: in a straightforward dynamic extension of the model, p would play the role of the replacement cost of capital. This reduced form can be obtained as the result of a well-specified dynamic program, as I did in my thesis (calculations available upon request). The original source is Baldwin (2001), in the context of the Krugman (1991) model. Baldwin (2001) formally assesses the validity of these methods. See Baldwin et al . (2003) for the analysis of an NEG model with s"/ ½. Or, which is equivalent by the symmetry of the model when s/½, to signing d (q1 /q2)/dn at n / ½. These terminology and acronyms are those used in Baldwin et al . (2003). That is, I follow Charlot et al . (2006) in that I compare market equilibria between them: n / {0, ½, 1}. There are several justifications for this, one of them being that in a decentralized economy the planner cannot force the firms to locate in region 2, say, if it would be more profitable for them to locate in region 1. They write: ‘We believe that this approach, which does not involve any interpersonal comparison and rests on market prices and incomes determined in a general equilibrium context, is superior to many others’ (p. 4). This is also the route I took in the thesis version of this paper. Details are available from the author upon request. See Charlot et al . (2006) for a detailed discussion of this issue in CP models. Note that the gap between the two criteria is increasing in m (and P (f )/K (f )0/0 when m 0/0): when people in the periphery benefit from agglomeration because manufactured c.i.f. prices are low, they benefit more the larger the value of m (the share of income they spend on manufactured goods) * and the less Kaldor’s criterion is binding. For any given value of a in {0.11, 0.51, 0.91}, varying m from 0.11 to 0.91 changes fPareto at the ninth decimal or beyond only. See, for example, Baldwin et al . (2003, Chs 3 and 11) for the positive and normative properties, respectively. In order to save space, I am not including this material here. Details are available from the author upon request.
References Baldwin, R. (1999) Agglomeration and endogenous capital, European Economic Review , 43(2), 253 280. Baldwin, R. (2001) The core periphery model with forward looking expectations, Regional Science and Urban Economics , 31, 21 49. Baldwin, R., Forslid, R., Martin, P., Ottaviano, G. & Robert-Nicoud, F. (2003) Public Policy and Spatial Economics, Princeton, NJ, Princeton University Press. Baldwin, R. & Robert-Nicoud, F. (2000) Free trade agreements without delocation, Canadian Journal of Economics , 33(3), 766 786. Barba-Navaretti, G., Venables, A., Barry, F., Ekholm, K., Falzoni, A., Haaland, J., Midelfart, K.-H. & Turrini, A. (2004) Multinational Firms in the World Economy, Princeton, NJ, Princeton University Press. Behrens, K., Lamorgese, A. R., Ottaviano, G. I. P. & Tabuchi, T. (2005) Changes in infrastructure and tariff barriers: local vs global impacts. CEPR Discussion Paper #S103. Bernard, A. & Jensen, B. (2005) Importers, Exporters and Multinationals: a Portrait of Firms in International Trade , NBER Working Paper No. 11404. Charlot, S., Gaigne´, C., Robert-Nicoud, F. & Thisse, J.-F. (2006) Agglomeration and welfare: the core periphery model in the light of Bentham, Kaldor, and Rawls, Journal of Public Economics , 90(1), 325 347. Combes, P. P., Mayer, T. & Thisse, J. F. (2005) Economic Geography , Princeton University Press (forthcoming). Dixit, A. & Stiglitz, J. (1977) Monopolistic competition and the optimum product diversity, American Economic Review , 67, 297 308. Englmann, F. & Walz, U. (1995) Industrial centres and regional growth in the presence of local inputs, Journal of Regional Science , 35, 3 27. Epifani, P. (2005) Heckscher Ohlin and agglomeration, Regional Science and Urban Economics , 35(6), 645 657.
Downloaded by [Tehran University] at 04:23 21 August 2011
Agglomeration and Trade
125
Ethier, W. (1982) National and international returns to scale in the modern theory of international trade, American Economic Review , 72(3), 389 405. Faini, R. (1984) Increasing returns, non-traded inputs and regional development, Economic Journal , 94, 308 323. Flam, H. & Helpman, E. (1987) Industrial policy under monopolistic competition, Journal of International Economics , 22, 79 102. Forslid, R. & Ottaviano, G. I. P. (2003) Trade and location: two analytically solvable cases, Journal of Economic Geography , 3, 229 240. Fujita, M., Krugman, P. & Venables, A. (1999) The Spatial Economy, MIT Press, Cambridge, MA. Fujita, M. & Thisse, J. (2002) Economics of Agglomeration, Cambridge University Press, Cambridge. Gaigne´, C. (2006) The ‘genome’ of NEG models with vertical linkages: a positive and normative synthesis: a comment on the welfare analysis, Journal of Economic Geography 6, 141 149. Hicks, J. (1940) The valuation of social income, Economica , 7, 105 124. Kaldor, N. (1939) Welfare propositions in economics and interpersonal comparisons of utility, Economic Journal , 49, 549 551. Krugman, P. (1980) Scale economies, product differentiation and the pattern of trade, American Economic Review , 70, 950 959. Krugman, P. (1991) Increasing returns and economic geography, Journal of Political Economy , 99, 483 499. Krugman, P. & Venables, A. (1995) Globalization and the inequality of nations, Quarterly Journal of Economics , 110(4), 857 880. Lawrence, C. & Spiller, P. T. (1983) Product diversity, economies of scale, and international trade, Quarterly Journal of Economics , 98, 63 83. Little, I. (1949) The foundations of welfare economics, Oxford Economic Review , 1, 227 246. Martin, P. & Rogers, C. (1995) Industrial location and public infrastructure, Journal of International Economics , 39, 335 351. Matsuyama, K. (1995) Complementarities and cumulative processes in models of monopolistic competition, Journal of Economic Literature , 33, 701 729. Mossay, P. (2006) The core periphery model: a note on the existence and uniqueness of short-run equilibrium, Journal of Urban Economics , 59(3), 389 393. Ottaviano, G. & Robert-Nicoud, F. (2006) The ‘genome’ of NEG models with vertical linkages: a positive and normative synthesis, Journal of Economic Geography , 6, 113 139. Ottaviano, G. I. P., Tabuchi, T. & Thisse, J. F. (2002) Agglomeration and trade revisited, International Economic Review , 43(4), 409 436. Ottaviano, G. & Thisse, J. (2002) Integration, agglomeration, and the political economics of factor mobility, Journal of Public Economics , 83(3), 429 456. Ottaviano, G. P. & Thisse, J. F. (2005) Agglomeration and economic geography, in J. V. Henderson and J. F. Thisse (eds), Handbook of Urban and Regional Economics (Vol. IV), Amsterdam, Elsevier. Pflu¨ger, M. (2004) A simple, analytically solvable, Chamberlain agglomeration model, Regional Science and Urban Economics , 34, 565 573. Puga, D. (1999) The rise and fall of regional inequalities, European Economic Review , 43(2), 303 334. Robert-Nicoud, F. (2005) The structure of simple ‘New Economic Geography’ models (Or, on identical twins), Journal of Economic Geography , 5(2), 201 234. Scitovsky, T. (1941) A note on welfare proposition in welfare economics. Review of Economic Studies , 9, 77 88. Venables, A. J. (1996) Equilibrium location of vertically linked industries, International Economic Review , 37, 341 359. Wildasin, D. (1986) Spatial variation in marginal utility of income and unequal treatment of equals, Journal of Urban Economics , 19, 125 129.
126
F. Robert-Nicoud
Appendix and Guide to Calculations Lemma 7 As mentioned in the text, five cases can occur. From the point of view of region 2’s residents: (1) if f B fsust the market outcome (n ½) is ‘second best’; (2) if f fP ,fbreak the market outcome (n 1) is ‘second best’; (3) if fP f fbreak the market outcome (n 1) provides too much agglomeration; (4) if fP B f B fbreak the market outcome is second best if n 1 but provides too little agglomeration if n ½; (5) if fsust B f B fP the market outcome is second best if n 1 but provides too little agglomeration if n ½. /
/
/
/
/
/
/
/
/
/
/
/
/
/
Downloaded by [Tehran University] at 04:23 21 August 2011
/
By contrast, anyone’s welfare is maximized when firms cluster in one’s region. Hence, from the point of view of region 1’s residents, three cases can occur: (1) if f fbreak then the decentralized outcome (n 1) delivers the best result; (2) if f B fsust then the decentralized outcome (n ½) delivers too little agglomeration; (3) if fsust B f B fbreak then the decentralized outcome delivers the best result if n 1 but provides too little agglomeration if n ½. /
/
/
/
/
/
/
/
Spatial Economic Analysis, Vol. 1, No. 1, June 2006
Modelling the Socio-economic Impacts of Major Job Loss or Gain at the Local Level: a Spatial Microsimulation Framework
DIMITRIS BALLAS, GRAHAM CLARKE & JOHN DEWHURST
Downloaded by [Tehran University] at 04:23 21 August 2011
(Received February 2006; revised March 2006)
It has long been argued that spatial microsimulation models can be used to estimate the impact of major changes in the local labour market through job losses or gains, including local multiplier effects. In a previous paper we have used SimLeeds, which is a spatial microsimulation model for the Leeds local labour market, in order to estimate the initial employment and income effect of a hypothetical closure of an engineering plant on different surrounding localities. This paper builds on that work and presents an extension of SimLeeds in order to provide estimates for the multiplier effects of such major changes in a local economy. In particular, we focus on the spatial distribution of the multiplier effects such as the event changes that are triggered by initial job and income effects. The disposable income gain or loss for each individual or household eventually leads to the increase/decrease of consumption of goods and services and to possible changes of the preferred retail location etc. (i.e. moving to more/less expensive stores). There are also net monetary losses for the government from the increase/decrease of income tax revenue and from the decrease/increase of the benefit claims from the households affected. In addition, the initial income and employment impacts would have second- and third-round multiplier effects, which could include the openings/closures of local convenience grocery stores as a result of the rise/fall of local demand for their goods. These closures in turn would generate further job creation or loss, which would have further multiplier effects at different localities within the city. This paper addresses all these multiplier effects in a spatial microsimulation context and provides a new framework for multiplier-effect micro-spatial analysis.
ABSTRACT
Mode´lisation des Effets Socio-e´conomiques d’une Perte ou d’un Gain Important d’Emploi sur le plan Local: un Cadre Spatial de Micro Simulation On a longtemps discute´ du fait qu’on pouvait utiliser les mode`les spatiaux de micro simulation dans l’e´valuation des effets des variations importantes, affectant le marche´ du travail local,
RE´SUME´
Dimitris Ballas (to whom correspondence should be sent), Department of Geography, University of Sheffield, Winter Street, Sheffield S10 2TN, UK. Email:
[email protected]. Graham Clarke, School of Geography, University of Leeds, Leeds LS2 9JT, UK. Email:
[email protected]. John Dewhurst, Department of Economic Studies, The University, Dundee DD1 4HN, Scotland, UK. Email:
[email protected]. The work reported on in this paper was part funded by the Greek State Scholarships Foundation (IKY). The Census Small-area Statistics are provided through the Census Dissemination Unit of the University of Manchester, with the support of the ESRC/JISC/DENI 1991 Census of Population Programme. All census data reported in this paper are Crown Copyright. The BHPS data were obtained from the UK Data Archive (University of Essex). All digitized boundary data are Crown and ED-LINE copyright. ISSN 1742-1772 print; 1742-1780 online/06/010127-20 # 2006 Regional Studies Association
DOI: 10.1080/17421770600697729
Downloaded by [Tehran University] at 04:23 21 August 2011
128
D. Ballas et al.
par le biais de pertes ou de cre´ations d’emploi, et dans l’e´valuation des effets multiplicateurs locaux dans l’analyse. Dans un article pre´ce´dent, nous avons utilise´ SimLeeds, mode`le de micro simulation pour le marche´ du travail de Leeds, dans le but d’e´valuer l’effet de la fermeture hypothe´tique d’une usine sur l’emploi et les revenus dans les diffe´rentes localite´s environnantes. L’article s’appuie sur ce travail et pre´sente une extension de SimLeeds pour fournir des estimations des effets multiplicateurs, ge´ne´re´s par des variations importantes, affectant une e´conomie locale. En particulier, nous nous sommes inte´resse´s a` la distribution spatiale des effets multiplicateurs, par exemple, les changements provoque´s par les modifications initiales, qui ont touche´ l’emploi et les revenus. Le revenu disponible en plus ou en moins pour chaque me´nage entraıˆne en de´finitive une augmentation/diminution de la consommation des biens et services et peut amener les me´nages a` s’approvisionner ailleurs. (par exemple, s’approvisionner dans des magasins plus ou moins chers). Il y a aussi des pertes se`ches pour le gouvernement du fait de l’augmentation/diminution de la rentabilite´ de l’impoˆt sur le revenu et du fait de l’augmentation/ diminution de la base imposable des me´nages affecte´s. De plus, les modifications initiales affectant le revenu et l’emploi auront des effects multiplicateurs de seond et de troisie`me rang, au nombre desquels on pourra ranger l’ouverture/la fermeture d’e´piceries locales. C’est la demande locale, en hausse ou en baisse, qui provoquera les ouvertures/fermetures. Ces fermetures provoqueront a` leur tour des cre´ations ou suppressions d’emploi, qui auront donc comme conse´quences des effects multiplicateurs supple´mentaires sur les diffe´rentes localite´s composant la ville. Cet article explique tous ces effets multiplicateurs dans un contexte spatial de micro simulation, et donne un nouveau cadre a` l’analyse micro spatiale des effets multiplicateurs. Modelo de impactos socioecono´micos de pe´rdida o ganancia de empleo importante en un a´mbito local: structura de microsimulacio´n espacial Desde hace tiempo se sostiene que los modelos de microsimulacio´n espacial pueden utilizarse para calcular el impacto de los principales cambios en el mercado laboral de a´mbito local mediante las pe´rdidas y ganancias de empleo, incluyendo los efectos multiplicadores locales. En un ensayo anterior utilizamos SimLeeds, que es un modelo de microsimulacio´n espacial para el mercado laboral en Leeds, a fin de calcular el efecto inicial de un cierre hipote´tico de una planta de ingenierı´a en el empleo y los ingresos en diferentes localidades de la zona. Nos basamos en ese trabajo y presentamos una ampliacio´n de SimLeeds a fin de ofrecer los ca´lculos para los efectos multiplicadores de tales cambios principales en una economı´a local. En particular nos enfocamos en la distribucio´n espacial de los efectos multiplicadores como los cambios de eventos que son desencadenados por los efectos iniciales en el trabajo y los ingresos. La ganancia o pe´rdida de ingresos disponibles para cada individuo o familia conduce con el tiempo a un aumento o una disminucio´n del consumo de bienes y servicios y a posibles cambios de la ubicacio´n minorista favorita, etc. (es decir, a desplazarse a almacenes ma´s caros o baratos). Tambie´n existen las pe´rdidas monetarias netas para el gobierno debido al aumento o la disminucio´n de los impuestos sobre la renta y al aumento o la disminucio´n del nu´mero de solicitudes de prestaciones para las familias afectadas. Adema´s, los impactos iniciales en los ingresos y el empleo tendrı´an efectos multiplicadores de segunda y tercera ronda que podrı´a incluir la apertura o el cierre de tiendas de alimentacio´n locales como resultado de ese aumento/descenso de la demanda local de bienes. Estos cierres generarı´an a su vez otra creacio´n o pe´rdida de trabajo que tendrı´an otros efectos multiplicadores en diferentes lugares de una misma ciudad. En este ensayo abordamos estos efectos multiplicadores en un contexto de microsimulacio´n espacial y ofrecemos una nueva estructura para el ana´lisis micro espacial del efecto multiplicador. RESUMEN
KEYWORDS: Spatial microsimulation; small-area microdata; small-area income data; multiplier effects;
socio-economic impact assessment JEL
CLASSSIFICATION:
R, R2, C61
A Spatial Microsimulation Framework
129
1. Introduction
Downloaded by [Tehran University] at 04:23 21 August 2011
Simulation is a critical concept in the future development of modelling because it provides a way of handling complexity that cannot be handled analytically. Microsimulation is a valuable example of a technique that may have increasing prominence in future research. (Wilson, 2000, p. 98) Simulation-based spatial modelling is an expanding area of research, which has great potential for the evaluation of the socio-economic and spatial effects of major developments in the regional or local economy. Among the most crucial issues that concern policy makers is the prediction of the effects of major job loss or gain in a region (in the short term as well as in the long term). There is a long history of modelling work in regional science that focuses on the assessment of the various short-term and long-term effects of major economic change. Much of this work looks at the multiplier effects of major job loss/gain on other areas of the local or regional economy. Regional economic models such as input output models make up the majority of such applications. It has been argued elsewhere (see, for instance, Birkin et al., 1996; Ballas & Clarke, 2000, 2001) that there is less work on the estimation of changes relating to the supply side of the economy: changes regarding incomes, expenditure, job prospects of different types of individuals or households in the catchment areas of the major economic change being analysed. So often in the field of regional science, for example, multipliers (social or economic) are estimated for entire cities or regions which, whilst useful for the aggregate performance indicators required to be produced by regional development agencies, are not so useful for local economic or social policy appraisal and planning. In this paper we build on previous work to develop a spatial microsimulation framework for the analysis of the social multiplier effects of major economic change by focusing attention at the household, individual and small-area level. First, we briefly review the potential of microsimulation for socio-economic impact assessment. Then, we describe SimLeeds, which is a spatial microsimulation model aimed at modelling the Leeds urban system (Section 2). This has already been used to model the first-order (or direct) impacts of a major plant closure in Leeds (Ballas & Clarke, 2001). The results from that exercise are first summarized here. In Section 3 we review the major approaches to socio-economic impact assessment and, at each step, offer an alternative framework based on microsimulation. This includes three types of multiplier effect on households within a city or region: first, the immediate impacts caused by the loss of household income in the areas surrounding the plant that has closed. Second, the impacts elsewhere in the city caused by inter-industry linkages (that is, job losses at companies that supplied the plant that has closed). Finally, we speculate on the long-term fortunes of the labour force that has been made redundant. The ability of the labour force to find alternative employment, or be retrained, will have a marked impact on the economic fortunes of the area in the long term. Some concluding comments are offered in Section 4. 2. Microsimulation for Impact Assessment In this section we first describe the microsimulation model which has been built to replicate household structures in Leeds (SimLeeds). Then, we report progress to
Downloaded by [Tehran University] at 04:23 21 August 2011
130
D. Ballas et al.
date on using the model for impact assessment. In Section 3 we build on the existing work to date to explore new and additional aspects of the economic multiplier process. The main aim of microsimulation models is first to compile large-scale data sets on the attributes of individuals or households (and/or on the attributes of individual firms or organizations). Once such data sets have been built, the analyst can examine the impacts of changing economic or social policies on these micro-units (Orcutt et al., 1986; Birkin & Clarke, 1995; Clarke, 1996). Since these models allow the production of analyses at the level of the individual, family or household they provide the means of assessing variations in the distributional effects of different policies (Mertz, 1991). In addition, microsimulation modelling frameworks provide the possibility of defining the goals of economic and social policy, the instruments employed and also the structural changes of those affected by socio-economic policy measures (Krupp, 1986). To date, therefore, microsimulation methodologies have become accepted tools in the evaluation of economic and social policy, in the analysis of tax-benefit options as well as in other areas of public policy. The microsimulation procedure typically involves four major procedures (Ballas et al., 2005b): (1) the construction of a micro data set (when this is not available); (2) Monte Carlo sampling from this data set to ‘create’ a micro-level population; (3) what-if simulations, in which the impacts of alternative policy scenarios on the population are estimated; (4) dynamic modelling to update a basic micro data set. Although microsimulation models can be constructed entirely from aggregate statistics (see Birkin & Clarke, 1988, for a good example), it helps if there are micro data sets to replicate or re-weight. In the case of the UK, the only official government published census micro data set is the Samples of Anonymised Records (SARs). These are samples of individual census records. However, in order to protect confidentiality, the geographical code or tag is, at best, given at the urban or district level only. It is not possible to identify the individual address or postcode. There are, however, many techniques available for using such micro data files to produce household estimates for an entire population of a city or region. These range from iterative proportional fitting methods to linear programming and complex combinatorial optimization methods (Williamson et al., 1998; Ballas & Clarke, 2000; Ballas et al., 2005b). These techniques make up stages 1 and 2 of the microsimulation procedure outlined above. Once a spatially disaggregated micro data set is built it is then possible to move to stage 3 and perform what-if policy analysis by changing the micro-unit attributes accordingly. In particular, once a spatial microsimulation database is built, the user can change any of the variables to test the impacts of various policy initiatives (Birkin et al., 1996). The above analysis can become even more sophisticated if dynamic procedures are incorporated into a spatial microsimulation model. In particular, the next stage in spatial microsimulation modelling is to update the micro-database and perform policy analysis in a dynamic fashion. This would involve the prediction of the micro-unit attributes and their behavioural responses over a period of time under different policy and demographic scenarios (see Ballas et al., 2005b, for an example).
Downloaded by [Tehran University] at 04:23 21 August 2011
A Spatial Microsimulation Framework
131
SimLeeds is a product of ongoing research and uses different approaches to conditional probability analysis for microsimulation modelling, especially simulated annealing re-weighting approaches (for more details on the different methodologies employed by SimLeeds see Ballas, 2001). The nature and number of the variables of the household or individuals modelled by SimLeeds are shown in Table 1. The model has been calibrated by checking estimates against the known smallarea statistics (SAS) and SARs. That is, the outputs of the models (when summed to census enumeration district level) must match the known census data for each major variable simulated, even if the interdependencies themselves cannot be directly calibrated. In addition, we have been able to add results from the British Household Panel Survey (BHPS), which is an annual survey of the adult population of the UK, drawn from a representative sample of over 5,000 households (Berthoud & Geshuny, 2000). Table 2 lists some of the additional variables available from the BHPS. Thus, we have created a synthetic micro-database at the small-area level, which now additionally comprises all the variables contained in the BHPS. Using this database it is possible to explore the interdependencies of any household attributes at the micro scale. It should be noted that one of the particular strengths of our synthetic database is that it contains a wide range of potentially policy-relevant variables, which are not covered by the census. As pointed out above, one of the most important non-census attributes is household income. Figure 1 depicts the estimated spatial distribution of average income at the census enumeration district (ED) level for Leeds compiled from a combination of census data relating to job status and occupation, and BHPS data relating to wages. Having described SimLeeds the next stage is to report on progress to date with impact analysis of job loss/gain. This section first builds on the work presented in a previous paper (Ballas & Clarke, 2001) which is briefly summarized before we add new analysis. In that paper we simulated the hypothetical closure of an engineering plant located in the East Leeds ward of Seacroft (see Figure 2). Figure 2 also depicts the observed travel-to-work flows to Seacroft. As can be noticed, Seacroft, Whinmoor and Halton are the wards with the largest numbers of individuals who work in Seacroft. In particular, 20% of the people who work in Seacroft live in Seacroft. In addition, 17.8% of the people who commute to Seacroft come from Whinmoor and 8.7% come from Halton. Therefore, it can be reasonably expected that these wards will be the most affected in the event of a plant closure in Seacroft (rather than all wards in the city). Using spatial microsimulation models such as SimLeeds it is thus possible to model the impacts of the major engineering plant closure in East Leeds on households in different localities of the city. To look at the impacts on different socio-economic groups we need information on the size and employment structure of the plant’s workforce. The plant to be closed had 3,000 employees and the plant’s workforce structure was as depicted in Figure 3. As can be seen, most of the plant’s workforce belonged to the skilled manual category (45% of the total workforce) and to the managerial and technical category (25% of the total workforce). The next stage in the development of SimLeeds to examine the impacts of this closure is to build a fully disaggregated journey-to-work model. This allows every household in Leeds (deemed to have a job) to be assigned a work destination location. The model can be calibrated on existing journey-to-work data available from the Census of Population. Thus, the patterns shown in Figure 2 can be easily
132
D. Ballas et al.
Table 1. SimLeeds attributes Name Location
Label Place of residence (ED level)
Bath Cenheat Insidewc Cars Hhsptype Hhspindw
Name Dhdecpos
Economic position of household head Age of household head Sex of household head Social class of household head Subject group of highest qualification Industry (SIC divisions) Occupation: SOC major groups
Availability of bath/shower Availability of central heating Availability of inside WC No. of cars Household space type No. of household spaces in dwelling Roomsnum No. of rooms in household space Tenure Tenure of household space Persinhh No. of persons in household Age Age Cobirth Country of birth Econprim Economic position (primary) Econsec Economic position (secondary) Empstat Employment status Ethgroup Ethnic group
Occsubmj Occminor Hhdcomp Hdeptype Dfhage Dfresid Dfdepch Dfolddc Dfyngdc
Famnum Famtype Hours Industry
Family number Family type Hours worked weekly Industry
Dfadult Dfchild Dfpensr Dfltill
LTILL
Limiting long-term illness
Dfemp
Mstatus
Marital status
Dfecact
Migorgn
Dfunemp
Qualevel
Migrant * area of former usual residence Occupation No. of higher educational qualifications Level of highest qualification
Dfinact
Residsta
Resident status
Dfother
Sex
Sex
Dfstuds
Soclass Segroup
Social class (based on occupation) Socio-economic group
Dfdeps Dfolddep
Tranwork
Mode of transport to work
Dfyngdep
Dhresid Dhdepch
No. of residents in household No. of dependent children in household Age of oldest dependent child in household Age of youngest dependent child in household No. of adults in household No. of under-16s in household
Dfhecpos Dfhsex
Occupation: SOC sub-major groups Occupation: SOC minor groups Household composition type Household dependant type Age of head of family No. of residents in family No. of dependent children in family Age oldest dependent child in family Age youngest dependent child in family No. of adults resident in family No. of under-16s resident in family No. of pensioners resident in family No. of persons with ltill resident in family No. of persons in employment resident family No. of economically active residents in family No. of unemployed residents in family No. of retired residents in family No. of permanently sick residents in family No. of econom. inactive residents in family No. of residents other inactive in family No. of students in family enumerated at term-time address No. of dependants resident in family Age of oldest resident dependant in family Age of youngest resident dependant in family Economic position of head of family Sex of head of family
Dfhclass
Social class of head of family
Dhstuds
No. of pensioners in household No. of LTILL (Limiting Long-Term Illness) persons in household
Dhyngdep Dallstud
No. of students enumerated at termtime address in household No. of dependants in household Age of oldest resident dependant in household Age of youngest dependant in hhold All-student household
Occpatn Qualnum
Dholddc Dhyngdc Dhadult Dhchild Dhpensr Dhltill
Dhdage Dhdsex Dhdclass Qsubgrp Indusdiv Occmajor
Label
Dfretire Dfpsick
Dhdeps Dholddep
A Spatial Microsimulation Framework
133
Table 1 (Continued ) Name
Label
Dhemp
No. of persons in employment in household No. of persons economically active in household No. of unemployed persons in household No. of retired persons in household No. of perm. sick persons in household No. of econ. inactive persons in household No. of persons other inactive in household
Dhecact Dhunemp Dhretire Dhpsick Dhinact Dhother
Name
Label
Dallpens
All-pensioner household
Dalladlt
All-adult household
EarnedIncome
Earned income (annual)
Tax Income tax paid RetirementPension Retirement pension (per week) WFTC Working Families Tax Credit (per week) ChildBenefit Amount of child benefit (per week)
replicated in the simulation model, providing us with a full set of characteristics or attributes for each household linked to the engineering plant to be closed. Under the plant closure scenario, the first impact is that the households assigned to work at this plant become unemployed and there is an immediate loss Table 2. The non-census SimLeeds attributes Additional non-census SimLeeds attributes Rent and mortgage, loan and hire purchase details Local authority service charges Allowances/rebates Difficulties with rent/mortgage payments Household composition Consumer durables, cars, telephones, food Heating/fuel types, costs, payment methods Non-monetary poverty indicators Crime Employment status Not working/seeking work Self-employed Sector private/public Standard Industrial Classification/Standard Occupational Classification/ISCO (International Standard Classification of Occupations) Nature of business/duties Workplace/size of firm Travelling time Means of travel Length of tenure Hours worked/overtime Union membership Prospects/training/ambitions Superannuation/pensions Attitudes to work/incentives Wages/salary/deductions Childcare provisions Job search activity Career opportunities Bonuses Performance-related pay Income from interest/dividends Savings and investments Income from benefits
134
D. Ballas et al.
Figure 1. Estimated spatial distribution of average earned income in Leeds, 2000.
of earned income. Figure 4 depicts the estimated spatial distribution and size of income loss resulting from the plant closure in the immediate areas surrounding the plant. Having estimated the loss of each household’s income we can partially
Figure 2. Travel-to-work flows to Seacroft. Source : Ballas & Clarke (2001, p. 303).
A Spatial Microsimulation Framework
135
Figure 3. The plant’s workforce structure.
offset that loss by in turn estimating the amount of Job Seeker’s Allowance (JSA) that could be claimed by the plant’s redundant workforce (which will depend upon age of former employee, number of children, etc.). Those JSA payments represent the major monetary costs of the plant closure for the government. Figure 5 depicts the estimated spatial distribution and size of the weekly JSA that would be paid out to those workers (again in the vicinity of the plant itself) as a result of the plant closure. In the next section, we build on this foundation to consider how microsimulation can aid in modelling further impacts of this plant closure on the local economy.
Figure 4. Estimated spatial distribution of annual income loss in Seacroft, Halton and Whinmoor. Source : Ballas & Clarke (2001, p. 306).
136
D. Ballas et al.
Figure 5. Estimated spatial distribution of weekly JSA in Seacroft, Halton and Whinmoor. Source : Ballas & Clarke (2001, p. 306).
3. Extended Regional Multipliers of Economic and Social Change 3.1. Introduction As noted in the introduction, there has long been an interest in the impacts of major job gains or losses on the fortunes of the regional or local economy. Similar impact and multiplier assessments have been studied extensively using a modelling perspective in the fields of regional science and economic geography. The most common methodological approaches have involved the use of Keynesian multiplier models and/or input output models. The major distinction between these two approaches lies, essentially, in the industrial disaggregation that is used in input output analysis. Although very strong on economic indicators, such models are less useful for monitoring the impact on socio-economic variables such as household income, expenditure and re-employment prospects. The problem here is the spatial scale. There are few good examples of such impact models applied below the regional scale (i.e. for individual cities, although see Batey & Madden, 2001; Hewings et al., 2001; Jin Yu-xian & Wilson, 1993). In this exposition of SimLeeds we choose to adopt a methodology that provides an alternative way of examining social or household change following a major plant closure (or opening). The work of Batey & Madden (1983, 1999, 2001) is perhaps one of the best illustrations so far of disaggregated input output models which include demographic or socio-economic components. However, there are still some interesting spatial issues to explore further. First, Batey & Madden (2001), for example, examine the impacts of extending the airport in Liverpool to allow more air traffic in the future. The results obtained pertain to the combined area they label Greater
A Spatial Microsimulation Framework
137
Merseyside. Local planners may well be as interested in the variations within this area. There is no way of doing this in their methodology without introducing a labour market journey-to-work model. We would argue that it would also be interesting to look in more detail at the distribution and income effects of the job market changes at the small-area level. Then it may be possible to, in turn, disaggregate the local multiplier effects spatially as well as sectorally. Jun (1999) offers a useful methodology to extend the input output framework geographically. He argues for the inclusion of a journey-to-work model to allocate job gains/losses from the input output model spatially. He also includes a shopping model to allocate the additional expenditure that come from the induced consumption fuelled by job growth. Some of these issues are explored in more detail below. 3.2. Expenditure Changes In Section 2 we showed the impacts on household income which would result from the plant closure. The next logical step is to estimate the change in aggregate household expenditure consequent on those changes in household disposable income. In a spatial microsimulation framework it is possible to estimate this at the household or individual level using the vector of household and individual attributes of the SimLeeds model (which was described in the previous section). For simplicity we assume a simple linear consumption function of the form: Ck ab:Yk (though note that more complicated functional forms could be adopted if the evidence warranted it). Change in this model is apportioned between different household goods given that there are different marginal propensities to consume different types of goods for different socio-economic groups. Thus, certain commodities or services (e.g. some food and drink, and many personal services) will continue to be consumed largely as before (with some reductions) whilst other types of purchases will be put off until some later date. On the supply side, the impacts of revenue loss will also vary by retailer. Some goods and services may be provided by local businesses that will not be able to absorb revenue losses so easily. National chains, however, may be able to trade for longer periods with reduced revenues since they can be supported by profits made elsewhere. Revenue losses for local traders may thus result in further job losses through store closures. Data from the Office for National Statistics (ONS) (Gibbins & Julian, 2006) Family Spending report can be utilized to apportion aggregate household expenditure into expenditure on individual goods and services. Note that the apportionment ratios will vary as household income varies because poorer households will tend to consume proportionately more basic items and richer households proportionately more of what one might term luxury items. In Section 2 we noted that SimLeeds could be extended to include a journeyto-work model. We can now link the microsimulation models to flow or interaction models to estimate the loss of revenue to local retailers. The benefits more generally of linking microsimulation models to more meso-scale interaction models for retail analysis are explored in detail by Nakaya et al. (2003). The retail interaction model can be stated as: m
Sijm Oim Ami Wja exp(bm dij );
D. Ballas et al.
138
where Sijm expenditure by household type m in residence zone i at destination j; Oim level of consumer expenditure of household type m in residence zone i; Ami a balancing factor to ensure that: X Sm Oim ; j ij /
/
/
/
which is calculated as: Ami P
1
mam exp(bm dij ) j Wj
;
where Wj the attractiveness of destination j as perceived by household type m; am a parameter reflecting scale economies for household type m; dij the distance between origin i and destination j; bm the distance decay parameter for household type m. Summing over the expenditure flows for each store provides an estimate of store turnover. The change in the income circumstances of the households affected by the plant closure would trigger various event changes, such as the reduction of expenditure on groceries (a reduction in the Oim term), change of retail destination (moving to less expensive stores* a change in the Wj for certain stores), etc. Figure 6 depicts the predicted weekly loss of revenue from the grocery outlets in the area affected by the plant closure after running this model in Leeds (the models were run prior to a new Tesco store opening in Seacroft). Thus Figure 6 shows the revenue loss when demand ( Oim ) in the Seacroft census tracts has been reduced and the model re-run with these lower expenditure values. The important issue to address then is how low revenues will be allowed to fall before particular stores close and further jobs are lost in the area. As noted above, it is likely that smaller independent stores will be most directly affected, since most of their revenue comes from local residents. Of the four main stores identified here we estimate that most pressure will occur on the locally owned Co-op store. Revenue losses here are estimated to be of the order of 20 30% of weekly income. Such reductions for the larger retailers are more likely to be absorbed by those chains. Clearly, such analysis could be undertaken for all other types of retail activity. The induced impacts may also concern the reduction in spending on /
/
/
/
/
/
Figure 6. Grocery revenue losses at stores in the areas affected by the plant closure.
A Spatial Microsimulation Framework
139
locally made retail goods. In measuring further effects on the Leeds economy (and indeed various sub-regions of Leeds) the reduction in locally produced items may be an additional significant impact. 3.3. Inter-industry Links and Socio-economic Impact Assessment Despite their lack of small-area spatial detail, input output models have an excellent history of modelling the multiplier effects of industrial change on other sectors of the economy, especially those that supply goods and services directly. Given tables of intra-industry links, the models will endogenously estimate the impacts on other major firms or industrial sectors. In the future, embedding such a model into the microsimulation framework would be a major enhancement. For now we draw on known impacts (rather than modelled) in the Leeds economy following the plant closure example described in Section 2. The engineering plant in Seacroft that closed had two major suppliers from elsewhere in Leeds (plus many outside the city which we ignore here). The first was located in the ward of City and Holbeck, which, according to data from the National On-line Manpower Information System (NOMIS), has the highest concentration of manufacturing companies in Leeds. This plant subsequently downsized and we model the impact of 335 additional job losses from this plant. Although we do not have information on the types of jobs lost, we assume an average redundancy rate across all plant workers. A second firm in Seacroft, in the ‘financial intermediation’ section, has also downsized as a result of the original plant closure. One hundred jobs are estimated to have been lost here also. Tables 3 and 4 show the occupational breakdown of the employees who were laid off in the two firms. Figure 7 shows the travel-to-work flows to City and Holbeck that will be important in looking at the spatial impacts of the loss of jobs at the supplier firm in this ward. On the basis of these flows, we used SimLeeds to estimate the spatial distribution of the supplying firms’ workforce. Figure 8 depicts the spatial distribution of the laid-off employees of the City and Holbeck supplying plant. The pattern shows a concentration of skilled and unskilled manual workers living in the immediate locality, and professional (management) workers commuting in from the suburbs. Figure 9 shows the spatial distribution of the income loss throughout the city as a result of these job losses. Using the same methodology we estimated the spatial distribution of the laidoff workforce of the financial firm in Seacroft. We also calculated the subsequent income effect. It is interesting now to look at the overall picture of income loss Table 3. Workforce occupational structure of plant supplier located at City and Holbeck. Supplier firm (SIC3: metal goods, engineering and vehicles) Occupational group Managers Professional Intermediate * non-manual Junior non-manual Skilled manual Unskilled manual Total
Lay-offs 15 60 30 30 100 100 335
140
D. Ballas et al.
Table 4. Workforce occupational structure of financial firm located at Seacroft. Financial Intermediation firm (SIC8) Occupational group
Lay-offs
Managerial Professional Intermediate * non-manual Junior non-manual Total
5 50 22 23 100
around Leeds from all these job losses. Figure 10 depicts the spatial distribution of the total annual income loss, which resulted from the initial plant closure. As would be expected, the areas around the plant location were affected the most. Nevertheless, other localities around Leeds experienced second-order effects, caused by the job lay-offs at the plant suppliers. It can be argued that the analysis presented here has the advantage of identifying the employment and income effect for localities within a region or cities, whereas, as pointed out above, traditional regional multiplier models just give the overall impact at the city or regional level. As can be seen from the map, in contrast to many aggregate studies, the analysis shows that large parts of the city (and hence the region) will be largely unaffected by the original plant closure. 3.4. Long-term Labour Market Dynamics The next stage of modelling the multiplier effects of our plant closure is to assess or speculate on the longer-term dynamics of the local labour market force. In particular, given the different socio-economic groups in a locality, we are interested in how many former employees of the Seacroft plant will be able to find jobs
Figure 7. Travel-to-work flows to City and Holbeck
A Spatial Microsimulation Framework
141
Figure 8. Estimated spatial distribution of the laid-off workforce of the City and Holbeck supplier.
instantly, how many will be re-employed within (say) 6 months, and how many will remain long-term unemployed. To help us ‘calibrate’ this type of analysis we can draw on many more qualitative analyses undertaken elsewhere. In this type of
Figure 9. Income loss resulting from the City and Holbeck supplier job losses.
142
D. Ballas et al.
Figure 10. Total income loss around Leeds.
work, interviews are often carried out to investigate how workers made redundant following plant closures fare in adapting to the changing labour market and how long-term unemployment is increased for those unable to retrain. This kind of analysis can provide useful insights into the behaviour of different types of households under the new circumstances which can also be used in order to formulate hypothesis in quantitative modelling frameworks. For example, Tomaney et al. (1999) describe the impact of the Swan Hunter shipyard closure on the local economy and community of north-east England. They present results of a survey carried out after the closure and they point out that the former workers of the shipyard had different post-closure fortunes, which clearly depended on their socio-economic and demographic characteristics. In other words, there were some groups of the shipyards workforce that had no difficulty finding new employment, whereas others remained unemployed for a long time or had to retire, take up lower-paid jobs or enter training or education. As Tomaney et al. (1999) point out: Notably, one of the firm’s key assets, the design team, discovered substantial demand for its skills* as the campaigners against the yard’s closure had predicted. Other groups of workers found it more difficult to find work. The workforce is revealed, therefore, not as an undifferentiated mass of ‘redundants’ but as a collection of groups, each with different skill profiles and labour market prospects. (Tomaney et al., 1999, p. 410) There are many examples of similar studies in the literature. Recently, Shuttleworth et al. (2005) examined the fortunes of employees made redundant in the Belfast region following the closure of the Harland and Wolff shipbuilding yards.
A Spatial Microsimulation Framework
143
Danson (2005) provides a more general discussion on models or analysis to examine the employability of those made redundant in old industrial regions. The findings of qualitative studies such as these can provide useful insights when formulating the ‘rules’ that determine the likely behaviour of households in a spatial microsimulation what-if policy analysis framework. Further, it is possible to use panel data from surveys such as the BHPS to model the life paths of particular individuals and households who have moved into and out of work. Table 5 shows the life histories of a sample of simulated individuals who worked in the SIC 3 industrial category and became unemployed in year t. These data can be used in combination with data from studies such as that of Tomaney et al. (1999) to extend models such as SimLeeds in order to simulate longterm labour market dynamics. Table 6 gives the percentages of different sub-groups of the workforce in the Tomaney et al. (1999) study that were still unemployed 2 years after the plant closure. As can be seen, socio-economic status and age group is crucial in determining the probability of finding another job quickly. It is possible to use these probabilities in the context of spatial microsimulation models such as SimLeeds in order to estimate the numbers of simulated individuals who would be out of work 2 years after our original plant closure in Seacroft. Figure 11 shows the spatial distribution of the simulated long-term unemployed individuals assuming that the probabilities described in Table 6 apply to the simulated workforce of the hypothetical plant and the businesses linked to the plant (this assumption can be relaxed by factoring those probabilities by the number of vacancies in the surrounding areas). 4. Concluding Comments In this paper we have presented a new framework for the analysis of the impacts of job losses on households within the catchment area of particular establishments that have been shut down. We have presented SimLeeds, which is a spatial microsimulation model for the local labour market of Leeds, and we have argued that models like SimLeeds can be used to develop a new framework for impact analysis based on local rather than regional multipliers of socio-economic change. The potential of such a framework has been illustrated by reference to modelling changes in household incomes and expenditure, changes in subsequent revenues for food retailers in the area, multiplier effects on other supplier plants in the region and Table 5. Labour market status dynamics of simulated individuals (based on the British Household Panel Survey) Person ID
Job status t
12212423
Unemployed
12634441
Unemployed
13151703
Job status t/1
Job status t/2
Unemployed
Employed, junior nonmanual, industry SIC6 Employed, junior nonmanual, industry SIC3 Unemployed
Employed, junior nonmanual, industry SIC6 Employed, skilled manual, industry SIC3 Unemployed
14426951
Unemployed
Unemployed
10014608
Unemployed
Employed, manager, industry SIC3
Employed, skilled manual, industry SIC3 Employed, manager, industry SIC3
Job status t/3 Employed, intermediate non-manual, industry SIC3 Employed, skilled manual, industry SIC3 Employed, professional, industry SIC3 Unemployed Employed, manager, industry SIC3
144
D. Ballas et al.
Table 6. Percentages of unemployed individuals who could not find a job 2 years after the plant closure (after Tomaney et al ., 1999) Age group
Socio-economic group
20 29 (%) 40 49 (%) 50 64 (%) Unskilled (%) Skilled manual (%) Managers (%) Clerical workers (%) 29
37
48
47
42
19
26
forecasts of future spatial patterns of both the long-term unemployment and employability of the redundant workers. There are many further challenges to expand on this framework. It would be an exciting challenge to merge microsimulation and input/output techniques to model the multiplier effects of job losses more directly on other sectors of the economy. It would also be useful to build more dynamics of lifestyle behaviours into the household information set. A start has been made here (see Ballas et al., 2005a) to generate more detailed life history information. The extended version of the model will include ‘household formation’ procedures. Further, SimLeeds could be developed to take into account the demand side of the local labour market. There are examples using microsimulation modelling of the behaviour of firms and organizations (e.g. Danson & Lavercombe, 1996; Van Wissen, 2000). Our future work will use this as a basis for a spatial microsimulation model of firms in Leeds. This will then be linked to the household model in order to provide an even more powerful tool for urban and regional policy analysis. This paper has demonstrated that spatial microsimulation is an important methodology in economic geography and regional science for socio-economic
Figure 11. Spatial distribution of simulated ‘long-term’ unemployed, as a result of the plant closure.
A Spatial Microsimulation Framework
145
impact assessment. However, it can be argued that just a fraction of the possible uses of spatial microsimulation in a labour market context has been shown here. Ongoing computational advances offer an enabling environment for large-scale microsimulations of all the subsystems that make up the economies of cities and regions. New developments in hardware and software as well as increased data availability offer the potential for modelling the interactions between all the entities and urban subsystems that make up a regional economy. In particular, models such as SimLeeds can be further developed in order to include more household variables and improved estimates of income and wealth and incorporate all the urban subsystems that make up the local socio-economic structure (i.e. education, health provision, etc.). Moreover, dynamic modelling procedures can be incorporated in order to render models such as SimLeeds capable of performing sophisticated what-if local multiplier analysis of different social policy initiatives. In addition, the use of spatial microsimulation methodologies in economic geography and regional science contexts can significantly complement both analytical regional economics as well as spatial econometrics. In particular, it is possible to link spatial microsimulation models to regional input output models and regional econometric models. For instance, the prediction of input output models for different sectors of the local economy can be spatially disaggregated with the use of models such as SimLeeds. The link of spatial microsimulation to input output models offers an exciting research agenda for the future. Likewise, predictions of regional econometric models for the whole region can be disaggregated at the individual and household level with the use of spatial microsimulation. References Ballas, D. (2001) A spatial microsimulation approach to local labour market policy analysis, Unpublished PhD thesis, School of Geography, University of Leeds. Ballas, D. & Clarke, G. P. (2000) GIS and microsimulation for local labour market analysis, Computers, Environment and Urban Systems , 24, 305 330. Ballas, D. & Clarke, G. P. (2001) Towards local implications of major job transformations in the city: a spatial microsimulation approach, Geographical Analysis , 33, 291 311. Ballas, D., Clarke, G. P., Feldman, O., Gibson, P., Jianhui, J., Simmonds, D. & Stillwell, J. (2005a) A spatial microsimulation approach to land-use modelling, CUPUM 2005 (Computers in Urban Planning and Urban Management) , Conference Proceedings, UCL, London, 29 June 1 July 2005. Internet site: http://128.40.59.163/cupum/ searchPapers/papers/paper276.pdf Ballas, D., Rossiter, D., Thomas, B., Clarke, G. P. & Dorling, D. (2005b) Geography Matters: Simulating the Local Impacts of National Social Policies, York, Joseph Rowntree Foundation. Batey, P. W. & Madden, M. (1983) The modelling of demographic economic change within the context of regional decline, Socio-economic Planning Sciences , 17, 315 328. Batey, P. W. & Madden, M. (1999) The employment impact of demographic change: a regional analysis, Papers in Regional Science , 78(1), 69 88. Batey, P. W. J. & Madden, M. (2001) Socio-economic impact assessment: meeting client requirements, in: G. P. Clarke & M. Madden (eds) Regional Science in Business, pp. 37 60, Berlin, Springer. Berthoud, R. & Geshuny, J. (eds) (2000) Seven Years in the Lives of British Families , Bristol, Public Policy Press. Birkin, M. & Clarke, G. P. (1995) Using microsimulation methods to synthesize census data, in: S. Openshaw (ed.) Census Users’ Handbook, pp. 363 387, London, GeoInformation International. Birkin, M., Clarke, G. P. & Clarke, M. (1996) Urban and regional modelling at the microscale, in: G. P. Clarke (ed.) Microsimulation for Urban and Regional Policy Analysis, pp. 10 27, London, Pion. Birkin, M. & Clarke, M. (1988) SYNTHESIS * a synthetic spatial information system for urban and regional analysis: methods and examples, Environment and Planning A , 20, 1645 1671. Clarke, G. P. (1996) Microsimulation: an introduction, in: G. P. Clarke (ed.) Microsimulation for Urban and Regional Policy Analysis, pp. 1 9, London, Pion.
146
D. Ballas et al.
Danson, M. (2005) Old industrial regions and employability, Urban Studies , 42(2), 285 300. Danson, M. & Lavercombe, A. (1996) Simulating changes in the production process of a firm, in: G. P. Clarke (ed.) Microsimulation for Urban and Regional Policy Analysis, pp. 187 200, London, Pion. Gibbins, C. & Julian, G. (2006) Family Spending: a Report on the 2004 05 Expenditure and Food Survey , UK Office for National Statistics Report, Basingstoke, Palgrave Macmillan. Hewings, G., Okuyama, Y. & Sonis, M. (2001) Creating and expanding trade partnerships within the Chicago Metropolitan area: applications using a Miyazawa Accounting System, in: G. P. Clarke & M. Madden (eds) Regional Science in Business, pp. 11 36, Berlin, Springer. Jin, Yu-xian & Wilson, A. G. (1993) Generation of integrated multispatial input output models of cities (GIMIMoC): 1: initial stage, Papers in Regional Science , 72(4), 351 368. Jun, Myung-Jin (1999) An integrated Metropolitan model incorporating demographic economic, land-use and transport models, Urban Studies , 36(8), 1399 1408. Krupp, H. (1986) Potential and limitations of microsimulation models, in: G. H. Orcutt, J. Mertz & H. Quinke (eds) Microanalytic Simulation Models to Support Social and Financial Policy, Amsterdam, North-Holland. Mertz, J. (1991) Microsimulation * a survey of principles, developments and applications, International Journal of Forecasting , 7, 77 104. Nakaya, T., Yano, K., Fotheringham, A. S., Ballas, D. & Clarke, G. P. (2003) Retail interaction modelling using meso and micro approaches , Paper presented at the 33rd Regional Science Association, RSAI * British and Irish Section conference, St Andrews, Scotland, 20 22 August 2003. Orcutt, G. H., Mertz, J. & Quinke, H. (eds) (1986) Microanalytic Simulation Models to Support Social and Financial Policy , Amsterdam, North-Holland. Shuttleworth, I., Tyler, P. & Mckinstry, D. (2005) Redundancy, readjustment and employability: what can we learn from the 2000 Harland and Wolff redundancy?, Environment and Planning A , 37, 1651 1668. Tomaney, J., Pike, A. & Cornford, J. (1999) Plant closures and the local economy: the case of Swan Hunter on Tyneside, Regional Studies , 33(5), 401 411. Van Wissen, L. (2000) A micro-simulation model of firms: applications of concepts of the demography of the firm, Papers in Regional Science , 79, 111 134. Williamson, P., Birkin, M. & Rees, P. (1998) The estimation of population microdata by using data from small area statistics and samples of anonymized records, Environment and Planning A , 30, 785 816. Wilson, A. G. (2000) Complex Spatial Systems: the Modelling Foundations of Urban and Regional Analysis, London, Prentice Hall.