Geographic Information Research
Geographic Information Research: Trans-Atlantic Perspectives EDITED BY
MASSIMO CRAGL...
359 downloads
3357 Views
10MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Geographic Information Research
Geographic Information Research: Trans-Atlantic Perspectives EDITED BY
MASSIMO CRAGLIA University of Sheffield, UK
HARLAN ONSRUD University of Maine, USA ORGANISING COMMITTEE HANS-PETER BÄHR, KEITH CLARKE HELEN COUCLELIS, MASSIMO CRAGLIA HARLAN ONSRUD, FRANÇOIS SALGÉ GEIR-HARALD STRAND
UK Taylor & Francis Ltd, 1 Gunpowder Square, London, EC4A 3DE USA Taylor & Francis Inc., 1900 Frost Road, Suite 101, Bristol, PA 19007 This edition published in the Taylor & Francis e-Library, 2005. “To purchase your own copy of this or any of Taylor & Francis or Routledge’s collection of thousands of eBooks please go to www.eBookstore.tandf.co.uk.” Copyright © Taylor & Francis 1999 All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, electrostatic, magnetic tape, mechanical, photocopying, recording or otherwise, without the prior permission of the copyright owner. British Library Cataloguing in Publication Data A catalogue record for this book is available from the British Library. ISBN 0-203-21139-1 Master e-book ISBN
ISBN 0-203-26900-4 (Adobe eReader Format) ISBN 0-7484-0801-0 (paper) Library of Congress Cataloguing-in-Publication Data are available Cover design by Hybert Design and Type, Waltham St Lawrence, Berkshire Cover printed by Flexiprint, Lancing, West Sussex
Contents
1
PART ONE
Preface
viii
European Science Foundation and National Science Foundation
xi
Contributors
xii
Introduction and Overview Massimo Craglia and Harlan Onsrud
1
GI and Society: Infrastructural, Ethical and Social Issues 2
Spatial Information Technologies and Societal Problems Helen Couclelis
14
3
Information Ethics, Law and Policy for Spatial Databases: Roles for the Research Community Harlan Onsrud
24
4
Building the European Geographic Information Resource Base: Towards a Policy-Driven Research Agenda Massimo Craglia and lan Masser
30
5
GIS, Environmental Equity Analysis and the Modifiable Areal Unit Problem (MAUP) Daniel Sui
40
6
National Cultural Influences on GIS Design: A Study of Country GIS in King Country, WA, USA and Kreis Osnabruck, Germany Francis Harvey
54
7
The Commodification of Geographic Information: Some Findings from British Local Government Steve Capes
67
8
Nurturing Community Empowerment: Participatory Decision-Making and Community-Based Problem Solving Using GIS Laxmi Romasubramaian
83
9
Climbing Out of the Trenches: Issues in Successful Implementation of Spatial Decision-Support System Paul Patterson
98
v
PART TWO
Gl for Analysis and Modelling
10
Spatial Models and GIS Michael Wegener
108
11
Progress in Spatial Decision Making Using Geographic Information Systems Tim Nyerges
121
12
GIS and Health: From Spatial Analysis to Spatial Decision Support Antony Gatrell
134
13
The use of Neural Nets in Modelling Health Variations—The Case of Västerbotten, Sweden Orjan Pettersson
148
14
Interpolation of Severely Non-Linear Spatial Systems with Missing Data: Using Kriging and Neural Networks to Model Precipitation in Upland Areas Joanne Cheesman and James Petch
163
15
Time and Space in Network Data Structures for Hydrological Modelling Vit Vozenilek
176
16
Approach to Dynamic Wetland Modelling in GIS Carsten Bjorsson
189
17
Use of GIS for Earthquake Hazard and Loss Estimation Stephanie King and Anne Kiremidjian
201
18
An Evaluation of the Effects of Changes in Field Size and Land Use on Soil Erosion Using a GIS-Based USLE Approach Philippe Desmet , W.Ketsman , G.Gowers
214
19
GIS for the Analysis of Structure and Change in Mountain Environments Anna Kurnatowska
227
PART THREE GIS and Remote Sensing 21
Multiple Roles for GIS in Global Change Research Mike Goodchild
257
22
Remote Sensing and Urban Analysis Hans-Peter Bähr
275
23
From Measurement to Analysis: A GIS/RS Approach to Monitoring Changes in Urban Density Victor Mesev
285
24
Landscape Zones Based on Satellite Data for the Analysis of Agrarian Systems in Fouta Djallon (Guinea) using GIS
300
vi
Eléonore Wolff 25
Spatial-Time Data Analysis: The Case of Desertification Julia Maria Seixas
314
26
The Potential Role of GIS in Integrated Assessment of Global Change Millind Kandlikar
327
PART FOUR Methodological Issues 27
Spatial and Temporal Change in Spatial Socio-Economic Units Jonathan Raper
337
28
Spatial-Temporal Geostatistical Kriging Eric Miller
346
29
Using Extended Exploratory Data Analysis for the Selection of an Appropriate Interpolation Model Felix Bucher
360
30
Semantic Modelling for Oceanographic Data Dawn Wright
374
31
Hierarchical Wayfinding—A Model and its Formalisation Adrijana Car
387
32
Integrated Topological and Directional Reasoning in Geographic Information Systems Jayant Sharma
401
33
Distances for Uncertain Topological Relations Stephan Winter
413
PART FIVE
Data Quality
34
Spatial Data Quality for GIS Henri Aalders And Joel Morrison
424
35
Assessing the Impact of Data Quality on Forest Management Decisions using Geographical Sensitivity Analysis Susanna Mcmaster
437
36
Probability Assessment for the use of Geometrical Metadata François Vauglin
455
37
Assessing the Quality of Geodata by Testing Consistency with respect to the Conceptual Data Schema Gerhard Joos
467
38
Data Quality Assessment and Perspectives for Future Spatial-Temporal Analysis Exemplified through Erosion Studies in Ghana
477
vii
Anita Folly PART SIX
Visualisation and Interfaces 39
Visual Reasoning: The Link Between User Interfaces and Map Visualisation for Geographic Information Systems Keith Clarke
490
40
Visualisation of Multivariate Geographic Data for Exploration Aileen Buckley
504
41
Metaphors and User Interface Design David Howard
519
42
Universal Analytical GIS Operations—A Task-Oriented Systematisation of Data Structure—Independent GIS Functionality Jochen Albrecht
531
Postcript lan Masser
545
Index
549
Preface Massimo Craglia and Harlan Onsrud
This volume is the outcome of the Second International Summer Institute in Geographic Information held in Berlin in the Summer of 1996. The meeting was sponsored jointly by the European Science Foundation (ESF) GISDATA programme and the US National Science Foundation (NSF) through the National Center for Geographic Information and Analysis (NCGIA). Like the Summer Institute held the year before in the US, this event was extraordinary in a number of ways. First, the participants came in equal numbers from both sides of the Atlantic, a very unusual feature compared to most international geographic information meetings which tend to be dominated by either European or American participants. Second, the duration of the Institute, which included six full days of meetings and one day for technical excursions, allowed for considerable breadth and depth of interaction among the participants. Third, the majority of participants were at the early stages of their scientific career, as they were completing or had recently completed their doctoral research, and were all selected on the basis of high quality and originality of work in open competitions, one in Europe and one in the USA. Fresh from research experience in their own fields, the early career scientiests could capitalise on the extensive feedback and close interaction with colleagues from other countries doing research in the same area, as well as with senior instructors recognised as leaders in their field. This volume includes most of the papers presented at the Institute. The papers were revised following peer review and direct feedback from colleagues both during and after the meeting. In many instances a revised paper reflects not only the comments of the specialists selected to review the papers but also the knowledge gained from many hours of face-to-face discussions with participants coming from very different disciplinary perspectives. The two Summer Institutes were the flagships of the collaboration between the GISDATA scientific programme of the European Science Foundation (Arnaud et al. 1993) and the NCGIA (Abler, 1987). The topics of the Institutes reflected the twelve GISDATA specialist meetings held in 1993–96 and current or recent NCGIA research initiatives in closely related areas. In many respects, the similarities between the areas identified by GISDATA and the NCGIA as being of highest research priority is indicative of the current concerns of the field as a whole. Intense collaboration between the two programmes has resulted in a critical mass of researchers across the Atlantic addressing the issues with input from a wider range of disciplines and perspectives and less duplication of effort. The goals of the two Summer Institutes were to: • promote basic research in geographic information • develop human resources, particularly among young researchers, and • develop international co-operation between US and European scientists.
ix
This volume, and the one that preceded it (Craglia and Couclelis, 1997) are the tangible evidence that the first goal was met. While the papers speak for themselves, their significance as contributions to the development of the field as a whole is further discussed in the Introduction. The achievement of the other two goals is better evaluated with hindsight. Here we can only outline the means by which the organising committee strove to maximise the value of the Summer Institute for the select international group of young (and not-so-young) scientists who participated. A critical aspect of the success of both Institutes was in bringing together early-career scientists with a substantial subset of the best known senior researchers in geographic information science. The First Summer Institute in the US included 52 participants, of whom 31 were young scholars selected on the basis of the two parallel open competitions and 21 were internationally known researchers. The Second Summer Institute in Berlin in 1996 included 46 participants, of whom 32 were young scholars and 14 were senior scientists. While there was some overlap among the senior scientists between the two Institutes, this was not the case among the young researchers. Therefore, a total of 63 early career scientists had the opportunity to benefit from this unique programme. In their evaluations of the Institutes, the young researchers gave their most glowing ratings to the active presence and day-long availability of several of the “living legends” in the field. These senior scientists gave keynote presentations, taught mini-workshops, led or participated in the ad hoc research project teams that were formed, gave constructive comments to the junior paper presenters, argued vigorously among themselves, co-judged the team projects submitted, guided the field trips, and were enthusiastic participants in the jolly “happy hours” that closed each hard day’s work. The 1996 Institute in Berlin was organised by a scientific panel including Dr. Harlan Onsrud, Professor Helen Couclelis, and Dr. Keith Clarke from the US and Professor Hans-Peter Bähr, Dr. Massimo Craglia, Mr.François Salgé, and Dr. Geir-Harald Strand from Europe. The programme was set up so as to balance plenary and small-group sessions, lectures and hands-on challenges, academic papers and practical workshops, structured and unstructured events. Above all, it sought to encourage a thorough mix of nationalities, research perspectives, and levels of academic experience, for the purpose of helping prepare tomorrow’s experts for the increasingly international and multidisciplinary working environments they are likely to function in. The topics covered by the keynotes and paper presentations included: • • • • • • •
GIS and societal issues including aspects of ethics and law; spatial models and decision-supportsystems; methodological issues in the representation of space and time, data quality, visualisation and interfaces, GIS and remote sensing for urban analysis, and applications areas spanning from global change to health.
The key device used at the Institute to foster collaborative work was the formation of six research teams. Each Team worked on an hypothetical research proposal in response to a mock solicitation for proposals similar in form to those used by NSF. The enthusiasm with which the groups worked until late hours to develop their proposals was one of the hallmarks of the Institute, and a prize for the best proposal was awarded by an international panel of experts following the written submissions and oral presentations. What was particularly noteworthy was not only the extent to which young researchers developed valuable skills
x
by drawing on the experience of seasoned academics in writing these proposals, but also the way in which people coming from very different backgrounds were able to join together and work effectively as a team. The Institute was held in the superb Villa Borsig on the shores of Lake Tegel, and we are indebted to Professor Hans-Peter Bähr for organising it, to Dr. Sigrid Roessner for her excellent tour of Potsdam, and to all those individuals who made this Institute such a resounding success. A special thanks to Michael Goodchild and Ian Masser who spearheaded the NCGIA-GISDATA collaboration of which the two Summer Institutes are the most visible products, and to the two sponsoring organisations, the European Science Foundation and the National Science Foundation which were represented at Villa Borsig by Professor Max Kaase of the ESF Standing Committee for the Social Sciences, and Professor Mike Goodchild, Director of the NCGIA, respectively. The importance of this initiative for future researchers and teachers in geographic information science cannot be overemphasised. Finally, a particular debt of gratitude is owed to all the manuscript referees who have at times been put under substantial pressure to provide feedback within a very short time; and to Christine Goacher at Sheffield who has undertaken the gruelling task of standardising the format of all the chapters while never losing her good sense of humour. REFERENCES ABLER, R. 1987. The national Science Foundation National Center for Geographic Information and Spatial Analysis, in International Journal of GIS, 1(4), pp. 303–326. ARNAUD, A., CRAGLIA, M., MASSER, I., SALGÉ, F. and SCHOLTEN, H. 1993. The research agenda of the European Science Foundation’s GISDATA scientific programme, International Journal of GIS , 7(5), pp. 463–470. CRAGLIA, M. and COUCLELIS,H. (Eds) 1997. Geographic Information Research Bridging the Atlantic. London: Taylor& Francis.
The European Science Foundation is an association of more than 60 major national funding agencies devoted to basic scientific research in over 20 countries. The ESF assists its Member Organisations in two main ways: by bringing scientists together in its Scientific Programmes, Networks and European Research Conferences, to work on topics of common concern; and through the joint study of issues of strategic importance in European science policy. The scientific work sponsored by ESF includes basic research in the natural and technical sciences, the medical and biosciences, the humanities and social sciences. The ESF maintains close relations with other scientific institutions within and outside Europe. By its activities, ESF adds value by co-operation and coordination across national frontiers and endeavours, offers expert scientific advice on strategic issues, and provides the European forum for fundamental science. GISDATA is one of the ESF Social Science scientific programmes and focuses on issues relating to Data Integration, Database Design, and Socio-Economic and Environmental applications of GIS technology. This four year programme was launched in January 1993 and is sponsored by ESF member councils in 14 countries. Through its activities the programme has stimulated a number of successful collaborations among GIS researchers across Europe. The US National Science Foundation (NSF) is an independent federal agency of the US Government. Its aim is to promote and advance scientific progress in the United States. In contrast to other federal agencies which support research focused on specific missions (such as health or defense), the NSF supports basic research in science and engineering across all disciplines. The Foundation accounts for about 25 percent of Federal support to academic institutions for basic research. The agency operates no laboratories itself but does support National Research Centers and certain other activities. The Foundation also supports cooperative research between universities and industry and US participation in international scientific efforts. The National Center for Geographic Information and Analysis (NCGIA) is a consortium comprised of the University of California at Santa Barbara, the State University of New York at Buffalo, and the University of Maine. The Center serves as a focus for basic research activities relating to geographic information science and technology. It is a shared resource fostering collaborative and multidisciplinary research with scientists across the United States and the world. Support for this collaborative work is currently funded by NSF through the Varenius Project.
Contributors
Henri Aalders Delft University of Technology, Faculty of Geodetic Engineering Thijsseweg 11, POB 5030, NL-2600 GA Delft, NETHERLANDS Jochen Albrecht Department of Geography, University of Auckland Private Bag 92019, Auckland, NEW ZEALAND Hans-Peter Bähr, Universitat Karlsruhe (TH), Englerstrasse 7 Postfach 69 80(W), 76128 Karlsruhe 1, GERMANY Carsten Bjornsson GISLAB, GISPLAN, Unit of Landscape Royal Agricultural & Veterinary University Rolighedsvej 23, 2, 1958 Frederiksberg C, DENMARK Felix Bucher University of Zurich, Department of Geography Winterthurerstrasse 190, 8057 Zurich, SWITZERLAND Aileen M.Buckley Department of Geosciences, Oregon State University Corvallis, OR 97331, USA Stephen A.Capes, 8 Duncan Road, Sheffield S10 1SN, UK Adrijana Car Department of Geomatics, University of Newcastle-upon-Tyne, Newcastle-upon-Tyne NE1 7RU, UK Joanne E Cheesman Department of Geography, University of Manchester Mansfield Cooper Building, Oxford Road Manchester M13 9PL, UK Keith C.Clarke Department of Geography, University of California Santa Barbara, CA 93106, USA Helen Couclelis NCGIA, University of California, 350 Phelps Hall Santa Barbara, CA93106, USA Massimo Craglia, Department of Town and Regional Planning, University of Sheffield, Western Bank, SHEFFIELD S10 2TN, UK Philippe Desmet, Laboratory for Experimental Geomorphology, Catholic University of Leuven Redingenstraat 16, B-3000 Leuven, BELGIUM
xiii
Anita Folly School of Agriculture, Food and Environment, Cranfield University Silsoe, Bedfordshire MK45 4DT, UK Anthony C.Gatrell, Institute for Health Research, Lancaster University Lancaster LA1 4YB, UK Michael F.Goodchild National Center for Geographic Information and Analysis, and Department of Geography University of California, Santa Barbara, CA 93106–4060, USA G.Gowers Laboratory for Experimental Geomorphology, Catholic University of Leuven Redingenstraat 16, B-3000 Leuven, BELGIUM Francis Harvey, Department of Geography, University of Kentucky Lexington, KY 40506–0027, USA David Howard The Pennsylvania State University, 710 S. Atherton Street Apt. 300, State College PA 16801, USA Gerhard Joos Institute for Geodesy, University of the Federal Armed Forces Munich D-85577 Neubilberg, GERMANY Milind Kandlikar National Center for Human Dimensions of Global Change Carnegie Mellon University, Pittsburgh, PA 15213, USA W.Ketsman Laboratory for Experimental Geomorphology, Catholic University of Leuven Redingenstraat 16, B-3000 Leuven, BELGIUM Stephanie A.King John A.Blume Earthquake Engineering Center, Department of Civil Engineering Stanford University, California 94305–4020, USA Anne S.Kiremidjian John A.Blume Earthquake Engineering Center, Department of Civil Engineering, Stanford University, California 94305–4020, USA Anna Kurnatowska University of Warsaw, Department of Geography and Regional Studies ul. Krakowskie Przedmiescie 30, 00–927 Warsaw, POLAND Ian Masser, Division of Urban Planning and Management, ITC P.O.Box 6, 7500AA Enschede, NETHERLANDS Susan McMaster ACSM Department of Geography, University of Minnesota, 414 Social Sciences Building, Minneapolis MN 55455, USA Victor Mesev, ESRC Research Fellow, Department of Geography University of Bristol, University Road Bristol, BS8 1SS, UK Joel Morrison, Chief, Geography Division, US Bureau of the Census Washington, D.C. 20233–7400, USA Eric J.Miller Office of Research, OCLC Online Computer Library Center Dublin, Ohio, USA Tim Nyerges Department of Geography, University of Washington
xiv
Seattle, WA 98195, USA Yelena Ogneva-Himmelberger Graduate School of Geography, Clark University Worcester, MA, USA Harlan Onsrud Department of Spatial Information Science and Engineering and National Center for Geographic Information & Analysis University of Maine, Orono, Maine 04469, USA Paul Patterson Onkel-Tom St. #112, 14169 Berlin, GERMANY James Petch Manchester Metropolitan University Department of Environmental and Geographical Sciences John Dalton Building, Chester Street Manchester M1 5GD, UK Örjan Pettersson Department of Social and Economic Geography Umeå University, S-901 87 Umeå, SWEDEN Jonathan Raper Department of Geography, Birkbeck College 7–15 Gresse Street, London W1P 1PA, UK Laxmi Romasubramaian Department of Geography, University of Auckland Private Bag 92019, Auckland, NEW ZEALAND Julia Maria Seixas Faculdade de Ciencas e Tecnologia, Universidade Nova de Lisboa, Quinta de Torre, 2825 Monte de Caparica, PORTUGAL Jayant Sharma Oracle Corporation, One Oracle Drive, Nashua, NH 03062, USA Daniel Z.Sui Department of Geography, Texas A&M University College Station, TX 77845–3147, USA François Vauglin IGN/COGNIT, 2 Av. Pasteur 94160 Saint Mande, FRANCE Vit Vozenilek Department of Geography, Palacky University tr. Svobody 26, 771 46 Olomouc, CZECH REPUBLIC Michael Wegener, Institut fur Raumplannung, University of Dortmund Postfach 500500, D-44221 Dortmund, GERMANY Stephan Winter Dept. of Geoinformation, Technical University of Vienna, Gusshausstrasse 27–29, 1040 Vienna, AUSTRIA. Eléonore Wolff Institut de Grestion de l’ Environment et Amanegement du Territoire CP246, Bd. du Triomphe, B1050 Bruselles, BELGIUM Dawn Wright Department of Geosciences, Oregon State University Corvallis, OR 97331–5506, USA
Chapter One Introduction and Overview Massimo Craglia and Harlan Onsrud
This book brings together some of the latest research in areas of strategic importance for the development of geographic information science across both sides of the Atlantic. With its predecessor, Geographic Information Research Bridging the Atlantic, also published by Taylor & Francis (Craglia and Couclelis, 1997), it spans the entire range of research topics identified by the European Science Foundation’s GISDATA programme (Arnaud et al., 1993) and by the NCGIA (Abler, 1987). These two programmes have been crucial in Europe and the USA in shaping and developing the geographic information (GI) research agenda during the 1990s. As we move towards the year 2000 we see new topics emerging such as a greater focus on the societal implications of geographic information science advancement, alongside some of the traditional ones such as spatial analysis and modelling. We are also starting to see geographic information research moving away from the core disciplines of geography, surveying, and computer science to a wider set of disciplines in the environmental and social sciences. This creates new challenges but also offers opportunities to address together some of the long standing problems identified in this area. Hence we see a greater contribution of philosophers and cognitive scientists into fundamental issues related to the perception and representation of space and the integration of time, the emergence of new specialities such as digital information law and economics, and of new areas of application such as health. These broad trends are well represented in this book which is divided into six sections. The first on “GI and Society” addresses important topics like societal impacts, empowerment, equity, commodification, and ethics. These provide a powerful signal of one of the new directions for GI research which is set to grow even more in importance over the next few years. The second section on “GI for Spatial Analysis and Modelling” could be considered as a more traditional concern of GI research, and yet we see not only an increased sophistication in the approaches being developed but also new important areas of application such as health. Section Three on “GIS and Remote Sensing” indicates the increasing importance that new data sources will have for many disciplines. Whilst remote sensing has been mainly the domain of the environmental sciences in the past, the future increased availability of high resolution imagery is opening up new opportunities for the social sciences including urban planning and analysis. Section Four on “Methodological Issues” gives evidence of the multidisciplinary effort under way to address key outstanding issues such as the inclusion of time into spatial analysis and the formalisation of human cognition of space. The chapters included under this heading illustrate the extent to which the boundaries of GI science are being extended to encompass other disciplines and address more effectively the complexity of the real world. Section Five addresses another topic of growing importance, that of “Data Quality”. Of course data quality has also been important but the much increased availability of digital data from a variety of sources including the Internet brings it much more to the fore and requires further research to develop models that are not only comprehensive but also operational for most users. Finally, the last section of the book
2
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
addresses “Visualisation and Interfaces”. These are areas where current geographic information systems are still relatively primitive and the need for research pressing if the opportunities of data rich environments are to be exploited for scientific inquiry. Many challenges and opportunities are therefore presented in these six sections as detailed below. 1.1 PART ONE: GI AND SOCIETY: INFRASTRUCTURAL, ETHICAL AND SOCIAL ISSUES The opening chapter of this first section of the book by Helen Couclelis aptly sets the scene for what has been a thorny issue for GI researchers over the last few years, particularly since the publication of Ground Truth (Pickles, 1995). Helen captures the essence of the debate in typical Greek philosophical fashion by way of a thesis (does GIS cause societal problems?) and its antithesis (does GIS help alleviate societal problems?). The chapter builds on the wide ranging discussions held in the US between social theorists and techno-enthusiasts in the framework of a special initiative on this topic organised by the NCGIA (I-19). The synthesis is of course that both propositions may be true as GIS, like information itself, is thoroughly embedded in the social context in which it operates. Hence, geographic information scientists have a special responsibility to ensure that all the aspects of their discipline are carefully investigated, and the results are widely disseminated. Ignorance is the worst enemy of society. The role of the geographic information science academic sector in helping define an ethical framework for GI use is explored by Harlan Onsrud in Chapter 3, thus directly building on the discussion by Couclelis. Harlan makes the important point that rapid technological progress and increased availability of digital data related to individuals is moving the whole field in an ethically grey zone with fuzzy boundaries. What to some is “smart business practice” giving a competitive advantage, for others is plain unethical behaviour. This is largely due to the inertia with which recognised legal and deontological frameworks respond to the new challenges posed by technology and societal change. In this grey zone, there is often the tendency either to do nothing or to do too much on the basis of emotional responses. Hence what is needed is for a close scrutiny of current practices to gather the evidence necessary for an informed debate on information policy and professional conduct, a role well suited to socially-aware GI researchers. A European perspective on some of these same issues is provided by Massimo Craglia and Ian Masser in Chapter 4. The authors give an overview of the recent developments towards a European Geographic Information Infrastructure (EGII) by the European Commission. This infrastructure is similar in concept to the American National Spatial Data Infrastructure, and is all the more necessary in Europe given the enormous variations that exist across national boundaries which inhibit social and economic analysis. The key difference between the European and USA experience is however in the extent of high level political support that a strategic initiative of this kind receives. Hence, the authors argue that in spite of the recent flurry of activity, the future of an EGII is still unclear and they put forward a policy-driven research agenda which gives greater emphasis to social and methodological issues than technical ones. As many of the topics identified lay outside the traditional strengths of the European GI research community and require increased inter-disciplinary efforts, they represent a formidable challenge and opportunity for European GI research. A practical case-study of the social and ethical issues involved in the use of GIS-based analyses is presented in Chapter 5 by Daniel Sui who investigates effects of the Modifiable Areal Unit Problem on the results of GIS-based environmental equity analysis. As he clearly demonstrates in his study of the relationship between extent of toxic waste and the socio-economic characteristics of neighbourhoods in
INTRODUCTION AND OVERVIEW
3
Houston, Texas, by varying the scale of analysis and areal framework it is possible to arrive at any desired results, with the added bonus of having a techno-gloss on it via GIS. On this basis Daniel convincingly argues that GIS like any other information handling technology, can shape our understanding of social reality so that the effects are due not to the phenomena being measured but to the tools measuring them. This adds to the mounting evidence that technology is not value neutral but a social construct to be critically challenged and evaluated. Given the social dimension of technology, and GIS, are there any cultural differences in the way GIS is implemented and used? Francis Harvey addresses this issue in Chapter 6 through two case studies of GIS implementation in public agencies in the USA and Germany. Building on the conceptual framework developed by Hofstede (1980) in his cross-national study of IBM employees, Francis considers the apparently very different approaches adopted in the two case-studies which reflect the administrative and organisational cultures and traditions as much as they speak about national traits. In this study, the German approach appears strongly encased in hierarchy and adherence to procedures as much as the American appears flexible to the point of becoming chaotic. Yet under these first impressions, it is clear that both casestudies share the need for all the actors involved to negotiate, whether openly or covertly, thus reinforcing the perspective on the social dimension of GIS, a dimension mediated by national, professional and other forms of cultural identity. Stephen Capes in Chapter 7 appears to break from this “social” stream to address the issue of the increasing treatment of digital (geographic) information as a commodity. The break is only apparent though because much of recent discussion on access to data has ended up focusing on the price of data, contrasting the US (federal) model of free data with the British one of highly charged data. As Stephen argues this is a very simplistic, and not very helpful view. In the first place, being a commodity is more than just commercialisation, i.e. vigorous charging. It also includes dissemination, exchange, and creating valueadded information services. In the second place, he shows that the well publicised practice of the British Ordnance Survey to recover a high proportion of its costs from the sale of data is not shared by other public agencies such as local government. Using evidence from local government in Britain, Stephen demonstrates that all four facets of commodification are present, largely based on organisational mandates and cultures. Finally, he argues that on this evidence the position in Britain and the US is not dissimilar, if one cares to look to the practices of USA state and local governments as well. Given the increasing pressures on all governments to reduce their spending and look for new forms of revenue, this chapter is a valuable contribution to a debate often based on cliché rather than facts. In Chapter 8, Laxmi Romasubramaian addresses some of the points raised in Chapter 2 by Helen Couclelis and provides evidence that GIS can also help empower local communities in making choices about their future. On the basis of a broad review of the literature on GIS adoption and implementation, the social dimension of GIS and documented case-studies of GIS use in local communities, Laxmi acknowledges that GIS offers opportunities but is also affected by numerous constraints. These include technical, organisational, data-related, and skill issues, particularly for local communities. The most important point the author makes though is that the critics of GIS in local communities who argue that GIS centralises decision-making and alienates non technical users, are not so much making a point about GIS but giving an indictment of the decision-making process. Therefore, what is required is a much wider use of participatory approaches in GIS implementation which build upon the experiences of participatory urban planning. The final chapter of this section by Paul Patterson, looks at spatial decision support systems (SDSS) in an organisational context. Using the example of routing software that has been implemented across many different organisational settings, Paul draws lessons for implementation which centre on the appropriateness
4
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
of data resolution, user feedback and interaction, ability to include extensions and customisation, and broader organisational issues. As he argues, if SDSS are ever to be usefully implemented it is important that implementation experiences are disseminated and discussed, trying to extract generally valid approaches from the specific experiences. 1.2 PART TWO: GI FOR ANALYSIS AND MODELLING Part 2 of the book focuses on GIS and spatial analysis and modelling, topics that continue to be at the core of the GI research community. The opening chapter by Michael Wegener focuses on GIS and spatial models, as these are critical to any forecasting activity in both social and environmental sciences. Michael provides a very useful classification of models used in both these scientific fields based on their comprehensiveness, structure, theoretical foundations, and techniques, and investigates the extent to which GIS offers real opportunities for the development of new models which were not previously possible or thought of. On the basis of his thorough review, he concludes that the potential for GIS to offer new data structures for spatial models represents the most promising challenge of GIS for spatial modelling. He also makes the important point that the increasing complexity of environmental and social problems requires the use of integrated spatial information systems cutting across disciplines which have traditionally developed separately. In the following Chapter 11, Timothy Nyerges provides a comprehensive review of the state of development of GIS for spatial decision support. Using a theoretical framework called Enhanced Adaptive Structuration Theory, Timothy explores the inputs necessary to decision-making, the process of interaction involved in spatial decision making, and the outcomes of this process. The progress made to date in the use of GIS for spatial decision-making is explored by way of seven research questions, which should be of extreme interest to any PhD student looking for a significant thesis. Timothy also identifies appropriate methodologies to investigate them, and urges researchers to take up the challenge to move the field of GIS for spatial decision-making forward. Anthony Gatrell, in Chapter 12, continues on the theme of spatial analysis and decision support from the specific angle of health research, one of the fastest growing fields for GIS application. Anthony makes the useful distinction between two streams of research: epidemiological studies (the geography of disease) which build on the natural sciences traditions, and healthcare planning research (the geography of health), which typically take a more social science approach. Each of these streams has tended in the past to operate in parallel and if the streams used GIS at all they required different functionalities and methodologies. Now as argued by Anthony, we are really starting to see the two streams coming together and a real opportunity emerging to link their requirements into dedicated spatial decision support systems based on GIS, and spatial statistical tools. This building of bridges across disciplines and scientific traditions echoes the views expressed in Chapter 10 by Michael Wegener and indicates that whichever the field of application, the sheer complexity of our society requires wider awareness and combined approaches. The health theme is developed further by Orjan Pettersson in Chapter 13, with a very interesting casestudy of public health variations in Västerbotten county, Sweden, using a neural networks approach. Orjan uses an index of ill-health based on the number of sick-leave days and explores the geographical variations in relation to socio-economic and labour market characteristics of the county. The comparison between neural nets and linear regression analysis indicates that neural nets are good at identifying non-linear relationships and at providing a predictive model at the microregional level. However, there are also a number of important constraints which make this approach far from perfect and which require further work.
INTRODUCTION AND OVERVIEW
5
Joanne Cheesman and James Petch address a subset of these issues in Chapter 14 by comparing the use of neural networks and kriging to develop areal precipitation models for upland catchment areas where precipitation gauges are few, unevenly distributed, and often located in lowlands. Joanne and James provide a good review of both neural networks and kriging, highlighting the advantages and disadvantages of each. In their specific study in the North of England, they conclude that kriging provides the more accurate results where data is more abundant and neural networks perform better in the higher altitude areas where data points are more scarce. This chapter nicely complements the previous one in building the empirical evidence on the suitability of different methodologies to solve different types of problems, including data density and distribution. In Chapter 15, Vit Vozenilek gives an overview of the many problems to be faced in trying to model hydrological systems with off-the-shelf GIS. He first addresses the general concepts for including space and time in network analysis and then develops a practical approach to modelling river systems with PC Arc/ Info. His approach is put to the test through three case-studies of increasing complexity. He demonstrates that useful results for a great deal of applications are possible, but that many approximations are needed to do so. A number of methodological issues relating to data structures, data requirements and scale effects are raised by Vit whose approach is essentially suited to mono-directional flows in a network. The limitation identified in the previous chapter of mono-directional flows in a network is addressed in Chapter 16 by Carsten Bjorsson in the context of modelling wetland areas. These areas are very sensitive to pollution and need very careful handling for their remediation. The model proposed tries to handle both space and time in a GIS raster structure using focal analysis which calculates for each cell a value as a function of its immediate neighbours and therefore can handle movement of flows in more than one direction. This individual cell addressing enables Carsten to calculate for each cell in the grid the rate of change of the stream of water as a function of precipitation, flow from adjacent cells, groundwater contribution, flow out of the cell and evaporation. The model is at the early stages of development but by comparison to the traditional network models it promises to allow back flows, partial area flows and flooding which are crucial in wetlands and that network models have difficulty in handling. Chapter 17 by Stephanie King and Anne Kiremidijan presents a useful application of GIS for earthquake hazard and loss estimation. There has been a growing interest in this field over the last few years which have witnessed large-scale disasters such as those in Kobe and Los Angeles and a major international conference on “Earthquakes in Mega-Cities”, held in September 1997 in Seeheim, Germany. Against this background, Stephanie and Anne show how GIS can usefully integrate the geological, structural, and economic data to produce micro-zones of risk. Moreover with an intelligent use of buffering, look-up tables, and ad-hoc routines it is possible to model the effects of an earthquake and start to put in place the necessary mitigation measures. Philippe Desmet and his colleagues discuss in Chapter 18 an algorithm implemented in GIS to utilise the Universal Soil Loss Equation in a two-dimensional terrain. The authors clearly demonstrate the validity of their approach in an IDRISI environment by investigating the effect through four test-sites near Leuven, Belgium. The comparison of their method with manual calculation shows that the latter underestimates erosion risk because the effect of flow convergence cannot be taken into account, particularly in concave areas where overland flow tends to concentrate. Hence the proposed method is able to deal with more complex land units, extending its applicability in land resource management. Anna Kurnatowska in Chapter 19 combines GIS functionalities with statistical methods to analyse and compare the environmental structure of two mountain areas placed in different climatic zones: one in Scotland, the other in Poland. GIS is used to delineate homogeneous typological units called geocomplexes and describe their morphology. The area, perimeter and number of distinctive land units in a particular type
6
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
of geocomplex are then statistically analysed to identify spatial patterns, and spatial relationships between different units. This analysis enables Anna to develop sensitivity maps which are then an important input to environmental impact assessment and conservation strategies. It is a feature of this chapter to integrate well established methodologies in Central Europe for the analysis of geocomplexes with Western approaches using GIS to arrive at the desired results. The final chapter of this section addresses an important issue already raised by Michael Wegener and Anthony Gatrell in their respective chapters: the need for cross-disciplinary efforts to deal with the increasing complexity of our society. In this respect, Yelena Ogneva-Himmelberger argues that Markov chain analysis is a useful approach to integrate in GIS socio-economic and ecological processes in modelling landcover change. Following an overview of Markov chain models Yelena applies them to a study area in Mexico to link land-cover change maps to maps identifying socio-economic and ecological factors responsible for such change. The results indicate the opportunities opened up by Markov chain probabilitymodels coupled with logistic regression models for an improved understanding of land-cover change processes, providing that the relevant variables and decision rules are carefully formulated. 1.3 PART THREE: GIS AND REMOTE SENSING This section of the book looks at the contribution that the increasing integration of remote sensing with GIS can make for the analysis of both environmental and socio-economic processes. The opening chapter by Michael Goodchild sets the scene in the context of global change research. This broad ranging review and analysis of the opportunities and limitations of GIS for the understanding and modelling of local-global processes touches on many themes. They include data issues, such as the difficulty to collect the requisite data at a global scale; conceptual issues such as understanding and modelling the effects of human activity across a range of scales, local, regional, and global; and methodological issues in integrating different types of models with complex feedback mechanisms over time and three dimensions. As the chapter points out, the areas where further research is needed are many, yet there is little doubt that the convergence of different media and analytical tools offer real opportunities for new insights. Hans-Peter Bähr in Chapter 22 focuses on a social-science application of remote sensing and GIS, urban analysis. Hans-Peter acknowledges at the very beginning that the combination of remote sensing and GIS has considerable merits for urban analysis, particularly with the arrival of high resolution satellite data but also by way of the more traditional airborne imagery. In particular remote sensing offers detailed up-to-date information which overcomes a traditional problem of urban analysis, obsolete data. Other advantages are that remote sensing data may be taken upon request according to the need of the user who can specify parameters and variables, and that it is possible to handle large amounts of data. The opportunities for exploiting these data sources and techniques are therefore very significant. Hans-Peter also identifies a number of research challenges and gives some examples to illustrate both opportunities and constraints. Victor Mesev in Chapter 23 reinforces the message of the previous chapter by pointing out that according to the Official Journal of the European Communities, nearly 90 percent of the European population will soon be living in urban areas. Hence the importance of being able to analyse and model urban areas accurately is critical. To do so, Victor proposes a cohesive two-part strategy that links the measurement of urban structures with the analysis of urban growth and density change. To obtain reliable measurements of urban structures Victor links conventional image classifications with contextual GIS based urban land use and socio-economic information. The spatial analysis part of the strategy builds on the work by Batty and Longley (1994) who showed that the fractals are a convenient way to represent the shape and growth of
INTRODUCTION AND OVERVIEW
7
urban settlements across space and time. The examples provided demonstrate the value of developing integrated urban models using RS and GIS, an area in which we will certainly see much more work in the future. Moving from the urban to the rural environment, Eléonore Wolff uses a combination of remote sensing techniques and GIS to delineate landscape zones and analyse agrarian systems in developing countries. Agrarian systems characterise space through the interplay of production factors (labour and investment) and products such as crops and livestock. Because of their complexity they tend to be studied through extensive ground surveys which then makes it difficult to generalise patterns at the regional scale. To overcome these limitations, Eléonore utilises raw high resolution remotely sensed data to delineate numerically homogeneous landscape zones which can then be used to generalise the results of local household surveys. Her method involves standard remote sensing techniques which are however applied to complete high resolution scenes to achieve useful analytical results. The starting point for Julia Seixas in Chapter 25 is that we are still a long way from being able to identify and assess the processes that take place in the environment which we attempt to capture by remote sensing. This is due to the lack of knowledge of environmental processes at the spatial resolution of the sensors (30 units or less), and the huge quantity of data associated with temporal series. To address these issues, Julia proposes a methodology inspired by exploratory data analysis to deal with spectral data and identifies in her study the spatial-temporal patterns of a desertification process. The methodology proposed characterises the statistical landscape of remotely sensed images using a kernel-based approach to measure the spatial association of values in the image and their variability over space (variance). From here time is integrated in the analysis by measuring the temporal average process and the temporal variability, hence developing an integrated spatial-temporal assessment of the association and variances of the spectral values. Although further extensions to this method are needed, the good results achieved in the presented case-study indicate that this is a highly promising route to follow. Chapter 26 by Milind Kandlikar returns to the role of GIS in the integrated assessment of global change discussed by Goodchild in the opening chapter of this section. His particular interest is in the opportunity that GIS offers to handle multiple scales and sectoral aggregations, thus enabling different stakeholders to have a say in integrated assessment models of climatic change. These models he argues are a useful framework to synthesise knowledge from various disciplines. However, because they traditionally have been used at the global scale, they have tended to ignore local concerns and different viewpoints. To help overcome these limitations, Milind argues that GIS can contribute to an integrated assessment by providing geographical specificity, improving data quality and model validation, and coupling data capture and visualisation with the integrated models. 1.4 PART FOUR: METHODOLOGICAL ISSUES This section is opened by Jonathan Raper who gives a thoughtful overview of the many research challenges still to be faced in the handling of space and time within GIS. The chapter focuses on the efforts needed to develop richer representations for the analysis of spatial socio-economic units, and the discussion provides an opportunity to address some very fundamental philosophical, cognitive, social, and analytical issues which are central to making progress in this field. Spatial units are many and functional to many uses, but they can be broadly identified as being along a continuum with symbolic, transient and diffuse units at one end (such as the concept of neighbourhood or vista), and instrumental units at the other end which are largely permanent and concrete (such as property parcels). Most GIS tend to handle a sub-set of the latter
8
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
type, i.e. spatial units which are non-overlapping and filling all the available space. They also focus only on their geometry rather than the social and political processes that govern their existence and behaviour. Other types of units have yet to be properly addressed both in terms of their conceptualisation and formalisation in a computer environment. This is a limitation which can no longer be sustained if GIS is to contribute further to our understanding of real life phenomena in space and time. Eric Miller in Chapter 28, echoes these views and suggests that an additional difficulty in understanding real life phenomena lies in the collection of a sufficient number of spatial and temporal observations at different scales. Whilst simple assumptions about isotropic space enable the use of variograms to handle sparse data points, the reality of natural processes is characterised by multi-dimensional data, sparse data sampling, statistically biased sampled data and irregular or anisotropic locations. To handle these difficulties, Eric suggests extending geostatistical kriging to the spatio-temporal domain to estimate unsampled values and variances in both space and time, and clearly explains how this may approach may operate. This is a very useful contribution which definitively deserves further focused treatment. In Chapter 29 Felix Bucher reports on research that he and his colleagues at the University of Zurich have been carrying out on the extension of exploratory data analysis to assist in the identification of an appropriate interpolation model when sampling phenomena are described by continuous fields. The proposed strategy aims at formalising and structuring the whole decision-making process to overcome the traditional problems encountered in the selection of an interpolation model. In particular, attention is given to the magnitude and manner of first-order and second-order spatial variations in the data being interpolated, the suitability of secondary data to assist the modelling of either form of spatial variation, and the assessment of data properties. The integrated nature of the approach proposed is its most significant characteristic which also provides a basis for possible implementation. Thus far this strategy has focused on spatial data, and the extent to which it can also be extended in the spatio-temporal domain is an open research question. Dawn Wright in Chapter 30 illustrates an application of GIS to support research on the impacts of submarine hydrothermal venting. The Vents Programme discussed in this chapter collects a very wide range of data from a variety of sources spanning temporal and spatial scales from decades to milliseconds and from millimetres to hundreds of kilometres. Hence a key issue is the design of a semantic model able to capture and link the information content of these different data sources independent of their internal data structures. Moreover, the model must offer simplicity of use and communication capabilities among many scientists of diverse backgrounds. GIS offers the core implementation platform for the proposed model which is clearly described by Dawn. As she acknowledges, the existing framework has already provided a useful base but now needs to be extended from an essentially 2-D environment to 3- and 4-D, hence addressing in part many of the challenges identified in the introductory chapters of this section. In Chapter 31, Adrijana Car focuses on spatial information theory and specifically on the use of hierarchies in the cognitive process to structure space. The understanding of how spatial hierarchies are formed and used, she claims, is one of the most important questions in spatial reasoning research today. The research undertaken by Adrijana uses wayfinding on a network as the case-study. Two examples are described, a hierarchical network and a flat one, and the appropriate onthologies made explicit. The comparison of the results indicates the value of the approach as the hierarchical algorithm often produces longer paths but is still preferred by the driver as it is perceived to be “faster”. The strength of this chapter lies in its development of a conceptual model for the hierarchy of space, an efficient hierarchical algorithm and an understanding of the underlying heuristic. All are necessary to formulate executable specification in a GIS and expand the range of GIS applications.
INTRODUCTION AND OVERVIEW
9
In Chapter 32, Jayant Sharma continues the research theme of spatial reasoning and focuses on the problem of formalising and implementing spatial inferences based on qualitative information. As the author explains, inference is the process of combining facts and rules to deduce new facts. Whilst this process may appear trivial to humans it is difficult to formalise the process in automated systems and yet may be extremely useful for searching large spatial databases. Jayant reviews the key concepts underlying spatial cognition and the key role of qualitative spatial information and then develops formalisms for this type of information distinguishing between heterogeneous spatial reasoning in which single spatial relations of different types are considered at each step of the process, and integrated spatial reasoning in which all the relations, such as distance or orientation of objects, are considered simultaneously. The inference of spatial relations is also addressed in Chapter 33 by Stephan Winter. He considers the additional complexity deriving from the uncertain position of objects in space. Rather than handling uncertainty using bands or fuzzy sets, Stephan builds on the 9-intersection model for expressing topological relations and adds metric information of the distance between regions, distance being defined as a function which can be statistically modelled, taking into account both the uncertainty in position and that of relation between objects. The work presented in this chapter is extremely promising and its relevance to GIS data input, managing, analysis and visualisation is clearly identified. 1.5 PART FIVE: DATA QUALITY The section on Data Quality opens with an authoritative overview by Henri Aalders and Joel Morrison who have both been directly involved in various international efforts to define quality standards for digital data. Their opening remarks set the scene in respect to the challenge being faced by traditional map producers as well as by users in the new world of digital spatial data. Whilst in the past users knew what they were getting from their established suppliers, and the latter had a certain degree of control and confidence over the quality of their product, this model is no longer working in the age of Internet surfing and spatial data downloading and integrating. Producers themselves are losing track of what is out there and the extent to which their products are fit for the purpose to which they are used. With this in mind, the authors report on efforts on both sides of the Atlantic to develop a comprehensive and yet workable data quality model based on seven aspects: lineage, accuracy, ability for abstraction, completeness, logical consistency, currency and reliability. As the authors argue, whilst progress has been made in conceptualising the data quality model, its operationalisation still poses many questions in terms of database complexity and requirements for visualisation. Hence, we are likely to see a major effort in this area of increasingly important research over the coming years. Given the importance of “fitness for purpose”, Susanna McMaster in Chapter 35 presents exploratory research on the impact of attribute data quality on the results of spatial modelling for a specific application, forest management. Sensitivity analysis is the underlying method used by Susanna who reviews the few studies which have used this technique and then details the methodology are adopted for her research. Once the spatial model was developed to assess the appropriate management practice in the forest area under study, errors of ±5 percent and ± 25 percent were deliberately introduced in three key attribute variables to assess their impact on the resulting suitability maps. The chapter describes in detail the sequential approach taken and the differential impact of the errors introduced in each variable, thus providing a useful benchmark for other researchers wishing to extend this important area of research into other application domains.
10
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
The experience developed at the French Institut Géographique National in producing digital cartographic products is the context from which François Vauglin in Chapter 36 describes his research on the probability assessment of geometric metadata. Metadata on positional accuracy are created by associating to each coordinate some probabilistic information related to its positional uncertainty. The chapter clearly shows the potential of assessing the probability density function of positional accuracy in GIS and its handling using probabilistic reasoning to have a complete description of the statistical behaviour of positional uncertainty. Although the procedure illustrated is as yet limited to points and lines and needs extending to surfaces, it is a clear contribution to this particular field of research and practice. In Chapter 37, Gerhard Joos focuses on another important aspect of data quality, the consistency of a given data set to its conceptual data schema. As Gerhard argues, it is essential that the rules of the conceptual data schema are understood and tested. This may be done within a GIS if the system is powerful enough to formulate the necessary queries and the rules of the conceptual schema are expressed in a formal language. As this may not be the case in many instances, Gerhard introduces a system independent language which any GIS user can utilise and edit. Having described the general features of the proposed language, Gerhard also gives specific examples of how to handle overlapping objects, or create new objects from old ones with evidence of the flexibility this new set of tools offers GIS users. The final chapter of this section discusses data quality in the context of assembling different data sets in developing countries, the result of which may be a pretty map but with highly debatable meaning. The focus of this chapter is on attribute data quality as information on the other aspects of data quality is often not available in developing countries. Using as case-study a project on soil erosion in Ghana, Anita Folly clearly highlights the many difficulties faced in operating with data for which the currency and quality is either unknown or very poor. This applies to physical and topographical data but even more so to socioeconomic data. One of the key features of the discussion in this chapter is the extent to which the use of GIS with multiple sources of data enabled Anita to improve the quality of the data used as well as documenting it, thus providing an additional service over and above the analytical output of the GIS operations. 1.6 PART SIX: VISUALISATION AND INTERFACES Keith Clarke in Chapter 39 opens the last section of the book on visualisation and user interface issues. Appropriately he reviews recent developments in these two converging fields and argues that much remains to be done to enhance the capabilities of current GIS which by all accounts are still very primitive both in respect to visualisation and user interface. Whilst having different origins, computer graphics and software engineering respectively, visualisation and user interfaces also share some common characteristics as both depend on visual reasoning. In this chapter, Keith identifies some of the key research issues relating to visual reasoning and suggests that binary visual processing may be a fruitful route to explore them by focusing on the existence of information flows rather than attempting the quantification of individual and collective human cognition. In the concluding section, Keith argues that in the future age of Internet Multimedia there will be many opportunities to use more than just vision to analyse and interpret data and that maybe it is time that GI science focused more on the technical barriers to these futures than the deep issues of spatial cognition. In Chapter 40, Aileen Buckley gives an excellent overview of the field of computer visualisation and the opportunities that it increasingly affords for “new ways” of looking at the world. As she argues, visualisation might be the foundation for a second computer revolution by providing scientists with new methods of inquiry and new insights into data rich environments. This chapter stands out for the clarity with
INTRODUCTION AND OVERVIEW
11
which it introduces the key terminology and concepts in the field of computer visualisation drawing on the essential literature. What makes it even more topical is the extent to which the geographic metaphor is increasingly becoming a key searching, retrieving, and viewing mechanism for non geographic data, opening new fields of application in the era of digital libraries. The use of metaphors for interface design is discussed in depth in Chapter 41 by David Howard who has been working on a hierarchical approach to user interfaces for geographic visualisation. Metaphors can be extremely helpful devices for both users and system developers to interact with computer based systems, and the example of the Macintosh and Windows95/NT desktop metaphor is often referred to as a model most users have become familiar with. Whilst offering many opportunities, metaphors have certain problems as well, in particular for their lack of precision. They are by necessity incomplete representations of the system they explain and therefore exact semantics are difficult to convey. Moreover, it is difficult to find metaphors that can be used for multiple types of applications. Spatial metaphors have additional problems of their own as it may occur that users confuse the metaphor with the spatial data itself. In spite of these issues, David argues that the combination of a hierarchical approach to interface design that clearly distinguishes between the conceptual, operational, and implementation levels, together with the judicious use of metaphors, is the way forward to simplifying the relationship between users and increasingly sophisticated systems. A much more radical approach is advocated by Jochen Albrecht in the concluding Chapter 42. Rather than incremental improvements, Jochen argues that GIS should be rethought from the bottom up focusing on what distinguishes them from other software such as automated mapping, namely their ability to perform data analysis. By contrast, Jochen argues that most GIS are still overwhelming users with functions that deal with map making, thus obscuring the core analytical functions. With this in mind, Jochen surveyed over 100 GIS users to identify the key “universal” analytical functions that should be at the basis of a new form of analytical GIS independent of data structures. Twenty sets of key operations emerged from this user requirement analysis, which Jochen clusters into six groups: search, locational analysis, terrain analysis, distributions/neighbourhood, spatial analysis, and measurement. Of course, some of these groupings could be debated as the author acknowledges, but the strength of this approach is that it is then possible to build a user interface, directly addressing the core needs of the user, thus facilitating their analytical tasks. The progress made in this direction is well documented in this chapter which may represent one of the most promising research focuses in this field. 1.7 SUMMARY As the brief overview above indicates, this book covers a wide range of core research issues across many disciplines. In some instances, such as the relationships between GIS and society, the issues raised are not specific to geographical information but are more deeply embedded into the current power structures in society or relate to the broader transition from the industrial society to one increasingly based on information processing. This blurring of distinction between GI and information in general is however not necessarily negative as it demonstrates that GI researchers are also moving out of their niche and addressing broader concerns in society bringing their specific expertise to the fore. As argued by Ian Masser in his Postscript to this volume, GI research is coming into mainstream social and environmental science research in the same way as GI and related technologies are becoming more pervasive in society, moving from dedicated research machines to the desk-top, and increasingly being diffused among individuals rather than being confined to organisations.
12
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
At the same time as GI loses its special status, we are also starting to see an increasing use of the geographic metaphor to search, retrieve, and visualise non-spatial data. Digital libraries were not specifically addressed in this volume but there is little doubt that they will dominate future developments and that many of the research topics addressed in this volume, which are still within the realm of specialists in the field, will underpin the applications that may become commonplace to non-specialists in the next century. Internet based spatial querying and analysis may be one of the ways in which individuals will discover geography in the future. To them issues of data quality, visualisation and interfaces based on a closer understanding of human cognition or the ability to search through space and time and integrate different types of data sources and models, may be something to be taken for granted. To get there however, there is still much to be done, not only from a technical perspective but also from methodological and datarelated perspectives. Moreover, the social and spatial impacts of these developments will need close evaluation to ensure that not only the opportunities but also the costs are in full view of public scrutiny. For the time being, we can take heart as editors of this volume that so many bright researchers from so many different backgrounds are applying their minds and talent to address the core questions identified in this book and are contributing to the further development of this important field. REFERENCES ABLER, R. 1987. The national Science Foundation National Center for Geographic Information and Spatial Analysis, International Journal of GIS 1(4), pp. 303–326. ARNAUD, A., CRAGLIA, M, MASSER, I., SALGÉ F. and SCHOLTEN, H. 1993. The research agenda of the European Science Foundation’s GISDATA scientific programme, International Journal of GIS 7(5), pp. 463–470. BATTY, M. and LONGLEY, P.A. 1994. Fractal Cities: A Geometry of Form and Function. London: Taylor & Francis. CRAGLIA, M. and COUCLELIS, H. (eds) 1997. Geographic Information Research Bridging the Atlantic. London: Taylor & Francis. HOFSTEDE G. 1980. Culture’s Consequences. International Differences in Work-Related Values. Beverly Hills: Sage Publications. PICKLES J. (Ed.) 1995. Ground Truth: the Social Implications of Geographic Information Systems. New York: Guilford Press
Part One GI AND SOCIETY: INFRASTRUCTURAL, ETHICAL AND SOCIAL ISSUES
Chapter Two Spatial Information Technologies and Societal Problems Helen Couclelis
2.1 INTRODUCTION The “and” in the title is tantalising: it establishes a connection between the phrases ‘spatial information technologies’ and ‘societal problems’ on either side of it, but what kind of connection this might be is anybody’s guess. Is it a complementary one, as in “bread and butter”? Does it indicate co-occurrence, as in “wet and cold”, or necessary succession, as in “night and day”? Is it a causal relationship, as in “fall and injury”, a normative one, as in ‘crime and punishment’, a confrontational one, as in “David and Goliath”, or is it more like “doctor and patient”, “question and answer”? Does it matter which of the two phrases comes before “and”? Linguists can have fun digging into the semantics of this simple-looking title. Maybe all these possible meanings make sense. For the purposes of our discussion here I will single out just two contrasting interpretations, as these are at the centre of most debates on the issue of the societal dimensions of spatial information technology: • Thesis 1: Spatial information technology causes societal problems • Thesis 2 (Antithesis): Spatial information technology helps alleviate societal problems. When a thesis and its antithesis can both be rationally defended we know that we have a highly complex issue in our hands. In cases like this any stance that does not give credit to its opposite is bound to be simplistic. Here, it is clearly equally naïve to claim that spatial information technology is plain evil, as it is to see it as the panacea that will help lay all kinds of societal problems to rest. This elaborate debate was recently taken up by Initiative 19 of the US National Center for Geographic Information and Analysis (NCGIA), entitled “GIS and Society: The Social Implications of How People, Space, and the Environment Are Represented in GIS”. Many of the ideas in this chapter originated through my involvement with I19 and my discussions with participants at the specialist meeting in March 1996. After giving a quick overview of the goals of I19, I will present my own typology of issues arising at the interface of spatial information technologies and society. I will then propose a framework to help see these very diverse issues against the background of a small number of domains inviting further research and reflection. What I am aiming at is another interpretation of the “and” in the above title, one that views spatial information technologies and society, despite the many problems and tensions, as inextricably linked.
SPATIAL INFORMATION TECHNOLOGIES AND SOCIETAL PROBLEMS
15
2.2 I19: GIS AND SOCIETY The idea for I19 was conceived in late 1993 during an NCGIA-sponsored meeting held in Friday Harbor, WA, USA. Stringent critiques of GIS by geographers working from the social theory perspective had started appearing in the literature, and there were fears that the two fields would become increasingly alienated (Taylor, 1990; Pickles, 1995). The purpose of the meeting was to bring together a number of GIS researchers and critical theorists, and let them sort out their differences and begin talking with each other. To almost everyone’s surprise, the meeting was a great success, thoroughly constructive, and there was widespread agreement that substantial follow-up efforts needed to be undertaken that would allow the two sides to continue working together. The proposal for I19 was the most tangible outcome of the Friday Harbor meeting, and it has already produced a meeting of its own. There was also a special issue of the journal Cartography and GIS, devoted to a number of expanded papers from the workshop (Sheppard and Poiker, 1995). As described in the proposal, the focus of I19 is on the following conceptual issues (see Harris and Weiner, 1996): 1. In what ways have particular logic and visualisation techniques, value systems, forms of reasoning, and ways of understanding the world been incorporated into existing GIS techniques, and in what ways have alternative forms of representation been filtered out? 2. How has the proliferation and dissemination of databases associated with GIS, as well as differential access to spatial databases, influenced the ability of different social groups to utilise information for their own empowerment? 3. How can the knowledge, needs, desires, and hopes of marginalised social groups be adequately represented in GIS-based decision-making processes? 4. What possibilities and limitations are associated with using GIS as a participatory tool for more democratic resolution of social and environmental conflicts? 5. What ethical and regulatory issues are raised in the context of GIS and Society research and debate? These conceptual issues are addressed in the context of three research themes: • the administration and control of populations; • location conflict involving disadvantaged populations; • the political ecology of natural resource access and use. As was to be expected, the discussion at the I19 specialist meeting ranged well beyond the themes outlined in the proposal. By the end of the three days, four new research proposal outlines had emerged, that were subsequently expanded, approved and funded: 1. 2. 3. 4.
The social history of GIS. The ethics of spatio-visual representation: towards a new mode. A regional and community GIS-based risk analysis. Local knowledge, multiple realities and the production of geographic information: a case-study of the Kanawa Valley, West Virginia.
16
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
2.3 GIS: FROM NUMBER-CRUNCHING TO ONTOLOGY The success of I19 thus far promises that interest in research at the interface of GIS and society will continue to grow and that the technical and societal perspectives on spatial information will continue to enrich each other. Clearly however I19 cannot account for all the work already going on in that general area, nor for all the theoretical and applied questions that could be investigated. In the following I will present a piece of my own perspective on these issues. Going back to the thesis and antithesis put forward at the beginning of this chapter, I will argue that the positive and negative effects of spatial information technologies on society are two sides of the same medal, and that both promoters and users of these technologies need to be constantly alert to the ethical issues always lurking just below the surface. This is not the same as saying that spatial information technologies are value-neutral tools that may be used for either good or bad purposes: rather, it is a recognition that, being thoroughly embedded in the values and practices of society, they share in the ethical dimensions of corresponding debates. To focus the discussion somewhat I will reduce “spatial information technologies” to “geographic information systems”. GIS broadly understood can encompass technologies such as remote sensing and image interpretation, geographic positioning systems (GPS), and the spatial databases available over the Internet, so that there is no significant loss of generality. Still, we have hardly broached the complexity of the issue, since GIS from a societal viewpoint can mean any or all of the following: • • • • • • • •
A technology for storing, combining, analysing, and retrieving large amounts of spatial data. A set of technical tools for the representation and solution of diverse spatial problems. A commodity that is produced, sold and bought. A man-machine system. A community of researchers and workers forming around that technology. A set of institutional and social practices. A set of conventions for representing the geographic world. A way of defining the geographic world and the entities within it.
Two things are notable about this list. First, ranging from the technically mundane to the philosophical, it is probably more involved than most such lists would be for other technologies. Second, most of the items on it, up to the last two, would also appear on a corresponding list for information technology in general. I personally believe that the societal issues that distinctly characterise GIS are to be found primarily in the last two aspects, the representation of the geographic world, and the associated ontology. This is also why representation was the key concept in the title of I19, even though the discussions ranged over a much broader spectrum of questions. The other items on the list do of course also have significant societal dimensions that are specific to GIS and spatial information, but these tend to be variants of similar issues (access, copyright, skills, power, empowerment, democracy, privacy, surveillance, etc.) arising from the broader problematic of modern information technology. I thus personally consider these last two aspects the most challenging—not necessarily because the associated problems are the most significant, but because unless we geographers and GIS researchers and practitioners grapple with them, no-one else will.
SPATIAL INFORMATION TECHNOLOGIES AND SOCIETAL PROBLEMS
17
2.4 ISSUES IN THE RELATION OF GIS AND SOCIETY I will now go through the list of eight aspects identified above, trying to highlight for each one of them both the positive and the problematic dimensions. My goal is to shake any complacent belief that GIS is either pure blessing or unmitigated calamity for society. 1. The simplest, most straightforward aspect of GIS from a societal viewpoint is its data handling capability. In both research and application the possibility to manipulate and analyse vast amounts of spatial data relatively easily, quickly, and cheaply, has not only greatly facilitated traditional activities such as map-making and cadastral maintenance, but has permitted scores of new useful data-intensive applications to be developed. On the other hand, running a GIS requires considerable skills that can only be acquired through substantial specialised training: is the necessary training really open to anyone who might have been employable under the old modes of operation, or does it exclude otherwise competent people who may not have social or geographic access to such training? Questions also arise regarding the displacement of workers lacking these skills by new, usually younger ones, which may create both human problems of employment and problems of lost institutional memory and experience. Further: are those performing the skilled data manipulations also the ones who best understand what the manipulations are for? What are the risks of separating the technical from the substantive expertise on a subject? A last question is whether the strong new emphasis on spatial data brought about by the introduction of GIS may not obscure other valuable ways of looking at problems under some circumstances. As with other information technologies (perhaps more so, because it is colourful and eye-catching), GIS is often seen as “signal and symbol” of efficiency in organisations (Feldman and March, 1981). To what extent that perception corresponds to reality is a question that can only be answered case by case. 2. Perhaps the most widely promoted view of GIS is that of a problem-solving technology applicable to a wide variety of spatial problems in both real-world situations and in research. Unquestionably GIS is that, judging from the myriad of applications world-wide in areas as diverse as planning, transportation, environmental conservation, forestry, medical geography, marketing, utilities management, the military, and so on. Still, there are questions: Whose problems are being solved? Who defines these problems? Who is involved in the generation of solutions? Who evaluates these solutions, and by what criteria? For whom are these solutions valid and good? How do alternative approaches to problem solution (or resolution) fit in? It is difficult to accept that, in a pluralistic and unequal society, a single problemsolving perspective based on the electronic manipulation and display of spatial data may give substantively good results in all the problem situations to which it is technically applicable. From a societal viewpoint, knowing when to use GIS and when to leave it alone may be the basis of good problem-solving. 3. There are several heart-warming examples of GIS being used by native peoples in the middle of the tundra or in disadvantaged neighbourhoods in the middle of inner-city jungles. These examples speak for the wide accessibility of GIS resulting from the phenomenal growth and diffusion of the technology in the past 15 years or so, and the intelligent, responsible, and dedicated work by GIS researchers and practitioners alike. This does not change the fact that GIS is a commodity made up of costly software, hardware, and databases, produced for the most part by a lucrative private industry operating under market constraints. As with any other commodity, however wide-spread, there are those who have it, and those who do not. Even free access to data, a thorny, disputed issue, would not make GIS a free good, even without taking into account the substantial investments needed to train skilled operators.
18
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
Access to GIS is unlikely ever to be evenly distributed across the population, or the globe, and just as the rich tend to get richer, the knowledgeable tend to get more knowledgeable, leaving in their wake a growing proletariat of ignorance. These problems of access are compounded by issues of intellectual property, copyright, questions of data quality and appropriate system selection, so that even those who can afford GIS cannot be sure that they got the product they think they are paying for. The cacophony of vendors promoting their own wares in a competitive market has landed many a good system on the wrong desk, resulting in a waste of scarce resources and a substantial opportunity cost for agencies or businesses hoping to jump on the GIS bandwagon. 4. Perhaps the greatest advantage of GIS over traditional methods of spatial data manipulation and analysis lies in the interactive coupling of a human intelligence with the computational power of a machine, and in the intuition-boosting cognitive appeal of the visual representations produced through that interaction. On the surface this efficient man-machine system seems devoid of societal implications, until we notice that this whole interactive process is channelled through a user interface replete with metaphor and convention. Does the desktop metaphor of files and folders mean much to those who have never used a desk? Are the colours conventionally used for land, water, and vegetation just as intuitively obvious to the Tutsi as to the Inuit? Is a world made up of static polygons as comprehensible to the nomadic herder as to the urban dweller? There are also things that a man-man (sorry! person-person) system can do that a man-machine system cannot. GIS interfaces are typically designed with the (isolated) user in mind, whereas answers to spatial problems are most often arrived at by large numbers of different people interacting over time within relationships that may be in turn collaborative or adversarial. Attempts to develop GIS for collaborative decision support are on the right track, provided they take into account that there is much more to real-world decision making than amicable collaboration among technically-minded peers. 5. I mentioned earlier the potential for exclusion inherent in the GIS requirement for specially trained, technically proficient personnel. The other side of the exclusion issue is the formation of a closed subculture of GIS practitioners and researchers, with its own journals, meetings, e-mail lists, social networks, associations, and agendas. The existence of such a distinct GIS subculture is not in itself a bad thing, and is probably necessary for enhancing the professional identity and profile of the speciality as well as for allowing its members to remain on the cutting edge of intellectual and technical developments in the area. However, all such professional subcultures run a risk of internal homogenisation that can stifle true innovation, all the more so when they are relatively young and have not had the time to branch out towards diverse directions. (Just think of the number of GIS practitioners around the globe whose education has been based on the NCGIA core curriculum!) The premature domination of an orthodoxy within the GIS community would be particularly damaging considering the unusually wide range of actual and possible applications of the technology, and the ensuing need for a healthy variety of substantially different perspectives and practices. Other aspects of the GIS subculture that have caused some concern among critical theorists are its purported male-dominated character, and what some people have perceived as the arrogant messages of global oversight and control inherent in the vendors’ advertising slogans and imagery (Roberts and Schein, 1995). 6. The societal aspect of GIS that has attracted the most attention to date is the fusion of that technology with contemporary social practices and institutions, to the point of becoming a set of social practices itself. Within very few years GIS has infiltrated government and business, industry and academia, the environmental movement as well as the military, has been adopted by grassroots movements and native population groups, and has profoundly affected both the practice and self-definition of geography and the perception of the discipline by the public at large. “Geography’s piece of the information
SPATIAL INFORMATION TECHNOLOGIES AND SOCIETAL PROBLEMS
19
revolution” raises a host of important issues similar to those raised by the information revolution in general, but distinguished by “the spatial twist”: issues of access, power, empowerment, democracy, political decision-making, consumer sovereignty, social justice, equity, privacy, surveillance, control; questions about who gains and who loses, what is the opportunity cost, how can we avoid confusing what is desirable with what is possible with the new technology, what should the role of research be in fostering an enlightened practice. These issues dominated the discussions at the I19 meeting as well as the literature on GIS and society that has appeared to date, and will clearly continue to do so for some time. They involve truly “wicked” problems that can never be solved once and for all, even though specific instances are always amenable to thoughtful, creative handling. 7. How the geographic world is represented in GIS is a question that has no counterpart in other forms of information technology. It is an issue specific to GIS and one that concerns all those working in the field, whether their interest is mainly technical or applied to either environmental or social-science problems. The issue is that GIS represents the world in a particular way, and that this is by no means the only way possible. Questions thus arise regarding likely errors of both commission and omission: What are the biases in this abstract digital model of the world underlying GIS, based on information technology, geocoded measurement, and the cartographic tradition? Whose view of reality is reflected in the resulting computer images and possible modes of their manipulation? What interpretations and solutions are supported by these representations? And conversely: what alternative forms of knowledge and ways of understanding the world are excluded from this view, what questions cannot be asked within it, what realities cannot be seen, whose voices are being silenced? Critics have stressed in particular the strong GIS bias towards the visual, leading, in Gregory’s (1994) words, to a view of the “world as exhibition”: a world reduced to what can be displayed—in this case, displayed in the form of either fields or objects. 8. The step from representation to ontology is a small but crucial one. It is the step from seeing the world as if it were as represented, to thinking it actually is as represented. It is the map becoming the territory, the representation determining what exists, how it all works, and what is important. In the case of GIS, it is not just the world represented in a particular kind of interactive visualisation: it is the world where what is visualisable in these terms is all that exists: the ultimate WYSIWYG (What You See Is What You Got-if you don’t see it, it’s not there). I have commented elsewhere on the power of GIS, along with other electronic media, to generate what Mitroff and Bennis (1989) call “unrealities”: Unreality One, where the unreal is made to look so much like the real that you can no longer tell the difference; and Unreality Two, where the unreal becomes so seductive that you no longer care about the difference (Couclelis, 1996). While the choice of any particular mode of representation necessarily constrains what questions can be possibly asked and answered, unwitting or conscious adoption of the corresponding ontology moulds permanently restrictive habits of mind. It would be a great loss, for our speciality as well as for society, if “the world as exhibition” actually became the only GIS world there is. 2.5 THE CONTRIBUTION OF GEOGRAPHIC INFORMATION SCIENCE As GIS researchers and practitioners working on specific technical and scientific problems, we may sometimes wonder how our efforts might relate to the above very general concerns. As responsible human beings and citizens we would like our work to be socially useful or at the very least not to harm people, even though we know that what we do as conscientious professionals often takes on a life of its own once it leaves our hands. The bravest among us have taken their GIS to the streets and had it applied to whatever
20
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
Figure 2.1. The geographic triangle
societal cause they considered most deserving. Others take comfort from the fact that the technical problems they are solving are sufficiently removed from the social arena so as not to pose a threat to anyone. Yet it should have become clear from the preceding discussion that the societal implications of GIS, whether positive or negative, do not derive from any particular kind of work, any particular individual choices, but from the whole nexus of technological, historical, commercial, disciplinary, institutional, intellectual and ideological conditions surrounding the development of the field. The effects of GIS on society are thus emergent effects of the aggregate, not of any of its individual parts, and as such we all share modestly but equally in the responsibility for what good or bad results. We thus cannot say that we will leave the worrying to just those interested in the social-science and policy applications of GIS, and to the critics. But how can we relate the particular data model, spatial query language, or generalisation technique we may be working on to a laundry-list of societal issues such as the ones discussed earlier? In my view, if such thing as geographic information science exists, it would not be worthy of the name unless it can help us make that connection. I recently proposed a conceptual framework for geographic information science that would pass that test. (Couclelis, 1998). Here is the idea in a nutshell: As members of the human species living on the earth we all share a fundamental knowledge of the geographic world, a knowledge that has, at the very least, empirical, experiential, and formal components— the difference between the former two being roughly that between explicit and implicit or intuitive. These three perspectives on geographical knowledge form the vertices of what I call the geographic triangle (Figure 2.1). Connecting the empirical and experiential are geographic concepts; connecting the empirical and formal are geographic measurements; and the fusion of the experiential and formal perspectives gives rise to spatial formalisms, that is, geometry and topology. The geographic triangle itself is best represented by the quintessential geographic instrument, the map. GIS, by continuing the cartographic tradition, is thus well grounded in the geographic triangle. In recent decades alternative approaches to geography have emerged out of the wider critical theory and political economy movements that have been flourishing in the social sciences. These have strongly contested the value of the formal perspective for human geography in particular and have developed their
SPATIAL INFORMATION TECHNOLOGIES AND SOCIETAL PROBLEMS
21
Figure 2.2. The ‘social’ vertex: a competing geographic triangle?
own non-quantitative discourses heavily critical of the limitations of, and even threats to society posed by formalist, quantitative, technology-based methodologies. The critique of GIS as reflected in Pickles (1995) and similar writings, which prompted the development of I19, derives almost entirely from these alternative perspectives on geography. The resulting tension between the mainstream and the “new” geographies may be illustrated in Figure 2.2, where a “social” vertex now defines a very different kind of geographic triangle. Confirming what we know from decades worth of literature, there seems to be no common ground between the two kinds of geography. But free that scheme from the flatness of the plane, and see what happens when the two kinds of approaches are allowed to enrich each other! (Figure 2.3). A new dimension is added to both, and a number of critical connections suddenly become evident. The resulting solid gives rise to three more edges and three more triangles. Here is what I suggested the new edges mean. Connecting the social and the empirical are the geographic constructs overlaid on the surface of the earth: the boundaries and the territories, the functional regions and protected areas, the private and public spaces, the neighbourhoods and natural ecosystems. Connecting the social and the experiential are the alternative perspectives on the geographic world borne out of the diverse kinds of social and cultural experiences. And I propose to you that what connects the social and the formal in this context is no other than geographic information: the formal end makes possible the GIS outputs—the maps, the TINS, the animations, the social provides the intentional stance which alone can give meaning to what otherwise would have been nothing but a bunch of fleeting colour pictures. Even those of us working on natural-science applications of GIS should recognise the extent to which the questions we find worth asking, the answers we deem acceptable, and the interpretation of what we do are socially conditioned. Science too is a societal enterprise: the social vertex is part of what we all do! I am thus arguing for a thorough integration of the societal perspective into geographic information science. This certainly does not mean that every one of us should or could start directly exploring societal
22
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
Figure 2.3. The tetrahedron of geographic information science
questions in their research. But it does mean that all of us should be aware that these multiple connections between GIS and society exist, and be prepared to occasionally engage in an earnest dialogue with those systematically working to explore them. It is the notion of information in GIS that makes the social dimension inescapable. Information presupposes an intelligence that is attuned to what it may signify, and that intelligence is always socially conditioned. There is no such thing as a lone mind dealing with the world outside of a social context. Robinson Crusoe was a single white Anglo-Saxon male! REFERENCES COUCLELIS, H. 1996. Geographic Illusion Systems: towards a (very partial) research agenda for GIS in the information age, in Harris T. and Weiner D. (Eds.), Report for the Initiative 19 Specialist Meeting on GIS and Society, NCGIA Technical Report 96–8. Santa Barbara, CA: National Center for Geographic Information and Analysis. COUCLELIS, H. 1998. GIS without computers: building geographic information science from the ground up, in Kemp, Z. (Ed.) Innovations in GIS 4: Selected Papers from the Fourth National Conference GIS Research UK. London: Taylor & Francis, pp. 219– 226. FELDMAN, M.S. and MARCH, J.G. 1981. Information in organizations as sign and symbol, Administrative Science Quarterly, 26, pp. 171–186. GREGORY, D. 1994. Geographical Imaginations. Cambridge: Blackwell. HARRIS, T. and WEINER, D. (Eds.). 1996. GIS and Society: the Social Implications of How People, Space, and Environment Are Represented in GIS, Report for the NCGIA Initiative 19 Specialist Meeting, NCGIA Technical Report. Santa Barbara, CA: NCGIA MITROFF, I. and BENNIS, W. 1989. The Unreality Industry: the Deliberate Manufacturing of Falsehood and What It Is Doing to Our Lives. New York: Oxford University Press.
SPATIAL INFORMATION TECHNOLOGIES AND SOCIETAL PROBLEMS
23
PICKLES, J. (Ed.) 1995. Ground Truth: The Social Implications of Geographic Information Systems. New York: Guilford Press. ROBERTS, S.M and SCHEIN, R.H. 1995. Earth shattering: global imagery and GIS, in Pickles, J. (Ed.), Ground Truth: The Social Implications of Geographic Information Systems. New York: Guilford Press, pp. 171–195. SHEPPARD, E. and POIKER, T. (Eds.) 1995. Special issue on “GIS & Society”, Cartography and Geographic Information Systems vol. 22(1), pp. 3–103. TAYLOR, P.J. 1990. GKS, Political Geography Quarterly, 9(3), pp. 211–212.
Chapter Three Information Ethics, Law, and Policy for Spatial Databases: Roles for the Research Community Harlan Onsrud
3.1 INTRODUCTION Ethical conduct may be defined from a practical perspective as conduct that we wish all members in society to aspire to but which is unenforceable by law or undesirable to enforce by law. Legal conduct is typically defined by the documented and recorded findings of our legislatures and courts. Ethical conduct is involved in the choices users make in applying geographic information system technologies on a day to day basis. However, ethical conduct is also involved in the choices scientists and researchers make in determining which aspects of the knowledge base they help advance. For instance, should researchers put their time and effort into expanding the knowledge base that will help advance systems for allowing stricter control over digital information or should we put our efforts into expanding the knowledge base for systems that will allow greater access to information by larger segments of society? Moral stances may be taken in support of either of these as well as many other propositions. The science of ethics helps us sort out which moral arguments have greater validity than others. No person may reliably predict how basic and even applied research advancements will ultimately affect society. However, as developers of new tools and techniques we should at least be aware of the potential social ramifications of our work so we can make informed and, hopefully, ethically supportable choices when the opportunity to make choices arise. The research outlined in this chapter is concerned with discovering the effects of geographic information technologies on society, observing effects when different choices are made for institutionalising or controlling location technologies and datasets, and developing information policy and legal models not yet tried in practice that might lead to greater beneficial results for society. Let us assume that the large rounded rectangle in Figure 3.1 encloses all societal conduct. Conduct to a philosopher is typically defined as behaviour that involves choices on the part of the actor. For most conduct, such as whether you choose to wear black or green socks, society makes no public judgement as to your choice of behaviour. For the most part we can assume that the behavioural choices we make in daily living are legal unless specifically defined as illegal by law. The subset of conduct enclosed by the circle in Figure 3.1 represents conduct that society has deemed to be illegal. If one makes a choice defined by society’s laws as being illegal, one is subject to the sanctions proscribed by society for that conduct. From a practical perspective, we may also classify a subset of conduct as being unethical and another set as being ethical. For these two classes of conduct, as with illegal conduct, society does make a judgement as to rightness or wrongness of your choice of behaviour. Ethical conduct to the non-philosopher is generally
INFORMATION ETHICS, LAW, AND POLICY
25
Figure 3.1: Societal conduct (Onsrud, 1995b).
understood to be “positive” or “laudatory” conduct while unethical conduct is understood to involve “wrong” or “bad” choices. Most ethical conduct is also legal. However, certain conduct that might be considered laudatory and ethical by an individual or even a large proportion of society, such as speeding to prevent a murder or helping a person with a terminal and extremely painful illness commit suicide, may be defined as illegal by society. Actions falling in this class are represented by the light grey area in Figure 3.1. Much unethical conduct is also deemed to be illegal by society. However, a significant body of conduct exists that is unethical yet legal. Even though the vast majority of society may agree that certain volitional actions of individuals are unethical, society may also agree that such actions should not be banned. For instance, disallowing certain conduct might overly restrict other conduct that we highly value or punishing certain non-desired conduct might be too burdensome on society’s resources. It is this body of unethical conduct as represented by the dark shaded area in Figure 3.1 that the remainder of this chapter is primarily concerned with. 3.2 UNETHICAL CONDUCT In determining whether a proposed action in the use of a geographic information system is considered unethical, one might first resort to philosophical theories as set forth in the philosophy literature for guidance. However, the bounds defining unethical behaviour vary substantially depending on the philosophical arguments one accepts. Even though the rules developed by the great philosophers have many areas of agreement, no one philosophical line of reasoning or single rule seems to have stood the test of time in determining the rightness or wrongness of actions (Johnson, 1984). As such, it is this author’s contention that the bounds between unethical conduct and other conduct should be defined first through realworld practical experiences and methods. These results then should be checked against the major philosophical lines of reasoning for their extent of conformance or non-conformance and adapted if necessary. In a practical world, in determining whether a proposed action in the use of geographic information systems might be considered unethical, individuals often try to anticipate whether the consensus of a group
26
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
or a majority of a group would consider the action to be “inappropriate” or “bad” if the group was made aware of the action. For a specific proposed action, it is seldom possible to take an opinion poll of the general public or of the group that may be affected prior to taking the action. For this reason, many professional groups develop explicit codes of conduct to which they hope all of their members will aspire and against which members of their community may assess proposed actions. Because of its recent emergence, the geographic information science community has no well established professional codes of conduct. However, even if one borrows principles from the codes of similar professional groups, many of the professional codes of conduct constructed in the past have been developed by consulting members of the specific discipline without consulting consumers or, for instance, consulting those who may be subjects in the data sets bought and sold within the discipline. Most codes of professional conduct focus on fair dealings among members of the discipline. This results in biases towards those involved in constructing the codes. What is agreed to be “smart business practices” by a large majority of practising professionals may be considered wholly unethical by data subjects or by the consumers of professional products and services. A highly appropriate role for the geographic information science research community would be to determine those common situations in day to day practice in the discipline that give rise to ethical dilemmas and to assess the responses of those affected by the various alternatives proposed for resolving the dilemmas. Parties such as producers, users, vendors, and data subjects all should be consulted. Any consensus lines arrived at from the responses should be assessed in the light of the leading philosophical theories addressing the appropriateness of human conduct. Practical ethical principles and guidelines such as those drawn from broad experiences across many segments of society might also be considered in this process (Kidder, 1995). In this way the geographic information science research community might arrive at recommended actions, if not recommended codes of conduct, that could better benefit those broad segments of society involved in the use of geographic information. 3.3 LEGAL CONDUCT As stated earlier, legal conduct is typically defined in most modern societies by assuming that all behaviour choices are legal unless specifically banned by law. Thus it is primarily the defining of illegal behaviour that allows us to define legal behaviour. As technologies such as geographic information systems, global positioning systems, other location technologies, and computer networks have advanced and are embedded in everyday life, new actions and choices in behaviour arise that were never envisioned or contemplated by past lawmakers. Opportunities arise for some parties to take advantage of others. Much of the opportunist behaviour promotes competition and economic activity and thus the behaviour is considered positive. Other opportunist behaviour is considered so unfair that past laws are construed to cover the changed circumstances and remedies may be had under principles of equity in the courts. In yet other instances, legislators react by passing new laws to cover those unfair opportunist behaviours that new technologies have given rise to but that cannot be dealt with through the application of current laws. Thus, the law adapts over time. We are at a time in history when rapid changes in technology are causing rapid changes in societal relationships. Because of this, the parties to information policy debates often have very little evidence of the actual effects that proposed information policies or laws will have on various groups or individuals in society. Arguments that proposed information policies or laws will or will not create fair, just, and equitable balances among members of society are often highly speculative since little or no evidence typically has been gathered to support the truthness or falsity of the competing claims. The need for information on the
INFORMATION ETHICS, LAW, AND POLICY
27
effects of alternative information policies and laws has created an important role for academics and researchers within the geographic information science community. 3.3.1 Roles for the Geographic Information Science Academic Sector For important social disputes in which geographic data is involved, the academic sector should help articulate and constructively criticise the arguments presented by or for the various stakeholders in the social disputes. If the private sector information industry is in a pitched battle with local governments over how spatial data should be distributed, the academic sector can fill an important role by listening to all stakeholders in the active dispute, identifying the strongest arguments for each set of voices in the debate, and helping to better construct the logic and persuasiveness of those arguments. The academic sector also needs to expose and articulate arguments in support of disenfranchised groups in society. The academic sector should purposely broaden debates to include voices not actively being heard but that also have a stake in the outcome of information policy debates. Government agencies and the private business sector have greater resources at hand to promote their interests than many others in society. For this reason the academic sector has often taken on the role of citizen advocacy or minority advocacy as a social obligation when those voices otherwise would not be heard. For instance, it is largely the academic sector rather than government agencies or the private business sector that is raising concerns over the adverse impacts on personal information privacy caused by the pervasiveness of geographic data sets and the use of those datasets as tools for the massive integration of data about individuals. The academic community has an extremely important role to play in continually questioning the logic and validity of all arguments presented in information policy debates. This community needs to evidence the truth or falseness of claims by collecting evidence on the ramifications of following one information policy or legal choice over another. The research community can inform important social debates by going out and observing information policy and law in action. If the claim is made that sale of GIS data by government increases the overall economic well being of a community or adversely impacts the overall economic well being, which of the claims actually holds up in practice and under what circumstances? The academic sector is particularly well suited to evidence the truth and falseness of claims and to observe information policies and laws in action. This is because their work is typically reviewed by peers with an expectation of full disclosure of study and survey methods. Considered reflection and analysis is the expected norm. If study methods are biased, they are expected to be exposed as such through the peer review process. In many social problem areas, scholars have little personal economic interest or other vested interests in how a social issue might be resolved. Therefore they are often able to judge the evidence, draw conclusions, and make recommendations more dispassionately than those to whom the outcomes matter. In the social arena of information policy debates, however, the academic community as a whole often does have a stake in the outcomes. Thus the review process by peers and scrutiny for bias needs to be more vigilant and scrupulous when vested interests of the academic community are evident. Finally, one of the most important roles for the academic community is to advance new knowledge. The research community should purposely set out to construct and recommend new or unexplored models of action or regulation that may better achieve the goals of all stakeholders in important social disputes. This of course is a long-term and iterative process since any new information policy models or approaches that may be suggested must also be tested and challenged in practice over time.
28
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
3.3.2 Needed Legal and Information Policy Knowledge Advancements There are many unresolved legal and information policy issues germane to geographic data that could be enlightened by the process of aiding articulation of arguments, gathering data to test the veracity of arguments, and developing new models to guide policy making and law making. Some of the more pressing areas in which long term iterative research should be accomplished within the geographic information policy and legal domains include intellectual property rights in geographic information (copyright, trademark, unfair competition, trade secret, patent), incursions on personal information privacy as a result of the availability or use of geographic datasets and processing, access to geographic data housed by government, liability for harmful geographic data and communication of such data, electronic commerce in geographic data (authentication, electronic contracting, admissibility of evidence), anti-trust implications of monopoly-like control over some geographic datasets, free trade in geographic data, and the implications of differences in law and information policies among nations on trade in geographic data, software, and services. Specific detailed research questions in each of the needs areas is readily available in the literature (Onsrud, 1995a). Similarly, a wide range of important unanswered information policy research questions have been raised relating to the sharing of geographic information (Onsrud and Rushton, 1995) and to the development of geographic information technologies that might better allow such technologies to be used for positive change in society (Harris and Wiener, 1996). 3.4 CONCLUSION For as long as modern societies continue to experience rapidly changing technological environments, legal and government policy arrangements for managing and protecting geographic data will remain unclear. Geographic data has been and will continue to be a test bed for developing new information practices and theory. The resolution of legal and information policy conflicts in this arena will have ramifications for other information technologies just as resolution of information policy issues in other arenas are affecting the handling of geographic data. The need for policy makers to reconcile competing social, economic, and political interests in geographic data will become more pressing over time, not less. To respond to these social needs, the geographic information science research community will need to increase fundamental research efforts to address ethical, information policy and legal issues in the context of geographic data. ACKNOWLEDGEMENTS This chapter is based upon work partially supported by the National Center for Geographic Information and Analysis (NCGIA) under NSF grant No. SBR 88–10917. Any opinions, findings and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation. REFERENCES HARRIS, T. and WEINER, D. (Eds.) 1996. GIS and Society: The Social Implications of How People, Space and Environment are Represented in GIS, Technical Report 96–7. Santa Barbara: NCGIA, UCSB.
INFORMATION ETHICS, LAW, AND POLICY
29
JOHNSON, O.A. 1984. Ethics Selections from Classical and Contemporary Writers, 5th ed. New York: Holt, Rinehart and Winston. KIDDER, R.M 1995. How Good People Make Tough Choices: Resolving the Dilemmas of Ethical Living. New York: Simon & Schuster. ONSRUD, H.J. (Ed.) 1995a. Proceedings of the Conference on Law and Information Policy for Spatial Databases Orono, Maine: NCGIA, University of Maine. ONSRUD, H.J. 1995b. Identifying Unethical Conduct in the Use of GIS, Cartography and Geographic Information Systems, 22(1), pp. 90–97. ONSRUD, H.J. and RUSHTON, G. (Eds.) 1995. Sharing Geographic Information. Rutgers: CUPR Press.
Chapter Four Building the European Geographic Information Resource Base: Towards a Policy-Driven Research Agenda Massimo Craglia and Ian Masser
4.1 INTRODUCTION The chapter reviews recent developments relating to the formulation of a policy framework for geographic information in Europe. It is divided into two parts which respectively examine the terms of the European debate with particular reference to the initiatives taken by the European Commission, and discuss some of the most urgent issues on which to focus future research efforts. What clearly emerges is that considerable efforts are being made to address the many and complex issues related to the development of a geographic information resource base in Europe and the need for the geographic information research community to engage in the debate and shape the research agenda. The nature of this agenda poses considerable challenges as it lies outside many of the traditional strengths of the European GIS research community and requires a much stronger inter-disciplinary dialogue that has been hitherto the case. 4.2 THE EMERGENCE OF A EUROPEAN POLICY FRAMEWORK FOR GEOGRAPHIC INFORMATION 4.2.1 Introduction This section considers some factors which have led to the emergence of a European policy framework for geographic information. The first part discusses the political context of these developments and the second summarises the main proposals contained in the discussion document entitled GI 2000: Towards a European Policy Framework for Geographic Information, published by DGXIII/E in May 1996 (DGXIII/ E, 1996a). To complete the picture the last part of this section describes a number of related developments that have been taking place in Europe over the last few years. Most of the projects discussed relate to the work of DGXIII/E. It should be noted however that DGXIII/E is not the only Directorate within the European Commission to express an interest in geographic information and GIS technology. These interests are shared by many other Directorates and also by Eurostat which has set up its own GIS (GISCO) to meet the needs of Community administrators. A number of these Directorates have also commissioned GIS related projects under the Fourth Framework for Research and Development which runs from 1994–98. To give an indication of the wide range of GI related activities within the Commission, a search on the ECHO database of the Commission found over 100 “hits” for
TOWARDS A POLICY DRIVEN RESEARCH AGENDA
31
projects dealing with or using GI(S). Most of these are primarily concerned with the application of GIS technology in the field of the environment, transport and spatial planning rather than geographic information policy itself. Consequently DGXIII/E, whose general responsibilities include telecommunications, the information market and the exploitation of the findings of research has played a vital role in the development of European wide geographic information policy 4.2.2 The Political Context The starting point for much of the current discussion is the vision for Europe that was presented to the European Council in Brussels in December 1993 by the then President Jacques Delors in the White Paper entitled Growth, Competitiveness and Employment: The Challenges and Ways Forward into the 21st Century (CEC, 1993). An important component of this vision is the development of the information society essentially within the triad of the European Union, the United States and Japan. One result of this initiative was the formation of a high level group of senior representatives from the industries involved under the chairmanship of Commissioner Martin Bangemann. This group prepared an action plan for “Europe and the global information society” which was presented to the European Council at the Corfu Summit in June 1994 (Bangemann, 1994). In this plan the group argued that recent developments in information and communications technology represent a new industrial revolution that is likely to have profound implications for European society. In order to take advantage of these developments it will be necessary to complete the liberalisation of the telecommunications sector and create the information superhighways that are needed for this purpose. With this in mind the group proposed ten specific actions. These included far reaching proposals for the application of new technology in fields such as road traffic management, transEuropean public administration networks and city information highways. These proposals were subsequently largely incorporated into the Commission’s own plan, Europe’s Way to the Information Society which was published in July 1994 (CEC, 1994). 4.2.3 GI 2000 Parallel to these developments a number of important steps have been taken towards the creation of a European policy framework for geographic information. In April 1994 a meeting of the heads of national geographical institutes was held in Luxembourg which concluded that “it is clear from the strong interest at this meeting that the time is right to begin discussions on the creation and supply of harmonised topographic data across Europe” (DGXIII/E, 1994). This view was reinforced by a letter sent to President Delors by the French minister M. Bosson which urged the Commission to set up a coordinated approach to geographic information in Europe, and the correspondence on this topic between the German and Spanish ministers and Commissioner Bangemann during the summer of 1994. As a result of these developments, a meeting of key people representing geographic information interests in each of the Member States was held in Luxembourg in February 1995. The basic objective of this meeting was to discuss a draft document entitled GI2000: Towards the European Geographic Information Infrastructure (DGXIII/E, 1995a) and identify what actions were needed in this respect. The main conclusion of this meeting was that “it is clear from the debate that the Commission has succeeded in identifying and bringing together the necessary national representative departments that can play a role in developing a Community Action Plan in Geographic Information” (DGXIII/E, 1995b, p. 12). As a result it
32
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
was agreed that DGXIII/E should initiate wide ranging consultations within the European geographic information community with a view to the preparation of a policy document for the Council of Ministers (Masser and Salgé, 1996). Since 1995 the original document has been redrafted at least seven times to take account of the views expressed during different rounds of consultations. In the process its status has changed from a “discussion document” to a “policy document” and back to a “discussion document” again. In the most recent of these drafts which is dated 15 May 1996 (DGXIII/E 1996a) its title has also been changed from GI2000: Towards the European Geographic Information Infrastructure to GI2000: Towards a European Policy Framework for Geographic Information and the term “infrastructure” has largely disappeared from the document to make it more palatable to policy makers who, it is argued, identify “infrastructure” with physical artefacts like pipes and roads and may get confused by the broader connotation of this term. Despite these changes in emphasis, the basic reasoning behind the argument that is presented remains essentially unchanged. There is a discernible trend towards ac hoc harmonisation of geographic information in Europe… However, progress is being hampered by political and institutional considerations that need to be addressed at the highest levels if the opportunities provided by geographic information technology are to be fully exploited. To remove bottlenecks, reduce unnecessary costs and provide new market opportunities, a coherent European policy framework is needed in which the industry and market can prosper (DGXIII/E, 1996a, p.11). Consequently, what is required is a “policy framework to set up and maintain a stable, European wide set of agreed rules, standards, procedures, guidelines and incentives for creating, collecting, exchanging and using geographic information” (DGXIII/E, 1996a, p. 11). The main practical objectives for the policy framework are: 1. To provide a permanent and formal, but open and flexible, framework for organising the provision, distribution and standardisation of geographic information for the benefit of all suppliers and users, both public and private. 2. To achieve a European wide meta data system for information exchanged that conforms to accepted world-wide practices. 3. As far as possible, to harmonise the objectives of national geographic information policies and to learn from experience at national level to ensure that EU-wide objectives can be met as well, at little additional cost and without further delay or waste of prior work already completed. 4. To lay the foundations for rapid growth in the market place by supporting the initiatives and structures needed to guarantee ready access to the wealth of geographic information that already exists in Europe, and to ensure that major tasks in data capture are cost effective, resulting in products and services usable at national and pan European scales. 5. To develop policies which aid European businesses in effective and efficient development of their home markets in a wide range of sectors by encouraging informed and innovative use of geographic information in all its many forms, including new tools and applications which can be used by non experts. 6. To facilitate the development of European policies in a wide range of fields by encouraging the promotion of new and sophisticated analysis, visualisation and presentation tools (including the relevant datasets) and the ability to monitor the efficacy of such policies.
TOWARDS A POLICY DRIVEN RESEARCH AGENDA
33
7. To help realise the business opportunities for the intrinsic European geographic information industry in a global and competitive market place (DGXIII/E, 1996a, p.13). The current draft also contains a list of practical actions that are required in connection with this framework. These include: • Actions to stimulate the creation of base data given that the lack of European base data is seen as “the single most important barrier to the development of the market for geographic information”. In this respect the role of the European Union is essentially to stimulate closer cooperation between national organisations and encourage the participation of the private sector in this task. • Actions to stimulate the creation of meta data services to make it easier to locate existing information and promote data sharing across different applications. • Actions to overcome legal barriers to the use of information while at the same time reducing the potential risks to society from the unrestricted application of modern information technology. It is recognised that the involvement of many organisations and institutions within Europe will be required to create such policy framework and that strong leadership and political support will be needed to carry the process forward. As no organisation exists with the political mandate to create geographic information policy at the European level, it is intended that the European Commission will seek such a mandate from the European Council. Once this is obtained it is envisaged that a high level task force will be set up to implement the policy outlined in the document. 4.2.4 Related R&D Developments A number of related R&D projects have been commissioned by DGXIII/E within the context of the European policy framework, including the IMPACT-2 projects, the three GI studies and INFO2000. The projects developed as part of the IMPACT-2 programme (1993–95) addressed a number of GI-based applications including tourism, education, and socio-economic analysis (see www2.echo.lu/gi/projects/en/ impact2/impact.html). They were particularly useful in building operational experience with respect to the institutional, cultural and legal barriers that need to be resolved for the creation of trans national databases within Europe (Longhorn 1998). The three GI studies commissioned by DG XIII in 1995 specifically related to issues arising out of the discussions regarding the European policy framework. Their basic objectives are reflected in their titles (see also www2.echo.lu/gi/docarchive/ref_doc.html): • Study on policy issues relating to geographic information in Europe (GI-POLICY) • Study on demand and supply for geographic information in Europe, including base data (GI-BASE) • Feasibility study for establishing European wide metadata services for geographic information in Europe (GI-META) Work on these projects began at the start of 1996 and was completed in 1997.
34
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
4.2.4.1 INFO2000 DGXIII/E launched in 1997 its new four year research programme which has been allocated a provisional budget of 65 million ECU. The basic objectives of this programme are to stimulate the multimedia content industry and to encourage the use of multimedia content in the emerging information society. Nearly half the budget for this programme is allocated to action line three for projects that will accelerate the development of the European multimedia industry in four key areas: cultural heritage; business services; geographic information; and science, technology and medical information. To receive support from the geographic information component of this programme projects will have to satisfy at least one of the following objectives: - to demonstrate through innovative pilot applications the advances made in integrated base data and thematic information content… - to provide pan-European information about GI, what is held, in what format and how it is accessed (metadata services and their linking) - to demonstrate integration or inter linking of base data of a pan-European, or trans-border nature that may form the building block of future commercial applications, especially where such projects will held to establish common specifications for pan-European GI datasets - to demonstrate methodologies for collecting, exchanging and using pan-European of trans-border GI, including provision for networked access to other services (DGXIII/E, 1996b, p. 13). 4.2.5 Evaluation From the discussion above it can be seem that geographic information occupies a prominent position in European policy debates at the present time. The initial stimulus for these debates arose out of the growing concerns among leading politicians and policy makers about the need to maintain Europe’s economic competitiveness in an emerging information society. One result of these debates is a wide ranging round of discussions that is currently taking place on the subject of the European policy framework for geographic information which aims to stimulate action by the European Commission itself. Parallel to these discussions DGXIII/E has commissioned a number of R&D projects whose findings are likely to inform this debate further. 4.3. TOWARDS A POLICY-DRIVEN RESEARCH AGENDA FOR GI 4.3.1 Introduction Having reviewed some of the main developments in the debate on the European Policy Framework, this section discusses some of the activities that are currently taking place to define a research agenda for GI in the context of the GI2000 initiative. Before discussing the research agenda, it is useful to outline the current thinking at EU Commission level in relation to the Fifth R&D Framework (1999–2002), and the role that GI may play in it, as well as some of the research topics being canvassed at the present time within the GI
TOWARDS A POLICY DRIVEN RESEARCH AGENDA
35
research community. Reference is therefore made to the discussion documents circulated at a R&D meeting hosted by DG XIII/E in Luxembourg on the 20 June 1996. This meeting was part of a broad consultation process on GI2000 which had already involved representatives of the GI user and GI producer communities. Its objectives were primarily to gather feedback on GI2000, but also to canvas ideas on the key issues relating to GI that ought to be included in the forthcoming Fifth R&D Framework. Although the ideas presented in the following sections were preliminary in nature, it is felt, nevertheless, that they provide the starting point for the development of the future policy-driven research agenda. 4.3.2 The Fifth Framework for R&D The general aims of the Fifth Framework are likely to broaden those of the Fourth Framework (1994–1998) i.e. “to strengthen the scientific and technical base of the EU”, to include issues related to growth, competitiveness, employment and the information society (see CEC 1993, 1994). Hence there is an even stronger connection to industry and job creation as well as developing the opportunities for greater communication and participation within the European Union. Nevertheless there is a growing recognition that long term research is also a strategic area for the EU, and that there is a need to support European scientists in partnership with European industry. This need however must be balanced against the overall aims of the Fifth R&D programme. Underpinning the themes of the Fifth Framework are some broad expected trends, namely: • The continuing reduction in cost and increased performance of Information and Communication Technologies (ICT) which will spread usage more and more to the non-specialists. • The pervasiveness of computer networking which require measures to encourage European companies to move to networked applications. • The growth of electronic commerce and the development of interactive mass media. • The increased competition of generic ICT services in areas of traditional media with significant impacts as e-mail becomes the dominant form of written communication and conventional printing becomes completely digital (DG XIII/E, 1996c). Against this background the main areas for GI research in the Fifth Framework being canvassed at present are: • GI data generalisation: i.e. the need to develop further the technology so that high resolution data collected for one application, such as land management, can be used for another application with lower resolution needs such as tourism, fleet management or environmental monitoring. For this purpose further research on scale-less digital data is needed. • GI data visualisation including virtual reality to satisfy the presentation needs of both existing and new GI users. • Geospatial image and text integration for presentation and analysis. Here the key issues relate to indexing very large datasets, data quality accuracy and precision, the combined effects of which have yet to be fully addressed by the industry. • GI in model building: this includes new algorithms and methodologies to promote greater and better use of GI for citizens, government, and industry and further research on handling temporal data and on error propagation.
36
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
Attention is also being given to Geospatial Publishing (collection, creation, packaging, labelling, certification, marketing, distribution), Intelligent Maps (transforming spatial data into knowledge, automated feature recognition from images, virtual reality for visualisation), and Geospatial Data on the Net (new methodologies, standards for data storing, indexing, accessing, and delivering, seamless data integration, and data clustering using object-oriented techniques). From the research topics outlined above it can be seen that the focus is primarily on the technical aspects of GI handling and the opportunities opened up by technological developments. This may be due to the partnership with industry which is one of the hallmarks of the EU research and development programmes. However, there is also considerable evidence which suggests that the barriers to exploiting these opportunities to the full are less technical than conceptual, legal, institutional, organisational, and related to education and awareness (Burrough et al., 1996, Masser et al., 1996; Nansen et al., 1996;). With this in mind, the following section puts forward a policy driven research agenda to support the development of the GI Policy Framework based on the experience of the European Science Foundation GISDATA Scientific programme. 4.3.3 A Policy-Driven Research Agenda The European Science Foundation GISDATA scientific programme was built around a research agenda which identified three main clusters of topics (Arnaud et al., 1993). These were geographic databases, data integration and socio-economic applications (see www.shef.ac.uk/uni/academic/D-H/gis/gisdata.html). The programme has involved over 300 GI researchers from 20 European countries and the USA during the period 1993–97, becoming a recognised voice of the European GI research community as a whole. In the light of the experience of this programme the following topics for a policy driven research agenda can be identified with respect to the needs of the GI 2000 initiative. These can also be grouped in the three main clusters of the original GISDATA research agenda. 4.3.3.1 Geographic Databases Model generalisation: this is a key area to which continues to be critical for the development of seamless scale-less databases. Both GISDATA and the NCGIA have essentially addressed issues of cartographic generalisation and have not gone very far on the much thornier issue of how to generalise attribute data and hence the conceptual data model when moving from one level of resolution to another or integrating data from different sources having different resolution levels. Spatial data handling on the Web: it is clear that GI handling technologies will less and less be based on proprietary software on single installations like a workstation and more and more based instead on distributed network environments which are platform-independent and use a whole suite of software tools and applets. Similarly the development of large digital libraries and metadata services needs further research on spatial agents and appropriate methodologies for data storage, indexing, retrieval and integration. Data quality issues will also increase in importance with the use of third party data rather than data internal to the organisation. Handling qualitative GI: this is a topic that has consistently emerged as requiring special attention to broaden the involvement of the public and non-GI expert users into using GI and related handling technologies. The geography and concerns of individuals and of disciplines in the social sciences outside geography are
TOWARDS A POLICY DRIVEN RESEARCH AGENDA
37
often expressed in qualitative terms rather than cartesian/quantitative ones. How to represent and incorporate these other geographies into our current technologies? How to develop databases and conceptual models that can integrate qualitative and quantitative data types? How to develop the social dimension of GIS? (see also Helen Couclelis, in Chapter 2 of this Volume). 4.3.3.2 Data Integration Liability in the digital age: the liability of data producers and vendors for erroneous digital GI is an important issue that may is currently hampering the development of the market. Yet real evidence of cases is patchy and often not disseminated. There is an urgent need to collect the evidence that is emerging worldwide and develop standard quality assurance mechanisms involving technologists, legal expertise, and data quality experts so that costly mistakes may be avoided in the future whilst providing a framework for the development of the GI market. Protecting Confidentiality: the opportunities for integrating different data sets is at the core of what is special about GI. Nevertheless there is a need to ensure that exploiting these opportunities is not at the expense of the individuals’ rights to privacy. The statistical community has addressed this issue over the years and identified ways to reduce the risks of disclosure, typically by aggregating data at census tractlevel and/or anonymising detailed records. The recent developments in GI handling technologies offer both opportunities and threats. On the one hand it is now possible to design census areas that are more homogeneous, and hence more valuable for analysis, whilst still protecting confidentiality. On the other, it is equally possible to combine different data sets and arrive at almost individual profiles, by-passing to a large extent the restrictions put in place in the past. This is a topic that cannot be left to the market to develop as the solutions are likely to be very unsatisfactory from a civil rights point of view. Hence the involvement of the research community and government is needed. The economics of the digital information market: research is needed in this area as there is a general agreement that the market for digital GI is still immature and that its characteristics are poorly understood. This results in a great many assumptions being made about the potential value of integrating very many data sets for market development which are reminiscent of the many claims made ten years ago about the potential created by GIS for increasing efficiency and reducing costs. The latter have not materialised and there is also a strong case for a realistic assessment of the market for GI. Recent studies on the economics of GI (Coopers and Lybrand, 1996) provide a useful starting point for research in this field which needs also to consider both the multiplier effects of digital GI and the implications for job creation/loss/ and displacement. For the latter, evidence from other industries such as telecommunications and banking should also be taken into account. 4.3.3.3 Applications Risk Management in Europe: the experience of the last 18 months of discussion on the EGII clearly indicates the lack of awareness of policy makers at EU, national, and local level on the specific nature and importance of GI. For this reason there is a need to bring forcefully home the message that GI is essential for effective service delivery and governance. Given potentially catastrophic events such as the floods in The Netherlands in 1995 and Italy in 1996, and trans-border issues like pollution and contamination research on emergency services and the role of GI in their planning and delivery, will highlight the issues in relation to
38
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
analytical and modelling requirements, technical infrastructure for data exchange, minimum data and common definitions needs as well as contributing to raising awareness at senior level. Geographic monitoring systems: an increasing number of applications require real-time monitoring of events such as traffic control, air and water quality, emissions, and temperatures. Yet current GI technology is relatively poor at handling dynamic and temporal data. A number of research-led initiatives in this field have been developed recently but further progress is needed, building on the experiences of different industrial sectors and disciplines, particularly those more used to handle time/flows (e.g. traffic control, utilities) and those traditionally focusing on spatial analysis (geography). Research in this area needs to identify what can be done better with current technology and methods, and the priorities of future applied projects in addressing the needs of industry and governance. Integrating spatial models: policy at both local, national and European level tends to be formulated by sector (e.g. agriculture, transport, environment, regional development, health). Yet each set of policies has profound spatial implications the cumulative effects of which are little understood. Efforts in the late 1960s to develop comprehensive planning largely failed due to lack of political commitment but also because the technology of the time could not cope with the complexity of the systems and their interrelationships. Are the combined pressures for reduced government involvement, more effective targeting of resources, and enhanced technology changing the picture? Is it possible to evaluate the combined effects of different policies and their underpinning theories and models? What is the impact on resource allocation and service provision? Research is clearly needed to look at both technological and methodological issues in this area. 4.4. CONCLUSIONS This chapter reviewed recent developments in Europe with respect to the creation of the European geographic information resource base and put forward a policy-driven research agenda to support these developments. This agenda does not focus on those topics on which progress in one form of other is to be expected relatively soon, such as the development of metadata services, and definition of core base data. Almost all parties agree that these are of primary importance and at least some funding on these issues is already in place in the INFO2000 programme. The agenda therefore addresses specifically those issues where research is needed to inform policy formulation and where important ethical issues are at stake, such as in the area of confidentiality. What clearly emerges from the discussion is that there has been a sustained effort over the last two years to make progress in developing the European GI resource base. However, a great deal of work remains to be done as there are still significant barriers to overcome. They largely stem from the limited awareness of policy makers at all levels (European, national, and local) as to the strategic value of developing the GI resource base, the difficulty of achieving coordination among so many stake-holders in this field, and the immaturity of the market. It should also be noted that many of the topics defined above lie outside the traditional strengths of the European GIS research community and will require a much increased interdisciplinary effort. For this reason in particular the agenda set above represents both a formidable challenge and an opportunity for European GIS research. ACKNOWLEDGEMENT The views expressed in this chapter are those of the authors alone and do not necessarily reflect those of the European agencies referred to in the text. Prof. Masser’s research for this chapter is part of that undertaken
TOWARDS A POLICY DRIVEN RESEARCH AGENDA
39
for an ESRC Senior Research Fellowship award H51427501895 on Building the European Information Resource Base. REFERENCES ARNAUD, A., CRAGLIA, M., MASSER, I., SALGÉ, F. and SCHOLTEN, H. 1993. The research agenda of the European Science Foundation’s GISDATA scientific programme, International Journal of GIS, 7(5), pp. 463–470 BANGEMANN, M. 1994. Europe and the Global Information Society: Recommendations to the European Council Brussels: Commission of the European Communities. BURROUGH, P., CRAGLIA, M, MASSER, I., and SALGÉ, F. 1996 Geographic Information: the European Dimension http://www.shef.ac.uk/uni/academic/D-H/gis/policy_l.html. CEC 1993. Growth, Competitiveness and Employment: the Challenges and Ways Forward into the 21st Century. Brussels: Commission of the European Communities. CEC 1994. Europe’s Way to the Information Society: An Action Plan, COM(94) 347 Final. Brussels: Commission of the European Communities. COOPERS AND LYBRAND 1996. Economic Aspects of the Collection, Dissemination and Integration of Government’s Geospatial Information. Southampton: Ordnance Survey. DGXIII/E 1994. Heads of National Geographic Institutes: Report on Meeting Held on 8 April 1994. Luxembourg: Commission of the European Communities, DGXIII/E. DGXIII/E 1995a. GI 2000: Towards a European Geographic Information Infrastructure. Luxembourg: Commission of the European Communities DGXIII/E. DGXIII/E 1995b. Minutes of the GI 2000 meeting, 8 February 1995. Luxembourg, Commission of the European Communities, DGXIII/E. DGXIII/E 1996a. GI2000—Towards a European Policy Framework for Geographic Information: A Discussion Document. Luxembourg: Commission of the European Communities, DGXIII/E. DGXIII/E 1996b. INFO 2000: Stimulating the Development and Use of Multimedia Information Content. Luxembourg: Commission of the European Communities, DGXIII/E. DGXIII/E 1996c. Fifth Framework Programme: DG XIII/E-E Information Content: Geographic Information. Paper for discussion at the R&D meeting, 20 June 1996, Luxembourg. Luxembourg: Commission of the European Communities, DGXIII/E. LONGHORN, R 1998. An evaluation of the experience of the IMPACT-2 programme in Burrough, P., and Masser, I. (Eds.), European Geographic Information Infrastructures: Opportunities and Pitfalls. London: Taylor & Francis, MASSER, I. and SALGÉ, F. 1996. The European geographic information infrastructure debate, in Craglia, M., and Couclelis, H., (Eds.), Geographic Information Research: Bridging the Atlantic. London: Taylor & Francis, pp. 28–36. MASSER I, CAMPBELL H, and CRAGLIA M. (Eds.) 1996. G1S Diffusion: the Adoption and Use of Geographical Information Systems in Local Government in Europe. London: Taylor & Francis. NANSEN, B., SMITH, N. and DAVEY, A. 1996. A British national geospatial data base, Mapping Awareness 10(3), p. 18–20 and 10(4), p. 38–40.
Chapter Five GIS, Environmental Equity Analysis, and the Modifiable Areal Unit Problem (MAUP) Daniel Sui
When the search for truth is confused with political advocacy, the pursuit of knowledge is reduced to the quest for power (Chase, 1995). 5.1 INTRODUCTION The issue of environmental equity—whether minorities and low income communities across the United States share a disproportionate burden of environmental hazards—has attracted intensive interdisciplinary research efforts in recent years (Bowen et al., 1995; Cutter, 1995). Because of the increasing availability and easy access to several national spatial databases, such as the US EPA’s toxic release inventory (TRI) database and Census Bureau’s TIGER files, GIS technology has been widely used in environmental equity analysis during the past five years (Burke, 1993; Chakraborty and Armstrong, 1994; Lowry et al., 1995). However, most previous studies were based upon only a single set of analytical units, such as census tracts, zip code areas, or counties. To date, environmental equity analysis has been conducted using a variety of areal unit boundaries at different geographical scales without considering the effects of the modifiable areal unit problem (MAUP). The MAUP issue refers to the fact that conclusions in geographic studies are highly sensitive to the scale and the zoning scheme (areal boundaries) used in the analysis (Openshaw, 1983). Numerous empirical studies have revealed that the inclusion of scale and areal boundary factors can alter the conclusions of a theory dramatically (Openshaw, 1984; Fotheringham and Wong, 1991; Amrhein, 1995). Despite, or perhaps because of, their critical importance, the scale and areal unit boundaries chosen in previous environmental equity analysis were often dictated more by expediency than by rational justification. Not surprisingly, diametrically opposing conclusions have been reported in the literature even though basically the same data set was used in the analysis (Bullard, 1990; Goldman and Fitton, 1994). Among the conflicting evidence provided in the literature, we still do not know to what extent the scale and unit of analysis may have over-or under-estimated the relationship between the distribution of toxic facilities and the characteristics of affected population. The goal of this research is to take a systematic approach to addressing the MAUP in environmental equity analysis using GIS and discuss the ethical ramifications of GIS as a social technology. This chapter is organised into seven sections. After a brief introduction, the research background and a literature review are presented in the second section. The third section describes specific research objectives and hypotheses. The fourth section introduces the methodology, followed by empirical results in the fifth
GIS, ENVIRONMENTAL EQUITY AND THE MAUP PROBLEM
41
section and further discussions from a critical social theoretic perspective in the sixth section. The last section contains concluding remarks and future research plans. 5.2 RESEARCH BACKGROUND AND REVIEW OF LITERATURE This research is framed by three sets of extensive literature: recent debates on environmental equity analysis and environmental racism; previous studies on ecological fallacies and the MAUP; and current efforts to explore the social implications of GIS technology via the NCGIA Initiative 19 (I-19). 5.2.1 The Social Problem—Environmental Equity Analysis and the Debate on Environmental Racism. The current literature on the existence and extent of environmental inequity or racism have taken two general approaches: ecological studies that examine the geographical collocation of racial and ethnic minorities and potentially hazardous facilities (Bryant and Mohai, 1992); and case studies of specific instances of environmental injustice in particular areas associated with specific facilities (Edelstein, 1988). Whereas the ecological studies attempt to determine the extent of environmental injustice, they are susceptible to constraints brought about by the fundamental ecological units of analysis. Case studies have the advantage over the ecological approach of being better able to trace the processes involved and can often conduct detailed and specific analysis aimed at establishing relationships and causes associated with the circumstances of the case. Unfortunately, case studies alone are ineffective in determining the spatial extent of the problem. The inherent difficulties in both the ecological and case study approach demand that the macro-level ecological study be integrated with micro-level case studies. Unfortunately, such an integration is rare in the literature. Most previous studies have used only one set of unit for analysis, such as census block groups (von Braun, 1993), census tracts (Burke, 1993), and zip codes (United Church of Christ, 1987). Only a few authors have mentioned the potential impacts of geographic scales (Bowen et al., 1995; Marr and Morris, 1995). Depending on the type and volume of toxic materials released and where (air, water, or soil), these toxic materials will affect populations at different distances (Hallman and Wanderman, 1989). A clear understanding of the impact of geographic scales and units will be the first step towards identifying the most appropriate unit and scale for environment equity analysis and policy implementation. 5.2.2 The Conceptual Problem—Ecological Fallacies and the Modifiable Areal Unit Problem The effects of scales and areal unit boundaries on results of geographical studies have been reported extensively in the literature. Comprehensive reviews on ecological fallacies and the MAUP problem have been provided by Langbein and Lichtman (1978), Openshaw (1983), and Fotheringham and Wong (1991). Ecological fallacy and MAUP studies can be traced at least to Gehlke and Biehl (1934) and Neprash (1934). Robinson (1950) showed empirical evidence on how effects of scales and aggregation methods can dramatically change results in illiteracy studies. These pioneering works have stimulated a wide range of studies on the impact of scale and unit of analysis in environment/ecological modelling and socio-economic
42
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
studies. Although the results vary according to the problems being examined, general conclusions from this wide range of studies indicate that parameters and processes important at one scale or unit are frequently not important or predictive at another scale or unit. Information is often lost as spatial data are aggregated to coarser scales or resolutions. Significant changes may occur when we move from one scale to another or from one zoning system of analysis to another. Each level has its own unique properties that cannot be derived by mere summation of the disaggregated parts. So far no ideal solutions have been developed to solve this stubborn problem in spatial analysis. Depending on the substantive issues being examined, the following methods for tackling the MAUP issue have been reported (Wong, 1996): 1. To identify the basic units and derive optimal scales and zonal configurations for the phenomena being studied (Openshaw, 1983). 2. To conduct sensitivity analysis and shift the emphasis of spatial analysis towards the rates of change across different scales and areal unit boundaries (Fotheringham and Wong, 1991). 3. To abandon traditional statistical analysis and develop scale-independent or frame-independent analytical techniques (Tobler, 1989). 5.2.3 The Philosophical Problem—GIS and Society. The massive proliferation of GIS in society has sparked a new research effort focusing on the dialectal relationship between GIS and society, namely how social contexts have shaped the production and use of GIS and how GIS technology has shaped the outcomes and solutions of societal problems (Sheppard, 1995). Current research has transcended the initial polarising debate (Sui, 1994) and an ongoing collaboration between GISers and social theorists is being developed via NCGIA Initiative 19 (I19). Specifically, I19 is designed to address five philosophical issues with regard to GIS and society (Curry et al., 1995): 1) In what ways have particular ontologies and epistemologies been incorporated into existing GIS techniques, and in what ways have alternative forms of representation and reasoning been filtered out? 2) How have the commodification, the proliferation, and dissemination of databases associated with GIS, as well as differential access to spatial databases, influenced the ability of different social groups to utilise information for their own empowerment? 3) How can the local knowledge, needs, desires, and hopes of marginalised social groups be adequately represented in GIS-based decision-making processes? 4) What are the possibilities and limitations of using GIS as a participatory tool for more democratic resolution of social and environmental conflicts? 5) What kind of ethical codes and regulatory frameworks should be developed for GIS applications in society? I19 addresses these conceptual issues in the context of three research themes: 1) the administration and control of populations by both public and private institutions; 2) the locational conflict involving disadvantaged populations; 3) the political ecology of natural resource access and use.
GIS, ENVIRONMENTAL EQUITY AND THE MAUP PROBLEM
43
This research is situated in these broader questions concerning the social implications of how people, space, and environment are represented in GIS. The empirical results will shed light on many of the issues posed in I19 (see also Couclelis, Chapter 2 in this volume). 5.3 RESEARCH OBJECTIVES AND HYPOTHESES The primary objective of this research is to investigate empirically the effects of the MAUP issue on the results of GIS-based environmental equity analysis. By tying this research to previous studies on the MAUP and ecological fallacies, two sets of testable hypotheses are proposed: the scale-dependency hypothesis and the areal unit-dependency hypothesis. The scale-dependency hypothesis posits that in environmental equity analysis, the results concerning the relationship between the racial (or socio-economic) status of particular neighbourhoods and the distribution of noxious facilities depends on the geographical scales used in the analysis. Specifically, it was expected that a strong correlation of race (or income) with environmental inequity at one geographical scale is not necessarily significant at either higher or lower geographic levels of analysis. The number of important variables generally decreases towards broader scales. The areal unit boundary-dependency hypothesis contends that different areal unit configurations (zoning schemes) at the same scale, which usually result in different aggregation (grouping) of the demographic and environmental data, will produce different results in environmental equity analysis. Similar to the scale hypothesis, it was expected that a strong correlation of race (or income) with environmental inequity at a particular areal unit configuration does not necessarily imply that the relationship will be significant for other areal unit boundaries. Likewise, changes in areal unit boundaries may dramatically alter the results of environmental equity analysis. Unlike the effects of scales, the impacts of areal unit boundaries may be less predictable. Haphazard selection of areal unit boundaries may create haphazard results. A secondary objective of this research is to contextualise conceptually GIS in society via the empirical environmental equity analysis. By tying the empirical results to Heidegger’s enframing theory of technology (Heidegger, 1972) and Habermas’s communicative theory of society (Habermas, 1984, 1987), this chapter calls for a shift from an instrumental rationality to a critical rationality for GIS applications in the social arena. 5.4. DATA AND METHODOLOGY 5.4.1 Study Area and Data In this project, the city of Houston, Texas, serves as the study area because of its diversified ethnic groups, noted environmental problems related to Houston’s petrochemical industry, and the lack of zoning laws. Because the city of Houston proper lies predominantly within Harris county, we use the Harris county boundary as a substitute for Houston’s city limit. The primary data source for this study is the US Environmental Protection Agency’s (EPA) National Toxic Release Inventory (TRI) database from 1987–90. The TRI database contains a complete inventory of toxic release sites in all major US cities. For each toxic release site, this database provides detailed information about the type of chemicals released at each site and precise locational information in the
44
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
latitude/longitude format. The demographic and socio-economic data come from the 1990 Census Summary Tape Files (STF), which were subsequently merged with the census block group, census tract, and zip code boundaries in the 1992 TIGER files. 5.4.2 Analytical Procedures This project was conducted with the following three stages: STAGE ONE: DATA COLLECTION.
We first downloaded Houston’s TRI site data from CD-ROM. Based upon the latitude/longitude information for each site, the TRI data were converted into ARC/INFO coverage format and merged with the 1990 TIGER files and 1990 census demographic and socio-economic data. STAGE TWO: GIS ANALYSIS.
During the second stage of this project, GIS analyses were conducted to derive all the data needed for testing the two hypotheses. Two different methods have been used to aggregate data using different geographical scales and areal unit boundaries: the deterministic approach using the predefined scales and boundaries and the stochastic approach using a double random procedure. The deterministic approach: for the scale-dependency hypothesis, all the TRI sites, demographic, social, and economic data were aggregated at three geographical scales: census block group, census tract, and zip code (Figure 5.1). For the zoning-dependency hypothesis, the census tract level data were re-aggregated to three zoning schemes. ARC/INFO was used to create three new sets of spatial units: buffer zones along major highways (1.5-mile); concentric rings from major population centres (1.5-mile, 3-mile, and 4.5-mile); and sectoral radiating patterns from Houston’s three major ethnic enclaves (45-degree sectoral patterns on four concentric rings with 1.5-mile intervals) (Figure 5.2). The TRI sites, demographic, and socio-economic data were also re-aggregated according to these three new sets of spatial units. The stochastic approach: this approach aggregates data to different scales and areal unit boundaries using a double random procedure with contiguity constraints. The algorithm was originally developed by Openshaw (1977) and essentially works in the following way: if a coverage consisting of N zones is required from the initial M zone system (M>N), N seed polygons are randomly selected. One of the remaining (M-N) polygons is then randomly selected and tested for adjacency to each of the N seed polygons. According to its location, it is either added to one of the seed polygons or replaced by another randomly selected polygon. The process iterates until all the (M-N) polygons have been allocated. The centroid of an aggregate is usually defined as the centroid of the its constituent zones. An important feature of this aggregation procedure is that it preserves the basic structure of the underlying zones. However, because of the double randomness inherent in the planting of seed polygons and the allocation of remaining polygons, each iteration produces a different set of areal unit boundaries. The SAM (Spatial Aggregation Machine) program developed by Yichun Xie at Eastern Michigan University was used to carry out the random aggregation. Sixteen hundred census block groups were used to be the initial zones, which were successively aggregated to 1000 units, 800 units, 600 units, 400 units, and 200 units to test the scale hypothesis. For each scale, ten different areal boundary configurations were randomly formed to test the areal unit boundary dependency hypothesis. The attribute data were also aggregated according to each scale and areal unit boundary configuration.
GIS, ENVIRONMENTAL EQUITY AND THE MAUP PROBLEM
45
Figure 5.1: Aggregation of TRI site data to three pre-defined scales.
After those aggregations using the deterministic and stochastic approaches, we have obtained the following 56 derived data sets: 1. Six data sets derived from the deterministic approach: three geographical scales (block groups, census tracts, and zip code areas) and three areal unit configurations (buffer zones, concentric rings, and sectoral radiations), 2. Fifty data sets from the stochastic approach: five geographical scales (200-unit, 400-unit, 600-unit, 800unit, and 1000-unit) and ten random areal unit boundary configurations for each scale. Statistical analyses were conducted to examine the changes of relationship between race, class, and the distribution of toxic materials. STAGE THREE: STATISTICAL ANALYSIS.
For each of the above derived data sets, statistical analyses were conducted to examine how the relationship between environmental hazards and the characteristics of the surrounding population will change under different geographical scales and zoning schemes. The following two models were estimated using SAS 6.02 for Windows:
where YTRI# is total number of TRI facilities; PMinority percentage of minority population; IPerlncome is per capita income; PDensity is population density; PBlack is percentage of black population; PHispanie is percentage of Hispanic population; PAsian is percentage of Asian population.
46
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
Figure 5.2: Aggregation of TRI site data to three pre-defined areal zoning schemes.
5.5 EMPIRICAL RESULTS 5.5.1 Results for the deterministic aggregation Results of these two models for data sets derived from the deterministic approach are shown in Tables 5.1 and 5.2. Table 5.1 contains the results of Models 1 and 2 using three geographic scales: block groups, census tracts, and zip code areas. Using Model 1, the relationship between the number of TRI sites and the characteristics of the surrounding population in terms of racial and socio-economic status was investigated. As shown in Table 5.1, the R2 of the same model increased significantly as the analysis scale moves from block groups to census tracts to zip code areas. The same independent variables explained 69 percent of the variance of the dependent variable at the zip code level, but only 63 percent and 41 percent at the census tract and the block group level respectively. This indicates that at a more disaggregate level, more independent variables need to be incorporated into the model to explain better the variance of the dependent variable. Of greater interest is the dramatic changes of the coefficients for the three independent variables for each different scale. At the block group level, per capita income is the most important independent variable in explaining the changes of the total number of TRI facilities whereas the minority population and population density played a subordinate role. As we move the scale from block groups to census tracts to zip code areas, we observe clearly that the minority population becomes more important for explaining the changes of the total number of TRI facilities whereas the per capita income and population density become less
GIS, ENVIRONMENTAL EQUITY AND THE MAUP PROBLEM
47
significant. In Model 2, the minority population is separated into three subgroups: Blacks, Hispanics, and Asians, to examine further which minority group shares disproportionate environmental burdens. A similar change is observed for the R2 as in Model 1. The Black population appears to be the most significant at the block group level, with the Hispanic population being the most significant at the census tract level. Asians are inversely related to TRI sites at the block group and census tract level, and positively related to TRI sites at the zip code level, although far less significantly than Blacks and Hispanics. Table 5.1: Results of GIS-based environment equity analysis using different geographical scales. Variables
Block groups
Census tracks
Zip codes
2015 0.41 2.29 −3.19 −2.31
583 0.63 3.27 −2.22 −1.91
140 0.69 4.45 −1.39 −0.92
583 0.18 0.062 0.049 −0.067
140 0.24 0.27 0.11 −0.033
Model 1: N R2 b1 b2 b3 Model 2: N 2015 2 R 0.11 b1 0.071 b2 0.035 b3 −0.038 Results significant at 0.95 confidence level
Table 5.2 contains the results of Models 1 and 2 using three zoning schemes: buffer zones along major highways, concentric rings from major population centres; and sectoral radiations from three ethnic enclaves. As shown in Table 5.2, the R2 of the same model decreased significantly as the analysis zoning scheme changes from buffer zones to concentric rings to sectoral areas. This suggests that a model with quite different predictive powers may be produced if the data are aggregated according to different areal unit boundaries. Table 5.2: Results of GIS-based environmental equity analysis using different areal unit boundaries. Variables Model 1: N R2 b1 b2 b3 Model 2: N R2 b1
Block groups
Census tracks
Zip codes
291 0.54 7.31 −1.31 −1.53
398 0.49 4.58 −1.36 0.69
343 0.31 3.91 −4.95 1.43
291 0.24 0.029
398 0.21 0.013
343 0.37 0.094
48
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
Variables
Block groups
b2 0.025 b3 −0.14 Results significant at 0.95 confidence level
Census tracks
Zip codes
−0.045 0.026
0.041 −0.29
The importance of minority population dropped substantially from buffer zones to concentric rings to sectoral radiations. The importance of per capita income is similar for buffer zones and concentric rings with a dramatic increase for sectoral aggregation. The population density factor is slightly more complicated than the minority population and per capita income: it is negatively related to the number of TRI sites for the buffer aggregation and positively related to the number of TRI sites, but less important than the other two variables for both the concentric ring and sectoral aggregation. As for Model 2, using data from the buffer zone aggregation, Asian population is inversely related to the number of TRI facilities with Black and Hispanic population being almost equally important. Hispanic population is inversely related to the dependent variable under the concentric ring scheme. Sectoral aggregation produced the highest R2 with Black population being the most important independent variable. Overall, the changes of results according to different zoning schemes are less predictable than those of different scales. 5.5.2 Results from the stochastic aggregation Results for both Model 1 and 2 from random scale and areal boundary changes are shown in Figure 5.3(a)– (c) for Model 1 and Figure 3(f)–(g) for Model 2. All results are significant at the 95 percent confidence level. If the MAUP issue did not exist, there would be little variation in each of the parameter estimates for Models 1 and 2. However, it is quite evident from Figure 5.3(a)–(c) that the change of scale and the modification of the areal boundary units from which the data are aggregated do create substantial variation for parameter estimation and reliability. Both of the models display a great degree of sensitivity to scales and zoning scheme variations. For Model 1, the variations in the estimates of bi and b2 seem to be very systematic: b1 has become systematically increased and the absolute value of b2 has decreased (less negative) as the scale shifts from a more disaggregated (1000 units) to a more aggregated (200 units) one. This result is consistent with the results of the deterministic aggregation. These findings indicate that income (class) tends to become the most important variable in explaining the distribution of TRI sites if more disaggregate data are used in environmental equity analysis and percentage of minorities (race) becomes the most important variable when more aggregate data are employed. Compared to b1 and b2, the estimates for b3 are stable at different aggregations and all remain negative, indicating the consistent inverse relationship between the distribution of TRI sites and population density. For each scale, the random variation in areal unit boundaries (zoning system) has created substantial fluctuations for all the three parameters in the model. However, it seems that the variations are greater at the more disaggregated level than at the disaggregate level. For Model 2, as shown in Figure 5.3(d)–(f), similar variations are observed although a little bit less systematic than the estimates of Model 1. The importance of both PBlack (percentage of black population) and PHispanic (percentage of Hispanic population), especially PHispanic, has increased as the data have been incrementally aggregated. The magnitude of increases is smaller compared to Model 1. The estimates of b3 fluctuate between positive and negative values, which means that PAsian could be either positively or negatively related to the distribution of TRI sites, depending on how data have been aggregated. With
GIS, ENVIRONMENTAL EQUITY AND THE MAUP PROBLEM
49
Figure 5.3: Variations in parameter estimates with random scale and areal boundary changes.
regard to the changes of parameter estimates according to areal unit boundaries at each different scale, variations are still discernible but the magnitude is far smaller that that of Model 1. These results clearly indicate that the results of the environmental equity analysis as currently conducted using GIS so far are highly sensitive to scales and areal units. What is more troubling is the fact that it is possible to find almost any desired results simply by re-aggregating the data to different scales and arealunit boundaries. Although we should consider the technical solutions for the MAUP issue, we must go beyond the technicalities to view this issue from a broader social and intellectual perspective. 5.6 FURTHER DISCUSSIONS: GIS AS A SOCIAL TECHNOLOGY These findings have profound implications for applying GIS to address controversial social issues. This research has demonstrated that, on the one hand, GIS has greatly facilitated the integration of a variety of spatial and non-spatial information at different scales and areal unit boundary configurations. It will be extremely difficult, if not entirely impractical, to conduct a multiple-scale/multiple zoning scheme environmental equity analysis without the aid of GIS. On the other hand, if cautionary steps were not taken to address the MAUP issue, GIS technology would be easily abused to generate whatever results, presumably with unquestionable hi-tech-based objectivity, to advance the political/social agendas of various interest groups. These uncertainties in GIS-based environmental equity analysis have perpetuated the ethical dilemmas facing researchers in this controversial area. I believe that mere technical solutions will not suffice for these dilemmas. We must reconceive and redefine the nature of GIS technology, from viewing it as a value-free tool to viewing it as a socially constructed technology. To achieve this shift of our philosophy, two critical theories are extremely useful in illuminating the implications of this study:
50
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
Heidegger’s enframing theory of technology (Heidegger, 1972) and Habermas’s communication theory of society (Habermas, 1984, 1987). According to Heidegger (1972), whenever we are applying a piece of technology to solve a problem, we are enframed by the implicit assumptions of the technology. Technology is a mode of revealing by concealing. Similar to what quantum physicists have told us, whenever an instrument is applied to measure the phenomena being studied, the instrument inadvertently alters the physical conditions of the system being measured which usually leads to unavoidable measurement errors. Heidegger further argues that the enframing nature of technology does not come from the potentially lethal machines or an apparatus of technology. The real danger of the enframing nature of technology is that we are increasingly becoming blind to other alternative ways of looking at things when we turn to technology for solutions of social problems. To Heidegger, technologies are not mere exterior aids but interior transformations of consciousness. Heidegger (1972) prescribed that we must strive to let things reveal their “thingness” instead of relying on a particular technology to do that for us. In the case of GIS-based environmental equity analysis, I believe that researches reported in the literature so far are enframed by the use of secondary data (both TRI facilities and characteristics of the population) and to which scale and areal unit boundaries these data are aggregated. It is beyond the scope of this chapter to discuss the data problems, not just in the technical sense of data-error but in the political sense of data-appropriateness. Even for the MAUP issue alone, as shown above, a single scale and areal unit boundaries do not warrant reliable conclusions. These findings reveal as much the systems we are conducting our research in as the environmental problems they are supposed about. If the public were not being informed about the effects of scales and zoning systems used in environmental equity analysis, as Zimmerman (1994) described in so many court decisions, they would be easily led to believe in a haphazard conclusion drawn at a particular scale or zoning system. To contextualise further the enframing nature of technology, Habermas’ communicative theory of society is also enlightening (Habermas, 1984, 1987). Central to Habermas’ theory is the analysis of “how individuals and organisations systematically manipulate communications to conceal possible problems and solutions, manipulate consent and trust, and misrepresent facts and expectations” (Aitken and Michel, 1995). Habermas (1984) argued that any form of knowledge is a product of human wishes, including the will to power, as well as the human practices of negotiation and communication. To Habermas and many others, technology not only enframes us into a particular mode of thinking, but also, perhaps more troublesome, manufactures fictions that can capture and trap public opinion into illusions. Because all things, including space, people, and environment, have become digital in GIS, they can be more easily manipulated in environmental equity analysis than before. From the results presented above, it can be seen that GIS technology, like all other communication tools, can be (ab)used by individuals and organisations to manufacture results to legitimate and impose political, economic, and social agendas. Far from being a neutral, value-free scientific tool, GIS is actually being used more as a communication and persuasion tool in the studies of many controversial social issues. In order to make GIS fulfil our democratic ideals in society, a shift of our philosophy from viewing GIS as an instrument for problem-solving to viewing it as a socially embedded process for communication is long overdue. Such a critical perspective of GIS entails an ontological as well as an epistemological position that views the subjects of research and representation as situated in complex webs of power relations that construct and shape those very subjects. This philosophical shift demands us to be both critically objective and objectively critical about applications of GIS in society. To be critically objective means to limit one’s conclusions as essentially partial and selective among all the possible conclusions rather than making radical claims about their universal applicability. To be objectively critical means to make one’s position vis-à-vis assumptions and limitations of research methodology explicitly known rather than invisible, because, to a
GIS, ENVIRONMENTAL EQUITY AND THE MAUP PROBLEM
51
great extent how we see determines what we see. GIS-based environmental equity analysis can serve as an engaging example to apply such a critical perspective. As indicated by the empirical results of this study, computer systems can shape our understanding of social reality so that effects are due, not to the phenomena measured, but to the systems measuring it. The social studies of GIS are a journey upstream towards the sources of everyday facts. This shift from instrumental to critical rationality will enable us to examine more vigorously how space, people, and environment have been represented, manipulated, and visualised in GIS, and thus promote more ethical GIS practice in the social arena. 5.7 CONCLUSIONS How big is your backyard, or what is the appropriate geographic scale or zoning system for environmental equity analysis, has always been a contentious issue in environmental justice research. Most previous studies on environment equity analysis were based upon an ad hoc selection of geographic scales and areal unit boundaries without a rational justification. The perplexing MAUP and ecological fallacies in environment equity analysis have not been adequately addressed in the literature. The primary purpose of this chapter was to develop a GIS approach to conduct environment equity analysis using multiple scales and zoning schemes in an attempt to examine the effects of scale and areal unit boundaries on the results of environmental equity analysis, and the implications of GIS technology in addressing controversial social issues. The preliminary results clearly indicate that the findings of environment equity analysis are highly sensitive to the geographical scales and areal-unit boundaries used. Environmental equity analyses based upon a single scale or zoning scheme cannot warrant a reliable conclusion about the actual processes of environmental equity. If the effects of geographic scales and zoning schemes are not considered, it has\been shown that it is possible to find almost any desired results simply by re-aggregating the data to different scales and areal-unit boundaries. The empirical results have confirmed both the scale-dependency and the areal boundary-dependency hypotheses. The findings of this research provide some engaging examples for GIS as a social technology. In order to overcome the enframing nature of GIS technology, GIS practices must be contextualised into their social dimensions as essentially a communication process. Viewing from a critical social theory perspective, GIS discloses the multifarious practices of various social groups with conflicting political agendas, which must be interrogated critically. Otherwise we might be deceived into thinking that the model in the database corresponds primarily to the essence of reality. ACKNOWLEDGEMENTS Part of this research was financially supported by the Creative and Scholarly Research Program, sponsored by the Office of Vice President for Research at Texas AandM University. The athor would like to thank Yichun Xie for providing the SAM program; Carl G.Amrhein and David W.S.Wong for the latest literature on the MAUP; and Michael Kullman, Daniel Overton, Thomas H.Meyer, and Ran Tao for their assistance in data preparation and aggregation. REFERENCES AITKEN, S.C. and MICHEL, S.M. 1995. Who contrives the ‘real’ in GIS? Geographic information, planning and critical theory, Cartography and Geographic Information Systems, 22(1), pp. 17–29.
52
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
AMRHEIN, C.G. 1995. Searching for the elusive aggregation effect: Evidence from statistical simulations, Environment and Planning A, 27(1), pp. 105–119. BOWEN, W.M., SALLING, M.J, HAYNES, K.E., and CYRAN, E.J. 1995. Towards environmental justice: Spatial equity in Ohio and Cleveland, Annals of the Association of American Geographers, 85(4), pp. 641–663. BRYANT, B. and MOHAI, P. (Eds.) 1992. Race and the Incidence of Environmental Hazards: A Time for Discourse. Boulder, CO: Westview Press. BULLARD, R.D. 1990. Dumping in Dixie: Race, Class, and Environmental Quality, Boulder, CO: Westview Press. BURKE, L.M. 1993. Race and environment equity: A geographic analysis in Los Angeles in Geo Info Systems, 4(6), pp. 44–50. CHAKRABORTY, J. and ARMSTRONG, M.R 1994. Estimating the population characteristics of areas affected by hazardous materials accidents in Proceedings of GIS/LIS’94, Phoenix, 25–27 October. Bethesda, MD: ASPRS, pp. 154–163. CHASE, A. 1995. In a Dark Wood: The Fight over Forests and the Rising Tyranny of Ecology. Boston: Houghton Mifflin Co. CURRY, M., HARRIS, T., MARK, D. and WEINER, D. 1995. Social implications of how people, environment and society are represented in GIS, NCGIA New Initiative Proposal (Available on-line at: http://www.geo.wvu.edu/ www/i19/proposal). CUTTER, S.L. 1995. Race, class, and environmental justice, Progress in Human Geography, 19(1), pp. 111–122. EDELSTEIN, M.R. 1988. Contaminated Communities: The Social and Psychological Impacts of Residential Toxic Exposur. Boulder, CO: Westview Press. FOTHERINGHAM, A.S. and WONG, D.W.S. 1991. The modifiable areal unit problem in multivariate statistical analysis, Environment and Planning A, 23(6), pp. 1025–1044. GEHLKE, C.E. and BIEHL, K.K. 1934. Certain effects of grouping upon the size of the correlation coefficient in census tract material, Journal of the American Statistical Association Supplement, 29(1), pp. 169–170. GOLDMAN, B.A. and FITTON, L. 1994. Toxic Wastes and Race Revisited: An Update of the 1987 Report on the Racial and Socio-economic Characteristics of Communities with Hazardous Waste Sites. Washington, D.C: Center for Policy Alternatives. HABERMAS, J. 1984. The Theory of Communicative Action, Vol. 1. Boston: Beacon Press. HABERMAS, J. 1987. The Theory of Communicative Action, Vol. 2, Boston: Beacon Press. HALLMAN, W. and WANDERMAN, A. 1989. Perception of risk and waste hazards in Peck, D.L. (Ed.), Psychological Effects of Hazardous Waste Disposal on Communities. Springfield, IL: Charles C. Thomas, pp. 31–56. HEIDEGGER, M. 1972. The question of concerning technology, in Lovitt, W. (Ed.), The Question Concerning Technology and Other Essays. New York: Harper Colophon, pp. 3– 35. LANGBEIN, L.I. and LICHTMAN, A.J. 1978. Ecological Inference. Beverly Hills, CA: SAGE Publications. LOWRY, J.H., MILLER, H.J., and HEPNER, G.F. 1995. A GIS-based sensitivity analysis of community vulnerability to hazardous contaminants on the Mexico/US border, Photogrametric Engineering and Remote Sensing, 61(11), pp. 1347–1359. MARR, P. and MORRIS, Q. 1995. People, poisons, and pathways: a case study of ecological fallacy. Paper presented at the International Conference on Applications of Computer Mapping in Epidemiological Studies, Tampa, FL, 14– 19 February. NEPRASH, J.A. 1934. Some problems in the correlation of spatially distributed variables, Journal of the American Statistical Association, 29(supplement), pp. 167–168. OPENSHAW, S. 1977. Algorithm 3: a procedure to generate pseudo-random aggregations of N zones into M zones, where M is less than N, Environment and Planning A, 9(6), pp. 1423–1428. OPENSHAW, S. 1983. The Modifiable Areal Unit Problem. CATMOG Series, No. 38, London: Institute of British Geographers. OPENSHAW, S. 1984. Ecological fallacies and the analysis of areal census data in Environment and Planning A, 15(1), pp. 74–92.
GIS, ENVIRONMENTAL EQUITY AND THE MAUP PROBLEM
53
ROBINSON, W.S. 1950. Ecological correlation and the behavior of individuals, American Sociological Review, 15(2), 351–357. SHEPPARD, E. 1995. GIS and society: an overview, Cartography and Geographical Information Systems, 22(1), pp. 5–16. SUI, D.Z. 1994. GIS and urban studies: positivism, post-positivism, and beyond, Urban Geography, 15(3), pp. 258–78. TOBLER, W. 1989. Frame independent spatial analysis, in Goodchild, M.F. and Gopal, S. (Eds.), Accuracy of Spatial Database. New York: Taylor & Francis, pp. 115–122. UNITED CHURCH OF CHRIST 1987. Toxic Wastes and Race in the United States: A National Report on The Racial and Socio-Economic Characteristics of Communities with Hazardous Waste Sites. New York: Commission of Racial Justice, United Church of Christ. VON BRAUN, M. 1993. The use of GIS in assessing exposure and remedial alternatives at Superfund sites in Goodchild, M.F., Parks, B.O., and Steyaert, L.T.(Eds.), Environmental Modeling with GIS. New York: Oxford University Press, pp. 339–347, WONG, D.W.S. 1996. Aggregation effects in geo-referenced data in Arlinghaus, S.L. (Ed.), Practical Handbook of Spatial Statistics. Boca Raton, FL: CRC Press, pp.83–106. ZIMMERMAN, R. 1994. Issues of classification in environmental equity: how we manage is how we measure, Fordham Urban Law Journal, 29(3), pp. 633–669.
Chapter Six National Cultural Influences on GIS Design: a Study of County GIS in King County, Wa, USA and Kreis Osnabrück, Germany Francis Harvey
6.1 NATIONAL CULTURE AND GEOGRAPHIC INFORMATION SYSTEMS This chapter sets out to discern the influence national culture can have on GIS design. Culture plays a crucial role in all human activity, but is entangled with institutions, disciplines, and our daily lives in perplexing ways. Studies of cultural influence on GIS technology stand to benefit the GIS research and practitioner communities through insights into this frequently down-played area. This research focuses specifically on national culture influences in GIS design, building on prior work in geography, cartography, GIS, information science, and sociology. It employs ethnographic research methods to examine the embedded relationships between national culture and GIS design, comparing the GIS designs of a county in the USA with a county in Germany. In sociology the importance of culture is perfectly obvious, but geography (Hettner, 1927; Pickles, 1986), cartography (Wood and Fels, 1986; Harley, 1989), and information systems (Hofstede, 1980; Boisot, 1987; Jordan, 1994) also emphasise the importance of culture. GIS researchers also consider cultural aspects. Some GIS research focuses on differences in the cultural mediation of GIS operations and corresponding cultural concepts (Campari and Frank, 1993; Campari, 1994). Other GIS researchers have studied cultural differences in spatial cognition (Mark and Egenhofer 1994a, 1994b). This research specifically utilises frameworks for examining the influence of national culture on information systems, particularly Hofstede’s cultural dimensions (Hofstede, 1980; Jordan, 1994). Like these works this research builds on the sociological work of Max Weber. Culture is commonly understood in this sociology to be the shared set of beliefs that influence what we consider to be meaningful and valuable. Disciplines, professions, and institutions in modern bureaucratic society nurture and transmit cultural values and meanings (Weber, 1946). In this vein, Obermeyer and Pinto recently discussed the role of professions in GIS in Weber’s framework (Obermeyer and Pinto 1994). Chrisman, writing earlier about the involvement of different disciplines and guilds in spatial data handling, also identifies disciplines as carriers and transmitters of cultural values (Chrisman, 1987). The focus of this research is solely national culture and employs the national culture dimensions described by Hofstede (1980). His framework describes four dimensions of national culture (uncertainty avoidance, power distance, individuality and masculinity) with their influence on thinking and social action. This research examines the two pertinent dimensions (uncertainty avoidance, power distance). Following the presentation of the theoretical background for this work in the next section, the methodology employed is described in the third section. The two research questions are:
NATIONAL CULTURAL INFLUENCES ON GIS DESIGN
55
1. How well do Hofstede’s national cultural dimensions reflect differences in the GIS designs as indicated through the analysis of the design documents? 2. Do these dimensions also help explain the actual practice of GIS design? The fourth section presents the GIS of the two counties, Kreis Osnabrück in Germany and King County in the United States, and the fifth section evaluates their differences in terms of Hofstede’s national cultural dimensions. The final section turns to a general review of the research findings and presents an explanation for the differences found between GIS design practice and Hofstede’s formal framework. 6.2 THEORETICAL BACKGROUND Because of its ubiquity, studying culture, even just national culture, is an extremely complex, and introspective activity (Clifford and Marcus, 1986; Emerson et al., 1995). Moving beyond the limits of our own cultural understanding and comprehending another easily becomes a very subjective undertaking. Fortunately, Hofstede, in a study of the influences of national culture on information systems, evaluated 117, 000 questionnaires from 84,000 people in 66 countries. Out of this mass of empirical data he developed four dimensions of national cultural influence on information system design. Although vast in scale, the focus on information systems was limited enough to provide an empirically validated framework that can be employed in evaluating GIS design. Hofstede specifically examined the role of national culture in work-related values and information system design (Hofstede, 1980). Applying theories of culture and organisational structure from Weber (1946) to the research findings, Hofstede (1980) establishes four dimensions of national culture. • uncertainty avoidance: the extent to which future possibilities are defended against or accepted • power distance: the degree of inequality of power between a person at a higher level and a person at a lower level • individualism: the relative importance of individual goals compared with group or collective goals • masculinity: the extent to which the goals of men dominate those of women. Uncertainty avoidance is the focus of information systems, decision support systems and so on (Jordan, 1994). It is considered together here with power distance because of interaction effects (Hofstede, 1980). The other two dimensions, individualism and masculinity, having little importance and relevance to German and US cultures, lie outside the focus of this research. Germanic and Anglo-American cultures are strongly differentiated in terms of uncertainty avoidance; the power distance dimension is quite similar. It is important to note that Hofstede’s findings ascribe ideal typical qualities to each culture in a Weberian sense: they are the strived for forms, not individual characteristics. In other words, research can only find distinctions between social group behaviour in terms of these dimensions. Uncertainty avoidance and power distance form critical interactions affecting organisations. In Germany and the USA, characterised by low power distance, there are two possible ways to keep organisations together and reduce uncertainty. In Germanic cultures, with high uncertainty avoidance, “people have an inner need for living up to rules,…the leading principle which keeps the organisations together can be formal rules” (Hofstede 1980, p. 319). With low uncertainty avoidance (Anglo-American cultures), “…the organisation has to be kept together by more ad hoc negotiation, a situation that calls for a larger tolerance
56
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
Figure 6.1: Dimensions of National Culture for low power distance and different uncertainty avoidances (After Hofstede, 1980 p. 319)
for uncertainty from everyone” (Hofstede 1980, p. 319). Figure 6.1 shows important organisational characteristics based on uncertainty avoidance and power distance dimensions. Hofstede makes detailed comments about these differences. The “Anglo” cultures “would tend more toward creating implicitly structured’ organisations” (Hofstede 1980, p. 319). In contrast, German speaking cultures establish “workflow” bureaucracies that prescribe the work process in much greater detail (Hofstede 1980, p. 319). Hofstede argues that problem solving strategies and implicit organisation forms follow: Germans establish regulations, Anglo-Americans have negotiations. Germans conceive of the ideal organisation as a “well-oiled machine”, whereas Anglo-Americans conceive of a “village market” (Hofstede, 1980). Information transaction cost theory (Willamson, 1975) provides additional insight into cultural influence on organisational structure and approaches to problem solving. In this theory, all business activity is transaction between individuals and groups. Information serves as the controlling resource (Jordan 1994). In this form the theory is overly reductionist and simplistic. Boisot (1987) extended this transaction cost theory to include cultural issues, distinguishing two characteristics of information that affects transactions: • codification: the degree of formal representation • diffusion: the degree of spread throughout the population (Jordan, 1994). Internalising the transaction in the organisation reduces the diffusion of information (Jordan, 1994). Centralised information requires a bureaucracy, whereas diffuse information is distributed in a market. These differences correspond to Hofstede’s national cultural characteristics (Jordan, 1994). How GIS design codifies or diffuses information will depend on the importance of uncertainty avoidance and ideal organisation type. Multi-disciplinary and multiple goal orientations (Radford, 1988) will create additional hurdles to face in information system design. Nominally, highly integrated industries and commerce apply the information transaction approach. GIS design approaches often begin with a similar structured systems approach (Gould, 1994). When considering heterogeneous public administrations, a different, highly diversified organisational structure is possible. In county governments the multi-disciplinary interests, missions, goals, and perspectives require special consideration of the cultural values.
NATIONAL CULTURAL INFLUENCES ON GIS DESIGN
57
6.3 METHODOLOGY AND RESEARCH DESIGN This research compares two ethnographic case studies of the GIS designs and implementations in King County, Washington, USA and Kreis (County) Osnabrück, Lower-Saxony, Germany. The research design is conceptually divided into two phases. In the first phase design documents were examined and compared (see Harvey, 1995) for the first report of these results). During the second phase, I participated as an observer in the actual design process to validate my findings from the first phase and test Hofstede’s framework. A case study methodology was chosen for the detailed insight it provides into the distinct cultural and institutional context of each GIS (Onsrud and Pinto, 1992). In the case of King County I followed a strategy of contextual inquiry, compared to naturalistic observation used during a shorter visit to Kreis Osnabrück (Wixon and Ramey, 1996). A framework for the case studies was prepared following Hofstede’s framework with a focus on uncertainty avoidance and the role of regulations and negotiations. Ethnographic approaches to differences in scientific practice (Hayek, 1952, 1979; Latour and Woolgar, 1979; Hirschhorn, 1984; Anderson 1994; Nelson, 1994) influenced the choice of participant observation to collect data. The actual issues raised during document evaluation, open-ended interviews, written correspondence, and telephone communications focused on GIS design and the construction of organisational, institutional, and physical components. The case study in King County occurred over a longer portion of time (six months) during which I participated in the system conceptualisation. This was followed by several phone interviews and written communication. Due to the distance to Kreis Osnabrück, key questions were posed in written format, several months before the site visit. During an intensive one week visit, open-ended interviews were held with six project participants and analysed. My training in German planning and administrative law plus experiences with GIS applications in Germany enabled me to get to the key research questions rapidly. The preliminary evaluation of documents and an ongoing exchange of discussions and/or e-mail, allowed a gradual entry into the design and implementation practices of each county. The design documents for each county were examined and evaluated in terms of Hofstede’s framework. Flood protection planning was chosen as a case study to examine in more detail because of the fundamental similarity of this mandate and the availability of digital data in both counties. The preparation of the visit to Kreis Osnabrück involved formulating specific questions and issues about design practice, uncertainty avoidance, and the role of regulations and negotiations. Questions focused on filling gaps in the recent history of the county GIS, understanding the role of different administrative agencies in the design process, and examining the practice of GIS design and implementation. During the visit, I visited several agencies and had discussions with county staff. Because of the far longer duration of observation in King County and my more direct involvement with the project, the case study in King County followed a different plan. The comparison was formulated parallel to my work there, so this case study involved clear retrospective and inquiry phases. After my six months project participation at King County, I had several meetings, telephone calls, and written correspondence with project staff to discuss specific questions related to project history, design, and implementation. 6.4 DESIGNS Both King County and Kreis Osnabrück started the GIS projects examined here in 1989. King County’s design occurred after several failed attempts, and was characterised by ongoing negotiations. Kreis
58
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
Osnabrück’s design involved a detailed examination of two departments’ tasks (regional planning and environmental protection) that focused on the identification of data objects and tasks required to fulfil legislated mandates. The essential difference in the GIS design approaches goes back to Kreis Osnabrück’s reliance on standards and regulations (ATKIS, ALK, MERKIS), whereas King County developed its GIS from the ground up. ATKIS—Automatisierte Topographisch-Kartographische Informationssystem (automated topographic and cartographic information system) is the most important standard for Kreis Osnabrück. It is the object orientated data model for provision of vectorised topographic data at three scales: 1:25,000, 1: 200,000 and 1:1,000,000. ALK—Automatisierte Liegemhaftskataster (automated property cadastre) is the automation of the Grundbuch (Property Book), the registry of property ownership. MERKISMaβ staborientierte Einheitliche Raumbezugsbasis für Kommunale Informationssysteme (map scale orientated uniform spatial coordination basis for communal information systems) describes GIS at the communal level as a “….geographic data base for agency specific, spatial communal information systems based on the national coordinate system, a unified data model for all topographic and agency specific spatial data.” (Der Oberkreisdirektor Landkreis Osnabrück, 1990). Kreis Osnabrück’s GIS design approach involves essentially three phases. In the first phase, questions regarding administrative functions (following the respective legal mandates) and problems with the available cartographic products were raised. The results were the basis for the detailed breakdown of administrative functions into tasks and objects. These tasks and objects are finally implemented during the last stage of design, when all issues and conflicts are to be worked out. King County’s (KC) GIS design process is far more complicated. Although it followed the accepted procedure (needs assessment, conceptual design, pilot study), the autonomy of participating agencies and county politics led to a very convoluted development. The final design involves a project that constructs the core data layers and infrastructure, but then finishes. This leaves many issues open to further negotiation. The central group in King County is basically a steering committee. There is no regulation or standardisation of what the county GIS is based on or should provide. The design of KC-GIS was, not surprisingly, difficult. After an internal proposal for a GIS fell apart due to internal strife, PlanGraphics was called in to carry out the design. This began with a needs assessment. The basic tenet of the PlanGraphics needs assessment report points to the requirement for coordination and a centralised organisation. They are the presumed basis for effectively using GIS technology that provides information and services to fulfil county administrative and governmental functions. The design paradigm follows the line that because departmental functions and information are dependent and related to other departments, a centralisation of the functions and information in a county GIS would improve the effectiveness of King County’s administration. The needs assessment report (PlanGraphics, 1992d), adopting a strategy of limited centralisation, focused mostly on elaborating county needs for a GIS in terms of common, shared, and agency specific applications. The intent was to determine which elements of a single department’s applications are common with other departments’ elements. The PlanGraphics GIS design proposal left a great many issues unresolved. These gaps required an exhaustive study of the conceptual design document and discussions with the various agencies to design a project that would fulfil objectives: in other words establishing the playing field and negotiation. Starting with the PlanGraphics documents, a special group in the Information Systems Department of Metro prepared a scoping report (Municipality of Seattle, 1993) with a more exhaustive overview of design, but left the implementation to inter-agency negotiation, and maintenance for even later negotiation.
NATIONAL CULTURAL INFLUENCES ON GIS DESIGN
59
Many GIS applications identified in the PlanGraphics reports were eliminated, because the budget for the project was reduced from US$ 20 million to US$ 6.8 million. The project’s focus was limited to creating the infrastructure and essential layers for a county GIS. Afterwards, responsibility for the layers would return to the “stakeholders”. From the PlanGraphics proposal only the names of the essential layers remained. The contents of the layers were left open to negotiation. The reduction in funding without a corresponding redefinition of mission and vague descriptions of mandates meant the design stage carried on into implementation, accompanied by ongoing negotiations. Based on examinations of design documents, the Table 6.1 summarises key design features of the two counties. Table 6.1: Comparison of Kreis Osnabrück and King County GIS design documents Kreis Osnabrück Organisation Lead agency is the information system department of the county government Various working groups are coordinated by a newly created position, GIS data base design is carried out in the responsible agency together with a central coordinating group following ATKIS Purpose Provision of data and information for more efficient administration and planning at the communal level Budget overview DM 2.89 million (app. US$ 1.94 million) Data model (Base layers) Provided and defined largely by the national standards ATKIS, ALK, and MERKIS. Extensions are for county purposes and already listed in the object catalogue. Agencies can extend the data model when needed in a given scheme.
King County The information system department of county transit agency (recently merged into the county government) is the lead agency. Two committees accompany the project GIS data base design is coordinated with other agencies, municipalities, and corporations
The core project aims to provide capabilities that are vaguely defined, i.e. “better management” The basic project goals is the development of a county GIS database. US$6.8 million
No explicit data modeling in the conceptual design documents. In all there are 72 layers. The most important are:
Survey Control Public Land Survey System Street Network Property Political Information collection, analysis, and display Documents describe administrative Documents sometimes identify rough costs (Municipality of Seattle 1993), procedures and source maps in detail, but no detailed requirements, sources, procedures of any kind are identified but not which GIS operations are required. Sources: Municipality of Seattle 1993; PlanGraphics, 1992b; Der Oberkreisdirektor Landkreis Osnabrück 1990, 1992a, 1993b.
60
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
6.5 DIFFERENCES Hofstede’s national cultural dimensions of information system design are clearly recognisable in each county’s GIS design documents. Kreis Osnabrück describes its GIS in terms of a clear and concise framework of laws, regulations, and accepted standard operating procedures. Before any product or GIS function is implemented in Germany, it is first formalised and codified. This usually involves negotiations (for example the modified ATKIS used in Kreis Osnabrück), but these negotiations are completed before the design stage and accepted as a regulation by other bodies. This reliance on regulations slows down the development of the county GIS to the rate at which regulations can be put in place. Several agencies are not satisfied by the slow rate of progress the county GIS is making. However, so far the county director has stayed with a stepby-step process of formalisation followed by implementation. King County, on the other hand, continually negotiates the design and implementation of the county GIS. The loose ends in the design documents reflect the “village market” approach. Piece-by-piece, portions of the county GIS are agreed to and implemented. This leaves design issues and, in particular, maintenance issues open or simply unresolved until implementation, reflecting the national cultural characteristics that lean towards negotiation as a design strategy. Design documents are the basis for negotiations between actors. Agreement is only established for a particular portion of KC-GIS with no guarantee of how long it will be maintained. Additionally, the design documents also leave many courses of action open, requiring extensive negotiations before any work is done. Due to this complexity, the agencies involved in the process still operate independently with limited synchronisation. 6.5.1 Designing and Implementing Regulations Each county’s design stops short of identifying specific GIS operations or functions required to prepare a layer or carry out parts of a task. It was clear that in King County the preparation of design documents and the negotiation of implementation are inextricable. However, there was no exact indication before the naturalistic observation in Kreis Osnabrück of how design documents were actually utilised during implementation. The evidence from the design documents supported Hofstede’s work, but the process of getting the design to work remained obscure. The practice of GIS design in Kreis Osnabrück differed considerably from Hofstede’s characterisation of Germanic national culture and the suggested procedures described in the design documents. The case study research indicates that the transformation of regulations into design and implementation occurs through negotiation. This was related to me using several examples. A good example is the case of database software. Part of the design, it turned out (for various reasons) would not be developed. This was a crucial component of the software system. Lacking it, the entire software design had to be reworked around an offthe-shelve product. This change was worked out through negotiations between participating agencies over the new course of action and between technical staff over design and implementation details. Problems arose nearly every day during implementation, some were large and other were small, requiring quickaction and alterations. Contrasted with Kreis Osnabrück, with only an implicit framework of regulations and guidelines for GIS design and implementation, the design of King County’s GIS project relies heavily on negotiations between departments. Since design work concludes only by pointing out the many loose-ends to be dealt with by the respective departments (PlanGraphics, 1992b), negotiation will always be the crucial step in project design.
NATIONAL CULTURAL INFLUENCES ON GIS DESIGN
61
Figure 6.2: Design for KC-GIS (from Municipality of Seattle, 1993)
In Figure 6.2, the puzzle pieces, illustrating how different parts of the county GIS should “fall into place”, graphically suggest the importance negotiations have even at the end of formal design. 6.5.2 Flood Protection Planning The detailed examination of the use of GIS for flood protection planning, a mandate similar in both counties, illustrates the influence of national culture on GIS design. In Kreis Osnabrück flood protection planning is based on the Prussian Water Legislation which states that land use permission must be granted by the County Office of Hydrology before the permit application can be further processed. This permission is the first step in acquiring the permit to build in a flood zone. The county first establishes whether the project lies in a flood zone. If it does, the county is required by law to establish whether the project is capable of approval. Currently this is done by overlaying a transparent plastic map of the flood zones on a map of the area in question.
62
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
The GIS implementation foreseen to support this mandate does not alter the application procedure. Flood protection zones are a legally defined regulation that the GIS implementation must support. GIS overlay will be used in the same way as the overlay of map transparencies is now. Flood protection planning is administratively different, but relies equally on overlay in King County. As in the rest of the USA, flood zones in King County are defined by the Federal Emergency Management Agency (FEMA). The establishment of flood zones and the partial regulation of land use is set out in the Federal Flood Insurance Protection Act of 1968. It establishes that maps for every community in the USA are to be prepared that indicate the flood zones. King County’s flood protection planning is instrumented through the permit application process, following the same overlay approach as in Kreis Osnabrück. It is not possible to obtain a building permit in a FEMA flood zone. A county ordinance lays this out and identifies the executing agency responsible for verification. The ordinance identifies the environmental agency as the clear “stakeholder” who is autonomous (in the context of the county ordinance) in fulfilling this function. However, the autonomy of county agencies leads to a very loosely defined bundle of technology, methods, and procedures for each individual agency. Every agency is distinctly separate and the county GIS design skirts these issues. The organisation of flood protection planning in each county fits Hofstede’s national cultural characteristics. Kreis Osnabrück develops the GIS operations around established regulations and King County employs GIS overlay in a manner consistent to the agency’s established practices following agreements negotiated with the other county agencies. 6.6 REGULATIONS AND NEGOTIATIONS Regardless of national culture, the diversity of perspectives and purposes in any public administration means the social construction of a GIS will always require negotiation. Regulations shift the focal points and lend a strong structure, but even regulations are negotiated. In King County negotiations and renegotiations of the GIS are ongoing. Compared to Kreis Osnabrück, the county GIS is not as stable, but agencies are extremely flexible in their response to institutional, legislative, and political contingencies. In Germany issues are negotiated and then codified as regulations or laws. The results are robust institutional solutions that offer an explicit framework, but bind the agencies involved to already established approaches leading to possible idiosyncratic solutions. New applications, consequences, and new actors’ roles must be addressed and formalised in existing institutional structures before action is taken. This takes up many institutional resources and delays the response of institutions to new technological opportunities. Much time is spent making the technology fit the institution. In King County, the flexibility left for individuals in their respective agencies is how mandates are fulfilled. In Kreis Osnabrück, individuals also propel the future developments in the county. These may take years, or never come about, but it is this resource that gives the institutionalised, bureaucratic government some flexibility. Though these are different frameworks for individual agencies, in each case they provide the necessary work and creativity to develop and provide spatial information technologies that assist in decision making. These findings point to significant differences in the actual practice of GIS design than Hofstede ascribes. On one hand, his assessment of negotiation as dominant fits the actual approach to design in King County, but the reliance on regulations he asserts for Germanic cultures only fits the design documents, not the actual practice of design in Kreis Osnabrück. Hofstede’s cultural dimensions only seem to apply at the abstract organisational level to the influence of national culture in both counties. In other words, the actual
NATIONAL CULTURAL INFLUENCES ON GIS DESIGN
63
Figure 6.3: Example of task analysis used in the design of KRIS (from Der Oberkreisdirektor Kreis Osnabrück, 1993a)
practice of design involves negotiations in both cultures. Behind the formal design of the Kreis Osnabrück GIS that explicitly relies on standards and regulations (see for example Figure 6.3), the practice of design is the result of negotiations. Hofstede’s finding that uncertainty avoidance is so high in Germanic culture explains why negotiations are codified as regulations and standards in a very hierarchical, institutionalised system. However, in both counties, the detailed design and implementation of the GIS (getting the design to work) is left completely in the hands of the people creating the system. Key differences between King County and Kreis Osnabrück lie in the organisational orientation of the design work. In Kreis Osnabrück the GIS is implemented by fulfilling standards. In King County work on the design aims to fulfil negotiated requirements and retain institutional and disciplinary positions, The work practice of the individuals on the Kreis Osnabrück project was in fact strongly compartmentalised according to the regulations they needed to implement. However, in spite of this compartmentalisation, much work was carried out resolving discrepancies between different regulations and constructing working
64
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
GIS software. Even in the compartmentalised world of German public administration, individuals rely on a web of contacts with co-workers and knowledgeable outsiders in the practice of design. This non-formalised part of GIS design is a tacit component of their work lives and is scarcely mentioned in discussions. Regulations remain important in the strong hierarchy. Informal meetings and arrangements with co-workers and outsiders are only the backdrops for design activities. The dominant view is that if these practices do not culminate in regulations, they are not important to the project, nor worth reporting. The project manager was aware of these issues and apparent conundrums. He indicated the difficulties of implementing broad standards and resolving all the problems of implementation. It is necessary to find a pragmatic solution. In his words, “Kreis Osnabrück strives for an 80 percent solution” (Remme, 1995). For all the regulations and detailed task descriptions, they are learning by doing. Some of the large problems in the past proved unsolvable (for multiple reasons) and the project design was altered. For instance, the unavailability of the database software required a new concept that led to the acquisition of commercial database software and a fundamental change in the project design. In the strong hierarchy of German public administration, emphasis on regulations and fulfilling legal mandates dominates the participants representation of design activities. This attests to the high uncertainty avoidance in Germanic cultures that Hofstede identified. In a culture so engrossed with regulations, it is no wonder that an outside observer, employing quantitative research techniques, only turns up the aspects emphasised by the culture. Going beneath the veneer of regulations and standards to the practice of design reveals a complex practice of negotiation and ad hoc problem solving in both counties. In spite of the emphasis on regulations and standards, the actual work constructing the GIS in Kreis Osnabrück involves negotiations as much as regulations. 6.7 CONCLUSIONS In terms of methodology and further studies, it is apparent that ethnographic research is crucial to unravelling national culture influences on GIS design. Case studies offer substantial advantages for examining the intricacies of GIS work in their cultural, organisational, and disciplinary context. Participant observation and other ethnographic research techniques can aid examinations of other national cultures’ influences, disciplinary and institutional roles, and the actual practice of GIS design and work. Qualitative research can lead to valuable insights to move beyond the tacit recognition that culture is entangled with GIS technology. The ethnographic case studies of King County and Kreis Osnabrück show that, in spite of similarities, national cultural factors help explain substantial differences in the design, practice, and organisation of GIS. The GIS techniques used (overlay of flood protection zones) may well be similar, but national cultural values lead to completely different constructions of GIS technology. Furthermore, in both counties, it is plain that the practice of GIS design involves negotiation. This finding suggests Hofstede’s characterisations of national culture are limited to an abstract organisational level, not necessarily the actual practices of design and implementation. ACKNOWLEDGEMENTS I would like to thank the following individuals for their assistance in carrying out this research: Martin Balikov, Karl Johanson, and Thomas Remme. The University of Washington Graduate School provided financial support for completing the case study in Germany.
NATIONAL CULTURAL INFLUENCES ON GIS DESIGN
65
REFERENCES ANDERSON, R.J. 1994. Representations and requirements: the value of ethnography in system design, HumanComputer Interaction, 9, pp. 151–182. BOISOT M. 1987. Information and Organizations: The Manager as Anthropologist. London: Fontana. CAMPARI, I. 1994. GIS commands as small scale space terms: cross-cultural conflict of their spatial content, in Waugh T. and Healey R. (Eds.). Advances in GIS Research, The Sixth International Symposium on Spatial Data Handling Vol. 1. London: Taylor & Francis, pp. 554–571. CAMPARI, I. and FRANK, A. 1993. Cultural differences in gis: a basic approach. Proceedings of Fourth European Conference and Exhibition on Geographical Information Systems (EGIS ‘93), Genoa, Italy, Vol. 1. Utrecht: EGIS Foundation, pp. 10– 16. CHRISMAN, N.R. 1987. Design of geographic information systems based on social and cultural goals, Photogrammetric Engineering and Remote Sensing, 53(10), pp. 1367– 1370. CLIFFORD, J., and MARCUS, G.D. (Eds.) 1986. Writing Culture. The Poetics and Politics of Ethnography. Berkeley, CA: University of California Press. DER OBERKREISDIREKTOR LANDKREIS OSNABRÜCK. 1990. Das Kommunale Raumbezogene Informationssystem (KRIS) Eine Arbeitspapier zur Realisierung. Osnabrück: Referat A. DER OBERKREISDIREKTOR LANDKREIS OSNABRÜCK. 1992a. Situationsbericht 12/2/92. Osnabrück: Der Oberkreisdirektor. DER OBERKREISDIREKTOR LANDKREIS OSNABRÜCK. 1992. Das Kommunale Raumbezogene Informationssystem Osnabrück (KRIS) Gemeinsamer Abschlußbericht der Projekt-und Entwicklergruppe (Final Report) 20 May 1992. Osnabrück: Der Oberkreisdirektor. DER OBERKREISDIREKTOR LANDKREIS OSNABRÜCK. 1993a. Lösungsvorschlag. Osnabrück: Der Oberkreisdirektor. DER OBERKREISDIREKTOR LANDKREIS OSNABRÜCK. 1993b. Systemkonzept. Osnabrück: Landkreis Osnabrück. EMERSON, R.M., FRETZ, R.I., and SHAW L.L. 1995. Writing Ethnographic Fieldnotes, Chicago: University of Chicago Press. GOULD, M. 1994. GIS design: a hermeneutic view, Photogrammetric Engineering and Remote Sensing, 60(9), pp. 1105–1115. HARLEY, J.B. 1989. Deconstructing the map, Cartographica 26(2), pp. 1–29. HARVEY, F. 1995. National and organizational cultures in geographic information system design: a tale of two counties, in Peuquet D. (Ed.) Proceedings of the Twelfth International Symposium on Computer-Assisted Cartography (AutoCarto 12). Charlotte, NC: ACSM/ASPRS, pp. 197–206. HAYEK, F. 1952, 1979. The Counter-Revolution of Science. Indianapolis: Liberty Press. HETTNER, A. 1927. Die Geographie. Ihre Geschichte, Ihr Wesen und lhre Methoden. Breslau: Ferdinand Hirt. HIRSCHHORN L. 1984. Beyond Mechanization. Work and Technology in a Postindustrial Age. Cambridge, MA, The MIT Press. HOFSTEDE, G. 1980. Culture’s Consequences. International Differences in Work-Related Values. Beverly Hills: Sage Publications. JORDAN, E. 1994. National and Organisational Culture Their Use in Information Systems Design, Faculty of Business, Report. Hong Kong: City Polytechnic of Hong Kong. LATOUR, B. and WOOLGAR, S. 1979. Laboratory Life: The Social Construction of Scientific Facts. Beverly Hills: Sage. MARK, D.M. and EGENHOFER, M.J. 1994a. Calibrating the meanings of spatial predicates from natural language: line-region relations, in Waugh T. and Healey R. (Eds.) Advances in GIS Research, The Sixth International Symposium on Spatial Data Handling, Edinburgh, Scotland, Vol. 1. London: Taylor & Francis, pp. 538–553.
66
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
MARK, D.M. and EGENHOFER, M.J. 1994b. Modeling spatial relationships between lines and regions: combining formal mathematical models and human subjects testing, Cartography and Geographic Information Systems, 21 (4), pp. 195–212. MUNICIPALITY OF SEATTLE. 1993. King County GIS Scoping Project. Seattle: Municipality of Metropolitan Seattle. NELSON, A. 1994. How could scientific facts be socially constructed? Studies in History and Philosophy of Science, 25 (4), pp. 535–547. OBERMEYER, N.J., and PINTO, J.K. 1994. Managing Geographic Information Systems. New York: Guildford Press. ONSRUD, H.J. and PINTO, J.K. 1992. Case study research methods for geographic information systems, URISA Journal 4(1), pp. 32–44. PICKLES, J. 1986. Geography and Humanism. Norwich: Geo Books. PLANGRAPHICS. 1992a. King County GIS Benefit/Cost Report. Frankfort, KY: PlanGraphics. PLANGRAPHICS. 1992b. King County GIS Conceptual Design. Frankfort, KY: PlanGraphics. PLANGRAPHICS. 1992c. King County GIS Implementation and Funding Plan. Frankfort, KY: PlanGraphics. PLANGRAPHICS. 1992d. King County GIS Needs Assessment/Applications, Working Paper. Frankfort, KY: PlanGraphics. RADFORD, K.J. 1988. Strategic and Tactical Decisions, 2nd edition. New York: Springer-Verlag. REMME, T. 1995. The GIS of Kreis Osnabrück (Interview) 17–19 August 1995. SCHULZE, T., and REMME, T. 1995. Kommunale Anwendungen beim Landkreis Osnabrück, in Kopfstahl, E. and Sellge, H. (Eds.), Das GeoinformationssystemATKIS und seine Nutzung in Wirtschaft und Verwaltung. Hannover: Niedersächisches Landesvermessungsamt, pp. 193–198. WEBER, M. 1946. Bureaucracy, in Gerth, H.H. and Mills, C.W. (Eds.), From Max Weber. Essays in Sociology. New York: Oxford University Press, pp. 196–244. WILLAMSON, O.E. 1975. Markets and Hierarchies: Analysis and Antitrust Implications. New York: Free Press. WIXON, D. and RAMEY, J. 1996. Field Methods for Software and Systems Design. New York: John Wiley & Sons. WOOD, D. and FELS, J. 1986. Designs on signs/myth and meaning in maps, Cartographica 23(3), pp. 54–103.
Chapter Seven The Commodification of Geographic Information: Some Findings from British Local Government Stephen Capes
7.1 INTRODUCTION The commodification of geographic information (Openshaw and Goddard, 1987) is of rapidly increasing importance with implications for government, data users and producers, academics, the GIS industry and society at large. This chapter considers the exploitation of the geographic information commodity in British local government with reference to a four-fold model of commodification. In comparing commodification locally with that at the national and international levels, the discussion draws a crucial distinction between commercialisation and commodification. Whereas the former merely involves selling information, commodification encompasses a somewhat broader conceptualisation of the use and value of geographic information, involving its exploitation for strategic as well as commercial purposes. The geographic information market is a growing industry; an estimate of government spending on geographic information in the European Union has been put at some six billion ECU, or 0.1 percent of EU gross national product (Masser and Craglia, 1996). Users of geographic information systems (GIS) require access to data to put onto computers in order that they might store it, integrate it spatially, perform analyses upon it, and present it graphically. An ever-increasing number of studies (for example Blakemore and Singh, 1992; Capes, 1995; Gartner, 1995; Johnson and Onsrud, 1995; Sherwood, 1992) highlight the practice of charging for the geographic information products and services being produced in both the private and public sectors, and the implications this poses for the information society. Rhind (1992a) reviews charging practices in a selection of countries and highlights some of the problems they raise, whilst Beaumont (1992) discusses issues relating to the availability and pricing of government data. Barr (1993) paints a vivid, if imaginary, portrait of the dangers associated with information monopolies, and Maffini (1990) calls for geographic information to remain cheap or free in order that the GIS industry might grow. Both Shepherd (1993) and Onsrud (1992a, 1992b), present balanced reviews of the pros and cons of charging for information. These papers cover views which range from those in favour of greater data sales to those arguing that geographic information should be free to users. In addition, GIS users who have already collected or bought data, analysed it and produced useful outputs, have themselves developed an interest in disseminating and selling these products and services. Bamberger and Sherwood (1993) have published case studies of practice and discussions of issues associated with marketing government geographic information in the USA. Rhind (1992b) explains that the Ordnance Survey, Britain’s national mapping agency, is required to sell its geographic information products to raise the income it needs to meet government targets and pay for an expensive digitisation programme.
68
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
Governments at national and international levels have recently begun to take a close interest in making geographic information available to a wider audience using information and communication technology. They are coming to believe that geographic information is of key strategic importance, since geography is a common base which enables the linking of many other data sets. Digital data and GIS are crucial tools in this information integration: in the UK, for instance, it has been suggested that between 60 percent and 80 percent of government information is in some way geographic, a figure used by the Ordnance Survey to support the building of a National Geospatial Database (Nanson et al., 1996). The ongoing creation of a European Geographic Information Infrastructure (Masser and Salgé, 1995) and, in America, the National Spatial Data Infrastructure (Clinton, 1994; Federal Geographic Data Committee, 1993), testify to the importance currently placed on exploiting geographic information as fully as possible. It is therefore evident that, partly as a result of the advent of GIS technology, and partly owing to strategic policy initiatives based around the integrative properties of geographic information, the commodification of geographic information is currently an important focus of research interest. This chapter examines the development of geographic information commodification in Britain, using the example of metropolitan information bureaux and the information products and services they provide, to amplify a model of commodification. The next section gives background information on the British local government sector, particularly its metropolitan information bureaux. The third section introduces a model of geographic information commodification in local government, based on a four-fold typology. Evidence from British information bureaux is presented in the fourth section to support this model, and the fifth section contains an evaluation and conclusions. 7.2 BACKGROUND Local government is an important locus of change in geographic information exploitation. Local authorities are major users of information and information technology (Hepworth et al,. 1989). Much of this information is geographic in nature, and increasingly geographic information systems are used to manage it (Campbell and Masser, 1992). In addition, local government is a significant socio-economic entity. In Britain, local government represents a substantial segment of the economy. It accounts for the employment of nearly three million people (around 10 percent of the economically active population), spends around £46 billion each year (approximately 10 percent of gross domestic product), and is a major supplier of and influence on essential public services such as education, policing and public transport (Stoker, 1991). As well as being a provider of products and services, local government is an important consumer in the economy, purchasing construction, electricity, labour, stationery, geographic information, information technology, and a wide variety of consulting and other contracts. At the same time, local government performs a vital social role, being an important element of both the welfare state (through its provision of housing and social services) and representative democracy (Wilson and Game, 1994). A number of processes might be considered to be interacting to promote the commodification of geographic information in British local government. Changes to the nature, culture and structure of local government are affecting how and for what purposes geographic information is exploited. However, technological change is particularly influential. Applying GIS often has the effect of focusing attention on the usefulness of geographic information. Since so much time and effort is spent getting data in to and out of the system there is a natural tendency to wish to exploit the value of this data. The mapping, analytical and presentational capabilities of a GIS are tools by which this can be achieved.
THE COMMODIFICATION OF GEOGRAPHIC INFORMATION
69
Other developments in information technology have had an impact on the exploitation of geographic information. The almost universal spread of the PC, floppy disk and CD-ROM has enhanced data storage and access capabilities, at low cost and ease of access to the user. Remote access to information over networks has improved data availability and distribution. Local networks have enhanced information access within local government organisations. Recent global scale developments with the Internet, notably the World Wide Web, have progressed at a remarkable pace. Local authorities have been influenced by all these developments. As well as being at the forefront of GIS adoption, they have invested widely in other computer technologies such as PCs, local area networks and Internet connections. Moreover, British local authorities have since 1993 made extensive purchases of digital map data from the Ordnance Survey, following the conclusion of an agreement designed to make digital mapping more accessible to the local government sector. In five metropolitan areas in Britain, there are information units charged with providing geographic information services to all the municipal authorities in the locality. These metropolitan information bureaux were established in the mid 1980s, and are jointly funded by the municipal authorities in five of the major British conurbations (see Figure 7.1). However, they are not local authorities themselves: they are independent units within the public sector. The bureaux each serve local authorities in conurbations covering populations between one and six million. Comparative data on the five British metropolitan information bureaux is contained in Table 7.1. Table 7.1: Metropolitan Information Bureaux Bureau Name
Major City Served
Population of Area Served (Millions)
Gross Expenditure (£ Funding As % Of Million) Expenditure
London Research Centre West Midlands Joint Data Team Merseyside Information Service Greater Manchester Research Tyne and Wear Research and Intelligence Unit Source: Capes (1995)
London
6.2
5.8
56
Birmingham
2.5
1.3
98
Liverpool
1.4
0.8
95
Manchester
2.5
0.4
73
Newcastle-uponTyne
1.0
0.3
88
Each metropolitan information bureau has the specific purpose of providing research and intelligence on social, economic, land use and, in some cases, transport issues to the consortium of municipalities it serves; but it also provides information services to a variety of other, mainly public sector, organisations. Since the sponsoring municipal authorities provide only part of an information bureau’s running costs (see Table 7.1), the provision of these services to other organisations for a fee is often an important part of a bureau’s business strategy. Their distinct focus on geographic information provision, their leading position with regard to GIS, and their status as jointly funded bodies with limited public subsidy, makes these metropolitan information bureaux ideal laboratories in which to study the commodification of geographic information at the present time.
70
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
Figure 7.1: Metropolitan Information Bureaux Locations
7.3 A MODEL OF COMMODIFICATION The context for geographic information commodification in local government is the changing geographic data availability environment at the national and international levels. The cost recovery versus free access to information debate (Onsrud, 1992a, 1992b) is especially visible in the USA. There, a tradition of free of charge, if not always highly precise, federal government information is being challenged by the development of sophisticated GIS products and services by government agencies and the need to recover at least part of the costs associated with creating, maintaining and supplying these information services (Bamberger and Sherwood, 1993). In Britain, too, government policy in recent years has been to require its national mapping agency to recoup as much as possible of its costs, particularly in relation to digitising large scale maps for use in GIS applications (Rhind, 1992b). With these and other developments in mind, a conceptual framework for the commodification of geographic information has been developed (Capes, 1995). In the context of local government, four
THE COMMODIFICATION OF GEOGRAPHIC INFORMATION
71
characteristics of exploiting the information commodity were identified. Commercialisation is the selling of information, often to the business sector but also to the general public. Dissemination is providing or publishing information, often for the general public but also for the commercial sector. Information exchange is the trading or sharing of information amongst public departments and agencies, including metropolitan information bureaux. Value-added information services are information products which involve some extra work on the part of the data provider, such as data analysis, interpretation or repackaging. This fourfold typology of geographic information commodification is detailed below. 7.3.1 Commercialisation Much work has focused on the commercial elements of commodification, concerning the sale of geographic information for a fee or charge by governments and other agencies (for example Antenucci, et al., 1991; Aronoff, 1985; Bamberger and Sherwood, 1993; Blakemore and Singh, 1992). A fundamental charging issue is the level at which the charge is pitched. The literature points out a need to balance, through policy decisions, the benefits to society from widespread and cheap access to government information; and the benefits to society from high fees offsetting information costs and lowering taxation (Blakemore, 1993). Dale and McLaughlin (1988) feel that there is a continuum of information products from those that should be wholly subsidised by the state to those that should be charged for on a completely commercial basis. There are thus a variety of levels for information charges which can be drawn together to develop a “spectrum” of charging for public information. At one end of the spectrum is the zero charge position: information should be free since the enquirer will already have paid for it through taxation. The next position is marginal cost recovery. Here, the information holder makes a charge to cover all or part of the cost of reproducing and distributing the information to the enquirer. This charge might cover photocopying, printing, paper and postage, but it would not account for the costs of collecting, storing and managing the information itself. Such is the charging regime applied by the USA federal government when releasing its information to enquirers (Blakemore and Singh, 1992). A charging component for the cost of staff and computer time may be incorporated. The third point on the charging spectrum is full cost recovery (Aronoff, 1985). Here the charge reflects not only reproduction and distribution costs but also the cost of collecting and managing the information. Cost recovery is a common aim where expensive computer systems (such as GIS) have been procured and used to provide information services (Archer and Croswell, 1989). At the upper extremity of the public information charging spectrum is the practice of charging as much as possible—what the market will bear. This strategy aims to charge a price greater than the full cost of providing information, in order to generate fresh income for the government or department in question. Proponents of this approach argue that a public body best serves the public by making a profit on information products, since this reduces the fiscal burden on the national coffers and provides funds for investment in new information and the technology and staff to manage it. By charging for geographic information in these ways, government (including local government) is exploiting the potential of the information commodity to make money. It is the pecuniary value of information which underlies these activities. However, commercial dealings are just one way by which the information commodity can be exploited. Although putting a monetary price on information quantifies its value in a manner which is easily understood by everyone, the value of geographic information can also be expressed in alternative, non-commercial and non-quantitative ways, such as information dissemination and exchange.
72
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
7.3.2 Dissemination Non-quantitative value is not so tangible or immediate as monetary gain, and can appear in many forms, so is harder to pin down. One instance is the power of geographic information to inform. Information may be used in research and education at schools and universities (Blakemore and Singh, 1992; Lloyd, 1990), or to inform and advise local residents and communities (Craig, 1994). Distributing information can also be undertaken for tactical or political ends, such as attempting subtly to shift opinions round to a certain point of view (Campbell, 1990; Roszak, 1986). Hepworth (1989) and Sherwood (1992) see disseminating local government information as helping to promote an area or a local council and enhance service delivery: providing information to citizens can assist their understanding of available services, whilst giving information to businesses is commonly used by local authorities as an advertising ploy to attract investment and employment to their areas. If public sector information does have a value for people outside government agencies then its availability is an important issue. Availability concerns both the release of internal information and access as mediated by the price charged for it. A potential access problem is posed by the sometimes conflicting goals of commercialisation and data availability. This is highlighted in general terms by Openshaw and Goddard (1987) and Mosco (1988), and is addressed in the context of local government by Hepworth (1989) and Barr (1993). The conflict is between a desire for a government to impose cost recovery targets on its information agencies, and a wish to disseminate information to as wide an audience as possible. Charging for information may price potential members of this audience out of the information market. One remedy is differential charging: the general public and students may be charged at a lower rate than, say, the business sector in order to keep information within an appropriate price band. Solutions may not be this simple. Nevertheless, it is now recognised that access to geographic information for members of the public and voluntary groups is “among the most important issues now before the GIS community” (Dangermond, 1995). Activities providing, circulating and publishing geographic information, often for the general public but also for the business sector, with the primary objectives of promulgating knowledge and raising awareness, can be termed information dissemination. The promotion or marketing of a locality may be aims of information dissemination, along with the stimulation of the local or national economy and labour market. Information dissemination may also be associated with tactical or political aims if the provision of information has less altruistic motives. Various methods may be employed to ensure that geographic information reaches the target groups, with free information and differential charging being possibilities. 7.3.3 Information exchange In addition to selling and disseminating information to external groups, local authorities also exchange information between, and within, themselves. Information which might otherwise be sold is sometimes provided free of charge to participants in data exchange agreements (Taupier, 1995). As a quid pro quo, all involved parties will provide some information in return for receiving new information from the other participants. In Britain, the metropolitan information bureaux exchange information of mutual interest with municipal councils, the population census being a notable case (Capes, 1995). Within councils, departments and sections within departments may share or swap information with one another (Calkins and Weatherbe, 1995).
THE COMMODIFICATION OF GEOGRAPHIC INFORMATION
73
Such information exchange arrangements are of benefit to all parties and often operate on a noncommercial basis: all actors aim to minimise their net costs by sharing the total overhead between them (Taupier, 1995). However, there is a growing tendency for information exchanges between and within local authorities to have a commercial component. As Bamberger (1995) relates, there are significant factors hindering information exchange exercises, one being the wish of all parties involved to see the accrual of sufficient benefits to offset the resources they have contributed to the joint venture. If just one participant pulls out of an information exchange, the whole scheme might collapse. Nevertheless, where the incentive for participation is sufficiently great, information exchange activities can and do take place. 7.3.4 Value-added information services Apart from cost minimising, a further stimulus for information exchange might be the possibility of adding value to information. By combining data sets from different agencies, the value of the whole becomes greater than the sum of the parts. A geographic information system is an ideal opportunity for such an activity. If the topographic data from a mapping agency are combined with the settlement information from a planning department and then added to traffic information from a transportation authority, the resulting information system is a powerful tool for analysing and perhaps modelling journeys, their purpose and duration. This analysis would not be possible if one of the data sets were missing. In this way, the power of GIS to add value to information by integrating a number of databases is demonstrated. GIS can also add value to data through the process of verifying and cleaning data which is performed when information is input to the system. But value-added information services do not necessarily require computer technology to be created. The skills and expertise of trained research officers are often vital components in adding value to information. Whilst a computer can rapidly perform statistical work like census tabulation (Rhind, 1983), only a trained human being can interpret this information. Given that answers are what many information clients want, such interpretation is clearly a vital part of many valueadded information services. Presenting results in an accessible manner (perhaps in the form of graphics and maps, as opposed to abstruse tabulations) is a further value-adding service (Garnsworthy, 1990). These value-adding activities all have in common the fact that they make information more useful and hence more valuable to users. In performing value adding work, staff and computer time are expended, along with other resources. It is therefore often the case that this added value attracts a charge to compensate for the extra resources put in to preparing the information in such a form. 7.3.5 Discussion Geographic information commodification in local government has four main components, each of which recognises that information has a value beyond its statutory uses and can therefore be further exploited. These four components are commercialisation, dissemination, exchange and value-added information services. Commercialisation is the selling of information, mainly to businesses, with earning income to recover costs as an objective. Dissemination is providing, circulating and publishing information, often for the general public but also for the business sector, with the primary objectives of promulgating knowledge and raising awareness. Promotion or marketing may be associated aims. Information exchange is the trading or sharing of information between departments, councils and related agencies, with the primary aim being maximised mutual benefits and minimised mutual costs (rather than making money or educating the
74
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
public). Value-added information services involve additional work to enhance an information product by combining, analysing, interpreting or presenting information in a form more useful to potential clients. All four commodification activities embody the recognition that public information is an externally valuable resource as well as an internally useful tool, but each rests on a different concept of value. In the case of commercialisation that value is simple to pinpoint: it is purely pecuniary. For dissemination the value of information is educational, social and economic. With information exchange, value is harder to express because it is jointly accrued, and may take the form of losses foregone as well as benefits gained. Value-added information services result from making information more useful, and hence more valuable, to potential clients. It should be appreciated that these four types of commodification are not mutually exclusive. Information exchange or dissemination might have a monetary element, if a charge (perhaps at marginal cost recovery rates) is made as part of the transaction. Commercialisation may include an educational component where, for instance, an information pack is sold to a school or member of the public: here income is gained but information is still disseminated. Added value might come into play in commercialisation or dissemination, since the information products and services involved may incorporate additional human or technological resource expenditure. The boundaries between the four elements of geographic information commodification are therefore blurred or overlapping. Few information transactions are likely to involve just one facet of commodification. The four strands of geographic information commodification— commercialisation, dissemination, exchange, and value-added information services—can occur together or separately. Nonetheless, as a working model describing commodification in local government this provides a useful apparatus for investigating the nature of commodification in practice. 7.4 GEOGRAPHIC INFORMATION COMMODIFICATION IN METROPOLITAN INFORMATION BUREAUX With the adoption of geographic information systems and under the influence of changes to the ‘municipal information economy’ in Britain (Hepworth et al., 1989), information commodification has begun to develop to a more sophisticated degree than ever before. The most advanced sector of British local government with respect to geographic information commodification is the set of metropolitan information bureaux in major conurbations. These bureaux have the express purpose of providing geographic information and research on behalf of the municipal authorities in their area. They are therefore strong centres of information activity, and are important examples of public sector use of geographic information technologies. In-depth studies of all five metropolitan information bureaux, together with detailed case studies of specific geographic information services they offer, were undertaken to provide evidence on the nature of advanced commodification in local government (Capes, 1995). Commercialisation, or vigorous charging for information, was found to be widespread. Although information products and services provided by the metropolitan information bureaux earn significant amounts of income, none of the services meet full cost recovery: income generated by selling information is used to subsidise the costs of providing services to the municipal authorities or to fund new technology purchases rather than to make profits. A particularly prominent activity in the British metropolitan information bureaux is the exchange of information. By sharing their geographic information processing and storage in one location, municipal authorities can reduce their costs. Little information dissemination is evident in the metropolitan information bureaux (this being far stronger elsewhere in the local government sector), but the bureaux do
THE COMMODIFICATION OF GEOGRAPHIC INFORMATION
75
display advanced value-added information services. They collate data from municipal authorities and package it as a single information set. It is then repackaged to suit the needs of customers, for example being made available in a variety of formats such as bound directories, paper lists, sticky mailing labels or on floppy disk. Occurrences of all four types of information commodification in the metropolitan information bureaux are examined in more detail below. 7.4.1 Commercialisation Commercialisation, or vigorous charging for information, is found in all metropolitan information bureaux. The business information service run by the Tyne and Wear Research and Intelligence Unit, although primarily run to promote local economic development, has a commercial component. Sales of these business information products and services gross around £20,000 each year, this money supporting the other activities of the Unit and helping in the purchase of new computers. Moreover, since the costs of this service have been estimated at £28,000, achieving full cost recovery through information sales is technically feasible (income generated presently covers some 70 percent of the estimated costs) although it is ruled out on other grounds. Some local marketing of business information services is carried out, further highlighting the commercial nature of this service. In addition, business information is currently being pushed into new markets via a joint venture with an economic development organisation. The Tyne and Wear Unit also uses economic data to make contacts in the business world and provide it with a large and secure client base beyond the district councils. By these means, the Unit is strengthening its position as a key player in the local economic information market. SASPAC (small-area statistics package) is a strongly commercialised geographic information product, provided by the London Research Centre. SASPAC dominates the census analysis market in larger British local authorities and central government. This census analysis software product generates income for the Centre, helping to subsidise its municipal work programme. An indication of the commercial importance of SASPAC is given by the tight copyright protection applied to it by the London Research Centre, and the strong marketing the Centre has undertaken in order to sell its product as widely as possible. It is in the nature of the project that SASPAC is most interesting. The London Research Centre has been involved in a close partnership with a number of external, private sector companies to develop and market the software. As well as working in partnership with the private sector, the London Research Centre has developed contacts with a wide range of public sector organisations outside the London boroughs (such as central government departments and health authorities). These links have led into spin-offs, with new geographic information products since being developed. The Merseyside Address Referencing System (MARS) is a spatial land and property database of great commercial importance for Merseyside Information Service. MARS is central to its GIS work, forming the spatial “hook” on which all its other data is “hung”. Most crucially, it is provided to the local police department as the basis of the police incident response system in Liverpool. Income generation does not explicitly arise in this case because most payments for services based on the MARS database are contained within the funding subscriptions received from municipal authorities and other local bodies. Nevertheless, the level of these subscriptions (over £150,000 per annum in the case of the Merseyside police) reflects the value of the database and the cost of maintaining it. A commercial approach to geographic information can therefore present itself either as high pricing in order to recover costs, raise money or fund new developments; or as a strategic effort to widen the client
76
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
base for information to ensure a stable future for the service. Both these techniques indicate a businesslike approach to treating information as a commodity. Raising income is a feature of all metropolitan information bureaux. In each case, the money generated eventually helps subsidise the core services of the bureau, reducing the amount demanded from the subscribing municipal councils. The Tyne and Wear business information service and the London Research Centre’s SASPAC product have both seen efforts to widen the market of users in order to secure the products against competition (largely from elsewhere in the public sector). 7.4.2 Dissemination Because the metropolitan information bureaux are primarily concerned with providing information to municipal authorities and other public sector organisations as opposed to the general public or businesses, wide dissemination of information would not be expected. There is, however, dissemination evident in some cases. Greater Manchester Research is one example, with its policy of publishing all its work as a report or bulletin available for sale to any firm or individual. The business information at Tyne and Wear Research and Intelligence Unit displays disseminatory characteristics, being available at half price to organisations concerned with economic development, or for consultation free of charge in libraries by anyone. A slightly different form of information dissemination is displayed by the Research Library at the London Research Centre, which circulates alerting bulletins to London borough councillors to keep them updated on local government and London affairs. Other research has shown geographic information dissemination to be widespread in local authorities, with particular concentration in London borough councils (Capes, 1995). Dissemination can involve the wide provision of information, either free of charge or at reduced rates, to various client groups. These may be businesses, councillors or the general public. Information to businesses is usually geared towards economic development (such as business information in Tyne and Wear), and information to the public may have the aim of spreading knowledge or promoting the organisation that is disseminating the information. Evidence for dissemination may be indicated by differential charging structures, where some groups get information at cheaper rates than others. 7.4.3 Information exchange Given that the purpose of metropolitan information bureaux is to share information and information processing amongst municipal authorities, there is a great deal of information exchange visible in the metropolitan information bureaux. The aim of information exchange is either to reduce the costs of information storage or processing, or to enhance the value of information by combining it with other data sets (see Capes, 1995). An example of the former is the GIS activities at Merseyside Information Service. In this case, a centrally-run wide-area network results in shared ‘information overheads’ for the subscribing municipal councils and other bodies, since information, software, hardware and support are all provided by MIS from the hub of the network. An example of the latter is the Tyne and Wear business information service. Business databases from the five Tyne and Wear districts are combined to create a more valuable countywide database, of use for strategic monitoring and enquiry purposes. Outputs based on this information are
THE COMMODIFICATION OF GEOGRAPHIC INFORMATION
77
provided back to the district councils. In addition they benefit from the commercial side of the service, taking a share of the income it generates. Shared information systems are justified either on grounds of cost or because they provide useful information that individual local authorities could not obtain operating independently. An example lies in the census information services provided by every metropolitan information bureau. These are justified on cost grounds (the metropolitan information bureaux make one purchase of the census data and are able to distribute it amongst the municipal authorities in their area free of census office copyright restrictions, thus avoiding the need for each authority to buy its own set of census data) and on shared information grounds (by holding the census data for the whole area, the metropolitan information bureaux can provide strategic information which the districts working on their own could not obtain). 7.4.4 Value-added information services Examples of value-added information services are common in the metropolitan information bureaux. The Tyne and Wear business information service involves collating then repackaging business information, making it available in various formats (bound directories, paper lists, sticky mailing labels or on floppy disk) to suit the needs of customers. Depending on the user, the form in which they take the information has more value than the alternatives. SASPAC is an information technology product which adds value to other geographic information by enabling census data to be analysed and printed. The Merseyside Address Referencing System (MARS) and its associated GIS activities add considerable value to local and national data sets by pulling them together on a single wide-access system; integrating other data; adding a comprehensive land and property gazetteer to reference and analyse the data; supplying appropriate software and hardware to do all this; and making training and advice available to users of these systems. Few local government examples of value-added information services are as sophisticated as the MARS example, and few are so dependent on high technology. Most value-added information services (and many instances of geographic information commodification) involve information technology, and most involve some degree of innovation. 7.4.5 Discussion Commercialisation can be characterised as the practice of charging clients both inside and outside the organisation for geographic information. The motivation behind charging varies: it may be to earn income to reduce an organisation’s call on taxation revenue; income may be used to subsidise the purchase of new equipment or software, or to part-fund the development of new information services; charging can be used to deter frivolous requests for information. The vigour applied to charging also varies, but only rarely in local government are attempts made to exceed marginal cost recovery (Capes, 1995). Indeed, zero charging may be applied when information is provided in small quantities; to certain individuals or groups; or as a “loss-leader” to promote the organisation, its other products, and its locality. These marginal costs might include photocopying, printing, materials (paper, toner and so on), computer time, and a component to cover office overheads (such as heating and lighting). In some cases, staff time involved in preparing information to meet a request is recorded as a marginal cost and is charged for; in these cases, value has been added to information and commercial clients might reasonably be expected to pay for these enhancements. The highest charging rate found by Capes (1995) for staff time in local
78
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
government was £75 per hour. Not usually included in charges for local government geographic information are the costs of acquiring and maintaining the data upon which information products and services are based. This is because collecting this data is done at public expense to help fulfil the statutory functions of local government; charges which attempt to recover these costs are rarely seen as desirable at present in local government. Data charges are generally only made for data bought from another body—fees for copying Ordnance Survey maps, and the payment of census office royalties are two cases in point. Geographic information dissemination in local government is characterised by the provision of free or reduced-price information to certain client groups, particularly the business sector and members of the public. Local authorities appear to disseminate information for any of three main reasons: to promote local economic development, to promulgate knowledge in the community, and to promote the information bureau or its parent bodies. The purpose of geographic information exchange in local government may be to reduce the costs of information management or purchase to the participating organisations; or to enhance the value of individual data sets by combining them with other information. Geographic information exchange can therefore be characterised as the trading or sharing of information with other public bodies, particularly other departments and local authorities, which brings benefits to all parties involved. Information is itself the main currency of such exchanges, although where a difference is thought to exist between the values of the information items exchanged the balance may be made up with money. Value-added information services entail making geographic information more useful or accessible to clients. This may be achieved through the use of hardware, software, skills or expert knowledge to repackage, collate, integrate, analyse, publish or present information. Adding value does not include reselling existing data products. Instead, it involves repackaging information by adding something new, such as analysis (often by computer), skills (or expert knowledge), presentation (in graphical or map form, or on different media such as floppy disk), or just publication (making information available to a wider audience, such as publishing business information as a directory). 7.5 EVALUATION AND CONCLUSIONS The typology of geographic information commodification in local government encompassed commercialisation, dissemination, information exchange, and value-added information services. This model is sufficiently robust to describe and analyse those instances of geographic information commodification found by research in British local government. All four types of commodification have been found to be present in metropolitan information bureaux, and these findings parallel developments visible elsewhere in local government. The metropolitan information bureaux have a highly-developed commercial profile, depending on generating income, or the promise of income, to maintain their staff and services. Income is used to subsidise existing services and fund new developments. Both these activities are increasingly in evidence elsewhere in local government, where budgets sometimes have income generation targets written in, and the purchase of new facilities (such as geographic information systems) may require income guarantees to gain approval. Information products of metropolitan information bureaux are sometimes vigorously marketed, perhaps in conjunction with private sector partners. Such partnerships to exploit commercial opportunities are beginning to be visible in local authorities: the joint venture between Powys County Council in Wales and private companies to develop and market the C91 census package is a key example. Local authorities
THE COMMODIFICATION OF GEOGRAPHIC INFORMATION
79
are also starting to address marketing and market research, as illustrated by the appointment of a marketing manager at Manchester City Council. Information dissemination is presently more common in local councils than in the metropolitan information bureaux, which are primarily concerned with providing information to local authorities rather than to the community at large. Dissemination for organisational defence and promotion is replicated frequently throughout the local authority community. The metropolitan information bureaux have information exchange arrangements similar to those found in the larger local authorities in Britain. In both cases, exchange enhances the value of one agency’s information by combining it with that of others, and at the same time reduces data processing costs for all parties involved. Some sophisticated examples of value-added information services are found in the metropolitan information bureaux. The SASPAC census handling product at the London Research Centre, together with the Merseyside Address Referencing System and its associated GIS activities at Merseyside Information Service, are probably the most exciting instances of value-added information services currently available in local government. This is a function both of their high technological sophistication and of the inter-agency collaboration they involve. Contrasts can be made between geographic information commodification in local government and that taking place in central government. Geographic information provision by some British central government agencies is more strongly commercialised than that of local government in Britain. In part, this reflects the government’s Tradeable Information Initiative, which aims to sell data at full market prices. In the case of the Ordnance Survey, however, it is a change in the status of the organisation that has augmented the commercial role of geographic information. In the context of geographic information in Britain, the Ordnance Survey is in the peculiar position of being a government executive agency (it has been separated from its parent department and given its own budget and cost recovery targets) whilst at the same time retaining copyright of its uniquely valuable geographic data sets. As a geographic information publisher, the Ordnance Survey needs to protect this copyright since it guarantees its revenue, a growing proportion of which it is required to raise from data sales. Local authorities, on the other hand, have many functions and income sources and are not dependent on data sales for their continued existence. Even local government metropolitan information bureaux deal with a range of information items; they are not reliant on any single data set for survival. Local government geographic information providers are not, on the whole, required to meet tough cost recovery targets such as those set by central government for the Ordnance Survey. It cannot, therefore, be regarded as surprising that commercialisation is often stronger in central than local government in Britain. Those cases where the geographic information commercialisation displayed in local government matches or exceeds that found in central government are, at present, exceptions rather than die rule. An interesting comparison can be made between information commercialisation in British government and that in the USA where the pattern suggested above is, to some extent, reversed. In America, the federal (central) government has adopted a non-commercial approach to its data, but some state and local governments exhibit more of a commercial orientation. Under Freedom of Information legislation the federal government releases data at low cost, charging no more than the marginal costs of reproducing and disseminating each information request. This mimics the de facto situation in most local authorities in Britain (Capes, 1995). Much federal data is in raw form, requiring analysis, integration and presentation to make it useful to outside clients. But key geographic products provided cheaply by the federal government have greatly assisted such analyses. For example, census information management products have enabled computer users to map and analyse population census data with a common spatial referencing system (Klosterman and
80
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
Lew, 1992). The US census bureau has made these computer files available without copyright restrictions and at low cost. State and local governments in the USA are not always bound by the same federal Freedom of Information laws. These authorities are therefore able to price geographic information commercially. Some state governments, for example, have passed their own laws enabling them to charge for GIS products and services, and recover some or all of the costs associated with providing them. Local governments and their agencies also show evidence of commercialisation. Sherwood (1992) made seven case studies of local government agencies selling geographic information in the USA. As well as the research units of multi-purpose local authorities (such as are still the norm in Britain), Sherwood looked at a regional planning commission, a metropolitan transport and planning agency, a local authority consortium created to develop a GIS, and an arm’s-length company created by another local authority consortium specifically to sell data on a non-profit basis. These agencies tend to exploit their information commercially in concert with private sector firms, who sell and distribute public information, databases and information products in return for a share of the revenue. The heterogeneity of USA local government makes for a different institutional context to that found in Britain at present; however these differences would appear to be decreasing in their extent as British local authorities change their structure, nature and culture. The limited comparison available from Sherwood’s (1992) study suggests that there is considerable parity in geographic information commodification between Britain and America. Sherwood’s findings, along with those reported in Johnson and Onsrud (1995), suggest that the recovery of a significant proportion of costs in American GIS agencies occurs only rarely. Following on from this, a crucial observation as to the nature of commodification in practice can be made. This is to emphasise that commodification is not simply about commercialisation. Full cost recovery is a rarity: it is the strategic spin-offs from exploiting information that tend to be of greater importance. These include: • • • • •
positioning an organisation in a competitive environment by gaining new clients and partners; using income (or the promise of income) to secure funding for new equipment or staff; promoting an organisation by disseminating its information; boosting economic development by providing information to businesses; saving money and gaining new information by exchanging data sets.
Other points to note are the importance of fully utilising the skills and knowledge of staff members: in many cases, their experience can add more value than a computer possibly could. Finally, the integrative properties of geographic information (via the map) can be used to combine and display other data. This chapter has begun to outline the nature of the commodification of geographic information at the local level, and reveal how this reflects trends visible at the national and international levels in Europe and North America. These are, on the one hand to exploit the commercial value of geographic information by charging for data products and services; and on the other hand to disseminate and exchange information strategically by developing data networks and shared infrastructure. Both these trends involve the provision of raw and value-added information services. Findings from British local government are therefore a valuable addition to our knowledge of how data is being exploited by its owners, and the implications this holds for other data users in a developing information society.
THE COMMODIFICATION OF GEOGRAPHIC INFORMATION
81
REFERENCES ANTENUCCI, J.C., BROWN, K., CROSWELL, P.L., KEVANY, M.J., and ARCHER, H. 1991. Geographic Information Systems: a Guide to the Technology. New York: Van Nostrand Rheinhold. ARCHER, H. and CROSWELL, P.L. 1989. Public access to geographic information systems: an emerging legal issue, Photogrammetric Engineering and Remote Sensing, 55(11), pp.1575–1581. ARONOFF, S. 1985. Political implications of full cost recovery for land remote sensing systems, Photogrammetric Engineering and Remote Sensing, 51(1), pp.41–45. BAMBERGER, W.J. 1995. Sharing geographic information among local government agencies in the San Diego region, in Onsrud, H.J. and Rushton, G. (Eds.) Sharing geographic information. New Jersey: Center for Urban Policy Research, pp. 119–137. BAMBERGER, W.J. and SHERWOOD, N. (Eds.) 1993. Marketing government geographic information: issues and guidelines. Washington DC: Urban and Regional Information Systems Association URISA. BARR, R. 1993. Signs of the times, GIS Europe, 2(3), pp. 18–20. BEAUMONT, J.R. 1992. The value of information: a personal commentary with regard to government databases, Environment and Planning A, 24(2), pp. 171–177. BLAKEMORE, M. 1993. Geographic information: provision, availability, costs, and legal challenges. Issues for Europe in the 1990s, Proceedings MARI 93 Paris, 7–9 April 1993, pp.33–39. BLAKEMORE, M. and SINGH, G. 1992. Cost-recovery Charging for Government Information. A False Economy? Gurmukh Singh and Associates Ltd., 200 Alaska Works, 61 Grange Road, London SE1 8BH. CALKINS, H.W. and WEATHERBE, R 1995. Taxonomy of spatial data sharing, in Onsrud, H.J. and Rushton, G. (Eds.) Sharing geographic information. New Jersey: Center for Urban Policy Research, pp. 65–75. CAMPBELL, H.J. 1990. The use of geographical information in local authority planning departments, Unpublished PhD thesis, University of Sheffield, UK. CAMPBELL, H. and MASSER, I. 1992. GIS in local government: some findings from Great Britain, International Journal of Geographical Information Systems, 6(6), pp.529–546. CAPES, S.A. 1995 The commodification of geographic information in local government, Unpublished PhD thesis, University of Sheffield, UK. CLINTON, W.J. 1994. Coordinating Geographic Data Acquisition and Access: the National Spatial Data Infrastructure, Executive Order, 11 April 1994. Office of the Press Secretary, the White House, Washington, DC. CRAIG, W.J. 1994. Data to the people: North American efforts to empower communities with data and information, Proceedings of the Association for Geographic Information Conference (AGI ‘94), pp.1.1.1–1.1.5. DALE, P.P. and McLAUGHLIN, J.D. 1988. Land information management. Oxford: Oxford University Press. DANGERMOND, J. 1995. Public data access: another side of GIS data sharing, in Onsrud, H.J. and Rushton, G. (Eds.) Sharing geographic information. New Jersey: Center for Urban Policy Research, pp. 331–339. FEDERAL GEOGRAPHIC DATA COMMITTEE 1993. A Strategic Plan for the National Spatial Data Infrastructure: Building the Foundation of an Information Based Society. Federal Geographic Data Committee, 590 National Center, Reston, Virginia 22092, USA. GARNSWORTHY, J. 1990. The Tradeable Information Initiative, in Foster, M.J. and Shand, P.J. (Eds.) The Association for Geographic Information Yearbook 1990. London and Oxford: Taylor & Francis and Miles Arnold, pp. 106–108. GARTNER, C. 1995. Commercialisation and the public sector—the New Zealand experience Paper presented at National Mapping Organisations Conference, 25 July to 1 August 1995, Cambridge, UK (not in proceedings). HEPWORTH, M.E. 1989. Geography of the information economy. London: Bellhaven Press. HEPWORTH, M, DOMINY, G. and GRAHAM, S. 1989. Local Authorities and the Information Economy in Great Britain. Newcastle Studies of the Information Economy, Working Paper no. 11. Centre for Urban and Regional Development Studies, University of Newcastle-upon-Tyne, UK.
82
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
JOHNSON, J.P. and ONSRUD, H.J. 1995. Is cost recovery worthwhile? Proceedings of the Annual Conference of the Urban and Regional Information Systems Association (URISA), July 1995, San Antonio, Texas, http:// www.spatial.maine.edu/cost-recovery-worth. KLOSTERMAN, RE. and LEW, A.A. 1992. TIGER products for planning, Journal of the American Planning Association, 58(3), pp.379–385. LLOYD, P.E. 1990. Organisational Databases in the Context of the Regional Research Laboratory Initiative, Regional Research Laboratory Discussion Paper 4, Department of Town and Regional Planning, University of Sheffield, UK. MAFFENI, G. 1990. The role of public domain databases in the growth and development of GIS, Mapping Awareness, 4(1), pp.49–54. MASSER, I. and CRAGLIA, M. 1996. The diffusion of GIS in local government in Europe Ch.7 in Craglia, M. and Couclelis, H. (eds.) Geographic Information Research: Bridging the Atlantic, London: Taylor & Francis. MASSER, I. and SALGÉ, F. 1995. The European geographic information infrastructure debate, in Masser, I., Campbell, H.J. and Craglia, M. (Eds.) GIS Diffusion: the Adoption and Use of Geographical Information Systems in Local Government in Europe. London: Taylor & Francis, pp. 28–36. MOSCO, V. 1988. Introduction, in Mosco, V. and Wasko, J. (Eds.) The Political Economy of Information. Madison: University of Wisconsin Press, pp. 8–13. NANSON, B., SMITH, N. and DAVEY, A. 1996. A British national geospatial database? Mapping Awareness, 10(3), pp. 38–40. ONSRUD, H.J. 1992a. In support of open access for publicly held geographic information, GIS Law, 1(1),pp.3–6. ONSRUD, H.J. 1992b. In support of cost recovery for publicly held geographic information, GIS Law, 1(2), pp. 1–6. OPENSHAW, S. and GODDARD, J. 1987. Some implications of the commodification of information and the emerging information economy for applied geographical analysis in the United Kingdom, Environment and Planning A, 19 (11), pp. 1423–1439. RHIND, D. (Ed.) 1983.A Census User’s Handbook, London: Methuea RHIND, D. 1992a. Data access, charging and copyright and their implications for GIS, InternationalJournal of Geographical Information Systems, 6(1), pp. 13–30. RHIND, D. 1992b. War and peace: GIS data as a commodity, GIS Europe, 1(8), pp.24–26. ROSZAK, T. 1986. The cult of information. Cambridge: Lutterworth Press. SHEPHERD, J. 1993. The cost of geographic information: opening the great debate, GIS Europe, 2(1), pp.13, 56 & 57. SHERWOOD, N. 1992. A review of pricing and distribution strategies: local government case studies, Proceedings URISA ‘92, volume 4. Washington: Urban and Regional Information Systems Association, pp. 13–25. STOKER, G. 1991. The politics of local government. London: Macmillan. TAUPIER, R.P. 1995. Comments on the economics of geographic information and data access in the Commonwealth of Massachusetts, in Onsrud, H.J. and Rushton, G. (Eds.) Sharing geographic information. New Jersey: Center for Urban Policy Research, pp. 277–291. WILSON, D. and GAME, C. 1994. Local government in the United Kingdom. London: Macmillan.
Chapter Eight Nurturing Community Empowerment: Participatory Decision Making and Community Based Problem Solving Using GIS Laxmi Ramasubramanian
8.1 INTRODUCTION In cities and communities across the world, architects, planners, decision makers, and individuals, are using Geographic Information Systems (GIS) and related information technologies to understand and evaluate specific problems occurring in their physical and social environment. In the United States for example, the city of Milwaukee monitors housing stock (Ramasubramanian, 1996), while Oakland’s Healthy Start program has analysed the high incidence of infant mortality (King, 1994a). GIS are being used to describe and explain diverse phenomena such as school drop out rates and provide decision support for a wide range of tasks, for example, the coordination of emergency service delivery in rural areas. These advances, as well as many other innovative uses of information technology and spatial analyses are reported in technical journals such as the Urban and Regional Information Systems Association (URISA) Journal, and in the popular press through magazines like Time, and US News and World Report, as well in trade journals such as GIS World. The use of information technologies and spatial analysis concepts promise many benefits to individuals and communities in our society—a society in which data, information, and knowledge have become commodities, seen as assets much like land, labour, and capital (Gaventa, 1993). At the same time, several thinkers, researchers, and analysts observe that information technologies can become inaccessible, thereby making the promise they offer disappear (e.g. Pickles, 1995). For example, William Mitchell, the Dean of MIT’s School of Architecture and Planning, observes, “While these technologies have the potential to become powerful tools for social change, create opportunities, and broaden access to educational opportunities, jobs, and services, it must be recognised that these benefits will not come automatically” (Mitchell, 1994, p. 4). There is considerable evidence which suggests that individuals and community groups have difficulty in acquiring and using information technologies. In particular, citizens and groups from low income communities and communities of colour are disproportionately affected by lack of access to information technologies (e.g., Sparrow and Vedantham, 1995). 8.2 SCOPE AND SIGNIFICANCE In this context, this chapter explores the appropriateness and the limitations of GIS to facilitate participatory, community-based planning. It is anticipated that spatial mapping and analysis will be valuable to community organisations and groups because it can be used skilfully to identify issues, make comparisons,
84
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
analyse trends, and support socio-political arguments thereby facilitating policy analysis, service delivery, and community participation. There is a push and a pull which is making GIS use increasingly popular at the level of small groups. The push comes from the manufacturers of hardware and software and professional decision makers who are presenting GIS as a panacea for problem solving to a new market segment, while the pull comes from these small groups themselves who are seeking more control over decision-making processes about issues that affect them and see GIS as a useful tool for this purpose. These groups can be characterized as having immediate and local problems, special interests, limited but well defined decision making powers, and limited technical knowledge. To facilitate real participation of community residents in planning and decision-making processes, it is vital that the community has control of, and access to public information. While planners and decision-makers routinely use data and information from any community in order to make decisions for that community, the same information is seldom available or accessible to community residents. Traditionally this disparity in information access has been attributed to the unavailability of both processes and technologies which could involve community residents in traditional planning and decision-making processes. This chapter argues that GIS can be effectively used to facilitate decentralised, community-based decision-making, planning, and research thereby contributing to the creation of an empowered citizenry. While conventional thinking believes that GIS and related technologies tend to centralise decision-making, this author argues that end-users, equipped with a critical world view, will be able to think creatively about ways they can use data, information, and GIS technology in their day-to-day problem solving, thereby making decentralised decision-making a reality. The chapter also argues that Participatory Research (PR) is a viable conceptual and methodological approach to develop critical thinking skills among end-users. 8.3 LITERATURE REVIEW Keeping with the scope of the chapter, this literature review addresses two themes—the nature of the technology and the context within which it will be used. The literature review discusses the organisational and institutional issues surrounding GIS adoption and use. In addition, this section presents a brief discussion about the development of communities, their institutions, and the decision-making processes that are typically encountered in this context. The review concludes by exploring the role that information plays in community-based decision-making. 8.3.1 Organisational and Institutional Issues Affecting GIS Adoption While research in the area of GIS has tended to centre around the technological capabilities of the system, there has been a steady stream of studies which have shown that the adoption and use of GIS is dependent on factors other than those related to the technical capacity (hardware and software) of the system. The GIS literature addressing organisational issues focuses on the diffusion processes of GIS in a variety of organisational contexts. Masser (1993) and Campbell and Masser (1996) have studied GIS diffusion in British local governments while Wiggins (1993) has examined the same issue within public sector agencies in the United States. In addition, Huxhold (1991) has studied GIS diffusion in city government by looking at the development of GIS in the City of Milwaukee over a period of 15 years. Croswell (1991), studying organisations that had acquired a GIS, developed a matrix of common problems associated with GIS implementation in the United States. The main problems identified in a hierarchical
COMMUNITY EMPOWERMENT USING GIS
85
order of high to low incidence are: lack of organisational coordination and conflicts; problems with data and data integration; lack of planning and management support; and the lack of awareness of technology and training. Looking at GIS adoption in planning agencies in developing countries through a case study of a “successful” planning agency in India, this researcher found that developing country agencies follow a model of GIS implementation which is similar to developed countries. As a result, they tend to have the same problems. This research identifies seven factors that facilitate GIS implementation: achieving clarity in problem definition; conducting a user-needs assessment; establishing inter-agency coordination; training of personnel; organising the collection and management of data; designing an incremental system for development; and the important role of advocates within the organisation (Ramasubramanian, 1991). Obermeyer and Pinto (1994) argue that the GIS literature has typically considered the implementation process as a bridge between the system developer and the user. When the system crossed over the bridge, it was regarded as successful. They recommend that implementation success must look at three criteria: technical validity—whether the system works; organisational validity—whether the system meets the organisations and users’ needs; and organisational effectiveness—whether the system improves the present working of an organisation. Masser and Onsrud (1993) propose that the central question in the area of GIS diffusion is to ascertain whether there is any difference between the diffusion process for GIS and for information technology products of similar kinds. The research issue is not what promotes adoption of GIS but what promotes the effective use of GIS. Organisational strategies are considered to be a critical element to enhance GIS use. Research to understand effective use centres around the question of measurement of effectiveness. Looking at institutional issues, the authors argue that research needs to focus on isolating generalisable principles germane to the acquisition, implementation, and particularly the utilisation of a GIS. For example, some of the questions that could be asked are: “What are the organisational and institutional structures that enhance effective implementation and use?”; “What are the strategies that best facilitate the implementation of a GIS within and among organisations?”; “What factors influence implementation and optimal use?”; and “Under what arrangements and circumstances can information sharing more easily occur?” (Masser and Onsrud, 1993). Huxhold and Levinsohn (1995) propose that organisations adopt GIS technology because they anticipate that it will provide new capabilities that will yield benefits to the organisation. Building on this assumption, they outline a conceptual framework to look at GIS adoption and diffusion. This framework consists of four elements: the GIS paradigm, data management principles, technology, and the organisational setting. According to them, a GIS paradigm is the conceptual foundation for using geographic information that provides a common base of reference or focus for the other three elements; data management principles govern the logical structuring and management of large databases that contain map and other data that can be related to the geography of interest to the organisation; technology comprises the effective combination of various hardware and software components that enables the automation of numerous geographic data handling functions; and the organisational setting implies the management environment that provides resources and enables changes to be made for incorporating GIS utilisation throughout the organisation.
86
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
8.3.2 The Context of GIS Application 8.3.2.1 Development of Community-Based Decision-Making King (1981), commenting on community development in Boston over three decades in his book Chain of Change suggests that a community develops in stages. Initially, community residents rely on the good will of others to receive services from city, state, federal, and non-governmental agencies. He calls this the service stage. At this stage, they are not part of any decision-making process. Next, the residents organise themselves into interest-groups to demand, seek and receive services that are appropriate to their needs. They get involved with the decisions that are made about the immediate community. This is the organising stage. Finally, the residents begin to develop and sustain community-based institutions which act as representative voices for them. Community-based organisations so created, then get involved in decisionmaking and attempt to influence issues that affect the immediate community and the surrounding geopolitical region. This is the institution-building stage. This model is useful to understand community-based decision-making because it is a study within spatially defined neighbourhoods over a period of time. Though it does not provide direct empirical evidence, it is written from the community’s perspective and is reliable in understanding community-based decision-making. King’s study compares favourably with another series of case studies that investigate the development of community participation in decision-making in several European countries. Susskind (1983) suggests that communities first experience paternalism in decision-making where the community has no say in the decisions that are made for it. This leads to discontentment, a period of conflict when the decision makers such as city and state agencies struggle with the community to maintain control over the process. The conflict gives way to co-production, a phase where both opposing groups resolve the conflict and create a shared decision-making model. It is important to note that the period of conflict can continue for an extended period of time. 8.3.2.2 Role of Information in Community-Based Decision-Making Since a large part of GIS is related to data and its efficient display and management, it is useful to note that information is seen as a complex source of power in the planning and decision-making process (Forester, 1989). At the same time, several people have argued that we have more data but less information, in short we know more about less (e.g. Friedmann, 1992; Naisbitt, 1994). Forester also suggests that decision makers often make decisions with incomplete information while Alexander (1984) points out that the rational paradigm of decision making is giving way to a host of other paradigms of decision making. King (1981) has presented a geo-political organising model for involving community residents in planning and decision making. One aspect of this model was the development of a computerised directory called the Rainbow Pages. The directory was designed to enable residents to get information about issues and activities that were affecting their neighbourhood. It also provided a way for residents to contact each other and solve problems collectively. This model emphasises self reliance, mutual self help, information availability, and information access. Kretzman and McKnight (1993) have developed a community development strategy which begins “with a clear commitment to discovering the community’s capacities and assets”. While information is critical for
COMMUNITY EMPOWERMENT USING GIS
87
any community development strategy, Kretzman and McKnight argue that cities and urban communities are often defined solely in terms of their negative images. They observe that both academic research and proactive problem solving initiatives begin with “negative” information about neighbourhood “deficits” or “needs”. They present an alternative approach which begins to map community “assets”—an approach that emphasises linkages between different resources or strengths that are present within any neighbourhood or community. Chen (1990), working with the Boston Redevelopment Authority on the South-End Master Plan (a neighbourhood of Boston), found that GIS was a useful way to begin to communicate spatial concepts to non-technical users. His work has been supported by other smaller studies in which non-technical users like high school students and community residents have used GIS to address problems concerning their neighbourhood (Ramasubramanian, 1995). Huxhold and Martin (1996) observe that federal and local funding agencies often require community organisations and groups seeking financial assistance to use data and information. They argue that the use of data and information is beneficial to the funding agency and the community organisations seeking financial support. The funding agency can use the data to determine the relative merit of a grant application and to ensure that the organisation is following funding guidelines. The community organisation seeking financial assistance can plan a more strategic campaign by interpreting the data to demonstrate financial need and the appropriateness of its intervention. In the 1990s, GIS applications have expanded to serve a wide range of users including those users who have typically not used them before. To illustrate with an example, let us take a look at the National Association for the Advancement of Coloured People (NAACP) v. American Family Mutual Insurance Company redlining lawsuit, considered “one of the most important federal cases in Wisconsin history” (Norman, 1995:12). In early 1995, the American Family Mutual Insurance Company agreed to settle a discrimination suit brought against them by NAACP and made a commitment to invest US$ 14.5 million in central Milwaukee. The NAACP had argued that the insurance company was discriminating against African-American residents in Milwaukee’s Northwest side by allowing the practice of “redlining”, that is a policy of systematic disinvestment in the area. While the case never actually went to trial, both parties had gathered a significant volume of data and analyses to support their claims. For their part, the insurance company pointed out that they had a fairly even distribution of insurance policies in Milwaukee using postal zip-codes as the unit of analysis. Their argument was countered by the NAACP who used their own maps to demonstrate that the company’s policies were distributed unevenly, clustered in mostly white census tracts. The “zip code defence” fell apart since mapping the data by census tracts revealed information that was not previously evident in analyses based on zip codes. This is because in the US, Zip codes tend to be so large as to mask differences between white and black neighborhoods, while census tracts are smaller aggregations (Norman, 1995). The NAACP v. American Family case is a high-profile example which vividly portrays the usefulness of GIS. First, it demonstrates that having access to relevant information plays a vital role in identifying issues and placing them within a problem-solving framework. Second, it demonstrates that information plays a significant role in making comparisons, and analysing trends which were required to establish the case for discriminatory behaviour against the insurance company. Third, it demonstrates the power and the potential of spatial analysis and maps to force all parties involved in the debate to address the reality and the gravity of the situation. Finally, this example demonstrates that GIS and related technologies are integral for mapping and analysis, storing large volumes of data, and for looking at different types of data such as demographic information and financial information simultaneously. Thus, if we stop to think about it, we
88
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
begin to recognise that the implementation of information technology and spatial analysis concepts has profound implications for people and communities everywhere, particularly those individuals who are unlikely to have used them before. 8.3.3 Synthesis of Literature Review The literature review covered a broad range of issues—on the one hand, it looked at GIS adoption and diffusion while on the other hand, it looked at community development and decision making. The development of GIS technology which began in the era of mainframe computers has kept pace with other innovations in computer technology. The mapping capacity has been enhanced, systems have become more user-friendly, and most of all, GIS is now affordable to the individual user. At present, GIS software is available with data packages for use on personal computers and can be customised to suit individual needs. GIS has the capacity to analyse several issues that concern community residents, community planners, and decision makers within community-based organisations. However, GIS researchers have not studied GIS adoption and use by small groups such as community-based organisations. Recognising this gap, the National Center of Geographic Information and Analysis has begun to grapple with the social and philosophical issues surrounding GIS adoption by a broader spectrum of society through its Initiative 19 (1– 19) (see Daniel Sui, Chapter 5 of this volume). In the absence of research-based evidence to the contrary, it is safe to hypothesise that GIS adoption in community-based organisations will imitate processes of adoption and use in other contexts such as local governments. Information is invaluable to any community group that intends to work proactively with the local, state, or federal government because administrations tend to rely on empirical evidence and hard data to determine the relative merits of an organisation’s request for funds. However, it should be obvious that, while a planner working for the local government and a community organiser may both use GIS (for example, to map the number of vacant parcels in the neighbourhood), the conclusions they draw and the policy options they recommend to their clients or constituencies will be very different. Both organising models discussed in the literature review use empirical data and qualitative information about the state of the community as a basis for their community development strategies. These models assume that rational, logical, arguments work effectively in influencing decision making around social issues. At the same time, these models use objective information to redefine issues and change the nature of the policy debate. Learning from situations like the NAACP v. American Family case, community activists, informal community groups, and community-based organisations are beginning to view GIS as a useful package of tools and techniques. They are using GIS to understand and evaluate specific problems occurring in their physical environments in areas such as housing, health care, education, economic development, neighbourhood planning, and environmental management (Ramasubramanian, 1995). GIS provides an efficient way to document, update, and manage spatial information, thereby making it possible to conduct analyses over time. GIS has the capacity to level the playing field by assisting small groups to analyse, present, and substantiate their arguments effectively. It also allows them to raise new questions and issues. Environmentalists, for example, have successfully used technologies such as GIS to negotiate their claims and resolve disputes (e.g. Sieber, 1997). Urban communities and their advocates can do the same. The next section will present a conceptual and methodological approach called participatory research and explain why it is an appropriate framework to introduce GIS use to a community group or organisation
COMMUNITY EMPOWERMENT USING GIS
89
comprised largely of non-technical users. It is anticipated that this approach will overcome some of the common problems associated with GIS adoption and its use, particularly issues such as fear of technology and resistance to change. 8.4 PARTICIPATORY RESEARCH CONCEPTS In order to understand participatory research, one first has to understand the basic concept of participatory planning. Friedmann’s (1987) classification of planning traditions is a useful framework to look at participatory planning. The four traditions he identifies are: policy analysis, social reform, social learning, and social mobilisation. Friedmann sees these traditions on a continuum with the policy analysis tradition working to maintain the status quo and social mobilisation working to create change. Positivistic research never discusses the role or value of participation seriously. Participatory planning, on the other hand, accepts ‘participation’ as an implicit condition. Discovering that most of the literature advocating participatory planning comes from the social learning and the social mobilisation traditions should come as no surprise. Why should one look at participatory planning at the present time? Friedmann (1992) and Sassen (1991) argue that it is not possible to plan effectively on behalf of people and states given the changing nature of the economy, the political landscape, and most of all, given the speed at which these changes are occurring, especially in the cities and urban areas of the world. They also suggest that the specialisation of knowledge makes it impossible for one group to plan and determine optimal solutions on behalf of the world community. Brown and Tandon (1991) state that the problems of poverty and social development are complex and require multiparty collaboration. Korten (1986) sees participatory planning as a fundamental construct of a larger strategy supporting people-centred development. Additionally, participatory planning is a model that is working well in situations where it has been undertaken with a genuine commitment to implement change. Thus, it deserves to be studied and seen as a viable social methodology. Finally, participatory planning invokes a basic principle of empowerment—of building the capacity of the community to speak for itself and address issues with the skill and confidence to create change (King, 1992). In his book, Man and Development, Nyerere argues that: “…For the truth is that development means the development of people. Roads, buildings, the increases of crop output, and other things of this nature and not development, they are only tools of development” (Nyerere, 1974, p. 26). Placing people at the centre of the planning and decision making process changes the nature of the debate dramatically. There are several models of participatory planning including Action Science (Argyris et.at., 1985), People-Centred Planning (Daley and Angulo, 1990), Transactive Planning, (Friedmann, 1992), Community-Based Resource Management (Korten, 1983), Participatory Action Research (Whyte, 1991) and Participatory Research (Hall, 1993). The concept of participatory research (PR) stems from larger values of democracy and citizen participation and from within the participatory planning traditions mentioned above. Individuals and communities have come to see participation as an essential approach to effect change. For the purposes of this chapter, PR can be defined as an approach that: 1. develops the capacity of the participants to organise, analyse, and discuss concepts to the level required by the particular endeavour they are involved in;
90
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
2. develops a process to incorporate the participants in the research and decision-making process which includes the formulation of the hypotheses, selection of the research design, and methods of evaluation; and, 3. returns research data and results to the participants. The long term goal of a PR project is to empower people psychologically and politically to effect social change. In the short term, a PR project engages people affected by a particular problem in the analysis of their own situation and emphasises self-reliance, self-assertiveness, and self-determination. According to Gaventa (1993), there are three ways of conducting research within the general framework of the participatory paradigm. The first approach re-appropriates dominant research knowledge. While it is effective in the process of empowerment, it is still based on gaining access to and control of knowledge that has already been codified by others. The second approach which evolves from the first approach aspires to create new knowledge based on people’s experience and makes it possible for people to produce and define their own reality. He recommends a third way where the people are involved in all stages of the research process including problem definition, setting the research agenda, and determining where the results or findings would be used. He argues that once people see themselves as researchers, they can investigate reality for themselves. The role of the outside researcher is changed radically when the research paradigm sees popular knowledge as having equal value to scientific knowledge. The PR perspective presented in Figure 8.1 sees the community as “insiders” who have information and knowledge gained through practical experience. Their knowledge has not been analysed to recognise patterns nor has it been synthesised within any larger societal framework. Individuals in communities live in isolation and their experiences are not connected. Researchers, advocates and consultants are “outsiders” who have seen similar situations and therefore are able to understand patterns and have theories and strategies because of their understanding of existing theoretical frameworks. PR sees these two groups coming together to participate in a mutual learning experience. This phase is dependent on developing effective communication strategies and attitudinal changes in the researcher. The local theory provides context specific cause and effect relationships and it can be shared with insiders to test through action. At the same time, outsiders can take this knowledge and use it to generate theory. This author advocates the use of this participatory perspective in any efforts to use GIS for communitybased planning and decision making. GIS advocates taking the role of the “outside” researcher will be able to communicate more effectively with the community in its attempt to use GIS. The technology will become less of a black box and more of an interactive tool which can be manipulated according to the needs of the users. This framework facilitates decision-making which is based on social learning. It is most effective in context bound situations and in work with small groups. Having briefly discussed participatory research concepts and its usefulness as an approach and a method to introduce GIS to community groups, this chapter will now look at one example in which this approach was attempted. 8.5 REPAIRERS OF THE BREACH—ADVOCATES FOR THE HOMELESS The Repairers of the Breach is a non-profit advocacy organisation in Milwaukee that works with the homeless and those at risk of becoming homeless. The organisation is typical of many community-based organisations in that it runs on a very small budget and relies for the most part on the kindness of strangers and the dedicated service of volunteers.
COMMUNITY EMPOWERMENT USING GIS
91
Figure 8.1 Using GIS from within a Participatory Research Paradigm
Since 1992, the organisation has been concerned about the displacement of low income people, and people of colour in central Milwaukee. In order to confirm what they had documented through anecdotal evidence, key members in the organisation sought to use GIS to monitor displacement. These members were actively involved in defining the research agenda, designing the research questions, and determining the scope and nature of the analysis. While the actual data manipulation and the computer mapping tasks were conducted by university students, the research was directed by the needs and interests of the organisation. Building on this preliminary work, the organisation has decided to use GIS to facilitate community-based research and analysis in order to: • create a comprehensive computer-based socio-economic and environmental profile of the areas they serve in Milwaukee; • customise this profile to include qualitative data and information of particular relevance to people who are homeless or at risk of becoming homeless; and, • develop the skills of neighbourhood residents to gather, analyse, and use data and information about their neighbourhood.
92
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
This model provides one example of how a community can effectively use computer-based technologies like GIS. Presently, key members in the organisation have a very sophisticated conceptual understanding of GIS. Initially, the relationship between the community-based organisation and the university which provided the technical assistance was similar to what King (1981) describes as happening in the service stage. The organisation did not intend to take an active part in the research and analysis but anticipated that the university would provide assistance by addressing their concerns. Over time, leaders in the organisation decided that it is invaluable for those directly affected by the displacement and gentrification trends to be involved in the research and analysis. As the Director, MacCanon Brown points out, “…one concept inspiring this project is a desire to move a system of knowledge into the hands of the people who are often subordinated by the exploitative use of systems of knowledge” (Repairers of the Breach, 1994, p. 9). The experience with this organisation demonstrates that there are several benefits to using a participatory process to work with community-based organisations, particularly if the work involves the introduction of GIS concepts and analysis techniques. Playing the role of outside researcher, the university was able to assist the project without dominating the research process. The university students who worked on the project perfected their technical skills while learning about the policy and pragmatic implications of their research from organisation members who were dealing with the issues surrounding homelessness on a dayto-day basis. The organisation members on the other hand were able to address their concerns from a position of strength, playing a leadership role in the research and analysis. This organisation is currently developing a research proposal to study homelessness in central Milwaukee and has raised money to sustain a drop-in centre with computing facilities to serve the needs of its constituency. The organisation’s actions are one indicator that the participatory approach used for the preliminary research was a catalyst in transforming those individuals who are typically seen as research subjects into researchers. With their enhanced critical thinking skills, the members of this organisation are demonstrating that they can participate with power in decision-making processes. The next section of this chapter will discuss some of the benefits that can accrue and constraints that community-based organisations face as they attempt to acquire a conceptual and technical understanding of GIS. 8.6 BENEFITS AND CONSTRAINTS OF USING GIS The use of GIS within a participatory paradigm bridges the gap between research and practice by creating divergent problem solving perspectives. For example, questions explored through a GIS gain special meaning because demographic, economic, and environmental data can be visually linked with actual features on a map like a house, a tract of land, a stand of trees, or a river. (Audet et.al., 1993). In addition, end-users tend to ascribe meaning to the data they are looking at because it is familiar and concerns them directly. For example, it is not uncommon to find community residents browsing through the database searching for their street, or their home and using address matching features to spatially locate familiar landmarks. As stated earlier, GIS facilitates analysis of spatial patterns and trends over time. A community can analyse the growth and decline of their neighbourhood using common indicators like the number of vacant parcels and the number of building code violations. At the same time, the technology makes it possible for non-technical users to come together to discuss context specific issues. By talking and sharing information, users learn from each other. For example, a community organiser with access to a GIS system can present
COMMUNITY EMPOWERMENT USING GIS
93
information about drug arrests in the neighbourhood over time and use the spatial patterns to talk about the effectiveness of block club organising and neighbourhood watches in preventing drug-related crime. There are some limitations in using GIS to assist community-based planning. Will Craig, Assistant Director at the University of Minnesota’s Center for Urban and Regional Affairs, says that “while the 1990 Census set a new precedent for sharply defined demographic information, community organisers must still depend on cities and counties for much of the other information they need. For example, the city assessor’s office maintains property records. The police department tracks crime. Most public information lands in city computers. More often than not, cities do not distribute this information”. Craig surveyed 31 major US. cities and six Canadian cities and found that while most had broad city data available, the results for ‘subcity’ or neighbourhood data were “dismal” (Nauer, 1994 cf. Ramasubramanian, 1995). During interviews conducted with ten community-based organisations in Milwaukee, this author learned that many of them were experiencing difficulties in their efforts to access and use computer-based technologies. Several barriers to access were noted. They included lack of access to appropriate hardware and software, lack of access to appropriate data and information, lack of appropriate computer skills and research as well as analysis skills, and finally financial constraints. Eight of the ten organisations interviewed are affected by financial constraints. However, only one of the organisations insisted that the lack of financial resources was the primary barrier to access. Lack of access to appropriate hardware and software seems to a major problem affecting most of the organisations interviewed. “We don’t have powerful computers” and “No modems” seemed to be a constant refrain. “We are using seventies technology here”, said one interviewee sounding frustrated about the quality of the computer equipment in her workplace. Another interviewee clarified this point further. She said that even when there are financial resources available to invest in new hardware and software, the end users are not able to get sound advice about what type of system to invest in. Other interviewees appear to agree. Most organisations appear to have adopted a “let’s wait and see” attitude because they feel that the technology is changing too rapidly to be of any use to their organisation. All the organisations surveyed agreed that another major barrier to using computer-based technologies such as GIS, was the lack of computer skills. Some organisations clarified that the lack of access to hardware and software was linked to the issue of training. “Training doesn’t help if we do not have the appropriate systems” says a community leader working with Asian-Americans in Milwaukee. He says that any training programs that his staff attended without actually having the appropriate technology in the workplace were useless. Some organisations said that it took considerable time before individuals could become competent users of the technology. Others indicated that the learning process took a lot of energy. The organisations actually attempting to use computer-based technologies in some way raised the issue that is usually a preoccupation of researchers and analysts—the lack of data and the varying quality of available data. For example, an activist working on issues that affect the American-Indian community pointed out that the census often undercounts American-Indians. In addition, there is very little data that is geared to the needs and interests of this sub-population. MacCanon Brown, the director of the homeless advocacy organisation discussed earlier in the chapter, agrees. She works with the homeless and those at risk of becoming homeless—an invisible population often undercounted and underrepresented in official statistics and analyses. In addition, two or three of the organisations interviewed clamoured for more data. The organisations working on long range planning said that they would like to use all the information pertaining to their neighbourhood that they could get. However, they are beginning to realise that the data are sometimes not available in a form that they can use. For example, data about crime can be aggregated at census tract level, zip code level, or block group level. One organisation may find it useful to look at crimes occurring within a
94
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
block group to mobilise block watches and organise the neighbourhood while another organisation may prefer to look at larger patterns of criminal activity to design intervention programs. One or two interviewees commented on the fact that community-based organisations are so busy maintaining routine operations that they did not have the time or energy to step back and look at the nature of computer-based technologies and the ways in which it could affect their organisation. “Most communitybased organisations cannot look at the global picture-we are different”, said one interviewee. This author observed that many organisations were not able to conceptually integrate computer-based technology use within their current activities. For example, in one neighbourhood organisation, the computer education and resource centre is designed to educate and entertain young people. It is likely to serve adults who want to gain some basic computing and word processing skills. However, the organisation does not appear to have explored the possibility of linking and enhancing their regular programs with the use of information technology. Several organisations are still doing maps with plastic overlays instead of using computer-generated overlays which increase efficiency and accuracy, and can be updated and maintained over time. At the same time, one or two of the organisations were very clear about how they would solve problems using technologies such as GIS. They talked about using the results of the spatial analysis to achieve certain tangible organisational goals such as generating increased awareness of a problem among neighbourhood residents and generating increased resident participation. These organisations also were aware of the value of looking at trends over time, something that is relatively easy to do using GIS. The interviews point out that community-based organisations have to spend a lot of time, energy and resources on gathering fragmented data from different sources and transform it to make it usable before they can use GIS for the neighbourhood scale of analysis. Small groups such as community-based organisations are easily deterred because of these start-up problems. However, this author believes that use of the participatory framework discussed earlier will assist community organisations in overcoming these difficulties. GIS technologies also exhibit certain unique characteristics that affect their diffusion and adoption. One cannot learn information technology concepts as one learns to use a standard computer package. A user of standard computer software begins by learning to manipulate the software. The data that become the focus of the manipulation are created by him or her prior to and/or during the process of working with the software. On the other hand, a GIS user works with several sets of data simultaneously. These data sets are usually created by other entities, for different purposes and with different goals in mind. Therefore, the end user is constantly confronted with uncertainty about the accuracy and availability of the data as well as issues which relate to the capacity of the system and its accessibility, the content, ownership, and privacy of the information. For example, in order to understand the spatial mapping and analysis capabilities available through a GIS, a user has to understand the basic principles of computing and cartography and how data are stored on the system. 8.7 SUMMARY This chapter has argued that GIS can be used effectively at the level of small groups such as communitybased organisations to assist them in problem solving and decision-making. Accordingly, the literature review has discussed the nature of GIS adoption and its use and the context of that use. The literature review has also pointed out that information is used in community-based decision-making because of a push and pull effect—community organisations are beginning to believe that rational arguments supported by data
COMMUNITY EMPOWERMENT USING GIS
95
can support their demands while funding agencies are seeking data and sophisticated analyses to determine the relative merits of the requests for funding they receive. In the next section, the chapter proposed that Participatory Research is a viable conceptual and methodological approach to introduce GIS use to a community group or organisation comprised largely of non-technical users. This approach has been described and explained and further exemplified through the example of the homeless advocacy group’s efforts to use GIS. The benefits of and constraints to using GIS have also been discussed. The critics who argue against GIS use in community-based decision-making often argue that it tends to centralise decision-making and separate it from the realm of understanding of non-technical users. While acknowledging this criticism, this author would like to emphasise that this criticism is more of an indictment about decision-making processes than about GIS and its use. To counter this critique, this chapter has presented a model that approaches decision making through a process of mutual learning between “expert planners and decision makers outside the community” and “community residents who are novice users of technology and information”. This model is very appropriate in looking at issues that affect small groups and communities. There are several barriers to the access and use of information technologies in general, and GIS in particular. Most initiatives to increase access provide the technology, some work on developing data standardisation measures, and data sharing mechanisms. Still fewer initiatives provide access to technology, and data, while putting some rudimentary skills in the hands of end users. However, very few initiatives address what this author believes to be the most important barrier to access—the lack of a critical world view which enables end users to think about ways they can use information technology and GIS in day-to-day problem solving and decision-making. The author is hopeful that the participatory strategies recommended in this chapter will go a long way in encouraging critical thinking among end users. 8.8 CONCLUSION It seems obvious that computer-based technologies like GIS and the decisions made using them are going to affect the lives of ordinary people in communities, even those who are not directly involved in using these technologies (Sclove, 1995). As exemplified in the NAACP v. American Family Insurance case example, it is likely that information will become the centre piece of the “Civil Rights” debate in this decade as corporations continue to use racial and economic demographics to locate and provide services (King, 1994b). Gaventa (1993) argued that the production and control of knowledge maintains the balance of power between powerful corporate interests and powerless individual citizens in a society that is becoming increasingly technocratic, relying on the expertise of scientists to transcend politics. According to him, a knowledge system that subordinates common sense also subordinates common people. This author hopes that the use of GIS within a participatory framework will counter this trend and contribute to the self development and empowerment of community groups by placing information and sophisticated technologies in their hands. Community development is fundamentally concerned about individual and community empowerment. This domain approaches problems through a systems approach in that it addresses more than one problem at a time and makes connections between issues. It believes that community members should be involved in and guide decision-making regarding the development of the community. If research can be defined as a particular process of learning following some codified guidelines, then the question is “who learns?”. In
96
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
non-participatory research, only the researcher learns; in participatory research as discussed earlier, all relevant stakeholders (those who choose to participate in an endeavour) will learn. This learning will empower participants in at least three ways: it will provide specific insights and new understanding of problems; the participants will learn to ask questions and therefore discover how to learn; and the participants will have an opportunity to act using their new knowledge and create new opportunities for their community. Placing GIS as a communication tool within a participatory framework will enhance the quality of decision-making and contribute significantly to individual and community empowerment. REFERENCES ALEXANDER, E.R. 1984. After Rationality, what? Journal of the American Planning Association, 50(1), pp. 62–69. ARGYRIS, C, PUTNAM, R., and SMITH, D. 1985. Action Science: Concepts, Methods, and Skills for Research and Intervention. San Francisco, CA.: Josey-Bass. AUDET, R, HUXHOLD, W., and RAMASUBRAMANIAN, L. 1993. Electronic exploration: an introduction to geographic information systems, The Science Teacher, 60(7), pp. 34–38. BROWN, D. and TANDON, R. 1991. Multiparty Collaboration for Development in Asia. Working paper from Institute for Development Research. Boston, Massachusetts and Society of Participatory Research, New Delhi, India. CAMPBELL, H. and MASSER, I. 1996. Great Britain: The dynamics of GIS diffusion, in Craglia, M., Campbell, H., and Masser, I. (Eds.). GIS Diffusion: the Adoption and Use of Geographical Information Systems in Local Government in Europe. London: Taylor & Francis, pp. 49–66. CHEN, W. 1990. Visual Display of Spatial Information: a Case study of the South End Development Policy Plan, unpublished Masters Thesis, Department of Urban Studies and Planning, Massachusetts Institute of Technology, USA. CROSWELL, P. 1991. Obstacles to GIS implementation and guidelines to increase the opportunities for success, URISA Journal 3(1), pp. 43–56. DALEY, J., and ANGULO, J. 1990. People-centered community planning, Journal of the Community Development Society, 21(2), pp. 88–103, ELDEN, M., and LEVIN, M. 1991. Cogenerative learning: bringing participation into action research, in Whyte, W. (Ed.) Participatory Action Research. Newbury Park, CA.: Sage, pp. 127–142. FORESTER, J. 1989. Planning in the Face of Power. Berkeley, CA: University of California Press. FRIEDMANN, J. 1987. Planning in the Public Domain: From Knowledge toAction. Princeton, NJ: Princeton University Press. FRIEDMANN, J. 1992. Educating the Next Generation of Planners, unpublished working paper, University of California, Los Angeles. GAVENTA, J. 1993. The powerful, the powerless, and the experts: knowledge struggles in an information age, in Park, P., Brydon-Miller, M., Hall, B. and Jackson,T. (Eds.), Voices of Change: Participatory Research in the United States and Canada. Westport, CT: Bergin and Garvey, pp. 21–40. HALL, B. 1993. Introduction, in Park, P., Brydon-Miller, M, Hall, B. and Jackson, T. (Eds.), Voices of Change: Participatory Research in the United States and Canada. Westport, CT: Bergin and Garvey, pp. xiii-xxii. HUXHOLD, W. 1991. An Introduction to Urban Geographic Information Systems. New York, NY: Oxford University Press. HUXHOLD, W., and LEVINSOHN, A. 1995. Managing Geographic Information Systems Projects. New York, NY: Oxford University Press. HUXHOLD, W. and MARTIN, M. 1996. GIS Assists Neighborhood Strategic Planning in Milwaukee, working paper, available from the authors. KING, M.H. 1981. Chain of Change: Struggles for Black Community Development. Boston, MA: South End Press. KING, M.H. 1992. Community Development, unpublished working paper, Massachusetts Institute of Technology, Cambridge, MA, available from the author.
COMMUNITY EMPOWERMENT USING GIS
97
KING, M.H. 1994a. Personal communication with the author. KING, M.H. 1994b. Opening Remarks, Proceedings of The New Technologies Workshop, Massachusetts Institute of Technology. Cambridge, MA: MIT Community Fellows Program. KORTEN, D. (Ed.). 1986. Community Management: Asian Experience and Perspectives. West Hartford, CT: Kumarian Press. KRETZMAN, J. and McKNIGHT, J. 1993. Building Communities from the Inside Out: A Path Toward Finding and Mobilizing a Community’s Assets. Evanston, IL: Center for Urban Affairs and Policy Research, North Western University. MASSER, I. 1993. The diffusion of GIS in British local government, in Masser, I. and Onsrud, I. (Eds.) Diffusion and Use of Geographic Information Systems Technologies, Dordrecht. Kluwer Academic Publishers, pp. 99–116. MASSER, I., and ONSRUD, H. 1993. Extending the research agenda, in Masser, I. and Onsrud, I. (Eds.) Diffusion and Use of Geographic Information Systems Technologies, Dordrecht: Kluwer Academic Publishers, pp. 339–344. MITCHELL, W. 1994. Opening Remarks, Proceedings of the New Technologies Workshop, Massachusetts Institute of Technology. Cambridge, MA: MIT Community Fellows Program. NAISBITT, J 1994. Global Paradox: The Bigger the World Economy, the More Powerful its Smallest Player. New York: Avon Books Inc. NORMAN, J. 1995. Homeowners ensure company does the right thinking, Milwaukee Journal Sentinel Magazine, 26 November 1995, pp. 12–15. NYERERE, J. 1974. Man and Development. London: Oxford University Press. OBERMEYER, N., and PINTO, J. 1994. Managing geographic information systems. New York: The Guilford Press. PICKLES, J. (Ed) 1995. Ground Truth: The Social Implications of Geographic Information Systems. New York: The Guilford Press. RAMASUBRAMANIAN, L. 1991. Mapping Madras: Geographic Information Systems Applications for Metropolitan Management in Developing Countries, unpublished Masters Thesis, Department of Urban Studies and Planning, Massachusetts Institute of Technology. RAMASUBRAMANIAN, L. 1995. Building communities: GIS and participatory decision making, Journal of Urban Technology 3(1), pp. 67–79. RAMASUBRAMANIAN, L. 1996. Neighborhood Strategic Planning in Milwaukee, a Documentation Project. Report researched and written for the NonProfit Center of Milwaukee. Available from the author. REPAIRERS OF THE BREACH. 1994. Proposal for support of a research project submitted to the Poverty and Race Research Action Council. Milwaukee, WI: Available from the author. SASSEN, S. 1991. The global City: New York, London, Tokyo, Princeton: Princeton University Press. SCLOVE, R 1995. Democracy and Technology. New York: Guildford Press. SIEBER, R.E. 1997 Computers in the Grassroots: Environmentalists, Geographic Information Systems, and Public Policy. PhD dissertation, Rutgers University. SPARROW, J. and VEDANTHAM, A. 1995. Inner-city networking: models and opportunities, Journal of Urban Technology, 3(1), pp. 19–28. SUSSKIND, L. 1983. Paternalism, Conflict, and Coproduction: Learning from Citizen Action and Citizen Participation in Western Europe. New York: Plenum. WHYTE. W. (Ed.) 1991. Participatory Action Research. Newbury Park, CA: Sage. WIGGINS, L. 1993. Diffusion and use of geographic information systems in public sector agencies in the United States, in Masser, I. and Onsrud, H. (Eds.) Diffusion and Use of Geographic Information Systems Technologies. Dordrecht: Kluwer Academic Publishers, pp. 147–164..
Chapter Nine Climbing Out of the Trenches: Issues in Successful Implementations of a Spatial Decision Support System Paul Patterson
9.1 INTRODUCTION This chapter describes the development of a particular spatial decision support system (SDSS) and analyses the experience gleaned from the implementation of this system within multiple organisations. This chapter intends to fill a void between GIS conference proceedings which often describe individual user accounts of SDSS implementation within single organisations and academic journals which describe technical model development and simulations but rarely describe the actual implementation of SDSSs within real organisations. By focusing on a system that has been implemented many times across many types of organisations, patterns of important issues emerge. The goal is to step out of the day to day implementation trenches to offer SDSS developers insight into these design and organisational issues which may improve the chances for successful implementation of their systems. The SDSS is RouteSmart™, a mature software system for routing and scheduling of vehicles. RouteSmart™ has been implemented in over 40 organisations, both public and private, in more than seven different industrial sectors. RouteSmart™ implementation therefore offers the opportunity to analyse implementation issues across a wide spectrum of organisations. The author has been involved in the design, coding, and redesign of RouteSmart™ and has personally consulted in its implementation in over 20 organisations since 1990. The first section of this chapter gives a background and description of RouteSmart™. In order to generalise the lessons learned in RouteSmart™ implementations to other similar-type systems, the second section places RouteSmart™ within a classification framework with other SDSSs. Generalising implementation issues to other types of SDSSs can be a risky task. Therefore SDSS typologies are presented to show how RouteSmart™ is related to other types of systems so that other SDSS developers can take the lessons presented in this chapter with the proper level of caution. The third section of this chapter explores the lessons learned from implementation. These lessons fall into the following categories: appropriate data resolution, user feedback and interaction, extensions and customisations, and organisational issues. The fourth and final section explores additional ideas which have not been implemented in RouteSmart™ but which may further improve the chances of successful implementation for other SDSSs.
ISSUES IN THE IMPLEMENTATION OF DECISION-SUPPORT SYSTEMS
99
9.2 BACKGROUND AND DESCRIPTION OF ROUTESMART™ RouteSmart™ is a SDSS for solving routing and scheduling problems over street networks. RouteSmart™ has been implemented in a variety of organisations since 1988 for routing meter readers, sanitation and recycling trucks, mail carriers, express package, telephone book, and newspaper deliverers, and field service personnel (Bodin, et al., 1995). The chances are that if you live in the United States you are currently or will soon be serviced by someone using a RouteSmart™ route. There are two versions of RouteSmart™. The point-to-point version develops routes to individual customers scattered in low densities throughout the service territory. The neighbourhood version develops routes to individual customers clustered on nearly every street in a service territory, such as for garbage collection, mail delivery, or meter reading. The benefits of RouteSmart™ can be summarised in five areas: 1. 2. 3. 4. 5.
balanced workload partitioning between routes, near-optimal travel path generation, automated route mapping and report generation, interface to customer information systems, and user control and override of computer solutions.
By commercial standards RouteSmart™ is a successful example of a SDSS. RouteSmart typically reduces the number of routes needed to service a territory from manual methods. Reductions of 8 to 19 percent are common with reductions as high as 22 percent documented in the press (McCoy, 1995). RouteSmart™ also reduces the time and personnel needed to generate routes—in some cases by as much as six person weeks per rerouting. RouteSmart™ has a short payback period and is occasionally used to justify the entire expense of a general purpose GIS implementation. RouteSmart™ consists of a set of Fortran and C routines that have been integrated within four different GIS systems (Arc/Info, Arc View, GisPlus, and Synercom) in a variety of PC, workstation, and minicomputer environments. Through a user interface the GIS handles the spatial data, topological editing, address matching, selection and extraction of street and customer data for network creation, routing parameter input and formatting, display of routes, manual street swapping between routes, solution management, and report and map generation. The external executables handle network building and topological connectivity testing, the partitioning of customers into balanced routes, and the generation of travel paths. In generating routes and travel paths RouteSmart™ considers one-way streets, single, multiple, and back alley street traversal requirements, mixed modal traversal (driving and walking), turn penalties at each intersection, time windows for customer servicing, demand or supply at each customer location, vehicle capacity, travel times to and from depots and transfer sites when vehicles are full or empty, and office and break times. Extensions have been developed to choose the optimal vehicle mix when vehicle types are constrained by street width. Routes are always balanced on time but can be created based on either number of routes desired (e.g. vehicle availability is the primary constraint), on length of the workday (e.g. overtime reduction is the primary objective), or both. When an integer number of routes cannot be generated within an area a remnant (or partial) route can be created and geographically located by the user. RouteSmart™ has its theoretical roots in the network analysis work of early eighteenth century mathematician Leonhard Euler and in the contemporary work of Larry Bodin, Bruce Golden, Arjan Assad, and Michael Ball of the University of Maryland (Bodin, et al., 1983; Golden and Assad, 1988). The practical application of network analysis was largely limited to very expensive, labour intensive studies before the widespread availability and distribution of GISs and GIS databases (Bodin and Rappoport,
100
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
1992). GIS in this case is an enabling technology-one that fosters the development of a SDSS from theory that has a long history but which could not be practically implemented. 9.3 TYPOLOGIES OF SDSSS There is a large variety of SDSSs. For example: • Retail site location models. For the classic review article see Craig, et al. (1984). • Location-allocation models, such as the Locational Analysis Decision Support System (LADSS) (Densham, 1992). • Watershed analysis models, such as the Geographic Watershed Analysis Modelling System (GEOWAMS) (DePinto, et al., 1994). • Urban simulation models, such as the Integrated Transportation and Land Use Package (ITLUP) (Putman, 1991). There are many others. For an annotated bibliography of papers relating to spatial decision support systems refer to the closing report of the National Center for Geographic Information and Analysis Research Initiative Six (Densham and Goodchild, 1994). For a discussion on general progress in the field of Spatial Decision Support Systems refer to Chapter 11 in this volume. Spatial decision support systems can be categorised in a variety of ways. Working groups for the NCGIA Specialist Meeting for Research Initiative Six on Spatial Decision Support Systems outline two typologies for classifying spatial models: 1. type of model (algorithmic, heuristic, or data manipulation), and 2. type of decision being modelled (decision situation, frequency, stakes, number of decision makers, tools needed, and tool availability) (NCGIA, 1990). Concerning the first typology, RouteSmart™ is a heuristic model for solving route partitioning and travel path problems. In this regard the lessons discussed in the next section will apply most to algorithmic and heuristic models and to a lessor extent data manipulation models. The difference between algorithmic and heuristic models for this discussion is minimal since both require the use of parameters and multiple solutions to compare scenarios. Data manipulation models on the other hand are generally more direct and require less interaction on the part of the user. The second typology is where the important distinctions are made in terms of applicability of these lessons to other SDSSs. Under this second typology RouteSmart™’s decision situation is the operations domain, with a frequency ranging from daily to yearly, for relatively minor stakes beyond the implementing organisation, with relatively few decision makers, using RouteSmart™ and often a customer information system as the primary tools needed. Of these categories it is the decision situation which has the most bearing on the generalisation of these lessons to other SDSSs. Types of decision situations are further explored in section 9.4.4. However it is left to the reader to determine how their SDSS of interest fits into this typology and the corresponding applicability of the lessons learned from RouteSmart™.
ISSUES IN THE IMPLEMENTATION OF DECISION-SUPPORT SYSTEMS
101
9.4 LESSONS LEARNED Many of the lessons learned from RouteSmart™ for successful SDSS implementation can be classified into the following four categories: appropriate data resolution, user feedback and interaction, extensions and customisations, and organisational issues. These categories are all represented as subcomponents of the four major research areas outlined by the National Center for Geographic Information and Analysis Research Initiative Six on Spatial Decision Support Systems (Densham and Goodchild, 1994). 9.4.1 Appropriate Data Resolution The ideal data resolution for a particular SDSS cannot often be used due to a lack of data availability, long computer processing times, statistical considerations, or privacy issues. The first obstacle, data availability, will be overcome for many models as remote collection methods improve and electronic data collection systems and databases are linked together. The second obstacle, long computer run times, are usually due to calculations on large matrices. This will be overcome as computer speeds increase, algorithmic shortcuts are invented, and processing methodologies improve, such as with the use of parallel processing (Densham, 1993). However, the latter two obstacles, statistical validity of aggregated data and privacy concerns, will remain an issue for certain types of models. For instance, in transportation demand models aggregated residential and employment areas called traffic analysis zones (TAZs) are used to indicate where trips originate and terminate. TAZs are used for two reasons. First, large scale personal trip behaviour data are traditionally impractical to collect, although this may change with the adoption of Intelligent Vehicle Highway Systems (IVHS). Second, and perhaps more importantly, freedom of movement considerations are an important issue in modelling people’s individual travel behaviour. Unfortunately TAZs must be large and relatively few in number in order to distribute trips between zones with any sort of statistical accuracy. The trade-off is that the larger these zones are the less accurately trips can be assigned to individual links in the transportation network. This results in traffic assignments to links consisting of generalised aggregations of multiple streets and street segments instead of to individual street segments (Patterson, 1990). The statistical and privacy trade-offs of zone size make the data resolution of transportation models larger than desired thus making the models themselves less useful and less likely to be implemented. This question of appropriate zone size for areas is often referred to as the modifiable areal unit problem (MAUP) and the reader is directed to the classic review article of Openshaw (1984). RouteSmart™ has evolved from an arc segment level data resolution and structure down to the detailed customer level. For SDSSs dealing in application domains which directly impact individuals this is an important goal. Initial RouteSmart™ designs for the neighbourhood version used the centreline street segment as the fundamental data element. Later revisions reduced this to the blockface level. More recent revisions have reduced the resolution of map and report output down to the customer level, although the blockface still remains the basic data structure of the partitioning and travel path optimisation routines in the neighbourhood version. By its very nature the resolution of the point-to-point version has always been at the customer level. The push for this reduction in data resolution corresponds to the general trend of both private and public organisations to become more customer service oriented. RouteSmart™ helps organisations be more responsive by differentiating customers by service type and special requirements on maps and reports. This push has also been extended into two-way interfaces between RouteSmart™ and various customer information, billing, and support systems, such as automated generation of change of
102
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
service day postcards. This push has also created the ability to update routes automatically as customers are added or removed. One unfortunate aspect of RouteSmart™’s data structure is the redundancy between its internal network structure and the street centrelines stored in the GIS. A tighter integration between the geographic elements and the analytical elements (network) as suggested at the NCGIA Specialist Meeting on Spatial Decision Support Systems would create a smaller, faster, and easier to use SDSS (NCGIA, 1990). 9.4.2 User Feedback and Interaction SDSSs must allow user feedback and interaction. Reality (geographic or any other facet) cannot be modelled perfectly in any decision support system, no matter how fine the data resolution and sophisticated the algorithms. There must be opportunities for humans to insert knowledge of local conditions in order for the model to have a chance at being implemented successfully. The model should be able to adapt to these new inputs. If not the model results will be simply a first cut at the spatial decision and the decision maker will be left to adjust the results to fit reality (or not)! In this case the model has limited utility. Perhaps the model designer has a better understanding of the underlying geographic processes at work but the ultimate decision maker is little better off than before. User feedback and interaction have different meanings and take different forms depending on the type of SDSS. Take for example a sophisticated retail site location model with data resolution at the parcel level that is sensitive to sales cannibalisation from outlets within the same franchise. With this model, two potential sites that are located across a divided road from each other would appear as essentially the same site. One or the other might be selected as among the most optimal but not both. In this case the user would want to interact with the model to separate these two sites in geographic space (perhaps by taking into account drive time distances and turn impedances) and rerun the model to see if perhaps both of these sites are suitable. They may very well both be suitable, accounting for the phenomena we see in the real world such as two gas stations of the same franchise located across a divided road from each other. RouteSmart™ allows user feedback and interaction in two ways: through its modular design and its solution approach to modelling. First RouteSmart™ is a collection of algorithms and processes that are run in sequence but designed to be modular to allow user feedback, if desired, between each discrete step. These user feedback steps include the selection of areas and customers of interest, the input of routing parameters, selection of the seed location for remnant routes, the ability to swap streets and customers between routes, the adjustment of turn penalties, the creation of user-defined routes, and control over travel path map plotting scales and number of sequenced maps generated. The second component of RouteSmart™’s user feedback and interaction is its orientation toward solution generation and comparison instead of the generation of a single answer. This is accomplished by allowing solutions to be saved at various stages of completeness and to spawn multiple children solutions. A considerable amount of the system deals with solution management. This is similar to the scenario manager approach develop for a watershed analysis model of the Buffalo River (DePinto, et al., 1994). The disadvantage of this approach is that solutions become obsolete as data changes. Object oriented data structures are being considered in RouteSmart™ as an antidote to this dilemma of solution concurrency. There would be a collection of route objects and their associated children travel path objects. Each of these objects would contain relevant solution history and creation information as well as embedded code to handle notification events such as changed customers or network barriers.
ISSUES IN THE IMPLEMENTATION OF DECISION-SUPPORT SYSTEMS
103
One area where RouteSmart™ could do a better job in allowing user feedback is in the determination of travel paths to and from offices, depots, and transfer sites. Currently the minimum time path is found which may traverse residential roads which drivers may prefer to avoid. Combining a hierarchical path generation algorithm such as the one proposed by Carr in Chapter 31 of this volume with the ability of the analyst to define dynamically and alter these hierarchies would be one solution to this shortcoming. Ultimately it is the drivers on the street who have final control. They will change the route solutions to fit reality as necessary. They will encounter static problems unforeseen by the routing analyst and random problems unforeseen by anyone, including themselves. Reality is complicated. In the end SDSS output of this type is simply a guide to better decisions, not the final word. 9.4.3 Extensions and Customisations In order for SDSSs to move from academic tools for research to practical decision making tools for organisations they must be developed in a manner that allows easy extensions and customisations. These characteristics are necessary for the widespread implementation of any type of SDSS. SDSSs should assist people in decision making. They should not restrict or constrain decisions and they should not change the organisation’s operating procedures or protocols simply because they cannot be customised in any way. Change for the better is good but change for software design limitations is intrusive and will meet with resistance. Even among the fairly homogenous sanitation collection and utility meter reading industries there is wide variety in operating procedures, terminology, and services rendered. Barriers to implementation might include the need to have a particular output format. For example one county in the USA wants to have only the outer boundary streets on its route maps to reduce map clutter and driver confusion. However a different county wants very detailed maps showing house numbers, travel sequence numbers, comments, and special symbology for each customer on each street. The trick in SDSS development, as in software development in general, is to generalise these customisations so that other organisations can benefit from extended functionality. There is a need for user-controlled terminology on menus, buttons, and report column headings. There is also a need to allow special add-ins and extensions that a client may build on top of the SDSS. The SDSS needs to be open ended so it is extensible and can grow in the future. This is more difficult to do than to state. Perhaps in the long run object oriented data structures may help in this area. 9.4.4 Organisational Issues There are many sensitive human and management issues surrounding the implementation of a SDSS in an organisation. Participants in NCGIA’s Collaborative Spatial Decision Making Specialist Meeting (Research Initiative 17) defined Joe’s Cube as having three axes representing the physical, environmental, and procedural settings of a decision making context or situation as referred to in the section on SDSS typologies. The second axis of Joe’s Cube, the environmental axis, measures the organisational impact of a decision making process “in the context of a coupling index that ranges from “tightly coupled”, representing a small group of people with similar goals working on a clearly defined project, to “loosely coupled”, where there is a large group with dissimilar goals working on a problem which is multi-faceted” (NCGIA, 1995, p. 11).
104
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
A simple and practical adaptation of this “environmental axis” would be to classify SDSSs into those that are either strategic, planning, or operations oriented. SDSSs that are strategic in nature, such as site location and marketing models, have less impact on the day to day running of an organisation because the size of the decision making group is small. They also have less impact on society as a whole. SDSSs that are planning oriented, such as transportation demand and land use allocation models, have a wide impact on society as a whole but less of an impact within the particular implementing organisation (although a general purpose GIS system on the other hand will have a large impact). In this planning context decisions will be made one way or another, with or without the use of a SDSS. SDSSs that are operations oriented in nature, such as RouteSmart™, can have a tremendous impact on an organisation. The more central a SDSS is to an organisation’s operations and mission the more serious will be the organisational issues. Knowing where a particular SDSS falls within this framework of strategic, planning, or operational domains will help indicate the extent to which organisational issues should be heeded. Nonetheless, organisational impacts, large or small, should be considered to increase the chance for a successful implementation. In terms of RouteSmart™ and other SDSSs in the domain of operations, the major issues in implementation include the following: 1. Goals of management. Does management wish to implement a SDSS for a desire to modernise, improve efficiency, placate or impress stockholders, council members, or other constituent groups? Understanding the goals of management is key to a successful implementation. 2. Labour relations. RouteSmart™’s objective is to decrease the amount of labour required to service a territory. When people’s jobs are at stake implementation can be very difficult. A strategy for overcoming this obstacle while still benefiting from efficiency is to accommodate future growth without having to hire new people. 3. Old school methods. Implementation of a SDSS to replace manual methods is seen as a loss of control, status, and power by those who used to perform the manual methods. Indeed it is a loss of influence. These people should be incorporated into the new implementation, both for their acceptance of the new system and for the valuable knowledge they possess which cannot be replaced by a model. If they are not a component of the new decision process they will likely attempt to sabotage the SDSS either directly or covertly by undermining the implementation of the model results. 4. Computer scepticism. Computer illiteracy and general mistrust of computer programs that attempt to model reality can hinder implementation. While scepticism may be well founded when it comes to complex models like SDSSs, training classes and on-site consulting can help alleviate unnecessary fear. Training classes using real client data can further alleviate this fear by keeping the content familiar and less abstracted from the user’s reality. 9.5 IMPROVEMENTS FOR SUCCESSFUL IMPLEMENTATION Granted that RouteSmart™ is a SDSS with a rather well defined problem and a history of theory underlying it, what other improvements can be suggested that would apply to other SDSSs? Two suggestions are an explicit attempt to model uncertainty involved with any SDSS solution, and the incorporation of multiparticipant decision making methods for assigning weights to various factors in certain classes of models. Uncertainty is a problem not only for the model builder but for the implementing organisation as well. If uncertainty can be incorporated into the model directly, perhaps in the form of an endogenous variable that can be manipulated by users in an iterative interactive manner, better sensitivity analysis could be
ISSUES IN THE IMPLEMENTATION OF DECISION-SUPPORT SYSTEMS
105
conducted. As an example, in the IVHS model of real time traffic monitoring and route advising, the system being modelled is composed of thousands of individual decision makers, all obtaining the same information and game playing with each other to beat the system. Should the SDSS recommend different “best” alternative routes for each vehicle? How does the model handle the fact that each decision maker is free to ignore the model’s advice to choose his or her own route? Clearly this is a chaotic system of individual actors in which uncertainty must be modelled directly. Only then can sensitivity analysis be performed by the implementing organisation. Similar uncertainties exist with integrated land use and transportation models. These SDSSs attempt to model the simultaneous decisions of an entire city population’s choice of residential and work locations along with transportation modes and routes, using exogenous employment and population projections (Putman, 1991). RouteSmart™ could also benefit from incorporating uncertainty in the service time and demand at each customer location and the drive time uncertainties at various time periods of the day, to mention just a few variables. The form this uncertainty variable takes in any particular SDSS depends on the characteristics and functional form of that SDSS but attempts to incorporate variables of this type are sound development principles. The second suggestion involves incorporating multi-participant decision making models directly within SDSSs—particularly where the weighting of various factors is open to debate, such as land allocation and siting models. Having stakeholders participate directly and in conjunction (rather than disjunction) to derive appropriate weights and values improves the quality of the decisions being made and increases the chance for implementation. A formal approach to modelling qualitative values between multiple participants is the analytical hierarchy process (Saaty, 1990). These models exist today and should be linked either formally or informally to SDSSs. A tightly-coupled formal linkage is preferred as it increases the ability to do direct sensitivity analysis under various weighting scenarios without having the participants wait in a latency period while the models trade data and solutions back and forth. When multiple people with conflicting objectives are using a SDSS, solution speed becomes even more important because in many ways the model is acting as mediator and needs to be perceived as available and responsive. 9.6 CONCLUSION The goal of this chapter was to provide design guidelines and illuminate organisational issues which should be addressed by SDSS developers to improve their chances for successful implementation. The major topics covered include appropriate data resolution, user feedback and interaction, extensions and customisations, and organisational issues. In making such generalisations it is necessary to provide a framework against which other SDSSs may be compared for similarity to determine the level of transferability of these suggestions. It is left to the reader to determine the suitability of these suggestions to their systems. Regardless of the suitability of specific suggestions, this analysis of implementation issues is intended to keep SDSS builders focused on the end result of their work. If support systems are built but cannot be successfully implemented, of what value are they beyond scientific curiosity? True advancement in this field will be when we can point to many models in use—and success will be when they become ubiquitous within organisations. The hope is that by digging deep enough into the trenches of how a particular SDSS has been implemented across multiple organisations the resulting mound of experience will provide a vantage point from which to help route future SDSSs to success.
106
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
ACKNOWLEDGEMENTS RouteSmart is a registered trademark of Bowne Distinct Ltd. REFERENCES BODIN, L. and RAPPOPORT, H. 1992. Vehicle routing and scheduling problems over street networks, Paper presented at Waste Management, Equipment, and Recycling Conference, Rosemont, IL, 6–8 October (Not in proceedings). BODIN, L., PAGAN, G., PATTERSON, P., JANKOWSKI, B., LEVY, L., and DAHL, R 1995. The RouteSmart™ system—functionality, uses, and a look to the future, Paper presented at the 1995 Environmental Systems Research Institute User Conference, Palm Springs, CA, 21–26 May (Not in proceedings). BODIN, L., GOLDEN, B.L., ASSAD, A., and BALL, M. 1983. Routing and scheduling of vehicles and crews: the state of the art, Computers and Operations Research, 10(2), pp. 63–211. CRAIG, S. C, GHOSH, A., McLAFFERTY, S. 1984. Models of the retail location process: a review, Journal of Retailing, 60(1), pp. 5–36. DENSHAM, P.J. 1992. The Locational Analysis Decision Support System (LADSS) Software Series S-92–3. Santa Barbara, CA: National Center for Geographic Information and Analysis. DENSHAM, P.J. 1993. Integrating GIS and parallel processing to provide decision support for hierarchical location selection problems, Proceedings of GIS/LIS’93, Minneapolis, November, Baltimore: ACSM-ASPRS-URISA-AM/ FM, vol. 1 pp. 170–179. DENSHAM, P.J. and GOODCHILD, M.F. 1994. Research Initiative Six: Spatial Decision Support Systems Closing Report. Santa Barbara, CA: National Center for Geographic Information and Analysis. DePINTO, J. V, CALKINS, H.W., DENSHAM, P.I, ATKINSON J., GUAN W., and LIN, H. 1994. Development of GEOWAMS: an approach to the integration of gis and watershed analysis models, Microcomputers in Civil Engineering, 9, pp. 251–262. GOLDEN, B.L. and ASSAD, A. 1988. Vehicle Routing: Methods and Studies. Amsterdam: North-Holland. McCOY, C.R 1995. High tech helps haul the trash, The Philadelphia Inquirer, 10 July. Philadelphia, PA. NCGIA. 1990. Research Initiative Six—Spatial Decision Support Systems: Scientific Report for the Specialist Meeting, Technical Paper 90–5. Santa Barbara, CA: National Center for Geographic Information and Analysis. NCGIA. 1995. Collaborative Spatial Decision Making: Scientific Report for the Initiative 17 Specialist Meeting, Technical Paper 95–14. Santa Barbara, CA: National Center for Geographic Information and Analysis. OPENSHAW, S. 1984. The Modifiable Areal Unit Problem. Norwich: Geo Abstracts. PATTERSON, P. 1990. An evaluation of the capabilities and integration of aggregate transportation demand models with GIS technologies, Proceedings of the Urban and Regional Information Systems Association (URISA) 1990, Washington: URISA, vol. 4, pp. 330–341. PUTMAN, S.H. 1991. Integrated Urban Models 2. London: Pion. SAATY, T. 1990. The Analytic Hierarchy Process: Planning, Priority Setting, Resource Allocation. New York: McGraw-Hill.
Part Two GI FOR ANALYSIS AND MODELLING
Chapter Ten Spatial Models and GIS Michael Wegener
10.1 INTRODUCTION Spatial models have become an important branch of scientific endeavour. In the environmental sciences they include weather forecasting models, climate models, air dispersion models, chemical reaction models, rainfall-runoff models, groundwater models, soil erosion models, biological ecosystems models, energy system models and noise propagation models. In the social sciences they include regional economic development models, land and housing market models, plant and facility location models, spatial diffusion models, migration models, travel and goods transport models and urban land-use models. The representation of space in the first generations of spatial computer models was primitive. It essentially followed the organisation of statistical tables where each line is associated with one spatial unit such as a statistical district, region or “zone” and the columns represent attributes of the areal unit. Networks were coded as lattices, but because nodes were not associated with coordinates, the geometry of networks was only vaguely represented by the lengths (travel times) of their arcs. All this has changed with the advent of geographic information systems. GIS have vastly increased the range of possibilities of organising spatial data. Together with improvements in data availability and increases in computer memory and speed they promise to give rise to new types of spatial models, make better use of existing data, stimulate the collection of new data or even depart for new horizons that could not have been approached before. It was the purpose of the GISDATA Specialist Meeting “Spatial Models and GIS” to find out whether GIS have had or will have an impact on spatial models in the environmental and social sciences (Heywood et al., 1995). The underlying hypothesis was that the new possibilities of storing and handling spatial information provided by GIS might contribute to making existing spatial models easier to use and give rise to new spatial models that were not feasible or not even imagined before. The meeting at Friiberghs Herrgård near Stockholm in June 1995 brought together researchers involved in environmental and socioeconomic modelling to assess the potential and limitations of the convergence of spatial modelling and GIS, to formulate a research agenda for making the best use of this potential and to explore avenues towards more integrated spatial models incorporating both environmental and socio-economic elements in response to the urgent social and environmental problems facing cities and regions today.
SPATIAL MODELS AND GIS
109
10.2 SPATIAL MODELS A model is a simplified representation of an object of investigation for purposes of description, explanation, forecasting or planning. A spatial model is a model of an object of investigation in bispace (space, attribute). A space-time model is a model of an objective of investigation in trispace (space, time, attribute). There are three categories of spatial models with respect to their degree of formalisation: scale, conceptual and mathematical models (Steyaert, 1993). Scale models are representations of real-world physical features such as digital terrain models (DTM) or network models of hydrological systems. Conceptual models use quasi-natural language or flow charts to outline the components of the system under investigation and highlight the linkages between them. Mathematical models operationalise conceptual models by representing their components and interactions with mathematical constructs. Mathematical models may use scale models for organising their data. In the following discussion the emphasis is on mathematical models. Another important classification of spatial models is how they deal with the indeterminism of real-world phenomena (Berry, 1995). Deterministic models generate repeatable solutions based on the direct evaluation of defined relationships, i.e. do not allow for the presence of random variables. Probabilistic models are based on probability distributions of statistically independent events and generate a range of possible solutions. Stochastic models are probabilistic models with conditional probability distributions taking into account temporal and spatial persistence. A third basic classification refers to statics/dynamics. In a static model all stocks have the same time label, i.e. only one point in time is considered. Static models are usually associated with the notion of a steady state or equilibrium. In a dynamic model stocks have two (comparative statics) or more time labels, hence change processes are modelled. Dynamic models may treat time as continuous or discrete. Models with discrete time intervals are called simulation models; with fixed time intervals (periods) they are called recursive, with variable time intervals event-driven. Spatial models can also be classified according to their resolution in space, time and attributes, ranging from the microscopic to the macroscopic. The space dimension can be represented by objects with zero dimension (points), one dimension (lines), two dimensions (areas) or three dimensions (volumes). The size of objects may range from a few metres to thousands of kilometres. In similar terms the time dimension can be represented with zero dimension (event) or one dimension (process); the resolution may range between a few seconds and hundreds of years. It is misleading to talk about time as the “fourth” dimension as there are dynamic spatial models without three-dimensionality. The attribute dimension may be single- or multiattribute. The resolution may range from individual objects (molecules, neurons, travellers) described by a list of attributes to large collectives (gases, species, national economies) described by averages of attributes, with all stages in between. Simulation models of individual objects are called microsimulation models; microsimulation models do not need to simulate all objects of the system of investigation but may work with a sufficiently large sample. There are many more ways of classifying spatial models that can only be indicated here. Beyond the above criteria, spatial models can be classified by: • comprehensiveness: some models deal only with one spatial subsystem, whereas others deal with interactions between different spatial subsystems. • model structure: one group of models applies one single unifying principle for modelling and linking all subsystems; other models consist of loosely coupled submodels, each of which has its own independent internal structure.
110
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
• theoretical foundations: environmental models rely on physical laws, whereas socioeconomic models apply conceptualisations of human behaviour such as random utility or economic equilibrium. • modelling techniques: models may differ by modelling technique such as input-output models, spatial interaction models, neural network models, Markov models or microsimulation models. In the following section spatial models in the environmental and social sciences will be classified by application field. 10.3 APPLICATION FIELDS The range of applications of spatial models in the environmental and social sciences is large and rapidly expanding. The most important application fields at present are described in the following sections: 10.3.1 Environmental Sciences Goodchild et al. (1993b) and Fedra (1993) use the following classification of spatial models in the environmental sciences: • Atmospheric modelling includes general circulation models used for short-or mediumterm weather forecasts or global, regional or micro climate forecasts and atmospheric diffusion models used for modelling the dispersion of pollutants originating from point, line or areal sources and carried by atmospheric processes including chemical reactions (Lee et al., 1993). • Hydrological modelling includes surface water models, such as rainfall-runoff models, streamflow simulation models and flood hydraulics, and groundwater models, such as groundwater flow models, groundwater contamination transport models and variably saturated flow and transport models (Maidment, 1993). • Land-surface-subsurface process modelling includes plant growth, erosion or salinization models, geological models and models of subsurface contamination at hazardous disposal sites or nuclear waste depositories and are typically combined with surface water and groundwater models (Moore et al., 1993). • Biological/ecological systems modelling comprise terrestrial models such as single-species and multispecies vegetation and/or wildlife models, e.g. forest growth models, freshwater models such as fish yield models and nutrient/plankton models for lakes and streams, and marine models such as models of migrations of fish and other sea animals and models of the effect of fishing on fish stocks (Haines Young et al., 1993; Hunsaker et al., 1993; Johnston, 1993). • Integrated modelling includes combinations of one or more of the above groups of models, such as atmospheric and ecosystem models (Schimel and Burke, 1993) or climate, vegetation and hydrology models (Nemani et al., 1993). From the point of view of environmental planning, energy system models and noise propagation models at the urban scale might also be included under the heading of environmental modelling.
SPATIAL MODELS AND GIS
111
10.3.2 Social Sciences Spatial models in the social sciences originated from several disciplines such as economics, geography, sociology and transport engineering and have only since the 1960s been integrated by “synthetic” disciplines such as regional science or multidisciplinary research institutes and planning schools. However, as a point of departure, the classification by discipline is still useful: • Economic modelling with a spatial dimension includes international or multiregional trade models and regional economic development models based on production functions, various definitions of economic potential or multiregional input-output analysis, and on the metropolitan scale models of urban land and housing markets based on the concept of bid rent. Normative economic models based on location theory (minimising location and transport cost) are used to determine optimum locations for manufacturing plants, wholesale and retail outlets or public facilities. • Geographic modelling include models of spatial diffusion of innovations similar to epidemiological models, migration models based on notions of distance and dissimilarity between origin and destination regions frequently coupled with probabilistic models of population dynamics, spatial interaction and location models based on entropy or random utility concepts and models of activity-based mobility behaviour of individuals subject to constraints (“space-time geography”). • Sociological modelling has contributed spatial models of invasion of urban territories by population groups based on analogies from plant and animal ecology (“social ecology”) and models of urban ‘action spaces’ related to concepts of space-time geography. • Transport engineering modelling includes travel and goods transport models based on entropy or randomutility theory with submodels for flow generation, destination choice, modal choice, network search and flow assignment with capacity restraint resulting in user-optimum network equilibrium, and normative models for route planning, transport fleet management and navigation in large transport networks. In more recent developments, concepts of activity-based mobility have been taken up by transport modellers to take account of non-vehicle trips, trip chains, multimodal trips, car sharing and new forms of demand-responsive collective transport. • Integrated modelling includes approaches in which two or more of the above specialised models are combined, such as integrated models of spatial development at the regional or metropolitan scale. Typically such models consist of models of activity location, land use and travel; more recently environmental aspects such as energy consumption, CO2 emissions, air pollution, open space and traffic noise are also addressed. Spatial modelling in the social sciences seems to be more fragmented than in the environmental sciences. At the same time the need for integrative solutions is becoming more urgent because of the interconnectedness of economic, social and environmental problems. 10.4 SPATIAL MODELS AND GIS Geographic information systems are both specialised database management systems for handling spatial information and toolboxes of methods to manipulate spatial information. Because of the limited analytical or modelling capabilities of present GIS, the toolbox side of current GIS seems to be of little interest. Instead it seems to be much more relevant to examine whether the organisation of spatial information in GIS
112
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
is appropriate for spatial models and, more importantly, whether it might facilitate new ways of applying existing models or stimulate the development of new ones. 10.4.1 Data Organisation of Spatial Models Pre-GIS spatial models have used predominantly the following five types of data organisation: Stock matrix. In aggregate spatial models space is subdivided into spatial units usually called zones. In general the area but not the shape or the neighbourhood relations of the zones are known to the model. All attributes of a zone are stored as a (sometimes very long) vector. So the study region is represented as a twodimensional matrix where the rows are the zones and the columns are the attributes. To keep the number of columns of the matrix manageable, the attributes are classified, however, sometimes an element of the attribute vector is a pointer to more complex information such as a univariate or bivariate distribution, for instance of households or dwellings. It is implicitly assumed that all attributes of a zone are uniformly spatially distributed throughout the zone, so the size of the zone determines the spatial resolution of the model. If the model is dynamic, there is one matrix for each point in time. The problem with this data organisation is that the zones (usually administrative subdivisions) are rarely homogenous with respect to the classified attributes and that the attributes are rarely uniformly distributed across the area of the zone. Interaction matrix. The spatial dimension of the model is introduced only indirectly by one or more interaction matrices and/or through the explicit representation of interzonal networks (see below). The interaction matrices are usually square matrices where both rows and columns represent zones. They contain either indicators of the spatial impedance between the zones such as distances, travel times or travel costs, or spatial interactions such as personal trips or flows of commodities or information. In spatial inputoutput analysis, the matrix of flows is actually four-dimensional comprising both inter-zonal and interindustry flows, but for practical reasons this is rarely implemented. If the model is dynamic, there is the same set of interaction matrices for each time period. Network. Pre-GIS network coding is a vector representation of the links of the network. The network topology is introduced by coding a from-node and to-node for each link. However, the coordinates of the nodes are not normally coded because the impedance of the links is determined only by its attributes such as length, mean travel time, capacity, etc. The association between networks and zones is established by pseudo links connecting arbitrarily localised points in the zones (“centroids”) to one or more network nodes. There is no other spatial relationship between network and zones, so spatial impacts of flows within the zones such as traffic noise cannot be modelled. List. Some disaggregate (microsimulation) models seek to avoid the disadvantages of the aggregate matrix representation of classified stocks by using a list representation of individual persons or objects. The list may contain all or a sample of the stock of a kind in the zone, for instance all individuals, households and dwellings or a representative sample of individuals, households and dwellings. Each list item is associated with a vector of attributes referring to it, so no averages are needed. Attributes can contain spatial information (such as address) or temporal information (such as year of retirement). One problem of list organisation is that matching operations (e.g. marriage) are not straightforward. However, by using more sophisticated forms of lists such as inverted lists, search in lists can be made more efficient. Raster. Even before raster-based GIS, a raster organisation of spatial data has been popular in environmental and to a lesser degree in social-science modelling. Raster organisation has the advantage that the topology is implicit in the data model and that some operations such as buffering and density
SPATIAL MODELS AND GIS
113
calculations are greatly simplified, but in every other respect raster-based models share the problems of zone-based models, unless the raster cells are very small. 10.4.2 Data Organisation of GIS for Spatial Models This section examines how the data structures offered by GIS correspond to the data structures used in spatial models. Spatially aggregate zone-based models can be well represented by the polygon data object of vectorbased GIS. Zones are represented by polygons and zonal attributes are stored in polygon attribute tables. However, this one-to-one correspondence underutilises the potential of vector-based GIS to superimpose different layers or coverages of vector-based GIS and to perform set operations on them. There is no advantage in using a GIS for this kind of model. In addition there are no facilities to store multiple sets of attributes with different time labels for one topology without duplicating the whole coverage. There is no data structure in current GIS which corresponds to interaction matrices storing attributes relating to interactions between all pairs of polygons. Networks can be conveniently represented as line coverages with the link attributes stored in arc attribute tables and the node attributes stored in associated point attribute tables. However, it is hardly possible to represent the temporal evolution of networks, i.e. the addition, modification or deletion of links at certain points in time. In pre-GIS times this was done by entering add, change or delete operations with a time label. This, however, is precluded in GIS in which there can be only one arc with the same from-node and to-node. In addition it is relatively difficult to associate two link-coded networks with each other (such as a road network with a public-transport network) or a link-coded network with additional route information where a route is a sequence of links (such as a public transport line). Also the network operations built into some GIS or GIS add-ons, e.g. operations for minimum-path search or location/allocation are no incentive for using a GIS for network representation as they tend to be too simplistic, inflexible and slow to compete with state-of-the-art network modelling algorithms. Nevertheless the ease of digitising, data entry and error checking makes it attractive to use GIS for network coding, even if all further processing takes place outside the GIS. There is a strong affinity between micro data coded in list form and the way point data are stored in point attribute tables in GIS. Therefore it is here where the potential of GIS for supporting spatial models seems to be most obvious. At the same time point attribute tables and point operations are the least complex in GIS and therefore might be most easily be reproduced outside a GIS. In addition there remains the difficulty of specifying multiple events with different time labels for one point at one location or of specifying the movement of one point from one location to another due to the requirement of most GIS that one point identifier can only be associated with one location or coordinate pair. Finally there are raster-based GIS. Their data organisation corresponds directly to that of raster-based spatial models and so shares their advantages and weaknesses plus the added difficulty of introducing time into the model. However, if a very small raster size is chosen, raster-based GIS and raster-based spatial models take on a new quality. If the raster cell is reduced to the size of a pixel, raster-based spatial models allow the generation and manipulation of quasi-continuous surfaces. Moreover, in conjunction with appropriate spatial interpolation techniques, it is possible to co-process polygon-based, network-based and list-based spatial models in one common spatial framework. For instance, in a travel simulation one might use a list to sample trip origins from a population density surface created by spatial interpolation from zonal data, access the nearest network node pixel, perform destination,
114
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
mode and route choice on the link-coded network and return to pixel representation at the destination. The results of such a simulation may be used as link-by-link information to drive a capacity-restraint or networkequilibrium model or may be used as pixel-by-pixel input to environmental impact submodels such as air dispersion or noise propagation models or may be used to drive output routines generating 2D or 3D surface representations. It would be worthwhile to explore the potential of raster-based add-ons to vector-based GIS to support such applications. 10.4.3 Coupling Spatial Models and GIS Many of the more sophisticated algorithms to process spatial data in spatial models are currently not available in commercially available GIS. This brings up the question how spatial models should be integrated with the GIS. Four levels of integration of spatial analysis routines with GIS with increasing intensity of coupling can be distinguished (Nyerges, 1993): • Isolated applications. The GIS and the spatial analysis programme are run in different hardware environments. Data transfer between the possibly different data models is performed by ASCII files offline; the user is the interface. The additional programming expenditure is low, but the efficiency of the coupling is limited. • Loose coupling. Here the coupling is carried out by means of ASCII or binary files; the user is responsible for formatting the files according to format specifications of the GIS. This kind of coupling is carried out on-line on the same computer or on different computers in a local network; with relatively little extra programming the efficiency is greater than with isolated applications. • Tight coupling. In this case the data models may still be different, but automated exchange of data between the GIS and the spatial analysis is possible through a standardised interface without user intervention. This increases the effectiveness of data exchange but requires more programming effort (e.g. macro language programming). The user remains responsible for the integrity of the data. • Full integration. This linkage operates like a homogeneous system from the user’s point of view; data exchange is based on a common data model and database management system. Interaction between GIS and spatial analysis is very efficient. The initial development effort is large, but may be justified by the ease by which later model functions can be added. Maguire (1995) lists examples of various levels of integration available in current GIS software. An external model offers the advantage of independent and flexible development and testing of the model, but is only suitable for loose coupling. Embedding the spatial model into the GIS has the advantage that all functions and data resources of the GIS can be used. However, present GIS fail to provide interfaces for standard computer languages such as C, Pascal or Fortran necessary for internal modelling. In the long run graphical user interfaces from which the user can call up both GIS tools and modelling functions will become available.
SPATIAL MODELS AND GIS
115
10.5 MICROSMULATION AND GIS 10.5.1 More Opportunities One development that is likely to have a profound impact on spatial modelling is the capabilty of GIS to organise and process efficiently spatially disaggregate data. Pre-GIS spatial models received their spatial dimension through a zonal system. This implied the assumption that all attributes of a zone are uniformly spatially distributed throughout the zone. Spatial interaction between zones was established via networks that are linked to centroids of the zones. Zone-based spatial models do not take account of topological relationships and ignore the fact that socio-economic activities and their impacts, e.g. environmental impacts, are continuous in space. The limitations of zonal systems have led to serious methodological difficulties such as the ‘modifiable areal unit problem’ (Openshaw, 1984; Fotheringham and Wong, 1991) and problems of spatial interpolation between incompatible zone systems (Flowerdew and Openshaw, 1987; Goodchild et al., 1993a; Fisher and Langford, 1995). For instance, most existing land use models lack the spatial resolution necessary to represent other environmental phenomena such as energy consumption or CO2 emissions. In particular emission-immission algorithms such as air dispersion, noise propagation and surface and ground water flows, but also micro climate analysis, require a much higher spatial resolution than large zones in which the internal distribution of activities and land uses is not known. Air distribution models typically work with raster data of emission sources and topographic features such as elevation and surface characteristics such as green space, built-up area, high-rise buildings and the like. Noise propagation models require spatially disaggregate data on emission sources, topography and sound barriers such as dams, walls or buildings as well as the threedimensional location of population. Surface and ground water flow models require spatially disaggregate data on river systems and geological information on ground water conditions. Micro climate analysis depends on small-scale mapping of green spaces and built-up areas and their features. In all four cases the information needed is configurational. This implies that not only the attributes of the components of the modelled system such as quantity or cost are of interest but also their physical micro location. This suggests a fundamentally new organisation of urban models based on a microscopic view of urban change processes (Wegener, 1998). This is where GIS come into play. A combination of raster and vector representations of spatial elements, as it is possible in GIS, might lead to spatially disaggregate models that are able to overcome the disadvantages of zonal models. Using spatial interpolation techniques, zonal data can be disaggregated from polygons to pixels to allow the calculation of micro-scale indicators such as population or employment density or air pollution (Bracken and Martin, 1988; Martin and Bracken, 1990; Bracken and Martin, 1994). The vector representation of transport networks allows the application of efficient network algorithms from aggregate transport models such as minimum path search, mode and route choice and equilibrium assignment. The combination of raster and vector representations facilitates activity-based microsimulation of both location and mobility in an integrated and consistent way.
116
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
Figure 10.1: Linking microsimulation and GIS
10.5.2 Linking Microsimulation and GIS Microsimulation models require disaggregate spatial data. Geographic information systems (GIS) offer data structures which efficiently link coordinate and attribute data. There is an implicit affinity between microanalytic methods of spatial research and the spatial representation of point data in GIS. Even where no micro data are available, GIS can be used to generate a probabilistic disaggregate spatial database. There are four fields in which GIS can support micro techniques of analysis and modelling (see Figure 10.1): • Storage of spatial data. There is a strong similarity between the storage of individual data required for microsimulation and the structure of point coverages of GIS. In an integrated system of microsimulation modules a GIS data base may therefore be efficient for analysis and modelling. • Generation of new data. GIS may be used to create new data for microsimulation that were not available before. These data can be derived using analytical tools of GIS such as overlay or buffering. • Disaggregation of data. Most data available for urban planning are aggregate zonal data. Microsimulation requires individual, spatially disaggregate data. If micro data are not available, GIS with appropriate microsimulation algorithms can generate a probabilistic disaggregate spatial database. A method for generating synthetic micro data is presented in the next section. • Visualisation. Microsimulation and GIS can be combined to display graphically input data and intermediate and final results as well as to visualise the spatial evolution of the urban system over time through animation.
SPATIAL MODELS AND GIS
117
10.5.3 Spatial Disaggregation of Zonal Data Spatial microsimulation models require the exact spatial location of the modelled activities, i.e. point addresses as input. However, most available data are spatially aggregate. To overcome this, raster cells or pixels are used as addresses for microsimulation. To disaggregate aggregate data spatially within a spatial unit such as an urban district or a census tract, the land use distribution within that zone is taken into consideration, i.e. it is assumed that there are areas of different density within the zone. The spatial disaggregation of zonal data therefore consists of two steps, the generation of a raster representation of land use and the allocation of the data to raster cells. Vector-based GIS record land use data as attributes of polygons. If the GIS software has no option for converting a polygon coverage into a raster representation, the following steps are performed (see Spiekermann and Wegener, 1995; Wegener and Spiekermann, 1996). First, the land use coverage and the coverage containing the zone borders are intersected to get land use polygons for each zone. Then the polygons are converted to raster representation by using a point-in-polygon algorithm for the centroid of each raster cell. As a result each cell has two attributes, the land use category and the zone number of its centroid. These cells represent the addresses for the disaggregation of zonal data and the subsequent microsimulation. The cell size to be selected depends on the required spatial resolution of the microsimulation and is limited by the memory and speed of the computer. The next step merges the land use data and zonal activity data such as population or employment. First for each activity to be disaggregated density-specific weights are assigned to each land use category. Then all cells are attributed with the weights of their land use category. Dividing the weight of a cell by the total of the weights of all cells of the zone gives the probability for that cell to be the address of one element of the zonal activity. Cumulating the weights over the cells of a zone yields a range of numbers associated with each cell. Using a random number generator, for each element of zonal activity, e.g. each household or work place, one cell is selected as its address. The result is a raster representation of the distribution of the activity within the zone that can be used as the spatial basis for a microsimulation. The combination of the raster representation of activities and the vector representation of the transport network provides a powerful data organisation for the joint microsimulation of land use, transport and environment. The raster representation of activities allows the calculation of micro-scale equity and sustainability indicators such as accessibility, air pollution, water quality, noise, micro climate and natural habitats, both for exogenous evaluation and for endogenous feedback into the residential construction and housing market submodels. The vector representation of the network allows the application of efficient network algorithms from aggregate transport models such as minimum path search, mode and route choice and equilibrium assignment. The link between the micro locations of activities in space and the transport network is established by automatic search routines finding the nearest access point to the network or nearest public transport stop. The combination of raster and vector representation in one model allows the application of the activity-based modelling philosophy to modelling both location and mobility in an integrated and consistent way. This vastly expands the range of policies that can be examined. For instance, it is possible to study the impacts of public-transport oriented land-use policies promoting low-rise, highdensity mixed-use areas with short distances and a large proportions of cycling and walking trips as well as new forms of collective travel such as bike-and-ride, kiss-and-ride, park-and-ride or various forms of vehicle-sharing.
118
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
10.6 CONCLUSIONS The main conclusion of this chapter is that the potential of GIS to offer new data organisations for spatial models represents the most promising challenge of GIS for spatial modelling. It should be the primary goal to identify approaches which explore the potential of GIS to facilitate new ways of applying existing models or to stimulate the development of new models. Under this perspective issues of data transformation from one data organisation to another (e.g. from polygon to raster) and the generation of synthetic micro data from aggregate data and integrated approaches combining different data models such as raster and vector deserve particular attention. In comparison, the question of what constitutes the right set of tools for spatial modelling within GIS seems to be of secondary priority. The new potential of GIS for spatial modelling needs to be explored much more thoroughly before a “canon” of typical spatial modelling operations can be defined. In addition there is always the chance that such a canon will become outdated as new problems come up and require new modelling approaches-modelling is not a routine activity that can be standardised. Similarly, the question of how GIS and spatial models should be connected seems to be premature as long as there is no canon of typical operations for spatial modelling. It is likely that for some time to come “loose coupling” will be the appropriate mode for research-oriented spatial modelling environments. From a more fundamental perspective the question may be criticised as being captive to a too restrictive concept of a GIS as a particular software package. After all, is not every spatial model a GIS in as much as it processes spatial information? From this point of view it does not make a difference whether the spatial model is embedded in the GIS or the GIS into the spatial model. There may be a time when GIS are no longer monolithic fortresses with tightly controlled import-export drawbridges but modular, open, interactive systems of functional routines and well documented file formats. Another convergence, which is also related to GIS, seems to be much more important. The growing complexity of environmental problems requires the use of integrated spatial information systems and models cutting across application fields and across the gap between the environmental and social sciences. Separate modelling efforts, with or without a GIS, are no longer sufficient. Joint efforts of computer scientists, landscape ecologists, hydrologists, planners and transport engineers should aim at the development of intelligent, highly integrated spatial information and modelling systems. These systems could play an important role not only in answering the questions of experts but also in educating and informing politicians, administrators and the general public. ACKNOWLEDGEMENT The author is indebted to the other members of the GISDATA task force “Spatial Models and GIS”, Ian Heywood, Ulrich Streit and Josef Strobl, for material from the position paper for the Specialist Meeting at Friiberghs Herrgård and to Klaus Spiekermann for reference to joint work on microsimulation and GIS. REFERENCES BERRY, J.K. 1995. What’s in a model? GIS World, 8(1), pp. 26–28. BRACKEN, I. and MARTIN, D. 1988. The generation of spatial population distributions from census centroid data, Environment and Planning A, 21, pp. 537–543.
SPATIAL MODELS AND GIS
119
BRACKEN, I. and MARTIN, D. 1994. Linkage of the 1981 and 1991 UK Censuses using surface modelling concepts, Environment and Planning A, 27, pp. 379–390. FEDRA, K. 1993. GIS and environmental modelling, in Goodchild, M.F., Parks, B.O. and Steyaert, L.T. (Eds.), Environmental Modelling with GIS. New York: Oxford University Press, pp. 35–50. FISHER, P.P. and LANGFORD, M. 1995. Modelling the errors in areal interpolation between zonal systems by Monte Carlo simulation, Environment and Planning A, 27, pp. 211–224. FLOWERDEW, R. and OPENSHAW, S. 1987. A Review of the Problem of Transferring Data from One Set of Areal Units to Another Incompatible Set, NE.RRL Research Report 87/0. Newcastle: Centre for Urban and Regional Development Studies, University of Newcastle. FOTHERINGHAM, A.S. and WONG, D.W. S. 1991. The modifiable areal unit problem in multivariate statistical analysis, Environment and Planning A, 23, pp. 1025–1044. GOODCHILD, M.F, ANSELIN, L. and DEICHMANN, U. 1993a. A framework for the areal interpolation of socioeconomic data, Environment and Planning A, 25, pp. 383–397. GOODCHILD, M.F., PARKS, B.O. and STEYAERT, L.T. (Eds.) 1993b. Environmental Modelling with GIS. New York: Oxford University Press. HAINES-YOUNG, R., GREEN, D.R. and COUSINS, S.H. 1993. Landscape Ecology and GIS. London: Taylor & Francis. HEYWOOD, I., STREIT, U., STROBL, J. and WEGENER, M. 1995. Spatial models and GIS: new potential for new models? Presentation at the GISDATA Specialist Meeting on “GIS and Spatial Models: New Potential for New Models?”, Friiberghs Herrgård, 14–18 June 1995. HUNSAKER, C.T., NISBET, R.A., LAM, D.C.,BROWDER, J.A., BAKER, W.L., TURNER, M.G. and BOTKIN, D.B. 1993. Spatial models of ecological systems and processes: the role of GIS, in Goodchild, M.F., Parks, B.O. and Steyaert, L.T. (Eds.), Environmental Modelling with GIS. New York: Oxford University Press, pp. 248–264. JOHNSTON, C.A. 1993. Introduction to quantitative methods and modeling in community, population and landscape ecology, in Goodchild, M.F., Parks, B.O. and Steyaert, L.T. (Eds.), Environmental Modelling with GIS. New York: Oxford University Press, pp.281–283. LEE, T.J., PIELKE, R., KITTEL, T. and WEAVER, J. 1993. Atmospheric modeling and its spatial representation of land surface characteristics, in Goodchild, M.F., Parks, B.O. and Steyaert, L.T. (Eds.), Environmental Modelling with GIS. New York: Oxford University Press, pp. 108–122. MAGUIRE, D. 1995. Implementing spatial analysis and GIS applications for business and service planning, in Longley, P. and Clarke, C. (Eds.), GIS for Business and Service Planning. Cambridge: GeoInformation International, pp. 171–191. MAIDMENT, D.R. 1993. GIS and hydrological modeling, in Goodchild, M.F., Parks, B.O. and Steyaert, L.T. (Eds.), Environmental Modelling with GIS. New York: Oxford University Press, pp. 147–167. MARTIN, D. and BRACKEN, I. 1990. Techniques for modelling population-related raster databases, Environment and Planning A, 23, pp. 1069–1075. MOORE, I.S., TURNER, A.K., WILSON, J.P., JENSON, S.K. and BAND, L.E. 1993. GIS and land-surface-subsurface process modeling, in Goodchild, M.F., Parks, B.O. and Steyaert, L.T. (Eds.), Environmental Modelling with GIS. New York: Oxford University Press, pp. 196–230. NEMANI, R., RUNNING, S.W., BAND, L.E. and PETERSON, D.L. 1993. Regional hydroecological simulation system: an illustration of the integration of ecosystem models in GIS, in Goodchild, M.F., Parks, B.O. and Steyaert, L.T. (Eds.) Environmental Modelling with GIS. New York: Oxford University Press, pp. 296–304. NYERGES, T.L. 1993. Understanding the scope of GIS: its relationship to environmental modeling, in Goodchild, M.F., Parks, B.O. and Steyaert, L.T. (Eds.), Environmental Modelling with GIS. New York: Oxford University Press, pp. 75–93. OPENSHAW, S. 1984. The Modifiable Areal Unit Problem, Concepts and Techniques in Modern Geography 38. Norwich: Geo Books.
120
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
SCHIMEL, D.S. and BURKE, I.C. 1993. Spatial interactive atmosphere-ecosystem coupling, in Goodchild, M.F., Parks, B.O. and Steyaert, L.T. (Eds.), Environmental Modelling with GIS. New York: Oxford University Press, pp. 284–289. STEYAERT, L.T. 1993. A perspective on the state of environmental simulation modeling, in Goodchild, M.F., Parks, B.O. and Steyaert, L.T. (Eds.), Environmental Modelling with GIS. Oxford, New York: University Press, pp. 16–30. SPIEKERMANN, K. and WEGENER, M. 1995. Freedom from the tyranny of zones: toward new GIS-based spatial models, Presentation at the GISD AT A Specialist Meeting “GIS and Spatial Models: New Potential for New Models?”, Friiberghs Herrgård, 14–18 June 1995. WEGENER, M. 1998. Applied models of urban transport, land use and environment, in Batten, D., Kim, T.J., Lundqvist, L. and Mattson, L.G. (Eds.), Network Infrastructure and the Urban Environment: Recent Advances in Land-use/ Transportation Modelling. Berlin/Heidelberg/New York: Springer Verlag. WEGENER, M. and SPIEKERMANN, K. 1996. The potential of microsimulation for urban models, in Clarke, G. (Ed.) Microsimulation for Urban and Regional Policy Analysis, European Research in Regional Science, Vol. 6. London: Pion, pp. 146–163.
Chapter Eleven Progress in Spatial Decision Making Using Geographic Information Systems Timothy Nyerges
11.1 INTRODUCTION Where are we with progress in spatial decision making using geographic information systems (GIS)? If you agree with David Cowen who wrote “I conclude that a GIS is best defined as a decision support system involving the integration of spatially referenced data in a problem solving environment” (Cowen, 1988, p. 1554), then perhaps we have amassed a wealth of experience using GIS in a spatial decision making context. If you agree more with Paul Densham who later wrote “current GIS fall short of providing GIA [geographic information analysis] capabilities [for decision making support]” (Densham, 1991, p. 405), then perhaps you believe that GIS has not evolved enough to truly support decision making. However, if you agree with Robert Lake who warns “ultimately at issue is whether the integrative capacity of GIS technology proves robust enough to encompass not simply more data but fundamentally different categories that extend considerably beyond the ethical, political, and epistemological limitations of positivism” (Lake, 1993, p. 141); then perhaps GIS might be a decision support disbenefit. Whether we consider GIS to be a spatial decision support system (SDSS) and/or a disbenefit, if we assume that the core of a SDSS relies on GIS technology, then we can assert that progress has been made with spatial decision making if we measure progress in terms of technology (tool) development. From that statement we can safely conclude that research on tool development has received much more attention over the years than has study of the tool use. We need to focus more energies on studying the use of the spatial information technology to determine if progress has occurred. Good principles directed at design come from a good understanding of the principles of tool use. It is important to understand that a balance of research on tool development and tool use is probably the best approach to ensure progress. Such a position is advocated by Zachary (1988), DeSanctis and Gallupe (1987) and Benbasat and Nault (1990) working in a management information context on decision support. Tool development and use can be studied together effectively using a (recon)structurationist perspective (DeSanctis and Poole, 1994; Orlikowski, 1992) regardless of the empirical research strategies employed. Tools beget new uses and new uses beget new tools. Understanding the impacts is fundamental to measuring progress. Most industry watchers would agreed that GIS (and perhaps some SDSS) are used everyday throughout the world by individuals, groups, and organisations. Some progress in in research about GIS use is being made from an individual perspective (Grassland et al., 1995; Davies and Medyckyj-Scott, 1995), a group perspective (Nyerges, 1995a; Nyerges and Jankowski, 1994), and organisation perspective (de Man, 1988; Dickinson, 1990; Onsrud et al., 1992). However, the research literature still lacks theoretical and empirical contributions about GIS use in general (Nyerges, 1993), let alone on the narrower topic of “spatial decision
122
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
making”. One of the reasons for the lack of research is that almost all of the GIS/SDSS and collaborative spatial decision making research has focused on software development. As in the management and decision sciences where interest in decision support system (DSS) development in the 1970s preceded studies of use in the 1980s, perhaps the same ten year lag is expected for GIS/SDSS development in the 1990s and studies of use after 2000. An additional reason might be that empirical studies about GIS/SDSS use have few guidelines from which to draw. From the outset, our focus will be on “use” of GIS, and its offspring SDSS, for spatial decision making. However, it should be recognised that GIS development is an integral part of the progress. To address the progress with GIS tool development and use, the chapter proceeds as follows. In the next section a framework for spatial decision making described in terms of input, process and outcome, helps us uncover the complexity about tool use. The framework establishes a scope for many variables/issues. In the third section we assess the progress in spatial decision making using GIS/SDSS by examining relationships between issues/variables. The fourth section contains an overview of research strategies that might be useful for exploring the impacts of tool development on tool use. Finally, the fifth section presents generalisations about progress and directions for research. 11.2 THE SCOPE OF SPATIAL DECISION MAKING USING GIS A scope of relevant issues about “spatial decision making using GIS” is established by drawing from a theoretical framework called enhanced adaptive structuration theory (EAST). Original AST characterises the influence of advanced information technology use on organisational change (DeSanctis and Poole, 1994). Nyerges and Jankowski (1997) created EAST by expanding on the number and nature of issues treated in AST, while doing this in the context of decision aiding techniques for GIS/SDSS (Figure 11.1). Of course, an iterative process in decision making is very much possible, so input, process and outcome interaction in an actual decision context can become complicated rather quickly. To sort through this complexity, specific issues/variables are described in the following subsections. 11.2.1 Scoping Spatial Decision Making Inputs Three major constructs acting as inputs in a decision making process are the information technology, the decision context broadly defined, and the decision actors (Constructs A, B and C in Figure 11.1, respectively). Although each is treated in that order below, one should remember that they simultaneously influence the decision making process. 11.2.1.1 Spatial Information Technology for Decision Making One of the major inputs to spatial decision making using GIS is the nature of the GIS technology (Construct A in Figure 11.1). Just as DSS evolved from management information systems because of unmet needs for decision making (Sprague, 1980), SDSS evolved from GIS (Densham, 1991). At this point it is useful to compare and contrast GIS and SDSS. A SDSS has as its core the basic decision aids of a GIS, i.e., data management as an aid to extend human memory, graphics display as an aid to enhance visualisation, and basic spatial analysis to extend human computing performance. However, a SDSS also integrates other aids
SPATIAL DECISION MAKING USING GIS
123
Figure 11.1: EAST-based framework for characterising “spatial decision making using GIS”
such as simulation, optimisation, and/or multiple criteria decision models that support exploration of alternatives. For example, a bushfire simulation model has been linked to GIS to provide decision makers predictive power (Kessell, 1996). Facility optimisation models have been linked to GIS to site health care facilities (Armstrong et al., 1991). Multiple criteria decision (MCD) models have been linked to GIS creating SDSS for land planning (Carver, 1991; Eastman et al., 1993; Faber et al., 1996; Heywood et al., 1995; Jankowski, 1995; Janssen and Herwijnen, 1991). In the early 1990s some GIS/SDSS researchers (Armstrong and Densham, 1990) interested in systematic approaches to tool development looked towards frameworks developed outside of the spatial sciences, mostly in the decision and management sciences where management information systems were not addressing decision needs. For example, Sprague (1980) provided a taxonomy of DSS, classifying them based on the capabilities. Alter (1983) followed, broadening the classification based on what designers had to offer in relation to what kinds of problems needed to be solved—systems applied to business problems. A little later, Zachary (1986) deepened the question of what kinds of decision aid techniques were useful by examining the issue from a cognitive (human decision making) perspective. He articulated six classes of decision aid techniques: process models, choice models, information control (data management) techniques, analysis and reasoning methods, representation aids, and human judgement amplifying/refining techniques. Shortly after, Zachary (1988) elaborated on the need for such aids, persuasively arguing that decision aiding techniques extend the limits of human cognition—which is why they are important. Building upon those studies in differentiating capabilities, hence DSS, came Silver’s work (1990) on directed and non-directed developments in DSS. He felt that DSS should provide capabilities suggested by Zachary (1988), in a way
124
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
that provided systematic access to capabilities through the folowing: breadth of capabilities to address a broad base of tasks; depth of sophistication of capabilities-to meet the needs of a specific task in detail; and restrictiveness of capabilities to be used in some preset manner to protect users against themselves, for example as given in Construct A in Figure 11.1. Currently, GIS is evolving into group-based GIS (Armstrong, 1994; Faber et al., 1996), and SDSS into SDSS for groups (SDSS-G) (Nyerges, 1995a), both being used for collaborative spatial decision making (Densham et al., 1995). Just as the lineage of SDSS can be traced to DSS, there is a similarity between group-based GIS (including SDSS-G) and group DSS (GDSS). GDSS originally emphasised multicriteria decision making techniques (Thiriez and Zionts, 1976). Jarke (1986) provided a four dimension framework for GDSS that included: 1. 2. 3. 4.
spatial separation of decision makers—local or remote; temporal separation of decision makers—face-to-face meeting or mailing; commonality of goals—consensus or negotiation, and control of interaction—democratic or hierarchical.
Shortly thereafter, DeSanctis and Gallupe (1987) extended the cognitive aid work of Zachary (1986) and incorporated Jarke’s (1986) space-time framework to organise techniques into a GDSS framework. In recent developments, spatial understanding support systems (SUSS) have been proposed to address the lack of free-form dialogue exploration capabilities in SDSS (Couclelis and Monmonier, 1995). It is only logical that SUSS be combined with SDSS to form spatial understanding and decision support systems (SUDSS), and SUDSS for groups (SUDSS-G). 11.2.1.2 Structural Issues in Decision Making as Inputs A second major concern as part of input involves decision making situations, or in the terms of DeSanctis and Poole (1994) “other structures” (Construct B). There are several issues other than technology that structure a decision situation. Decision motivation is one issue that continually influences the decision process. Although much of the early work on decision making was couched in terms of a rational process stemming from economic motivation (Simon, 1960), there are other considerations such as social wellbeing and sustainability of an organisation (Zey, 1992). Sorting through these motivations and values is rather important to understanding why decision results turn out differently than expected based on surface (face-value) information (Keeney, 1992). Another important concern is the character of the problem task. Included as characteristics of a task are the goal, content, and complexity of the task. With regards to differentiating task structure, Simon (1960, 1982) recognised three basic tasks in individual decision making—intelligence, evaluation, and choice. In a planning context, Rittel and Webber (1973) described “wicked problems” as among the most complex problem tasks because goals and content are not clear. Mason and Mitroff (1981) went on to label “wicked” problems as “ill-structured” or “semi-structured” problems, such that tasks are more akin to “problem finding” than to “solution finding” (Crossland et al., 1995). In contrast, “well-structured” problems are those that can be defined well enough such that a right answer is computable and knowable. Focusing on small groups, McGrath (1984) performed a comprehensive review of group activity literature to synthesise a task circumplex composed of eight types of tasks—generating ideas, generating plans, problem solving with correct answers, deciding issues with preference, resolving conflicts of viewpoint, resolving conflicts of
SPATIAL DECISION MAKING USING GIS
125
interest, resolving conflicts of power, and executing performance tasks. Each of these differs in terms of goal, content, and/or complexity in specific decision making situations. Another aspect of decision situation is organisational control. If democratic control underlies decision making, versus hierarchical autocratic control, the process and outcome could be different. Such has been the case in the small-group research literature (Counselman, 1991), thus one might expect this effect in spatial decision making. Last to be treated here, but certainly not least, is a fundamental issue about “who benefits” from the decision making? In the management information systems literature (Money et al., 1988), benefits of decision support systems accrue to one or more of three groups: those at the managerial level, operational level, and personal level. In that study most benefits were gained at the personal level, rather than the other two. However, the system design was focused on individuals as a unit, rather than other units such as partnerships, groups, organisations, or community as taken up in the next section. 11.2.1.3 Decision A ctor Unit A third construct (Construct C in Figure 11.1) used as input to the decision making process deals with who is using the spatial information technology. Decision actor unit could be individual, partnership, group, organisation and/or community. In small group research, Mantei (1989) recognises the influence that all of these levels might have on a small group, as well as the reverse. She called the small-group the “standard level”. What is likely to be the best level of research investigation for GIS/SDSS? The opportunistic answer is probably all of them, so that insight is gained from multiple perspectives (Nyerges and Jankowski, 1997). It is likely we may find that decision support capabilities at one level of decision actor unit might not work at all for another level. Thus, size of the decision unit is a fundamental concern. Another aspect of this input is the knowledge and belief experience of a decision actor unit (i.e., individual, partnership, group, organisation, or community) in terms of what is known about the topic and/ or what aspects of the topic are valued more dearly than others. Knowledge about the problem and knowledge about techniques used to solve problems and making decisions are important aspects of what constitute the “personality” of the decision actor unit. 11.2.2 Spatial Decision Making as a Process of Interaction The three input Constructs (A, B and C) influence spatial decision making processes (Constructs D and E) in various ways. Decision making processes in a human context and information technology have often been described in terms of “interaction”. For individuals it is human-computer interaction. For groups it is human-computer-human interaction. For organisations it might be human-system interaction. A goal in GIS is to make the computer as transparent as possible, focusing on human-problem interaction (Mark and Gould, 1991). 11.2.2.1 Structure Appropriation Individuals and/or groups appropriate structures into the decision process (Construct D). Appropriating a structure is the act of invoking the structure, not necessarily the act of using it all the time. The major
126
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
structures are the technology (Construct A) and organisation guidelines (Construct B) that get the process started. Appropriation of structures is only a first part of various stages in the process; they may in fact indicate the transition from stage to stage. When using a GIS during the decision making, people appropriate various decision aids provided to them from the technology menu. Certain decision aids to be appropriated might be used because the decision makers are familiar with how information is treated. In other cases some decision aids might get appropriated as a novelty, and not be used again. Still others such as maps might be appropriated in ways that are creative, but different from what designers had intended, called ironic appropriations (Contractor and Seibold, 1993). 11.2.2.2 Dynamics of Decision Processes Task analyses are performed commonly to understand better the flow of decision phases (Construct E), the result of which is a task model (Nyerges, 1993). Armstrong and Densham (1995) present a task analysis of a facility location decision making process, whereby location-allocation is the underlying analytical mechanism to establish best sites. Nyerges and Jankowski (1994) performed a task analysis to create a task model of habitat site selection based on technical/political preferences, and generalised the process into six phases: 1. 2. 3. 4. 5. 6.
develop a list of objectives as part of problem scoping; develop feasible alternatives as a problem definition; identify criteria for measuring the degree to which objectives meet alternatives; specify criteria weights as a starting preference for the public-private choice process; apply criteria weights to database values, aggregating scores to compute alternative rank; negotiate selection of the best alternatives as part of a community-political process.
Each of these phases is part of McGrath’s (1984) circumplex of task activities for groups, and are thought to be general enough to apply to all preference-based site selection decision processes. This assumption is being investigated as phases in the dynamics of SDSS decision processes (Nyerges and Jankowski, 1994). 11.2.2.3 Emergent Structures When decision aids are applied continually for specific purposes, they are considered to be “emergent structures” of information (Construct F), for example, information structures based on diagrams, models or maps. Emergent structures are likely to direct the thinking patterns of individuals and/or groups. 11.2.3 Decision Making Outcome A third part of the framework concerns outcome. Outcomes from decision making consist of decision outcomes and social structure outcomes. Decision Outcomes: decision outcomes (Construct G) consist of the amount of energy, or cognitive effort, it takes to make the decision, the accuracy of the decision as a measure of decision quality, the
SPATIAL DECISION MAKING USING GIS
127
satisfaction with the decision in terms of comfort, the commitment to the decision, and/or the consensus opinion if in a group setting. Social Structure Outcomes: when in group decision settings, new social structures may develop from decision interaction processes—individuals to organisation; individuals to group, group to group, and/or group to organisation (Construct H). Social structure outcomes consist of adopting new rules for using information, and new interpersonal contacts and ways to support decision making. 11.3 ASSESSING PROGRESS: SPATIAL DECISION MAKING DURING GIS/SDSS USE The issues/variables in the EAST framework (Figure 11.1) are each important in themselves, but to assess progress it is important to know how the variables affect each other. Each of the construct boxes in Figure 11.1 depicts a major issue describing spatial decision making with regard to input, process and outcomes. Premises (P1-P7) connecting boxes motivate potential research questions. The research questions incorporate variables from two constructs at opposite ends of the premises. In this section, we address the progress on investigating these relationships, each premise being taken in turn with emphasis on “spatial aspects” of decision making. Premise 1: Decision aid technology has an influence on decision aid moves. A decision aid move is the initial process of invoking the aid—not the entire process of using it. One fundamental question is: “Are the counts of aid moves for different types of maps likely different because of the advantages (or disadvantages) of information associated with each?” Grassland et al. (1995) describe several types of displays for individual decision making, but the number of times each was used was not reported. With regard to group decision making, Armstrong et al. (1992) identify several kinds of map displays for facility location problems, and Jankowski et al. (1997) identify several for habitat site selection. Armstrong et al. (1992) note that certain kinds of cartographic displays are more appropriate for certain stages of the decision process, and describe them, but counts of moves were not collected in the study to determine relative frequency of moves. Ramasubramanian (Chapter 8, this volume) and Nyerges et al. (1997) describe the influence that geographic information technologies could have on group participation in community decision making. Developments of GIS in this context lead to what has been called public participation GIS (PPGIS). Ramasubramanian (Chapter 8, this volume) describe the use of PPGIS for social service decision making in Milwaukee. Nyerges et al. (1997) describe the use of GIS in three scenarios—urban land rehabilitation, neighbourhood crime watch, forest conservation planning—to synthesise a preliminary, general set of requirements for PPGIS. Premise 2. Appropriation of decision aids varies with alternative sources of structuring. How does task complexity influence spatial decision making? Crossland et al. (1995) report on the relationship between task complexity (low and high) and decision outcome, but did not examine the decision process that links complexity and outcome. Nyerges and Jankowski (1994) are coding data about low and high task complexity as given by the number of sites to be addressed (eight and 20 respectively), but have yet to complete the analysis. McMaster (Chapter 35, this volume) describes how GIS data quality, expressed in terms of locational resolution and attribute accuracy, can impact forest management decision making. Premise 3. Decision aid appropriations will vary depending on the actor’s character. An individual’s knowledge about tools and problems has an influence on use of decision aid tools (Barfield and Robless, 1989; Nyerges, 1993), and we should expect that a group’s technical knowledge and experience will have
128
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
an effect on the appropriation of decision aids. No statistical evidence exists on users’ background in relation to counts of spatial decision aid moves. Premise 4. Decision aid moves have an influence on decision processes. DeSanctis and Poole (1994) view decision making as a process of social interaction, and we suspect the same. What kinds of decision aids influence problem exploration, evaluation, and consensus? Do problem exploration aids facilitate learning, evaluation aids facilitate idea differentiation, and consensus aids facilitate idea integration? What aspects of tables, diagrams, maps, and multicriteria decision making models bring a group to a consensus? No research reports provide statistical evidence for the kind of decision aid moves that occur, but Armstrong et al (1991) report that different maps are used as per the design of the system. Premise 5. New sources of structure emerge during the technology, task, and decision process mix. How diagrams, maps, and/or models emerge during the human-computer interaction has not been studied. Are maps emerging more than tables as a decision process evolves? Are there differences in the emergent sources of structure between face-to-face and space-time distributed meetings? Armstrong et al. (1992) describe the advantages of spider maps, demand maps, and supply maps for facility location decision problems. Grassland et al. (1995) describe several kinds of maps for site selection. Premise 6. Decision processes influence decision outcomes. Decision outcomes have been shown to be influenced by several characteristics (Contractor and Seibold, 1993). When characteristics such as the decision aids available, other sources of social structure (e.g. task guidance), faithful moves/use of decision aids, and decision venue that fit the task assignment, are positive, is it likely that GIS use will result in more satisfactory outcomes? Grassland et al. (1995) report finding that unequivocal evidence exists in favour of addition of GIS technology to a site selection decision task in terms of reduced time and increased accuracy for individual decision makers. Further, their findings about use of GIS indicated favourable results for both the low complexity and high complexity tasks. However, Todd and Benbasat (1994) suggest that cognitive effort may in fact be a more important variable to study than decision quality (accuracy) and efficiency (time). Premise 7. New social structures emerge during the decision process. Interpersonal relationships between and among individuals evolve as they work through problem solving and decision making tasks. Since people come together in dyads, groups or organisations as part of the decision process, they establish new working relationships or reconstruct old ones. New rules for technology use develop as a result of information differentiation and/or integration. To date, social structure adoption has received only scant investigation through GIS implementation studies (Pinto and Azad, 1994). In only a very few cases do we have evidence about issue/variable relationships to help us better understand the impacts of GIS/SDSS use on spatial decision making. Without knowing about the relationships it is rather difficult to comment on progress. The above relationships are only a sample of those that are encouraged by examination of the premises in the context of an EAST-based conceptual framework (Nyerges and Jankowski, 1997). 11.4 RESEARCH STRATEGIES FOR EMPIRICAL STUDIES Several research strategies are available as a guideline for empirical studies that can address the premises in the previous section. Among these are: 1. usability tests with emphasis on evolving the tools (Davies and Medyckyj-Scott, 1994);
SPATIAL DECISION MAKING USING GIS
129
2. laboratory experiments with emphasis on controlled treatments for observation of individuals (Grassland et al., 1995; Gould, 1995), and small groups (Nyerges and Jankowski, 1994); 3. quasi-experiment (natural experiment) with natural control of treatments (Blackburn, 1987); 4. field studies using questionnaires (Davies and Medyckyj-Scott, 1995), interview (Onsrud et al., 1992), and videotape (Davies and Medyckyj-Scott, 1995), and 5. field experiments as a cross between a lab experiment and field study (Zmud et al., 1989). Usability studies examine GIS users’ impressions and performance on tasks identified by system designers (Davies and Medyckyj-Scott, 1994). The goal of usability studies is usually to evolve the tool to a next state of usefulness. Questionnaires can be administered to find out recollections of users’ concerns with difficult to use system capabilities. Video cameras can collect “talk aloud” data about users’ concerns with invoking capabilities. Laboratory experiments make use of controlled treatments for observation. They provide a high degree of internal validity, i.e. consistency between the variables that are being manipulated. The validity is accomplished through setting tasks by the research design such that the treatments (as variables to be collected) can be compared with each other as independent and dependent variables. Grassland et al. (1995) made use of GIS-noGIS and low-high task complexity treatments in their study of decision making. They monitored the time duration and accuracy of the decision results for a site selection problem using keystroke logging. Gould (1995) made use of different user interface options in his experiments, also collecting data through keystroke logging. Nyerges (1995b) reports on interaction coding systems prepared for coding data from videotapes recorded in SDSS group experiments. Coding keystroke and video data is a challenge to summarise the character of use (Sanderson and Fisher, 1994). Quasi-experiments, also called natural experiments (Blackburn, 1987), set up treatments through natural task assignments. A natural task is one that the researcher has no control over, undertaken as a normal process of work activity. Data can be collected using questionnaires, interviews, videotaping, and/or keystroke logging. The less intrusive the data collection, the more natural the data collection, e.g. videotaping. Field studies make use of natural decision making environments, employing participant observation to collect data. Field studies establish a high degree of external validity, i.e., the results are likely to apply to similar realistic situations (Zmud et al., 1989). Onsrud et al. (1992) describe case studies drawing heavily from Lee (1989). Dickinson (1990) and Davies and Medyckyj-Scott (1994) describe survey instruments to collect data about GIS use. Davies and Medyckyj-Scott (1995) describe videotape data capture about GIS use. Field experiments are a cross between a field study and a laboratory experiment. They are among the most difficult research designs to implement because of the conflict between natural setting and controlling for treatments. Given the emphasis in laboratory experiments to establish internal validity, and field studies to establish external validity, it is no wonder they are among the most fruitful research designs (Zmud et al., 1989). Keystroke logging, videotapes, questionnaires and interviews are useful techniques, but again the less intrusive the technique the better the data collection. 11.5 CONCLUSIONS—ARE WE THERE YET? Like a young child on an extended journey, we ask “are we there yet?” If not, the next question is usually “how much farther (in this case further) do we have to go?” In comparison to tool development, we have
130
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
just started with investigations of tool use. Establishing an agenda to study both tool development and tool use is an ambitious task. The EAST framework (Nyerges and Jankowski, 1997) motivates many relationships to examine. Which ones are the most important to pursue? To answer this we could construct a matrix where all variables are listed down the left side, and all variables listed across the top. The left side list could be interpreted as independent variables, and along the top would be the dependent list. The cells would represent relationships to be examined. Naturally, all possible relationships would not be meaningful, at least directly. The diagonal would not be considered, although there are probably some variables such as knowledge experience that affects itself, perhaps through metacognition. As variables are listed in sequence in the matrix, the eight construct boxes in Figure 11.1 could be made evident. Which cells have and have not been treated indicates the status of progress. Of course some of the relationships are more valuable than others. Capturing the appropriation of decision aids in relation to decision phases might be the most important endeavour. We would then know what aids relate to what phases. Determining whether cognitive effort or decision quality is more important under various treatments would be another significant pursuit, relating to research in the decision sciences. Technologies to support various tasks in a space-time distributed setting are also among significant issues. Such technologies expand the time and reduce the space that have been the underlying constraints in making use of decision aids. With the advent of fully functional desktop GIS now in existence, it is likely that use of new problem understanding and decision support aids will foster new social structures in public meetings. Characterising the variables involved in the above relationships will be a challenge to those who are more comfortable with tool building or tool use, and not both. Some relationships may require a different research strategy to articulate than will others, making some studies more difficult, but again, useful entries in the cells. Sorting these issues out is the challenge. Only then can we determine “are we there yet?”, and if not “how far do we need to go with spatial decision making using GIS?” REFERENCES ALTER, S.L. 1983. A taxonomy of decision support systems, in House, W.C. (Ed.) Decision Support Systems . New York: Petrocelli, pp. 33–56. ARMSTRONG, M..P. 1994. Requirements for the development of GIS-based group decision support systems, Journal of the American Society of Information Science, 45(9), pp. 669– 677. ARMSTRONG, M.P., and DENSHAM P.J. 1990. Database organisation alternatives for spatial decision support systems, International Journal of Geographical Information Systems. 4, pp. 3–20. ARMSTRONG, M..P. and DENSHAM, P.J. 1995. A conceptual framework for improving human-computer interaction in locational decision-making, in Nyerges, T., Mark, D.M., Laurini, R. and Egenhofer, M.(Eds.), Cognitive Aspects of HCI for GIS, Proceedings of the NATO ARW, Mallorca, Spain, 21–25 March 1994. Dordrecht: Kluwer, pp. 343–354. ARMSTRONG, M..P., RUSHTON, G., HONEY, R., DALZIEL, B., LOLONIS, P. and DENSHAM, M..P. 1991. Decision support for regionalization: a spatial decision support system for regionalizing service delivery systems, Computers, Environment and Urban Systems, 15, pp. 37–53. ARMSTRONG, M..P., DENSHAM, P.J., LOLONIS, P., and RUSHTON, G. 1992. Cartographic displays to support location decision making, Cartography and Geographic Information Systems, 19, pp. 154–164. BARFIELD, W. and ROBLESS, R. 1989. The effects of two-and three-dimensional graphics on problem solving performance of experienced and novice decision makers, Behaviour and Information Technology, 8(5), pp. 369–85. BENBASAT, I., NAULT, B.R. 1990. An evaluation of empirical research in managerial support systems, Decision Support Systems, 6, pp. 203–226.
SPATIAL DECISION MAKING USING GIS
131
BLACKBURN, R.S. 1987. Experimental design in organizational settings, in Lorsch, J. (Ed.), Handbook of Organizational Behavior. Englewood Cliffs, NJ: Prentice Hall, pp. 126–39. CARVER, S. 1991. Integrating multi-criteria evaluation with geographical information systems, International Journal of Geographical Information Systems, 5(3), pp. 321–339. CONTRACTOR, N.S. and SEIBOLD, D.R. 1993. Theoretical frameworks for the study of structuring processes in group decision support systems, Human Communication Research, 19(4), pp. 528–563. COUCLELIS, H., and MONMONIER, M. 1995. Using SUSS to resolve NIMBY: how spatial understanding support systems can help with “Not in My Backyard” Syndrome, Geographical Systems, 2, pp. 83–101. COUNSELMAN, E.F. 1991. Leadership in a long-term leaderless women’s group, Small Group Research, 22(2), pp. 240–257. COWEN, D.J. 1988. GIS versus CAD versus DBMS: what are the differences? Photogrammetric Engineering & Remote Sensing, 54(11), pp. 1551–1555. CROSSLAND, M.D., WYNNE, B.E. and PERKINS, W.C. 1995. Spatial decision support systems: an overview of the technology and a test of efficacy, Decision Support Systems, 14, pp. 219–235. DAVIES, C. and MEDYCKYJ-SCOTT, D. 1994. GIS usability: recommendations based on the user’s view, International Journal of Geographical Information Systems 8(2), pp. 175– 189. DAVIES, C. and MEDYCKYJ-SCOTT, D. 1995. Feet on the ground, studying user-GIS interaction in the workplace, in Nyerges, T., Mark, D.M., Laurini, R. and Egenhofer, M. (Eds.) Cognitive Aspects of Human-Computer Interaction for Geographic Information Systems, Proceedings of the NATOARW, Mallorca, Spain, 21–25 March 1994. Dordrecht: Kluwer,pp. 123–141. DE MAN, W.H..E. 1988. Establishing a geographic information system in relation to its use: a process of strategic choices , International Journal of Geographical Information Systems, 2(3), pp. 245–261. DENSHAM, P.J. 1991. Spatial decision support systems, in Maguire, D.J., Goodchild, M.F., Rhind, D.W. (Eds.), Geographical Information Systems: Principles and Applications. New York: John Wiley & Sons. DENSHAM, P., ARMSTRONG, M. and KEMP, K. 1995. Collaborative Spatial Decision Making: Scientific Report for the I-1 7 Specialist Meeting, National Center for Geographic Information and Analysis, TR 95–14. Santa Barbara CA: NCGIA. DeSANCTIS, G. and GALLUPE R.B. 1987. A foundation for the study of group decision support systems, Management Science, 33, pp. 589–609. DeSANCTIS, G. and POOLE, M..S. 1994. Capturing the complexity in advanced technology use: adaptive structuration theory, Organization Science, 5(2), pp. 121–147. DICKINSON, H. 1990. Deriving a method for evaluating the use of geographic information in decision making, Ph.D. dissertation, State University of New York at Buffalo, National Center for Geographic Information and Analysis (NCGIA) Technical Report 90–3. EASTMAN, J.R., KYEM, P.A.K., TOLEDANO, J., JIN, W. 1993. GIS and Decision Making Explorations in Geographic Information Systems Technology, Volume 4. Geneva: UNITAR. EASTMAN, J.R., WEIGEN, J., KYEM, P.A. K., and TOLEDANO, J. 1995. Raster procedures for multicriteria/multiobjective decisions, Photogrammetric Engineering and Remote Sensing, 61(5), pp. 539–547. FABER, B.G., WALLACE, W.W. and MILLER, R..M. P. 1996. Collaborative modeling for environmental decision making, Proceedings of GIS’96 Symposium, Vancouver, British Columbia. Fort Collins: GIS World Books, pp. 187–198. GOULD, M.D. 1995. Protocol analysis for cross-cultural GIS design: the importance of encoding resolution, in Nyerges, T., Mark, DM, Laurini, R. and Egenhofer, M. (Eds.) Cognitive Aspects of Human-Computer Interaction for Geographic Information Systems, Proceedings of the NATO ARW, Mallorca, Spain, 21–25 March 1994, Dordrecht: Kluwer, pp. 267–284. HEYWOOD, D.I., OLIVER, J., and TOMLINSON, S. 1995. Building an exploratory multi-criteria modelling environment for spatial decision support, in Fisher, P.(Ed.) Innovations in GIS 2. London: Taylor & Francis, pp. 127–136.
132
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
JANKOWSKI, P. 1995. Integrating geographical information systems and multicriteria decision-making methods, International Journal of Geographical Information Systems, 9(3), 251–273. JANKOWSKI, P., NYERGES, T.L., SMITH, A., MOORE, T.I, and HORVATH E. 1997. Spatial Group Choice: A SDSS tool for collaborative spatial decision making, International Journal of Geographical Information Science, 11(6), pp. 577–602. JANSSEN, R. and VAN HERWIJNEN, M. 1991. Graphical decision support applied to decisions changing the use of agricultural land, in Korhonen, P., Lewandowski A. and Wallenius, J. (Eds.), Multiple Criteria Decision Support, Proceedings of the International Workshop, Helsinki, Finland, 7–11 August 1989. Berlin: Springer-Verlag, pp. 78–87. JARKE, M. 1986. Knowledge sharing and negotiation support in multiperson decision support systems, Decision Support Systems, 2, pp. 93–102. KEENEY, R.L. 1992. Value-Focused Thinking: A Path to Creative Decision-making. Cambridge, MA: Harvard University Press. KESSELL, S.R. 1996. The integration of empirical modeling, dynamic process modeling, visualization, and GIS for bushfire decision support in Australia, GIS and Environmental Modeling: Progress and Research Issues. Fort Collins: GIS World Books, pp. 367–371. LAKE, R.W. 1993) Planning and Applied Geography: Positivism, Ethics and Geographic Information Systems, Progress in Human Geography, 17(3), pp. 404–413. LEE, A.S. 1989. A scientific methodology for MIS case studies, MIS Quarterly, March, pp. 33–50. MANTEI, M.M. 1989. A discussion of small group research in information systems: theory and method, in Benbasat I. (Ed.), The Information Systems Research Challenge: Experimental Research Methods. Boston: Harvard Business School, Volume 2, pp. 89–94. MARK, D.M. and GOULD, M..D. 1991. Interacting with geographic information: a commentary, Photogrammetric Engineering and Remote Sensing, 57(11), pp. 1427–1430. MASON, R..O. and MITROFF, I. 1981. Challenging Strategic Planning Assumptions. New York: John Wiley & Sons. McGRATH, J.E. 1984. Groups: Interaction and Performance. Englewood Cliffs, NJ: Prentice-Hall. MONEY, A., TROMP, D. and WEGNER, T. 1988. The quantification of decision support within the context of value analysis, MIS Quarterly, 12(2), pp. 12–20. NYERGES, T.L. 1993. How do people use geographical information systems?, in Medyckyj-Scott D. and Hearnshaw, H.(Eds.), Human Factors for Geographical Information Systems. New York: John Wiley & Sons, pp. 37–50. NYERGES, T.L. 1995a. Cognitive task performance using a spatial decision support system for groups, in Nyerges, T., Mark, D.M., Laurini, R and Egenhofer, M.(Eds.), Cognitive Aspects of Human-Computer Interaction for Geographic Information Systems, Proceedings of the NATO ARW, Mallorca, Spain, 21–25 March 1994, Dordrecht: Kluwer, pp. 311–323. NYERGES, T.L. 1995b. Interaction coding systems for studying collaborative spatial decision making, in Densham, P., Armstrong, M. and Kemp, K. (Eds.), Collaborative Spatial Decision Making, Technical Report 95–14. Santa Barbara, CA: NCGIA NYERGES, T.L. and JANKOWSKI, P. 1994. Collaborative Spatial Decision Making Using Geographic Information Technologies and Multicriteria Decision Models, research funded by the National Science Foundation, Geography and Regional Program, SBR-9411021. NYERGES, T.L. and JANKOWSKI, P. 1997. Enhanced adaptive structuration theory: a theory of GIS-supported collaborative decision making, Geographical Systems, 4(3), pp. 225–259. NYERGES, T.L., BARNDT, M., BROOKS, K., 1997. Public participation geographic information systems, Proceedings, AutoCarto 13, Seattle, WA, 7–10 April., Bethesda, MD: American Society for Photogrammetry and Remote Sensing, pp. 224–233. ONSRUD, H.J., PINTO, J.K., and AZAD, B. 1992. Case study research methods for geographic information systems, URISA Journal, 4(1), pp. 32–44. ORLIKOWSKI, W.J. 1992. The duality of technology: rethinking the concept of technology in organizations, Organization Science, 3(3), pp. 398–427.
SPATIAL DECISION MAKING USING GIS
133
PINTO, J.K. and AZAD, B. 1994. The role of organizational politics in GIS implementation. Journal of Urban and Regional Information Systems Association, 6(2), pp. 35–61. RITTEL, H.W J. and WEBBER, M.M. 1973. Dilemmas in a general theory of planning. Policy Sciences, 4, pp. 155–169. SANDERSON, P.M. and FISHER, C. 1994. Exploratory sequential data analysis: foundations, Human-Computer Interaction, 9, pp. 251–317. SILVER, M.S. 1990. Decision support systems: directed and non-directed change, Information Systems Research, 1(1), pp. 47–70. SIMON, H.A. 1960. The New Science of Management Decision, New York: Harper & Row. SIMON, H.A. 1982. Models of Bounded Rationality, Cambridge, MA: MIT Press. SPRAGUE, R.H. 1980. A framework for the development of decision support systems. Management Information Systems Quarterly, 4, pp. 1–26. THIRIEZ, H. and ZIONTS, S. (Ed.) 1976. Multiple Criteria Decision Making. Berlin: Springer-Verlag. TODD, P. and BENBASAT, I. 1994. The influence of decision aids on choice strategies: an experimental analysis of the role of cognitive effort, Organizational Behavior and Human Decision Processes, 60, pp. 36–74. ZACHARY, W.W. 1986. A cognitively based functional taxonomy of decision support techniques, Human-Computer Interaction, 2, pp. 25–63. ZACHARY, W.W. 1988. Decision support systems: designing to extend the cognitive limits, in Helander, M. (Ed.), Handbook of Human-Computer Interaction. Amsterdam: Elsevier Science Publishers, pp. 997–1030. ZEY, M. (Ed.) 1992. Decision Making: Alternatives to Rational Choice Models. Newbury Park, CA: Sage. ZMUD, R.W., OLSON, M.H. and HAUSER, R 1989. Field experimentation in MIS research, Harvard Business School Research Colloquium. Boston, Harvard Business School, volume 2, pp. 97–111.
Chapter Twelve GIS and Health: from Spatial Analysis to Spatial Decision Support Anthony Gatrell
12.1 INTRODUCTION There can be few areas of human inquiry that require more of a multidisciplinary perspective than that of health. Understanding the nature and incidence of disease and ill-health, the demands made upon health care systems, and how best to shape a configuration of health services, necessitates insights, approaches and tools drawn from disciplines that straddle the natural, social and management sciences. Given that ill-health is suffered by people living in particular localities, that an understanding of this requires knowledge of global and local environmental and social contexts, and that healthcare resources have to be located somewhere, it is not surprising that a spatial perspective on all these issues usually proves fruitful. As a consequence, GIS as both science and technology can inform our understanding of health problems, policy and practice—as I hope to show. My contribution emerges from a long-standing interest in the geography of health, and more specifically from having convened jointly a GISDATA workshop in this field (Gatrell and Löytönen, 1998) that has yielded insights from several researchers working in different disciplines and countries. “Health” is, of course, notoriously difficult to define! As a result, attention tends to be focused on the incidence of illness and disease in the community. Traditionally, and as we see below, emphasis has been placed more within the GIS community on the use of easily-quantified health data; so we find studies using mortality data, or perhaps morbidity data such as that from cancer registries. Many of these data sets are address-based, carrying with them a postal code that permits fine-grained spatial analysis. But the existence of such data sets should not blind us to serious issues of data quality and uncertainty, both for spatial and attribute data. These issues are touched upon later, but clearly they are a prime example of concerns dealt with in other GISDATA meetings. Notwithstanding the difficulties of defining attributes such as health, illness and disease, there are further difficulties in “entitation”, concerning the spatial objects to which such attributes may be attached. In some areas of health research, notably healthcare planning, this may be relatively unproblematic. For example, in defining a set of facilities that can deal with accidents and emergencies there may be clear criteria for inclusion—though the set of such facilities will of course change over time, requiring the spatial database to be updated. In epidemiology, the study of disease incidence, entitation might be thought to be straightforward, assuming we have postcode addresses of, say, adults suffering from throat cancer. But the existence of such data, and their physical representation as a spatial point pattern (Gatrell and Bailey, 1996, p 850) should not blind us to the fact that the individual point “events” are a very crude and imperfect representation of the lived worlds of the victims. Again, this point is developed below.
GIS AND HEALTH: FROM SPATIAL ANALYSIS TO SDS
135
It should be clear that I follow others (for example, Jones and Moon, 1987; Thomas, 1992) in making a distinction in the geography of health between a concern with epidemiology and a concern with healthcare planning. As a result, in what follows I shall seek first to outline the ways in which GIS can contribute to an understanding of disease and ill-health. In so doing I shall identify those areas in which more research is needed. I follow this with a discussion of GIS in healthcare planning. But I wish to make the point that the two traditions of the geography of health come together in at least one important way; quite simply, if we can identify areas with significant health “needs” then it behoves healthcare planners to set in motion projects designed to address those needs and improve the status quo. In reviewing a range of possible scenarios— where to invest or disinvest in healthcare, for example—planners are led to consider the usefulness of spatial decision support systems (SDSS). The prospects for SDSS in health research are therefore discussed, before reaching some general conclusions about the status of GIS and health. 12.2 GIS AND EPIDEMIOLOGY Given that epidemiology is concerned with describing and explaining the incidence of disease, it follows that spatial or geographical epidemiology requires methods that will provide good descriptions of the spatial incidence of disease, together with methods that offer the prospect of modelling such incidence. I shall, therefore, consider briefly methods for visualising, exploring, and modelling the incidence of disease. This threefold division of analytical labour follows the classification adopted in Bailey and Gatrell (1995), where extended discussion of these and other methods may be found. Such a stance suggests immediately that I am adopting a “spatial analysis” view of GIS, one which many leading figures in GIS research (notably Goodchild, Burrough, Openshaw and Unwin) have endorsed. These, and other authors (for example, Anselin and Getis, 1993; Bailey, 1994; Goodchild et al, 1992) have bemoaned the lack of a spatial analysis functionality in GIS, a shortcoming which raises particularly serious issues in epidemiology. Can we interrogate the spatial database for meaningful information (as opposed to simple queries), asking, for example, not simply how many cases of childhood asthma lie within buffer zones placed around main roads, but whether this is “unusual” or statistically significant? Such important questions demand a spatial analysis functionality, something which is now appearing in commercial and research-based products such as SAS-GIS, S-Plus for ARC/INFO, REGARD, and LISPSTAT. Pedagogic material (e.g. INFO-MAP; Bailey and Gatrell, 1995) is also available. A key requirement is the ability to query the data in one window and to see the results of such queries appear in other windows. For example, we might want to see a choropleth map of incidence rates in one window, the results of spatial smoothing of this map in another, a graph relating rates to data on air quality in a third, and a tabulation of data in the form of a spreadsheet in a fourth. These windows need to be linked, so that selection of objects in one causes them to be highlighted in others (Brunsdon and Charlton, 1995). 12.2.1 Visualisation of epidemiological data Assuming we have a set of point objects representing disease incidence among a set of individuals we can map these as a point pattern, though this is singularly uninformative unless an attempt is made to control for underlying population distribution. Often we do not have access to point data and instead are provided only with data for a system of areal units, such as census tracts or other administrative units. Such data include disease incidence rates, age-standardised to control for variations in age structure. There are several issues
136
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
concerned with visualisation of such choropleth maps and how the reader extracts meaning from these. For example, researchers such as Brewer (1994) have looked at the use of colour on such maps. One special class of problem arises because of the variable size of spatial units. This has led some authors (e.g. Dorling, 1994 and Selvin et al., 1988) to explore the use of cartograms or “density-equalised” maps in disease mapping (an idea dating back many years; see Forster, 1963). Here, the size of an areal unit is made proportional to population at risk, and disease rates are shaded on this new base map; alternatively, individual disease events may be mapped instead. The ability to construct cartograms is not, to my knowledge, currently available within any proprietary GIS; it should be. An alternative method is to use proportional symbols to represent population at risk, with symbol shading representing disease rate. 12.2.2 Exploratory spatial data analysis The production of cartograms—effectively, transformations of geographic space—means that these are as much exploratory as visualisation tools, suggesting that the distinction between the two sets of methods is fuzzy. The uneven distribution of population within any given spatial unit (see Wegener, Chapter 10 in this volume) also calls for solutions from exploratory spatial data analysis, though again the results produce new visualisations of epidemiological data. Martin and Bracken (1991) have introduced these methods into the analysis and mapping of census data. Their use in the analysis of health data is set to expand rapidly (Gatrell et al., 1996; Rushton et al., 1996). Such methods are known as kernel or density estimation. In essence, they reveal how the density, or “intensity” of a spatial point pattern varies across space (see Bailey and Gatrell, 1995, pp. 84–88 for a pedagogic treatment). A moving window, or kernel, is superimposed over a fine grid of locations, and the density estimated at each location; research shows that the choice of kernel function is less critical than the bandwidth or spatial extent of the kernel function. Too small a bandwidth simply duplicates the original map of health events, while too large a bandwidth over-smoothes the map, obscuring any useful local detail. Methods are available for selecting an optimal bandwidth. Visualisation of the results shows regions where there is a high incidence of disease, and therefore possible clusters. On its own, this is uninformative, given the natural variation in population at risk, but others (Bithell, 1990; Kelsall and Diggle, 1995) have shown how the ratio of two density estimates (one for disease cases, the other for healthy controls) provides a powerful exploratory tool for cluster detection. For example, we may map the incidence of children born between 1985 and 1994 with serious heart malformations in north Lancashire and south Cumbria (Figure 12.1a). The dot map appears to do little more than mirror population distribution. Mapping controls (healthy infants, those born immediately before and after the cases) allows us to assess this visually (Figure 12.1b) but the ratio of two kernel estimates (Figure 12.1c) gives us a more rigorous indication of whether or not there are significant clusters, the density of shading providing clues as to the location of clusters. In this instance there seems to be nothing unusual in the distribution of heart malformations; a possible cluster in the north-east of the study area is influenced very much by the presence of a single case, while a test of the hypothesis that the two kernel estimates are identical, using randomisation tests, gives a p value of 0.587. Rushton et al. (1996) have embedded these kinds of ideas into software that is now widely available to health professionals in the USA and they are implemented in some of the interactive software environments mentioned earlier. Although derived from a different pedigree, and with an underlying theory that suggests it belongs more in a section on modelling than exploration, the work of Oliver and others on the kriging of disease data (and discussed in the spatial analysis GISDATA meeting) is closely related to kernel estimation (see Oliver et al., 1992).
GIS AND HEALTH: FROM SPATIAL ANALYSIS TO SDS
137
Figure 12.1 Geographical analysis of congenital malformations in north Lancashire and south Cumbria, UK, 1985– 1994: (a): case incidence (b): healthy controls (c): ratio of kernel estimates (h=bandwidth)
Such methods allow us to pinpoint possible clusters, in much the same way as Openshaw’s groundbreaking Geographical Analysis Machine sought to do (Openshaw et al., 1987). Authors such as Openshaw and Rushton are keen to stress the potential usefulness of such methods in disease surveillance, suggesting that they could be put to use in routine interrogations of spatial databases; public health specialists would “instruct” software to search such databases for “hotspots” and report the results for investigation and possible action. The feasibility and merit of this proposal demands further consideration. The issue of whether there exist “clusters” of health events needs to be separated conceptually from whether or not there is generalised “clustering” of such events across the study region as a whole. In some applications it is important to know whether or not there is a tendency for cases of disease to aggregate more than one might expect on a chance basis. Again, there are statistical tools available to do this. For example, one approach (Cuzick and Edwards, 1990) looks at each case of disease in turn and asks whether nearest neighbours are themselves more likely to be cases than controls. Other approaches use Ripley’s Kfunction, which gives an estimate of the expected number of point events within a given distance of an arbitrarily chosen event; again, pedagogic treatments are available (Bailey and Gatrell, 1995). The Kfunction allows us to assess whether a spatial distribution is random, clustered, or dispersed, at a variety of spatial scales. As with kernel estimation, knowledge of the K-function for the spatial distribution of health events is of limited value, but if it is estimated for both cases and controls we can assess whether cases display more, or less, tendency for aggregation or clustering than we would expect, given background variation in population at risk. Statistical details are given in Diggle and Chetwynd (1991), with applications in Gatrell et al. (1996). If our database includes as an attribute date of infection, or date of disease notification, then the analytical possibilities are widened, and the space-time incidence of disease may be explored. Do cases that cluster in space also cluster in time? If so, this may give clues to a possible infective mechanism in disease
138
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
aetiology (causation). There is a variety of techniques available here, including those which extend the Kfunction (Bailey and Gatrell, 1995, pp 122–5). Applications include research on Legionnaires’ disease (Bhopal et al., 1992) and on cancer (Gatrell et al., 1996). But this area of research raises interesting issues, of visualisation and analysis, and those interested in spatio-temporal GIS (as in a parallel GISDATA initiative) are well-placed to make a contribution. For example, we might have very detailed longitudinal information available on residential histories, and this can be used to examine space-time clustering not in a contemporary setting but in an historical one. We could, for instance, record locations at a certain age, and use as a measure of temporal separation the difference in birth year of cases. A test of space-time interaction might then reveal whether there was a tendency for clustering at particular ages, as demonstrated very convincingly for multiple sclerosis in part of Norway (Riise et al., 1991). If we wish, as we should, to give time a significant role in our analyses we need to recognise the scale dimension, just as we do in a spatial setting. In other words, we need to acknowledge that some exposures to infectious agents (such as viruses) might have taken place many years ago, and any disease have taken years to develop; on the other hand, some environmental insults (the Chernobyl and Bhopal disasters spring immediately to mind) can lead to immediate health consequences. And at very local scales, it is quite crucial to realise that simply recording current address—and using the battery of spatial point process tools outlined above—is a grossly imperfect measure of “exposure”. People do not remain rooted to the spot, waiting to be exposed to airborne pollutants, for example. They have individual, and possibly very complex, daily and weekly activity spaces, a set of other locations at which they may be exposed to a variety of environmental contaminants. This implies that we should draw upon the rich literature of “time geography”, first developed by the Swedish geographer Torsten Hägerstrand, in order to give due weight to these influences. A start has been made by some in bringing these ideas to the attention of the GIS community (Miller, 1991), while Schaerstrom (1996) has shown how they can be employed in an epidemiological setting (see Figure 12.2). This is a potentially fruitful and important area of research, in which much remains to be done. In exploring health data that are collected for areal units there is a variety of analytical methods available. An important issue here concerns the low frequency of counts or incidence in small areas—or even in quite large areas if the disease is rare. Techniques such as probability mapping, and in particular Bayes estimation (Bailey and Gatrell, 1995, pp 303– 308; Langford, 1994) are now commonplace in the epidemiology literature. Essentially, the latter allows us to “shrink” rates in areas where disease incidence is low, towards the average value for the study area as a whole, as a way of acknowledging that our estimates are uncertain; if the rate is based on large numbers of cases it is not shrunk or smoothed so much. The smoothing can be either “global” or “local”; in the latter context the estimate is adjusted to a local or neighbourhood mean rather than that for the entire study area. Although such methods are not standard in proprietary GIS, they are widely used in modern atlases of mortality and morbidity (see, for example, the Spanish cancer atlas: Lopez-Abente et al., 1995). Note that it is also possible to adapt the kernel estimation ideas discussed earlier for use in exploratory analyses of area data; this has been exploited to great effect in the electronic atlas of mortality in Italy (Cislaghi et al., 1995). Several researchers have made use of measures of spatial (auto)correlation in describing the patterning of disease incidence among a set of areal units (see, for example, Glick, 1982; and Lam, 1986). Various researchers have added such tools to GIS. But one critique of spatial autocorrelation statistics is that they describe properties of the map as a whole; they are global rather than local statistics. Researchers such as Anselin (1995) and Ord and Getis (1995) have encouraged the incorporation of LISA (local indicators of spatial association) into GIS.
GIS AND HEALTH: FROM SPATIAL ANALYSIS TO SDS
139
Figure 12.2 The time geography, and risk factors of an imaginary family (after Schaerstrom, 1996)
12.2.3 Modelling spatial data in epidemiology In using point data in geographical epidemiology a common emphasis has been either on detecting clustering of disease, or in identifying “clusters”. But a key problem for geographical or environmental epidemiology is to conduct more “focused” studies (Besag and Newell, 1991), where the aim is to test hypotheses about possible raised incidence of disease around suspected point or linear sources of pollution (such as incinerators and nuclear power plants, or high voltage power lines and busy main roads). While much of the exploratory research can be conducted without a proprietary GIS (rather, with software for interactive spatial data analysis) it is surely in the field of modelling that GIS can make a real contribution. This is because we frequently have to link epidemiological data to other databases, concerned with air or water quality, for example. Dramatic examples of the possibilities are provided by those researchers engaged in trying to predict and control the incidence of malaria in parts of Africa, such as the Gambia (Thomson et al., 1996) and KwaZulu-Natal (Sharp and Le Sueur, 1996). The first of these studies demonstrates the potential for using coarse-resolution satellite imagery, and derived NDVI measurements, in modelling malaria transmission; the second shows how global positioning systems can be used to record the locations of 35,000 homesteads at risk from malaria, in relation to clinic catchments; in so doing, this work anticipates later discussions of links between traditional epidemiology and health care planning. Much of this modelling work proceeds under the assumption that proximity to, or distance from, such putative sources acts as a reasonable marker of exposure. For example, Diggle et al. (1990) demonstrated raised incidence of larynx cancer around the site of a former industrial waste incinerator, by fitting a spatial statistical model to data on cases and controls (see Gatrell and Rowlingson, 1994 for comments on linking
140
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
this to a proprietary GIS, and Elliott et al., 1992b for tests of the hypothesis around other incinerators). One important issue, however, is to control for “confounders”, or other variables that may themselves be associated with proximity to the source(s) being examined. For example, we should where possible control for smoking behaviour, or for socio-economic status, since it may be that any demonstrable elevated risk near such sources is due to these factors, and not to any emissions from the plant (Diggle and Rowlingson, 1994). For areal data there is a substantial literature on modelling disease incidence, with regression-type models used to explain incidence in terms of available covariates. The important point to note here is the need to recognise spatial dependence among the set of areal units; they are not “n” independent observations. The statistical issues are discussed in Haining (1990) and in Bailey and Gatrell (1995), among other texts, while applications are presented by Elliott et al. (1992). More specifically, several authors have built upon the earlier, exploratory Bayesian analysis to build generalised linear models with fixed and random effects. For example, Lopez-Abente (1998) has included covariates such as the application of insecticides, in an ecological analysis of cancers among Spanish provinces. As in exploratory analyses, issues of exposure come to the fore in modelling disease incidence. Suppose we wish to model the incidence of respiratory disease in the vicinity of main roads, or to model the distribution of odour complaints around a hazardous waste site (Lomas et al., 1990). We can use a GIS to define areas of risk, by placing buffer zones around such roads, maybe varying the width of these zones to reflect estimated traffic densities. There are important epidemiological issues to address in any analysis, such as the need to allow for confounders (is incidence high along busy roads because the damp housing is found there too, for example?) and to recognise that indoor exposure to pollutants may be equally serious, if not more so. But the complexity of individuals’ activity spaces is an issue again. And in the absence of detailed measurements of air quality we are forced to rely on modelling likely exposure, for example by using kriging or other interpolation techniques to provide estimates of exposure over space (Collins et al., 1995). A further question is the extent to which there are other adequate surrogates of exposure. Is traffic density, or even the density of the road network, adequate in studies of respiratory morbidity; do the explanatory gains of using monitored or modelled air pollution outweigh the costs and complexity involved in collecting the data? 12.3 GIS AND HEALTHCARE DELIVERY So far, we have considered the role that analytic GIS can play in an understanding of the geographical incidence of disease. We turn now to its possible role in planning the configuration and delivery of health services, following this by considering how GIS can help an examination of variations in accessibility and uptake of services. The two are closely linked, though one can be thought of more in terms of the provider’s perspective, the other more as a consumer perspective. 12.3.1 Planning health services Comparing healthcare systems and delivery among many countries, both in the developed and developing worlds, reveals that a primary healthcare focus is being increasingly adopted, with a concomitant acknowledgement that planning has to be on a local scale. In the developing world this means that healthcare delivery has moved away from investment in “prestige”, hospital-based facilities and more
GIS AND HEALTH: FROM SPATIAL ANALYSIS TO SDS
141
Figure 12.3 Dominant flows of patients to General Practitioners in West Sussex (after Bullen et al., 1996)
towards small-scale, community-based clinics that meet better the needs of the population. This local focus is mirrored in parts of the developed world. For example, in Britain, where the “purchase” of health care (by general practitioners and health authorities, on behalf of their populations) is separated from those who provide it (hospital and community services) there have been moves towards “locality commissioning”. This requires purchasers to define localities, about which detailed information on demography and morbidity is required in order to identify likely needs and demands for health care. Bullen et al. (1996) have demonstrated the usefulness of GIS in defining such localities in west Sussex, on the south coast of England. One particularly novel idea was to incorporate individuals’ own definitions of neighbourhood into the planning process; 500 such neighbourhoods were digitised, then rasterised in order to form a count of the number of times a cell formed part of a local neighbourhood. Dominant patient flows (to GP surgeries) were also used in the planning process (Figure 12.3). Having defined small areas for which health care purchasing is required, how are we to assess the needs of people living there? There is a long tradition, certainly in Britain, of using census data, and socio-economic classifications derived from such data, in order to characterise small areas. “Geodemographics”, the origin of which lies in target marketing, has been used to attach lifestyle descriptions (such as “affluent achievers”, “thriving greys” or “hard-pressed families” in the CDMS Super Profile system; see Brown et al., 1991) to postcodes (representing on average 15 properties). In this way, the proportions of locality populations in each lifestyle class, or the association of lifestyle with mortality or morbidity data, can be obtained (Brown et al, 1995). Other research (Hennell et al, 1994) has shown how a standardised morbidity measure can be estimated for such lifestyle classes and attached to individuals, yielding a synthetic illness score for the practice with which patients are registered. Such a score can be used, for example, to predict expenditure on prescriptions.
142
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
12.3.2 Accessibility, utilisation and outcome An assessment of the accessibility of populations to public facilities has long been a subject for geographical enquiry, with Godlund’s classic work on Swedish hospitals being one of the earliest studies (Godlund, 1961). Studies are now appearing that are similar in spirit to what Godlund did without the benefit of GIS technology. For example, in Illinois, Love and Lindquist (1994) have taken 11,000 census block centroids in the state, together with census data on the distribution of the elderly, and linked this to the coordinates of 214 hospitals in order to determine the accessibility of that population to such hospitals. Using simple Euclidean distance they show that 80 percent of the elderly are within about 8 km. of a hospital, and 60 percent within 8 km. of two hospitals; however, those living outside major urban areas differ substantially in their accessibility from urban residents. Whether more sophisticated measures of accessibility than straight-line distance are valuable is a question that does not appear to have been addressed, though with the advent of digitised road network databases it is, as Love and Lindquist observe, feasible to use network distance or estimated drive times to compute accessibility scores. In a pilot study in south-east England, Gatrell and Naumann (1992) took road network data provided by the cartographic company Bartholomew, used assumed speeds of travel during peak and off-peak hours and thereby assigned journey times to arcs of the network. With data on the locations of accident and emergency facilities, linked to census data, they were able to assess which areas were more than a fixed travel time from A&E sites. But the real value here is the ability to use the GIS as a spatial decision support tool, evaluating accessibility under a range of scenarios; what happens, for example, if a particular site closes? Similar research, investigating the need for additional cancer units in north-west England, has been reported by Forbes and Todd (1995). For the GIS specialist this kind of work raises interesting questions. Gatrell and Naumann (1992) make the point that results are sensitive to the resolution of the road database used. Ball and Fisher (1994) observe that we cannot legitimately speak of a single catchment around a hospital or clinic; such catchments can only be probabilistic rather than deterministic. And from a substantive, rather than technical, viewpoint, we need to ask the question: accessibility for whom? The work described here assumes a population that drives to hospital. I am unaware of published research that uses data on public transport availability to assess access to healthcare services among non-car users; surely it is not too much to ask to incorporate, for example, bus timetables into a GIS? The research reported above considers potential accessibility; but what of the utilisation or uptake of care? There is a substantial literature on variations or inequalities in uptake, though the use of GIS here has been negligible. As an example of what is possible, consider the problem of assessing the uptake of screening for breast cancer in south Lancashire (in north-west England). Suppose we wish to explain variations in uptake of screening among the set of general practices (physician clinics); why do some practices achieve uptake rates of perhaps 90 percent, while others only 50 percent? What role do catchment characteristics play? Given that patients have some choice in selecting their GP it is no easy matter to define such catchments (Haynes et al., 1995). But given the patient’s postcode we can assign her to an enumeration district (ED) and attach to her record data on the social deprivation of that ED in which she resides. Collecting together all patients registered with a particular practice we can obtain a crude average deprivation score for that practice. When we regress uptake against deprivation we find a clear relationship (Figure 12.4), moreover, when we add practice characteristics (such as whether or not it has at least one female partner), the level of explained variation increases significantly. Jones and Bentham (1995) have demonstrated the use of GIS in understanding the links between health outcomes and accessibility. In an examination of road traffic accidents in Norfolk, England between 1987 and 1991 they estimated the time taken for an ambulance to reach the accident and convey victims to A&E
GIS AND HEALTH: FROM SPATIAL ANALYSIS TO SDS
143
Figure 12.4 Uptake of screening for breast cancer in South Lancashire, UK (solid dots are general practices without a female GP; crosses are those with at least one female GP)
departments. They modelled the likelihood of the victim being a fatality, as opposed to a serious injury, using the estimated travel time but also controlling for other factors, such as type of road, nature of accident, weather conditions, and age of victim. No relationship was found between outcome and journey time, so that survival did not appear to be affected by accessibility. Whether this finding extends to other parts of the world, where distances in rural areas are much greater, has yet to be demonstrated after adequate control for confounding variables. Such research is important, since it feeds into policy debates about the concentration of health services. For many reasons it may be sensible to plan services so that they are located in one large regional centre; but the impacts on those who are some distance from such services have yet to be fully evaluated, and there is much research using GIS (and linked to spatial interaction models) to be done here. For example, after suitable control for confounders, is cancer survival affected by relative location? 12.4 LINKING EPIDEMIOLOGY AND HEALTHCARE PLANNING: SPATIAL DECISION SUPPORT SYSTEMS As noted earlier, within medical geography there are usually two major research areas distinguished; one on geographical epidemiology, the other on health care planning. But we need to build bridges between these— and (to continue the metaphor) the structure required to do this can and should include a spatial decision support system. If analysis suggests there are serious health variations, and in particular localised health problems, then a need is identified for resources to be devoted to tackling such problems. A spatial decision support system provides the tools to do this, as the previous section implied. A striking example of this comes from the work referred to above on malaria incidence and control. Research by Sharp and le Sueur (1996) shows that small-scale maps that portray broad regional trends can
144
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
mask substantial, and epidemiologically significant, variations in incidence in small districts. More detailed maps allow authorities seeking to control malaria to focus strategies in areas where they are most needed; this spatial targeting of resources also contributes to cost-effectiveness. The research also highlights the need to recognise that many public health problems do not stop at national borders; a high percentage of the malaria cases in South Africa are imported from Mozambique, for example. With the rapid expansion of global travel, we do well to remember that new “spaces” created by flows of people serve as modern backcloths on which disease transmission is mapped (Gould, 1993). And we need to bear in mind that, having identified areas of high incidence of disease and illness, the goal of public health doctors should be to “change the map”, by seeing that health and other policies are implemented accordingly. 12.5 CONCLUSIONS Much of this review has emphasised the importance of adding statistical analysis to GIS, though the previous section stressed the parallel theme of ensuring that a decision support component is available too. Having made these points, we do need to ask the question: who will use this extended GIS? Links to public health doctors and others over recent years have taught me not to overestimate their analytical requirements. Some of the tools discussed above are quite sophisticated, and are hardly likely to put in an appearance in health reports that have to satisfy the requirements of lay audiences; while such audiences may themselves be sophisticated, few will have a subtle feel for Bayesian estimation, kernel estimation and K-functions! When coupling such analytical tools to a GIS we need to provide plenty of guidance concerning their use. We also need too to recognise, and emphasise, issues of data quality. It is no use putting such tools to work on poor data. When dealing with clinical databases we need high standards of diagnostic accuracy. And the importance of this needs to be communicated to a lay public concerned about “cancer clusters”, for example, where different cancers may well have very different aetiologies and where awareness of perhaps two or three cases of “brain cancer” may mask the fact that one or more may be secondary tumours resulting from spread of the cancer from another, primary site. Data quality issues also arise in a primary care setting, where general practitioner databases may be out of date or inflated by patients who have left the area but who have yet to be removed from GP registers. In a European setting, we see vast differences in the availability of high resolution health data. For instance, Scandinavian countries have detailed, geocoded, individual level data that allow the researcher to track movements of individuals between residential and occupational settings. Yet the researcher in France is restricted to aggregated data at quite coarse levels of spatial resolution. Even where data are of high quality—for instance where we have accurate residential locations and histories of patients stretching back many years—we should acknowledge that such locations provide far from perfect measures of “exposure”. This issue is, for me, one of the most critical in GIS-based epidemiology. Finally, I think that those of us interested in GIS and health need to engage in more dialogue with those who approach health research from an alternative epistemological viewpoint. What can we do to answer those who criticise much applied GIS for its “surveillant eye” (Pickles, 1995), for distancing itself from those whose health it seeks to map, explore and model? I am much struck by the dedication in Anders Schaerstrom’s (1996) thesis: “To all those unfortunate people whose lives, sufferings and deaths are transformed to trajectories, dots and figures in scientific studies”. (For “transformed”, we might read “reduced”). Put simply, much GIS-based health research takes place within the context of a biomedical model, in which social, cultural and biographical settings are ignored. Can we do more to acknowledge lay perspectives, perhaps? A start could be made by creating and analysing spatial databases that have more to
GIS AND HEALTH: FROM SPATIAL ANALYSIS TO SDS
145
do with the perception of ill-health rather than health data with “hard” end-points. Or, drawing on new concerns over “environmental equity” we could construct databases that deal with access to healthpromoting resources (such as good, reasonably priced food, recreational facilities, traffic-free zones) rather than only access to secondary or tertiary health care. These are some of the challenges for GIS-based health research in the first few years of the twenty-first century. REFERENCES ANSELIN, L. 1995. Local indicators of spatial association-LISA, Geographical Analysis, 27, pp. 93–115. ANSELIN, L. and GETIS, A. 1993. Spatial statistical analysis and geographic information systems, in Fischer, M.M. and Nijkamp, P. (Eds.) Geographic Information Systems, Spatial Modelling, and Policy Evaluation. Berlin: Springer-Verlag. BAILEY, T.C. 1994. A review of statistical spatial analysis in geographical information systems, in Fotheringham, A.S. and Rogerson, P. (Eds.) Spatial Analysis and GIS, London: Taylor and Francis. BAILEY, T.C. and GATRELL, A.C. 1995. Interactive Spatial Data Analysis. Harlow: Addison, Wesley, Longman. BALL, J. and FISHER, P.P. 1994. Visualising stochastic catchments in geographical networks, The Cartographic Journal, 31, pp. 27–32. BESAG, I.E. and NEWELL, J, 1991. The detection of clusters in rare diseases, Journal of the Royal Statistical Society, Series A, 154, pp. 143–155. BHOPAL, R., DIGGLE, P.J. and ROWLINGSON, B.S. 1992. Pinpointing clusters of apparently sporadic Legionnaire’s disease, British Medical Journal, 304, pp. 1022–27. BITHELL, J. 1990. An application of density estimation to geographical epidemiology, Statistics in Medicine, 9, pp. 691–701. BREWER, C. 1994. Color use guidelines for mapping and visualisation, in MacEachran, A. and Taylor, D.R.F. (Eds.) Visualisation in Modern Cartography. Amsterdam: Elsevier. BROWN, P.J.B., HIRSCHFIELD, A.F.G. and BATEY, P.W.J. 1991. Applications of geodemographic methods in the analysis of health condition incidence data, Papers in Regional Science, 70, pp. 329–44. BROWN, P.J.B., TODD, P. and BUNDRED, P. 1995. Geodemographics and GIS in Small Area Demographic Analysis: Applying Super Profiles in District Health Care Planning, URPERRL, Department of Civic Design. Liverpool: University of Liverpool. BRUNSDON, C. and CHARLTON, M. 1995. Developing an exploratory spatial analysis system in XLisp-Stat, Proceeding of, GISRUK ‘95. London: Taylor & Francis. BULLEN, N., MOON, G. and JONES, K. 1996. Defining localities for health planning: a GIS approach, Social Science and Medicine, 42, pp. 801–816. CISLAGHI, C., BIGGERI, A., BRAGA, M., LAGAZIO, C. and MARCH, M. 1995. Exploratory tools for disease mapping in geographical epidemiology, Statistics in Medicine, 14, pp. 2663–2682. COLLINS, S., SMALLBONE, K., and BRIGGS, D. 1995. A GIS approach to modelling small area variations in air pollution within a complex urban environment, in P.Fisher (Ed.) Innovations in GIS 2. London: Taylor & Francis. DIGGLE, P.J. and ROWLINGSON, B.S. 1994. A conditional approach to point process modelling of elevated risk, Journal of the Royal Statistical Society, Series A, pp. 433–40. DIGGLE, P.J. and CHETWYND, A.D. 1991. Second-order analysis of spatial clustering for inhomogeneous populations, Biometrics, 47, pp. 1155–1163. DIGGLE, P.J., GATRELL, A.C. and LOVETT, A.A. 1990. Modelling the prevalence of cancer of the larynx in part of Lancashire: a new methodology for spatial epidemiology, in Thomas R.(Ed.) Spatial Epidemiology. London: Pion. DORLING, D. 1994. Cartograms for visualising human geography, in Hearnshaw, H.J. and Unwin, D.J. (Eds.) Visualization in Geographical Information Systems. Chichester: John Wiley. ELLIOTT, P., CUZICK, J., STERN, R. and ENGLISH, R. 1992a. Geographical and Environmental Epidemiology. Oxford: Oxford University Press.
146
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
ELLIOTT, P., HILLS, M., BERESFORD, J., KLEINSCHMIDT, I., JOLLEY, D., PATTENDEDN, S., RODRIGUES, L., WESTLAKE, A. and ROSE, G. 1992b. Incidence of cancer of the larynx and lung near incinerators of waste solvents and oils, Lancet, 339, pp. 854–858. FORBES, H. and TODD, P. 1995. Review of Cancer Services: North West Regional Health Authority, URPERRL, Department of Civic Design. Liverpool: University of Liverpool. FORSTER, F. 1963. Use of a demographic base map for the presentation of areal data in epidemiology, British Journal of Preventive and Social Medicine, 20, pp. 165–171. GATRELL, A.C. and NAUMANN, I. 1992. Hospital Location Planning: A Pilot GIS Study, North West Regional Research Laboratory. Lancaster: Lancaster University. GATRELL, A.C. and ROWLINGSON, B.S. 1994. Spatial point process modelling in a GIS environment, in Fotheringham, A.S. and Rogerson, P. (Eds,) Spatial Analysis and GIS. London: Taylor & Francis. GATRELL, A.C. and BAILEY, T.C. 1996. Interactive spatial data analysis in medical geography, Social Science and Medicine, 42(6), pp. 843–855. GATRELL, A.C., BAILEY, T.C, DIGGLE, P.J. and ROWLINGSON, B.S. 1996. Spatial point pattern analysis and its application in geographical epidemiology, Transactions, Institute of British Geographers, 21, pp. 256–274. GATRELL, A.C. and LÖYTÖNEN, M. 1998. (Eds.) GIS and Health. London: Taylor & Francis GLICK, B.J. 1982. The spatial organisation of cancer mortality, Annals of the Association of American Geographers, 72, pp. 471–481. GODLUND, S. 1961. Population, Regional Hospitals, Transport Facilities and Regions: Planning the Location of Regional Hospitals in Sweden, Lund Studies in Geography, Series B, No. 21, University of Lund, Sweden. GOODHILD, M., RAINING, R., WISE, S. et al 1992. Integrating GIS and spatial data analysis: problems and possibilities, International Journal of Geographical Information Systems, 6, pp. 407–423. GOULD, P. 1993. The Slow Plague: A Geography of the AIDS Pandemic. Cambridge, MA: Blackwell. HAINING, R. 1990. Spatial Data Analysis in the Social and Environmental Sciences, Cambridge: Cambridge University Press. HAYNES, R.M., LOVETT, A.A., GALE, S.H., BRAINARD, J.S. and BENTHAM, C.G. 1995. Evaluation of methods for calculating census health indicators for GP practices, Public Health, 109, pp. 369–374. HENNELL, T., KNIGHT, D. and ROWE, P. 1994. A Pilot Study into Budget-Setting Using Synthetic Practice Illness Ratios (SPIRO Scores) Calculated from “Super Profiles” Area Types, URPERRL Working Paper 43, Department of Civic Design. Liverpool: University of Liverpool. JONES, A.P. and BENTHAM, G. 1995. Emergency medical service accessibility and outcome from road traffic accidents, Public Health, 109, pp. 169–177. JONES, K. and MOON, G. 1987. Health, Disease and Society. London: Routledge. KELSALL, J. and DIGGLE, P.J. 1995. Nonparametric estimation of spatial variation in relative risk. Statistics in Medicine, 14, pp. 2335–2342. LAM, N. 1986. Geographical patterns of cancer mortality in China, Social Science and Medicine, 23, pp. 241–247. LANGFORD, I. 1994. Using empirical Bayes estimates in the geographical analysis of disease risk, Area, 26, pp. 142–9 LOMAS, T., KHARRAZI, M, BROADWIN, R., DEANE, M, SMITH, M. and ARMSTRONG, M. 1990. GIS in public health: an application of GIS Technology in an epidemiological study near a toxic waste site, Proceedings, Thirteenth Annual ESRI User Conference. Redlands, CA: E.S.R.I.. LOPEZ-ABENTE, G. 1998. Bayesian analysis of emerging neoplasms in Spain, in Gatrell, A.C. and Löytönen, M. (Eds.) GIS and Health. London: Taylor & Francis, LOPEZ-ABENTE, G. et al. 1995. Atlas of Cancer Mortality and Causes in Spain.http://www.ucaa.es/hospital/atlas/ introdui.html LOVE, D. and LINDQUIST, P. 1994. The geographical accessibility of hospitals to the aged: a geographic information systems analysis within Illinois, Health Services Research, 29, pp. 627–651. MARTIN, D. and BRACKEN, I. 1991. Techniques for modelling population-related raster databases, Environment and Planning A, 23, pp. 1069–1075.
GIS AND HEALTH: FROM SPATIAL ANALYSIS TO SDS
147
MILLER, H.J. 1991. Modelling accessibility using space-time prism concepts within geographical information systems, International Journal of Geographical Information Systems, 5, pp. 287–301. OLIVER, M.A., MUIR, K.R., WEBSTER, R., PARKES, S.E., CAMERON, A.H., STEVENS, M. and MANN, J.R. 1992. A geostatistical approach to the analysis of pattern in rare disease, Journal of Public Health Medicine, 14, pp. 280–289. OPENSHAW, S., CHARLTON, M., WYMER, C. and CRAFT, A. 1987. A Mark 1 geographical analysis machine for the automated analysis of point data sets, International Journal of Geographical Information Systems, 1, pp. 335–358. ORD, J.K. and GETIS, A. 1995. Local spatial autocorrelation statistics: distributional issues and an application, Geographical Analysis, 27, pp. 286–306. PICKLES, J. (Ed.) 1995. Ground Truth: The Social Implications of Geographical Information Systems. New York: Guildford Press. RIISE, T. et al. 1991. Clustering of residence of multiple sclerosis patients at age 13 to 20 years in Hordaland, Norway, American Journal of Epidemiology, 133, pp. 932–939. RUSHTON, G., ARMSTRONG, M.P., LYNCH, C. and ROHRER, J. 1996. Improving public health through Geographical Information Systems: an instructional guide to major concepts and their implementation, Department of Geography, University of Iowa, CD-ROM. SCHAERSTROM, A. 1996. Pathogenic Paths? A Time Geographical Approach in Medical Geography, Lund: Lund University Press. SEL VIN, S., MERRILL, D.W. and SACKS, S. 1988. Transformations of maps to investigate clusters of disease, Social Science and Medicine, 26, pp. 215–221. SHARP, B.L and Le SUEUR, D. 1996. Malaria in South Africa: the past, the present and selected implications for the future, South African Medical Journal, 86, pp. 83–89. THOMAS, R 1992. Geomedical Systems: Intervention and Control. London: Routledge. THOMSON, M.C., CONNOR, S.I, MILLIGAN, P. and FLASSE, S. 1996. The ecology of malaria as seen from earth observation satellites, Annals of Tropical Medicine and Parasitology, 90, pp 243–264.
Chapter Thirteen The Use of Neural Nets in Modelling Health Variations— The Case of Västerbotten, Sweden Örjan Pettersson
13.1 INTRODUCTION Regional change and uneven development are terms familiar to most geographers. These research areas have received and are still receiving attention at global, regional and local scales (for a short review, see Schoenberger, 1989; Smith, 1989). Even though most research has focused on economic change, often measured as growth/decline in GDP, there have also been studies concerned with other aspects of regional change, such as demography, environment and welfare in a broader perspective (Dorling, 1995; Morrill, 1995; Pacione, 1995). Furthermore, there is an extensive literature specialising in medical geography and spatial epidemiology (Gould, 1993; Kearns, 1996; Mayer, 1990). Attempts have been made to identify underprivileged or deprived areas (Jarman, 1983, 1990; Pacione 1995) and in recent years there has been a renewed interest in geographical literature concerning issues of spatial equity, unfair distribution and justice (Hay, 1995; Smith, 1994). This chapter deals with the substantial differences in the status of public health among the populations living in 500 residential areas in the county of Västerbotten, Sweden. A neural net approach is applied in order to explore these health variations. 13.1.1 Contemporary Sweden By international standards welfare in Sweden is high and relatively evenly distributed. The Swedish welfare model has reduced social and spatial differences in living conditions within the country. However, the trend changed in the late 1980s. Within a few years national unemployment rates rose to levels that were unprecedented in post-war Sweden. Although there are some signs of recovery in the economy, there is still great uncertainty as to whether or not there will be any substantial reduction in unemployment rates (SOU, 1995). This new labour market situation has major implications for individuals and households, but no clear picture has yet emerged regarding how these changes affect welfare and health distribution between social groups and regions. Although Sweden is characterised by relatively small regional imbalances, there are certain differences in living conditions between different parts of the country (Svenska Kommunforbundet, 1994). Little attention has been paid, however, to the obvious differences among the populations living in different parts of cities and municipalities.
NEURAL NETS FOR MODELLING HEALTH VARIATIONS
149
Figure 13.1: The county of Västerbotten; fifteen municipalities and 500 microregions. The areas with high ill-health rates (65 days or more) are shaded.
Uneven distribution has usually been seen as a regional problem, mainly concentrated to the interior parts of northern Sweden or at specific localities hit by a crisis when a large plant has been shut down, and the traditional Swedish regional policy has almost exclusively been aimed at such areas and localities. Recently, there has also been some concern about metropolitan suburbs with a large proportion of immigrants (National Board of Health and Welfare, 1995). Persson and Wiberg (1995) maintain that the last few years have seen a shift towards increasing spatial differences in Swedish society and they also anticipate that such growing inequalities are first to be observed at the micro-regional level, i.e. within counties and municipalities. A recent empirical study has shown that differences in both living conditions and public health can be substantial within an ordinary Swedish county (Pettersson et al., 1996). 13.1.2 Aims of the study This chapter will focus on the observed differences in health status among the populations living in different parts of the county of Västerbotten in northern Sweden (Figure 13.1). In this chapter a measure called “ill-health rate” (ohälsotal) will be used as an indicator of the population’s health status. The illhealth rate consists of the population’s average number of days absent from work. The measure will be further specified and discussed later in the chapter. Hypothetically, the analysis has to face the presence of non-linear interaction between indicators at different spatial scales. Such “local” pockets of interaction are difficult to pin-point with explicit a priori hypotheses. As an alternative to regression analysis the methodology of supervised artificial neural nets will be applied to the problem of identifying a relationship between sick-leave and the morphology of
150
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
demographic, social and economic indicators at different spatial scales. This method is supposed to reveal the relevant explanatory patterns in social and physical space. First it is necessary to give a description of the studied county and to summarise some of the findings from earlier stages in the research project (Pettersson et al., 1996). 13.2 THE STUDY AREA The county of Västerbotten consists of 15 municipalities in northern Sweden, covering approximately 55, 000 km2 and extending from the shores of the Gulf of Bothnia to the mountainous border with Norway. Most of the 250 000 inhabitants are concentrated to the coastal areas, especially in the towns of Umeå and Skellefteå. With its 100 000 citizens Umeå is the biggest and fastest growing municipality in northern Sweden and it serves as the regional centre with a university and a regional hospital. Skellefteå is more of an industrial centre and dependent upon small-scale manufacturing industry. During the last few years the differences between these two major localities have increased. While Umeå is growing rapidly, Skellefteå exhibits clear signs of decline. The interior parts of Västerbotten are very sparsely populated (1–3 persons per km2) and are considered to be among the traditional problem areas in Swedish regional policy. 13.2.1 The microregional approach A pilot study (with reference to what has previously been done in Sweden) was conducted to analyse emerging new patterns of inequality in an exploratory way (Pettersson et al., 1996). In order to shed light upon the substantial spatial differences in living conditions and the hypothesis about increasing inequalities (Persson & Wiberg, 1995) it was thought necessary to employ data with a higher degree of spatial resolution than municipalities (Can, 1992). Even though the investigation was conducted for a single county there are good reasons to believe that most of the spatio-temporal changes that could be observed in Västerbotten are, to a large extent, also valid for the rest of Sweden. The microregional approach left us with two main choices: either to use electoral wards or to use NYKOs (nyckelkodsområden), which are the municipalities’ own statistical subdivisions (usually more detailed than electoral wards). Both divisions are considered to represent relatively homogeneous housing environments. However, the NYKO-system involves a number of practical problems as there are no national rules regarding how the municipalities are supposed to make these subdivisions and only a few, mostly big, municipalities have actually produced digital maps according to NYKO-boundaries. The final decision was to employ electoral wards for the rural municipalities and NYKO for the three largest municipalities. By making use of NYKO for some of the municipalities, it would be easier to trace local pockets of deprivation within the major localities. The final base map (Figure 13.1) was made up of approximately 500 geographical entities. Some area units are very large while others are very small, and most microregions contain between 50 and 1700 inhabitants. Since “Statistics Sweden” adopts a principle of suppressing information on areas with few inhabitants there was some loss of information in microregions with fewer than 50 residents. With conventional choropleth maps there is an obvious risk that physically large areas dominate the visual impression. Furthermore, small and often densely populated areas are obscured. In this case the latter problem is of less importance since the urban areas in the county of Västerbotten are usually characterised by low ill-health rates.
NEURAL NETS FOR MODELLING HEALTH VARIATIONS
151
A set of census indicators was obtained from ‘Statistics Sweden’ and when running univariate analyses, many indicators revealed dramatic differences within the county. Some variables showed expected spatial patterns with manifested urban-rural continuum characteristics. Other indicators resulted in a kind of microregional mosaic with patterns that were far more complex and difficult to interpret. The ill-health rate was one of them (Figure 13.1). A simple index showed that many microregions were exposed to multiple deprivation. Most of these were found in rural areas, but there were also underprivileged areas unexpectedly close to the coast and within the towns. During the period between 1985 and 1992 there has been a substantial decline in employment intensity, but this change appeared to affect most microregions to the same extent. In terms of disposable income there has been a tendency towards convergence, whereby the populations in the relatively poor areas have experienced a significant rise in purchasing power. Contrary to this, the households in affluent areas have suffered a decline in disposable income. This development contradicts the hypothesis suggesting increased spatial inequalities at the microregional level. 13.2.2 Clusters of microregions A cluster analysis was performed in order to reduce the 500 microregions into a manageable number of groupings with similarities regarding certain variables, such as residential characteristics and indicators of material living conditions and health (Pettersson, 1996). The cluster analysis was performed with Ward’s method and six indicators. A seven-cluster solution provided groupings with well-defined characters, ranging from densely populated residential areas within the major localities to very remote rural wards with many elderly inhabitants. The clustering procedure resulted in a spatial mosaic with marked urban-rural tendencies. The highest ill-health rates were found in the remotest margins and in the rural areas, while the lowest values were found among the suburban areas. However, the variations in public health were still considerable within some of these clusters. There also seemed to be a positive relationship between general living conditions and public health in the residential areas. 13.3 THE ILL-HEALTH RATE In this chapter the ill-health rate is used as an indicator of public health. The ill-health rate is defined as the average number of sick-leave days (or, more precisely, days absent from work due to illness). This measure has several advantages over alternative public health measures. One important argument is that the ill-health rate is a simple measure and also available for small area units, but there are also at least three important objections to the relevancy of the ill-health rate. Firstly, it is restricted to only a part of the population, those between 16 and 64 years of age and working. Secondly, it can be affected by other factors not considered as having anything to do with the population’s health status; for instance, the ill-health rate alters due to changes in the generosity of the sickness benefit system. Thirdly, it is difficult to make direct comparisons with other countries. Besides ordinary sick-leave days, the measure also contains those who have obtained an early retirement. Early retirements and long-term illness contribute considerably to the ill-health rate and thereby make this measure sensitive to the health status of a relatively small part of the population, especially persons over 50 years of age. Sometimes a distinction is made between early retirements and more normal sick-leave days; however, since the data used here do not allow such a distinction, both categories will be considered as
152
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
arising from the same factors, even though it is easy to see that some of the indicators are primarily related to just one of them. Sick-leaves and early retirements have received considerable attention in national and international research. Marklund (1995) provides an overview of factors known to affect the number of sick-leave days. First it is necessary to point out that the individual health situation is of course evident. In understanding the differences in public health between different regions the population’s structure regarding age and gender is of great importance. The elderly and females have more absences due to illness and the ill-health rate increases rapidly especially when approaching pensionable age (65 years of age). It is important though to make a distinction between the individual level and the aggregated level, in this case the microregion and groups of microregions. Most relationships can, in fact, only be given a relevant explanation at the individual level. For instance, it must be assumed that the influence on public health of age and gender depends upon whether the individual is aged and/or a female, and that these circumstances have no effect upon the health status of the rest of the population. Type of work and education are also of significant importance. Persons with manual or monotonous jobs have more sick-leave days. Higher educational level means fewer sick-leave days on average. Educational level is expected to be highly correlated with both the age structure of the population and type of work. The effect of the labour market situation seems to be unclear. It is important to emphasise that losing a job or being exposed to long-term unemployment is always considered to affect the individual’s health in a negative way. The effect on the ill-health rate at the aggregated level could, however, be the reverse since it is possible that persons employed in a labour market characterised by high unemployment are more anxious to be visible at the place of work. In contrast, and more controversial, it has been claimed that one way of securing a stable income in an unsafe labour market is to strive for an early retirement. Some investigations suggest a correlation between a troublesome labour market and high ill-health rates mostly resulting from large numbers of early retirements. Several studies have indicated a covariance between household structure and sickleaves. One-person households and families with small children, in particular single parents, show higher averages than the population in general. A covariance between income and health has also been proposed even though it is difficult to establish in what direction the relationship goes. A Danish study (Bovin and Wandall, 1989) has concluded that people living in small municipalities or in rural areas have fewer sick-leave days than residents in large municipalities and cities. They suggest that one reason for this is that the social control in small societies makes it harder to stay home from work. 13.3.1 Public health in the county of Västerbotten The county of Västerbotten exhibits ill-health rates well above the national average and within the county there are substantial deviations from the county average. At the municipality level there is a marked coreperiphery pattern with high ill-health rates in the rural municipalities and relatively low rates in the coastal areas. The microregional approach unmasks a more complicated pattern with high ill-health rates in the middle of the county, but similar areas are also to be found even within the coastal municipalities and towns (Figure 13.1). Within Skellefteå, in particular, there are several microregions with ill-health rates well above the county average. There are other areas with unexpectedly low ill-health rates, for instance many of the electoral wards close to the Norwegian border show much lower values than most neighbouring areas.
NEURAL NETS FOR MODELLING HEALTH VARIATIONS
153
13.3.2 A priori hypotheses It is obvious that a large proportion of these variations in public health between microregions can be explained by the age and gender of their populations, but there is also a need to establish whether other factors, especially social and economic circumstances, contribute to the ill-health rate. Once again it is necessary to emphasise that most of the relationships are only relevant at the individual level. However, it is expected that a demographic structure with many elderly men and women implies higher ill-health rates at the microregional level. Similarly, it is likely that a high proportion of one-person households and families with children, as well as microregions with a lower educational level, will show higher ill-health rates. The relationship with income situation is more difficult to handle. A negative relationship might imply that health is negatively affected by a troublesome economic situation, but it can also be argued that the income reduction is due to health problems. With the new and seemingly structural unemployment situation it is of particular importance to investigate the relationship between labour market and public health, even though it could be argued that a large proportion of the effect on the ill-health rate is not entirely health related. A negative relationship between employment levels and the ill-health rate implies that the health status of the individual or of the microregion’s population becomes impaired when exposed to long-term unemployment or an insecure labour market situation. On the other hand, a positive relationship would suggest that, as a measure of public health at the microregional level, the ill-health rate is sensitive to factors not considered to be health related. This indicator is one example showing that the effect on the individual’s sick-leaves could be contradictory, depending on whether one considers the individual or the aggregated level. From a geographical point of view it is likely that distance to health care could be another important factor in explaining deviations in public health among the populations in different parts of the county. Primary health care is usually provided in the municipality centres, while the more advanced medical services are concentrated to the major localities and especially to the regional hospital in Umeå. Public health could also be related to the physical environment and to local cultural traditions regarding the consumption of food, alcohol and tobacco. Due to the lack of data such factors will largely be left out of this investigation. 13.4 ARTIFICIAL NEURAL NETWORKS During recent years artificial neural net technology has penetrated many fields of scientific inquiry. What in the beginning was met with scepticism and disbelief—“a black box”—has today become an established methodology providing a tool with wide applicability (Hewitson and Crane, 1994). In this chapter neural nets will be used as an alternative to linear regression analysis in order to explore microregional variations in health status among the populations of nearly 500 residential areas in the county of Västerbotten. Since it is assumed that neural net technology is not completely unknown to most readers, only a very brief overview will be given. For a short and simple introduction, see Hinton (1992) or the first three chapters in Hewitson and Crane (1994). The final chapters in the latter book also provide some examples of successful applications to geographical problems. More extensive presentations of neural net technology can be found in Bishop (1995), Hertz et al. (1991) and Ripley (1994). The neural net looks for patterns in a set of data and “learns” them. This means that the trained network has the ability to classify new patterns correctly and to make predictions and forecasts. Furthermore neural
154
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
Figure 13.2: A single neuron and a feedforward neural net with three layers.
nets are considered as being able to handle complex and non-linear interactions. Another advantage is the neural net’s capability of overcoming problems with noisy data or even missing values. A simple feedforward neural network architecture, such as the three-layer back-propagation net, consists of several building blocks. The basic element is the neuron (node or unit) shown in Figure 13.2. The node sums the weighted inputs from the connected links and an activation function decides whether the input signal to the neuron is powerful enough to “fire” a signal to the neuron(s) in the next layer(s). Most backpropagation neural networks are built with one input layer, one or more hidden layer(s) and one output layer. These layers consist of one or more neurons connected to neurons in other layers. There are two types of “learning”: supervised learning means that the neural net is provided with both input and output data, while in unsupervised learning the neural network is only given the input values. In this chapter, only supervised learning will be performed. The network “learns” (or trains) by gradually adjusting the interconnecting weightings between the layers in order to reproduce the output pattern. The purpose of training is to construct a model that will generalise well upon an unseen set of data. There is a risk that the neural network finally starts memorising the data set, but this can be prevented by saving a part of the data material outside the training set. 13.5 DATA PREPARATION In the above discussion concerning the ill-health rate we introduced some of the factors expected to contribute to variations in this public health measure. Since the data was originally obtained for a different purpose, the analysis was restricted to what was available in the data base. Nevertheless, it was possible to construct a set of variables with expected relevancy for this investigation (Table 13.1). Most variables are census-based and relate to the population’s demographic composition and their socio-economic circumstances. Other indicators describe the settlement pattern.
NEURAL NETS FOR MODELLING HEALTH VARIATIONS
155
Table 13.1: Variable list Ill-health rate Total population Average age of total population Proportion of inhabitants 50–64 years in the 16–64 age-group Proportion of old inhabitants (over 75 years) Proportion of population 20–64 years (economically active population) Proportion of females 16–64 years Proportion of one-person households Proportion of single-parent households Proportion of two-parent households Proportion of flats in multi-dwelling buildings Land area, km2 Number of inhabitants per km2 Distance to own municipality centre, km Distance to regional centre (Umeå), km Mean income from work (thousands of SEK) Mean disposable income 1992 (thousands of SEK) Mean disposable income 1985 (thousands of SEK) Change in disposable income 1985–1992 Employment intensity 1992 Employment intensity 1985 Change in employment intensity 1985–1992 Proportion of population 16–64 years with compulsory school education Proportion of population 16–64 years with integrated upper secondary school education Proportion of population 16–64 years with post secondary school education Proportion of population 16–64 years with more than 2 years of post secondary school education Number of privately owned cars per inhabitant
It is worth emphasising that the data are aggregated and do not contain any information on whether actual relationships between indicators at the microregional level are valid at the individual level. This is the wellknown problem of ecological fallacy. As Statistics Sweden adopts a principle to suppress information on areas with small populations there was some loss of information. For this reason the investigation was restricted to areas with more than 50 inhabitants and, after this reduction, 439 microregions were available for the final analysis. Since the analysis was performed on area units of different sizes, and sometimes with small populations, there was a need to evaluate the importance of geographical scale. In order to shed some light on the scale effect the same set of variables was also computed for larger regions. These regions were constructed by summarising all values within a certain distance from the microregions’ centroid. This zoning procedure was repeated for radii of 5, 10, 20, 30 and 50 kilometres. In this way a matrix of 6X25 variables was obtained.
156
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
13.6 NEURAL NET—LINEAR REGRESSION From a methodological point of view it was desirable to find out whether neural net technology could enhance our understanding of a complex phenomenon, such as public health, beyond what was possible with other multivariate techniques. In this chapter back-propagation neural nets are compared with the more traditional linear regression analysis. 13.6.1 Regression analysis The first step was to conduct stepwise multiple linear regression at the microregional level (i.e. without any variables computed from the zoning procedure). The model (Equation 13.1) could be simplified to five significant variables without over-reducing the coefficient or multiple correlation (R2): (13.1) where: Yr=Ill-health rate=average number of sick-leave days X1=Average age of total population X2=Proportion of inhabitants 50–64 years in the 16–64 age-group (%) X3=Proportion of single-parent households (percentage units) X4=Employment intensity 1992 (%) X5=Proportion of population 16–64 years with post secondary education (%). The same procedure was repeated for the different zonings. Even though some “new” variables also showed significance it was possible to keep the same set of variables to explain the microregional public health status at all levels. The final step was to substitute some of the “microregional” indicators with the corresponding indicators from other spatial scales. It turned out that this could not be done without accepting a substantial reduction in the R2-value. 13.6.2 Neural network analysis The second step was to perform a similar analysis using a neural network. Since there are no good criteria for including or excluding variables in the neural network models, much experimentation was required with different sets of indicators, network architectures and numbers of neurons in hidden layers. Finally, it was decided to employ a three-layer back-propagation network and it was also necessary to reduce the number of variables. After some experimenting, the final choice was to use the same set of variables as in the final regression model described above, but with one-person households also included. This indicator was added because it made an important contribution to the final model and because it was also supported by other studies (Marklund, 1995). Due to the complexity of the neural net solution it is not possible to obtain coefficients of a regular equation as in regression models. It is possible, however, to compare each indicator’s relative importance to the model and to illustrate each variable’s partial effect upon the output. A sensitivity analysis was performed in order to discover the partial relationships between the input and output indicators. By feeding the trained network with slightly altered values for each variable and keeping
NEURAL NETS FOR MODELLING HEALTH VARIATIONS
157
the rest of the indicators at the county average, it was possible to obtain the variable’s partial effect on the ill-health rate (Table 13.2). In most cases the partial relationship with the dependent variable is approximately linear and shows the same sign of coefficient as the regression equation. Nevertheless, the two variables concerning employment intensity and one-person households indicate non-linear features. Table 13.2: Results from neural net analysis at the microregional level. Variable
Weight*
Partial relationship
Signs of coefficient
Average age of total 20.24 Approx. linear, increasing + population Proportion of one-person 19.72 Non-linear + ,− households Employment intensity 18.22 Non-linear (see Fig. 13.3) + ,−, + 1992 Proportion of single15.23 Approx. linear, increasing + parent households Proportion with post 13.62 Approx. linear − secondary school education Proportion of 50–64 years 10.98 Approx. linear + in the 16– 64 age-group *Weight=contribution factor=the sum of the absolute values of the weights leading from the single variable and a rough measure of the importance of a single variable in predicting the networks output.
Figure 13.3 shows the partial effect of changes in employment intensity. The straight line displays the corresponding partial effect within a linear regression model, while the curve visualises the neural net’s ability to find non-linear relationships. It is difficult, however, to find a simple interpretation to the observed non-linearity. Since almost all microregions have employment intensities above 56 per cent, the graph implies that the ill-health rate decreases with increasing employment intensity until the employment intensity rises above the county average (74 per cent), whereupon the ill-health rate increases with increasing employment intensity. However doubtful, this could indicate that there are at least two types of microregions. In the first type the employment intensity is below the county average. These areas are characterised by a declining labour market and a relatively large proportion of the population have actually obtained early retirement. In the second type of microregions the employment intensity is higher, but this also means that being visible at the place of work is less important. It is also possible that high levels of employment intensity results in persons with a poor health status, who under other circumstances would have been unemployed, being able to find a job. Similar explanations have also been proposed in Marklund (1995). The correlation with one-person households is harder to explain. It is likely that social relations are important to the individual’s health status, but whether household structure in this sense reflects social networks remains questionable. The non-linearity is even more difficult to interpret. For microregions with relatively few one-person households, the relationship with the dependent variable is positive; but in other microregions the correlation is negative. Even though the data do not allow a distinction to be made between different types of one-person households, it is a known fact that while some microregions are dominated by elderly persons living alone, others are dominated by younger one-person households. Therefore it is possible that one-person households somehow interact with age in the neural network.
158
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
Figure 133: Partial effects of changes in employment intensity.
Another interesting characteristic of this variable is that it can be substituted with the same indicator for 10 kilometre zonings, without reducing the R2-value too much. This could be interpreted as meaning that the effect of one-person households is not isolated to the single microregion or individual, but rather that it is a structural phenomenon. 13.6.3 A comparison between regression and neural net models Both linear regression and neural nets show similar results according to sets of variables in the models. It is also possible to compare the variables’ relative importance between different models. In the neural net model the average age of the population showed the highest contributing factor, while in the regression model the proportion of highly educated was the single most important variable (Table 13.3). Table 13.3: Relative importance of variables in different models. Relative importance:
Regression model: Beta coefficient*
Neural net model: Contribution factor
High
Proportion with post secondary school education Average age of total population Proportion 50–64 of 16–64 Employment intensity 1992 Proportion of single parents
Average age of total population One-person households Employment intensity 1992 Proportion of single parents Proportion with post secondary school education
Low – Proportion 50–64 of 16–64 2 R -value (%): 56.9 65.4 Mean abs. error: 9.9 9.0 * The beta coefficient is a rough measure of the variables relative importance in the regression model.
Seemingly the neural net stresses the importance of age and household structure, whereas the regression equation puts greater significance to educational level and demographic variables. The neural net makes slightly better predictions, even though the model on average miscalculates the actual values by nine days. Both methods have difficulties in predicting extremely high and extremely low ill-health rates.
NEURAL NETS FOR MODELLING HEALTH VARIATIONS
159
Figure 13.4: The shaded areas show where the predictions from regression analysis (above) and neural network (below) underestimate the observed ill-health rates by more than 10 per cent
13.6.4 Prediction error maps It is obvious that a large proportion of the explanation for the microregional differences in public health is left outside the applied models. A first step in analysing these deviations is to plot the microregions where the predictions fail substantially. From a public health perspective it is of special interest to study areas with higher ill-health rates than those predicted by the models (Figure 13.4). The maps show that the both methods make similar mistakes in predicting the actual ill-health rates. There seems to be a slight clustering to certain parts of the county and this could indicate specific local public health problems. Usually these microregions have relatively high ill-health rates and this stresses the importance of further investigations in these areas. These microregions could also be primary targets for
160
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
Figure 13.5: R2-values for regression and neural net models with sets of independent variables from different zonings.
public health campaigns implemented by the County Health Organisation. Similar ideas have been proposed by Jarman (1990) and Jørgensen (1990). 13.6.5 The importance of geographical scale When trying to replace some of the variables on the microregional level with the same variables for another distance all attempts failed. Only when using the neural net and replacing the microregional one-person households with the same variable for zones within a radius of 10 kilometres, was the reduction in the R2value relatively small. In one specific data run this model exceeded the “purely” microregional model substantially, but this trial could never be reproduced. In my opinion this shows that neural nets sometimes find unstable solutions. This also indicates the need for being cautious with the results of “too successful” models when using neural nets. That the microregional differences in ill-health rate could best be explained with indicators at the same level is expected. However, it is also possible to explain a substantial part of the deviations with sets of variables at other levels. Figure 13.5 shows the results of computations using regression analysis and neural networks with the previously described sets of variables but for different zonings. The “purely microregional contribution” to the model is roughly 20–25 per cent. This microregional contribution effect could have two very different interpretations. The first one indicates that the microregional level is the correct level when trying to analyse the ill-health rate and that the microregion is the population’s “natural” residential environment (neighbourhood). The second interpretation is that the effect is due to the fact that many of the microregions have relatively few inhabitants, thereby making the microregional approach sensitive to individual variations. This tends to imply that the micro-regional level acts as a proxy for the individual level. 13.7. CONCLUDING REMARKS The study was performed with relatively few indicators and it is likely that some changes would have improved the explanatory power of the models. Earlier in the chapter it was noted that type of work is an
NEURAL NETS FOR MODELLING HEALTH VARIATIONS
161
important factor when analysing sick-leaves. Even though the educational level also reflects the labour market structure to some extent, one cannot disregard the fact that such an indicator would presumably have enhanced the model. It is also likely that a subdivision into ordinary sick-leave days and early retirements would have contributed to an improved analysis. Since there are obvious differences between men and women it is possible that a subdivision into gender, for at least some of the indicators, could have improved the final models. This does not necessarily mean that another network architecture and a changed number of neurons in hidden layers would not have been able to improve the model. The selection, or rather reduction, into a few important variables did seem to be of great importance. The problem with selecting the network architecture and the number of variables is also one of the major disadvantages when using neural nets, since there are no good criteria for including or excluding variables (at least the software program used, Neuro Shell 2, did not provide such a tool). Another disadvantage is that it is impossible to grasp how the independent variables interact within the neural network. On one occasion the network provided a solution that could not be repeated and this indicates that neural networks sometimes find unstable solutions. Nevertheless, the neural network does actually provide a model that makes better predictions at the microregional level. Although there were some obvious difficulties in applying a neural net to this problem, this does not mean that neural networks would “fail” when applied to similar problems. Furthermore, the example illustrates the ability of neural nets to detect non-linear relationships. However, this chapter also shows that neural net technology is not a panacea. When compared with linear regression analysis the latter provides simpler solutions, but they are less sophisticated than the ones given by neural networks. It is also possible that most relationships between the ill-health rate and the set of independent variables are linear and that this is also the reason why the neural net does not outdo regression analysis. In fact, the partial effects from most independent variables suggest a linear relationship. The analysis shows that individual variations in health status are very important in understanding variations in public health between different parts of the county. Even though a set of social and economic indicators was utilised at the microregional level, almost every significant variable could only be given a reasonable interpretation at the individual level. However, the neural net analysis also suggested other relationships. The non-linear feature of employment intensity could perhaps indicate, however doubtfully, that not only the effect stemming from the individuals being employed or unemployed is important, but that the effect of the microregional employment situation is also relevant when analysing deviations in public health. However, this needs further investigation. ACKNOWLEDGEMENTS The chapter is based upon a research project carried out at CERUM (Centre for Regional Science at Umeå University, Sweden) and was partly financed by the Västerbotten County Health Organisation. The author wishes to thank Einar Holm and Ian Layton (Department of Social and Economic Geography, Umeå University) and three anonymous referees. REFERENCES BISHOP, C.M. 1995. Neural Networks for Pattern Recognition. Oxford: Clarendon. BOVIN, B. and WANDALL, J. 1989. Sygedage-fravaer blandt ansatte i amter og kommuner. Köpenhamn: AKFforlaget.
162
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
CAN, A. 1992. Residential quality assessment—alternative approaches using GIS, The Annals of Regional Science, 26, pp. 97–110. DORLING, D. 1995. A New Social Atlas of Britain. Chichester: John Wiley. GOULD, P. 1993. The Slow Plague: A Geography of the AIDS Pandemic. Cambridge, MA: Blackwell. HAY, A.M 1995. Concepts of equity, fairness and justice in geographical studies. Transactions of the Institute of British Geographers, 20 (4), pp. 500–508. HERTZ, J., KROGH, A. and PALMER, R.G. 1991. Introduction to the Theory of Neural Computation. Redwood City: Addison-Wesley. HEWITSON, B.C. and CRANE, R.G. (Eds.) 1994. Neural Nets: Applications in Geography. Dordrecht: Kluwer. HINTON, G.E. 1992. How neural networks learn from experience, Scientific American September 1992, pp. 105–109. JARMAN, B. 1983. Identification of underprivileged areas, British Medical Journal, 286, pp. 1705–1709. JARMAN, B. 1990. Social Deprivation and Health Service Funding. Paper presented as a public lecture at Imperial College of Science, Technology and Medicine, University of London, 22 May 1990. JØRGENSEN, S. 1990. Regional epidemiology and research on regional health care variations—differences and similarities, Norsk Geografisk Tidskrift ,44(4), pp. 227–235. KEARNS, R.A. 1996. AIDS and medical geography: embracing the other? Progress in Human Geography, 20(1), pp. 123–131. MARKLUND, S. (Ed.) 1995. Rehabilitering i ett samhällsperspektiv. Lund: Studentlitteratur. MAYER, J.D. 1990. The centrality of medical geography to human geography: the traditions of geographical and medical geographical thought, Norsk Geografisk Tidskrift, 44(4), pp. 175–187. MORRILL, R.L. 1995. Ageing in place, age specific migration and natural decrease, The Annals of Regional Science, 29, pp. 41–66. NATIONAL BOARD OF HEALTH AND WELFARE 1995. Welfare and Public Health in Sweden 1994. Stockholm: Fritzes. PACIONE, M. 1995. The geography of deprivation in rural Scotland, Transactions of the Institute of British Geographers, 20(2), pp. 173–191. PERSSON, L.O. and WIBERG, U. 1995. Microregional Fragmentation: Contrasts Between a Welfare State and a Market Economy. Heidelberg: Physica-Verlag. PETTERSSON, Ö. 1996. Microregional Fragmentation in a Swedish County. Paper presented at the 28th International Geographical Congress, The Hague, 4–10 August 1996. PETTERSSON, Ö., PERSSON, L.O. and WIBERG, U. 1996. Närbilder av västerbottningar —materiella levnadsvillkor och hälsotillstånd i Västerbottens Iän. Regional Dimensions Working Paper No. 2, Umeå universitet: CERUM. RIPLEY, B.D. 1994. Neural networks and related methods for classification, Journal of the Royal Statistical Society B, 56(3), pp. 409–456. SCHOENBERGER, E. 1989. New models of regional change, in Peet, R. and Thrift, N. (Eds.), New Models in Geography Volume I. London: Unwin Hyman, pp. 115–141. SMITH, D.M. 1994. Geography and Social Justice. Oxford: Blackwell Publishers. SMITH, N. 1989. Uneven development and location theory: towards a synthesis, in Peet, R. and Thrift, N. (Eds.), New Models in Geography, Volume I. London: Unwin Hyman. pp. 142–163. SOU 1995. Långtidsutredningen 1995. Stockholm: Fritzes. SVENSKA KOMMUNFÖRBUNDET. 1994. Levnadsförhållanden i Sveriges kommuner. Stockholm: Svenska Kommunförbundet.
Chapter Fourteen Interpolation of Severely Non-Linear Spatial Systems with Missing Data: Using Kriging and Neural Networks to Model Precipitation in Upland Areas Joanne Cheesman and James Petch
14.1 INTRODUCTION For strategic planning purposes, water authorities require accurate yield estimates from reservoirs, therefore precipitation gauge interpolation results are critical for providing areal precipitation estimates. However, the interpolation of precipitation amounts in remote, upland areas is one situation in which input data are severely unrepresentative. Precipitation gauge networks are usually of low density and uneven distribution with the majority of gauges located in the lowland regions of catchments. Results of using traditional interpolation techniques are seriously affected both by the complexity of theoretical data surfaces (Lam, 1983) and by the quality of data, especially their density and spatial arrangement. Typically, a standard interpolation technique will fail to model upland precipitation successfully, as the interpolation is likely to be based upon lowland gauges. The predominant influence of orography on the spatial distribution of precipitation throughout the United Kingdom has been recognised since the 1920s (Bleasdale and Chan, 1972); however, the relationships are not clearly defined. Orography complicates the estimation of mean areal precipitation in upland areas through effects such as the triggering of cloud formation and the enhancement of processes such as condensation and hydrometeor nucleation and growth. Additionally, intense, lengthy precipitation events are typically upwind of the topographic barrier or divide, with sharply decreasing magnitude and duration on the leeward side (Barros and Lettenmaier, 1994). Classical interpolation techniques make simplistic assumptions about the spatial correlation and variability of precipitation and do not handle orographic effects well (Garen et al. 1994). In a review of several studies evaluating various methods available for estimating areal precipitation from point values, Dingman (1994) found that optimal-interpolauon/kriging methods provide the best estimates of regional precipitation in a variety of situations. It was considered that these methods performed more accurately because they are based on the spatial correlation structure of precipitation in the region of application, whereas other methods impose essentially arbitrary spatial structures. However, kriging requires a stationary field for estimation, i.e. there must be no systematic spatial trend or ‘drift’ in the mean or variance of the process; this is not the case in upland regions influenced by orographic effects. Furthermore, kriging requires a well distributed set of points to achieve optimum performance. Modern Geographical Information Systems (GIS) provide the functionality to carry out most interpolation procedures. However, the inability of interpolation procedures to account for complex, multivariate relationships and unrepresentative data, continues to be a major shortcoming of current GIS,
164
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
which are considered to lack sophisticated forms of spatial analysis and modelling (Fischer, 1994a, 1994b; Fischer and Nijkamp, 1992; Goodchild, 1991). Future GIS models should be derived in the first instance from data rather than theory and they should be increasingly computationally dependent rather than analytical in nature. Such new spatial analysis approaches should be capable of determining relationships and patterns without being instructed either where to look or what to look for (Openshaw, 1992a), It has been suggested that artificial intelligence technologies such as the artificial neural network (ANN), could provide these more advanced forms of spatial analysis and modelling (Fischer, 1994a, 1994b; Openshaw, 1992a). Original studies into ANNs were inspired by the mechanisms for information processing in biological nervous systems, particularly the human brain. ANNs offer one alternative information-processing paradigm. ANNs comprise networks of very simple, usually non-linear computational units, interconnected and operating in parallel. Most real world relationships involving dynamic and spatial relations are non-linear. This complexity and non-linearity make it attractive to try the neural network approach, which is inherently suited to problems that are mathematically difficult to model. Furthermore, ANNs are reported to display great flexibility in “poor data” situations, which are often characteristic of the GIS world. ANN technology provides data-driven methodologies that can increase the modelling and analysis functionality of GIS (Fischer, 1994a, 1994b; Fischer and Gopal, 1993). Realisation of the potential of ANNs by the GIS community has been relatively slow with the exception of a few social and economic geographers, for example Openshaw (1992b), Fischer (1994a, 1994b) and Fischer and Gopal (1993). There has been recent progress in the Hydro-GIS field, for example, Gupta et al., (1996) integrated ANNs and GIS to characterise complex aquifer geometry and to calculate aquifer parameters for ground water modelling. Utilisation of ANNs in geoscience generally is also relatively new; applications to problems so far have included cloud classification (Lee et al., 1990), sunspot predictions (Koons and Gorney, 1990), optimising aquifer remediation for groundwater management (Rogers and Dowla, 1994), short-range rain forecasting in time and space (French et al., 1992) and synthetic inflow generation (Raman and Sunilkumar, 1995). ANNs have also been employed effectively for the classification of remotely sensed data (e.g. Benediktsson et al., 1990; Foody, 1995; Liu and Xiao, 1991). One of the simplest ways to utilise a neural network is as a form of multivariate non-linear regression to find a smooth interpolating function from a set of data points (Bishop, 1994). The data-driven generalisation approach of neural networks should also enable them to handle incomplete, noisy and imprecise data in an improved manner. In such a situation traditional statistical interpolation algorithms would not be expected to provide an adequate representation of the phenomena being studied. Furthermore, multiple input variables, which are considered to have a possible relationship with the output variable, e.g. orographic influences upon precipitation amount and distribution, can be fed into the ANN. The aim of this chapter is to present preliminary progress in the evaluation of areal precipitation models, constructed using neural networks and kriging, for upland catchments where the precipitation gauge networks are of low density, uneven distribution and are mainly located in the lowland areas. The topographic nature of the catchments is varied and precipitation distribution is likely to be influenced by orographic effects. The chapter will assess; the overall interpolation performance of each model; and the success of each model to map precipitation falling within remote, high altitude areas. This chapter provides the first steps in a comparative evaluation of neural networks and a well established GIS interpolation model, kriging, in a classic geophysical GIS application. The study area includes the upland regions of north-west England including the Pennines and the Lake District and covers an area of approximately 13, 000 km2.
KRIGING AND NEURAL NETS TO MODEL PRECIPITATION
165
14.2 ARTIFICIAL NEURAL NETWORKS 14.2.1 Theoretical Background The ANN typically comprises a highly interconnected set of non-linear, simple information processing elements, also known as units or nodes, analogous to a neuron, that are arranged in layers. Each unit collects inputs from single and/or multiple sources and produces output in accordance with a predetermined transfer function e.g. non-linear sigmoidal. Creation of the network is achieved by interconnecting units to produce the required configuration. There are four main features that distinguish ANNs from conventional computing and traditional Artificial Intelligence-approaches (Fischer, 1994a, Fischer and Gopal, 1994): 1. inherent parallelism—information processing is inherently parallel, this provides a way to significantly increase the speed of information processing; 2. connectionist type of knowledge representation—knowledge within an ANN is not stored in specific memory locations (as with conventional computing and expert systems); knowledge is distributed throughout the system, and it is a dynamic response to the inputs and the network architecture; 3. fault tolerance—ANNs are extremely fault tolerant, they can learn from and make decisions based upon noisy, incomplete and fuzzy information; and 4. adaptive model free function estimation not algorithmic dependent—ANNs require no a priori model and adaptively estimate continuous functions from data without specifying mathematically how outputs depend on inputs. French et al., (1992) state that the above characteristics can be used to identify suitable application areas for ANNs: 1. situations in which only a few decisions are required from a massive amount of data, e.g. classification problems; 2. operations that involve large combinatorial optimisation exercises; or 3. tasks in which a complex non-linear mapping must be learned, as with the situation which is addressed in this work. 14.2.2 Advantages There are a number of advantages characteristic of the ANN approach to problem solving (French et al., 1992): 1. application of a neural network does not require a priori knowledge of the underlying process; 2. one may not recognise all of the existing complex relationships between various aspects of the process under investigation;
166
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
3. a standard optimisation approach or statistical model provides a solution only when allowed to run to completion, whereas an ANN always converges to an optimal (or sub optimal) solution and need not run to any pre-specified solution condition; and 4. neither constraints nor an a priori solution structure is necessarily assumed or strictly enforced in the ANN development. Such characteristics eliminate, to a certain extent, the problems of regression-based methodologies, mainly the need for the modeller to select explanatory variables and the dependence on understanding of both local and non-local conditions. 14.2.3 Disadvantages The principal drawbacks of ANNs include: 1. the need to provide a suitable set of example data for training purpose, and the potential problems that can occur if the ANN is required to extrapolate to new regions of the input space that are significantly different from those corresponding to the training data; 2. excessive training times, there could be several thousands of weights to estimate, and convergence of the non-linear opitimisation procedures tend to be very slow; 3. over-trained ANNs can learn how to reproduce random noise as well as structure; and 4. choice of ANN architecture is extremely subjective, for instance how many layers and how many neurons in each layer (Bishop, 1994; Openshaw, 1992a). 14.2.4 Structure The ANN can be trained to solve complex, non-linear problems. In order to carry this out a neural network must first learn the mapping of input to output. In a supervised approach, the weighted connections are adjusted through a learning or training process, via the presentation of known inputs and outputs, in some ordered/random manner. The strength of the interconnections is altered using an error convergence technique so that the desired output will be produced for a known set of input parameters. Once created, the interconnections stay fixed and the ANN can be used to carry out the intended work. An ANN typically consists of an output layer, one or more hidden layers and an input layer. Each layer is made up of several nodes and the layers are interconnected by a set of weighted connections. The number of processing units in each layer and the pattern of connectivity may vary with some constraints. There is no communication between the processing units within a layer, but the processing units in each layer can send their output to the processing units in the succeeding layers. Nodes can receive inputs from either the initial inputs or from the interconnections. A feed-forward neural network with an error back propagation algorithm, first presented by Rumelhart et al. (1986), was utilised in this research. Error back propagation provides a feed forward neural network with the capacity to capture and represent relationships between patterns in a given data set. The processing units are arranged in layers, and the method takes an iterative non-linear optimisation approach, using a gradient descent search routine. Error back propagation involves two phases: a feed forward phase when the external input information at the input nodes moves forward to compute the output information signal at the output
KRIGING AND NEURAL NETS TO MODEL PRECIPITATION
167
Figure 14.1 : Three Layer Neural Network Model, Structure (After: French et al., 1992 and Raman and Sunilkumar, 1995)
unit(s); and a backward phase in which modifications to the strength of the connections are made based on the differences between the computed and observed information signals at the output unit(s). At the start of the learning process, the connection strengths are assigned random values. The learning algorithm modifies the strength in each iteration until the completion of the training. On convergence of the iterative process, the collection of connection strengths captures and stores the knowledge and information present in the examples used in the training process. The trained neural network is then ready to use. When presented with a new input pattern, a feed forward network computation results in an output pattern which is the result of the synthesis and generalisation of what the ANN has learned and stored in its connection strengths. Figure 14.1 shows a three layer neural network and its input parameters, N data input patterns, each with a set of input values, xi, i= I,.............., k at the input nodes with output values, On, n=1,....., m, at the output nodes. The input values, xi are multiplied by the first interconnection weights, Wij, j = I,............, h, at the hidden nodes, the values are then summed over the index, i, and become the inputs to the hidden layers i.e.: (14.1) where Hj is the input to the jth hidden node, Wij is the connection weight from the ith input node to the jth hidden node. The inputs to the hidden nodes are transformed through a non-linear activation function, usually sigmoidal to provide a hidden node output, HOj: (14.2) where Hj is the input to the node, f(Hj) is the hidden node output, and Pj is a threshold or bias and will be learned in the same way as the weights. The output, HOj, is the input to the succeeding layer until the output layer is reached. This is known as forward activation flow. The input to the m output nodes, IOn, is defined as:
168
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
(14.3) These input values are processed through a non-linear activation function (such as the above defined sigmoidal function) to give ANN output values, On. The subsequent weight adaptation or learning process is achieved by the back propagation learning algorithm. The On at the output layer will be different to the target value, Tn. The sum of the squares of error, ep, for the pth input pattern, for each input pattern, is: (14.4) and the mean square error (MSE), E, which provides the average system for all input patterns is: (14.5) where Tpn is the target value, Tn, for the pth pattern and Opn is the ANN output value, On for the pth pattern. The back propagation training algorithm is an iterative gradient algorithm designed to minimise the average squared error between values of the output, Opn, at the output layer and the correct pattern, Tpn, provided by a teaching input. This is achieved by first computing the gradient (β n) for each processing element on the output layer: (14.6) where Tn is the correct target value for the output unit, n, and On is the neural network output. The error gradient (β j) is then recursively determined for the hidden layers by calculating the weighted sum of the errors at the previous layer: (14.7) The errors are propagated backwards one layer at a time until the input layer is reached, recursively applying the same procedure. The error gradients are then used to adjust the network weights: (14.8) (14.9) where r is the iteration number, wji (r) is the weight from hidden node i or from an input to node j at iteration r, xi is either the output of node i or is an input, β j is an error term for node j, and β is the learning rate or gain item providing the size of step during the gradient descent. The learning rate determines the rate at which the weights are allowed to change at any given presentation. Higher learning rates result in faster convergence, but can result in non-convergence. Slower learning rates produce more reliable results but require increased training time. Generally, to assume rapid convergence, large step sizes which do not lead to oscillations are used. Convergence is sometimes faster if a momentum term is added and weight changes are smoothed by (14.10)
KRIGING AND NEURAL NETS TO MODEL PRECIPITATION
169
14.2.5 Training the network In order to train a neural network, inputs to the model are provided, the output is computed, and the interconnection weights are adjusted until the desired output is reached. The number of input, hidden and output nodes (the architecture of the network) used depends upon the particular problem being studied; however, whilst the number of input and output nodes is determined by the input and output variables no well-defined algorithm exists for determining the optimal number of hidden layers and the number of nodes in each. The performance advantage gained from increasing relational complexity by using more hidden nodes must be balanced with maintaining a relatively short training time. The error back-propagation algorithm is employed to train the network, using the mean square error (MSE) over the training samples as the objective function. The data are divided into two sets, one for training and one for testing. 14.2.6 Testing the network How well an ANN performs when presented with new data that did not form part of the training set is known as generalisation. Some of the data are set apart or suppressed when building the training data set. These observations are not fed into the network during training. They are used after training in order to test the network and evaluate its performance on test samples in terms of the mean square error criterion. 14.3 ARTIFICIAL NEURAL NETWORK IMPLEMENTATION In order to account for orographic influences upon the spatial distribution of precipitation falling within an upland catchment, various topographic variables are incorporated in the implementation of the ANN. These variables are obtained from a digital elevation model (DEM) that can also be used within a GIS to derive aspect and slope values for the area covered. The target output values of the ANN are long-term annual average rainfall (LAR), 1961–90. In this study we investigated the problem of predicting rainfall amounts, p, from six input variables, four of which are derived from the DEM, using in-built GIS functions. The variables are denoted x (UK National Grid Easting of rain gauge), y (UK National Grid Northing of rain gauge), e (elevation at the rain gauge), s (angle of slope at the rain gauge), asin (sin value of east-west aspect, at the rain gauge) and a-cos (cos value of north-south aspect, at the rain gauge). The aim was to predict precipitation, p, from knowledge of these variables using a functional relationship of the form (14.11) A four layer network with one input layer, two hidden layers and one output layer was constructed. Each hidden layer had 12 hidden nodes giving a network architecture of 6-12-121. The precipitation data set consisted of 1384 gauges for a 13,000 km2 area of north-west England. This set was randomly divided into two sets each of 692 gauges, one for training and one for testing. As the aim of this chapter is to evaluate the performance of the two models to estimate precipitation amounts in the higher altitude zones based upon predominantly lowland located gauges, the data set was also split into four sets of gauges falling within different altitude zones: Zone A 0–99 m, an area of 3815.35 km2 with 385 rain gauges; Zone B 100–349 m,
170
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
an area of 6505.4 km2 with 780 rain gauges: Zone C 350–599 m, an area of 2364.58 km2 with 204 rain gauges; and Zone D 600–1000 m, an area of 318.17 km2 with 15 rain gauges. These data sets were used in the same manner as the test set in order to facilitate a comparison of the results. Note that normalisation between 0 and 1 formed part of pre-processing of all input and target variables, thus creating similar dynamic ranges, to maximise the modelling process and avoid inter-variable bias. A logistic sigmoid activation function was used and initially the learning rate, β , of the network was set at 0.8 for 20000 epochs, after each set of 20000 epochs the learning rate was reduced by 0.2. This reduction of the learning rate was done to assist convergence toward an optimal solution. Using a high learning rate early on, the aim was to approach the neighbourhood of the optimal solution rapidly, then decrease the learning rate to allow smooth convergence toward the final solution. An epoch is a training cycle, in which all data in the training set are presented to the ANN. The sum squared error (SSE) and mean squared error (MSE) were recorded at every 20000 epochs and finally at 50000 epochs. When training was completed the weights were collected from the training sample to test the network and evaluate its performance on the test data and the altitudinal test sets in terms of SSE, MSE and correlation, as shown in Table 14.1. The real LAR values are plotted against the modelled values (see Figure 14.2). 14.4 KRIGING Kriging is an oprtimal spatial interpolation procedure for estimating the values of a variable at unmeasured points from nearby measurements. Kriging regards the surface to be interpolated as a regionalised variable that has a degree of continuity, i.e. there is a structure of spatial autocorrelation or a dependence between sample data values that decreases with their distance apart. These characteristics of regionalised variables are quantified by the variogram, a function that describes the spatial correlation structure of the data. The variogram is the variance of the difference between data values separated by a distance, h, and is calculated as follows: (14.12) where 2 (h) is the sample estimate of the variogram, h is the distance between data sites, x is a vector in a twocoordinate system providing the spatial location of a data site, Y(x) is the data value at point x, and n is the number of site pairs separated by the distance h. The function β (h) is known as the semi-variogram. The values of this function are actually used in the kriging calculations to estimate the unknown values. The semi-variogram is usually modelled by one of several analytic functions including spherical, circular, exponential, gaussian and linear. The estimate of a data value at an unmeasured point Y is a weighted sum of the available measurements. (14.13) where wi is the weight for measurement Yi, m is the number of measurements, and (14.14) Kriging is the algorithm for determining the weights wi such that the estimate has minimum variance. This is a Lagrangian optimisation problem that requires the solution of a system of linear equations.
KRIGING AND NEURAL NETS TO MODEL PRECIPITATION
171
Kriging is considered to be an improvement over other interpolation methods because the degree of interdependence of the sample points is taken into consideration. The kriging estimate is based on the structural characteristics of the point data, which are summarised in the variogram function, and thus result in an optimal unbiased estimate (Delhomme, 1978). Furthermore, kriging provides an estimate of the error and confidence interval for each of the unknown points. Kriging was first developed for use in the mining industry and has subsequently found widespread use in geology and hydrology (it is sometimes referred to as “geostatistics”). Examples of precipitation studies that have utilised kriging include Delfiner and Delhomme (1975), Chua and Bras (1982), Creutin and Obled (1982), Lebel et al. (1987), Dingman et al. (1988) and Garen et al. (1994). 14.5 IMPLEMENTATION OF KRIGING The spatial correlation structure of precipitation is modelled in kriging by the semi-variogram. Each empirical semi-variogram was examined in order to find an adequate semi-variogram to model the residuals, and spherical was chosen. In order to allow a comparative study of ANNs and kriging, the same data sets used for training and testing the ANNs were used to create and test the kriging model. Kriging interpolation was performed using the 692 randomly selected rain gauges and was subsequently tested on the 692 test data set and then the altitude zone sets. The SSE, MSE and correlation were calculated, as shown in Table 14.1. Table 14.1: Evaluation Measures of the Artificial Neural Network and Kriging Models Artificial Neural Network Training Set Testing Set Zone A ZoneB ZoneC ZoneD
SSE 0.27888 2.049794 0.295729 1.460579 0.48589 0.092649
Kriging MSE 0.000403 0.002962 0.000768 0.001873 0.000623 0.006177
CORR 0.98607 0.929808 0.975158 0.925218 0.958466 0.935093
SSE 0 0.609059 0.027621 0.181528 0.232096 0.169811
MSE 0 0.000888 0.0000723 0.000233 0.001143 0.011321
CORR 1 0.979072 0.997631 0.99048 0.983369 0.919882
14.6 RESULTS Performance measures were computed separately for ANN and kriging interpolation of the training data, test data and the altitude validation data, zone A, zone B, zone C and zone D. These measures indicate how well the ANN learned the input patterns and how well kriging interpolated the points and the degree to which each method can be used to predict other rainfall amounts at gauges not included in the training processes. The evaluation measures include comparison of the real precipitation data with those modelled by the ANN and kriging, using the SSE, MSE, correlation between the real LAR values and the modelled LAR values, and real and modelled LAR values plotted against each other. Figure 14.2. provide a visual comparison of the real LAR values (normalised) and the modelled LAR values (normalised) for the training, testing, zone A, zone B, zone C and zone D data sets for each model.
172
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
Figure 14.2: Real LAR values (normalised) plotted against the modelled LAR values (normalised) for the training, testing, zone A, zone B, zone C and zone D data
In each figure, perfectly modelled LAR values lie on the line. Figures14.2a and 14.2g show that with both
KRIGING AND NEURAL NETS TO MODEL PRECIPITATION
173
models the training data performed more successfully than any of the testing datasets, although this would be expected. The training data for kriging shows a perfect relationship between the actual and modelled values, this is because kriging is an exact interpolation procedure, meaning that the interpolated surface fits exactly through the values of the sample points. The neural network training data (Figure 14.2a), shows that most of the modelled values fit closely to the real values although there are some outliers. In order to cope with such noisy data, more iterations of the neural network are required. Both models display a clustering pattern within the data: it is possible that this is due to the nature of the original data, i.e. 30 years of longterm average data. Any trends or extremes within this period may be smoothed out by the averaging procedure. Comparison of the methods within test data, zone A, zone B and zone C sets shows that kriging provided the best results, with fewer outliers and closer fits to the real LAR values. However in zone D (the highest altitude zone) the neural network outperformed kriging, with a SSE of 0.092649 compared to 0. 169811 and a MSE of 0.006177 compared to 0.011321. Table 14.1 shows that kriging achieves smaller SSE and MSE results with all the data sets except zone D. This is particularly noticeable when comparing the test data set where the neural network achieved a SSE of 2.04794 and MSE of 0.002962, significantly higher than those of Kriging 0.609059 (SSE) and 0.000888 (MSE). The ANN SSE and MSE could be improved with further iterations, although overtraining must be avoided (Bishop, 1994). The correlation coefficient results provide further evidence that kriging has resulted in a closer relationship between the actual and the modelled values, except in zone D. 14.7 CONCLUSION An ANN was developed for the interpolation of precipitation using topographic variables as inputs. Kriging was also used to create a surface using the same data in order to provide a comparison of the performance. The results show that overall kriging provided the most accurate surface except in the higher altitude zone, where points of known data were scarce. These initial results are encouraging, suggesting that neural networks can provide a robust model in situations where data scarcity is an issue. Possible improvements to the ANN could be made by increasing the epochs or the hidden nodes in order to reduce the SSE further, and by using actual precipitation data as oppose to average data. Overall, the study has shown that a global ANN model can be employed successfully to interpolate data points of low density and uneven spatial distribution based upon topographic variables with no a priori knowledge of the multivariate relationships. Further work needs to be undertaken to investigate how well ANN models based upon gauges falling within lower altitude zones can predict values within higher altitude ranges and vice versa. This will provide further evidence about the ability of an ANN to extrapolate results to areas of data scarcity. This study is by no means conclusive as to the ability of an ANN approach to precipitation interpolation and only provides an initial step towards the understanding and evaluation of the role for ANNs in spatial modelling of geophysical processes. Further studies should be carried out using simulated as well as real data, different precipitation data such as annual or monthly totals, seasonal variation of the precipitation data, a selection of several spatial distributions of gauges, stability of the methods versus the scale of analysis (topoclimatology), time series data and various neural network configurations before any conclusive statement can be produced.
174
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
ACKNOWLEDGEMENTS The authors would like to thank north-west Water for supplying the data and funding the project and Bob Abrahart of the Medalus Project, School of Geography, University of Leeds for providing his expertise and time in the development of the ANN. REFERENCES BARROS, A.P. and LETTENMAIER, D.P. 1994. dynamic modeling of orographically induced precipitation, Reviews of Geophysical Union, 32(3), pp 265–284. BENEDKTSSON, J.A., SWAIN, P.H. and ERSOY, O.K. 1990. Neural network appoaches versus statistical methods in classification of multisource remote sensing data in IEEE Transactions on Geoscience and Remote Sensing, 28(4), pp 540–551. BISHOP, C.M.. 1994. Neural networks and their applications, Review of Scientific Instrumentation, 65(6), pp 1803–1832. BLEASDALE, A . and CHAN, Y.K. 1972. orographic influences on the distribution of precipitation, World Meteorological Organisation Publication, 326(326), pp 322–333. CHU A, S.H. and BRAS, R.L. 1982. Optimal estimators of mean areal precipitation in regions of orographic influence, Journal of Hydrology, 57, pp 23–48. CREUTIN, J.D. and OBLED, C. 1982. Objective analysis and mapping techniques for rainfall fields: an objective comparison, Water Resources Research, 18(2), pp 413–431. DELFINER, P. and DELHOMME, J.P. 1975. Optimum interpolation by kriging, in David, J.C. (Ed.) Display and Analysis of Spatial Data. NATO Advanced Study Insititute: John Wiley and Sons, pp 96–114. DELHOMME, J.P. 1978. Kriging in the hydrosciences, Advances in Water Resources, 1(5), pp 251–266. DINGMAN, S.L. 1994. Physical Hydrology. New York: Macmillan. DINGMAN, S.L., SEELY-REYNOLDS, D.M. and REYNOLDS, III R.C. 1988. Application of kriging to estimating mean annual precipitation in a region of orographic influence, Water Resources Bulletin, 24(2), pp 329–399. FISCHER, M.M. 1994a. From conventional to knowledge-based geographic information systems, Computing, Environment and Urban Systems, 18(4), pp 233–242. FISCHER, M.M. 1994b. Expert systems and artificial neural networks for spatial analysis and modelling: Essential components for knowledge-based Geographical Information Systems, Geographical Systems, 1, pp 221–23 5. FISCHER, M.M. and GOPAL, S. 1993. Neurocomputing—a new paradigm for geographic information processing, Environment and Planning A, 25, pp 757–760. FISCHER, M.M. and GOPAL, S. 1994. Artificial neural networks: a new approach to modeling interregional telecommunication flows, Journal of Regional Science, 34(4), pp 503–527. FISCHER, M.M and NIJKAMP, P. (Eds.). 1992. Geographic Information Systems, Spatial Modelling, and Policy Evaluation. Berlin: Springer-Verlag. FOODY, G.M. 1995. Land cover classification by an artificial neural network with ancillary information, International Joumal of Geographical Information Systems, 9(5), pp 527– 542. FRENCH, M.N., KRAJEWSKI, W.F. and CUYKENDALL, R.R. 1992. Rainfall forecasting in space and time using a neural network, Journal of Hydrology, 137, pp 1–31. GAREN, DC., JOHNSON, G.L and HANSON, C.L. 1994. Mean areal precipitation for daily hydrologie modeling in mountainous regions, Water Resources Bulletin, 30(3), pp 481–491. GOODCHILD, M.F. 1991. Progress on the GIS research agenda, in EGIS‘91: Proceedings Second European Conference on Geographical Information Systems, Volume 1. Utrecht: EGIS Foundation, pp 342–350. GUPTA, A.D., GANGOPADHYAY, S., GAUTAM, T.R and ONTA, P.R 1996. Aquifer characterization using an integrated GIS-neural network approach, in Proceedings of HydroGIS 96: Application of Geographic Information
KRIGING AND NEURAL NETS TO MODEL PRECIPITATION
175
Systems in Hydrology and Water Resources Management, Vienna, April. Oxfordshire: IAHS Publication no. 235, pp 513– 519. KOONS, H.C and GORNEY, D.J. 1990. A sunspot maximum prediction using a neural network, EOS Transactions American Geophysical Union, 71(18), pp 677–688. LAM, N.S 1983. Spatial interpolation methods: a review, The American Cartographer, 10(2), pp 129–149. LEBEL, T., BASTIN G., OBLED, C. and CREUTIN, J.D. 1987. On the accuracy of areal rainfall estimation: a case study, Water Resources Research, 23(11), pp 2123–2134. LEE, J., WEGER, R.C., SENGUPTA, S.K and WELCH, RM. 1990. A neural network approach to cloud classification, IEEE Transactions Geoscience and Remote Sensing, 28(5), pp 846–855. LIU, Z.K. and XIAO, J.Y. 1991. Classification of remotely-sensed image data using artificial neural networks, International Joumal of Remote Sensing, 12(11), pp 2433–2438. OPENSHAW, S. 1992a. Some suggestions concerning the development of artificial intelligence tools for spatial modelling and analysis in GIS, in Fischer, M.M and Nijkamp, P. (Eds.) Geographic Information Systems, Spatial Modelling, and Policy Evaluation. Berlin: Springer-Verlag, pp. 17–33. OPENSHAW, S. 1992b. Modelling spatial interaction using a neural net, in Fischer, M.M and Nijkamp, P. (Eds.) Geographic Information Systems, Spatial Modelling, and Policy Evaluation. Berlin: Springer-Verlag, pp. 147–164. RAMAN, H. and SUNILKUMAR, N. 1995. Multivariate modeling of water resources time series using artificial neural networks, Journal ofHydrological Sciences, 40(2), pp. 145– 163. ROGERS, L.L. and DOWLA, F.U. 1994. Optimization of groundwater remediation using artificial neural networks with parallel solute transport modeling, Water Resources Research, 30(2), pp. 457–481. RUMELHART, D.E., HINTON, G.E. and WILLIAMS, RJ. 1986, Learning representations by back-propagating errors, Nature, 323, pp. 533–536.
Chapter Fifteen Time and Space in Network Data Structures for Hydrological Modelling Vit Vozenilek
15.1 INTRODUCTION In terms of GIS the river networks are systems of connected linear features that form a framework through which water flows. The flow of any resource in the network depends on more than the linear conduits of movement. Movement between connecting lines is affected by other elements and their characteristics in the network. It is possible to carry out some hydrological modelling directly within GIS systems, so long as time variability is not needed. The way of eliminating time as a variable is to take a snapshot at the peak flow condition and model that by assuming the discharge is at its peak value throughout the system (Maidment, 1993). It is thus possible to route water through GIS networks using analogies to traffic flow routing in which each arc is assigned an impedance measured by flow time or distance and flow is accumulated going downstream through the network. Extensions to GIS are needed to put the implementation of network analysis into practice. Some GIS can offer such tools. For example, PC NETWORK from ESRI provides the flexibility to support the requirements of a broad range of applications. Hydrologists can simulate sources of water pollution and analyse their downstream effects. Networks can also be used to model storm runoff. This chapter explores the use of GIS for this purpose showing how such tools can be used to represent the complexity of a river network. 15.2 SPACE AND TIME OF THE HYDROLOGICAL PHENOMENA IN RIVER NETWORK SYSTEMS Most hydrological phenomena are linked to the river networks. These networks are results of long-term endogenic and exogenic landscape processes. They are determined by many aspects including parameters of earth surface, soil and rock characteristics, climatic conditions and others. There are many geographical, hydrological and ecological topics which can be investigated in the river network structure—for example flood prediction, water pollution management or water quality assessment. Transport processes are characterised by rate of advection, dispersion and transformation of the transported constituent. Advection refers to the passive motion of a transported constituent at the same velocity as the flow. This is the simplest motion that one can conceive, but it is a reasonable approximation, particularly in groundwater flow, where this approximation is used to determine how long it will take leakage from a
TIME AND SPACE IN NETWORK DATA STRUCTURES
177
contaminant source to flow through the groundwater system to a strategic point, such as a drinking water well, a lake or a river. Dispersion refers to the spreading of the constituent because of mixing in the water as it flows along. In surface water flow, dispersion occurs because of eddy motion mixing masses of water containing different concentrations of the constituent. The transformation of constituents by degradation, absorption or chemical reaction can change the concentration from that predicted by the advectiondispersion equation, usually resulting in decreased concentrations as the constituent leaves the solution (Goodchild et al., 1993). Each application of network analyses requires special data structure, models and set of algorithms (Bishr and Radvan, 1995). All of these needs involve accurate understanding of space and time of the river networks with their full complexity. The model creation is determined by data structure and gives basic features of algorithms. 15.2.1 SPACE It seems theoretically possible that one-dimensional and two-dimensional steady flow computations could be done explicitly based on a GIS database of river channels (for ID flow) or shallow lakes and estuaries or aquifer systems (for 2D flow) (Maidment, 1993). That is not done in commonly used GIS systems at this time, however. Incorporating these flow descriptions for groundwater models within GIS would best be done using the analytical solutions which exist for many different types of groundwater configurations. The motion of water is a truly 3D phenomenon and all attempts to approximate that motion in a space of lesser dimensions are just that—approximations. The network understanding in terms of space involves more than network morphology. To apply all complex natural phenomena to the scope of river network analyses mere are several levels of dimensionality (see Figure 15.1): The first level—3D+T space—describes all natural phenomena in their real nature. They can be defined by the function well-known as interpretation of 4D space: (15.1) where x,y, and z are the spatial coordinates and t is the time parameter. The second level—2D+T space—is derived from 3D+T space to simplify phenomena in network analyses. This simplification goes to meet the data structures. The 2D+T level is the derivation of the function f. The description can be expressed by function f: (15.2) where x and y are spatial coordinates and t is time parameter, identical to 3D+T space. The example is map representation of spreading rain or any resource event. The acronym expresses two spatial and one temporal axes and is more precise than the acronym 3D. The third level—1D+T space—is the final simplification to simulate the movement of investigated material in the river. It is derived from above levels. The description can be expressed by function f’: (15.3) where x is spatial coordinate and t is time parameter, identical to 3D+T and 2D+T spaces. The example is water flow in the river bed. The acronym expresses one spatial and one temporal axes and is more precise than the acronym 2D.
178
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
Figure 15.1: Three Levels of Dimensionality
15.2.2 Time Hydrological phenomena are driven by rainfall and are thus always time dependent, even though by taking snapshots at particular points in time or by time averaging over long periods, a steady-state model can be created (Bonhan-Carter, 1996). Time is the basic parameter in the concept of dynamic modelling. Space can be stretched but time must to stay continuous. It is possible to split a continuous time into a set of intervals only. This simplification allows dynamic modelling with available data structures. The temporal items in the structures are needed to simulate real processes in time:
TIME AND SPACE IN NETWORK DATA STRUCTURES
179
• Time Initial—defines the initial time of the phenomena state. It is calculated from 2D+T spaced models. It is used for calculation of any temporal parameters of phenomena, • F-node Time—defines the time of phenomena state in F-node point of any network link, • T-node Time—defines the time of phenomena state in T-node point of any network link, • Demand Time—defines time associated with any feature and can be specified for every arc and stop in the network. The basic assumption can accept the intervals limited by the following times: t0—the initial situation when the resource event starts; the primary calculations establish starting state S0, t1—calculations of state S1 according to distribution and intensity of resource event and calculation of Tnode time, t2—calculations of state S2 according to distribution and intensity of resource event and calculation of Tnode time, t3—calculations of state S3 according to distribution and intensity of resource event and calculation of Tnode time, etc. The general aspects of space and time in the river network analyses can be described as follows: 1. The initial time can vary with the direction and speed of resource event (rain, pollution, erosion). 2. The initial time must be stable for each link in the network. 3. The variables of the phenomena investigated include : • the state parameters of river network links (slope, roughness, depth, bank regulation, width and crosssection of river bed, etc.), • the morphology of the river network (length of links, number of confluences and their radius etc.), • the spatial and temporal changes of resource event (rain can hit a part of catchment only, rain can stop and then start again, rain can have varying intensity, etc.), • the morphology of surroundings of the streams (fields at the banks steep and/orflat, rough and/or covered by grass, large and/or small, etc.). 15.2.3 Models Network analysis simulating mass movement in the network requires exactly defined paths in the river network (Vozenilek, 1994a). They can be derived as a stream order in the river system and their direction from spring to confluence can be reached by changing the link directions. The simplest analysis is a calculation of the time taken for water to flow from one point to another in the river network. Greater problems can arise in the calculation of water volume at confluences. To create a model for this task the algorithm involves two steps: 1. Identification of path in river network from springs to confluence and classification of the river order (by Strahler) (Haggett and Chorley, 1969). 2. Calculation of water volumes for each path, starting with the tributaries of the highest order up to the main river; after each path calculation the counted water volume is added to the path of the highest order in confluence.
180
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
The simulation of floods can be carried out in two steps: 1. Definition of the base-flow based on water inputs into streams from springs. 2 . Evaluation of resource event and its adding into network as resource demand of links. The algorithm for simulating the initial state of the river network has the following requirements: 1. Yield of spring requires data for selected points. 2. Time increase for each link (time in which water flows from F-node to T-node and it is calculated according to attributes items (slope, density, etc.) implementation of a DEM with other layers and corresponding equations. 3. Water volume at T-node is water volume at F-node or F-node is adding several T-nodes (in terms of water volumes). If rain is considered in the analysis there is the need to add water from the surrounding slopes into these calculations. 15.3 NETWORK DATA STRUCTURES The common data structures for linking GIS and hydrological modelling must involve the representation of the land surface and channel network. It is suggested that GIS possess six fundamental data structures, including three basic structures (point, line and polygons) and three derived structures (grid, triangulated irregular network, and network). Each of this data structures is well understood in hydrological modelling, where they have been used for many years as a basis for flow modelling. It is clear that computation is more efficient if it can rely on a single set of data structures instead of having data passed back and forth between different data structures (Joao et al., 1990). There are specialised products designed to provide network modelling and address matching capabilities. They allow users to simulate all the elements, characteristics and functions as they appear in the real world. They also provide geocoding capabilities which the user can use to take information stored in tabular address files and add it to the project for subsequent use in analytical application. In the research described in this chapter the PC Arc/Info Network module was used with the ROUTE programme utilised to determined optimal paths for the movement of water through the network. There are several general elements in network data structures, such as PC Arc/Info Network, to describe the real world objects. • Links are the conduits for movement. Their attributes simulate flow (two-way impedance) and water amount (demand). For the river network the two-way impedance is only used one way—downstream. Links represent stream and channels in river networks. Barriers prevent movement between links. The resources cannot flow through them. They have no attributes. • Barriers can represent facilities for flow prevention on streams or channels. • Turns are all possible turns at an intersection. Flow can occur from one link through the node, and onto another link. Their attributes defined the only way of flow—downstream direction. The other turns are defined as closed. Turns represent all confluences in the river network.
TIME AND SPACE IN NETWORK DATA STRUCTURES
181
• Stops are locations on a route to pick up or drop off resources along paths through the network. They are used to represent wells, springs, water consumers, factories, point pollution sources or any facilities which have a specific volume of water for distribution. Their attributes define adding or losing water volumes. Stops are used only in the ROUTE procedure. Attributes associated with network elements are represented by items defined according to their data types and values (Vozenilek 1994b). Most of the network elements have one or more characteristics that are an important part of the network. Impedance measures resistance to movement. Impedances are attributes of arcs and turns. The amount of impedance depends upon a number of factors, such as the character of the arc (e.g. roughness, slope, type of channel), types of resources moving through the network, direction the resources are flowing, special conditions along an arc, etc. Turns also have impedance attributes. They measure the amount of resistance to move from one arc, through a node, onto another arc. Turn impedances will vary according to the conditions at the intersection of the two arcs. All arc and turn impedances in a network must always be in the same units for accurate calculations. The purpose of impedances is to simulate the variable conditions along lines and turns in real networks. Negative impedance values in ROUTE block movement along any arc or turn. Negative values can also be used to prevent travel along arcs in a certain direction. Resource demand, the number or amount of resources associated with any feature, can be specified for every arc and stop in the network. Resource demand is the amount of water carried by each arc in a water distribution system (Vozenilek, 1995). To simulate runoff in a river network the allocate procedure can be used. The allocate procedure allows users to model the distribution of resources from one or more centres. Geographers can model the flow of resources to and from a centre. Allocation is the process of assigning links in the network to the closest centre. The use of network data structures and analysis is wide. Water companies can record data about wells, streams, reservoirs, channels, and so on, and use the data to model water availability and distribution. Hydrologists can simulate sources of water pollution and analyse their downstream effects. Networks can also be used to model storm runoff. Forestry departments will find network analysis useful in management studies, such as testing the feasibility of logging system transportation plans. Wildlife managers can use networks to assess the potential environmental impacts on wildlife migration routes. The following section describes some of the applications of the methodology outlined above and identifies key research issues. 15.4 CASE STUDIES The ability to recognise and quantify spatial structures is an essential component of river system management. A number of approaches may be taken to establish the database depending on where each catchment data set is to be located and who would require access to it. Providing a powerful system in order to deal with a huge amount of data derived from different types and sources is a significant opportunity for GIS (Kanok, 1995). There is a need to see GIS as more integrated and oriented towards an information provision process, away from primary data collection and documentation. Three study areas were used to test practically the approach mentioned above. During this process, several GIS related questions emerged with respect to:
182
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
Figure 15.2: The Network Elements in the Smallest Scale Catchment
• • • • • • • •
building a digital spatial database including all information from the research, investigating the possibility of integrating a knowledge base, model-base and GIS, evaluating time in the model of the river system, replacing manual work by digital data processing, modelling the river network, modelling the monitoring outcomes, visualising the results, selecting the best management practice to improve the ecological status of the river system. 15.4.1 Initial Model
The lowest scale to test the approach was created as a simple unreal catchment (Figure 15.2). The basic network elements were defined and the algorithms implemented. The results were used for model calibration and verification. The relation between the form of a stream channel network and the character of throughput flows is a complex one and in particular the distinction between cause and effect is very much a function of time. Inputs of precipitation into a stream network produce two types of output: 1. The discharge of water and debris. This discharge is a relatively immediate output, clearly related to a given input, and the prime concern of those who exploit the short-term operation of river systems, and 2. Changes in the channel network itself. These changes in many aspects of channel geometry are the natural result of certain magnitudes and frequency of runoff and discharge input. The larger the time interval
TIME AND SPACE IN NETWORK DATA STRUCTURES
183
under consideration the more important are these changes, either due to cumulative effects or to the increasing probability of encountering extreme events of high energy. Thus some attributes of the hydraulic geometry of self-formed channels (e.g. bankfull hydraulic radius) reflect events of relatively short recurrence interval (e.g. the most probable annual flood discharge), whereas other network attributes (e.g. the areal drainage network density) are more the product of events of larger recurrence intervals. The data were collected following a classification scheme which defined the features as either points, lines or areas. The data collection becomes more cost effective as it concentrates first on key elements of the river system. Only if further refinements are needed, can additional data be collected. Field work is supported by spatial hypotheses, making the selection of representative sites more objective. Data for network analyses was collected in different ways including digitising topographical and thematic maps, aerial processing photographs, field direct and indirect measurements, results of spatial analyses, and results from laboratory analyses. The practical value of measurements is determined by the degree to which they help understanding the causative basis of river system dynamics. Ultimately, nature research will be needed to dislocate the actual use of measurements and their importance in river management. 15.4.2 The Trkmanka catchment For a demonstration of the capabilities of network analysis the Trkmanka catchment was used as study area (see Figure 15.3). The Trkmanka river is a left tributary of the Dyje river in the south-eastern part of the Czech Republic. The data were obtained from topographical maps 1:50000 scale (digitised river network) and from field research (meteorological and hydrological stations). The calculation of runoff was chosen as the process to be simulated. This simple task gives immediate results which are presented in maps. To use the project outputs efficiently it is recommended that the implementation of this system be demonstrated at a larger scale. This has implications in respect to the collection of data to create a complex regional database applicable for different environmental applications, the preparation of inputs for the simulation packages, and the analysis of the data to infer the best management practices. Database The topographical and environmental databases have been organised mainly from 1:50 000 scale maps. The integrated information was included from historical series or planning data with geographical references (water flows, rainfalls, evaporation data etc.) to non-serial and georeferenced data (Vozenilek, 1994c). Database generalisation was taken as a more general process than normally conceived by traditional cartographers (Vozenilek, 1995). The generalisation involved: • • • •
filtering features of specified regions where the phenomenon under study is continuous in nature, selecting an appropriate areal base or framework among a very large number of alternatives, reducing the number of areal units when data are aggregated to a higher level in a spatial hierarchy, modifying the initially recorded position, magnitude, shape or nature of discrete geographical objects,or changing their relationships with other discrete objects, • changing the measurement scale in the attribute data on aggregation.
184
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
Figure 15.3: The Trkmanka Catchment
The generalisation of vector data consists of generalisation of both attribute data and spatial data. Table 15.1 presents terminology of generalisation procedures according to McMaster (1989), Brassel and Wiebel (1988) and Tobler (1989). Table 15.1: The Types of Generalisation Attribute data
Thematic data
Spatial data
Temporal data Punctiform data
Linear data
Areal data
Surface data
Flow data
Classification Symbolisation Statistical means Omit Aggregate Displace Simplify Smooth Displace Merge Enhance Omit Amalgamate Collapse Displace Smooth Enhance Simplify Amalgamate Omit
TIME AND SPACE IN NETWORK DATA STRUCTURES
185
Figure 15.4: The Morava River System
Generalisation can dramatically affect the quality of results derived from a GIS. In addition, GIS analysis can become difficult if the data sets do not match with old ones. The impact of generalisation will influence some aspects of the project (Joao et al., 1990): Differences in measured lengths, areas, volumes and densities. The length of mapped features (such as rivers) increases with increasing map scale (Steinhaus’ paradox). Similar effects occur in relation to area and surface data. Shifts of geometric features. Massive shifts of objects occur between different representations. Geological boundaries were digitised at 1:50 000 scale and then replayed precisely at the smaller scale to match the manually generalised topographical base map. Generally the location of spatial means or centroids will be also affected (affecting in turn inter-zonal distance calculations); statistical errors generated in the resulting two-dimensional coverage will occur; and topological errors generated will inadvertently lead to a quite different three-dimensional surface model. The problem of “sliver polygons“ occurred frequently. Statistically related problems. These problems occurred in handling data extracted from large national databases. Two different cases are involved: the effect of zonal “reporting units” at different levels of spatial aggregation and the effects of different arrangements of boundaries for a single level of aggregation. 15.4.3 The Morava river system The ecological research of 4,067 km of the Morava River system shows that only one third is in a healthy state, while the remaining parts are unacceptable from an environmental perspective. A whole 1,086 km can be considered to be an ecological catastrophe. The basis for this classification is the “ecological value”, The analyses of the different factors show that the character (landscape type) of the flood-plain has the strongest impact on the ecological status of the system. Where forest is located along the river, the ecological status of the ecosystem is good, however, in the case of arable land, the river is usually in a very poor condition. The Trkmanka catchment is a part of the Morava River system (see Figure 15.4).
186
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
Figure 15.5: The Relation Between Total Ecological Value and Altitude
The project was originally designed as a large interdisciplinary study. Because of the huge amount of information involved the project provides a suitable platform for GIS implementation. The project is divided into eight sub projects: • • • • • • • •
quality water monitoring; point pollution sources; non-point pollution sources; sources of drinking water; strategy of using water related to water protection; evaluation and water quality models; ecological status of the rivers; synthesis and regulation proposal..
These sub-projects required the collection of vast quantities of data including: the collection of water quality samples from the Morava River and its main tributaries to analyse the pollutants, observation of paper and wood companies and their waste activities, observation of the impact of the sugar-beet season, update of point sources of pollution in agriculture, analysis of organic and inorganic contents of fluvial sediments, monitoring of soil erosion, USLE implementation and the calculation of the potential soil erosion, investigation of sources of drinking water and identification of contaminated water, observation of the river system with regard to the relationship between water economy and discharge, and global assessment of water quality and ecological values. Analysis
The major output of the project can be described in terms of maps, tables, reports and photos (see, for example, Figures 15.5 and 15.6). The main analysis of the data was carried out using GIS techniques on spatial environmental databases. Some point features and non-spatial data however were analysed by FoxBase+ database software. A conversion of these data to ARC/INFO was carried out to produce maps. Spatial statistical modelling has been done as a tool to estimate variable values at specific locations. Statistical analysis of the basic data was carried out parallel to data processing, e.g. evaluation of water time
TIME AND SPACE IN NETWORK DATA STRUCTURES
187
Figure 15.6: The Relation Between Total Ecological Value (EV) and Landscape Type (LT).
series data, creation of results which have been stored as single attributes in the GIS database. An essential part of the assessment procedure of the abiotic aspects was the application of a ground water model in combination with GIS capabilities. The main objectives of the analysis were: • assessment of regional and local conditions (surface, soil, land use, etc.); • assessment, modelling and simulation of water movement in the river system in relation to soil, land cover and agricultural practices; • development of procedures to optimise the use of limited water resources and their range of natural conditions and to provide guidelines for their management. The developed techniques can be applicable for a wide range of conditions. The potential impacts caused by river engineering works on the flood-plain ecosystems were evaluated by “ecological impact assessment” and “ecological risk assessment” procedures. Environmental models are increasingly becoming more spatial and more quantitative enabling a more precise assessment of the landscape. In addition, the improvements in advanced dynamic modelling, for example, using data types built on video compression techniques to reduce storage space and processing time, will allow for further refinement in resolution. At the time of writing the project was not completed yet, but it is expected that it will provi#de considerable input to environmental decision-making in the area. 15.5 CONCLUSIONS Environmental studies usually imply the knowledge of continuous data but generally environmental data are measured at specific locations. Some goals of the project were to compare the spatial trends in order to access the validity of techniques used for the estimation of the ecological status of the Morava River. A GIS played and will play a significant role towards the new management paradigm of river systems. However, the power of GIS as a decision-making tool can be reduced if the accuracy of the results cannot be controlled, independently of the type of users. User experience of such shortcomings seems likely to result
188
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
in a backslash against the use of GIS-based methods and even against quantitatively-based planning tools in general. Further research needs to focus on more complicated phenomena in the river networks. An example is that of fish movement in rivers as fish can move against the stream and bring diseases or pollution both downward and upward. Another is the movement of tides and water in marshlands as discussed in the next chapter of this Volume. From the perspective of this project, the experience gained so far indicates the need for an extension of procedures to properly account for additional aspects (municipal water management, water logging, deep percolation, alkalisation etc.), extensions of the models to manage a complete irrigation system with reservoirs, canals, hydroelectric power generation, verification of the techniques through comparison with controlled field experiments, and improvement of the irrigation efficiency through reduction of seepage, deep percolation and similar phenomena. A tall research agenda the findings of which will undoubtedly have very positive applications, particularly in areas environmentally distressed such as that of the Morava. REFERENCES BISHR, Y.A., and RADVAN, M.M. 1995. Object orientated concept to implement the interaction between watershed components that influences the selection of the best management practice, in Proceeding of the First Joint European Conference and Exhibition on Geographical Information, Hague, pp. 265–278. BONHAM-CARTER, G.F. 1996. Geographical Information Systems for Geoscientists. London: Pergamon Press. BRASSEL, K., and WIEBEL, R. 1988. A review and framework of automated map generalisation, International Journal of Geographical Information Systems, 3, pp. 38–42. GOODCHILD, M.F., PARKS, B.O., and STEYAERT, L.T. (Eds.) 1993. Environmental Modeling with GIS. Oxford: University Press. HAGGETT, P., CHORLEY, R.J. 1996, Network Analysis in Geography. London: Edward Arnold. JOAO, E., RHIND, D. and KELK, B. 1990. Generalisation and GIS databases, in Harts, J., Ottens, H. and Scholten, H. (Eds.), Proceeding of the First European Geographical Information Systems Conference, Amsterdam, 10–13 April. Utrecht: EGIS Foundation, pp. 368–381. KANOK, J. 1995. Die Farbenuswahl bei der Bildung von Urheberoriginalen der Thematischen Karten in der Computer, in Acta facultatis rerum naturalium Universitas Ostraviensis. Ostrava: University of Ostrava pp. 21–31. MAIDMENT, D.R. 1993. GIS and hydrologie modelling, in Goodchild, M.F., Parks, B.O., and Steyaert, L.T. (Eds.), Environmental Modeling with GIS. Oxford: Oxford University Press, pp. 147–167. McMASTER, R. 1989. Introduction to numerical generalization in cartography, Cartographia, Monograph 40, 26(1), pp. 241. TOBLER, W. 1989. Frame independent spatial analysis. In Goodchild, M. and Gopal, S. (Eds.), Accuracy of spatial data bases. London: Taylor & Francis, pp. 56–80. VOZENILEK, V. 1994a. Computer models in geography, in Acta Universitatis Palackianae Olomoucensis, Faculatas Rerum Naturalium 118, Geographica-Geologica 33, pp. 59–64. VOZENILEK, V. 1994b. Data format and data sources for hydrological modelling, in Proceedings of Regional Conference of International Geographical Union, Prague, pp. 256–261. VOZENILEK, V. 1994c. From topographic map to terrain and hydrological digital data: an arc/info approach, m Acta Universitatis Palackianae Olomoucensis, Faculatas Rerum Naturalium 118, Geographica-Geologica 33, pp. 83–92. VOZENILEK, V. 1995. Environmental databases in GIS, in GeoForum, 2, Bratislava, p. 47 (in Czech).
Chapter Sixteen Approach to Dynamic Wetland Modelling in GIS Carsten Bjornsson
16.1 INTRODUCTION Developing wetlands to improve water quality involves the analysis of existing hydrological conditions, which are often characterised by time and space variations and discrete sample points. To bring about spatial continuity, hydrologic and water quality models has been developed and implemented using GIS. In respect to spatial distribution these GIS models are often based on one-dimensional network algorithms or surfaces with a many-to-one spatial relationship and some level of accumulation of flows. Some hydrologic movements in space are described by a many-to-many relationship characterised by types of behaviour like dispersion, diffusion, etc. where movement is multidirectional and the object in focus is spreading or diving itself. This chapter suggests a different approach to model multidirectional movements in GIS using existing programming tools within commercial raster GIS. 16.1.1 Problems with the Sustainability of Wetlands In 1932 Beadle was studying water quality in African lakes and in particular the Chambura Payrus Swamp. He noted that water quality was much better in the lake where the inflow passed a wetland than in an adjacent lake without a wetland. Based on their biochemical cycles, wetlands can be introduced for remediation purposes to improve water quality in streams and rivers acting like wastewater treatment of nutrients and other chemicals. In principle there are three types of wetlands being used for these purposes: natural wetlands, enhanced wetlands and constructed wetlands. Natural wetlands are wetlands which have not been altered or restored to a previous state. Enhanced wetlands are also natural but somehow have been modified or altered to fulfil one or more purposes. Constructed wetlands are wetlands on sites with no previous recorded wetland data. Wetlands are, no matter of what type, balanced ecosystems and when misplaced they may have opposite effects with either release of remediants or of substances which degenerate the wetland (Mitsch, 1993). Wetlands are often characterised by the presence of water, either at the surface or within the root zone, unique soil conditions that differ from adjacent upland, and vegetation adapted to the wet conditions and conversely are characterised by the absence of flooding-intolerant vegetation. Misplacement is often the result of lack of knowledge in locating and analysing site conditions. It is thus of the highest importance to identify areas within the watershed which have the potential for long range efficiency (Kuslar and Kentula, 1990). Efficiency is primarily dependent on hydrology (Hedin et al., 1994; Stabaskraba et al., 1988). Hydrological, erosion, environmental and ecological landscape models have been
190
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
used for many years to study problems related to water quality and determine measures to be taken to reduce the load of contaminants. Models in general have the advantage of describing the relationships between processes thereby testing simulation of system reaction to different management strategies as well as scientific hypotheses. At the same time models offer the possibility of understanding the processes involved and help identify areas where further knowledge is needed or data are sparse (Jørgensen, 1988). Two different approaches have been taken in modelling water quality and wetlands. The first involves ecosystem modelling where mathematical models have been used to simulate wetland remediation performance (Mitsch, 1990). These types of models are lumped models in a spatial context. The second approach involves models handling data in a spatial domain either as surfaces or networks. In recent years the latter two have been integrated to some extent with GIS (Maidment, 1993; Moore et al., 1993). 16.2 MODELS IN A LUMPED DOMAIN Wetland models are based on the philosophy of ecosystem modelling which highlights interaction between a set of variables within an ecosystem (Sklar and Constanza, 1990; Stabaskraba et al., 1988). They include: external variables which impose changes on the system, state variables which describes the state of the system, parameters describing some changeable numeric values and constants describing global values (Jørgensen, 1988). The interaction between these components describes the rate of change (dynamics) and is expressed through a condensed set of differential equations, thus making these types of models mathematical models (Jørgensen, 1988; Sklar et al., 1994). The rate of change, also termed fundamental equation of the model can be expressed as: (16.1) in which the state variable x is projected a small time step β t ahead with the r being the rate of change proportional to β t. The (*) denotes that r may depend on various quantities like time, external variables, other state variables and feedback mechanisms from x. Expressed only with one state variable, the previous equation becomes: (16.2) which states that x is not affected by other state variables. This type of equations, is often stated as an ordinary differential equation of first order. More complex systems with interactions from several state variables are described through a set of differential equations with each equation for each state variable. For a system described by two state variables x(t) and y(t) the fundamental equations can be expressed by: (16.3) Today these differential equations as well as much more complex ones are solved using numeric integration techniques often implemented in modelling programs like SYSL, STELLAR, EXTEND, etc. or programmed by the user in either C, Pascal, or Fortran. An often implemented numeric integration technique for solving differential equations is the Runge-Kutta fourth order, which approximates r t+h through a repeated stepwise approximation from r at time t. The result is not a continues function r(t) but a discrete set of approximations of points (r0 r1 r2,….) in time (t0, t1 t2,….) (Knudsen and Feldberg, 1994), obtained through a set of iterations for each time step (Chow et al., 1988). Every equation describing the system is thus solved for each iteration before advancing to the next. The number of iterations are either controlled by a user defined constant or through some minimum rate of change. One advantage using Runge-Kutta is the possibility of estimating the numerical errors due either to rounding errors or truncation errors for each time step. In GIS no immediate algorithms exist to perform numeric integration and solve a
DYNAMIC WETLAND MODELLING IN GIS
191
set of differential equations. Equations for different state variables formulated as (16.2) can be solved by algebra as long as the model is linear and projected forward in time. This procedure simplifies calculations of the fundamental equations but creates some obvious errors since each equation has to be solved explicitly and serves as input to the next fundamental equation within the same time step. 16.2.1 Spatial modelling approaches A spatial model is a model in a bi-space (space, attribute) and a space-time model is a model in a tri-space (space, time, attribute) (Chapter 10, this volume). Wetland and water quality models have the advantage of GIS to implement spatial models. This implementation can broadly be divided into two approaches (Fedra, 1993). The first approach uses the analytical tools of GIS to extract hydrologie parameters (Moore et al., 1993) and export these to a model outside GIS. This model reads and processes input data and produces some output which is then routed back for display in GIS. The models used outside GIS have a structure coherent to lumped ecosystem models. The second approach involves spatial modelling within GIS where tools for spatial data handling combined with analytical capabilities make up the spatial model. The latter type of models in GIS are at least characterised by one of five items in their modelling structure: 1. The processes modelled are often expressed in spatially or averaged terms on the computational level (Vieux, 1991). In raster GIS this involves models like AGNPS (Young et al., 1987) which uses simplistic algebraic relations and indexing like USLE (Wischmeier and Smith, 1978) and the SCS curve number (Chow, 1990) to calculate erosion and runoff from each cell. In network GIS routing of water based on Djikstra algorithm has been used to determine steady flows through pipes and channels using simple linear equations (Maidment, 1993). The reason for this is the inadequacy of present GIS to perform numerical solutions. 2. Hydrologie conditions simulated in watershed management models are often expressed as overland flow due to their implementation towards management of flooding, erosion and sediment discharge. Models like AGNPS, ANSWER (Beasley et al., 1980), ARMSHED (Hodge et al., 1988) and other runoff models used for estimation of runoff (Nageshwar et al., 1992; Stuebe and Johnston, 1990) assume that stream flow are generated during rainfall where soil becomes saturated which leads to Hortonian overland flow running to a stream (Maidment, 1993). In Denmark overland flows of this type only occur in places with impenetrable surfaces, during thunderstorms, or periods with frozen soil. Most areas contributing to overland flow are located near and around streams where saturated conditions occasionally exist. In GIS there are no algorithms to calculate this type of partial area flow (Maidment, 1993). 3. Temporal domains are often handled through a single event phenomenon where an event is released in one instance and a snapshots is taken showing only peak conditions throughout the watershed, for example the distribution of precipitation (Maidment, 1993). Water quality models like ARMSED and ANSWER are based on the single event approach calculating the routing of surface water to estimate loads of sediment. Algorithms to describe pulses—distribution of flow over a surface as well as subsurface, are not yet implemented in GIS. 4. Existing algorithms in raster GIS for routing surface water are based on the pour-point model (Jenson, 1991). Through focal analysis, the flow is directed towards the lowest lying cell of the eight adjacent cells. The spatial relationship is a many-to-one. This structure can be used for watershed delineation as well as tracing flow paths. Using this approach in a DEM with no obstacles might be satisfactory but with
192
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
obstacles, like hedgerows or boundaries between farming fields, surface flow follows different patterns. Other phenomena like waves, dispersion, and diffusion also follow other flow patterns. An algorithm based on Darcy’s Flux is implemented in a commercial raster GIS, like Arc/Info to model dispersion (ESRI, 1996). This algorithm supports a spatial relation of 1 to 4 to adjacent cells where diagonals are left out. 5. Flows described in network are modelled as 1D lines and routed along these lines having no interactions with surrounding zones (Chapter 10, this volume). Thus describing inundation, partial area flow, or flooding creates problems in network GIS. 16.2.2 Spatial modelling and dynamic simulation A third and very promising type of wetland model is spatial models based on a merge between ecosystem models and distributed process thinking—landscape process models (Sklar and Constanza, 1990). Landscape process models are spatial models which divide the landscape into some geometric compartments and then describe flows within compartments and flows between compartments according to location-specific algorithms. Flows in this type of model can be multidirectional, determined by the interaction of variables thus allowing for feedback mechanisms to occur. This type of model is dynamic and time can be treated either as a discrete entity or continuously (Chapter 10, this volume; Maidment, 1993). Models like the CELSS simulate movements of nutrients between 1 km2 squares in a marsh land. Each square (cell) consists of a dynamic non-linear simulation model with eight state variables. Each cell exchanges water and materials with each of its adjacent cells. This connectivity is a function of landscape characteristics like habitat type, drainage density, waterway orientation, and levee height. In a spatial context rates of change between state variables are transformed into movements where the same object occupies different positions in space at different times (Galton, 1996). 16.3 A FRAMEWORK FOR SPATIAL MODELLING According to Galton (1996), the definition of movements involves a definition of space, time, and position as well as object: • Space may have up to three dimensions. Three dimensions represents a volume with the boundary being a surface. Two dimensions is a surface also called an area or a region with edges as its boundaries. An edge is one dimensional where its extension is an arc or length which consists of a pair of points. • Time is characterised by duration and direction where a certain duration can be defined as an interval bounded by instants. An interval is clearly defined by instants and marks a stretch of time. Direction is often obtained by formally imposing an ordering relation to sets of time often expressed in linear form. However a third issue is important when defining time. This has to do with granularity which means the temporal resolving power. Granularity can be addressed in either of two ways. One is to work with discrete time where temporal duration is articulated into a set of minimal steps. The other is dense time where steps can be arbitrarily subdivided into any interval, say between two instants a third can be found, etc., aiming towards infinite subdivisions. • An object is anything that it makes sense to ascribe movement to. An object can be rigid, maintaining the shape and size at all times, or non-rigid and change shape, size, or position.
DYNAMIC WETLAND MODELLING IN GIS
193
Figure 16.1: Commercial GIS addressing of immediate Neighbourhood.
• The position of an object has been formulated as the total region of space occupied by it at a time, thereby making the congruence of position and object. It can either be ‘exact’ position or defined by proximity or contiguity. Movement can thus be defined at each time t by mapping the position of a body at time t. Working with this definition t has to be defined as well as the type of position. Movement is continuous if to get from one position to another it has to pass through a set of intermediate positions making no sudden jumps from one place to another. In a discrete space positions are neighbours where the object can move directly from one position to another. Such a space is called quasi-continuous where the rule is that objects can only move directly between neighbouring positions. 16.4 IMPLEMENTATION WITHIN GIS Introducing the above framework for movements together with landscape process thinking may eliminate many of the current constraints in GIS. Using existing tools in GIS interaction matrices can be implemented to describe movement by defining objects, time, space, and position. In raster GIS matrices are defined by rows and columns where each ‘cell’ in the matrix has one attribute besides row and column number stored in a data model (Hansen, 1993). Using map algebra different matrices can be modelled together using either local, focal, zonal, or global approaches (Tomlin, 1990). Using focal analysis these techniques calculate a cell value as some function of its immediate neighbourhood corresponding to the definition of movement. Focal analysis tools further allow handling subsets of immediate neighbourhoods. In commercial GIS addressing of immediate neighbourhood occurs in either of three procedures (ESRI, 1996): • Most widely used has been functions calculating the directions of flows based on bit values indicating directions of steepest flow. Bit values range from 1 to 128 where each of the eight surrounding cells has its own bit value. The cell will get a value corresponding to the bitvalue of the lowest laying cell. Many cells can point into one cell but one cell can only point to one cell (see Figure 16.1, diagram 1). • The other type of flow processing is calculating the number of cells that will flow to the target cell. The cell will get a bit value indicating which cells flow to the target cell. To identify these cells involves additional application programming. Vice-versa it is possible to identify adjacent cells addressed by the cell giving the ability for a many-to-many relationship. If a focal flow value is calculated to 5 only the two cells with values 1 and 4 are flowing into the cell whereas the rest of the cells will flow from the cell (see Figure 16.1, Diagram 2). • A much less recognised approach but with greater possibilities is the use of individual cell addressing. Using this technique involves application programming within GIS where a cell can be programmed to address each of its surrounding neighbours using a reference relative to itself. This reference consists of two co-ordinates where the cell value will be a function of the cell addressed (see Figure 16.1, Diagram 3).
194
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
Figure 16.2: Movements in GIS can be described as focal interaction, focal—and local relations
In describing movements the last two approaches seems feasible. Two immediate advantages of such approaches are: the possibility of using GIS’s map-algebraic capabilities in conjunction with focal cell modelling; and no data transfer has to occur. 16.5 MOVEMENT IN GIS We have shown that rates of change between cells are described by movement where the same object occupies different positions in space at different times. Using individual cell addressing in raster GIS, it is possible to describe the movement of objects based on the landscape process model approach. To describe movements there are conceptually three issues to be determined for an object to move from one cell to the next. The first issue is to determine whether a relation between two cells exists or not. This could be termed the level of focal interaction. Second, if an interaction occurs, which kind of focal relation describes the interaction needs to be determined. Thirdly there is a local relation. No matter the type of focal interaction, processes could occur within the cell where the value of the cell will change during a certain period of time. By using a stepwise individual cell addressing it is possible to determine which relationship exists between the target cell and each of its surrounding cells. Each cell under investigation can be identified through a pointer to the cell, defined as its location in relation to the target cell. If focal interaction is established, processing between cells can occur. The focal and local relation can be expressed as some linear mathematical equation to determine the rates of change. Since no immediate numerical integration techniques exist at this level in cell processing, the equations must be solved via algebraic solution. This automatically creates difficulties in describing rates of change simultaneously in a space and time domain. Nevertheless a solution can be reached by accepting an approximation of the rates of change. (See Figure 16.2) Based of the concept of ecosystem theory each cell can be interpreted as a single ecosystem governed by a set of fundamental equations each describing the rates of change involved. Having these equations gives the possibility of describing exchanges between cells. Working with the immediate neighbourhood fulfils the assumption that the processes described do not jump cells but instead pass through cells. The cell can thus be expressed as a dynamic model which describes changes in the spatial pattern of the immediate neighbourhood from time t to time t+m: (16.4) where X is a column vector describing the spatial pattern at time t of a state variable and Y describes other external or internal variables in the immediate neighbourhood. An effect of this definition implemented in the individual cell processing is its dynamic structure where feed-back mechanisms are allowed through interactions of individual cells. Within each cell parameters from the cell and its neighbours are processed (Constanza et al., 1990).
DYNAMIC WETLAND MODELLING IN GIS
195
Figure 16.3: A conceptual model of stream flow using ecosystem modeling approach where stream (ST) represents the state variable and precipitation, evaporation, and groundwater external variables affecting the state of ST
16.6 BACK TO THE HYDROLOGY PROBLEM Hydrological flow processes represent the most important factor affecting the kinetics of the wetland. Instead of modelling a stream via lines in GIS a stream feature can be modelled as a set of individual segments each adjacent to each other. Each cell is a homogeneous entity representing a system of its own interacting with its immediate neighbours. Expressed as a conceptual model the variables for a stream could have the following appearance: (16.5) where the rate of change for stream water in a cell is precipitation at time t plus a contribution from adjacent cells at time t−1, contribution from groundwater at time t, minus evaporation at time t, and flow out of the cell at time t (see Figure 16.3). PR (precipitation), is the daily accumulated precipitation over each cell and gives a positive contribution to the system. This comes from accumulated time series of data expressed as mm/day (24 hours sampling period). EP(Evaporation), is the evaporation from streams expressed as mm/24 hours with the assumption that there is no wind which speeds up evaporation from open water surfaces (Chow, 1990): (16.6) where: T is the temperature expressed as °C r is density of water expressed as kg/m2 Rn is the net radiation expressed as: Rn=Ri (1-a)-Re with; Ri being the incoming radiation in W/m2
196
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
a being an absorption factor based on types of land cover Re is the emission or reflected radiation defined as: Re=e s T4 with; e being the emissivity of the surface between 0.97–1.00 s is the Stefan-Boltzmann constant (5.67×10–8 W/m2 K4) T is the temperature in Kelvin degrees SF(stream flow), is the flow in streams expressed as a steady uniform flow in open channels. Regarding each cell as a homogeneous entity a stream segment can be approximated to an open channel with the same hydraulic radius throughout the cell. Each cell has thus a different hydraulic radius depending on its physical properties. This gives different flow rates between cells creating a simple dynamic structure as well as easy algebraic solutions. Depending on data and ambition these equations could be expressed in more complex form. Uniform flow for open channels is expressed as (Chow, 1990): (16.7) where: R is the hydraulic radius defined as: R=A/P with A being the cross section area of the stream defines as height multiplied with width. P being the wetted perimeter defined as width+(2× height) n is the Manning roughness factor determined for different types of physical features in a stream like winding, ponds etc. Sf being the friction slope expressed as So (bed slope) for uniform flow. The flow rate can be calculated as: (16.8) 16.7 A DYNAMIC SPATIAL ALGORITHM FOR STREAM FLOW Transforming the conceptual model for stream flow into an algorithm based on individual cell processing also introduces spatial reasoning (Berry, 1995) expressed through rules for focal interaction. One rule of focal interaction for a stream is that a stream cell only receives water from an adjacent stream cell if and only if it is beneath the donor cell. A cell can run to many cells if it is either meeting the above criteria or the neighbouring cell is below the water table in the cell. To implement the algorithm a small test site size 600×800 m. is divided into 6×8 cells with a cell size of 100 m. The algorithm firstly reads, sets and calculates at time t=0 all the parameters to be used in the following procedure such as initial water table, precipitation, evaporation and volume. All these represent vertical flows in the model and are calculated by local relations only. Often meteorological data are scattered over large areas and have to be interpolated to the area in focus. In this simulation the time series of data available for 24 hours are accumulated precipitation data, sun hours, and highest and lowest temperature in that period. Using 24 hour accumulated precipitation data will however give a false value for stream velocity: therefore these data are approximated into 12 intervals using a Gaussian normalisation curve to distribute the rain pattern over a 24 hour period. This approach is feasible because a comparison between rainfall patterns and water velocity and levels in the test stream shows steady flows at all times, indicating a main contribution to the stream from subsurface water and thus not surface runoff Evaporation, evapotranspiration and initial water table are calculated using local cell operations and local relations (Tomlin, 1990). Contributions from precipitation and stream
DYNAMIC WETLAND MODELLING IN GIS
197
flows from the previous time step represent the total initial surface water in the cell. Some initial water will evaporate during the two hour time step and are removed from the initial water table (see Figure 16.4). Secondly, focal interaction is tested and horizontal flows are calculated through individual cell processing. Focal relations formulated through equations for slope and hence velocity (16.8) and rates are calculated using the equation for uniform flow in open channels (Chow, 1990). Parameters used for this calculation, such as Mannings and physical stream conditions are described in a set of parameter grids. Calculation of water velocity represents a “chicken or egg” situation since it is determined by the height of the stream which again is determined from water velocity. The problem is solved by introducing initial distributed height values for the stream. These are interpolated values based on a set of height measurements along the stream which are then interpolated. Velocity is calculated based on the values of the adjacent lower lying stream cells and the slope between these. The algorithm (local relations) then checks the volume of water that will leave the cell within the time step and the travel time to the next cell to ensure that water only runs a certain distance within a time step. If a volume of water in one cell has not yet reached an accumulated travel time beyond the time step the volume is allowed to reach the next cell. If travel time (previous travel time+new travel time) will exceed the time step, only a subset of the water will leave the cell whereas if the opposite occurs all the water will leave the cell. In the next step the algorithm checks its immediate neighbourhood to see if any of its adjacent neighbours will contribute to the cell. If the cell itself has an accumulated time past 7200 sec. water can actually run into the cell and even further due to high velocity and low travel time. This phenomenon is also called a pulse (Chow, 1990). Based on the cell’s physical properties a new water table is calculated (local relation). The algorithm represents an iteration within a time step where the iteration continues to calculate the distribution of stream water as long as there is a volume which flows out of the cells. When no flows occurs the algorithm bails out of the iteration and continues with the next time step calculating distribution of water at time t+1. The number of time steps are controlled by either the user or by the range of accumulated precipitation data. 16.8 PRELIMINARY RESULTS At this stage the model for stream flow is yet in a testing mode. Some preliminary results can however be presented. Figure 16.5 illustrates the distribution of water in the test stream during one time step where volumes are represented as blocks shaded after size and volume. At time 0 the initial water table has been calculated based on average measurements and precipitation. The following five illustrations show how this volume of water is distributed throughout the stream during one time step. Distribution of volume does not follow a linear rate since it depends on the spatial variability in physical properties of the stream. Since no groundwater recharge is implemented in this test run the stream will slowly dry during the iterations. This is because the velocity and thus the travelled distance exceeds the actual length of the test stream. 16.9 DISCUSSION Being at a preliminary stage, this model can only indicate at present the spatial behaviour of the stream. Actual quantitative results have yet to come since the model needs to be further calibrated and verified throughout the whole watershed. It can be argued that using network algorithms within GIS the same result could have been obtained. Using network models however does not allow back flows, partial area flows or flooding to be modelled. The method suggested in this chapter supports these issues, thus allowing
198
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
Figure 16.4: Within each time step distribution of water in a stream is calculated through a set of iterations controlled by height of stream surface and accumulated travel time.
modelling more complex flow patterns in a spatial domain. One of the present constrains is the obvious lack of possibilities to solve a set of equations simultaneously or address and process multiple cells in one calculation. Not being able to do this introduces some crude approximations in relation to time. Nevertheless this approach allows the modelling of spatial behaviour at a much higher level than existing algorithms. The drawback of implementing these types of models in raster GIS is the level of application programming needed which is rather slow especially in the writing and processing of files. On the other hand, the advantage of modelling in GIS is the possibility of a seamless interaction of focal interaction, focal relation, and local relations running the model using all available tools to interpolate and analyse data. In this algorithm only simple mathematical equations have been used to describe flow behaviour of water. It is possible to implement further complex equations based on linear solutions.
DYNAMIC WETLAND MODELLING IN GIS
199
Figure 16.5: Through one time step water distributes itself and reaches a steady state after 6 iterations. As seen there is not a linear movement of volumes due to the varying physical properties of the stream as well as allowed travel time of water.
16.10 CONCLUSION The preliminary results of this work show that distributed flows of stream water can be described using individual cell processing techniques in GIS. Using this technique, stream velocity as well as flow rates are calculated for each 100 m thus creating possibilities to estimate hydrology and sediment loads in the streams. Knowledge of these in locating wetlands is crucial if long-term efficiency of wetlands is to be obtained. Using this type of model also allows interactions between subsurface, surface, and stream, thus describing movements of hydrology mid sediments within the whole watershed. REFERENCES BEASLY, D.B., HUGGINS, L.F. and MONKE, E.J. 1980. ANSWER: a model for watershed planning, Transactions of the ASAE, pp. 938–944 BERRY, J. 1995. Spatial Reasoning for Effective GIS. Fort Collins: GIS World Books. CHOW, V.T., MAIDMENT, D.R and MAYS, L.W. 1988. Applied Hydrology. New York. McGraw-Hill. CONSTANZA, R., SKLAR, F.H. and WHITE, M.L. 1990. Modelling coastal landscape dynamics, Bioscience, 40(2), pp. 91–107. E.S.R.I 1996. ArcDOC Version 7.0.4. Redlands: E.S.R.I. FEDRA, K. 1993. GIS and environmental modelling, in Goodchild, M.F., Parks, B.O. and Steyaert, L.T. (Eds.), Environmental Modelling with GIS. Oxford: Oxford University Press, pp. 35–50. GALTON, A. 1996. Towards a qualitative theory of movement, in Frank, A. (Ed.) Temporal Data in Geographic Information Systems. Vienna: Department of Geoinformation, Technical University of Vienna, pp. 57–78. HANSEN, H.S. 1993. GIS-datamodeller, in GIS i Danmark København: Teknisk Forlag, pp. 45–50. HEDIN, R.S., NARIN, R.W. and KLEINMANN, RL. P 1994. Passive treatment of coal minage drainage, in Bureau of Mines Information Circular/1994. Washington: United States Department of Interior. HODGE, W., LARSON, M. and GORAN, W. 1988. Linking the ARMSED watershed process model with the grass geographic information system, Transactions of the ASAE, pp. 501–510.
200
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
JENSON, S.K. 1991. Applications of hydrologic information automatically extracted from digital elevation models, Hydrological Processes, 5(1), pp. 31–44. JØRGENSEN, S.E. 1988. Environmental Modelling. Amsterdam: Elsevier. KNUDESEN, C. and FELDBERG, R. 1994. Numerical Solution of Ordinary Differential Equations with Runge-Kutta Methods. Lyngby: Technical University of Denmark, Physics Department. KUSLAR, J.A. and KENTULA, M.E. (Eds.) 1990. Wetland Creation and Restoration—The Status of Science. New York: Island Press. MAIDMENT, D.R. 1993. GIS and hydrologic modeling, in Goodchild, M.F., Parks, B.O. and Steyaert, L.T. (Eds.), Environmental Modelling with GIS. Oxford: Oxford University Press, pp. 147–167. MOORE, I.D., TURNER, A.K., WILSON, J.P., JENSON, S.K. and BAND, L.E. 1993. GIS and land-surfacesubsurface-modelling, in Goodchild, M.F., Parks, B.O. and Steyaert, L.T. (Eds.), Environmental Modelling with GIS. Oxford: Oxford University Press, pp. 196–230. MITSCH, W.J. 1993. Designing constructed wetlands systems to treat agricultural non point source pollution problems, in Created and natural wetlands for controlling non point source pollution problems. Boca Raton: CRC Press. MITSCH, W.J., and GOSSELINK, J.G. 1993. Wetlands, second edition. New York: Van Nostrand Reinhold. NAGESHWAR, R.B., WESLEY, J.P. and RAVIKUMAR, S.D. 1992. Hydrologic parameter estimation using geographic information systems, Journal of Water Resources Planning and Management, 118(5), pp. 492–512. SKLAR, F.H. and CONSTANZA, R. 1990. Dynamic spatial models, in Quantitative Methods in Landscape Ecology, New York: Springer Verlag, pp. 239–288. SKLAR, F.H., GOPU, K.K., MAXWELL, T. and CONSTANZA, R. 1994. Spatially explicit and implicit dynamic simulations of wetland processes, in Global Wetlands—Old and New, Amsterdam: Elsevier, 537–554. STABASKRABA, M, MITSCH, W.J. and JØRGENSEN, S.E. 1988. Wetland modelling-an introduction and overview in Wetland Modelling. New York: Elsevier. STUEBE, M.M. and JOHNSTON, DM 1990. Runoff volume estimation using GIS techniques, Water Resources Bulletin, 26(4), pp. 611–620. TOMLIN, C.D. 1990. Geographic Information Systems and Cartographic Modelling. Englewood Cliffs, NJ: Prentice Hall. VIEUX, B.E. 1991. Geographic information systems and non-point source water quality and quantity modelling, Hydrologic Processes, 4, pp. 101–113. WISCHMEIER, W.H. and SMITH, D.D. 1978. Predicting rainfall erosion losses, a guide to conservation planning, Agricultural Handbook 537, Washington DC: US Department of Agriculture. YOUNG, R.A., ONSTAD, C.A., BOSCH, D.D. and ANDERSON, W.P. 1987. AGNPS, agricultural-non-point-sourcepollution-model: a watershed analysis tool, Conservation Research Report, 35, Washington DC: US Department of Agriculture.
Chapter Seventeen Use of GIS for Earthquake Hazard and Loss Estimation Stephanie King and Anne Kiremidjian
17.1 INTRODUCTION The realistic assessment of the earthquake hazard and potential earthquake-induced damage and loss in a region is essential for purposes such as risk mitigation, resource allocation, and emergency response planning. A geographic information system (GIS) provides the ideal environment for conducting earthquake hazard and loss studies for large regions (King and Kiremidjian, 1994). Once the spatially-referenced databases of geologic, geotechnical, and structural information for the region have been compiled and stored in the GIS, different earthquake scenarios can be analysed to estimate potential damage and loss in the region. In addition to the forecasting of future earthquake damage and loss for planning and mitigation purposes, the GIS-based analysis provides invaluable assistance when an earthquake does occur in the study region. The earthquake event is simulated in the GIS to estimate the location and severity of the damage and loss, assisting officials who need to allocate emergency response resources and personnel. As post-earthquake reconnaissance reports are made, the GIS is continuously updated to provide a relatively accurate assessment of the disaster situation in the region. This chapter describes the development of a GIS-based methodology for earthquake hazard and loss estimation that aids in seismic risk mitigation, resource allocation, public policy decisions, and emergency response. After some brief background, an overview of the various types of data and models that comprise a regional earthquake hazard and loss estimation is presented. Following the overview is a description of the development of the GIS-based methodology, i.e., the implementation of earthquake hazard and loss estimation in a geographic information system, with examples from a case study for the region of Salt Lake County, Utah (Applied Technology Council, in progress). 17.2 BACKGROUND Most of the previous work in the application of geographic information system technology to regional earthquake damage and loss estimation has been limited to methods usually considering only one type of seismic hazard and often applied to a small region or to a specific type of facility. Rentzis et al. (1992) used a GIS to estimate damage and loss distributions due to ground shaking in a 50-year exposure period for residential and commercial buildings in Palo Alto, California. Borcherdt et al. (1991) developed a GISbased methodology for identifying special study zones for strong ground shaking in the San Francisco Bay
202
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
region based on regional surface geology and an intensity attenuation relationship for a repeat of the 1906 San Francisco earthquake. Kim et al. (1992) developed a GIS-based regional risk analysis program to study the vulnerability of bridges in a regional highway network. McLaren (1992) describes the use of a GIS by Pacific Gas and Electric (PG&E) to aid in the evaluation of the likely effects of high-probability, large magnitude future earthquakes in PG&E’s service territory and to set priorities for the mitigation of seismic hazards. Due to recent improvements in the availability and quality of GIS technology, tabular database software, as well as computer hardware, a significant amount of current research has been devoted to incorporating GIS technology in seismic hazard and risk analysis. Very few of these studies, however, have considered combining the effects of the various seismic hazards such as ground shaking, soil amplification, liquefaction, landslide, and fault rupture. Additionally, most studies have been conducted for a specific site or for a specific facility type. This chapter describes a methodology for integrating all of the separate modules necessary for a comprehensive regional earthquake damage and loss analysis in a manner that is flexible in geographic location, analysis scale, database information, and analytical modelling capabilities. 17.3 OVERVIEW OF EARTHQUAKE HAZARD AND LOSS ESTIMATION Earthquake hazard and loss estimation involves the synthesis of several types of spatially-referenced geologic, geotechnical, and structural information with earthquake hazard and loss models. The basic steps in the analysis typically include the following (as illustrated in Figure 17.1): 1. Estimation of ground shaking hazard. Ground shaking estimation involves the identification of the seismic sources that may affect the region, modelling of the earthquake occurrences on the identified sources, modelling the propagation of seismic waves from the sources to the study region, and lastly modifying the ground motion to account for local soil conditions that may alter the shaking intensity and frequency content. 2. Estimation of secondary seismic hazards. Secondary seismic hazards include effects such as liquefaction, landslide, and fault rupture. Models to estimate these hazards are typically functions of the local geologic and geotechnical conditions, the intensity, frequency, and duration of the ground motion, and the occurrence of the hazards in previous earthquakes. 3. Estimation of damage to structures. Structural damage estimation involves the identification, collection, and storage of information on all building and lifeline structures in the region, and the modelling of damage to each type of structure as a function of ground shaking intensity and potential for secondary seismic hazards. 4. Estimation of monetary and non-monetary losses. Monetary losses include repair and replacement cost for structural damage and loss of business income, and non-monetary loss includes deaths and injuries. The estimation of these types of losses involves models that are functions of the level of damage and associated social and economic information for each structure. There are several other types of losses that can be considered, such as clean-up and relocation cost, homelessness, unemployment, emotional distress, and other short and long-term socio-economic impacts on the region; however, the modelling of these is typically too involved, requiring separate and more detailed studies.
GIS FOR EARTHQUAKE HAZARD AND LOSS ESTIMATION
203
Figure 17.1: Basic steps in earthquake hazard and loss estimation.
17.4 A GIS-BASED EARTHQUAKE HAZARD AND LOSS ESTIMATION METHODOLOGY As illustrated in Figure 17.1, earthquake hazard and loss estimation utilises models that require spatiallyreferenced information as the input parameters. A GIS is ideal for this type of analysis. The geologic, geotechnical, and structural data for the study region are stored in the form of maps with related database tables, and the models are in the form of either look-up tables or short programs written in GIS macro language that involve map overlay procedures and database calculations. Figure 17.2 illustrates the GIS implementation of earthquake hazard and loss estimation, following the same basic steps as those shown in Figure 17.1. Each of the four steps shown in Figure 17.2 is described in more detail below. 17.5 GIS MAPS OF GROUND SHAKING HAZARD An earthquake event is typically specified in the study region by selecting the seismic source from a fault map of the area, as well as the desired magnitude of the earthquake. The bedrock ground motion resulting
204
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
Figure 17.2: GIS-based earthquake hazard and loss estimation.
from the earthquake is estimated with the use of an attenuation relationship, an empirical formula that describes the level of ground shaking at a given location as a function of earthquake magnitude and distance to the fault. In the GIS, buffer zones of equal distance around the fault are created and the level of ground motion in each buffer zone is assigned through a database table look-up. For example, Figure 17.3 shows a map of Salt Lake County, Utah with a scenario fault break that would produce a magnitude 7.5 earthquake, and the corresponding buffer zones around the fault. The database attributes associated with a typical buffer zone are also shown in Figure 17.3. The values of peak ground acceleration (PGA) were computed according to the attenuation relationship developed by Boore et al. (1993) as follows: (17.1)
GIS FOR EARTHQUAKE HAZARD AND LOSS ESTIMATION
205
Figure 17.2: (continued). GIS-based earthquake hazard and loss estimation.
where:
d=distance to the rupture zone in km M=the assumed magnitude of the earthquake (7.5 in this study) Gb=0 and Gc=0 for soil type A (shear wave velocity > 750 m/s) Gb=1 and Gc=0 for soil type B (shear wave velocity=360–750 m/s) Gb=0 and Gc=1 for soil type C (shear wave velocity < 360 m/s) Peak ground acceleration values were also converted to Modified Mercalli Intensity (MMI) (Wood and Newman, 1931) values for use in earthquake damage estimation described later in this chapter. A map of
206
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
Figure 17.3: Map showing buffer zones of ground shaking in Salt Lake County, Utah.
the local site geology is used to define the areas of A, B, and C soil types in the study region. To estimate the final surface ground shaking in the region, the map of ground motion buffer zones is combined with the soil type map to produce the final map of surface ground shaking as illustrated in Figure 17.4 for MMI values for a magnitude 7.5 earthquake in Salt Lake County, Utah. An example list of the attributes associated with each polygon on the map is also shown in Figure 17.4. High ground motion values are shown to occur through the middle portion of the county. This is due to the close proximity of the fault in these regions, as well as the presence of softer soil deposits (soil type C). 17.6 GIS MAPS OF SECONDARY SEISMIC HAZARDS The secondary seismic hazards that are considered here include liquefaction, landslide, and surface fault rupture. The hazards associated with liquefaction and landslide are defined in terms of “high”, “moderate”, “low”, and “very low” potential of occurrence based on geologic and geotechnical conditions (see King and Kiremidjian, 1994). Maps describing the hazard due to liquefaction and landslide are often available in GIS format in terms of the qualitative potential description. More quantitative descriptions of liquefaction and landslide hazard can be developed in the GIS with spatial modelling of geotechnical and geologic parameters (for example, see Luna, 1993). The hazard due to surface fault rupture is defined in terms of 100 and 200 meter buffer zones around the assumed scenario fault break. The surface fault rupture map is created in the GIS with a similar buffer procedure as is used in the ground shaking map generation. Figure 17.5 shows an example landslide potential map for Salt Lake County, Utah.
GIS FOR EARTHQUAKE HAZARD AND LOSS ESTIMATION
207
Figure 17.4: Map showing distribution of ground shaking hazard in Salt Lake County, Utah.
17.7 GIS MAPS OF DAMAGE TO STRUCTURES In order to estimate earthquake damage accurately, a complete and detailed inventory of structures must be developed for the region. The accuracy of the final regional estimates of damage and loss is highly dependent upon the accuracy of the underlying structural inventory developed for the area. The information to be included in a structural inventory often depends upon the classes of facilities under consideration and the type of analysis being conducted. For the most general regional earthquake damage and loss analysis, information about the location, use, and structural properties of each facility is typically desired. Sources of information for the inventory include federal, state, and local governments, as well as private sector databases. Often, knowledge-based expert systems are used to infer missing information and assign predefined engineering classes to structures in the final inventory (see King et al., 1994). Typically the inventory includes information on all structures in the region, such as buildings, bridges, highways, pipelines, and industrial facilities. Once the inventory is compiled, summary maps may be generated that help to describe the characteristics of the inventory in the study region. For example, the percentage of unreinforced masonry buildings (a typically poor performer in withstanding earthquake shaking) in each Census tract can be displayed as shown in Figure 17.6 for Salt Lake County, Utah. These summaries help to indicate those areas that contain buildings that are relatively more hazardous. If resources are limited, a detailed earthquake damage and loss analysis might be conducted only in the most hazardous areas, while the remainder of the study region could be analysed when more funding is available. The most widely used measure of earthquake damage is an expression of damage in terms of percentage financial loss that can be applied to all types of structures (Rojahn, 1993). This measure is typically given the name “damage factor” and is defined as (Applied Technology Council, 1985):
208
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
Figure 17.5: Map showing landslide potential in Salt Lake County, Utah.
(17.2) Damage is typically estimated individually for all seismic hazards, i.e., ground shaking, liquefaction, landslide, and fault rupture and then combined to result in a final damage to the structures in the region for the given earthquake event; however, the discussion in this chapter is limited to damage due only to ground shaking hazard. Motion-damage relationships are used to estimate earthquake damage for each facility type due to various levels of ground shaking. These relationships, also known as vulnerability curves, are typically expressed in terms of: damage-loss curves, fragility curves, damage probability matrices, and expected damage factor curves (King and Kiremidjian, 1994). Figure 17.7 shows an example of an expected damage factor curve for low-rise wood frame buildings located in Salt Lake County, Utah. Curves such as these are usually developed on the basis of expert opinion augmented with empirical data. In the GIS, the inventory of structures is stored as a series of maps with associated database tables. The expected damage factor curves, such as the one shown in Figure 17.7, are stored in the form of database tables with ground shaking level (MMI), engineering class, expected damage factor, and standard deviation on the expected value as attributes. The maps of inventory data are overlaid in the GIS on the ground shaking hazard map as shown in Figure 17.8 for commercial buildings in Salt Lake County. Through the overlay, each building (stored as a point feature in the GIS) acquires the level of ground shaking intensity as one of its attributes. A table look-up is then used to assign the expected damage factor and standard deviation to the building as a function of the ground shaking intensity and the engineering class of the building. A sub-set of the final attributes stored with each building is also shown in Figure 17.8. There are several sources of uncertainty in the GIS-based earthquake damage and loss estimation, such as in the ground motion intensity, the inventory information, and the expected damage factor curves. Due to all the uncertainty and simplifying assumptions that are necessary when representing variables as maps, damage and loss estimates are never reported on a structure-by-structure basis but are reported as
GIS FOR EARTHQUAKE HAZARD AND LOSS ESTIMATION
209
Figure 17.6: Map showing percentage of non-reinforced masonry buildings in Salt Lake County, Utah.
Figure 17.7: Expected damage factor curve with standard deviation as a function of MMI for low-rise wood-frame buildings.
aggregated values over small regions. For example, Figure 17.9 shows the expected damage factor due to ground shaking for a magnitude 7.5 earthquake in Salt Lake County, Utah. The areas of high damage are located through the centre of the county, where the ground shaking is relatively high and where many buildings are constructed of unreinforced masonry. 17.8 GIS MAPS OF MONETARY AND NON-MONETARY LOSS Monetary losses resulting from an earthquake are typically due to: direct structural damage, such as failed beams, excessive deflections, and differential settlement to man-made facilities; and indirect effects, such as damage to non-structural elements and contents, clean-up and financing of repairs, and loss of facility use. Non-monetary losses usually refer to fatalities, injuries, unemployment, and homelessness. The modelling of the various monetary and non-monetary losses is very difficult and the subject of many current research projects. In this chapter, losses are limited to those due to direct structural damage, loss of facility use, and casualties. The latter categories of loss are assumed to be a function of the facility use and the earthquake damage to the facility.
210
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
Figure 17.8: Map showing results for commercial buildings overlaid on seismic hazard map for Salt Lake County, Utah.
In the GIS, losses are estimated with database manipulations; map overlays are not necessary. Loss due to direct structural damage is the expected damage factor of a facility multiplied by its replacement cost. Replacement costs are typically estimated as a function of the facility use and square footage according to local construction estimates. Loss of facility use is assumed to be a function of the use of the facility and the expected damage factor of the facility. Values for loss of facility use are based on expert opinion augmented with empirical data and give the number of days required to restore the facility to various percentages of full use. Casualties are also estimated as a function of the use of the facility and the expected damage factor of the facility. Casualty rates are typically based on empirical data, although the amount of data is limited. Example maps of earthquake loss estimates are shown in Figures 17.10 and 17.11. Figure 17.10 shows the loss due to structural damage aggregated by Census tract for a magnitude 7.5 earthquake in Salt Lake County, Utah, and Figure 17.11 shows the expected number of deaths for the same event and region, also aggregated by Census tract. As expected, the areas of high losses on the maps shown in Figures 17.10 and 17.11 correspond to the areas of high damage on the map shown in Figure 17.9. Again, reporting these estimates on an individual building basis is not done due to the numerous sources of uncertainty. 17.9 COST-BENEFIT STUDIES The GIS-based earthquake hazard and loss estimation provides an efficient and clear means for assessing the effects (e.g., cost-benefit analyses) of various seismic risk mitigation strategies. For example, a proposed city ordinance may require the seismic upgrading of all unreinforced masonry public buildings. The cost associated with the upgrading is assumed to be a dollar amount per square foot of building and is computed by utilising the building inventory stored in the GIS database. The benefit is assumed to be the change in the expected loss (including monetary loss, casualties, and loss of use) for one or more earthquake scenarios. Using the GIS, the expected earthquake loss is computed for the unreinforced masonry public buildings in
GIS FOR EARTHQUAKE HAZARD AND LOSS ESTIMATION
211
Figure 17.9: Map showing average expected damage factor due to ground shaking in each Census tract in Salt Lake County.
both their current and upgraded state, assuming that compliance with the city upgrade ordinance decreases the expected earthquake damage to the buildings by a certain percentage. The cost and benefit of the city ordinance are compared to assess the effectiveness of this type of risk mitigation strategy. 17.10 SUMMARY This chapter describes the implementation of earthquake hazard and loss estimation in a geographic information system. The GIS-based analysis provides decision making assistance is areas such as seismic risk mitigation, resource allocation, public policy, and emergency response. An overview of the various types of data and models that comprise a regional earthquake hazard and loss estimation is given, followed by a description of how earthquake hazard and loss estimation is done within the GIS with examples from a case study for Salt Lake County, Utah. ACKNOWLEDGEMENTS Funding for the research described in this chapter was provided by several sources including Applied Technology Council, GeoHazards International, Kajima Corporation through the California Universities for Research in Earthquake Engineering Foundation, and National Science Foundation grant number EID-9024032. The authors are grateful for this support. In addition, the authors wish to thank Environmental Systems Research Institute of Redlands, California for providing their ARC/Info™ software for research use at the John A. Blume Earthquake Engineering Center.
212
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
Figure 17.10: Map showing loss due to structural damage in each Census tract in Salt Lake County, Utah.
REFERENCES APPLIED TECHNOLOGY COUNCIL, in progress, Earthquake Loss Evaluation Methodology and Databases for Utah, ATC-36 Report, Redwood City, California. APPLIED TECHNOLOGY COUNCIL 1985. Earthquake Damage Evaluation Data for California, ATC-13 Report. Redwood City, CA: ATC. BOORE, D.M., JOYNER, W.B., and FUMAL, T.E. 1993. Estimation of Response Spectra and Peak Accelerations From Western North American Earthquakes: an Interim Report, Open File Report 93–509. Menlo Park, CA: United States Geological Survey. BORCHERDT, R, WENTWORTH, C.M., JANSSEN, A., FUMAL, T. and GIBBS, J. 1991. Methodology for predictive GIS mapping of special study zones for strong ground shaking in the San Francisco Bay Region, CA, Proceedings of the Fourth International Conference on Seismic Zonation, Stanford, California, 25–29August 1991, Volume III, pp. 545–552. KIM, S.H., GAUS, M.P., LEE, G. and CHANG, K.C. 1992. A GIS-based regional risk approach for bridges subjected to earthquakes, Proceedings of the ASCE Special Conference on Computing in Civil Engineering, Dallas, Texas, June 1992, pp. 460–467. KING, S.A. and KIREMIDJIAN, A.S. 1994. Regional Seismic Hazard and Risk Analysis Through Geographic Information Systems, John A. Blume Earthquake Engineering Center Technical Report No. 111, Stanford, CA: Stanford University. KING, S.A., KIREMIDJIAN, A.S., ROJAHN, C., SCHOLL, R.E., WILSON, R.R, and REAVELEY, L.D. 1994. Development of an integrated structural inventory for earthquake loss estimation, in Proceedings of the Fifth National Conference on Earthquake Engineering, Chicago, Illinois, 10–14 July, Vol. IV, pp. 397–406. LUNA, R. 1993. Liquefaction analysis in a GIS environment, Proceedings of the NSF workshop on Geographic Information Systems and their Application in Geotechnical Earthquake Engineering , Atlanta, Georgia, 28– 29January, 1993, pp. 65–71. McLAREN, M. 1992. GIS Prepares Utilities for Earthquakes, GIS World, Volume 5(4), pp. 60–64.
GIS FOR EARTHQUAKE HAZARD AND LOSS ESTIMATION
213
Figure 17.11 : Map showing total number of expected deaths in each Census tract in Salt Lake County, Utah. RENTZIS, D.N., KIREMIDJIAN, A.S. and HOWARD, H.C. 1992. Identification of High Risk Areas Through Integrated Building Inventories, The John A. Blume Earthquake Engineering Center Technical Report No. 98, Stanford, CA: Stanford University. ROJAHN, C. 1993. Estimation of earthquake damage to buildings and other structures in large urban areas, Proceedings of the Geohazards International/Oyo Pacific Workshop, Istanbul, Turkey, 8–11October 1993. WOOD, H.O. and NEWMAN, F. 1931. Modified Mercalli intensity scale of 1931, Seismological Society of America Bulletin, 21m(4), pp. 277–283.
Chapter Eighteen An Evaluation of the Effects of Changes in Field Size and Land Use on Soil Erosion Using a GIS-Based USLEApproach Philippe Desmet, W.Ketsman, and G.Govers
18.1 INTRODUCTION Watershed models have an obvious and explicit spatial dimension and increasingly benefit from the use of GIS linked to digital elevation models for both data input, analysis and display of the results. The particular value of GIS lies in capturing the spatial variability of parameters, and aiding in the interpretation of the results. The Universal Soil Loss Equation (USLE) is a lumped parameter erosion model that predicts the average annual sheet and rill erosion rate over a long term. The traditional approach uses averaging techniques to approximate characteristics of each parameter needed in the model. Despite its shortcomings and limitations the USLE (Foster, 1991; Renard et al., 1994; Wischmeier, 1976) is still the most frequently used equation in erosion studies, mainly due to the simple, robust form of the equation as well as to its success in predicting the average, long-term erosion on uniform slopes or field units (e.g. Bollinne, 1985; Busacca et al., 1993; Flacke et al., 1990; Jäger, 1994; Moore and Burch, 1986; Mellerowicz et al., 1994,). The USLE is primarily designed to predict erosion on straight slope sections. Foster and Wischmeier (1974) were the first to develop a procedure to calculate the average soil loss on complex slope profiles by dividing an irregular slope into a limited number of uniform segments. In this way, they were able to take the profile shape of the slope into account. This is important as slope shape influences erosion (D’Souza and Morgan; 1976, Young and Mutchler, 1969). Using manual methods the USLE has already been applied on a watershed scale (Griffin et al., 1988, Williams and Berndt, 1972, 1977, Wilson, 1986). Basically, all these methods consist of the calculation of the LS-value for a sample of points or profiles in the area under study: the results of such calculations are then considered to be representative for the area as a whole. The number of data collected will necessarily be limited by the time-consuming nature of these methods. Furthermore, a fundamental problem may arise: the measurement of slope length at a given point is not straightforward and in a two-dimensional situation slope length should be replaced by the unit contributing area, i.e. the upslope drainage area per unit of contour length (Kirkby and Chorley, 1967). Indeed, in a real two-dimensional situation overland flow and the resulting soil loss does not really depend on the distance to the divide or upslope border of the field, but on the area per unit of contour length contributing runoff to that point (e.g. Ahnert, 1976; Bork and Hensel, 1988; Moore and Nieber, 1989). The latter may differ considerably from the manually measured slope length, as it is strongly affected by flow convergence and/or divergence. GIS technology provides for relatively easy construction and handling of digital elevation models which, in principle, allow for the calculation of the unit contributing area so that the complex nature of the
EVALUATION OF SOIL EROSION USING A GIS-BASED USLE APPROACH
215
topography may be fully accounted for. In order to do so, various routing algorithms have been proposed in the literature and some applications have already been made in erosion studies (e.g. Bork and Hensel, 1988; Desmet and Covers, 1995; Moore and Burch, 1986, Moore and Nieber, 1989, Moore et al., 1991). The aim of this chapter is to present an extension of the Foster and Wischmeier (1974) approach for the calculation of the explicit-factor on a two-dimensional terrain. It will be shown that the algorithm may increase the applicability of the USLE by incorporating the proposed procedure in a GIS-environment thereby allowing the calculation of LS-values on a land unit basis. The applicability and flexibility of the GIS procedure will thereafter be demonstrated by an explorative evaluation of the effect of changes in land parcellation and land use on the erosion risks in the last two centuries in areas in central Belgium. Land parcellation and land use were chosen because of their relative importance in erosion assessments, their explicit spatial character and because they offered the best potential to derive their temporal evolution. 18.2 METHODOLOGY 18.2.1 A GIS-based USLE approach Foster and Wischmeier (1974) recognised the fact that a slope or even a field unit cannot be considered as totally uniform. Therefore, they subdivided the slope into a number of segments, which they assumed to be uniform in slope gradient and soil properties. The LS-factor for such a slope segment might then be calculated as: (18.1) where: L=slope length factor for the j-th segment (-) Sj=slope factor for the j-th segment (-) β j=distance from the lower boundary of the j-th segment to the upslope field boundary (m) m=the length exponent of the USLE LS-factor (-) In a grid-based DEM the surface consists of square cells. If the LS-factor has to be calculated, the contributing area of each cell as well as the grid cell slope have to be known. There are various algorithms to calculate the contributing area for a grid cell, i.e. the area upslope of the grid cell which drains into the cell. A basic distinction has to be made between single flow algorithms which transfer all matter from the source cell to a single cell downslope, and multiple flow algorithms which divide the matter flow out of a cell over several receiving cells. This distinction is not purely technical: single flow algorithms allow only parallel and convergent flow, while multiple flow algorithms can accommodate divergent flow. Desmet and Covers (1996a) reviewed the algorithms available and found that the use of single flow algorithms to route water over a topographically complex surface is a problem as minor topographical accidents may result in the erratic location of main drainage lines. Multiple flow algorithms, which can accommodate divergent flow, do not have this disadvantage. In here, we used the flux decomposition algorithm developed by Desmet and Covers (1996a): a vector having a magnitude equal to the contributing area to be distributed, increased with the grid cell area, is split into its two ordinal components. The magnitude of each component is proportional to the sine or cosine of the aspect direction which gives the direction of the vector. But as the sum of these two components is larger than the original magnitude, the components have to be normalised
216
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
afterwards. The unit contributing area may then be calculated by dividing the contributing area of a cell by the effective contour length D’ij. This is the length of contour line within the grid cell over which flow can pass. The latter equals the length of the line through the grid cell centre and is perpendicular to the aspect direction. It is calculated as: (18.2) where: D’ij=the effective contour length (m) D=the grid cell size (m) xij=sinij +cosα ij β =aspect direction for the grid cell with co-ordinates (i, j). At the cell outlet, the contributing area at the inlet has to be increased by the grid cell area. Equation (18. 1) can be extended for a two-dimensional topography by substituting the unit contributing area for the slope length as each grid cell may be considered as a slope segment having a uniform slope. After some rearrangements the L-factor for the grid cell with co-ordinates (i, j) may be written as: (18.3) where: Lij=slope length factor for the grid cell with co-ordinates (i, j) Aij-in=contributing area at the inlet of a grid cell with co-ordinates (i, j) (m2) Different methods can be used to calculate the slope gradient on grid-based DEMs. For this study, the slope gradient for each grid cell of the study area was computed according to the algorithm described by Zevenbergen and Thorne (1987): (18.4) where Gx=gradient in the x-direction (m/m) and Gy=gradient in the y-direction (m/m). The LS-factor for a grid cell may then be obtained by inserting Gi,j calculated by equation (18.4) and Lij calculated by equation (18.3), in the equations of the chosen USLE approach. For this study, we employed the equations as proposed for the Revised Universal Soil Loss Equation (RUSLE) (McCool et al., 1987, 1989; Renard et al., 1993). This approach has proved to give reliable, two-dimensional estimates of the long-term erosion risks; more details can be found in Desmet and Covers (1996b). 18.2.2 Benefits of this approach Desmet and Covers (1996b) compared the automated approach to a manual analysis of a topographic map in which the LS-values for a sample of points were derived and considered to be representative for a certain area around these points. A first difference is that the number of data collected will necessarily be limited by the time-consuming nature of the manual method while on the other hand the automated approach enables an almost infinite amount of grid cells. Both the manual and the automated method yielded broadly similar results in terms of relative erosion risk mapping. However, there appeared to be important differences in absolute values. We may generally state that the use of manual methods leads to an underestimation of the erosion risk because the effect of flow convergence cannot taken into account. This is especially true in plan-concave areas (i.e. zones of flow concentration) as overland flow tends to concentrate in the concavities: it is therefore logical that the sheet and rill erosion risk will be higher here. The plan form convergence responsible for these higher erosion risks can clearly be captured by the automated technique
EVALUATION OF SOIL EROSION USING A GIS-BASED USLE APPROACH
217
but not by the manual. Therefore, it is clear that a two-dimensional approach is required for topographically complex areas to capture the convergence and/or divergence of the real topography. The latter is especially important when rill and gully erosion are considered. While the USLE is said to give the mean annual soil loss by sheet and rill erosion, there is a strong indication that this two-dimensional approach also accounts for (ephemeral) gully erosion. This indication is based on the prediction of rill and gully volumes measured in the field (Desmet and Covers, 1997) by using the above-mentioned LS-factor. After the classification of the LS-values, the mean (predicted) LS-value and the mean (measured) rill section were extracted for each class. A comparison of the predicted mean class values for the LS-factor with the measured mean rill erosion rate shows that the highest LS-class (which corresponds with the ephemeral gully zone) follows the general linear trend of the other data, reflecting rill erosion strictu sensu (Figure 18.1). Thus, a two-dimensional approach for the USLE can be used to predict the variation with topography of both rill and ephemeral gully erosion as long as the effects of soil conditions on ephemeral gully development are not too important. The integration of the procedure in a GIS environment, in this particular case IDRISI (Eastman, 1992), has several advantages: • The combination of IDRISI with TOSCA (Jones, 1991) permits the relatively easy construction of a DEM from a standard topographic map. This DEM can directly be used by the program to produce an information layer containing the LS-value for each grid cell. • The USLE consists of six factors, four of which may gain an obvious advantage from the linkage with a GIS; digitising soil and land unit maps and/or the availability of digital soil information enable an easy storing, updating and manipulating of the soil credibility factor K, the cover-management factor C and the support practice factor P; some information for the rainfall-runoff factor R can also be achieved and improved by using GIS techniques (e.g. interpolation, Thiessen polygons). This linkage enables the incorporation of the spatial variation of some factors causing erosion and the assignment of appropriate C and P values to each of the land units and K values to each of the soil units. With a GIS-based extension of the LS-factor a full linkage is achieved. The predicted soil loss per unit area can now be calculated for each grid cell by a simple overlay procedure. Furthermore, standard IDRISI procedures allow to sum and average predicted soil losses for each land or soil unit, so that total and average soil loss can be calculated on a land unit/soil unit basis. • In a catchment with mixed land use, theoretical drainage areas are often irrelevant. Very often runoff will not be generated on the whole slope length at the same time, but only on land units with a specific land use. The following may be an example of this: mature grassland or woodland never generate significant amounts of runoff under western European conditions. So, if there are parcels with this land use upslope of an area under cultivation, they should not be taken into account when calculating L-values for the cultivated part of the catchment. It is also possible that land units may be separated by drainage systems diverting all the runoff from the upslope area. Therefore, the program was written in a way so that the user may or may not consider the land units as being hydrologically isolated. If this is done, only the area within the land unit under consideration will be taken into account when calculating L-values. Certainly, user experience is required to decide whether two parcels should be considered as being hydrologically isolated or hydrologically continuous.
218
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
Figure 18.1: The prediction of gully sections by a power function of slope gradient and length
18.2.3 Implementation procedure Four study sites were selected in the neighbourhood of Leuven, Belgium. All catchments have a surface area of a few km2 and a rolling topography with soils ranging from loamy sand to silty loam. Main crops are winter cereals, sugar beets and Belgian endives while limited parts are forested or under permanent pasture. The elevation data to construct the digital elevation model were obtained by digitising the contour lines from the Belgian topographical map (NGI), and thereafter converted into a grid-based DEM. Accuracy on elevations of contour lines are in the order of 0.5 m (Depuydt, 1969). For all DEMs a grid spacing of 5 m was chosen. Next to this, the parcellation of each catchment for 1947, 1969 and 1991 was digitised from aerial photographs. This means that the subdivision into parcels is not based on the land registry, which cannot be seen on aerial photographs, but on exploitation, which is the appropriate option when assessing erosion risks. In order to fit the parcellations to the DEM a rubber sheeting procedure was carried out. For each parcel, the land use for the corresponding year was derived by visual inspection of the aerial photographs. A main distinction was made between arable land, pasture, woodland and built-up area. The identification of the differences between arable land (especially winter corn) and pasture was not always easy because most photographs were taken in the spring; mostly, we had to rely on the presence of patterns of tillage for arable land, and a patchy pattern for pasture. In case of doubt, the field was considered to be arable land. Next to these aerial photographs, we also used old cartographic documents, namely the map of the Austrian Count de Ferraris dated to 1775 and the first official topographical map of 1870. These sources did not give any information about the parcellation, however, it was possible to retrieve information on land use. Therefore, we had to assume the parcellation of 1775 and 1870 being identical to the parcellation of 1947.
EVALUATION OF SOIL EROSION USING A GIS-BASED USLE APPROACH
219
By doing so, we have probably overestimated the field sizes for these years and therefore, its effect on the erosion risk. The GIS-based routine to calculate the LS-factor was then run on all DEMs using the information on the parcellation of the respective years. For this study we assumed the parcels to be hydrologically isolated so that no water could flow from one parcel to another. This means that the L-factor for a certain parcel is only dependent on the size and the orientation of the parcel. The LS-factor itself may give us information on the effect of changes in field size on the erosion risk. In order to include information on the land use, we also had to change the cover-management factor C.Bollinne (1985) suggested a C-value of ‘0.47’ for arable land and a value of “0” for pasture, woodland and built-up areas. The other RUSLE-factors were taken from Bollinne (1985) who performed an erosion plot study in the area, i.e. R=67.4; K=0.43; and P=1. 18.3 RESULTS 18.3.1 Evaluation of the predicted patterns Figure 18.shows the height and the slope map for the Kinderveld study catchment. A comparison of the slope map with the spatial patterns of the LS-values as calculated for the parcellation of 1990, points to the fact that the areas with the steepest slopes have the highest LS-factors (Figure 18.3). This is simply because the slope gradient is the major control on the LS-value, especially for large parcels. For smaller parcels, the effect of the slope gradient can be relatively compensated for by the lower slope lengths. 18.3.2 Effect of parcellation For all study areas the average field size has increased significantly since the Second World War (Table 18.1). This is mainly due to re-allocations, a process that still goes on. The effects on the erosion risks is obvious: it causes an increase of the average LS-factor and thus, an increase of what we may call the topography-based erosion risk. This increase ranges from 25 to 30% between 1947 and 1991 which is much lower than the increase in field size (Table 18.1). This is due to the fact that slope, which is assumed not to change over time, is the major component in the LS-factor, and because the relation between an increase in field size and an increase of the L-factor is spatially variable and dependent on the configuration of the parcel Table 18.1: Evolution of the field size and corresponding LS-values Mean field size (ha.)
1947
1969
1991
Change since 1947
Kouberg Bokkenberg Kinderveld Ganspoel Mean LS-value Kouberg
0.43 0.38 0.37 0.39
0.82 0.62 0.58 0.59
1.71 0.87 1.08 1.32
+398% +229% +292% +338%
1.08
1.22
1.37
+26.9%
220
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
Figure 18.2: Topography for the Kinderveld study catchment A: height map; B: slope map. Mean field size (ha.)
1947
1969
1991
Change since 1947
Bokkenberg Kinderveld Ganspoel
1.19 1.17 1.48
1.32 1.34 1.76
1.50 1.46 1.92
+26.1% +24.8% +29.7%
EVALUATION OF SOIL EROSION USING A GIS-BASED USLE APPROACH
221
18.3.3 Effect of parcellation and land use The changes in LS-values have to be attributed to the changes in field sizes. In order to estimate the real erosion risks, we also have to introduce information on land use. Table 18.2 gives the evolution of the percentage arable land for each study area. The maximum percentage of arable land is always found in 1870. This also affects the erosion risks as for all study areas the mean RUSLE value reaches a (local) maximum (Table 18.2). Table 18.2: Evolution of the land use and of the RUSLE values % Arable land
1775
1870
1947
1969
1991
Change since 1947
Kouberg Bokkenberg Kinderveld Ganspoel Mean RUSLE-value (ton/ha/yr) Kouberg Bokkenberg Kinderveld Ganspoel
96 73 90 77
97 100 100 97
95 96 97 90
95 70 89 77
92 68 82 77
−3.2% −29.2% −15.5% −14.4%
14.25 10.40 14.29 12.58
14.45 16.31 16.06 19.33
12.87 15.69 15.26 16.20
14.97 10.64 12.96 14.57
16.55 12.62 13.19 18.69
+28.6% −19.6% −13.6% +15.4%
After 1870 the percentage arable land gradually decreased. Essentially, this causes a decrease of the mean erosion risk, but this might sometimes be compensated for by the effect of the previously mentioned increased field size. And in fact, two catchments experienced an increase of the mean erosion risk, the two others a decrease. Whether the erosion risk will decrease or increase, is dependent on several factors: • The evolution of the field size, which is not the same for all catchments (Table 18.1). • The decrease of the percentage arable land is not in the same order of magnitude for all catchments. For the Kouberg area the percentage arable land decreased only a small amount which implied a serious increase of the erosion risk since 1947 (Table 18.2). On the other hand, for the Bokkenberg area almost one third of the arable land has been converted to pasture or woodland; this area experienced an important decrease of the mean RUSLE value (Table 18.2). • The LS-values are especially high for the steep slopes (Figures 18.2 and 18.3). When the percentage arable land is decreased, the farmers will choose to convert these steeper areas to pasture or woodland. The degree to which this happens, is partially dependent on the relative availability of steep slopes; because the steepest slopes have mostly been converted first, the slopes remaining for conversion later on were necessarily less steep. This is illustrated for the Kinderveld catchment (Table 18.3); due to the fact that the steepest areas were previously converted to pasture or woodland, the mean slope for the non-arable land decreased with time. However, the slopes for the non-arable land were always significantly higher than those for the arable land. Moreover, the mean slope for the converted area was always clearly higher than the mean slope for the remaining arable land; for example, the mean slope of the area converted to pasture and woodland between 1947 and 1969 was 13.1 percent compared to a mean slope value of less than 6 percent for the remaining arable land. For the period between 1969 and 1991 these values were respectively 6.6 percent and 5.1 percent.
222
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
Figure 18.3: Spatial pattern of the pixel-wise LS-values for the Kinderveld catchment based on the parcellation of 1991. Table 18.3: Mean Slope Gradients for the Kinderveld catchment Kinderveld
1947
1969
1991
Arable land Non-arable land
5.92% 15.79%
5.17% 13.57%
5.06% 10.99%
The effect of changes in land use on the erosion risk is probably best illustrated by the spatial pattern of the RUSLE values. Figure 18.4 shows the evolution of these patterns for the Kinderveld area. The lower slopes and the main valley (Figure 18.2) in particular experienced a gradual but important decrease of the erosion risks over time which is mainly due to a conversion towards woodland. On the other hand, for some other areas, erosion risk was increasing due to the increase in field size. 18.4 DISCUSSION According to this study an increase of the erosion risk due to the combined evolution of field size and land use occurred in the last two centuries. The representation of the temporal evolution by sequent snapshots is appealing but also implies some dangers (Langran, 1993). A major limitation may be the number of time slices which cannot fully capture all temporal changes. As the input data used are the sole data available, they must be sufficient to detect the general tendency, especially as to the evolution of field size. The aim was to study the evolution of the erosion risk as influenced by field size and land use. The evolution of the erosion risk in the last two centuries was also dependent on other factors (e.g. plowing techniques, soil structure). These factors also changed significantly over time, but in this case even less information is available in respect to their temporal evolution. 18.5 CONCLUSION A two-dimensional formulation to calculate the topographic LS-factor for topographically complex terrain was implemented in a GIS environment. The linkage of this procedure in a GIS offers several advantages compared to the one-dimensional and/or manual approach; it may account for the effect of flow convergence on rill development and it has advantages in terms of speed of execution and objectivity. The
EVALUATION OF SOIL EROSION USING A GIS-BASED USLE APPROACH
223
Figure 18.4a: Spatial pattern of the parcel-wise RUSLE soil loss values for the Kinderveld catchment A: predicted values for 1870 (parcellation of 1947); B: predicted values for 1947.
ease of linking this module with a GIS facilitates the application of the (Revised) Universal Soil Loss Equation to complex land units, thereby extending the applicability and flexibility of the USLE in land resources management. This is shown by an application in which the effect of changes in field size and land use on the erosion risk, is investigated and quantified. A significant increase of the field size occurred since the Second World War which caused a substantial increase of the erosion risks when only considering topography and field size. This increase was very similar for the four studied catchments. However, this effect may sometimes be
224
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
Figure 18.4b: Spatial pattern of the parcel-wise RUSLE soil loss values for the Kinderveld catchment C: predicted values for 1969; D: predicted values for 1991.
counteracted by changes in land use which were more variable between the catchments. The extent and degree to which this is true, depends on physical and social-economic factors of the region. This approach also enables the identification of those areas that should be converted to pasture or woodland in case of supra-national authorities (e.g. the EU) or when the market situation impose the farmers to shrink the arable land further.
EVALUATION OF SOIL EROSION USING A GIS-BASED USLE APPROACH
225
REFERENCES AHNERT, F. 1976. Brief description of a comprehensive three-dimensional process-response model of landform development Zeitschrift für Geomorphologie Suppl.Band, 25, pp. 29–49. BOLLINNE, A. 1985. Adjusting the universal soil loss equation for use in Western Europe, in El-Swaify S.A., Moldenhauer W.C. and Lo A. (Eds.) Soil Erosion and Conservation. Ankeny: Soil Conservation Society of America, pp. 206–213. BORK, H.R. and HENSEL H. 1988. Computer-aided construction of soil erosion and deposition maps, Geologisches Jahrbuch Al, 04, pp. 357–371. BUSACCA, A.J., COOK, C.A. and MULLA, D.J., 1993. Comparing landscape-scale estimation of soil erosion in the Palouse using Cs-137 and RUSLE, Journal of Soil and Water Conservation, 48(4), pp. 361–367. DEPUYDT, F. 1969. De betrouwbaarheid en de morfologische waarde van een grootschalige a Geographica Lovaniensia, 7, pp. 141–149. DESMET, P.J.J. and COVERS, G. 1995. GIS-based simulation of erosion and deposition patterns in an agricultural landscape: a comparison of model results with soil map information, Catena, 25(1–4), pp. 389–401. DESMET, P.J.J. and COVERS, G. 1996a. Comparison of routing systems for DEMs and their implications for predicting ephemeral gullies, International Journal of GIS, 10(3), pp. 3 11–331. DESMET, P.J.J. and COVERS, G. 1996b. A GIS-procedure for the automated calculation of the USLE LS-factor on topographically complex landscape units, Journal of Soil and Water Conservation, 51(5), pp. 427–433. DESMET, P.J.J. and COVERS, G. 1997. Two-dimensional modelling of rill and gully geometry and their location related to topography, Catena. D’SOUZA, V.P.C. and MORGAN, R.P.C. 1976. A laboratory study of the effect of slope steepness and curvature on soil erosion, Journal of Agricultural Engineering Research, 21, pp. 21–31. EASTMAN, R. 1992. IDRISI version 4.0, User’s Guide. Worcester: Clark University, Graduate School of Geography. FLACKE, W., AUERSWALD, K. and NEUFANG, L., 1990. Combining a modified Universal Soil Loss Equation with a digital terrain model for computing high resolution maps of soil loss resulting from rain wash, Catena, 17, pp. 383–397. FOSTER, G.R., 1991. Advances in wind and water erosion prediction, Journal of Soil and Water Conservation, 46(1), pp. 27–29. FOSTER, G.R. and WISCHMEIER, W.H. 1974. Evaluating irregular slopes for soil loss prediction, Transactions of the American Society Agricultural Engineers, 17, pp. 305– 309. GRIFFIN, M.L., BEASLEY, D.B., FLETCHER, J.J. and FOSTER, G.R 1988. Estimating soil loss on topographically non-uniform field and farm units, Journal of Soil and Water Conservation, 43, pp. 326–331. JÄGER, S. 1994. Modelling regional soil erosion susceptibility using the USLE and GIS in Rickson R.J. (Ed.) Conserving SoilRresources: European Perspectives. Wallingford, CAB International, pp. 161–177. JONES, J. 1991. TOSCA version 1.0, Reference Guide. Worcester: Clark University, Graduate School of Geography. KIRKBY, M.J. and CHORLEY, R.J. 1967. Through-flow, overland flow and erosion, Bulletin of the International Association of Hydro logical Scientists, 12, pp. 5–21. LANGRAN, G. 1993. Time in Geographic Information Systems. London: Taylor & Francis. McCOOL, D.K., BROWN, L.C., FOSTER, G.R, MUTCHLER, C.K. andMEYER, L.D. 1987. Revised slope steepness factor for the Universal Soil Loss Equation, Transactions of the American Society Agricultural Engineers, 30, pp. 1387–1396. McCOOL, D.K., FOSTER, G.R., MUTCHLER, C.K. and MEYER, L.D. 1989. Revised slope length factor for the Universal Soil Loss Equation, Transactions of the American Society Agricultural Engineers, 32, pp. 1571–1576. MELLEROWICZ, K.T., REES, H.W., CHOW, T.L. and GHANEM, I. 1994. Soil conservation planning at the watershed level using the Universal Soil Loss Equation with GIS and microcomputer technologies: a case study, Journal of Soil and Water Conservation, 49(2), pp. 194–200. MOORE, I.D and BURCH, G.J. 1986. Physical basis of the length-slope factor in the Universal Soil Loss Equation, Journal of the Soil Science Society of America, 50, pp. 1294–1298.
226
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
MOORE, I.D. and NIEBER, J.L. 1989. Landscape assessment of soil erosion and non-point source pollution, Journal of the Minnesota Academy of Science, 55(1), pp. 18–25. MOORE, I.D., GRAYSON, R.B. andLADSON, A.R. 1991. Digital terrain modelling: a review of hydrological, geomorphological and biological applications, Hydrological Processes, 5, pp. 3–30. RENARD, K.G., FOSTER, G.R., WEESIES, G.A., McCOOL, D.K. and YODER, D.C. 1993. Predicting Soil Erosion by Water: A Guide to Conservation Planning with the Revised Universal Soil Loss Equation (RUSLE). Washington D.C., USDA. RENARD, K.G., FOSTER, G.R., YODER, D.C. and McCOOL, D.K. 1994. Rusle revisited: status, questions, answers, and the future, Journal of Soil and Water Conservation, 49(3), pp. 213–220. WILLIAMS, J.R. and BERNDT, H.D. 1972. Sediment yield computed with universal equation in Journal of the Hydraulics Division, 98, pp. 2087–2098. WILLIAMS, J.R. and BERNDT, H.D. 1977. Determining the USLE’s length-slope factor for watersheds, in Soil Erosion: Prediction and Control, Proceedings of a National Conference on Soil Erosion, West Lafayette, 24–26 May 1976. West Lafayette: Purdue University, pp. 217–225 WILSON, J.P. 1986. Estimating the topographic factor in the universal soil loss equation for watersheds, Journal of Soil and Water Conservation, 41(3), 179–184. WISCHMEIER, W.H. 1976. Use and misuse of the universal soil loss equation, Journal of Soil and Water Conservation, 31(1), pp. 5–9. WISCHMEIER, W.H. and SMITH. D.D. 1965. Predicting rainfall erosion losses from cropland east of the Rocky Mountains, USDA Agricultural Handbook 282, Washington, DC: USDA WISCHMEIER, W.H. and SMITH, D.D. 1978. Predicting rainfall erosion losses: a guide to conservation planning, USDA Agricultural. Handbook 537. Washington, DC: USDA YOELI P. 1983. Digital terrain models and their cartographic and cartometric utilisation, The Cartographic Journal, 20 (1), pp. 17–22. YOUNG, R.A. and MUTCHLER, C.K. 1969. Soil movement on irregular slopes, Water Resources Research, 5(5), pp. 1084–1089. ZEVENBERGEN, L.W. and THORNE, C.R. 1987. Quantitative analysis of land surface topography, Earth Surface Processes and Landforms, 12, pp. 47–56.
Chapter Nineteen GIS for the Analysis of Structure and Change in Mountain Environments Anna Kurnatowska
19.1 INTRODUCTION 19.1.1 Aim of the research The research presented in this chapter compares the environmental structure of two mountainous areas, placed in different climatic zones. The analysis is based on delimited, homogenous, typological units (called geocomplexes) and consists of a detailed description of their morphology (size, frequency, shapes), assessment of relationships between delimited units (indication of dominant and subordinate, stable and fragile geocomplexes; determination of neighbourhood of geocomplexes); comparison of frequency and strength of relationships between geocomponents in different types of geocomplexes and assessment of biodiversity of the researched environments. From this comparison it is possible to draw general conclusions (main processes ruling montane environments), which predict further environmental changes and evaluate simple models of correlation between vegetation, geology and slope class, while stressing the differences between mountainous areas placed in different climatic zones. The analysis and comparison were carried out with the help of GIS and statistical methods. The use of GIS enabled the quantification of environmental data which was later used as the basis for statistical methods. 19.1.2 Brief characteristics of areas of interest The research covers the Five Lakes Valley located in the Tatra Mountains and Loch Coruisk Valley eroded from the Cuillin Hills on Skye, a Scottish Island of the Inner Hebrides (see Figure 19.1). The sites were chosen because both areas represent an alpine type of environment, differentiated with respect to geology, climate, hydrology, biogeography and biocenosis, and also in consideration of their beauty, uniqueness, complexity and fragility. The Tatra Mountains are eroded from acidic granite that intruded in the Carboniferous period; the Cuillin Hills are composed mainly of Tertian intrusions (intermediate gabbro and ultrabasic peridotites). In a humid climate granite weathers quickly, producing round hills; under the climatic conditions prevailing in the Tatra Mountains, weathered granite forms steep and rugged peaks. In alpine environments, the main geocomponent that determines the character of all other geocomponents is relief. Hence the researched
228
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
Figure 19.1: Location of Loch Coruisk Valley and The Five Lakes Valley
areas have one essential common feature: they have been formed by glacial processes during the last glacial epoch. Placed in the subalpine and alpine zones, both valleys constitute typical glacial valleys: maximum denivelations in Loch Coruisk Valley are 900 metres and in the Five Lakes Valley they are 650 metres. Generally speaking, Loch Coruisk Valley is steeper, less accessible and more rugged than the Five Lakes Valley. Both areas of interest are considered as high mountain environments and are appreciated by climbing and trekking tourists. Quaternary deposits in the Five Lakes Valley cover 73 percent of the researched area (including moraines —36 percent, slope deposits—27 percent, and alluviums—10 percent); in Loch Coruisk Valley—67 percent (34 percent, 11 percent and 12 percent, respectively). Gabbro is an extremely hard rock and despite its basic character and its great resistance to weathering does not produce particularly basic soils. Generally, in both areas of interest poor soils (lithosols, regosols, rankers and podzols) prevail. More fertile soils in the Five Lakes Valley developed on mylonites that are a result of tectonic crushes on ridge passes. Basic soils in Loch Coruisk Valley developed on ultrabasic peridotite and on basic sills and dikes that cut gabbroic laccolith. In both areas of interest surface waters are represented by alpine lakes and streams, and underground waters by few moraine reservoirs and slotted waters. Due to the impermeable ground, bogs and peats have developed at the very bottoms of both valleys at sites where runoff is impeded. However in the more humid climate of Loch Coruisk Valley, bogs and peats dominate in the landscape while in the Five Lakes Valley they form small patches. The most obvious feature differentiating the researched valleys is climate. The Tatra Mountains are placed in a moderate, transitional zone, from Atlantic to continental, with cold winters and hot summers, precipitation peak in summer months (mean yearly precipitation is about 1700 mm), strong foehn winds and strong winter temperature inversions. The geographical situation of Skye determines its cool, extremely Atlantic climate with high precipitation all year (yearly precipitation averages 3400 mm), strong winds and mild winters and summers leading to low temperature amplitudes (37°C, compared with 58°C in the Tatra Mountains). The Atlantic character of Loch Coruisk Valley is further emphasised by the fact that it lies in the southern part of Skye and opens straight to the sea allowing easy penetration of the valley by humid,
GIS TO ANALYSE STRUCTURE AND CHANGE IN MOUNTAIN ENVIRONMENTS
229
westerly winds. Skye climate can be characterised by its name which derives from the Norse “skuy” meaning “cloud”; in Gaelic it is Eilan a Cheo—“Isle of mists”, as there are more than 300 rainy days in an average year. The Five Lakes Valley is located above the tree line (the lowest point in the valley is the Great Lake at 1664 metres above sea level). Although Loch Coruisk Valley reaches sea level, it is also devoid of trees due to hard rock, steep slopes and exposure to strong winds and humid air masses from the sea. Main vegetation types on both areas of research are: scree and cliff communities, subalpine and alpine grasslands and dwarfshrub heaths, fens, bogs and tall herb communities. Some patches of subnival vegetation can be found at both sites. Both valleys have long been subjected to man’s activity (burning, tourist pressure) and sheep grazing whose influence is depicted by introduction of anthropogenic communities and species. 19.1.3 Methods: construction of the database and processing of maps The Polish site of research covers an area of 5.2 km2 while the Scottish one is 13.8 km2. The small areas of research, the enormous variety of mountainous ecosystems and the aim of the research, required a detailed study. Both case studies were carried out at 1:10,000 scale. Scarce environmental data at such a detailed scale and the mountainous character of the researched valleys determined the choice of three principal components that were used in the analysis: geology, geomorphology (slope class) and vegetation. Geological and contour maps (with 50 metres cut) were available at 1:10,000 scale, while vegetation maps were produced during terrain surveys (mapping of vegetation communities) at 1:7,500 scale and partially on the basis of interpretation of aerial photos at about 1:25,000 scale. All maps were digitised and rasterized. For both sites the same size of base unit was assumed: one pixel represented 4.2 m×4.2 m in terrain (which is 0.42×0.42 mm on a map at 1:10,000 scale). The main shortcoming of such a fine pixel size was increased processing time and big file sizes. However, as the credibility of final results depends on quality and detail of input maps, precision in the database construction is a must. The next step in map processing was interpolation of contour maps (development of Digital Elevation Models) and derivation of slope and aspect maps for both areas of interest. Finally slope maps were classified (classification according to Kalicki, 1986 —see Figure 19.2). The maps (Figures 19.2, 19.3, 19.4 and 19.5, see also Tables 19.1 and 19.2) were overlaid to produce maps of geocomplexes (homogeneous land units characterised by one feature of geology and vegetation). Automated overlaying of maps was accomplished using the map algebra module in IDRISI. Simple processing produced maps containing very small contours that were eliminated using a specially designed filter (all areas of less than 4 pixels were joined to the neighbouring class of pixels). Finally maps of geocomplexes were processed with respect to environmental sense of the delimited geocomplexes that is, some similar types were joined in one type. In the Five Lakes Valley 42 types represented by 1407 individual land units were delimited, on Skye values were 41 and 1442, respectively. Maps of types of geocomplexes became base maps to analyse the structure of the environments. The task was implemented through statistical analysis of the geocomplexes. Most statistical methods require numerical data whereas geocomplexes constitute qualitative characteristics of the environment. To quantify the data, three numerical parameters of geocomplexes were calculated: areas, perimeters and quantities of distinctive land units in particular types of geocomplexes. Calculation of these measurements was possible with the use of GIS methods and allowed the evaluation of more than 30 statistical indicators and coefficients, some of which are presented in this chapter.
230
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
Figure 19.2 Maps of slopes of The Five Lakes Valley and Loch Coruisk Valley Table 19.1: Legend for the Vegetation Communities in the Five Lakes Valley (Geocomplexes: A=granite; B=mylonite; C=moraines; D=slope deposits; E=river alluvium) No
1.
Vegetation type
Subnival grasslands (Minuartio-Oreochloetum distichae, Com Oreochloa distichaGentiana frigida, Trifido-Distichetum subnival variant with Oreochloa disticha)
Type of Geocomplexes A
B
1
15
C
D
E
GIS TO ANALYSE STRUCTURE AND CHANGE IN MOUNTAIN ENVIRONMENTS
Figure 19.3: Geological maps of The Five Lakes Valley and Loch Coruisk Valley
231
232
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
Figure 19.4. Vegetation map of The Five Lakes Valley No
Vegetation type
Type of Geocomplexes A
2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15.
Montane typical grasslands (Trifido-Distichetum typicum) Montane mossy grasslands (T.-D. sphagnetosum) Montane chinophilous grasslands (T.-D. salicetosum herbaceae) Montane boulder shrubs and grasslands (T.-D. salicetosum kitaibelianae, Drepanoclado-Salicetum kitaibelianae) Montane scree grasslands (T.-D. scree variant with Juncus Trifidus) Montane grazed grasslands (T.-D. caricetosum sempervirentis) Subalpine grasslands (T.-D. anthropogenic variant with Agrostis rupestris and Deschampsia flexuosa) Mylonrte communities (Festuco versicoloris-Agrostietum alpinae, Com. with Silene cucubalus) Scree and boulder communities (Rhizocarpetalia) Chinophilous heaths and mosses (Salicetum herbaceae and mossy communities from Salicetea herbaceae) Chinophilous grasses (Luzuletum spadiceae) Tall herb communities (Calamagrostietum villosae, Aconitetum firmi, Adenostyletum alliariae) Bogs (Com. with Eriophorum vaginatum, Sphagno-Nardetum, Shagno-Caricetum) Fens (Caricetum fuscae subalpinum)
B
2 5 7 9
C
D
3
4 6 8 10
12
11 13
E
14 16 17 20 21 26
22
18
19
24 28
25 29
31 33
23 27 30 32
GIS TO ANALYSE STRUCTURE AND CHANGE IN MOUNTAIN ENVIRONMENTS
233
Figure 19.5: Vegetation map of Loch Coruisk Valley No
Vegetation type
Type of Geocomplexes A
16. Dwarf-shrub pine (Pinetum mughi carpaticum) 17. Dwarf-shrub heaths (Vaccinietum myrtylli, Empetro-Vaccinietum) 18. Anthropogenic fresh grasslands (Com. with Deschampsia flexuosa, Hieracio alpiniNardetum) 19. Anthropogenic wet grasslands (Cerastio fontani-Deschampsietum) 20. Antropogenic communities (Com. with Stellaria media, Urtico-Aconitetum, Com. with Rumex obtusifolius, Com. with Cardaminopsis halleri, Com. with Ranunculus repens)
B
34
C
D
35 37 38
36
E
39
41 42
40
Table 19.2: Legend for the Vegetation Communities in Loch Coruisk Valley (Geocomplexes: A=gabbro; B=periodites; C=moraines; D=slope deposits; E=river alluvium) NO Type of vegetation
1. 2. 3.
Montane grasslands (Com. Festuca ovina-Luzula spicata) Montane grasslands and heaths (Cariceto-Rhacomitretum lanuginosi) Montane heaths (Rhacomitreto-Empetrum)
Type of Geocomplexes A
B
1 3 5
8
C
D 2 4 6
E
234
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
NO Type of vegetation
Type of Geocomplexes A
4. 5. 6. 7. 8.
9. 10. 11. 12. 13. 14. 15. 16.
Montane dwarf shrubs (Com. with Juniperus nana) Scree and boulder communities (Rhizocarpetalia) Acidic tall herb communities (Com. Luzula sylvatica-Vaccinium myrtillus) Calcareous tall herb communities (Com. Sedum rosea-Alchemilla glabra) Eutrophic fens (Com Carex rostrata-Scorpidium scorpioides, Com Carex paniceaCampylium stellatum, Com. Eriophorum latifolium-Carex hostiana, Com. with Schoenus nigricans) Mezotrofic fens (Com. Trichophorum cespitosum-Carex panicea, Com.Molinia caerulea-Myrica gale) Ombryotrophic mires (Com. Eriophorum angustifiolium-Sphagnum cuspidatum, Trichophoreto-Callunetum, Trichophoreto-Eriophoretum) Bogs and wet-heaths moderately flushed by water (Molinieto-Callunetum) Species-poor dwarf-shrub heaths (Callunetum vulgaris) Mossy dwarf-shrub heaths (Vaccinieto-Callunetum hepaticosum) Anthropogenic species-poor grasslands (Agrosto-Festucetum species-poor) Anthropogenic species-rich grasslands (Agrosto-Festucetum species-rich, Alchemilleto-Agrosto-Festucetum) Coastal communities (Com. Asplenium marinum-Grimma maritima)
7 9 12 15 18
B
C
10
D 11 14 16
E
20
13 17 19
21
23
22
24
26
25
27 29 31 34 37
28 30 33 36 40
32 38
35 39
41
19.2 STATISTICAL ANALYSIS OF THE ENVIRONMENTAL STRUCTURES The calculated indices and measures can be divided into four groups: 1. parameters of size and frequency of occurrence of land units; 2. indices of shape of land units; 3. measures characterising spatial (horizontal) structure of geocomplexes (pattern complexity, nearest neighbour frequency, landscape richness, eveness, patchiness, entropy, diversity and dominance of geocomplexes); and 4. measures characterising relationships between different features of geocomponents (strength and frequency of relationships between slope class, type of vegetation and geology). 19.2.1 Parameters of size and frequency of geocomplexes Size and frequency of geocomplexes constitute simple, basic and introductory descriptions of the morphology of the environment. At the same time, however, they are very important, as the analysis of the structure of geocomplexes depends on the scale of the research and the size of delimited geocomplexes.
GIS TO ANALYSE STRUCTURE AND CHANGE IN MOUNTAIN ENVIRONMENTS
235
Mean size of geocomplexes within a type Large mean sizes of geocomplexes within a type express stability. The largest mean sizes of types of geocomplexes in The Five Lakes Valley characterise the following types of geocomplexes: subalpine grasslands on moraines (14), grazed montane grasslands on slope deposits (13), mylonite communities on slope deposits (16) and scree communities on moraines (18). Subalpine meadows occur in a transitional zone between alpine and subalpine zones and constitute characteristic mixture of communities typical for both zones, which explains their large sizes. Moreover, similar to montane grazed grasslands, they are a result of intensive sheep grazing which continued till the late 1960s, when the Five Lakes Valley was purchased by the Tatra National Park from private farmers. Degeneration and floristic changes of communities are not so strong as in subalpine zones, but in both cases sheep grazing lead to the development of semi-natural geocomplexes of simplified inner structure. Hence, in spite of the primitive character of pasture management and a short time of grazing during the year (two months yearly), it led to simplification of characteristics, and a mosaic structure of the alpine environments. This fact supports a theory that environmental structure depends on the strength of anthropopression: at the beginning human activity leads to diversification of environmental structure (number and sizes of geocomplexes); with its intensification, however, it leads to substitution of small, diversified geocomplexes by large, cohesive and unified ones (Pietrzak, 1989). Mylonite communities form “flower meadows” on the alluvium talluses below mountain passes with mylonite outcrops formed on dislocation lines. The main determinant of the occurrence of this community is the presence of CaCO3 in flushing water. Other geocomponents like geology, relief, ground humidity and microclimate are of less importance and do not contribute to diversification of these communities in the scale of current analysis. Scree communities on slope deposits (19) are communities of extensive scree boulders in side valleys. Low diversification is a result of lack of linear geocomplexes (streams or gullies) which usually cut large geocomplexes. Also large boulders constitute a hostile environment for any other type of vegetation, because of lack of soil. The following types of geocomplexes have the smallest sizes: chinophilous grasses on mylonites (22), scree communities on granite (17), montane grasslands on mylonites (15) and tall herb communities on river alluviums (27). Small areas of types 22, 15 and 27 are due to the type of geology underlying rocks: both mylonites and alluviums form small patches which are important in the diversification of environment. Scree communities on granite exist mainly on rugged summits and crevices, and in a 3-D reality cover large areas. However, as they are presented on a 2-D map, their size parameters are significantly underestimated. In Loch Coruisk Valley the largest geocomplexes are found in the following types: scree communities on slope deposits (11), eutrophic fens on river alluviums (19), anthropogenic species-rich grasslands on gabbro (37), montane grasslands on gabbro (1) and poor anthropogenic grasslands on gabbro (34). Types 19 and 1 can be considered as stable and typical geocomplexes in the Loch Coruisk Valley landscape. Comparatively large sizes of types 11, 37 and 34 can be explained in the same way as the large sizes of the respective geocomplexes in the Five Lakes Valley in the Tatra Mountains. The smallest mean areas of types of geocomplexes in Loch Coruisk Valley are observed in the following types: mesotrophic fens and flushed bogs on gabbro (21 and 27), and scree communities on peridotite and gabbro (10 and 9). Types 21 and 27 form small patches in small concaves in solid rocks, often on watershed borders in upper parts of the valley or on muttons. Small sizes of types 10 and 9 are, again, due to the 2-D representation of the 3-D environment and are significantly underestimated. As a final analysis of the sizes of geocomplexes an index of area variability for each type of geocomplex was calculated (Bocarov 1976):
236
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
(19.1) where: s – standard deviation of the area within a geocomplex type, – mean value of the area within a geocomplex type. The index represents standard deviation as a percentage of mean value and hence eliminates direct influence of mean values on standard deviation which enables comparison of size variability for different types of geocomplexes. The index correlates with the total area of the type of geocomplexes (correlation equal 0.63 and at the 0.00 significance level). In the Five Lakes Valley the index ranges from 26 percent to 231 percent and reaches its maximal values for the following types: scree montane grasslands on moraines (12), scree communities on moraines (18) and anthropogenic wet grasslands on moraines (41). On Skye it ranges from 44 percent to 367 percent and reaches its maximum for: flushed bogs on gabbro (27), speciespoor heaths on moraines (30), and anthropogenic species-poor grasslands on gabbro (34). Generally speaking these are large and typical geocomplexes which have quite a wide range of habitat requirements. Small variability of size is characteristic for small and rare geocomplexes. Most often these are geocomplexes which require specific habitat conditions. In the Tatra Mountains these are fens (32, 33) and chinophilous communities (20, 22); on Skye they are represented by calcareous communities (8, 15, 16, 37) and montane heaths (5, 3). These communities are homogenous and in spite of their episodic character and small areas, are important as they indicate non-typical habitats which contribute to the specific character of the researched environments. Frequency of types of geocomplexes High area frequency of geocomplexes is characteristic for dominant geocomplexes which constitute the so called “landscape background”. In the Five Lakes Valley as many as six types cover 40 percent of the total valley area: scree communities on moraines (18), dwarf pine on moraines (35), chinophilous grasses on slope deposits (25), scree communities on slope deposits (19), anthropogenic fresh grasslands on moraines (35) and typical montane grasslands on granite (2). Types of geocomplexes covering a small percentage of the total valley area but of high dispersion of individual geocomplexes (high frequency of occurrence) are characterised by common but specific habitats covering small areas. Good examples in the Tatra Mountains are tall herb communities occurring in long, narrow belts along disrupt slope bents on the border of solid rock and accumulation deposits (types 26 and 28), chinophilous grasses (21, 23, 25) and dwarf pine on slope deposits (36). In Loch Coruisk Valley six types of geocomplexes, most of which are hydrogenic (types 20, 36, 11, 29, 34, and 28), cover as much as 50 percent of the total area. This is further evidence that the dominant geocomponent in Loch Coruisk Valley is climate, which determines all the other geocomponents. Small but common individual geocomplexes are characteristic for basic habitats (types 38, 15, 17, 18, 10, and 4). 19.2.2 Indices of shape Shape indices provide information on the main processes ruling the overall performance of geocomplexes. Index of shape dismemberment
Index of shape dismemberment reaches its minimum value of 1 for circles and approaches infinity for highly dismembered shapes (Pietrzak, 1989):
GIS TO ANALYSE STRUCTURE AND CHANGE IN MOUNTAIN ENVIRONMENTS
237
(19.2) where: A = area of individual geocomplex, P = perimeter of individual geocomplex. Dismemberment of shapes is highly scale-dependent. In spite of the detailed scale of the research, shapes of individual geocomplexes are generalised. Hence the index renders rather elongation of shapes. It correlates with mean perimeters of types, and mean size of geocomplexes. Large geocomplexes are more dismembered or elongated than small ones. On both areas of interest the index reaches its maximum for geocomplexes of subnival and montane zones, being under the strongest influence of gravitation forces (rock and boulder fall, soil creep, rock slides, landslides, mud-flows, avalanche erosion and water erosion)—in the Five Lakes Valley in types 1, 2, 5, 9, 13, 16, in Loch Coruisk Valley: 6, and for hydrogenic geocomplexes dependent on water flow (16, 23; and 12, 14, 15, 17, 19, respectively). Mean value of the shape dismemberment index for Skye (2.17) is higher than for the Tatra Mountains (1.91) and is a result of higher inaccessibility, steepness of slopes and more humid climate. Roundness index
The mathematical expression defining the index is a square inverse of the index of shape dismemberment and is given by the following equation (Pietrzak, 1989): (19.3) The index reaches its maximum (Rc = 1) for circles and approaches 0 for long and dismembered shapes. In the Five Lakes Valley it reaches its maximum (Rc = 0.51) for anthropogenic geocomplexes, which significantly stand out from natural geocomplexes. The highest values of the index (0.37–0.38) are found in two groups of geocomplexes. The first is represented by bogs and chinophilous communities (in the Five Lakes Valley: 8, 33, 20, 31, 37; in Loch Coruisk Valley: 26, 24, 32, 31); the second by shrub and grass communities on “islands” of solid rocks buried in moraine deposits (34 and 34, respectively). Both groups are characteristic for subalpine zone: dwarf pine zone in the Tatra Mountains and the respective heath zone on Skye, where gravitation forces are slowed down in comparison to the montane and subnival zones. Mean value of the roundness index in the Five Lakes Valley (0.32) is higher than the mean value of the index calculated for Loch Coruisk Valley (0.27) and confirms a fact that Loch Coruisk Valley environment is much more influenced by one-direction gravitation processes than The Five Lakes Valley. 19.2.3 Analysis of spatial patterning Landscape indices derived from information theory approximate homogeneity and diversification of environment. In this chapter four indices are presented: landscape diversity (absolute entropy), relative landscape diversity (relative entropy), dominance and likeness index (nearest neighbour analysis). Diversity (absolute entropy) and relative diversity (relative entropy)
Entropy is a basic notion used in cybernetics which for continuous features is based on probabilities of occurrence of particular features and is expressed by logarithmic function (Richling 1992):
238
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
(19.4) where: p=si/s=probability of occurrence of particular state (the proportion of landscape in habitat i or % of geocomplexes within the whole research area), Si=area of individual geocomplex within a type (or 1), s=total area of a type (or number of individual units within a type), n=number of observed individual units within a type. In this study probabilities of occurrence of individual geocomplexes were measured both on the basis of areas of geocomplexes and number of geocomplexes. As the logarithm base was set to 2, entropy is given in bits. Entropy increases with number of individual units within a type. It reaches a minimum (0) for one element sets and approaches infinity with an increase of individual units within a type. At a given number of individual units within a type, absolute entropy of the type of geocomplexes is higher when the probability distribution of a feature is close to even and reaches its maximum when units cover the same area. A relation of absolute entropy to maximum entropy for a given type is called a relative entropy: (19.5) where: (19.6) Entropy is strongly scale-dependent (it is calculated on the basis of frequencies of occurrence and sizes of landscape units), hence it is difficult to compare entropies of different landscapes evaluated in different studies. However, entropies evaluated for Loch Coruisk Valley and the Five Lakes Valley can be compared without a fault. Although mean sizes of landscape units in Loch Coruisk Valley are bigger than mean sizes of landscape units in the Five Lakes Valley, the total researched area of Loch Coruisk Valley is bigger than the total researched area of the Five Lakes Valley. A single, individual landscape unit in Loch Coruisk Valley constitutes 6.93x10–4 % of total researched area while in the Five Lakes Valley it constitutes 7, 11×10–4% of the total researched area. As these values are comparable, they do not influence indices based on entropy. Absolute and relative entropies based on areas and number of individual units within a type are highly correlated: in The Five Lakes Valley correlation equals 0.97 while in Loch Coruisk Valley it is 0,94 so both measures are good approximations of landscape diversity. As absolute entropy describes diversity of a particular type and correlates highly with number of individual units within a type (correlation 0.96 in the Five Lakes Valley and 0.94 in Loch Coruisk Valley) its measurements do not contribute significantly to the description of landscape structure of the researched environments. Mean entropy of the Five Lakes Valley equals 5.26, and of Loch Coruisk Valley 4.73; Loch Coruisk Valley is characterised by lower dispersion of individual units within a type and hence lower landscape diversity. This fact can be explained by the strong influence of mesoclimate on the Loch Coruisk Valley landscape. Strong winds and high humidity leave their impact on spatial patterning through blurring the impact of microclimate and eliminating small-scale differences of the environment. In the Tatra Mountains the impact of microclimate and local ground humidity is strong and leads to formation of small patches of vegetation communities (and hence geocomplexes) differentiated with respect to local soil and microclimate conditions. Relative entropy eliminates the influence of the number of units and correlates inversely with an index of area diversification (correlation index in the Five Lakes Valley equals −0.78 and −0.59 in Loch Coruisk Valley at significance level of 0.00). Relative entropy ranges from 0 to 1 and reaches its maximum for types
GIS TO ANALYSE STRUCTURE AND CHANGE IN MOUNTAIN ENVIRONMENTS
239
with the lowest index of area diversification. Mean entropy of the researched valley is very high and in the Five Lakes Valley reaches 0.975 and in Loch Coruisk Valley 0.883. This is further confirmation of a fact that landscape of the Five Lakes Valley is more diversified than that of Loch Coruisk Valley. Dominance
Dominance is another statistical index based on entropy (Turner, 1989): (19.7) It correlates highly with frequencies based on areas and shows dominant types of geocomplexes with respect to the percentage of area taken by a particular type of geocomplex. Furthermore it correlates highly with the index of area diversification (0,97 at both sites at a significance level of 0.00) and strongly but negatively with relative entropy (in the Five Lakes Valley correlation equals −0.85, and in Loch Coruisk Valley −0.71, both at a significance level of 0.00). To summarise, although indices based on the notion of entropy constitute good approximations of landscape patterning they can be substituted by the simpler indices presented above. Likeness index
Likeness index, measured on the basis of length of common borders between distinct land units is a good method of nearest neighbour analysis. It is given by the following equation (Richling, 1992): (19.8) where: a (or b)=total length of border of unit a (or b), c=total common length of border between units a and b. Likeness of units which do not border with each other is 0 and for two types bordering only with each other (which occurs when there are only two types of geocomplexes in the researched landscape), it reaches 100 percent. The index was evaluated for each pair of bordering types of geocomplexes; finally mean values for each occurrence of a bordering type were calculated. On both sites of research the most frequent neighbouring types of geocomplexes are pairs with the same vegetation but different geology. In the Five Lakes Valley the index reaches its maximum value of 45 percent and is above 30 percent for the following pairs: (32, 33); (30, 31); (7, 8); (38, 39); (5, 6); (2, 4) and (41, 42) (see Tables 19.1 and 19.2). In Loch Coruisk Valley the highest values of the index are between 49 and 35 percent for the following pairs: (19, 20); (3, 4); (34, 36); (1, 2); (1, 10); and (31, 33). It can be explained by the fact that vegetation categories are much more detailed than geological categories (neighbourhood of two types from the total of 5 is much more probable than neighbourhood of two types from the total of 20). It testifies that ecological amplitudes of vegetation communities are much narrower than variation of soil humidity and mineral composition that can be read from available geological maps. Granulometric composition of geological material could be a simple projection of soil conditions. Unfortunately geological maps of both researched areas are very detailed with respect to solid rocks (detailed division of granites and gabbro which have little influence on vegetation) and very rough with respect to Pleistocene and Holocene deposits. A good example is the category “moraines” which constitutes 36 percent of the total area of the Five Lakes Valley and consists of ground, lateral, interlobate, terminal, and other moraines which differ significantly with respect to size of particles and hence availability of nutrients and humidity for plants. As maps of vegetation of both areas present a
240
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
better description of the researched environments, further detailed recognition of environmental structures could be well based on recognition of structure of vegetation habitats. The second group of geocomplexes with high likeness index consists of geocomplexes of different vegetation types but occurring on the same geological unit. In the Five Lakes Valley the highest values of the index ranges between 21 percent and approximately 12 percent for the following types of geocomplexes: (11, 25); (5, 9); (18, 35); (2, 21). In Loch Coruisk Valley these are represented by the pairs: (8, 10); (32, 38); (19, 22); (15, 29); (20, 23) and (15, 31). This group characterises most often neighbouring vegetation communities. The pairs represent communities with similar habitat requirements, for instance ground humidity requirements (eutrophic and mesotrophic fens) or similar slope aspect (mossy montane grasslands and boulder montane shrubs and grasslands)—usually communities tied to the same microclimate. However, some of the pairs are characterised by diametrically different habitats: for instance typical montane grasslands and chinophilous grasses on granite. It testifies a fact of great diversification and mosaic character of montane environments. In the group of pairs with different vegetation and geological units, the likeness index is very low and reaches 5–7 percent. On both sites this group of geocomplexes is derived either from basic geological units (mylonites and peridotites) or from calcareous communities which occur on non-basic geological units but are flushed by basic waters flowing from above, basic regions (in the Five Lakes Valley: (16, 1); (2, 15); (2, 16); in Loch Coruisk Valley: (17, 32) and (4, 8)). This group thus relates hence to the first group where the main determinant of location was geology. Mean value of the likeness index in the Five Lakes Valley is 3.92 percent and in Loch Coruisk Valley 5. 03 percent. An important characteristic is also the percentage of occurring neighbourhoods from the total number of all possible neighbourhoods. In the Five Lakes Valley it reaches 51 percent and in Loch Coruisk Valley—43 percent. This is further confirmation that Loch Coruisk Valley is better ordered than the Five Lakes Valley whose types and individual units are more dismembered and accidental. 19.2.4 Analysis of vertical structure of geocomplexes Index of strength of relationship
The index of strength of relationship describes the relationship between pairs of geocomponents such as soils and geology, microclimate and vegetation, etc. (Bezkowska, 1986; Richling, 1992). The index expresses the relation of area (or number of occurrences) covered by geocomplexes with particular features to a theoretical, maximum area (or number of occurrences) where the relationship could exist. It is expressed by the following equations: (19.9) (19.10) where: Prg=area or frequency of types of geocomplexes with r type of vegetation and g category of geology, Pr=total area (or frequency) of geocomplexes with r type of vegetation, Pg=total area (or frequency) of geocomplexes with g category of geology.
GIS TO ANALYSE STRUCTURE AND CHANGE IN MOUNTAIN ENVIRONMENTS
241
The index was calculated both on the basis of frequencies and areas. In this chapter the author presents results of the latter method. The index reaches its maximum (W=1) for the types of geocomplexes which are the only representatives of the particular category of geocomponents. It approaches 0 for the features of geocomponents whose relation with features of other geocomponents is loose and reaches 0 for the features of geocomponents which never occur together (for instance in the Five Lakes Valley typical montane grasslands never occur on river alluviums and dwarf pine never occurs on mylonites). High values of the index are typical for strong and stable relationships, which have a leading role in the environmental structure. The results achieved in this study are presented according to the classification introduced by Bezkowska (1986): very strong relationships (W=0.8–1), strong relationships (W=0.6–0.8), moderate relationships (W=0.4–0.6), loose relationships (W=0.2–0.4) and very loose relationships (W=0–0.2). In the Five Lakes Valley six types of geocomplexes reach the maximum value of the index. This is a direct implication of the method of delimitation of geocomplexes, which after map overlaying where further generalised (use of the filter and classification). This is a good example of instances when the preliminary procedure (GIS analysis) must be considered while interpreting statistical results. On both research areas strong and very strong relationships prevail in three groups of geocomplexes: montane grasslands on solid rocks (in the Five Lakes Valley types: 1, 5, 2 and in Loch Coruisk Valley: 29, 1, 12, 3), hydrogenic geocomplexes on moraines (31, 38, 33, 31 and 23, 28, 26 and 19, respectively), and montane grasslands on basic material (18 and 8, respectively). It is significant that while in the Five Lakes Valley strong and very strong relationships prevail mainly in the first group, in Loch Coruisk Valley they prevail mainly in the second. Hence a conclusion may be drawn that in moderate, transitional climate, in montane environments the most important geocomponent that diversifies environment is relief and altitude (and hence microclimate), while in a cool, Atlantic climate, it is hydrography (dependant upon meso- and macroclimate), which is more important than relief, altitude and microclimate. This phenomenon was observed during the field research. While in the Tatra Mountains one can observe vegetation zones, which appear approximately at the same altitude ranges, on Skye, the transition between vegetation formations is gradual. A good example of the “blurred” sequence is gradual transition of the following communities: heaths, montane heaths, montane heaths and grasslands, montane grasslands, dismembered montane grasslands with scree communities, scree communities, bare rocks. The strength of relationship between different geocomponents describes inner cohesion of the types of geocomplexes. Geocomplexes with strong relationships between geocomplexes are stable and cohesive. The most stable and cohesive geocomplexes are the large and typical ones. Low values of index of relationship indicate unbalanced and fragile geocomplexes. Most fragile geocomplexes are found in transition zones between different vegetation zones. As borders of vegetation zones never have linear character, most fragile geocomplexes form small patches of communities located in the upper limits of their zones of occurrence. Good examples are small patches of dwarf pine in the Five Lakes Valley (mean size of geocomplexes with dwarf pine range from 1681 m to 1808.5 m while the mean size of all the geocomplexes equals 3695 m). They are the most endangered communities, which can be easily destabilised. This fact should be considered during planning of tourist routes which should pass them.
242
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
19.3 INTERPRETATION OF RESULTS—COMPARISON OF THE RESEARCHED ALPINE VALLEYS The research summarised in this chapter, makes it possible to compare the Five Lakes Valley and Loch Coruisk Valley and confirm more general conclusions concerning the structure of the alpine type of environments. The leading component that determines the character of all other components in alpine environments is relief. Strong dependence on gravitation is expressed by elongated and strongly digital shapes of geocomplexes. The most elongated shapes are characteristic for geocomplexes that are under the strongest influence of slope processes and water flow (geocomplexes of subnival and alpine zone and hydrogenic character). At the same time these geocomplexes are the least transformed by man. Another important conclusion is the enormous diversity of the researched environments, depicted by strong break up and dispersion of types of geocomplexes. Such a big diversity of small areas is typical for alpine environments and results from a wide range of different scale factors, e.g. macro-, meso- and microclimate. However, diversity of the researched areas can also be attributed to different factors. In the Five Lakes Valley the main factors differentiating vegetation and hence landscape are: height above sea level (climatic zoning) and meso- and micro-relief that determine humidity and microclimate (which is mainly a result of relief and slope aspect). In the Loch Coruisk Valley Valley, extremely humid climate reduces the influence of temperature and sun exposure and emphasises the importance of slope exposure to humid air masses and strong winds. The effect of these humid conditions is a blurring of the outlines of vertical climatic zoning, e.g. lack of trees within forest zone, lowering of communities typical for alpine zone towards the sea level, and due to low competition, descent of some alpine species to the bottom of the valley. The main factors influencing the researched landscapes are reflected in a wide range of alpine meadows in the Five Lakes Valley and a wide range of hydrogenic communities in Loch Coruisk Valley. Geocomplexes within these communities constitute dominant geocomplexes in the researched landscapes, and being climax communities, are characterised by high cohesion and stability. An interesting differentiation of the researched environments is the differentiation of geology. Geocomplexes on basic geological formations (mylonites in the Tatra Mountains and peridotites on Skye) are unique and characteristic because of their proliferating basic flora dominated by herbal species with big flowers. Because both mylonites and peridotites are easily weathered, basic communities are endangered by degradation. This fact should be taken into account when planning tourist routes. Unfortunately, nearly all the basal habitats in The Five Lakes Valley are on mountain passes crossed by tourist routes. These areas are hence subject to strong erosion due to tourists and water. Of primary importance to landscape diversity are geocomplexes of small area frequencies but large quantity frequencies (many small units within one type of geocomplex), which cut large background geocomplexes. These are mostly represented by geocomplexes with hydrophilic or chinophilous communities or calcareous communities. Although very common, their ecological amplitudes are comparatively narrow—they are fragile to the slightest changes of any of the environmental factors determining their habitats. In spite of the alpine character and low accessibility of the areas of research, both regions have been subjected to anthropopression for many years. Sheep grazing and burning of shrub communities led to significant changes of environmental structure in both valleys. The most transformed areas in the Five Lakes Valley are the subalpine zone and lower parts of the alpine zone, where shrub and heath communities were replaced by meadow and sometimes even mossy communities. In Loch Coruisk Valley, subalpine communities stretched down and dominated the forest climatic zone. Because of low competition, even some
GIS TO ANALYSE STRUCTURE AND CHANGE IN MOUNTAIN ENVIRONMENTS
243
mountain species descended to lower parts of the valley. In both regions man’s activities led not only to disturbance of the natural sequence of vegetation communities but also to unification of structure of subalpine zones. These parts of both valleys are less diversified; individual units are big and cohesive. Apart from changes in the proportion of natural communities, some of the communities are inhabited by lowland species that are alien to the researched areas. Abandonment of man’s activities (establishment of a National Park in the Tatra Mountains, and gradual climate cooling on Skye) leads to gradual withdrawal of lowland and anthropogenic species. However, natural succession may last many years and in extreme cases natural communities may not return at all. For example, heather encroachment in Scotland leads to soil acidification and in some cases to development of peat processes. In the Tatra Mountains grazed meadow communities form compacted sod where dwarf mountain pine (Pinetum mughi carpaticum) seedlings cannot cut through. In these cases anthropogenic changes to the environment are irreversible and lead not only to evident changes of real vegetation but also to permanent changes of habitats that in the long run lead to changes of potential vegetation. Stability of geocomplexes with anthropogenic vegetation is confirmed by high values of indicators of strength between geocomponents. In summary it can be concluded that subnival, alpine and low, hydrogenic parts of both valleys are characterised by strong diversity and dispersion of landscape, with long individual geocomplexes and strong relationships between geocomponents. The subalpine zone, being a transitional zone between alpine and forest zones, includes many geocomplexes that are characteristic for both zones. As many communities reach their extreme habitats, higher diversity and lower stability of subalpine zone should be expected. Meanwhile, significant reduction of landscape diversity (large, cohesive geocomplexes) and strong relationships between geocomponents are observed. This is an evident result of intensive and long-term anthropopression. This confirms also a long acknowledged fact that slight, short-term human activity leads to landscape variation (for instance tourist routes), while strong and long-term activity leads to unification of landscapes. One of the most important conclusions achieved in this part of the research was confirmation of the fact that detailed phytosociological studies can replace complex environmental research. This is due to the fact that vegetation is a perfect indicator of all habitat characteristics like geology, geomorphology, hydrology and micro-climate. Abrupt discontinuities in vegetation are associated with abrupt discontinuities in the physical environment; and vegetation patterns in space reflect general patterns of the landscape and can lead to general conclusions on ecological processes. This is a not a new conclusion but its confirmation is important, especially for mountain environments where detailed studies of micro-climate or hydrology are often not feasible. Detailed studies of vegetation, being a relatively easy method of research, can be considered as the most important method of research in such diversified ecosystems. 19.4 FURTHER RESEARCH One of the problems encountered during the research was the fact that all the indicators used were based on areas and perimeters calculated on orthogonal maps. As the researched environments have an alpine character, size measures of geocomplexes calculated from two-dimensional maps lead to significant underestimation of the values. As a result, many geocomplexes which exist on steep slopes, and cover big areas, have underestimated measures of their size as a map constitutes a horizontal projection of the real situation and the vertical dimension is lost. This prompts the author to research further in this field and to improve the results. The author hopes to accomplish this task with the help of photogrammetry techniques. Further research will include: evaluation of DEMs from stereopairs of aerial photos, production of
244
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
orthophotomaps and then calculation of areas and perimeters. This time, as the third dimension will be taken into account, areas will represent real measures. Development of statistical indicators based on real, 3-D space would simplify the interpretation of achieved statistical coefficients. As further research will hopefully be based on data acquired from aerial images, also thematic maps of several geocomponents (vegetation, geology, geomorphology, soil humidity, aspect, slope, shading conditions, wind exposure, etc.) will be derived from the imagery and then analysed using image enhancement and classification methods. Field work, which in this research constituted a significant part of the research, will be minimised to calibration of the aerial data. In conclusion, the achieved results challenge the author to undertake more detailed investigation of interdependencies existing between geocomponents of montane environments and to improve and widen employed techniques. The author believes that further detailed, automated and objective research of alpine environments will help in management and protection of these fragile environments. REFERENCES BEZKOWSKA G. 1986. Struktura i typy geokompleksów w α rodkowej czα α ci Niziny Poludniowowielkopolskiej, Acta Geographica Lodziensia, nr 54. Lódα : Ossolineum. BOČ AROV M.K. 1976 Metody statystyki matematycznej w geografii. Warszawa: PWN. KALICKI, T. 1986. Funkcjonowanie geosystemów wysokogórskich na przykładzie Tatr, Prace Geograficzne z. 67. Zeszyty Naukowe Uniwersytetu Jagielloα skiego PIETRZAK, M. 1989. Problemy i metody badania struktury geokompleksu (na przykładzie powierzchni modelowej Biskupice). Poznaα : UAM. RICHLING, A. 1992. Kompleksowa geografia fizyczna. Warszawa: PWN. TURNER, M.G. 1989. Landscape ecology: the effect of pattern on process, Annual Review of Ecological Systems, 20, pp. 171–197.
Chapter Twenty Simulation of Land-Cover Changes: Integrating GIS, SocioEconomic and Ecological Processes, and Markov Chain Models Yelena Ogneva-Himmelberger
20.1 INTRODUCTION The need to couple GIS and modelling techniques has been strongly expressed by environmental modellers, and some successful attempts have been made to integrate GIS with atmospheric, hydrological, land surfacesubsurface processes, and biological/ecosystems modelling (Goodchild et al., 1993). However, there are very few examples of coupling GIS with models that link environmental processes to their socio-economic causes, such as relating land-cover change to their human driving forces (Dale et al., 1993; Veldcamp and Fresco, 1995). There are several obstacles to integrating GIS and process modelling, both in terms of GIS functionality and modelling methods. Additional difficulties arise when social and environmental data are to be integrated into a model: not only the different nature and scale of the processes (Lonergan and Prudham, 1994), but also the lack of a well developed theoretical framework for linking societal and environmental process models make their integration difficult (Brown, 1994). One of the ways of coupling simulation models of land-cover change with GIS is via the Markov chain models (Lambin, 1994). This method, although employed in a few landscape ecology studies (Berry et al., 1996; Hall et al., 1991; Parks, 1991; Turner, 1988), has not been extensively explored in the GIS field. This chapter proposes a methodology for integration of GIS and socio-economic and ecological processes for modelling land-cover change via the Markov chain analysis. The model is based on dynamic transition probabilities which are defined as functions of exogenous (ecological and socio-economic) factors via logistic regression. It is tested in a study area in the southern Yucatan peninsula (Mexico), where old growth, semi-evergreen tropical forests have been subjected to significant changes over the past 20 years. This chapter starts with the definition and an overview of the evolution of the Markov chain models in environmental sciences. The description of the study area and the explanation of the methodology follow in the next section. Initial modelling results and the issues encountered during the model implementation stage conclude this chapter. 20.2 HISTORICAL OVERVIEW OF MARKOV CHAIN MODELS A Markov chain model is a “mathematical model for describing a certain type of process that moves in a sequence of steps through a set of states” (Lambin, 1994, p. 28). Transition probability matrices are the basis of this model. These probabilities are estimated from land-cover changes between time t and time t+1
246
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
Figure 20.1: Example of calculation of transition probabilities in a raster GIS. Numbers in the crosstabulation table represent the number of cells in each combination of land-cover categories. To create transition probability matrix, each element of each row in this table is divided by the number in the “TOTAL” column of that row
from agricultural census or forest surveys, or through superimposition of two land-cover maps in a GIS, using the following formula: (20.1) where pij is the probability that a given cell has changed from class i to class j during the interval from t to t +l, and aij is the number of such transitions across all cells in a landscape with m land-cover classes (Anderson and Goodman, 1957). An m × m matrix is built based on the probabilities for all possible transitions between all states (Figure 20.1). The major assumptions of these models are: that the transition probabilities are stationary over time; and that they depend only on current distribution of uses, and history has no effect. Using this matrix, Markov models allow the calculation of land-cover distribution at time t+1 from the initial distribution at time t. In matrix notation, the model can be expressed as (Baker, 1989): (20.2) where nt is a column vector whose elements (n1… m) are the fraction of land area in each of m states at time t, and M is an m x m transition matrix, whose elements, pij are the transition probabilities. The Markov models have a special mathematical property that makes them relevant to simulations of ecological successions: the resulting areal distribution of landscape elements corresponds to steady state conditions of an ecosystem (Hall et al., 1991; Shugart et al., 1973; Usher, 1981; Van Hulst, 1979). Use of
SIMULATION OF LAND-COVER CHANGES WITH GIS AND MARKOV CHAIN MODELS
247
these models in geography began in the mid 1960’s with diffusion and movement research (Brown, 1963; Clark, 1965). Such models were also widely used to estimate changes in land-cover acreage (Burnham, 1973; Miller et al., 1978; Nualchawee et al., 1981; Vandeveer and Drummond, 1978). Predictions based on the Markov chain models are generally considered better than linear extrapolations (Aaviksoo, 1993). In recent years, Markov chain models have been extended to overcome three major limitations concerning their application to modelling land-cover change: the assumption that the rates of transition are constant through time; the neglect of the influence of exogenous variables on transition probabilities, and the non-spatial character of future land-cover prediction (Baker, 1989; Lambin, 1994). The model proposed in this chapter addresses the last two limitations and assumes stationarity of transition probabilities over the period of study. To overcome stationarity limitation, Collins et al. (1974) suggest the calculation of dynamic transition probabilities by postulating rules of behaviour of certain landscape elements or by switching between different transition matrices at certain intervals (Baker, 1989). The influence of exogenous factors on transition probabilities can be incorporated into the model by using theoretical or empirical functions such as econometric models (Alig, 1986). Multivariate regression techniques are usually applied to analyse the influence of selected economic factors on each type of land-cover transition (Berry et al., 1996; Parks, 1991). The results of regression analysis are then used to recalculate transition probabilities under different economic scenarios and to run new simulation models with the new probabilities (Lee et al., 1992; Parks, 1991). The first spatial model of land-cover change was developed by Tom et al. (1978) who analysed the change between 1963 and 1970 using discriminant analysis with 27 physiographic, socio-economic, and transportation variables. They tested two types of models: with a priori change occurrence probability (derived from the Markov matrix) and with equal change occurrence probability (e.g. for five types of landcover change, the probability of each was assumed to be 0.2). Separate discriminant analyses were run for each initial type of land cover. The accuracy of the models was based on the number of cells whose future (1970) land cover was correctly predicted. The authors proposed to combine the Markov chain and discriminant analysis models for improved prediction of change. The Markov transition probability matrix provides the number of cells that are expected to change, and the discriminant analysis model calculates probabilities of change for each cell. These cells are then rank-ordered from the highest to the lowest probability, and the correct number of the top cells is selected for each type of change. This model uses the expected percentage of change as an a priori knowledge to predict where these changes will take place. The strength of the algorithm is that it selects the areas that are “best suited” for change. By running the model for different time scales with different Markov change parameters, it is possible to observe the spatial diffusion of change over time. This type of model works well in situations when there are only a few land-cover types present. They do not provide a solution for conflict situations, when the same cells are selected as the most suitable for several change types. 20.3 THE STUDY AREA The study area (60×60 km) corresponds to the southern Campeche state in Mexico, specifically to a swath between Route 186 (the east-west, cross-peninsula highway) and the Guatemalan border. The area in question corresponds to a tropical monsoon climate (following Köppen). The annual rainfall is around 2,000 mm with peak precipitation during summer and a distinct winter dry season. A rolling karstic terrain dominates the centre of the area with elevations reaching about 250–300 meters above sea level; the east
248
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
and west edges are, however, low-lying. There are no permanent surface streams in the uplands, although seasonal water courses are apparent. Evergreen and semi-evergreen tropical forests predominate without human interference. The upland soils are shallow but agriculturally fertile mollisols, while depressions (called bajos) are typically filled with vertisols of thick clays. The area so bounded was part of the Río Bec region of the ancient Maya people, one of the most densely settled regions of Mesoamerica from about 1000 BC to 800-1000 AD, in which major deforestation occurred. The forest returned with the collapse and depopulation of the Classic Maya civilisation in this region. Subsequent Maya occupation was sparse, associated with extensive slash-and-burn cultivation. When the Spaniards entered the region in the early XVI century, they encountered a mature forest, occasionally punctuated by a Maya village (Turner, 1990). Such conditions remained until the early part of this century, when market trade in chicle (resin) and tropical hardwoods began (Edwards, 1957). Due to the high value of chicle at that time, logging and burning in tropical forests dominated by chicozapote (Manilkara zapote) were prohibited. But by the end of the 1930s, with the production of synthetic latex and the sharp decrease in the price of chicle, extraction of the product declined and deforestation began with the development of timber production and, to a lesser extent, agriculture (Flores, 1987). Initially, less commercially valuable timber was logged for railroad construction; extraction of more precious hardwoods (cedar and mahogany) started later. More recent and pronounced changes started with the construction of a highway during the late 1960s and the implementation of largescale government-sponsored resettlement projects (Plan de Colonization de Sureste), involving the establishment of ejidos—villages with communally owned lands. Cleared land was used for subsistence agriculture through the shifting cultivation of maize, beans, and squash, which expanded along the highway. Later, during the first half of the 1980s, large areas were cleared by the Mexican government for rice fields. Many of these areas have since been abandoned due to the failure of rice production experiments, though some were cultivated with African grasses and converted into pastures. The livestock raised here is cebu and criollo stock, used largely for domestic (local and national) consumption. Overall, land cover has gone through a range of transformations, from intensive cultivation of land to abandonment of cultivated pastures and regrowth of secondary forests. The recent character of these changes and the availability of data facilitate the documentation of different trajectories of change in land cover from the onset of the recent deforestation. Also, the diverse set of human forces driving these changes makes this area an excellent example for studying the links between causes and types of cover change and for exploring the spatial and temporal dynamics of these relationships. 20.4 METHODS The study was conducted in three methodological steps. First, land-cover maps for three different times were created. These maps were then analysed and land-cover change (transition) maps for the two time periods were designed. Second, ecological and socioeconomic factors defining land-cover change in the area were identified, and a set of digital maps representing these factors was made. Finally, land-cover transitions were linked to the factors of change via the Markov chain-based spatially explicit model. The model used data from the first time period to predict land-cover transitions for the second time period. Changes in land cover are examined through the analysis of two types of data—Landsat MSS images and socio-economic census data, both spanning the period from 1975 to 1990. The study area is completely covered by one MSS scene. Three radiometrically corrected and geometrically rectified cloud-free MSS scenes—for 1975, 1986 and 1990—were obtained from the North American Landscape Characterisation
SIMULATION OF LAND-COVER CHANGES WITH GIS AND MARKOV CHAIN MODELS
249
Project (NALC). These images were georeferenced to a 60×60 meter Universal Transverse Mercator ground coordinate grid, and coregistered at the EROS Data Center prior to purchase. Since cloud-free data are rare for this area, the acquisition dates in the dataset range from April (1986 and 1990 scenes) to December (1975 scene). 20.4.1 Land-cover classification A combination of different image classification techniques has been employed to achieve the results most suited for analysis. First, Principal Components Analysis (PCA) was used to reduce four spectral bands in each MSS scene to two components: the first principal component of the two visible bands, and the first principal component of the two near-infrared bands. Each of the component images was scaled so that mean plus or minus four standard deviations falls within the 0–255 range. The visible component allows for the separation of open vs. forested land, and the near-infrared component is used to delineate water and cloud shadows (Bryant et al., 1993). To classify open and forest areas in more detail, a combination of unsupervised and supervised classification was used. Training sites for supervised classification were selected based on the existing landcover map for 1986 and a knowledge of the area. A set of aerial photographs for 1984–85 was employed to interpret image classification results. 20.4.2 Defining the driving forces of change The land uses associated with the various covers and the important factors of change were determined from field observations, interviews conducted with the local farmers in eight ejidos, and from literature reviews. Elevation, slope, and soil type were identified as ecological factors important for land-cover change in the area. Among the socio-economic factors of change, the amount of governmental subsidies, availability of bank loans, population distribution and affluence level, distance to roads, distance to market, and distance from each land plot to the village, were identified as important. Some socio-economic variables representing these driving forces were collected from the 1970 and 1980 population censuses for each of the 23 ejidos comprising the study area. Agricultural census data were available only for 1970 and 1988 at the municipal level. Data for some of the driving forces (soils, bank loans, amount of governmental subsidies) were not available. Based on the census data, a database of 20 digital maps was created. The village boundary map was used to represent census information in a spatial form. At this point, it was assumed that all socio-economic indicators derived from census data were distributed uniformly within each village boundary. These 20 maps include: ecological factors (elevation, slope, distance to the nearest other land cover) and socioeconomic factors (population, economically active population, population density, percentage of households with electricity and indoor plumbing, number of working animals per hectare of cropland, literacy rate, number of pick-up trucks, number of trucks, number of tractors per hectare of cropland, and the distance to roads, village, market, cropland, grassland, and secondary forest). Two sets of socio-economic factor maps were created. The first set, based on the 1970 population and agricultural census, was used in model development, and the second, based on the 1980 population census and the 1988 agricultural census, was used in model implementation.
250
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
20.4.3 Integrated modelling The model was developed using data on land-cover changes between 1975 and 1986 and was applied to the 1986 data to produce the map for 1990. It was validated by comparing this predicted map with the landcover map produced from the satellite imagery for 1990. The development of the model included the following steps: 1. Cross-classification of the 1975 and 1986 land-cover maps to produce the transition type map, showing the corresponding transition number for each cell. 2. Extraction of values from the set of 20 ecological and socio-economic maps for each pixel that underwent a particular transition between 1975 and 1986. 3. Use of multinomial logistic regressions (CATMOD procedure in SAS) to produce a transition probability equation for each transition type: (20.3) where: Pij is the probability of change from land cover i to land cover j; a and β are logistic regression coefficients; X1 – Xn are values of ecological and socio-economic factors for a pixel. The logistic regression was chosen for two reasons: it is suitable for the analysis of continuous and discrete variables (Trexler and Travis, 1993); and the interpretation of predicted values, always ranging between 0 and 1, is straightforward for probability-based Markov chain analysis. If multicolinearity was present between the data, factor analysis was applied first, and then the factor scores were used in the logistic regressions as independent variables. The model was implemented using the following steps: 1. Generation of transition probability maps corresponding to these equations using map algebra functions in a GIS. Since the considered time period corresponds to 11 years (1975–1986), these maps likewise correspond to 11 year transition probabilities (1986– 1997). 2. Normalisation of these 11 year maps to annual probabilities and generation of the four-year transition probabilities maps (corresponding to 1986–1990). The standardisation procedure assumed that the rate of change of probabilities was linear. 3. Comparison of the transition probability maps on a cell-by-cell basis and creation of the predicted 1990 land-cover map. Each land-cover type was considered separately. In other words, for cells that were cropland in 1986, only maps representing probabilities of transition from cropland to other land covers (plus probability map of “no change”) were considered. At this time, a deterministic approach, based on the assumption that the transition that will take place will always be the one with the highest likelihood, was adopted. Thus, of the four transition probability values corresponding to each cell, the highest was selected, and the cell was changed to the land cover that represented that highest value. 4. Model evaluation: comparison of the simulated land-cover map with the actual land-cover map produced from classification of satellite imagery.
SIMULATION OF LAND-COVER CHANGES WITH GIS AND MARKOV CHAIN MODELS
251
20.5 RESULTS AND DISCUSSION 20.5.1 Land-cover classification The PCA technique worked well on these data and allowed for the separation of open vs. forested areas with high degree of accuracy. Unsupervised classification was performed on the false colour composite image of green, red, and infrared bands. The results of this classification were, however, unsatisfactory due to the severe striping problems in the original data. Supervised classification allowed for the separation of eight land-cover classes, which were then reduced to six general classes in order to minimise the effect of misclassification error on modelling results. These six land-cover types are: forest (including both mediumtall semi-evergreen and bajo seasonal wetland forests), scrub forest (early successions), grassland (both savanna and cultivated pastures), cropland, bare soil/roads and water. Three land-cover maps (for 1975, 1986 and 1990) were produced using the same technique. 20.5.2 Integrated modelling Cross-classification of 1975 and 1986 land-cover maps yielded 29 transition types. Only 16 of them, representing changes between four main land-cover categories—forest, scrub forest, grassland, and cropland —were chosen for the model. Other land-cover categories, such as water and roads, were assumed to be unchanging. Statistical analysis of 20 independent variables showed that many of them were highly correlated with each other (for example, the four distance variables, as well as variables representing the affluence level). Factor analysis was applied to the independent variables and the first ten factors were extracted (each of the factors had at least one variable with the loading higher than 0.7). These factors explained about 97 percent of the variance. It is interesting to note that even though the variable loading pattern was quite different for different transitions, variables related to technology use (number of tractors, trucks, working animals per hectare of cropland) as well as the distance to market variable, always had the highest loading in the first component for all transition types. Multinomial logistic regressions were run separately for each of the four initial (i.e. corresponding to 1975) land-cover types. The maximum-likelihood analysis-of-variance tables showed that all models fit and that the factors included in the analysis were significant with respect to each transition type. Comparison of the predicted map (Figure 20.2) with the “control” map (Figure 20.3) was made on both a non-spatial and spatial basis. The former involved the calculation of the total area under each land cover for the two maps, and the latter, the overlaying and crosstabulation of the two maps. The model predicted the extent of mature forest quite accurately (just 7 percent less than in the control map), but underpredicted the area under scrub forest (39 percent less). With regard to the grassland and cropland, the model overpredicted their extent 1.3 and 2.5 times respectively. The fact that the model overestimated the rates of conversion in these categories is not very surprising and can be linked to the model’s main assumption about stationarity of transition probabilities over time. The period between 1975 and 1986 corresponds to the large-scale deforestation by the government for cultivation of rice, as well as the growth of subsistence agriculture, due to the increase in population in the area. By the late 1980s, however, rice production was declining, and no more land was cleared on a large scale. Thus, the rate of
252
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
Figure 20.2: Predicted land-cover map for 1990,
these particular transitions (forest-cropland and forest-grassland) was somewhat different during the second time period (1986–1990), but was not accounted for by the model. With respect to the spatial agreement of the results, overlaying of the two maps showed that the model correctly predicted the location of 89 percent of the forest, 11 percent of scrub forest, 9 percent of grassland, and 39 percent of cropland. It is important to keep in mind that the final map corresponds to land-cover changes that had the highest probability for each cell, regardless of the value of this probability and its proximity to the second-highest probability. A closer look at the areas where forest conversion into cropland or grassland was predicted incorrectly shows that the probabilities of these transitions are 0.59 and 0.52, while the second-highest probabilities (corresponding to no change) are 0.40 and 0.47 respectively. In the case of forest-to-grassland transition, the two probabilities are so close to each other (the difference is 0.05), that the decision rule applied in this deterministic model may not be appropriate. This example illustrates that the final land-cover map should always be analysed in conjunction with the maps of the maximum and the second highest transition probabilities. Finally, a comment regarding the data used for the analysis. The difference in the dates of the imagery (the 1975 scene corresponds to the end of the wet season, and the 1986 and 1990 scenes to the end of the dry season) as well as the limited spectral resolution may have introduced some land-cover classification error. A finer resolution imagery (TM or SPOT) acquired on anniversary dates would have improved the classification accuracy. Weak predictions of some of the transitions may be explained by the fact that the 20 independent variables representing the driving forces of change in the area, do not fully describe the
SIMULATION OF LAND-COVER CHANGES WITH GIS AND MARKOV CHAIN MODELS
253
dynamics present. If a time series of the data on bank loans, governmental subsidies, prices for agricultural and livestock production were available, the model performance may have improved for certain transitions. 20.6 CONCLUSIONS AND SUGGESTIONS FOR FUTURE RESEARCH Analysis of the results leads to the conclusion that the Markov chain probability-based model coupled with a logistic regression model allows for improved understanding of land-cover change processes. Its main value is its spatial character and the ability to incorporate explanatory variables (both ecological and socioeconomic). The identification of variables representative of the processes defining land-cover change in an area, as well as the definition of decision rules during the model formulation stage are the most critical elements for its successful performance. The findings presented above are part of ongoing research. The next step of the analysis will be to calibrate the model using several different approaches. First, some alternative models (e.g., stochastic) will be tested and their results compared with the performance of the deterministic model. Second, the approach suggested by Tom et al. (1978), where the probability maps are rank-ordered from the highest to the lowest value, and the correct number of the top cells (as derived from non-spatial Markov chain analysis) is selected for each type of transition, will be tested. Third, a set of different approaches to spatial representation of socio-economic variables will be tested (e.g. areal interpolation using ancillary data) and their influence on model performance will be evaluated. Finally, a set of behavioural rules for certain land covers will be incorporated to overcome the stationarity limitation of the Markov approach. ACKNOWLEDGEMENT This research was partially supported by the US Man and Biosphere Program (Tropical Directorate) Grant No. TEDFY94-003. Professors Billie Lee Turner II, Ronald Eastman, and Samuel Ratick, who serve on the author’s dissertation committee, have contributed to the design of this research and their comments have been incorporated into this chapter. REFERENCES AAVIKSOO, K. 1993. Changes of plant cover and land use types (1950’s and 1980’s) in three mire reserves and their neighborhood in Estonia, Landscape Ecology, 8(4), pp. 287–301. ALIG, R. 1986. Econometric analysis of the factors influencing forest acreage trends in the Southeast Forest Science, 32 (1), pp. 119–134. ANDERSON, T. and GOODMAN, L. 1957. Statistical inference about Markov chains, Annals of Mathematical Statistics, 28, pp. 89–110. BAKER, W. 1989. A review of models of landscape change, Landscape Ecology, 2(2), pp. 111–133, BERRY, M., FLAMM, R, HAZEN, B., and MACINTYRE, R 1996. Lucas: a system for modeling land-use change, IEEE Computational Science and Engineering, 3(1), pp. 24–35. BROWN, L. 1963. The Diffusion of Innovation: a Markov Chain-type Approach. Discussion Paper No. 3, Department of Geography, Northwestern University. BROWN, D. 1994. Issues and Alternative Approaches for the Integration and Application of Societal and Environmental Data within a GIS, Michigan State University, Department of Geography, Rwanda SocietyEnvironment Project, Working Paper No. 3,12 April, 1994.
254
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
Figure 20.3: Actual land-cover map for 1990. BRYANT, E., BIRNIE, R. and KIMBALL, K. 1993. A practical method of mapping forest change over time using landsat MSS data: a case study from central Maine, in Proceedings of 25th International Symposium on Remote Sensing and Global Environmental Change, Graz, Austria, 4–8 April. Ann Arbor: ERIM, Vol. 2, pp. 469–480. BURNHAM, B. 1973. Markov intertemporal land use simulation model, Southern Journal of Agricultural Economics, 5 (2), pp. 253–258. CLARK, W. 1965. Markov chain analysis in geography: an application to the movement of rental housing areas, Annals of American Association of Geographers, 55, pp. 351– 359. COLONS, L., DREWETT, R. and FERGUSON, R. 1974. Markov models in geography, The Statistician, 23, pp. 179–209 DALE, V., O’NEIL, R., PEDLOWSKI, M. and SOUTHWORTH, F. 1993. Causes and effects of land-use change Rondonia, Brazil, Photogrammetric Engineering and Remote Sensing, 59(6), pp. 997–1005. EDWARDS, C. 1957. Quintana Roo, Mexico’s Empty Quarter. Berkeley: University of California. FLORES, G.J. 1987. Uso de losRecursos Vegetales en la Peninsula de Yucatan: Pasado, Presente y Futuro. Xalapa: INIREB. GOODCHILD, M., PARKS, B. and STEYAERT L. (Eds.) 1993. Environmental Modeling with GIS . New York: Oxford University Press. HALL, F.,BOTKIN, D., STREBEL, D., WOODS, K. and GOETZ, S. 1991. Large-scale patterns of forest succession as determined by remote sensing, Ecology, 72(2), pp. 628–640, LAMBIN, E. 1994. Modeling Deforestation Processes: a Review. Ispra: TREES Project. LEE, R., FLAMM, R., TURNER, M., BLEDSOE, C., CHANDLER, P., DEFERRARI, C., GOTTFRIED, R., NAIMAN, R, SCHUMAKER, N. and WEAR, D. 1992. Integrating sustainable development and environmental vitality: a
SIMULATION OF LAND-COVER CHANGES WITH GIS AND MARKOV CHAIN MODELS
255
landscape ecology approach, in Naiman, R. (Ed.), Watershed Management: Balancing Sustainability and Environmental Change. New York: Springer-Verlag, pp. 499–521. LONERGAN S. and PRUDHAM, S. 1994. Modeling global change in an integrated framework: a view form the social sciences, in Meyer, W. and Turner, B. (Eds.), Global Land-use and Land-cover Change. Cambridge: Cambridge University Press. MILLER, L., NUALCHWEE, K. and TOM, C. 1978. Analysis of the dynamics of shifting cultivation in the tropical forest of northern Thailand using landscape modeling and classification of Landsat imagery, in Proceedings of the 20th International Symposium on Remote Sensing of Environment, 20–26 April. Ann Aibor: ERIM, pp. 1167–1185. NUALCHAWEE, K., MILLER, L., TOM, C., CHRISTENSON, J. and WILLIAMS, D. 1981. Spatial Inventory and Modeling of Shifting Cultivation and Forest Land Cover of Northern Thailand with Inputs from Maps, Airphotos and Landsat, Remote Sensing Centre Technical Report No. 4177. College Station: Texas A & M University. PARKS, P. 1991. Models of forested and agricultural landscapes: integrating economics, in Turner, M. and Gardner, R. (Eds.), Quantitative Methods in Landscape Ecology. New York: Springer-Verlag, pp. 309–322. SHUGART, H, CROW, T. and HETT, J. 1973. Forest succession models: a rationale and methodology for modeling forest succession over large regions, Forest Science, 19, pp. 203–212. TOM, C, MILLER, L. and CHRISTENSON, J. 1978. Spatial Land-use Inventory, Modeling, and Projection/Denver Metropolitan Area, with Inputs from Existing Maps, Airphotos, and Landsat Imagery. Greenbelt: NASA, Goddard Space Center. TREXLER, J. and TRAVIS, J. 1993. Nontraditional regression analyses, Ecology, 74(6), pp. 1629–1637. TURNER, M. 1988. A spatial simulation model of land use changes in a piedmont county in Georgia, Applied Mathematics and Computation, 27, pp. 39–51. TURNER II, B. 1990. The rise and fall of Maya population and agriculture, 1000 BC to present: the Malthusian perspective reconsidered, in Newman, L. (Ed.), Hunger and History: Food Shortages, Poverty and Deprivation, Oxford: Basil Blackwell, pp. 178–211. USHER, M. 1981. Modeling ecological succession, with particular reference to Markovian models, Vegetatio, 46(1), pp. 11–18. VANDEVEER, L. and DRUMMOND, H. 1978. The Use of Markov Processes in Estimating Land Use Change. Oklahoma: Agricultural Experimental Station. VAN HULST, R 1979. On the dynamics of vegetation: Markov chains as models of succession, Vegetatio, 40(1), pp. 3–14. VELDCAMP, A. and FRESCO, L. 1995. Modeling Land Use Changes and their Temporal and Spatial Variability with CLUE. A Pilot Study for Costa Rica, Wageningen: Department of Agronomy, Wageningen Agricultural University.
Part Three GIS AND REMOTE SENSING
Chapter Twenty One Multiple Roles for GIS in Global Change Research Michael Goodchild
21.1 BACKGROUND The past ten years have seen a dramatic increase in support for research into the physical Earth system, and the effects of human-induced change, particularly in climate. Such research places heavy demands on geographic data, and on systems to handle those data, in order to calibrate, initialise, and verity models of the Earth system, and also to investigate the relationships that exist between various aspects of the physical system, and the human populations that both cause change and experience its effects. It is widely believed that GIS and related technologies (remote sensing, GPS, image processing, high bandwidth communications) will play an increasingly important role in global change research (Goodchild et al., 1993; Mounsey, 1988; Townshend, 1991). In particular, GIS is seen as a vehicle for collecting, manipulating, and pre-processing data for models; for integrating data from disparate sources with potentially different data models, spatial and temporal resolutions, and definitions; for monitoring global change at a range of scales; and for visual presentation of the results of modelling in a policy-supportive, decision-making environment. This chapter explores these potential multiple roles of GIS, in global change research and more broadly in environmental modelling and analysis. The emphasis throughout the chapter is on the role GIS can play in the science of global change research; in addition, but downplayed in the chapter, are its more general roles in creating and managing data in global databases such as the UN Environment Program’s GRID. The discussion is based in part on the results of two specialist meetings conducted by the National Center for Geographic Information and Analysis (NCGIA) under its Research Initiative 15 (for the fall meeting reports see Goodchild et al., 1995, 1996). 21.2 INTRODUCTION During the last decade there has been increased awareness of the potential for major changes in climate, deterioration of the stratospheric ozone layer, and decreasing biodiversity. At the same time, new political and economic transformations and structures are emerging. These phenomena are described as “global change” (Botkin, 1989; Committee on the Human Dimensions of Global Change, 1992; Price, 1989; Turner et al., 1990) and can be classified as being of two basic types (Botkin, 1989; Turner et al., 1990). In one sense the term applies where actions are global in extent or systemic, that is, at a spatial scale where perturbations in the system have consequences everywhere else, or reverberate throughout the system.
258
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
Thus, for example, there is concern over “greenhouse” gases and other climate forcing agents that are manifested globally. The second meaning applies where there is cumulative global change. The loss of biological diversity at so many locations throughout the world is global in scale because its effects are world-wide, even though the causes are localised. The international global change research program (IGBP, 1990; NRC, 1990) has grown out of the need for scientific assessments of both types of global change, and is ultimately intended to aid in policy decisions. Emphasis has focused largely on interactions between the Earth’s biosphere, oceans, ice, and atmosphere. The research strategies that help to provide this scientific foundation were developed in the mid-1980s (ESSC, 1986, 1988; ICSU, 1986; NRC, 1986), and feature an advanced systems approach to scientific research based on: (1) data observation, collection, and documentation; (2) focused studies to understand the underlying processes; and (3) the development of quantitative Earth system models for diagnostic and prognostic analyses. Concepts such as Earth system science (ESSC, 1986), global geospherebiosphere modelling (IGBP, 1990), and integrated and coupled systems modelling at multiple scales (NRC, 1990) have emerged, focusing broadly on the Earth system, but including subsystems such as atmosphereocean coupling. The US Global Change Research Program is one example of a national-level effort to implement this strategy (CES, 1989, 1990; CEES, 1991). GIS could play an important role in this research in two ways: (1) enhancement of models of Earth system phenomena operating at a variety of spatial and temporal scales across local, regional, and global landscapes, and (2) improvements in the capacity to assess the effects of global change on biophysical (ecological) systems on a range of spatial and temporal scales. In addition to the biogeochemical process that drive the Earth system, changes in human land use, energy use, industrial processes, social values, and economic conditions are also increasingly being recognised as major forces in global change (Committee on the Human Dimensions of Global Change, 1992). The relationship of these activities and behaviours to global change is critical because they may systemically affect the physical systems that sustain the geosphere-biosphere. Thus additional research strategies that emphasise the human dimension in global change have recently emerged. The National Research Council’s (NRC) Committee on Global Change (Committee on the Human Dimensions of Global Change, 1992) has emphasised that the development of a coherent and systematic assessment and understanding of global change phenomena requires better linkage between the environmental and human dimensions (social and economic). At present, several problems pose formidable challenges in addressing the human dimensions of global change, three of which are central to this initiative. First, there are difficulties in collecting requisite socio-economic and demographic data. Those data that do exist often span a range of temporal and spatial scales, lack appropriate intercalibration, have incomplete coverages, are inadequately checked for error, and have unsuitable archiving and retrieval formats (Committee on the Human Dimensions of Global Change, 1992). Second, there remain serious problems in translating human activities and information across a range of scales (local, regional, national, or global). Human activities that drive and mitigate global change vary significantly by region or place (Feitelson, 1991; Turner et al., 1990) but as in ecology, methods for explicit translation across disparate scales or levels of organisation are lacking. Feitelson (1991) noted that geographers have only recently begun to consider how activities at one geographic scale affect activities at other spatial scales, and proposed a conceptual framework for analysing how geographic scale affects environmental problem solving. For conceptually similar problems, ecologists have invoked hierarchy theory as a way of understanding complex, multiscaled systems (Allen and Starr, 1982). Last, there is a dearth of ways of understanding the interactions of socioeconomic systems and global change other than through logical analysis, which often requires a level of abstraction that makes their understanding obscure (Cole and Batty, 1992). Geographic visualisation
MULTIPLE ROLES OF GIS IN GLOBAL CHANGE RESEARCH
259
could be used to gain insight into both data and models, though such visual “thinking” has been little explored. Four broad themes emerge from this discussion to characterise the potential for GIS use in global change research. These are discussed below. 21.2.1 Use of GIS to support integrative modelling and spatial analysis Scientifically based mathematical models for computer analysis, that is, environmental simulation, are fundamental to the development of reliable, quantitative assessment tools. One major purpose of these computer-based models is to simulate spatially distributed, time-dependent environmental processes realistically. But environmental simulation models are, at best, simplifications and inexact representations of real world environmental processes. The models are limited because basic physical processes are not well understood, and because complex feedback mechanisms and other interrelationships are not known. The sheer complexity of environmental processes (three-dimensional, dynamic, non-linear behaviour, with stochastic components, involving feedback loops across multiple time and space scales) necessarily leads to simplifying assumptions and approximations (e.g. Hall et al., 1988). Frequently, further simplification is needed to permit numerical simulations on digital computers. For example, the conversion of mathematical equations for numerical processing on a grid (discretisation) can lead to the parameterisation of small-scale complex processes that cannot be explicitly represented in the model because they operate at subgrid scales. There may be significant qualitative understanding of a particular process, but quantitative understanding may be limited. The ability to express the physical process as a set of detailed mathematical equations may not exist, or the equations may be too complicated to solve without simplifications. In addition to incomplete knowledge, simplifications, and parameterisations of real world processes, other general themes emerge from a review of state-of-the-art modelling, and from efforts to link models with GIS. One is cross-disciplinary modelling, which is illustrated by the concept of modelling water and energy exchange processes within the soil-plant-atmosphere system, or ecosystem dynamics modelling with, for example, the environmentally and physiologically structured Forest-BGC model (Running and Coughlan, 1988). These models cross such disciplines as atmospheric science, hydrology, soil science, and plant physiology. 21.2.2 GIS-linked models and conceptual frameworks for hierarchical and aggregated structures The requirements of global change research place significant emphasis on modelling at multiple time scales and across multiple spatial scales. NRC (1990) outlines a strategy for coupling models across time scales to account for feedbacks in land-atmosphere interactions. For example, the land surface parameterisations for water and energy exchange between the biosphere and atmosphere must adapt to climate-induced changes in vegetation characteristics that exert major influence on such exchange processes. Feedbacks may be further complicated by the existence of thresholds, and by hysteresis effects. Hay et al. (1993) discuss the use of nesting to model interactions between spatial scales, while Nemani et al. (1993), Burke et al. (1991), and King (1991) are concerned with the extrapolation of research results from local study areas to regional analysis. Hall et al. (1988) illustrate some of the problems in linking vegetation, atmosphere, climate, and
260
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
remote sensing across a range of spatial and temporal scales. Spatial scaling involves significant research issues, such as how to parameterise (i.e., aggregate or integrate) water and energy fluxes from the plant leaf level to the regional level. In addition to scale problems, the parameterisation process is confounded by structuring processes which operate at different hierarchical levels (e.g., physiological, autecological, competitive, landscape). Finally, the interactions between levels are asymmetric in that larger, slower levels maintain constraints within which faster levels operate (Allen and Starr, 1982). Complex terrain and heterogeneous landscape environments form another major theme in the use of physically based models of spatially distributed processes (Running et al., 1989). Distributed parameter approaches are increasingly used instead of classic lump sum analysis as models become more sophisticated, allowing them to incorporate more realistic, physically based parameterisations of a wide variety of land surface characteristics data (King, 1991; Running et al., 1989). Factors such as terrain and landscape variability are important considerations for land-atmosphere interactions (Carleton et al., 1994; Pielke and Avissar, 1990). Finally, environmental simulation modelling depends on the results of field experiments such as the First ISLSCP (International Satellite Land Surface Classification Project) Field Experiment (FIFE) (Hall et al., 1988), an intensive study of interactions between land surface vegetation and the atmosphere, and the Boreal Ecosystem-Atmosphere Study (BOREAS; NASA, 1991). Such experiments are integral to the development and testing of models based on direct measurements and remote sensing data from various ground-based, aircraft, and satellite systems. In addition, focused research to understand processes and to develop remotesensing driven algorithms for regional extrapolations will be supplemented by a range of simulation models under BOREAS (NASA, 1991). These themes of environmental systems modelling suggest opportunities for the integration of GIS. For example, detailed consideration of landscape properties and spatially distributed processes at the land surface is fundamental to global climate and mesoscale models, watershed and water resource assessment models, ecosystem dynamics models that are physiologically based, and various types of ecological models involving landscape, population, and community development processes. The themes of multiple space and time scales are basic to coupled systems modelling, a highly cross-disciplinary modelling approach exemplified by the suite of models for land-atmosphere interactions research. In addition to the issue of spatial processes operating at multiple time and space scales, GIS and environmental simulation models share converging interests in geographic data. The availability of geographic data from many sources, including land cover characteristics based on multitemporal satellite data, is growing rapidly. GIS by definition is a technology designed to capture, store, manipulate, analyse, and visualise diverse sets of geographically referenced data. In fact, advanced simulation models require a rich variety of multidisciplinary surface characteristics data of many types in order to investigate environmental processes that are functions of complex terrain and heterogeneous landscapes. To illustrate, land surface characteristics data required by scientific research include land cover, land use, ecoregions, topography, soils, and other properties of the land surface to help understand environmental processes and to develop environmental simulation models (Loveland et al., 1991). These advanced land surface process models also require data on many other types of land surface characteristics, such as albedo, slope, aspect, leaf area index, potential solar insulation, canopy resistance, surface roughness, soils information on rooting depth and water holding capacity, and the morphological and physiological characteristics of vegetation. GIS along with remote sensing has a role in dealing with these complex data issues. GIS can help meet these requirements and provide the flexibility for the development, validation, testing, and evaluation of innovative data sets that have distinct temporal components. There is the need to create
MULTIPLE ROLES OF GIS IN GLOBAL CHANGE RESEARCH
261
derivative data sets from existing ones and GIS tools are also needed for flexible scaling, parameterisation and re-classification, creating variable grid cell resolutions, or aggregation and integration of spatial data. At the same time, methods are needed to preserve information across a range of scales or to quantify the loss of information with changing scales. Thus, this overall modelling environment seems suited for GIS as a tool to support integrative modelling, to conduct interactive spatial analysis across multiple scales for understanding processes, and to derive complex land surface properties for model input based on innovative thematic mapping of primary land surface characteristics data sets. By implementing a full range of spatial data models, GIS offers the ability to integrate data across a range of disciplines despite wide variation in their ways of conceptualising spatial processes and of representing spatial variation. 21.2.3 More efficient integration of models and GIS Despite the above mentioned potential, a number of impediments stand in the way of more complete integration of GIS and global environmental modelling. GIS are generic tools, designed for a range of applications that extend well beyond environmental modelling, into data management for utilities, local governments, land agencies, marketing, and emergency response (Maguire et al., 1991). While GIS support a wide range of data models, many of the fundamental primitives needed to support environmental modelling are missing, or must be added by the user (Goodchild, 1991). At present, environmental simulations must be carried out by a separate package linked to the GIS, and the ability to write the environmental model directly in the command language of the GIS is still some distance away. Nyerges (1993) provides a comprehensive discussion of these technical integration issues. 21.2.4 Visualisation of spatial patterns and interactions in global data Effective use of GIS requires attention to several generic issues, many of which are also of concern to environmental modelers. The discretisation of space that is inherent in both fields forces the user to approximate true geographical distributions, and the effects of such approximations on the results of modelling are often unknown, or unevaluated. Space can be discretised in numerous ways—finite differences and finite elements are two of the examples well known to environmental modelers—and each has its own set of impacts on the results. Effective display of the results of modelling, particularly for use in policy formulation, requires attention to principles of cartographic design. Finally, spatial databases tend to be large, and effective environmental modelling may require careful attention to the efficiency of algorithms and storage techniques. Many of these generic issues are identified in the NCGIA research agenda (NCGIA, 1989, 1992) and are the subject of active research within the GIS community. As this review demonstrates, and as the title of this initiative indicates, we view GIS as a tool that can play many roles in global change research. There is a need to identify those roles more clearly, and also to identify impediments that prevent GIS being used more broadly. We need to address the generic needs of global change research for spatial data handling tools, whether or not those needs will be met five or ten years from now by a technology we recognise as “GIS”.
262
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
21.3 IMPEDIMENTS TO PROGRESS With these issues in mind, the following sections address the problems that stand in the way of a greater role for GIS in global change research, and the research that needs to be conducted to address them. They address five major themes: • To identify technical impediments and problems that obstruct our use of GIS in global change research, and our understanding of interactions between human systems and regional and global environmental systems. • To assess critically the quality of existing global data in terms of spatially varying accuracy, sampling methodologies, and completeness of coverage, and develop improved methods for analysis and visualisation of such data. • Within the context of global change, to develop theoretical/computational structures capable of building up from knowledge at smaller spatial scales and lower levels of aggregation. • To develop methods for dynamically linking human and physical databases within a GIS and for exploring the regional impacts of global change. • To develop methods for detecting, characterising, and modelling change in transition zones, thereby addressing the problems that result from overly simplistic representations of spatial variation. These themes span to varying degrees the concerns of the many disciplines that together constitute the global change research community. For the purposes of this chapter, the wide range of topics addressed by global change research is narrowed to eight areas: • • • • • • • •
Atmospheric science and climate Oceans, ocean-atmosphere coupling, and coasts Biogeochemical dynamics, including soils Hydrology and water Ecology, including biodiversity Demography, population, and migration Production and consumption, including land use Policy and decision-making.
The following sections address major problem areas within this context. 21.3.1 Perspectives on “GIS” Most published definitions of “geographic information system” refer to both data and operations, as in “a system for input, storage, manipulation, analysis, and output of geographically referenced information”. In turn, geographically referenced information can be defined fairly robustly as information linked to specific locations on the Earth’s surface. This definition suggests two tests that can be applied to a software package to determine whether it is a GIS: the integration of a broad enough range of functions, and the existence of geographic references in the data. Clearly the first is less robust than the second, and there have been many arguments about whether computer-assisted design (CAD) or automated mapping functions are sufficiently broad to qualify packages for the title “GIS”.
MULTIPLE ROLES OF GIS IN GLOBAL CHANGE RESEARCH
263
At this time, several hundred commercial and public-domain packages meet these qualifications, and the GIS software industry is enjoying high rates of growth in annual sales which now amount to perhaps $500 million per year. However, the majority of these sales are in applications like parcel delivery, infrastructure facilities management, and local government, rather than science. Moreover, the term “GIS” has come to mean much more than is implied by this narrow definition and test. At its broadest, “GIS” is now used to refer to any and all computer-based activities that focus on geographic information; “GIS data” is often used as shorthand for digital geographic information; and the redundant “GIS system” is becoming the preferred term for the software itself. One can now “do GIS”, specialise in GIS in graduate programs, and subscribe to the magazine GIS World. A further and largely academic perspective is important to understanding all of the ramifications of current usage. In many areas of computer application, such as corporate payroll or airline reservations, the objects of processing are discrete and well-defined. On the other hand many geographically distributed phenomena are infinitely complex, particularly those that are naturally occurring as distinct from constructed by humans. Their digital representations are thus necessarily approximations, and will often embed subjective as well as objective aspects. The use of digital computers to analyse such phenomena thus raises a series of fundamental and generic scientific issues, in areas ranging from spatial statistics to cognitive science. The GIS research community has begun to describe its focus as “geographic information science” (Goodchild, 1992), emphasising the distinction between the development of software tools on the one hand, and basic scientific research into the issues raised by the tool on the other. In summary, three distinct perspectives are identifiable in current use of the term “GIS”: 1. GIS as geographic information system, a class of software characterised by a high level of integration of those functions needed to handle a specific type of information. 2. GIS as an umbrella term for all aspects of computer handling of geographic data, including software, data, the software development industry, and the research community. 3. GIS as geographic information science, a set of research issues raised by GIS activities. From the first perspective, we can identify a range of advantages and disadvantages of GIS as a software tool for global change research. Some of these can be seen as technical impediments, implying that further research and development of the software may remove them. Others are more fundamental, dealing with the problems of large-scale software integration and the adoption of such solutions within the research community. In this area, it may be possible to draw parallels between GIS and other software toolkits, such as the statistical packages, or database management systems, or visualisation packages. In each of these cases, the average researcher faces a simple make-or-buy decision—is it preferable to write one’s own code, or to obtain it? The answer can be very different depending on the discipline of the researcher, the cost of the software, and its ease of use. In the specific case of GIS, the following issues seem important: • Ease of use: How much learning is needed to make use of the software? Will it be quicker to learn the package or to write code, or to find code written for this exact problem by some colleague? Many GIS are reputed to be difficult to use, and GIS courses require a heavy investment of time. On the other hand it may be preferable to rely on a GIS than to write code in an area unfamiliar to the researcher, such as map projections. • Cost: Many researchers are averse to purchasing application software, although they expect to pay for more generic packages such as operating systems and compilers. Many commercial GIS have a high
264
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
price-tag. A GIS will be considered worth the investment if it is perceived as needed by a large enough proportion of the research community, like a statistical package. • Software integration: Is the level of software integration in a GIS viable? A researcher needing to solve a problem in map projections might prefer a public-domain map projection package to a GIS that offers the same functionality bundled into a much more costly and complex package; the same argument could be made in the context of spatial interpolation techniques. Any generic, integrated tool imposes a cost on its users because it cannot achieve the same performance as a tool designed for a specific purpose, so this cost must be compared to the benefits of integration. • Terminology: Does the GIS use terms familiar to the researcher, or does use of GIS require the researcher to embrace an entirely unfamiliar culture very different from his or her own? Researchers see time as a fixed resource, and fear that adoption of any new technology will be at the expense of other areas of expertise. If we adopt the second meaning of GIS above, the world of GIS seems very different. Other geographic information technologies, such as remote sensing and GPS, now fall under the GIS umbrella, and the use of GIS is no longer an issue: global change research has no choice but to use computers and digital data; and the vast majority of the types of data needed for global change research are geographically referenced. From this viewpoint, we face a much broader set of issues, including: • Requirements for computer-based tools in support of global change research, focusing in particular on the need to model dynamic processes in a variety of media, together with relevant boundary conditions and interfaces. • The need for interoperability between tools, to allow users of one collection of tools to share data and methods of analysis with users of another collection—and associated standards of format, content description, and terminology to promote interoperability. • The need to harmonise approaches to data modelling, defined as the entities and relationships used to create digital representations of real geographic phenomena. The current variation in data modelling practice between software developers, the various geographic information technologies, and the different disciplines of global change research is a major impediment to effective use of GIS. • The accessibility of data, including measurements shared between scientists; and data assembled by governments for general purposes and useful in establishing geographic reference frameworks and boundary conditions for modelling. • The role of visualisation and other methods for communicating results between global change researchers and the broader communities of decision-makers and the general public. The third perspective above defines GIS as a science, with its own subject matter formed from the intersection of a number of established disciplines. From this perspective global change research is an application area with an interesting set of issues and priorities, many of which fall within the domain of geographic information science. These include the modelling of uncertainty and error in geographic data; the particular problems of sampling, modelling, and visualising information on the globe; and the development of abstract models of geographic data. Of the three, the second meaning of GIS is perhaps the most appropriate to a discussion of the multiple roles of GIS in global change research, as it provides a more constructive perspective than the first, and a greater sensitivity to context than the third. All three are necessary, however, in order to understand the full range
MULTIPLE ROLES OF GIS IN GLOBAL CHANGE RESEARCH
265
of viewpoints being expressed both within and outside the GIS community, and the research that needs to be done to move GIS forward. 21.3.2 Global change research communities “What is this GIS anyway?” may be the question uppermost in the minds of many global change researchers, but it is quickly supplanted when one realises that the multiple roles of GIS in global change research extend well beyond the immediate needs of scientists for computerised tools. First, global change is a phenomenon of both physical and human systems. Many of the changes occurring in the Earth’s physical environment have human origins, and thus mechanisms for their prediction and control are more likely to fall within the domain of the social sciences. Moreover, many would argue that when measured in terms of their impacts on human society, the most important changes to the globe are not those currently occurring in its physical environment, but are economic and political in nature. The issues raised by computerised tools are very different in the social sciences. Second, the need to integrate physical and social science in order to understand global change creates its own set of priorities and issues. Not only are the characteristics of data different, but the differences in the scientific cultures and paradigms of physical and social science create enormous barriers to communication that are exacerbated by the formalisms inherent in GIS. A recurring theme in global change research is the need to build effective connections between science and policy. Complaints about the lack of connections surface whenever the US Congress is asked to approve another major investment in global data collection, such as NASA’s Mission to Planet Earth. Several obvious factors are to blame: scientists are not trained to present their results in forms that can be readily understood by policy-makers; decisions must be made quickly, but science marches to its own timetable; the scientific culture does not provide adequate reward for communicating with policy-makers. GIS as broadly understood is widely believed to have a role to play in this arena. It is visual, providing an effective means of communicating large amounts of information; it is already widely used as a common tool by both the scientific and policy communities; and it supports the integration of information of various sources and types. One of the biggest impediments to progress in global change research, perhaps the biggest of all, is the general public’s reluctance to accept global environmental change as a major problem requiring the commitment of massive research effort and the development of effective policy. As GIS becomes more widely available, through the Internet, World Wide Web, home computers, and other innovations in digital technology that impact the mass market, the same arguments made above about the roles of policy-makers will become applicable to the general public. In summary, three major communities should be considered in examining the roles of GIS in global change research: scientists, policy-makers, and the general public. Each creates its own set of priorities for GIS, and its own set of impediments. Another recurring theme in global change research is the potential role of the general public in collecting data. The GLOBE project (Global Learning and Observations for a Better Environment; http:// globe.fsl.noaa.gov) is conceived along these lines as a network of schoolchildren around the world who will collect data on their own local environment, learning about it in the process, and then contribute those data to a central agency responsible for interpretation and synthesis. In turn, the central agency will return a global synthesis to the primary sources. In a sense, this concept promises to return us to the earliest days of environmental data collection, before the organisation of official climate measurement stations, and offers to give back to the general public the role then played by the amateur scientist. Although there are
266
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
substantial concerns about quality control, this concept offers perhaps the only feasible solution to the current dilemma faced by national governments who can no longer support dense networks for primary field data collection in the face of rising costs and steadily decreasing budgets. 21.3.3 Data issues Several issues associated with data arise in using GIS in support of global change research. First, all of the global change communities are affected by issues of data quality. In any multidisciplinary enterprise it is common for the data used by a scientist to have been collected, processed, manipulated, or interpreted by someone else prior to use, creating a demand for new mechanisms to assure quality that have not been part of traditional science. Tools are needed to model and describe quality; to compare data with independent sources of higher accuracy such as ground truth; to verify the output of models of global environmental change; and to support updating. Much of the necessary theory to support such tools has been developed in the past decade, and needs to be brought to the attention of the global change research community, implemented in readily available tools, and disseminated in education programs. Second, remote access to data must be supported by effective methods of data description, now commonly termed “metadata”. Search for suitable data can be seen as a matching process between the needs of the user and the available supply, both represented by metadata descriptions; and both user and supplier must share a common understanding of the terms of description. The advent of technology to support remote access, including the World Wide Web, has put enormous pressure on the community to develop appropriate methods of description and cataloguing. Techniques need to be improved to support content-based search for specific features, and there are many other technical issues to be addressed in this rapidly developing area of spatial database technology. Third, issues arise over the institutional arrangements necessary to support free access to global change research data, and the concerns for copyright, cost recovery, and legal liability that are beginning to impact the use of communications technology. While much data for global change research is unquestionably for scientific purposes, other data are also useful for commercial and administrative purposes, and in many cases these tend to dictate access policies. Fourth, there are a number of issues under the rubric of facilitating input, output, and conversion. These include interoperability, the lack of which is currently a major contributor to GIS’s difficulty of use and a major impediment to data sharing among scientists. Interoperability can be defined by the effort and information required to make use of data and systems; in an interoperable world, much of what we now learn in order to make use of GIS will be unnecessary or hidden from the user. An important role in this arena is being played by the Open Geodata Interoperability Specification initiative (http://www.ogis.org). 21.3.4 Data models and process models The term “model” is used in two very different contexts in environmental modelling. A process model is a representation of a real physical or social process whose action through time results in the transformation of the human or physical landscape. For example, processes of erosion by wind and flood modify the physical landscape; processes of migration modify the human landscape. A process model operates dynamically on individual geographic entities. Here we should distinguish between process models that define the dynamics of continuous fields, such as the Navier-Stokes equation, and must be rewritten in approximate, numerical
MULTIPLE ROLES OF GIS IN GLOBAL CHANGE RESEARCH
267
form to operate on discrete entities; and models such as Newton’s law of gravitation or individual-based models in ecology that operate directly on discrete entities. A data model, on the other hand, is a representation of real geographic variation in the form of discrete entities, their attributes, and the relationships between them. Many distinct data models are implemented in GIS, ranging from the arrays of regularly spaced sample points of a digital elevation model (DEM) to the triangular mesh of the triangulated irregular network (TIN) model. Under these definitions, there is clearly a complex and important relationship between data modelling and process modelling. In principle, the entities of a process model are defined by the need to achieve an accurate modelling of the process. In practice, the entities of a data model are often the outcome of much more complex issues of cost, accuracy, convenience, the need to serve multiple uses that are frequently unknown, and the availability of measuring instruments. An atmospheric process model, for example, might require a raster representation of the atmospheric pressure field; the only available data will likely be a series of measurements at a sparse set of irregularly located weather stations. In such cases it is likely the data will be converted to the required model by a method of intelligent guesswork known as spatial interpolation, but the result will clearly not have the accuracy that might be expected by a user who was not aware of the data’s history. Such data model conflicts underlie much of the science of global change research, and yet their effects are very difficult to measure. The availability of data is often a factor in the design of process models, particularly in areas where the models are at best approximations, and distant from well-understood areas of physical or social theory. We rarely have a complete understanding of the loss of accuracy in modelling that results from use of data at the wrong level of geographic detail, or data that has been extensively resampled or transformed. Clearly the worlds of data modelling and process modelling are not separate, and yet practical reality often forces us to treat them as if they were. 21.3.5 Levels of specificity Another key issue in data modelling can be summed up in the word specificity. While there may be agreement that data modelling requires the definition of entities and relationships, there is much greater variation in the degree to which those entities and relationships must be specified, and the constraints that affect specification. One set of constraints is provided by the various models used by database management systems. The hierarchical model, for example, requires that all classes of entities be allocated to levels in a hierarchy; and that relationships exist only between entities at one level and those at the level immediately above or below. If these constraints are acceptable, then a database can be implemented using one or another of the hierarchical database management systems that are readily available. While the model seems most applicable to administrative systems, and has now been largely replaced by less constrained models, it has been found useful for geographic data when the collection of simple entities into more complex aggregates is important—for example, in the ability to model an airport at one scale as a point, and at a finer scale as a collection of runway, hangars, terminal, etc. The most popular model for geographic data is the relational, and its implementation for geographic data is often termed georelational. Relationships are allowed between entities of the same class, or between entities in different classes, and this is often used to model the simple topological relationships of connectedness and adjacency that are important to the analysis of geographic data. But even georelational models impose constraints that may be awkward in geographic data modelling.
268
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
For many Earth system scientists, the important modelling frameworks are the ones implemented in the various statistical and mathematical packages, which are much more supportive of complex process modelling than GIS and database management systems. Matlab and S-Plus, for example, have their own recognised classes of entities and relationships, and impose their own constraints. Thus to an Earth system scientist, the task of data modelling may consist of a matching of entities and relationships to those classes supported by a common modelling package; whereas a GIS specialist may be more concerned with matching to the constraints of the georelational model. The entity types supported by a modelling or statistical package will likely include simple tables of data, and arrays of raster cells, but not the full range of geographic data types implemented in the more advanced GIS, with their support for such geographic functions as projection change and resampling, and with implementations of data model concepts like planar enforcement and dynamic segmentation. Choices and constraints may also be driven by the nature of data—a field whose primary data comes mostly from remote sensing will naturally tend to think in terms of rasters of cells, rather than vector data, and to the attributes of a cell as averages over the cell’s area rather than samples at the cell’s centre. The georelational model imposes one level of constraints on data modelling. Further constraints are imposed by the practice of giving certain application-specific interpretations to certain elements of data models. For example, many GIS implement the relational model in specific ways, recognising polygons, points, or nodes as special types within the broad constraints of the relational model. This issue of specificity, or the imposition of constraints on data modelling, contributes substantially to the difficulty of integrating data across domains. For example, the data modelling constraints faced by an oceanographer using Matlab are very different from those of a GIS specialist using ARC/INFO. One might usefully try to identify the union of the two sets, or their intersection, in a directed effort at rationalisation. 21.3.6 Generalisations of GIS data models It is widely accepted that GIS data models have been developed to support an industry whose primary metaphor is the map—that is, that GIS databases are perceived as containers of maps, and that the task of data modelling is in effect one of finding ways of representing the contents of maps in digital form. Maps have certain characteristics, and these have been largely inherited by GIS. Thus maps are static, so GIS databases have few mechanisms for representing temporal change; they are flat, so GIS databases support a wide range of map projections in order to allow the curved surface of the Earth to be represented as if it were flat; they are two-dimensional, so there are few GIS capabilities for volumetric modelling; they are precise, so GIS databases rarely attempt to capture the inherent uncertainty associated with maps, but almost never shown on them; and they present what appears to be a uniform level of knowledge about the mapped area. There are many possible extensions to this basic GIS data model, with varying degrees of relevance to global change research. The five points made above lead directly to five generalisations: • temporal GIS, to support spatio-temporal data and dynamic modelling (Langran, 1992); • spherical GIS, avoiding the use of map projections by storing all data in spherical (or spheroidal) coordinates; computing distances and areas and carrying out all analysis procedures on the sphere; and using the projection for display (Goodchild and Yang, 1992; Raskin, 1994; Whiter et al., 1992); • 3D GIS, to support modelling in all three spatial dimensions (Turner, 1992);
MULTIPLE ROLES OF GIS IN GLOBAL CHANGE RESEARCH
269
• support for modelling the fuzziness and uncertainty present in data; propagating it through GIS operations; and computing confidence limits on all GIS results (Heuvelink and Burrough, 1993); • methods of analysis that allow for variable quality of data. The spherical data models are clearly of relevance to global change research, but their benefits need to be assessed against the costs of converting from more familiar representations such as the latitude/longitude grid. Methods of spatial interpolation, which are widely used in global change research to resample data and to create approximations to continuous fields from point samples, are particularly sensitive to the problems that arise in using simple latitude/longitude grids in polar regions and across the International Date Line. On the other hand, the benefits of consistent global schemes may be outweighed by the costs of converting from less ideal but more familiar schemes. 21.3.7 The data modelling continuum The literature contains several discussions of the various stages that lie between reality and a digital database: from reality to its measurement in the form of a spatial data model, to the additional constraints imposed by a digital data model, to a data structure, to the database itself. For example, the sharp change in temperature that occurs along a boundary between two bodies of water might be first modelled as a curved line (perhaps by being drawn as such on a map); the curved line would then be represented in digital form as a polyline, or a set of straight-line connections between points; the polyline would be represented in a GIS database as an arc; and the arc would be represented as a collection of bits. Modelling and approximation occurs at each of these four stages except perhaps the last. The polyline, for example, may be no better than a crude approximation to the continuous curve, which is itself only an approximation to what is actually a zone of temperature change. It is important to recognise that approximation and data modelling occur even before the use of digital technology. 21.3.8 The data life cycle Related to the previous concept of a data modelling continuum is the data life cycle, which is conceived as the series of transformations that occur to data as it passes from field measurement to eventual storage in an archive. In a typical instance, this life cycle may include measurement, interpretation, collation, resampling, digitising, projection change, format change, analysis, use in process modelling, visualisation, exchange with other researchers, repetition of various stages, and archiving. The data model may change many times, with consequent change in accuracy. Moreover, data quality is more than simply accuracy, since it must include the interpretation placed on the data by the user. If data pass from one user to another, that interpretation can change without any parallel change in the data, for example if documentation is lost or misinterpreted. In this sense, data quality can be defined as a measure of the difference between the contents of the data, and the real phenomena that the data are understood to represent—and can rise and fall many times during the life cycle, particularly in applications that involve many actors in many different fields. It is very easy, for example, for data collected by a soil scientist, processed by a cartographer, analysed by a geographer, and used for modelling by an atmospheric scientist, to be understood by the various players in very different ways.
270
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
21.3.9 Information management Recent advances in digital communication technology, as represented by the Internet, and applications such as the World Wide Web (WWW), have created a situation in which there is clearly an abundance of digital data available for global change research, but few tools exist to discover suitable information or assess its fitness for use. Much effort is now going into development of better tools for information management, in the form of digital libraries, search engines, standards for data description, and standards for data exchange. Several recent developments in geographic information management are of relevance to global change research and GIS. While the Federal Geographic Data Committee’s Content Standard for Geospatial Metadata (http://www.fgdc.gov/Metadata/metahome.html) has attracted much attention since its publication in 1994, the effort required to document a data set using it is very high, particularly for owners of data who may have little familiarity with GIS or cartography. If the purpose of metadata is to support information discovery, search, browse, and determination of fitness for use, then much less elaborate standards may be adequate, at least to establish that a given data set is potentially valuable. At that point the potential user may want to access a full FGDC record, but if the owner of the data has not been willing to make the effort to document the data fully, other mechanisms such as a phone or e-mail conversation may be just as useful, and more attractive to the owner. Scientists appear reluctant to document data without a clear anticipation that it will be used by others. However, it may be that funding agencies will begin to require documentation as a condition for successful termination of a project. Otherwise, documentation to standards like FGDC may have the character of an unfunded burden. An owner of data may be willing to provide an initial contribution of metadata to a data catalogue. But if the data are later modified, or deleted, are there suitable mechanisms for ensuring that the catalogue reflects this? Users of the WWW are acutely aware of the problems caused by “broken” URLs (Universal Resource Locators) and similar issues. Although it might be possible to provide facilities for checking automatically whether a data set has been modified, owners may not be willing to accept this level of intrusion. Another issue associated with distributed information management that is already affecting the global change research community concerns the use of bandwidth. The communication rates of the Internet are limited, and easily made inadequate by fairly small geographic data sets. Research is needed to develop and implement methods that reflect more intelligent use of bandwidth, including progressive transmission (sending first a coarse version of the data, followed by increasingly detailed versions) and the use of special coarse versions for browse. While methods already exist for certain types of raster images, there is a need to extend them to cover all types of geographic data. A final information management issue of major importance to global change research is interoperability. Today, transfer of data from one system to another frequently requires that the user invoke some procedure for format conversion. While such procedures may not be complex, they present a considerable impediment to data sharing and the research it supports. In principle, the need for conversion should not involve the user, any more than it does in the automatic conversion of formats that is now widely implemented in word processors—the user of Microsoft Word, for example, will probably not need to know the format of a document received from someone else, although conversion still needs to occur. Achievement of interoperability between the software packages used to support global change research should be a major research objective. Reasonable goals for interoperability research might include the following: • interoperability between representations of imagery tied to the Earth’s surface—this might include recognition of a common description language that can be read automatically, and used to perform
MULTIPLE ROLES OF GIS IN GLOBAL CHANGE RESEARCH
271
necessary operations such as resampling to a common projection; interoperability between bandsequential and band-interleaved data; interoperability between different representations of spectral response, including different integer word lengths; • interoperability between data sets based on irregularly spaced point samples, allowing automatic interpolation to a raster, or resampling to another set of sample points; • interoperability between any data model representations of continuous fields over the Earth’s surface. 21.4 CONCLUSION Several themes from this discussion are of sufficient generality to warrant revisiting in this concluding section. Data models lie at the heart of GIS, because they determine the ways in which real geographic phenomena can be represented in digital form, and limit the processing and modelling that can be applied to them. GIS has inherited its data models from an array of sources, through processes of legacy, metaphor, and commercial interest, and there is a pressing need for greater recognition of the role of data models, better terminology, and a more comprehensive perspective. A second strong theme is interoperability. Interest in this area stems largely from the widespread acceptance of the notion that despite its abundant functionality, GIS is hard to use, particularly in exploiting its potential for integrating data from a variety of sources. Even though we now have a range of format standards to help us in exchanging data, and every GIS now supports a range of alternative import and export formats, the fact remains that transfer of data from one system to another is far more time-consuming and complex than it need be. Moreover, every system has its own approach to user interface design, the language of commands, and numerous other aspects of “look and feel” that help to create a steep learning curve for new users. The discussion identified several areas where current techniques of spatial analysis are inadequate for global change research. One is the availability of techniques for analysis of phenomena on a spherical surface; too many methods of spatial analysis are limited to a plane, and are not readily adapted to the globe. A survey of existing techniques for spatial analysis on the sphere has been published as an NCGIA technical report (Raskin, 1994). In August 1995 NCGIA began a project to develop improved methods for spatial interpolation, including methods for the sphere, that incorporate various kinds of geographic intelligence. These “smart” interpolators will go beyond the traditional generic types such as kriging and thin plate splines by attempting to model processes known to affect geographic distributions of specific phenomena, and to take advantage of known correlations. The current status of the work can be reviewed at http://www.geog.ucsb.edu/~raskin. With funding from ESRI (Environmental Systems Research Institute) and CIESIN (Consortium for International Earth Science Information Network), NCGIA constructed the first consistent global database of population based on a regular grid. The database was completed in 1995, and is being distributed for use in studies which integrate human and physical processes of global change, and thus need demographic data on a basis compatible with most physical data sets. The work was led by Waldo Tobler, with assistance from Uwe Deichmann and Jonathan Gottsegen. It uses a range of techniques of spatial analysis for disaggregating and reaggregating census population counts from arbitrary regions to grid cells. The work is documented in an NCGIA Technical Report (Tobler et al., 1995). Another general issue is the need to understand the influence of national government policy and other dimensions of the policy context on the availability of spatial data. This issue has recently come to the fore in arguments about access to climate records, under the auspices of the WMO (World Meteorological
272
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
Organisation). Other debates are occurring in the context of the Internet, and its implications for intellectual property rights and the international market for data. Efforts such as the US Department of State-led Earthmap (http://www.gnet.org/earthmap), the Japanese Millionth Map, and the international community’s Core Data are attempting to coordinate base mapping around the world and achieve a higher level of availability for digital framework data in the interests of global change research (Estes et al., 1995). Other efforts, such as the Alexandria Digital Library (ADL) project (http://alexandria.ucsb.edu) are seeking general solutions to the problems of finding geographic data on the Internet. While much of the discussion of this chapter has been motivated by the specific context of global change research, similar concerns arise in considering the potential roles of GIS in any area of science. Global change research is particularly complex, involving many disciplines, and of great public interest, so there are good reasons for suggesting that it might form a useful model for the scientific uses of GIS generally. ACKNOWLEDGEMENTS The National Center for Geographic Information and Analysis is supported by the National Science Foundation through Cooperative Agreement SBR 88–10917. We acknowledge support for the two I15 specialist meetings from the US Geological Survey. The Alexandria Digital Library project is also supported by the National Science Foundation through Cooperative Agreement IRI 94–11330. The assistance of John Estes, Kate Beard, Tim Foresman, Dennis Jelinski, and Jenny Robinson in co-leading I15 is gratefully acknowledged. Ashton Shortridge also helped to organise the two specialist meetings and to prepare the reports. REFERENCES ALLEN, T.F.H. and STARR, T.B. 1982. Hierarchy: Perspectives for Ecological Complexity. Chicago: University of Chicago Press. BOTKIN, D.B. 1989. Science and the global environment, in Botkin, D.B., Caswell, M.F., Estes, J.E. and Orio, A.A. (Eds.) Changing the Global Environment: Perspectives on Human Involvement. New York: Academic Press, pp. 3–14. BURKE, I.C., KITTEL, T.G.F., LAURENROTH, W.K., SNOOK, P., YONKER, CM. and PARTON, W.J. 1991. Regional analysis of the Central Great Plains, Bioscience, 41(10) pp. 685–692. CARLETON, A.M., TRAVIS, D., ARNOLD,D., BRINEGAR, R, JELINSKI, D.E. and EASTERLING, D.R. 1994. Climatic-scale vegetation-cloud interactions during drought using satellite data, InternationalJournal of Climatology, 14(6), pp. 593–623. COLE, S., and BATTY, M. 1992. Global Economic Modeling and Geographic Information Systems: Increasing our Understanding of Global Change. Buffalo, NY: National Center for Geographic Information and Analysis, State University of New York at Buffalo. COMMITTEE ON EARTH SCIENCES (CES) 1989. Our Changing Planet: A US Strategy for Global Change Research. Washington, DC: Federal Coordinating Council for Science, Engineering, and Technology, Office of Science and Technology Policy. COMMITTEE ON EARTH SCIENCES (CES) 1990. Our Changing Planet—The FY 1990 Research Plan. Washington, DC: Federal Coordinating Council for Science, Engineering, and Technology, Office of Science and Technology Policy. COMMITTEE ON EARTH AND ENVIRONMENTAL SCIENCES (CEES). 1991. Our Changing Planet—The FY 1992 US Global Change Research Program. Washington, DC: Federal Coordinating Council for Science, Engineering, and Technology, Office of Science and Technology Policy.
MULTIPLE ROLES OF GIS IN GLOBAL CHANGE RESEARCH
273
COMMITTEE ON THE HUMAN DIMENSIONS OF GLOBAL CHANGE. 1992. Report of the Committee on the Human Dimensions of Global Change, in Stern, P.C., Young, O.R. and Druckman, D. (Eds.) Global Environmental Change: Understanding the Human Dimensions, Washington, DC: National Academy Press. EARTH SYSTEM SCIENCES COMMITTEE (ESSC) 1986. Earth System Science Overview: A Program for Global Change. Washington, DC: National Aeronautics and Space Administration. EARTH SYSTEM SCIENCES COMMITTEE (ESSC). 1988. Earth System Science: A Closer View. Washington, DC: National Aeronautics and Space Administration. ESTES, I.E., LAWLESS, J. and MOONEYHAN, D.W. (Eds.) 1995. Proceedings of the International Symposium on Core Data Needs for Environmental Assessment and Sustainable Development Strategies, Bangkok, Thailand, Nov. 15–18, 1994. Sponsored by UNDP, UNEP, NASA, USGS, EPA, and URSA. FEITELSON, E. 1991. Sharing the globe: the role of attachment to place, Global Enviromental Change, 1, pp. 396–406. GOODCHILD, M.F. 1991. Integrating GIS and environmental modeling at global scales, in Proceedings, GIS/LIS 91,1, Washington, DC: ASPRS/ACSM/AAG/URISA/AMFM, pp. 117–127, GOODCHILD, M.F. 1992. Geographical information science, International Journal of Geographical Information Systems, 6(1), pp. 31–46, GOODCHILD, M.F., and YANG, S. 1992. A hierarchical spatial data structure for global geographic information systems, CVGIP-Graphical Models and Image Processing, 54(1), pp. 31–44. GOODCHILD, M.F., PARKS, B.O., and STEYAERT, L.T. (Eds.) 1993. Environmental Modeling with GIS. New York: Oxford University Press. GOODCHILD, M.F., ESTES, I.E., BEARD, K.M., FORESMAN, T. and ROBINSON, J. 1995. Research Initiative 15: Multiple Roles for GIS in US Global Change Research: Report of the First Specialist Meeting, Santa Barbara, California, March 9–11, 1995. Technical Report 95–10. Santa Barbara, CA: National Center for Geographic Information and Analysis. GOODCHILD, M.F., ESTES, I.E., BEARD, K.M. and FORESMAN, T. 1996. Research Initiative 15: Multiple Roles for GIS in US Global Change Research: Report of the Second Specialist Meeting, Santa Fe, NM, January 25–26, 1996. Technical Report 96–5. Santa Barbara, CA: National Center for Geographic Information and Analysis. HALL, F.G., STREBEL, D.E. and SELLERS, P.J. 1988. Linking knowledge among spatial and temporal scales: vegetation, atmosphere, climate and remote sensing, Landscape Ecology, 2 pp. 3–22. HAY, L.E., BATTAGLIN, W.A., PARKER, R.S., and LEAVESLEY, G.H. 1993. Modeling the effects of climate change on water resources in the Gunnison River basin, Colorado, in Goodchild, M.F., Parks, B.O. and Steyaert, L.T. (Eds.), Environmental Modeling with GIS. New York: Oxford University Press, pp. 173–181. HEUVELINK, G.B.M., and BURROUGH, P.A. 1993. Error propagation in cartographic modelling using Boolean logic and continuous classification , International Journal of Geographical Information Systems, 7(3), pp. 231–246. INTERNATIONAL COUNCIL OF SCIENTIFIC UNIONS (ICSU) 1986. The International Geosphere-Biosphere Program —A Study of Global Change: Report No. 1. Final Report of the Ad Hoc Planning Group, ICSU Twentyfirst General Assembly, September 14–19, 1986. Bern, Switzerland: ICSU. INTERNATIONAL GEOSPHERE-BIOSPHERE PROGRAMME (IGBP) 1990. The International GeosphereBiosphere Programme —A Study of Global Change: The Initial Core Projects, Report No. 12. Stockholm, Sweden: IGBP Secretariat. KING, A.W. 1991. Translating models across scales in the landscape, in Turner, M.G. and Gardner, R.H. (Eds.) Quantitative Methods in Landscape Ecology. New York: Springer Verlag. LANGRAN, G. 1992. Time in Geographic Information Systems. London: Taylor & Francis. LOVELAND, T.R., MERCHANT, J.W., OHLEN, D. and BROWN, J.F. 1991. Development of a land-cover characteristics database for the conterminous US, Photogrammetric Engineering and Remote Sensing, 57(11), pp. 1453–1463. MAGUIRE, D.W., GOODCHILD, M.F. and RHIND, D.W. (Eds.). 1991. Geographical Information Systems: Principles and Applications. London: Longman Scientific and Technical . MOUNSEY, H.M. (Ed.) 1988. Building Databases for Global Science. London: Taylor & Francis.
274
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
NATIONAL AERONAUTICS AND SPACE ADMINISTRATION (NASA) 1991. BOREAS (Boreal EcosystemAtmosphere Study): Global Change and Biosphere-Atmosphere Interactions in the Boreal Forest Biome, Science Plan. Washington, DC: NASA. NATIONAL CENTER FOR GEOGRAPHIC INFORMATION AND ANALYSIS (NCGIA). 1989. The research agenda of the National Center for Geographic Information and Analysis, International Journal of Geographical Information Systems, 3, pp. 117–136. NATIONAL CENTER FOR GEOGRAPHIC INFORMATION AND ANALYSIS (NCGIA). 1992. A Research Agenda for Geographic Information and Analysis. Technical Report 92–7, Santa Barbara, CA: National Center for Geographic Information and Analysis. NATIONAL RESEARCH COUNCIL (NRC) 1986. Global Change in the Geosphere-Biosphere, Initial Priorities for an IGBP, Washington, DC: US Committee for an International Geosphere-Biosphere Program, National Academy of Sciences. NATIONAL RESEARCH COUNCIL (NRC) 1990. Research Strategies for the US Global Change Research Program. Washington, DC: Committee on Global Change, US National Committee for the IGBP. National Academy of Sciences. NEMANI, R.R, RUNNING, S.W., BAND, L.E. andPETERSON, D.L. 1993. Regional hydro-ecological simulation system—an illustration of the integration of ecosystem models in a GIS, in Goodchild, M.F., Parks, B.O. and Steyaert, L.T. (Eds.) Environmental Modeling with GIS. New York: Oxford University Press, pp. 296–304. NYERGES, T.L. 1993. Understanding the scope of GIS: its relationship to environmental modeling, in Goodchild, M.F., Parks, B.O. and Steyaert, L.T. (Eds.) Environmental Modeling with GIS. New York: Oxford University Press, pp. 75–93. PIELKE, R.A. and AVISSAR, R. 1990. Influence of landscape structure on local and regional climate, Landscape Ecology, 4, pp. 133–155. PRICE, M.F. 1989. Global change: defining the ill-defined, Environment, 31, pp. 18–20. RASKIN, R. 1994. Spatial Analysis on the Sphere. Technical Report 94–7. Santa Baibara, CA: National Center for Geographic Information and Analysis. RUNNING, S.W. and COUGHLAN, J.C. 1988. A general model of forest ecosystem processes for regional applications. I: hydrologic balance, canopy gas exchange, and primary production processes, Ecological Modelling, 42, pp. 125–154. RUNNING, S.W., NEMANI, R.R., PETERSON, D.L., BAND, L.E., POTTS, D.F., PIERCE, L.L. and SPANNER, M.A. 1989. Mapping regional forest evapotranspiration and photosynthesis by coupling satellite data with ecosystem simulation, Ecology , 70, pp. 1090–11. TOBLER, W.R, DEICHMANN, U., GOTTSEGEN, J. and MALOY, K. 1995. The Global Demography Project. Technical Report 95–6. Santa Barbara, CA: National Center for Geographic Information and Analysis. TOWNSHEND, J.R.G. 1991. Environmental databases and GIS, in Maguire, D.J., Goodchild, M.F. and Rhind, D.W. (Eds.) Geographical Information Systems: Principles and Applications, 2, London: Longman, pp. 201–216. TURNER, A.K. 1992. Three-Dimensional Modeling with Geoscientific Information Systems, Dordrecht: Kluwer. TURNER, B.L. II, KASPERSON, RE., MEYER, W.B., DOW, K.M., GOLDING, D., KASPERSON, J.X., MITCHELL, R.C. and RATICK, S.J. 1990. Two types of global environmental change: definitional and spatialscale issues in their human dimensions, Global Environmental Change, 1, pp. 15–22. WHITE, D., KIMERLING, A.J., and OVERTON, W.S. 1992. Cartographic and geometric components of a global sampling design for environmental monitoring , Cartography and Geographic Information Systems, 19(1), pp. 5–22.
Chapter Twenty Two Remote Sensing and Urban Analysis Hans-Peter Bähr
22.1 REMOTE SENSING AND ACQUAINTED TECHNOLOGY Remote sensing is defined as a “technology for acquiring information from distant objects without getting in direct contact, taking the electromagnetic spectrum as the transport medium”. We shall in the following restrict this very broad and generally accepted definition to imaging techniques. Remote sensing includes techniques for imaging both from satellites and airborne platforms. Therefore photogrammetry, which as long as 80 years ago had already developed mapping techniques for urban areas using aerial photography (see Schneider, 1974), is a well-established subset of remote sensing and not a separate field of activity (see Bähr and Vögtle, 1998). Remote sensing is proving to be particularly useful for urban analysis for the following reasons: 1. Imagery, as a function of scale, shows very detailed information in non-generalised mode, ranging from 3D geometry of buildings to contours of urban environment. 2. Imagery may be taken “upon request” according to the needs of the user. This is particularly true for aerial imagery: time, scale, stereoscopic parameters and spectral range may be preselected. 3. Image-based data processing offers advanced automatic procedures. 4. Both the variety of data and the advanced digital image processing techniques available are leading to considerable progress in urban analysis by remote sensing techniques. 22.2 IDENTIFYING RESEARCH TOPICS FOR REMOTE SENSING IN URBAN ANALYSIS During the GISDATA meeting in Strasbourg in June 1995 research topics were identified according to three major areas. The results are laid down in detail in the GISDATA Newsletter No. 6 (Craglia, 1996) and will not be given here again. Nevertheless, the issues which showed a special level of interest and which were strongly discussed are taken as a guideline for the following discussion.
276
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANLIC PERSPECTIVES
22.2.1 Remote Sensing Detection of Patterns in Relation to Processes A feature which is discussed again and again for many applications, particularly in urban analysis, is scale in relation to sensor resolution and object size. No general rule has been confirmed up to now and only heuristics are taken for practical use. For classical methodology like airborne photogrammetry, empirical rules exist for instance when drawing topographic maps by stereo restitution procedures. The algorithm, still applied today is (Heissler, 1954): m b=Scale factor image m k=Scale factor map It is noteworthy that this formula is empirical. Therefore we have to accept that the result for getting an adequate relation between object size on the one hand and image scale/sensor resolution on the other hand will also be empirical. A second topic of concern is segmentation of imagery in order to obtain GIS-objects in urban areas. The required methods are automatic procedures as far as possible. It applies not only to plain objects and landuse pattern in urban environment but also to 3D modelling. It is suggested that the task should be done by using multiple sensors from different origin. The multisensor concept is expected to give more reliable results than those obtained when taking only one sensor at the time (see Section 22.3.1). Finally it should be noted, that linguistic concepts should be taken more seriously into account for pattern detection in urban areas. However, this feature is more future oriented than the two aspects mentioned above (Bähr and Schwender, 1996). 22.2.2 Remote Sensing in Urban Dynamics Interestingly, a strong discussion is underway about which features can really be detected and analysed by remote sensing for urban dynamics. Because of its specific nature, remote sensing techniques are able to reveal physical patterns directly. The main concern of urban dynamics, of course, is the growth of cities. But this question is heavily dependent upon the definition of the limits of an urban cluster. This point was highlighted at the Strasbourg GISDATA meeting and it is clear that many definitions go far beyond physical parameters and shape (see paragraph 22.3.1). Another challenging point is the determination of optimum time and frequency when taking remote sensing data for change analysis. On-line observation of urban change will be an exception and may be applied for big construction areas like the “Potsdamer Platz” in Berlin during the period 1995–2000. In most cases samples for a specific time have to be taken and the situation in between has to be interpolated. The dream of urban analysts is forecasting physical changes. Here remote sensing may give a reliable base at least by nearly continuous spatial information. Nevertheless, forecasting is generally done at a very high risk and may be pure speculation. 22.2.3 Data integration (remote sensing, cartography, socio-economics) The opportunities created by remote sensing systems for urban analysis are increasing as an increasing amount of data from high resolution satellites comes on stream The MOMS-2 system on a space shuttle
REMOTE SENSING AND URBAN ANALYSIS
277
already provided data with a resolution of 4.5 m on the ground, while a series of new commercial satellites is being developed which may increase the resolution on the ground to about 1 m (Fritz, 1996; Jürgens, 1996). This may be especially important for urban applications. For cartography it is worthwhile to note that in nearly all countries digital map data bases are under development. The GIS user community should prepare for having digital data in vector form available in the very near future together with digital terrain models (DTMs, 2, 5D), and an increasing number of full 3D models of the earth’s surface. “Integration” requires common features of data. This means for instance that geometry, time, quality and topological relationship should be equivalent or at least be analytically comparable. In addition to “metric conditions”, non-metric conditions like standards and terminology have also to be taken into account. Integration of data requires models of data geometry, topology and semantics. The quality of the data model has to be in balance with the model of the process which is to be analysed. This means for instance, that a bad model for a dynamic process in urban analysis cannot be compensated by excellent data. On the other hand it makes no sense to design an excellent model for analysis when only weak data are available. 22.3 REMOTE SENSING FOR URBAN FEATURE EXTRACTION: SOME EXAMPLES This section takes “small scale” and “large scale” as different features for urban analysis. Urban analysis does not necessarily require large scale. The scale factor used for urban analysis has to be considered in relation to the respective application. When satellite imagery offered a very coarse resolution of about 70 meters (e.g. Landsat MSS), users strongly demanded higher resolution for applications in urban environment. Nevertheless, higher resolution showed that it yields new types of problems. Always remember: “No solution just by resolution”. 22.3.1 Small Scale Examples Satellite imagery is of increasing importance for remote sensing applications in urban environments. However, even the new sensor generation of about 5 m pixel size on the ground or smaller in case of the commercial series cannot necessarily be considered as “large scale”. Consequently, satellite data are characteristic for small scale urban analysis. In this field progress will be achieved by adding knowledge. A growing source of knowledge is the available data itself. This is particularly true for radar imagery because acquisition is independent of atmospheric conditions. A test has been done for segmentation of urban areas near Karlsruhe, Germany, merging five datasets of the optical and three of the microwave domain. It has been shown, that by “just combining” optical and microwave data results will not necessarily be improved. Consequently, for some classes pure microwave or pure optical data will yield the best results and for another group of classes a combination. The best result then is filtered out and displayed (see Bähr 1995 b; Foeller, 1994; Hagg and Bähr, 1997). It is not easy to check the quality of the obtained result. One possibility is to take an existing digital cartographic data base, if available. Figure 22.1 shows the differences from a land-use classification based on data fusion of microwave and optical data subtracted from a geo-database for the class “urban”. Differences, i.e. non-matching areas, are displayed in black. A general error is the so-called “mixed-pixel-
278
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANLIC PERSPECTIVES
Figure 22.1: “Urban area”: Differences (in black) between land use classification from merged satellite data (Foeller) and an existing cartographic database. (Area shows approx. 9 km×5 km)
effect” which occurs for the marginal parts of high-contrast objects, like streets or built-up areas neighboured by vegetation (e.g. linear features in Figure 22.1). This effect may be overlaid by residuals from non-matching geometry between the different data sets. Larger errors are for instance due to time differences of both data bases. In the case of Figure 22.1, the cartographic database had not been updated and consequently new built up areas are shown as “differences” (black patches). Moreover, the structure of errors based on image classification procedures is clearly displayed. Because of the typical mixed texture in urban areas (for instance due to gardens, plants, different types of roof cover etc.) it is very difficult if not impossible to show urban areas in “uniform blocks”. This problem may be overcome by a special procedure, based on Delaunay-Triangulation as suggested by Schilling and Vögtle (1996). Figure 22.2 shows the triangulation network of pixels classified as “urban”. In this case, the delineation of “urban areas” seems to be possible when executing an interpretation of the size and shape of the triangles. This then enables the automatic drawing of the real contours of the settlement areas. The result is given in Figure 22.3. The contours of the settlement areas are displayed in bold lines as the result of the image analysis explained above. The problem of defining “urban” or the “limits of urban areas” is widely discussed in geography and urban planning. The procedure shown here, based on Delaunay-triangulation and “intelligent” grouping, should be taken as a proposal for a machine-based contribution in this field.
REMOTE SENSING AND URBAN ANALYSIS
Figure 22.2: Delaunay-Triangulation for "urban pixels" (Area shows approx. 4×4 kms.)
279
280
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANLIC PERSPECTIVES
Figure 22.3: Contours of urban areas derived from analysis of triangles
22.3.2 Large Scale Example Generally speaking, results may be improved by adding knowledge. In this respect, existing topographic line maps may play an important role, though they may have not been used very often up to now. This is due to the fact that large digital cartographic data bases are still under development, and that imagery and line maps show basically very different representations of the “real world”, for reasons not discussed here (see Bähr et al., 1994). It has been shown that it is possible to take information of both image and map datasets for mutual analysis (Quint, 1997). It is however not possible to do this straightforwardly i.e. on the “iconic level”. The iconic display has to be transformed first into the “symbolic level” using for instance semantic networks. Figure 22.4 shows the data flow as developed by Quint. Once transformed into a hierarchical semantic net, the verification of objects found in both datasets is possible. For objects where a verification was not achieved, the classification has to analyse the reasons for non matching. The analysis of large scale imagery has been found to be much more confident when taking line maps as additional knowledge source. The final result is shown in Figure 22.5. It is evident that the combination of line maps and aerial imagery gives a new quality for urban remote sensing. Because of the very many objects in the urban environment and the variance of textural and radiometric patterns, remote sensing procedures are becoming increasingly sophisticated in urban analysis for both small and large scale, particularly if supported by an intelligent use of ancillary cartographic databases.
REMOTE SENSING AND URBAN ANALYSIS
281
Figure 22.4: Data flow for combined analysis in image and cartographic databases (Quint, 1997)
22.4 CONCLUSION 22.4.1 Available tools The increasing development of high resolution satellites i.e. satellites with a resolution ranging between 1 and 5 meters on the ground, promises to contribute widely to the needs of urban analysis. Growing diversity
282
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANLIC PERSPECTIVES
Figure 22.5: Map- supported image analysis for urban environment using semantic networks(Quint, 1997)
will also become the norm as many different countries will have their own national satellite system in orbit (Bähr, 1997a). Although urban planners and analysts have always requested high geometric resolution; one should not expect too much from data of this kind. A considerable amount of experience for geometrically high resolution data is already available from aerial photography. In the digital domain there is no difference in data extraction between satellite imagery of high resolution and aerial imagery. Consequently, from this viewpoint high resolution satellite imagery for urban analysis in principle does not offer completely new options. A more important feature is identified by the fact that digital maps are more and more common in the cartographic domain. Although this is not purely a remote sensing issue, one may count on digital maps being available for additional information when doing feature extraction from imagery in urban areas (Section 22.3.2). In both Europe and North America digital databases from the urban environment are already on the market or being produced. Hardware and software developments are also increasingly supporting image analysis in urban areas. Whilst the continuing fall in hardware costs is of major benefit, there is still considerable progress needed in automating image processing. For example, much work is still needed to derive 3D models of urban environment from stereoscopic imagery by automatic procedures.
REMOTE SENSING AND URBAN ANALYSIS
283
22.4.2 Application challenges From the very many real topics which are challenging at the moment urban analysis, the following four are of particular importance: 1. Developing countries: Remote sensing imagery from satellite platforms may provide relatively cheap (indirect) information about the demographic explosion. In this case, high resolution imagery from satellites may be a partial substitute for aerial photography. For modelling development of urban areas in the Third World, new concepts are urgently needed. 2. Urban pollution: Here again the models are the weak point. The dynamics of polluted air in an urban environment requires 3D models of cities. There is no other way to get them than by remote sensing using for instance correlation techniques or laser scan procedures. 3. Disaster management: Disasters always require very rapid information about location and extent. For many cases, like floods and earthquakes, the terrain will not be accessible by cars or other ground transportation media. Here only procedures from airborne platforms may be used. They should in principle allow on-line processing, i.e. location and extent of the damages should be detected automatically on-the-flight and then directly recorded to the ground. 4. Navigation/transportation: This topic is very close to commercialisation as financial return seems to be very fast. The whole field of navigation and transportation is a very big market. Strong international firms will occupy that market very quickly once the technology has matured. A technological challenge for remote sensing is continuous traffic flow monitoring even for larger urban areas. This could include information for drivers, suggestions for deviations, and generally assist in cutting down the cost of transportation. 22.4.3 Crucial factors Remote sensing in urban analysis requires digital tools and digital thinking in a GIS context. Completely new tools are available, and this should lead to completely new solutions (“do not try to the the old solutions by new tools!”). New features are, for example, automated processes and inclusion of options for decision making. GIS is knowledge based; this means that databases and decision processes for spatial information are potentially no longer separated as is the case with analogue data. The decision making process, which was formerly done by the operator from “knowledge” should—as far as possible—be delegated to the GIS. At the present time, the issues relating to data quality are frequently discussed. However, very often model quality is more crucial. Model in this context means the analytical description of the environmental phenomena. It makes no sense to use good data in weak urban models. In many cases it is not the fault of lacking or bad data that the result obtained is not acceptable. In an optimised process, data quality and model quality have to be well balanced. Digital geo-data acquisition, storage and analysis compared to conventional analogue methods should open up a completely new field of methodology. The step from analogue to digital is a “paradigm shift” (Ackermann, 1995). This step means revolution rather than evolution of technology. One has to admit that both producers and users are not yet fully prepared for a change.
284
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANLIC PERSPECTIVES
REFERENCES ACKERMANN, F. 1995. Digitale Photogrammetrie—Ein Paradigma-Sprung, Zeitschrift für Photogrammetrie und Fernerkundung, 63(3), pp. 106–115. BÄHR, H.-P. 1997a. Satellite image data: progress in acquisition and processing, in Altan, O. and Gründig, L. (Eds.) Proceedings of the Second Turkish-German Joint Geodetic Days, Berlin, pp. 83–94. BÄHR,H.-P. 1995b. Image Segmentation Methodology in Urban Environment-Selected Topics, paper presented at the GISD AT A Specialist meeting on Remote Sensing for Urban Applications, Strasbourg. BÄHR, H.-P. and SCHWENDER, A. 1996. Linguistic Confusion in Semantic Modelling. Wien: Internationale Gesellschaft für Photogrammetrie und Fernerkundung Commission V. BÄHR, H.-P., VÖGTLE, T. (Eds.) 1998. GIS for Environmental Monitoring—Selected Material for Teaching and Learning. Stuttgart: Schweizerbart Verlag. BÄHR, H.-P., QUINT, F. and STILLA, U. 1995. Modellbasierte Verfahren der Luftbildanalyse zur Kartenfortführung, ZPF-Zeitschrifl für Photogrammetrie und Fernerkundung 6/1995, p.224 ff. CRAGLIA, M. (Ed.). 1996. GISDATA Newsletter. No. 6. Sheffield: University of Sheffield, Department of Town & Regional Planning FOELLER, J. 1994.Kombination der Abbildungen verschiedener operationeller Satellitensensoren zur Optimierung der Landnutzungsklassifizierung. Diploma Thesis, unpublished, FRITZ, L.W. 1996. The era of commercial earth observation satellites, Photogrammetrie Engineering and Remote Sensing, January, 1/1996, pp. 36–45. HAGG, W. and BÄHR, H.-P. 1977. Land-use Classification Comparing Optical, Microwave Data and Data Fusion. Rio de Janeiro: IAG Scientific Assembly. HEISSLER, V. 1954. Untersuchungen uber den wirtschaftlich zweckmäßgsten Bildmaßstab bei Bildflügen mit Hochleistungsobjektiven, Bildmessung und Luftbildwesen, pp. 37–45, 67–79, 126–137. JÜRGENS, C. 1966. Neue Erdbeobachtungs-Satelliten liefern hochauflösende Bilddaten für GIS-Anwendungen, Geoinformation Systems, 6, pp. 8–11. QUINT, F. 1997. Kartengestützte Interpretation monokularer Luftbilder. Dissertation Universität Karlsruhe, in Deutsche Geodätische Kommission, 477, Serie C. SCHILLING, K.-J. and VÖGTLE, T. 1996. Satellite image analysis using integrated knowledge processing, in Kraus, K. and Waldhäusl, P. (Eds.), Proceedings of the XVIII Congress of the ISPRS, Vienna, Austria, July 1996. International Archive of Photogrammetry and Remote sensing, Vol. XXXI, Part B3, Commission III, pp. 752–757. SCHNEIDER, S. 1994. Luftbild und Luftbildinterpretation. Berlin: Walter de Gruyter.
Chapter Twenty Three From Measurement to Analysis: a GIS/RS Approach to Monitoring Changes in Urban Density Victor Mesev
23.1 INTRODUCTION The monitoring of urban land use change is undoubtedly one of the most challenging goals for GIS, remote sensing, and spatial analysis. Yet the rewards from the design of more precise and reliable urban monitoring methodologies are enormous from the point of view of local government planning, zoning, and management of resources and services (Kent et al., 1993). Indeed, the Official Journal of the European Communities claims that nearly 90 percent of the European population will soon live in areas designated as built-up urban. Information on the spatial structure of urban areas is therefore vital to the monitoring of contemporary processes of urban growth/decline in terms of population shifts, employment restructuring, changing levels of energy use, and increased pollution and congestion problems. The main problem with monitoring urban areas has always been with the initial creation of the digital maps on which urban monitoring scenarios are based. Problems associated with the acquisition and generation of accurate, reliable, and consistent urban spatial data have resulted in maps that have not completely kept pace with the needs of dynamic urban monitoring programs and models. Moreover, the choice in the type of spatial data has not always been fully justified or reflected the choice in the type of analysis. What is needed is effective mapping of the structure of urban areas to act as the baseline component in the assessment of the general sustainability of settlements. Effective mapping should not only be in terms of higher accuracy and consistency, but should also be directly responsive to appropriate spatial analysis techniques. All too frequently the measurement of urban areas is completely independent of analysis, and most work concentrates on one or the other, but rarely on both. What this chapter proposes is a cohesive two-part strategy that links the measurement of urban structure with the analysis of urban growth and density change. The first part of the strategy concentrates on generating urban measurements that define the structural properties of the characteristic mix of built-up and natural surfaces. These measurements are generated using a methodology that links conventional image classification with contextual-based GIS urban land use information. Essentially, supervised maximumlikelihood classifications are constrained by a series of GIS-invoked decision rules using disaggregated population census data (see for example Figure 23.1d) (Mesev et al., 1995). The constraints, which include stratification, training adjustments, and post-classification sorting, represent an innovative suite of techniques that link remote sensing with GIS, and as such contribute to the advancement in research on GIS/ RS integration (Star et al., 1991; Zhou, 1989). Results from this GIS/RS link have already shown marked improvements in classification accuracy, particularly at the level of residential density. The methodology is fully documented elsewhere (Mesev et al., 1998) so coverage in this chapter will be kept at a minimum.
286
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
Figure 23.1 : A Selection of “Urban” Spatial Data for the Settlement of Bristol: (a) SPOT HRV-XS satellite image, taken on 17th May 1988; (b) census tracts: enumeration districts; (c) postal sectors; (d) surface model based on enumeration district centroids
Instead, the final product of this GIS/RS link, urban classifications, will be used to reveal important spatial and temporal patterns in urban land use juxtaposition and changes in density profiles, both vital for effective urban monitoring programs. In order to do this, a spatial analysis approach is needed that can summarise urban density measurements and display the degree to which these measurements are a reflection of underlying urban processes. At the same time, in choosing the spatial analysis technique careful consideration needs to be given to one that can also fully exploit the specific advantages of urban mapping by remote sensing (de Cola, 1989). In other words, making analysis accountable to measurement. The choice made in this chapter was to adopt a technique initially developed by Batty and Kim (1992) and later modified to accept remotely-sensed data by Mesev et al. (1995). The technique is based on traditional urban density functions, specifically inverse
A GIS/RS APPROACH TO MONITORING CHANGES IN URBAN DENSITY
287
power functions, which have proved to be good statistical and graphical indicators of the degree to which urban density attenuates with distance from the city core (Mills, 1992). A modification is made to replace the standard power function with a parameter derived from fractal theory (see Batty and Kim, 1992). The assumption is that urban areas exhibit spatial patterns of form that are commensurate with fractal notions of self-similarity and scale-independence. In other words, the shape of settlements and the way they grow can be conveniently represented and consistently summarised across space and time by fractal dimensions (Batty and Longley, 1994). Again, only brief coverage will be given to this now established technique. Instead, support will be given to the contention that urban measurements from GIS/RS are the most appropriate type of source data from which to base fractal-derived density functions. Appropriate in the sense that the approach allows for flexibility in land use classification, variations in image spatial resolutions, and frequency in data capture. These three advantages over more standard source data, when linked with fractal-led density modelling, allow for objective and detailed measurements of spatial and temporal changes not only in the size, form and density of urban land uses for individual settlements but also for comparisons through time and across the urban hierarchy at the national and eventually the international scale. In summary, this chapter will examine two mutually dependent areas of research: 1. the classification of satellite imagery by extensive reference to urban-based data held within a GIS, and 2. the spatial analysis of urban density profiles using concepts from fractal geometry. The most prominent emphasis will be given on how measurement from (1) is most effectively modelled by analysis in (2). 23.2 URBAN MEASUREMENT 23.2.1 Traditional Urban Measurements The field of urban density function analysis boasts a vast literature pool containing research from subjects as diverse as econometrics, urban geography, regional planning, and civil engineering. However, many papers written on this topic frequently end with concerned comments over the practical relevance of their results (Mills and Tan, 1980). The concern is centred around the type and quality of urban measurements from which their analyses are empirically verified. Much of the earlier work on the empirical fitting of density functions was based on the assumption that population densities could be imputed from conventional zonal data such as census tracts (Figure 23.1b). By this approach, ordinal, interval and ratio census data are directly related to areal or volumetric data which are represented by simple choroplethic surfaces (Clark, 1951). Unfortunately, density was calculated as a gross density, which means that the entire area of the census tract was assumed to contain a uniform population distribution. Furthermore, density was inextricably linked to the size of the tract, where any changes in its areal size directly affected the value of its density. These problems were alleviated, to a certain degree, by using more disaggregated surfaces (Figure 23.1d) (see Latham and Yeates, 1970), where unpopulated areas of land could be filtered out in order to estimate a population net density. However, both census tracts, and their disaggregated derivatives were always constrained by the number of zonal units from core to periphery, typically under 15 for large cities, and as little as four for medium-sized ones (Zielinski, 1980). This means that density gradients were
288
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
commonly over-generalised and unable to reveal the full amount of variability in population and density changes. 23.2.2 GIS and RS Measurements With the more recent uptake of digital representations, including GIS and RS, urban measurements have now become much more extensive and detailed, as well as more accurate. Density values can now be calculated down to much finer tessellations, including individual image pixels. This has undoubtedly opened up many more possibilities for monitoring urban areas using various integrated digital applications (Barnsley et al., 1989; Langford et al., 1991; Jensen et al., 1994). Spatial data digitised from topographic maps have become the main back-cloth to many contemporary GIS-based urban monitoring operations and urban density gradients. Examples include urban boundary data from the Department of Environment in the UK and TIGER files holding digitised residential streets in the US (Batty and Xie, 1994). More recently, the proliferation of aspatial data pertaining to census variables (figure 23.1b), mailing information (Figure 23.1c), and demographic and socio-economic characteristics have added a qualitative dimension to the GIS urban measurement process (useful comparisons in Mesev et al., 1996). In the United Kingdom, and no doubt many other countries, this wealth of human-based information is starting to be linked to standard spatial data to produce a kaleidoscope of geo-urban applications, from point-based address matching, line-based transport networks, through to area-based geodemographic profiles linking postal units with census tracts (Longley and Clarke, 1995). As a consequence, urban areas can now be represented in a variety of measurements, tailored to match specific monitoring applications. However, many of these spatial and aspatial datasets are available somewhat infrequently, in the case of census information every ten years. Furthermore, most, if not all, of the spatial data are secondary, and as such prone to errors from digitising or scanning, and all aspatial information commonly contains many systematic and interpretative discrepancies. Also, there is a lack of overall co-ordination and standardisation in positional and attribute information, leading to low commonality and inconsistencies between datasets (Star et al., 1991). Some of the problems of temporality and consistency can be addressed by the use of remotely-sensed imagery (Figure 23.1a). In brief, digital imagery has facilitated wider, more repetitive areal coverages of urban areas, that allow rapid and cost-effective computerised updating (Michalak, 1993). Traditional land observation imagery has been mostly taken from the Landsat TM and SPOT multispectral and panchromatic sensors, with nominal spatial resolutions of 30 m, 20 m, and 10 m, respectively. These scanners have had qualified success for local and regional monitoring of urban growth and decline, road network detection, and generalised land use changes (Forster, 1985; Lo, 1986). More precise and detailed urban monitoring may become possible from the next generation of optical technology. Plans for launching programs such as Earth Watch, Eyeglass, and Space Imaging, claim data will be available at spatial resolutions down to 1 m for panchromatic and 4 m for multispectral bands (Barnsley and Hobson, 1996). However, even with these increases remotely-sensed images are still only snapshots of the Earth’s surface able, at best, to differentiate between urban land cover types during cloudless days. As a result, only limited information can be interpreted on the way urban land cover types are used, even less if buildings are spectrally and spatially similar (Forster, 1985). What is needed is a means of inferring land use from mixtures of land cover types. Land use information from GIS is a promising way forward in labelling, and discriminating between, spectrally similar land cover types. However, this can only be successfully achieved if GIS and RS data are processed simultaneously, preferably within an integrated methodology (Zhou, 1989).
A GIS/RS APPROACH TO MONITORING CHANGES IN URBAN DENSITY
289
23.2.3 GIS/RS Integrated Measurements This chapter will now argue how, by using an integrated GIS/RS model, new qualitative land use information can be actively incorporated into the standard classification process of remotely-sensed images. The technique developed in Mesev et al. (1998) is an ideal method to demonstrate the links that can be easily established between remotely-sensed data and census information held as a GIS surface model, and how these links can produce improvements in the accuracy of urban measurements. The technique is based on the premise that census attributes and census surfaces held by a raster-based GIS (Figure 23.1d) (Martin and Bracken, 1991) can be used to inform as well as constrain the standard maximum likelihood (ML) image classifier (ERDAS, 1995). Essentially, GIS surfaces of census data are used as contextual information to perform a series of hierarchical stratifications by determining more reliable class training signatures and class estimates. Area estimates are then further normalised and directly inserted into the ML classifier as prior probabilities, Pr(x|w, z), where the probability of pixel vector x belongs to class w and is weighted by census variable z (Strahler, 1980). Elsewhere, favourable results have also been generated from area estimates which have been used as part of an iterative process for updating ML a posteriori probabilities (Mesev et al., 1998). A number of empirical applications have already been completed on four settlements (Bristol, Norwich, Swindon, Peterborough) (Figure 23.1 inset) in the United Kingdom (Mesev et al., 1996). In each case data from the land observation satellites, Landsat and SPOT, represented the base and pivotal components for the technique. After examining other remotely-sensed sources, only these types of satellite imagery were, at the time, able to provide consistently, at regular intervals, accessible multispectral data that allow wide surface coverage and at a spatial resolution that could facilitate large-scale classification of urban land cover categories. Eight types of urban land cover and land use were classified. These were “urban”, “built-up”, “residential”, “non-residential”, and four types of essentially residential density, “detached dwellings”, “semidetached dwellings”, “terraced dwellings”, and “purpose-built apartment blocks”. The first four of these were classified from Landsat TM images taken either in 1982 or 1984, and represent the “1981” dataset. Later Landsat TM images (1988 or 1989), along with a single SPOT XS (1988) for Bristol (Figure 23.1a), were classified into all eight categories, and represent the “1991” dataset (samples are found in Figure 23.2). The reason for including dwelling types only for 1991 is simply due to the superior quality of the SPOT and 1988 and 1989 Landsat images, as well as the introduction of dwelling type indicators in the 1991 UK Census of Population. The discrepancies between the dates the images were taken and the two Censuses were unavoidable but only directly affected the classifications, not the subsequent urban density analyses. As a verification of the classification methodology, class areal estimates derived using equal and unequal prior probabilities were compared against those produced by the GIS census surfaces for all four settlements. Each showed marked reductions in total absolute error, ranging between 1.4 percentage points for Bristol and 4.9 percentage points for Swindon. A detailed site-specific accuracy assessment was also conducted on residential dwelling categories for the Bristol 1991 dataset using 250 ground truth points collected by manual and GPS equipment (Table 23.1). Small, yet systematic improvements are evident in the number of points correctly classified (shown as bold), or percentage correct (shown in parentheses), and overall accuracy and Kappa coefficients. These GIS/RS classifications are advancements in themselves, but also represent a new type of source data for analysing urban density changes, that can be conveniently summarised by statistical distance-decay functions. However, conventional density functions are inadequate and need to be modified to ensure the specific merits of GIS/RS measurements are upheld. The next section will examine such a modification.
290
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
Figure 23.2: GIS/RS Classified Images of Three Urban Land Uses for the Four Settlements in 1991 Table 23.1: Accuracy Assessment of the 1 99 1 datasets for Bristol Reference data Classified data
Detached
Detached
Semi-detached
Terraced
Apartment blocks
Equal
Unequal
Equal
Unequal
Equal
Unequal
Equal
Unequal
22(76)
25(86)
3(3)
2(2)
2(2)
1(1)
0(0)
0(0)
A GIS/RS APPROACH TO MONITORING CHANGES IN URBAN DENSITY
291
Reference data Classified data
Detached Equal
Unequal
Semi-detached
Terraced
Equal
Equal
Unequal
Apartment blocks Unequal
Equal
Unequal
Semi-detached 2(7) 1(3) 72(82) 81(86) 13(13) 7(7) 3(18) 2(12) Terraced 3(10) 2(7) 12(14) 5(6) 81(82) 90(91) 5(29) 5(29) Apartments 2(7) 1(3) 2(2) 0(0) 3(3) 1(1) 9(53) 10(59) Overall Accuracy: Equal (73.6%) Unequal (82.4%) Kappa Coefficient: Equal (0.607) Unequal (0. 737)
23.3 URBAN DENSITY ANALYSIS 23.3.1 Traditional Urban Density Functions Urban density functions are defined as mathematical formulations which describe the relationship between distance from a city centre (or indeed any other growth focus) and density measures of some facet of the urban environment, often population, buildings or economic activity. For the purposes of demonstrating the importance of linking measurement with analysis, a simple urban density function will illustrate how classified satellite images can be most effectively spatially analysed using rudimentary fractal geometry. It must be stressed that the results from such an analysis are constrained to quantitative summaries of urban form and density, and urban processes may only be inferred from such measurements. As a starting point, the density function most widely adopted in quantitative urban analysis is the negative-exponential (Clark, 1951). It assumes that population density p(R) at distance R from the centre of the city (where R = 0) declines monotonically according to the following negative-exponential, (23.1) where K is a constant of proportionality which is equal to the central density p(0) and β is a rate at which the effect of distance attenuates. If α is large, density falls off rapidly; if it is small, density falls off slowly. In contrast, in inverse-power functions, K is again the constant of proportionality as in (23.1), but not defined where R = 0, and the parameter on distance a, is now a power, (23.2) Both (23.1) and (23.2) are poor predictors of central densities, but the inherent flexibility of the inverse-power produces a less steep fall-off at the periphery, reflecting more realistically the growth of urban areas, primarily through suburbanisation (Batty and Kim, 1992). Furthermore, this flexibility in the design of the inverse-power function allows modifications to be made to the distance parameter a. Unlike α , α is scale independent and is an ideal mechanism for representing the fractal way urban areas grow and fill available space. The assumption that urban systems exhibit fractal patterns of development is contentious but work by Batty and Longley (1994) gives strong support to the important contribution of fractal models as robust and convenient estimators of size, form and density. Moreover the GIS/RS technique outlined in this chapter typically produces classified scenes of urban land use (Figure 23.2) which exhibit spatial complexities and irregularities similar to, and as such can most efficiently be measured by, fractal-based models.
292
GEOGRAPHIC INFORMATION RESEARCH: TRANS-ATLANTIC PERSPECTIVES
23.3.2 Fractal-based Urban Density Functions The development of urban fractal models can be found in general texts such Batty and Longley (1994). In this chapter, density and fractal dimension estimation is based purely on the idea of measuring the occupancy, or space-filling properties, of urban development. According to fractal theory, dimension D, will fall between the established range of 1 and 2, where each land use (or occupancy) fills more than a line across space (D = 1) but less than the complete plane (D = 2). The COUNT dimension refers to the estimation process which takes into account the cumulative totals for each concentric zone, R (where each zone is 1 pixel wide) from the urban centre and is calculated by, (23.3) and where N(R') is the total number of occupied cells at mean distance R' from the central point of the settlement. On the other hand, the DENSITY dimension refers to the incremental proportion of zone occupation and is expressed in terms of, (23.4) where p(R') is the proportion of occupied cells, again at mean distance R'. The two types of fractal estimates, COUNT and DENSITY, however, do not account for the variance in each land use pattern. As such, it is impossible to speculate upon the shapes of these patterns with respect to density gradients or profiles. To circumvent the problem, regression lines will be fitted to the profiles generated from each surface in terms of counting land use cells, i in each concentric zone from the urban centre, given as Nt, along with their normalisation to densities expressed as pt, producing, (23.5) (23.6) where, α and α are constants of proportionality, but not defined where radius R=0, and where D and α are the parameters on distance capable of accommodating scale independence observed in urban systems through the notions of fractal geometry (Batty and Kim, 1992). Fractal dimensions are generated by the intercept parameters, α and α which are, in turn, affected by the slope parameters, a and 2—D, in (23.6) and (23.5) respectively. Note that only density profiles (23.6) will be examined further. Count profiles typically produce highly linear functions and do not fully illustrate the constraints on urban development (refer to Mesev et al., 1995 for discussion). It has duly been noted that slope parameters may become volatile when confronted with abnormal data sets, leading to fractal dimensions that could lie beyond the logical limits associated with generalised space-filling, i.e. 1