P.F. Fisher Developments in Spatial Data Handling
Peter F. Fisher
Developments in Spatial Data Handling 11th International Symposium on Spatial Data Handling
with 267 Figures and 37 Tables
Professor Peter F. Fisher Department of Information Science City University Northampton Square London EC1V 0HB United Kingdom
Library of Congress Control Number: 2004111521
ISBN 3-540-22610-9 Springer Berlin Heidelberg New York This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilm or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer-Verlag. Violations are liable to prosecution under the German Copyright Law. Springer is a part of Springer Science+Business Media GmbH springeronline.com © Springer-Verlag Berlin Heidelberg 2005 Printed in Germany The use of general descriptive names, registered names, trademarks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. Cover design: E. Kirchner, Heidelberg Production: Almas Schimmel Typesetting: camera-ready by the authors Printing: Mercedes-Druck, Berlin Binding: Stein + Lehmann, Berlin Printed on acid-free paper 30/3141/as 5 4 3 2 1 0
Foreword The International Symposium on Spatial Data Handling (SDH) commenced in 1984, in Zurich, Switzerland, organized by the International Geographical Union Commission on Geographical Data Sensing and Processing which was later succeed by the Commission on Geographic Information Systems, Study Group on Geographical Information Science and then the Commission on Geographical Information Science (http://www.hku.hk/cupem/igugisc/). Previous symposia have been held at the following locations: 6th - Edinburgh, 1994 1st - Zurich, 1984 7th - Delft, 1996 2nd - Seattle, 1986 8th - Vancouver, 1998 3rd - Sydney, 1988 9th - Beijing, 2000 4th - Zurich, 1990 10th - Ottawa, 2002 5th - Charleston, 1992 This book is the proceedings of the 11th International Symposium on Spatial Data Handling. The conference was held in Leicester, United Kingdom, on August 23rd to 25th 2004, as a satellite meeting to the Congress of the International Geographical Union in Glasgow. The International Symposium on Spatial Data Handling is a refereed conference. All the papers in this book were submitted as full papers and reviewed by at least two members of the Programme Committee. 83 papers in all were submitted and among the 50 included here, all are considered above average by the reviewers. The papers cover the span of Geographical Information Science topics, which have always been the concern of the conference. Topics from uncertainty (error, vagueness, and ontology and semantics) to web issues, digital elevation models and urban infrastructure. I would venture to suggest that in the proceedings there is something for everyone who is doing research in the science of geographical information. The Edinburgh and Delft proceedings had been published as postconference publications by Taylor and Francis in the Advances in GIS series which appeared as two volumes edited by Tom Waugh and Richard Healey and then by Martien Molennar and Menno-Jan Kraak, and the Ottawa conference has been published by Springer-Verlag as Advances in Spatial Data Handling, edited by Dianne Richardson and Peter van Oosterom. Many important seminal papers and novel ideas have been originated from this conference series. This Publication is the second in the
Springer-Verlag series which members of the Commission hope will continue.
vi
Acknowledgement Any conference takes considerable organization, and this one especially. As the time came for submission of initial papers for review, I was confined to a hospital bed after emergency surgery and the conference only happened because Jill, my wife and helper, and Beth, my daughter, kept my email free of submissions. I want to thank them and Kate and Ian too from the bottom of my heart, for being there, the conference and this proceedings are dedicated to them. I should also thank those at the University of Leicester who were involved in putting on the conference. Kate Moore who worked on the web site, Dave Orme and Liz Cox who organized the finances and accommodation. Nick Tate, Claire Jarvis and Andy Millington all helped with a variety of tasks. I should also thank the programme committee who promptly completed the review process. The support and the questioning of Anthony Yeh are particularly appreciated. Finally I would like to thank the originators of the SDH symposium series, especially Kurt Brassel who organized the first and Duane Marble who edited it for their vision in setting up the series which has maintained a consistent level of excellence in the reporting of the best scientific results in what we now know as Geographical Information Science.
Peter Fisher 2 June 2004
Table of Contents
Plenary of Submitted Papers About Invalid, Valid and Clean Polygons................................................ 1 Peter van Oosterom, Wilko Quak and Theo Tijssen 3D Geographic Visualization: The Marine GIS.................................... 17 Chris Gold, Michael Chau, Marcin Dzieszko and Rafel Goralski Local Knowledge Doesn’t Grow on Trees:............................................ 29 Community-Integrated Geographic Information Systems and Rural Community Self-Definition Gregory Elmes, Michael Dougherty, Hallie Challig, Wilbert Karigomba, Brent McCusker and Daniel Weiner.
Web GIS A Flexible Competitive Neural Network for Eliciting ........................ 41 User’s Preferences in Web Urban Spaces Yanwu Yang and Christophe Claramunt Combining Heterogeneous Spatial Data from ...................................... 59 Distributed Sources M. Howard Williams and Omar Dreza Security for GIS N-tier Architecture ..................................................... 71 Michael Govorov, Youry Khmelevsky, Vasiliy Ustimenko, and Alexei Khorev Progressive Transmission of Vector Data Based .................................. 85 on Changes Accumulation Model Tinghua Ai, Zhilin Li and Yaolin Liu
Elevation modelling An Efficient Natural Neighbour Interpolation...................................... 97 Algorithm for Geoscientific Modelling Hugo Ledoux and Christopher Gold Evaluating Methods for Interpolating Continuous ........................... 109 Surfaces from Irregular Data: a Case Study M. Hugentobler, R.S. Purves and B. Schneider Contour Smoothing Based on Weighted Smoothing ......................... 125 Splines Leonor Maria Oliveira Malva Flooding Triangulated Terrain............................................................. 137 Yuanxin Liu and Jack Snoeyink
Vagueness and Interpolation Vague Topological Predicates for Crisp Regions................................ 149 through Metric Refinements Markus Schneider Fuzzy Modeling of Sparse Data ............................................................ 163 Angelo Marcello Anile and Salvatore Spinella Handling Spatial Data Uncertainty Using a Fuzzy ............................ 173 Geostatistical Approach for Modelling Methane Emissions at the Island of Java Alfred Stein and Mamta Verma
x
Temporal A Visualization Environment for the Space-....................................... 189 Time-Cube Menno-Jan Kraak and Alexandra Koussoulakou Finding REMO - Detecting Relative Motion....................................... 201 Patterns in Geospatial Lifelines Patrick Laube, Marc van Kreveld and Stephan Imfeld
Indexing Spatial Hoarding: A Hoarding Strategy for Location- ...................... 217 Dependent Systems Karim Zerioh, Omar El Beqqali and Robert Laurini Distributed Ranking Methods for Geographic ................................... 231 Information Retrieval Marc van Kreveld, Iris Reinbacher, Avi Arampatzis and Roelof van Zwol Representing Topological Relationships between............................... 245 Complex Regions by F-Histograms Lukasz Wawrzyniak, Pascal Matsakis and Dennis Nikitenko The Po-tree, a Real-time Spatiotemporal Data ................................... 259 Indexing Structure Guillaume Noël, Sylvie Servigne and Robert Laurini
Uncertainty Empirical Study on Location Indeterminacy of.................................. 271 Localities Sungsoon Hwang and Jean-Claude Thill
xi
Registration of Remote Sensing Image with........................................ 285 Measurement Errors and Error Propagation Yong Ge, Yee Leung, Jianghong Ma and Jinfeng Wang Double Vagueness: Effect of Scale on the ............................................ 299 Modelling of Fuzzy Spatial Objects Tao Cheng, Peter Fisher and Zhilin Li Area, Perimeter and Shape of Fuzzy Geographical ........................... 315 Entities Cidália Costa Fonte and Weldon A. Lodwick
Generalisation Why and How Evaluating Generalised Data ?.................................... 327 Sylvain Bard, Anne Ruas Road Network Generalization Based on Connection ......................... 343 Analysis Qingnian Zhang Continuous Generalization for Visualization on................................. 355 Small Mobile Devices Monika Sester and Claus Brenner
Shape-Aware Line Generalisation With Weighted ............................ 369 Effective Area Sheng Zhou and Christopher B. Jones
Spatial Relationships Introducing a Reasoning System Based on.......................................... 381 Ternary Projective Relations Roland Billen and Eliseo Clementini A Discrete Model for Topological Relationships................................. 395 between Uncertain Spatial Objects Erlend Tøssebro and Mads Nygård xii
Modeling Topological Properties of a Raster ...................................... 407 Region for Spatial Optimization Takeshi Shirabe Sandbox Geography – To learn from children ................................... 421 the form of spatial concepts Florian A. Twaroch and Andrew U. Frank
Urban Infrastructure Street Centreline Generation with an .................................................. 435 Approximated Area Voronoi Diagram Steven A. Roberts,G. Brent Hall and Barry Boots Determining Optimal Critical Junctions for ....................................... 447 Real-time Traffic Monitoring for Transport GIS Yang Yue and Anthony G. O. Yeh Collaborative Decision Support for Spatial......................................... 459 Planning and Asset Management: IIUM Total Spatial Information System Alias Abdullah, Muhammad Faris Abdullah and Muhammad Nur Azraei Shahbudin
Navigation Automatic Generation and Application of Landmarks .................... 469 in Navigation Data Sets Birgit Elias and Claus Brenner Towards a Classification of Route Selection ....................................... 481 Criteria for Route Planning Tools Hartwig Hochmair An Algorithm for Icon Labelling on a Real-Time............................... 493 Map Lars Harrie, Hanna Stigmar, Tommi Koivula and Lassi Lehto xiii
Working with Elevation Semantically Correct 2.5D GIS Data – the.......................................... 509 Integration of a DTM and Topographic Vector Data Andreas Koch Christian Heipke Generalization of integrated terrain elevation .................................... 527 and 2D object models J.E. Stoter, F. Penninga and P.J.M. van Oosterom An Image Analysis and Photogrammetric........................................... 547 Engineering Integrated Shadow Detection Model Yan Li, Peng Gong and Tadashi Sasagawa
Semantics and Ontologies Understanding Taxonomies of Ecosystems: ........................................ 559 a Case Study Alexandre Sorokine and Thomas Bittner Comparing and Combining Different Expert ..................................... 573 Relations of How Land Cover Ontologies Relate Alexis Comber, Peter Fisher and Richard Wadsworth Representing, Manipulating and Reasoning ....................................... 585 with Geographic Semantics within a Knowledge Framework James O’Brien and Mark Gahegan
Data Quality and Metadata A Framework for Conceptual Modeling of ......................................... 605 Geographic Data Quality Anders Friis-Christensen, Jesper V. Christensen and Christian S. Jensen
xiv
Consistency Assessment Between Multiple ......................................... 617 Representations of Geographical Databases: a Specification-Based Approach David Sheeren, Sébastien Mustièr and Jean-Daniel Zucker Integrating Structured Descriptions of ............................................... 629 Processes in Geographical Metadata Bénédicte Bucher
Spatial Statistics Toward Comparing Maps as Spatial Processes .................................. 641 Ferko Csillag and Barry Boots Integrating computational and visual analysis.................................... 653 for the exploration of health statistics Etien L. Koua and Menno-Jan Kraak
Using Spatially Adaptive Filters to Map Late..................................... 665 Stage Colorectal Cancer Incidence in Iowa Chetan Tiwari and Gerard Rushton
xv
Author Index Abdullah, Alias Abdullah, Muhammad F. Ai, Tinghua Anile, A.Marcello Arampatzis, Avi Bard, Sylvain Billen, Roland Bittner, Thomas Boots, Barrry Brenner, Claus Bucher, Bénédicte Challig, Hallie Chau, Michael Cheng, Tao Christensen, Jesper V. Claramunt, Christophe Clementini, Eliseo Comber, Alexis Csillag, Ferko Dougherty, Michael Dreza, Omar Dzieszko, Marcin El Beqqali, Omar Elias, Birgit Elmes, Gregory Fisher, Peter Fonte, Cidália Costa Frank, Andrew U. Friis-Christensen, Anders Gahegan, Mark Ge, Yong Gold, Christopher Gong, Peng Goralski, Rafel Govorov, Michael Hall, G. Brent Harrie, Lars Heipke, Christian
Hochmair, Hartwig Hugentobler, Marco Hwang, Sungsoon Imfeld, Stephan Jensen, Christian S. Jones, Christopher B. Karigomba, Wilbert Khmelevsky, Youry Khorev, Alexei Koch, Andreas Koivula, Tommi Koua, Etien L. Koussoulakou, Alexandra Kraak, Menno-Jan Laube, Patrick Laurini, Robert Laurini, Robert Ledoux, Hugo Lehto, Lassi Leung, Yee Li, Zhilin Li, Yan Liu, Yaolin Liu, Yuanxin Lodwick, Weldon A. Ma, Jianghong Malva, Leonor M.O. Matsakis, Pascal McCusker, Brent Mustière, Sébastien Nikitenko, Dennis Noël, Guillaume Nygård, Mads O’Brien , James Penninga, F. Purves, Ross Quak, Wilko Reinbacher, Iris
459 459 85 163 231 327 381 559 435, 641 355, 469 629 29 17 299 605 41 381 573 641 29 59 17 217 469 29 299, 573 315 421 605 585 285 17, 97 547 17 71 435 493 509
xvii
481 109 271 201 605 369 29 71 71 509 493 653 189 189, 653 201 217 259 97 493 285 85, 299 547 85 137 315 285 125 245 29 617 245 259 395 585 527 109 1 231
Roberts, Steven A. Ruas, Anne Rushton, Gerard Sasagawa, Tadashi Schneider, B. Schneider, Markus Servigne, Sylvie Sester, Monika Shahbudin, Muhammad Sheeren, David Shirabe, Takeshi Snoeyink, Jack Sorokine, Alexandre Spinella, Salvatore Stein, Alfred Stigmar, Hanna Stoter, J.E. Thill, Jean-Claude Tijssen, Theo Tiwari, Chetan Tøssebro, Erlend
Twaroch,Florian A. Ustimenko, Vasiliy van Kreveld, Marc van Oosterom, Peter van Zwol, Roelof Verma, Mamta Wadsworth, Richard Wang, Jinfeng Wawrzyniak, Lukasz Weiner, Daniel Williams , M. Howard Yang , Yanwu Yeh, Anthony G. O. Yue, Yang Zerioh, Karim Zhang, Qingnian Zhou, Sheng Zucker, Jean-Daniel
435 327 665 547 109 149 259 355 459 617 407 137 559 163 173 493 527 271 1 665 395
xviii
421 71 201, 231 1, 527 231 173 573 285 245 29 59 41 447 447 217 343 369
617
Programme Committee
Chair Peter Fisher William Mackaness Andrew Millington Martien Molennar Ferjan Ormeling Henk Ottens Donna Peuquet Anne Ruas Tapani Sarjakoski Monica Sester Marc Van Krefeld Peter van Oostermon Rob Weibel Stephan Winter Mike Worboys Anthony Yeh
Dave Abel Michael Barnsley Eliseo Clementini Leila De Floriani Geoffrey Edwards Pip Forer Andrew Frank
Randolph Franklin Chris Gold Mike Goodchild Francis Harvey Robert Jeansoulin Chris Jones Brian Klinkenberg Menno-Jan Kraak Robert Laurini
Local Organising Committee Chair Peter Fisher David Orme Sanjay Rana Kevin Tansley Nicholas Tate
Mark Gillings Claire Jarvis Andrew Millington Kate Moore
xix
About Invalid, Valid and Clean Polygons Peter van Oosterom, Wilko Quak and Theo Tijssen Delft University of Technology, OTB, section GIS technology, Jaffalaan 9, 2628 BX Delft, The Netherlands.
Abstract Spatial models are often based on polygons both in 2D and 3D. Many Geo-ICT products support spatial data types, such as the polygon, based on the OpenGIS ‘Simple Features Specification’. OpenGIS and ISO have agreed to harmonize their specifications and standards. In this paper we discuss the relevant aspects related to polygons in these standards and compare several implementations. A quite exhaustive set of test polygons (with holes) has been developed. The test results reveal significant differences in the implementations, which causes interoperability problems. Part of these differences can be explained by different interpretations (definitions) of the OpenGIS and ISO standards (do not have an equal polygon definition). Another part of these differences is due to typical implementation issues, such as alternative methods for handling tolerances. Based on these experiences we propose an unambiguous definition for polygons, which makes polygons again the stable foundation it is supposed to be in spatial modelling and analysis. Valid polygons are well defined, but as they may still cause problems during data transfer, also the concept of (valid) clean polygons is defined.
1 Introduction Within our Geo-Database Management Centre (GDMC), we investigate different Geo-ICT products, such as Geo-DBMSs, GIS packages and ‘geo’ middleware solutions. During our tests and benchmarks, we noticed subtle, but fundamental differences in the way polygons are treated (even in the 2D situation and using only straight lines). The consequences can be quite unpleasant. For example, a different number of objects are selected when
2
Peter van Oosterom, Wilko Quak and Theo Tijssen
the same query is executed on the same data set in different environments. Another consequence is that data may be lost when transferring it from one system to another, as polygons valid in one environment may not be accepted in the other environment. It all seems so simple, everyone working with geo-information knows what a polygon is: an area bounded by straight-line segments (and possibly having some holes). A dictionary definition of a polygon: a figure, (usually a plane, rectilinear figure), having many, i.e. (usually) more than four, angles (and sides) (Oxford 1973). A polygon is the foundation geometric data type of many spatial data models, such as used for topographic data, cadastral data, soil data, to name just a few. So, why have the main GeoICT vendors not been able to implement the same polygons? The answer is that in reality the situation is not as simple as it may seem at first sight. The two main difficulties, which potentially cause differences between the systems, are: 1. Is the outer boundary allowed to interact with itself and possibly also with the inner boundaries and if so, under what conditions? 2. The computer is a finite digital machine and therefore coordinates may sometimes differ a little from the (real) mathematical value. Therefore tolerance values (epsilons) are needed when validating polygons.
Fig. 1. Real world examples: topographic data (left: outer ring touches itself in one point, middle: two inner rings (holes) both touch the outer ring, right: two inner rings touch each other)
The interaction between the outer and possibly the inner boundaries of a single polygon is related to the topological analysis of the situation. This is an abstract issue, without implementation difficulties, such as tolerance values. So, one might expect that the main ‘geometry’ standards of OpenGIS and ISO will provide a clear answer to this. A basic concept is that of a straight-line segment, which is defined by its begin and end point. Polygon input could be specified as an unordered, unstructured set of straightline segments. The following issues have to be addressed before it can be
About Invalid, Valid and Clean Polygons
3
decided whether the set represents a valid or invalid polygon (also have a look a the figures in section 3): 1. Which line segments, and in which order, are connected to each other? 2. Is there one or are there more than one connected sets of straight-line segments? 3. Are all connected sets of straight-line segments closed, that is, do they form boundary rings and is every node (vertex) in the ring associated with at least 2 segments? 4. In case a node is used in 4, 6, 8, etc. segments of one ring, is the ring then disconnected in respectively 2, 3, 4, etc. ‘separate’ rings (both choices may be considered ‘valid’, anyhow the situation can happen in reality irrespective of how this should be modeled, see fig. 1)? 5. Are there any crossing segments (this would not be allowed for a valid polygon)? 6. Is there one ring (the outer ring), which encloses an area that ‘contains’ all other (inner) rings? 7. Are there no nested inner rings (would result in disconnected areas)? 8. Are there any touching rings? This is related to question 4, but another situation occurs when one ring touches with one of its nodes another ring in the interior of a straight-line segment. 9. Are the rings, after construction from the line segments, properly oriented, that is counter clockwise for the outer boundary and clockwise for the inner boundaries? (defining a normal vector for the area that points upward as a usual convection inherited from the computer graphics world: only areas with a normal vector in the direction of the viewer are visible) Note that this means that the polygon area will always be on the left-hand side of the polygon boundaries. Some of questions may be combined in one test in an actual implementation in order to decide if the polygon is valid. In case a polygon is invalid, it may be completely rejected (with an error message) or it may be ‘accepted’ (with a warning) by the system, but then operations on such a polygon may often not be guaranteed to be working correctly. In this paper it is assumed that during data transfer between different systems enough characters or bytes are used in case of respectively ACSII (such as GML of OpenGIS and ISO TC211) or binary data formats in order to avoid unwanted change of coordinates (due to rounding/conversion) and that the sending and receiving system have similar capabilities for representing coordinates; e.g. integers (4 bytes), floating point numbers (4, 8, or 16 bytes) (IEEE 1985). In reality this can also be a non-trivial issue in which errors might be introduced. A worst case scenario would be transferring the same data several times between two systems (without editing) and every time
4
Peter van Oosterom, Wilko Quak and Theo Tijssen
the coordinates drift further away due to rounding/conversions. By using enough characters (bytes) this should be avoided. In reality, polygons are not specified as a set of unconnected and unordered straight-line segments. One reason for this it that every coordinate would then at least be specified twice and this would be quite redundant with all associated problems, such as possible errors and increased storage requirements. Therefore, most systems require the user to specify the polygon as a set of ordered and oriented boundary rings (so every coordinate is stated only once). The advantage is also that many of the tasks listed above are already solved through the syntax of the polygon. But the user can still make errors, e.g. switch the outer and inner boundary, or specify rings with erroneous orientation. So, in order to be sure that the polygon is valid, most things have to be checked anyhow.
2 Polygon definitions In this section we review a number of polygon definitions. First, we have a look at the definition of a (simple) polygon as used within computational geometry. Then the ISO and the OpenGIS polygon definitions are discussed. Then, we present our definition, which tries to fill the blank spots of the mentioned definitions and in case of inconsistencies between the standards make a decision based on a well defined (and straight forward) set of rules. Finally, the concept of clean (and robust) polygons is introduced. 2.1 Computational geometry From the computational text book of Preparata and Shamos (1985, p.18): ‘a polygon is defined by a finite set of segments such that every segment extreme is shared by exactly two edges and no subset of edges has the same property.’ This excludes situations with dangling segments, but also excludes two disjoint regions (could be called a multi-polygon), polygon with a hole, or a polygon in which the boundary touches itself in one point (extreme is shared by 4, 6, 8, … edges). However, it does not exclude a self-intersecting polygon, that is, two edges which intersect. Therefore also the following definition is given: ‘A polygon is simple if there is no pair of nonconsecutive edges sharing a point. A simple polygon partitions the plane into two disjoint regions, the interior (bounded) and the exterior (unbounded).’ Besides the self-intersection polygons, this also disallows polygons with (partial) overlapping edges. Finally the following
About Invalid, Valid and Clean Polygons
5
interesting remark is made: ‘in common parlance, the term polygon is frequently used to denote the union of the boundary and the interior.’ This is certainly true in the GIS context, which implies that actually the simple polygon definition is intended as otherwise the interior would not be defined. One drawback of this definition is that is disallows polygons with holes, which are quite frequent in the GIS context. 2.2 ISO definition The ISO standard 19107 ‘Geographic information — Spatial schema’ (ISO 2003) has the following polygon definition: ‘A GM_Polygon is a surface patch that is defined by a set of boundary curves (most likely GM_CurveSegments) and an underlying surface to which these curves adhere. The default is that the curves are coplanar and the polygon uses planar interpolation in its interior.’ It then continues with describing the two important attributes, the exterior and the interior: ‘The attribute “exterior” describes the “largest boundary” of the surface patch. The GM_GenericCurves that constitute the exterior and interior boundaries of this GM_Polygon shall be oriented in a manner consistent with the upNormal of the this.spanningSurface.’ and ‘The attribute “interior” describes all but the exterior boundary of the surface patch.’ Note that in this context the words exterior and interior refer to the rings defining respectively the outer and inner boundaries of a polygon (and not referring to the exterior area and interior area of the polygon with holes). It is a bit dangerous to quote from the ISO standard without the full context (and therefore exact meaning of primitives such as GM_CurveSegments and GM_GenericCurves). The GM_Polygon is a specialization of the more generic GM_SurfacePatch, which has the following ISO definition: ’GM_SurfacePatch defines a homogeneous portion of a GM_Surface. The multiplicity of the association “Segmentation” specifies that each GM_SurfacePatch shall be in one and only one GM_Surface.’ The ISO definition for the GM_Surface is: ‘GM_Surface, a subclass of GM_Primitive, is the basis for 2-dimensional geometry. Unorientable surfaces such as the Möbius band are not allowed. The orientation of a surface chooses an “up” direction through the choice of the upward normal, which, if the surface is not a cycle, is the side of the surface from which the exterior boundary appears counterclockwise. Reversal of the surface orientation reverses the curve orientation of each boundary component, and interchanges the conceptual “up” and “down” direction of the surface. If the surface is the boundary of a solid, the “up” direction is outward. For closed surfaces, which have no boundary, the up direction is
6
Peter van Oosterom, Wilko Quak and Theo Tijssen
that of the surface patches, which must be consistent with one another. Its included GM_SurfacePatches describe the interior structure of a GM_Surface.’ So, this is not the simple definition of a polygon one might aspect. Further, it is not directly obvious if the outer boundary is allowed to touch itself or if it is allowed to touch the inner boundaries and if so, under what conditions this would be allowed. One thing is very clear: there is just one outer boundary and there can be zero or more inner boundaries. This means that a ‘polygon’ with two outer boundaries, defining potentially disconnected areas, is certainly invalid. Also the ISO standard is very explicit about the orientation of the outer and inner boundaries (in 2D looking from above: counterclockwise and clockwise for respectively the outer and inner boundaries). 2.3 OpenGIS definition The ISO definition of a polygon is at the abstract (mathematical) level and part of the whole complex of related geometry definition. The definition has to be translated to the implementation level and this is what is done by the OpenGIS Simple Feature Specification (SFS) for SQL (OGC 1999). The OpenGIS definition is based on the ISO definition, so it can be expected that there will (hopefully) be some resemblance:‘A Polygon is a planar Surface, defined by 1 exterior boundary and 0 or more interior boundaries. Each interior boundary defines a hole in the Polygon. The assertions for polygons (the rules that define valid polygons) are: 1. Polygons are topologically closed. 2. The boundary of a Polygon consists of a set of LinearRings that make up its exterior and interior boundaries. 3. No two rings in the boundary cross, the rings in the boundary of a Polygon may intersect at a Point but only as a tangent: P Polygon, c1, c2 P.Boundary(), c1 z c2, p, q Point, p, q c1, p z q, [ p c2 q c2] 4. A Polygon may not have cut lines, spikes or punctures: P Polygon, P = Closure(Interior(P)) 5. The Interior of every Polygon is a connected point set. 6. The Exterior of a Polygon with 1 or more holes is not connected. Each hole defines a connected component of the Exterior. In the above assertions, Interior, Closure and Exterior have the standard topological definitions. The combination of 1 and 3 make a Polygon a Regular Closed point set.’
About Invalid, Valid and Clean Polygons
7
Similar to the ISO standard, in the OpenGIS SFS specification, a polygon is also a specialization of the more generic surface type, which can exits in 3D space. However, the only instantiable subclass of Surface defined in the OpenGIS SFS specification, Polygon, is a simple Surface that is planar. As might be expected from an implementation specification a number of things become clearer. According to condition 3: rings may touch each other in at most one point. Further condition 5 makes clear that the interior of a polygon must be a connected set (and a configuration of inner rings, which somehow subdivides the interior of a polygon into disconnected parts, is not allowed). Finally, an interesting point is raised in condition 4: cut lines or spikes are not allowed. All-fine from a mathematical point of view, but when is a ‘sharp part’ of the boundary considered a spike. Must the interior angle at that point be exactly 0, or is some kind of tolerance involved. The same is true for testing if a ring touches itself or if two rings touch each other (in a node-node situation or a node-segment situation). Note that the OpenGIS does not say anything concerning the orientation of the polygon rings. 2.4 An enhanced ‘polygon with holes’ definition Our definition of a valid polygon with holes: ’A polygon is defined by straight-line segments, all organized in rings, representing at least one outer (oriented counterclockwise) and zero or more inner boundaries (oriented clockwise, also see sections 2.1-2.3 for used concepts). This implies that all nodes are at least connected to two line segments and no dangling line segments are allowed. Rings are not allowed to cross, but it is allowed that rings touch (or even partially) overlap themselves or each other, as long as any point inside or on the boundary of the polygon can be reached through the interior of the polygon from any other point inside the polygon, that is, it defines one connected area. As indicated above, some conditions (e.g. ‘ring touches other ring’) require a tolerance value in their evaluation and therefore this is the last part of the definition.’ One could consider not using a tolerance value and only look at exact values of the coordinates and the straight-lines defined by them. Imagine a situation in which a point of an inner ring is very close to the outer ring; see for example cases 4, 31 and 32 in the figure of section 3. The situation in reality may have been that the point is supposed to be located on the ring (case 4). However, due to the finite number of available digits in a computer, it may not be possible to represent that exact location (Goldberg 1991, Güting 1993), but a close location is chosen (cases 31 and 32). It is arbitrary if this point would be on the one or the other side of the ring. Not
8
Peter van Oosterom, Wilko Quak and Theo Tijssen
considering tolerances would either mean that this situation would be classified as crossing rings (not allowed) or two disjoint rings (is not the case, as they are supposed to touch each other). Anyway, the polygon (ring and validity) situation is not correctly assessed. This is one of the reasons why many systems use some kind of tolerance value. The problem is how to specify the manner the tolerance value is applied when validating the polygon (this part is missing in our definition above). Another example, which illustrates this problem, is case 30 (see figure in section 3): one option for tolerance processing could be to remove the ‘internal spike’ and the result would be a valid polygon. However, an alternative approach may be to widen the gap between the toe end nodes and in this situation the result is an invalid polygon as the ‘internal spike’ intersects one of the other edges. It may be difficult to formalize unambiguous epsilon processing rules as a part of the validation process. Another strange aspect of our definition of valid polygons is that a ‘spike to the outside’ (case 11 in figure section 3) results in an invalid polygon as it is not possible from a point in the middle of this spike to reach all other points of the polygon via the interior. While at the same time a ‘spike to the inside’ (case 12 in figure section 3) is considered a valid polygon as it is possible to reach from any point of the polygon (also from the middle of the spike) any other point via the interior of the polygon. Something similar as with ‘spikes’ occurs with ‘bridges’: while internal bridges are valid (cases 7 and 8), the external bridges (case 15 and 16) are invalid. Both in the situation of spikes and bridges, the difference between internal and external could be considered ‘asymmetrical’. 2.5 Valid and clean polygons The validation process (according to our definition of valid polygons as described above) would become much simpler is it can be assumed that no point lies within epsilon tolerance of any other point or edge (which it does not define itself), which will be called a (valid) clean polygon. In cases 4 (31 and 32) this implies that the segment on which the point of the other ring is supposed to be located, should be split into two parts, with the point as best possible representation within the computer. By enforcing this way of modelling, the polygon validity assessment may be executed without tolerances. Another advantage of clean polygons as described above is that they will not have any spikes (not to the inside and not to the outside) as the two end nodes of the spike lay too close together (or are even equal). Similarly, the internal and external bridges should be removed. Further, repeated points are removed from the representation. Be-
About Invalid, Valid and Clean Polygons
9
fore transferring polygon data the sending system therefore first do the epsilon tolerance processing. After that, the sender can be sure that the receiving system will correctly receive the polygons (assuming that coordinates are not changed more that epsilon during transfer). The largest distance of moving a coordinate, while the result is still a valid polygon, is called the robustness of the polygon representation of Thompson (2003). In case 4 without an additional node, the robustness would be equal to 0 as infinitely small change (to the outside) of the node of the inner ring on the edge of the outer ring would make this polygon invalid (not considering epsilon tolerance). However, adding an explicit shared node in both inner and outer ring as result of the epsilon tolerance processing increases the robustness of this representation (of the ‘same’ polygon) to at least the value of epsilon. In fact the robustness is even larger as it is possible to change every (shared) node by more than epsilon (the size of epsilon can be observed from the open circle in the drawing of cases 31 and 32). The robustness of a polygon can be computed finding the smallest distance between a node and an edge (not defined by the node). The smallest distance can be either reached somewhere in the middle of near one of the end points of the involved edge. A brute force algorithm would require O(n2), while a smarter (computational geometry) algorithm could probably compute this in O(n log n), where n is the number of nodes (or edges) in the polygon. The concept of robustness has some resemblance with the ‘indiscernibility’ relation between two representations as introduced by Worboys (1998).
3 Testing Geo-ICT systems In this section we first specify a list of representative test polygons. This list is supported to be exhaustive for all possible valid and invalid type of polygons. Next we use this test set in combination with four different geoDBMSs and compare the outcome to the OpenGIS, ISO and our own definition of valid polygons. 3.1 Polygon examples Figure 2 shows an overview of our test polygons. In these images, small filled circles represent nodes. Sometimes, two of these circles are drawn very close to each other; this actually means that the nodes are exactly the same. The same is true for two very close lines, which actually mean (partly) overlapping segments. Some figures contain empty circles, which
10
Peter van Oosterom, Wilko Quak and Theo Tijssen
indicate tolerance values (the assumed tolerance value is 4000 related to the coordinates used in our test polygons). In case such a situation occurs, the system may decide to ‘correct’ the polygon, within the tolerance value distances, resulting in a new polygon representation. The resulting new representation may be valid or invalid, depending on the configuration. Note that all polygon outer rings are defined counterclockwise and all inner rings are defined clockwise. More test variants can be imagined when reversing the orientation (only done for test 1). Further, rings touching itself can be modelled as one ring or modelled as separate rings. The separate ring option is chosen, with the exception of example 4, where both cases are tested: 4a, the ‘separate rings’ option (without explicit node were rings touch), 4b, the ‘single ring’ option (self touching). Even a third variant would be possible: two ‘separate rings’ with explicit nodes were these rings touch (not tested). Also in this case more test variants can be imagined. In addition to our presented set of test cases, it may be possible to think of other test cases. It is important to collect these test cases in order to evaluate the completeness (and correctness) of the standards related. To name just a few additional, untested, cases (but many more will exist): polygons with inner and outer ring switched, (first inner ring specified, then outer ring), but both with the proper orientation same as above, but now also the orientation (clockwise/ counterclockwise) reversed. two exactly the same points on a straight line (similar to case 26, but now with repeated points) same as above, but now the two points are repeated on a true 'corner' of the polygon line segment of inner ring is within tolerance of a line segment of the outer ring (but on the inside), similar to case 9 but with tolerance value same as above but now the line segment is on the outside two outer rings, with partly overlapping edges or touching in point
About Invalid, Valid and Clean Polygons
Fig. 2. Overview of the polygons used in the test
11
12
Peter van Oosterom, Wilko Quak and Theo Tijssen
Table 1. Results of validating the polygons, no code means polygon is considered valid (BS=boundary selfintersects, CR=crossing rings, EN=edge not connected to interior, FI=floating inner ring, NA=no area, NC=not closed, NH=not one homogenous portion, NO=not orientable, NS=no surface, Rn=rule n (n=1,3,4,5), RC=ring crosses ring, RO=rings overlap, RT=rings touch, SR=self crossing ring, TE=two exterior rings, TS=two separate areas, WO=wrong orientation) id 1a 1b 2 3 4a 4b 5 6 7 8 9 10 11 12 13 14a 14b 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37
Oracle
Informix
PostGIS
ArcSDE
OGC-SFS
WO
BS
BS BS WO BS
WO
WO
R3 R1 R4 R4 R3 R5 R5 R5 R5 R5 R5 R5 R5 R5 R3 R3 R5 R5
? NS ?
EN NA EN
? WO TE NH NH NH NH NH NH NH NO NH TE NH
CR TS TS TS TS TS TS TS TS TS SR TS TS TS
? ? ?
? ? ?
? ? ? ? ? ?
? ? ? ? ? ?
R3 BS
BS BS BS RT BS
we
RC
BS
BS BS RO NC BS BS RT WO
ISO 19107
BS BS NC
RC RC RC RC NC
BS
RC RC FI
BS BS BS BS BS
RC RC RC RC RC RC
BS
RC
BS
BS
RC
RT
BS
RC
BS
BS
RC
RT
BS
RC
TS TS TS TS TS
About Invalid, Valid and Clean Polygons
13
3.2 System tests We tested a test set of about 40 polygons in different spatial databases: Oracle (2001), Informix (2000), PostGIS (Ramsey 2001, PostgreSQL 2001), and ArcSDE binary (ESRI). Of these implementations Informix and PostGIS use the OpenGIS specification for polygons. Oracle Spatial defines a polygon as: ‘Polygons are composed of connected line strings that form a closed ring and the area of the polygon is implied.’ The OpenGIS Well Known Text (WKT) format was used when trying to insert the records in the different systems, with the exception of ArcSDE (see below). Because Oracle does not support this format we integrated the Java Topology Suite (1.3) into the Oracle server to implement the conversion function. Below a single example of the exact inserts statement for three (of the four) systems of a correct polygon with one hole (case 2): Oracle example: insert into test_polygon values ('2', GeomFromText( 'polygon( (33300 19200, 19200 30000, 8300 15000, 20000 4200, 33300 19200), (25000 13300, 17500 13300, 17500 19200, 25000 19200, 25000 13300))'));
Informix example: insert into test_polygon values ('2', ST_PolyFromText('polygon( (33300 19200, 19200 30000, 8300 15000, 20000 4200, 33300 19200), (25000 13300, 17500 13300, 17500 19200, 25000 19200, 25000 13300))', 128992));
PostGIS/PostgreSQL: insert into test_polygon values ('2', GeometryFromText('POLYGON( (33300 19200, 19200 30000, 8300 15000, 20000 4200, 33300 19200), (25000 13300, 17500 13300, 17500 19200, 25000 19200, 25000 13300))', -1));
Oracle has a separate validation function in which a parameter for the tolerance can be specified (example below shows our tolerance value of 4000): select id, sdo_geom.validate_geometry_with_context(geom,4000) as valid from test_polygon where sdo_geom.validate_geometry_with_context(geom,4000) 'TRUE';
It was not possible to load the WKT format into ArcSDE (without writing a program using ESRI’s ArcSDE API). As we did want to be sure to input the same polygon definitions with standard tools of the vendor we used the following procedure to load the data into ArcSDE: 1. WKT was converted by hand to ‘mif’ (‘mid’) format: in this manner exactly the same coordinates, ordering and rings were specified (including the repetition of first and last point)
14
Peter van Oosterom, Wilko Quak and Theo Tijssen
2. ArcTools 8.3 was used to convert the mif/mid format to ESRI shape files, a binary format. This conversion has as a side effect some coordinate manipulation: e.g. outer rings are now always ordered clockwise, a repeated last point is omitted, rings defining a region are per definition closed. This makes it impossible to test case 10. 3. Finally the polygons are loaded into ArcSDE binary with the following ArcSDE command: create –l testpoly,geom. –f testpoly.shp –e a+ -k SDEBINARY \ -9 10000 –a all –u username
The table gives an overview of the different responses by the four systems. This table also contains the expected results according to the ISO and OpenGIS definitions and our own definition. The result of inserting the test cases in the four different DBMSs leads to the following general observations. Oracle Spatial (version 9.2.0.3.0) provides a tolerance parameter that is used with many of the operations. In this test a tolerance of 4000 was used. Experiments with different tolerance values yielded the same results. If Informix (Spatial DataBlade Release 8.11.UC1 (Build 225)) finds a polygon with erroneous orientation, it reverses the polygon internally without warning. PostGIS 0.6.2 (on PostgreSQL 7.1.3) only supports the GeomFromText (and not PolyFromText) and the geometries cannot be validated.
4 Conclusion As noticed in our experience when benchmarking and testing geo-ICT products, the consistent use of a polygon definition is not yet a reality. This is both true for the standards (specifications) and the implementation in products. Both based on the ISO 19107 standard and OpenGIS SFS for SQL implementation specification it may sometimes be very difficult or impossible to determine whether a polygon is valid or not. Also according to our evaluation of the set with test cases, the results of (in)valid polygons are not always harmonized between ISO and OpenGIS. Further, both ISO and OpenGIS definitions do not cover the important aspect (when implementing polygons) of tolerance value. Therefore our own improved definition of a polygon (with holes) was given. This was refined by the definition of a (valid) clean polygon, which is suitable for data transfer (and easy validation). A part of the polygon validation may already be embedded in the syntax of the 'polygon input' (string) and certain validation tasks are implicit (e.g. the system does not have to assemble the ring from the individual straight
About Invalid, Valid and Clean Polygons
15
line segments as the rings are specified). One could wonder if the orientation of the rings in a polygon should be a strict requirement as the intended polygon area is clear. In case in the syntax the outer polygon ring would not be determined by the ordering of the rings (outer ring first), but purely by the orientation of the rings (outer ring could be any one in the list of rings), then proper orientation is useful as this can be used to detect inner and outer rings. But even this it is not strictly needed as one could also determine via the geometric configuration (computing) which ring should be considered the outer ring. Besides the theory, our tests with four different Geo-DBMSs, Oracle, Informix, PostGIS, and ArcSDE binary and one geo middleware product (LaserScan Radius Topology, not reported here), revealed that also in practice significant differences in polygon validation do exist. It needs no further explanation that this will cause serious problems during data transfer, including loss of data. We urge standardization organizations and GeoICT vendors to address this problem and consider our proposed definition. Until now, only the validation of (input) polygons is discussed, but what happens with these (in)valid polygons during operations; e.g. intersection of two valid polygons may result in disconnected areas (that is not a valid polygon). How is the area or perimeter of a polygon computed in case it is specified in long, lat (on the curved surface such as an sphere, ellipsoid or geoid), What will be the resulting units and how does the tolerance value influence this result? In this paper only simple polygons with holes on a flat surface were discussed. However as already indicated in the previous paragraph (curved surfaces), more complex situations can occur in the world of geo-ICT products (and standards): x Multi-polygons (that is, two or more outer boundaries which are not connected to each other) x Also include non linear edges in boundary (e.g. circular arcs) x In 3D space, but limited to flat surface x In 3D space, but limited to polyhedral surfaces (piecewise flat) x In 3D, non flat surfaces, especially an Earth ellipsoid (or geoid) For all these situations unambiguous and complete definitions, including the tolerance aspect, must be available. Test cases should be defined and subsequently the products should be valuated with these test cases. And after finishing with polygons, we should continue with polyhedrons (Arens et al. 2003).
16
Peter van Oosterom, Wilko Quak and Theo Tijssen
Acknowledgements We would like to thank the Topographic Service and the Dutch Cadastre for providing us with the test data sets (since January 2004, the Topographic Service is part of the Cadastre). We are further grateful to the vendors and developers of the Geo-ICT products mentioned in this paper (Oracle, Informix, PostgreSQL/PostGIS, ArcSDE), for making their products available for our research. Finally, we would like to thank the anonymous reviewers of this paper for the constructive remarks.
References Arens C, Stoter JE, van Oosterom PJM (2003) Modelling 3D spatial objects in a Geo-DBMS using a 3D primitive, Proceedings 6th AGILE, Lyon, France. Goldberg D (1991) What Every Computer Scientist Should Know About FloatingPoint Arithmetic, ACM Computing Surveys, Vol. 23: 5-48. Güting R and Schneider (1993) Realms: A foundation for spatial data types in database systems. In D. J. Abel and B. C. Ooi, editors, Proceedings of the 3rd International Symposium on Large Spatial Databases (SSD), volume 692 of Lecture Notes in Computer Science, pages 14-35. Springer-Verlag. IEEE (1985) American National Standard -- IEEE Standard for Binary Floating Point Arithmetic. ANSI/IEEE 754-1985 (New York: American National Standards Institute, Inc.). Informix (2000) Informix Spatial DataBlade Module User's Guide. December 2000. Part no. 000-6868. ISO (2003) ISO/TC 211/WG 2, ISO/CD 19107, Geographic information — Spatial schema, 2003. OGC (1999) Open GIS Consortium, Inc., OpenGIS Simple Features Specification For SQL, Revision 1.1, OpenGIS Project Document 99-049, 5 May 1999. Oracle (2001) Oracle Spatial User's Guide and Reference. Oracle Corporation, Redwood City, CA, USA, June 2001. Release 9.0.1 Part No. A8805-01. Oxford (1973) The Shorter Oxford English dictionary. PostgreSQL (2001) The PostgreSQL Global Development Group. PostgreSQL 7.1.3 Documentation. Preparata FP and Shamos MI (1985) Computational Geometry, an Introduction. Springer-Verlag, New York Berlin Heidelberg Tokyo. Ramsey P (2001) PostGIS Manual (version 0.6.2). Refractions Research Inc. Thompson R (2003) PhD research proposal ‘Towards a Rigorous Logic for Spatial Data Representation’. Department of Geographical Sciences and Planning, The University of Queensland, Australia, November 2003. Worboys MF (1998) Some Algebraic and Logical Foundations for Spatial Imprecision, in Goodchild M. and Jeansoulin, R (ed), Data Quality in Geographic Information: from error to uncertainty, Hermes.
3D Geographic Visualization: The Marine GIS Chris Gold 1, Michael Chau 2, Marcin Dzieszko 1 and Rafel Goralski 1 1
Department of Land Surveying and Geo-Informatics, Hong Kong Polytechnic University, Hong Kong,
[email protected] 2 Hong Kong Marine Department, Hong Kong
Abstract The objective of GIS and Spatial Data Handling is to view, query and manipulate a computer simulation of the real world. While we traditionally work with two-dimensional static maps, modern technology allows us to work with a three-dimensional dynamic environment. We have developed a generic graphics component which provides many of the tools necessary for developing 3D dynamic geographical applications. Our example application is a 3D “Pilot Book”, which is used to provide navigation assistance to ships entering Hong Kong harbour. We show some of the “Marine GIS” results, and mention several other applications.
1 Introduction: The Real and Simulated World Traditional GIS is two-dimensional and static. The new game technology is 3D and dynamic. We attempt here to develop a “games engine” for 3D dynamic GIS, and to evaluate the process with several applications – primarily a “Marine GIS”, where objects and viewpoints move, and a realistic simulation of the real-world view should be an advantage in navigation and training. While computer game developers have spent considerable time in developing imaginary worlds, and the ways in which players may interact in a “natural” fashion, graphics software and hardware developers have provided most of the necessary tools. However, there has been limited examination of the potential of these technologies for “Real World” spatial data handling. This paper attempts to address these questions.
18 Chris Gold, Michael Chau, Marcin Dzieszko and Rafel Goralski
For “GIS”, we presume that our data represents some sampling of the real world. This real world data maps directly to the simulated world (objects and coordinates) of the computer representation that we wish to work with. The only difference between this activity and some kinds of computer game playing is the meaning inherent in our data, and the set of operations that we perform on it. Our perception or mental ability to manipulate and imagine the simulated world that we work with is usually limited by our concept of permissible interactions. Historically, we started with paper and map overlays, and this led to a particular set of techniques that were implemented on a computer. Game development is capable of freeing us from many of these constraints. We would like to present an approach to manipulation of our simulated world that attempts to be independent of the usual constraints of static maps – a static viewpoint, a static set of objects and relationships, the inability to reach in and make changes, the inability to make interactive queries, the inability to change the perceived mood of our world (darkness, fog, etc.) We are not alone in doing this – the computer games industry has put much effort into such increased “realism”. The intention of this work is to bring these concepts, and the improved algorithms and hardware that they have inspired, into the context of geographic space, where the objects and locations correspond in some way to the physical world. Again, we are not completely original here either – terrain fly-throughs and airport simulations are widely used. Nevertheless, we think that the overall structure of our interactive interface between the user and the simulated world contains components that have not previously been combined in a geographic application. We will first describe the motivation and imagined actions, then the consequent “games engine” design (which differs from pure games). This leads to the implementation issues of our “GeoScene” engine, Marine GIS and other applications, and our conclusions about this exercise. In the “God Game” (Greeley 1986) the author imagines the computer user creating an interactive game, with individuals who then come alive “inside the screen” – and he is thereafter responsible for their well-being. Our objective is to create a general-purpose interface that allows us to manage, not an imaginary world, but a simulation of some aspect of our current one. Thus we need to: 1) create a simulated world and populate it with geometric objects, lights and cameras (observers); 2) manipulate the relations between observers and objects; 3) query and modify objects’ properties, geometry or location; and 4) automate changes to objects (e.g. location). While not a complete list, these were our initial objectives, motivated by the desire to define a general-purpose simulation and visualization tool for geographic problems.
3D Geographic Visualization: The Marine GIS 19
2 Computer Graphics Background: Viewing Objects One concept in the development of computer graphics (e.g. Foley et al. 1990) is the ability to concatenate simple transformations (rotations, translation (movement) and scaling) by the use of homogeneous coordinates and individual transformations expressed as 4x4 matrices for 3D worlds. Blinn (1977) showed that these techniques allowed the concatenation of transformation matrices to give a single matrix expressing the current affine transformation of any geometric object. More recent graphics hardware and software (e.g. OpenGL: Woo 1999) adds the ability to stack previous active transformation matrices, allowing one to revert to a previous coordinate system during model building – e.g. the sun, with an orbiting planet, itself with an orbiting moon. A further development was the “Scene Graph” (Strauss and Carey 1992; also Rohlf and Helman 1994) which took this hierarchical description of the coordinate systems of each object and built a tree structure that could be traversed from the root, calculating the updated transformation matrix of each object in turn, and sending the transformed coordinates to the graphics output system. While other operations may be built into the scene graph, its greatest value is in the representation of a hierarchy of coordinate systems. These coordinate systems are applied to all graphic objects: to geometric objects, such as planets, or to cameras and lights, which may therefore be associated with any geometric object in the simulated world. Such a system allows the population of the simulated world with available graphic objects, including geometric objects, lights and cameras (windows, or observers). An object is taken from storage, using its own particular coordinate system, and then placed within the world using the necessary translation and rotation. If it was created at a different scale, or in different units, then an initial matrix is given expressing this in terms of the target world coordinates. Geometric objects may be isolated objects built with a CAD type modelling system or they may be terrain meshes or grids – which may require some initial transformation to give the desired vertical exaggeration. In most cases world viewing is achieved by traversing the complete tree, and drawing each object in turn, after determining the appropriate camera transformation for each window. Usually an initial default camera, and default lighting, is applied so that the model may be seen! We have implemented a general-purpose, scene graph based viewer for our particular needs, called “GeoScene”. The heart of GeoScene is the Graphic Object Tree, or scene graph. This manages the spatial (coordinate) relationships between graphic objects.
20 Chris Gold, Michael Chau, Marcin Dzieszko and Rafel Goralski
These graphic objects may be drawable (such as houses, boats and triangulated surfaces) or non-drawable (cameras and lights). The basis of the tree is that objects can be arranged in a hierarchy, with geometric transformations expressing the position and orientation of an object with respect to its parent object – for example the position of a wheel on a car, a light in a lighthouse, or a camera on a boat. The efficiency of this method is based on the concatenation of transformation matrices based on Blinn’s homogeneous coordinates. Redrawing the whole simulated world involves starting at the root of the tree, incorporating the transformation matrix, and drawing the object at that node (if any). This is repeated down the whole tree. Prior to this the camera position and orientation must be calculated, again by running down the tree and calculating the cumulative transformation matrix until the selected camera is reached. This could be repeated for several cameras in several windows. This process must be repeated after every event that requires a modified view. These events could be generated by window resizing or redrawing by the system, by user actions with the keyboard or mouse, or by automated operations, such as ship movements.
3 Action and Interaction: System Design The system described so far creates and views the world from different perspectives. It is designed for full 3D surface modelling, where the surface is defined by a collection of unrelated triangles. While it is often desirable for other operations to preserve the topological connections between the components (vertices, edges, faces) of an object, this is not necessary for basic visualization. User actions with the mouse or keyboard can have different meanings in different applications, or even in different modes of operation. This means that a “Manipulator” component needs to be written or modified for each application. We have developed a fairly consistent and effective mapping between wheel-mouse operations and navigation within the 3D world (where a camera moves and gives the observer’s view as a result of his operations, or else when the camera is located on a moving boat). These and other actions would be modified as necessary for the application or operation mode. The two main modes are when user gestures move the observer’s position (a left mouse movement means “Go left”) or when the gesture means “Move the selected object left”, in which case the observer appears to move right. Selection is performed by GeoScene, which calls the Manipulator to select the action, and the scene is redrawn. We have been quite successful with mapping the wheel of a wheel mouse to depth
3D Geographic Visualization: The Marine GIS 21
in the screen for many applications. In other cases, object selection is followed by a query of its properties from the database. These actions may be automated by using the “Animator” component, running in a separate thread so as to preserve the timing. This operates on objects that need to change over time – either a value must change (e.g. for a lighthouse), or else a transformation matrix must be updated for object movement. Examples are the movement of a boat, the rotation of a lighthouse beam or the movement of a camera in a fly-through. In the real world, objects may not occupy the same location at the same time. There is no built-in prohibition of this within the usual graphics engine, nor in GeoScene. Particular applications (e.g. the Marine GIS) implement their own data structures for collision detection and topology if necessary. Modification of an object, as done in CAD systems, requires the selection of a portion of the object (usually a vertex, edge or face), and in b-rep systems these are part of a topological model of the object surface (Mantyla 1988). These operations are specific to the application using GeoScene.
4 A Pilot Application – the “Pilot Book” Perhaps the ultimate example of a graphics-free description of a simulated Real World is the Pilot Book, prepared according to international hydrographical standards to define navigation procedures for manoeuvring in major ports (UK Hydrographic Office 2001). It is entirely text-based, and includes descriptions of shipping channels, anchorages, obstacles, buoys, lighthouses and other aids to navigation. While a local pilot would be familiar with much of it, a foreign navigator would have to study it carefully before arrival. In many places the approach would vary depending upon the state of the tides and currents. It was suggested that a 3D visualization would be an advantage in planning the harbour entry. While it might be possible to add some features to existing software, it appeared more appropriate to develop our own 3D framework. Ford (2002) demonstrated the idea of 3D navigational charts. This was a hybrid of different geo-data sources such as satellite pictures, paper chart capture and triangular irregular network data visualized in 3D. The project concluded that 3D visualization of chart data had the potential to be an information decision support tool for reducing vessel navigational risks. Our intention was to adapt IHO S-57 Standard Electronic Navigation Charts (International Hydrographic Bureau 2000, 2001) for 3D visualization. The study area was Hong Kong’s East Lamma Channel.
22 Chris Gold, Michael Chau, Marcin Dzieszko and Rafel Goralski
Fig. 1: MGIS Model - Graphic User Interface
5 The Marine GIS Our own work, based on the ongoing development of GeoScene, was to take the virtual world manipulation tools already developed, and add those features that are specific to marine navigation. As GeoScene contains no topology or collision detection mechanism, in the Marine GIS application we use the kinetic Voronoi diagram (Roos 1993) as a collision detection mechanism in two dimensions on the sea surface, so that ships may detect potential collisions with the shoreline and with each other. Shoreline points are calculated from the intersection of the triangulated terrain with the sea surface, which may be changed at any time. In addition, marine features identified in the IHO S57 standards were incorporated. Fig. 1 illustrates the system. On top of GeoScene, a general “S57Object” class was created, and sub-classes created for each defined S57 object. These included navigational buoys, navigational lights, soundings, depth contours, anchorage areas, pilot boarding stations, radio calling-in points, mooring buoys, traffic separation scheme lanes, traffic
3D Geographic Visualization: The Marine GIS 23
scheme boundaries, traffic separation lines, precautionary areas, fairways, restricted areas, wrecks, and underwater rocks. See Figs. 2-7 for examples.
Fig. 2: Visualization of Navigational Buoy and Lights
Fig. 3: Visualization of Safety contour with vertical extension
Other objects include ship models, sea area labels, 3DS models, range rings and bearing lines, and oil spill trajectory simulation results. Various query functions were implemented, allowing for example the tabulation of all buoys in an area or the selection of a particular buoy to determine its details. Selecting “Focus” for any buoy in the table moves the window viewpoint to that buoy.
24 Chris Gold, Michael Chau, Marcin Dzieszko and Rafel Goralski
Fig. 4: Visualization of Sea Area Label Using 3D fonts
Safety contours may be displayed along the fairways, and a 3D curtain display emphasizes the safe channel. Fog and night settings may be specified, to indicate the visibility of various lights and buoys under those conditions. Safety contours and control markers may appear illuminated if desired, to aid in the navigation. The result is a functional 3D Chart capable of giving a realistic view of the navigation hazards and regulations.
Fig. 5: Visualization of Oil Trajectory Record
6 Other Applications While Marine GIS is the most developed application using GeoScene, a variety of other uses have been developed. Our work on terrain modeling
3D Geographic Visualization: The Marine GIS 25
Fig. 6: Lists of Navigational Buoys
Fig. 7: MGIS Model – Scene of Navigational Mode. When animation mode is activated, the viewpoint will follow the movement of the ship model. The movement of the vessel could be controlled by using mouse clicks.
uses the same interface for 3D visualization, and runoff modeling (using finite difference methods over Voronoi cells rather than a grid) also requires GeoScene for 3D visualization (Fig. 8). Applications are under development for interactive landscape modification, using a “knife” to sculpt the triangulated terrain model.
26 Chris Gold, Michael Chau, Marcin Dzieszko and Rafel Goralski
Fig. 8: Surface runoff modelling based on Voronoi cells.
Recent work on the development of new 3D spatial data structures also uses GeoScene as a visualization tool, and a preliminary demo to show an underground piping layout has been prepared – requiring only a few lines of code to be added to the GeoScene units to perform simple visualization. We believe that an available, operational toolkit for the creation, visualization, query, animation and modification of simulated worlds is of great potential benefit in the geosciences. In our simulation of the real world, we no longer need to think of 2D static visualization, because the tools exist to provide more realistic results. There should be no real reason to work in two dimensions, without time, unless it is clearly sufficient for the application.
7 Conclusions Since the objective of this exercise is one of improved visualization in a simulated real or modified world, it is difficult to evaluate the results directly. Some of our experiences, however, can be summarized. We started our work with extensive discussions and human simulations of the modes of operation: given the wheel mouse as our hardware limit,
3D Geographic Visualization: The Marine GIS 27
we developed gestures that were universally accepted by (young) game players and (old) professors as a mechanism for manipulating the hierarchical observer/world relationships. Game experts and non-experts could learn to manoeuvre in five minutes. This also led to rediscovering the scene graph as a hierarchy of transformations, allowing any levels of cartographic rescaling and object/sub-object repositioning – including locating observers on any desired object. This was the framework of GeoScene. We also found that intuitive interaction depended on observer/world scale: gesturing to the left if the observer was climbing a cliff implied that the actor moved left, but if the observer was holding an object (e.g. in a CAD modelling system) that implied the object moved left, giving a relative observer movement to the right. These are usually separate applications, but our global geographic intentions included landscape modelling as well as viewing. Thus different operating modes were necessary to avoid operator disorientation. This formed the basis of the Manipulator module, where the same gesture had to be mapped to different actions depending on the intention.
Fig. 9: Prototype “Pipes” application.
Relative success with the early marine applications emphasized the generality of the viewing/manipulation system, and forced a redesign separating GeoScene from the specific application. Other applications included 3D terrain and flow modelling projects, and 3D volumetric modelling of oceanographic data, which were standardized on GeoScene, and some surprising offshoots: a trial sub-road utility modelling system was developed in one day, for example, to indicate potential collisions between various pipes, cables and manholes (Fig. 9). Space precludes more illustrations.
28 Chris Gold, Michael Chau, Marcin Dzieszko and Rafel Goralski
Finally, the Marine GIS was enhanced by the inclusion of real features and marine aids, and by improved (perhaps new) 3D cartographic symbolism. Interest is being expressed by the HK and UK governments, and private industry. Our preliminary conclusions are that many applications exist for 3D dynamic modeling of real world situations, if simple enough tools exist. We hope that our work is a contribution to this.
Acknowledgements We thank the Hong Kong Research Grants Council for supporting this research (project PolyU 5068/00E).
References Blinn JF (1997) A homogeneous formulation for lines in 3-space. Computer Graphics 11:2, pp 237-241 Foley JD, van Dam A, Feiner SK and Hughes JF (1990) Computer graphics, principles and practice, second edition. Addison-Wesley, Reading, Massachusetts Ford SF (2002) The first three-dimensional nautical chart. In: Wright D (ed) Under Sea with GIS, ESRI Press, pp 117 -138, Greeley A (1986) God game. Warner Books, New York. International Hydrographic Bureau (2000) IHO transfer standard for digital hydrographic data edition 3.0, Special publication No. 57 International Hydrographic Bureau (2001) Regulations of the IHO for internal (INT) charts and chart specification of the IHO Mantyla M (1988) An introduction to solid modelling. Computer Science Press, College Park, MD Rohlf J and Helman J (1994) Iris performer: A high performance multiprocessing toolkit for real-time 3d graphics. Proceedings of SIGGRAPH 94, pp 381–395 Roos T (1993) Voronoi diagrams over dynamic scenes. Discrete Applied Mathematics 43:3 pp 243-259 Strauss PS and Carey R (1992) An object-oriented 3D graphics toolkit. In: Catmull EE (ed) Computer Graphics (SIGGRAPH ’92 Proceedings), pp. 341–349 The United Kingdom Hydrographic Office (2001) Admiralty sailing directions – China Sea Pilot volume I, fifth edition, United Kingdom National Hydrographer, Taunton Woo M (1999) OpenGL(R) programming guide: the official guide to learning OpenGL, version 1.2. Addison-Wesley Pub Co
Local Knowledge Doesn’t Grow on Trees: Community-Integrated Geographic Information Systems and Rural Community Self-Definition Gregory Elmes1, Michael Dougherty2, Hallie Challig1, Wilbert Karigomba1, Brent McCusker1, and Daniel Weiner1. 1 Department of Geology and Geography, PO Box 6300, West Virginia University, Morgantown, WV 26506-6300, USA. 2 WVU Extension Service, PO Box 6108, Morgantown, WV 26506-6108 USA.
Abstract The Appalachian-Southern Africa Research and Development Collaboratory (ASARD) seeks to explore the integration of community decisionmaking with GIS across cultures. Combining geospatial data with local knowledge and the active participation of the community creates a Community-Integrated Geographic Information System (CIGIS) representing and valuing themes related to community and economic development. The intent is to integrate traditional GIS with the decision-making regime of local people and authorities to assist them in making informed choices and to increase local participation in land use planning, especially within economically disadvantaged communities. Keywords GIS and Society, Participatory GIS, Community-Integrated GIS, Local Knowledge.
1 Introduction This paper addresses how a community defines and characterizes itself and how such information might be included within a GIS to assist the decision process associated with local development projects. The first question is simple: Where does geographical information about a community reside? Is it in the property books? In the maps and charts? The legal deeds? The land surveys? Or does it reside with the people? Clearly it is all of these
30 Elmes et al.
and more, but the local geographical knowledge of residents is often omitted or is the last to be considered in development planning. For development planning in West Virginia, geographical information about communities frequently is reduced to the standard layers of digital geospatial data. GIS typically include five basic components: people, data, procedures, hardware, and software, which are designed to analyze and display information associated with parcels of land, places or communities. Conventional GIS handle information only from formal sources. Thus GIS is seen to represent the world through an “objective” lens by presenting official geospatial data and statistics. This function is important and meaningful, but it is incomplete. Descriptions and understanding of the community by the people of that community are too often missing. Local knowledge can be conveyed along with the standard GIS layers however, and incorporating it enhances the ability of the system to serve as a more effective platform for communication and debate. No longer does the GIS report only on how experts and outsiders define the community, it may also include what those who live in the community and make the landscape alive “know” and even “feel” about that land. A Community-Integrated Geographic Information System (CIGIS) tries to accomplish this task by assimilating self-definitions and human experience of a place. A CIGIS is a hybrid of the formal and the familiar. The statistical data are important, but so are the informal perceptions, images, and stories of the community’s people. All information is necessary for a fuller understanding of place. This paper examines the concepts of CIGIS in a case study of the Scotts Run Community in Monongalia County, West Virginia (USA), one of three West Virginia University sites for the Appalachian-Southern Africa Research and Development Collaboratory (ASARD). The evolving process of CIGIS has guided the initial fieldwork over the last two years and continues to be used to modify the research plan. The initial sections of this paper discuss the derivation of CIGIS concepts within the objectives of ASARD. After a foundation for discussion has been established, the development of the Scotts Run project will be examined in detail and the results to date critiqued. 2 Community-Integrated GIS Advances in GIS from its rudimentary origins to its present state have reflected advances in the associated technology. Significantly for this research, qualitative information can now be incorporated into a system that heretofore has been dominated by quantitative spatial data. Successful in-
Local Knowledge Doesn’t Grow on Trees
31
tegration of public input and qualitative data has resulted in participatory GIS (Craig et al. 2002). Some participatory GIS undertakings have sought to improve access to or understanding of data. A joint mapping venture helped provide a better understanding of the Herbert River catchment area in northeastern Queensland (Walker et al. 1998). In another setting, he ability of GIS to examine various scenarios and outcomes was the focal point of public visioning sessions used to help develop plans for the Lower West Side of Buffalo, N.Y. (Krygier 1998). The inclusion of audio and video recordings along with geovisualization in traditional GIS has been advocated to improve public participation in design and planning efforts (Howard 1998). The NCGIA Research Initiative 19 provided several roots for CIGIS (Weiner et al. 1996). The report “GIS and Society: The social implications of how people, space, and environment are represented in GIS” proposed new methods for including and representing qualitative spatial data in GIS. In 2001, the European Science Foundation and National Science Foundation sponsored a workshop on access and participatory approaches in using geographic information documenting progress since the NCGIA initiative (Onsrud and Craglia 2003.) This focus on fuzzy, ambiguous and intangible spatial data was expanded during studies of the contentious issues of land reform in South Africa (Weiner and Harris 2003). CIGIS extends the capacity of the “expert” technology of GIS to include people and communities normally on the periphery with respect to politics and spatial decision-making, by incorporating local knowledge in a variety of forms – such as mental maps, images, and multimedia presentations – alongside the customary geometric and attribute data. As a result, CIGIS can pose questions that local community participants consider important and broaden access to digital spatial technology in the self-determination of their future. Local knowledge differentiates and values aspects of the landscape that are deemed socially important. It helps a community develop its own definition based upon its own knowledge, alongside the more formal information generally available to official bodies.
3 CIGIS and ASARD The creation of ASARD in 1999 expanded the effort to broaden the participatory scope of GIS activities. The project involves West Virginia University, the University of Pretoria in South Africa, and the Catholic University of Mozambique. The international research collaboration connects
32 Elmes et al.
Appalachia with Southern Africa, seemingly distinct regions, which, in spite of their evident differences, share some common problems, including development at the rural-urban interface, limited access to social and economic resources, external ownership or control of the land, and a lack of influence by local residents on most development-related decisions. Thus, each site aims to use CIGIS to investigate spatial aspects of local and regional land-use decisions, and patterns of uneven development (For more information, see: A-SARD Website undated; Objectives). 3.1 ASARD in Scotts Run In West Virginia, the focus of the work has been the Scott’s Run watershed of northern Monongalia County. The study area defined as the ‘Scotts Run Transect’ extends west from the Monongalia River to Cass and New Hill settlements, and north to the Pennsylvania state line (see http://www.up.ac.za/academic/centre-environmentalstudies/Asard/mapsUWV.htm for a study area map). The collaboratory’s principal effort has been an examination of land use and natural resources issues in the context of social, and particularly, physical development. Local knowledge of the particulars of land, property ownership, tenure, traditional land uses and accessibility characteristics have been acquired and used in the construction of a CIGIS, which includes a strong (re)emphasis on the human component of GIS. Residents are integrated as local experts; ideally they contribute data and shape the objectives of inquiry and analysis. Founded as a mining camp area across the Monongahela River from Morgantown WV, Scotts Run dates from the early 20th century (Ross 1994). While coal was king, the mining community became racially and ethnically diverse as the demand for labor outstripped the local supply. The community has been exceptionally impoverished from the 1920’s onwards, though it has had some spells of relative prosperity, contingent on the state of the national economy and the demand for coal,. As part of a social movement, the Methodist Church opened a Settlement House in 1922 that, even today, is much in use in the provision of basic necessities. One year later, the Presbyterian Church established a mission in the area, commonly referred to as “The Shack”, which continues to serve the needs of the poor and immigrant mining population (Lewis 2002). These centers provided foci from which the research team was able to introduce itself into the community. CIGIS studies in the area began by orientation, familiarization tours around the vicinity, and windshield surveys. Team members walked and
Local Knowledge Doesn’t Grow on Trees
33
drove through the communities, visited the few remaining business establishments, and spoke with people on the streets. During these visits, an informal local leader was identified – Mr. Al Anderson, the chair of the Osage Public Service District (PSD) and sometime seeker of public office. As a shoe repair shop operator, a gospel singer, and a youth club counselor, Mr. Anderson was well connected and respected in the community and became the primary contact for the West Virginia ASARD Team. During several visits, Mr. Anderson has provided the team with local insight, copies of the local newsletter, The Compass, and connections to other residents. His role as champion for CIGIS has become apparent. Discussions with Mr. Anderson and others have led to the identification of sources and collection of local knowledge. A contentious plan for sewer installation, the first in the district, will have profound effects on the social and environmental characteristics of the area. The $8.2 million project began in November 2003. Service is to be established along the main roads within one year, on the secondary roads within five years, and will serve approximately 1,000 households when completed (Henline 2003). From the researchers’ viewpoint in the University community across the river, sewerage service seemed not only essential, but uncontroversial. The plain need for the sewer system was reinforced as team members were shown, and mapped, the unofficial discharge points of household “sewers” directly into Scotts Run – which can be a dry steam bed in the summer time. Yet within the Scott’s Run community, this seemingly indispensable sewer project has caused deep concern and even opposition. Mr. Anderson helped organize the initial public meeting in June 2001, where the team learned of the economic difficulties many residents face for the cost of sewer connections and in regular sewerage charges. Physical displacement is also a possibility. In some localities, the majority of residents are renters whose future tenure has been destabilized by the sewer project. County tax records in the CIGIS confirm the extremely high proportion of absentee landlords. The concentration of land ownership means that a few families stand to gain through the infrastructure improvements. Land-holding corporations own coal-producing lands that are unlikely to be mined; a large proportion of the area. They represent developable land. Further development pressures are being brought to bear by the construction of a new bridge and multi-lane highway access to Morgantown, the county’s largest urbanized area. The residents of Scotts Run fear that the land owners are in a position to control the development that will follow the sewer. A GIS was constructed with available geospatial data. Existing data included USGS 1:24000 DLG framework layers, 30 meter DEM, 1:24000 digital orthophotographs (1997), and various SPOT and Landsat-TM images classified for land cover. Detailed sewer plans were obtained from the
34 Elmes et al.
PSD. Unfortunately the Monongalia County tax assessor’s office still produces manual parcel (cadastral) maps. The WV GIS Technical Center had created digital versions of tax maps for a mineral lands assessment project in 1999, but these had no formal ground controls and were extremely difficult to georegister with other layers. The lack of accurate digital cadastral data, and a similar lack of current land use data, indicated which spatial data would begin to be acquired. Eventually aerial photographs were retrieved from state and university archives for three intervals: 1930, 1957, and 1970. These photographs have been rectified, georegistered and mosaiced. A time-series of land cover change has been created, but much work remains in establishing a satisfactory and consistent classification of land use. Large scale maps of the Scotts Run Transect have been presented to Mr. Anderson for use in his leadership role with the Osage PSD on the area sewer project. In-depth interviews were conducted for a random sample of 57 households in the study area in June and July 2002. Survey respondents were almost equally divided between newcomers and long-time residents. Most respondents revealed that there was no single shared issue. They indicated that they were near coal mines, that houses in the area were run down, and that flooding was a recurrent problem. Despite half reporting income levels of $20,000 or less a large majority of respondents said they did not receive any financial assistance. Incomes were much below the median household income for the state or nation ($29,696 and $41,994 respectively, U.S. Census Bureau, 2003). More than 10 percent of the respondents were African-American, higher than the overall minority population for the county or the state (7.8 percent and 5.0 percent respectively, U.S. Census Bureau, 2003). Additionally, about 64 percent of respondents owned their residence, which, although high for an impoverished area, is below the national and state average, (66.2 percent and 75.2 percent respectively according to U.S. Census Bureau, 2003) reinforcing observations made to team members during meetings and informal discussions related to property ownership and the eventual financial beneficiaries of the sewer project in the area. Progress on the project continues to focus on completing the critical layers and attributes of the conventional GIS database. The 2000 Census of Population and Housing data has been loaded. Ancillary information includes hydrologic, water quality, and mining information, as well as locationally-referenced multimedia presentations. The survey results and narratives are also incorporated into the database to build a geographic representation of place in the community. Maintaining individual confidentiality of the survey records is proving to be a challenge. Appropriate geographic masking techniques are required. Finally, and most importantly, a further series of public meetings is necessary to
Local Knowledge Doesn’t Grow on Trees
35
tantly, a further series of public meetings is necessary to engage the population more deeply in the design, refine the initial feedback, and extend the use of the CIGIS. While we have gathered initial quantitative and qualitative information, it remains necessary to find acceptable means through which the citizens at large can access and handle the spatial data for their own ends.
4 CIGIS and Development Issues Up to now the CIGIS work of the ASARD project has helped enhance the development debate in Scotts Run in several ways. First, it has provided a complete, large-scale, updateable, seamless digital map of the area, which can be reproduced for customized needs. The Osage PSD now has an alternative source of information and does not have to rely on that of government institutions, developers, and land owners who might place their own interests above those of the community and its residents. The PSD did not have to use its limited resources to acquire this information. Second, the project continues to provide background information, such as census, land tenure and land cover data to the group. This information helps augment the base knowledge of the residents. Furthermore, the GIS products have the ability to bring people together to discuss matters related to their community and their own vision of its development. This capacity to stir community involvement will be augmented through a new series of public meetings. The project should be able to build on the momentum of interest generated by the sewer project to become more broadly focused on community development issues. These include the identification and spatial delineation by the community of characteristics that would enhance or detract from it, the types of development that would be beneficial or detrimental to the area, and how acceptable development could be tailored to minimize costs and maximize benefits to community residents. From the point of view of incorporating and sharing local knowledge, the team recognizes that there remains much to be done. While it has been relatively easy to assimilate a range of qualitative information through ‘hot links’, multimedia, and the like, the power of updating and maintaining the data currently resides with the researchers and not within the community. Several technical options for distributed GIS operation are now available through Internet mapping, such as Map Notes in ESRI’s ARCIMS ™. But access to the Internet in Scott’s Run is constrained to low bandwidth telephone modems in schools and community buildings, such as the Settlement House and the Shack. CIGIS must place acceptable means of access
36 Elmes et al.
and control in the hands of residents. Even tools such as ArcReader™ and similar freeware viewers are unlikely to overcome the lack of access to, and relative unfamiliarity with, computers and software. Since economic and socio-cultural conditions are unlikely to support the mass adoption of broadband Internet access in the area, solutions are more likely to be found in increased social interaction and low end technology.
5 Project Transferability and Related Lessons So long as the data and technical requirements are met, there is a high degree of project transferability. In other words, sufficient digital geospatial data to create a standard GIS, along with the hardware, software, and technical expertise have to exist in a community. More critical is the ability to gain entrance into a community and to be able to engage public opinion and sentiment. University researchers are always “others” especially where social, cultural, and economic characteristics mark us as different. A further intangible element is the level of desire of the local population to exercise greater control over the future of their community. Several lessons have been learned during this process on ways to improve the use of CIGIS. The first and perhaps the most important lesson is that there needs to be a project plan in place before starting. Clearly, the research cannot be so academic in its objectives that the results will be of no practical value to the community. A team cannot go into a community and say “We are from the university and we are here to help you.” That does not mean the plan cannot change. In the Scotts Run study for example, health issues have emerged as less central than originally anticipated, principally because the residents revealed more urgent priorities. Meanwhile, land tenure and access to natural resources, such as game lands, have proved to be more important than originally anticipated A second lesson is that there should be a key informant or champion who provides a liaison for the research team. This person should have the respect and the trust of the community and be able to lend credibility through that reputation to the CIGIS. In the Scotts Run study, the key informant was identified very early on and has helped frame the local situation for the team and assist in involving the community in the process. A third lesson is that there must be others besides the key informant working to ensure project success. As a major participant the project may overemphasize their personal perspective, unintentionally or otherwise. Key informants may quickly become dominant actors and shield the research team from other candidate participants. To ensure balance it is necessary to
Local Knowledge Doesn’t Grow on Trees
37
involve additional members of the community. With respect to the Scotts Run study, the team placed too much initial reliance on a single contact. Efforts to identify secondary contacts have so far met with limited success. A fourth lesson is that contact, once made, must be persistent. Not communicating on a frequent basis with the community members can have a devastating impact on momentum and trustworthiness. If the team is not visible in the community on a consistent basis, people may believe that it is there only for its own purposes. In Scotts Run, contact between the team and the community occurred more frequently at the beginning of the project, a natural outcome of the need for initial links. The need to add members to the research team and bring them “up to speed” has led to some unforeseen delays and sporadic communication. A fifth lesson is that community members should be engaged in multiple ways as the local knowledge is being developed. Just as people learn in different ways, they communicate in different ways. Not everyone feels comfortable speaking in public, talking to strangers, being video taped or photographed, and writing responses to questions. In Scotts Run there have been public meetings, informal conversations and discussions, photography, and sketch mapping, as well as the formal survey instrument as varied instruments to gain information and insight on the area.
6 Conclusions This project shows the potential of using CIGIS in community development. The Scotts Run CIGIS can take the next step as it adds the dimension of control, incorporating perceptions and local spatial knowledge into a database, and by using them as a tool for the community to define and examine itself and its values. So far, the Scotts Run undertaking has succeeded to extent of bringing a variety of geographically-referenced information to a greater number of people. It has also shown that it has the ability to create a powerful tool for examining the community and to represent what residents think about the potential changes brought about by development. But the project is a long way from becoming a CIGIS. Technological barriers exist that present hurdles to improved community participation and self-management. Even after two years, the research team is not yet fully integrated within the community, and the community does not yet have sufficient access to suitable computer technology to activate their agenda independently. Basic questions regarding the spatial nature and representation of specific local knowledge remain to be investigated. Two further studies of the nature,
38 Elmes et al.
acquisition and inclusion of local knowledge are underway. It is anticipated that these additional studies will yield the additional information necessary to enable CIGIS help residents to direct development activities and opportunities in their community.
References A-SARD Web Site (Undated) A-SARD research at a glance. www.up.ac.za/academic/centre-environmental-studies/Asard/research.htm accessed May 15, 2004 Craig WJ, Harris TM, Weiner D eds (2002) Community participation and geographic information systems. Taylor and Francis, London Dougherty M, Weiner D (2001) Applications of community integrated geographic information systems in Appalachia: a case study. Presented at the Appalachian Studies Association Annual Conference, March 30-April 1, 2001 Snowshoe, West Virginia Haywood I, Cornelius S, Carver S (1998) An introduction to geographic information systems. Pearson Education Inc, New York Henline J (2003) Scotts Run residents look forward to sewer service: project needs 20 more takers for green light. The Dominion Post June 25, 2003. Website: www.dominionpost.com/a/news/2003/06/25/ak/ accessed July 8, 2003 Howard D (1998) Geographic information technologies and community planning: spatial empowerment and public participation. Presented at the project Varenius specialist meeting, Santa Barbara, California October 1998 www.ncgia.ucsb.edu/varenius/ppgis/papers/howard.html accessed May 15, 2004 Krygier JB (1998) The praxis of public participation gis and visualization. Presented at the project Varenius specialist meeting, Santa Barbara, California October 1998 www.ncgia.ucsb.edu/varenius/ppgis/papers/krygier.html accessed July 1, 2003 Lewis AL (2002) Scotts Run: An introduction. Scott’s Run writing heritage project website. www.as.wvu.edu/~srsh/lewis_2.html accessed July 8, 2003 Mark DM, Chrisman N, Frank AU, McHaffie PH, Pickles J (1997) The GIS history project. Presented at UCGIS summer assembly. Bar Harbor, Maine June 1997 http://www.geog.buffalo.edu/ncgia/gishist/bar_harbor.html accessed May 15, 2004 Miller HJ (in press) What about people in geographic information science? In: RePresenting Geographic Information Systems. Fisher P, Unwin D, eds John Wiley, London Onsrud HJ, Craglia M, eds (2003) Introduction to the second special issue on access and participatory approaches in using geographic information. URISA Journal 15, 1 http://www.urisa.org/Journal/APANo2/onsrud.pdf accessed May 15, 2004
Local Knowledge Doesn’t Grow on Trees
39
Ross P (1994) The Scotts Run coalfield from the great war to the great depression: a study in over development. West Virginia History 54: 21-42 www.wvculture.org/history/journal_wvh/wvh53-3.html accessed July 8, 2003 US Census Bureau (2003) “West Virginia: state and county quick facts. quickfacts.census.gov/qfd/states/54000.html quickfacts.census.gov/qfd/states/54/54061.html accessed July 8, 2003 Walker DH, Johnson AKL, Cottrell A, O’Brien A, Cowell SG, Puller D (1998) GIS through community-based collaborative joint venture: an examination of impacts in rural Australia. Presented at the project Varenius specialist meeting, Santa Barbara, California, October 1998 www.ncgia.ucsb.edu/varenius/ppgis/papers/walker_d/walker.html accessed July 1, 2003 Weiner D, Harris TM, (2003) Community-Integrated GIS for land reform in South Africa. URISA Journal (on-line). www.urisa.org/Journal/accepted/2PPGIS/weiner/community_integrated_gis_f or_land_reform.htm accessed May 15, 2004 Weiner D, Harris TM, Burkhart PK, Pickles J (1996) Local knowledge, multiple realities, and the production of geographic information: a case study of the Kanawha valley, West Virginia. NCGIA Initiative 19 paper. www.geo.wvu.edu/i19/research/local.htm accessed on July 1, 2003
A Flexible Competitive Neural Network for Eliciting User’s Preferences in Web Urban Spaces Yanwu Yang and Christophe Claramunt Naval Academy Research Institute BP 600, 29240, Brest Naval, France. Email: {yang, claramunt}@ecole-navale.fr
Abstract: Preference elicitation is a non-deterministic process that involves many intuitive and non well-defined criteria that are difficult to model. This paper introduces a novel approach that combines image schemata, affordance concepts and neural network for the elicitation of user’s preferences within a web urban space. The selection parts of the neural network algorithms are achieved by a web-based interface that exhibits image schemata of some places of interest. A neural network is encoded and decoded using a combination of semantic and spatial criteria. The semantic descriptions of the places of interest are defined by degrees of membership to predefined classes. The spatial component considers contextual distances between places and reference locations. Reference locations are possible locations from where the user can act in the city. The decoding part of the neural network algorithms ranks and evaluates reference locations according to user’s preferences. The approach is illustrated by a web-based interface applied to the city of Kyoto. Keywords: image schemata, preference elicitation, competitive neural network, web GIS
1 Introduction The World Wide Web (web) is rapidly becoming a popular information space for storing, exchanging, searching and mining multi-dimensional information. In the last few years, a lot of researches have been oriented to the development of search engines (Kleinberg 1999), analysis of web communities (Greco et al. 2001), statistical analysis of web content, structure and usage (Madria et al.
42 Yanwu Yang and Christophe Claramunt
1999, Tezuka et al. 2001). Nowadays, the web constitutes a large repository of information where the range and level of services offered to the users community is expected to increase and enrich dramatically in the next few years. This generates many research and technical challenges for the information engineering science. One of the open issues to explore is the development of unsupervised mechanisms that facilitate manipulation and analysis of information over the web. This implies to approximate user’s preferences and intentions in order to guide and constrain information retrieval processes. Identifying user’s preferences over a given domain knowledge requires either observing user’s choices or directly interacting with the user with pre-defined questions. A key issue in preference elicitation is the problem of creating a valid approximation of user’s intentions with a few information inputs. The measurement process consists in the transformation of user’s intentions towards a classifier or regression model that rank different alternatives. Several knowledge-based algorithms have been developed for preference elicitation from pairwise-based comparisons to value functions. An early example is the pairwise algorithm comparison applied on the basis of ratio-scale measurements that evaluate alternative performances (Saaty 1980). Artificial neural networks approximate people preferences under certainty or uncertainty conditions using several attributes as an input and a mapping towards an evaluation function (Shavlik and Towell 1989, Haddawy et al. 2003). Fuzzy majority is a soft computing concept that provides an ordered weighted aggregation where a consensus is obtained by identifying the majority of user’s preferences (Kacprzyk 1986, Chiclana et al. 1998). Preference elicitation is already used in e-commerce evaluation of client profiles and habits (Riecken 2000, Schafer et al. 1999), flight selection using value functions (Linden et al. 1997), apartment finding using a combination of value function and user’s feedbacks (Shearin and Lieberman 2001). This paper introduces a novel integrated approach, supported by a flexible and interactive web GIS interface, which applies image schemata and neural networks to preference elicitation in a web GIS environment. We introduce a prototype whose objective is to provide a step towards the development of assisted systems that help users to plan actions in urban spaces, and where the domain knowledge involved is particularly diverse and stochastic. This should apply to the planning of tourism activities where one of the complex problems faced is the lack of understanding of the way tourists choose and arrange their activities in the city (Brown and Chalmers 2003).
A Flexible Competitive Neural Network 43
The web-based prototype developed so far is a flexible interface that encodes user’s preferences in the selection of places of interest in a web urban space, and ranks the places that best fit those user’s preferences according to different criteria and functions. Informally, a web urban space can be defined as a set of image schemata, spatial and semantic information related to a given city, and presented to the user using a web interface. The web-based interface provides the interacting level where user’s preferences are encoded using an image schemata selection of the places that present an interest for the user. Those places are classified using fuzzy quantifiers according to predefined degrees of membership to some classes of interest. For example, a temple surrounded by a garden is likely to have high degrees of membership to the classes: garden and temple, relatively high to a class museum and low to a class urban. The second parameter considered in the places ranking process is given by an aggregated evaluation of the proximity of those places to some reference locations. The computational part of the prototype is supported by a competitive back propagation neural network. First, the encoding part of the neural network returns the best location according to the elicitation of her/his preferences from where she/he would like to plan her/his actions in the city. Without loss of generality, those reference locations are represented by a set of hotels distributed in the city. Secondly, the decoding and ranking part of the neural network returns the places of reference ranked in function of their semantic and spatial proximity to the user’s preferences. The prototype is applied to the city of Kyoto, a rich historical and cultural environment that provides a high degree of diversity in terms of places of interest. The reminder of the paper is organised as follows. Section 2 introduces the modelling principles of our approach, and the concepts of places, image schemata and affordances. Section 3 develops the principles of the neural network algorithm and describes the different algorithms implemented so far. Section 4 presents the case study and the Kyoto finder prototype. Finally, Section 4 concludes the paper.
2 Modelling principles We consider the case where the user has little knowledge of a given city. Places are presented to the user using image schemata in order to approximate her/his range of interests. Image schemata are recurring imaginative patterns that help human to comprehend and
44 Yanwu Yang and Christophe Claramunt
structure their experience while moving and acting in their environment (Johnson 1987). They are closely related to the concept of affordance that qualifies the possible actions offered to do with them (Gibson 1979). Image schemata and affordance have been already applied to the design of spatial interfaces to favour interaction between people and real-world objects (Kuhn 1996). We apply those two concepts to the selection of the places that are of interest for the user, assuming that those image schemata and affordances relate to the opportunities and actions she/he would like to take and expect in the city. Places are represented as modelling objects classified semantically and located in space. Let us consider a set of places X={x1, x2, …, xp}. A place xi is described by a pair of coordinates in a two dimensional space, and symbolised by an image schemata that acts as a visual label associated to it. The memberships of a place xi with respect to some thematic classes C1, C2, …, Ck are given by the values x1i , xi2 , …, xik that denote some fuzzy quantifiers with 1didp. xih denotes the degree of membership of xi to the class Ch and it is
bounded by the unit interval [0,1], with 1dhdk. A value xih that tends to 0 (resp. 1) denotes a low (resp. high) degree of membership to the class Ch. A place xi can belong to several classes C1, C2, …, Ck at different degrees, and the sum of the membership values x1i , xi2 , …, xik can be higher than 1. This later property reflects the fact that some classes are semantically close, i.e. they are not semantically independent. This is exemplified by the fact that a place xi with a high degree of membership xih to a class Ch is likely to also have high membership values with respect to the classes that are semantically close to Ch. Let us consider some places of interest in the example of the city of Kyoto. We classify them according to a set of classes {C1, C2, C3, C4} with C1=’Museum’, C2=’Temple’, C3=’Garden’, C4=’Urban’. The image schemata presented in Figure 1 illustrates the example of the Toji Temple in Kyoto labelled as x1. This photograph exhibits a view of the temple surrounded by a park. This can be intuitively interpreted by a relatively high membership to the classes C1, C2 and C3 (one can remark a semantic dependence between the classes C1 and C2), and low to the class C4. Those degrees of membership are approximated by fuzzy qualifiers that are predefined within the web prototype (Figure 1).
A Flexible Competitive Neural Network 45
Membership degrees: Museum:
x11 = 0.6,
Temple: x12 = 0.9, Garden: x13 = 0.8, Urban: x14 = 0.05
Fig. 1. Place example: Toji Temple
3 Back propagation neural network 3.1 Contextual distances and contextual proximities We assume no prior knowledge of the city presented by the web interface, neither experiential nor survey knowledge1. Although those places are geo-referenced, this information is not presented to the user in order to not interfere with the approximation of her/his preferences. The web interface encompasses information either explicitly (image schemata of the representative places) or implicitly (location of the places and the reference location in the city, proximity between them). The proximity between two locations in an urban space is usually approximated as an inverse of the distance factor. We retain a contextual modelling of the distance and proximity between two places. This reflects the fact, observed in qualitative studies that the distance from a region A to a distant region B should be magnified when the number of regions near A increase, and vice versa (Worboys 1995). The relativised distance introduced by Worboys normalises the conventional Euclidean distance between a region A and a region B by a dividing factor that gives a form of contextual value to that measure. This dividing factor is the average of the Euclidean distance between the region A and all
1
Experiential knowledge is derived from direct navigation experience while survey knowledge reflects geographical properties of the environment (Thorndyke and Hayes-Roth 1980).
46 Yanwu Yang and Christophe Claramunt
the regions considered as part of the environment. We generalise the relativised distance to a form of contextual distance. Contextual distance: The contextual distance between a place xi of X={x1, x2, …, xp} and a reference location yj of Y={y1, y2, …, yq}, is given by D(xi, yj) =
d ( xi, yj ) (1)
d ( x, y )
where d ( xi, yj ) stands for the Euclidean distance between xi and yj; d ( x, y ) the average distance between the places of X and the reference locations of Y (the definition above gives a form of generalisation of Worboys’s definition of relativised distance as the dividing factor is here the average of all distances between the regions of one set with respect to the regions of a second set). We also slightly modify the definition of the relativised proximity introduced by Worboys in the same work (1995) by adding a square factor to the contextual distance in the denominator in order to maximize contextual proximities for small distances (vs. minimizing contextual proximities for large distances), and to extend the amplitude of values within the unit interval. The contextual proximity is defined as follows. Contextual proximity: the contextual proximity between a place xi of X={x1, x2, …, xp} and a reference location yj of Y={y1, y2, …, yq} is given by 2
d ( x, y ) 1 = P(xi, yj) = 2 2 1 D( xi , y j ) d ( x, y ) d ( xi, yj ) 2
(2)
Where P(xi, yj) is bounded by the unit interval [0,1]. The higher P(xi, yj) the closer xi to yj, the lower P(xi, yj) the distant xi to yj, and vice versa. 3.2 Neural network principles The web interface provides several flexible algorithms where the input is given by a set of places. Those algorithms encapsulate different forms of semantic and spatial associations between those places and some reference locations. An algorithm output returns the location which is the most centrally located with respect to those
A Flexible Competitive Neural Network 47
places, and according to those associations. Those algorithms are implemented using a back propagation neural network that complies relatively well with the constraints of our application: unsupervised neural network, no input/output data samples and maximum flexibility with no training during the neural network processing (cf. Freeman and Skapura, 1991 for a survey on neural network algorithms). The back propagation neural network is bi-directional and competitive as the best match is selected. This gives a form of “winner takes all” algorithm. The computation is unsupervised, and the complexity of the neural network is minimal.
2
Y layer – Ref erence locations
q
1
w i,j associations X layer - Places 1 p 2
1
2
Base map
q p
1 2
Fig. 2. Neural network principles We initialise the bi-directional competitive neural network using two layers X and Y where X={x1, x2, …, xp} denotes the set of places, Y={y1, y2, …, yq} the set of reference locations (no semantic criteria are attached to those reference locations but they can be added to the neural network with some minor adaptations). The neural network has p vectors in the X layer, q vectors in the Y layer (Figure 2). We define a weight matrix W where wi,j reflects the strength of the association between xi and yj for i=1,..,p and j=1, …, k. Matrix values are initialised during the encoding process which depends of the algorithm chosen. We give the user the opportunity to choose between several algorithms in order to explore different output alternatives and evaluate the one that is the most appropriate to her/his intentions.
48 Yanwu Yang and Christophe Claramunt
3. 3 Neural network encoding The propagation rules of the neural network are based on several semantic and spatial criteria that are described in the cases introduced below. First, propagation ensures selection of the reference location that best fits the user’s preferences according to some spatial and semantic criteria (neural network encoding). Secondly, back propagation to the layer of places ranks the places with respect to the selected reference location (neural network back propagation). Let us first describes the encoding process. An input vector x {(0,1)}p, is applied to the layer X and propagated from the layer X to the layer Y. This input vector describes the places of interest considered in the neural network (i.e. xi = 1 if the place is considered in the computation, xi=0 otherwise). Similarly an input vector y {(0,1)}q is applied to the layer Y where yj = 1 if the reference location is considered in the computation, yj=0 otherwise. The encoding processes provide a diversity of mechanisms in the elicitation of user’s preferences by either allowing x explicit choice of user’s places of interest and return of the most centrally located reference location, amongst the ones selected by the user, and according to some either spatial (cases A and B) or spatial and semantic metrics (case C), x derivation of user’s class preferences elicited from the places selected by the user, and return of the most centrally located reference location, amongst the ones selected by the user, according to some spatial and semantic metrics (case D), x explicit definition of user’s class preferences and return of the best reference location, amongst the ones selected by the user, according to some spatial and semantic metrics (case E). The information input by the user is kept minimal in all cases: selection of places of interest and selection of reference locations of interest. The criteria used for the propagation algorithm and the input values derived for each reference location in the layer Y are as follows (variables used several times in the formulae are described once). Case A: Contextual distance Based on the contextual distance, the algorithm returns the most centrally located reference location given a set of places. The input vector reflects the places that are selected by a user, those places
A Flexible Competitive Neural Network 49
corresponding to her/her place preferences. The algorithm below introduces the propagation part and the encoding of the algorithm: p
input ( yj ) = y j
¦
xi D ( xi , yj ) with wi,j = D ( xi, yj )
(3)
i 1
where xi=1 if the place is selected in the input vector, xi=0 otherwise; yj =1 if the reference location is selected in the input vector, yj =0 otherwise; D ( xi, yj ) denotes the contextual distance between the place xi and the reference location yj; p denotes the number of places in the layer X, q of reference locations in the layer Y. Case B: Contextual proximity This algorithm models the strength of the association between the places and the reference locations using contextual proximities (one can remark that case B negatively correlates case A): p
input ( yj ) = y j
¦
xi P ( xi , yj ) with Wi,j = P ( xi, yj )
(4)
i 1
where P ( xi, yj ) is the contextual proximity between the place xi and the reference location yj. Case C: Contextual proximity + degrees of membership The algorithm below finds out the most centrally located reference location based on two criteria: contextual proximity, and overall interest of the places considered. This case takes into account both spatial (i.e., proximity) and semantic criteria (i.e., overall degree of membership to the classes given) to compute the strength of the association between a place and a reference location. Degrees of membership ( x1i , xi2 , …, xik ) weight the significance of a given place xi with respect to the classes C1, C2, …, Ck. High values of class memberships increase the contribution of a place to the input values on the Y layer and its input ( xi ) in return. The propagation part of the algorithm is as follows: p
input ( yj ) = y j
k
k
h 1
h 1
¦ xi P( xi, yj )¦ xih with Wi,j = P( xi, yj )¦ xih i 1
(5)
50 Yanwu Yang and Christophe Claramunt
where xih stands for the degree of membership of xi with respect to Ch; k denotes the number of semantic classes. Case D: Contextual proximity + degrees of membership + class preferences The propagation part of the algorithm adds another semantic criteria to the algorithm presented in case C: class preferences, a high-level semantic factor derived from the places selected. Formally, and for a given class Ch, its degree of preference ph with respect to an input vector x {(0,1)}p is evaluated by p
¦x x i
ph =
h i
i 1 p j k
¦x ¦x i
i 1
(6)
j i
j 1
Those degrees of preference form a class preference pattern g (p1, p , …, ph), with respect to the classes C1, C2, …, Ch. At the difference of the previous cases, all places in X are considered as part of the input vector at the initialisation of the neural network. The places selected by the user are taken into account only to derive her/his class preferences. Input values in the layer of reference locations are derived as follows: 2
p
input ( yj ) = y j
k
¦ P ( x , y )¦ x i
h i
j
i 1
ph with wi,j
h 1
(7)
k
= P ( xi , yj )
¦x
h i
ph
h 1
Case E: Contextual proximity + degrees of membership + user-defined class preferences The approach is relatively close to the case D but with the difference that class preferences are user-defined. Input values in the layer of reference locations are calculated as follows: p
input ( yj ) = y j
k
¦ P( xi, yj )¦ xih puh with wi,j i 1
h 1
k
= P ( xi, yj )
¦x
h i
p uh
A Flexible Competitive Neural Network 51 k
= P ( xi, yj )
¦x
h i
p uh
h 1
(8)
Where puh denotes a user-defined class preference for class Ch, that is, an integer value given by the user at the interface level. This case gives a high degree of flexibility to user, it constitutes a form of unsupervised neural network. The reference location yj with the highest input ( yj ) value is the one that is the nearest to the places that are of interest, those being or not the ones given by the input vector. 3.4 Neural network decoding and back propagation The back propagation algorithms are applied to all cases. The basic principles of the decoding part of the neural network is to rank the places of the X layer with respect to the “winning” reference location selected in the layer Y. Output values are determined as follows. Case A 1 if input ( yj ) input ( yj ' ) for all j z j' yj(t+1)= ® ¯0 otherwise
(9)
for j =1, 2, …, q Cases B, C, D and E 1 if input ( yj ) ! input ( yj ' ) for all j z j' yj(t+1)= ® ¯0 otherwise
(10)
for j =1, 2, …, q The patterns yj(t+1) produced on the Y layer are back propagated to the X layer thus giving the following input values on the X layer. The consistency of the algorithm is ensured as the decoding is made with the function used in the selective process in the X layer. The place with the best fit is the xi where xi(t+2) t xi’(t+2) at the exception of Case A where the place selected is the xi where xi(t+2) d xi’(t+2)) for all i z i’. The other places are ranked according to their input ( xi ) values (ranked by increasing values for cases B, C, D and
52 Yanwu Yang and Christophe Claramunt
E, by decreasing values for case A). Input values for the X layer are given by q
Case A: input ( xi ) = xi
¦y
j (t
1) Dij
(11)
j 1
q
Case B: input ( xi ) = xi
¦y
j (t
1) P ( xi , yj )
j 1
k
a
Case C: input ( xi ) = xi
¦y j 1
q
Case D: input ( xi ) =
Case E: input ( xi ) =
j
(t 1) P( xi, yj )¦ xih h 1
(13)
k
¦ y j (t 1) P( xi, yj )¦ xih p h j 1
h 1
q
k
¦ y j (t 1) P( xi, yj )¦ xih puh j 1
(12)
h 1
(14)
(15)
with xi(t+2) = input(xi) where xi=1 if the reference location is selected in the input vector, xi = 0 otherwise; yj(t+1) values are given by the input(yj) values above; q denotes the number of places in the layer Y. The input functions above support a wide diversity of semantic and spatial criteria. This ensures flexibility in the elicitation process and maximisation of opportunities despite the fact that user’s data inputs are kept minimal.
4 Prototype development We developed a web-based Java prototype that provides an experimental validation of the neural network encoding and decoding algorithms. The prototype implements all algorithms (case A and B are merged into an algorithm AB in the interface as they give similar results) plus a variation of algorithm A based on the absolute distance (denoted as algorithm A0 in the interface). The web prototype is applied as an illustrative example to the city of Kyoto, an urban context that possesses a high diversity of places. The interface developed so far encodes two main levels of information inputs: places and reference locations. Several places of diverse
A Flexible Competitive Neural Network 53
interest in the city of Kyoto have been pre-selected to give a large range of preference opportunities to the user. Those places are referenced by image schemata and encoded using fuzzy qualifiers according to predefined semantic classes (urban, temple, garden and museum) and geo-referenced. Reference locations are represented by a list of geo-referenced hotels. Figure 3 presents the overall interface of the Kyoto finder. To the left are the image schemata of the places in the city of Kyoto offered for selection, to the top-right the list of seven hotels offered for selection. To the right-bottom is the functional and interaction part of the interface. The algorithm proposed by default is Case D, that is, the one based on an implicit elicitation of user’s class preferences (case D).
Fig. 3. Kyoto finder interface The encoding and decoding parts of the algorithms are encapsulated within the web-interface. The interface provides a selective access to those algorithms by making a distinction between options by default and advanced search facilities. The algorithm applied by default is the one given by the case D where the selection of some image schemata is used to derive user’s preferences (Figure 3). Advanced search facilities offer five algorithms to the user (namely A0, AB, C, D and E as illustrated in Figure 4). Figure 4 illustrates a case where the application of the algorithms gives different location/hotel winners (at the exception of algorithms AB and C that give a similar result). Class preferences are explicitly valued by user when the case E is chosen (index values illustrated at the right-middle interface presented in Figure 4).
54 Yanwu Yang and Christophe Claramunt
Fig. 4. Algorithm output examples After the user’s choice of an algorithm, user’s class preferences are derived. In the example of algorithm D illustrated in Figure 4, the ordered list of class preferences is Temple with p2 = 0.31, Garden with p3 = 0.27, Museum with p1 = 0.25, and Urban with p4 = 0.16. When triggered, the neural network calculates the input values in the Y layer, and selects the hotel that best fits the user’s preference patterns. Figure 5 summarises the results for the previous example (from the left to the right: hotels selected by the user, input values in the layer Y, normalised values in the layer Y). The winning reference location (Ana Hotel) is then propagated back to the X layer where places of interest are ranked according to the algorithm value function.
Fig. 5. Place result examples
A Flexible Competitive Neural Network 55
Finally the results of the encoding and the decoding process can be displayed to the user on the base map of the city of Kyoto. Figure 6 presents the map display of the previous example processed using algorithm D. The wining hotel (i.e. Ana Hotel) is the best reference location with respect to the user’s class preference pattern. The Ana Hotel is denoted by the central square symbol in Figure 4, the best places selected are the circles connected by a line to that hotel. Each place can be selected by pointing in the interface in order to display the image schemata associated to them, other squares are the other hotels while the isolated circles are the initial selection of the user.
Fig. 6. Map results The objective of the Kyoto finder is to act as an exploratory and illustrative solution towards the development of a web environment that supports elicitation of user’s preferences in a web urban space. The algorithms presented offer several flexible solutions to the ranking of some reference locations with respect to places of interest in a given city. The semantic and spatial criteria can be completed by additional semantic and spatial parameters although a desired constraint is to keep the user’s input minimal. A second constraint, we impose on the encoding and decoding processes, is to rely on an acceptable level of complexity in order to guaranty a straight comprehension of the algorithm results. The outputs given by the system are suggestions offered to the user. Those should allow her/him to explore interactively the different options suggested and
56 Yanwu Yang and Christophe Claramunt
to further explore the web information space to complete the findings of the neural network.
5 Conclusions The research presented in this paper introduces a novel approach that combines image schemata and affordance concepts for preference elicitation in a web-based urban environment. The computation of user’s preferences is supported by a competitive back propagation neural network that triggers an encoding and decoding process that returns the reference location that best fits user’s preferences. Several algorithms provide the encoding and the decoding part of the system. Those algorithms integrate semantic and spatial criteria using a few information inputs. The approach is illustrated by a web-based prototype applied to the city of Kyoto. The modelling and computational principles of our approach are general enough to be extended to different spatiallyrelated application contexts where one of the constraints is the elicitation of user’s preferences. The development realised so far confirms that the web is a valuable computational environment to explore and approximate user’s preferences. It allows interaction and multiple explorations of and with multi-modal data: visual, semantic, textual and cartographical. The fact that the web is part of a large data repository allows further exploration in the information space. This prototype opens several avenues of research for further work. The directions still to explore are the integration of training and reinforcement learning in the neural network, implementation of validation procedures and multi-user collaboration in the preference elicitation process.
References Brown B and Chalmers M (2003) Tourism and mobile technology. In: Proceedings of the 8th European Conference on Computer Supported Cooperative Work, 14th-18th September, Helsinki, Finland, Kluwer Academic Publishers. Chiclana F, Herrera F and Herrera-Viedma E (1998) Integrating three representation models in multipurpose decision making based on preference relations. Fuzzy Sets and Systems, 97: 33-48. Freeman JA and Skapura DM (1991) Neural Networks: Algorithms, Applications and Programming Techniques, Addison-Wesley, MA.
A Flexible Competitive Neural Network 57 Gibson J (1979) The Ecological Approach to Visual Perception. Houghton Mifflin Company, Boston. Greco G, Greco S and Zumpano E (2001) A probabilistic approach for distillation and ranking of web pages. World Wide Web: Internet and Information Systems, 4(3): 189-208. Haddawy P, Ha V, Restificar A, Geisler B and Miyamoto J (2003) Preference elicitation via theory refinement. Journal of Machine Learning Research, 4: 317-337. Johnson M (1987) The Body in the Mind: The Bodily Basis of Meaning, Imagination, and Reason. The University of Chicago Press, Chicago. Kacprzyk, J (1986) Group decision making with a fuzzy linguistic majority. Fuzzy Sets and Systems, 18: 105-118. Kleinberg JL (1999) Authoritative sources in an hyperlinked environment. Journal of the ACM, 46(5): 604-632. Kuhn W (1996) Handling Data Spatially: Spatializing User Interfaces. In: Kraak MJ and Molenaar M (Eds.), SDH’96, Advances in GIS Research II, Proceedings. 2, International Geographical Union, Delf, pp 13B.113B.23. Linden G, Hanks S and Lesh N (1997) Interactive assessment of user preference models: The automated travel assistant. User Modeling, June. Madria SK, Bhowmick SS, Ng WK and Lim EP (1999) Research issues in web data mining. In: Proceedings of the 1st International Conference on Data Warehousing and Knowledge Discovery, pp. 303-312. Riecken (2000) Personalized views of personalization. Communications of the ACM, 43(8): 26-29. Saaty TL (1980) The Analytic Hierarchy Process, McGraw-Hill, NewYork. Schafer JB, Konstan J and Riedl J (1999) Recommender systems in ecommerce. In: Proceedings of the ACM Conference on Electronic Commerce, pp. 158-166. Shavlik J and Towell G (1989) An approach to combining explanationbased and neural learning algorithms, Connection Science, 1(3): 233255. Shearin S and Lieberman H (2001) Intelligent profiling by example. In: Proceedings of the International Conference on Intelligent User Interfaces (IUI 2001), Santa Fe, NM, pp. 145-152. Tezuka T, Lee R, Takakura H and Kambayashi Y (2001) Web-based inference rules for processing conceptual geographical relationships. In: Proceedings of the 1st IEEE International Web GIS Workshop, pp. 1424. Thorndyke PW and Hayes-Roth B (1980) Differences in Spatial Knowledge Acquired from Maps and Navigation, Technical Report N-1595-ONR, The Office of Naval Research, Santa Monica, CA. Worboys M (1996) Metrics and topologies for geographic space. In: Kraak MJ and Molenaar M (Eds.), SDH’96, Advances in GIS Research II, Proceedings. 2, International Geographical Union, Delf, pp. 365-375.
Combining Heterogeneous Spatial Data From Distributed Sources M. Howard Williams and Omar Dreza School of Math & Comp. Sc., Heriot-Watt Univ., Riccarton, Edinburgh, EH14 4AS UK,
Abstract The general problem of retrieval and integration of data from a set of distributed heterogeneous data sources in response to a query has a number of facets. These include the breakdown of a query into appropriate subqueries that can be applied to different data sources as well as the integration of the partial results obtained to produce the overall result. The latter process is non-trivial and particularly dependent on the semantics of the data. This paper discusses an architecture developed to enable a user to query spatial data from a collection of distributed heterogeneous data sources. This has been implemented using GML to facilitate data integration. The system is currently being used to study the handling of positional uncertainty in spatial data in such a system. Keywords: Spatial databases, XQL, GML, distributed databases, heterogeneous data.
1 Introduction The problem of retrieving and integrating information from a set of heterogeneous data sources involves a number of complex operations (ElKhatib et al. 2002). These include the breakdown of a query into appropriate sub-queries that can be applied to different data sources and the integration of the partial results obtained to produce the overall result (MacKinnon et al. 1998). Initially the problem was more difficult to deal with but progress in communication and database technologies has facilitated solutions. The present situation is characterized by a growing number of
60
M. Howard Williams and Omar Dreza
of applications that require access to data from a set of heterogeneous distributed databases (Elmagarmid and Pu 1990). This need for access to distributed data is certainly true of spatial data where there has been growing interest in retrieval and integration of data from distributed sources. One development in the USA is the National Spatial Data Infrastructure (NSDI) which developed under the coordination of the Federal Geographic Data Committee (FGDC 2003). This defined the technologies, policies, and people necessary to promote sharing of geospatial data throughout all levels of government. It provides a base or structure of practices and relationships among data producers and users that facilitates data sharing and use. It is a set of actions and new ways of accessing, sharing and using geographic data that enables far more comprehensive analysis of data to help decision-makers. The implementation of Internet GIS requires not only network infrastructures to distribute geospatial information, but also software architectures to provide interactive GIS functions and applications. This paper discusses an architecture developed to demonstrate the integration of heterogeneous spatial data from distributed data sources to support particular decisions.
2 Query process In order to provide a user-friendly interface that is easily understandable to users, one may develop graphical or natural language interfaces to such a system. A query at this level may involve data from different data sources. Thus the first task is to map a query from this general form into subqueries addressed to specific data sources and expressed in an appropriate query language. For example, a query such as “What roads may be affected in region (xmin, ymin, xmax, ymax) if the level of lake x rises by 3 metres?” may require three different sets of data - a map of the lake, the road network and the DTM data sources. Thus this general query must be translated into three specific sub-queries, each formulated in the language that is understandable by the related database manager. In order to implement this, each data provider needs to develop a wrapper (Zaslavsky et al. 2000) whose purpose is to translate a query from the server language to that of the data source and transform the results provided by the data source to a common format understandable by the server. As this part of the process is not the main focus of our work, a simple approach has been adopted for our implementation.
Combining Heterogeneous Spatial Data From Distributed Sources
61
3 Data integration process Because of the autonomy of the data providers, the data generated from the three data sources mentioned in the previous section will in general have different syntax (i.e. different schemas), and different semantics. In order to integrate them it is necessary to convert them to a common format. For this purpose GML (GML 2001) was chosen. GML has the following properties:
It is an XML encoding of geography, It enables the GI community to leverage the world of XML technology, It supports vector mapping in standard Web applications, It provides complex features, feature collections and feature associations.
4 Architecture The architecture described here has been developed and implemented to support access to a distributed collection of heterogeneous spatial data sources. The basic idea behind this architecture is that the user should be able to pose a query which requires data from different sources and that the system will fetch the data and combine it to produce an answer without the user being aware of the different data sources involved. The architecture is based on a client/server approach, with three levels - the client level, the server level, and the data provider level. A system based on this architecture has been developed using Java. 4.1 The Client User Interface Currently the system uses a very simple Client User Interface (CUI) with a set of templates for fixed scenarios. Since the main focus of our work is on the handling of positional uncertainty this provides a simple and easy to use interface although limited for general use. The CUI is designed to allow the user to specify the keys of his/her query such as the region name, coordinates, etc. For a scenario such as flooding, the user can specify the affected entities (roads, land parcels, buildings, etc), the entity causing the flooding, etc. For a scenario such as utilities maintenance, the user can specify the utilities that may be affected
62
M. Howard Williams and Omar Dreza
by the maintenance (water pipe, gas pipe, etc.), the maintaining utility, and key parameters (such as digging width, measurement unit, etc.). Fig. 1 shows an example of the CUI for a flooding scenario query. The CUI presents the template to the user and obtains from the user the values of the parameters. These are then used to generate a query that is sent to the Server. The client also has a browser component for viewing the results retrieved by the Server. As shown in fig. 5 this browser contains GIS tools to help the user to zoom, pan, and select specific features. It also displays the scale of the viewed results and can send the results to a printer to produce hardcopy.
Fig. 1. Client User Interface
4.2 The Server The Server works as an information mediation service where user queries against multiple heterogeneous data sources are communicated via middleware (mediator) which is responsible for query rewriting, dispatching query fragments to individual sources, and assembling individual query results into a composite query response (Levy et al. 1996; Papakonstantinou et al. 1996; Tomasic et al. 1998; Wiederhold 1992). The proposed architecture of the server contains a number of processes used to analyse the query sent by the client as illustrated in Fig. 2. The server has three main purposes. The first is to break down the query passed
Combining Heterogeneous Spatial Data From Distributed Sources
63
from the client into a simple set of sub-queries, which can be understood by the data provider taking account of the purpose of the client query. This is performed by the KB component that generates the separate subqueries required to satisfy the general query. The second purpose is to determine appropriate data sets, which can be retrieved from the data providers and which will satisfy the user query parameters (scale, locations). Finally, the server must integrate the spatial results sent back to the server from the data providers. Once again the KB component is used to perform this integration. This task will be further illustrated in section 5. The server metadata consists of summary extracts derived from the provider metadata. It contains information about the data sets from all data providers known to the server. This information includes data provider URL address, the scale of the data set, the spatial location covered by the data set, etc., and is used to determine whether a data set is relevant to a specific sub-query and, if it is, how to perform a connection to the data provider. The following example shows part of the server metadata that is retrieved for the lake flood scenario.
y quer e User rpos u p ery + qu
Client part User query Spatial integration of the incoming results from the data provider
Spatial query
Results integration process
KB
ress add URL
et, atas L, d traints R U ons ry c que
Data provider connector
View of the dataset based on the user query
ta Da er vid pro rt pa
Fig. 2. Server Architecture
t name Datase ry al_que + spati +scale
Server metadata
Sp ati al
qu ery
Spatial_analysis control
, t, scale Datase n o ti a c lo
Search process
KB
ba Proc sed es on s se the ver UR al c the me Ls f alls o tad ata und in
64
M. Howard Williams and Omar Dreza
LAKE <SCALE>25000 lab152pc2.cee.hw.ac.uk:1115 EDINBURGH ROAD <SCALE>25000 PCDB3.macs.hw.ac.uk:1115 EDINBURGH DTM <SCALE>25000 lab152pc3.cee.hw.ac.uk:1115 EDINBURGH <WEST>328500.0 <EAST>331530.0<EAST> 652260.0 <SOUTH>655480.0
The KB_XML file contains a set of knowledge relating to the specific data sets that can be used. Each set contains a spatial query related to a particular problem and the optimum scale that can be used in this particular scenario. The KB component takes the parameters of the query provided by the client (e.g. affected entities, etc.) and converts them to an XML query, which is used to fetch the KB_XML file. This query is developed using the XQL standard. More information on this and the XCool library can be found in (Robie 1999). The following is an example of the results retrieved from the KB_XML file with a flooding scenario XML query. LAKE
Combining Heterogeneous Spatial Data From Distributed Sources
65
25000 <Spatial_Query> SELECT * FROM LAKE WHERE LAKE.name = X DTM ROAD 25000 <Spatial_Query> SELECT * FROM ROAD WHERE ROAD.name = X DTM 25000 <Spatial_Query> SELECT * FROM REGION WHERE REGION.x >= Xmin And REGION.x = Ymin And REGION.y 1:50,000 ID_points2 H:\www\DTMdataProvider\DTM\ Edinburgh 3 3
Fig. 3. Data provider
Combining Heterogeneous Spatial Data From Distributed Sources
67
The second process is responsible for applying the spatial query to the spatial database. For example, if the spatial database used is MapInfo, the retrieval process is written in the MapBasic language provided by MapInfo. This could easily be extended to handle other spatial databases. The third process retrieves the results generated by the query. It converts the TAB file generated from the spatial query to a GML file or a text file (in the case of DTM data). This process is implemented using the MITAB library (MITAB 2002). Fig. 4 shows part of the GML file representing the retrieved road network data. Finally, the uncertainty associated with the exact locations of features is <SpatialReferenceSystem srsName=""> unspecified unspecified <Spheroid> <SpheroidName>unspecified 3.0 1 Edinburgh Road1 329167.8580,655677.3579 329176.0383,655571.1892 329162.0340,655366.3166 329185.8037,655184.3520 329193.8983,655093.3545 329142.1628,654857.9885 329128.0728,654668.2872 328925.3267,654326.1183 328882.7140,653817.6995 328815.6889,653605.0299 328755.8159,653468.2469 328499.6635,653186.5517 327820.3729,652660.4251 Fig. 4. Part of GML file representing road network data.
68
M. Howard Williams and Omar Dreza
determined and added to the results.
5 Using the approach The approach described here has been implemented in a system developed to support access to a distributed collection of heterogeneous spatial data sources. To illustrate the operation of the system developed, consider the flooding scenario with a set of data sources. The results of a specific query of this type are shown in Fig. 5.
Fig. 5. Final results
Combining Heterogeneous Spatial Data From Distributed Sources
69
6 Conclusion This paper is concerned with the problem of combining spatial data from heterogeneous sources. An architecture that has been developed to handle queries involving multiple spatial data sources has been extended to incorporate different forms of uncertainty in the data. A prototype has been implemented to realise this architecture. This was initially based on the use of XML as encoding standard for the metadata and the KB and to represent the data transferred in response to requests between different levels of the architecture. The system uses XQL as the query language to retrieve information that encoded in XML, such as metadata and KB. GML has proved to be an excellent notation for representing vector-GIS data. It is simple and readable and based on XML schema. However, there are some limitations in handling large DTM files. Using GML schema, the data and its geometry can be stored in a single file, which makes it easier to represent data transferred between different spatial data sources.
Acknowledgement The authors gratefully acknowledge the support of Biruni Remote Sensing Center who is providing funding for O. Dreza to conduct this research toward a PhD.
Reference El-Khatib HT, Williams MH, MacKinnon LM, Marwick DH (2002). Using a distributed approach to retrieve and integrate information from heterogeneous distributed databases. Computer Journal, 45(4):381-394. Elmagarmid AK, Pu C (1990) Introduction: special issue on heterogeneous databases (guest editors), ACM Computing Surveys, 22(3):175–178. FGDC (2003) FGDC, USGS, 590 National Center, Reston, VA 20192. Updated: Thursday, 27-Mar-2003.http://www.fgdc.gov/nsdi/nsdi.html GML 2001. Geography Markup Language (GML 2.0), OpenGIS® Implementation Specification, 20, OGC Document Number: 01-029. http://opengis.net/gml/01-029/GML2.html Levy A, Rajaraman A, Ordille J (1996) Querying Heterogeneous Information Sources Using Sources Descriptions. Proceedings of the 22nd International Conference on VLDB, pp. 251-262.
70
M. Howard Williams and Omar Dreza
Mackinnon LM, Marwick DH, Williams MH (1998) A model for query decomposition and answer construction in heterogeneous distributed database systems. Journal of Intelligent Information Systems, 11: 69-87. MITAB. MapInfo .TAB and .MIF/.MID Read/Write Library, http://pages.infinit.net/danmo/e00/index-mitab.html. Papakonstantinou Y, Abiteboul S, Garcia-Molina H (1996) Object Fusion in Mediator Systems. Proceedings of the 22nd International Conference on VLDB, pp. 413-424. Robie J (1999). XCOOL WWW document, http://xcool.sourceforge.net . Tomasic A, Raschid L, Valduriez P (1998) Scaling Access to Heterogeneous Data Sources with DISCO. IEEE Transactions on Knowledge and Data Engineering, 10(5):808-823. Wiederhold G (1992) Mediators in the Architecture of Future Information Systems. IEEE Computer, 25(3):38-49. Zaslavsky I, Marciano R, Gupta A, Baru C (2000) XML-based Spatial Data Mediation Infrastructure for Global Interoperability, 4th Global Spatial Data Infrastructure Conference Cape Town, South Africa. http://www.npaci.edu/DICE/Pubs/.
Security for GIS N-tier Architecture Michael Govorov, Youry Khmelevsky, Vasiliy Ustimenko, and Alexei Khorev 1 GIS Unit, Department of Geography, the University of the South Pacific, PO Box 1168, Suva, Fiji Islands,
[email protected]; 2 Computing Science Department, the University College of the Cariboo, 900 McGill Road, Kamloops, BC, Canada,
[email protected]; 3 Department of Mathematics and Computer Science, The University of the South Pacific, PO Box 1168, Suva, Fiji Islands,
[email protected]; 4 Institute of Computational Technologies, SBRAS, 6 Ac. Lavrentjev Ave., Novosibirsk, 630090, Russia,
[email protected] Abstract Security is an important topic in the Information Systems and their applications, especially within the Internet environment. Security issue for geospatial data is a relatively unexplored topic in Geographical Information Systems (GIS). This paper analyzes the security solutions for Geographical Information Storage Systems (GISS) within n-tier GIS architecture. The first section outlines the application of the main categories of database security for management spatial data. These categories are then analyzed from a point of view of application within GIS. A File System within Database (FSDB) with traditional and new encryption algorithms has been proposed to be used as a new GISS solution. A FSDB provides more safe and secure storage for spatial files and support centralized authentication and access control mechanism in legacy DBMS. Cryptography solutions as a topic of central importance to many aspects of network security are discussed in detail. This part of the paper describes several traditional and new symmetric, fast and nonlinear encryption algorithms’ implementation with fixed and flexible key sizes.
72
M.Govorov, Y.Khmelevsky, V.Ustimenko, and A.Khorev
1 N-tier Distributive GIS Architecture Two major recent tendencies in the development of GIS technology are relevant to security: 1. First is adaptation of IT technology, such as n-tier software architecture. Existing GIS solutions started transition to the Web distributive and open n-tier architecture a few years ago. But still in most existing GIS applications, the map server provides only cartographic rendering and simple spatial data analysis on the client and back-end tiers. Current Web Map Servers are a simplification of full functional application server at the middle of the 3-tier industry-standard architecture. 2. The second tendency is GISS transition from file’s spatial data warehouses to full functionality of spatial databases solutions with employment of DBMS as a storage system within in Single Server or Distributed Environment. The advantages of such transition are well-known to the IT industry. In global geo-network large amounts of data are still stored in spatial warehouses as flat files (e.g. in .shp, .tab, .dxf, .img), which have single user access, large size of files and no transaction based processing.
Fig. 1. The Feasible GIS n-tier Architecture
The purpose of this article is to analyze the security solutions for spatial data management within GIS n-tier architecture. This section outlines the feasible GIS n-tier architecture and role of GISS to store GIS spatial data. The feasible GIS n-tier architecture is shown in Fig. 1. GIS functionality, data, and metadata can be assigned to various tiers (sometimes called layers) along a network and can be found on the server
Security for GIS N-tier Architecture
73
side in one or more intermediate middleware layers, either on the back-end or client side. All 3-tiers can be independently configured to meet the users' requirements and scaled to meet future requirements. The feasible architecture includes a client tier in which user services reside. Client tier is represented by Web browser or wireless device (thin client), and either Web browser with Java applets or ActiveX components or a Java application (thick client) [9]. The middle tier is divided in two or more subsystems (layers) with different functions and security features, including SSL encryption, authentication, user’s validation, single-sign logon server, and digital signature. GIS Web services perform specific GIS functions, and spatial queries; and can be integrated as a part of the middle-tier application server [1]. Spatial components have capabilities for accessing and bundling maps and data into the appropriate format before sending the data back to a client. These components support different functionalities: generate image maps and stream vector spatial data for the client; return attribute data for spatial and tabular queries; execute geo-coding and routing functions; extract and return spatial data in appropriate format; search a spatial metadata repository for documents related to spatial data and services; and run spatial and cartographic generalization techniques. Data Management Layer (GISS) controls database storage and retrieval. Data access logic describes transactions with a database. Data access is normally performed as a functionality of business logic. Since many spatial data are still stored in file format, the management of this data may be significantly improved by storing data within a database system. Critical security communication channels of information flows within classical Application Server are between: a Web browser and a Web Server; Web server and a business logic layer (cases of thin and medium client configurations); and a business logic layer and a back-end tier. Also attention should be focused on secure communication between all other distributed components of middle tier. The first question is how to secure flowing information, the second, how to maintain access control. Because of the connectionless nature of the Web, security issues relate not only to initial access, but to re-access also. For the case of the thick client, these two problems can be addressed how to secure communication between thick client and business logic layer.
74
M.Govorov, Y.Khmelevsky, V.Ustimenko, and A.Khorev
2 Security Controls within n-tier GIS Architecture One of the primary reasons for deploying an n-tier system within Internet environment is security improvement. Thus, application logic in the middle tier can provide a layer of isolation to sensitive data maintained in spatial database. For GIS applications, the middle tier in n-tier system can focus on pre-presentation processing and cartographic presentation of spatial data to the user, allowing the back-end tier to focus on management and heavy processing of spatial data. However, n-tier architectures increase the complexity of practical security deployment compared with 2-tier Client/Server architecture. For GIS n-tier architecture a general security framework should address the same requirements as for legacy n-tier systems, which include authentication, authorization, identification, integrity, confidentiality, auditing, non-repudiation, credential mapping, and availability [4, 15]. There are some specifics of spatial data management, which concern protecting confidentiality, and integrity of data while in transit over the Internet and when it is stored on internal servers and in databases. This section outlines the general security framework for GIS Web based n-tier architecture. In the next sections, solutions for confidentiality protection of spatial data in storage are discussed. A firewall can be basically the first choice of defense within GIS Web based n-tier architecture. One device or application can use more than one basic firewall mechanisms such as stateful packet filtering, circuit-level gateway, and proxy server and application gateway. Many configurations are possible with placement of firewalls. Several layers of firewalls can be added for security [10]. Ideal solution is to provide buffers of protection between Internet, GIS Application Server and spatial database [12]. Most of the existing Web Map Servers use a low level authentication, which supports minimal security and based on a password. Cryptographic authentication in the form of digital certificates must to be used for stronger authentication. Authentication protection can be implemented within Web Server, JSP, servlet or ASP connector, business logic layer and back-end tier. The next defense line of security in GIS Application Server is proper access control to business logic components and back-end recourses. Authorization services determine what resources and files a user or application has access to. There are at least three main access control models, which can be used - mandatory, discretionary and role-and-policy based authorization schemes [5].
Security for GIS N-tier Architecture
75
If the subsystems of n-tier architecture have different security infrastructures, they may need to convey authorization information dynamically by propagating it along with an identity. GIS Application Server can dynamically update users and roles by leveraging an external, centralized security database or service, via LDAP server. Determining whether a specific user could have access to a specific table or file, but not access to specific data within the table or file usually enforces access control within the spatial database. Such a situation can be interesting for accessing certain level of multi-detailed representation of spatial features from spatial multi-scale database. If there is need to enforce entity-level access control for data within tables, one has to rely on database views, or program the access logic into stored procedures or applications. If access logic is programmed into applications, then these applications must be rewritten if security policies change. Another important feature of GIS n-tier architecture security is protection of GIS data and service confidentiality in exchanges between clients, middle tier and back-end tier, and in a spatial storage. Encryption is the standard mechanism for these purposes and can be used within GIS n-tier architecture for different purposes of protection. First purpose of such protection is encryption of a user’s identity for authentication and authorization services. For a typical case, this relies on the transport layer for security via the SSL protocol, which also provides data integrity and strong authentication of both clients and servers. Second, encryption can be used for the protection of spatial data in transit. Next section of the article gives an overview of this security aspect. Third, cryptography can be used to encrypt sensitive data stored on DSS, including caches.
3 Web Services' Security of Spatial Message Protection A GIS Web service is a software component that can provide spatial data and geographic information system functionality via the Internet to GIS and custom Web applications. GIS Web services to perform real-time processing on the computers where they are located and return the results to local application over the Internet. The protocols, which form the basis of the Web service architecture, include SOAP, WSDL, and UDDI. Current SOAP security model is based upon relying on the transport layer for security and recently emerged security specifications that provide message-level security that works end-to-end through intermediaries [14].
76
M.Govorov, Y.Khmelevsky, V.Ustimenko, and A.Khorev
XML-based security schemes for Web services include XML Digital signature, XML Encryption, XML Key Management Specification, Extensible Access Control Markup Language (XACML), Secure Assertion Markup Language (SAML), Web Services Security, and ebXML Message Service. The XML Signature (XMLSIG) in conjunction with security tokens supports multiple signers for a single XML document for proving the origin of data and to protect against tampering during transit and storage. The XML Encryption (XMLENC) specification supports the ability to encrypt or portions of an XML document for providing the confidentiality. SAML specifies the XML format of asserting authentication, authorization, and attributes for an entity. XACML out of the OASIS group specifies how authorization information can be represented in an XML format. OpenGIS specifications are including Web Map Service, Web Feature Service, Web Coverage Service, and Catalog Service/Web profile. The SOAP message security approaches can be applied for protection of GIS Web service. Thus, GIS applications, which are using XML (GML, ArcXML) for a web services, can use XML digital signatures for verification of the origins of messages. Important advantage for encryption of spatial data (for large data streaming) with emerged XMLENC is encryption of a part(s) of an XML document while leaving other parts open. 3.1 Internet File System (IFS) and Encryption Security Solutions for Spatial Warehouses Volumes of spatial information, which are stored in files, are growing at explosive rates. According to some sources, the volume of such file storage is doubled every year [7]. At the same time, many new formats are used to store spatial and non-spatial data within files. The GIS users and distributive applications demand to store, manage and retrieve information in safe and secure manner. GIS users and applications should have universal secure access mechanism to the spatial files’ database. A RDBMS is a core system in any organization or should be a core system, which has powerful mechanism to store different type of information with different access rights and sophisticated security mechanisms. Every year new products have emerged on the market, which raise possibilities to utilize legacy RDBMS for unusual purposes. But idea of application of these products is similar: to have only one universal system for information storage, processing and retrieving within an organization.
Security for GIS N-tier Architecture
77
3.1.1 File System within RDBMS Instance as Storage for GIS Data Files File System within Database (FSDB), a relatively new idea, can help solve the above-mentioned problem effectively as follows: FSDB raises the possibility for any file to be created, reviewed, corrected, approved, and finally published with appropriate access restrictions for user groups or simple users into DBMS. The files can be versioned, checked in and checked out, and synchronized with the local copies [11]. At the same time FSDB can be replicated by standard replication procedures of any sophisticated modern DBMS. The protocol servers that are included, for example, with the Oracle IFS allow the FSDB to provide support for all common industry standard protocols through the Internet or application server and within the enterprise network [11]. A FSDB can provide a multi-level security model to ensure the privacy and integrity of documents in a number of different ways, such as: leveraging the security provided by the DBMS; user authentication; access rights definition; access control at the file, version and folder level; support for Internet security standards; and anti-virus protection [11]. A FSDB secures GIS files by storing them in a DBMS. The FSDB uses authentication mechanism to get access into a DBMS or repository of FSDB, regardless of the protocol or tool being used to access a file. Newest versions of FSDB have more sophisticated authentication mechanisms, such as SSO servers, Internet Directory and LDAP server’s utilization. Oracle IFS was used to test protection of the spatial data file while in storage and during an on-going processing [8]. Users can use their desktop GIS and any other applications while spatial data is stored and managed by database, thereby leveraging the reliability, scalability and availability that come with the database, and at the same time have the familiarity and ease of a standard file system. Oracle IFS stores spatial data files in the form of Large Objects (LOBs) inside of database, which lets GIS users store very large files. LOBs are designed for fast access and optimized storage for large binary content. Fig. 2 shows authentication and authorization processes between external desktop GIS application and IFS storage. Obviously FSDB while providing great possibility for security and management of spatial data files also prompts several concerns: Will the transition of spatial data files from standard OS file system (e.g. NTFS or UFS) to FSDB affect the performance of input, retrieval and updating of spatial data? Will the size of spatial storage be increased?
78
M.Govorov, Y.Khmelevsky, V.Ustimenko, and A.Khorev
Fig. 2. IFS Security Model
Performance results (time differences) of input, retrieval and updating GIS data files in desktop GIS software such as MapInfo and ArcView from Oracle IFS 9i are shown in Fig. 3. Different sizes of vector GIS files were used for the study. The large pool size buffer, cache size and processes components of IFS and Oracle 9i Application Server were optimized to achieve the best performance of IFS.
Fig. 3. “IFS – NTFS” Time Differences (in seconds)
The negative results are obtained for processing of small-size files using Oracle Buffer Cache. All other results give difference of about 1-2 seconds for processing data files with the sizes up to 100 MB by using IFS storage to compare to native OS system. The study of the changes in the spatial data file sizes, compare with the amount of space that they take up in NTFS and IFS drives, shows that the
Security for GIS N-tier Architecture
79
Oracle IFS tablespace is increased in size by about 12% only. That difference can be reduced changing database storage parameters for IFS. The results of IFS performance investigation show that this approach is acceptable for data processing within GISS. Within this approach of spatial file storage, the following authentication and authorization levels can be used to secure spatial data files: OS Level (share permissions and folder permissions) and IFS Level. Permissions remain the same regardless of the protocol (e.g. HTTP, FTP, SMB) being used to access the content stored in IFS repository. 3.2 Conventional Encryption for GIS Data Protection in Storage It is noteworthy that the IFS within DBMS is capable enough to provide sufficient security to spatial files. If necessary, encryption can be employed to provide additional security to confidential and sensitive GIS information. Oracle Advanced Security of the Oracle 9iAS supports industry standard encryption algorithms including RSA’s RC4, DES and 3DES and can be used for spatial data encryption [6]. Custom external encryption algorithms can be integrated into that security schema as well. The data encryption can significantly degrade system performance and application response time. For performance testing, the Oracle 9i DBMS_OBFUSCATION.TOOLKIT was investigated (see Figure 4). Different key length gives different time results, for e.g. difference of time between 16 and 24 byte keys is about 10-20%, but time difference of 24 and 32 byte keys is about 5% only. Average speed 3 DES encryption is about 2.5 sec per megabyte, or about 1 hour to encrypt or decrypt 1 GB spatial data on workstation (1.6 GHz Intel Processor within Window OS). To use special multiprocessor UNIX servers, the encryption/decryption can be reduced to 10-20 minutes or in the best way to several minutes, what is applicable to real environment, when decryption/encryption of spatial data should be performed once per session. To keep encrypted GIS data files into IFS, standard encryption of Oracle and new developed encryption algorithms were analyzed and investigated for performance. To provide encryption or decryption of sensitive application data, decryption procedures can be activated by database triggers for authenticated users (during log in). To log off, user will again fire the trigger that should execute the procedure to encrypt all the modified files or to replace decrypted files by already encrypted files into IFS LOB objects from the temporary storage within encrypted files. If connection to database is lost by accident, changes to files should be committed or roll backed by DBMS and modified data encrypted back into permanent LOB objects. Decryp-
80
M.Govorov, Y.Khmelevsky, V.Ustimenko, and A.Khorev
tion and encryption of spatial data files will slow down user interaction with the system. These delays would occur at two occasions when user logs in and logs out or there is session failure. 3.2.1 New Encryption Algorithm for GIS Data Protection in Storage Special approaches were developed to use encryption for large files in Oracle. To encrypt LOB data objects, the procedure splits the data into smaller binary chunks, then encrypts and appends them to the LOB object back. Once the encrypted spatial data files have been allocated into LOB segments, they can be decrypted by chunks and written back to BLOB object. For the read-only spatial data files, additional LOB object once encrypted should always be kept. It will save time for encryption procedure during log off. The decrypted spatial data files will be simply replaced by read-only encrypted spatial data files in the main permanent storage during log off. The algorithm of binary and text files encryption, which is more robust, compared to DES and 3DES, has strong resistance to attacks, when adversary has the image data and ciphertext proposed by V. Ustymenko [13]. This algorithm can be applied to encrypt spatial raster and vector data types, which are commonly used in GIS. The encryption algorithm is based on a combinatorial algorithm of walking on a graph of high girth. The general idea was to treat vertices of a graph as messages and arcs of a certain length as encryption tools. The encryption algorithm has a linear complexity and it uses nonlinear function for encryption, thus it resists to different type of attacks of adversary. The general idea was to treat vertices of a graph as messages and arcs of a certain length as encryption tools. The quality of such an encryption in case of graphs of high girth by comparing the probability to guess the message (vertex) at random with the probability to break the key, i.e. to guess the encoding arc is good. In fact the quality is good for graphs, which are close to the Erdos bound, defined by the Even Cycle Theorem [2, 3]. In the case of algebraically defined graphs with special colorings of vertices there is a uniform way to match arcs with strings in some alphabet. Among them can be found ''linguistic graphs'' whose vertices (messages) and arcs (encoding tools) both could be naturally identified with vectors over GF(q), and neighbors of the vertex defined by a system of linear equations. The encryption algorithm is a private key cryptosystem, which uses a password to encrypt the plain text, and produces a cipher text.
Security for GIS N-tier Architecture
81
The developed prototype model allows testing the resistance of the algorithm to attacks of different types. The initial results from such tests are encouraging. In case for p=127 (size of ASCII alphabet minus “delete” character), some values of t(k,l) [time needed to encrypt (or decrypt because of symmetry) file, size of which is k Kilobytes with password of length l (key space roughly 27l )], processed by an Intel Pentium 1.6 GHz processors workstation (Oracle 9i DBMS Server, PL/SQL programming language), can be represented by the matrix shown in Table 1. Our results presented in Table 1 indicate that the encryption/decryption time has linear correlation to the file size. Roughly it takes about 60 seconds for 51 KB file encryption within 16 byte length password by using PL/SQL functions, and for 1 MB - about 17 minutes. If more powerful 2-4 processors workstation and C++ or Macro Assembler programming languages are used to rewrite encryption/decryption functions, encryption time will be further decreased by several dozen times, e.g. for 100 MB file size it can reach 20-30 minutes encryption/decryption time, which can be used for practical implementation. Taking into consideration that the 10-20 processors systems are practical industrial server solution (expected to be common in near future), GISS encryption/decryption time can be reduced to less than 5 minutes. Table 1. Processing time t(k,l) for encryption/decryption by the New Algorithm as compared with RC4 New Algorithm (s) Kb/L 7.6 51.5 96.6 305.0 397.0
48 26 179 335 1061 1379
40 22 149 279 883 1145
32 17 119 223 706 913
24 14 90 169 529 685
RC4 (s) 16 9 60 112 353 458
48 1 8 14 45 59
40 1 8 15 47 62
32 1 8 15 24 31
Difference (times) 48 40 32 26 22 17 22.4 18.6 14.9 23.9 18.6 14.9 23.6 18.8 14.9 23.4 18.5 14.9
Currently, program code and encryption algorithm optimization are under investigation by the authors and will be the subject of our future publications.
82
M.Govorov, Y.Khmelevsky, V.Ustimenko, and A.Khorev
4 Conclusion N-tier architectures and Web Services are making the application layer more complex and fragmented. The solution in protection lies in application of the security framework to all subsystems and components of n-tier system. This framework has to comply with the industry security requirements of major application development models. GIS data management and Mapping Services are primary considerations when developing GIS n-tier architectures. There are several reasons for supporting n-tier architectures for spatial applications. Major reasons include providing user access to data resources and GIS services through the Web and at the same time providing better data and service protection. Framework of standard security mechanisms can be used to improve security within critical points of spatial information flows within GIS Application Server. Security solutions for GIS distributive systems can be approached in ways similar to e-commerce applications, but can be specific to spatial data security management as it relates to spatial data types, large size of binary files and presentations logic. Often, file servers are used to store GIS data. A file system within database instance provides more safe and secure storage for spatial files within centralized authentication and access control mechanism in legacy DBMS. By using additional encryptions, a FSDB is able to guarantee that access control is enforced in a consistent manner, regardless of which protocol or tool is being used to access the repository. Our encryption model would provide a secure working environment for GIS client to store and to transfer spatial data over the network. For this purpose we utilize existing and new fast nonlinear algorithms of encryption with flexible size of keys based on the graph theoretical approach.
References [1] [2] [3] [4]
ArcIMS 4 Architecture and Functionality (2002) An ESRI White Paper Biggs NL (1988)Graphs with large girth, Ars Combinatoria 25, pp 73-80 Bollobas (1976) Extremal Graph Theory, Academic Press Computer Networking: A Top-Down Approach Featuring the Internet (2001) Addison Wesley Longman, Online Course [5] De S, Eastman CM, Farkas C (2002) Secure Access Control in a Multi-user Geodatabase, 22nd Annual ESRI International User Conference [6] Heimann J (2003) Oracle 9i Application Server, Release 2, Security
Security for GIS N-tier Architecture
83
[7] iWrapper software (2002) eSpatial, http://www.espatial.com/products/iwrapper.htm [8] New name: The Oracle Content Management SDK (2003) http://otn.oracle.com/products/ifs/content.html [9] OpenGIS Web Map Server Interface Implementation Specification, Revision 1.0.0 (2000) OpenGIS Project Document 00-028 [10] Security and ArcIMS (2001) An ESRI White Paper [11] Security and the Oracle Internet File System, Oracle Internet File System (2000) Technical White Paper [12] System Design Strategies, (2003) An ESRI White Paper [13] Ustimenko V (2002) Graphs with special arcs and Cryptography, Acta Applicandae Mathematicae, 74, pp 117-153 [14] Web Services Security: SOAP Message Security 1.0 (2004) OASIS, WSSecurity, http://www.oasis-open.org/committees/documents.php [15] WebLogic Security Framework: Working with Your Security Eco-System (2003) BEA, White Paper [16] WebSphere (2003)Web Services Handbook, IBM, Version 5
Progressive Transmission of Vector Data Based on Changes Accumulation Model Tinghua Ai1,2, Zhilin Li 2, and Yaolin Liu1 1 School of Resource and Environment Sciences, Wuhan University, China 2 Department of Land Surveying and Geo-Informatics, The Hong Kong Polytechnic University, Hong Kong, China2
Abstract The progressive transmission of map data over World Wide Web provides the users with a self-adaptive strategy to access remote data. It not only speeds up the web transfer but also offers an efficient navigation guide in information acquisition. The key technology in this transmission is the efficient multiple representation of spatial data and pre-organization on server site. This paper aims at offering a new model for the multiple representations of vector data, called changes accumulation model, which considers the spatial representation from one scale to another as an accumulation of the set of changes. The difference between two consecutive representations is recorded in a linear order and through gradually addition or subtraction of “change patches” the progressive transmission is realized. As an example, the progressive transmission of area features based on this model is investigated in the project. The model is built upon the hierarchical decomposition of polygon into series of convex hulls or bounding rectangles and the progressive transmission is accomplished through component of the decomposed elements. Keywords: progressive transmission, map generalization, polygon decomposition, convex hull, web GIS, changes accumulation model.
1 Introduction The advent of Internet presents two challenges to cartography. One is the resulting new space to be mapped, namely cyberspace or virtual world
86
Tinghua Ai, Zhilin Li, and Yaolin Liu
(Taylor 1997, Jiang and Ormeling 1997). The other is the advancement of mapping technology into web environment, including the presentation of map data on web, remote access of map data, on-line map-making with data from different web sides, and so on. The former challenge resulted from modern visualization leads to cartography into the development of new basic concepts. The latter resulted from mapping technology, on one hand, provides new opportunities and methods for the representation of spatial phenomena, and on the other hand, results in new issue on how to live with web. When downloading map data from web side, the user usually demands a fast response. It has been studied that the user will be impatient after waiting for 3 minutes. There are two problems to be solved: (1) quickly finding the location of requested map data with a search engine, and (2) quickly downloading map data under an interactive control. The first problem is related to the efficiency of the specific map search engine to process metadata and will not be discussed here in this paper. The second problem can be partially solved by improving the hardware and web infrastructure, such as the use of broadband. This problem is also related to the organization of data on server and transmission approaches across web. In this domain, the progressive transmission of map data from coarse to fine details becomes a desirable approach. Organized in sequence of significance, the map data is transferred and visualized on the client site step by step with increasing details. Once the user finds that the accumulated data meets his requirements, he can interrupt the transmission at any time. It is a self-adaptive transmission procedure in which the user and system can communicate interactively. As the complete data set on server usually covers much more details than required by users, the interruption can save much time for some users. The progressive transmission not only speeds up the web transfer but also corresponds to the principle of spatial cognition. From the viewpoint information acquisition, the progressive process behaves as an efficient navigation guide. The progressive transmission of raster data and DEM/TIN has been successfully implemented in web transfer (Rauschenbach and Schumann 1999, Srinivas et al 1999). But for vector map data, it still remains a challenge. One of the reasons could be that the multi-scale representation of vector data is much more difficult than that of raster or DEM data. It is not easy to find a proper strategy to hierarchically compress vector data, similar to the quadtree for the approximate representation of raster data in different resolutions. Indeed, the progressive transmission of vector data becomes a hot issue. Bertolotto and Egenhofer (1999, 2001) first presented the concept of progressive transmission of vector map data and provided a formalism model
Progressive Transmission of Vector Data 87
based on distributed architecture. Buttenfield (2001) investigated the requirements of progressive transmission based on the modified strip tree (Ballad 1981) developed a model for line transmission. Han and Tao (2003) designed a server-client scheme for the progressive transmission. The progressive transmission is an application of multiple representation of spatial data in web transfer environment, associated with map generalization. It can be regarded as the inverse process of map generalization at a low interval scale change. The key solution is to pre-organize generalized data on server site in a linear order with progressively increased details. In this study, a model to represent multi-details based on changes accumulation is presented, which considers the spatial representation from one scale to another scale as an accumulation of the set of changes. The rest of paper is organized as follows: Section 2 explains the changes accumulation model for the purpose of progressive transmission of vector map data. An application of this model is offered in section 3, which is about the transmission of area feature. An algorithm of hierarchical decomposition of polygon is investigated and the progressive transmission is realized through the accumulation of certain separated elements. Section 4 presents some conclusions.
2 The changes accumulation model The key question in progressive transmission is to pre-organize the vector data in a hierarchical structure on server site, with the assistance of map generalization technology. In this section, a model of the organization of vector data for progressive transmission is described, which is called changes accumulation model and takes into consideration of the level of geometric details. 2.1 Model description Multi-scale representation of vector map data can be achieved by two methods: (1) storing representation versions based on the generalized results, and (2) deriving multi-scale representations from initial data via online generalization. Our model belongs to the former one, but the recorded data is the changes between two consecutive representations rather than the complete version of representation. Let the initial representation state be S0 and the change between Si and Si+1 be ǻSi. Then the ith representation can be derived as the accumulation of the series of changes, i.e.
88
Tinghua Ai, Zhilin Li, and Yaolin Liu
Si = Si-1 + ǻSi-1 = S0 + ǻS0 +ǻS1 +…+ ǻSi-1 The representation {Si} corresponds to the series of data in the order of detail increment. Each state scene Si is the function of spatial scale and the change ǻSi can be regarded as the differential of representation Si to scale variable c. On server site, only the data S0 and {ǻSi } are just recorded. The initial representation S0 is the background data to meet the basic demands from all potential users and is determined by the purpose of use and the semantics. The set {ǻSi } is determined by spatial scale. The resolution to decompose {ǻSi } is determined by the granularity of progressive transmission, which will impact on the number of set elements. The order of element organization in set {ǻSi } is determined by the transmission sequence, based on a linear index. Generally speaking, the data volume of the storage of {ǻSi } is much smaller than that of the storage of { Si }. 2.2 Three basic operations A new state is derived through the change integration which can be regarded as set operations. In geometric operations, the transformation from the original state (large scale) to target state (small scale) is rather complex and needs generalization operators and/or algorithms. But from the point of view of change decomposition, the changes between two consecutive representations can be classified into certain categories. For example, the decomposed results of different geometric operations can usually be regarded as the segment or bend for line object and patch or simple convex polygon for area object. According to the role, the operations of changes can be distinguished into three categories. There are two roles to play in change parts to make the representation. The first operation plays a positive role in the representation so as to make the foreground object, i.e. to act as the part component of the detailed representation. Another plays a negative role in the representation so as to make the background object, i.e. to act as the complement of the detailed representation. Correspondingly, three operations can be distinguished, namely the addition for the former one, the subtraction for the latter one, and the replacement for that with both roles in one change step. In geometric characteristics, the addition operation will increase the length of line or the area of polygon, but the subtraction operation does the reverse. The addition will assign an additional part to the representation and the subtraction will remove a part from the representation. From the
Progressive Transmission of Vector Data 89
viewpoint of set operations, in the change accumulation model, these three operations can be realized by union, difference and their joint respectively.
3 An example ---- Polygon hierarchical decomposition and progressive transmission In this section an application example of progressive transmission of vector data based on the changes accumulation model is presented. The studied object is an area feature. We try to accomplish the transmission of area feature from coarse to fine with increasing details. On server site the data is organized in a hierarchical structure (i.e. tree) with levels of details based on the hierarchical decomposition of polygon. Two kinds of area features will be investigated, i.e. natural area feature such as lake, and artificial area feature such as building. In this approach, the “change” reflects as the decomposed element including convex-hull and bounding rectangle. 3.1 Hierarchical decomposition We try to decompose the polygon object into set of basic parts (changes in changes accumulation model) and to represent the area feature at certain scale through certain combination of these parts. The transformation from small scale to large scale is realized by addition, subtraction and replacement of some “changes”. In computational geometry, a polygon can be decomposed into elements with or without overlaps. The result of the former is called a cover and that for the latter is called a partition (Culberson and Reckhow, 1994). The decomposed element usually includes rectangle, simple convex polygon, triangle, grid and others. In this study the polygon decomposition belongs to the cover type and the basic decomposed element is convex hull or bounding rectangle. For the approximation of polygon in GIS, we usually use MBR or convex hull to represent its coverage. This enveloping representation will include some background areas (the “pocket” polygons in the concave part of original polygon), leading to the increment of the area. The measure of convex degree is defined as the ratio of the area of polygon to the area of its convex hull (or bounding rectangle). This measure is used to describe the approximation accuracy. The larger the convex degree is (closer to value 1.0), the higher accuracy the approximation representation is. Here we investigate the included “pocket” polygons and eliminate them by their approximations respectively. But this elimination will remove more area
90
Tinghua Ai, Zhilin Li, and Yaolin Liu
than it should be and next we add the approximation of second level “pocket” polygons. Continue this decomposition with the addition and subtraction of the approximations in turn until the “pocket” polygon is small enough or can not be further decomposed. This is the basic idea of our method to decompose polygon. Here we define the studied area feature the object polygon (abbreviated as OP) and the approximation of one polygon the approximation polygon (abbreviated as AP). The AP in geometry includes MBR and convex hull and their different usage will be discussed later. The decomposition result is stored in a hierarchical tree, represented as H-tree with nodes representing AP. The node of H-tree has the description of hierarchical level. The algorithm is described as follows: 1) Construct the AP of OP and initiate AP to H-tree as the root with level 0; 2) Extract the “pocket” polygons through the overlap computation with AP and OP getting result R={p0, p1,…pn}, and sort the element polygons on area decrement; 3) Push the “pocket” polygon pi in R which satisfies the further decomposition condition to stack P; 4) If the stack P is null, stop, otherwise pull one polygon pi from stack P and do the following steps: 4.1) Construct the AP of pi and assign AP to H-tree with level add 1 on the basis of pi’s level; 4.2) Extract the “pocket” polygons through the overlap of AP and pi and sort the elements on area decrement ; 4.3) Push the “pocket” polygon satisfying the further decomposition condition to stack P;
The decomposition result is the final H-tree whose nodes represent the approximations of the polygon. Such an approximation of polygon has two candidates in terms of geometric shape: (1) convex hull, and (2) minimum bounding rectangle. They are respectively applied to represent such natural features as lake, land-use parcel with irregular boundary, and such artificial feature as building with orthogonal property. The convex hull of polygon can be constructed by many algorithms and the optimum one reaches to computation complexity O(lg(n)n). For the construction of MBR of building polygon, the bounding rectangles in all edge orientations are generated and the smallest one is selected. As the number of building polygon edges is limited, the generation of rectangle candidates costs not much time. The conditions to stop further decomposition in the above algorithm include: (1) the convex degree is larger than a tolerance, say 0.9, and (2) the
Progressive Transmission of Vector Data 91
area of the polygon is small enough. The sort operation on “pocket” polygons is to guarantee the “child nodes” under certain “parent” having the sequence of area decrement (from left to right), and the later progressive transmission having the order of significance decrement correspondingly. Figure 1 illustrates a polygon and the corresponding decomposition result, the H-tree structure. In this case we take the convex hull as the approximation element. +
h1
h11 h121
h14
h12 h31
h13
h2 +
h3
Fig. 1. The illustration of a polygon and the corresponding H-tree by the decomposition of convex hull.
The H-tree decomposition has the following properties: 1. The level depth is associated with the complexity of polygon boundary; 2. The sub-tree by cutting some “descendent branches” represents the approximation of polygon in different resolution, see Figure 2. 3. The hierarchical decomposition is convergency with the final complete decomposition representing the real polygon; 4. The size of the decomposed element (corresponding to a “node” in Htree) is associated with the visual identification. The “leaf node” is able to represent SVO, smallest visible object (Li and Openshaw, 1992). 5. The increment of data volume is not obvious. Compared with the full representation, the number of overlapped APs increases greatly, but the number of adding points is small. h1 h12 h14
h1
h2
h2 h3
h3 Fig. 2. Extracting different sub-trees gets different approximation of polygon.
92
Tinghua Ai, Zhilin Li, and Yaolin Liu
3.2 Linear index construction After the decomposition of polygon into parts, the next is to decide the sequence of these parts organization. The H-tree structure has to be converted to a linear structure in the order of detail increment. In H-tree, each “node” about convex hull (or MBR) has an operation sign depending on whether its level number is even or odd and computed as (-1)n, where n is the level number. The “node” with plus sign has the positive contribution in polygon representation and that with minus sign the negative contribution. Based on the hierarchical tree, the polygon can be represented as the integration of all nodes (a node corresponds to a AP) using the following expression: m
Poly
¦ (1)
ni
hi , where ni is the level number of “node” hi
i 1
The item in the above expression with plus sign corresponds to the addition in later progressive transmission, and that with minus sign corresponds to the subtraction. The order of item integration could be in two situations: (1) based on the decomposition level number (under one decomposition level, the “node” hi has been sorted on area size) , and (2) based on the area size of corresponding AP. On the basis of the level order, the polygon in figure 1 can be represented as: h0-h1-h2-h3+h11+h12+h13+h14+h31-h121
The level based linear index implies to vertically scan the H-tree from top to bottom. At the same level, the geometry of “nodes” may be quite different (in size). According to this linear index, the small size “node” but with priority level (low number in value) will be transmitted first. Obviously this sequence is not reasonable and is against the principle from coarse to fine. Compared with the area of full representation of the polygon, the area change for this transmission jumps from larger to smaller and next step vice versa. The second linear index, which is based on the order of area size, does not take care of the decomposition level. It will let the
Fig. 3. An example of building polygon and 21 decomposed MBRs .
Progressive Transmission of Vector Data 93
large size and more important AP first transmit, respecting the manual habits in visual cognition (from large characteristics to small details). In this study, the linear index based on the area size is used for the organization strategy for progressive transmission. For the building shown on the left side of Figure 3, 21 MBRs with 5 decomposition levels are made. A comparison of the transmission between the level based on linear index and the area size based on linear index are as shown in Figure 4 and Figure 5. From two figures, it can be found that the transmission based on level linear index is with steep change in area. Visually it is found that the transmission of the latter reflects well the process from coarse to fine with large component parts first appearing and small details later.
level 0 level 1 level 2 level 3 level 4 Fig. 4. The progressive transmission of MBRs based on the order of position level.
level 5 decom-
step 1 step 2 step 5 step 10 step 15 step 20 Fig. 5. The progressive transmission of MBRs based on the order of area size.
3.3 Application in progressive transmission Based on the hierarchical decomposition of polygon with MBR or convex hull, the changes accumulation model has been built. The decomposition level determines the operations, e.g. addition and subtraction in the model. This model covers a wide scale range in multi-scale representation. In the real application of progressive transmission, we need to determine the scale range for data transmission. We may not always begin with the tree root (the complete approximation of object polygon, convex hull or MBR). It means, on server site, we need to combine part of changes to generate the initial packet data and transfer it in the first step. And then let user to decide what when to stop the transmission. The size of MBR or convex hull has something to do with recognition ability. Given a scale, we can extract sub-expression of linear index expression. It implies this method
94
Tinghua Ai, Zhilin Li, and Yaolin Liu
can contribute to the application fields other than progressive transmission, such as the spatial query on multi-scale representation.
Fig. 6. An example of lake polygon and the decomposed convex hulls.
Considering the difference of area feature in geometry, in this method, we distinguish two kinds of area features, namely irregular polygon and orthogonal polygon, based on the same decomposition ideas. Two algorithms have been realized in our study. Figure 5 is the experiment result for building feature and Figure 7 for lake polygon with complex boundary, which is digitized from 1:10000 topographical map. In Figure 6 the lake polygon has been decomposed into 214 convex hulls (in experiment we adopt the termination condition that the polygon is exact convex.). In some deep levels, the decomposed result reflects as very small patches with few points. In the progressive transmission, when arriving at step 50, the representation of accumulated convex hulls is very close to that of the real polygon (full representation). The area size is also close to that of the full representation. On client site, the data reorganization can be accomplished by geometric computation of polygon overlap to get normal polygon representation. The AP element with plus sign corresponds to union operation and that with minus sign the difference operation.
step 0 step 5 step 10 step 20 step 50 step 80 Fig. 7. The progressive transmission of details based on H-tree decomposition.
4. Conclusion The multi-scale representation and hierarchical organization of vector data is the key for progressive transmission. Supported by the idea that video
Progressive Transmission of Vector Data 95
data compression in which only the change content rather than full frame image is recorded, we present the changes accumulation model. Indeed, the progressive transmission can be regarded as a mapping process of data representation from spatial scale to temporal scale. The data details separated on the basis of spatial scale is transmitted in time range domain. Each snapshot in time domain corresponds to one representation at certain spatial scale. Representation at a higher resolution in spatial scale will be obtained after waiting for a longer time. In technology, the progressive transmission is associated with map generalization. If generalization can output dynamically data within a wide scale range rather than at one scale point, the series of data is well suitable for progressive transmission. Unfortunately, most of existing generalization algorithms can just derive new data at one “scale point” other than in a “scale duration”. This paper presents the model of data organization based on changes accumulation and tries to unify the representation through change data, regardless what and how generalizations execute. As generalization just output data at one scale point, the changes between representations can be extracted by the geometric comparison of consecutive versions to get difference. Based on set operation, three operations, i.e. addition, subtraction, and replacement are defined. The change accumulation model in some degree is a bridge between off-line generalization on server site and online generalization on client site. Compared with multi-versions representation, the data volume of change accumulation model is reduced. For the purpose of approximation from coarse to fine, we present a hierarchical decomposition based on convex hull and MBR geometric construction for polygon multi-representation. The decomposed result is easy to be stored in the changes accumulation model. The linear index based on the area size of basic element not only guarantees the transmission from coarse to fine, but can generate the relation to recognition resolution according to the definition of SVO. The progressive transmission using this method is effective in time cost and can be realized on line, because the data is just displayed on client without additional computation.
Acknowledgements This work is supported by the National Science Foundation, China under the grant number 40101023, and Hong Kong Research Grant Council under the grant number PolyU 5073/01E.
96
Tinghua Ai, Zhilin Li, and Yaolin Liu
References Ai T and Oosterom P van (2002) GAP-tree Extensions Based on Skeletons. In: Richardson D and Oosterom P van(eds) Advances in Spatial Data Handling, Springer-Verlag, Berlin, pp501-514. Ballard D (1981) Strip Trees: A Hierarchical Representation for Curves. Communication of the Association for Computing Machinery, vol. 14: 310-321. Bertolotto M and Egenhofer M (2001) Progressive Transmission of Vector Map Data over the World Wide Web. GeoInformatica, 5 (4): 345-373. Bertolotto M and Egenhofer M (1999) Progressive Vector Transmission. Proceedings, 7th International Symposium on Advances in Geographic Information Systems, Kansas City, MO: 152-157. Buttenfield B P (2002) Transmitting Vector Geospatial Data across the Internet, In: Egenhofer M J and Mark D M (eds) Proceedings GIScience 2002. Berlin: Springer Verlag, Lecture Notes in Computer Science, No 2478: 51-64. Buttenfield B P (1999) Sharing Vector Geospatial Data on the Internet. Proceedings, 18th Conference of the International Cartographic Association, August 1999, Ottawa, Canada, Section 5: 35-44. Culberson J C and Reckhow R A (1994) Covering polygons is hard. Journal of Algorithms, 17(1): 2-44. Han H, Tao V and Wu H (2003) Progressive Vector Data Transmission, Proceedings of 6th AGILE, Lyon, France. Jiang B and Ormeling F J (1997) Cybermap: the map for cyberspace. Cartographic Journal, 34 (2):111-116. Li Z and Openshaw S (1992) Algorithms for automated line generalization based on a natural principle of objective generalization. International Journal of Geographical Information Systems. 6 (5): 373-389. Muller J C and Wang Z (1992) Area-path Generalization: A Competitive Approach, The Cartographic Journal. 29(2): 137-144. Oosterom P Van (1994) Reactive Data Structure for Geographic Information Systems. Oxford University Press, Oxford Rauschenbach U and Schumann H (1999) Demand-driven Image Transmission with Levels of Detail and Regions of Interest. Computers & Graphics, 23(6): 857-866 . Srinivas B S R, Ladner M and Azizoglu (1999) Progressive Transmission of Images using MAP Detection over Channels with Memory. IEEE Transactions on Image Processing, 8(4): 462-475. Taylor D (1997) Maps and Mapping in the Information Era. In: Ottoson L (eds) Proceedings of the 18th ICA/ACI International Cartographic Conference, Stockholm, Sweden, June 1997 Gävle, pp 23-27.
An Efficient Natural Neighbour Interpolation Algorithm for Geoscientific Modelling∗ Hugo Ledoux and Christopher Gold Department of Land Surveying and Geo-Informatics The Hong Kong Polytechnic University, Hong Kong
[email protected] —
[email protected] Abstract Although the properties of natural neighbour interpolation and its usefulness with scattered and irregularly spaced data are well-known, its implementation is still a problem in practice, especially in three and higher dimensions. We present in this paper an algorithm to implement the method in two and three dimensions, but it can be generalized to higher dimensions. Our algorithm, which uses the concept of flipping in a triangulation, has the same time complexity as the insertion of a single point in a Voronoi diagram or a Delaunay triangulation.
1 Introduction Datasets collected to study the Earth usually come in the form of two- or three-dimensional scattered points to which attributes are attached. Unlike datasets from fields such as mechanical engineering or medicine, geoscientific data often have a highly irregular distribution. For example, bathymetric data are collected at a high sampling rate along each ship’s track, but there can be a very long distance between two ships’ tracks. Also, geologic and oceanographic data respectively are gathered from boreholes and water columns; data are therefore usually abundant vertically but sparse horizontally. In order to model, visualize and better understand these datasets, interpolation is performed to estimate the value of an attribute at unsampled locations. The abnormal distribution of a dataset causes many problems for interpolation methods, especially for traditional weighted average methods in which distances are used to select neighbours and to assign weights. Such methods have problems because they do not consider the configuration of the data. ∗
This research is supported by the Hong Kong’s Research Grants Council (project PolyU 5068/00E).
98 Hugo Ledoux and Christopher Gold It has been shown that natural neighbour interpolation (Sibson, 1980, 1981) avoids most of the problems of conventional methods and therefore performs well for irregularly distributed data (Gold, 1989; Sambridge et al., 1995; Watson and Phillip, 1987). This is a weighted average technique based on the Voronoi diagram (VD) for both selecting the set of neighbours of the interpolation point x and determining the weight of each. The neighbours used in an estimation are selected using the adjacency relationships of the VD, which results in the selection of neighbours that both surround and are close to x. The weight of each neighbour is based on the volume (throughout this paper, ‘volume’ is used to define area in 2D, volume in 3D and hyper volume in higher dimensions) that the Voronoi cell of x ‘steals’ from the Voronoi cell of the neighbours in the absence of x. The method, which has many useful properties valid in any dimensions, is further defined in Sect. 2. Although the concepts behind natural neighbour interpolation are simple and easy to understand, its implementation is far from being straightforward, especially in higher dimensions. The main reasons are that the method requires the computation of two Voronoi diagrams—one with and one without the interpolation point—and also the computation of volumes of Voronoi cells. This involves algorithms for both constructing a VD—or its geometric dual the Delaunay triangulation (DT)—and deleting a point from it. By comparison, conventional interpolation methods are relatively easy to implement; this is probably why they can be found in most geographical information systems (GIS) and geoscientific modelling packages. Surprisingly, although many authors present the properties and advantages of the method, few discuss details concerning its implementation. The twodimensional case is relatively easy to implement as efficient algorithms for constructing a VD/DT (Fortune, 1987; Guibas and Stolfi, 1985; Watson, 1981) and deleting a point from it (Devillers, 2002; Mostafavi et al., 2003) exist. Watson (1992) also presents an algorithm that mimics the insertion of x, and thus deletion algorithms are not required. The stolen area is obtained by ordering the natural neighbours around x and decomposing the area into triangles. In three and higher dimensions, things get more complicated because the algorithms for constructing and modifying a VD/DT are still not well-known. There exist algorithms to construct a VD/DT (Edelsbrunner and Shah, 1996; Watson, 1981), but deletion algorithms are still a problem—only theoretical solutions exist (Devillers, 2002; Shewchuk, 2000). Sambridge et al. (1995) describe three-dimensional methods to compute a VD, insert a new point in it and compute volumes of Voronoi polyhedra, but they do not explain how the interpolation point can be deleted. Owen (1992) also proposes a sub-optimal solution in which, before inserting the interpolation point x, he simply saves the portion of the DT that will be modified and replaces it once the estimation has been computed. The stolen volumes are calculated in only one operation, but that requires algorithms for intersecting planes in three-dimensional space. The idea of mimicking the insertion algorithm of
Natural Neighbour Interpolation for Geoscientific Modelling
99
Watson (1992) has also been generalized to three dimensions by Boissonnat and Cazals (2002) and to arbitrary dimensions by Watson (2001). To calculate the stolen volumes, both algorithms use somewhat complicated methods to order the vertices surrounding x and then decompose the volume into simplices (tetrahedra in three dimensions). The time complexity of these two algorithms is the same as the one to insert one point in a VD/DT. We present in this paper a simple natural neighbour interpolation algorithm valid in two and three dimensions, but the method generalizes to higher dimensions. Our algorithm works directly on the Delaunay triangulation and uses the concept of flipping in a triangulation, as explained in Sect. 3, for both inserting new points in a DT and deleting them. The Voronoi cells are extracted from the DT and their volumes are calculated by decomposing them into simplices; we show in Sect. 4 how this step can be optimised. The algorithm is efficient (its time complexity is the same as the one for inserting a single point in a VD/DT) and we believe it to be considerably simpler to implement than other known methods, as only an incremental insertion algorithm based on flips, with some minor modifications, is needed.
2 Natural Neighbour Interpolation The idea of a natural neighbour is closely related to the concepts of the Voronoi diagram and the Delaunay triangulation. Let S be a set of n points in ddimensional space. The Voronoi cell of a point p ∈ S, defined Vp , is the set of points x that are closer to p than to any other point in S. The union of the Voronoi cells of all generating points p in S form the Voronoi diagram (VD) of S. The geometric dual of VD(S), the Delaunay triangulation DT(S), partitions the same space into simplices—a simplex represents the simplest element in a given space, e.g. a triangle in 2D and a tetrahedron in 3D— whose circumspheres do not contain any other points in S. The vertices of the simplices are the points generating each Voronoi cell. Fig. 1(a) shows the VD and the DT in 2D, and Fig. 1(b) a Voronoi cell in three dimensions. The VD and the DT represent the same thing: a DT can always be extracted from a VD, and vice-versa. The natural neighbours of a point p are the points in S sharing a Delaunay edge with p, or, in the dual, the ones whose Voronoi cell is contiguous to Vp . For example, in Fig. 1(a), p has seven natural neighbours. 2.1 Natural Neighbour Coordinates The concept of natural neighbours can also be applied to a point x that is not present in S. In that case, the natural neighbours of x are the points in S whose Voronoi cell would be modified if the point x were inserted in VD(S). The insertion of x creates a new Voronoi cell Vx that ‘steals’ volume from the Voronoi cells of its ‘would be’ natural neighbours, as shown in Fig. 2(a). This idea forms the basis of natural neighbour coordinates (Sibson, 1980, 1981),
100 Hugo Ledoux and Christopher Gold which define quantitatively the amount Vx steals from each of its natural neighbours. Let D be the VD(S), and D+ = D ∪ {x}. The Voronoi cell of a point p in D is defined by Vp , and Vp+ is its cell in D+ . The natural neighbour coordinate of x with respect to a point pi is wi (x) =
V ol(Vpi ∩ Vx+ ) V ol(Vx+ )
(1)
where V ol(Vpi ) represents the volume of Vpi . For any x, the value of wi (x) will always be between 0 and 1: 0 when pi is not a natural neighbour of x, and 1 when x is exactly at the same location as pi . A further important consideration is that the sum of the volumes stolen from each of the k natural neighbours is equal to V ol(Vx+ ). Therefore, the higher the value of wi (x) is, the stronger is the ‘influence’ of pi on x. The natural neighbour coordinates are influenced by both the distance from x to pi and the spatial distribution of the pi around x.
p
(a) Voronoi diagram and Delaunay triangulation (dashed lines) in 2D.
(b) A Voronoi cell in 3D with its dual Delaunay edges joining the generator to its natural neighbours.
Fig. 1. Voronoi diagram. p1
p2
w1
p6 w6 w5
p1
p1
w2
p
2
p
x
p
2
p
6
6
w3
w4
p3
p5
x p
3
p
p4
(a) Natural neighbour coordinates of x in 2D.
p
3
p
5
5
p
4
p
4
(b) 2D DT with and without x.
Fig. 2. Two VD are required for the natural neighbour interpolation.
Natural Neighbour Interpolation for Geoscientific Modelling
101
2.2 Natural Neighbour Interpolation Based on the natural neighbour coordinates, Robin Sibson developed a weighted average interpolation technique that he named natural neighbour interpolation (Sibson, 1980, 1981). The points used to estimate the value of an attribute at location x are the natural neighbours of x, and the weight of each neighbour is equal to the natural neighbour coordinate of x with respect to this neighbour. If we consider that each data point in S has an attribute ai (a scalar value), the natural neighbour interpolation is f (x) =
k
wi (x) ai
(2)
i=1
where f (x) is the interpolated function value at the location x. The resulting method is exact (f (x) honours each data point), and f (x) is smooth and continuous everywhere except at the data points. To obtain a continuous function everywhere, that is a function whose derivative is not discontinuous at the data points, Sibson uses the weights defined in Eq. 1 in a quadratic equation where the gradient at x is considered. To our knowledge, this method has not been used with success with real data and therefore we do not use it. Other ways to remove the discontinuities at the data points have been proposed: Watson (1992) explains different methods to estimate the gradient at x and how to incorporate it in Eq. 2; and Gold (1989) proposes to modify the weight of each pi with a simple hermitian polynomial so that, as x approaches pi , the derivative of f (x) approaches 0. Modifying Eq. 2 to obtain a continuous function can yield very good results in some cases, but with some datasets the resulting surface can contain unwanted effects. Different datasets require different methods and parameters, and, for this reason, modifications should be applied with great care. 2.3 Comparison with Other Methods With traditional weighted average interpolation methods, for example distancebased methods, all the neighbours within a certain distance from the interpolation location x are considered and the weight of each neighbour is inversely proportional to its distance to x. These methods can be used with a certain success when the data are uniformly distributed, but it is difficult to obtain a continuous surface when the distribution of the data is anisotropic or when there is variation in the data density. Finding the appropriate distance to select neighbours is difficult and requires a priori knowledge of a dataset. Natural neighbour interpolation, by contrast, is not affected by these issues because the selection of the neighbours is based on the configuration of the data. Another popular interpolation method, especially in the GIS community, is the triangle-based method in which the estimate is obtained by linear interpo-
102 Hugo Ledoux and Christopher Gold lation within each triangle, assuming a triangulation of the data points is available. The generalization of this method to higher dimensions is straightforward: linear interpolation is performed within each simplex of a d-dimensional triangulation. In 2D, when this method is used with a Delaunay triangulation, satisfactory results can be obtained because the Delaunay criterion maximizes the minimum angle of each triangle, i.e. it creates triangles that are as equilateral as possible. This method however creates discontinuities in the surface along the triangle edges and, if there is anisotropy in the data distribution, the three neighbours selected will not necessarily be the three closest data points. These problems are amplified in higher dimensions because, for example, the max-min angle property of a DT does not generalize to three dimensions. A 3D DT can contain some tetrahedra, called slivers, whose four vertices are almost coplanar; interpolation within such tetrahedra does not yield good results. The presence of slivers in a DT does not however affect natural neighbour interpolation because the Voronoi cells of points forming a sliver will still be ‘well-shaped’ (relatively spherical).
3 Delaunay Triangulation, Duality and Flips In order to construct and modify a Voronoi diagram, it is actually easier to first construct the Delaunay triangulation and extract the VD afterwards. Managing only simplices is simpler then managing arbitrary polytopes: the number of vertices and neighbours of each simplex is known and constant which facilitates the algorithms and simplifies the data structures. Extracting the VD from a DT in 2D is straightforward, while in 3D it requires more work. In two dimensions the dual of a triangle is a point (the centre of the circumcircle of the triangle) and the dual of a Delaunay edge is a bisector edge. In three dimensions the dual of a tetrahedron is a point (the centre of the circumsphere of the tetrahedron) and the dual of a Delaunay edge is a Voronoi face (a convex polygon formed by the centre of the circumspheres of every tetrahedron incident to the edge). In short, to get the Voronoi cell of a given point p in a 3D DT, we must first identify all the edges that have p as a vertex and then extract the dual of each (a face). The result will be a convex polyhedron formed by convex faces, as shown in Fig. 1(b). We discuss in this section the main operations required for the construction of a DT and for implementing the natural neighbour interpolation algorithm described in Sect. 4. Among all the possible algorithms to construct a VD/DT, we chose an incremental insertion algorithm because it permits to firstly construct a DT and then modify it locally when a point is added or deleted. Other potential solutions, for example divide-and-conquer algorithms or the construction of the convex hull in (d + 1) dimensions, might be useful for the initial construction, but local modifications are either slow and complicated, or simply impossible.
Natural Neighbour Interpolation for Geoscientific Modelling
103
3.1 Flipping A flip is a local topological operation that modifies the configuration of adjacent simplices in a triangulation. Consider the set S = {a, b, c, d} of points in the plane forming a quadrilateral, as shown in Fig. 3(a). There exist exactly c
c d
a
c
c
flip22
flip13
d
a flip22 b
d
flip31
b
a
a
b
(a)
b
(b) Fig. 3. Two-dimensional flips.
two ways to triangulate S: the first one contains the triangles abc and bcd; and the second one contains the triangles abd and acd. Only the first triangulation of S is Delaunay because d is outside the circumcircle of abc. A flip22 is the operation that transforms the first triangulation into the second, or vice-versa. It should be noticed that when S does not form a quadrilateral, as shown in Fig. 3(b), there is only one way to triangulate S: with three triangles all incident to d. A flip13 refers to the operation of inserting d inside the triangle abc and splitting it into three triangles; and a flip31 is the inverse operation that is needed for deleting d. The notation for the flips refers to the numbers of simplices before and after the flip. The concept of flipping generalizes to three and higher dimensions (Lawson, 1986). The flips to insert and delete a point generalize easily to three dimensions and become respectively flip14 and flip41, as shown in Fig. 4(b). The generalization of the flip22 in three dimensions is somewhat more complicated. Consider a set S = {a, b, c, d, e} of points in R3 , as shown in Fig. 4(a). There are two ways to triangulate S: either with two or three tetrahedra. In the first case, the two tetrahedra share a face, and in the latter case the three a
a
a flip23
flip14
c
b
flip32
c
b e
e
flip41
d
d
d
(a)
(b) Fig. 4. Three-dimensional flips.
c
b
104 Hugo Ledoux and Christopher Gold tetrahedra all have a common edge. A flip23 transforms a triangulation of two tetrahedra into another one containing three tetrahedra; a flip32 is the inverse operation. 3.2 Constructing a DT by Flips Consider a d-dimensional Delaunay triangulation T and a point x. What follow are the steps to insert x in T by flips, assuming that the simplex τ containing x has been identified (see Devillers et al. (2002) for different methods). After the insertion of x, one or more simplices of T will be in ‘conflict’ with x, i.e. their circumspheres will contain x. We must identify, delete and replace these conflicting simplices by other ones. A flipping algorithm first splits τ into d + 1 simplices with a flip (e.g. a flip13 in 2D). Each new simplex must then be tested to make sure it is Delaunay; this test involves only two simplices: the new simplex and its adjacent neighbour that is not incident to x (there is only one). If the new simplex is not Delaunay then a flip is performed. The new simplices thus created must be tested later. The process continues until every simplex incident to x is Delaunay. This idea can be applied to construct a DT: each point is inserted one at a time and the triangulation is updated between each insertion. This incremental insertion algorithm is valid in any dimensions, i.e. there always exists a sequence of flips that will permit the insertion of a single point in a d-dimensional DT. For a detailed description of the algorithm, see Guibas and Stolfi (1985) and Edelsbrunner and Shah (1996) for respectively the twoand d-dimensional case.
4 A Flip-Based Natural Neighbour Interpolation Algorithm Our algorithm to implement natural neighbour interpolation performs all the operations directly on the Delaunay triangulation (with flips) and the Voronoi cells are extracted when needed. We use a very simple idea that consists of inserting the interpolation point x in the DT, calculating the volume of the Voronoi cell of each natural neighbour of x, then removing x and recalculating the volumes to obtain the stolen volumes. Two modifications are applied to speed up the algorithm. The first one concerns the deletion of x from the DT. We show in Sect. 3 that every flip has an ‘inverse’, e.g. in 2D, a flip13 followed by a flip31 does not change the triangulation; in 3D, a flip23 creates a new tetrahedron that can then be removed with a flip32. Therefore, if x was added to the triangulation with a sequence l of flips, simply performing the inverse flips of l in reverse order will delete x. The second modification concerns how the overlap between Voronoi cells with and without the presence of x is calculated. We show that only some faces of a Voronoi cell (in the following, a Voronoi face is a (d − 1)-face forming the boundary of the cell, e.g. in 2D
Natural Neighbour Interpolation for Geoscientific Modelling
105
it is a line and in 3D it is a polygon) are needed to obtain the overlapping volume. Given a set of points S in d dimensions, consider interpolating at the location x. Let T be the DT(S) and pi the natural neighbours of x once it is inserted in DT(S). The simplex τ that contains x is known. Our algorithm proceeds as follow: 1. x is inserted in T , thus getting T + = T ∪ {x}, by using flips and the sequence l of flips performed is stored in a simple list. 2. the volume of Vx+ is calculated, as well as the volumes of each Vp+i . 3. l is performed in reverse order and the inverse flip is performed each time. This deletes x from T + . 4. the volume of Vpi are re-calculated to obtain the natural neighbour coordinates of x with respect to all the pi ; and Eq. 2 is finally calculated. To remember the order of flips in two dimensions, a simple list containing the order in which the pi became natural neighbours of x is kept. The flip13 adds three pi , and each subsequent flip22 adds one new pi . In 3D, only a flip23 adds a new pi to x; a flip32 only changes the configuration of the tetrahedra around x. We store a list of edges that will be used to identify what flip was performed during the insertion of x. In the case of a flip23, we simply store the edge xpi that is created by the flip. A flip32 deletes a tetrahedron and modifies the configuration of the two others such that, after the flip, they are both incident to x and share a common face abx. We store the edge ab of this face. Therefore, in two dimensions, to delete x we take one pi , find the two triangles incident to the edge xpi and perform a flip22. When x has only three natural neighbours, a flip31 deletes x completely from the T + . In 3D, if the current edge is xpi , a flip32 is used on the three tetrahedra incident to xpi ; and if ab is the current edge, then a flip23 is performed on the two tetrahedra sharing the face abx. 4.1 Volume of a Voronoi Cell The volume of a d-dimensional Voronoi cell is computed by decomposing it into d-simplices and summing their volumes. The volume of a d-simplex τ is easily computed: 0 1 v . . . v d det (3) V ol(τ ) = 1 ... 1 d! where v i is a d-dimensional vector representing the coordinates of a vertex and det is the determinant of the matrix. Triangulating a 2D Voronoi cell is easily performed: since the polygon is convex a fan-shaped triangulation can be done. In 3D, the polyhedron is triangulated by first fan-shaped triangulating each of its Voronoi faces, and then the tetrahedra are formed by the triangles and the generator of the cell. In order to implement natural neighbour interpolation, we do not need to know the volume of the Voronoi cells of the pi in T and T + , but only the
106 Hugo Ledoux and Christopher Gold difference between the two volumes. As shown in Fig. 2(a), some parts of a Voronoi cell will not be affected by the insertion of x in T , and computing them twice to subtract afterwards is computationally expensive and useless. Notice that the insertion or deletion of x in a DT modifies only locally the triangulation—only simplices inside a defined polytope (defined by the pi in Fig. 2(b)) are modified. Each pi has many edges incident to it, but only the edges inside the polytope are modified. Therefore, to optimise this step of the algorithm, we process only the Voronoi faces that are dual to the Delaunay edges joining two natural neighbours of x. In T + , the Voronoi face dual to the edge xpi must also be computed. Only the complete volume of the Voronoi cell of x in T + needs to be known. The Voronoi cells of the points in S forming the convex hull of S are unbounded. That causes problems when a natural neighbour of x is one of these points because the volume of its Voronoi cell, or parts of it, must be computed. The simplest solution consists of bounding S with an artificial (d + 1)-simplex big enough to contain all the points. 4.2 Theoretical Performances By using a flipping algorithm to insert x in a d-dimensional DT T , each flip performed removes one and only one conflicting simplex from T . For example, in 3D, the first flip14 deletes the tetrahedron containing x and adds four new tetrahedra to T + ; then each subsequent flip23 or flip32 deletes only one tetrahedron that was present in T before the insertion of x. Once a simplex is deleted after a flip, it is never re-introduced in T + . The work needed to insert x in T is therefore proportional to r, the number of simplices in T that conflict with x. As already mentioned, each 2D flip adds a new natural neighbour to x. The number of flips needed to insert x is therefore proportional to the degree of x (the number of incident edges) after its insertion. Without any assumptions on the distribution of the data, the average degree of a vertex in a 2D DT is 6; which means an average of four flips are needed to insert x (one flip13 plus three flip22). This is not the case in 3D (a flip32 does not add a new natural neighbour to x) and it is therefore more complicated to give a value to r. We can nevertheless affirm that the value of r will be somewhere between the number of edges and the number of tetrahedra incident to x in T + ; these two values are respectively around 15.5 and 27.1 when points are distributed according to a Poisson distribution (Okabe et al., 1992). Because a flip involves a predefined number of adjacent simplices, we assume it is performed in constant time. As a result, if x conflicts with r simplices in T then O(r) time is needed to insert it. Deleting x from T + also requires r flips; but this step is done even faster than the insertion because operations to test if a simplex is Delaunay are not needed, nor are tests to determine what flip to perform. The volume of each Voronoi cell is computed only partly, and this operation is assumed to be done in constant time. In the natural neighbour interpolation algorithm, if k is the
Natural Neighbour Interpolation for Geoscientific Modelling
107
degree of x in a d-dimensional DT, then the volume of k Voronoi cells must be partly computed twice: with and without x in T . As a conclusion, our natural neighbour interpolation algorithm has a time complexity of O(r), which is the same as an algorithm to insert a single point in a Delaunay triangulation. However, the algorithm is obviously slower by a certain factor since x must be deleted and parts of the volumes of the Voronoi cells of its natural neighbours must be computed.
5 Conclusions Many new technologies to collect information about the Earth have been developed in recent years and, as a result, more data are available. These data are usually referenced in two- and three-dimensional space, but socalled four-dimensional datasets—that is three spatial dimensions plus a time dimension—are also collected. The GIS, with its powerful integration and spatial analysis tools, seems the perfect platform to manage these data. It started thirty years ago as a static mapping tool, has recently evolved to three dimensions (Raper, 1989) and is slowly evolving to higher dimensions (Mason et al., 1994; Raper, 2000). Interpolation is an important operation in a GIS. It is crucial in the visualisation process (generation of surfaces or contours), for the conversion of data from one format to another, to identify bad samples in a dataset or simply to have a better understanding of a dataset. Traditional interpolation methods, although relatively easy to implement, do not yield good results, especially when used with datasets having a highly irregular distribution. In two dimensions, these methods have shortcomings that create discontinuities in the surface and these shortcomings are amplified in higher dimensions. The method detailed in this paper, natural neighbour interpolation, although more complicated to implement, performs well with irregularly distributed data and is valid in any dimensions. We have presented a simple, yet efficient, algorithm that is valid in two, three and higher dimensions. We say ‘simple’ because only an incremental algorithm based on flips, with the minor modifications described, is required to implement our algorithm. We have already implemented the algorithm in two and three dimensions and we hope our method will make it possible for the GIS community to take advantage of natural neighbour interpolation for modelling geoscientific data.
References Boissonnat JD, Cazals F (2002) Smooth surface reconstruction via natural neighbour interpolation of distance functions. Computational Geometry, 22:185–203. Devillers O (2002) On Deletion in Delaunay Triangulations. International Journal of Computational Geometry and Applications, 12(3):193–205.
108 Hugo Ledoux and Christopher Gold Devillers O, Pion S, Teillaud M (2002) Walking in a triangulation. International Journal of Foundations of Computer Science, 13(2):181–199. Edelsbrunner H, Shah N (1996) Incremental Topological Flipping Works for Regular Triangulations. Algorithmica, 15:223–241. Fortune S (1987) A Sweepline algorithm for Voronoi diagrams. Algorithmica, 2:153– 174. Gold CM (1989) Surface Interpolation, spatial adjacency and GIS. In J Raper, editor, Three Dimensional Applications in Geographic Information Systems, pages 21–35. Taylor & Francis. Guibas LJ, Stolfi J (1985) Primitives for the Manipulation of General Subdivisions and the Computation of Voronoi Diagrams. ACM Transactions on Graphics, 4:74–123. Lawson CL (1986) Properties of n-dimensional triangulations. Computer Aided Geometric Design, 3:231–246. Mason NC, O’Conaill MA, Bell SBM (1994) Handling four-dimensional georeferenced data in environmental GIS. International Journal of Geographic Information Systems, 8(2):191–215. Mostafavi MA, Gold CM, Dakowicz M (2003) Delete and insert operations in Voronoi/Delaunay methods and applications. Computers & Geosciences, 29(4):523–530. Okabe A, Boots B, Sugihara K, Chiu SN (1992) Spatial Tessellations: Concepts and Applications of Voronoi Diagrams. John Wiley and Sons. Owen SJ (1992) An Implementation of Natural Neighbor Interpolation in Three Dimensions. Master’s thesis, Brigham Young University, Provo, UT, USA. Raper J, editor (1989) Three Dimensional Applications in Geographic Information Systems. Taylor & Francis, London. Raper J (2000) Multidimensional Geographic Information Science. Taylor & Francis. Sambridge M, Braun J, McQueen H (1995) Geophysical parameterization and interpolation of irregular data using natural neighbours. Geophysical Journal International, 122:837–857. Shewchuk JR (2000) Sweep algorithms for constructing higher-dimensional constrained Delaunay triangulations. In Proc. 16th Annual Symp. Computational Geometry, pages 350–359. ACM Press, Hong Kong. Sibson R (1980) A vector identity for the Dirichlet tesselation. In Mathematical Proceedings of the Cambridge Philosophical Society, 87, pages 151–155. Sibson R (1981) A brief description of natural neighbour interpolation. In V Barnett, editor, Interpreting Multivariate Data, pages 21–36. Wiley, New York, USA. Watson DF (1981) Computing the n-dimensional Delaunay tessellation with application to Voronoi polytopes. The Computer Journal, 24(2):167–172. Watson DF (1992) Contouring: A Guide to the Analysis and Display of Spatial Data. Pergamon Press, Oxford, UK. Watson DF (2001) Compound Signed Decomposition, The Core of Natural Neighbor Interpolation in n-Dimensional Space. http://www.iamg.org/ naturalneighbour.html. Watson DF, Phillip G (1987) Neighborhood-Based Interpolation. Geobyte, 2(2):12– 16.
Evaluating Methods for Interpolating Continuous Surfaces from Irregular Data: a Case Study M. Hugentobler1, R.S. Purves1 and B. Schneider2 1 GIS Division, Department of Geography, University of Zürich, Winterthurerstr. 190, Zürich, 8057, Switzerland
[email protected] 2 Department of Geosciences, University of Basel, Berhoullistr. 32, Basel, 4056, Switzerland
Abstract An artificial and ‘real’ set of test data are modelled as continuous surfaces by linear interpolators and three different cubic interpolators. Values derived from these surfaces, of both elevation and slope, are compared with analytical values for the artificial surface and a set of independently surveyed values for the real surface. The differences between interpolators are shown with a variety of measures, including visual inspection, global statistics and spatial variation, and the utility of cubic interpolators for representing curved areas of surfaces demonstrated.
1 Introduction Terrain models and their derivatives are used in a wide range of applications as ‘off the shelf’ products. However, Schneider (2001b) points out how the representation of surface continuity in many applications is both implicit and contradictory for different products of the terrain model. The use of continuous representations of terrain, which are argued to better represent the real nature of terrain surfaces, is suggested as an important area of research. Furthermore, it is stated that the nature of a representation should also be application specific with, for example, a surface from which avalanche starting zones are to be derived implying a different set of constraints than those required to interpolate temperature. In the latter case it is sufficient to know only elevation at a point, whereas in the former de-
110
M. Hugentobler, R.S. Purves and B. Schneider
rivatives of the terrain surface such as gradient, aspect and curvature (Maggioni and Gruber, 2003) are all required. Models of terrain can generally be categorised as either regular or irregular tessellations of point data sometimes with additional ancillary information representing structural features such as breaks in slope or drainage divides (Weibel and Heller, 1991). Regular tessellations have dominated within the modelling and spatial analysis communities, despite the oft-cited advantages of irregular tessellations (e.g. Weibel and Heller, 1991). Within the domain of regular tessellations geomorphologists and GIScientists have combined to examine the robustness of descriptive indices of topography, such as slope and aspect (e.g. Evans, 1980; Skidmore 1989; Corripio, 2003; Schmidt et al., 2003), hypsometry, and hydrological catchment areas (Walker and Willgoose, 1999; Gallant and Wilson, 2000) all derived using a range of data models, resolutions and algorithms. Irregular tessellations are commonly based upon a triangular irregular network (TIN) which in itself is derived from point or line data (e.g. Peucker et al., 1978). Surfaces interpolated from irregular tessellations may or may not fulfil basic conditions of continuity, with Schneider (2001b) describing a family of techniques derived from Computer-Aided Graphics Design (CAGD) which may be used to attempt to fulfill continuity constraints. Hugentobler (2002) describes one of these techniques, triangular Coons patches, which are applied to the problem of representing a continuous surface. In comparison to regular tessellations, little work has been carried out to assess the implications of differing surface representations and their resulting products with irregular tessellations, as opposed to comparisons between regular and irregular tessellations (though the work of Kumler (1994) is an exception to this observation). In this paper we introduce a case study where a suite of interpolation methods are applied to a TIN and resulting elevations and first order surface derivatives compared in order to assess the properties of different techniques. The paper first lists a range of methods for comparing the properties of these representations, mostly derived from the literature on regular tessellations. A methodology for carrying out such comparisons on irregularly tessellated points is then presented and a subset of the resulting values are discussed. The implications of the case study for interpolation of TINs for differing applications are then examined. Finally, some recommendations for further work in this area are made.
Evaluating Methods for Interpolating Continuous Surfaces
111
1.1 Techniques for evaluating terrain models A number of techniques are described in the literature for evaluating terrain models. Perhaps the most straightforward, but nonetheless a very powerful technique is the use of visual inspection. For instance, Wood and Fisher (1993) cite the use of a range of mapping techniques including 2D raster rendering, pseudo-3D projection, aspatial representations and a range of shaded relief, slope and aspect maps. In computer-aided graphic design (CAGD), reflection lines are often used to detect small irregularities in surfaces (Farin, 1997). Mitas and Mitasova (1999) and Schneider (1998) detect artefacts in continuous surfaces though the use of shaded pseudo-3D projection. Visual inspection has the great advantage that patterns can straightforwardly be identified with inspection that are difficult or impossible to identify through summary statistics (Wood and Fisher, 1993). On the other hand, as Schneider (2001b) points out a surface may be visually pleasing whilst not being a realistic representation of terrain. Indeed, two different representations of the same surface may both appear realistic whilst giving different impressions of the same surface. Thus, visual inspection is probably best employed in searching for grave artefacts in the surface. The use of artificial surfaces generated from a mathematical function with an analytical solution allows quantitative evaluation of interpolators at any point on a continuous surface. This technique has been used by a number of authors, including Corripio (2003) to compare a number of algorithms for calculating slope and Schmidt et al. (2003) to examine calculations of curvature. Analytical surfaces provide a means for rapidly collecting many points and knowing true values of elevation and derivatives at any point. However, the ability of a function with an analytical solution to have the same or similar properties to a real terrain is unclear. Perhaps the most convincing method of assessing the quality of a terrain model is the collection of independent data from the same terrain, with precision and accuracy at least as high as that assumed for the model. Skidmore (1989) and Wise (1997) attempt to collect independent data from topographic maps for comparison with gridded elevation models. Bolstad and Stowe (1994) used GPS measurements to evaluate elevation model quality. Other work has used field measurements of elevation, slope and profile curvature for quality assessment (Giles and Franklin, 1996). However, little work appears to exist using field data to evaluate the quality of continuous models of terrain derived from irregular data. In considering the validation of interpolated terrain models with field data several important considerations must be taken into account:
112
M. Hugentobler, R.S. Purves and B. Schneider
x All applications use a terrain model at some implicit scale. However, this scale specific surface does not exist in reality and therefore cannot be measured (Schneider, 2001a). x Factors other than the interpolation influence the errors in the resulting terrain model: x the discretisation of the continuous terrain surface; x the choice of triangular tessellation; and x errors in the base data used for surface derivation and comparison. In order to assess interpolation methods the uncertainty introduced through discretisation, tessellation and base data must be small in comparison to the uncertainties resulting from the interpolation methods themselves. 2 Methodology 2.1 Overview The aim of this study was to generate a comprehensive set of methods to compare a number of different interpolators of irregular point datasets. To achieve this all of the methods reviewed above were used, namely the generation of an artificial surface with analytical solutions, the collection of high precision field data within a test area and the use of a variety of statistical and visual methods to compare different interpolators. In this section the techniques used to generate these comparisons are described. 2.2 Test surfaces 2.2.1 Artificial surface An artificial surface as described in (1) was created and is shown in Figure 1.
f ( x, y )
§ § 2S · § 2S · · 500 100¨¨ sin ¨ x ¸ sin ¨ y ¸ ¸¸ © 200 ¹ ¹ © © 200 ¹
(1)
The function was evaluated with values of x and y between 100 and 300. 162 datapoints were selected randomly, since the surface varies
Evaluating Methods for Interpolating Continuous Surfaces
113
smoothly, in order to build a tessellation which was triangulated using a Delaunay triangulation for the different interpolation methods described in 2.4.
Fig. 1. 3D view of the artificial surface described in Equation 1
2.2.2 Field test area The test area for the study was a square of approximately 200m x 200m near Menzingen in the Canton of Zug in Switzerland. The landscape itself is one formed by glacial deposition containing farmland bisected by a road, a small hill and an area of flat land (Figure 2). A survey of the area was carried out with a geodimeter to produce an independent dataset (called hereinafter the control dataset, consisting of 263 points) with the intention of comparison with a photogrammetric dataset collected by the Canton of Zug. However, this field data and the photogrammetric data were found to have a consistent offset of around 30cm. Since the origin of this offset was unclear, it was decided to resample the field area at approximately the same locations as the photogrammetric dataset (including breaklines along either side of the road) to produce an independent dataset (called hereinafter the triangulation dataset, consisting of 230 points) where the same survey techniques had been used. A constrained Delaunay triangulation was used to interpolate the triangulation dataset following the method of de Floriani and Puppo (1992).
114
M. Hugentobler, R.S. Purves and B. Schneider
To examine the sensitivity of the results to the points contained within the triangulation dataset a randomly selected set of points were swapped between the triangulation and control datasets to generate different surfaces. Finally, three profiles were measured at regular intervals from the hill to the plain for comparison with interpolated surfaces. One of these profiles is represented schematically in Figure 2 by a dashed line.
Fig. 2. The field area viewed from the north-east, showing hill and ‘road’ (running across the middle of the image and then behind bushes), and the plain (in foreground). Dashed line represents a profile schematically.
In order to avoid the problems discussed in section 1.2 with comparison of field data and surfaces, the following were considered when collecting field data: x The two datasets (triangulation and control) must contain features of approximately the same scales. Since the data were collected by the same teams with the same purpose in mind this problem was minimized. x The tessellation used must portray the terrain well, with no edges crossing valleys or ridges and no long thin triangles. x The elevation values of the two datasets have to be of similar accuracy and precision. The problems encountered in attempting to use the photogrammetric dataset illustrated these issues well.
Evaluating Methods for Interpolating Continuous Surfaces
115
2.4 Interpolation methods Four different interpolators were used to create continuous elevation surfaces using the generated triangulations. 2.4.1 Linear interpolation Linear interpolation is the simplest interpolation scheme for TINs and also the most widely used. The surface within each triangle is a plane passing through the three vertices. Each facet has one value for gradient and one for aspect, while the curvature is zero everywhere. These properties make it relatively straightforward to derive more complex properties such as viewsheds and catchment areas. 2.4.2 Triangular Coons patch The triangular Coons patch (Barnhill and Gregory, 1975, Hugentobler, 2002) is a method using combinations of ruled surfaces to interpolate to a triangular network of boundary curves and cross-boundary derivatives. In this paper, a cubic version was used, which interpolates to the position and the first order cross-derivatives of the boundary curves. Therefore, G1continuous surfaces, which can be considered synonymous with first-order continuity (Farin, 1997), can be generated. The cross derivatives between the data points have been interpolated linearly along each triangle edge. 2.4.3 Clough-Tocher Bezier splines Clough-Tocher Bezier splines (Farin, 1997) are surfaces specified by means of control points. These control points attract the surface and allow the shape of the surface to be controlled in an intuitive way. To achieve G1-continuity, each triangle of the TIN has to be split into three subtriangles and a cubic Bezier triangle has to be specified for each. The shape of the surface is not fully determined by the condition of G1-continuity. A further assumption as to how the cross-derivatives along the edges of the macrotriangles behave has to be used. Again, linear interpolation of the cross-derivatives has been used. 2.4.4 Smoothed version of the Clough-Tocher spline A smoothed version of the Clough-Tocher spline (Farin, 1985) was also utilised. With a Lagrange minimisation, the behaviour of the crossderivatives along the triangle edges is constrained such that the deviations of the control points to a G2-transition between two triangles is minimised.
116
M. Hugentobler, R.S. Purves and B. Schneider
G1-continuity between the triangles has also been specified as constraint for the minimisation. 2.5 Comparison techniques 2.5.1 Visual inspection A shaded 3D projection was prepared for each of the four interpolators from the same location (with the view of the test area covering the road and the hill, where it was considered likely that artefacts may occur).These images were visually inspected to find irregularities and artefacts of the interpolation and tessellation. 2.5.2 Artificial surfaces Values of elevation and derivatives can be calculated for any point on the surface. Elevation, and the magnitude of the first derivative (gradient) were compared for a total of around 25 thousand points and a number of summary statistics and relationships derived. The following selection is presented in this paper: x Mean unsigned deviations of ‘real’ elevation values from the four interpolation methods. x Correlations of signed deviations of interpolators with respect to the total curvature (Gallant and Wilson, 2000) at a point were compared. x Graphs of slope values derived from linear and Coons interpolation with those derived from the artificial function were produced. 2.5.3 Comparison of triangulation and control datasets Global statistics were calculated for the deviation of elevation at 263 points measured in the control dataset from the triangulation dataset. The stability of these results was measured by recalculating these global statistics with 80 points randomly swapped between the two datasets (and four resulting new interpolated surfaces generated). To examine the variation in sensitivity of the interpolators to the nature of the surface being modelled points on the surface were classified as lying on the hill, road (breakline) or plain. Global statistics were calculated and compared for these subsets of points. A second set of comparisons mapped the spatial variation of the signed deviations from the surface in order to see whether deviations of the interpolated surface from the control dataset were spatially autocorrelated with particular areas of the surface. Finally, the three profiles measured were
Evaluating Methods for Interpolating Continuous Surfaces
117
graphed against a two dimensional profile extracted along each of these profiles to examine in which areas the interpolated surfaces showed the greatest agreement and deviation from the measured points.
3 Results In this section the results are presented for the comparisons described above. Figures 3 and 4 show an overview of the surface and detail of the surface as generated from the four interpolators and represented as a shaded 3D surface.
Fig. 3. Overview of the surface generated by a linear interpolator and viewed from the north east
(a)
(c)
(b)
(d)
118
M. Hugentobler, R.S. Purves and B. Schneider
Fig. 4. Detail of the surface showing the road and hill, viewed from the north-east for (a) a linear interpolator, (b) triangular Coons patches, (c) Clough-Tocher Bezier splines and (d) smoothed Clough-Tocher splines.
Table 1 shows the unsigned deviation of the interpolated values from the artificial surface together with the maximum and standard deviation of these values. The correlation of the signed deviation of the surface with curvature was found to be 0.64 for the linear surface, whereas it was only 0.08 for the triangular Coons patch and Clough-Tocher Bezier splines, and 0.07 for the Clough-Tougher splines. Figure 5 shows the relationship between the analytically derived slope and slope calculated from linearly interpolated triangles and Coons patches respectively. The standard deviation of slope from the analytically derived value was 5.25° for the linearly interpolated surface and 3.93° for the Coons patches. Linear
Mean devia- 2.61m tion 10.90m Maximum deviation Standard 2.12m deviation
Coons
CloughTocher
0.87m
0.83m
Smoothed CloughTocher 0.83m
7.16m
7.17m
7.04m
1m
1m
0.99m
Table 1. Deviations of the interpolated elevation values from the artificial surface for around 25000 points
Evaluating Methods for Interpolating Continuous Surfaces (a)
119
(b)
Fig. 5. Relationship between slope derived from the linear interpolator (a) and Coons patches (b) with ‘real’ values of slope calculated analytically
Table 2 shows global statistics for the comparison of the elevation of the interpolated surfaces from the triangulation dataset in comparison with the control dataset, along with the results obtained when 80 points were randomly swapped between datasets. Linear
Mean deviation Maximum deviation Standard deviation
0.42m (0.46m) 5.24m (12.2m) 0.47m (0.82m)
Coons
0.31m (0.33m) 5.50m (12.36m) 0.45m (0.81m)
CloughTocher 0.31m (0.33m) 5.50m (12.36m) 0.45m (0.81m)
Smoothed CloughTocher 0.31m (0.33m) 5.51m (12.35m) 0.45m (0.81m)
Table 2. Deviations of the elevation of control data points from the interpolated surfaces derived from the triangulation dataset, with bracketed values representing results when 80 data points were randomly swapped between datasets. 263 were points used in the evaluation.
Statistics were also calculated for four sub-divisions of the test area (hill, road and two parts of the plain). These statistics showed that the linear interpolator’s performance was similar to that of the other interpolators on the plain, and slightly worse than the cubic interpolators in other areas (as was the case in the global statistics). Since the global statistics were generally similar for the three cubic interpolators, spatial comparisons are only presented for comparisons between surfaces interpolated linearly and using Coons patches. Figures 6a and 6b illustrate the results obtained by mapping the variation in deviation of the control dataset from the interpolated triangulation dataset for two interpolators – the linear and Coons patch. In Figure 6c the interpolator with the smaller deviation from the triangulated interpolated surface (linear or Coons) is mapped along with the unsigned magnitude of the deviation.
120
M. Hugentobler, R.S. Purves and B. Schneider
Fig. 6. Deviation of the elevation values derived the control dataset with respect to the interpolated surfaces from the triangulated datasets ((a) linear interpolation and (b) Coons patches). The hill is the triangular area in the bottom centre of the picture, with the plain to the top right. The image is oriented with north to the top. (c) shows which interpolator (linear or Coons) is closer to the surface.
In Figure 7 a comparison of values measured along a 2D profile, as indicated in Figure 2, with two interpolated surfaces running from the top of the hill down onto the plain is shown. The deviations are relatively small, and so are multiplied by ten from the real values.
Evaluating Methods for Interpolating Continuous Surfaces
121
Fig. 7. Deviation of a profile from interpolated surface for linear and triangular Coons patches. Deviations are small so are shown magnified by a factor of ten.
4 Discussion The 3D images of the test area show, not surprisingly the most apparent artefacts on the linear interpolator. Linear facets which result from the tessellation do not well represent the hill since such breaklines are clearly derived from the tessellation. All four images show artefacts of the triangular tessellation, with the smoothed Clough-Tocher most effectively removing these artefacts. The interpolated artificial surface shows the greatest deviation from the analytical values for the linear interpolator (Table 1). The deviations of the three cubic interpolators are all of similar magnitude. Deviations of the linear interpolator from the surface also showed a strong tendency to correlate with convexities and concavities in the surface, illustrating well how this representation fails to deal with these important features in the terrain. Figure 4 shows how the deviation of interpolated values of gradient varies with the value of gradient calculated directly from the analytical surface. Smaller values of slope show higher deviations, and the linear interpolator showed greater mean deviation from slope than the other interpolators, though in all cases the deviations were relatively small. In a landscape where absolute values of slope are both small and important (i.e. variation
122
M. Hugentobler, R.S. Purves and B. Schneider
in elevation is small) then the choice of interpolator is much more important than for a steeper landscape, such as the hill in the test area used here. The global statistics demonstrate consistent results between the three cubic interpolators, with the performance of the linear interpolator being slightly worse. When the landscape is subdivided, all four interpolators performed equally well on the planar areas, where curvature did not exist at the scale of measurement (e.g. on the plain). This is consistent with the result obtained by the correlation of the linear interpolator with the convexities and concavities. Figures 6a and 6b, mapping the signed deviation of the two interpolators, shows an overall similar pattern. However, the hill area (the triangle formed in the lower part of the figure) shows more positive deviation for the Coons patches and the magnitude of negative deviations is smaller than for the linear interpolation. This result again illustrates the efficacy of Coons patches in representing curved surfaces, with the variation in curvature for each triangle concentrated along its edges. Figure 6c further shows the better performance of the Coons patches on the hill, whilst the two interpolators perform similarly on the plain. Figure 7 illustrates that both interpolators lie relatively close to the measured surface. However, they also show high frequency correlated variation which would result in uncertainty in derived values of slope and curvature. These variations are most likely artefacts of the underlying tessellation.
5 Conclusions and further work In this paper a range of techniques have been used to compare four different interpolators on a “real” and artificial test datasets. In general the cubic interpolators performed better than the linear. However, little quantitative difference was found between the three cubic interpolators, although the smoothed Clough-Tocher reduced prominent artefacts when visualised. On surfaces where curvature is an important property, a linear interpolation is not adequate and in order to allow local inflexion points at least a third order interpolator is required. In turn, curvature is defined by the data resolution and careful thought must be given to the implicit scale of the features being modelled in making a choice of interpolator. This study has indicated the value of a range of techniques for comparing irregular tessellations and further work will investigate the influence of variations caused by interpolators of these derived primary properties of topography on compound indices (Moore et al., 1993) such as
Evaluating Methods for Interpolating Continuous Surfaces
123
stream power, with the aim of specifying useful interpolators to users for differing applications.
Acknowledgements The Canton of Zug and the local farmers are thanked for the provision of data and permission to work on our test area. Alastair Edwardes is thanked for his assistance in the collection of field data in sometimes inclement conditions. This research was funded by the Swiss National Science Fund (Project No 59578).
References Barnhill RE and Gregory JA (1975) Compatible smooth interpolation in triangles. Journal of Approximation Theory 15: 214-225 Bolstad P, Stowe T (1994) An evaluation of DEM accuracy: elevation slope and aspect. Photogrammetric Engineering and Remote Sensing 60: 1327–1332 Corripio JG (2003) Vectorial algebra algorithms for calculating terrain parameters from DEMs and solar radiation modelling in mountainous terrain. IJGIS 17: 1–23 Evans IS (1980) An integrated system of terrain analysis for slope mapping. Zeitschrift fur Geomorphologie 36: 274-295 Farin G (1985) A modified Clough-Tocher interpolant. Computer Aided Geometric Design 2: 19–27 Farin G (1997) Curves and surfaces for CAGD. A practical guide (Academic Press) de Floriani L , Puppo E (1992) An online algorithm for constrained Delaunay triagulation. Graphical Models and Image Processing 54: 290-300 Gallant JC and Wilson JP (2000) Primary Topographic Attributes, In Terrain Analysis: Principles and Applications edited by Wilson, J.P. and Gallant, J.C (Wiley): 51- 85 Giles P, Franklin S (1996) Comparison of derivative topographic surfaces of a DEM generated from stereoscopic spot images with field measurements. Photogrammetric Engineering and Remote Sensing 62: 1165–1171 Hugentobler M (2002) Interpolation of continuous surfaces for terrain surfaces with Coons patches. In Proceedings of GISRUK 2002 (Sheffield, UK): 13-15. Kumler M (1994) An intensive comparison of TINs and DEMs. Cartographica (Monograph 45), 31: 2 Maggioni M, Gruber U (2003) The influence of topographic parameters on avalanche release dimension and frequency. Cold Regions Science and Technology, 37: 407-419
124
M. Hugentobler, R.S. Purves and B. Schneider
Mitas L, Mitasova H (1999) Spatial interpolation. In Geographical Information Systems edited by P.Longley, M.F. Goodchild, D.J. Maguire, and D.W.Rhind (Longman): 481–492 Moore ID, Grayson RB, Landson AR (1993) Digital terrain modelling: A review of hydrological, geomorphological and biological applications. In Terrain Analysis and Distributed Modelling in Hydrology edited by Beven, K.J. and Moore, I.D (Wiley):7 - 34 Peucker TK, Fowler RJ, Little JJ, Mark DM (1978) The Triangulated Irregular Network, Proceedings of the American Society of Photogrammetry: Digital Terrain Models (DTM) Symposium, St. Louis, Missouri, May 9-11, 1978: 516-540 Schneider B (1998) Geomorphologisch plausible Rekonstruktion der digitalen Repräsentation von Geländeoberflächen aus Höhenliniendaten. PhD thesis, University of Zurich Schneider B (2001a) On the uncertainty of local shape of lines and surfaces. Cartography and Geographic Information Science 28: 237–247 Schneider B (2001), Phenomenon-based specification of the digital representation of terrain surfaces. Transactions in GIS 5: 39–52 Schmidt J, Evans IS and Brinkmann J (2003) Comparison of polynomial models for land surface curvature calculation. IJGIS 17:797 – 814 Skidmore A (1989) A comparison of techniques for calculating gradient and aspect from a gridded digital elevation model. IJGIS 3: 323–334 Walker JP, Willgoose GR (1999) On the effect of digital terrain model accuracy on hydrology and geomorphology. Water Resources Research 35: 2259-2268 Weibel R, Heller M (1991) Digital terrain modeling. In GIS: Principles and Applications edited by Maguire, D.J., Goodchild, M.F. and Rhind, D.W. (Wiley, New York ): 269-97 Wise S (1997) The effect of GIS interpolation errors on the use of digital elevation models in geomorphology. In Landform monitoring, modelling and analysis edited by S. Lane, K. Richards, and J. Chandler (Wiley): 139–164 Wood J and Fisher P (1993) Assessing interpolation accuracy in elevation models. IEEE Computer Graphics & Applications: 48–56
Contour Smoothing Based on Weighted Smoothing Splines Leonor Maria Oliveira Malva Departamento de Matemática, F.C.T.U.C., Apartado 3008, 3001 454 Coimbra, Portugal, email:
[email protected] Abstract Here we present a contour-smoothing algorithm based on weighted smoothing splines for contour extraction from a triangular irregular network (TIN) structure based on sides. Weighted smoothing splines are onevariable functions designed for approximating oscillatory data. Here some properties are derived from a small space of functions and working with few knots and special boundary conditions. However, in order to apply these properties to a two variable application such as contour smoothing, local reference frames for direct and inverse transformation are required. The advantage of using weighted smoothing splines as compared to pure geometric constructions such as the approximation by parabolic arcs or other type of spline function is the fact that these functions adjust better to the data and avoid the usual oscillations of spline functions. We note that Bezier and B-spline techniques are result in convenient, alternative representations of the same spline curves. While these techniques could be adapted to the weighted smoothing spline context, there is no advantage as our approach will be simple enough.
1 Introduction By a triangular irregular network we mean a triangulation of the convex hull of scattered data in space. In this prismatic surface, each triangle is represented by a plane with the following equation, f x, y bi z i b j z j bk z k (1.1)
126
Leonor Maria Oliveira Malva
where bi , b j , bk are the barycentric coordinates. The interpolation function is a continuous but non-differentiable (nonsmooth) surface. This means that the contours of the reconstructed surface are polygonal lines parallel in the interior of each triangle, forming sharp connections between adjacent triangles. Following Christensen(Christensen 2001) those will be called raw contours. In order to produce smooth contours from this kind of data it is usual to apply smoothing procedures such as B-splines or Bézier curves. 1.1 Data structure In TIN models, data can be stored in different ways- namely in structures of triangles, sides, nodes or combinations of these. The side structure is more suitable for contour extraction. The structure based on triangles is more suited to the computation of slope, aspect and volume. Therefore, we will describe a structure based on sides. Usually a structure based on sides (see figure 1) has three tables: a table of points, composed of four columns (point number, x coordinate, y coordinate, z coordinate); a table of sides, composed of five columns (side number, first node, second node, left triangle, right triangle), and a table of triangles, composed of four columns (triangle number, side one, side two, side three).
Fig. 1. Side structure
To implement our algorithm it is necessary to include another table composed of three columns: (number of triangle, x barycentre coordinate, y barycentre coordinate). This table allows the computation of the medians of each triangle. For example, in figure 2 the medians P1C1 , P2 C1 and P4 C1 are computed from the barycentre C1 and from the vertices P1 , P2 , P4 .
Contour Smoothing Based on Weighted Smoothing Splines
127
1.2 Extracting linear contours Let us define the elevation of the lowest contour. First it is necessary to verify if this elevation is included between the elevations of the nodes of the first side. In case this is true, let us compute the intersection of the contour with the side. On the contrary, the search proceeds in the sides of the triangulation until an intersection point is reached. At this point we know the initial side and choose the left or right triangle for a particular contour. Then we search for a new side that belongs to the same triangle or belongs to the adjacent triangle. This can be easily computed because in the table of sides each side has the indication of the triangle on the left and on the right. The procedure follows until we reach the first side again (for closed contours) or the boundary (zero triangle) for open contours. If the boundary is reached, the search proceeds from the initial side and for the non-initial chosen left or right triangle. This way we are computing a second part of the contour. In case of large files it is probably the case that a contour has several sections. During all the procedure every triangle found is placed in a vector of used triangles.
Fig. 2. The geometry of the procedure
1.3 Christensen procedure Our algorithm is in part based on the procedure of Christensen that smooths portions of raw contours included between the medians of adjacent triangles and the vertices of the same contour on the common edge of those triangles. As an illustration take adjacent triangles >P1 , P2 , P4 @ and
>P2 , P3 , P4 @ (see figure 2); these are cut by a raw contour >V1V2V3 @
a sharp angle at V 2 .
forming
128
Leonor Maria Oliveira Malva
The smoothing procedure consists of the substitution of the raw contour between the intersection points H 1 and H 2 with the medians of the triangles by a parabolic arc tangent to the raw contour at H 1 and H 2 . However, this is made by a pure geometric procedure. Such a solution is illustrated in figure 3 and as Christensen, closer to the raw contours than Bezier curves or B-splines because the latter put the contours at the wrong elevation due to their oscillatory behaviour.
Fig. 3. Interpolation of smooth contours using a parabola
2 Weighted smoothing splines The weighted splines were introduced in independent works by Salkauskas (Salkauskas 1984) and Cinquin (Cinquin 1981) for one variable interpolation of oscillatory data. However, this is not the only application of such functions, as we will see later. These functions are chosen on from H 2 >a, b@
^f : f c is absolutely continuous on >a, b@, and f cc L >a, b@` 2
(2.1)
in such a way to interpolate and minimize the semi-norm, Jf
b
³ a wt > f cct @
2
dt , wt t 0, w z 0, t >a, b@
(2.2)
A classical smoothing spline (Wahba 1990) does not have to interpolate and minimizes b
Q f
n
³ >f t @ dt D ¦ z b , t >a, b@ ''
a
2
i
i 1
i
2
(2.3)
The optimum is a certain cubic spline. As in J , a weight function can be included for additional shape control. Here, like in (Malva and Salkauskas 2000) we will need the simplest setting of these splines and we will restrict f to a small space of functions and work with few knots and special boundary conditions.
Contour Smoothing Based on Weighted Smoothing Splines
129
These are C 1 functions, which means that the derivative is continuous but not necessarily the second derivative. These functions can be written as a linear combination of the Hermite cardinal functions I i ,\ i , i 1,2,3 for ordinate and derivative interpolation at the knots t1 t 2 t 3 . Then, a piecewise cubic function s interpolating ordinate b1 and slope m1 at t1 , ordinate b and slope mat t 2 , and ordinate b3 and slope m3 at t 3 , can be written as s
>
b1I1 m1\ 1 bI 2 m\ 2 b3I 3 m3\ 3
(2.4)
@
on t1 , t 3 Proposition 2.1: For any nonnegative weight function w , which is not identically zero and is constant on intervals t1 , t 2 , t 2 , t 3 , and for any
>
>
D ! 0 , there is a unique V in the space of C 1 piecewise cubics, with knots t1 t 2 t 3 , interpolating at t1 with b1 ordinate and slope m1 , and interpolating at t 3 with b3 ordinate and slope m3 , which minimizes
Q s
³
t3 t1
w t > s cc t @ 2 dt D >z st 2 @
(2.5)
2
for any constant z . Furthermore
lim V t 2
D of
(2.6)
z
Proof. The proof follows the proof of proposition 5.1 in Malva (Malva and Salkauskas 2000) with the coefficients A, B, C , D and E given by t3
A
³
t3
wI 2cc 2 dt
B
t1
t3
C
³
w>b1I1ccI 2cc m1\ 1ccI 2cc b3I 3ccI 2cc m3\ 3ccI 2cc @dt
t1
t3
E
³ w>b1I1c\c 2cc m1\ 1cc\ 2cc b3I 3cc\ 2cc m3\ 3cc\ 2cc @dt t1
³ wI 2cc\ 2cc dt t1
t3
D
³ w\ 2cc t1
2
dt
(2.7)
130
Leonor Maria Oliveira Malva
and b and m given by Dz C B b
E
D
AD
B
B
D
;m
Dz C
AD
E
B
AD
B
B
D
(2.8)
̝ Now we can make special choices of ordinates and slopes and end up with the following cases: Case one: a piecewise cubic function s interpolating zero ordinate and slope m1 at t1 , ordinate b and slope mat t 2 , and zero ordinate and slope
m3 at t 3 , can be written as s
>
m1\1 bI 2 m\ 2 m3\ 3
(2.9)
@
on t1 ,t 3 (see figure 4, case 1). Case two : a piecewise cubic function s interpolating zero ordinate and slope m1 at t1 , ordinate b and slope m at t 2 and ordinate b3 and slope
m3 at t 3 can be written s
>
m1\1 bI 2 m\ 2 b3 I 3 m3\ 3
@
on t1 ,t 3 (see figure 4, case 2).
Fig. 4. The piecewise cubic functions
(2.10)
Contour Smoothing Based on Weighted Smoothing Splines
131
2.1- Local reference frames Let >H 1V 2 H 2 @ be a portion of a raw contour to be smoothed. This polyline can have any orientation in the x, y coordinate system in the map. To apply the preceding theory it is necessary to have a local reference frame in which to the points H 1 , V 2 , H 2 corresponds a partition t1 t 2 t 3 . We identify three different situations. Case 1: This corresponds to figure 5 where the local s axis has the orientation from H 1 to H 2 . The st -coordinates are related to general ones by t x cos T y sin T x H1 s x sin T y cos T y H1 (2.11) In this case the points H 1 , V 2 , H 2 have abscissas t1 , t 2 , t 3 and
t 2 t1
h , and t 3 t 2
k
Fig. 5. Local frame of reference, case 1
. In this frame of reference the ordinate at H 1 and at H 2 is zero. The values of the slope of H 1V 2 at t1 and the slope of V 2 H 2 at t 3 can be used to construct a weighted smoothing spline tangent to H 1V2 at t1 and tangent to H 2V2 at t 3 . The parameter D and the ordinate z will define the degree of interpolation at V2 . Once the weighted smoothing spline is computed one must proceed by the inverse transformation x
t cos T s sin T x H 1
y
t sin T s cos T y H1
,
(2.12)
132
Leonor Maria Oliveira Malva
to represent the computed function in the actual position. Case 2: In some cases choosing a local frame like in case 1 doesn’t lead to a partition t1 t 2 t 3 and we get h ! 0 and k 0 instead (as can be seen in figure 6). In such cases we can choose a new local frame with the s ' axis connecting H 1 with the point P in the direction of H 2V 2 and at distance H 2V 2 of V 2 . In this frame we have in particular that h k . There are in this case also two direct transformations, involving the angles T and E , in order to arrange the reference frame so that proposition 2.2 can be applied, and thus there are two inverse transformations to put the computed function into xy - coordinates. The values of m1 , m3 are computed in the same way for the coordinates t ' , s' , and the value b3 is the ordinate of H 2 on the same axes.
Fig. 6. Local frame of reference, case 2
Case 3: In the last case, on the local frame we get h 0 , and k ! 0 (see figure 7). Therefore we choose a new frame of reference centred on H 1 , with the direction t ' perpendicular to H 2 P , where P is in the direction of H 1V2 , at the distance H 1V 2 of V2 . The rotation angles are computed in a similar way to the previous cases.
Contour Smoothing Based on Weighted Smoothing Splines
133
Fig. 7. Local reference frame, case3 2.2 The weight function The motivation underlying the choice of the weight functions is to allow that variations on data be followed by the adjusted spline function. The information available in this case consists of the nodes ^t i , i 1,2,3` and of the corresponding values of the ordinates ^s i , i 1,2,3` , and its variability can be assessed by the slope between adjacent segments. In the tests we choose the weight function introduced by Salkauskas (Salkauskas 1984) 3
wt
ªs s º2 ½ ° ° i i 1 ®1 « » ¾ , t >t i 1 , t i , i h i ¼ °¿ °¯ ¬
The choice was made to make
³
b a
(2.13) 2,3
wt >s cct @2 dx resemble a L 2 -norm of
the curvature of s . 2.3 Smoothing with weighted smoothing splines Figure 8 is an application of proposition 1.1. The non-interpolation case corresponds to D 0 , and the interpolating spline corresponds to D f . We can also observe an intermediate spline corresponding to an arbitrary value of 0 D f .
134
Leonor Maria Oliveira Malva
Fig. 8. Smoothing weighted spline, case 1
In figure 9 we can see the difference between the parabola obtained from the Christensen procedure and the weighted spline from our procedure. As it can be observed the weighted spline is closer to the raw contour than the parabola.
Fig. 9. Approximations to the raw contour
3 Applications
3.1 Generation of contours from spot heights In order to make an application of the exposed theory to the contour generation from spot heights, let us make a Delaunay triangulation of the spot heights included on figure 11. This is shown by the dashed lines in figure 10 a) and b). Figure 10 b) corresponds to the application of the results of proposition 1.1 with D 0 to the raw contours of figure 10 a). We must point out that in a similar way to the procedure of Christensen where the parabola is approximated by a polyline, in this application the weighted smoothing splines are approximated by sets of polylines, connecting the points of the spline at a regular spacing of the parameter t .
Contour Smoothing Based on Weighted Smoothing Splines
135
Fig. 10. Contours from spot heights
3.2 Generation of intermediate contours Let us take a portion of a ten meter equidistance contour map of a region of >2.0 u 2.0 Km@ in Coimbra in the centre of Portugal (see figure 11).
Fig. 11. Original contour map
Fig. 12. Weighted spline contour map
Since the vertices lie on contour lines of the map, any contour triangulation will produce raw contours passing through the vertices of the triangulation. That fact prevents the reconstruction of the original contours either by the procedure of Christensen or by the application of our procedure. However the same triangulation can be used to compute intermediate contours as can be seen on figure 12. This results from a Delaunay triangulation of all the points in the contours and the application of the weighted
136
Leonor Maria Oliveira Malva
spline for the intermediate contours. The original contours are hatched to allow comparison with the original picture.
4 Conclusions The Christensen procedure was designed to smooth raw contours in such a way that the resultant contour is closer to the raw one than the approximations with B-splines or Bezier curves. However, this is made by a pure geometric procedure. With the application of weighted smoothing splines we get contours that depending on the value of D , can be closer to the raw contours than the procedure of Christensen and at the same time we have the advantage of producing C 1 curves. The previous sections show that this can be applied to the contour extraction from spot heights or to the generation of intermediate contours.
References Cinquin. P., (1981) Splines Unidimensionells Sous Tension et Bidimensionelles Parametrées : Deux Applications Medicals, Thèse, Université de Saint_Etienne. Christensen, A. H. J., (2001) Contour Smoothing by an Ecletic Procedure, Photogrametric Engineering & Remote Sensing, 67(4): 511-517. Malva, L., Salkauskas, K., (2000) Enforced Drainage Terrain Models Using Minimum Norm Networks and Smoothing Splines, Rocky Mountain Journal of Mathematics, 30(3): 1075-1109. Rahman A. A., 1994, Design and evaluation of TIN interpolation algorithms, EGIS Foundation 1
Salkauskas, K., 1984, C splines for interpolation of rapidly varying data, Rocky Mountain Journal of Mathematics, 14(1): 239-250. Wahba. G., (1990) Spline models for observational data, SIAM Stud. Appl. Math. 59.
Flooding Triangulated Terrain1 Yuanxin Liu and Jack Snoeyink Department of Computer Science, University of North Carolina, Chapel Hill, USA {liuy,snoeyink}@cs.unc.edu
Abstract We extend pit filling and basin hierarchy computation to TIN terrain models. These operations are relatively easy to implement in drainage computations based on networks (e.g., raster D8 or Voronoi dual) but robustness issues make them difficult to implement in an otherwise appealing model of water flow on a continuous surface such as a TIN. We suggest a consistent solution of the robustness issues, then augment the basin hierarchy graph with different functions for how basins fill and spill to simplify the watershed graph to the essentials. Our solutions can be tuned by choosing a small number of intuitive parameters to suit applications that require a data-dependent selection of basin hierarchies.
1 Introduction and Previous Work Without a doubt, the computation of drainage characteristics, including basin boundaries, is one of the successes of GIS analysis. Digital data and GIS are widely employed to partition terrain into hydrological units, such as watersheds and basins. The US proposal for extending the hydrologic unit mapping from a 4-level to a 6-level hierarchy [5] includes discussion of the role of DEMs in what was formerly a manual cartographic process. A GIS can give preliminary answers to analysis questions such as how much rainfall runoff a downstream point can receive. Detailed hydrologic analysis will apply numerical computation to hillslope patches created to have uniform slope, soil, and/or vegetation characteristics. The raster DEM is the most common terrain and hydrology model in GIS, because regular grid representations of terrain are most amenable to computation. Many algorithms have been proposed for pit-filling [8,9,10], 1
Research partially supported by NSF grant 9988742.
138
Yuanxin Liu and Jack Snoeyink
barrier-breaching [16], flow direction assignment [2,15], basin delineation [12,14,23,24], and other steps of hydrologic modeling [4,13]. Another common terrain model is the TIN (Triangulated Irregular Network), which forms a continuous surface from triangles whose vertices are irregularly-sampled elevation points. A TIN is more complex to store and manipulate because it must explicitly store the irregular topology. The debate between using grid or TIN as the DEM model has been a longstanding one in GIS [11]. Often-mentioned advantages for a TIN include: x A grid stores the terrain at uniform resolution, while a TIN can potentially store large flat regions as single polygons. x To construct a grid from irregularly spaced data or multiple source data often requires interpolation and possible loss of information. x Grid squares do not give a continuous surface model, so geometric operations and measurements on a surface must be approximated on grid cells. x Water flow on a grid is commonly constrained to eight (or even four) neighbor directions for ease of computation, which can produce visible artifacts.
Our work focuses on basin computations on a TIN, including filling spurious pits and simplifying the basin hierarchy. We provide a framework for basin computation that can support several natural variations efficiently and robustly. We partition the computation into several steps: 1. Assign low directions to triangles. For this we use steepest descent directions on the triangles, but other choices are possible, especially in flat portions of terrain. Triangles can be subdivided or otherwise parameterized to allow more complex specification of flow directions. 2. Trace flow paths network. A key contribution of this work is to keep explicit track of the order that paths cross triangle edges and triangles, and thus avoid computational failures due to degenerate configurations and floating point inaccuracies, which otherwise make this a difficult task. 3. Compute basins, spill points, and an initial basin hierarchy. 4. Compute basin characteristics such as spill times, by propagation through the basin hierarchy. With consistent flow paths, these become simple problems on graphs. The initial basin hierarchy contains many spurious pits, but its structure can help us decide which are spurious, and which are significant. 5. Simplify the basin hierarchy, and, if desired, the terrain to match. We give examples of hierarchies from basin spill times computed from projected area and volume for basins and the sub-basins that spill into them. Natural variants for computing basin spill times that can be suitable to different type of terrain.
Assignment of flow directions (step 1) and computation of basin spill times (step 4) can be carried out in several ways, a few of which we demonstrate. Our framework calls for simple output of these variants (an assignment of direction vectors to triangles, or event times to basin hierarchy edges) that allows experimentation with algorithm or terrain-dependent policies, while the bulk of the geometric computation remains fixed.
Flooding Triangulated Terrain
139
The basin hierarchy has been studied in grid-based terrain by Band et al. [12]. An explicit representation of the basin hierarchy during pit-filling provides structural information that can help us decide which pits are spurious. Moreover, it simplifies the handling of pits—e.g., we need not assign flow directions at spill points, as many pit-filling algorithms on a grid must do, because we just send overflow into the basin we spill into. Some other researchers have used a TIN as the terrain model for flow computation, but directed flow only along triangle edges, using either the triangles [20,21] or the dual Voronoi cells [18,22] as the hydrological unit. Thus, they still use a discrete network flow model, much like the raster; they simply substitute the regular grid for an irregular network. The flow model that we use has an underlying continuous surface, discretized along flow paths, rather than by a limited set of directions (D8 or edges). Two sections give definitions and sketches of the algorithms in our work: Sec. 2 reviews previous results [17,25] that we use directly and Sec. 3 describes the basin filling model. Sec. 4 details our implementation, which includes the computational steps above. Sec. 5 describes our experiments, and Sec. 6 discusses possible future work.
2 TIN-based hydrology model We use a TIN-based hydrology model proposed by Yu et al. [25] that extends the accurate definitions of Frank et al. [6]. We briefly review relevant definitions, properties, and algorithms. A terrain is mathematically modeled as the graph of a real-valued function f over a bounded region of the plane. This definition does not tell us how to store f or, given a set of noisy or irregular data samples, how such a function f might be computed. Still, this general definition of terrain allows us to specify water flow precisely: the trickle path of a point is the steepest descent path that ends either at a local minimum (pit) or the boundary of the terrain (Figure 1c). The watershed of a point p is the set of all points whose trickle paths contain p. The watercourse network is all points whose watersheds have nonzero area. Catchments (strips draining into a portion of the watercourse network) and basins (watersheds of pits) can also be defined. Since a TIN is a piecewise linear surface, the only segments on a watercourse network are local channels (Figure 1a) and segments of steepest descent paths on triangle faces traced from saddle vertices of the TIN. With this observation, the watercourse network is straightforward to compute: take the set of all segments that belong to the watercourse network and join
140
Yuanxin Liu and Jack Snoeyink
them at intersection points. It can be easily shown that this watercourse network can be characterized: x The watercourse network in a TIN is a collection of disjoint (graphtheoretic) trees rooted at pits, whose leaves are local channels.
d) a) local chanel
b) local ridge
e)
c) trickle path
Fig. 1 The thick edges in a) and b) are a local channel and a local ridge. c) shows a trickle path. d) and e) show segments of a watershed graph (dotted) before and after being joined in the neighborhood of a vertex.
McAllister [17] shows how to compute a basin graph: an embedded planar graph for the entire TIN whose faces are polygons that define basins and whose edges separate pairs of adjacent basin faces. We first compute a watershed graph, which is essentially a basin graph with extra interior segments. We then delete the interior segments to obtain the basin graph. The computation of the watershed graph consists of two steps: 1. Collect the local ridges (Figure 1b) of the TIN, and the steepest descent paths traced backward from each saddle point of the TIN. 2. The open line segments from step 1 are connected to form an embedded planar graph consistent with the TIN hydrology model.
Step 1 is simple, and we note the similarity between the watershed graph and the watercourse network, whose segments are local channels and steepest descent paths. Intuitively, the segments for the watershed graph form ridges that potentially separate basins. Step 2 is complicated, and we examine the details here. There are two cases: we can connect the upper end point of a steepest descent path to a local ridge segment, or connect two segments ending at a TIN vertex. The first case is general, and easy to handle. The second is degenerate, and requires care. Figure 1d shows the neighborhood of a vertex, with the open segments shortened by an infinitesimal length. If we simply join these segments by extending them to the vertex, the trickle paths through the vertex will be cut off, dividing a basin face and of the watershed graph. Instead, we must join the segments so they do not “collide” with the trickle paths through the vertex as in Figure 1e. These joining operations in a vertex neighborhood involve only graph operations on data structures. A key problem with many hydrology models, including this one, is selecting appropriate scale. How can we consistently extract hydrological
Flooding Triangulated Terrain
141
objects at a desired scale—say, the watershed of a large river—no matter how detailed our terrain model is? With a TIN, two triangles define a segment of the watercourse network regardless of whether they correspond to the banks of a river or sides of a ditch. This problem is exacerbated by data resolutions ranging from 100-meter DEMs to sub-meter LIDAR. Another key problem of this model is its assumption that water has no volume. In heavy rainfall, depressions such as storm ponds in urban areas, become inundated, changing the drainage characteristics of the terrain. The good news is that the work on TIN-based hydrology models allows for an extension that meets the challenge of these key problems.
3 The basin filling model Suppose that we change the model so that water has volume and can accumulate in basins. Then we must allow a basin to fill until it “spills” into a neighboring basin. Consider the moment the surface of the accumulated water in a basin A reaches the lowest point, a, on its boundary—the spill point. Let basin B be the basin that a drains into. Then the two basins can be merged into a single one: the boundary between A and B are deleted, and the pit of the new basin is the pit of B. This filling model naturally defines a sequence of merges forming a basin hierarchy tree: leaves are the initial basins, and the root is the merge of all basins. It also defines a sequence of deformation operations on the terrain, each replacing a filled basin by the flat surfaces of its water body. Although trickle paths on flat surfaces are not well defined, since steepest descent is not unique, the trickle path of point p in a filled basin A starts at p and exits at the spill point of A. Therefore, on a “flooded” terrain, a trickle of water can still be directed from its origin to a pit. To incrementally compute the basin hierarchy tree, we can compute the “spill time” of each basin, then repeatedly take the basin with the earliest spill time, merge it with a neighboring basin and compute the new spill time of the merged basin. This “flooding simulation” algorithm responds to events rather than simulating flow at regular time steps. Each iteration processes one event when a topological change of the hierarchy tree and terrain occurs. The algorithm calculates volumes and surface areas—which can be done accurately up to round-off errors—and updates data structures that store local geometry and connectivity as described in the next section. The spill times merely define an ordering of the basins; other definitions of spill times can be substituted. We describe experiments in Sec. 5.
142
Yuanxin Liu and Jack Snoeyink
4 Data structures and implementation We have implemented the basin filling computations in C++. Our program takes a set of points in 3D as input and processes them as the following: 1. Create a TIN by computing the Delaunay triangulation of the input points (using only the x and y coordinates). 2. Compute the watershed graph of the TIN, and delete the internal edges of the graph to create a basin graph. 3. Run the flooding computation until one basin is left. At each step, a filled basin is found that spills into a neighbor by merging the corresponding faces in the basin graph. The terrain surface is modified to reflect this by simply deleting the set of triangles that are below the water surface and recording the water surface height with the partially immersed triangles. 4. Simplify the basin hierarchy by removing edges, merging pairs of basins, and displaying the corresponding basin graph.
The two most important objects in the program, the TIN and the watershed graph, can each be stored as planar subdivisions using data structures such as Quadedge [7]. However, a number of options should be considered to reduce the space and speed up the computation. These are particularly important due to the size of the data sets we would like to handle. The watershed graph shares many edge segments with the TIN—in particular, all the local ridges of a TIN are part of the watershed graph. Therefore, instead of duplicating the connectivity information in the TIN, we simply store it in a simple graph data structure with pointers to the segments and vertices of the TIN. An advantage of using a simple graph data structure is that vertex and edge insertion and deletion operations have lower overhead than the same operations on a subdivision. Note, however, that the watershed graph is more than a subset of the TIN. As discussed earlier, the watershed graph in an infinitesimal neighborhood of a saddle point can be very different from the TIN. If we wish to implement sophisticated subdivision operations such as the merge-face operations for flooding, we want to have the basin graph stored as a subdivision independent of the TIN. Fortunately, the watershed graphs have a large number of internal edges, which implies that the basin graphs are much smaller. We create the basin graph subdivision after we have deleted all the internal edges from the watershed graph. The flooding computation need not know the connectivity between triangles in the TIN. It is enough to keep, for each basin, the set of triangles in it, ordered by the height so that the triangles within a height interval can be quickly located. Other options that trade between computation time, space, and ease of implementation are yet to be explored. E.g., computing the watershed is currently the memory bottle-neck of our program; pointers between the
Flooding Triangulated Terrain
143
TIN elements and the simple watershed graph data structure still take too much space. These pointers provide topological information through the TIN that represent the planar embedding of the watershed graph. We can try to reconstruct the basin graph subdivision using only the coordinate information, but numerical errors and degeneracy make this a challenge. 4.1 Robustness issues in implementation Geometric algorithms that manipulate topological data structures are harder to implement robustly than algorithms that manipulate rasters. The culprits are round-off error and geometric “degeneracies” in a problem. Both have been extensively studied in the field of computational geometry [1,3,19]. Although these results are often quite technical, the most important techniques are not hard to understand. Bit complexity analysis often involves only back-of-the-envelope calculations, and degeneracies can be eliminated by implementing a small set of policies that conceptually perturb the input. We look at these two techniques in more detail to robustly implement the algorithms for this flooding model. 4.1.1 Numerical issues Geometric algorithms derive spatial relationships from computations on coordinates, but a computer uses only a limited number of bits to represent a number. If algorithm correctness depends on ideal geometry, round-off error results in at best an approximation and, at worst, a crash. Three numerical computations are involved in our algorithm: 1. Computing the steepest descent direction for each triangle. 2. Testing whether water on some triangle flows into a triangle edge or away from it. This will classify whether an edge is a local channel, a ridge, or neither. 3. Tracing the steepest descent path backward from a vertex to a local ridge.
For 1), observe that if we replace “steepest descent” by “unique descent direction,” in the definition of trickle path, the definitions for watersheds are still consistent. So, if we assign each triangle some descent direction that closely approximates the steepest descent, we have a TIN whose drainage characteristics are acceptable as a close approximation of the original. We store the steepest descent vector as single-precision, though the exact vector requires double-precision. We simply round off the directional vector so that the result gives a descent direction. For 2), testing the flow direction on a triangle edge is an orientation test, which is of algebraic degree two. Since double precision is supported in hardware, an exact implementation of this test is straightforward.
144
Yuanxin Liu and Jack Snoeyink
For 3), we have the most problematic numeric computation involved in the algorithm. Each time we cross a triangle as we back-trace a steepest descent path, we must compute the intersection of the steepest ascent path with an edge of the triangle, which quadruples the precision for exact computation. If k intersections are computed, we go from b bits to a worst case of (3k+1)b bits to represent the last point of the path exactly. We use single precision floating point instead, with the unfortunate consequence that not only does the path's position in space become inexact, we can no longer guarantee watershed graph properties such as that each face contains exactly one pit. This approximation can be acceptable as long as small errors do not cause catastrophic changes. The main issue is that two inexact paths might cross each other, contradicting the assumption that the steepest descent path is unique. We can try two ways to handle this: 1. We can compute the watershed graph without regard to whether the steepest ascent paths cross each other. Then, the nodes of the computed watershed graph that are supposed to be on the steepest ascent paths cannot be embedded right away. We must first repair steepest ascent paths so they no longer intersect. 2. We can incrementally maintain the invariant that no steepest ascent paths cross by asserting that the order of the entry points of the steepest ascent paths into a polygon must be the same as the order of the exit points. In the data structure, we maintain a list of all steepest ascent flows across each triangle edge.
We have chosen 2) in our implementation. When we back-trace a steepest descent path through a triangle, we first compute the position of the exit point by numeric computation. If this position fails to satisfy the invariant, we assign a position in the valid range that the path is allowed to exit. In our experiments, we have found that the number of faces that do not have exactly one pit is small and the deviation is never more than one. 4.1.2 Resolving degeneracies Geometric algorithms often assume that the inputs are in general position. For example, no three points lie on the same line. Subsets of the input that violate the assumption are called degeneracies [1] because an infinitesimal random perturbation of the input will eliminate them. To actually handle the degeneracies in an implementation, one can either directly handle all the “exceptions” introduced, or create policies that treat the degeneracies consistent with some infinitesimal perturbation. In McAllister [17], the first option is taken, while we have done the latter. We list below the general position assumptions made in our algorithm and how the corresponding degeneracies are handled with perturbations. 1. No two points have the same height. In the degenerate case, when two z-values are equal, compare their x-values; and if their x-values are equal, compare their y-values. This policy is consistent with infinitesimally rotating the xy-plane.
Flooding Triangulated Terrain
145
2. The steepest descent direction in a triangle is not in the same direction as any of the edge of the triangle. We need this to test whether water on a triangle flows into a triangle edge or away. In the degenerate case, we choose “away.” This is consistent with rotating the steepest descent direction infinitesimally. 3. The steepest ascent path does not go through a triangle vertex. In the degenerate case, we choose one of the two edges adjacent to the vertex as the next edge. This is consistent with perturbing the x y coordinates of the vertex infinitesimally. Note that this perturbation precludes the possibility that an area can drain into another area through a single trickle path.
When implementing a perturbation policy, we must reason backwards about what the policy implies to how the input is perturbed and convince ourselves that perturbations in different operations are not contradictory. If this is not done, the behavior of the program can be unpredictable.
5 Experiments We have tested our program against digital elevation data from various sources. The graphical output of the program is shown in Figure 2. Basin boundaries are drawn as outlines over the hill-shaded terrains. Black regions are flooded regions of the terrain. We have used several different basin filling strategies. Each strategy effectively defines a different basin spill time. We have experimented with these four strategies: 1. Uniform precipitation model: compute a basin's filling time by dividing the volume of a basin by (projected) area. Once a basin spills into another, the areas and volumes add. This is consistent with the simplistic, but physical, assumption that all rainfall turns into surface flow. 2. The basins are filled in order of increasing volume. 3. The basins are filled in order of increasing area. 4. For each basin, we compute two numbers: the number of basins that have spilled into it and its filling time under the uniform precipitation model. Two basins are compared by lexicographic ordering of their number pairs.
These filling strategies—some corresponding to precipitation models— demonstrate the flexibility of the flooding model and the program. The program stores enough topological and geometrical information so that the numbers used for the ordering can be easily computed. Figure 2 shows the output using the strategies 1 and 2 (please refer to our website, http://www.cs.unc.edu/~liuy/flooding/, for more results); two terrain models were used as the input. The terrain model for the first column of pictures was produced from 1 degree quad of USGS DEM data from Grand Junction, Colorado. The second terrain model was produced from unfiltered LIDAR data from Baisman Run, Maryland.
146
Yuanxin Liu and Jack Snoeyink
Fig. 2 Basin graphs from two terrain models using two filling stratgies. Each basin is delineated. The number of basins is also shown. These particular filling strategies were motivated by observations about our initial results from a uniform precipitation model. For example, in the second row from Figure 2, we see that large rivers merge before small basins are filled. This is unfortunate, though unsurprising with this model, since a large basin with a low spill point holds a relatively small amount of water compared to the amount of water it receives through precipitation
Flooding Triangulated Terrain
147
over a large area. Our other filling strategies somewhat alleviate this problem. The improvement is especially obvious with LIDAR data that contains elevation points from vegetation that often produce “wells” on the terrain that are hard to fill with a uniform precipitation model.
6 Conclusions and Future Work We have shown how to produce basin hierarchies on a TIN with the basin filling model. We have implemented our algorithms robustly and given sample outputs. Currently, the largest TINs we can handle are about one million data points, so we would like to use more out-of-core algorithms and data structures to handle tens of millions of points. We would like to compare our results against basin hierarchies produced with grid-based algorithms, and to include more physically-based parameters into our implementation, such as absorption rate determined by land cover type. We would also like to incorporate the basin graph into TIN simplification so that the basin graph of the simplified terrain is close to a basin graph at some appropriate level of the basin hierarchy of the original terrain, where closeness is defined both topologically and geometrically.
References [1] P. Alliez, O. Devillers, and J. Snoeyink. Removing degeneracies by perturbing the problem or perturbing the world. Reliable Computing, 6:61-79, 2000. [2] T-Y Chou, W-T Lin, C-Y Lin, W-C Chou, and P-H Huang. Application of the PROMETHEE technique to determine depression outlet location and flow direction in DEM. J Hydrol, 287(1-4):49-61, 2004. [3] H. Edelsbrunner and E.P. Mücke. Simulation of simplicity: A technique to cope with degenerate cases in geometric algorithms. ACM TOG., 9(1):66–104, 1990. [4] J. Fairfield and P. Leymarie. Drainage networks from grid digital elevation models. Water Resour. Res, 27:709–717, 1991. [5] Federal standards for delineation of hydrologic unit boundaries. version 1.0. http://www.ftw.nrcs.usda.gov/HUC/HU_standards_v1_030102.doc, mar 2001. [6] A.U. Frank, B. Palmer, and V.B. Robinson. Formal methods for the accurate definition of some fundamental terms in physical geography. In Proc. 2nd Intl. SDH, pages 585–599, 1986. [7] Leonidas J. Guibas and J. Stolfi. Primitives for the manipulation of general subdivisions and the computation of Voronoi diagrams. ACM TOG, 4(2):74123, 1985. [8] M.F. Hutchinson. Calculation of hydrologically sound digital elevation models. In Proc. 3rd Int. SDH, pages 117–133, 1988.
148
Yuanxin Liu and Jack Snoeyink
[9] S.K. Jenson and J.O. Domingue. Extracting topographic structure from digital elevation data for geographic information system analysis. Photogrammetric Engineering and Remote Sensing, 54(11):1593–1600, Nov. 1988. [10] S.K. Jenson and C.M. Trautwein. Methods and applications in surface depression analysis. In Proc. AUTO-CARTO 8, pages 137–144, 1987. [11] M.P. Kumler. An intensive comparison of triangulated irregular networks (TINs) and digital elevation models (DEMs). Cartog., 31(2), 1994. mon 45. [12] D.S. Mackay and L.E. Band. Extraction and representation of nested catchment areas from digital elevation models in lake-dominated topography. Water Resources Research, 34(4):897–902, 1998. [13] David R. Maidment. GIS and hydrologic modeling – an assessment of progress. In Proc. Third Conf. GIS and Environmental Modeling, Santa Fe, NM, 1996. NCGIA http://www.sbg.ac.at/geo/idrisi/gis_environmental_modeling/ sf_papers/maidment_david/maidment.html.
[14] D. Marks, J. Dozier, and J. Frew. Automated basin delineation from digital elevation data. Geo-Processing, 2:299–311, 1984. [15] L.W. Martz and J. Garbrecht. The treatment of flat areas and closed depressions in automated drainage analysis of raster digital elevation models. Hydrological Processes, 12:843–855, 1998. [16] L.W. Martz and J. Garbrecht. An outlet breaching algorithm for the treatment of closed depressions in a raster DEM. Computers and Geosciences, 25, 1999. [17] M. McAllister. The Computational Geometry of Hydrology Data in Geographic Information System. Ph.D. thesis, UBC CS, Vancouver, 1999. [18] O.L. Palacios-Velez and B. Cuevas-Renaud. Automated river-course, ridge and basin delineation from digital elevation data. J Hydrol, 86:299–314, 1986. [19] J. Shewchuk. Adaptive precision floating point arithmetic and fast robust geometric predicates. Discrete & Comp. Geom. 18:305-363, 1997. [20] A.T. Silfer, G.J. Kinn, and J.M. Hassett. A geographic information system utilizing the triangulated irregular network as a basis for hydrologic modeling. In Proc. Auto-Carto 8, pages 129–136, 1987. [21] D.M. Theobald and M.F. Goodchild. Artifacts of TIN-based surface flow modeling. In Proc. GIS/LIS’90, pages 955–964, 1990. [22] G.E. Tucker, S.T. Lancaster, N.M. Gasparini, R.L. Bras, S.M. Rybarczyk. An object-oriented framework for distributed hydrologic and geomorphic modeling using triangulated irregular networks. Comp Geosc, 27(8):959–973, 2001. [23] K. Verdin and S. Jenson. Development of continental scale DEMs and extraction of hydrographic features. In Proc. 3rd Conf. GIS and Env. Model., Santa Fe, 1996. http://edcdaac.usgs.gov/gtopo30/papers/santafe3.html. [24] J.V. Vogt, R. Colombo, and F. Bertolo. Deriving drainage networks and catchment boundaries: a new methodology combining digital elevation data and environmental characteristics. Geomorph., 53(3-4):281–298, July 2003. [25] S. Yu, M. van Kreveld, and J. Snoeyink. Drainage queries in TINs: from local to global and back again. In Proc. 7th SDH, pages 13A.1–13A.14, 1996.
Vague Topological Predicates for Crisp Regions through Metric Refinements Markus Schneider University of Florida Department of Computer and Information Science and Engineering Gainesville, FL 32611, USA
[email protected] Abstract Topological relationships between spatial objects have been a focus of research on spatial data handling and reasoning for a long time. Especially as predicates they support the design of suitable query languages for data retrieval and analysis in spatial databases and geographical information systems. Whereas research on this topic has always been dominated by qualitative methods and by an emphasis of a strict separation of topological and metric, that is, quantitative, properties, this paper investigates their possible coexistence and cooperation. Metric details can be exploited to refine topological relationships and to make important semantic distinctions that enhance the expressiveness of spatial query languages. The metric refinements introduced in this paper have the feature of being topologically invariant under affine transformations. Since the combination of a topological predicate with a metric refinement leads to a single unified quantitative measure, this measure has to be interpreted and mapped to a lexical item. This leads to vague topological predicates, and we demonstrate how these predicates can be integrated into a spatial query language. Keywords. Vague topological relationship, metric refinement, quantitative refinement, 9-intersection model, lexical item, spatial data type, spatial query language
1 Introduction In recent years, the exploration of topological relationships between objects in space has turned out to be a multi-disciplinary research issue involving disciplines like spatial databases, geographical information systems, CAD/CAM
This work was partially supported by the National Science Foundation under grant number NSF-CAREER-IIS-0347574.
150 Markus Schneider systems, image databases, spatial analysis, computer vision, artificial intelligence, linguistics, cognitive science, psychology, and robotics. From a database perspective, their development has been motivated by the necessity of formally defined topological predicates as filter conditions for spatial selections and spatial joins in spatial query languages, both at the user definition level for reasons of conceptual clarity and at the query processing level for reasons of efficiency. Topological relationships like overlap, inside, or meet describe purely qualitative properties that characterize the relative positions of spatial objects to each other and that are preserved (topologically invariant) under continuous transformations such as translation, rotation, and scaling. They deliberately exclude any consideration of metric, that is, quantitative, measures and are associated with notions like adjacency, coincidence, connectivity, inclusion, and continuity. Some well known, formal, and especially computational models for topological relationships have already been proposed for spatial objects, for example for regions. They permit answers to queries like “Are regions A and B disjoint ?” or “Do regions A and B overlap?”. Unfortunately, these purely qualitative approaches (topology per se) are sometimes insufficient to express the full essence of spatial relations, since they do not capture all details to make important semantic distinctions. This is motivated in Figure 1 for the topological relationship overlap. Obviously, for all three configurations the predicate overlap(A, B) yields true. But there is no way to express the fact that in the left configuration regions A and B hardly overlap, that the middle configuration represents a typical overlap, and that in the right configuration regions A and B predominantly overlap. In these statements and the corresponding resulting queries, the degree of overlapping between two spatial objects is of decisive importance. The crucial aspect is that this degree is a relative metric, and thus quantitative, feature which is topologically invariant under affine transformations. This leads to metrically refined topological relationships having a vague or blurred nature. Transfering this observation to concrete applications, we can consider polluted areas, for example. Here it is frequently not only interesting to know the fact that areas are polluted but also to which degree they are polluted. If two land parcels are adjacent, then often not only this fact is interesting but also the degree of their adjacency. Section 2 discusses some relevant related work about topological relationships. Our design is based on the 9-intersection model, an approach that uses point set topology to define a classification of binary topological relationships in a purely qualitative manner. The goals of this paper are then pursued in the following sections. In Section 3, we explore metrically refined topological relationships on spatial regions with precisely determined boundaries (so-called crisp regions) and show how qualitative descriptions (topological properties) can be combined with quantitative aspects (relative metric properties) into a single unified quantitative measure between 0 and 1. This leads us to vague topological predicates. In Section 4, we demonstrate how the obtained quan-
Vague Topological Predicates through Metric Refinements 151
A
B
A
B
A
B
Fig. 1. Topological relationship overlap(A, B) with different degrees of overlapping.
titative measures can be mapped to lexical items corresponding to natural language terms like “a little bit inside” or “mostly overlap”. This introduces a kind of vagueness or indeterminacy into user queries which is an inherent feature of human thinking, arguing, and reasoning. Section 5 deals with the integration of these indeterminate predicates into an SQL-like query language. Finally, Section 6 draws some conclusions.
2 Related Work An important approach for characterizing topological relationships rests on the so-called 9-intersection model (Egenhofer et al. 1989). This model allows one to derive a complete collection of mutually exclusive topological relationships for each combination of spatial types. The model is based on the nine possible intersections of boundary (∂A), interior (A◦ ), and exterior (A− ) of a spatial object A with the corresponding components of another object B. Each intersection is tested with regard to the topologically invariant criteria of emptiness and non-emptiness. 29 = 512 different configurations are possible from which only a certain subset makes sense depending on the definition and combination of spatial objects just considered. For each combination of spatial types this means that each of its predicates p can be associated with a unique boolean intersection matrix BI p (Table 1) so that all predicates are mutually exclusive and complete with regard to the topologically invariant criteria of emptiness and non-emptiness. 0
∂A ∩ ∂B = ∅ BI p (A, B) = @ A◦ ∩ ∂B = ∅ ∅ A− ∩ ∂B =
∅ ∂A ∩ B ◦ = A◦ ∩ B ◦ = ∅ A− ∩ B ◦ = ∅
1 ∂A ∩ B − = ∅ ◦ − A ∩B = ∅ A A− ∩ B − = ∅
Table 1. The boolean 9-intersection matrix. Each matrix entry is a 1 (true) or 0 (false).
Topological relationships have been first explored for simple regions (Clementini et al. 1993, Cui et al. 1993, Egenhofer et al. 1989). A simple region is a bounded, regular closed set homeomorphic (that is, topologi-
152 Markus Schneider cally equivalent) to a two-dimensional closed disc1 in IR2 . Regularity of a closed point set eliminates geometric anomalies possibly arising from dangling points, dangling lines, cuts, and punctures in such a point set (Behr & Schneider 2001). From an application point of view, this means that a simple region has a connected interior, a connected boundary, and a single connected exterior. Hence, it does not consist of several components, and it does not have holes. For two simple regions eight meaningful configurations have been identified which lead to the well known eight topological predicates of the set T = {disjoint , meet , overlap, equal , inside, contains, covers, coveredBy}. In a vector notation from left to right and from top to bottom, their (well known) intersection matrices are: BI disjoint (A, B) = (0, 0, 1, 0, 0, 1, 1, 1, 1), BI meet (A, B) = (1, 0,1, 0, 0, 1, 1, 1, 1), BI overlap (A, B) = (1, 1, 1, 1, 1, 1, 1, 1, 1), BI equal (A, B) = (1, 0, 0, 0, 1, 0, 0, 0, 1), BI inside (A, B) = (0, 1, 0, 0, 1, 0, 1, 1, 1), BI contains (A, B) = (0, 0, 1, 1, 1, 1, 0, 0, 1), BI covers (A, B) = (1, 0, 1, 1, 1, 1, 0, 0, 1), BI coveredBy (A, B) = (1, 1, 0, 0, 1, 0, 1, 1, 1). For reasons of simplicity and clear presentation, in this paper, we will confine ourselves to metric refinements of topological relationships for simple regions. An extension to general, complex regions (Schneider 1997), that is, regions possibly consisting of several area-disjoint components and possibly having area-disjoint holes, is straightforward. For this purpose, metric refinements have to be applied to the 33 generalized topological predicates between two complex regions (Behr & Schneider 2001). Approaches dealing with metric refinements of spatial relationships on crisp spatial objects are rare. In (Hernandez et al. 1995) metric refinements of distance relationships are introduced to characterize indeterminate terms like very close, close, far, and very far. In (Peuquet & Xiang 1987, Goyal & Egenhofer 2004) directional relationships like north, north-west, or southeast are metrically refined. Two papers deal at least partially with metric refinements of topological relationships. In (Vazirgiannis 2000) refinements which are similar to our directed topological relationships are proposed. Metric details, which are similar to our metric refinements, are used in (Egenhofer & Shariff 1998) to refine natural-language topological relationships between a simple line and a simple region and between two simple lines. There are a number of differences to our approach. First, we deal with two regions. Second, they do not interpret the entries of a 9-intersection matrix for a given topological predicate as optimum values, as we do in Section 3.2. Third, our set of refinements is systematically developed and complete but not ad hoc (see Section 3.1). Fourth, their refinements are not combined with the 9-intersection matrix into a so-called similarity matrix, as in our case (see Section 3.2). Fifth, they do not employ our concept of applicability degree (see Section 3.2). 1
D(x, ) denotes a two-dimensional closed disc with center x ∈ IR2 and radius ∈ IR+ iff D(x, ) = {y ∈ IR2 | d(x, y) ≤ } where d is a metric on IR2 .
Vague Topological Predicates through Metric Refinements 153 Two completely different approaches to modeling indeterminate topological predicates rest on the concept of so-called fuzzy topological predicates (Schneider 2001a, Schneider 2001b), which are defined on complex fuzzy regions (Schneider 1999). That is, in contrast to the assumption in this paper, these predicates operate on (complex) regions whose extent cannot be precisely determined or is not precisely known.
3 Metric Refinements of Topological Relationships Topological relationships are designed as binary predicates yielding a boolean and thus strict decision whether such a relationship holds for two spatial objects or not. Metric details based on the geometric properties of the two spatial objects can be used to relax this strictness. As we will see in Sections 4 and 5, they enable us to describe vague nuances of topological relationships, as they often and typically occur in natural language expressions, and hence they allow us to refine topological relationships in an indeterminate manner. Queries like “Which are the land parcels that are hardly adjacent to parcel X?” or “Which landscape areas are mostly contaminated (overlapped) with toxic substances?” can then be posed and answered. To describe metrical details, we use relative area and length measures provided by the operand objects (in our case two simple regions). These measures are normalized values with respect to the areas of interiors and lengths of boundaries of two simple regions. Consequently, they are scale-independent and topologically invariant. 3.1 Refinement Ratio Factors We now introduce six refinement ratio factors which are illustrated in Figure 2. For the definition of all factors we assume two simple regions A and B. The common area ratio area(A◦ ∩ B ◦ ) CA(A, B) = area(A◦ ∪ B ◦ ) specifies the degree to which regions A and B share their areas. Obviously, CA(A, B) = 0, if A◦ ∩ B ◦ = ∅ (that is, A and B are disjoint or they meet ), and CA(A, B) = 1, if A◦ = B ◦ (that is, A and B are equal ). Like all the other factors to be presented, the common area factor is independent of scaling, translation, and rotation, and hence constant. This factor is also symmetric, that is, CA(A, B) = CA(B, A). The outer area ratio OA(A, B) =
area(A◦ ∩ B − ) area(A◦ )
computes the ratio of that portion of A with A that is not shared with B. Here, OA(A, B) = 0, if A◦ = B ◦ , and OA(A, B) = 1, if A◦ ∩ B − = A◦ , that
154 Markus Schneider is, A and B are disjoint or they meet. Obviously, the outer area ratio is not symmetric, that is, OA(A, B) = OA(B, A). The exterior area ratio EA(A, B) =
area(A− ∩ B − ) area(A− ∪ B − )
calculates the ratio between the area of the common exterior of A and B on the one hand and the area of the application reference system, where all our regions are located, minus the area of the intersection of A and B on the other hand. The reference system is usually called the Universe of Discourse (UoD ). We assume that our UoD is bounded and thus not equal to but a proper subset of the Euclidean plane. This is not a restriction, since all spaces that can be dealt with in a computer are bounded. If A and B meet and their union is the UoD, EA(A, B) = 0. If A = B, EA(A, B) = 1. The exterior area ratio is symmetric, that is, EA(A, B) = EA(B, A). The inner boundary splitting ratio IBS (A, B) =
length(∂A ∩ B ◦ ) length(∂A)
determines the degree to which A’s boundary is split by B. If A and B are disjoint or meet , then IBS (A, B) = 0. If A is inside or coveredBy B, then IBS (A, B) = 1. The inner boundary splitting ratio is not symmetric, that is, IBS (A, B) = IBS (B, A). The outer boundary splitting ratio OBS (A, B) =
length(∂A ∩ B − ) length(∂A)
yields the degree to which A’s boundary lies outside of B. If A is inside or coveredBy B, then OBS (A, B) = 0. If A and B are disjoint or meet , then OBS (A, B) = 1. The outer boundary splitting ratio is not symmetric, that is, OBS (A, B) = OBS (B, A). The common boundary splitting ratio CBS (A, B) =
length(∂A ∩ ∂B) length(∂A ∪ ∂B)
calculates the degree to which regions A and B share their boundaries. Obviously, CBS (A, B) = 0, if ∂A∩∂B = ∅, and CBS (A, B) = 1, if ∂A∩∂B = ∂A,
C A (A , B )
O A (A , B )
E A (A , B )
IB S (A , B )
Fig. 2. Refinement ratio factors.
O B S (A , B )
C B S (A , B )
Vague Topological Predicates through Metric Refinements 155
A in B
A
(a)
A
A
A out (b)
B
(c)
(d)
Fig. 3. Problem configuration for the common boundary splitting ratio (a), enlarged region Aout and reduced region Ain for region A, and 0-dimensional (c) and 1dimensional (d) boundary intersections with their corresponding boundary areas.
which means that A and B are equal . The common boundary splitting ratio is also symmetric, that is, CBS (A, B) = CBS (B, A). The common boundary splitting ratio is especially important for computing the degree of meeting of two regions. A problem arises with this factor, if the common boundary parts do not have a linear structure but consist of a finite set of points. Figure 3a shows such a meeting situation. The calculation of CBS (A, B) leads to 0, because due to regularization ∂A ∩ ∂B = ∅ and the length is thus 0. Hence, common single points are not taken into account by this factor, which should be done for correctly evaluating (the degree of) a meeting situation. To solve this problem, for each simple region A we introduce two additional simple regions Aout and Ain which are slightly enlarged and reduced, respectively, by scale factors 1 + and 1 − , respectively, with > 0 (Figure 3b). We then consider ∆A = Aout − Ain as the extended boundary of A and redefine the common boundary splitting ratio as CBS (A, B) =
area(∆A ∩ ∆B) area(∆A ∪ ∆B)
In Figures 3c and d, the dark shaded regions show the extended boundaries of A and B. The diagonally hatched regions correspond to the boundary intersection of A and B. The refinement ratio factors have, of course, not been defined arbitrarily. They have been specified in a way so that each intersection occurring in a matrix entry in BI p is contained as an argument of the area or length function of the numerator of a refinement ratio factor. As an example, consider the intersection ∂A ∩ B ◦ included in the inequality of the first row and second column of BI p (A, B). This intersection reappears as argument of the length function of the numerator of IBS (A, B). The intersection A◦ ∩ ∂B (second row, first column of BI p (A, B)) is captured by IBS (B, A). The purpose of the denominator of a refinement ratio factor then is to make the factor a relative and topologically invariant measure.
156 Markus Schneider 3.2 Evaluation of the Applicability of a Topological Relationship In this subsection we show how the concept of metric refinement can be used to assess the applicability of a topological relationship for a given spatial configuration with a single, numerical value. This then leads us to the concept of a vague topological relationship. The Similarity Matrix The boolean intersection matrix BI p (A, B) contains nine strict, binary intersection tests leading either to true (1) or false (0). We now replace each intersection test (inequality) by the corresponding refinement ratio factor. This leads us to the real-valued similarity matrix RS p (A, B) (Table 2) which we now employ in order to represent and estimate a topological relationship between two simple regions. Each matrix entry of RS p (A, B) represents a value between 0 and 1 and is interpreted as the degree to which the corresponding intersection in BI p (A, B) holds. That is, the statement about the existence of an intersection is replaced by a statement about the degree of an intersection. 0 RS p (A, B) =
CBS (A, B) IBS (A, B) @ IBS (B, A) CA(A, B) OBS (B, A) OA(B, A) 0
=
B B B B B B @
area(∆A ∩ ∆B) area(∆A ∪ ∆B) length(A◦ ∩ ∂B) length(∂B) length(A− ∩ ∂B) length(∂B)
1 OBS (A, B) OA(A, B) A EA(A, B)
length(∂A ∩ B ◦ ) length(∂A) area(A◦ ∩ B ◦ ) area(A◦ ∪ B ◦ ) area(A− ∩ B ◦ ) area(B ◦ )
length(∂A ∩ B − ) length(∂A) area(A◦ ∩ B − ) area(A◦ ) area(A− ∩ B − ) area(A− ∪ B − )
1 C C C C C C A
Table 2. The real-valued similarity matrix. Each matrix entry is computed as a value between 0 and 1.
Seen from this perspective, each matrix entry 0 or 1 of BI p can be interpreted in a new, different way, namely as the “optimum”, “best possible”, or sometimes “asymptotic” degree to which the corresponding intersection occurring as part of an intersection test in BI p holds. On the other hand, this is not necessarily obvious. Hence, for each predicate p, Table 3 contains an analysis of the suitability of a matrix entry in BI p for our interpretation. The left column contains a list of the topological predicates. The first row uses shortcuts to represent the nine intersections. For example, ∂ ◦ means ∂A ∩ B ◦ (= ∅), and ◦ ∂ means A◦ ∩ ∂B (= ∅). An entry “+” in the table indicates that the respective 0 or 1 in BI p is the optimum, perfect, and adopted value to fulfil predicate p. For example,
Vague Topological Predicates through Metric Refinements 157
equal meet disjoint inside contains covers coveredBy overlap
∂∂ ∂◦ ∂− ◦∂ + + + + (+) + + + + + + + + + + + + + + + (+) + + + (+) + + + (+) (+) (+) (+)
◦◦
+ + + (+) (+) (+) (+) (+)
◦−
− ∂ −◦ −− + + + (+) + + + (+) + + + (+) + + (+) (+) (+) + + (+) (+) + + (+) + + (+) (+) (+) (+) (+) (+)
Table 3. Suitability of a matrix entry in BI p for interpreting it as the degree to which the respective intersection holds.
if for disjoint the intersection of the boundary of A and the interior of B is empty (matrix entry 0), this is the optimum that can be reached for the inner boundary splitting ratio IBS (A, B). If for covers the intersection of the boundary of A and the exterior of B is non-empty (matrix entry 1), this is the optimum that can be reached for the outer boundary splitting ratio OBS (A, B). This situation implies that the boundary of B touches the boundary of A only in single points and not in curves. An entry “(+)” expresses that the respective 0 or 1 in BI p is an asymptotic value for predicate p. That is, this value can be approached in an arbitrarily precise way but in the end it cannot and may not be reached. For example, for meet the common boundary splitting ratio CBS (A, B) can be arbitrarily near to 1. But it cannot become equal to one, because then the relationship equal would hold. For our later computation, this is no problem. We simply assume the respective asymptotic 0’s and 1’s as optimum values. Computing the Degree of Applicability Whereas exactly one topological relationship applies with the boolean 9intersection matrix, we will show now that the similarity matrix enables us to assess all topological relationships but with different degrees of applicability. For that purpose, we test for the similarity of RS p with BI p . Since we can interpret the matrix entries of BI p for a predicate p as the ideal values, the evaluation of the similarity between RS p and BI p can be achieved by measuring for each matrix entry RS p(x,y) (A, B) with x, y ∈ {∂,◦ ,− } the deviation from the corresponding matrix entry BI p(x,y) (A, B). The idea is then to condense all nine deviation values to a single value by taking the average of the sum of all deviations. We call the resulting value the applicability degree of a topological relationship p with respect to two simple regions A and B. The applicability degree is computed by a function µ taking two simple regions and the name of a topological predicate p as operands and yielding a real value between 0 and 1 as a result:
158 Markus Schneider µ(A, B, p) =
x∈{∂,◦ ,− } y∈{∂,◦ ,− }
if BI p(x,y) (A, B) then RS p(x,y) (A, B) else 1 − RS p(x,y) (A, B) 9
What we have gained is the relaxation of the strictness of a topological predicate p : region × region → {0, 1} (region shall be the type for simple regions) to an applicability degree function µ : region × region × T → [0, 1] (remember that T is the set of all topological predicates). The applicability degree µ(A, B, p) gives us the extent to which predicate p holds for two simple regions A and B. We abbreviate µ(A, B, p) by the vague topological predicate value pv (A, B) with pv : region × region → [0, 1]. The term pv indicates the association to predicate p. Whereas the topological predicate p maps to the set {0, 1} and thus results in a strict and abrupt decision, the vague topological predicate pv maps to the closed interval [0, 1] and hence permits a smooth evaluation. Codomain [0, 1] can be regarded as the data type for vague booleans.
4 Mapping Quantitative Measures to Qualitative, Lexical Items The fact that the applicability degree yielded by a vague topological predicate is a computationally determined quantification between 0 and 1, that is, a vague boolean, impedes a direct integration into a query language. First, it is not very comfortable and user-friendly to use such a numeric value in a query. Second, spatial selections and spatial joins are not able to cope with vague predicates and expect strict and exact predicates as filter conditions that yield true or false. As a solution which maintains this requirement, we propose to embed adequate qualitative linguistic descriptions of nuances of topological relationships as appropriate interpretations of the applicability degrees into a spatial query language. Notice that the linguistic descriptions given in the following are arbitrary and exchangeable, since it is beyond the scope of this paper to discuss linguistic reasons how to associate a meaning to a given applicability degree. In particular, we think that the users themselves should be responsible for specifying a list of appropriate linguistic terms and for associating an applicability degree with each of them. This gives them greatest flexibility for querying. For example, depending on the applicability degree yielded by the predicate inside v , the user could distinguish between not inside, a little bit inside, somewhat inside, slightly inside, quite inside, mostly inside, nearly completely inside, and completely inside. These user-defined, vague linguistic terms can then be incorporated into spatial queries together with the topological predicates they modify. We call these terms vague quantifiers, because their semantics lies between the universal quantifier for all and the existential quantifier there exists.
Vague Topological Predicates through Metric Refinements 159 not
a little bit
nearly completely
1.0
somewhat
0.1
slightly
0.2
0.3
quite
0.4
0.5
0.6
mostly
0.7
0.8
0.9
completely
1.0
Fig. 4. Membership functions for vague quantifiers.
We know that a vague topological predicate pv is defined as pv : region × region → [0, 1]. The idea is now to represent each vague quantifier γ ∈ Γ = {not, a little bit, somewhat, slightly, quite, mostly, nearly completely, completely, . . . } by an appropriate membership function µγ : [0, 1] → [0, 1]. Let A, B ∈ region, and let γ pv be a quantified vague predicate (like somewhat inside with γ = somewhat and pv = inside v ). Then we can define: γ pv (A, B) = true
:⇔
(µγ ◦ pv )(A, B) = 1
That is, only for those values of pv (A, B) for which µγ yields 1, the predicate γ pv is true. A membership function that fulfils this quite strict condition is, for instance, the partition of [0, 1] into n ≤ |Γ | disjoint or adjacent intervals completely covering [0, 1] and the assignment of each interval to a vague quantifier. If an interval [a, b] is assigned to a vague quantifier γ, the intended meaning is that µγ (pv (A, B)) = 1, if a ≤ pv (A, B) ≤ b, and 0 otherwise. For example, the user could select the intervals [0.0, 0.02] for not, [0.02, 0.05] for a little bit, [0.05, 0.2] for somewhat, [0.2, 0.5] for slightly, [0.5, 0.8] for quite, [0.8, 0.95] for mostly, [0.95, 0.98] for nearly completely, and [0.98, 1.00] for completely. Alternative membership functions are shown in Figure 4. While we can always find a fitting vague quantifier for the partition due to the complete coverage of the interval [0, 1], this is not necessarily the case here. Each vague quantifier is associated with a vague number having a trapezoidal-shaped or triangular-shaped membership function. The transition between two consecutive vague quantifiers is smooth and here modeled by linear functions. Within a vague transition area, µγ yields a value less than 1 which makes the predicate γ pv false. Examples in Figure 4 can be found at 0.2, 0.5, or 0.8. Each vague number associated with a vague quantifier can be represented as a quadruple (a, b, c, d) where the membership function starts at (a, 0), linearly increases up to (b, 1), remains constant up to (c, 1), and linearly decreases up to (d, 0). Figure 4 assigns (0.0, 0.0, 0.0, 0.02) to not, (0.01, 0.02, 0.03, 0.08) to a little bit, (0.03, 0.08, 0.15, 0.25) to somewhat, (0.15, 0.25, 0.45, 0.55) to slightly, (0.45, 0.55, 0.75, 0.85) to quite, (0.75, 0.85, 0.92, 0.96) to mostly, (0.92, 0.96, 0.97, 0.99) to nearly completely, and (0.97, 1.0, 1.0, 1.0) to completely.
160 Markus Schneider So far, the predicate γ pf is only true if µγ yields 1. We can relax this strict condition by defining: γ pf (A, B) = true
:⇔
(µγ ◦ pf )(A, B) > 0
In a spatial database system this gives us the chance also to take the transition zones into account and to let them make the predicate γ pv true. When evaluating a spatial selection or join in a spatial database system on the basis of a vague topological predicate, we can even set up a weighted ranking of database objects satisfying the predicate γ pv at all and being ordered by descending membership value 1 ≥ µγ (x) > 0 for some value x ∈ [0, 1]. A special, optional vague quantifier, denoted by at all, represents the existential quantifier and checks whether a predicate pv can be fulfilled to any extent. An example query is: “Do regions A and B (at all ) overlap?” With this quantifier we can determine whether µγ (x) > 0 for some value x ∈ [0, 1].
5 Querying In this section we briefly demonstrate with a few example queries how spatial data types and quantified vague topological predicates can be integrated into an SQL-like spatial query language. It is not our objective to give a full description of a specific language. We assume a relational data model where tables may contain regions as attribute values in the same way as integers or strings. What we need first are mechanisms to declare and to activate userdefined vague quantifiers. These mechanisms should allow the user to specify trapezoidal-shaped and triangular-shaped membership functions as well as partitions. In general, this means to define a (possibly overlapping) classification, which for our example in Section 4 could be expressed by the user in the following way: create classification fq (not a little bit somewhat slightly quite mostly nearly completely completely
(0.00, 0.00, 0.00, 0.02), (0.01, 0.02, 0.03, 0.08), (0.03, 0.08, 0.15, 0.25), (0.15, 0.25, 0.45, 0.55), (0.45, 0.55, 0.75, 0.85), (0.75, 0.85, 0.92, 0.96), (0.92, 0.96, 0.97, 0.99), (0.97, 1.0, 1.0, 1.0))
Such a classification could then be activated by set classification fq We assume that we have a relation pollution, which stores among other things the geometry of polluted zones as regions, and a relation areas, which keeps
Vague Topological Predicates through Metric Refinements 161 information about the use of land areas and which stores their spatial extent as regions. A query could be to find out all inhabited areas where people are rather endangered by pollution. This can be formulated in an SQL-like style as (we here use infix notation for the predicates): select areas.name from pollution, areas where area.use = inhabited and pollution.region quite overlaps areas.region This query and the following two ones represent vague spatial joins. Another query asks for those inhabited areas lying almost entirely in polluted areas: select areas.name from pollution, areas where areas.use = inhabited and areas.region nearly completely inside pollution.region Assume that we are given living spaces of different animal species in a relation animals and that their indeterminate extent is represented as a vague region. Then we can search for pairs of species which share a common living space to some degree: select A.name, B.name from animals A, animals B where A.region at all overlaps B.region As a last example, we can ask for animals that usually live on land and seldom enter the water or for species that never leave their land area (the built-in aggregation function sum is applied to a set of vague regions and aggregates this set by repeated application of vague geometric union): select name from animals where (select sum(region) from areas) nearly completely covers or completely covers region
6 Conclusions In this paper we have presented a simple but expressive and effective concept showing how metric details can be leveraged to make important semantic distinctions of topological relationships on simple regions. The resulting vague topological predicates are often more adequate for expressing a spatial situation than their coarse, strict counterparts, because they are multi-faceted and
162 Markus Schneider much nearer to human thinking and questioning. Consequently, they allow a much more natural formulation of spatial queries than we can find in current spatial query languages. We are currently working on a prototype implementation for demonstrating the concepts presented in this paper and validating their relevance to practice. In the future we plan to extend the concept of metric refinement to complex regions. We will also investigate metric refinements between two complex line objects and between a complex line object and a complex region object.
References Behr, T. & Schneider, M. (2001), Topological Relationships of Complex Points and Complex Regions, in ‘Int. Conf. on Conceptual Modeling’, pp. 56–69. Clementini, E., Di Felice, P. & Oosterom, P. (1993), A Small Set of Formal Topological Relationships Suitable for End-User Interaction, in ‘3rd Int. Symp. on Advances in Spatial Databases’, LNCS 692, pp. 277–295. Cui, Z., Cohn, A. G. & Randell, D. A. (1993), Qualitative and Topological Relationships, in ‘3rd Int. Symp. on Advances in Spatial Databases’, LNCS 692, pp. 296–315. Egenhofer, M. J., Frank, A. & Jackson, J. P. (1989), A Topological Data Model for Spatial Databases, in ‘1st Int. Symp. on the Design and Implementation of Large Spatial Databases’, LNCS 409, Springer-Verlag, pp. 271–286. Egenhofer, M. J. & Shariff, A. R. (1998), ‘Metric Details for Natural-Language Spatial Relations’, ACM Transactions on Information Systems 16(4), 295–321. Goyal, R. & Egenhofer, M. (2004), ‘Cardinal Directions between Extended Spatial Objects’, IEEE Trans. on Knowledge and Data Engineering . In press. Hernandez, D., Clementini, E. C. & Di Felice, P. (1995), Qualitative Distances, in ‘2nd Int. Conf. on Spatial Information Theory’, LNCS 988, Springer-Verlag, pp. 45–57. Peuquet, D. J. & Xiang, Z. C. (1987), ‘An Algorithm to Determine the Directional Relationship between Arbitrarily-Shaped Polygons in the Plane’, Pattern Recognition 20(1), 65–74. Schneider, M. (1997), Spatial Data Types for Database Systems - Finite Resolution Geometry for Geographic Information Systems, Vol. LNCS 1288, SpringerVerlag, Berlin Heidelberg. Schneider, M. (1999), Uncertainty Management for Spatial Data in Databases: Fuzzy Spatial Data Types, in ‘6th Int. Symp. on Advances in Spatial Databases’, LNCS 1651, Springer-Verlag, pp. 330–351. Schneider, M. (2001a), A Design of Topological Predicates for Complex Crisp and Fuzzy Regions, in ‘Int. Conf. on Conceptual Modeling’, pp. 103–116. Schneider, M. (2001b), Fuzzy Topological Predicates, Their Properties, and Their Integration into Query Languages, in ‘ACM Symp. on Geographic Information Systems’, pp. 9–14. Vazirgiannis, M. (2000), Uncertainty Handling in Spatial Relationships, in ‘ACM Symp. for Applied Computing’.
Fuzzy Modeling of Sparse Data Angelo Marcello Anile1 and Salvatore Spinella2 1 2
Dipartimento di Matematica ed Informatica, Universit` a di Catania, Viale Andrea Doria 6, 90125 Catania Italy,
[email protected] Dipartimento di Linguistica, Universit` a della Calabria, Ponte Bucci Cubo 17B, 87036 Arcavacata di Rende, Italy,
[email protected] Abstract In this article we apply fuzzy bspline reconstruction supplemented by fuzzy kriging to the problem of constructing a smooth deterministic model for environmental pollution data. A method to interrogate the model will also be discussed and applied. Keywords: Uncertain, Fuzzy Number, Fuzzy Interpolation, Fuzzy Queries, Fuzzy Kriging, B-spline, Sparse Data, Environment Pollution.
1 Introduction Geographical data concerning environment pollution consist of a large set of temporal measurements (representing, e.g. hourly measurements for one year) at a few scattered spacial sites. In this case the temporal data at a given site must be summarized in some form in order to employ it as input to build a spatial model. Summarizing the temporal data (data reduction) will necessarily introduce some form of uncertainty which must be taken into account. Statistical methods reduce the data to some moments of the distribution function as means and standard deviations, but these procedures rely on statistical assumptions on the distribution function, which are hard to verify in practice. In the general case, without any special assumption on the distribution function, statistical reduction can grossly misrepresent the data distribution. An alternative way is to represent the data with fuzzy numbers, which has the advantage of keeping the full data content (conservatism) and also of leading to computationally efficient approaches. This method has been employed for ocean floor geographical data by [Patrikalakis et al 1995] (in the interval case) and [Anile 2000] (for fuzzy numbers) and to environmental pollution data by [Anile and Spinella 2004]. Once the temporal data at the given sites have been summarized with fuzzy
164 Angelo Marcello Anile and Salvatore Spinella numbers then it is possible to resort to fuzzy interpolation techniques in order to build a mathematically smooth deterministic surface model representing the spacial distribution of the quantity of interest. An alternative approach would be to employ fuzzy kriging which will build a stochastic model. However our aim is to construct a smooth deterministic model, because this could be used for simulation purposes. We shall use fuzzy kriging only to estimate the missing information which is required just outside the domain boundary, as we shall see, in order to build a consistent deterministic model.
2 Fuzzy representation 2.1 Modeling observations Let O be a sequence of n observational data in a domain X ⊆ R2 in the form O = {(x1 , y1 , Z1 ), . . . , (xi , yi , Zi ), . . . , (xn , yn , Zn )}
(1)
with (xi , yi ) ∈ X
Zi = {zi,1 . . . zi,mi }
(2)
where zi,j ∈ R represents the j-th observation at the point (xi , yi ). By introducing a fuzzy approach [Anile 2000][Lodwick and Santos 2002] that represents the datum Zi with an appropriately constructed fuzzy number it is possible to preserve both the data distribution and their quality. Here fuzzy numbers [Kauffman and Gupta 1991] are defined as maps that associate to each presumption level α ∈ [0, 1] a real interval Aα such that α > α ⇒ Aα ⊆ Aα
(3)
and the latter property is formally called convex hull property. By utilizing one of several methods for constructing fuzzy sets membership functions [Gallo et al. 1999] from Zi one can represent the n observational data as FO = {(x1 , y1 , z1 ), . . . , (xi , yi , zi ), . . . , (xn , yn , zn )} (4) where zi ∈ F (R) is the fuzzy number representing the observations at the point (xi , yi ). For computational purposes fuzzy numbers are represented in terms of a finite discretization of α-levels, which is a natural generalization of intervals and a library for arithmetic operations on them has been implemented in [Anile et al. 1995].
Fuzzy Modeling of Sparse Data 165 2.2 Fuzzy B-splines We introduce fuzzy B-spline as follow [Anile 2000]: Definition 1 (B-spline fuzzy). A fuzzy B-spline F (t) relative to the knot sequence (t0 , t1 , . . . , tm ), m = k + 2(h − 1), is a function of the kind F (t) : R → F (R) defined as F (t) =
h+h−1
Fi Bi,h (t)
(5)
i=0
where the control coefficients Fi are fuzzy numbers and Bi,h (t) real B-spline basis functions [DeBoor 1972]. Notice that definition (1) is consistent with the previous definitions and more precisely for any t, F (t) is a fuzzy number, i.e. it verifies the convex hull property (3): α > α ⇒ F (t)α ⊆ F (t)α because the B-spline basis are non negative. The generalization in 2D of a fuzzy B-spline relative to a rectangular grid of M × N knots is f (u, v) =
M −1 N −1
Fi,j Bi,h (u)Bj,h (v)
(6)
i=0 j=0
with the same properties as above described. Similar considerations in the more general framework of fuzzy interpolation can be found in [Lodwick and Santos 2002]. Here we proceed with the construction of a fuzzy Bspline approximation following the approach already expounded in the detail in [Anile and Spinella 2004]. 2.3 Constructing fuzzy B-spline surfaces Let us consider a sequence of fuzzy numbers representing the observations (4). If a fuzzy B-spline F (u, v) on a rectangular grid G ⊇ X of M × N knots approximates FO (4) then ∀α ∈ [0, 1]
[ zi ]α ⊆ [F (xi , yi )]α
and furthermore one must also have
i = 1...n
(7)
166 Angelo Marcello Anile and Salvatore Spinella
M −1 N −1
G i=0 j=0
u l ([Fi,j ]α − [Fi,j )]α Bi,h (u)Bj,h (v)dudv ≤
M −1 N −1 G i=0 j=0
u l ([Yi,j ]α − [Yi,j ]α )Bi,h (u)Bj,h (v)dudv
∀{Yi,j }i=0...N −1,j=0...M −1 ∈ F (R)
∀α ∈ [0, 1]
(8)
where [F l ]α and [F u ]α indicate respectively the lower and upper bound of the interval representing the fuzzy number α-level. More precisely, for each presumption α-level, the volume encompassed by the upper and lower surface of the fuzzy B-spline is the smallest, which corresponds to minimizing the uncertainty propagation. These definitions are the generalization to the fuzzy case of the corresponding interval ones [Patrikalakis et al 1995]. Notice that the integral in a rectangular domain of a real B-spline is a linear expression.
−1 M −1 N
D i=0 j=0
Pi,j Bi,h (u)Bj,h (v)dudv =
M −1 N −1 i=0 j=0
Pi,j
(ti+h − ti )(sj+h − sj ) h2 (9)
where obviously {(ti , sj )}i=0...M +h−1,j=0...N +h−1 are the grid knots. Therefore, given the set FO of observations in (4) and a finite number P +1 of presumption levels α0 > α1 > . . . > αP the construction of a fuzzy B-spline requires the solution of the following constrained optimization problem: ⎧ P M −1 N −1 u (ti+h −ti )(sj+h −sj ) l ⎪ ⎪ min k=0 i=0 j=0 ([Fi,j ]αk − [Fi,j ]αk ) h2 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ l l l u ⎪ ⎪ [Fi,j ]α0 ≤ [Fi,j ]α1 ≤ . . . ≤ [Fi,j ]αP ≤ [Fi,j ]αP ≤ . . . ⎪ ⎪ ⎪ ⎨ u u ]α1 ≤ [Fi,j ]α0 i = 0 . . . M − 1 j = 0 . . . N − 1 . . . ≤ [Fi,j ⎪ ⎪ ⎪ ⎪ ⎪ M −1 N −1 u u ⎪ ⎪ r = 1, . . . , n k = 0, . . . , P ⎪ i=0 j=0 [Fi,j ]αk Bi,h (xr )Bj,h (yr ) ≥ [zr ]αk ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ M −1 N −1 l l r = 1, . . . , n k = 0, . . . , P i=0 j=0 [Fi,j ]αk Bi,h (xr )Bj,h (yr ) ≤ [zr ]αk (10) For the problem (10) one notices that, the objective function minimizes the uncertainty of the representation.
Fuzzy Modeling of Sparse Data 167
3 Fuzzy Kriging and boundary condition In the previous paragraph we expounded the method for constructing a fuzzy surface approximating, in a well defined way the fuzzy numbers representing the data sets. The quality of this approximation will deteriorate the farther one is from the sites of the data sets. Therefore one expects the constructed approximation not to be very satisfactory at the border of the domain within which the data sites are comprised. To remedy such drawbacks one has to introduce further information regarding the decay of the quantity of interest away from the domain. A simple approach would be to assume that the quantity of interest decays to zero outside the boundary but this is hardly justifiable. However a better treatment would be to construct fictitious observation data just outside the boundary by utilizing statistical kriging. The latter approach is more realistic because, in some sense, it amounts to an extrapolation driven by the data. By ordinary spatially distributed data and with stationary hypothesis of the distribution, the kriging combines the available data by weights in order to construct an unbiased estimator with minimum variance. Likewise such approach is extended to the fuzzy case [Diamond 1989]. Given N spatially distributed fuzzy data {(x1 , y1 , z1 ), . . . , (xi , yi , zi ), . . . , (xn , yn , zn )} one looks for an estimator constructed by a linear combination of weights λi to evaluate the distribution in the point (x, y) like ∗ = Z
N
λi zi
(11)
i=1
∗ like ranNotice that the above interprets fuzzy data zi and estimator Z dom fuzzy number. In order to construct this estimator the follow hypothesis must be satisfied ∗ ) = E(Z) = E(Z(x + h, y + k) = r (K1) E(Z ∗ 2 , Z) must be minimized (K2) Ed(Z (K3) λi ≥ 0, i = 1..N
(12)
N The first condition (K1) implies i=1 λi = 1, instead the last one (K3) guarantees the estimator stands in the cone of random fuzzy number generated by the data. It can be proved that ∗ , Z) 2 = N λi λj Cα (xi , yi , xj , yj )+ Ed(Z i,j=1 α N −2 i=1 λi α Cα (xi , yi , x, y)+ + α Cα (x, y, x, y)
(13)
168 Angelo Marcello Anile and Salvatore Spinella Where Cα is a positive defined function that represents a covariance. The minimization of such function together to the hypothesis (K1) and (K3) leads to a constrained minimization problem. It is solvable formulating everything in terms of Kuhn-Tucker conditions by the follow theorem ∗ be an estimator for Z of the form Z ∗ = N λi zi . Theorem 1. Let Z i=1 Suppose the conditions (K1) and (K3) are satisfied and the matrix defined by Γij = α Cα (xi , yi , xj , yj ), i, j = 1..N is strictly positive defined. Then there ∗ satisfying the (K2) condition. exists a unique linear unbiased estimator Z Moreover the weight satisfies the follow system N Γij λij − Lj − µ = α Cα (xi , yi , x, y) i=1 N λi = 1 (14) i=1 N i=1 Li λi = 0 Li , λi ≥ 0, i = 1..N The residual is σ2 =
Cα (x, y, x, y) + µ −
α
N i=1
λi
Cα (xi , yi , x, y)
(15)
α
The above problem defines the weights λi and it can be solved using a method to manipulate the constrains in (14) like the Active Set Method [Fletcher 1987]
4 Fuzzy B-spline Interrogation 4.1 Inequality relationship between fuzzy Number In order to represent the inequality relationship it is convenient to introduce the definition of overtaking between fuzzy numbers. The concept of overtaking between fuzzy number was been introduced in [Anile and Spinella 2004]. We start with overtaking between intervals Definition 2 (Overtaking between intervals). The overtaking of interval A with respect to interval B is the real function σ : I(R)2 → R defined as: ⎧ u l ⎨0 u l A ≤ B A −B (16) σ(A, B) = width(A) Au > B l ∧ Al ≤ B l ⎩ 1 Al > B l where width(A) is the width of interval A. Figure 1 clarifies the above definition. The overtaking of A with respect to B is 0, of B with respect to C is 35 while of D with respect to C is 1. From this definition one can define the δ-overtaking operator as:
Fuzzy Modeling of Sparse Data 169
Fig. 1. Interval A overtakes B by 0, B overtakes C by by 1
3 5
and finally D overtakes C
Definition 3 (δ-overtaking operator between intervals). Given two intervals A, B and a real number δ ∈ [0, 1] then A overtakes B by δ if σ(A, B) ≥ δ, i.e (17) A ≥δ B ⇐⇒ σ(A, B) ≥ δ Likewise we can define the overtaking between fuzzy numbers as follows. Definition 4 (Overtaking between fuzzy numbers). One defines over with respect to the fuzzy number B the real taking of the fuzzy number A 2 function σ : F (R) → R defined as 1 α , [B] α )w(α)dα σ([A] (18) σ(A, B) = 0
where w : [0, 1] → R is an integrable weight function.
Fig. 2. The fuzzy number A overtakes B by 0, B overtakes C by about 0.61, finally D overtakes C by 1
Figure 2 clarifies the above definition. The overtaking of A with respect to B is 0, of B with respect to C is about 0.61 while of D with respect to C is 1. Likewise one can define the operator of δ-overtaking between fuzzy numbers as follows.
170 Angelo Marcello Anile and Salvatore Spinella Definition 5 (δ-overtaking operator between fuzzy numbers). Given B ∈ F (R) and a real number δ ∈ [0, 1] then A overtakes B by δ if σ(A, B) ≥ A, δ, i.e ⇐⇒ σ(A, B) ≥δ ≥δ B A
(19)
4.2 Query by sampling By utilizing the definitions of overtaking previously given it is possible to phrase the fuzzy or interval B-spline surface interrogation as a global search in the domain G where the fuzzy or interval B-spline surface is defined. For instance interrogating the fuzzy B-spline surface in order to find the regions Xδ ⊆ G where F (x, y) overtakes z by δ amounts to Xδ = {(x, y)|F (x, y) ≥δ z} = {(x, y)|σ(F (x, y), z) ≥ δ}
(20)
Let us consider now an algorithm of global search that divides the given domain G in rectangles Rp and samples in (xp , yp ) ∈ Rp by considering as objective function to maximize the overtake sp = δ(F (xp , yp ), z). By iterating until a stop criterion is satisfied one could then with the parameter ε divide G into the two sets:
s 30) are obtained for each distance class. Through the pairs hj, Jˆ (h j ) a parametric model is fitted. Common models are the exponential model Ȗe(h) = ș1 + ș2·(1-exp{-h/ș3) and the spherical model Ȗs(h) = ș1 + ș2·(3/2 h/ș3-1/2 (h/ș3)3). These models apply for h > 0, whereas Ȗe(0) = Ȗs(0) = 0 and Ȗs(h) = ș1 + ș2 for h > ș3. Both models depend upon a vector Ĭ of k = 3 parameters. In this study this will be assumed to belong to a fuzzy set {Ĭ}, in contrast to a crisp set as commonly in ordinary kriging. The set {Ĭ} is assumed to be compact. Kriging is equivalent to predicting values at an unvisited location. Let the vector Ȗ contain evaluations of the variogram from observation locations to the prediction location, and let the matrix ī contain those among the observation locations. The kriging equations equal t
Eˆ J ' * 1 ( z Eˆ1n ) K [V , 4, {xi }i
1,... n , { z ( x i )}i
(1) 1,..., n ]
where 1n is the vector of n elements, all equal to 1, z = (z(x1),…,z(xn))’ is the column vector containing the n observations, and Eˆ G 1n ' * 1 z is the spatial mean, with G 1n ' * 11n 1 . The symbol K is used to emphasize that kriging is an operator on the support, the parameters, the configuration of observation points and the observations themselves. As Eˆ is linear in the observations z(xi), the kriging predictor is also linear in the observations. In addition, it the minimum prediction error variance, the so-called kriging variance can be expressed as an operator U on the support set V, the set Ĭ of variogram parameters and the configuration set of observation locations xi, i = 1,…,n, but not on the observations 1themselves as
Handling Spatial Data Uncertainty Using Fuzzy Geostatistics Var (t z ( x0 ))
J ' * 1J x a ' Gx a U [V , 4, {xi }i
1,...n
177
(2) ]
where x a 1 1n ' * 1 z . The kriging variance does not contain the observations, but relies on C, being the configuration of data points and prediction location, and on the variogram. Both the data and the variogram are commonly assumed to be without errors. 2.2 Fuzzy kriging Fuzzy kriging is an extension of equations (1) and (2) as it takes uncertainty in the variogram parameters into account. To define fuzzy kriging, we need the extension principle. Consider the set of fuzzy sets on Ĭ, denoted as F(Ĭ)p. A function g: Ĭp Æ T shall become extended to a function gˆ : F (4) p o F (T ) by defining for all fuzzy sets Ĭ1,…,Ĭp from F (4) p a set B : gˆ : (41...4 p ) with P B (t )
sup T1 ...T p 4 t g (T1 ...T p )
^
`
min P41 (T1 ),..., P4 p (T p )
for all t T. Using the extension principle for the kriging operator K, the membership value for any real number t resulting from the kriging equation (1) is: P K (t )
sup^P 4 (T ); t K [V , T , {xi }i 1,...,n , {z ( xi )}i ® 0 if the above set is empty ¯
1,..., n
]`
(3)
Here, P4 (T ) is the membership function for the fuzzy subset of variogram parameters. This equation defines a membership function on the set of real numbers. The fuzzy number corresponding to this membership function is the fuzzy kriging prediction. Use of the extension principle with operator U results in an estimation variance expressed as a fuzzy number. Similarly, the membership value of any real number s2 for the estimation variance is defined as:
^
sup P 4 (T ); s 2
PU ( s 2 ) ® ¯
U [V , T , {xi }i
1,...,n
0 if the above set is empty
`
]
(4)
178
Alfred Stein and Mamta Verma
The above fuzzy set is a fuzzy number. In this way the uncertainty (imprecision) of the variogram model is transferred to the kriging estimate and estimation variance. 2.3 Implementation At each location, fuzzy kriging results into fuzzy numbers for the Kriging estimate and estimation variance. However, the definition of the membership functions (eqs. (1) and (2)) as suprema is not convenient for computational purposes because it would require for each possible real number t an optimization algorithm to retrieve the membership value µ. To simplify calculations, instead of assigning a membership value to every specific value t, a value t could be assigned to each specific membership value µ. If the fuzzy set Ĭ is connected, this can be done by using the fact that level sets of fuzzy numbers are intervals. For fuzzy kriging, the endpoints of these intervals can be determined with the help of two optimization problems; namely, for any selected membership level 0 < t < 1 find: min K [V , T , {xi }i
1,..., n
, {z ( xi )}i
1,..., n
]
subject to the constraint µĬ(ș) t t and find: max K [V , T , {xi }i
1,...,n
, {z ( xi )}i
1,..., n
]
subject to the same constraint. Results of these optimizations provide end points of intervals Rt. These optimizations for a selected finite set of t are sufficient because, by virtue of convexity, they provide greater and lesser bounds for membership of intermediate values. The particular shape of the membership function can be taken into account by repeatedly changing the support of fuzziness for each prediction, yielding a full membership function in the end. The same is done to calculate fuzzy estimation variances.
3 Study area and data used The methodology is applied on a dataset from the Island of Java (fig. 1), containing modelled methane data (Van Bodegom et al., 2001, 2002). The isle of Java is approximately 280 km wide and more than 1000 km long. It
Handling Spatial Data Uncertainty Using Fuzzy Geostatistics
179
is the most densely populated island of Indonesia. The island is transversed from east to west by a volcanic mountain chain, but more than 60% of the 130,000 km2 island has been cultivated. In general the soil is physically and chemically fertile and alluvium. The samples used in this study were collected within the rice growing areas of Java. Methane emission is measured at a plot scale of 1 m2 using a closed chamber technique. Soil samples are analyzed by Center for Soil and Agro climate Research (CSAR) in Bogor, Indonesia. Soil organic carbon was estimated from soil profile data in the world inventory of soil emission potentials database, which was directly available at CSAR. Methane (CH4) emissions are an important greenhouse gases and rice paddy fields are among the most important sources of atmospheric methane, and these methane emissions are accounting for 15-20% of the radiative forcing added to the atmosphere (Houghton et al, 1996). Precise estimates of global methane emissions from rice paddies are, however, not available and depend on the approaches, techniques and databases used. One principal cause for uncertainties in global estimates results from the large, intrinsic spatial and temporal variability in methane emissions. Methane emissions from rice fields are strongly influenced by the presence of the rice plants. Methane emissions are higher with rice plants than without (Holzapfel-Pschorn and Seiler, 1986), and methane emissions are dominated by plant-mediated emissions (Schütz et al., 1989). The rice crop influences the processes underlying methane emissions via its roots.
Figure 1: Map of Java Island showing the distribution of the modeled methane emissions through out the paddy fields.
180
Alfred Stein and Mamta Verma
4 Results
4.1 Descriptive statistics Statistics of the variables of the data set are shown in Table 1. Mean (44.8) and median (47.8) are relatively close to each other, and the standard deviation (26.6) is approximately equal to half the value of the mean. Such a coefficient of variation shows that the distribution may be skewed, in this case to the right. In fact, high CH4 emission values (maximum value equals 192.3) occur.
minimum maximum mean median variance
x(m)
y(m)
149066 1085455
71832 351632
Methane (g m-2 season-1) 2.22 192 47.8 44.8 707
Table 1: Descriptive statistics for the study area.
Empirical variograms were constructed and both an exponential and a spherical model were fitted. The exponential model gave the best fit, yielding parameter values equal to 4ˆ (Tˆ1 , Tˆ2 , Tˆ3 )' (37.7, 296.6, 350.0)' , where Tˆ corresponds with the nugget, Tˆ with the sill and Tˆ with the range. 1
2
3
From the variogram fit we notice that the estimated parameters are relatively uncertain, and we interpreted these as fuzzy numbers. In the subsequent analysis we used a fixed value for the nugget, we allowed a spread of 10 km in the range parameter and a spread of 25 g2 m-4 season-2 in the sill, leading to a nugget Tˆ1 equal to (350.0), a fuzzy value for Tˆ2 equal to (27.7,37.7,47.7) and a fuzzy value for Tˆ equal to (271.6,296.6,321.6). As 3
concerns notation, fuzzy numbers y are denoted as triples (y1,y2,y3), where the membership function µ(x) equals 1 at y2, equals 0 for x < y1 as well as for x > y3. A triangular membership function is applied, i.e.
Handling Spatial Data Uncertainty Using Fuzzy Geostatistics
181
P ( x) 0 if x y1 P ( x) P ( x) P ( x)
x y1 if y1 d x y2 y2 y1 y3 x if y2 d x y3 y3 y2 0 if y3 d x
(5)
4.2 Fuzzy kriging Fuzzy kriging is applied in two steps. First, a fuzzy prediction is made at an arbitrary point, in this case the point with coordinates (664,200). Fuzzy variograms were applied (fig 2). Both the fuzzy prediction and the fuzzy standard deviation (fig. 3) were calculated on the basis of occurring values from the fuzzy variograms, using equations (1) and (2). The kriging prediction shows a clear maximum well within the space governed by the uncertainties in the parameters ș2 and ș3, i.e. with estimated values equal to Tˆ2 = 321.6 and Tˆ3 = 46.286. Minimum value and both minimum and maximum values for the kriging standard deviation occur at the edges of the ș2 u ș3 space.
Figure 2: Fuzzy variograms
Next, maps were created showing the fuzzy kriged prediction and the fuzzy kriging standard deviation (figs. 4, 5). The kriged values remain relatively stable under the fuzziness in the variogram values, as the three maps showing the endpoints of the fuzzy intervals are almost identical. The kriging standard deviation, however, is more sensitive to changes in the variogram parameters. We notice that the lowest map, showing the left points of the fuzzy intervals, has a much lighter color than the upper map, showing the right points of the fuzzy intervals. This yields an average (fuzzy) value on the map equal to (23.20,24.16,24.34).
182
Alfred Stein and Mamta Verma
By means of the use of fuzzy variograms in the case of methane it was possible to obtain kriging standard deviations in the range of 12 to 26.45 as compared with the ordinary kriging standard deviations that was 12 to 33.11. Fuzzy kriging was also performed for carbon and Iron. In case of carbon the fuzzy kriging standard deviations was found in the range of 0.33 to 0.37 as compared to the ordinary kriging standard deviations, which was 0.4 to 0.69. In the case of Iron the fuzzy kriging standard deviations was found in the range of 0.44 to 0.5 as compared to the ordinary kriging standard deviations, which was 0.65 to 1.02. So with these results we can see that by means of fuzzy variograms, it was possible to obtain the low kriging standard deviations as compared to the kriging standard deviations with the ordinary kriging using the crisp kriging.
Figure 3: Fuzzy kriging prediction (left) and fuzzy kriging standard deviation at the single location with coordinates (664,200).
5 Discussion Fuzzy set theory can be used to account the imprecise knowledge of variogram parameters in kriging. This type of problem can also be solved using a Bayesian or a probabilistic approach. Each approach has its advantages and limitations. The problem with a Bayesian approach is that a prior distribution has to be selected. Also a Bayesian approach requires extensive calculations. The fuzzy set approach has a similar difficulty in selection of the membership functions, but only simple dependenceindependence assumptions are necessary, and computations are relatively
Handling Spatial Data Uncertainty Using Fuzzy Geostatistics
183
simple. In this research the approach of fuzzy methods within geostatistical modeling has been proposed to handle the uncertainty.
Figure 4: Fuzzy kriging of CH4 values using a fuzzy variogram - maximum value (top), predicted value (middle) and minimum value (bottom).
Several advantages exist in using fuzzy input data. An example is the possibility to incorporate expert knowledge, to be used for the definition of fuzzy numbers, at places where exact measurements are rare in order to reduce the kriging variance. As the kriging variance decreases, fuzziness emerges in the results. Such a result presents more information, because the vague information is taken into account that could not be used by conventional methods.
184
Alfred Stein and Mamta Verma
Figure 5: Fuzzy kriging standard deviations for the CH4 predictions using a fuzzy variogram - maximum value (top), predicted value (middle) and minimum value (bottom).
Data often incorporate fuzziness such as measurement tolerance that can be expressed as a fuzzy numbers. It can be of a great advantage to know the result tolerances when the results have to be judged. There is necessarily some uncertainty about the attribute value at an unsampled location. The traditional approach for modeling local uncertainty consists of computing a kriging estimate and associated error variance. A more rigorous approach is to assess first the uncertainty about the unknown, then deduce an optimal estimate, e.g. using an indicator approach
Handling Spatial Data Uncertainty Using Fuzzy Geostatistics
185
that provides not only an estimate but also the probability to exceed critical values, such as regulatory thresholds in soil pollution or criteria for soil quality. One vision for uncertainty management in GIS is the application of 'intelligent' systems. Burrough, 1991, suggests that such systems could help decision makers evaluate the consequences of employing different combinations of data, technology, processes and products, to gain an estimate of the uncertainty expected in their analyses before they start. Elmes et.al, 1992, investigated use of a data quality module in a decision support system to advise on management of forest pest infestations. Agumya and Hunter, 1996 believe the uncertainty debate must now advance from its present emphasis on the effect of uncertainty in the information, to considering the effect of uncertainty on the decisions. Just as uncertainty is an important consideration in spatial decision-making, also in many different applications the temporal aspects of spatial data are often a vital component. To improve the quality of decisions in such scenarios, information systems need to provide better support for spatial and temporal data handling. This part of intelligent system is still lacking, though. The aboveproposed approach should be helpful to be of direct use for the decision making. An important element in this regard is to have a good interface to represent the outputs of the analysis in a proper form so that it can be used by the decision makers.
6 Conclusions Fuzzy kriging provides interpolation results in the form of fuzzy numbers. As such in contributes to further modelling attribute uncertainty in spatial data. In our study, it was applied to modelling of methane emissions at the Isle of Java, where fuzziness in the variogram parameters could be incorporated. For these data, the exponential model provides the best fit for variogram of methane giving parameter values of nugget, sill and range. Fuzzy kriging provides an interesting way of both calculating and displaying the interpoaltion results as maps. Validation of the results, shows that ordinary kriging gives values of kriging standard deviations equal to 12 to 33.11, whereas fuzzy variograms reduced these values to kriging standard deviations (12 to 26.45).
186
Alfred Stein and Mamta Verma
Acknowledgement We are thankful to Prof. Peter van Bodegom for providing the data set for this work, also to Dr. Theo Bouloucous, from ITC, The Netherlands, Dr. P.S.Roy and Mr. P.L.N.Raju, from IIRS, India, for their support during the research work.
References Agumya, A. and Hunter, G.J., 1996, Assessing Fitness for Use of Spatial Information: Information Utilisation and Decision Uncertainty. Proceedings of the GIS/LIS '96 Conference, Denier, Colorado, pp. 349-360 Bardossy, A., Bogardi, I. and Kelly, W.E., 1990a, Kriging with Imprecise (Fuzzy) Variograms I: Theory. Mathematical Geology 22, 63-79 Bardossy, A., Bogardi, I., and Kelly, W.E., 1990b, Kriging with Imprecise (Fuzzy) Variograms II: Application, Mathematical Geology 22, 81-94 Bezdek, J.C., 1981, Pattern recognition with fuzzy objective function algorithms. Plenum Press, New York. Burrough P.A., 1986. Principles of geographical information systems for land resources assessment. Clarendon press, Oxford. Burrough, P.A., 1991, The Development of Intelligent Geographical Information Systems. Proceedings of the 2nd European Conference on GIS (EGIS '91), Brussels, Belgium, vol. 1, pp. 165-174 Burrough, P.A., 2001, GIS and geostatistics: essential partners for spatial analysis. Environmental and ecological statistics 8, 361-378 Chilès, J.P., and Delfiner, P. 1999. Geostatistics: modelling spatial uncertainty. John Wiley & Sons, New York. Diamond, P., and Kloeden, P. 1989. Characterization of compact subsets of fuzzy sets. Fuzzy Sets and Systems 29, 341-348 Elmes, G.A. and Cai, C., 1992, Data Quality Issues in User Interface Design for a Knowledge-Based Decision Support System. Proceedings of the 5th International Symposium on Spatial Data Handling, Charleston, South Carolina, vol. 1, pp. 303-312 Goodchild, M. and Jeansoulin, R. 1998. Data quality in geographic information. Hermes, Paris. Guptill S.C. and Morrison, J.L. (1995). Elements of Spatial Data Quality. Elsevier Science Ltd, Exeter, UK. Heuvelink, G.M.H., 1998. Error Propagation in Environmental modeling with GIS. Taylor Francis, London Holzapfel-Pschorn, A. and Seiler, W. 1986. Methane emission during a cultivation period from an Italian rice paddy. Journal of Geophysical Research 91, 11804-14
Handling Spatial Data Uncertainty Using Fuzzy Geostatistics
187
Houghton, J.T., Meira Filho, L.G., Calander, B.A., Harris, N., Kattenberg, A. and Marskell, K. 1996. Climate change 1995. The science of climate change. Cambridge University Press, Cambridge. Klir G.J. and Folger, T.A., 1988. Fuzzy sets, uncertainty and Information. Prentice Hall, New Jersey. NCGIA (1989) The research plan of the National Center for Geographic Information and Analysis. International Journal of Geographical Information Systems 3 117-136 Schütz, H., Seiler, W. and Conrad, R., 1990. Influence of soil temperature on methane emission from rice paddy fields. Biogeochemistry 11, 77-95 Stein, A. and Van der Meer, F. 2001. Statistical sensing of the environment, International Journal of Applied Earth Observation and Geoinformation 3, 111113 Van Bodegom, P.M.; R. Wassmann; and T.M. Metra-Corton, 2001, A processbased model for methane emission predictions from flooded rice paddies, Global biogeochemical cycles 15, 247-264 Van Bodegom, P.M, Verburg, P.H., and Denier van der Gon, H.A.C. 2002. Upscaling methane emissions from rice paddies: problems and possibilities, Global biogeochemical cycles 16, 1-20 Zadeh.L., 1965, Fuzzy Sets. Information Control 8, 338-353
A Visualization Environment for the Space-TimeCube Menno-Jan Kraak1 and Alexandra Koussoulakou2 1 ITC, Department of Geo-Information Processing, PO Box 6, 7500 AA Enschede, The Netherlands
[email protected]; 2 Aristotle University of Thessaloniki, Department of Cadastre, Photogrammetry and Cartography, 541 24 Thessaloniki, Greece,
[email protected] (for extra colour illustrations: http://www.itc.nl/personal/kraak/sdh04 )
Abstract At the end of the sixties Hägerstrand introduced a space-time model which included features such as a Space-Time-Path, and a Space-Time-Prism. From a visualization perspective the Space-Time-Cube was the most prominent element in Hagerstrand’s approach. However, when the concept was introduced the options to create the graphics were limited to manual methods and the user could only experience the single view created by the draftsperson. Today’s software has options to automatically create the cube and its contents from a database. Data acquisition of space-time paths for both individuals and groups is also made easier using GPS. The user’s viewing environment is, by default, interactive and allows one to view the cube from any direction. In this paper the visualization environment is proposed in a geovisualization context. Keywords: Space-Time-Cube, Geovisualization, Time
1 Introduction Our dynamic geo-community currently witnesses a trend which demonstrates an increased need for personal geo-data. This need for geo-data is based on a strong demand driven data paradigm. Individuals move around assisted by the latest technology which requires data that fits personal needs. They want to know where they are, how they get to a destination and what to expect there, and when to arrive. The elementary questions
190
Menno-Jan Kraak and Alexandra Koussoulakou
linked to geospatial data such as ‘Where?’, ‘What?’ and ‘When?’ become even more relevant. This is further stimulated by technological developments around mobile phones, personal digital assistance, and global positioning devices. Next to the demand for location based services one can witness an increase in use of for instance mobile GIS equipment in fieldwork situations (Pundt, 2002; Wintges, 2003). The above technology also proves to be a rich new data source (Mountain, 2004), since these devices can easily collect data. Interests in exploratory and analytical tools to process and understand these (aggregate) data streams are increasing. Geographers see new opportunities to study human behaviour and this explains the revival in the interest in Hägerstrand’s time geography (Miller, 2002). The time-geography approach has been hampered by the difficulty to get abundant data and methods and techniques to process data. This seems no longer the case, but it might have serious implications on privacy which should not be taken for granted (Monmonier, 2002). From a visualization perspective the Space-Time-Cube is the most prominent element in Hägerstrand’s approach. In its basic appearance the cube has on its base a representation of the geography (along the x- and yaxis), while the cube’s height represents time (z-axis). A typical SpaceTime-Cube could contain the space time-paths of for instance individuals or objects. However, when the concept was introduced the options to create the graphics were limited to manual methods and the user could only experience the single view created by the draftsperson. A different view on the cube would mean to go through a laborious drawing exercise again. Today software exists that allows to automatically creating the cube and its contents from a database. Developments in geovisualization allow one to link the (different) cube views to other alternative graphics. Based on the latest developments in geovisualization, this paper presents an extended interactive and dynamic visualization environment, in which the user has full flexibility to view, manipulate and query the data in a Space-Time-Cube, while linked to other views on the data. The aim is to have a graphic environment that allows for creativity via an alternative perspective on the data to sparkle the mind with new ideas and to solve particular geo-problems. The proposed visualization environment is illustrated by two applications: sports and archaeology. The first is relatively straight forward when it comes to the application of the space-time-cube and the second demonstrates new opportunities for the cube because of the visualization possibilities.
A Visualization Environment for the Space-Time-Cube
191
2 Hägerstrand’s Time Geography Time-geography studies the space-time behaviour of human individuals. In their daily life each individual follows a trajectory through space and time. Hägerstand’s time geography sees both space and time as inseparable, and this becomes clear if one studies the graphic representation of his ideas, the Space-Time-Cube as displayed in this papers figures. Two of the cubes axes represent space, and the third axe represents time. This allows the display of trajectories, better known as Space-Time-Paths (STP). These paths are influenced by constraints. One can distinguish between capability constraints (for instance mode of transport and need for sleep), coupling constraints (for instance being at work or at the sports club), and authority constraints (for instance accessibility of buildings or parks in space and time). On an aggregate level time geography can also deal with trends in society. The vertical lines indicate a stay at the particular location which are called ‘stations’. The time people meet at a station creates so-called ‘bundles’. The near horizontal lines indicate movements. The Space-TimePath can be projected on the map, resulting in the path’s footprint. Another important time-geography concept is the notion of the Space-TimeCube. In the cube it occupies the volume in space and time a person can reach in a particular time-interval starting and returning to the same location (for instance: where can you get during lunch time). The widest extent is called the Potential Path Space (PPS) and its footprint is called Potential Path Area (PPA). In the diagram it is represented by a circle assuming it is possible to reach every location at the edge of the circle. In reality the physical environment (being urban or rural) will not always allow this due to the nature of for instance the road pattern or traffic intensity. During the seventies and eighties Hägerstrand’s time-geography has been elaborated upon by his Lund group (Hägerstrand, 1982; Lenntorp, 1976). It has been commented and critiqued by for instance Pred (1977), who in his paper gives a good analysis of the theory, and summarizes it as “the time geography framework is at one and the same time disarmingly simple in composition and ambitious in design.” As one of the great benefits of the approach he notes that is takes away the (geographers) over emphasize on space and includes time. Recently Miller (2002) in a homage to Hägerstrand worded this as a shifting attention from a ‘place-based perspective’ (our traditional GIS) to a more people based-perspective (timegeography). The simplicity is only partly through, because if one considers the Space-Time-Cube from a visual perspective it will be obvious that an interactive viewing environment is not easy to find. Probably that is one of the reason the cube has been used only sporadically since its conceptuali-
192
Menno-Jan Kraak and Alexandra Koussoulakou
sation. Although applications in most traditional human-geography domains have been described, only during the last decade we have witnessed an increased interest in the Space-Time-Cube. Miller (1991; 1999) applied its principles in trying to establish accessibility measure in an urban environment. Kwan (1998; 1999) has used it to study accessibilities differences among gender and different ethnic groups. She also made a start integrating cyberspace into the cube. Forer has developed a interesting data structure based on taxels (‘time volumes’) to incorporate in the cube to represent the Space-Time-Prism (Forer, 1998; Forer and Huisman, 1998). Hedley et al (1999) created an application in a GIS environment for radiological hazard exposure. Improved data gathering techniques have given the interest in Time geography and the Space-Time-Cube again a new impulse. Andrienko et. al. (2003) did put the cube in the perspective of exploratory spatio-temporal visualization. Recent examples are described by Mountain and his colleagues (Dykes and Mountain, 2003; Mountain and Raper, 2001) who discuss the data collection techniques by mobile phone, GPS and location based services and suggest visual analytical method to deal with the data gathered. The Space Time-Path or geospatial life lines as they are called by Hornsby & Egenhofer (2002) are object of study in the framework of moving objects. From a visualization point of view the graphics created in most applications are often of an ad-hoc nature. This paper tries to systematically look at what is possible today? It should be obvious that the Space-Time-Cube is not the solution for the visualization of all spatio-temporal data one can think of. The next two sections will put the cube in a more generic geovisualization and temporal visualization perspective.
3 Space-Time-Cube in a Geovisualization Context During Hägerstrand’s active career - the period before Geographic Information Systems - paper maps and statistics were probably the most prominent tools for researchers to study their geospatial data. To work with those data, analytical and map use techniques were developed, among them the concepts of time-geography. Many of these ideas can still be found in the commands of many GIS packages. Today GIS offers researchers access to large and powerful sets of computerized tools such as spreadsheets, databases and graphic tools to support their investigations. The user can interact with the map and the data behind it, an option which can be extended via links to other data accessible via the web. This capability adds a differ-
A Visualization Environment for the Space-Time-Cube
193
ent perspective to the map, as they become interactive tools for exploring the nature of the geospatial data at hand. The map should be seen as an interface to geospatial data that can support information access and exploratory activities, while it retains its traditional role as a presentation device. There is also a clear need for this capability since the magnitude and complexity of the available geospatial data pose a challenge as to how the data can be transformed into information and ultimately into knowledge. Geovisualization integrates approaches from scientific visualization, (exploratory) cartography, image analysis, information visualization, exploratory data analysis (EDA) and GIS to provide theory, methods and tools for the visual exploration, analysis, synthesis and presentation of geospatial data (MacEachren and Kraak, 2001). Via the use of computersupported, interactive, visual representations of (geospatial) data one can strengthen understanding of the data at hand. The visualizations should lead to insight that ultimately helps decision making. In this process maps and other graphics are used to stimulate (visual) thinking about geospatial patterns, relationships and trends, generate hypotheses, develop problem solutions and ultimately construct knowledge. One important approach here is to view geospatial data sets in a number of alternative ways, e.g., using multiple representations without constraints set by traditional techniques or rules. This should avoid the trap described by Finke (1992) who claims that “most researchers tend to rely on wellworn procedures and paradigms...” while they should realize that “…creative discoveries, in both art and science, often occur in unusual situations, where one is forced to think unconventionally.” This is well described by Keller and Keller (1992), who in their approach to the visualization process suggest removing mental roadblocks and taking some distance from the discipline in order to reduce the effects of traditional constraints. Why not choose the STC as an alternative mapping method to visualize temporal data? However, to be effective the alternative view should preferably be presented in combination with familiar views to avoid one gets lost. This implies a working environment that consist of multiple linked views to ensure that an action in one view is also immediate visible in all other views. Currently this approach receives much attention from both a technical as well as a usability perspective (Convertino et al., 2003; Roberts, 2003). The above trend in mapping is strongly influenced by developments in other disciplines. In the 1990s, the field of scientific visualization gave the word “visualization” an enhanced meaning (McCormick et al., 1987). This development linked visualization to more specific ways in which modern computer technology can facilitate the process of “making data visible” in real time in order to strengthen knowledge. The relations between the
194
Menno-Jan Kraak and Alexandra Koussoulakou
fields of cartography and GIS, on the one hand, and scientific visualization on the other, have been discussed, in depth (Hearnshaw and Unwin, 1994) and (MacEachren and Taylor, 1994). Next to scientific visualization, which deals mainly with medical imaging, process model visualization and molecular chemistry, another branch of visualization that influenced mapping can be recognized. This is called information visualization and focuses on visualization of non-numerical information (Card et al., 1999). Of course recent trend around GIScience play a role as well (Duckham et al., 2003). In the recent book ‘Exploring Geovisualization’ (Dykes et al., 2004) many current research problems are described. From the map perspective, it is required that in this context cartographic design and research pay attention to human computer interaction of the interfaces, and revive the attention for the usability of the products, especially since many alternative views nor their experimental environments have been really tested on their efficiency or effectiveness. Additionally, one has to work on representation issues and the integration of geocomputing in the visualization process to be able to realize the alternatives.
4 Space-Time-Cube and its Visualization Environment Taking the above discussion into account it is assumed that a better (visual) exploration and understanding of temporal events taking place in our geo-world requires the integration of geovisualization with timegeography’s Space-Time-Cube. Prominent keywords are interaction, dynamics and alternative views which each have their impact on the viewing environment proposed. Interaction in needed because the threedimensional cube has to be manipulated in space to find the best possible view, and it should be possible to query the cube’s content. Time, always present in the Space-Time-Cube automatically introduces dynamics. The alternative graphics appear outside the cube and are linked and should stimulate thinking new insights and explanations. This section will systematically discuss the functionality required in such environment. One can distinguish functions linked to a basic display environment, those linked to data display and manipulation in the cube and those related to linked view. The functions are all based on the questions: What tasks are expected to be executed when working with a Space-Time-Cube?
A Visualization Environment for the Space-Time-Cube
195
Fig. 1. The Space-Time-Cube’s basic viewing environment. The working view shows the details of the data to be studied. The 2D and 2D view assist the user in orientation and navigation. The attribute view offers the user options to define which variable will be displayed in the cube. The data displayed in the figure represent Napoleon’s 1812 march into Russia. The viewing environment has a main or working view, as can be seen in Figure 1. Additionally three other optional views are shown. These views help with orientation and navigation in the cube, as well as with the selection of the data displayed. The 2D view shows the whole map of the study area with a rectangle that matches the base map displayed in the working view’s cube. Similarly, the content of the cube in the 3d view corresponds to the content of the Space-Time-Cube. The situation in Figure 1 results after zooming in on a particular area. The views are all linked, and moving for instance the rectangle in the 2D view will also result in a different location on the overview cube in the 3d view and a different content of the working view. The attribute view shows the variables available for display, and allows the user to link those variables to the Space-Time-Cube’s display variables. These include a base map, variables to be displayed along the cubes axis (x, y, and time) and variables linked to the Space-Time-Path – its colour and width. The user can drag any of the available variables onto the display variables. However, a few limitations exist. Since we deal with a Space-Time Cube a spatial (x or y) and time component must always be there. It is possible though that different time variables exist of which one should be selected by the user. For instance time could be given
196
Menno-Jan Kraak and Alexandra Koussoulakou
in years but also according particular historical events like the reign of an administration. The base map is displayed optionally. The x or y could be used for display with another variable. In the case of the Napoleon data displayed in Figure 1 the y could be interchanged by the number of troops, as was suggested by Roth discussing the SAGE software (Roth et al., 1997). A Space-Time-Path will be automatically displayed as soon as the x, y and time variables have been assigned in the attribute view. As with most software dealing with maps it is possible to switch information layers on and off. In the cube a footprint of the Space-Time-Path is displayed for reference and measurements. Any of the cube’s axes can be dragged into the cube to measure time or location. Drop lines can be projection on the axis for the same purpose. Since we deal with the display of the third dimension the option to manipulate the cube in 3d space is a basic requirement. Rotating the cube independently around any of its axis is possible. Also introduced is a spinning option that allows one to let the cube automatically rotate around a define axis with the purpose to get an overview of the data displayed. Since users will be curious one should be able to query the Space-Time-Path. In this example the system responds with an overview of available data on the segment selected. Via the attribute view the user can change the variable attached to the axis or the Space-TimePath on the fly. For selection purposes it is also possible to use multiple slider planes along the axis to highlight an area of the cube.
5 Applicatons and Extended Functionality
5.1 Sports Sport and time-geography have met before, for instance to analyse rugby matches (Moore et al., 2003). The Space-Time-Cube is most suitable for the display and analysis of paths of (multiple) individuals, groups or other objects moving through space (Figure 2). However, other possibilities exist, and it could also be used for real-time monitoring. In an experiment the Space-Time-Cube is used to visualize a running event, with additional views linked to the cube environment. The working view and the 2D view show a path based on GPS data acquired during a fitness run. Highlighted in both views is a section of the run for which detailed information is shown in the linked views. These are a GPS data file showing the track history and a video displaying the path’s environment. A dot in the work-
A Visualization Environment for the Space-Time-Cube
197
ing view and 2d view indicate the position of the runner represented by the current video-frame and highlighted record in the GPS-log. For analytical purposes one could add the other paths of similar runs and compare the running results in relation to the geography. To expand this geographic analysis one could also use a digital terrain model as base map and see how the terrain might influence running speed. This example shows how multimedia elements can be linked to the cube. However, one can easily think of other linked views as well. With many available variables these could –when relevant- be displayed in a parallel coordinate plot which could be used as an alternative variable view and allow for linking the variable with the cube’s axes or the Space-Time-Path. 5.2 Archaeology Another discipline that might benefit from the Space-Time-Cube approach is archaeology. Complex relations between artefacts excavated at particular times found from different historical period at different location could be made visible in the cube. The location of a find will be presented by a vertical line that has a colour/thickness at the relevant time period, possibly indicating uncertainty. Since most finds will be point locations paths are not necessarily available in the archaeological version of the cube. However, apart from the excavation data (the “archaeological” part) the interpretation of the excavation (the “history” part) can provide spatiotemporal distributions of derived information, which do not refer to point locations only. Examples are: uses of space (“landuse”), locations with concentrations of specific materials (e.g. pottery, gold, stones etc.), borders of a city through time, distribution of settlements / excavations in an area, links to other geographical locations, in a smaller map-scale. Moreover, the possibility to play with the different time scales related to history, geology, archaeology etc makes it possible to for instance discover the spread of a civilisation or the influence quarry in the spread of particular artefacts. It could assist in prediction where interesting location could be found. In Figure 2 the Space-Time Cube is used to visualize the spatiotemporal location of archaeological excavations (corresponding to various settlements) in an area. Of importance here is the location of the settlements rather, than the path they follow in time (this is because there still remain settlements to be discovered). Nevertheless, the STP of all existing settlements might offer interesting patterns to archaeologists with respect to movements through time. An additional functionality requirement for archaeology has to do with the kind of use/uses of space in the various
198
Menno-Jan Kraak and Alexandra Koussoulakou
spots. Also in this application the cube environment should not be used stand alone but in active connection to other views with relevant textual, tabular and image data.
Fig. 2. STC applications: left multiple runners in an orienteering race; right: archaeology excavation where the stations indicate the duration of the existence of settlements
6 Problems and Prospects This paper has discussed the option to visually deal with the concepts of Hägerstrands Space-Time-Cube based on the opportunities offered by geovisualization. The functionality of a Space-Time-Cube’s viewing environment with its multiple linked views has been described. However, many questions remain and are currently subject of further research. Among those questions ‘How many multiple linked views can the user handle?’, ‘Can the user understand the cube when multiple Space-TimePaths are displayed?’, ‘How should the interface look like?’ All these questions deal with usability aspects of the cube’s viewing environment, and the authors currently work on a usability set up to answer the above questions. References Andrienko, N., Andrienko, G.L. and Gatalsky, P., 2003. Exploratory spatiotemporal visualization: an analytical review. Journal of Visual Languages and Computing, 14: 503-541.
A Visualization Environment for the Space-Time-Cube
199
Card, S.K., MacKinlay, J.D. and Shneiderman, B., 1999. Readings in information visualization: using vision to think. Morgan Kaufmann, San Francisco. Convertino, G., Chen, J., Yost, B., Ryu, Y.-S. and North, C., 2003. Exploring context switching and cognition in dual-view coordinated visualizations. In: J. Roberts (Editor), International conference on coordinated & multiple views in exploratory visualization. IEEE Computer Society, London, pp. 55-62. Duckham, M., Goodchild, M. and Worboys, M. (Editors), 2003. Foundations of geographic information science. Taylor & Francis, London. Dykes, J., MacEachren, A.M. and Kraak, M.J. (Editors), 2004. Exploring geovisualization. Elseviers., Amsterdam. Dykes, J.A. and Mountain, D.M., 2003. Seeking structure in records of spatiotemporal behaviour: visualization issues, efforts and applications: Computational Statistics and Data Analysis (Data Viz II). Computational Statistics & Data Analysis, 43(4): 581-603. Finke, R.A., Ward, T.B. and Smith, S.M., 1992. Creative Cognition: Theory, Research, and Applications. The MIT Press, Cambridge, Mass, 205 pp. Forer, P., 1998. Geometric approaches to the nexus of time, space, and microprocess: implementing a practical model for mundane socio-spatial systems. In: M.J. Egenhofer and R.G. Gollege (Editors), Spatial and temporal reasoning in geographic information systems. Spatial Information Systems. Oxford University Press, Oxford. Forer, P. and Huisman, 1998. Computational agents and urban life spaces: a preliminary realisation of the time-geography of student lifestyles, Third International Conference on GeoComputation, Bristol. Hägerstrand, T., 1982. Diorama, path and project. Tijdschrift voor Economische en Sociale eografie, 73: 323-339. Hearnshaw, H.M. and Unwin, D.J. (Editors), 1994. Visualization in Geographical Information System. J. Wiley and Sons, London. Hedley, N.R., Drew, C.H. and Lee, A., 1999. Hagerstrand Revisited: Interactive Space-Time Visualizations of Complex Spatial Data. Informatica: International Journal of Computing and Informatics, 23(2): 155-168. Hornsby, K. and Egenhofer, M.J., 2002. Modeling Moving Objects over Multiple Granularities. Annals of Mathematics and Artificial Intelligence, 36(1-2): 177-194. Keller, P.R. and Keller, M.M., 1992. Visual cues, practical data visualization. IEEE Press, Piscataway. Kwan, M.-P., 1998. Space-time and integral measures of individual accessibility: A comparative analysis using a point-based framework. Geographical Analysis, 30(3): 191-216. Kwan, M.-P., 1999. Gender, the home-work link, and space-time patterns of nonemployment activities. Economic Geography, 75(4): 370-94. Lenntorp, B., 1976. Paths in space time environments: a time geographic study of movement possibilities of individuals. Lund studies in Geography B: Human geography. MacEachren, A.M. and Kraak, M.J., 2001. Research challenges in geovisualization. Cartography and Geographic Information Systems, 28(1): 3-12.
200
Menno-Jan Kraak and Alexandra Koussoulakou
MacEachren, A.M. and Taylor, D.R.F. (Editors), 1994. Visualization in Modern Cartography. Pergamon Press, London. McCormick, B., DeFanti, T.A. and Brown, M.D., 1987. Visualization in Scientific Computing. Computer Graphics, 21(6). Miller, H.J., 1991. Modelling accessibility using space-time prism concepts within geographical information systems. International Journal of Geographical Information Systems, 5(3): 287-301. Miller, H.J., 1999. Measuring space-time accessibility benefits within transportation networks: basic theory and computational procedures. Geographical Analysis, 31(2): 187-212. Miller, H.J., 2002. What about people in geographic information science? In: D. Unwin (Editor), Re-Presenting Geographic Information Systems. Wiley. Monmonier, M., 2002. Spying with maps. University of Chicago Press, Chicago. Moore, A.B., Wigham, P., Holt, A., Aldridge, C. and Hodge, K., 2003. A Time Geography Approach to the Visualisation of Sport, Geocomputation 2003, Southampton. Mountain, D., 2004. Visualizing, querying and summarizing individual spatiotemporal behaviour. In: J. Dykes, A.M. MacEachren and M.J. Kraak (Editors), Exploring Geovisualization. Elseviers, London, pp. 000-000. Mountain, D.M. and Raper, J.F., 2001. Modelling human spatio-temporal behaviour: a challenge for location-based services, GeoComputation, Brisbane. Pred, A., 1977. The choreography of existence: Comments on Hagerstrand's timegeography and its usefulness. Economic Geography, 53: 207-21. Pundt, H., 2002. Field data collection with mobile GIS: Dependencies between semantics and data quality. GeoInformatica, 6(4): 363-380. Roberts, J., 2003. International conference on coordinated & multiple views in exploratory visualization. IEEE Computer Society. Roth, S.F., Chuah, M.C., Kerpedjiev, S., Kolojejchick, J.A. and Lucas, P., 1997. Towards an Information Visualization Workspace: Combining Multiple Means of Expression. Human-Computer Interaction Journal, 12(1 & 2): 131185. Wintges, T., 2003. Geodata communication on personal digital assistants (PDA). In: M.P. Peterson (Editor), Maps and the Internet. Elseviers, Amsterdam.
Finding REMO - Detecting Relative Motion Patterns in Geospatial Lifelines Patrick Laube1, Marc van Kreveld2, and Stephan Imfeld1 Department of Geography, University of Zurich, Winterthurerstrasse 190, CH–8057 Zurich, Switzerland, {plaube,imfeld}@geo.unizh.ch1 Department of Computer Science, Utrecht University, P.O. Box 80.089, 3508 TB Utrecht, The Netherlands,
[email protected] Abstract Technological advances in position aware devices increase the availability of tracking data of everyday objects such as animals, vehicles, people or football players. We propose a geographic data mining approach to detect generic aggregation patterns such as flocking behaviour and convergence in geospatial lifeline data. Our approach considers the object's motion properties in an analytical space as well as spatial constraints of the object's lifelines in geographic space. We discuss the geometric properties of the formalised patterns with respect to their efficient computation. Keywords: Convergence, cluster detection, motion, moving point objects, pattern matching, proximity
1 Introduction Moving Point Objects (MPOs) are a frequent representation for a wide and diverse range of phenomena: for example animals in habitat and migration studies (e.g. Ganskopp 2001; Sibbald et al. 2001), vehicles in fleet management (e.g. Miller and Wu 2000), agents simulating people for modelling crowd behaviour (e.g. Batty et al. 2003) and even tracked soccer players on a football pitch (e.g. Iwase and Saito 2002). All those MPOs share motions that can be represented as geospatial lifelines: a series of observations consisting of a triple of id, location and time (Hornsby and Egenhofer 2002).
202
Patrick Laube, Marc van Kreveld, and Stephan Imfeld
Gathering tracking data of individuals became much easier because of substantial technological advances in position aware devices such as GPS receivers, navigation systems and mobile phones. The increasing number of such devices will lead to a wealth of data on space-time trajectories documenting the space-time behaviour of animals, vehicles and people for off-line analysis. These collections of geospatial lifelines present a rich environment to analyse individual behaviour. (Geographic) data mining may detect patterns and rules to gather basic knowledge of dynamic processes or to design location based services (LBS) to simplify individual mobility (Mountain and Raper 2001; Smyth 2001; Miller 2003). Knowledge discovery in databases (KDD) and data mining are responses to the huge data volumes in operational and scientific databases. Where traditional analytical and query techniques fail, data mining attempts to distill data into information and KDD turns information into knowledge about the monitored world. The central belief in KDD is that information is hidden in very large databases in the form of interesting patterns (Miller and Han 2001). This statement is true for the spatio-temporal analysis of geospatial lifelines and thus is a key motivator for the presented research. Motion patterns help to answer the following type of questions. x Can we identify an alpha animal in the tracking data of GPS-collared wolves? x How can we quantify evidence of 'swarm intelligence' in gigabytes of log-files from agent-based models? x How can we identify which football team played the more catching lines of defense in the lifelines of 22 players sampled at seconds? The long tradition of data mining in the spatio-temporal domain is well documented (for an overview see Roddick et al. (2001)). The Geographic Information Science (GISc) community has recognized the potential of Geographic Information Systems (GIS) to 'capture, represent, analyse and explore spatio-temporal data, potentially leading to unexpected new knowledge about interactions between people, technologies and urban infrastructures (Miller 2003). Unfortunately, most commercial GIS are based on a static place-based perspective and are still notoriously weak in providing tools for handling the temporal dimensions of geographic information (Mark 2003). Miller postulates expanding GIS from the place-based perspective to encompass a people-based perspective. He identifies the development of a formal representation theory for dynamic spatial objects and of new spatio-temporal data mining and exploratory visualization techniques as key research issues for GISc.
Detecting Relative Motion Patterns in Geospatial Lifelines
203
In this paper work is presented which extends a concept developed to analyse relative motion patterns for groups of MPOs (Laube and Imfeld 2002) to also analyse the object's absolute locations. The work allows the identification of generic formalised motion patterns in tracking data and the extraction of instances of these formalised patterns. The significance of these patterns is discussed.
2 Aggregation in Space and Time Following Waldo Tobler's first law of geography, near things are more related than distant things (Tobler 1970). Tolber's law is often referred to as being the core of spatial autocorrelation (Miller 2004). Nearness as a concept can be extended to include both space and time. Thus analysing geospatial lifelines we are interested in objects near in space-time. Objects that are near at certain times might be related. Although correlation is not causality, it provides evidence of causality that can (and should) be assessed in the light of theory and/or other evidence. Since this paper focuses on the formal and geometrical definition and the algorithmic detection of motion patterns we use geometric proximity in euclidian space to avoid the vague term nearness. To analyse geospatial lifelines this could mean that MPOs moving within a certain range influence each other. E.g. an alpha wolf leads its pack by being seen or heard, thus all wolves have to be located within the range of vision or earshot respectively. Analysing geospatial lifelines we are interested in first identifying motion patterns of individuals moving in proximity. Second we want to know how, when and where sets of MPOs aggregate, converge and build clusters respectively. Investigating aggregation of point data in space and time is not new. Most approaches focus on detecting localized clustering at certain time slices (e.g. Openshaw 1994; Openshaw et al. 1999). This concept of spatial clusters is static, rooted in the time sliced static map representation of the world. With a true spatio-temporal view of the world aggregation must be perceived as the momentary process convergence and the final static cluster as its possible result. The opposite of convergence, divergence, is equally interesting. Its possible result, some form of dispersal, is much less obvious and thus much harder to perceive and describe. A cluster is not the compulsory outcome of a convergence process and vice versa. A set of MPOs can very well be converging for a long time without building a cluster. The 22 players of a football match may converge during an attack without ever forming a detectable cluster on the
204
Patrick Laube, Marc van Kreveld, and Stephan Imfeld
pitch. In reverse, MPOs moving around in a circle may build a wonderful cluster but never be converging. In addition the process of convergence and the final cluster are in many cases sequential. Consider the lifelines of a swarm of bees. At sunset the bees move back to the hive from the surrounding meadows, showing a strong convergence pattern without building a spatial cluster. In the hive the bees wiggle around in a very dense cluster, but do not converge anymore. In short, even though convergence and clustering are often spatially and/or temporally tied up, there need not be a detectable relation in an individual data frame under investigation.
3 The Basic REMO–Analysis Concept The basic idea of the analysis concept is to compare the motion attributes of point objects over space and time, and thus to relate one object's motion to the motion of all others (Laube and Imfeld 2002). Suitable geospatial lifeline data consist of a set of MPOs each featuring a list of fixes. The REMO concept (RElative MOtion) is based on two key features: First, a transformation of the lifeline data to a REMO matrix featuring motion attributes (i.e. speed, change of speed or motion azimuth). Second, matching of formalized patterns on the matrix (Fig. 1). Two simple examples illustrate the above definitions: Let the geospatial lifelines in Fig. 1a be the tracks of four GPS-collared deer. The deer O1 moving with a constant motion azimuth of 45° during an interval t2 to t5, i.e. four discrete time steps of length t, is showing constance. In contrast, four deer performing a motion azimuth of 45° at the same time t4 show concurrence.
objects
time t1 t2 t3 t4 t5
O1
O1 O2 O3 O4
time t1 t2 t3 t4 t5 O1 O2 O3 O4
45 45 45
constance
90
45 45 45
45
90 315 45 315
0
45
45
90 180 45
45
90 315 90 45
0
45 45 45 45
concurrence
45 45 45 45 45
O3 O O 2 4
45
(a)
(b)
(c)
trend-setter (d)
Fig. 1. The geospatial lifelines of four MPOs (a) are used to derive in regular intervals the motion azimuth (b). In the REMO analysis matrix (c) generic motion patterns are matched (d).
Detecting Relative Motion Patterns in Geospatial Lifelines
205
The REMO concept allows construction of a wide variety of motion patterns. See the following three basic examples: x Constance: Sequence of equal motion attributes for r consecutive time steps (e.g. deer O1 with motion azimuth 45° from t2 to t5). x Concurrence: Incident of n MPOs showing the same motion attributes value at time t (e.g. deer O1, O2, O3, and O4 with motion azimuth 45° at t4 ) x Trend-setter: One trend-setting MPO anticipates the motion of n others. Thus, a trend-setter consists of a constance linked to a concurrence (e.g. deer O1 anticipates at t2 the motion azimuth 45° that is reproduced by all other MPOs at time t4) For simplicity we focus in the remainder of this paper on the motion attribute azimuth, even though most facets of the REMO concept are equally valid for speed or change of speed.
4 Spatially Constrained REMO patterns The construction of the REMO-matrix is an essential reduction of the information space. However it should be noted that this step factors out the absolute locations of the fixes of the MPOs. The following two examples illustrate generic motion patterns where the absolute locations must be considered. x Three geese all heading north-west at the same time – one over London, one over Cardiff and one over Glasgow are unlikely to be influenced by each other. In contrast, three geese moving north-west in the same gaggle are probably influenced. Thus, for flocking behaviours the spatial proximity of the MPOs has to be considered. x Three geese all heading for Leicester at the same time – one starting over London, one over Cardiff and one over Glasgow show three different motion azimuths, not building any pattern in the REMO matrix. Thus, convergence can only be detected considering the absolute locations of the MPOs. The basic REMO concept must be extended to detect such spatially constrained REMO patterns. In Section 4.1 spatial proximity is integrated and in Section 4.2 an approach is presented to detect convergence in relative motion. Section 4.3 evaluates algorithmic issues of the proposed approaches.
206
Patrick Laube, Marc van Kreveld, and Stephan Imfeld
4.1 Relative Motion with Spatial Proximity Many sheep moving in a similar way is not enough to define a flocking pattern. We expect additionally that all the sheep of a flock graze on the same hillside. Formalised as a generic motion pattern we expect for a flocking the MPOs to be in spatial proximity. To test the proximity of m MPOs building a pattern at a certain time we can compute the spatial proximity of the m MPO's fixes in that time frame. Following Tobler's first law, proximity among MPOs can be considered as impact ranges, or the other way around: a spatio-temporally clustered set of MPOs is evidence to suggest an interrelation among the involved MPOs. The meaning of spatial constraint in a motion pattern is different if we consider the geospatial lifeline of a single MPO. The consecutive observations (fixes) of a single sheep building a lifeline can be tested for proximity. Thus the proximity measure constrains the spatial extent of single object's motion pattern. A constance for a GPS-collared sheep may only be meaningful if it spans a certain distance, excluding pseudo-motion caused by inaccurate fix measurements. Different geometrical and topological measures could be used to constrain motion patterns spatially. The REMO analysis concept focuses on the following open list of geometric proximity measures. x A first geometric constraint is the mean distance to the mean or median center (length of star plot). x Another approach to indicate the spatial proximity of points uses the Delaunay diagram, applied for cluster detection in 2-D point sets (e.g. Estivill-Castro and Lee 2002) or for the visualisation of habitat-use intensity of animals (e.g. Casaer et al. 1999). According to the cluster detection approach two points belong to the same cluster, if they are connected by a small enough Delaunay edge. Thus, adapted to the REMO concept a second distance proximity measure is to limit the average length of the Delauney edges of a point group forming a REMO pattern. x Proximity measures can have the form of bounding boxes, circles or ellipses (Fig. 2). The simplest way of indicating an impact range would be to specify a maximal bounding box that enclosed all fixes relevant to the pattern. Circular criteria can require enclosing all relevant fixes within radius r or include the constraint to be spanned around the mean or median center of the fixes. Ellipses are used to rule the directional elongation of the point cluster (major axis a, minor axis b). x Another areal proximity measure for a set of fixes is the indication of a maximal border length of the convex hull.
Detecting Relative Motion Patterns in Geospatial Lifelines
207
Using these spatial constraints the list of basic motion patterns introduced in Section 3 can be amended with the spatially constrained REMO patterns (Fig. 2). x Track: Consists of the REMO pattern constance and the attachment of spatial constraint. Definition: constance + spatial constraint S. x Flock: Consists of the REMO pattern concurrence and the attachment of a spatial constraint. Definition: concurrence + spatial constraint S. x Leadership: Consists of the REMO pattern trend-setter and the attachment of a spatial constraint. For example the followers must lie within the range (x, y) when they join the motion of the trend-setter. Definition: trend-setter + spatial constraint S. ANALYSIS SPACE
(a)
time ∂ymax
(b)
rmax
objects
track
GEOGRAPHIC SPACE
∂xmax
t3 t2 t1 to
objects
flock
time
r a
∂y
b ∂x
objects
t3 to
t1
t2 ∂xmax
leadership
time
∂ymax
Fig. 2. The figure illustrates the constraints of the patterns track, flock and leadership in the analysis space (the REMO matrix) and in the geographic space. Fixes matched in the analysis space are represented as solid forms, fixes not matched as empty forms. Some possible spatial constraints are represented as ranges with dashed lines. Whereas in the situations (a) the spatial constraints for the absolute positions of the fixes are fulfilled they are not in the situations (b): For track the last fix lies beyond the range, for flock and leadership the quadratic object lies outside the range.
208
Patrick Laube, Marc van Kreveld, and Stephan Imfeld
4.2 Convergence At the same time self-evident and fascinating are groups of MPOs aggregating and disaggregating in space and time. An example is wild animals suddenly heading in a synchronised fashion for a mating place. Wildlife biologists could be interested in the who, when and where of this motion pattern. Who is joining this spatio-temporal trend? Who is not? When does the process start, when does it end? Where lies the mating place, what spatial extent or form does it have? A second example comes from the analysis of crowd behaviour. Can we identify points of interest attracting people only at certain times, events of interest rather than points of interest, losing their attractiveness after a while? To answer such questions we propose the spatial REMO pattern convergence. The phenomenon aggregation has a spatial and a spatio-temporal form. An example may help to illustrate the difference. Let A be a set of n antelopes. A wildlife biologist may be interested in identifying sets of antelopes heading for some location at certain time. The time would indicate the beginning of the mating season, the selected set of m MPOs the readyto-mate individuals, and the spot might be the mating area. This is a convergence pattern. It is primarily spatial, that means the MPOs head for an area but may reach it at different times. On the other hand the wildlife biologist and the antelopes may share the vital interest to identify MPOs that head for some location and actually meet there at some time extrapolating their current motion. Thus, the pattern encounter includes considerations about speed, excluding MPOs heading for a meeting range but not arriving there at a particular time with the others. x Convergence: Heading for R. Set of m MPOs at interval i with motion azimuth vectors intersecting within a range R of radius r. x Encounter: Extrapolated meeting within R. Set of m MPOs at interval i with motion azimuth vectors intersecting within a range R of radius r and actually meeting within R extrapolating the current motion.
Detecting Relative Motion Patterns in Geospatial Lifelines
209
r
t0 t1 1
t2 t3 t4
t6
2
2 3
t5
4
3 2 r
3
2
Fig. 3. Geometric detection of convergence. Let S be a set of 4 MPOs with 7 fixes from t0 to t6. The illustration shows a convergence pattern found with the parameters 4 MPOs at the temporal interval t1 to t3. The darkest polygon denotes an area where all 4 direction vectors are passing at a distance closer than r. The pattern convergence is found if such a polygon exists. Please note that the MPOs do not build a cluster but nevertheless show a convergence pattern.
The convergence pattern is illustrated in Figure 3. Let S be a set of MPOs with n fixes from t0 to tn-1. For every MPO and for every interval of length i an azimuth vector fitting in its fixes within i represents the current motion. The azimuth vector can be seen as a half-line projected in the direction of motion. The convergence is matched if there is at any time a circle of radius r that intersects n directed half-lines fitted for each MPO in the fixes within i. For the encounter pattern whether the objects actually meet in future must additionally be tested. The opposites of the above described patterns are termed divergence and breakup. The latter term integrates a spatial divergence pattern with the temporal constraint of a precedent meeting in a range R. The graphical representation of the divergence pattern is highly similar to Fig. 3. The only
210
Patrick Laube, Marc van Kreveld, and Stephan Imfeld
difference lies in the construction of the strips, heading backwards instead of forwards, relative to the direction of motion. 4.3 Algorithms and Implementation Issues In this section we develop algorithms to detect the above introduced motion patterns and analyse their efficiency. The basic motion patterns in the REMO concept are relatively easy to determine in linear time. The addition of positions requires more complex techniques to obtain efficient algorithms. We analyse the efficiency of pattern discovery for track, flock, leadership, convergence, and encounter in this section. We let the range be a circle of given radius R. Let n denote the number of MPOs in the data set, and t the number of time steps. The total input size is proportional to nt, so a linear time algorithm requires O(nt) time. We let m denote the number of MPOs that must be involved in a pattern to make it interesting. Finally, we assume that the length of a time interval is fixed and given. The addition of geographic position to the REMO framework requires the addition of geographic tests or the use of geometric algorithms. The track pattern can simply be tested by checking each basic constance pattern found for each MPO. If the constance pattern also satisfies the range condition, a track pattern is found. The test takes constant additional time per pattern, and hence the detection of track patterns takes O(nt) time overall. Efficient detection of the flock pattern is more challenging. We first separate the input data by equal time and equal motion direction, so that we get a set of n' n points with the same direction and at the same time. The 8t point sets in which patterns are sought have total size O(nt). To discover whether a subset of size at least m of the n points lie close together, within a circle of radius R, we use higher-order Voronoi diagrams. The mth order Voronoi diagram is the subdivision of the plane into cells, such that for any point inside a cell, some subset of m points are the closest among all the points. The number of cells is O(m(n'-m)) (Aurenhammer 1991), and the smallest enclosing circle of each subset of m points can be determined in O(m) time (de Berg et al. 2000, Sect. 4.7). If the smallest enclosing circle has radius at most R, we have discovered a pattern. The sum of the n' values over all 8t point sets is O(nt), so the total time needed to detect these patterns is O(ntm2 + nt log n). This includes the time to compute the m-th order Voronoi diagram (Ramos 1999). Leadership pattern detection can be seen as an extension of flock pattern detection. The additional condition is that one of the MPOs shows con-
Detecting Relative Motion Patterns in Geospatial Lifelines
211
stance over the previous time steps. Leadership detection also requires O(ntm2+nt log n) time. For the convergence pattern, consider a particular time interval. The n MPOs give rise to n azimuth vectors, which we can see as directed halflines. To test whether at least m MPOs out of n converge, we compute the arrangement formed by the thickened half-lines, which are half-strips of width 2r. For every cell in the arrangement we determine how many thickened half-lines contribute, which can be done by traversing the arrangement once and maintaining a counter that shows in how many half-strips the current cell is. If a cell is contained in at least m half-strips, it constitutes a pattern. Computing the arrangement of n half-strips and setting the counters can be done in O(n2) time in total; the algorithm is very similar to computing levels in arrangements (de Berg et al. 2000, Chap. 8). Since we consider t different time intervals, the total running time becomes O(n2t). The encounter pattern is the most complex one to compute. The reason is that extrapolated meeting times must also match, which adds a dimension to the space in which geometric algorithms are needed. We lift the problem into 3-D space, where the third dimension is time. The MPOs become half-lines that go upward from a common horizontal plane representing the beginning of the time interval; the slope of the half-lines will now be the speed. The geometric problem to be solved is finding horizontal circles of radius R that are crossed by at least m half-lines, which can be solved in O(n4) time with a simple algorithm. For all time intervals of a given length, the algorithm needs O(n4t) time.
5 Discussion The REMO approach has been designed to analyse motion basing on geospatial lifelines. Since motion is expressed by a change in location over time the REMO patterns intrinsically span over space and time. Our approach thus overcomes the limitation of only either detecting spatial clusters on snapshots or highlighting temporal trends in attributes of spatial units. It allows pattern detection in space-time. REMO patterns rely solely on point observations and are thus expressible for any objects that can be represented as points and leave a track in a euclidean space. Having translated the expected behaviours into REMO patterns, the detection process runs unsupervised, listing every pattern occurrence. The introduced patterns can be detected within reasonable time. Many simple patterns can be detected in close to linear time if the size of the subset m that constitutes a pattern is a constant, which is natural in
212
Patrick Laube, Marc van Kreveld, and Stephan Imfeld
many situations. The encounter pattern is quite expensive to compute, but since we focus on off-line analysis, we can still deal with data sets consisting of several hundreds of MPOs. Note that the dependency on the number of time steps is always linear for fixed length time intervals. The most promising way to obtain more efficient algorithms is by using approximation algorithms, which can save orders of magnitude by stating the computational problem slightly less firm (Bern and Eppstein 1997). In short, the REMO concept can cope with the emerging data volumes of tracking data. Syntactic pattern recognition adopts a hierarchical perspective where complex patterns are viewed as being composed of simple primitives and grammatical rules (Jain et al. 2000). Sections 3 and 4 introduced a subset of possible pattern primitives of the REMO analysis concept. Using the primitives and a pattern description formalism almost arbitrary motion patterns can be described and detected. Due to this hierarchical design the concept easily adapts to the special requirements of various application fields. Thus, the approach is flexible and universal, suited for various lifelines such as of animals, vehicles, people, agents or even soccer players. The detection of patterns of higher complexity requires more sophisticated and flexible pattern matching algorithms than the ones available today. The potential users of the REMO method know the phenomenon they investigate and the data describing it. Hence, in contrast to traditional data mining assuming no prior knowledge, the users come up with expectations about conceivable motion patterns and are able to assess the results of the pattern matching process. Therein lies a downside of the REMO pattern detection approach: It requires relatively sophisticated knowledge about the patterns to be searched for. For instance, the setting of an appropriate impact range for a flock pattern is highly dependent on the investigated process and thus dependent on the user. In general the parametrisation of the spatial constraints influences the number of patterns detected. Further research is needed to see whether autocalibration of pattern detection will be possible within the REMO concept. Even though the REMO analysis concept assumes users specifying patterns they are interested in, the pattern extent can also be viewed as an analysis parameter of the data mining approach. One reason to do so is to detect scale effects lurking in different granularities of geospatial lifeline data. The number of matched patterns may be highly dependent on the spatial, temporal and attributal granularity of the pattern matching process. For example the classification of motion azimuth in only the two classes east and west reveals a lot of presumably meaningless constance patterns. In contrast, the probability of finding constance patterns with 360 azimuth classes is much smaller, or take the selection of the impact range r for the flock pattern in sheep as another example. By testing the length of the im-
Detecting Relative Motion Patterns in Geospatial Lifelines
213
pact range r against the amount of matched patterns one could search for a critical maximal impact range within a flock of sheep. Future research will address numerical experiments with various data to investigate such relations. A critical issue in detecting convergence is fitting the direction vector in a set of fixes. Only slight changes in its azimuth may have huge effects on the overlapping regions. A straightforward solution approach to this problem is to smooth the lifelines and then fitting the azimuth vector to a segment of the smoothed lifeline. The paper illustrates the REMO concept referring to ideal geospatial lifeline data. In reality lifeline data are often imprecise and uncertain. Sudden gaps in lifelines, irregular fixing intervals or positional uncertainty of fixes require sophisticated interpolation and uncertainty considerations on the implementation side (e.g. Pfoser and Jensen 1999).
6 Conclusions With the technology driven shift from the static map view of the world to a dynamic process in GIScience, cluster detection on snapshots is insufficient. What we need are new methods that can detect convergence processes as well as static clusters, especially if these two aspects of space-time aggregation are separated. We propose a generic, understandable and extendable approach for data mining in geospatial lifelines. Our approach integrates individuals as well as groups of MPOs. It also integrates parameters describing the motion as well as the footprints of the MPOs in spacetime. Acknowledgements The ideas to this work prospered in the creative ambience of Dagstuhl Seminar No. 03401 on 'Computational Cartography and Spatial Modelling'. The authors would like to acknowledge invaluable input from Ross Purves, University of Zurich. References Aurenhammer F (1991) Voronoi diagrams: A survey of a fundamental geometric data structure. ACM Comput. Surv. 23(3):345-405
214
Patrick Laube, Marc van Kreveld, and Stephan Imfeld
Batty M, Desyllas J, Duxbury E (2003) The discrete dynamics of small-scale spatial events: agent-based models of mobility in carnivals and street parades. Int. J. Geographical Information Systems 17(7):673-697 Bern M, Eppstein D (1997) Approximation algorithms for geometric problems. In Hochbaum DS (ed) Approximation Algorithms for NP-Hard Problems, PWS Publishing Company, Boston, MA, pp 296-345 Casaer J, Hermy M, Coppin P, Verhagen R (1999) Analysing space use patterns by Thiessen polygon and triangulated irregular network interpolation: a nonparametric method for processing telemetric animal fixes. Int. J. Geographical Information Systems 13(5):499-511 de Berg M, van Kreveld M, Overmars M, Schwarzkopf O (2000) Computational Geometry - Algorithms and Applications. Springer, Berlin, 2nd edition Estivill-Castro V, Lee I (2002) Multi-level clustering and its visualization for exploratory data analysis. GeoInformatica 6(2):123-152 Ganskopp, D. (2001) Manipulating cattle distribution with salt and water in large arid-land pastures: a GPS/GIS assessment. Applied Animal Behaviour Science 73(4):251-262 Hornsby K, Egenhofer M (2002) Modeling moving objects over multiple granularities. Annals of Mathematics and Artificial Intelligence 36(1-2):177194 Iwase S, Saito H (2002) Tracking soccer player using multiple views. In IAPR Workshop on Machine Vision Applications, MVA Proceedings, pp 102-105 Jain A, Duin R, Mao J (2000) Statistical pattern recognition: A review. IEEE Transactions on Pattern Recognition and Machine Intelligence 22(1):4-37 Laube P, Imfeld S (2002) Analyzing relative motion within groups of trackable moving point objects. In: Egenhofer M, Mark D (eds), Geographic Information Science, Second International Conference, GIScience 2002, Boulder, CO, USA, September 2002, LNCS 2478, Springer, Berlin, pp 132-144 Mark, D (2003) Geographic information science: Defining the field. In: Duckham M, Goodchild M, Worboys M (eds), Foundations of Geographic Information Science, chap. 1, Taylor and Francis, London New York, pp 3-18 Miller H (2003) What about people in geographic information science? Computers, Environment and Urban Systems 27(5):447-453 Miller H (2004) Tobler's first law and spatial analysis. in preparation. Miller H, Han J (2001) Geographic data mining and knowledge discovery: An overview. In: Miller H, Han J (eds) Geographic data mining and knowledge discovery, Taylor and Francis, London New York, pp 3-32 Miller H, Wu Y (2000) GIS software for measuring space-time accessibility in transportation planning and analysis. GeoInformatica 4(2):141-159 Mountain D, Raper J (2001) Modelling human spatio-temporal behaviour: A challenge for location-based services. Proceedings of GeoComputation, Brisbane, 6 Openshaw S (1994) Two exploratory space-time-attribute pattern analysers relevant to GIS. In: Fotheringham S, Gogerson P (eds) GIS and Spatial Analysis, chap. 5, Taylor and Francis, London New York, pp 83-104
Detecting Relative Motion Patterns in Geospatial Lifelines
215
Openshaw S, Turton I, MacGill J (1999) Using geographic analysis machine to analyze limiting long-term illness census data. Geographical and Environmental Modelling 3(1):83-99 Pfoser D, Jensen C (1999) Capturing the uncertainty of moving-object representations. In: Gueting R, Papadias D, Lochowsky, F (eds) Advances in Spatial Databases, 6th International Symposium, SSD'99, Hong Kong, China, July 1999. LNCS 1651, Springer, Berlin Heidelberg, pp 111-131 Ramos E (1999) On range reporting, ray shooting and k-level construction. In: Proc. 15th Annu. ACM Symp. on Computational Geometry, pp 390-399 Roddick J, Hornsby K, Spiliopoulou M (2001) An updated bibliography of temporal, spatial, and spatio-temporal data mining research. In: Roddick J, Hornsby K (eds), Temporal, spatial and spatio-temporal data mining, TSDM 2000, LNAI 2007, Springer, Berlin Heidelberg, pp 147-163 Sibbald AM, Hooper R, Gordon IJ, Cumming S (2001) Using GPS to study the effect of human disturbance on the behaviour of the red deer stags on a highland estate in Scotland. In: Sibbald A, and Gordon IJ (eds) Tracking Animals with GPS, Macaulay Institute, pp 39-43 Smyth C (2001) Mining mobile trajectories. In: Miller H, Han J (eds) Geographic data mining and knowledge discovery, Taylor and Francis, London New York, pp 337-361 Tobler W (1970) A computer movie simulating urban growth in the Detroit region. Economic Geography 46(2):234-240
Spatial Hoarding: A Hoarding Strategy for Location-Dependent Systems Karim Zerioh2, Omar El Beqqali2 and Robert Laurini1 1
LIRIS Laboratory, INSA-Lyon – Bât B. Pascal – 7 av. Capelle F – 69621 Villeurbanne Cedex France
[email protected] 2 Dhar Mehraz Faculty of Science, Mathematics and Computer Science Department, B.P. 1897 Fes-Atlas, 30000, Fes, Morocco.
[email protected],
[email protected] Abstract In a context-aware environment, the system must be able to refresh the answers to all pending queries in reaction to perpetual changes in the user’s context. This added to the fact that mobile systems suffer from problems like scarce bandwidth, low quality communication and frequent disconnections, leads to high delays before giving up to date answers to the user. A solution to reduce latency is to use hoarding techniques. We propose a hoarding policy particularly adapted for location-dependent information systems managing a huge amount of multimedia information and where no assumptions can be made about the future user’s location. We use the user’s position as a criterion for both hoarding and cache invalidation. Keywords: Hoarding, cache invalidation, mobile queries, locationdependent systems, spatial information systems.
1 Introduction The growing popularity of mobile computing has lead to more and more elaborated mobile information systems. Nowadays, mobile applications are aware of the user's context (time, location, weather, temperature, surrounding noise, ...). One of the most popular context-aware applications is the tourist guide (Cheverst et al. 2000, Abowd 1997, Malaka 1999, Poslad et al. 2001, Zarikas et al. 2001). In this paper we deal only with location.
218
Karim Zerioh, Omar El Beqqali and Robert Laurini
Let’s consider the scenario of a tourist with a mobile tourist guide asking where the nearest restaurant is located. This query must be answered immediately. Otherwise, if the answer takes some delay, it may be obsolete if the tourist in movement is already nearer to a different restaurant. So the system must refresh the responses that have been invalidated by context changes, to all pending queries. This operation can be repeated several times, depending on the number of users and the frequency of their queries and the context changes. On the other hand, mobile systems still suffer from scarce bandwidth, low quality communication and frequent network disconnections. All these factors lead to high delays before satisfying user's queries. But this delay will not occur if the answer is already in the client's cache. Caching techniques have proven their usefulness in wired systems. The answer of a query is stored in the cache for future use and when a user asks the same query it is answered by the cache. However, in location-dependent systems, where the answer of the same query changes if only the user's position is different, and where users rarely return to the same place (for example, a user with a tourist guide, after visiting a museum, has a very low chance to return to it after a while), the benefits of caching are not so obvious. But, if useful information is transferred to the client before the user requests it, the problem of latency will be resolved. Hoarding techniques must predict in advance which information the user will request, and try to transfer as less as possible unusable data for not wasting the scarce bandwidth resources and the usually limited memory and storage capacity in the user’s device. Tourist guides, nowadays, are very elaborated. They use maps for guided tours, present audio content to allow the user to walk while listening to explications, provide virtual 3D reconstructions of historical sites and 3D representations of the place where the target asked by the user is being for making its recognition easier, offer a life show of a tour in a hotel… So the amount of the multimedia data dealt with is really huge. This makes the necessity of appropriate cache invalidation schemes for freeing space in the user’s device. In this paper, we present a hoarding technique particularly adapted for location-dependent systems managing an important amount of data (called spatial hoarding). We use the user’s location both as a prediction criterion and as a cache invalidation one. The rest of the paper is organised as follows. In section 2 we begin with the related work. In section 3 we give a description of our method, we define the client’s capability and propose to operate in disconnected mode to save power. In section 4, we present two necessary algorithms for implementing the SH strategy, and discuss how data must be organised for the
Spatial Hoarding: A Hoarding Strategy for Location-Dependent Systems
219
determination of the information that must be hoarded. Section 5 gives an overview of our future work. Finally, section 6 concludes the paper.
2 Related Work Caching is the operation of storing information in the user’s device after it has been sent from the server. This allows future accesses to this information to be satisfied by the client. Invalidation schemes are used for maintaining consistency between the client cache and the server. Invalid data in the cache may be dropped for freeing memory for more accurate data. In location-dependent systems a cached data value becomes invalid when this data is updated in the server (temporal-dependent invalidation), or when the user moves to a new location (location-dependent). The team of the University of Science and Technology at Hong Kong (Xu et al. 1999, Zheng et al. 2002) investigated the integration of temporal-dependent and location-dependent updates. They assume the geographical coverage area partitioned into service areas, and define the valid scope of an item as the set of service areas where the item value is valid. To every data item sent from the server is attached its valid scope. So a data item becomes invalid when the user enters in a service area not belonging to its valid scope. However, as noted by (Kubach and Rothermel 2001), caching never speeds up the first access to an information item, and caching locationdependent data is not beneficial if the user does not return frequently to previously visited locations. Their simulation results have proven that hoarding gives better results than caching in location-dependent systems, although it is assumed that memory available for caching is unlimited and no information is removed from the cache. Hoarding is the process of predicting the information that the user will request, for transferring it in advance to the client cache. So, the future user’s query will be satisfied by the client, although the response contains a data item that has never been requested before. Several hoarding techniques have been proposed in the literature. The first proposed methods were requiring user intervention, making the system less convivial and are useless in systems where the user does not know in advance what kind of information he will need. Automated hoarding is the process of predicting the hoard set without user intervention. Kuening and Popek (Kuenning and Popek 1997) propose an automated hoarding method where a measure called semantic distance between files is used to feed a clustering algorithm that selects the files that should be hoarded. Saygin et al (Saygin et al. 2000) propose another method based on data mining techniques. This
220
Karim Zerioh, Omar El Beqqali and Robert Laurini
latter uses association rules for determining the information that should be hoarded. Khushraj et al (Khushraj et al. 2002) propose a hoarding and reintegration strategy substituting whole file transfers between the client and the server, by transferring only the changes between their different versions. These changes are patched to the old version once in the destination machine. This incremental hoarding and reintegration mechanism is built within the Coda file system (Coda Group), based on the Revision Control System (RCS) (Tichy1985). Cao (Cao 2002) propose a method that allows making a compromise between hoarding and available power resources. None of these methods deal with the spatial property of locationdependent systems. De Nitto et al (De Nitto et al. 1998) propose a model to evaluate the effectiveness of hoarding strategies for context aware systems, based on cost measures. They apply these measures to some idealised motion models. For the two-dimensional model, the area is divided in adjacent polygons. The distance between two polygons is the number of polygons that must be traversed to pass from the first polygon to the other one. The ring k is defined as the set of all polygons whose distance from a given polygon is equal to k. All the information associated to rings 0, 1, …, k around the starting position is hoarded. The next hoard does not occur until the user enters a polygon outside the circle k. One drawback of this strategy is that a lot of unnecessary information is hoarded because the user’s direction is not taken into account (all the information associated with the area behind the user is useless). Another drawback is that the hoard does not begin until the user is out of the hoarded circle. So hoard misses will occur until the information related with the new user’s circle is hoarded. Kubach and Rothermel (Kubach and Rothermel 2001) propose a hoarding mechanism for location-dependent systems based on infostations (Badrinath et al. 1996). When the user is near an infostation the information related to its area is hoarded. An access probability table is maintained where each data item is associated with the average probability that it will be requested. Only a fixed number of the first data items with the highest probabilities are hoarded for the purpose of not wasting bandwidth and the user’s device resources. No explication is given about what a data item is. We find that a fixed number m is not a good criteria for this purpose in a realistic system because what ever data items can be (files, tables, fields in a table, web pages, real word entities…) there will be always differences in the memory required for them, so a fixed number of data items can always exceed the space reserved for them. When no assumptions can be made on the future user’s movement, all the information related to the infostation area is hoarded. So this mechanism is not adapted for systems managing a huge amount of data.
Spatial Hoarding: A Hoarding Strategy for Location-Dependent Systems
221
3 Proposed Method: Spatial Hoarding
3.1 Method Description In pervasive location-dependent systems, the majority of the user’s queries are related to the area where he is. The problem of latency is crucial for local queries (Zheng et al. 2002) whereas for non-local queries user movement for a short time does not invalidate the response. So our mechanism must hoard the information related to the current position of the user. As mobile devices usually suffer from limited storage and memory capacities, we cannot hoard the information associated to a big area. We consider the space divided in adjacent squares. We use the Peano encoding with Nordering (Laurini and Thompson 1992, Moktel et al. 2003) for its known advantage of stability (we can add an other area to the area covered by the system without affecting the encoding), its usefulness for indexing spatial databases and because it allows the use of the Peano algebra for solving spatial queries. In the following we will call a square a square of length 2 and a sub-square a square of length 1 (see Fig. 1). The curve in the figure shows the path of the user. The purpose of the method is to hoard the information that the user will the most probably access. The user after leaving the square, in which he is located, will be in one of the eight adjacent squares. But hoarding all the squares will waste resources with unnecessary information because the user direction is not taken into account. For the purpose of exploiting user’s orientation, we choose to make the hoarding decision when the user enters in a new sub-square. This way, if the user leaves his current square, he will be in one of the three squares adjacent to his sub-square. Thus, dividing squares in sub-squares and making the hoarding decision in the sub-squares boundaries, allows us to take the user’s direction into account and to restrict the number of squares to hoard from eight to only three. Usually, one or two of the adjacent squares are already in the client and only the remaining one or two squares will be hoarded. As discussed above a caching invalidation scheme is also necessary for freeing memory for more accurate data. The user’s location is also used as an invalidation criterion, and we invalidate all the squares that are not adjacent to the user’s square. We summarise this as follows: x When the user enters in a new sub-square, its three adjacent squares must be hoarded. x The information located in the squares non adjacent to the user’s square must be dropped.
222
Karim Zerioh, Omar El Beqqali and Robert Laurini
As the spatial property is used both as a hoarding and a cache invalidation criterion, we call this method “Spatial Hoarding” (SH).
20
17
28
19
25
52
27
49
60
51
16
18
24
26 i
48
50
5
7
13
15
37
39
4
6
12
14
36
38
A
56
User's Location
44 x
B 0
8 i
32
40
Fig. 1 User’s route in an area divided into adjacent peano N-ordered squares
Table 1 summarises how the Spatial Hoarding (SH) method is applied to the user’s path portion of figure 1. When the user is in the sub-square 36, the squares 8, 12, 32, 36 are available in the client’s cache. When he enters to the sub-square 14, the hoarding and the cache invalidation criteria are checked to decide which squares to hoard and which one to delete. The three adjacent squares of the sub-square 14 are 8, 32 and 36, which are already in the client’s cache. So no hoard is needed. For the cache invalidation criteria, there are no non-adjacent squares to the square 12, so no deletion is needed. Then, the user moves to the sub-square 12, which is adjacent to the squares 0, 4 and 8. Squares 0 and 4 are not available in the client’s cache, so they will be hoarded. For the cache invalidation criteria, here again there are no non-adjacent squares to the square 12. The cache invalidation criterion is not satisfied until the user reaches the sub-square 7. 7 is a sub-square of square 4. The adjacent squares of square 4 are 0, 8, 12, 16, and 24. As Squares 32 and 36 are in the client’s cache and are not adjacent to the square 4, they are invalidated by the cache invalidation criteria and are dropped. We give an algorithm of the SH method in section 4.
Spatial Hoarding: A Hoarding Strategy for Location-Dependent Systems
223
Table 1. The Spatial Hoarding method applied to the user’s path of Fig. 1 Sub-squares Available squares
Squares to hoard Squares to drop
36
8, 12, 32, 36
14
8, 12, 32, 36
12
8, 12, 32, 36
0, 4
13
0, 4, 8, 12, 32, 36
16, 24
7
0, 4, 8, 12, 16, 24, 32, 36
32, 36
18
0, 4, 8, 12, 16, 24
0, 8
19
4, 12, 16, 24
25
4, 12, 16, 20, 24, 28
27
4, 12, 16, 20, 24, 28
49
4, 12, 16, 20, 24, 28, 48, 52
51
12, 24, 28, 48, 52
20, 28
48, 52 4, 16, 20 56, 60
3.2 Client’s Capability Let’s consider again the scenario of a tourist looking for the nearest restaurant. In Figure 1, the tourist is in the sub-square number 14, there is a restaurant A in the sub-square 26 and a restaurant B in the sub-square 8. If the client answers the query of the nearest restaurant, the response will be restaurant B, although restaurant A is nearer, because the sub-square 26 is not available in the client. We define the capability of the client as the area where the client is able to give a correct answer to a query of this kind. The client’s capability is the area delimited by the circle whose centre is the current user’s location and whose radius is the distance between the user and the nearest vertex of the polygon delimiting the squares available on the client. For our example, this vertex is the upper side of the square 12. So, before giving an answer to the user, the client must look for the nearest restaurant within its capability circle. If no restaurant is found, the query must be transferred to the server.
224
Karim Zerioh, Omar El Beqqali and Robert Laurini
3.3 Operating in Disconnected Mode Another limit to mobile devices is their low power capacity. The SH method applied to the example of Figure 1 shows that hoarding is needed only in 5 sub-squares of the 11 ones traversed. We can exploit this by allowing the user to operate in doze or disconnected mode for saving power consumption. The application interface must allow the user to know when he can switch to disconnected mode and when he must reconnect.
4 Implementation
4.1 Algorithm We model the available squares in the client’s cache by a linked list whose nodes are squares having an attribute where we store its corresponding Peano key. Algorithm 1 is the algorithm implementing the cache invalidation and the hoarding operations. Algorithm 2 is the algorithm retrieving the 3 adjacent squares to a given sub-square. Algorithm 1: Application of the cache invalidation and the hoarding criteria’s Input: List of the available squares (list), Array of the adjacent squares to the current square (T1[9]), Array of the adjacent squares to the current sub-square (T2[3]) Procedure: Integer i; /*Cache invalidation criteria*/ temp := list.first; temp2 := new(square); while ((notin(temp, T1) = = true) and (temp != NULL) do first := temp.next; temp := first; end while while (temp != NULL) do if (temp.next != NULL) then
Spatial Hoarding: A Hoarding Strategy for Location-Dependent Systems
225
if (notin(temp.next, T1) = = true) then begin temp2 := temp.next; temp.next := temp2.next; if (tmp2 = = last) then last = tmp; end if free(temp); end else then temp := temp.next; end if else then temp := temp.next; end if end while /* Hoarding Criteria*/ for i := 1 to T2.length do if (isnotin(T2[i], list) = = true) then add(list, T2[i]); end for The algorithm begins with the application of the cache invalidation criteria. The first element of the list is treated alone because it has no previous node pointing to it. Each node of the list is compared, using the function “notin” with the array T1; if a node (square) does not exist in the array T1 (if this square is not adjacent to the current square), it is dropped from the list. Then, the hoarding criterion is applied. Each element of T2 (array of the adjacent squares to the current sub-square) is compared to the list using the function “isnotin”. Every element of T2, which does not exist in the list, is added to the latter using the function “add”. Algorithm 2: Determination of the adjacent squares to the current sub-square. Input: Peano key of the current sub-square (P) Output: The array of the 3 adjacent Squares T[3] Procedure: integer T[3]; integer A[3][2]; integer x, y, i; x := get_x(P); y :=get_y(P);
226
Karim Zerioh, Omar El Beqqali and Robert Laurini
switch (P mod 4) begin case 0 : A[1][1] := x – 1; A[1][2] := y; A[2][1] := x – 1; A[2][2] := y A[3][1] := x ; A[3][2] := y - 1; Break; case 1 : A[1][1] := x; A[1][2] := y + 1; A[2][1] := x – 1; A[2][2] := y + A[3][1] := x - 1; A[3][2] := y; Break; case 2 : A[1][1] := x; A[1][2] := y - 1; A[2][1] := x + 1; A[2][2] := y A[3][1] := x + 1; A[3][2] := y; Break; case 3 : A[1][1] := x; A[1][2] := y + 1; A[2][1] := x + 1; A[2][2] := y + A[3][1] := x +1; A[3][2] := y; Break; end for i := 1 to 3 do T[i] := get_p(A[i][1],A[i][2]); T[i] := T[i] – (T[i] mod 4) end for output T
1;
1;
1;
1;
First, the coordinates x and y are deducted from the Peano key P using the functions “get_x” and “get_y”. Then, the position of the current subsquare in its parent square is determined, because the determination of the adjacent sub-squares depends on it. After determining the coordinates of the adjacent sub-squares, the corresponding Peano keys are deducted using the function “get_p”. Then, the number of the left bottom sub-square is deducted, because the number of this sub-square added to a length 2 is the name of the square. We do not give the algorithm of the determination of the adjacent squares to the current square, because it is quite similar to algorithm 2. 4.2 What to Hoard Location-dependent systems such as tourist guides, as noted before, can use different kinds of data (text, graphics, maps, images, audio clips, film clips…). So the amount of the multimedia data managed is very important.
Spatial Hoarding: A Hoarding Strategy for Location-Dependent Systems
227
We have proposed to divide the area covered by the system into adjacent squares, because hoarding the information related to all the area cannot be implemented in a real system, because of the limited memory and storage capacity in the user’s device. Depending on the user’s device resources and the available bandwidth, a fixed amount of space A will be reserved to the client’s cache. As the maximal number of squares that can be available in the client’s cache is 9, the amount of hoarded data cannot exceed A/9 in each square. As we deal here with systems that manage a huge amount of data, the amount of information related to some squares can exceed the A/9 value. Also, some data items associated to a square may have low probabilities of access. So transferring this data may only waste resources. Kubach and Rothermel (Kubach and Rothermel 2001) have described how to maintain access probability tables, where each data item is associated with the average probability that it will be requested, and propose to hoard only a fixed number of the first elements. We think that, a fixed number of data items is not a criteria that can be implemented in practice, because data items do not require the same space in memory, so hoarding a fixed number of data items can exceed the space reserved for the cache. In the following we will determine what we mean by a data item within the SH method, we will explain how we will be able to add and drop data items dynamically depending on the amount of space available on the client’s cache and how to retrieve all the information related to a given square. Laurini and Thompson (Laurini and Thompson 1992) have generalised the concept of hypertext to hyperdocument where the content portions can always be displayed on a screen or presented via a loudspeaker, and define the hypermap as multimedia documents with a geographic coordinate based access via mouse clicking or its equivalent. Database entities are represented by graphical means, and clicking a reference word or picture allows the user to go to another node. They present the following relational model: WEB(Document_node-ID, (Word_locator (To_document_node-ID, Type_of_link)*)*, (From_document_node-ID, Type_of_link)*) Where * indicates the nesting (the data nested are in a separate table; however, they may be stored as a long list in the parent table). Type_of_link refers to the nature of the path from one node to another. Word_locator is the word, graph unit, or pixel, or other element, in the first element. They explain also how to deal with spatial queries for retrieving hypermap nodes. By Peano relations the solution is: Document(Document_node-ID, (Peano_key, Side_length)*)
228
Karim Zerioh, Omar El Beqqali and Robert Laurini
Within the SH method we will consider document nodes as the data items to which the average access probability will be associated: Document(Document_node-ID, P(Document_node-ID), (Peano_key, Side_length)*) The first operation is to retrieve all the documents related to a candidate square for hoarding, sorted by decreasing order of their average probability of access. Then, the application will use the operating system and DBMS primitives for associating each document to its amount of space in memory. The documents will be added to the hoard set until the maximum value fixed for their square is reached. This value can be exceeded if there is sufficient space in the client. As noted before, the client can drop the data items with the lower probabilities later if necessary. Every click on a node or a document consultation will be kept in a log file in the client, and will be sent later to the server for updating the average access probability for each document node.
5 Future Work We are in the final stages of the development of a simulation prototype for the Spatial Hoarding method. Our preliminary simulation results show that the SH policy improves the cache hit ratio and reduces significantly the query latency. When there is an important number of user’s in a given area, the information related to the same square may be hoarded several times for different clients. In our future work we will focus on the issue of saving the bandwidth in the case of multiple users.
6 Conclusion We have presented an innovative policy for resolving the problem of latency in location-dependent systems. Our mechanism makes no assumptions about the future user’s movement and thus deals with the complexity of real world applications. We proposed solutions to all the problems related to the method’s implementation in an elaborated spatial information system managing multimedia information Our hoarding mechanism improves the cache hit ratio, thus reduces the uplink requests, and reduces the query latency. Compared to the previous hoarding mechanisms whose main aim was to allow disconnected information access, but making the user in the risk of accessing obsolete informa-
Spatial Hoarding: A Hoarding Strategy for Location-Dependent Systems
229
tion, our method allows the user to have access to the most recent data. Using the user’s position as a cache invalidation criterion reduces the need for extra communication between the client and the server for checking cache consistency.
References Badrinath, B.R., Imielinsky, T., Frankiel, R. and Goodman, D., 1996. Nimble: Many-time, many-where communication support for information systems in highly mobile and wireless environments, http://www.cs.rutgers.edu/~badri/dataman/nimble/. Cao, G., 2002, Proactive power-aware cache management for mobile computing systems. IEEE Transactions on computers 51, 6, 608-621. Cheverst, K., Davis, N., Mitchell, K., Friday and Efstriatou C., 2000, Developing a context-aware electronic tourist guide: some issues and experiences. In Proceedings of CHI’2000, Netherlands, pp. 17-24. The Coda Group, Coda file system, http://www.coda.cs.cmu.edu/. De Nitto, V.P., Grassi, V., Morlupi, A., 1998, Modeling and evaluation of prefetching policies for context-aware information services. In Proceedings of the 4th Annual International Conference on Mobile Computing and Networking, (Dallas, Texas, USA), pp. 55-64. Khushraj, A., Helal, A., Zhang, J., 2002, Incremental hoarding and reintegration in mobile environments. In Proceedings of the International Symposium on Applications and the Internet (Nara, Japan). Kubach, U., and Rothermel, K., 2001, Exploiting location information for infostation-based hoarding. In Proceedings of the 7th International Conference on Mobile Computing and Networking (New York, ACM Press), pp. 15-27. Kuenning, G. H., and Popek, G. J., 1997, Automated hoarding for mobile computers. In Proceedings of the 16th ACM Symposium on Operating Systems Principles, (St. Malo, France), pp. 264-275. Laurini, R., and Thompson, A.D., 1992, Fundamentals of Spatial Information Systems (A.P.I.C. Series, Academic Press, New York, NY). Lee, D.L., Lee, W.C., Xu, J., and Zheng, B., 2002, Data management in locationdependent information services. IEEE Pervasive Computing, 1, 3, 65-72. Abowd, G.D., Atkeson, C.G., Hong, J., Long, S., Kooper, R., Pinkerton, M., 1997, Cyberguide: a mobile context-aware tour guide. Wireless Networks 3, 5, pp. 421-433. Malaka, R., 1999, Deep Map: the multilingual tourist guide. In Proceedings of the C-STAR workshop. Mokbel M-F, Aref W-G., Kamel I.: Analysis of Multi-Dimensional Space-Filling Curves. GeoInformatica 7(3): 179-209 (2003) Poslad, S., Laamanen, H., Malaka, R., Nick, A., Buckle, P., and Zipf, A., 2001, CRUMPET: Creation of user-friendly mobile services personalised for tour-
230
Karim Zerioh, Omar El Beqqali and Robert Laurini
ism. In Second International Conference on 3G Mobile Communication Technologies (London UK), pp. 28-32. Saygin, Y., Ulusoy, Ö., and Elmagarmid, A.K., 2000, Association rules for supporting hoarding in mobile computing environments. In Proceedings of the 10th International Workshop on Research Issues in Data Engineering (IEEE Computer Society Press). Tichy, W.F., 1985, RCS - A system for version control. Software-Practice and Experience, 15, 7, pp. 637-654. Xu, J., Tang, X., Lee, D.L., and Hu, Q., 1999, Cache coherency in locationdependent information services for mobile environments. In Proceedings of the 1st International Conference on Mobile Data Acess (Springer, Heidelberg, Germany), pp. 182-193. Zarikas, V., Papatzanis, G., and Stephanidis, C., 2001, An architecture for a selfadapting information system for tourists. In Proceedings of the 2001 workshop on Multiple User Interfaces over the Internet, http://cs.concordia.ca/~seffah/ihm2001/papers/zarikas.pdf. Zheng, B., Xu, J., and Lee, D.L., 2002, Cache invalidation and replacement strategies for location-dependent data in mobile environments. IEEE Transactions on Computers 51, 10, pp. 1141-1153.
Distributed Ranking Methods for Geographic Information Retrieval Marc van Kreveld, Iris Reinbacher, Avi Arampatzis, and Roelof van Zwol Institute of Information and Computing Sciences, Utrecht University P.O.Box 80.089, 3508 TB Utrecht, The Netherlands
[email protected],
[email protected],
[email protected],
[email protected] Summary. Geographic Information Retrieval is concerned with retrieving documents that are related to some location. This paper addresses the ranking of documents by both textual and spatial relevance. To this end, we introduce distributed ranking, where similar documents are ranked spread in the list instead of consecutively. The effect of this is that documents close together in the ranked list have less redundant information. We present various ranking methods, efficient algorithms to implement them, and experiments to show the outcome of the methods.
1 Introduction The most common way to return a set of documents obtained from a Web query is by a ranked list. The search engine attempts to determine which document seems to be the most relevant to the user and will put it first in the list. In short, every document receives a score, or distance to the query, and the returned documents are sorted by this score or distance. There are situations where the sorting by score may not be the most useful one. When a more complex query is done, composed of more than one query term or aspect, documents can also be returned with two or more scores instead of one. This is particularly useful in geographic information retrieval (Jones et al. 2002, Rauch et al. 2003, Visser et al. 2002). For example, the Web search could be for castles in the neighborhood of Koblenz, and the documents returned ideally have a score for the query term “castle” and a score for the proximity to Koblenz. This implies that a Web document resulting from this query can be mapped to a point in the 2-dimensional plane. A cluster of points in this plane could be several documents about the same castle. If this castle is in the immediate vicinity of Koblenz, all of these documents would be ranked high, provided that they also have a high score on the term “castle”. However, the user probably also wants documents about other castles that may be a bit further away, especially when these documents
This research is supported by the EU-IST Project No. IST-2001-35047 (SPIRIT).
232
Marc van Kreveld, Iris Reinbacher, Avi Arampatzis, Roelof van Zwol
are more relevant for the term “castle”. To incorporate this idea in the ranking, we introduce distributed ranking in this paper. We present various models that generate ranked lists that have diversity. We also present efficient algorithms that compute the distributed rankings. To keep server load low, it is important to have efficient algorithms. There are several reasons to rank documents according to more than one score. For example we could distinguish between the scores of two textual terms, or a textual term and metadata information, or a textual term and a spatial term, and so on. A common example of metadata for a document is the number of hyperlinks that link to that document; a document is probably more relevant if there are many links to it. In all of these cases we get two scores which need to be combined for a ranking. In traditional information retrieval, the two scores of each document would be combined into a single score (e.g., by a weighted sum or product) which produces the ranked list by sorting. Besides the problem that it is unclear how the two scores should be combined, it also makes a distributed ranking impossible. Two documents with the same combined score could be similar documents or quite different. If two documents have two scores that are the same, one has more reason to suspect that the documents themselves are similar than when two documents have one (combined) score that is the same. The topic of geographic information retrieval is studied in the SPIRIT project (Jones et al. 2002). The idea is to build a search engine that has spatial intelligence because it will understand spatial relationships like close to, to the North of, adjacent to, and inside, for example. The core search engine will process a user query in such a way that both the textual relevance and the spatial relevance of a document is obtained in a score. This is possible because the search engine will not only have a term index, but also a spatial index. These two indices provide the two scores that are needed to obtain a distributed ranking. The ranking study presented here will form part of the geographic search engine to be developed for the SPIRIT project. Related research has been conducted in (Rauch et al. 2003), which focuses on disambiguating geographic terms of a user query. The disambiguation of the geographic location is done by combining textual information, spatial patterns of other geographic references, relative geographic references from the document itself, and population heuristics from a gazetteer. This gives the final value for geoconfidence. The georelevance is composed of the geoconfidence and the emphasis of the place name in the document. The textual relevance of a document is computed as usual in information retrieval. Once both textual and geographic relevance are computed, they are combined by a weighted sum. Finding relevant information and at the same time trying to avoid redundancy has so far mainly been addressed in producing summaries of one or more documents. (Carbonell and Goldstein 1998) uses the maximal marginal relevance (MMR), which is a linear combination of the relevance of the document to the user query and its independence of already selected documents.
Distributed Ranking Methods for Geographic Information Retrieval
233
MMR is used for the reordering of documents. A user study has been performed in which the users preferred MMR to the usual ranking of documents. This paper contains no algorithm how to actually (efficiently) compute the MMR. Following up on this, a Novelty Track of TREC (Harman 2002) discusses experimenting with ranking of textual documents such that every next document has as much additional information as possible. (Goldstein et al. 1999) proposes another scoring function for summarizing text documents. Every sentence is assigned a score combined of the occurence of statistical features and on the occurrence of linguistic features. They are combined linearly with a weighting function. In (Goldstein et al. 2000), MMR is refined and used to summarize multiple documents. Different passages or sentences respectively are assigned a score instead of full documents. The remainder of this paper is organized as follows. In Section 2 we present several different ranking methods and the algorithms to compute them. In Section 3 we show how the ranking methods behave on real-world data. In the conclusions we mention other research and experiments that we have done or we are planning to do.
2 Distributed Ranking Methods In this section we will present specific ranking methods. Like in traditional information retrieval, we want the most relevant documents to appear in the ranking, while avoiding that documents with similar information appear close to documents already ranked. We will focus on the two dimensional case only, although in principle the idea and formulas apply in higher dimensions too. We assume that a Web query has been conducted and a number of relevant documents were found. Each document is associated with two scores, for example a textual score and a spatial score (which is the case in the SPIRIT search engine). The relevant documents are mapped to points in the plane, and the query is also mapped to a point. We perform the mapping in such a way that the query is a point Q at the origin, and the documents are mapped to points p1 , . . . , pn in the upper right quadrant, where documents with high scores are points close to Q. We can now formulate the two main objectives for our ranking procedure: 1. Proximity to query: Points close to the query Q are favored. 2. Spreading: Points farther away from already ranked points are favored. A ranking that simply sorts all points in the representation plane by distance to Q is optimal with respect to the first objective. However, it can perform badly with respect to the second. Selecting a highly distributed subset of points is good with respect to the second objective, but the ranked list would contain too many documents with little relevance early in the list. We therefore seek a compromise where both criteria are considered simultaneously. Note
234
Marc van Kreveld, Iris Reinbacher, Avi Arampatzis, Roelof van Zwol
that the use of a weighted sum to combine the two scores into one makes it impossible to obtain a spreaded ranking. The point with the smallest Euclidean distance to the query is considered the most relevant document and is always first in any ranking. The remaining points are ranked with respect to already ranked points. At any moment during the ranking, we have a subset R ⊂ P of points that have already been ranked, and a subset U ⊂ P of points that are not ranked yet. We choose from U the “best” point to rank next, where “best” is determined by a scoring function that depends on both the distance to the query Q and the set R of ranked points. Intuitively, an unranked point has a higher added value or relevance if it is not close to any ranked points. For every unranked point p,
y
p2
p p − pi
p φ Q
pi p1
p3 x
Fig. 1. An unranked point p amidst ranked points p1 , p2 , p3 , pi , where p is closest to pi by distance and by angle.
we consider only the closest point pi ∈ R, where closeness is measured either in the Euclidean sense, or by angle with respect to the query point Q. This is illustrated by p − pi and φ, respectively, in Figure 1. Using the angle to evaluate the similarity of p and pi seems less precise than using the Euclidean distance, but it allows more efficient algorithms, and certain extensions of angle-based ranking methods give well-distributed results.
2.1 Distance to query and angle to ranked Our first ranking method uses the angle measure to obtain the similarity between an unranked point and a ranked point. In the triangle pQpi (see Figure 1) consider the angle φ = φ(p, pi ) and rank according to the score S(p, R) ∈ [0, 1], which can be derived from the following normalized equation:
k 1 2(φ(p, pi ) + c) (1) · S(p, R) = min pi ∈R 1 + p π + 2c
Distributed Ranking Methods for Geographic Information Retrieval
235
Here, k denotes a constant; if k < 1, the emphasis lies on the distribution, if k > 1, we assign a bigger weight to the proximity to the query. The additive constant 0 < c 1 ensures that all unranked points p ∈ N are assigned an angle dependent factor greater than 0. The score S(p, R) necessarily lies between 0 and 1, and is appropriate if we do not have a natural upper bound on the maximum distance of unranked points to the query. If such an upper bound was available, there are other formulas that give normalized scores. During the ranking algorithm, we always choose the unranked point p that has the highest S(p, R) score and rank it next. This implies an addition to the set R, and hence, recomputation of the scores of unranked points may be necessary. We first give a generic algorithm with a running time of O(n2 ). Algorithm 1: Input: A set P with n points in the plane. 1. Rank the point r closest to the query Q first. Add it to R and delete it from P . 2. For every unranked point p ∈ P do a) Store with p the point r ∈ R with the samallest angle to p b) Compute the score S(p, R) = S(p, r) 3. Determine and choose the point with the highest score S(p, R) to be next in the ranking; add it to R and delete it from P . 4. Compute for every point p ∈ P the angle to the last ranked point p. If it is smaller than the angle to the point stored with p , then store p with p and update the score S(p , R). 5. Continue with step 3 as long as there are unranked points. The first four steps all take linear time. As we need to repeat steps 3 and 4 until all points are ranked, the overall running time of this algorithm is O(n 2 ). It is a simple algorithm, and can be modified to work for different score and distance functions. In fact, it can be applied to all the ranking models that will follow. Theorem 1. A set of n points in the plane can be ranked according to the distance-to-query and angle-to-ranked model in O(n2 ) time. If we are only interested in the top 10 documents of the ranking, we only need O(n) for the computation. More generally, the top t documents are determined in O(tn) time.
2.2 Distance to query and distance to ranked In the last section we ranked by angle to the closest ranked point. It may be more natural to consider the Euclidean distance to the closest ranked point instead. In the triangle pQpi of Figure 1, take the distance p − pi from p to the closest ranked point pi and rank according to the outcome of the following equation:
p − pi (2) S(p, R) = min pi ∈R p2
236
Marc van Kreveld, Iris Reinbacher, Avi Arampatzis, Roelof van Zwol
The denominator needs a squaring of p (or another power > 1) to assure that documents far from Q do not end up too early in the ranking, which would conflict with the proximity to query requirement. A normalized equation such that S(p, R) ∈ [0, 1] is the following:
1 −λ·p−pi (3) )· S(p, R) = min (1 − e pi ∈R 1 + p
Here, λ is a constant that defines the slope of the exponential function. Algorithm 1 can be modified to work here as well with a running time of O(n2 ). Theorem 2. A set of n points in the plane can be ranked according to the distance-to-query and distance-to-ranked model in O(n2 ) time.
2.3 Addition models So far, our distributed methods were all based on a formula that divided angle or distance to the closest ranked point by the distance to the query. In this way, points closer to the query get a higher relevance. We can obtain a similar effect but a different ranking by adding up these values. It is not clear beforehand which model will be more satisfactory for users, so we analyze these models as well.
2 (4) S(p, R) = min α · (1 − e−λ·(p/pmax ) ) + (1 − α) · φ(p, pi ) · pi ∈R π
In this equation, pmax is the point with maximum distance to the query, α ∈ [0, 1] denotes a variable which is used to put an emphasis on either distance or angle, and λ is a constant that defines the base eλ of the exponential function. Algorithm 1 can be modified for the addition models, but as the angle φ(p, p i ) is an additive and not a multiplicative part of the score equation, we can give algorithms with better running time. The point set is initially stored in the leaves of a binary tree T , sorted by counterclockwise (ccw) angle to the y-axis. In every leaf of the tree we also store: (i) ccw and clockwise (cw) angle to y and x-axis respectively; (ii) the distance to the query; (iii) ccw and cw score. We augment T as follows (see e.g. (Cormen et al. 1990) for augmenting data structures): In every internal node we store the best cw and ccw score per subtree. Later in the algorithm, we additionaly store the angle of the closest ccw and cw ranked point and whether the closest ranked point is in cw or ccw direction in the root of each subtree. Furthermore, we store the best score per tree in a heap for quicker localization. As shown left in Figure 2, between two already ranked points p 1 and p2 , indicated by 1 and 2 , there are two binary trees, T1 cw and T2 ccw of the bisecting barrier line 12 . All the points in T1 are closer in angle to p1 and all the points in T2 are closer in angle to p2 . If we insert a new point p3 to the ranking, this means we insert a new imaginary line 3 through p3 and we need to perform the following operations on the trees:
Distributed Ranking Methods for Geographic Information Retrieval
T3
‘2
‘2 T2
p2
p1
p3
0 Tccw
‘12
00 Tcw
T1 T0
‘1
‘32
237
‘12 ‘3
00 Tccw 0 Tcw
‘13 ‘1
Fig. 2. The split and concatenate of trees in Algorithm 2.
1. Split T1 and T2 at the angle-bisectors ‘32 and ‘13 , creating the new trees 0 0 and Tccw and two intermediate trees T cw and T ccw Tcw 2. Concatenate the intermediate trees from (1), creating one tree T . 00 00 and Tccw . 3. Split T at the newly ranked point p3 , creating Tcw
Figure 2 right, shows the outcome of these operations. Whenever we split or concatenate the binary trees we need to make sure that the augmentation remains correct. In our case, this is no problem, as we only store the best initial scores in the inner leaves. However, we need to update the information in the root of each tree about the closest cw and ccw ranked point and the best scores. As the scores are additive, and all scores for points in the same tree are calculated with respect to the same ranked point, we simply subtract (1 − fi) · φ0 · 2=π, where φ0 denotes the cw(ccw) angle of the closest ranked point, from the cw (ccw) best score to get the new best score for the tree. We also need to update the score information in the heap. Now we can formulate an algorithm for the addition-model that runs in O(n log n) time. Algorithm 2: Input: A set P with n points in the plane. 1. Create T with all points of P , the augmentation and a heap that contains only the point p closest to the query Q. 2. Choose the point p with the highest score S(p, R) as next in the ranking by deleting the best one from the heap. 3. For every last ranked point p do: a) Split and concatenate the binary trees as described above and update the information in their roots. b) Update the best-score information in the heap: i. Delete the best score of the old tree T1 or T2 that did not contain p. 0 00 00 0 , Tccw , Tcw , and Tccw ii. Find the four best scores of the new trees Tcw and insert them in the heap. 4. Continue with step 2. Theorem 3. A set of n points in the plane can be ranked according to the angle-distance addition model in O(n log n) time. Another, similar, addition model adds up the distance to the query and the distance to the closest ranked point:
238
Marc van Kreveld, Iris Reinbacher, Avi Arampatzis, Roelof van Zwol
S(p, R) = min α · (1 − e−λ1 ·(p/pmax ) ) + (1 − α)(1 − e−λ2 ·p−pi ) (5) pi ∈R
Again, pmax is the point with maximum distance to the query, α ∈ [0, 1] is a variable used to influence the weight given to the distance to the query (proximity to query) or to the distance to the closest point in the ranking (high spreading), and λ1 and λ2 are constants that define the base of the exponential function. Note that Algorithm 2 is not applicable for this addition model. This is easy to see, since the distance to the closest ranked point does not change by the same amount for a group of points. This implies that the score for every unranked point needs to be adjusted individually when adding a point to R. We can modify Algorithm 1 for this addition model. Alternatively, we can use the following algorithm that has O(n2 ) running time in the worst case, but a typical running time of O(n log n). Algorithm 3: Input: A set P with n points in the plane. 1. Rank the point p closest to the query Q first. Add it to R and delete it from P . Initialize a list with all unranked points. 2. For every newly ranked point p ∈ R do: a) Insert it to the Voronoi diagram of R. b) Create for the newly created Voronoi cell a list of unranked points that lie in it by taking those points that have p as closest ranked from the lists of the neighboring cells. For all R ⊆ R Voronoi cells that changed, update their lists of unranked points. c) Compute the point with the best score for the newly created Voronoi cell and insert it in a heap H. For all R ⊆ R Voronoi cells that changed, recompute the best score and update the heap H accordingly. 3. Choose the point with the best overall score from the heap H as next in the ranking; add it to R and delete it from P and H. 4. Continue with step 2. Since the average degree of a Voronoi cell is six, one can expect that a typical addition of a point p to the ranked points involves a set R with six ranked points. If we also assume that, typically, a point in R loses a constant fraction of the unranked points in its list, we can prove an O(n log n) time bound for the whole ranking algorithm. The analysis is the same as in (Heckbert and Garland 1995, van Kreveld et al. 1997). The algorithm can be applied to all ranking methods described so far. Theorem 4. A set of n points in the plane can be ranked by the distancedistance addition model in O(n2 ) worst case and O(n log n) typical time.
3 Experiments We implemented the generic ranking Algorithm 1 for the basic ranking methods described in Subsections 2.1, 2.2, and 2.3. Furthermore we implemented
Distributed Ranking Methods for Geographic Information Retrieval
239
55
55
18
50
19
17
15
50 20
8 10
8
40
11 10
12
14
40
16
13 14 7
30
5 4
30
5
6
6
12
15
20
20
3
13
4
10
10 2
79
1
9
11
1 0
3 2
10
20
30
40
50
55
0
10
20
30
40
50
55
Fig. 3. Ranking by distance to origin only.
an extension called staircase enforcement, explained in Subsection 3.2. We compare the outcomes of these algorithms for two different point sets shown in Figure 3. The point set at the left consists of 20 uniformly distributed points, the point set at the right shows the 15 most relevant points for the query ‘safari africa’ which was performed on a dataset consisting of 6,500 Lonely Planet web pages. The small size of the point sets was chosen out of readability considerations.
3.1 Basic ranking algorithms Figure 3 shows the output of a ranking by distance-to-query only. It will function as a reference point for the other rankings. Points close together in space are also close in the ranking. In the other ranking methods, see Figure 4, this is not the case anymore. This is visible in the ‘breaking up’ of the cluster of four points in the upper left corner of the Lonely Planet point set rankings. Note also that the points ranked last by the simple distance ranking are always ranked earlier by the other methods. This is because we enforced higher spreading over proximity to the query by the choice of parameters. The rankings are severely influenced by this choice. In our choice of parameters we did not attempt to obtain a “best” ranking. We used the same parameters in all three ranking methods to simplify qualitative comparison.
3.2 Staircase enforced ranking algorithms In the staircase enforced methods, shown in Figure 5, the candidates to be ranked next are only those points that lie on the (lower left) staircase of the point set. The scoring functions are as before. A point p is on the staircase of a point set P if and only if for all p ∈ P \ {p}, we have px < px or py < py . So, with this limitation, proximity to the query gets a somewhat higher
240
Marc van Kreveld, Iris Reinbacher, Avi Arampatzis, Roelof van Zwol 55
55
15
50
20
6
6
50 19
7 14 7
40
11
15 12
8
40
17
5 18 30
13
10 14
30
16
9
12 10
5
20
20
4
9
8
4 2
3 13
1
10
10 2
11 3
1
50
40
30
20
10
0 55
55
18
50
20
11
10
0 55
50
55
50
55
50
55
9
50 12
7 17
10
40
40
30
20
13 14
11 10
40
6
8 15 7
30
12 3
30
4
5
14 8
16
20
20
9
13
3
4 2
6 15
1
10
10 2
19 5
1 30
20
10
0 55
50
40
55
5
50
16
13
0 55
10
40
30
4
50 19
3 12 6
40
20
15 12
10 7
40
10
6 17 15
30
11 14
30
3
9
20 9
4
20
20
7
14
8
10
10 18
2 13
1
11 2
1 0
5
8
10
20
30
40
50
55
0
10
20
30
40
Fig. 4. Top: Ranking by distance to origin and angle to closest (k = 1, c = 0.1). Middle: Ranking by distance to origin and distance to closest (Equation 3, λ = 0.05). Bottom: Ranking by additive distance to origin and angle to closest (α = 0.4, λ = 0.05).
Distributed Ranking Methods for Geographic Information Retrieval
241
55
55
16
50
20
15
15
50 19
6 14
13
40
13 12
14 10
40
17
11 18 12
30
5 8
30
11
9
9 6
10
20
20
4
8
5
4 2
37
1
10
10 2
7 3
1
50
40
30
20
10
0 55
55
15
50
18
14
10
0 55
50
55
50
55
50
55
14
50 20
6 13
9
40
40
30
20
11 12
13 9
40
11
10 12 8
30
4 7
30
4
8
10 7
19
20
20
6
17
3
5 3 2 15
1
10
10 2
16 5
1
50
40
30
20
10
0 55
55
10
0
30
20
40
55
16
50
15
20
15
50
19 6
12 13
40
13 12
14 10
40
17
11
18 14
30
5 8
30
11
9
10
8
9
20
20
4
10
6
7
10
3
27
1
5 2
1
0
4 3
10
20
30
40
50
55
0
10
20
30
Fig. 5. Same as Figure 4, but now staircase enforced.
40
242
Marc van Kreveld, Iris Reinbacher, Avi Arampatzis, Roelof van Zwol
importance compared to the basic algorithms, which is clearly visible in the figures, as the points farthest away from the query are almost always ranked last. It appears that staircase enforced methods perform better on distance to query while keeping a good distribution. The staircase enforced rankings can be implemented efficiently by adapting the algorithms we presented before.
4 Conclusions This paper introduced distributed relevance ranking for documents that have two scores. It is particularly useful for geographic information retrieval, where documents have both a textual and a spatial score. The concept can easily be extended to more than two scores, although it is not clear how to obtain efficient algorithms that run in subquadratic time. The experiments indicate that both requirements for a good ranking, distance to query and spreading, can be obtained simultaneously. Especially the staircase enforced methods seem to perform well. User evaluation is needed to discover which ranking method is preferred most, and which parameters should be used. We have examined more extensions and performed more experiments than were reported in this paper. For example, we also analyzed the case where the unranked points are only related to the 10 (or any number of) most recently ranked points, to guarantee that similar points are sufficiently far apart in the ranked list. Also for this variation, user evaluation is needed to determine the most preferred methods of ranking.
References Carbonell, J.G., and Goldstein, J., 1998. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In Research and Development in Information Retrieval, pages 335–336. Cormen, T.H., Leiserson, C.E., and Rivest, R.L., 1990. Introduction to Algorithms. MIT Press, Cambridge, MA. Goldstein, J., Kantrowitz, M., Mittal, V.O., and Carbonell, J.G., 1999. Summarizing text documents: Sentence selection and evaluation metrics. In Research and Development in Information Retrieval, pages 121–128. Goldstein, J., Mittal, V.O., Carbonell, J.G., and Callan, J.P., 2000. Creating and evaluating multi-document sentence extract summaries. In Proc. CIKM, pages 165–172. Harman, D., 2002. Overview of the TREC 2002 novelty track. In NISI Special Publication 500-251: Proc. 11th Text Retrieval Conference (TREC 2002). Heckbert, P.S., and Garland, M., 1995. Fast polygonal approximation of terrains and height fields. Report CMU-CS-95-181, Carnegie Mellon University. Jones, C.B., Purves, R., Ruas, A., Sanderson, M., Sester, M., van Kreveld, M.J., and Weibel, R., 2002. Spatial information retrieval and geographi-
Distributed Ranking Methods for Geographic Information Retrieval
243
cal ontologies – an overview of the SPIRIT project. In Proc. 25th Annu. Int. Conf. on Research and Development in Information Retrieval (SIGIR 2002), pages 387–388. Rauch, E., Bukatin, M., and Naker, K., 2003. A confidence-based framework for disambiguating geographic terms. In Proc. Workshop on the Analysis of Geographic References. http://www.metacarta.com/kornai/NAACL/WS9/Conf/ws917.pdf. van Kreveld, M., van Oostrum, R., and Snoeyink, J., 1997. Efficient settlement selection for interactive display. In Proc. Auto-Carto 13: ACSM/ASPRS Annual Convention Technical Papers, pages 287–296. Visser, U., V¨ ogele, T., and Schlieder, C., 2002. Spatio-terminological information retrieval using the BUSTER system. In Proc. of the EnviroInfo, pages 93–100.
Representing Topological Relationships between Complex Regions by F-Histograms Lukasz Wawrzyniak, Pascal Matsakis, and Dennis Nikitenko Department of Computing and Information Science University of Guelph, Guelph, ON, N1G 2W1, Canada {lwawrzyn, matsakis, dnikiten}@cis.uoguelph.ca
Abstract In earlier work, we introduced the notion of the F-histogram and demonstrated that it can be of great use in understanding the spatial organization of regions in images. Moreover, we have recently designed F-histograms coupled with mutually exclusive and collectively exhaustive relations between line segments. These histograms constitute a valuable tool for extracting topological relationship information from 2D concave objects. For any direction in the plane, they define a fuzzy partition of all object pairs, and each class of the partition corresponds to one of the above relations. The present paper continues this line of research. It lays the foundation for generating a linguistic description that captures the essence of the topological relationships between two regions in terms of the thirteen Allen relations. An index to measure the complexity of the relationships in an arbitrary direction is developed, and experiments are performed on real data.
1 Introduction Work in the modeling of topological relationships often relies on an extension into the spatial domain of Allen’s temporal relations (Allen 1983). Although several alternatives and refinements have been proposed, a common procedure is to approximate the geometry of spatial objects by Minimum Bounding Rectangles (Nabil et al. 1995; Sharma and Flewelling 1995). Many authors, e.g., (Goodchild and Gopal 1990), have stressed the need to handle imprecise and uncertain information about spatial data. Qualitative spatial reasoning aims at modeling commonsense knowledge of space. Nevertheless, computational approaches for spatial modeling and reasoning can benefit from more quantitative measures, and the interest of fuzzy approaches has been widely recognized (Dutta 1991; Freeman 1975).
246 Lukasz Wawrzyniak, Pascal Matsakis, and Dennis Nikitenko
In previous publications, we introduced the notion of the F-histogram (Matsakis 1998; Matsakis and Wendling 1999), a generic quantitative representation of the relative position between two 2D objects. Most work focused on particular F-histograms called force histograms. As demonstrated in (Matsakis 2002), these histograms can be of great use in understanding the spatial organization of regions in images. For instance, they can provide inputs to systems for linguistic scene description (Matsakis et al. 2001). Moreover, we have recently shown (Matsakis and Nikitenko, to appear) that the F-histogram constitutes a valuable tool for extracting topological relationship information from 2D concave objects. The present paper builds both on (Matsakis et al. 2001) and (Matsakis and Nikitenko, to appear). It lays the foundation for generating a linguistic description that captures the essence of the topological relationships between two complex regions in terms of the thirteen Allen relations. The notion of the Fhistogram is briefly described in Sect. 2. The way F-histograms can be coupled with Allen relations using fuzzy set theory is examined in Sect. 3. Section 4 describes experiments on real data. It shows that the F-histograms associated with a given pair of objects carry lots of topological relationship information. An index to measure the complexity of the relationships in an arbitrary direction is developed in Sect. 5. This index will play an important role in the generation of linguistic descriptions. Conclusions are given in Sect. 6.
2 F-Histograms As shown in Fig. 1, the plane reference frame is a positively oriented orthonormal frame (O, i , j ). For any real numbers D and v, the vectors iD and jD are the respective images of i and j through the D-angle rotation, and 'D(v) is the oriented line whose reference frame is defined by iD and the point of coordinates (0,v) — relative to (O, iD , jD ). An object is a nonempty bounded set of points, E, equal to its interior closure , and such that for any D and v the intersection ED(v)=E'D(v) is the union of a finite number of mutually disjoint segments. An object may have holes in it and may consist of many connected components. ED(v) is a longitudinal section of E. Finally, T denotes the set of all triples (D,ED(v),GD(v)), where D and v are any real numbers and E and G are any objects. Now, consider two objects A and B (the argument and the referent), a direction T and some proposition PAB(T) like “A is after B in direction T,” “A overlaps B in direction T,” or “A surrounds B in direction T.” We want 1
1
In other words, it is a 2D object that does not include any “grafting,” such as an arc or isolated point.
Representing Topological Rrelationships between Complex Regions 247
to attach a weight to PAB(T). To do so, the objects A and B are handled as longitudinal sections. x For each v, the pair (AT(v),BT(v)) of longitudinal sections is viewed as an argument put forward to support PAB(T). x A function F from T into IR + (the set of non-negative real numbers) attaches the weight F(T,AT(v),BT(v)) to this argument (AT(v),BT(v)). x The total weight FAB(T) of the arguments stated in favor of PAB(T) f is naturally set to (Fig. 2): FAB(T) = ³ f F(T,AT(v),BT(v)) dv. The function FAB so defined is called the Fhistogram associated with (A,B). It is one possible representation of the position of A with regard to B. F-histograms include f-histograms, which include M-histograms, which themselves include force histograms (Matsakis 1998; Matsakis and Nikitenko, to appear). Most work has focused on force histograms (Matsakis 2002). Malki et al. (2002), however, use f-histograms 2 to attach weights to the propositions PAr B(T) { “A r B in direction T,” where r belongs to the set {>, mi, oi, f, d, si, =, s, di, fi, o, m, (after)
f (finishes) d (during)
Fig. 3. Allen relations (Allen 1983) between two segments of an oriented line. The black segment is the referent, the gray segment is the argument. Two relations r1 and r2 are linked if and only if they are conceptual neighbors, i.e., r1 can be obtained directly from r2 by moving or deforming the segments in a continuous way.
Let r denote an Allen relation, A and B two objects (convex or not), and T a direction. To attach a weight to the proposition PAr B(T) { “A r B in direction T,” each pair (AT(v),BT(v)) of longitudinal sections is viewed as an argument put forward to support P Ar B(T) (Sect. 2). A function Fr attaches the weight Fr (T,AT(v),BT(v)) to this argument, and the total weight F Ar B(T) of the arguments stated in favor of P Ar B(T) is set to: f F AB r (T) = ³ f Fr (T,AT(v),BT(v)) dv.
The question, of course, is how to define Fr . Small changes in the longitudinal sections should not affect F AB r (T) significantly. Fuzzy set theoretic approaches have been widely used to handle imprecision and achieve robustness in spatial analysis. Allen relations are fuzzified in Sect. 3.1 and longitudinal sections in Sect. 3.2. The last section, Sect. 3.3, defines the function Fr .
Representing Topological Rrelationships between Complex Regions 249 3.1
Fuzzification of Allen Relations
An Allen relation r can -(b+a)/2 -(b+a)/2 -a 0 -b -b-a be fuzzified in many a/2 -b-3a/2 -b-a/2 -b+a/2 -3a/2 -a/2 ways, depending on the < y > intent of the work. For y instance, Guesgen mi m y y (2002) proceeds in a qualitative manner. o y oi y Here, we proceed in a quantitative manner. y y The 13 Allen relations are y y fuzzified as shown in s f x x z/2 z z/2 z Fig. 4. Each relation, except =, is defined by y y the min of a few trapezy y oid membership functions. Let A be the set of si x x fi z 2z z 2z all thirteen fuzzy relations. Three properties are y worth noticing. First, d x for any pair (I,J) of z/2 z segments, we have y 6rA r(I,J) = 1, where r(I,J) denotes the degree di x z 2z to which the statement I r J is to be considered < m o s f d true. This, of course, = = 1 comes from the defini> mi oi si fi di tion of = (and it can be shown that = takes its Fig. 4. Fuzzified Allen relations between two values in [0,1]). Second, segments I and J of an oriented line. Each relfor any r in A, there exation, except =, is defined by the min of a few ist pairs (I,J) such that membership functions (one for , m, mi, o, oi; r(I,J)=1. Lastly, for any three for s, si, f, fi; and two for d and di). x is the length of I (the argument), z is the length of J (the pair (I,J) and any r1 and referent), a=min(x,z), b=max(x,z) and y is the r2 in A , if r1(I,J)z0 and signed distance from the end of J to the start of I. r2(I,J)z0 then r1 and r2 are direct neighbors in the graph of Fig. 3.
250 Lukasz Wawrzyniak, Pascal Matsakis, and Dennis Nikitenko 3.2 Fuzzification of Longitudinal Sections
The idea is to consider that if two segments are close enough relative to their lengths, then they should be seen, to a certain extent, as a single segment. Let I be the longitudinal section ET(v) of some object E. Assume I is not empty. There exists one set {Ii}i1..n (and only one) of mutually disjoint segments such that: I = i1..n Ii. The indexing can be chosen such that, for any i in 1..n1, the segment Ii+1 is after Ii in direction T. Let Ji be the open interval “between” Ii and Ii+1. The longitudinal section I is considered a fuzzy set on 'T(v) with membership function PI. For any point M on any Ii , the value PI(M) is 1. For any point M on any Ji , the value PI(M) is Di — and, initially, Di = 0. Fuzzification of I proceeds by increasing these membership degrees Di. An example is presented in Fig. 5. Details can be found in (Matsakis and Nikitenko, to appear). 3.3 Coupling F-Histograms with Allen Relations
Consider an Allen relation r and the longitudinal sections AT(v) and BT(v) of some objects A and B. We are now able to define the value Fr (T,AT(v),BT(v)) 1
(a)
input 'Tv
0
1
output
(b) 0
'Tv
Fig. 5. Fuzzification of a longitudinal section I. (a) Membership function PI before fuzzification. (b) Membership function after fuzzification.
(see the introductory paragraph of Sect. 3). If AT(v)= or BT(v)= then Fr (T,AT(v),BT(v)) is naturally set to 0. Assume AT(v)z and BT(v)z. Assume r, AT(v) and BT(v) have been fuzzified as described in Sects. 3.1 and 3.2. There exists a tuple (D0,D1,…,Dc) of real numbers such that D0=0 0, N t 0, (controlling maximum range of weight), KS > 0, KH t 1 x For H/(KS*W) = 1, Wflat = 1 (standard form) x For H/(KS*W) > 1, Wflat > 1 (enhancing taller triangles) x For H/(KS*W) < 1, Wflat < 1 (weakening flatter triangles)
§ 2M N · ¸ © M N ¹
KH
x For Hof, Wflat o ¨
§ N · x For Ho0, Wflat o ¨ ¸ ©M N¹
KH
374
Sheng Zhou and Christopher B. Jones
x For M = 1 and N = 0, Wflat [0, 2KH) 2.3.2 Low-Pass filter (HF-01)
W flat
(
4M arctan( KS W / H ) / PI N KH ) M N
(4)
This filter is indeed a symmetric form of LF described above. Thus, it tends to eliminate extreme points and achieves the effect of semantic generalisation. x M > 0, N t 0, KS > 0, KH t 1 x For (KS*W)/H = 1, Wflat = 1 (standard form) x For (KS*W)/H > 1, Wflat > 1 (enhancing flatter triangles) x For (KS*W)/H < 1, Wflat < 1 (weakening taller triangles)
§ 2M N · x For Wof, Wflat o ¨ ¸ © M N ¹ § N · x For Wo0, Wflat o ¨ ¸ ©M N¹
KH
KH
x For M = 1 and N = 0, Wflat [0, 2KH) 2.4 Skewness and Convexity filters For triangle T(v0,v1, v2), ML is the distance between v1 and the middle point of edge v0-v2. Consequently, we have the ratio H/ML [0, 1], which might be used in a skewness filter to retain points with effective triangles close to isosceles:
§ SM H / ML · Wskew = ¨ ¸ © SM 1 ¹
(5)
SK
(SM t 0, SK t 1)
If we consider a cartographic line as directed, a convexity filter may be defined as: Wconvex = C (if convex) or 1 (if concave)
(6)
Here C is a positive constant. If C > 1 is used, this filter tends to retain points with convex effective triangles. Otherwise, points with concave effective triangles are retained.
Shape-Aware Line Generalisation With Weighted Effective Area
375
3 Experimental Results 3.1 Sample dataset and a web demo for weighted effective area To evaluate the generalisation effects of various filters described above, we have used a sample dataset (figure 2) of the coastline of Isle of Wight, which is extracted from an original Ordnance Survey LandForm PANAROMA dataset. There are five (closed) linear objects and 2376 vertices in total in the dataset, where the largest object contains 2236 vertices. In order to provide a better view on the effects of generalisation based on weighted effective area, we have developed a JAVA applet-based web demonstrator which may be accessed following the link: http://www.cs.cf.ac.uk/user/S.Zhou/WEADemo/
Fig. 2. Sample dataset (Crown Copyright 2002) and the web demo
This demonstrator allows comparison of various generalisation results using weighted effective area to that of RDP and the original VisvalingamWhyatt algorithm, where parameter values (i.e. RDP tolerance, effective area or weighted effective area) or the number of vertices retained after generalisation are adjustable. In the following subsection, we will present a few experimental results that demonstrate the different effects of various generalisation filters described above. These results are obtained from the web demonstrator. For each comparison, the same number of filtered vertices is retained in the whole generalised dataset so that the difference in vertex selection/filtering of various algorithms or parameter values may be highlighted. Also in all experiments M=0 and N=1.
376
Sheng Zhou and Christopher B. Jones
3.2 Experiments 3.2.1 Skewness According to our current experimental results, the weight based on the skewness of the effective triangle does not make a significant impact on the output if moderate parameter values are used (e.g. figure 3-D, SM = 0 and SK = 2). On the other hand, more extreme parameter values may generate unpredictable and undesirable results. These results cast doubt on the value of using this weight factor.
A
B
C
D
Fig. 3. Effects of VW with Skewness (D) compared to original (A), RDP(B) and VW (C) (for B,C and D, 1000 vertices retained in the whole dataset)
Fig. 4. Effects of applying convexity weight to VW
3.2.2 Convexity In our experiments, extreme convexity weight values generate quite significant effects (figure 4). On top of the initial effective area value, application of a very small weight (C=0.004) tends to retain vertices on the external local convex hulls while a very large weight (C=25) has the opposite effect of retaining vertices on the internal local convex hulls (see Normant and van de Walle 1996, regarding local convex hulls).
Shape-Aware Line Generalisation With Weighted Effective Area
377
3.2.3 Line simplification with WEA - LF Scheme Figure 5 and 6 demonstrate the effect of graphic simplification using “high-pass” weighted effective area, in comparison to the results of RDP. This filter appears to be able to generate simplification effects similar to that of RDP.
Fig. 5. Simplification by RDP, LF (KS=0.5, KH=1) and LF(KS=1, KH=1) (1000 vertices retained in the whole dataset)
Fig. 6. Simplification by RDP and LF(KS=0.2, KH=1), 200 vertices retained
Fig. 7. Effects of HF01 - Original VW, KS/KH as: 0.2/1; 0.5/1; 1/1 (1000 vertices retained in the whole dataset)
3.2.4 Line generalisation with WEA - HF01 The effects of the low-pass filter HF01 are shown in figures 7-9. Figure 7 demonstrate the effect of defining different “standard forms” (i.e. weight equals to 1), represented by KS (KS = 0.2, 0.5 and 1). Clearly, a "flatter" standard form (i.e. with a smaller KS) results in heavier generalisation.
378
Sheng Zhou and Christopher B. Jones
Figure 8 shows the generalisation effects of the same set of parameter values at different levels of detail (vertices retained in the whole dataset are from 1000 to 150).
2
1
4
3
Fig. 8. Effects of HF01 at KS=0.5 & KH=2 with 1000/600/300/150 vertices (1-4) retained
Finally, figure 9 illustrates the effect of different values for KH. It is obvious that at the same level of details (1000 vertices), a larger KH value will result in heavier generalisation.
1
2
3
4
Fig. 9. Effects of different KH (1/2/4/8) for the same KS (1), 1000 vertices
4 Discussion The experimental results in the previous section have demonstrated that the application of weight factors for flatness, convexity and (to a lesser extent) skewness can provide considerable control over both graphic and semantic generalisation effects. Apart from the skewness filter, the filter parameters provide consistent and predictable control over the resulting generalisation.
Shape-Aware Line Generalisation With Weighted Effective Area
379
It is worth noting that greater control does not always result in a “better” effect, but the parameters described and demonstrated here do appear to provide excellent potential for obtaining generalisations that are adapted to the requirements of particular applications. At the current stage of research and development, we suggest WEA may best be applied in an interactive manner in order to obtain preferred results, for which an interactive mapping tool has been provided. In future it may be possible to select parameter values automatically based on the results of training with different types of generalisation. 4.1 Topologically consistent generalisation
A problem associated with VW algorithm (as well as many other algorithms such as RDP) is that topological consistency is not guaranteed. WEA-based generalisation is no exception as the same bottom-up process as in VW algorithm is adopted. It is however fairly easy to geometrically (i.e. graphically) remove inconsistencies by adopting simple approaches such as retaining any vertex whose removal may cause inconsistency. For example, the topologically consistent multi-representational dataset used in (Zhou and Jones 2003) is generated in this way. There a Delaunay triangulation was used to detect when removal of a vertex would cause an inconsistency. 4.2 The issue of feature partition
As mentioned earlier, the bottom-up process in VW algorithm is a localised, minimalist approach as only the smallest details (three vertices) are considered. Generalisation effects on larger details are achieved progressively without explicit knowledge of them. The lack of direct control over these (often semantically significant) details makes it more difficult to decide the best combination of parameter values. Indeed, often a single set of parameters may not be appropriate for every large detail in the cartographic line to be generalised (which is especially true for the convexity filters). Therefore, it is natural to consider partitioning the line into several large details and subsequently applying bottom-up or other types of generalisations on them with appropriate individual parameter sets. Many methods for (geometric or semantic) detail identification and feature partition have been proposed, such as Sinuosity measures (Plazanet 1995), Voronoi Skeletons (Ogniewicz and Kübler 1995), skeletons based on Delaunay triangulation (van der Poorten and Jones 2002) and various convex hull based methods (e.g. Normant and van de Walle 1996; Zhou
380
Sheng Zhou and Christopher B. Jones
and Jones 2001). Following partitioning of features and addressing issues such as hierarchical details, overlapped details and oriented details of larger scale (i.e. more vertices), the best overall generalisation effects may be achieved by combining a localised bottom-up generalisation method, as presented here, with methods which take a more global view of features and operate successfully at the level of larger details (such as the branch pruning approach of van der Poorten and Jones, 2002) or multiple features.
References Douglas, D.H. and Peucker, T.K., 1973, Algorithms for the reduction of the number of points required to represent a digitised line or its caricature. The Canadian Cartographer, 10(2), 112-122. Normant, F. and van de Walle, A., 1996, The Sausage of Local Convex Hulls of a Curve and the Douglas-Peucker Algorithm, Cartographica, 33(4), 25-35 Ogniewicz, R.L. and Kübler, O., 1995, Hierarchic Voronoi Skeletons, Pattern Recognition, 28(3), 343-359 Plazanet, C., 1995, Measurements, Characterization, and Classification for Automated Line Feature Generalization. ACSM/ASPRS Annual Convention and Exposition, Vol. 4 (Proc. Auto-Carto 12): 59-68 Ramer, U., 1972, An iterative procedure for polygonal approximation of planar closed curves. Computer Graphics and Image Processing 1, 244-256. Robinson, A.H., Morrison, J.L., Muehrcke, P.C., Kimerling, A.J. and Guptill, S.C., 1995, Elements of Cartography, sixth edition. John Wiley & Sons, Inc. van der Poorten, P.M. and Jones, C.B., 2002, Characterisation and generalisation of cartographic lines using Delaunay triangulation. International Journal of Geographical Information Science 16(8), 773-794. Visvalingam, M. and Whyatt, J.D., 1993, Line generalisation by repeated elimination of points. Cartographic Journal, 30(1), 46-51. Visvalingam, M. and Williamson, P.J, 1995, Simplification and generalization of large scale data for roads. Cartography and Geographic Information Science 22(4), 3-15. Visvalingam, M. and Herbert, S., 1999, A computer science perspective on the bendsimplification algorithm. Cartography and Geographic Information Science 26(4), 253-270. Zhou, S. and Jones, C.B., 2001, Multi-Scale Spatial Database and Map Generalisation. ICA Commission on Map Generalization 4th Workshop on Progress in Automated Map Generalization Zhou, S. and Jones. C.B., 2003, A Multi-representation Spatial Data Model. Proc. 8th International Symposium on Advances in Spatial and Temporal Databases (SSTD'03), LNCS 2750, 394-411
Introducing a Reasoning System Based on Ternary Projective Relations Roland Billen1 and Eliseo Clementini2 1
Center for Geosciences, Department of Geography and Geomatics, University of Glasgow, Glasgow G12 8QQ, Scotland (UK),
[email protected] 2 Department of Electrical Engineering, University of L’Aquila, I-67040 Poggio di Roio (AQ), Italy,
[email protected] Abstract This paper introduces a reasoning system based on ternary projective relations between spatial objects. The model applies to spatial objects of the kind point and region, is based on basic projective invariants and takes into account the size and shape of the three objects that are involved in a relation. The reasoning system uses permutation and composition properties, which allow the inference of unknown relations from given ones.
1 Introduction The field of Qualitative Spatial Reasoning (QSR) has experienced a great interest in the spatial data handling community due to its potential applications [1]. An important topic in QSR is the definition of reasoning systems on qualitative spatial relations. For example, regarding topological relations, the 9-intersection model [2] provides formal definitions for the relations and a reasoning system based on composition tables [3] establishes a mechanism to find new relations from a set of given ones. Topological relations take into account an important part of geometric knowledge and can be used to formulate qualitative queries about the connection properties of close spatial objects, like “retrieve the lakes that are inside Scotland”. Other qualitative queries that involve disjoint objects cannot be formulated in topological terms, for example: “the cities that are between Glasgow and Edinburgh”, “the lakes that are surrounded by the mountains”, “the shops that are on the right of the road”, “the building that is before the crossroad”. All these examples can be seen as semantic inter-
382
Roland Billen and Eliseo Clementini
pretations of underlying projective properties of spatial objects. As discussed in [4], geometric properties can be subdivided in three groups: topological, projective and metric. Most qualitative relations between spatial objects can be defined in terms of topological or projective properties [5], with the exception of qualitative distance and direction relations (such as close, far, east, north) that are a qualitative interpretation of metric distances and angles [6]. The use of projective properties for the definition of spatial relations is rather new. A model for ternary projective relations has been introduced for points and regions in [7]. The model is based on a basic geometric invariant in projective space, the collinearity of three points, and takes into account the size and shape of the three objects involved in a relation. In first approximation, this work can be compared to research on qualitative relations dealing with relative positioning or cardinal directions [813]. Most approaches consider binary relations to which is associated a frame of reference, never avoiding the use of metric properties (minimum bounding rectangles, angles, etc.). To this respect, the main difference in our approach is that we only deal with projective invariants, disregarding distances and angles. Most work on projective relations deals with point abstractions of spatial features. In [9], the authors develop a model for cardinal directions between extended objects. Composition tables for the latter model have been developed in [14]. Freksa’s double-cross calculus [15] is similar to our approach in the case of points. Such a calculus, as it has been further discussed in [16, 17], is based on ternary directional relations between points. However, in Freksa’s model, an intrinsic frame of reference centred in a given point partitions the plane in four quadrants that are given by the front-back and right-left dichotomies. This leads to a greater number of qualitative distinctions with different algebraic properties and composition tables. In this paper, we establish a reasoning system based on the ternary projective relations that were introduced in [7]. From a basic set of rules about the permutation and composition of relations, we will show how it is possible to infer unknown relations using the algebraic properties of projective relations. The paper is organized as follows. We start in Section 2 with introducing the general aspects of a reasoning system with ternary relations. In Section 3 we summarize the model for ternary projective relations between points and we present the associated reasoning systems. In section 4, recall the model in the case of regions and we introduce the reasoning system for this case too. In Section 5, we draw short conclusions and discuss some future developments.
Introducing a Reasoning System Based on Ternary Projective Relations
383
2 Reasoning systems on ternary relations In this section, we present the basis of a reasoning system on ternary projective relations. Usually, reasoning systems apply to binary spatial relations, for example, to topological relations [3] and to directional relations [18]. For binary relations, given three objects a,b,c and two relations r(a,b) and r(b,c), the reasoning system allows to find the relation r(a,c). This is done by giving an exhaustive list of results for all possible input relations, in the form of a composition table. The inverse relations complete the reasoning system, by finding, given the relation r(a,b), the relation r(b,a). Reasoning with ternary relations is slightly more complex and it is not been applied a lot to spatial relations till now, with few exceptions [16, 17, 19]. The notation we use for ternary relations is of the kind r ( PO, RO1 , RO2 ) , where the first object PO represents the primary object, the second object RO1 represents the first reference object and the third object RO2 represents the second reference object. The primary object is the one that holds the relation r with the two reference objects, i.e., PO holds the relation r with RO1 and RO2. A reasoning system with ternary relations is based on two different sets of rules: x a set of rules for permutations. Given three objects a,b,c, and a relation r(a,b,c), these rules allow to find which are the other relations with permutations of the three arguments. There are 6 (=3!) potential arrangements of the arguments. The permutation rules correspond to the inverse relation of binary systems. x a set of rules for composition. Given four objects a,b,c,d, and the two relations r(a,b,c) and r(b,c,d), these rules allow to find the relation r(a,c,d). The composition of relations r1 and r2 is indicated r1 r2 . Considering a set of relations 5, it is possible to prove that the four following rules, three permutations and one composition, are sufficient to derive all the possible ternary relations out of a set of four arguments. (1) (2) (3) (4)
r (a, b, c) o r ' (a, c, b) r (a, b, c) o r ' ' (b, a, c) r ( a , b, c ) o r ' ' ' ( c , a , b ) r1 (a, b, c) r2 (b, c, d ) o r3 (a, c, d )
In the next sections, we will see how to apply such a ternary reasoning system in the case of projective ternary relations between points and between regions.
384
Roland Billen and Eliseo Clementini
3 Reasoning system on ternary projective relations between points The projective ternary relations between points have been introduced in a previous paper [7]. They have a straightforward definition because they are related to common concepts of projective geometry [20]. In section 3.1, we will only present the definitions and the concepts necessary for a good understanding of the reasoning system. In section 3.2, we show how to apply the reasoning system on these ternary projective relations. 3.1 Ternary projective relations between points Our basic set of projective relations is based on the most important geometric invariant in a projective space: the collinearity of three points. Therefore, the nature of projective relations is intrinsically ternary. Given a relation r ( P1 , P2 , P3 ) , the points that act as reference objects must be distinct, in such a way they define a unique line passing through them, indicated with P2 P3 . When the relation needs an orientation on this line, the orientation is assumed to be from the first reference object to the second one: the oriented line is indicated with P2 P3 . The most general projective relations between three points are the collinear relation and its complement, the aside relation. The former one can be refined into between and nonbetween relations, and the latter one into rightside or leftside relations. In turn, the nonbetween relation can be subdivided into before and after relations, completing the hierarchical model of the projective relations between three points of the plane (see Figure 1.a). Out of this hierarchical model, five basic projective relations (before, between, after, rightside, leftside) are extracted. They correspond to the finest projective partition of the plane (see Figure 1.b).
a. Hierarchical model of the relations Fig. 1. Projective relations between points
b. Projective partition of the plane
Introducing a Reasoning System Based on Ternary Projective Relations
385
Definitions are given only for the collinear relation and the five basic relations. Definition 1. A point P1 is collinear to two given points P2 and P3, with P2 z P3 , collinear ( P1 , P2 , P3 ) , if P1 P2 P3 . Definition 2. A point P1 is before points P2 and P3, with P2 z P3 , before( P1 , P2 , P3 ) , if collinear ( P1 , P2 , P3 ) and P1 (f, P2 ) , where the last interval is part of the oriented line P2 P3 . Definition 3. A point P1 is between two given points P2 and P3, with P2 z P3 , between( P1 , P2 , P3 ) , if P1 >P2 P3 @ . Definition 4. A point P1 is after points P2 and P3, with P2 z P3 , after ( P1 , P2 , P3 ) , if collinear ( P1 , P2 , P3 ) and P1 ( P3 ,f) , where the last interval is part of the oriented line P2 P3 . Considering the two halfplanes determined by the oriented line P2 P3 , respectively the halfplane to the right of the line, which we indicate with HP ( P2 P3 ) , and the halfplane to the left of the line, which we indicate with HP ( P2 P3 ) , we may define the relations rightside and leftside.
Definition 5. A point P1 is rightside of two given points P2 and P3, rightside( P1 , P2 , P3 ) , if P1 HP ( P2 P3 ) . Definition 6. A point P1 is leftside of two given points P2 and P3, leftside( P1 , P2 , P3 ) , if P1 HP ( P2 P3 ) .
3.2 Reasoning system Using this model for ternary relations between points, it is possible to build a reasoning system, which allows the prediction of ternary relations between specific points. Such a reasoning system is an application of the reasoning system on ternary relations previously introduced. The four rules become: (1) r ( P1 , P2 , P3 ) o r ' ( P1 , P3 , P2 ) (2) r ( P1 , P2 , P3 ) o r ' ' ( P2 , P1 , P3 ) (3) r ( P1 , P2 , P3 ) o r ' ' ' ( P3 , P1 , P2 ) (4) r1 ( P1 , P2 , P3 ) r2 ( P2 , P3 , P4 ) o r3 ( P1 , P3 , P4 ) For any ternary relations (P1,P2,P3), Table 1 gives the corresponding relations resulting from permutation rules (1), (2) and (3). The following abbreviations are used: bf for before, bt for between, af for after, rs for right-
386
Roland Billen and Eliseo Clementini
side and ls for leftside. For example, knowing bf(P1,P2,P3), one can derive the relationships corresponding to the permutation of the three points, which are this case af(P1,P3,P2), bt(P2,P1,P3) and af(P3,P1,P2). Table 1. Permutation table of ternary projective relations between points r ( P1 , P2 , P3 )
r ( P1 , P3 , P2 )
r ( P2 , P1 , P3 )
bf
af
bt af
bt bf
bt bf af
rs ls
ls rs
ls rs
r ( P3 , P1 , P2 ) af bf bt rs ls
Table 2 gives relations resulting from the composition rule (4). The first column of the table contains the basic ternary relations for ( P1 , P2 , P3 ) and the first row contains the basic ternary relations ( P2 , P3 , P4 ) . The other cells give the deduced transitive relations for ( P1 , P3 , P4 ) . For some entries, several cases may occur and all the possibilities are presented in the table. Table 2. Composition table of ternary projective relations between points bf bt af rs ls
bf bf bf af, bt rs ls
bt af, bt bt bf ls rs
af af af, bt bf ls rs
rs rs rs ls af, rs, ls, bt bf, rs, ls
ls ls ls rs bf, rs, ls af, rs, ls, bt
Using this reasoning system and knowing any two ternary relations between three different points out of a set of four, it is possible to predict the ternary relations between all the other possible combinations of three points out of the same set.
4 Reasoning system on ternary projective relations between regions Likewise the section on relations between points, we first recall some concepts about the ternary projective relations between regions (section
Introducing a Reasoning System Based on Ternary Projective Relations
387
4.1). Afterwards, we introduce the associated reasoning system including an example of application of such a system (section 4.2). 4.1 Ternary projective relations between regions We will assume that a region is a regular closed point set possibly with holes and separate components. We will only present briefly the basic projective relations and the related partition of the space, while we refer to [7] for a more extended treatment. In the following, we indicate the convex hull of a region with a unary function CH(). As in the case of points, we use the notation r ( A, B, C ) for projective relations between regions, where the first argument A is a region that acts as the primary object, while the second and third arguments B and C are regions that act as reference objects. The latter two regions must satisfy the condition CH ( B) CH (C ) , that is, the intersection of their convex hulls must be empty. This condition allows to build a reference frame based on B and C, as it will be defined in this section. We also use the concept of orientation, which is represented by an oriented line connecting any point in B with any point in C. Definition 7. Given two regions B and C, with CH ( B) CH (C ) , a region A is collinear to regions B and C, collinear ( A, B, C ) , if for every point P A , there exists a line l intersecting B and C that also intersects P, that is: P A, l , (l B z ) (l C z ) | l P z . The projective partition of the space into five regions corresponding to the five basic projective relations is based, as it was for the points, on the definition of the general collinear relation between three regions. The portion of the space where this relation is true is delimited by four lines that are the common external tangents and the common internal tangents. Common external tangents of B and C are defined by the fact that they also are tangent to the convex hull of the union of B and C (figure 2.a). Common internal tangents intersect inside the convex hull of the union of regions B and C and divide the plane in four cones (figure 2.b). In order to distinguish the four cones, we consider an oriented line from region B to region C and we call Conef ( B, C ) the cone that contains region B, Conef ( B, C ) the cone that contains region C, Cone ( B, C ) the cone that is
to the right of the oriented line, Cone ( B, C ) the cone that is to the left of
388
Roland Billen and Eliseo Clementini
the oriented line. We obtain a partition of the space into five regions, which correspond to the five basic projective relations before, between, after, rightside, and leftside (figure 3.a).
a. Common external tangents
b. Common internal tangents
Fig. 2. Internal and external tangents
Definition 8. A region A is before two regions B and C, before( A, B, C ) , with CH ( B) CH (C ) , if A Conef ( B, C ) CH ( B C ) . Definition 9. A region A is between two regions B and C, between( A, B, C ) , with CH ( B ) CH (C ) , if A CH ( B C ) . Definition 10. A region A is after two regions B and C, after ( A, B, C ) , with CH ( B) CH (C ) , if A Conef ( B, C ) CH ( B C ) . Definition 11. A region A is rightside of two regions B and C, rightside( A, B, C ) , with CH ( B ) CH (C ) , if A is contained inside Cone ( B, C ) minus the convex hull of the union of regions B and C, that is,
if A (Cone ( B, C ) CH ( B C )) . Definition 12. A region A is leftside of two regions B and C, leftside( A, B, C ) , with CH ( B ) CH (C ) , if A is contained inside Cone ( B, C ) minus the convex hull of the union of regions B and C, that is,
if A (Cone ( B, C ) CH ( B C )) . The set of five projective relations before, between, after, rightside, and leftside can be used as a set of basic relations to build a model for all projective relations between three regions of the plane. The model, that we call the 5-intersection, is synthetically expressed by a matrix of five values that are the empty/non-empty intersections of a region A with the five regions defined in Figure 3.b. In the matrix, a value 0 indicates an empty intersection, while a value 1 indicates a non-empty intersection. The five basic relations correspond to values of the matrix with only one non-empty value (Figure 4). In total, the 5-intersection matrix can have 25 different values that correspond to the same theoretical number of projective rela-
Introducing a Reasoning System Based on Ternary Projective Relations
389
tions. Excluding the configuration with all zero values, which cannot exist, we are left with 31 different projective relations between the three regions A, B and C. A leftside(B,C) A between(B,C) A rightside(B,C)
A Before(B,C)
a. The partition of the plane in five regions
A after(B,C)
b. The 5-intersection model
Fig. 3. Projective relations between regions
§ before(A,B,C): ¨ 1 ¨ ©
§ rightside(A,B,C): ¨ 0 ¨ ©
· ¸ ¹
0
§ between(A,B,C): ¨ 0 ¨ ©
0 0¸ 0
0
·
0 0¸ ¸ 1 ¹
§ leftside(A,B,C): ¨ 0 ¨ ©
0
·
1 0¸ ¸ 0 ¹
1
§ ¨ ©
after(A,B,C): ¨ 0
0
· ¸ ¹
0 1¸ 0
·
0 0¸ ¸ 0 ¹
Fig. 4. The projective relations with object A intersecting only one of the regions of the plane.
4.2. Reasoning system The reasoning system for regions is fully defined on the basis of the following relations: (1) r ( A, B, C ) o r ' ( A, C , B) (2) r ( A, B, C ) o r ' ' ( B, A, C ) (3) r ( A, B, C ) o r ' ' ' (C , A, B) (4) r1 ( A, B, C ) r2 ( B, C , D) o r3 ( A, C , D)
390
Roland Billen and Eliseo Clementini
Currently, the reasoning system has been established for the five basic projective relations only. We are working on its extension to the whole set of projective relations by developing a system which will combine the results of permutations and composition of the basic cases. Table 3. Permutation table of ternary projective relations between regions r ( A, B, C )
r ' ( A, C , B )
r ' ' ( B, A, C )
r ' ' ' (C , A, B)
bf
af
bt , (bt rs ) , (bt ls ) , (bt rs ls )
af , (af rs ) , (af ls ) ,
(af rs ls )
bf , (bf rs ) , (bf ls ) ,
bf , (bf rs ) , (bf ls ) ,
(bf rs ls )
(bf rs ls )
af , (af rs ) , (af ls ) ;
bt , (bt rs ) , (bt ls ) , (bt rs ls ) rs
bt af
bt bf
(af rs ls )
rs ls
ls rs
ls rs
ls
For any basic ternary relation r(A,B,C), Table 3 gives the corresponding relations resulting from permutation rules (1), (2) and (3). The similarity with the permutation table for three points is clear. Only for some cases, there are exceptions to the basic permutations for points. In those cases, the “strong” relation (which is the one that holds also for points) can be combined with one or both of leftside and rightside relations. The results of the composition rule (4) of the reasoning system are presented in Table 4. The first column of the table contains the basic ternary relations for r1 (A,B,C) and the first row contains the basic ternary relations for r2 (B,C,D). The other cells give the deduced r3 ( A, C , D) relations. In this table, we present only the single relations as results. The full composition relations can be obtained by combinations of these single relations. For example, the result of the composition before(A,B,C) before(B,C,D) is: bf , rs , ls , bf rs , bf ls , ls rs , bf rs ls . Table 4. Composition table of ternary projective relations between regions bf bt af rs ls
bf bf, rs, ls bf bf, bt, af, rs, ls bf, bt, af, rs bf, bt, af, ls
bt bt, af, rs, ls bt bf, bt, rs, ls bf, bt, af, ls bf, bt, af, rs
af af bt, af, rs, ls bt, af bf, bt, af, ls bf, bt, af, rs
rs af, rs bf, bt, rs bf, bt, ls bt, af, rs, ls bf, rs, ls
ls af, ls bf, bt, ls bf, bt, rs bf, rs, ls bt, af, rs, ls
Introducing a Reasoning System Based on Ternary Projective Relations
391
We will end this section by an example of application of the reasoning system. Given the relations before(A,B,C) and rightside(B,C,D), we find out the potential relations for r(A,B,D). x Step 1: apply (1) to the first term of the transitive relations, and (2) to the second term (fig. 5a, b and c): before( A, B, C ) o after ( A, C , B) ; rightside( B, C , D) o leftside(C , B, D) .
x Step 2: apply (4) to the following composition: after ( A, C , B) leftside(C , B, D) o before( A, B, D)
(fig 5.d) between( A, B, D) (fig 5.e) rightside( A, B, D) (fig 5.f) (before( A, B, D) between( A, B, D)) (fig 5.g) (before( A, B, D) rightside( A, B, D)) (fig 5.h) (between( A, B, D) rightside ( A, B, D)) (fig 5.i) (before( A, B, D) between( A, B, D) rightside( A, B, D)) . (fig 5.j)
5 Conclusion and future work In this paper, we have introduced a reasoning system based on ternary projective relations between points and between regions. These sets of qualitative spatial relations, invariant under projective transformations, provide a new classification of configurations between three objects based on a segmentation of the space in five regions. The associated reasoning system allows inferring relations between three objects using permutations and compositions rules. It is the first step of the establishment of a whole qualitative reasoning based on projective properties of space. In the future, the reasoning system has to be more formally defined; in particular the relations contained in permutation and composition tables have to be proved. Another issue that should be explored is the realisation of a complete qualitative spatial calculus for reasoning about ternary projective relations.
392
Roland Billen and Eliseo Clementini
a. Apply (1) to bf(A,B,C) implies af(A,C,B).
b. Apply (2) to rs(B,C,D) implies…
c. … ls(C,B,D)
d. bf(A,B,D)
e. bt(A,B,D)
f. rs(A,B,D)
g. bf(A,B,D) bt(A,B,D)
h. bf(A,B,D) rs(A,B,D)
i. bt(A,B,D) rs(A,B,D)
j. bf(A,B,D) bt(A,B,D) rs(A,B,D)
Fig. 5. Example of application of the reasoning system between regions
Acknowledgements This work was supported by M.I.U.R. under project “Representation and management of spatial and geographic data on the Web”.
References 1. Cohn, A.G. and S.M. Hazarika, Qualitative Spatial Representation and Reasoning: An Overview. Fundamenta Informaticae, 2001. 46(1-2): p. 1-29. 2. Egenhofer, M.J. and J.R. Herring, Categorizing Binary Topological Relationships Between Regions, Lines, and Points in Geographic Databases. 1991, Department of Surveying Engineering, University of Maine, Orono, ME. 3. Egenhofer, M.J., Deriving the composition of binary topological relations. Journal of Visual Languages and Computing, 1994. 5(1): p. 133-149.
Introducing a Reasoning System Based on Ternary Projective Relations
393
4. Clementini, E. and P. Di Felice, Spatial Operators. ACM SIGMOD Record, 2000. 29(3): p. 31-38. 5. Waller, D., et al., Place learning in humans: The role of distance and direction information. Spatial Cognition and Computation, 2000. 2: p. 333-354. 6. Clementini, E., P. Di Felice, and D. Hernández, Qualitative representation of positional information. Artificial Intelligence, 1997. 95: p. 317-356. 7. Billen, R. and E. Clementini, A model for ternary projective relations between regions, in EDBT2004 - 9th International Conference on Extending DataBase Technology, E. Bertino, Editor. 2004, Springer-Verlag: Heraklion - Crete, Greece. p. 310-328. 8. Gapp, K.-P. Angle, Distance, Shape, and their Relationship to Projective Relations. in Proceedings of the 17th Conference of the Cognitive Science Society. 1995. Pittsburgh, PA. 9. Goyal, R. and M.J. Egenhofer, Cardinal directions between extended spatial objects. IEEE Transactions on Knowledge and Data Engineering, 2003. (in press). 10. Kulik, L. and A. Klippel, Reasoning about Cardinal Directions Using Grids as Qualitative Geographic Coordinates, in Spatial Information Theory. Cognitive and Computational Foundations of Geographic Information Science: International Conference COSIT'99, C. Freksa and D.M. Mark, Editors. 1999, Springer. p. 205-220. 11. Schmidtke, H.R., The house is north of the river: Relative localization of extended objects, in Spatial Information Theory. Foundations of Geographic Information Science: International Conference, COSIT 2001, D.R. Montello, Editor. 2001, Springer. p. 415-430. 12. Moratz, R. and K. Fischer. Cognitively Adequate Modelling of Spatial Reference in Human-Robot Interaction. in Proc. of the 12th IEEE International Conference on Tools with Artificial Intelligence, ICTAI 2000. 2000. Vancouver, BC, Canada. 13. Schlieder, C., Reasoning about ordering, in Spatial Information Theory: A Theoretical Basis for GIS - International Conference, COSIT'95, A.U. Frank and W. Kuhn, Editors. 1995, Springer-Verlag: Berlin. p. 341-349. 14. Skiadopoulos, S. and M. Koubarakis, Composing cardinal direction relations. Artificial Intelligence, 2004. 152(2): p. 143-171. 15. Freksa, C., Using Orientation Information for Qualitative Spatial Reasoning, in Theories and Models of Spatio-Temporal Reasoning in Geographic Space, A.U. Frank, I. Campari, and U. Formentini, Editors. 1992, Springer-Verlag: Berlin. p. 162-178. 16. Scivos, A. and B. Nebel, Double-Crossing: Decidability and Computational Complexity of a Qualitative Calculus for Navigation, in Spatial Information Theory. Foundations of Geographic Information Science: International Conference, COSIT 2001, D.R. Montello, Editor. 2001, Springer. p. 431-446. 17. Isli, A. Combining Cardinal Direction Relations and other Orientation Relations in QSR. in AI&M 14-2004, Eighth International Symposium on Artificial Intelligence and Mathematics. 2004. January 4-6, 2004, Fort Lauderdale, Florida.
394
Roland Billen and Eliseo Clementini
18. Frank, A.U., Qualitative Reasoning about Distances and Directions in Geographic Space. Journal of Visual Languages and Computing, 1992. 3(4): p. 343-371. 19. Isli, A. and A.G. Cohn, A new approach to cyclic ordering of 2D orientations using ternary relation algebras. Artificial Intelligence, 2000. 122: p. 137-187. 20. Struik, D.J., Projective Geometry. 1953, London: Addison-Wesley.
A Discrete Model for Topological Relationships between Uncertain Spatial Objects Erlend Tøssebro and Mads Nygård Department of Computer Science, Norwegian University of Science and Technology, NO-7491 Trondheim, Norway. tossebro, mads @idi.ntnu.no
Abstract Even though the positions of objects may be uncertain, one may know some topological information about them. In this paper, we develop a new model for storing topological relationships in uncertain spatial data. It is intended to be the equivalent of such representations as the Node-Arc-Area representation but for spatial objects with uncertain positions. Keywords: Uncertain spatial data, topology, node-arc-area.
1 Introduction Several models have been presented for storing spatial data with uncertain or vague positions, including the models from (Tøssebro and Nygård 2002a and 2002b) that this paper builds on. However, these models do not handle topological relationships well. One example would be to describe two regions that share a stretch of boundary, but one does not know quite where this boundary lies. The goal of this paper is to develop a vector model of uncertain spatial points, lines and regions that incorporates this sort of information. The following examples describe situations in which one wants to store topological relationships between uncertain objects: Example 1: In a historical database, it may be known that two empires, such as the Egyptians and the Hittites, shared a boundary. However, it is not necessarily known exactly where the boundary was. If a coarse temporal granularity is used, the border may also have changed inside the time period of one snapshot. In such cases one may want to store the fact that the boundaries were shared rather than there being a possible overlap. A more recent example of this problem from (Plewe 2002) is the fact that the textual descriptions of county borders in Utah in the 19th century sometimes gave a lot of room for interpretation. Example 2: In a database containing information of past rivers and lakes, the location of the rivers and lakes may be partially uncertain. How-
396
Erlend Tøssebro and Mads Nygård
ever, one may know that a particular river came out of a particular lake. This relationship must be stored explicitly because it cannot be deduced using existing representations. Based only on the uncertain representation, the river may overlap the lake or start some distance away from it. Additionally, one must store such topological information if one wants to store uncertain or vague partitions. A partition is a set of uncertain regions that covers the entire area of interest and where the regions do not overlap each other. One example of a vague partition is: Example 3: A database of soil types or land cover should store a fuzzy partition because soil types typically change gradually rather than abruptly. However, the total map covers the entire terrain and has no gaps, thus being a partition. All these examples illustrate the need to store topology in uncertain data. Because such topology in many cases cannot be inferred from the data representation itself, it must somehow be stored explicitly. This paper will describe representations for storing topological data as well as methods for generating these representations from a collection of vectorized regions. The representations are based on existing models for uncertain spatial data by the present authors as well as how topology is stored for crisp data. (The word “crisp” will be used in this paper as the opposite of uncertain or vague.) A vector model is chosen rather than a raster model because vector data take much less space than raster data, and topology is much easier to generate and store for vector data. Most models for dealing with topology in crisp data are vector models. The next section will discuss the models and papers on which these new representations are built. Section 3 will discuss how to represent those topological relationships that cannot be inferred from the spatial extents of the uncertain data. Section 4 will discuss how one can extract information of one type from another, such as finding the uncertain border curve of an uncertain face. Section 5 extends the methods from Sections 3 and 4 to generate the uncertain border curves between regions. Section 6 will discuss other previous approaches to similar problems and compare them with this approach when appropriate. Section 7 summarizes the contributions of this paper.
2 Basis for the model This section will discuss some of the means that have been used to represent topological relationships before. Section 2.1 looks at how topology is represented for crisp data, and Section 2.2 looks at how topology is represented for uncertain data. Section 2.2 also looks at which relationships that
A Discrete Model for Topol. Relationships between Uncertain Spat. Objects
2
A
397
B
a
Start Point
1
(0-n)
Left Area (1)
Line Segment (1)
End
C
(0-n)
(0-n)
Face
Right Area (1)
(1)
(0-n)
ID Start End Left Reg. Right Reg. a 1 2 A B
a) Example
b) ER Diagram
Figure 1: Node-Arc-Area representation
can be inferred from the spatial representation alone when the data are uncertain. 2.1
Topology in crisp data
In the ideal case, topology can be inferred from the data itself in the crisp case. However, rounding errors and update inconsistencies mean that in practise it is often desirable to store topological information explicitly. Topology is therefore often stored explicitly in databases for crisp spatial data. One common representation is the so-called node-arc-area (NAA) representation. It is described in for instance (Worboys 1995), page 193. An example of this representation is shown in Fig. 1a. The table row below the main figure shows the relational entry for line a. An ER-diagram of this representation is given in Fig. 1b. In this representation, each face is bounded by a set of line segments. Each line segment has a pointer to the faces that are on each side of it, as well as to the start and end points. 2.2
Abstract uncertain topology
Much work has been done on determining topological relationships between indeterminate regions. However, the conclusions of most of these papers are either that there are a lot more topological relationships in the uncertain case (Clementini and Di Felice 1996), or that one cannot determine the relationships for certain (Roy and Stell 2001). This is particularly a problem for relationships like equals that requires precise overlap or meet that requires that the borders but not the interiors overlap. The topological relationships that are used in (Schneider 2001) for uncertain regions are given in Table 1.
398
Erlend Tøssebro and Mads Nygård
The contains, inside, disjoint and overlap relationships can be determined by geometric computations for uncertain data as well as for crisp data. These computations may yield an uncertain answer, but at least a crisp answer is possible. However, this is not the case for covers, coveredBy, equals and meet. Determining for certain that two regions are equal is only possible if they are known to be the same object (have the same object identifier or an explicit link to each other). Even if the uncertain spatial representations are equal, there is no guarantee that the actual objects are equal. It is also impossible to determine the meet relationship for certain, for the same reason as for equals. Although this operation is but one of eight, it encompasses a number of cases: • Point meet point: This is the same as point equals point and is solved by checking object identities as per the other equals operations. • Point meet line: Is the point P on the line L? (Did the ancient city C lie on the river R?) • Point meet face: Is the point P on the border of the face F? • Line meet line: Do the two lines share an end point? • Line meet face: Does the line end in the face? (Does the river R come from the lake L?) • Face meet face: Do the two faces share a common boundary? (Were the ancient empires A and B neighbours?) The node-arc-area representation stores regions by their boundary and lets the edges of bordering regions store links to each other. In this fashion one knows that these two regions meet each other. Covers and coveredBy cannot be determined for certain with uncertain data for the same reason as with meet, and the solution presented in this paper will work for them as well.
Table 1: Topological relationships covers coveredBy contains inside disjoint equals overlap meet
The second object is inside the first but shares a part of its boundary The first object is inside the second but shares a part of its boundary The second object is entirely inside the first The first object is entirely inside the second the intersection of the two objects is empty The two objects are equal The two objects overlap The boundaries of the objects overlap but the interiors do not
A Discrete Model for Topol. Relationships between Uncertain Spat. Objects
(0 - n)
Meet
Touches
(0 - n)
(0 - n)
On Uncertain Point
(0 - n)
399
(0 - n)
Enters (0 - n)
Uncertain Curve
(0 - 2)
(0 - n)
(0 - n)
Uncertain Face (0 - n)
On Border
Figure 2: ER-diagram of the simple method of storing connections
3 Storing connections between types Let us say that the data provider knows that two uncertain objects meet each other but does not know precisely where. Because he does not know precisely where the objects are, they should be stored as uncertain or fuzzy spatial objects. However, existing representations of such objects do not provide any means of determining whether they meet or not. To be able to determine the meet relationships, one needs to store them somehow. Two different approaches will be presented in this section. The first is to simply store the relationship explicitly in the data objects. The second is to create an uncertain equivalent of the node-arc-area representation. This second method is also useful in storing an uncertain partition. 3.1
Simple method
The simplest solution is to store the relationships as lists in the objects themselves. In this method, each object stores a link to each other object to which it is related. An ER diagram showing these relationships is presented in Fig. 2. The advantage of this model is simplicity. There is for instance no need to convert between types. However, it is impossible to check where the relationship holds in this model. For instance, it is impossible to check which part of the boundary of region A that it shares with region B. This model also cannot be used to store an uncertain partition because one cannot determine whether or not a given set of faces is a partition. 3.2
Uncertain node-arc-area
The alternative is to use a representation that is similar to the Node-ArcArea (NAA) representation that is used in the crisp case. The NAA representation can be translated into a representation for uncertain data using much the same ER diagram as for the crisp case. The only difference is that
400
Erlend Tøssebro and Mads Nygård End point (0 - 3)
Uncertain Point (0 - n)
Border
(0 - 2)
On
Uncertain Curve
(0 - 2)
(0 - n)
Uncertain Face
(0 - n)
Figure 3: ER-diagram of uncertain Node-Arc-Area
one might want to store the On relationship explicitly. In the crisp case this can be computed from the geometry. In the uncertain case it cannot because even if the support of an uncertain point is entirely within the support of the uncertain curve, the point still probably is not on the curve. The ER-diagram for the uncertain node-arc-area representation is given in Fig. 3. In an uncertain node-arc-area representation, the following relationships from Fig. 2 are checked by testing relationships with different types: • • • •
p.OnBorder ( F) ≡ p.On ( f.Border()) c 1 .Meet ( c 2) ≡ ∃p ( c 1 .EndPoint ( p) ∧ c 2 .EndPoint ( p) ) c.Enters ( F) ≡ ∃p ( c.EndPoint ( p) ∧ p.OnBorder ( F) ) F 1 .Touches ( F 2) ≡ ∃c ( F 1 .BorderedBy ( c) ∧ F 2 .BorderedBy ( c) )
Notice that this representation requires that the border of the face and the end points of the curve be represented explicitly. To ensure consistency between the curve and the end point, it should be possible to generate the end points from the curve. In some cases it may also be necessary to generate some of these data in order to create an uncertain ordered edge list. For instance, if one has a set of regions, one must generate their border curves as well as the end points of those curves. In this representation, it is assumed that if two regions are bordered by the same curve, they meet. If they are bordered by different curves, they do not meet. If one knows that they may meet, one may store this as if they were certain to meet and store a probability of meeting in the border curve.
4 Converting between types To be able to generate the uncertain ordered edge list, one needs to be able to find the border of a region and the end points of a curve. In this section, one method for doing this for a general egg-yolk style representation (Cohn and Gotts 1996) is given. In an egg-yolk model, a face is represented with two boundaries. Each uncertain face is expected to have a core in which it is certain to exist, and a support in which it can possibly be. The object may also store a probability
A Discrete Model for Topol. Relationships between Uncertain Spat. Objects
401
Support Uncertain Boundary Curve Core
Figure 4: Uncertain Face and Boundary Curve
function or fuzzy set to indicate where the object is most likely to be. One way of producing the boundary curve of a face is the following: 1. Let BR be the boundary region of the face. 2. If there is a probability function, create the expected border line along the 0.5 probability contour. Otherwise, create it in the middle of the boundary region. 3. Assign a probability of existence of 1 along the entire expected border line. 4. Let the uncertain border curve be the expected border line and the boundary region of the face. 5. The boundary region indicates where the border line may be, and the expected border line indicates where it is most likely to be. This procedure assumes that there are no peaks in the probability function outside the core. If there are, a more complex algorithm is needed. This algorithm has been omitted due to space limitations. Figure 4 shows an example of an uncertain face and its boundary curve. To create the uncertain node-arc-area representation, one also needs to find the end points of an uncertain curve. An uncertain cycle like the ones that form the borders of uncertain faces do not have end points, but other uncertain curves may have. When storing topological relationships, one possible method of finding the points where two or more curves meet is to use the area that is overlapped by the supports of all the curves that are supposed to meet there. This method is used in the algorithm presented in Sect. 5. Computing the end points from the representation of the uncertain curve would require an unambiguous representation of the probability that the curve is at particular places. Such a representation is presented in (Tøssebro and Nygård 2002a), but it is beyond the scope of this paper. All uncertain objects have a face as support and an object of the appropriate type as the core. Probability functions are computed by triangulating the support and computing the probability functions along the triangle edges. The basic approach that was described for point-set models can also be used in this discrete model.
402
Erlend Tøssebro and Mads Nygård Four uncertain regions
One end point
Two end points
Two end points
Figure 5: Possible configurations when four uncertain curves meet
5 Creating a representation of the meeting relationships for uncertain regions In an uncertain ordered edge list, each uncertain curve is supposed to form the boundary between two specific faces. However, the conversion method from the previous section only gives the boundary of a face as a single curve. Thus this curve has to be split into several curves. Because of rounding errors, digitizing errors and other inaccuracies, one cannot guarantee that two regions meet up perfectly even if the data provider knows that they do. Additionally, one does not know whether four lines meet in one point or two. One simple example of this last problem is shown in Fig. 5. For the four uncertain regions to the left, it is impossible to know which of the three cases to the right is the actual one. For these reasons creating an uncertain partition from its component set of regions is not easy. One way of storing the topological relationships is to try to make the regions fit together. This requires that the regions are changed slightly so that they do fit together. The method presented below is one way of creating such a best fit for the advanced discrete vector model from (Tøssebro and Nygård 2002a): 1. Add a buffer to the supports to ensure that the supports overlap entirely. For each point of the support of A that is inside the support of B, increase the distance from the core of A. a. Advantage of a large buffer: Makes certain that the supports overlap entirely. b. Disadvantage of a large buffer: Makes end points larger than necessary. 2. Remove those parts of the support of A that are inside the core of B. 3. Repeat points 1 and 2 for region B. 4. Take the intersection of the supports of A and B. This is the support of the border curve between the two faces. 5. Find the end points of the border curve. The supports of an end point is all the line segments of the support of the uncertain curve that are not
A Discrete Model for Topol. Relationships between Uncertain Spat. Objects Original faces
Step 1
Step 2
403
Finished representation
Figure 6: Creating the advanced uncertain partition
shared with the core of either A or B. These line segments will form two lines. Turn each of these lines into cycles by adding a straight closing line segment. 6. The core of the border curve and border points are determined as for normal uncertain curves and points. Figure 6 shows an example of two uncertain regions that border each other using this algorithm. The gray areas in the final representation are the meeting points of the border lines. If several faces border each other with no gaps between (a partial partition), each end point should be the end point of three or more border curves. One may use the following procedure to join the results together: 1. Take the union of the supports of A from all the topological relationships that it is involved in. 2. Remove those parts of A that are inside the cores of any region that it should border. 3. Join all meeting points that overlap by creating a new meeting point. The support of this point is the intersection of the supports of all the regions that meet there. (An alternative would be to use the union of the supports of the existing points. However, this may result in points with a weird and unnatural shape.) The number of possible configurations like in Fig. 5 grows rapidly with the number of meeting curves. For four curves, the number is three, for five it is 11, and for six it is around 45. Therefore it is important that the algorithm decides which of the configurations is most likely the correct one, and not delay that decision to query time. The above algorithm assumes that all the lines end in a single point. This assumption makes the algorithm simpler. An alternative assuming that at most three lines meet in each end point has been constructed but is omitted due to space limitations. An additional aspect of the advanced uncertain partition is whether the shape of an uncertain region should be stored only as the border curves or
404
Erlend Tøssebro and Mads Nygård
also stored separately. Because the border curves are needed anyway in an uncertain partition, letting the shape of the region be implied by its border curves costs less space and prevents inconsistencies between the representations. However, it also means that to get the shape of the region, one must retrieve all its border curves. If the shape was also stored in the region itself, the shape could be retrieved from the region object itself without retrieving the border curves, which probably are stored in another table.
6 Related work Many studies have looked at topology in uncertain data from a slightly different direction: How is the classical 9-intersection model affected by uncertainty? Early works in this direction like (Clementini and Di Felice 1996), (Clementini and Di Felice 1997) and (Cohn and Gotts 1996) have found that the number of distinct topological relationships increases from 8 to 44 when boundaries are broad rather than crisp. These 44 different relationships are then grouped into 14 different cases, one for each of the 8 crisp relationships from Table 1, and 6 that indicate which of the 8 the relationship is closest to. (Roy and Stell 2001) uses a method similar to that used in (Cohn and Gotts 1996) but defines the relationships differently. Rather than define a lot of possible relationships, they show how one can find approximate answers to some core relationships. Rather than having an explicit “Almost overlap” relationship, they determine for each of the topological relationships the probability that it is the true relationship. (Schneider 2001) and (Zhan 1998) present versions of this method which give a mathematical way of computing a fuzzy Boolean value for uncertain topological relationships. (Winter 2000) presents a statistical method for computing the probability that various topological relationships are true for two uncertain regions. (Cobb and Petry 1998) uses fuzzy sets to model topological relationships and direction relationships between spatial objects modelled as rectangles. Their approach is to find out how much of object B is west of object A or how much of object B overlaps object A. They store this as fuzzy set values in a graph that has one node for each direction and edges to nodes representing the objects. In Chap. 9 of (Molenaar 1998), a model for fuzzy regions and lines is defined. This model is defined on a crisp partition of the space. The shape of a region is defined by which of the faces in this underlying partition it contains. A fuzzy region is defined by assigning a fuzzy set value for each underlying face the region contains. A fuzzy line is defined by two fuzzy regions, one that lies to the right of the line, and one that lies to the left. This
A Discrete Model for Topol. Relationships between Uncertain Spat. Objects
405
model also includes methods for computing fuzzy overlap, fuzzy adjacency and fuzzy intersection of a line and a region. The fuzzy adjacency relationship states that if the supports of the regions are adjacent or overlapping, the regions are adjacent. It also defines strict adjacency as adjacent but not overlapping. This method works for vague regions but not for uncertain regions. For uncertain regions, overlapping supports may just as easily indicate a small overlap or that the regions are near each other. Therefore one needs to store adjacency explicitly in the uncertain case rather than derive it from the geometry as the model from (Molenaar 1998) does. (Bjørke 2003) uses a slightly different way of deriving fuzzy lines from regions and computing topological relationships. The fuzzy boundary of a fuzzy region is defined with the function 2 ⋅ min [ fv ( x , y) , 1 – fv ( x, y) ] where fv is the fuzzy membership value at those coordinates. To compute the relationship between two fuzzy objects, (Bjørke 2003) computes the fuzzy set values of the four intersections between the boundary and the interior of the two regions. Bjørke then computes the similarity between this result and all the 8 valid topological relationships from Table 1. The model from (Bjørke 2003) is capable of computing the border of a region as well as topological relationships on the regions, including equals and meets. It does not deal with meeting points. It is also an abstract model while this paper focuses on a discrete model.
7 Conclusions This paper has presented a way to represent topological information in a vector model for uncertain or fuzzy data. In particular, it shows how to represent the various relationships classified under meet by (Schneider 2001) in a vector model such that one can get definite results in queries if such results are known by the data providers. This paper also shows how one can store a partition made up of uncertain or vague regions in a vector model. The representation is derived from an earlier model for representing individual uncertain objects. The paper further presents algorithms for generating the uncertain boundary of an uncertain face and the end points of an uncertain curve. These are used to generate uncertain partitions from a set of regions. Regions that the data provider knows share a border, do not necessarily meet perfectly. Various errors may cause slight overlaps or gaps between them. This paper presents a solution to this that bases itself on altering the regions so that they meet perfectly. One challenge is that our model requires fairly complex algorithms for extracting the border curves of faces and end points of lines. Additionally,
406
Erlend Tøssebro and Mads Nygård
the algorithm for end points is only an approximate solution. We can think of no way to generate the uncertain end point precisely. A simpler representation based on another discrete model has been constructed but is not included due to space limitations.
References Bjørke JT (2003) Topological relations between fuzzy regions: derivation of verbal terms. Accepted for publication in Fuzzy sets and Systems. Clementini E and Di Felice P (1996) An Algebraic Model for Spatial Objects with Indeterminate Boundaries. In Geographic Objects with Indeterminate Boundaries, GISDATA series vol. 2, Taylor & Francis, pp. 155-169. Clementini E, and Di Felice P (1997) Approximate Topololgical Relations. International Journal of Approximate Reasoning, 16, pp. 173-204. Cohn AG and Gotts NM (1996) The ‘Egg-Yolk’ Representation of Regions with Indeterminate Boundaries. In Geographic Objects with Indeterminate Boundaries, GISDATA series vol. 2, Taylor & Francis, pp. 171-187. Cobb MA and Petry FE (1998) Modeling Spatial Relationships within a Fuzzy Framework. In Journal of the American Society for Information Science, 49 (3), pp. 253-266. Molenaar M (1998) An Introduction to the Theory of Spatial Object Modelling for GIS. (Taylor & Francis). Plewe B (2002) The Nature of Uncertainty in Historical Geographic Information. Transactions in GIS, 6 (4), pp. 431-456. Roy AJ and Stell JG (2001) Spatial Relations between Indeterminate Regions. Accepted for publication in International Journal of Approximate Reasoning, 27(3), pp.205-234. Schneider M (2001) Fuzzy Topological Predicates, Their Properties, and Their Integration into Query Languages. In Proceedings of the 9th ACM Symposium on Geographic Information Systems (ACM GIS), pp. 9-14. Tøssebro E and Nygård M (2002a) An Advanced Discrete Model for Uncertain Spatial Data. In Proceedings of the 3rd International Conference on Web-Age Information Management (WAIM), pp. 37-51. Tøssebro E and Nygård M (2002b) Abstract and Discrete models for Uncertain Spatiotemporal Data. Poster presentation at 14th International Conference on Scientific and Statistical Database Management (SSDBM). An abstract on page 240 of the proceedings. Winter S (2000) Uncertain Topological Relations between Imprecise Regions. International Journal of Geographical Information Science, 14(5), pp. 411430. Worboys M (1995) GIS: A Computing Perspective. (Taylor & Francis). Zhan FB (1998) Approximate analysis of binary topological relations between geographic regions with indeterminate boundaries. Soft Computing, 2, pp. 28-34.
Modeling Topological Properties of a Raster Region for Spatial Optimization Takeshi Shirabe Institute for Geoinformation, Technical University of Vienna, 1040 Vienna, Austria,
[email protected] Abstract Two topological properties of a raster region – connectedness and perforation – are examined in the context of spatial optimization. While topological properties of existing regions in raster space are well understood, creating a region of desired topological properties in raster space is still considered as a complex combinatorial problem. This paper attempts to formulate constraints that guarantee to select a connected raster region with a specified number of holes in terms amenable to mixed integer programming models. The major contribution of this paper is to introduce a new intersection of two areas of spatial modeling – discrete topology and spatial optimization – that are generally separate.
1 Introduction A classic yet still important class of problems frequently posed to rasterbased geographic information systems (GIS) is one of selecting a region in response to specified criteria. It dates back to the early 1960s when landscape architects and environmental planners started the so-called overlay mapping – a process of superimposing transparent single-factor maps on a light table – to identify suitable regions for particular land uses (see, e.g. McHarg 1969). The then manual technique is today automated by most of the current GIS. Expressing spatial variation in numerical terms, rather than by color and texture, makes the technique more ‘rigorous and objective’ (Longley et al. 2001) as well as reproductive and economical. If a region-selection problem can be reduced to a locationwise screening, the overlay mapping technique will suffice. For example, a query like ‘select locations steeper than 10% slope within 200m from streams’ can be easily answered. If selection criteria apply not to individual locations but to loca-
408
Takeshi Shirabe
tions as a whole, however, one will encounter combinatorial complexity. To see this, add to the above query a criterion ‘selected locations must be connected and their total area is to be maximized.’ Criteria like this are called ‘holistic’ as opposed to ‘atomistic’ (Tomlin 1990, Brookes 1997), and found in a variety of planning contexts such as land acquisition (Wright et al. 1983, Diamond and Wright 1991, Williams 2002, 2003), site selection (Diamond and Wright 1988, Cova and Church 2000), and habitat reserve design (McDonnell et al. 2002, Church et al. 2003). Storing cartographic data in digital form also enables one to cast such a holistic inquiry as a mathematical optimization problem. In general, to address an optimization problem, two phases are involved: model formulation and solution. A problem may be initially stated in verbal terms as illustrated above. It is then translated into a set of mathematical equations. These equations tend to involve discrete (integer) as well as continuous decision variables – decision variables are the unknowns whose values determine the solution of the problem under consideration – for dealing with indivisible raster elements. A system of such equations is called a mixed integer programming (MIP) model. Once a problem is formulated as such, it is solved by either heuristic or exact methods. Heuristic methods are designed to find approximate solutions in reasonable times and are useful for solving large-scale models (as is often the case with raster space). Exact methods, on the other hand, aim to find best (or optimal) solutions – with respect to criteria explicitly considered. If there is no significant difference between their computational performances, exact methods are preferred. Even when exact methods are not available, good heuristic methods should be able to tell how good obtained solutions are relative to possible optima. Thus, whether a problem is solved approximately or exactly, for solutions to be correctly evaluated, the problem needs to be formulated exactly. Many region-selection criteria have been successfully formulated in MIP format (see, e.g. Wright et al. 1983, Gilbert et al. 1985, Benabdallah and Wright 1991, Cova and Church 2000, Williams 2002). Typical among them are (geometric) area, (non-geometric) size, compactness, and connectedness. The area of a region can be equated with the number of raster elements in the region. Similarly, the size of a region in terms of a certain attribute (such as cost or land use suitability) is computed by aggregating the attribute value of each element in the region. Compactness is a more elusive concept and has no universally accepted measure. This allows various formulae for compactness, such as the ratio between the perimeter and the area of a region, the sum of the distances – often squared or weighted – between each element and a region center, and the number of adjacent pairs of elements in a region.
Modeling Topological Properties of a Raster Region
409
Connectedness (or contiguity) is, on the other hand, a difficult property to express in MIP form. In fact it has not been modeled as such until recently (Williams 2002). A frequently used alternative for addressing connectedness is to pursue a compact region, which tends (but not guarantees) to be connected. Though connectedness and compactness are in theory independent of each other, this approach is of practical value particularly when both properties are required. ‘Region-growing’ (Brookes 1997) is another technique which guarantees to create a connected region. The procedure starts with an arbitrary chosen raster element and attaches to it one element after another until a region of a desired size and shape is built. Cova and Church (2000) devised another heuristic for growing a region along selected shortest paths. On the contrary to these heuristics, Williams (2002) proposed a MIP formulation of a necessary and sufficient condition to enforce connectedness. The formulation exploits the facts that a set of discrete elements can be represented by a planar graph, whose dual is also a planar graph; and that the two graphs have spanning trees that do not intersect each other. More recently, Shirabe (2004) proposed another MIP formulation of an exact connectedness condition based on concepts of network flows (see Section 3). There is no difference between the two exact models in accuracy. Also they involve the same number of binary (0-1) decision variables – a major factor affecting tractability. The latter formulation, however, requires fewer continuous variables and constraints and seems more tractable. Perforation is another important region-selection criterion. A region with holes, even though connected, often lacks integrity and utility. Neither of the two aforementioned connectedness models prevents a region from having holes. In fact, the more elements a region is required to contain, the more likely it will have holes. The closest to achieve nonperforation is Williams’s MIP model for selecting a convex (i.e. not perforated or indented) region from raster space (Williams 2003). It does not consider all convex regions but guarantees to make a certain type of convex region. Convexity is, however, not a necessary but a sufficient condition for connectedness. Thus a potentially good connected region without holes might be overlooked. The last two difficult criteria are topological in nature. Unlike metric counterparts, they are strictly evaluated in Boolean terms. Topology in discrete space – as opposed to that in continuous space (Egenhofer and Franzosa 1991, Randell et al. 1992) – has been formalized by many researchers in terms that can be implemented in raster GIS and image processing software (see Section 2). Their topological models have enhanced spatial reasoning and query, which deal with regions already recorded. They may
410
Takeshi Shirabe
not, however, lend themselves to spatial optimization, which is concerned with regions yet to be realized. This paper therefore aims to bridge this gap. More specifically, it proposes a MIP formulation of a necessary and sufficient condition for selecting a connected region without holes from raster space. It then generalizes the formulation to be able to achieve a desired degree of perforation – from no hole to a largest possible number of holes. The rest of the paper is organized as follows. Section 2 reviews topology of a single region in raster space. Section 3 presents an existing MIP-based connectedness constraint set and two new MIP-based perforation constraint sets. Section 4 reports computational experiments. Section 6 concludes the paper.
2 Topology of a Single Raster Region Raster space is a two-dimensional space that is discretized into equallysized square elements called cells (or pixels in image processing). Two cells are said to be 4-adjacent (resp. 8-adjacent) if they share a side (resp. a side or a corner) (figure 1). A finite subset of raster space is here referred to as a raster map. A subset taken from a raster map constitutes a region. (a) 4-adjacency
(b) 8-adjacency
Fig. 1. Adjacency. Shaded cells are (a) 4- or (b) 8-adjacent to the center cell.
Connectedness and perforation in raster space are roughly described as follows. A region is said to be connected if one can travel from any cell to any other cell in the region by following a sequence of adjacent cells without leaving the region. Likewise, a region is said to be perforated if it has a hole, which is a maximum connected set of cells not included in but completely surrounded by the region. Connected regions are classified into two kinds: simply connected and multiply connected (Weisstein 2004). A simply connected region has no hole, while a multiply connected region has one or more holes. In general, a connected region with n holes is called an (n+1)-ply connected region. These topological properties are easily recognized by visual inspection, and should also be implied by adjacency relations between cells. Unlike Euclidean counterparts, however, connectedness in raster space is not so
Modeling Topological Properties of a Raster Region
411
obvious, which in turn makes perforation ambiguous, too. This is well exemplified by the so-called connectedness paradox (Kovalevsky 1989, Winter 1995, Winter and Frank 2000, Roy and Stell 2002). To see it, consider a raster curve illustrated in figure 2. In terms of the 4-adjacency, the curve is not closed and the inside of the curve is not connected to the outside of the curve. This violates the Jordan curve theorem (Alexandroff 1961) as a non-closed curve has separated space into two parts. If the 8-adjacency is employed, the curve is closed and the inside of the curve is connected to the outside of the curve. This, too, violates the theorem, since a closed curve has not separated space into two parts.
Fig. 2. A raster curve
To overcome the connectedness paradox, Kong and Rosenfeld (1989) applied different adjacency rules to a region and its complement (i.e. its background). For instance, if the curve mentioned above has the 8adjacency and its complement has the 4-adjacency, the curve is closed and separates space into two parts. The paradox is then resolved. Kovalevsky (1989), Winter (1995), Winter and Frank (2000), and Roy and Stell (2002) took another approach to the paradox. They decompose space into geometric elements of n different dimensions called ‘n-cells’, which are collectively referred to as a ‘cellular complex’ (Kovalevsky 1989). A cellular complex of raster space or a ‘hyper raster’ (Winter 1995, Winter and Frank 2000) consists of 0-cells (cell corners), 1-cells (cell sides), and 2-cells (cells). Euclidean topology applies to this model (Winter 1995), so the paradox does not arise. This paper relies on a mixed-adjacency model – more specifically, the (4, 8) adjacency model (Kong and Rosenfeld 1989) which applies the 4adjacency to a region and the 8-adjacency to its complement – for two reasons. First, as far as a single region is concerned, no paradox occurs. Second, a raster map can be represented by a graph, which enables one to evaluate connectedness and perforation of a region without reference to its complement (see Section 3). To represent a raster map in terms of a graph, each cell is equated with a vertex and each adjacency relation between a pair of cells is equated with an edge. If the 4-adjacency is employed, the graph is planar, that is, it can
412
Takeshi Shirabe
be drawn in the plane such that no two edges cross (Ahuja et al. 1993). A face is formed by a cycle of four edges. In the case of the 8-adjacency, however, a raster map cannot be modeled by a planar graph. A connected planner graph has an important property expressed by the following Euler’s formula. ve f 2 (1) where v, e, and f are the numbers of vertices, edges, and faces, respectively, in the graph. Consider a 10-by-10 raster map as an example. Its associated graph satisfies the formula as it has 100 vertices, 180 edges, and 82 faces (including an unbounded face). The formula also applies to a connected subgraph of it. Figure 3 illustrates two connected regions and their corresponding graphs. The graph on the left has 25 vertices, 34 edges, and 11 faces, while the graph on the right has 25 vertices, 29 edges, and six faces. Here it is important to note that a perforated region has faces formed by more than four edges. # # # # #
# # #
# #
#
#
# # # # # # # #
# # # # # # # #
# # #
# #
#
#
#
# # #
# # # # # #
# # # # #
Fig. 3. Graph representations of two regions. Each region (shaded) is represented by a graph of vertices (white dots) and edges (white lines).
3 Formulation In this section, based on the (4, 8) adjacency, we present three sets of connectedness and perforation constraints. They are formulated self-contained so as to be used as components of region-selection models that may involve additional criteria for optimization. The first constraint set guarantees to select a connected region regardless of the number of holes. The second constraint set, coupled with the first one, guarantees to select a simply connected region. The third constraint set generalizes the second one to control the number of holes in a connected region.
Modeling Topological Properties of a Raster Region
413
3.1 Connectedness Constraints Connectedness is defined in graph-theoretic terms as follows: a region is said to be connected if there is at least one path between every pair of vertices in a graph, G, associated with the region. Since the adjacency relation between any pair of cells is symmetric in raster space, a necessary and sufficient condition for G to be connected is that there is at least one vertex, p, in G, to which there is a path from any other vertex in G. This can be interpreted in terms of network flows by regarding p as a sink and the rest of vertices in G as sources and replacing each edge with two opposite directed arcs. In this setting, for G to be connected, every source must have a positive amount of supply that ultimately reaches the sink (Shirabe 2004). Thus, selecting a connected region from a raster map is equivalent to selecting, from a network associated with the map, a sink vertex and source vertices such that they make an autonomous subnetwork that satisfies the above condition. This is expressed by the following set of constraints. y ij y ji x i mwi i I (2)
¦
¦
{ j |( i , j ) A}
¦w ¦y i
{ j |( j ,i ) A}
1
(3)
iI
ij { j | ( i , j ) A}
d (m 1) xi
i I
(4)
where I set of cells in a given raster map. Each cell of I is denoted by i or j. A set of ordered pairs of adjacent cells in the map. By definition, (i, j ) A if and only if ( j , i ) A . m Given nonnegative integer indicating the number of cells to be selected for the region. xi Binary decision variable indicating if cell i is selected for the region. x i
wi
yij
1 if selected, 0 otherwise.
Binary decision variable indicating if cell i is a sink. wi 1 if a sink, 0 otherwise. Nonnegative continuous decision variable indicating the amount of
flow from cell i to cell j. Constraints (2) represent the net flow of each cell. The two terms on the left-hand side represent, respectively, the total outflow and the total inflow of cell i. Constraint (3) requires that one and only one cell be a sink. Constraints (4) ensure that there is no flow into any cell in the region’s complement (where x i 0 ) and that the total inflow of any cell in the region
414
Takeshi Shirabe
(where x i 1 ) does not exceed m-1. This implies that there may be flow (though unnecessary) from a cell in the region’s complement to a cell in the region. Even in this case, the supply from each cell in the region remains constrained to reach the sink and the contiguity condition holds. This formulation has a fairly efficient structure, as the numbers of binary variables, continuous variables, and constraints are |I|, |A|, and 2|I|+1, respectively. 3.2 Simply Connectedness Constraints The above constraint set only guarantees to make a connected region, so it is possible that the region has one or more holes. To prevent perforation, a new set of constraints are added. It takes advantages of a special topological structure of a simply connected region in the (4, 8) adjacency model. That is, in a graph representation of a simply connected region, all faces except one unbounded face are cycles of four edges (see the region on the left in Figure 3). Thus the enforcement of the Euler’s formula, while counting only such cycles as faces, will result in a simply connected region. Accordingly, the following constraints, together with constraints (2)(4), guarantee to select a simply connected region from a raster map. m eij (1 f ijkl ) 2 (5)
¦
( i , j )E
¦
( i , j , k ,l )F
eij d xi
(i, j ) E
(6)
eij d x j
(i, j ) E
(7)
eij t x i x j 1
(i, j ) E
(8)
f ijkl d eij
(i, j , k , l ) F
(9)
f ijkl d e kl
(i, j , k , l ) F
(10)
f ijkl t eij e kl 1
(i, j , k , l ) F
(11)
where E set of unordered pairs of adjacent cells in the map. F set of unordered quads of adjacent cells in the map. eij Binary decision variable indicating if cells i and j (unordered) are both selected for the region. eij
f ijkl
1 if selected, 0 otherwise.
Binary decision variable indicating if cells i, j, k, and l (unordered) are all selected for the region. f ijkl
1 if selected, 0 otherwise.
Modeling Topological Properties of a Raster Region
415
Constrain (5) is Euler’s formula for a graph representation of a simply connected region. Three terms on the left-hand side of constraint (5) respectively represent the numbers of vertices, edges, and faces (including one unbounded face). Constraints (6)-(8) make eij 1 if cells i and j are included in the region, 0 otherwise. Constraints (9)-(10) make f ijkl
1 if
cells i, j, k, l are included in the region, 0 otherwise. Since all xi ’s are constrained to be 0 or 1, all eij ’s and f ijkl ’s are guaranteed to be 0 or 1 without explicit integrality constraints on them. Thus the number of binary variables remains the same as the previous model. 3.3 Multiply Connectedness Constraints A connected region that does not satisfy constraints (5)-(11) has one or more holes. It, however, does not mean that such a region violates Euler’s formula; but that it has faces the previous formulation cannot detect. Those overlooked faces are formed by more than four edges. Fortunately such faces are easy to enumerate, since each of them encircles one and only one hole (see the region on the right in Figure 3). Therefore, the following constraint generalizes Euler’s formula for a connected region with n holes (an (n+1)-ply connected region) with the (4, 8) adjacency assumed. m eij (1 n f ijkl ) 2 (12)
¦
( i , j )E
¦
( i , j , k ,l )F
This constraint, together with constraints (2)-(4) and (6)-(11), guarantees to select a connected region with n holes from a raster map. It should be noted that the number of holes can be confined in a specific range (rather than a single value) by making constraint (12) an inequality.
4 Experiments To show how the proposed constraints address connectedness and perforation, three sample problems are considered. All use the same dataset from Williams (2002): a 10-by-10 raster map, in which each cell is assigned a random number taken from a uniform distribution of values ranging from 0.2 to 1.8 with a step size of 0.1. Note that for simplicity and transparency the problems are rather contrived, admittedly, and that reallife problems often involve multiple regions (see, e.g. Benabdallah and Wright 1992, Aerts et al. 2003) and multiple objectives (see, e.g. Wright et al. 1983, Gilbert et al. 1985, Diamond and Wright 1988).
416
Takeshi Shirabe
Problem 1: Select a connected region of m cells from the raster map to minimize the sum of each cell’s value. The problem is formulated as the following MIP model. Min (13) ci xi
¦ iI
Subject to (2)-(4) where ci represents the random number attributed to cell i. The model was solved with 100 different m’s from one to 100 on a Pentium 4 with 512 MB RAM, using the CPLEX 8.0 MIP solver. The model is generally tractable at this scale, as solution times (in wall clock) range from nearly zero to 23.54 seconds with a median of 1.08 second. All instances that took more than 10 seconds to solve are clustered where m is between 20 and 52. The model’s tractability generally improved as m departed from this range. This indicates that while a region is relatively small, it gets harder to achieve connectedness as the region grows; but that once a region becomes sufficiently large, it gets easier to achieve connectedness as the region further grows. Problem 2: Select a simply connected region of m cells from the map to minimize the sum of each cell’s value. The problem differs from Problem 1 only in that no holes are allowed. It is formulated as a MIP model consisting of objective function (13) and constraints (2)-(11). The model was again solved with 100 different m’s using the same computing system. It turned out that the model is less tractable than the previous one, as the maximum solution time was 121.24 seconds and the median was 4.355 seconds. It is also found that the model has two separate clusters of difficult instances (figure 4). The first cluster appears where m is between about 10 and 50, with the most difficult case at m = 24. It corresponds to the cluster found in Problem 1, which has already been explained. The second cluster is seen where m is relatively large, and the most difficult case is where m = 93. The implication of this is that a large region (in terms of the number of cells) is more susceptible to perforation. In fact, Problems 1 and 2 have identical optimal solutions, where m is smaller than 78. Problem 3: Select a connected region of m cells with n holes from the map to minimize the sum of each cell’s value. The problem is formulated as a MIP model consisting of objective function (13) and constraints (2)-(4) and (6)-(12). Two experiments were conducted. In the first experiment, m was fixed to 24 (corresponding to the most difficult case in the first cluster for Problem 2) while n was varied from zero to four (the largest possible number of holes). In the second experiment, m was fixed to 93 (corresponding to the most difficult case in the
Modeling Topological Properties of a Raster Region
417
second cluster for Problem 2) while n was varied from zero to seven (the largest possible number of holes). Selected solutions are illustrated in figure 5, and numerical results are summarized in table 1. 140
Solution Time [s]
120 100 80 60 40 20 0 0
5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 m
Fig. 4. Solution times for Problem 2 (a) n = 0
(b) n = 1
(b) n = 2
(b) n = 3
(b) n = 4
Fig. 5. Optimal solutions to Problem 3 with m = 24
Table 1. Numerical results for Problem 3 n
MIP(24)
LP(24)
Time(24)
BB(24)
MIP(93)
LP(93)
Time(93)
BB(93)
0
10.9
9.80
45.62
3,770
82.40
81.95
123.49
30,459
1
11.7
10.15
164.63
11,131
81.70
81.70
0.21
0
2
13.5
10.52
8,780.39
668,779
81.70
81.70
0.21
0
3
14.7
10.88
22,442.42
1,704,274
81.80
81.80
0.21
0
4
16.3
11.25
87,882.50
6,737,713
81.90
81.90
0.19
0
5
-
-
-
-
82.10
82.07
0.33
0
6
-
-
-
-
82.30
82.25
0.29
0
7
-
-
-
-
82.70
82.50
0.98
0
MIP(24), LP(24), Time(24), and BB(24) respectively represent the optimal objective value of the MIP model, the optimal objective value of its LP relaxation (with no integer constraints on xi’s), the solution time in wall clock, and the number of branch-and-bound nodes required, when m = 24. MIP(93), LP(93), Time(93), and BB(93) represent those, when m = 93. Roughly speaking, a brunch-and-bound al-
418
Takeshi Shirabe
gorithm, which is employed by CPLEX and many other MIP solvers, solve a model by moving from one LP feasible solution to another until a solution is found MIP optimal in a treelike fashion. So, in general, the larger the gap between the LP and the MIP optima, and the size of the tree (consisting of branch-andbound nodes), the less tractable the model is.
As seen in table 1, where m = 93, any degree of perforation is a trivial requirement. It may be speculated that when a problem requires a perforated region to encompass large part of a raster map, there are relatively a small number of feasible solutions. Where m = 24, on the other hand, the model’s tractability significantly deteriorates. The model becomes less tractable as n increases, and takes about one full day of computation to solve where n = 4. Other computational experiments have found even more difficult cases, whose solution requires several days of computation. These suggest that perforation requirement adds excessive computational complexity. 5 Conclusion We have formulated three sets of MIP constraints that guarantee to select from a raster map a connected region regardless of the number of holes, a connected region without holes, and a connected region with a specified number of holes, respectively. Though they involve relatively few binary decision variables, computational experiments suggest that they are, in practical sense, not tractable enough to be solved by general-purpose MIP solver such as CPLEX. The most difficult problems seem to be those to select a medium-sized connected region with many holes from a large raster map (though perforated regions are largely more theoretical interest than practical application). Thus, to solve such problems, one may need to resort to heuristic methods (e.g. Brookes 1997, Aerts and Heuvelink 2002). We have assumed the (4, 8) adjacency in this paper. It is then natural to ask whether connectedness and perforation constraints can be similarly formulated in the (8, 4) adjacency and the hyper raster model. At present, we do not have answers. The difficulty lies in the facts that those models do not represent a region by a planar graph. Still, they are structured so regularly that other approaches might exist. This should be explored in future research. Lastly, we have seen significant overlap between discrete topology and spatial optimization. Connectedness and perforation are fundamental yet small part of it. It would be interesting to see how other topological properties – including topological relations between two regions – are incorporated into spatial optimization models. This, too, should be explored in future research.
Modeling Topological Properties of a Raster Region
419
References Aerts JCJH and Heuvelink GBM (2002) Using simulated annealing for resource allocation. International Journal of Geographical Information Science 16: 571587 Aerts JCJH, Eisinger E, Heuvelink GBM, Stewart TJ (2003) Using linear integer programming for multi-region land-use allocation, Geographical Analysis 35: 148-169 Ahuja RK, Magnanti TL, and Orlin JB (1993) Network flows: theory, algorithms, and applications. Prentice Hall, Englewood Cliffs, New Jersey Alexandroff P (1961) Elementary concepts of topology. Dover, New York. Brookes CJ (1997) A parameterized region-growing programme for region allocation on raster suitability maps. International Journal of Geographical Information Science 11: 375-396 Benabdallah S, Wright JR (1991) Shape considerations in spatial optimization. Civil Engineering Systems 8: 145-152 Benabdallah S, Wright JR (1992) Multiple subregion allocation models. ASCE Journal of Urban Planning and Development 118: 24-40 Church RL, Gerrard RA, Gilpin M, Sine P (2003) Constructing Cell-Based Habitat Patches Useful in Conservation Planning. Annals of the Association of American Geographers 93: 814-827 Cova TJ, Church RL (2000) Contiguity constraints for single-region region search problems. Geographical Analysis 32: 306-329 Diamond JT, Wright JR (1988) Design of an integrated spatial information system for multiobjective land-use planning. Environment and Planning B 15: 205214 Diamond JT, Wright JR (1991) An implicit enumeration technique for land acquisition problem. Civil Engineering Systems 8: 101-114 Egenhofer M, Franzosa R (1991) Point-set topological spatial relations. International Journal of Geographical Information Systems 5: 161-174 Gilbert KC, Holmes DD, Rosenthal RE (1985) A multiobjective discrete optimization model for land allocation. Management Science 31: 1509-1522 Kong TY, Rosenfeld A (1989) Digital topology: introduction and survey. Computer Vision, Graphics, and Image Processing 48: 357-393 Kovalevsky VA (1989) Finite topology as applied to image analysis. Computer Vision, Graphics, and Image Processing 46: 141-161 Longley PA, Goodchild MF, Rhind DW (2001) Geographic information systems and science. John Wiley and Sons, New York McDonnell MD, Possingham HP, Ball IR, Cousins EA (2002) Mathematical methods for spatially cohesive reserve design. Environmental Modeling and Assessment 7: 107-114 McHarg I (1969) Design with nature. Natural History Press, New York
420
Takeshi Shirabe
Randell DA, Cui Z, Cohn AG (1992) A spatial logic based on regions and connection. Proceedings of the third international conference on knowledge representation and reasoning, Morgan Kaufmann, San Mateo: 165-176 Roy AJ, Stell JG (2002) A quantitative account of discrete space. Proceedings of GIScience 2002, Lecture Notes in Computer Science 2478. Springer, Berlin: 276-290 Shirabe T (2004) A model of contiguity for spatial unit allocation, in revision Tomlin CD (1990) Geographical information systems and cartographic modelling. Prentice Hall , Englewood Cliffs, New Jersey Williams JC (2002) A Zero-One programming model for contiguous land acquisition, Geographical Analysis 34: 330-349 Williams JC (2003) Convex land acquisition with zero-one programming, Environment and Planning B 30: 255-270 Weisstein EW (2004) World of mathematics. http://mathworld.wolfram.com Winter S (1995) Topological relations between discrete regions In: Egenhofer MJ, Herring JR (eds) Advances in Spatial Databases, Lecture Notes in Computer Science 951. Springer , Berlin: 310-327 Winter S, Frank AU (2000) Topology in raster and vector representation. Geoinformatica 4: 35-65 Wright JR, ReVelle C, Cohon J (1983) A multipleobjective integer programming model for the land acquisition problem. Regional Science and Urban Economics 13: 31-53
Sandbox Geography – To learn from children the form of spatial concepts Florian A. Twaroch & Andrew U. Frank Institute for Geoinformation and Cartography, TU Vienna,
[email protected],
[email protected] Abstract The theory theory claims that children’s acquisition of knowledge is based on forming and revising theories, similar to what scientists do (Gopnik and Meltzoff 2002). Recent findings in developmental psychology provide evidence for this hypothesis. Children have concepts about space that differ from those of adults. During development these concepts undergo revisions. This paper proposes the formalization of children’s theories of space in order to reach a better understanding on how to structure spatial knowledge. Formal models can help to make the structure of spatial knowledge more comprehensible and may give insights in how to build GIS. Selected examples for object appearances are modeled using an algebra. An Algebra Based Agent is presented and coded in a functional programming language as a simple computational model.
1 Introduction Watch children playing in a sandbox! Although they have not collected much experience about their surroundings, they follow rules about objects in space. They observe solid objects and liquids, they also manipulate them, and they can explain spatial behavior of objects albeit their judgments are sometimes in error (Piaget and Inhelder 1999). Infants individuate objects; they seem to form categories and can make predictions about object occurrences. Like geographers they explore their environment and make experiments and derive descriptions. Geographers omit large scale influences like geomorphologic movements, so do children. Lately social
422
Florian A. Twaroch & Andrew U. Frank
interactions are considered in spatial models. These can be also found in the sandbox. The playing toddlers have contact with other kids and from time to time they check if their caring parents are still in the vicinity. Children also form theories about people (Gopnik, Meltzoff et al. 2001). This paper proposes to exploit recent findings of psychologists in order to build formal models for GIS. Different approaches are taken to explain how adults manage spatial knowledge. Newcombe and Huttenlocher (2003) review three approaches that influenced spatial development in the research during the last fifty years Piagetianism, Nativism and Vygotskyanism. Followers of Piaget assume that children start out with no knowledge about space. In a four-stage process child knowledge develops to adult knowledge. A follower of the nativist view is Elizabeth S. Spelke, who has identified in very young children components of cognitive systems that adults still make use of. It is called core knowledge (Spelke 2000). New knowledge can be built by the composition of these core modules. The modules itself are encapsulated and once they are triggered they do not change (Fodor 1987). Vygotskyanists believe that children are guided and tutored by elders, cognitive efforts are adapted to particular situations, and that the human has a well developed ability in dealing with symbolic material (Newcombe and Huttenlocher 2003). The present work concentrates on a view called the theory theory explored by A. Gopnik and A. N. Meltzoff. From the moment of birth the acquired knowledge undergoes permanent change whenever beliefs do not fit together with observed reality (Gopnik, Meltzoff et al. 2001; Gopnik and Meltzoff 2002). The presented paper starts with a formalization of this model using Algebra as a mathematical tool for abstracting and prototyping. Finding very simple and basic concepts about the physical world is not new. Patrick Hayes proposes in a manifesto a Naïve Physics (Hayes 1978; Hayes 1985). A Naïve Theory of Motion has been investigated (McCloskey 1983). The geographic aspects have been considered in a Naïve Geography that forms a “body of knowledge that people have about the surrounding geographic world” (Egenhofer and Mark 1995). This knowledge develops through space and time. It starts at very coarse core concepts and develops to a fully fledged theory. An initiative for common sense geography has been setup to identify core elements (Mark and Egenhofer 1996). The demand for folk theories has been stated by several authors in Artificial Intelligence science to achieve a more usable intuitive interface. Recent results by the psychology research community can influence GIS by forming new and sound models. In extension to the naïve ge-
Sandbox Geography – To learn from children the form of spatial concepts
423
ography new insights about how to structure space can be won by the investigation of children’s mind. Recent research in developmental psychology is discussed in section two of the paper. Section three connects these findings to current GIS research. The use of Algebra in GIS is introduced in section four. An Algebra Based Agent is proposed in section five, using simple examples for object appearances on static and moving objects. Section six introduces the prototypic modeling done so far. In the concluding seventh section the results and future research topics are discussed.
2 Children and Space In a multi-tiered approach for GIS (Frank 2001) the human plays a central role – a cognitive agent is modeled as its own tier. For the last fifty years children were ignored in geo-sciences. Children for a long time were not investigated as an object of psychological research. Aristotle and the English philosopher John Locke considered them tabulae rasae, not knowing anything in advance. Nowadays a whole research enterprise has developed which investigates children’s mental models. It started with Piaget in the early fifties (Piaget, Inhelder et al. 1975; Piaget and Inhelder 1999). Although he was wrong in some of his assumptions his ideas have been studied in detail. Piaget had the opinion that children start out into the world without any innate knowledge. All the knowledge a person has at a certain point of time had to be acquired before. Today researchers suppose that there is some innate knowledge available that is either triggered in some way and reused or developed in form of adaption. According to the theory theory the learning process is driven by three components (Gopnik and Meltzoff 2002): Innate knowledge – core knowledge: Evidence shows that babies are born with certain abilities. Object representations consist of 3 dimensional solid objects that preserve identity and persist over occlusion and time (Spelke and Van de Walle 1993). Gopnik, Meltzoff and Kuhl show that there is also an innate understanding of distance. The same authors also detected evidence that there are links between information picked up from different sensor modalities (Gopnik, Meltzoff et al. 2001). Powerful learning abilities: Equipped with those innate structures babies start a learning process. Language acquisition especially shows how powerful this mechanism must be (Pinker 1995). In the first six years a child learns around 14 000 words. Another thing that has to be learned is
424
Florian A. Twaroch & Andrew U. Frank
the notion of object permanence. To understand object permanence means to understand that a hidden object continues to exist. Different approaches seem to be used by children to explain this phenomenon during their learning process. The formation of object categories and the understanding of causal connections are two other aspects that have to be learned by children throughout many years (Gopnik, Meltzoff et al. 2001). Unconscious tutoring by others: Adults teach children by doing things in certain ways. By repeating words, accentuating properly and speaking slowly they help children unconsciously to acquire the language. Children learn many things by imitation. The absence of others can heavily influence social behavior. As demonstrated by Kaspar Hauser 1828 in Nürnberg, Germany. These three components innate knowledge, powerful learning and tutoring by others are also the basis for the theory theory by A. Gopnik and A. N. Meltzoff (Gopnik, Meltzoff et al. 2001; Gopnik and Meltzoff 2002). Children acquire knowledge by forming and revising theories, similar to what scientists do. The spatial concepts infants live in are obviously different from an adult’s concepts. Ontological commitments are made in order to explain events in the world. The theories babies build about the world are revised and transformed. Children form theories about objects, people, and kinds; they learn language and all this is connected to space. The core about the theory theory is formulation and testing of hypotheses. It is a theory about how humans acquire knowledge about the world (by forming theories). When children watch a situation, they are driven by an eagerness to learn. They set up a hypothesis about a spatial situation and they try to prove it by trial and error. If the outcome is as expected they become uninterested (bored) and give up testing after some tries. If something new happens they test again and again, even using methodology. When they are puzzled they try new hypothesis and test alternatives. An 18 months old child is not supposed to concentrate for a long time. But an experiment shows that they test hypothesis up to half an hour (Gopnik, Meltzoff et al. 2001). User requirement analysis is a common way to build ontologies for GIS, using interviews, questionnaires, and desktop research. Infants can not communicate their experiences with space through language, so psychologists make use of passive and active measure studies. Two methods will be shortly described here. Studies of predictive action like reaching with the hand for an object: Infants are presented with a moving object while their reaching and tracking actions are observed and measured. When doing so children act predictive. They start reaching before the object enters their reaching space, aiming for a position where the object will appear when it will reach their hands.
Sandbox Geography – To learn from children the form of spatial concepts
425
Similar observations can be made for visual tracking studies and studies that measure predictive motion of the head. There is evidence that infants extrapolate future object positions (von Hofsten, Feng et al. 2000). Studies of preferential looking for novel or unexpected events: When children are confronted with outcomes different from their predictions they are puzzled. It is like watching magic tricks (Gopnik and Meltzoff 2002). The surprise can be noticed by the children’s stare. An independent observer can measure how long children watch a certain situation in an experimental setup. It is evident that children make inferences about object motions behind other objects (Spelke, Breinlinger et al. 1992).
3 Sandbox Geography A sandbox is a place for experimentation; The laws of physics can be investigated using very simple models. The models are made of sand, so they do not last forever, but they can raise new insights into the little engineers’ understanding. The objects treated in a sandbox underlie a mesoscopic partitioning (Smith and Mark 2001) they are on human scale and they belong to categories that geographers form. “Sandbox Geography” is motivated by children’s conception of space and can be seen as a contribution to the naïve geography (Egenhofer and Mark 1995). The investigation of very simple spatial situations is necessary to find out more about how space is structured in mental models. The goal is a formalization of these simple models. There is no need to connect these models to a new theory of learning nor do the authors intend to build a computational model for a child’s understanding of space. Furthermore, the sandbox is also a place to meet, a place of social interaction. The social aspect is considered more and more in building ontologies for GIS. The presented research may contribute new insights for finding structures to define sound GIS interfaces. The basis of the present investigation is the theory theory as explained in the previous section. An initial geographic model formed under this assumption will underlie changes. The necessity for adaptation can be caused by two reasons. First, the environment may change and the models we made about it may not be applicable anymore. Second, we may acquire new knowledge or be endowed with new technology. Our conceptual models then change and we perceive the environment differently. Consequently we do model the environment differently. We select one example of several theories in this paper for modeling what is called object permanence in psychology. Where is an object when
426
Florian A. Twaroch & Andrew U. Frank
it is hidden? Adults have a quite sophisticated theory about “hidden objects”. Four factors contribute to their knowledge. Adults know about spatial relations between stationary objects, they assume the objects to have properties and they know about some laws that govern the movement and the perception of objects. Equipped with this knowledge they can predict where and when an object will be visible to an observer. They can explain disappearance and reappearance and form alternate hypothesis about where the object might be if the current rules do not hold (Gopnik and Meltzoff 2002). Children start out with quite a simple theory where an object might be. 2.5 months old infants expect an object to be hidden when behind a closer object, irrespective of their relative sizes. After about a month they will revise this theory and consider the size as well. An object that disappeared is firstly assumed to be at the place where it appeared before. That is habituation – parents tidy up in the world of infant’s objects. A later hypothesis is that an object will reappear at the place where it disappeared. The object is individuated only by its location. The properties of the object seem to be ignored. It is even possible to exchange a hidden object. In a series of experiments an object is presented to the child and then hidden behind a screen. An experimenter exchanges the hidden object e.g. a blue ball by a red cube. Then the screen is removed. A child around the age of six months will not be surprised as long as an object reappears where it disappeared. Surprise appears only if observations and predictions about an object do not fit together. Because the child’s prediction does not consider properties of objects, the exchange of the object will not lead to a contradiction between prediction and observation. The object individuation by location will change in the further development to an object individuation by movement. An object that moves along a trajectory will be individuated as a unique object even when it does change its properties. Additionally, there seems to be a rule that solid 3D objects can not move into each other as long as they are on the same path. The child will even be able to make a prediction about when the object will appear on a certain point in the trajectory. This theory will again change to an object individuation by physical properties like shape, color, and size. As it goes through this process the child will come closer to an adult’s theory of objects with every new experience it makes about the objects. In the following sections we want to present a formalization of the “hidden object” problem. It is the first model in the necessary series of models for the sandbox geography.
Sandbox Geography – To learn from children the form of spatial concepts
427
4 Algebra An algebraic specification consists of a set of sorts, operations, and axioms (Loeckx, Ehrich et al. 1996). There are well known algebras, like the algebra for natural numbers, the Boolean algebra or the linear algebra for vector calculations. An algebra groups operations that are applied to the same data type. The Boolean algebra has operations that are all applied to truth values. Axioms describe the behavior of these operations. An example is given below. Algebra Bool b operations not :: b -> b and, or, implies :: b -> b -> b axioms (for all p,q) not(not p) = p p and q = if p then q else False p or q = if p then True else q p implies q = (not p) or q A structure preserving mapping from a source domain to a target domain is called morphism. Morphisms are graded by their strength and describe the similarity of objects and operations in source and target domain. Finding or assuming morphisms helps to structure models. They help to link a cognitive model to a model of the real world. Previous work has successfully used algebra to model geographic problems (Frank 1999; Frank 2000; Raubal 2001). Algebras help to abstract geographic problems and offer the possibility to do this in several ways. An Algebra can be used as a sub algebra within another algebra and thus allows the combination of algebras. Instantiation is another way to reuse algebras (Frank 1999). This research assumes the following hypothesis: Theories of space can be described by a set of axioms. It is possible to revise such a theory by adding, deleting, or exchanging axioms. Therefore algebra seems to be the right option for modeling the problem. Algebras for different spatial situations can be built and quickly tested with an executable functional programming language.
428
Florian A. Twaroch & Andrew U. Frank
5 Agents and Algebra To model the “hidden objects” an agent based approach has been chosen. An agent can be defined as “Anything that can be viewed as perceiving its environment through sensors and acting upon the environment through effectors” (Russell and Norvig 1995). Several definitions can be found in the literature (Ferber 1998; Weiss 1999). Modeling an Algebra based Agent is motivated by using the tiered belief computational ontology model proposed by (Frank 2000). A two tiered reality beliefs model allows to model errors in a person’s perception by separating facts from beliefs. This distinction is vital for modeling situations where agents are puzzled. This happens always when a belief about the “real world” does not fit together with the actual facts. Several reactions are possible in this situation. 1. The agent retests the current belief against the reality. 2. The agent makes use of an alternative hypothesis and tests it. 3. If no rule explains the model of reality the agent has to form a new ad-hoc rule that fits. 4. If all rules fail and ad-hoc rules also do not work the agent has to exchange its complete theory. This is not the case under the hypothesis taken that theories can be revised by adding, deleting, or exchanging axioms. The agent generates a reaction of surprise when beliefs do not fit together with facts about the world. An environment with a cognizant agent has to be built as a computational model.
6 Computational Model The computational model consists of a simple world with named solid objects. The objects can be placed in the world. Their locations are described by vectors. It is also possible to remove objects. An agent has been modeled that can observe his own position and orientation in the world. Algebra World(world of obj, obj, id, value) Operations putObj removeObj getObj
Sandbox Geography – To learn from children the form of spatial concepts
429
Algebra Positioned(obj, vec) Uses VecSpace Operations putAt isAt Algebra VecSpace(Vector,length) Operations dotProd orthogonal distance direction ... Algebra Object(obj) Operations maxHeight maxWidth color ... The computational model motivated by the theory theory makes use of three basic properties. The prototypic agent has to be endowed with innate knowledge. Jerry Hobby claims that there is a certain minimum of “core knowledge” that any reasonably sophisticated intelligent agent must have to make its way in the world (Hobbs and Moore 1985). The agent is able to observe the distance and direction between him and an object. The objects are given names in order to identify them. The agent can give an egocentric description about the objects in the environment. A learning mechanism shall enable the agent to revise his knowledge and have new experiences with his environment. The agent shall apply a mechanism of theory testing by making hypotheses and testing and verifying them. To determine the location of an object the agent is equipped with an object-location memory. Each object is situated on a vector valued location in the world. The agent stores locations with a timestamp in order to distinguish when objects have been perceived at a certain location. For the first version of the computational model agents do not move. The agent generates a reaction of surprise when beliefs do not fit together with facts about the world. Algebra Agent Operations
430
Florian A. Twaroch & Andrew U. Frank
position :: agent -> pos direction :: agent -> dir observe :: t -> world -> [obj] predict :: t -> [[obj]] -> Maybe egocentric :: t -> [obj] Axiom isSurprise = If observe(t1,world) predict(t1,[[obj]]) then TRUE By the exchange of one axiom three different behaviors and thus three spatial conceptualizations can be achieved. In the first model a disappearance of an object will be explained by the following hypothesis. The object will be behind the occluding object where it disappeared. The predict function will return a list of objects at time ti, being behind an occluding object. Axiom: predict (t,[[obj]]) = [obj(ti)] Observing contradiction with prediction, an axiom is replaced. This gives new prediction. The second model formalized will consider that the object will be where it appeared before. The predict function will return a list of objects being perceived at an initial observation time t0. Axiom: predict (t,[[obj]]) = [obj(t0)] The final model will assume that an object will be where it disappeared. The predict function will deliver a list of objects visible at the observation time tv. Axiom: predict (t,[[obj]]) = [obj(tv)] For the realization of this computational model an executable functional language has been chosen. Haskell is widely accepted for rapid prototyping in the scientific community (Bird 1998).
7 Conclusion and Outlook Naïve Geography theories can benefit from the presented investigations. We have shown that it is possible to formalize the conceptual models children have about space. Further research in developmental psychology will be beneficial for this work, but the already existing body of research will
Sandbox Geography – To learn from children the form of spatial concepts
431
be sufficient for my Ph.D. Future research will certainly concentrate on moving objects and extend the presented approach. The formalization has to be enhanced and different spatial situations have to be modeled using algebra. We will not undertake human-subject testing, but concentrate on formalizing operations reported in the literature. The model of our current agent can be extended by the inclusion of perspective taking, delivering intrinsic and absolute allocentric descriptions of the world. If an agent is tutored by other agents it requires rules about when and how knowledge is acquired. Last considerations are at that time omitted in our research. However we want to keep it as an interesting topic for the future. It is important to identify the structures in the spatial models and find mappings between them. To form a sound GIS theory we need to find simple commonsense concepts. Children’s understanding of space can be exploited to find these concepts. This paper wants to contribute towards a better understanding of the formal structure of spatial models – the Sandbox Geography.
Acknowledgements This work has been funded by the E-Content project GEORAMA EDC11099 Y2C2DMAL1. I especially like to thank to Prof. Andrew Frank for guiding me through the process of finding a Ph.D. topic, for his discussions and all the help with this paper. Gratefully we would like to mention Christian Gruber and Stella Frank for correcting the text and checking my English. Last but not least many thanks to all colleagues especially Claudia Achatschitz from the Institute for Geoinformation for their valuable hints.
Bibliography Bird, R. (1998). Introduction to Functional Programming Using Haskell. Hemel Hempstead, UK, Prentice Hall Europe. Egenhofer, M. J. and D. M. Mark (1995). Naive Geography. Lecture Notes in Computer Science (COSIT '95, Semmering, Austria). A. U. Frank and W. Kuhn, Springer Verlag. 988: 1-15. Ferber, J., Ed. (1998). Multi-Agent Systems - An Introduction to Distributed Artificial Intelligence, Addison-Wesley. Fodor, J. A. (1987). The modularity of mind: an essay on faculty psychology. Cambridge, Mass., MIT Press.
432
Florian A. Twaroch & Andrew U. Frank
Frank, A. U. (1999). One Step up the Abstraction Ladder: Combining Algebras From Functional Pieces to a Whole. Spatial Information Theory - Cognitive and Computational Foundations of Geographic Information Science (Int. Conference COSIT'99, Stade, Germany). C. Freksa and D. M. Mark. Berlin, Springer-Verlag. 1661: 95-107. Frank, A. U. (2000). "Spatial Communication with Maps: Defining the Correctness of Maps Using a Multi-Agent Simulation." Spatial Cognition II: 80-99. Frank, A. U. (2001). "Tiers of ontology and consistency constraints in geographic information systems." International Journal of Geographical Information Science 75(5 (Special Issue on Ontology of Geographic Information)): 667-678. Gopnik, A. and A. N. Meltzoff (2002). Words, Thoughts, and Theories. Cambridge, Massachusetts, MIT Press. Gopnik, A., A. N. Meltzoff, et al. (2001). The Scientist in the Crib - What early learning tells us about the mind. New York, Perennial - HarperCollins. Hayes, P. (1985). The Second Naive Physics Manifesto. Formal Theories of the Commonsense World. J. R. Hobbs and R. C. Moore. Norwood, New Jersey, Ablex Publishing Corporation: 1-36. Hayes, P. J. (1978). The Naive Physics Manifesto. Expert Systems in the Microelectronic Age. D. Mitchie. Edinburgh, Edinburgh University Press: 242-270. Hobbs, J. and R. C. Moore, Eds. (1985). Formal Theories of the Commonsense World. Ablex Series in Artificial Intelligence. Norwood, NJ, Ablex Publishing Corp. Loeckx, J., H.-D. Ehrich, et al. (1996). Specification of Abstract Data Types. Chichester, UK and Stuttgart, John Wiley and B.G. Teubner. Mark, D. M. and M. J. Egenhofer (1996). Common-Sense Geography: Foundations for Intuitive Geographic Information Systems. GIS/LIS '96, Betsheda, American Society for Photogrammetry and Remote Sensing. McCloskey, M. (1983). Naive Theories of Motion. Mental Models. D. Genter and A. L. Stevens, Lawrence Erlbaum Associates. Newcombe, N. S. and J. Huttenlocher (2003). Making Space: The Development of Spatial Representation and Reasoning. Cambridge, Massachusetts, MIT Press. Piaget, J. and B. Inhelder (1999). Die Entwicklung des räumlichen Denkens beim Kinde. Stuttgart, Klett-Cotta. Piaget, J., B. Inhelder, et al. (1975). Die natürliche Geometrie des Kindes. Stuttgart, Ernst Klett Verlag. Pinker, S. (1995). The Language Instinct. New York, HarperPerennial. Raubal, M. (2001). Agent-based Simulation of Human Wayfinding: A Perceptual Model for Unfamiliar Buildings. Institute for Geoinformation. Vienna, Vienna University of Technology: 159. Russell, S. J. and P. Norvig (1995). Artificial Intelligence. Englewood Cliffs, NJ, Prentice Hall. Smith, B. and D. M. Mark (2001). "Geographical categories: an ontological investigation." International Journal of Geographical Information Science 15(7 (Special Issue - Ontology in the Geographic Domain)): 591-612. Spelke, E. S. (2000). "Core Knowledge." American Psychologist November 2000: 1233-1243.
Sandbox Geography – To learn from children the form of spatial concepts
433
Spelke, E. S., K. Breinlinger, et al. (1992). "Origins of knowledge." Psychological Review 99: 605-632. Spelke, E. S. and G. S. Van de Walle (1993). Perceiving and reasoning about objects: insights from infants. Spatial representations: problems in philosphy and psychology. N. Eilan, R. McCarthy and B. Brewer. Cambridge, Massachusetts, Blackwell: 132-161. von Hofsten, C., Q. Feng, et al. (2000). "Object representation and predictive action in infancy." Developmental Science 3(2): 193-205. Weiss, G. (1999). Multi-Agent Systems: A Modern Approach to Distributed Artificial Intelligence. Cambridge, Mass., The MIT Press.
½ ¾
½ ½ ¾
!"##$% & ' ( ' & ) * * +
½
& * , -
* !
. * % * * * /
* 0 * , !
% ) ! % ** *
1
!
" ! # $ ! % ! & ' "()))# * ' "+,,-# ' . "+,,/# 0 1 ! ! "# 2 & 3 ! % "#
4 % 5 ! 0 ! ! 6 "2! ())) (/-# & 7 3 7 & 80 9 % 0 0
% 7 % 0 $ : 5! ;! "+,,,# 0 6 )< = 3 "+,,# ! * ! "+,,,# $6 "; $ ())) ,# 0 7 % $6 4 7 & ! $
! " #
$ % $ & $ ' $$
¾
! ( ½ # ) # ʾ ' ' ' & * # #
+
Ü Ü Ü , (
Ü Ü ' # &
+
, (
+
#
, + ,
+-,
# ' . $
(
+),
/ $ # $ # + , ( + ½ , + ,# $ $ $ & $ ʾ # $ $ ' + , (
+
½,
+
,
+,
01 +)222# -3, # + , $ $ + + ,,# + , ( ½
' # + , $ & # $ $ # ' $ % $ ' # +01 # )222# )3, /
$ 0 $ ʾ & $ &$ + , & '
' # + , & 1 # 1 1 1 4 # &
Ü
Ü Ü Ü Ü Ü Ý Ý
Ü Ü Ü Ü
Ü
! " # # $ % ! & %# " # ! # % '%& % # (#) *++ (#) ,# *++- (#) ./ % *++0 ! %% " 123"4 , % " 5 ". & # % ! ! "4 " # % ! ! % (5 -666 *&*+ *0 !
& %
7 ! % # 2% %# ! # ! ! % # % # 4# *++0 *++0 *++0 . *++ % " # ! % 7 ! $ # % % # #!% 8 % % ! ## # # ,# *++6 *++6 (#) ,# *++- % ". % % # ! # , % ! 5 "4 % $ % ! # # % # ! %
% # 5 #
# 5 &! # $ # # % # # ! ½ 7 % ! $#% 9 5& % ʾ ! :% ! ! 7 % 5 7 % % % ! ! , % # % ! % &! 9% 5 ! 5 !% ! ; ! 5& ʾ < %# > *+0 ) > *+? 4# *+@ !% (=2< . ) *++6 *++*
% # # % " # # ! # ! ! 5 # % *++? 9% & ! % # ! % % ! # ! 2% #% & % # ! " #
!" #$$$" % #$$ " & ' (% ) * * ( * * * ( & + * * &+ * ,+ * * -( " & * * *% . ,+ * * ' ( % / 0 ( ( * ( + ( 1 2
' 3 * * * 0% * 1 * * + * ( % 0
' 3 * * ( 0 %( ( 4 + ( *% %( 5 0 * * 0 * (
' 36 *+ * ( % + * 0 (% 7 / % 2 * '* (( '" 8 * * ( *% * * *
% #9$ * * 0 * ( % 4 ' *% : " * * * % ( (6 " 5 0 *+ *
0 ( * ** ( + ;&< * &+ * -( " 0 & * % (
* %( * * ( & *+ * * (
(%
¿ & = % : * : 8> - 8 0 * ( & * * % * * * + * * ( + ( * % * * ( 1 * ( % ( / * ( ( * " *0 + (% * * & %( * ( % $$ % ! ?* ( ( * ( +
!" # $!$ % & '() % '(*' + , - %
.
/ % 0 % % 1 %% 2
3 "
, 3 % 4 # ½ & % 5 5 6 5 # & %
¼ 5 %
% , % % 5 % % 71 #"((!& %% 3 % % #
& % % % 3 %% . .%% % % ,
5 - - " 8 ) " % È % / 2 - % % 5 / % Î . .++6 %% .
- % / 8
- % 1
- % % - , 3 9 /
- % - % 8 / % Î %% - % Î ¼ . %% . :.7;/+ %%
! ! ! ! ! ! ! ! ! ! !! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !
! !! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !
¼ ! !" # $%&'(# ' ) # # # $ (# !" " !* !" ' "
$$ (( !" " !* !" ' ¼ !" " # ¼ " !" + , ! , - . $ ,/0 ( " ¼ ! $ 1 +( 1 0 " ! 2 ! ! ! ! ! ! ! ! ! ! !! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !
! !! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !
! "
# $ %
& ' ( ) * & $
"
+ $ $ $
* , %-.. ' *
/ " * 0
* 1
$
,
$ / * 0 $ ( $ $ ( $ $ ( %!2 " -..3 -4.5-67' * (
8 $ , 4 6 9 , ¼¼ % , 4' $ ¼¼ * ) $ ¼¼ $ : ¼ ; $ % < ' , $ ) $ 2 $ Ü $ $ ; ¼ $ $ ¼¼
$ ; ¼ $ , 6
( & ) $
$ * 0 $ $ Ü & Ü $ ¼ ¼ $ $ ¼ , 9
) / "
( * /
¼¼
! " ½ # ʾ $ % &'( ) * $ * +'( , )+ & * + - $ * ! .' #
$ * ! /0' 1 $ * ! &(.' / $ * ! &'
&( 2, 3 * + ( * +! * + ( 4 - 5 * + $ ' ( #( $&' *
&! $ ( '( 1( $#' 4 $ % #'( /( $1' 6 ( ( .( 4
#( 7( ( 8( 6 1( ¼
¼
¼
¼
¼
¼ ¼ ¼¼ ! " # ¼¼ $ % $ # ¼¼ % # $ # # $ & ¼ & $ " # $
& ' # % ( # ) #
# # " ¼ # # ¼¼
# * ! + , ¼ # - . / 0 / 1 # #
2 # 1 % 234 # 54 # # $
$
!"!#"! $ % !"!#"! & $ ' % $ ( % ) % ' % & ' ' % ' $ ' % % ' ' % * + $
! " # !
$ "
" % & '" ()**+," $ ! - ! ." #
"
" / %" 0 " ()**1," - ¬ 2 3 $ " +456(6,2)67)8" / %" % " ()**," $ *9
88781" " / %" % " ()**," : $ ! " # ! "
"! # ! ))+7)6"
/ %" % " ()**," : $ ! " - #
! ; >" ()**1," " $ % & 8(678,2676+8" 3 #" ' -" (6444," % 2 " ++(9,21+9719" 0 '# (6444," = 30-0'#0 " 0! 3" ()***," % ?" $ 7+" @ '" 3$! @" ()**9," ! $2 " ' )1(,2**)7)44+" @ " ; 3" @ 3" ()**," # $ " # = ! %" @"
% 6781" -! A @ " 3" (6444," ? 5 " # ( ) * ")7 "69"
! "#"$ " %% &' ( )% """*" +,$-.,+ "/ -++ # % & ! / 0 &" "') ,+$ ., 1 2 + 3 4# " % " ( )% 1 !) % -5., 1 2 + 3 4# ! " ( )% ' " ") 1 !) % -6-.-6 1777 )%! &" 8
3 3" 14 2 6 )" ( % ") ! " ' " " ) ! " 9+,$,9+.,6- 3 * 11 : + "" %' %" ( ! ) ! " & ($ ; " " ) %% &' ,$5-.5 2 # 69 :? !" % 65 = &" > )' ( %" ' ) " ( " / $+ . "4"&* , ! ' 3 *$ !# ; "4"&* 1 2 - ! '( "4"&* 3@ ! 5 " &'"& ; " / -6,$, ,.,5 /
!"' 3 '"! -+++ %( ! " & "" > ! / 6 ) " A" " 8
& "" / 0). Because if it is occluded, there must be an object between the current camera position and cell m+1 locating at cell m-n (n>0), and intersect with the ray between the cell and the camera. Suppose that the height of the intersect of the object with the ray corresponding to cell m+1 and cell m is h m+1 and h m respectively, as showing in figure 6, it can be seen from the illustration that we must have h m+1> h m. It means that cell m must be occluded by the same object.
Fig. 6 The relation between the visibility and the heights of shadow cell and relevant cell Therefore, our rule is, for a visible shadow cell, if its relevant cell is a shadow with height not less than the visible shadow, it is visible too. For example, in figure 7, the flight direction of the plane is horizontal and from right to left. The shadows in (a) are caused by the building in (a) and the sun is in the left of the building; The shadows in (b) are caused by the building in (b) and the sun is in the right of the building. These shadow cells adjacent for each other and with the same heights. If the most right shadow cell is traced as visible, according to our rule, the left shadow cell
An Integrated Shadow Detection Model
555
is also visible. Further more, all the other shadow cells are visible too, for both (a) and (b).
Fig. 7 Simplified tracing model
2.3 Integrated shadow detection and location The shadows derived from the last stage do not fit very well with what we observe from the image. Because DSM itself is short of the details, such as the fine structure of the building, it is not practice to count on DSM to give the precise result. However, the ray tracing result gives the approximating correct positions of the shadows. It is helpful for the following shadow segmentation. Since the shadow area now contain most observed image shadows and only little errors, the statistics of the shadow area reflects the distribution of the intensities of the shadow approximately. Therefore, in the first stage, that is, the result of the ray tracing, a reference segmentation threshold is obtained from the mean of the intensities of the shadow area. In the image histogram the local minimum value which is most close to the reference threshold is taken as the threshold for shadow segmentation. Thus, a final segmentation will give a more precise shadow locating.
3 Results and Discussion We have made experiments using aerial image and the associated data to test our shadow detection model. The study field is Tsukuba, Japan. The resolution of the image is 20cm. Figure 8 shows some patches of the image and the effects of shadow detection. In Figure 8, (a) and (d) are two patches of the original image. (b) and (e) refer to the compound image by superposing the shadow area contour detected through photogrammetric engineering to the original image for (a) and (d) respectively. The closing white curves represent the detected shadow area contours. (c) and (f) refer
556
Yan Li, Peng Gong and Tadashi Sasagawa
(a)
(b)
(c)
(d)
(e) (f) Fig. 8 Experiment images and the corresponding ones superposed shadow area contours
An Integrated Shadow Detection Model
557
to the compound image by superposing the shadow contour detected by the integrated model to the original image for (a) and (d) respectively. It can be seen that the detected shadow area contours in (b) and (e) do not fit with what we observe from the image very well. While they are correct as well as precise by the integrated model, as shown in (c) and (f). The experiment for the whole image supports the same conclusion, i.e., the detection model proposed in this paper gives better results than the photogrammetric method singly. This paper proposes a model for shadow detection integrating several data sources, and by taking advantages of photogrammetric method and image analysis method a better result is obtained. Based on the shadow detection result, an image processing technique is used to restore the color and intensities of the shadow pixels to remove the shadow effects. The remaining work will be introduced in other paper.
Reference Bittner J., 1999, Hierarchical techniques for visibility determination, Postgraduate Study Report, DC-PSR-99-05. Chen, L.C. and Rau J.Y., 1993, A unified solution for digital terrain model and orthoimage generation from SPOT stereopairs, IEEE Trans. on Geoscience and Remote Sensing, 31(6), 1243-1252. Gong P., Shi P., Pu R., etc., 1996, Earth Observation Techniques and Earth System Science(Science Press, Beijing, China). Paglierroni D. W. and Petersen S. M., 1994, Height distributional distance transform methods for height field ray tracing, ACMTransactionson Graphics, 13(4), 376-399. Paglieroni D. W., 1997, Directional distance transforms and height field preprocessing for efficient ray tracing, Graphical Models and Image Processing, 59(4), 253–264. Paglieroni D. W., 1999, A complexity analysis for directional parametric height field ray tracing, Graphical Models and Image Processing, 61, 299–321. Zhang Z., Zhang J., 1997, Digital Photogrammetry(Press of WTUSM, Wuhan, China). Zhou G., Qin Z., Kauffmann P., Rand J., 2003, Large-scale city true orthophoto mapping for urban GIS application, ASPRS 2003 Annual Conference Proceedings. ADS40 Information Kit for Third-Party Developers 2001, www.gis.leica-
geosystems.com
Understanding Taxonomies of Ecosystems: a Case Study Alexandre Sorokine1 and Thomas Bittner2 1 2
Geography Department, University at Buffalo
[email protected] Institute of Formal Ontology and Medical Information Systems, Leipzig
[email protected] Abstract This paper presents a formalized ontological framework for the analysis of classifications of geographic objects. We present a set of logical principles that guide geographic classifications and then demonstrate their application on a practical example of the classification of ecosystems of Southeast Alaska. The framework has a potential to be used to facilitate interoperability between geographic classifications.
1 Introduction Any geographic map or a spatial database can be viewed as a projection of a classification of geographic objects onto space [1]. Such classification can be as simple as a list of objects portrayed on the map or as complex as a multi-level hierarchical taxonomy such as used in the areas of soil or ecosystem mapping [2]. However, in any of these cases classification is predicated on a limited set of rules that ensure its consistency within itself and what it is projected on. Classifications of geographic objects, if compared to classifications in general, have certain peculiarities because geographic objects inherit many of their properties from the underlying space. Classifications of geographic objects typically manifest themselves as map legends. The goal of this paper is to develop a formalized framework for handling of the structure of and operations on classifications of geographic objects. There are three groups of purposes for development of this framework: (1) such a framework would allow better understanding of the existing classification systems and underlying principles even for non-experts, (2) the framework can provide useful tips for developing new classification systems with improvements in terms of consistency and generality, (3) the framework would allow
560
Alexandre Sorokine and Thomas Bittner C o1 ...on R1 `h J JJJJ JJJJ JJJJ JJJJ J(
W
R
8[ 8 88 88 88q1 ...qn 88 88 88 R2 > 6 ttt ttttt t t t ttt v~ tttt
Fig. 1. Interoperability through representation models
more flexibility for achieving interoperability and fusion between datasets employing non-identical classification systems. Each classification is a representation (representations are denoted by letters R1 and R2 on Fig. 1) of the real world W that was created using a unique and finite chain of operations (o1 . . . on and q1 . . . qn respectively). The goal of our research is to outline operations o1 . . . on and q1 . . . qn in a clear formal, understandable and non-ambiguous manner. This knowledge will allow us to build a new representation R that would be able to accommodate both R1 and R2 thus achieving interoperability (shown on the diagram as double arrows) between them and possibly with other representations.
2 Classification of Ecological Subsections of Southeast Alaska To demonstrate our theory we will use the classification of ecological subsection of Southeast Alaska [3] as a running example (Fig. 2). This classification was developed by the USDA Forest Service and it subdivides the territory of Southeast Alaska into 85 subsections that represent distinct terrestrial ecosystems. The purpose of the classification is to provide a basis for practical resource management, decision making, planning, and research. The classification has three levels that are depicted in Table 1. The first level (roman numerals in Table 1) subdivides the territory into three terrain classes: active glacial terrains, inactive glacial terrains and post-glacial terrains. At the next level (capital letters in Table 1) the territory is subdivided according to its physiographic characteristics. The third level of the classification (numbers in Table 1) divides territory by lithology and surface deposits.
Southeast Alaska Ecological Subsections Hierarchy Inactive Glacial Terrains
Active Glacial Terrains Angular Mountains
Icefield
Granitics
Boundary Ranges Icefield St.EliasŦFairweather Icefields
Recently Deglaciated Areas Exposed Bedrock Hugh MillerŦGeikie Inlet Mountains QueenŦTidal Inlet Mountains Upper West Arm Mountains
Affeck Canal Till Lowlands
Etolin Granitics
Thorne Arm Granitics
Central POW Till Lowlands
KuiuŦPOW Granitics
South Baranof Sediments
Misty Fiords Granitics
Alvin Bay Sediments
Duncan Canal Till Lowlands
North Chichagof Granitics
Rowan Sediments
Elevenmile Till Lowlands
Peril Strait Granitics
Salmon River Sediments
Foggy Bay Till Lowlands
Sedimentary, Carbonates
Mainland Rivers
Chilkat River Valley StikineŦTaku River Valleys
Freshwater Bay Carbonates
MitchellŦHasselborg Till Lowlands
Volcanics
DallŦOutside Complex
Kook Lake Carbonates
Kasaan Peninsula Volcanics
Fairweather Front Range Complex
North POWŦKuiu Carbonates
Sumner Strait Volcanics
Outer Islands Complex
Hetta Inlet Metasediments
Sitka Sound Complex
Traitors Cove Metasediments
PostŦglacial Terrains Volcanics Mount Edgecumbe Volcanics Princess Bay Volcanics
Soda Bay Till Lowlands Vixen Inlet Till Lowlands
Outwash Plains Glaciomarine Terraces Stepehns Passage Glaciomarine Terraces
WaveŦcut Terraces Outer Coast WaveŦcut Terraces
Wrangel Narrows Metasediments
Complex Sedimentary & Volcanics Volcanics
Skowl Arm Till Lowlands
Thomas Bay Outwash Plains
Point Adolphus Carbonates
Metasedimentary
Ketchikan Mafics/Ultramafics
Stikine River Delta
Klawock Inlet Till Lowlands
North POW COmplex Kake Volcanics
Mafics/Ultramafics
Delatas
Sedimentary, Carbonates
Gulf of Esquibel Till Lowlands
Boca De Quadra Complex
HoodŦGambier Bay Carbonates
North Baranof Complex
AlsekŦTatshenshini River Valleys
UshkŦPatterson Bay Granitics
Complex Sedimentary & Volcanics
Cape Spencer Complex
North Admiralty Complex
Valleys
Thayer Lake Granitics
Duke Island Till Lowlands
Behm Canal Complex
Central POW Volcanics
Berg Bay Complex
Clarence Strait Volcanics
Cape Fanshaw Complex
South Admiralty Volcanics
Chilkat Complex
Stepens Passage Volcanics
Eastern Passage Complex Holkam Bay Complex Moira Sound Complex Stikine Strait Complex West Chichagof Complex
Understanding Taxonomies of Ecosystems
YakutatŦLituya Forelnads
Sedimentary, NonŦcarbonates
Sedimentary, NonŦcarbonates
Complex Sedimentary & Volcanics
WachusettŦAdams Hills
Till Lowlands
Necker Bay Granitics
Puget Peninsula Metasediments
Gustavus Flats
Lowlands
Granitics South POW Granitics
Central Baranof Metasediments
Dundas River Flats
Hills
Bell Island Granitics
Metasedimentary
BergŦBeardslee Moraine
Granitics
Dundas Bay Granitics
Chilkat Peninsula Carbonates
Unconsolidated Sediments
Rounded Mountains
Zimovia Strait Complex
561
Fig. 2. Southeast Alaska Ecological Subsection Hierarchy [3, pp. 22–23]
562
Alexandre Sorokine and Thomas Bittner Table 1. Hierarchical Arrangement of Subsections [3, Table 2, p. 16]
I. Active Glacial Terrains A. Ice elds B. Recently deglaciated areas 1. Exposed Bedrock 2. Unconsolidated sediments C. Mainland rivers 1. Valleys 2. Deltas II. Inactive Glacial Terrains A. Angular Mountains 1. Granitics 2. Sedimentary, Noncarbonates 3. Sedimentary, Carbonates 4. Metasedimentary 5. Complex sedimentary & volcanics
6. Ma cs/Ultrama cs B. Rounded Mountains 1. Granitics 2. Sedimentary, Carbonates 3. Metasedimentary 4. Complex sedimentary & volcanics 5. Volcanics C. Hills 1. Granitics 2. Sedimentary, Carbonates 3. Metasedimentary 4. Complex sedimentary & volcanics 5. Volcanics
D. Lowlands 1. Till Lowlands 2. Outwash Plains 3. Glaciomarime Terraces 4. Wave-cut Terraces III. Post-glacial Terrains A. Volcanics
Roman numerals Terrains classes Capital letters Physiographic classes Numbers Geologic classes
3 A Formal Theory of Classes and Individuals In this section we will introduce logical theories that are needed to formalize relations behind ecosystem classifications and demonstrate their application using the classification of ecological subsection of Southeast Alaska (Table 1 and Fig. 2) as a running example. Formalization of the theories will be presented using first order predicate logic with variables x, y, z, z1 , . . . ranging over individuals and variables u, v, w, w1 , . . . ranging over classes. Predicates always begin with a capital letter. The logical connectors ¬, = , ∧ , ∨ , → , ↔ , ≡ have their usual meanings: not, identical-to, and, or, ‘if . . . then’, ‘if and only if’ (iff), and ‘defined to be logically equivalent’. We write (x) to symbolize universal quantification and (∃x) to symbolize existential quantification. Leading universal quantifiers are assumed to be understood and are omitted. Strict distinction between classes and individuals is one of the cornerstones of our theory. Typically classifications and map legends show only classes. However, the diagram on Fig. 2 shows a mix of classes and individuals. In our understanding ecological subsections, i.e., such entities as “Behm Canal Complex”, “Summer Strait Volcanics”, “Soda Bay Till Lowlands” and others leafs of the hierarchy, are individuals. All other entities that are not leafs (e.g., “Active glacial terrains”, “Granitics”, “Volcanics”, etc.) are classes. In the same sense Table 1 shows only classes of the classification.
Understanding Taxonomies of Ecosystems
563
3.1 The Tree Structure of Classes Examples of classes are the class human being, the class mammal, the class ecosystems of the polar domain, the class inactive glacial terrains, etc. Classes are organized hierarchically by the is-a or the subclass relation in the sense that a male human being is-a human being and a human being is-a mammal, or, using our example, “Exposed Bedrock” is-a “Deglaciated Area”. In the present paper the is-a or subclass relation is denoted by the binary relation symbol and we use symbol for the proper subclass relation. We will write u v to say that class u is involved in the subclass relation with class v. Also we will call v a superclass of u if the relation u v holds. The proper subclass relation is asymmetric and transitive (ATM1–2). It very closely corresponds to the common understanding of the is-a (kind-of) relations: (ATM1) u v → ¬v u (ATM2) (u v ∧ v w) → u w Axiom ATM1 postulates that if u is a proper subclass of v then v is not a proper subclass of u. Transitivity (ATM2) implies that all proper subclasses of a class are also proper subclasses of the superclass of that class. In our example (Table 1) class “Exposed Bedrock” is a proper subclass of “Recently Deglaciated Areas” that in turn is a proper subclass of “Active Glacial Terrains”. Due to the transitivity of the proper subclass relation we can say that class “Exposed Bedrock” is also a proper subclass of the class “Active glacial terrains”. Next we define the relations of subclass as D . Unlike proper subclass, subclass relation allows for a class to be a subclass of itself: D u v ≡ u v ∨ u = v One can then prove that the subclass relation () is reflexive, antisymmetric and transitive, i.e., a partial ordering. Class overlap (O ) is defined as DO : DO O uv ≡ (∃w)(w u ∧ w v) Classes overlap if there exists a class that is a subclass of both classes, e.g., in Table 1 class “Icefields” overlaps with class “Active Glacial Terrains”. We now add the definitions of a root class and an atomic class (atom). A class is a root class if all other classes are subclasses of it (Droot ). A class is an atom if it does not have a proper subclass (DA ): Droot DA
root u ≡ (∀v)(v u) A u ≡ ¬(∃v)(v u)
In our example the root class of the classification would be a class of all Southeast Alaska ecological subsections (Fig. 2). Geologic classes designated
564
Alexandre Sorokine and Thomas Bittner
with numbers in Table 1 are atoms because they do not have any proper subclasses. In practice in many classifications root classes are not specified explicitly however their existence has to be implied. For example, Table 1 does not contain a root class but it can be inferred from the context that the root class is “Southeast Alaska ecological subsections”. Since the hierarchy formed by classes of Southeast Alaska ecological subsections are the result of a scientific classification process we can assume that the resulting class hierarchy forms a tree. We are justified to assume that scientific classifications are organized hierarchically in tree structures since scientific classification employs the Aristotelean method of classification. As [4] point out, in the Aristotelian method the definition of a class is the specification of essence (nature, invariant structure) shared by all instances of that class. Definitions according to Aristotle’s account are specified by (i) working through a classificatory hierarchy from the top down, with the relevant topmost node or nodes acting in every case as undefinable primitives. The definition of a class lower down in the hierarchy is then provided by (ii) specifying the parent of the class (which in a regime conforming to single inheritance is of course in every case unique) together with (iii) the relevant differentia, which tells us what marks out instances of the defined class or species within the wider parent class or genus, as in: human = rational animal, where rational is the differentia (see also [5] for more details.) Now we have to add axioms that enforce tree structures of the form shown in Fig. 3(a) and which rule out structures shown in Figs. 3(b) and 3(c). These additional axioms fall into two groups, axioms which enforce the tree structure and the finiteness of this structure respectively. We start by discussing the first group. Firstly, we demand that there is a root class (ATM3). Secondly, we add an axiom to rule out circles in the class structure: if two classes overlap then one is a subclass of the other (ATM4). This rules out the structure in Fig. 3(b) and also it is very much true for our running example: all overlapping classes in Table 1 are subclasses of each other. Thirdly, we add an axiom to the effect that if u has a proper subclass v then there exists a class w such that w is a subclass of v and w and u do not overlap (ATM5). This rules out cases where a class has a single proper subclass or a chain of nested proper subclasses. Following [6] we call ATM5 the weak supplementation principle. (ATM3) (∃u)root(u) (ATM4) O uv → (u v ∨ v u) (ATM5) u v → (∃w)(w u ∧ ¬O wv) Upon inspection, the classification in Table 1 violates the weak supplementation principle (axiom ATM5) because the class of “Post-glacial terrains” has only a single subclass “Volcanics”. For this reason classification on Table 1 is not a model of our theory. We will discuss this case in detail in Sect. 4 and show that the underlying classification does satisfy our axioms but that additional operations have been performed on these structures.
Understanding Taxonomies of Ecosystems
root
c
565
root
a
b
d
e
a
f
(a) Proper tree
c
b
d
f
(b) Overlaps
c
a
b
d
e
f
(c) Multiple roots
Fig. 3. Trees (a) and non-trees ((b) and (c))
Using Droot , the antisymmetry of , and ATM4 we can prove uniqueness of the root class. This rules out the structure shown in Fig. 3(c). The second group of axioms that characterizes the subclass relation beyond the properties of being a partial ordering are axioms which enforce the finiteness of the subclass-tree. ATM6 ensures that every class has at least one atom as subclass. This ensures that no branch in the tree structure is infinitely long. Finally ATM7 is an axiom schema which enforces that every class is either an atom or has only finitely many subclasses. This ensures that class trees cannot be arbitrary broad. T M 6 (∃ )( ∧ x) V T M7 ¬ → (∃x1 . . . xn )(( 1≤i≤n xi ) ∧ (z)(z
→
W 1≤i≤n
z = xi ))
Here ( 1≤i≤n xi y) is an abbreviation for x1 y ∧ . . . ∧ xn y and 1≤i≤n z = xi for x1 = z ∨ . . . ∨ xn = z.
3.2 Classes and Individuals Classes and individuals are connected with the relationship of instantiation InstOf xu which first parameter is an instance and which second parameter is a class. InstOf xu then is interpreted as ‘individual x instantiates the class u’. From our underlying sorted logic it follows that classes and individuals form disjoint domains, i.e., there cannot exist a entity which is a class as well as an individual. Therefore instantiation is irreflexive, asymmetric, and non-transitive. In terms of our theory each individual (an ecological subsection) instantiates a class of the subsection hierarchy. A single class can be instantiated by several individuals. For example, we can say that individuals “Behm Canal Complex”, “Berg Bay Complex” and others instantiate the class of “Rounded Mountains”. Axiom (ACI1) establishes the relationships between instantiation and the subclass relation. It tells us that u v if and only if every instance of u is also an instance of v.
566
Alexandre Sorokine and Thomas Bittner ACI1 (u v ↔ (x)(InstOf xu → InstOf xv)) T CI1 (u = v ↔ (x)(InstOf xu ↔ InstOf xv))
From (ACI1) it follows that two classes are identical if and only if they are instantiated by the same individuals. Finally we add an axiom that guarantees that if two classes share an instance then one is a subclass of the other (AI2). AI2 (∃x)(InstOf xu ∧ InstOf xv) → (u v ∨ v u) AI2 can be illustrated using the following example: classes “Inactive Glacial Terrains” and “Rounded Mountains” share instance “Kook Lake Carbonates” (Fig. 2) and “Rounded Mountains” is a subclass of “Inactive Glacial Terrains”.
4 Applying the Theory to Multiple Classifications As it was mentioned in Sect. 3.1, Southeast Alaska ecological subsections hierarchy (Fig. 2 and Table 1) does not satisfy one of the axioms of our classification theory: the weak supplementation principle (ATM5). It is easy to notice that at the third level of the classification on (Table 1) contains repeating classes. For example, class “Granitics” can be found under the classes “Angular Mountains”, “Rounded Mountains” and “Hills” in the class “Inactive Glacial Terrains”. Given this, it is possible to interpret classification on Fig. 2 as a product of two independent classification trees: classification of terrains (terrain classes and physiographic classes in Table 1) and classification of lithology and surface geology (geologic classes in Table 1). The hierarchies of classes that represented these two separate classifications are shown on Fig. 4(a) and Fig. 4(b) respectively. The product of these classifications is depicted in Table 2, with terrain classes as columns and geologic classes as rows. Each cell of the table contains the number of individuals that instantiate corresponding classes of both hierarchies. Class hierarchies on Fig. 4 contain two differences from the original hierarchy in Table 1. The class “Volcanics” that violates the weak supplementation principle was moved from the terrains hierarchy into the geologic classes hierarchy (Fig. 4 and Table 2). This is a more natural place for this class because there is already a class with an identical name. Another problematic class is “Icefields”. It is an atomic class and does not have any subclasses. Also it does not represent any geologic class. To be able to accommodate class “Icefields” in the product of classifications we have added a new class “Other” to the hierarchy of geologic classes (Fig. 4(b) and Table 2). By using a product of two classifications we have avoided the problem of having a class with a single proper subclass. The remaining part of this section describes how a product of two or more classifications can be formalized.
Understanding Taxonomies of Ecosystems
567
Terrain classes
I. Active Glacial Terrains
A. Icefields
B. Recently deglaciated areas
II. Inactive Glacial Terrains
C. Mainland rivers
A. Angular Mountains
B. Rounded Mountains
III. Post-glacial Terrains
C. Hills
D. Lowlands
A. Volcanics
(a) Terrain classes (class “Volcanics” violates the weak supplementation principle)
Geologic classes
Exposed Bedrock
Unconsolidated Sediments
Valleys
Deltas
Granitics
Sedimentary, Carbonates
.......
Other
Volcanics
(b) Geologic classes (some classes are not shown, class “Volcanics” was added from Terrain classes hierarchy) Fig. 4. Classification trees
4.1 From Theory to Models The theory presented in the previous section gives us a formal account of what we mean by a classification tree and by the notion of instantiation. In this section we now consider set-theoretic structures that satisfy axioms given above. This means that we interpret classes as sets in such a way that the instance-of relation between instances and classes is interpreted as the element-of relation between an element and the set it properly belongs to and we interpret the is-a or subcell relation as the subset relation between sets. Sets satisfying our axioms then are hierarchically ordered by the subset relation then can be represented using directed graph structures in such a way that sets are nodes on the graph. Formally a graph is a pair T = (N, E) where N is a collection of nodes and E is a collection of edges. Let ni and nj nodes be nodes in N corresponding to the sets i and j then we have a directed edge e in E from ni to nj if and only if the set i is a subset of the set j. Since the sets we consider are assumed to satisfy the axioms given in the previous section it follows that the directed graph structures constructed in this way are trees and we call them classification trees. We now are interested in classification trees, operations between them, and the interpretation of those operations in our running example. In particular
568
Alexandre Sorokine and Thomas Bittner Table 2. The product of terrain and geology classifications Terrain classes
Glaciation phases Physiographic classes
Geologic classes
Other Exposed Bedrock Unconsolidated Sediments Valleys Deltas Granitics Sedimentary, Noncarbonates Sedimentary, Carbonates Metasedimentary Complex Sedimentary and Volcanics Mafics, Ultramafics Volcanics Till Lowlands Outwash Plains Glaciomarine Terraces Wave-cut Terraces
Active Glacial Terrains Icefields
Inactive Glacial Terrains
Recently Mainland Angular Rounded Hills LowDeglaciated Rivers Mountains Mountains lands Areas
PostGlacial Terrains
2 3 5 3 1 2
8
1
2 3
1
5
2
3
7
10
2
4
3
1 2 12 1 1 1
we will use the notion of classification tree in order to formalize the notion of product between classification discussed in the introduction of this section. Set theory allows us to also form higher order sets, i.e., sets of sets. In what follows we will consider levels of granularities that are sets of sets in the given interpretation. For example, the set of all leafs in a classification tree is a level of granularity. Since in our interpretation nodes in the tree structure correspond to sets levels of granularity are sets of sets. Below we will introduce the notion of a cut in order to formalize the notion of level of granularity. Notice that in the case of higher order sets the element-of relation is not interpreted as an instance-of relation.
Understanding Taxonomies of Ecosystems
569
4.2 Cuts Classification trees can be intersected at their cuts. To define a cut (δ) let us take a tree T constructed as described above and let N be the set of nodes in this tree. Definition 1. A cut δ is a cut in the tree-structure T is a subset of N defined inductively as follows [7, 8]: (1) {r} is a cut, where r is the root of the tree; (2) For any class z let d (z) denote the set of immediate subclasses of z and let C be a cut with z ∈ C and d (z) = ∅, then (C − {z}) ∪ d (z) is a cut. For example, the hierarchy of terrain classes (Fig. 4(a) without class “Volcanics”) has five different cuts. By Definition 1 the root class “Terrain classes” is a cut. If we assume that class “Terrain classes” is z and C is a cut then z ∈ C. Immediate descendants d (z) of the root class z are classes d (z) = { “I. Active Glacial Terrains”, “II. Inactive Glacial Terrains”, “III. Post-glacial Terrains”}. Then d (z) will be the next cut because in this case (C − {z}) = ∅ thus (C − {z}) ∪ d (z) leaves us with d (z) . Repeated application of Definition 1 to the hierarchy of terrain classes (Fig. 4(a)) results if five cuts that are listed in Table 3. Table 3. Examples of cuts in the hierarchy of terrain classes on Fig. 4(a) 1. the root class “Terrain classes” 2. “I. Active Glacial Terrains”, “II. Inactive Glacial Terrains”, “III. Postglacial Terrains” 3. “A. Icefields”, “B. Recently deglaciated areas”, “C. Mainland rivers”, “A. Angular Mountains”, “B. Rounded Mountains”, “C. Hills”, “D. Lowlands”, “III. Post-glacial Terrains”
4. “I. Active Glacial Terrains”, “A. Angular Mountains”, “B. Rounded Mountains”, “C. Hills”, “D. Lowlands”, “III. Post-glacial Terrains” 5. “A. Icefields”, “B. Recently deglaciated areas”, “C. Mainland rivers”, “II. Inactive Glacial Terrains”, “III. Post-glacial Terrains”
Using Definition 1 and (ATM1–7) one can prove that the classes forming a cut are pair-wise disjoint and that cuts enjoy a weak form of exhaustiveness in the sense that every class–node in N is either a subclass or a superclass of some class in the cut δ at hand [7].
4.3 Joining Classification Trees Let δ1 and δ2 be cuts in two different classification trees T1 and T2 . Cuts are sets that are composed of classes that satisfy conditions on Definition 1. Let
570
Alexandre Sorokine and Thomas Bittner
δ1 = {u1 , u2 , . . . , un } and δ2 = {v1 , v2 , . . . , vm }. The cross-product δ1 × δ2 of these cuts can be represented as a set of pairs that can be formed by classes in δ1 and δ2 : ⎧ ⎫ (u1 , v1 ) (u1 , v2 ) · · · (u1 , vm ) ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ (u2 , v1 ) (u2 , v2 ) · · · (u2 , vm ) ⎬ δ1 × δ2 = .. .. .. .. ⎪ ⎪ . . . . ⎪ ⎪ ⎪ ⎪ ⎭ ⎩ (un , v1 ) (un , v2 ) · · · (un , vm ) The product of two classification trees on Fig. 4 is depicted in Table 2. A product of two classification trees (or their cuts) will produce N = n×m pairs where n and m are the number of classes in the respective cuts. Most likely N will be greater than the number of classes that can be instantiated by the instances. In our example most of the cells of the Table 2 are empty indicating that there are no individuals that instantiate classes in either classification tree. The reason for it is that certain higher-level classes do not demonstrate as much diversity on the studied territory as other classes do. For example, class “Post-glacial terrains” is represented with only a single subclass “Volcanics” while class “Inactive glacial terrains” contains a total of 20 subclasses (Table 1). To remove empty pairs of classes one has to normalize the product of two classifications, i.e., one has to remove the pairs of classes that do not have instances. Normalized product δ1 δ2 can be formally defined as the cross product of levels of granularity which yields only those pairs which sets have at least one element in common (D ). D∗ δ1 δ2 ↔ (ui , vi ) ∈ δ1 × δ2 ∧ (∃x)(x ∈ ui ∧ x ∈ vi ) In this case each individual instantiates several classes each belonging to a different classification tree. In our example each individual instantiates one class from the classification tree of terrains and another class from the tree of geologic classes. For instance, ecological subsection “Thorne Arm Granitics” instantiates class “Hills” from the terrains classification tree and class “Granitics” from geologic classes.
5 Conclusions In this paper we have presented an ontological framework to dissect and analyze geographic classifications. The framework is based on a strict distinction between the notion of a class and the notion of an individual. There are two types of relations in the framework: subclass relation defined between classes and instantiation relation defined between individuals and classes. Subclass relation is reflexive, antisymmetric and transitive. Classes are organized into classifications that form finite trees (directed acyclic graphs). The latter is
Understanding Taxonomies of Ecosystems
571
achieved by requiring a classification to have a root class, prohibiting loops and classes with a single proper subclass. We have demonstrated how a practical classification can be built using outlined principles on the example of Southeast Alaska ecological subsection hierarchy. Practical classification may require to employ additional operations such as a product of classifications and removal of some classes due to redundancy. Even though most of the operations in our approach would seem obvious for geographers and ecologists, such operations have to be outlined explicitly if the goal of interoperation of two datasets is to be achieved or the information contained in classifications is to be communicated to non-experts in the area. The formalized theory presented above can be used to facilitate interoperability between geographic classifications. Interoperability between classifications can be achieved by creating a third classification capable of accommodating of both of the original classifications. The formal theory of classes and individuals can be used to mark out the elements of classifications such as classification trees, cuts, products and normalized products. Original classifications have to be decomposed into these elements and then these elements have to be reassembled into a new and more general classification. In a hypothetical example to interoperate Southeast Alaska ecological subsection with a similar hierarchy for some other region, we must first perform the operations outlined in Sect. 3 (mark out classification trees, cuts and their products) for both classifications. Then the trees from the different classifications would have to be combined. Territories with dissimilar geologic histories would posses different sets of classes and more diverse territories will have a larger number of classes. For instance, let us assume that the second hierarchy in our example was developed for a territory only partly affected by glaciation. Glaciation-related classes from Southeast Alaska are likely to be usable in that territory too. However, classifications for that territory will contain many classes that would not fit into the class trees specific for Southeast Alaska. Those classes would have to be added to the resulting classification trees. Most of these would fall under the “Post-glacial Terrains” class of Southeast Alaska hierarchy. Combined classification trees must satisfy axioms ATM1–7. Finally, a normalized product of class trees will have to be created. This methodology still awaiting testing in practical context.
Acknowledgments The authors would like to thank Dr. Gregory Nowaki from the USDA Forest Service for useful and detailed comments on our paper. Dr. Nowaki is one of the leading developers of the classification of Southeast Alaska ecological subsections [3]. Support for the second author from the Wolfgang Paul Program of the Alexander von Humboldt Foundation and from National Science Foun-
572
Alexandre Sorokine and Thomas Bittner
dation Research Grant BCS–9975557: Geographic Categories: An Ontological Investigation, is gratefully acknowledged
References [1] E. Lynn Usery. Category theory and the structure of features in geographic information systems. Cartography and Geographic information systems, 20(1):5–12, 1993. [2] David T Cleland, Peter E Avers, W Henry McNab, Mark E Jensen, Robert G Bailey, Thomas King, and Walter E Russell. National hierarchical framework of ecological units. In Mark S. Boyce and Alan W. Haney, editors, Ecosystem Management Applications for Sustainable Forest and Wildlife Resources, pages 181–200. Yale University Press, New Haven, CT, 1997. [3] Gregory Nowaki, Michael Shephard, Patricia Krosse, William Pawuk, Gary Fisher, James Baichtal, David Brew, Everett Kissinger, and Terry Brock. Ecological subsections of southeastern Alaska and neighboring areas of Canada. Technical Report R10-TP-75, USDA Forest Service, Alaska Region, October 2001. [4] Barry Smith, Jacob Khler, and Anand Kumar. On the application of formal principles to life science data: A case study in the gene ontology. In E Rahm, editor, Database Integration in the Life Sciences (DILS 2004). Springer, Berlin, 2004. forthcoming. [5] Harold P Cook and Hugh Tredennick. Aristotle: The Categories, On Interpretation, Prior Anlytics. Harvard University Press, Cambridge, Massachusetts, 1938. [6] Peter Simons. Parts, A Study in Ontology. Clarendon Press, Oxford, 1987. [7] P. Rigaux and M. Scholl. Multi-scale partitions: Application to spatial and statistical databases. In M. Egenhofer and J. Herring, editors, Advances in Spatial Databases (SSD’95), number 951 in Lecture Notes in Computer Science, pages 170–184. Springer-Verlag, Berlin, 1995. [8] T. Bittner and J.G. Stell. Stratified rough sets and vagueness. In W. Kuhn, M. Worboys, and S. Timpf, editors, Spatial Information Theory. Cognitive and Computational Foundations of Geographic Information Science. International Conference COSIT’03, pages 286–303. Springer, 2003.
Comparing and Combining Different Expert Relations of How Land Cover Ontologies Relate Alexis Comber1, Peter Fisher1, and Richard Wadsworth2 1
Department of Geography, University of Leicester, Leicester, LE1 7RH, UK, Tel: +44 (0)116 252 3859, Fax: +44 (0)116 252 3854, email:
[email protected] 2 Centre for Ecology and Hydrology, Monks Wood, Abbots Ripton, Huntingdon, Cambridgeshire, PE28 2LS, UK
Abstract Expressions of expert opinion are being used to relate ontologically diverse data and to identify logical inconsistency them. Relations constructed under different scenarios, from different experts and evidence combined in different ways identify different subsets of inconsistency, the reliability of which can be parameterised by field validation. It is difficult to identify one combination as being objectively “better” than other. The selection of specific experts and scenarios depends on user perspectives.
1 Introduction Geographic data necessarily are a simplification of the real world. Decisions about what to record and what to omit results in differences between datasets even when they purport to record the same features. Some of these differences may be rooted in local practice, in institutions, in the technology used to record and measure the processes of interest or in the (policy related) objective of the study (Comber et al. 2003a). The net result is that objects contained in one dataset may not correspond to similarly named objects in another. Specifying which objects to measure, and how to measure them is to describe an ontology or a specification of a conceptualisation (Guarino 1995). In the case of a remotely sensed land cover dataset an ontology is more than a list of the class descriptions: as well describing the botanical and ecological properties of the classes, it describes the epistemological aspects of data collection, correction and processing. Research reported here considers how different expert descriptions of the way that data relate may be combined and used to identify change and inconsistency
574
Alexis Comber, Peter Fisher and Richard Wadsworth
between them. Previous work has shown that using the opinions of a single expert it is possible to identify change in the land cover mappings of the UK (Comber et al. in press a). Different experts have different opinions, however, and the aim of the current work is to examine how evidence from multiple experts may be combined in order to more robustly identify change and inconsistency. In this paper we describe a method that addresses the issues of data conceptualisation, meaning and semantics by using a series of expert expressions of how the categories in different datasets relate. We asked three experts to describe the relationships between two mappings of UK land covers in 1990 and 2000 (introduced in the next section) using a table of pair-wise relations under three different scenarios. The experts can be characterised as a data User, a data Producer and a data Supplier. In the scenarios the experts described the semantics, common technical confusions and possible transitions of land covers, in an attempt to account for the differences in meaning and conceptualisation between the two datasets.
2 Background
2.1 Land cover mapping in the UK The 1990 Land Cover Map of Great Britain (LCMGB) is a raster dataset that records 25 Target land cover classes classified from composite winter and summer satellite imagery using a per pixel supervised maximum likelihood classification (Fuller et al. 1994). The classification was determined by scientists on the basis of what they believed they could reliably identify (Comber et al. 2003a). The UK Land Cover Map 2000 (LCM2000) upgrades and updates the LCMGB (Fuller et al. 2002). It describes the landscape in terms of broad habitats as a result of postRio UK and EU legislation and uses a parcel structure as the reporting framework. It includes extensive parcel-based meta-data including processing history and the spectral heterogeneity attribute “PerPixList” (Smith and Fuller 2001). LCM2000 was released with a “health warning” against comparing it to LCMGB because of the thematic and structural differences. 2.2 Data integration The integration of spatial data is an objective common to many research themes. The fundamental problem of integration is that the features of one dataset may not match those recorded in another. Intersecting the data provides measures of correspondence between the elements of each data set which can be used to model the uncertainty between two datasets, for instance, as membership functions using fuzzy sets or as probability distributions for Bayesian approaches. Such approaches allow multiple realisations of combined data to be modelled using the
Comparing and combining different expert relations of land cover
575
general case provided by the correspondence matrix and, because there is no spatial component to correspondences, they are assumed to be evenly distributed. This means that every object (pixel or parcel) of the same class is treated in the same way. For the typical case of the data object this is unproblematic. If one is interested in identifying atypical cases – for instance in order to identify change or inconsistency between different datasets – a more subtle approach is required. To this end a series of approaches originating in computing science have been suggested to address the data integration problem – interoperability (Bishr 1998; Harvey et al. 1999), formal ontologies (Frank 2001; Pundt and Bishr 2002; Visser et al. 2002) and standardisation (Herring 1999; OGC 2003) – all of which identified the need to overcome differences in the semantics of the objects described in different data. 2.3 Previous work Comber et al. (2003b; 2003c; in press a) have proposed and developed an approach based on expert expressions of how the semantics of the elements (classes) of two data sets relate to each other, using land cover mapping in the UK as an example. An expert familiar with both datasets was describe the pair-wise semantic relations between LCMGB and LCM2000 classes in a Look Up Table (LUT). It involved the following stages: 1. The area covered by each parcel was characterised in 1990: x For each parcel the number and type of intersecting LCMGB was determined. x The distribution of LCMGB classes was interpreted via the Expert LUT. The descriptions were in terms of whether the LCM2000-LCMGB class pair is “Unexpected”, “Expected” or “Uncertain” (U, E, Q). A triple for each parcel was generated by placing each intersecting pixel into one of these descriptions. x The U, E, Q scores were normalised by parcel area. 2. The parcel was characterised in 2000: x For each parcel the number and type of spectral heterogeneity attribute spectral subclasses were extracted. x These were interpreted via a Spectral LUT. This represented knowledge of expected, unexpected or possible spectral overlap relative to the parcel broad habitat class. This information accompanied the release of the LCM2000 data. These descriptions were used to generate a (U, E, Q) triple based on the percentage of each spectral subclass within each LCM2000 parcel. 3. Changes in U, E and Q were calculated for each parcel (ǻU, ǻU and ǻQ) and normalised to a standard distribution function for each class. 4. The normalised ǻU and ǻE provided beliefs for a hypothesis of change, combined using Dempster-Shafer, which was given further support from ǻQ.
576
Alexis Comber, Peter Fisher and Richard Wadsworth
5.
A sample of parcels was surveyed in the field and assessments were made of whether the land cover matched LCM2000 and whether it had changed since 1990. The headline result was that inconsistency between LCMGB and LCM2000 was identified in 100% of the parcels, with 41% of it was attributable to change and 59% to data error (Comber et al., in press b). The aim of the current work was to explore the effects of different types of expert evidence and different experts.
3. Methodology Three experts – a User, Producer and a Supplier – completed tables of relations between LCMGB and LCM2000 under three scenarios: - Semantic: based on their understanding of the links between the semantics of the two datasets; - Change: describing the transitions between land cover classes; - Technical: describing relations based on knowledge of where confusions may occur (e.g. spectral confusions between classes). Each scenario from each expert was substituted in turn as the Expert LUT in the outline described above (Section 2.3). Beliefs from the different scenarios for each expert were combined in two ways for each expert. First, using Dempster-Shafer with a normalisation term to produce Aggregate combinations of belief. Second, by adding the beliefs together to produce Cumulative combinations of beliefs. For each parcel 22 beliefs were calculated - Semantic from the semantic Expert LUT (S) - Technical from technical Expert LUT (T); - Change from change Expert LUT (C); - Semantic and Technical LUTs combined cumulatively (ST+); - Semantic and Technical LUTs combined by aggregation (ST*); - All three LUTs combined cumulatively (STC+); - All three LUTs combined by aggregation (STC*). This was done for each expert and an overall belief was calculated by combining all three scenarios using Dempster-Shafer (All). The beliefs for the visited parcels were extracted from the data and thresholds applied to explore how well the change and no change parcels were partitioned by different combinations of expert relations.
Comparing and combining different expert relations of land cover
577
578
Alexis Comber, Peter Fisher and Richard Wadsworth
4. Results
4.1 Belief thresholds Table 1 shows how the evidence from different experts partitions the visited using thresholds (T) of 1 and of 0.9. The error of omission is the proportion of parcels with a combined belief below the threshold but found to have changed. The error of commission is the proportion parcels with a belief greater than the threshold but found not to have changed. The overall figure is the proportion of all parcels from both subsets correctly partitioned by the threshold. In all cases where T = 1 the overall reliability is higher than for T > 0.9, due to lower errors of commission. However more parcels that have changed are missed at the higher threshold. The lowest errors of omission are for the aggregate combinations, but these also have the highest errors of commission. Thus the overall reliability for cumulative (additive) combinations is higher than from aggregate (multiplicative) due to their having lower errors of commission. 4.2 Experts The results in Table 1 are described below in terms of each individual expert and the levels of the different types of errors caused by the selection of this threshold (T = 1). The Supplier partitioned the field data with the fewest errors of omission using the Semantic relations (S) for the single scenarios and the aggregated combination of the Semantic Technical and Change relations (STC*). The fewest errors of commission were from the Change relations (C) and the additive Semantic Technical and Change relations (STC+). The Producer partitioned the field data with the fewest errors of omission are the Semantic relations (S) for the single scenarios and either the aggregated combination of the Semantic Technical and Change relations (STC*) or the Semantic and Technical (ST*). The fewest errors of commission were from the Change relations (C) and the aggregated Semantic Technical and Change relations (STC*). The User partitioned the field data with the fewest errors of omission with the Semantic relations (S) for the single scenarios and the aggregated combination of the Semantic Technical and Change relations (STC*). The fewest errors of commission were from the Technical relations (T) and the additive Semantic Technical and Change relations (STC+). 4.3 Methods of combination The two methods of combining belief Cumulative and Aggregate do so with different results. For example:
Comparing and combining different expert relations of land cover
579
S = 0.946 T = 0.425 C = 0.953 Aggregate: ST* = 0.928 STC* = 0.996 Cumulative: ST+ = 0.685 STC+ = 0.775 The cumulative method pushes the distribution of beliefs away from the extremes. This is as expected if one considers that all the terms (scenarios) must have high beliefs for the cumulative belief to be high. For example in the case of the Supplier S ĺ ST+ ĺ STC+ 0.65 0.58 0.53 However the number of false positives using this approach decreases correspondingly (i.e. the proportion of ‘No change’ parcels erroneously partitioned) S ĺ ST+ ĺ STC+ 0.33 0.33 0.28 The aggregate (Dempster-Shafer) method pushes the distributions of the beliefs apart from the median towards the extremes of the distribution of beliefs. For each expert we see that the proportion of ‘Change’ parcels correctly partitioned by the data increases as each scenario term is incorporated in the aggregate. For example in the case of the Supplier: S ĺ ST* ĺ STC* 0.65 0.70 0.72 However the number of false positives increases correspondingly (i.e. the proportion of ‘No change’ parcels erroneously partitioned): S ĺ ST* ĺ STC* 0.33 0.41 0.44 Thus the All combined belief generated in this way from all the experts and scenarios partitions the ‘Change’ parcels most reliably (0.79), but it includes 50% of the ‘No change’ parcels in the partition.
5 Discussion and Conclusions This work has explored the differences between multiple expert expressions of relations between datasets. Earlier work has shown that such approaches reliably identify inconsistency between two datasets (Comber et al. in press b). Various factors have been shown to influence the parcels that are partitioned: different experts, different scenarios, threshold selection and different methods for combining the belief. They generate different sets of change parcels and necessarily different error terms. The extent to which one set of results from the interaction of these factors is preferred to another, centres on the acceptability of different types of error. The preference will depend on the question being asked of the data and by whom, which will determine the acceptability of different levels of errors. This point is illustrated in Figures 1, 2 and 3 which show the different woodland parcels identified as having changed for an area of Sherwood Forest. The differences can be explained in terms of:
580
Alexis Comber, Peter Fisher and Richard Wadsworth
different scenarios (Figure 1); different thresholds (Figure 1); different combination methods (Figure 2); different experts and the sequence of scenarios (Figure 3). In this work LCMGB and LCM2000 have been analysed for change. They are separated by a 10 year interval and therefore the set of inconsistent parcels will contain change parcels. For datasets with a smaller interval separating them, the hypothesis of change may be less appropriate; however this method would still identify inconsistency between the datasets. This is of interest to many areas of geographic research that are concerned with shifting paradigms in the way that geographic phenomenon such as land cover are measured and recorded. In such cases, this method identifies where one dataset is considered to be inconsistent relative to the reporting paradigms of the other. Semantic
Technical
Change
Combined belief > 0.9
Combined belief =1 Fig. 1. Woodland parcels identified as having changed from different scenarios using different thresholds and evidence from the Producer expert
Fig. 2. Woodland change parcels identified from ways of combining beliefs, cumulative (+) and aggregate (*) (T > 0.9)
Comparing and combining different expert relations of land cover
581
The problem of detecting change between thematic (land cover) maps was described by Fuller et al. (2003) with reference to the accuracy of LCMGB and LCM1990. This work has applied the recommendation made by Fuller et al. (2003), namely that change detection using LCMGB and LCM2000 may be possible using the parcel structure of LCM2000 to interrogate the LCMGB raster data. This has provided local (LCM2000 parcel) descriptions of LCMGB distributions, allowed the previous cover dominance for each LCM2000 land parcel to be determined and, when interpreted through the lens of the Expert LUT, identified change and inconsistency on a per-parcel basis. A complementary description of LCM2000 parcel heterogeneity was provided by one of the LCM2000 meta-data attributes “PerPixList” and this in turn was interpreted in qualitative terms using the information on expected spectral overlap provided by the data providers.
Semantic
Semantic + Technical Semantic + Technical + Change Distributor
Producer
User
Fig. 3. Parcels identified by different experts combining the beliefs cumulatively from the different scenarios, where T > 0.9
Whilst field validation of change provided error terms and confidences in the different combinations of experts, scenarios, thresholds and combination methods, it difficult to say in any absolute sense that one combination of these factors is objectively better than another. Rather the user must decide what aspects of the question they wish to ask of the data are the most important. It is hoped that techniques such as this are of interest to users in any instance where two datasets with nonmatching semantics are used (possibly but not necessarily temporally distinct). Indeed, it should be noted that where the datasets are synchronous or the data is not expected to have changed, the method provides an approach to data verification and semantic consistency.
582
Alexis Comber, Peter Fisher and Richard Wadsworth
Acknowledgements This paper describes work done within the REVIGIS project funded by the European Commission, Project Number IST-1999-14189. We would especially like to thank our collaborators, especially Andrew Frank, Robert Jeansoulin, Geoff Smith, Alfred Stein, Nic Wilson, Mike Worboys and Barry Wyatt.
References Bishr, Y., 1998. Overcoming the semantic and other barriers to GIS interoperability. International Journal of Geographical Information Science, 12 (4), 299-314. Comber, A. Fisher, P., and Wadsworth, R. (in press a). Integrating land cover data with different ontologies: identifying change from inconsistency. International Journal of Geographic Information Science. Comber, AJ, Fisher, PF, Wadsworth, RA., (in press b). Assessment of a Semantic Statistical Approach to Detecting Land Cover Change Using Inconsistent Data Sets. Photogrammetric Engineering and Remote Sensing. Comber, A. Fisher, P. and Wadsworth, R., 2003a. Actor Network Theory: a suitable framework to understand how land cover mapping projects develop? Land Use Policy, 20, 299–309 Comber, A.J., Fisher, P.F and Wadsworth, R.A, 2003b. A semantic statistical approach for identifying change from ontologically divers land cover data. In AGILE 2003, 5th AGILE conference on Geographic Information Science, edited by Michael Gould, Robert Laurini, Stephane Coulondre (Lausanne: PPUR), pp. 123-131. Comber, AJ, Fisher, PF, Wadsworth, RA., 2003c. Identifying Land Cover Change Using a Semantic Statistical Approach: First Results. In cd Proceedings of the 7th International Conference on GeoComputation, 8th-10th September 2003 (University of Southampton). Frank, A.U. 2001. Tiers of ontology and consistency constraints in geographical information systems, International Journal of Geographical Information Science, 15 (7), 667678. Fuller, R.M., Smith, G.M., and Devereux, B.J. (2003). The characterisation and measurement of land cover change through remote sensing: problems in operational applications? International Journal of Applied Earth Observation and Geoinformation, 4, 243-253. Fuller, R.M., G.B., Groom, A.R. Jones, 1994. The Land Cover Map of Great Britain: an automated classification of Landsat Thematic Mapper data. Photogrammetric Engineering and Remote Sensing, 60, 553-562. Fuller, R.M., Smith, G.M., Sanderson, J.M., Hill, R.A. and Thomson, A.G., 2002. Land Cover Map 2000: construction of a parcel-based vector map from satellite images. Cartographic Journal, 39, 15-25. Guarino, N, 1995. Formal ontology, conceptual analysis and knowledge representation. International Journal of Human-Computer Studies, 43, 625-640. Harvey, F., Kuhn, W., Pundt, H., Bishr, Y. and Riedemann, C.,, 1999. Semantic interoperability: A central issue for sharing geographic information. Annals of Regional Science 33 (2), 213-232. Herring J.R., 1999. The OpenGIS data model. Photogrammetric Engineering and Remote Sensing, 65 (5), 585-588.
Comparing and combining different expert relations of land cover
583
OGC, 2003. OpenGIS Consortium. http://www.opengis.org/ (last date accessed: 10 June 2003). Pundt, H. and Y. Bishr, 2002. Domain ontologies for data sharing-an example from environmental monitoring using field GIS. Computers and Geosciences, 28 (1), 95-102. Smith, G.M. and R.M. Fuller, 2002. Land Cover Map 2000 and meta-data at the land parcel level, In Uncertainty in Remote Sensing and GIS, edited by G.M. Foody and P.M. Atkinson (London: John Wiley and Sons), pp 143-153.
Visser, U., Stuckenschmidt, H., Schuster, G. and Vogele, T., 2002. Ontologies for geographic information processing. Computers and Geosciences, 28, 103-117.
Representing, Manipulating and Reasoning with Geographic Semantics within a Knowledge Framework James O’Brien and Mark Gahegan GeoVISTA Center, Department of Geography, The Pennsylvania State University, University Park, PA 16802, USA. Ph: +1-814-865-2612; Fax: +1-814-863-7643; Email:
[email protected],
[email protected] Abstract This paper describes a programmatic framework for representing, manipulating and reasoning with geographic semantics. The framework enables automating tool selection for user defined geographic problem solving, and evaluating semantic change in knowledge discovery environments. Methods, data, and human experts (our resources) uses, inputs, outputs, and semantic changes are described using ontologies. These ontological descriptions are manipulated by an expert system to select resources to solve a user-defined problem. A semantic description of the problem is compared to the services that each entity can provide to construct a graph of potential solutions. An optimal (least cost) solution is extracted from these solutions, and displayed in real-time. The semantic change(s) resulting from the interaction of resources within the optimal solution are determined via expressions of transformation semantics represented within the Java Expert System Shell. This description represents the formation history of each new information product (e.g. a map or overlay) and can be stored, indexed and searched as required. Examples are presented to show (1) the construction and visualization of information products, (2) the reasoning capabilities of the system to find alternative ways to produce information products from a set of data methods and expertise, given certain constraints and (3) the representation of the ensuing semantic changes by which an information product is synthesized.
586
James O’Brien and Mark Gahegan
1 Introduction The importance of semantics in geographic information is well documented (Bishr, 1998; Egenhofer, 2002; Fabrikant and Buttenfield, 2001; Kuhn, 2002). Semantics are a key component of interoperability between GIS; there are now robust technical solutions to interoperate geographic information in a syntactic and schematic sense (e.g. OGC, NSDI) but these fail to take account of any sense of meaning associated with the information. Visser et al., (2002) describe how exchanging data between systems often fails due to confusion in the meaning of concepts. Such confusion, or semantic heterogeneity, significantly hinders collaboration if groups cannot agree on a common lexicon for core concepts. Semantic heterogeneity is also blamed for the inefficient exchange of geographic concepts and information between groups of people with differing ontologies (Kokla and Kavouras, 2002). Semantic issues pervade the creation, use and re-purposing of geographic information. In an information economy we can identify the roles of information producer and information consumer, and in some cases, in national mapping agencies for example, datasets are often constructed incrementally by different groups of people (Gahegan, 1999) with an implicit (but not necessarily recorded) goal. The overall meaning of the resulting information products are not always obvious to those outside that group, existing for the most part in the creators’ mental model. When solving a problem, a user may gather geospatial information from a variety of sources without ever encountering an explicit statement about what the data mean, or what they are (and are not) useful for. Without capturing the semantics of the data throughout the process of creation, the data may be misunderstood, be used inappropriately, or not used at all when they could be. The consideration of geospatial semantics needs to explicitly cater for the particular way in which geospatial tasks are undertaken (Egenhofer, 2002). As a result, the underlying assumptions about methods used with data, and the roles played by human expertise need to be represented in some fashion so that a meaningful association can be made between appropriate methods, people and data to solve a problem. It is not the role of this paper to present a definitive taxonomy of geographic operations or their semantics. To do so would trivialize the difficulties of defining geographic semantics.
Representing, manipulating and reasoning with geographic semantics
587
2 Background and Aims This paper presents a programmatic framework for representing, manipulating and reasoning with geographic semantics. In general semantics refers to the study of the relations between symbols and what they represent (Hakimpour and Timpf, 2002). In the framework outlined in this paper, semantics have two valuable and specific roles. Firstly, to determine the most appropriate resources (method, data or human expert) to use in concert to solve a geographic problem, and secondly to act as a measure of change in meaning when data are operated on by methods and human experts. Both of these roles are discussed in detail in Section 3. The framework draws on a number of different research fields, specifically: geographical semantics (Gahegan, 1999 and Kuhn, 2002), ontologies (Guarino, 1998) computational semantics (Sowa, 2000), constraint-based reasoning and expert systems (Honda and Mizoguchi, 1995) and visualization (MacEachren, in press) to represent aspects of these resources. The framework sets out to solve a multi-layered problem of visualizing knowledge discovery, automating tool selection for user defined geographic problem solving and evaluating semantic change in knowledge discovery environments. The end goal of the framework is to associate with geospatial information products the details of their formation history and tools by which to browse, query and ultimately understand this formation history, thereby building a better understanding of meaning and appropriate use of the information. The goal of the framework is not to develop new theories about ontologies or semantics, but instead to use ontologies to specify a problem and to govern the interaction between data, methods and human experts to solve that problem. The problem of semantic heterogeneity arises due to the varying interpretations given to the terms used to describe facts and concepts. Semantic heterogeneity exists in two forms, cognitive and naming (Bishr, 1998). Cognitive semantic heterogeneity results from no common base of definitions between two (or more) groups. Defining such points of agreement amounts to constructing a shared ontology, or at the very least, points of overlap (Pundt and Bishr, 2002). Naming semantic heterogeneity occurs when the same name is used for different concepts or different names are used for the same concept. It is not possible to undertake any semantic analysis until problems of semantic heterogeneity are resolved. Ontologies, described below, are widely recommended as a means of rectifying semantic heterogeneity (Hakimpour and Timpf, 2002; Kokla and Kavouras, 2002; Kuhn, 2002; Pundt and Bishr, 2002; Visser et al., 2002). The framework presented in this paper utilizes that work and other ontological re-
588
James O’Brien and Mark Gahegan
Fig. 1. Interrelationships between different types of ontology.
search (Brodaric and Gahegan, 2002; Chandrasekaran et al., 1997; Fonseca, 2001; Fonseca and Egenhofer, 1999; Fonseca et al., 2000; Guarino, 1997a; Guarino, 1997b; Mark et al., 2002) to solve the problem of semantic heterogeneity. The use of an expert system for automated reasoning fits well with the logical semantics utilized within the framework. The Java Expert System Shell (JESS) is used to express diverse semantic aspects about methods, data, and human experts. JESS performs string comparisons of resource attributes (parsed from ontologies) using backward chaining to determine interconnections between resources. Backward chaining is a goal driven problem solving methodology, starting from the set of possible solutions and attempting to derive the problem. If the conditions for a rule to be satisfied are not found within that rule, the engine searches for other rules that have the unsatisfied rule as their conclusion, establishing dependencies between rules. 2.1 Ontology In philosophy, Ontology is the “study of the kinds of things that exist” (Chandrasekaran et al., 1997; Guarino, 1997b). In the artificial intelligence community, ontology has one of two meanings, as a representation vo-
Representing, manipulating and reasoning with geographic semantics
589
cabulary, typically specialized to some domain or subject matter and as a body of knowledge describing some domain using such a representation vocabulary (Chandrasekaran et al., 1997). The goal of sharing knowledge can be accomplished by encoding domain knowledge using a standard vocabulary based on an ontology (Chandrasekaran et al., 1997; Kokla and Kavouras, 2002; Pundt and Bishr, 2002). The framework described here utilizes both definitions of ontology. The representation vocabulary embodies the conceptualizations that the terms in the vocabulary are intended to capture. Relationships described between conceptual elements in this ontology allow for the production of rules governing how these elements can be “connected” to solve a geographic problem. In our case these elements are methods, data and human experts, each with their own ontology. In the case of datasets, a domain ontology describes salient properties such as location, scale, date, format, etc., as currently captured in meta-data descriptions (see figure 1). In the case of methods, a domain ontology describes the services a method provides in terms of a transformation from one semantic state to another. In the case of human experts the simplest representation is again a domain ontology that shows the contribution that a human can provide in terms of steering or configuring methods and data. However, it should also be possible to represent a two-way flow of knowledge as the human learns from situations and thereby expands the number of services they can provide (we leave this issue for future work). The synthesis of a specific information product is specified via a task ontology that must fuse together elements of the domain and application ontologies to attain its goal. An innovation of this framework is the dynamic construction of the solution network, analogous to the application ontology. In order for resources to be useful in solving a problem, their ontologies must also overlap. Ontology is a useful metaphor for describing the genesis of the information product. A body of knowlFig. 2 The creation of an information product relies on data, methods, and human edge described using the experts domain ontology is utilized
590
James O’Brien and Mark Gahegan
in the initial phase of setting up the expert system. A task ontology is created at the conclusion of the automated process specifically defining the concepts that are available. An information product is derived from the use of data extracted from databases and knowledge from human experts in methods, as shown in figure 2. By forming a higher level ontology which describes the relationships between each of these resources it is possible to describe appropriate interactions. Fonseca et al. (2001) highlight the complexity of determining a solution and selecting appropriate methods and data while attempting to define a relationship between climate and vector borne diseases (e.g. West Nile virus). The authors outline a problem where a relationship exists between climate and infectious diseases, as the disease agents (viruses, bacteria, etc.) and disease vectors (ticks, mosquitoes, rodents) are sensitive to temperature, moisture and other climatic variables. While climate affects the likely presence or abundance of an agent or vector, health outcomes are dependent on other criteria such as age demographics and other stressors such as air and water quality can determine risk, while land use and land cover are significant controls on ecosystem character and function, and
Fig. 3. A concrete example of the interaction of methods, data and human experts to produce an information product.
Representing, manipulating and reasoning with geographic semantics
591
hence any disease dynamics associated with insects, rodents, deer, etc. (Fonseca et al., 2001). The lack of an ability to assess the response of the system to multiple factors limits our ability to predict and mitigate adverse health outcomes. Such a complex problem (figure 3) is obviously more difficult to constrain, a larger number of data sources, methods and human expertise is required, potentially with changes between data models, data scales, and levels of abstraction and across multiple domains (climate, climate change, effects of environment variation on flora, fauna, landuse / landcover and soils, epidemiology, and human health-environment interactions) potentially from fields of study with different conceptualizations of concepts. One interesting feature demonstrated in this example is the ability of human experts to gain experience through repeated exposure to similar situations. In this example a basic semantic structure is being constructed and a lineage of the data can be determined. 2.2 Semantics While the construction of the information product is important, a semantic layer sits above the operations and information (figure 4). The geospatial knowledge obtained during the creation of the product is captured within this layer. The capture of this semantic information describes the REASONING ENGINE
Geospatial Knowledge (meaning layer)
ork d ew a a n s ram emat thod f tic h e an g sc n m se m ndin atio d m te po or jec res n sf Pr o h cor ic tra t t i n w ma se
Geospatial Information (operational layer)
M D
H
K
s tion ec ing o nn s s c r ce int e pr o eo and g g t ure n i c ist Ex rastru in f
Fig. 4. Interaction of the semantic layer and operational layer.
transformations that the geospatial information undergoes, facilitating better understanding and providing a measure of repeatability of analysis, and
592
James O’Brien and Mark Gahegan
improving communication in the hope of promoting best practice in bringing geospatial information to bear. Egenhofer (2002) notes that the challenge remains of how best to make these semantics available to the user via a search interface. Pundt and Bishr, (2002) outline a process in which a user searches for data to solve a problem. This search methodology is also applicable for the methods and human experts to be used with the data. This solution fails when multiple sources are available and nothing is known of their content, structure and semantics. The use of pre-defined ontologies aids users by reducing the available search space (Pundt and Bishr, 2002). Ontological concepts relevant to a problem domain are supplied to the user allowing them to focus their query. A more advanced interface would take the user’s query in their own terms and map that to an underlying domain ontology (Bishr, 1998). As previously noted, the meaning of geospatial information is constructed, shaped and changed by the interaction of people and systems. Subsequently the interaction of human experts, methods and data needs to be carefully planned. A product created as a result of these interactions is dependent on the ontology of the data and methods and the epistemologies and ontologies of the human experts. In light of this, the knowledge framework outlined below focuses on each of the resources involved (data, methods and human experts) and the roles they play in the evolution of a new information product. In addition, the user’s goal that produced the product, and any constraints placed on the process are recorded to capture aspects of intention and situation that also have an impact on meaning. This process and the impact of constraint based searches are discussed in more detail in the following section.
3 Knowledge framework The problem described in the introduction has been implemented as three components. The first, and the simplest, is the task of visualizing the network of interactions by which new information products are synthesized. The second, automating the construction of such a network for a userdefined task, is interdependent with the third, evaluating semantic change in a knowledge discovery environment, and both utilize functionality of the first. An examination of the abstract properties of data, methods and experts is followed by an explanation of these components and their interrelationships.
Representing, manipulating and reasoning with geographic semantics
593
3.1 Formal representation of components and changes This section explains how the abstract properties of data, methods and experts are represented, and then employed to track semantic changes as information products are produced utilizing tools described above. From the description in Section 2 it should be evident that such changes are a consequence of the arrangement of data, computational methods and expert interaction applied to data. At an abstract level above that of the data and methods used, we wish to represent some characteristics of these three sets of components in a formal sense, so that we can describe the effects deriving from their interaction. One strong caveat here is that our semantic description (described below) does not claim to capture all senses of meaning attached to data, methods or people, and in fact as a community of researchers we are still learning about which facets of semantics are important and how they might be described. It is not currently possible to represent all aspects of meaning and knowledge within a computer, so we aim instead to provide descriptions that are rich enough to allow users to infer aspects of meaning that are important for specific tasks from the visualizations or reports that we can synthesize. In this sense our own descriptions of semantics play the role of a signifier—the focus is on conveying meaning to the reader rather than explicitly carrying intrinsic meaning per-se. The formalization of semantics based on ontologies and operated on using a language capable of representing relations provides for powerful semantic modelling (Kuhn, 2002). The framework, rules, and facts used in the Solution Synthesis Engine (see below) function in this way. Relationships are established between each of the entities, by calculating their membership within a set of objects capable of synthesizing a solution. We extend the approach of Kuhn by allowing the user to narrow a search for a solution based on the specific semantic attributes of entities. Using the minimal spanning tree produced from the solution synthesis it is possible to retrace the steps of the process to calculate semantic change. As each fact is asserted it contains information about the rule that created it (the method) and the data and human experts that were identified as resources required. If we are able to describe the change to the data (in terms of abstract semantic properties) imbued by each of the processes through which it passes, then it is possible to represent the change between the start state and the finish state by differencing the two. Although the focus of our description is on semantics, there are good reasons for including syntactic and schematic information about data and methods also, since methods generally are designed to work in limited circumstances, using and producing very specific data types (pre-conditions and post-conditions). Hence from a practical perspective it makes sense to
594
James O’Brien and Mark Gahegan
represent and reason with these aspects in addition to semantics, since they will limit which methods can be connected together and dictate where additional conversion methods are required. Additional potentially useful properties arise when the computational and human infrastructure is distributed e.g. around a network. By encoding such properties we can extend our reasoning capabilities to address problems that arise when resources must be moved from one node to another to solve a problem (Gahegan, 1998). Describing Data As mentioned in Section 2, datasets are described in general terms using a domain ontology drawn from generic metadata descriptions. Existing metadata descriptions hold a wealth of such practical information that can be readily associated with datasets; for example the FGDC (1998) defines a mix of semantic, syntactic and schematic metadata properties. These include basic semantics (abstract and purpose), syntactic (data model information, and projection), and schematic (creator, theme, temporal and spatial extents, uncertainty, quality and lineage). We explicitly represent and reason with a subset of these properties in the work described here and could easily expand to represent them all, or any other given metadata description that can be expressed symbolically. Formally, we represent the set of n properties of a dataset D as in Eq. 1 (Gahegan, 1996).
D p1 , p 2 , , p n
(1)
Describing Methods While standards for metadata descriptions are already mature and suit our purposes, complementary mark-up languages for methods are still in their infancy. It is straightforward to represent the signature of a method in terms of the format of data entering and leaving the method, and knowing that a method requires data to be in a certain format will cause the system to search for and insert conversion methods automatically where they are required. So, for example, if a coverage must be converted from raster format to vector format before it can be used as input to a surface flow accumulation method, then the system can insert appropriate data conversion methods into the evolving query tree to connect to appropriate data resources that would otherwise not be compatible. Similarly, if an image classification method requires data at a nominal scale of 1:100,000 or a pixel size of 30m, any data at finer scales might be generalized to meet this requirement prior to use. Although such descriptions have great practical
Representing, manipulating and reasoning with geographic semantics
595
benefit, they say nothing about the role the method plays or the transformation it imparts to the data; in short they do not enable any kind of semantic assessment to be made. A useful approach to representing what GIS methods do, in a conceptual sense, centers on a typology (e.g. Albrecht’s 20 universal GIS operators, 1994). Here, we extend this idea to address a number of different abstract properties of a dataset, in terms of how the method invoked changes these properties (Pascoe & Penny, 1995; Gahegan, 1996). In a general sense, the transformation performed by a method (M) can be represented by preconditions and post-conditions, as is common practice with interface specification and design in software engineering. Using the notation above, our semantic description takes the form shown in Eq. 2, where Operation is a generic description of the role or function the method provides, drawn from a typology. Operation M : D p1 , p2 , , pn o D' p1 ' , p2 ' , , pn '
(2)
For example, a cartographic generalization method changes the scale at which a dataset is most applicable, a supervised classifier transforms an array of numbers into a set of categorical labels, an extrapolation method might produce a map for next year, based on maps of the past. Clearly, there are any number of key dimensions over which such changes might be represented; the above examples highlight spatial scale, conceptual ‘level’ (which at a basic syntactic level could be viewed simply as statistical scale) and temporal applicability, or simply time. Others come to light following just a cursory exploration of GIS functionality: change in spatial extents, e.g. windowing and buffering, change in uncertainty (very difficult in practice to quantify but easy to show in an abstract sense that there has been a change).
Again, we have chosen not to restrict ourselves to a specific set of properties, but rather to remain flexible in representing those that are important to specific application areas or communities. We note that as Web Services (Abel et al., 1998) become more established in the GIS arena, such an enhanced description of methods will be a vital component in identifying potentially useful functionality. Describing People Operations may require additional configuration or expertise in order to carry out their task. People use their expertise to interact with data and methods in many ways, such as gathering, creating and interpreting data, configuring methods and interpreting results. These activities are typically structured around well-defined tasks where the desired outcome is known,
596
James O’Brien and Mark Gahegan
although as in the case of knowledge discovery, they may sometimes be more speculative in nature. In our work we have cast the various skills that experts possess in terms of their ability to help achieve some desired goal. This, in turn, can be re-expressed as their suitability to oversee the processing of some dataset by some method, either by configuring parameters, supplying judgment or even performing the task explicitly. For example, an image interpretation method may require identification of training examples that in turn necessitate local field knowledge; such knowledge can also be specified as a context of applicability using the time, space, scale and theme parameters that are also used to describe datasets. As such, a given expert may be able to play a number of roles that are required by the operations described above, with each role described by Eq. 3, meaning that expert E can provide the necessary knowledge to perform Operation within the context of p1…, pn. So to continue the example of image interpretation, p1…, pn might represent (say) floristic mapping of Western Australia, at a scale of 1:100,000 in the present day. Operation E : o p1 , p2 , , pn
(3)
At the less abstract schematic level, location parameters can also be used to express the need to move people to different locations in order to conduct an analysis, or to bring data and methods distributed throughout cyberspace to the physical location of a person. Another possibility here, that we have not yet implemented, is to acknowledge that a person’s ability to perform a task can increase as a result of experience. So it should be possible for a system to keep track of how much experience an expert has accrued by working in a specific context (described as p1…, pn). (In this case the expert expression would also require an experience or suitability score as described for constraint management described below in section 3.2). We could then represent a feedback from the analysis exercise to the user, modifying their experience score. 3.2 Solution Synthesis Engine The automated tool selection process or solution synthesis relies on domain ontologies of the methods, data and human experts (resources) that are usable to solve a problem. The task of automated tool selection can be divided into a number of phases. First is the user’s specification of the problem, either using a list of ontological keywords (Pundt and Bishr, 2002) or in their own terms which are mapped to an underlying ontology
Representing, manipulating and reasoning with geographic semantics
597
Fig. 5. Solution Synthesis Engine user interface
(Bishr, 1997). Second, ontologies of methods, data and human experts need to be processed to determine which resources overlap with the problem ontology. Third, a description of the user’s problem and any associated constraints is parsed into an expert system to define rules that describe the problem. Finally networks of resources that satisfy the rules need to be selected and displayed. Defining a complete set of characteristic attributes for real world entities (such as data, methods and human experts) is difficult (Bishr, 1998) due to problems selecting attributes that accurately describe the entity. Bishr’s solution of using cognitive semantics to solve this problem, by referring to entities based on their function, is implemented in this framework. Once again it should be noted that it is not the intention of this framework to describe all aspects of semantics, but to provide descriptions that are rich enough to allow users to infer aspects of meaning that are important for specific tasks from the visualizations or reports that I can synthesize. Methods utilize data or are utilized by human experts and are subject to conditions regarding their use such as data format, scale or a level of human knowledge. The rules describe the requirements of the methods (‘if’) and the output(s) of the methods (‘then’). Data and human experts, specified by facts, are arguably more passive and the rules of methods are applied to or by them respectively. The set of properties, outlined above, describing data and human experts governs how rules may use them.
598
James O’Brien and Mark Gahegan
The first stage of the solution synthesis is the user specification of the problem using concepts and keywords derived from the problem ontology. The problem ontology, derived from the methods, data and human expert ontologies, consist of concepts describing the intended uses of each of the resources. This limitation was introduced to ensure the framework had access to the necessary entities to solve a user’s problem. A more advanced version of the problem specification using natural language parsing is beyond the scope of this proposal. This natural language query would be mapped to the problem ontology allowing the user to use their own semantics instead of being governed by those of the system. The second stage of the solution synthesis process parses the rules and facts, from with DAML+OIL or OWL to rules and facts describing relationships between data, methods, and human experts. It is important to note that these rules do not perform the operations described rather they mimic the semantic change that would accompany such an operation. The future work section outlines the goal of running this system in tandem with a codeless programming environment to run the selected toolset automatically. With all of the solutions defined by facts and rules defined, the missing link is the problem. The problem ontology is parsed into JESS to create a set of facts. These facts form the “goal” rule that mirrors the user’s problem specification. The JESS engine now has the requisite components for tool selection. During the composition stage, as the engine runs, each of the rules “needed” are satisfied using backward chaining, the goal is fulfilled, and a network of resources is constructed. As each rule fires and populates the network a set of criteria is added to a JESS fact describing each of the user criteria that limits the network. Each of these criteria is used to create a minimal spanning tree of entities. User criteria are initially based upon the key spatial concepts of identity, location, direction, distance, magnitude, scale, time (Fabrikant and Buttenfield, 2001), availability, operation time, and semantic change. Users specify the initial constraints, via the user interface (figure 5) prior to the automated selection of tools. Once again using the vignette, a satellite image is required for the interpretation task, but the only available data is 6 months old and data from the next orbit over the region will not be available for another 4 weeks. Is it “better” to wait for that data to become available or is it more crucial to achieve a solution in a shorter time using potentially out of date data? In the case of landuse change, perhaps 6 month old data is acceptable to the user, however in a disaster management scenario, more timely data may be important. It is possible that the user will request a set of limiting conditions that are too strict to permit a solu-
Representing, manipulating and reasoning with geographic semantics
599
tion. In these cases all possible solutions will be displayed allowing the user to modify their constraints. It is proposed that entities causing the solution to be excluded are highlighted allowing the user to relax their constraints. The user specified constraints are used to prune the network of proposed solutions to a minimal spanning tree that is the solution (or solutions) that satisfies all of the user’s constraints. The final stage of the solution synthesis is the visualization of the process that utilizes a self-organising graph package (ConceptVISTA). This stage occurs in conjunction with the composition stage. As each of the rules fires, it generates a call to ConceptVISTA detailing the source (the rule which fired it – the method) and the data and human experts that interacted with the method. This information is passed to ConceptVISTA, which matches the rules with the ontologies describing each of the entities, loads visual variables and updates the self-organising graph. As well as the visual representation in ConceptVISTA, an ontology is constructed detailing the minimal spanning tree. This ontology describes the relationships between the methods, data and human experts as constructed as part of the solution synthesis.
4 Results This section presents the results of the framework’s solution synthesis and representation of semantic change. The results of the knowledge discovery visualization are implicit in this discussion as that component is used for the display of the minimal spanning tree (Figure 6). A sample problem, finding a home location with a sunset view is used to demonstrate the solution synthesis. In order to solve this problem, raster (DEM) and vector (road network) data needs to be integrated. A raster overlay, using map algebra, followed by buffer operations is required to find suitable locations, from height, slope and aspect data. The raster data of potential sites needs to be converted to a vector layer to enable a buffering operation with vector road data. Finally a viewshed analysis is performed to determine how much of the landscape is visible from candidate sites. The problem specification was simplified by hard-coding the user requirements into a set of facts loaded from an XML file. The user’s problem specification was reduced to selecting pre-defined problems from a menu. A user constraint of scale was set to ensure that data used by the methods in the framework was at a consistent scale and appropriate data layers
600
James O’Brien and Mark Gahegan
were selected based on their metadata and format. With the user requirements parsed into JESS and a problem selected, the solution engine selected the methods, data and human experts required to solve the problem. The solution engine constructed a set of all possible combinations and then determined the shortest path by summing the weighted constraints specified by the user. Utilizing the abstract notation from above, with methods specifying change (see Eq. 4), , the user weights were included and summed for all modified data sets (see Eq. 5). As a result of this process the solution set is pruned until only the optimal solution remains (based on user constraints).
M 1 : D p1 , p 2 , , p n Operation o D' p1 ' , p 2 ' , , p n '
(4)
¦ D1 ' u1 p1 ' , u 2 p 2 ' , , u n p n ' ,...Dn ' u1 p1 ' , u 2 p 2 ' , , u n p n '
(5)
Fig. 6. Diagram showing the simple vignette solution, derived from a thinned network
5 Future Work The ultimate goal of this project is to integrate the problem solving environment with the codeless programming environment GEOVISTA Studio (Gahegan et al., 2002) currently under development at Pennsylvania State University. The possibility of supplying data to the framework and determining the types of questions which could be answered with it is also an interesting problem. A final goal is the use of natural language parsing of the user’s problem specification.
Representing, manipulating and reasoning with geographic semantics
601
6 Conclusions This paper outlined a framework for representing, manipulating and reasoning with geographic semantics. The framework enables visualizing knowledge discovery, automating tool selection for user defined geographic problem solving, and evaluating semantic change in knowledge discovery environments. A minimal spanning tree representing the optimal (least cost) solution was extracted from this graph, and can be displayed in real-time. The semantic change(s) that result from the interaction of data, methods and people contained within the resulting tree represents the formation history of each new information product (such as a map or overlay) and can be stored, indexed and searched as required.
Acknowledgements Our thanks go to Sachin Oswal, who helped with the customization of the ConceptVISTA concept visualization tool used here. This work is partly funded by NSF grants: ITR (BCS)-0219025 and ITR Geosciences Network (GEON).
References Abel, D.J., Taylor, K., Ackland, R., and Hungerford, S. 1998, An Exploration of GIS Architectures for Internet Environments. Computers, Environment and Urban Systems. 22(1) pp 7 -23. Albrecht, J., 1994. Universal elementary GIS tasks- beyond low-level commands. In Waugh T C and Healey R G (eds) Sixth International Symposium on Spatial Data Handling : 209-22. Bishr, Y., 1997. Semantic aspects of interoperable GIS. Ph.D Dissertation Thesis, Enschede, The Netherlands, 154 pp. Bishr, Y., 1998. Overcoming the semantic and other barriers to GIS interoperability. International Journal of Geographical Information Science, 12(4): 299314. Brodaric, B. and Gahegan, M., 2002. Distinguishing Instances and Evidence of Geographical Concepts for Geospatial Database Design. In: M.J. Egenhofer and D.M. Mark (Editors), GIScience 2002. Lecture Notes in Computing Science 2478. Springer-Verlag, pp. 22-37. Chandrasekaran, B., Josephson, J.R. and Benjamins, V.R., 1997. Ontology of Tasks and Methods, AAAI Spring Symposium.
602
James O’Brien and Mark Gahegan
Egenhofer, M., 2002. Toward the semantic geospatial web, Tenth ACM International Symposium on Advances in Geographic Information Systems. ACM Press, New York, NY, USA, McLean, Virginia, USA, pp. 1-4. Fabrikant, S.I. and Buttenfield, B.P., 2001. Formalizing Semantic Spaces for Information Access. Annals of the Association of American Geographers, 91(2): 263-280. Federal Geographic Data Committee. FGDC-STD-001-1998. Content standard for digital geospatial metadata (revised June 1998). Federal Geographic Data Committee. Washington, D.C. Fonseca, F.T., 2001. Ontology-Driven Geographic Information Systems. Doctor of Philosophy Thesis, The University of Maine, 131 pp. Fonseca, F.T., Egenhofer, M.J., Jr., C.A.D. and Borges, K.A.V., 2000. Ontologies and knowledge sharing in urban GIS. Computers, Environment and Urban Systems, 24: 251-271. Fonseca, F.T. and Egenhofer, M.J., 1999. Ontology-Driven Geographic Information Systems. In: C.B. Medeiros (Editor), 7th ACM Symposium on Advances in Geographic Information Systems, Kansas City, MO, pp. 7. Gahegan, M., Takatsuka, M., Wheeler, M. and Hardisty, F., 2002. Introducing GeoVISTA Studio: an integrated suite of visualization and computational methods for exploration and knowledge construction in geography. Computers, Environment and Urban Systems, 26: 267-292. Gahegan, M. N. (1999). Characterizing the semantic content of geographic data, models, and systems. In Interoperating Geographic Information Systems (Eds. Goodchild, M.F., Egenhofer, M. J. Fegeas, R. and Kottman, C. A.). Boston: Kluwer Academic Publishers, pp. 71-84. Gahegan, M. N. (1996). Specifying the transformations within and between geographic data models. Transactions in GIS, Vol. 1, No. 2, pp. 137-152. Guarino, N., 1997a. Semantic Matching: Formal Ontological Distinctions for Information Organization, Extraction, and Integration. In: M.T. Pazienza (Editor), Information Extraction: A Multidisciplinary Approach to an Emerging Information Technology. Springer Verlag, pp. 139-170. Guarino, N., 1997b. Understanding , building and using ontologies. International Journal of Human-Computer Studies, 46: 293-310. Hakimpour, F. and Timpf, S., 2002. A Step towards GeoData Integration using Formal Ontologies. In: M. Ruiz, M. Gould and J. Ramon (Editors), 5th AGILE Conference on Geographic Information Science. Universitat de les Illes Balears, Palma de Mallorca, Spain, pp. 5. Honda, K. and Mizoguchi, F., 1995. Constraint-based approach for automatic spatial layout planning. 11th conference on Artificial Intelligence for Applications, Los Angeles, CA. p38. Kokla, M. and Kavouras, M., 2002. Theories of Concepts in Resolving Semantic Heterogeneities, 5th AGILE Conference on Geographic Information Science, Palma, Spain, pp. 2. Kuhn, W., 2002. Modeling the Semantics of Geographic Categories through Conceptual Integration. In: M.J. Egenhofer and D.M. Mark (Editors), GIScience 2002. Lecture Notes in Computer Science. Springer-Verlag.
Representing, manipulating and reasoning with geographic semantics
603
MacEachren, A.M., in press. An evolving cognitive-semiotic approach to geographic visualization and knowledge construction. Information Design Journal. Mark, D., Egenhofer, M., Hirtle, S. and Smith, B., 2002. Ontological Foundations for Geographic Information Science. UCGIS Emerging Resource Theme. Pascoe R.T and Penny J.P. (1995) Constructing interfaces between (and within) geographical information systems. International Journal of Geographical Information Systems, 9:p275. Pundt, H. and Bishr, Y., 2002. Domain ontologies for data sharing–an example from environmental monitoring using field GIS. Computers & Geosciences, 28: 95-102. Smith, B. and Mark, D.M., 2001. Geographical categories: an ontological investigation. International Journal of Geographical Information Science, 15(7): 591-612. Sotnykova, A., 2001. Design and Implementation of Federation of SpatioTemporal Databases: Methods and Tools, Centre de Recherche Public - Henri Tudor and Laboratoire de Bases de Donnees Database Laboratory. Sowa, J. F., 2000, Knowledge Representation: Logical, Philosophical and Computational Foundations (USA: Brooks/Cole). Turner, M. and Fauconnier, G., 1998. Conceptual Integration Networks. Cognitive Science, 22(2): 133-187. Visser, U., Stuckenschmidt, H., Schuster, G. and Vogele, T., 2002. Ontologies for geographic information processing. Computers & Geosciences, 28: 103-117.
A Framework for Conceptual Modeling of Geographic Data Quality Anders Friis-Christensen1 , Jesper V. Christensen1 , and Christian S. Jensen2 1 2
National Survey and Cadastre, Rentemestervej 8, CPH, Denmark, {afc|jvc}@kms.dk Aalborg University, Fredrik Bajers Vej 7E, Aalborg, Denmark,
[email protected] Abstract The notion of data quality is of particular importance to geographic data. One reason is that such data is often inherently imprecise. Another is that the usability of the data is in large part determined by how “good” the data is, as different applications of geographic data require different qualities of the data are met. Such qualities concern the object level as well as the attribute level of the data. This paper presents a systematic and integrated approach to the conceptual modeling of geographic data and quality. The approach integrates quality information with the basic model constructs. This results in a model that enables object-oriented specification of quality requirements and of acceptable quality levels. More specifically, it extends the Unified Modeling Language with new modeling constructs based on standard classes, attributes, and associations that include quality information. A case study illustrates the utility of the quality-enabled model.
1 Introduction We are witnessing an increasing use and an increasingly automated use of geographic data. Specifically, the distribution of geographic data via web services is gaining in popularity. For example, the Danish National Survey and Cadastre has developed an initial generation of such services [12]. This development calls for structured management and description of data quality. The availability of associated quality information is often essential when determining whether or not available geographic data are appropriate for a given application (also known as the fitness for use). When certain geographic objects from a certain area are extracted for use in an application, these should be accompanied by quality information specific to these objects and of relevance to the application. Quality information is necessary metadata, and there is a need for a more dynamic approach to quality management in which quality information is an integrated part of a geographic data model. This allows the relevant quality information to be retrieved together with the geographic data. This is in contrast to the use of traditional, general metadata reports that are separate from the geographic data itself.
606
Friis-Christensen et al.
We suggest that the specification of data quality requirements should be an integrated part of conceptual modeling. To enable this, we extend the metamodel of the Unified Modeling Language to incorporate geographic data quality elements. The result is a framework that enables designers and users to specify quality requirements in a geographic data model. Such a quality-enabled model supports application-specific distribution of geographic data, e.g., one that uses web services. Use of this framework has several advantages. • • •
It “conceptualizes” the quality of geographic data. It enables integrated and systematic capture of quality in conceptual modeling. It eases the implementation of data quality requirements in databases.
Fundamental elements of quality have been investigated in past work [3, 7, 8, 14], on which this paper builds. In addition, the International Standardization Organization’s TC211 is close to the release of standards for geographic data quality [10, 11]. Their work focuses on identifying, assessing, and reporting quality elements relevant to geographic data; they do not consider the integration of data quality requirements into conceptual models. We believe that conceptual modeling is important when making the management of geographic data quality operational. Constraints are an important mechanism for quality management. Constraints enable the specification of when data are consistent and enable the enforcement of the consistency requirements. An approach to ensure geographic data quality using constraints specified in the Object Constraint Language (OCL) has been investigated previously [2]. However, constraints and consistency issues are only relevant to a subset of geographic data quality; in this paper, we focus on geographic data quality from a more general point of view. The remainder of the paper is organized as follows. Section 2 presents the quality elements and investigates how to classify quality requirements. Section 3 develops a framework for modeling quality by extending UML to include quality elements. Section 4 discusses and evaluates the approach based on an example of how the extended model can be used. Finally, Section 5 concludes and briefly presents research directions.
2 Quality of Geographic Data This section describes quality and the requirements to geographic data quality. Two overall approaches may be used for describing quality: the user-based approach and the product-based approach [6]. The former emphasizes fitness for use. It is based on quality measures evaluated against user requirements dependent on a given application. The product-based approach considers a finite number of quality measures against which a product specification can be evaluated. While the product specification is based on user requirements, it is often a more general specification that satisfies the needs of a number of different customers. The ISO 9000 series (quality management) defines quality as: the totality of characteristics of a product that bear on its ability to satisfy stated and implied needs [9]. This definition is used in the ISO standards on quality of geographic data [10, 11], and it covers both the
A Framework for Conceptual Modeling of Geographic Data Quality
607
user-based and the product-based approach. Geographic data quality is defined by a finite set of elements, which we term quality information. The quality elements that need to be included in a quality report are identified by ISO and others [11, 14] and they include: lineage, accuracy, consistency, completeness (omission and commission). In addition to the elements specified by ISO, we include precision and metric Quality Elements
Completeness Consistency Constraints
Lineage
Userdefined
Quality Subelements
Accuracy
Precision
Commission
Thematic
Domain
Thematic
Spatial
Format
Spatial
Temporal
Topological
Temporal
Omission
Metric
Fig. 1. Quality Information: Quality Elements and Quality Subelements
consistency [8] because they are identified as necessary elements by the Danish National Survey and Cadastre. The difference between accuracy and precision is that precision is the known uncertainty, e.g., that an instrument used for data capture has. All elements have additional subelements as specified in Figure 1. As quality is a relative measure, data users and database designers are faced with two challenges: how to specify their quality requirements and how to determine whether data satisfy the requirements. The first challenge concerns the specification of requirements, which are often not expressed directly. Introducing quality elements in conceptual models is helpful in meeting this challenge because it this enables the capture of quality requirements in the design phase. The second challenge concerns the possible difference between actual and required data quality. A solution is to make it possible for users to evaluate their requirements against the data, by including quality information about each object, attribute, and association in the database. Section 3 provides an approach to meet these challenges. To be able to support quality requirements in conceptual models, we need to identify the characteristics of such requirements. In doing so, we divide the requirements into two. First, it should be possible to specify which quality subelements are relevant for a given application. These requirements are termed Quality Element Requirements (QERs). Second, it should be possible to express requirements related to the values of the actual quality, i.e., express specifications or standards that the data should satisfy. These requirements are termed Acceptable Quality Levels (AQLs). The two types of requirements are at different levels and are similar to the notions of data quality requirements and application quality requirements [15].
608
Friis-Christensen et al.
The QERs can be characterized as the elements that are necessary to be able to assess and store quality. Thus, QERs reside on a detailed design level. On a more abstract level, we find requirements such as “90% of all objects must be present in the data,” which exemplifies the AQLs. It is important to be able to separate these two levels of requirements. The AQLs do not necessarily have to be specified, but if any quality information is needed, the QERs must be specified. Figure 2 shows the two levels. The requirements are specified as user-defined, which means that the designers are specifying the requirements, which should reflect the requirements from the application users. As seen in the figure, the quality
Acceptable quality level User specified requirements
uses
Quality assessment (internal and external)
uses
requires
Quality element requirements (aggregation/instance level) User specified requirements
Fig. 2. Quality Requirements
elements require different methods for their assessment: internal and external methods. In the internal approach, quality is assessed using internal tests immediately when data are inserted into the database, and this testing is independent of external sources. For example, topological constraints are evaluated immediately after data are inserted into the database. Data that do not satisfy the constraints are rejected. In the external approach, data are checked against an external reference that is believed to be the “truth.” For example, if 90% of all forest objects are required to be present in the data, this has to be measured against some kind of reference. Quality elements assessed using external methods need to be assessed at a later stage, by means of an application running on top of the database. Finally, the quality elements are divided into an aggregation level and an instance level. These levels relate to whether quality is measured/specified for each instance of an object type or the quality is aggregated at, e.g., an object-type level. It should be possible to specify requirements related to both levels. As an example, a building object (instance level) has a spatial accuracy, which is different from the aggregated spatial accuracy of all buildings (aggregation level). Both levels may be important to an application.
3 Modeling Quality We proceed to present a framework for conceptual modeling of geographic data quality. We first describe how conceptual data models can be extended to support quality requirements. Then the various quality subelements are related to model constructs. This sets the stage for the quality-enabled model, which is covered last. A more thorough description of the quality-enabled model is given elsewhere [4].
A Framework for Conceptual Modeling of Geographic Data Quality
609
3.1 Conceptual Data Models Conceptual data models aim to enable domain experts to reflect their modeling requirements clearly and simply. We use the Unified Modeling Language (UML) [1] for modeling. The UML metamodel represents model constructs3 as classes, e.g., class, association, and attribute. We term these UML base classes. UML provides three built-in mechanisms for extending its syntax and semantics: stereotypes, tagged values, and constraints [1, 16]. Using these mechanisms, it is possible to adapt the UML semantics without changing the UML metamodel, and so these are referred to as lightweight extensibility mechanisms [13]. Another approach to extending UML is to extend the metamodel, e.g., by introducing new metaclasses. A metaclass is a class whose instances are classes [1], and it defines a specific structure for those classes. The new metaclasses can also define stereotypes, which are used when the new modeling constructs are required. We refer to this as the heavyweight extensibility mechanism [13]. 3.2 Quality in Conceptual Data Models Conceptual data models typically support constructs such as classes, associations, and attributes. No special notation is offered for the modeling of data quality, and database designs often do not model data quality, perhaps because data producers do not realize the importance of data quality information. This may reduce the applicability of a database: to be able to use data appropriately, it is important to have access to quality information. Integration of data quality subelements with the common model constructs is a significant step in making it possible to express data quality requirements in the conceptual design phase. This contributes to enabling subsequent access to quality information. Table 1 relates data model constructs to data quality elements, and is used to extend UML with data quality elements. As can be seen from the table, we divide quality subelements related to model constructs into an aggregation level and an instance level. The aggregation level concerns aggregated quality subelement measures for all objects of a class. The instance level concerns quality subelement measures for an individual object. As described in Section 2, quality requirements can reside at both levels. Furthermore, the quality subelements are classified according to how they are assessed. Completeness and accuracy subelements require external data to be assessed, whereas the remaining quality subelements do not. Precision and lineage information are closely related to the production process and are attached to instances of the model constructs. Precision can later be assessed at the aggregation level. The consistency requirements are related to the specification of the universe of discourse. They specify requirements that must be satisfied and are usually expressed as constraints in conceptual models. Consistency requirements can be specified for instances, but also for collections and sets (aggregation level). 3
Termed model elements in UML; here, we use “model constructs” to avoid confusion with “quality element.”
610
Friis-Christensen et al. Table 1. Quality Related to Model Constructs
Quality elements Completeness (External) Accuracy (External) & Precision (Internal) Lineage (Internal) Consistency (Internal)
Class
Aggregation Level Model construct Attr. Assoc.
Omission Commission Thematic
— Format
Instance Level Model construct Object Attr. value Assoc. instance Omission Omission Omission Commis- Commission sion Thematic Absolute: Relative: Spatial Spatial Thematic Temporal Source Source Source
Omission Commission Absolute: Spatial Thematic Temporal —
Omission Commission Relative: Spatial
Domain Format Topological Metric
Domain Format Format Topological Metric
—
Domain Format Topological Metric
Domain Format Topological Metric
At the instance level, the omission subelement of completeness is not included for objects. This is because information cannot be associated with objects that do not exist in the database. On the other hand, commission errors can be stored; this means that an object in the data set does not exist in the universe of discourse. For attributes and associations, we can assess both omission and commission. When a class is instantiated, it is possible to determine whether or not an attribute has a value. The same applies for explicit associations, as they can be implemented as foreign keys or object reference attributes. Only spatial accuracy and precision of associations are possible. This is the relative distance, which may be relevant in certain situations. Based on our requirements, we do not find relative temporal and thematic accuracy/precision to be relevant. 3.3 Quality-Enabled Model The quality-enabled model presented in this section consists of several elements. First, the class, attribute, and association constructs are extended in the UML metamodel to contain quality subelements. They are used to describe the QERs at the instance and aggregation levels. Second, to express consistency requirements, the constraints are extended to include specific geographic constraints (e.g., topological). Third, optional tagged values are created and used to specify which QERs are relevant and need to be assessed at the instance and aggregation levels. Finally, to specify the AQLs, a specification compartment is used for both the instance and the aggregation level. Quality Element Requirements In Figure 3, the UML metamodel is extended to support geographic data quality based on Table 1. The metaclass specifies the structures of the quality subele-
A Framework for Conceptual Modeling of Geographic Data Quality
611
ments of classes, attributes, and associations, and of their instances. Three new metaclasses are defined: ClassQuality, AttributeQuality, and AssociationQuality. These are subclasses of Class, Attribute, and Association, respectively, from the UML core metamodel. For the abstract metaclass AttributeQuality, we define three subclasses that represent the spatial, thematic, and temporal dimensions. As a subclass to the metaclass Association from the UML core, we define an AssociationQuality, which is used when information on omission and commission needs to be captured. We also define a subclass SpatialAssocQuality, which, in addition to omission and commission, stores information about relative accuracy and precision.
{Each association instance must have QualityInformationAttr of type: Lineage, Omission, Commission}
{Each attribute instance must have QualityInformationAttr of type: Lineage, Omission, Commission, Accuracy, Precision}
Association (from Core)
Attribute (from Core) initialValue
AssociationQuality origin
1
AttributeQuality 1 origin
{Each class instance must have QualityInformationAttr of type: Lineage, Commission, Accuracy, Precision}
{Valid types are: Lineage, Commission, Omission, Accuracy, Precision}
4 5 QualityInformationAttr 2 3
SpatialAssocQuality
TemporalAttrQuality
SpatialAttrQuality
ThematicAttrQuality
relativeAcc precision
absoluteAcc precision
absoluteAcc precision
absoluteAcc precision
Class (from core)
1 ClassQuality absoluteAcc precision origin commission
1
Fig. 3. Quality Metamodel Elements
The attributes accuracy, precision, omission, and commission, some of which are associated with metaclasses that carry quality information, are statically scoped, meaning that their values belong to the class and, hence, apply to all instances. They enable specification of quality requirements at the aggregation level. Quality subelements relevant at the instance level are modeled using the metaclass QualityInformationAttr. This metaclass inherits from Attribute from the UML core and defines a new attribute, which has a type and assessment method. The type of the attribute can be: Lineage, Accuracy, Precision, Omission, or Commission. ClassQuality, AttributeQuality, and AssociationQuality are all associated with instances of QualityInformationAttr, which describe various aspects of quality. They are instance-scoped attributes, which means that they are attributes of each instance of instantiated metamodel classes. The assessment method is optional; however, it can be useful when analyzing a quality report. The instance-scoped attributes are differentiated from the statically scoped attributes by having an initial capital letter.
612
Friis-Christensen et al.
The extended metamodel and its metaclasses are elsewhere used to define new stereotypes [4]. The tagged values and acceptable quality level are explained later in this section. The string > denotes a new stereotype, which can be used in a model. An example is that a class, the instances of which needs to carry quality information, is associated with the stereotype . Since we change the structure of UML’s metamodel (adding new attributes to the metaclasses), we apply heavyweight extension to UML. It has to be noted that the stereotypes only express which quality subelements are relevant. Not all quality subelements have to be assessed if certain requirements from producers exist. A tagged value (see Section 3.3) can be used to specify which subelements need to be assessed. Constraints specify consistency requirements in a conceptual model. Constraints can be associated with all modeling constructs (e.g., classes, attributes, and associations), and all types are subclasses of the metaclass constraint in the UML metamodel. Constraints are specified in OCL, which means that OCL must be extended with new operators, e.g., the topological operator inside. An example domain constraint specified for an attribute follows: domainConstraint context building self.shape.area > 25. This constraint specifies that the area of a building should be greater than a predefined value (25). Four constraint types are classified and stereotypes are defined accordingly: , , >, and . These constraints are further specified elsewhere [4]. Quality Subelements to be Assessed If there is a need to reduce the number of quality subelements, we can use tagged values (qualityElementI and qualityElementA) for specifying which quality elements should be assessed and stored. It is important to note that tagged values can be seen as additional documentation details and do not have to be specified. A comprehensive quality report assesses all elements. QualityElementI specifies which elements need to be assessed at the instance level and qualityElementA specifies which elements need to be assessed at the aggregation level. There is a difference in allowed subelements; for example, lineage information is only relevant at the instance level. The tagged values can be associated with classes, attributes, and associations. To assess quality at the aggregation level, the relevant information must exist at the instance level. As an example, we add a tagged value to a class Building, specifying the subelements that should be assessed at the aggregation level: qualityElementA = . Acceptable Quality Level Finally, to specify the AQLs, two specification compartments are introduced. They can be associated with classes (for classes and attributes) and associations. There is one compartment for specifying the AQLs for each of the instance level and the
A Framework for Conceptual Modeling of Geographic Data Quality
613
aggregation level. The database designer specifies the AQLs. However, users of data can also express AQLs. Data can be evaluated against their requirements and the usability can be determined. Specification of AQLs is optional. The expressions stated in the specification compartments are simply constraints related to the assessed quality, which must be satisfied. The form of the expression is: <modelconstruct> : () . An example of the acceptable quality level is given in the next section.
4 Discussion of Approach This section discusses the approach presented in the previous section. First, a case study elicits some quality requirements relevant for certain geographic entities. We then show how these requirements can be modeled using our approach. Finally, the approach is evaluated against other approaches. 4.1 Case Study The case study consists of a map database with three classes: Road, Road Segment, and Municipality that are depicted in Figure 4 using UML. It is a simplified model of a road network with associated municipalities. The objects in the model are intended for use in the scale of 1:10,000. The model contains no ex0..*
Road name[1] : String
1..*
located in
Road_Segment 1
1..*
extent[1] : Polyline
is within 0..*
Municipality 1
name[1] : String shape[1] : Polygon
Fig. 4. Conceptual Model for Road Network
plicit information of quality requirements, although some are represented implicitly in the model. For example, the cardinalities of associations express consistency requirements. Apart from such implicit quality requirements, several additional quality requirements exist. As examples, we state five quality requirements that are to be included in the model. 1. For all roads, road segments, and municipalities, all quality information should be recorded (completeness, accuracy, precision, and lineage requirement). 2. All road segments must have an average root mean square error of maximum 1.7 meters (accuracy requirement). 3. The name of a road object should be correct (accuracy requirement). 4. At least 99% of all roads in the universe of discourse must be represented in the data, and no more than 1% of all roads must be represented in data without being in the universe of discourse (completeness requirement). 5. A road segment must not intersect a municipality border (consistency requirement). The above list exemplifies the variety among quality requirements. We proceed to express these requirements in our conceptual model.
614
Friis-Christensen et al.
4.2 Modeling Quality Requirements We use the stereotypes specified above to model the QERs. Classes Road, RoadSegment, and Municipality are all stereotyped with (as depicted in Figure 5). This is because we are interested in all available quality information about each class and associated instance (i.e., quality information at both aggregation and instance level). To the attribute name of Road, we associate the stereotype because thematic attribute quality information is required. To the attributes extent and shape of Road Segment and Municipality, we associate the stereotype > because spatial attribute quality information is required. For the remaining thematic attribute name of Municipality, we do not require any quality information. AQLs «ClassQuality» Road «ThematicAttrQuality» name[1] : String
«ClassQuality» Road_Segment «SpatialAttrQuality» extent[1..*] : Polyline 1
AQL Aggr. level class:road(commission) < 1% class:road(omission) < 1% attr:name(accuracy) =100%
1..*
AQL Aggr. level attr:type(accuracy) = 99% attr:extent(accuracy) < 1.7m 1
AQL instance level attr:name(accuracy) = true
«ClassQuality» Municipality name[1] : String «SpatialAttrQuality» shape[1] : Polygon
located in 0..* 1..*
0..*
is within
«TopologicalConstraint» {Road_segment.extent do not cross Municipality.shape}
Fig. 5. Case with Quality Elements
appear in the specification compartments. An example is that for Road, we specify that commission and omission must be less than 1%, which means that less than 1% of all roads must be in excess compared to reality and less than 1% of the roads may be missing from the data. These are requirements to the data producers; if they fail, new registrations must be initiated. Different approaches can be used to develop a quality-enabled model. We have developed an extension to UML. Another approach is to add quality information to standard geographic data models using design patterns [5]. This approach has the advantage that extension to UML is not needed. An example is that the QERs at the aggregation level for both classes and attributes is specified using a quality class for each class and attribute for which quality information is required. Only one instance is allowed for each of these quality classes. This is similar to the design pattern Singleton [5]. The instance-level QERs for attributes can be specified using additional attributes. An example is that for a shape attribute, e.g., a lineageShape and an accuracyShape attribute are required for specifying the quality of the attribute. The AQLs can be specified at the application level on top of a given geographic database, or alternatively tagged values can be used to specify the required AQLs.
A Framework for Conceptual Modeling of Geographic Data Quality
615
This approach complicates the schema unnecessarily, as, e.g., numerous common quality attributes and classes must be associated with classes, attributes, and associations that require quality information. The alternative solutions described above do not meet our requirements. We require that quality information should be an integrated part of standard UML constructs, such as classes and associations, thus, we have extended the properties of existing modeling constructs. The advantage of using this approach is that, even though it may perhaps at first seem complex, it provides a standard interface to model the quality of geographic data. It enables designers to conveniently reflect all requirements related to the quality of geographic data in a conceptual design. Not all quality requirements necessarily have to be specified, and certain requirements can be visually omitted if presented to users with no interest in quality requirements. An example is that the AQLs can be hidden from some users. In the design of a conceptual data model, there is always a balance between how much information to capture in the model and the readability of the model. If a model becomes overloaded with information, the disadvantage can be that the model is not easily read. This is an issue that designers have to consider when using our approach. Further advantages of our approach are that the formulation of quality requirements has been “standardized,” which is helpful in the design and implementation phases. Modeling of quality requirements using standard UML does not ensure a common approach.
5 Conclusions and Future Work A main motivation for the work reported here is the ongoing change in the distribution and use of geographic data. The trend is towards online and automated access. This leads to an increasing need for the ability to select relevant and appropriate data. In this selection, proper quality information is essential; and since users are selective when it comes to geographic theme and extent, they should receive only the quality information relevant to their selection. This requires new approaches to quality information management. This paper presents a new framework for the integrated conceptual modeling of geographic data and its associated quality. First, we advocate a solution that captures that quality information together with the data it concerns. Second, we present a notation for creating conceptual models, in which quality requirements can be captured. The use of models based on our notation and framework enable application-oriented quality reports. An example illustrates that the designers are given an approach to formulate more precisely the quality requirements that are relevant to a resulting geographic data model. Furthermore, the framework provides a systematic approach to the capture of quality requirements. The paper thus offers a significant step towards a common framework for quality modeling. Several interesting directions for future work exist. Some additional aspects of quality may be taken into consideration. Currently, it is not considered how quality information relates to the results of GIS-operations. For example, if spatial data have errors that are spatially autocorrelated, then the data may still be useful for
616
Friis-Christensen et al.
measuring distances. Next, the development of a more formal specification of the quality-enabled model is of interest, as is the development of a quality evaluation prototype. How to transform conceptual models, created within the proposed framework, into logical models is also a topic for future research. Furthermore, there is a need to develop applications that can be used to assess quality and generate quality reports based on the specified quality requirements. Finally, applications that support the extended UML metamodel should be implemented.
References [1] G. Booch, J. Rumbaugh, and I. Jacobson. The Unified Modeling Language User Guide. Object Technology Series. Addison-Wesley, 1999. [2] M. Casanova, R. Van Der Straeten, and T. Wallet. Automatic Constraint Generation for Ensuring Quality of Geographic Data. In Proceedings of MIS, Halkidiki, Greece, 2002. [3] M. Duckham. Object Calculus and the Object-Oriented Analysis and Design of an Error-Sensitive GIS. GeoInformatica, 5(3):261–289, 2001. [4] A. Friis-Christensen. Issues in the Conceptual Modeling of Geographic Data. Ph.D. thesis, Department of Computer Science, Aalborg University, 2003. [5] E. Gamma, R. Helm, R. Johnson, and J. Vlissides. Design Patterns. Elements of reusable Object-Oriented Software. Addison Wesley, 1995. [6] David A. Garvin. Managing Quality: The Strategic and Competitive Edge. Free Press, 1988. [7] M. F. Goodchild and S. Gopal, editors. The Accuracy of Spatial Databases. Taylor & Francis, 1989. [8] S. C. Guptill and J. L. Morrison, editors. Elements of Spatial Data Quality. Elsevier, 1995. [9] ISO. Quality Management and Quality Assurance—Vocabulary. Technical Report 8402, International Standardization Organization, 1994. [10] ISO. Geographic information—Quality evaluation procedures. ISO/TC 211 19114, International Standardization Organization, 2001. [11] ISO. Geographic information—Quality principles. ISO/TC 211 19113, International Standardization Organization, 2001. [12] KMS. Map Service (in Danish), 2003. http://www.kms.dk/kortforsyning. [13] OMG. White Paper on the Profile Mechanism. 99-04-07, 1999. [14] H. Veregin. Data Quality Parameters. In P. A. Longley, M. F. Goodchild, D. J. Maguire, and D. W. Rhind, editors, Geographical Information Systems: Principles and Technical Issues, Vol. 1, pp. 177–189. 2nd Edition, John Wiley & Sons, 1999. [15] R. Y. Wang, M. Ziad, and Y. W. Lee. Data Quality. Advances in Database Systems. Kluwer Academic Publishers, 2001. [16] J. B. Warmer and A. G. Kleppe. The Object Constraint Language : Precise Modeling with UML. Object Technology Series. Addison-Wesley, 1999.
Consistency Assessment Between Multiple Representations of Geographical Databases: a Specification-Based Approach David Sheeren1,2, Sébastien Mustière1, and Jean-Daniel Zucker3 1 COGIT Laboratory - IGN France, 2-4 avenue Pasteur, 94165 Saint Mandé {David.Sheeren,Sebastien.Mustiere}@ign.fr 2 LIP6 Laboratory, AI Section, University of Paris 6 3 LIM&BIO, University of Paris 13,
[email protected] Abstract There currently exist many geographical databases that represent a same part of the world, each with its own levels of detail and points of view. The use and management of these databases therefore sometimes requires their integration into a single database. The main issue in this integration process is the ability to analyse and understand the differences among the multiple representations. These differences can of course be explained by the various specifications but can also be due to updates or errors during data capture. In this paper, we propose an new approach to interpret the differences in representation in a semiautomatic way. We consider the specifications of each database as the “knowledge” to evaluate the conformity of each representation. This information is grasped from existing documents but also from data, by means of machine learning tools. The management of this knowledge is enabled by a rule-based system. Application of this approach is illustrated with a case study from two IGN databases. It concerns the differences between the representations of traffic circles. Keywords. Integration, Fusion, Multiple Representations, Interpretation, Expert-System, Machine Learning, Spatial Data Matching.
1 Introduction In recent years, a new challenge has emerged from the growing availability of geographical information originating from different sources: their combination in a consistent way in order to obtain more reliable, rich and useful information. This general problem of information fusion is encountered
618
David Sheeren, Sébastien Mustière and Jean-Daniel Zucker
in different domains: signal and image processing, navigation and transportation, artificial intelligence… In a database context, this is traditionally called “integration” or “federation” [Sheth and Larson 1990, Parent and Spaccapietra 2000]. Integration of classical databases has already been given much attention in the database community [Rahm and Bernstein 2001]. In the field of geographical databases, this is also subject to active research. Contributions concern the integration process itself (the schemas integration and the data integration) [Devogele et al. 1998, Branki and Defude 1998], the development of matching tools [Devogele 1997, Walter and Fritsch 1999], the definition of new models supporting multiple representations [Vangenot et al. 2002, Bédard et al. 2002], and new data structures structures [Kidner and Jones 1994]. Some ontology-based approaches are now being proposed [Fonseca et al. 2002]. But some issues still need to be addressed in the process of unifying geographical databases, particularly the phase of data integration, i.e. the actual population of the unified database. Generally speaking, this phase is mainly thought of as a matching problem. However, it is also essential to determine if the differences in representation between homologuous objects are “normal”, i.e. originating from the differences of specification. Some contributions exist to evaluate the consistency between multirepresentations, especially the consistency of spatial relations [Egenhofer et al. 1994, El-Geresy and Abdelmoty 1998, Paiva 1998]. Most of the time, studies rely on a presupposed order between representations. This assumption may not be adapted to the study of databases with similar levels of detail, but defined according to different points of view. In this paper, we too address the issue of the assessment of consistency between multiple representations. The approach we suggest is based on the use of specifications from individual databases. We consider these specifications as the key point to understand the origin of the differences, and we suggest the use of a rule base to explicitly represent this knowledge and manage it. We make no assumptions on a hierarchy between the databases that need integrating. The paper is organised as follows: in section 2, we examine the origins of the differences and the specification-based approach to interpret them. Then we propose an interpretation process and the architecture of the system to implement it in section 3. The feasibility of the approach is demonstrated with a particular application in section 4. We conclude our study in section 5.
Consistency Assessment Between Multiple Representations of GDB
619
2 Specifications for Interpreting Differences
2.1 Origin of Differences Between Representations Geographical databases are described by means of specifications. These documents describe precisely the contents of a database, i.e. the meaning of each part of the data schema, which objects of the real world are captured, and how they are represented (figure 1). Specification 1 Data capture1 DB1
View of the World
Schema1
Data capture2 Specification 2
DB2 Schema2
Fig. 1. Specifications govern the representation of geographical phenomena in databases
The differences between specifications are responsible for the majority of the differences between representations. These differences are completely normal and illustrate the diversity of points of views turned on the world. For example, a traffic circle may be represented by a dot in one database, or by a detailed surface in another. However, all differences are not justified. The data capture process is not free of errors and differences can occur between what is supposed to be in the databases and what is actually in the databases. Some other differences are due to the differences of update between databases. These differences are problematic because they can lead to inconsistent representations in the multi-representation system and for that reason, they must be detected and managed in a unification process. More formally, we define hereafter the concepts of equivalence, inconsistency and update between representations. Let O be the set of objects from a spatial database DB1 and O' the set of objects from a spatial database DB2. Let us consider a matching pair of the form (M,M'), where M is a subset of O and M' a subset of O'. Definition 1 (equivalence). Representations of matching pairs (M,M’) are said to be equivalent if these representations can model a world such as, at the same
620
David Sheeren, Sébastien Mustière and Jean-Daniel Zucker
time, M and M’ respect their specifications and correspond to the same entity of the real world. Definition 2 (update). Representations of matching pairs (0,0’) are said to be of different periods if these representations can model a world such as M and M' respect their specifications and correspond to the same entity of the real world, but at different times. Definition 3 (inconsistency). Representations of matching pairs (0,0’) are said to be inconsistent if they are neither an update, nor an equivalence. Thus either M or M' does not respect its specifications (error in databases), or M and M’ do not correspond to the same entity of the real world (matching error).
The purpose of our work is to define a process to automatically detect and interpret differences between databases. This process, embedded in a decision support system, aims at guiding the management of differences during a unification process. 2.2 Knowledge Acquisition for the Interpretation of the Differences The key idea of this approach is to make explicit, in an expert-system, the knowledge necessary to interpret the differences. As explained above, a great deal of the knowledge comes from the specifications. Nevertheless, it is rather difficult to draw knowledge from the specifications and to represent it. Actually, the documents are usually rich but voluminous, relatively informal, ambiguous, and not always organised in the same way. Moreover, part of the necessary knowledge comes from common geographical knowledge (for example: traffic circles are more often retained than discarded) and experts are rarely able to supply an explicit description of the knowledge they use in their reasoning. We are thus faced with the wellknown problem of the "knowledge acquisition bottleneck". In our process, we try to solve it in three different ways. The first technique is to split up into several steps the reasoning involved in the problem solving (section 3). This is the approach of second generation expert-systems [David et al 1993]. The control over the inferences that need to be drawn is considered as a kind of knowledge in itself, and explicitly introduced in expert-systems. The second technique is to develop rules by hand with the help of a knowledge acquisition process. We believe that such a process should rely on the definition of a formal model of specifications [Mustière et al 2003]. In the example developed in section 4, some of the rules managed by the expert-system have been introduced by hand, after formalising the actual specifications by means of a specific model. This is still ongoing research and it will not be detailed in this paper.
Consistency Assessment Between Multiple Representations of GDB
621
The last technique is the use of supervised machine learning techniques [Mitchell 1997]. These techniques are one of the solutions developed in the Artificial Intelligence field. Their aim is to automatically build some rules from a set of examples given by an expert. These rules can then be used to classify new examples introduced into the system. Such techniques have already been used to acquire knowledge in the geographical domain [Weibel et al 1995, Sester 2000, Mustière et al 2000, Sheeren 2003].
3 The Interpretation Process
3.1 Description of the Steps In this section, we describe the interpretation process we have defined. It is decomposed in several steps which are illustrated in figure 2. One correspondence
Specifications analysis
Enrichment of datasets
Intra-Database Control
Spatial Data Matching
Global Evaluation
Inter-Database Control
DB1
DB2
Machine learning
Rules
Fig. 2. From individual databases to the interpretation of differences
The process starts with one correspondence between classes of the schemas of the two databases. We presume that matching at the schema level has already been carried out. For instance, we know that the road class in DB1 tallies with the road and track classes in DB2. The task of specifications analysis is then the key step: the specifications are analysed in order to determine several rule bases that will be used to guide each of the ensuing steps. These rules primarily describe what exactly the databases contain, what differences are likely to appear, and in which conditions. This step is performed through the analysis of documents, or through machine learning techniques. The next step concerns the enrichment of each dataset. This is compulsory before the actual integration of the databases [Devogele et al 1998]. The purpose is to express the heterogeneous datasets in a more homogene-
622
David Sheeren, Sébastien Mustière and Jean-Daniel Zucker
ous way. For this step, the particularity of the geographical databases arises from the fact that they contain lots of implicit information on spatial relations through the geometry of objects. Their extraction requires specific analysis procedures. A preliminary step of control is then planned: the intra-database control. During this step, part of the specifications is checked so as to detect some internal errors and determine how the data instances globally respect specifications. This will be useful for the identification of the origin of each difference but also, for the detection of matching errors. Once the data of both databases has been independently controlled, it is matched. Matching relationships between datasets are computed through geometric and topologic data matching. We end up with a set of matching pairs, each one characterised by a degree of confidence. The next step consists in the comparison of the representations of the homologous objects. This is the inter-database control. This comparison leads to the evaluation of the conformity of the differences and particularly implies the use of specifications and expert knowledge. Results of the first control previously carried out are also exploited. At the end, differences existing between each matching pair are expressed in terms of equivalence, inconsistency or update. After the automatic interpretation of all the differences by means of the expert-system, a global evaluation is supplied: the number of equivalencies, the number of errors and their seriousness, and the number of updates. 3.2 The Architecture of the System An illustration of the structure of the system is given in figure 3. It is composed of two main modules: the experimental Oxygene GIS and the Jess expert-system. Oxygene is a platform developed at the COGIT laboratory [Badard & Braun 2003]. Spatial data is stored in the relational Oracle DBMS, and the manipulation of data is performed with the Java code in the object-oriented paradigm. The mapping between the relational tables and the Java classes is done by the OJB library. A Java API exists to make the link between this platform and the second module, the Jess rule-based system. It is an open source environment which can be tightly coupled to a code written in Java language [Jess 2003]. The rules used by Jess originate directly from the specifications, or have been gathered with the learning tools.
Consistency Assessment Between Multiple Representations of GDB JESS EXPERT-SYSTEM
Facts Inference engine
623
OXYGENE
Application Schemas
Java API
DB Schemas
Java Implementation
Object Schema
Rules Mapping (OJB)
Spatial Data (DBMS Oracle)
Specifications
External Knowledge
GROUND KNOWLEDGE
Learning algorithms
Fig. 3. Architecture of the system
4 Differences Between Representations of Traffic Circles: a Case Study In this section, we study the differences existing between the traffic circles of two databases from the IGN (French National Mapping Agency): BDCarto and Georoute (figure 4). BDCarto is a geographical database meant in particular to produce maps at a scale ranging from 1:100,000 to 1:250,000. Georoute is a database with a resolution of 1 m dedicated to traffic applications. The representation of the traffic circles can differ from one database to another because the specifications are different. Our question is thus as follow: which differences are “normal”, i.e. which representations are equivalent, and which differences are “abnormal”, i.e. which representations are inconsistent ? We detailed the implementation of the process below. Georoute
BDCarto
Fig. 3. The road theme of the two geographical databases examined
Specifications analysis. Specifications of BDCarto and Georoute explicitly describe the representation of the traffic circles. For both databases the representation can be simplified (corresponding to a node) or detailed (corresponding to connected edges and nodes). The modelisation depends
624
David Sheeren, Sébastien Mustière and Jean-Daniel Zucker
on the diameter of the object in the real world, but also on the presence of a central reservation (figure 5).
Fig. 4. Some specifications concerning traffic circles of BDCarto and Georoute
Specifications introduce differences between datasets. They appear both at geometry and attribute levels. The description also reveals a difficulty that has already been brought up: the gap existing between the data mentioned in the specifications and the data actually stored in the databases. The traffic circles are implicit objects made of several edges and nodes, but there is no corresponding class in the database. In the same way, the diameter of the objects and the direction of the cycle does not exist as an attribute in the databases. It is thus necessary to extract this information in order to check the specifications and enable the comparison between the data. This enrichment is the subject of the next step. Enrichment of the data. In the unification context, the enrichment of the databases concerns both geometrical data and schemas. In figure 6, we illustrate the new classes and relations created at the schema level. These classes can constitute federative concepts to put in correspondence the two schemas of the databases during the phase of creating the unified schema [Gesbert 2002]. At the data level, it is also necessary to extract implicit information and instanciate the new classes and relations created. Several operations have been carried out to achieve this, for the two databases (figure 7). First, we created the simple traffic circles and their relation with the road nodes. The simple traffic circles are road nodes for which the attribute nature takes the value ‘traffic circle’. Concerning the complex traffic circles, the construction of a topological graph was first necessary. Faces were created and all the topological relations between edges, nodes and faces were computed. We then filtered each face in order to retain only those corresponding to a traffic circle. Several criteria were taken into account: the direction of the cycle, the number of nodes for each cycle and the value of Miller’s circularity index. These criteria were embedded in rules and com-
Consistency Assessment Between Multiple Representations of GDB
625
bined with the decision support system. In doing so, only faces corresponding to a traffic circle were retained. Traffic Circle
start > Simple Traffic Circle
Complex Traffic Circle
Road Section
1..*
0..1
Road Node
< end 1..1
0..1
Fig. 5. Extract of the Georoute schema: new classes and relations are in dashed line.
The enrichment phase is thus performed to extract the implicit information required for the specifications control, but also to bring the structure of the data and schemas closer to each other. Creation of the Topological
Characterization of each Face
Filtering of Faces (Jess rules)
Examples of created object
Input data
Output data
Fig. 7. The creation of complex traffic circles (extract of Georoute).
Intra-database Control. Two kinds of traffic circles were created in the previous step: simple traffic circles (nodes) and complex traffic circles (connected edges and nodes). At this level, the representations of the objects were checked to detect some internal errors. The control was automated thanks to several rules activated by the expert-system. These rules were developed and introduced by hand. For example: (defrule control_diameter_georoute (if diameterLength > 30) => (set diameterConformity "conform"))
Only part of the representations were controlled at that stage for each database: the complex traffic circles and the information associated with them (the diameter, the number of nodes,…). The node representation was checked later during the inter-database control because of the lack of in-
626
David Sheeren, Sébastien Mustière and Jean-Daniel Zucker
formation at that point. Some errors were identified during this process and the results were stored in specific classes for each database. Spatial Data Matching. The matching tools used in this process are those proposed by [Devogele 1997]. They are founded on the use of both geometric and topologic criteria. They have been enriched by using the polygonal objects created during the previous steps, in order to improve the reliability of the algorithms. A degree of confidence was systematically given for each matching pair, according to the cardinality of the link, the dimension of the objects constituting the link and the matching criteria used. Finally, we have retained 89% of matching pairs, for a total of 690 correspondences computed. BDCarto
Georoute
Matching
Fig 8. Example of homologous traffic circles and roads matched
Inter-database Control. Some internal errors were already detected during the first step of control but the representations of the two databases had not been compared. The comparison was the purpose of this step. It led to the classification of each matching pair in terms of equivalence and inconsistency (no updates were found for these datasets). The introduction of rules by hand to compare the representations was first considered, but because of numerous possible cases and the complexity of some rules, we decided to use supervised machine learning. An example of a rule computed by the C5.0. algorithm [Quinlan 1994] is presented below. It enables the detection of an inconsistency: If the type of the traffic circle in Georoute = ‘dot’ And if the node type of the traffic circle in BDCarto = ‘small traffic circle’ Then the representations are inconsistent
A set of rules have been introduced in the expert-system and finally, the set of matching pairs have been interpreted automatically. We have computed 67% of equivalencies and 33% of inconsistencies. Various types of inconsistencies were highlighted: modelling errors, attribute errors and geometrical errors (the variation between the diameters of the detailed objects were sometimes too high). We have noted that the errors were more frequent in the BDCarto.
Consistency Assessment Between Multiple Representations of GDB
627
5 Conclusion and Future Work This paper has presented an new approach to deal with the differences in representation during the phase of data integration of geographical databases. The key idea of the approach is to use the specifications of each database to interpret the origin of the differences: equivalence, inconsistency or updates. The knowledge is embedded in rules and handled by an expertsystem. The rules are introduced in two ways: by hand and thanks to supervised machine learning techniques. This approach opens up many new prospects. It will be possible to improve the quality and up-to-dateness of each analysed database. The specifications could also be enriched and described in a more formal way. The use of the specifications and representations of one database can indeed help precise the capture constraints of the other database. Finally, we think that the study of the correspondences between the data could help find the mapping between the elements at the schema level. Few research have been made in that direction.
References Badard T. and Braun A. 2003. OXYGENE : an open framework for the deployment of geographic web services, In Proceedings of the International Cartographic Conference, Durban, South Africa, pp. 994-1003. Bédard Y., Bernier E. et Devillers R. 2002. La métastructure vuel et la gestion des représentations multiples. In Généralisation et représentation multiple, A. Ruas (ed.), chapitre 8. Branki T. and Defude B. 1998. Data and Metadata: two-dimensional integration of heterogeneous spatial databases, In Proceedings of the 8th International Symposium on Spatial Data Handling, Vancouver, Canada, pp. 172-179. David J.-M., Krivine J.-P. and Simmons R. (eds.) 1993. Second Generation Expert Systems, Springer Verlag. Devogele T. 1997. Processus d’intégration et d’appariement de bases de données Géographiques. Application à une base de données routières multi-échelles, PhD Thesis, University of Versailles, 205 p. Devogele T., Parent C. and Spaccapietra S. 1998. On spatial database integration, International Journal of Geographical Information Science, 12(4), pp.335352. Egenhofer M.J., Clementini E. and Di Felice P. 1994. Evaluating inconsistencies among multiple representations, In Proceedings of the Sixth International Symposium on Spatial Data Handling, Edinburgh, Scotland, pp. 901-920.
628
David Sheeren, Sébastien Mustière and Jean-Daniel Zucker
El-Geresy B.A. and Abdelmoty A.I. 1998. A Qualitative Approach to Integration in Spatial Databases, In Proceedings of the 9th International Conference on Database and Expert Systems Applications, LNCS n°1460, pp. 280-289. Fonseca F.T., Egenhofer M., Agouris P. and Câmara G. 2002. Using ontologies for integrated Geographic Information Systems, Transactions in GIS, 6(3). Gesbert N. 2002. Recherche de concepts fédérateurs dans les bases de données géographiques, Actes des 6ème Journées Cassini, École Navale, pp. 365-368. Jess 2003. The Jess Expert-System, http://herzberg.ca.sandia.gov/jess/ Kidner D.B. and Jones C. B. 1994. A Deductive Object-Oriented GIS for Handling Multiple Representations, In Proceedings of the 6th International Symposium on Spatial Data Handling, Edinburgh, Scotland, pp. 882-900. Mitchell T.M. 1997. Machine Learning. McGraw-Hill Int. Editions, Singapour. Mustière S., Zucker J.-D. and Saitta L. 2000. An Abstraction-Based Machine Learning Approach to Cartographic Generalisation, In Proceedings of the 9th International Symposium on Spatial Data Handling, Beijing, pp. 50-63. Mustière S., Gesbert N. and Sheeren D. 2003. A formal model for the specifications of geographic databases, In Proceedings of the 2nd Workshop on Semantic Processing of Spatial Data (GeoPro’2003), Mexico City, pp. 152-159. Paiva J.A. 1998. Topological equivalence and similarity in multi-representation geographic databases, PhD Thesis, University of Maine, 188 p. Parent C. and Spaccapietra S. 2000. Database Integration: the Key to Data Interoperability. In Advances in Object-Oriented Data Modeling, Papazoglou M., Spaccapietra S. and Tari Z. (eds). The MIT Press. Quinlan J.R. 1993. C4.5 : Programs for machine learning, Morgan Kaufmann. Rahm E. and Bernstein P.A. 2001. A survey of approaches to automatic schema matching, Very Large Database Journal, 10, pp. 334-350. Sester M. 2000. Knowledge Acquisition for the Automatic Interpretation of Spatial Data, International Journal of Geographical Information Science, 14(1), pp. 1-24. Sheeren D. 2003. Spatial databases integration : interpretation of multiple representations by using machine learning techniques, In Proceedings of the International Cartographic Conference, Durban, South Africa, pp. 235-245. Sheth A. and Larson J. 1990. Federated database systems for managing distributed, heterogeneous and autonomous databases, ACM Computing Surveys, 22(3), pp. 183-236. Vangenot C., Parent C. and Spaccapietra S. 2002. Modeling and manipulating multiple representations of spatial data, In Proceedings of the International Symposium on Spatial Data Handling, Ottawa, Canada, pp. 81-93. Walter V. and Fritsch D. 1999. Matching Spatial Data Sets: a Statistical Approach, International Journal of Geographical Information Science, 13(5), pp. 445473. Weibel R., Keller S. et Reichenbacher T. 1995. Overcoming the Knowledge Acquisition Bottleneck in Map Generalization : the Role of Interactive Systems and Computational Intelligence, In Proceedings of the 2nd International Conference on Spatial Information Theory, pp. 139-156.
Integrating structured descriptions of processes in geographical metadata Bénédicte Bucher Laboratoire COGIT, Institut Géographique National, 2 avenue Pasteur, 94 165 St Mandé Cedex, France,
[email protected] Abstract. This paper extends upon a category of information, processes description, and its relevance in metadata about geographic information. Metadata bases about processes are needed because processes are themselves resources to manage. Besides, specific processes participate in the description of other types of resources like data sets. Still, current metadata models lack structured containers for this type of information. We propose a model to build structured descriptions of processes to be integrated in metadata bases about processes themselves or about data sets. Keywords : metadata, process, task
1 Introduction During the past five years, significant progresses have been made in modelling metadata to enhance the management of geographical data. The release of the ISO19115 international standard was an important milestone. Metadata models initially aimed at resolving data transfer issues. Their application is now to support data exchange and cataloguing. The very scope of these models has widened from describing geographical datasets to describing aggregates of data sets, i.e. specifications of representation, and services. This paper extends upon the necessity of enriching geographic information metadata with structured descriptions of processes. The first section of this paper details the need for this category of knowledge in metadata about geographical information. The next section is a brief state of the art of existing models to describe processes, including
630
Bénédicte Bucher
a prior work of the author in this domain. The section after describes our current approach that focuses on describing data production and management tasks to enrich metadata bases within IGN.
2 The need for structured descriptions of processes in metadata In this paper, studied processes are manipulations of geographical data to meet an objective. The need for structured descriptions of processes in metadata is twofold. 2.1 Processes management metadata Processes themselves are resources to manage so that a metadata model is needed to describe them. Managing a process usually means to track its running or to assist operators in performing it. For instance, the process of acquiring metadata about a data set could be designed as a process distributed between several people participating in the production of the data set. In this case, metadata associated to the data production process would be of the following forms : flags warning to perform a data metadata acquisition event associated to certain events in the data production process, guidelines to perform these metadata acquisition events. Managing a process can also mean cataloguing it. The famous three stages of data cataloguing proposed by the Global Spatial Data Infrastructure Technical Committee (GSDI 00), discovery, exploration and exploitation, can be translated as follows for processes cataloguing : discovery : What processes exist? exploration : Which processes are close to the process I need? exploitation : Can I use this process? Can I adapt it to my context? Or can I define a new process through reusing existing processes patterns? Processes that typically need cataloguing are repair processes. Repairing a given error consists in performing an existing repair process when the error is already referenced. When it is not, it may consist in designing a new repair process possibly based on catalogued diagnosis and repair processes.
Integrating processes in geographical metadata
631
2.2 Data sets description metadata Modelling processes in metadata is also needed to manage other types of resources in the description of which processes appear. Indeed, descriptions of processes are already present in the ISO19115 model where the lineage entity is composed of sources and processes. This specific category of metadata, lineage, is essential to IGN sale engineers. They use it as a major source to assess the content and quality of a data product or data set. They don't find so far this information in metadata bases but rather by contacting production engineers. The ISO19115 structure for processes elements in the lineage entity is free text. Obviously, this information calls for a richer structure. 2.3 A need for a model of processes types and instances To conclude this section, let us summarize what structures are needed to describe processes in metadata, as illustrated on Figure 1. A model is needed to describe : processes types, like data matching, processes instances, like consistency checking for two specific data sets. These structures are needed to build process management metadata. Descriptions of specific types of process constitute a metadata base about these types of processes. It could entail information like the name of the process, its generic signature, its decomposition. Metadata bases about specific instances of a process can then be obtained through specifying elements in the description of the corresponding type of process, like the name of the operator who was responsible for performing this process instance and some decisions he made during the process. These structures are also needed to document the lineage metadata in data descriptions. A type of process can be the production of a type of data sets. For instance the Scan25® product in IGN is a data set series associated to a type of production process. The description of this type of process is a lineage metadata for this specific aggregate. The lineage metadata for a data set belonging to this aggregate is then the description of the specific process instance which output was this data set.
632
Bénédicte Bucher
Structures
Metadata about processes
>
Metadata model for types of processes
TypeOfProcess
Metadata about data
structures
DataSet Matching DataSet Matching DataSet Matching
(All classes) Metadata Base about types of processes (one class) Metadata model for particular processes of this type
(one class) lineage metadata for a specific agregate
structures Matching_id : Matching_id : DataSet Matching : Matching_id DataSet Matching DataSet Matching
(All instances) Metadata Base about particular processes of this type
(one instance) lineage metadata for a specific data set
Fig. 1. Integration of structured descriptions of processes in metadata about geographic information.
3 Existing models to describe processes Models to describe processes as such are to be found in the domain of business management. They integrate components like triggering events, decomposition and agents (Tozer 99). In the context of the Web, there exist models to describe Web services. Web services realise specific categories of process. In this area, the most achieved approach is that of OWL-S (OWL-S 02). OWL is a W3C effort to propose languages to build ontologies on the Web. A section of OWL is specifically dedicated to Web Services. It proposes the following RDF-like model to describe services. A resource provides a service. A service presents a service profile, i.e. what the service does. A service is described by a service model, i.e. how it works. A service supports a service grounding, i.e. how to access it. In the area of geographic information, there exist classifications of geographical Web services like that of OGC (OGC 02). A promising work is that of (Lemmens et al.03) who extend the OGC Web service taxonomy schema using DAML-S (former name of OWL-S).
Integrating processes in geographical metadata
633
All these models support the description of processes instances or of very specific process types like a Web Service is. But they don’t provide much description for more generic process types, apart from classifications. In the COGIT laboratory, the TAGE model has been proposed to enrich ISO19115 metadata with how-to-use knowledge (Bucher 03). This model describes application patterns like to locate an entity or to match data sets. These applications patterns can be specified to obtain a description of a geographic application that should yield the result the user expects. TAGE relies on the classical concepts of tasks and roles. A task is a family of problem and their solution. A task is a process type, generic or specific. It can be specified to a more specific process type or to a process instance. A process instance is a realisation of a task. The inputs and output of tasks are described through roles. A role is a variable with a name and a set of possible values. For instance, the output of the task « to locate an entity » is a role called « location ». The set of possible values for this role includes : a geometry in a spatial reference system, coordinates in a linear reference system, a place defined by a relationship with a geographic feature, a symbol on a map or route instructions. The role itself is task dependant whereas the elements describing its values or not task dependant. These elements are called “domain elements”. TAGE has been embedded in the TAGINE application to assist users in specifying a task which realisation should meet their need. Tasks and roles are relevant concepts in the description of generic processes. The TAGE model supports the description of tasks which decomposition varies depending on how they are specified, which is not supported by other decomposition models. Moreover, the use of roles and of specification rules embedded within the task allows for the consistent specification of the output depending on the specification of the input. This is far more precise than describing the signature of a process. Indeed, the effect of a process, as well as other properties of the process, may depend on the input data. The limitations of TAGE are the complexity of this model and the consequent difficulty of acquiring tasks to feed in the TAGINE database.
4 TAGE2 : a simplified TAGE model Our current approach to describe processes consists in simplifying the TAGE model to obtain a model and a database that are more readable and easy to maintain.
634
Bénédicte Bucher
4.1 A shift of objective The corresponding application does not meet the same objective as TAGINE. Our ambitions have shifted from enhancing external users access to geographical information to enhancing internal users access to geographical information. This takes place in a new context in IGN where people are more aware of the importance of metadata. This context is described in (Bucher et al. 03). Technically, TAGINE aimed at supporting cooperative specification of a task. The new application only aims at : browsing tasks, editing a task and interactively specifying it, storing tasks and storing realisations of tasks.
4.2 The representation of tasks in TAGE2 In the initial TAGE model, tasks were to be modelled as instances of one class, similar to the ISO FC_FeatureType structure. In TAGE2, a task can be modelled as a class extending the class “Task” or as an instance of the class “TaskType”, as shown on Figure 2. The same holds for domain elements with the classes “DomainElement” and “ElementType”. We use this because the java language does not support the representation of metaclasse. Still metaclass like structures are needed as explained in section 2.3. A specific task should be both an instance of another class and a class itself. Task
TaskType type >
DataSet Matching
TTDSMatching : TaskType
type : TTDSMatching
Fig. 2. Double view on tasks : a specific task can be described as a class extending the class Task or as an instance of the class TaskType.
Describing a task as an instance of the class TaskType, or of a subclass of it, is useful to describe numerous tasks, typically to build a metadata da-
Integrating processes in geographical metadata
635
tabase about processes. Describing a task as a class is useful to focus on the model of the specific task. Instances of this task will be descriptions of processes instances. These needs were listed on Fig 1. The use of TAGE2 structure to meet these needs is summarized on Fig 3. Metadata about processes
TAGE2 Structures
TaskType
Task
Metadata model for types of processes
structures TTDSMatching:TaskType TTTopologyChecking:TaskType … DataSet Matching
structures Metadata Base about types of processes
Metadata model for particular processes of this type
structures
DSM1:DataSetMatching DSM2:DataSetMatching …
Metadata Base about particular processes of this type
Fig. 3. The use of TAGE2 structures to meet the needs expressed in the first section
4.3 A prototype To build a prototype of TAGINE2, we have chosen to describe a specific data management task : data set matching. This task is used by the data producer to assess the quality of a data set, to update a data set or to build a multi-representation data set. The domain elements for this task have been modelled according to the implementation of ISO standards in the laboratory plate form OXYGENE (Badard and Braun 2003). We introduced the concept of data set and related it to features implementing the ISO FT_Feature interface. A data set is also associated to unit of storage that can be XML files or Oracle tables or directories. It has a description that is a ISO19115 MD_DataDescription entity. These elements are illustrated on Figure 4. Other domain elements relevant to the data set matching task are Feature catalogues that are referred to in MD_Description entities.
636
Bénédicte Bucher
DataSet < content
description >
container v Feature
UnitOfStorage
MD_Description
^ is mapped to FeatureList
Fig. 4. Domain elements representing a data set.
The overall mechanism of the task is to find minimas of the distance between two representations of the same space, at the model level and at the data level. Its decomposition is the following. The first subtask is to match schemas. This consists in matching groups of elements in both schemas that represent the same category of objects in reality. In this task, it is also important to identify relationships and attributes that don’t vary with the representation and to mark those that are identifiers. This task should rely on stored correspondences between feature types in feature catalogues and on stored marking of non varying attributes and relationships in each feature catalogue. In the future, these elements will be integrated in the domain. The next subtask is to rank the schemas correspondences after the evaluated easiness and quality of data matching for the corresponding features. If two classes are matched that bear a non varying identifier, then features corresponding to these classes should be matched first. If two classes are matched and bear numerous non varying relationships, the corresponding features should be matched early. Indeed, these first results will then be used to match features related to them by these relationships. There exist numerous rules like these ones that should be applied to build a tree of correspondences. The last subtask is to match features. It is itself decomposed into three tasks : to select a feature in the reference data set –after specific criteria-,
Integrating processes in geographical metadata
637
to restrict the compared data set to features that may be matched to the selection and to use geometric algorithms to assess the correspondence.
Fig. 5. Browsing of the DataSetMatching task in the prototype. The interface is in french but some elements (the tasks) have been translated into English or the purpose of this paper.
The Figure 5 shows task browsing supported by the prototype. A task is described by its roles, a brief description of its model, and a summarized description of its decomposition. This decomposition can be browsed more in detail in another window. The “detail” buttons next to each role open windows describing the possible values for the role.
638
Bénédicte Bucher
Perspectives On-going work aims at providing a simple interface to edit tasks as well as domain elements. The next step will consist in acquiring tasks and realisations of tasks. A first method to acquire tasks consists in asking people to describe through the TAGINE2 interface a task they are familiar with, either by building a new instance of TaskType that will possibly reuse other instances, or by building a new subclass of Task that will possibly reuse other subclasses. Another method we plan to experiment is the analysis of a set of log files corresponding to the same task to extract common patterns. Acquiring realisations of a specific task will rely on the description of this task as a specific class. We will focus on supporting this acquisition in a distributed context, i.e. when it is performed by several people. Again this may be done through these people using the TAGINE2 interface or through the analysis of their log files. We intend to focus on the acquisition of data production tasks and on the integration of these descriptions in the ISO19115 lineage entity.
Acknowledgements The author wishes to thank Sandrine Balley, Sébastien Mustière and Arnaud Braun from the COGIT laboratory for their help in modelling the Data Set Matching task, its roles and decomposition, and the corresponding domain elements.
References (Badard and Braun 03) Thierry Badard, Arnaud Braun, OXYGENE : An Interoperable Platform Enabling the Deployment of Geographic Web Services, GISRUK conference, London, 2003 (Bucher 2003) Bénédicte Bucher, Translating user needs for geographic information into metadata queries, 6th AGILE Conference, Lyon, 2003, pp.567576. (Bucher et al. 03) Bénédicte Bucher, Didier Richard, Guy Flament, A metadata profile for a National Mapping Agency Enterprise Portal - PARTAGE, GISRUK Conference, Londres, 2003 (GSDI 00) GSDI Technical Working Group, Developing Spatial Data Infrastructures : the SDI Cookbook, v1.0, Douglas Nebert (Ed), 2000.
Integrating processes in geographical metadata
639
(Lemmens et al. 03) Rob Lemmens, Marian de Vries, Trias Aditya, Semantic extension of Geo Web service descriptions with ontology languages, 6th AGILE Conference, Lyon, 2003, pp.595-600 (OGC 02) The OpenGIS Consortium, OpenGIS® OWS 1.2, 2002 (OWL-S 03) The OWL Services coalition, OWL-S 1.0, Semantic Marks up for Web Services, 2003 (Tozer 1999) Guy Tozer, Metadata management for information control and business success, Artech House, Boston, 1999
Toward Comparing Maps as Spatial Processes Ferko Csillag1* and Barry Boots2 1 Department of Geography, University of Toronto, 3359 Mississauga Rd, Mississauga, ON, L5L1C6, Canada
[email protected] 2 Department of Geography and Environmental Studies, Wilfrid Laurier University,Waterloo, ON, N2L 3C5, Canada
[email protected] Abstract We are concerned with comparing two or more categorical maps. This type of task frequently occurs in remote sensing, in geographical information analysis and in landscape ecology, but it is also an emerging topic in medical image analysis. Existing approaches are mostly pattern-based and focus on composition, with little or no consideration of configuration. Based on a web-survey and a workshop, we identified some key strategies to handle local and hierarchical comparisons and developed algorithms which include significance tests. We attempt to fully integrate map comparison in a process-based inferential framework, where the critical questions are: (1) Could the observed differences have arisen purely by chance? and/or (2) Could the observed maps have been generated by the same process? Keywords: stochastic processes, spatial pattern, inference, local statistics, hierarchical decomposition
1 Introduction Advances in digital data collection, processing and imaging technologies result in enormous databases, for example, in environmental and medical sciences. These data sets can be used in many ways, one of which is to determine categories that reflect some sort of human summary of the data (e.g., label land cover classes to pixels of a satellite image, identify tumors on a medical image, assign habitat types on a simulation model output). With recent demands on the usage of such databases comparison of spatial data sets is becoming a more and more frequent task in various settings.
642
Ferko Csillag and Barry Boots
The relevant traditions in spatial data analysis are focused on (1) accuracy assessment, which is primarily concerned with the coincidence (or confusion) matrix accounting for compositional differences at each data site, and is most frequently used to characterize the match (or mismatch) between data sets with identical labels (Congalton 1994, Stehman 1997, Stehman 1999, Smith et al. 2002, Foody 2002); (2) change detection, where differences (either before or after labeling/classification) are interpreted against time (Richards and Xiuping 1999, Metternicht 1999, Rogerson 2002); (3) model comparison, where (predicted or simulated) model outputs are compared to either observed landscapes and/or to other model outputs (White et al. 1997, Hargrove et al. 2002).; (4) landscape indices, which usually summarize some characteristics of spatial patterns by one (or a few) numbers and thus facilitate comparisons (Trani and Giles 1999, Turner et al. 2001., Rogan et al. 2002). There are recent efforts to introduce fuzzy approaches that allow for a level of uncertainty in categories or locations or both (Power at el. 2001, Hagen 2003). In general, however, these traditions have had much less impact on each other than would have been expected based on their strengths and weaknesses (Fortin et al. 2003). In particular, these approaches are, in large part, inconsistent with our understanding of the relationships between processes and patterns. Therefore, we refer to them as pattern-based approaches since they concentrate on the patterns shown on the maps without explicitly considering how the patterns were generated (e.g., it is a frequent assumption that data sites are spatially independent, or that the data-generating process is stationary). We suggest that the fundamental question in map comparison should be: "Could the observed differences have arisen purely by chance?". Considering the pattern-based approaches, it is virtually certain that we will encounter some differences between the maps in the sense that not every location will have identical values on the maps. The above listed methods do not provide any guidelines for users to evaluate if the differences are significant (i.e., "surprising"). Such statistical inference would require us to evaluate the likelihood of each map (or certain values on a map), which in turn would allow us to answer the question: "Could the observed maps have been generated by the same process?". We report here the initial steps toward incorporating map comparisons in a process-based inferential framework (Getis and Boots 1978). It is a challenging task because there is no (consensus on a) general strategy toward a theoretically and operationally feasible framework in spite of the recognition of the need for simultaneous use of compositional and configurational information and spatially variable (or adaptive) data description (Csillag and Boots 2003). Therefore, to assess the appropriateness of these two approaches in light of "what do users want?", we designed a web-
Toward Comparing Maps as Spatial Processes
643
based test for map comparison and followed it up with a workshop. In the next section we report the major findings of these open consultations. Then outline a "top-down" (hierarchical) and a "bottom-up" (local) pattern-based map comparison procedure that both include an inferential component. It is followed by our preliminary results in process-based methods and some concluding remarks regarding the likely decisions related to these approaches and their advantages/disadvantages.
2 What You See Is What You Get? Ultimately, the goal of map comparison is to assist users in decisionmaking. It is imperative, therefore, to account for users' perspectives (e.g., what is it that users see or what is it that users want?) even if they do not match some envisioned formal stochastic framework (D'Eon and Glenn 2000). Our primary goal in designing the test was to assess how users react to various aspects of differences in the data-generating process. We were also interested if these reactions varied by expertise and/or specific fields of expertise. For simplicity we simulated 64-by-64 binary (black-and-white) isotropic stationary landscapes with fixed (known) parameters for composition (proportion of B/W) and configuration (first-order neighbours). We used four simulation algorithms as pattern generators: (1) conditional autoregression, where the configuration parameter is spatial autocorrelation (Cressie 1993, p.407, Csillag et al. 2001); (2) fragmentation where the configuration parameter is spatial clustering (Fahrig, 1997, 1998); (3) inhibition (Upton and Fingleton, 1985, pp.18-22), where the configuration parameter is the maximum number of neighbouring pixels of the same colour; (4) a pixel-level inhomogeneous Poisson (Cliff and Ord, 1981, p.89), where the configuration parameter is the probability that neighbouring pixels are positively correlated. We created 102 systematically chosen pairs from identical processes (36 with no difference between the parameters, 20 with compositional difference, 20 with configurational difference, 26 with difference in both) and posted it on the World Wide Web on 11 April 2003 at http://eratos.erin.utoronto.ca/fcs/PRES/2MAPS/2maps_ind ex.html
(Figure 1). The test consisted of 20 randomly selected pairs about which the only question asked was: "Are these two maps different?", and comments could be entered in a text window. At the beginning of the test we asked users to provide their names and e-mail address, some information about their experience and rate their expertise (0-10), and after completion
644
Ferko Csillag and Barry Boots
we asked them to comment on the test as well as to rate the difficulty in answering the question. On 24 May 2003 we organized a workshop at the GEOIDE Annual
Fig. 1. Four sample pairs from the web-based map comparison test. The process names are followed by the composition-configuration parameter pairs and the percentage of respondents who found them different. Top-left: fragmentation [40-2540-25] 57% top-right: CAR [75-12,65-00], 86% bottom-left: CAR [55-12,45-12], 91% bottom-right: inhom [20-01,20-02], 51%.
Conference in Victoria, BC. There were fourteen participants who repeated the test, listened to a brief presentation about the fundamental considerations and then interactively evaluated their findings. Here we report results based on the first 200 users. The average expertise was 5.75 with a bimodal distribution with a standard deviation of 2.49. According to the field of expertise we recorded "geography", "GIS", "remote sensing", "computing/statistics" and "landscape ecology" (anything else was grouped in "other"). Participants could list more than one field; "GIS" was mentioned by half of them. The average reported difficulty was 4.65 resembling a uniform distribution (standard deviation: 2.67). There was a weak (inverse linear) relationship between expertise and difficulty, there was no relationship between expertise and number of correct answers (Figure 2) and there was a weak (linear) relationship between difficulty and number of correct answers (not shown). In 74% of all pairs respondents found the pairs different (Figure 3), and 68.1% of the answers were correct. The results are not significantly different when grouped by field of expertise (Figure 4).
Toward Comparing Maps as Spatial Processes
645
Both "raw score" evaluation and the analysis of the commentaries strongly indicate that respondents were, in general, much more sensitive to composition than configuration. In fact, composition was almost always
Fig. 2. Relationship between "expertise" and "difficulty" (left) and "expertise" and "number of correct answers out of 20" (right) for the first 200 participants of the web-test. The respondents were grouped by expertise; the centre of the circle represents the group-average, the area of the circle represents the size of the group.
considered first (and foremost). Many respondents chose to interpret the pairs in a specific context (e.g., "I tried to imagine these as vegetation maps..."). Most frequently references were made to "clusters", "clumping", "density" and "scale". COMPOSITION
CONFIGURATION
same
different
same
54.7%
69.4%
different
80.8%
89.9%
Fig. 3. Percentage of respondents finding pairs different by compositional and configurational parameter differences. (The difference between the lowest and highest cells is statistically significant at the 0.1 level.)
If and when composition was found the same, two types of 'ad hoc' strategies seem to dominate. One of these was "scanning" the landscape for differences in specific locations (e.g., "I started to count black cells from the corners", "The big black 'blob' in the center of the left image is missing on the right one"), and the other one was "zooming" in and out over the entire landscape revisiting sections at coarser/finer resolution (e.g., "I tried to find 'objects' in the images at coarser resolution", "They do not appear to be different unless one is interested in a sub region of the image"). Frequently, some combination of these strategies was employed (e.g., switching back and forth between "scanning" and "zooming"). There were several suggestions to include a scale for differences, but the term 'significant' was only mentioned nine times (in 200*21=4200 comments). Finally, it is
646
Ferko Csillag and Barry Boots
important to note that in more than half of the cases where the pair was generated (simulated) by an identical process (i.e., differences were purely due to chance) respondents found them different.
Fig. 4. Relationship between "expertise" and "difficulty" (left) and "expertise" and "number of correct answers out of 20" (right) for the first 200 participants of the web-test. Participants were grouped by self-identified field of expertise: LEC=landscape ecology, CST=computing/statistics, RSE=remote sensing, GIS=geographical information, GGR=geography, and OTH=other, and are represented by the group averages
During the workshop discussion, participants were introduced to the concept of both pattern-based and process-based map comparisons. The proportion of correct answers was only slightly higher than among webparticipants, but a definite need was emphasized for testing the hypothesis if the observed differences were due to chance. Although process-based inferential statistics has a long tradition in geographical information analysis, we have not found any explicit reference to map comparison (Legendre and McArdle (1999) mentioned it but never explored the idea).
3 Pattern-Based Map Comparisons with Significance Test We developed methods that include a significance test (or a series of tests) within the pattern-based framework that meet the expectations formulated in our surveys. Below we illustrate one that follows the idea of evaluating differences at all locations within neighbourhoods on pairs of categorical maps on a regular grid, and another one that implements a hierarchical approach. Both approaches lead to a series of comparisons, each with a significance test, rather than a single value to characterize the differences. 3.1 A local approach to map comparison This approach makes use of local statistics for categorical spatial data (local indicators for categorical data – LICDs) (Boots, 2003). LICDs are
Toward Comparing Maps as Spatial Processes
647
based on the two fundamental characteristics of categorical spatial data, composition, which relates to the aspatial characteristics of the different categories (e.g., colours), and configuration, which refers to the spatial distribution (or arrangement) of the categories. Further, it is argued that, when considered locally, configuration should be measured conditionally with respect to composition. While composition can be uniquely captured by measures based on category counts or proportions, no simple characterization is possible for configuration. Current results indicate that as many as five measures of configuration (number of patches, patch sizes, patch dispersion, join counts, and category eccentricity) are required in order to differentiate between all possible local categorical maps. Previously, local spatial statistics, in general, and LICDs, in particular, have been used to identify spatial variations within a data set (e.g., complementing global methods). Here, we use them for map comparison. For now, for simplicity, assume two binary raster maps A and B. A and B can be compared by computing LICDs in (n x n) windows centred on each pixel i in both maps. Let cin be the value of the number of black cells (composition) and fjin be the value of the jth configuration measure in the (n x n) window centred on pixel i. Generate the distributions of cin and the conditional distributions of fjin | cin for A and B. Test if the corresponding pairs of distributions are significantly different. Thus, by using values of n =3, 5, 7, 9, … we can determine at what spatial scales and for which characteristics, A and B differ (Figure 5). 3.2 A hierarchical approach to map comparison This approach is based on the measurement and pyramid-like hierarchical decomposition of the mutual information between two categorical maps on a regular grid (Csillag et al. 2003). The basic measures used are the mutual information: I(X1,X2) = H(X2)-H(X2|X1), the amount of the common part of information in X2 and X1 (which equals D[(X1,X2), (X1xX2)], the Kullbackdivergence between the actual joint distribution and the joint distribution with independent X1 and X2, usually obtained as the cross-product of the marginals), and the uncertainty coefficient: U = 100•I(X1,X2)/H(X2), the proportion of the common part, and the significance of U can be assessed by the appropriate chi-square distribution (Freeman 1987). We encode the quadrants (1,...,4 for NW, NE, SW, SE, respectively) of the pyramid for a 2L-by2L grid for each level of the pyramid, and we denote the variables associated with the levels as (coarsest to finest) Y1, Y2,..., YL. Let X denote the variable representing the (collection of) maps, and Z denote the variable for colours. The decomposition I(X,(Y,Z))=H(Y,Z)-H((Y,Z)|X) is straight-
648
Ferko Csillag and Barry Boots
forward (and note that H(Y)=log2(#cells) at each level). The series of mutual information forms a monotonic sequence: I(X,(Z,Y1)) < I(X,(Z,Y1,Y2))... and the (residual) uncertainty (and its significance) can again be computed for each step.
Fig. 5. The LICD-based map comparison software with a convenient graphical user interface.
This approach, in principle, can be easily illustrated by an analogy to a digital camera. Imagine that we are looking at two completely random images with the camera out of focus (i.e., very coarse resolution): the images appear similar. As we change the resolution to finer and finer quadrants, slight differences may show up, but the vast majority of the differences will be encountered in the very last step. Conversely, if we compare two images consisting of a few large patches with different colours, we would encounter almost all the differences in the first step, and no more differences will be found until we reach the finest resolution. Both the mutual information and the uncertainty coefficient can be plotted as a function against pyramid-level, the significance of each level (conditioned on the coarser ones) can be tested, and the shape of these functions provides visual characterization of the differences (Figure 6).
Toward Comparing Maps as Spatial Processes
649
4 Concluding Remarks: Toward Process-Based Map Comparison Comparing categorical data sets in a process-oriented framework requires some formal assumptions about the interactions between locations and categories: a stochastic process in this sense is the description (parametrization) of the interactions. For instance, we may assume that there is no relationship between locations and colours on a map (i.e., the colour distribution in space is random), in which case we are able to test this hypothesis in various forms (see example-2 above), or we may assume that
Fig. 6. The hierarchical decomposition of mutual information and uncertainty along pyramid levels (software implementation in R).
a given colour distribution does not change significantly from one data set to another (see example-1 above). Characterization of the interactions between locations and categories for cases other than random puts map comparison into the inferential framework where we estimate tha parameters of a model for two (or more) realizations and assess the difference(s) with some (required) level of confidence. This can be quite challenging, and here we briefly highlight the avenues we will pursue in the near future. The first challenge we face is that the generality of a model usually comes at a price of extremely large number of paramaters. To illustrate this, consider a broad class of stochastic processes generating a categorical map (C categories and N locations) which can be described by Markov Random Fields (MRFs). The number of potential parameters in the general case is prohibitive (~C2N) because, in principle, both compositional and configurational parameters can change at each location (Li 2001). Thus,
650
Ferko Csillag and Barry Boots
simplifying assumptions are necessary to make the model manageable, of which the most frequent one is stationarity, i.e., the parameters are assumed to be spatially homogeneous (Getis and Boots 1978, Cressie 1993). In the stationary case for MRFs the local conditional distributions of firstorder neighbours fully characterize the joint distribution (Geman and Geman 1984, Besag 1986). In this case the joint likelihood can be written as l (P{s1,...,sn},D,E,J) v D + E(si) + J(si,sj), where D is the so-called partition function (to ensure integration to 1), E accounts for composition (~the probabilities of categories), and J accounts for configuration (~the probability of given category-combinations for given neighbours). Even for binary maps this means 1 parameter for composition and 16 (=24) parameters for configuration. Furthermore, these parameters cannot be estimated independently (the proportion of B or W cells obviously influences the probabilities of BB, WW and BW neighbours). However, the local conditional probabilities (the elements of J, e.g., the probability of a cell being B given that four neigbours are B) and the corresponding frequencies (i.e., how many times it actually occurs that a B cell has four B neighbours) fully characterize the stochastic model. Thus, using Markov-chain Monte-Carlo (MCMC) methods, we can simulate realizations for any parameter set and this allows us to derive the empirical distributions of widely used or newly proposed measures of map comparison. With stationary MRFs as stochastic processes, we can model the differences between two maps as, for example: (1) independent realizations with constant parameters (e.g., the data were produced by two different procedures), (2) independent realizations with changing parameters (e.g., forest harvesting, disease spreading, urbanization), (3) non-independent realizations with constant parameters (e.g., undisturbed forest conservation area at two different dates), (4) nonindependent realizations with dependent parameters (e.g., pollution plume and fish distribution). A major challenge is the non-stationary case, when spatial heterogeneity of the parameters is allowed. This makes all single-value-based traditional comparisons highly suspect. While a general solution is nowhere in sight, some "wavelet-like" optimization seems to be feasible. The above illustrated non-parametric pattern-based approaches are also likely to be most useful in these situations where we can make no (or only limited) assumptions concerning the process(es) that generated the pattern. With recent advances in digital spatial databases there is increasing demand for comparing such data sets (e.g., evaluating changes over time, assessing situations at two different locations, considering the advantages and disadvantages of new methods). We propose to use existing mathematical-statistical foundations for putting map comparison into an inferen-
Toward Comparing Maps as Spatial Processes
651
tial framework. The necessary extra effort pays off by being able to ascertain if observed differences could have been caused purely by chance. We demonstrated the preliminary steps toward drawing conclusions about such comparisons in terms of the significance of the differences. These developments are likely to change the way we view differences between maps.
Acknowledgement The authors gratefully acknowledge the financial support of the GEOIDE Network of Centres of Excellence (Canada) and the Natural Sciences and Engineering Research Council of Canada, and the constructive discussions with Sándor Kabos (Eötvös University, Budapest) about entropy-based measures of similarity.
References Besag, J.E. 1986: On the statistical analysis of dirty pictures. Journal of the Royal Statistical Society B, 48:302-309. Boots, B. 2003. Developing local measures of spatial association for categorical data. Journal of Geographical Systems, 5(2), 2003, 139-160. Cliff, A.D. and Ord, J.K. 1981. Spatial Processes: Models and Applications. London: Pion. Congalton, R. (ed.) 1994: International Symposium on the Spatial Accuracy of Natural Resources Data Bases. ASPRS, Bethesda, MD. Csillag, F. and Boots, B. 2003 A statistical framework for decisions in spatial pattern analysis. Canadian Geographer (in review) Csillag, F., Boots, B., Fortin, M-J., Lowell, K and Potvin, F. 2001 Multiscale characterization of ecological boundaries. Geomatica 55: 291-307. Csillag, F., Remmel, T., Mitchell, S. and Wulder, M. 2003. Comparing categorical forest maps by information theoretical distance. Technical Report CFS-03, Department of Geography, University of Toronto, p.49. Cressie, N.A.C. 1993. Statistics for spatial data. New York: John Wiley & Sons. D'Eon, R.G., Glenn, S.M. 2000. Perceptions of landscape patterns: Do the numbers count? Forestry Chronicle 76 (3): 475-480. Fahrig, L. 1997. Relative effects of habitat loss and fragmentation on population extinction. Journal of Wildlife Management, 61(3), 603-610. Fahrig, L. 1998. When does fragmentation of breeding habitat affect population survival? Ecological Modelling, 105, 273-292. Foody, G. 2002: Status of land cover classification accuracy assessment. Remote Sensing of Environment 80:185-201.
652
Ferko Csillag and Barry Boots
Fortin, M-J., Boots, B., Csillag, F. and Remmel, T. 2002: On the role of spatial stochastic models in understanding landscape indices in ecology. Oikos 102:203-212. Freeman, D.H. 1987. Applied categorical data analysis. New York: M. Dekker. Geman, D. and Geman, S. 1984: Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images. IEEE T. Pattern Analysis and Machine Intelligence 4: 721-741. Hargrove, W., Hoffman, F.M., Schwartz, P.M. 2002. A fractal landscape realizer for generating synthetic maps. Conservation Ecology 6 (1): Art. No. 2. Legendre, P. and McArdle, B.H., 1997. Comparison of surfaces. Oceanologia Acta 20: 27-41. Li, S.Z. 2001. Markov random field modeling in image analysis. New York: Springer. Metternicht, G. 1999. Change detection assessment using fuzzy sets and remotely sensed data: an application of topographic map revision. ISPRS Journal of Photogrammetry and Remote Sensing 54: 221-233. Power, C., Simms, A., White, R. 2001. Hierarchical fuzzy pattern matching for the regional comparison of land use maps. International Journal of Geographical Information Science 15: 77-100. Richards, J.A. and Xiuping, J. 1999: Remote sensing digital image analysis: an introduction. New York: Springer. Rogan, J., Franklin, J. and Roberts, D.A. 2002: A comparison of methods for monitoring multitemporal vegetation change using Thematic Mapper imagery. Remote Sensing of Environment 80:143-156. Rogerson, P.A. 2002. Change detection thresholds for remotely sensed images. Journal of Geographical Systems 4:85-97. Smith, J.H., Wickham, J.D., Stehman, S.V. 2002. Impacts of patch size and landcover heterogeneity on thematic image classification accuracy. Photogrammteric Engineering and Remote Sensing 68 (1): 65-70. Stehman, S.V. 1997. Selecting and interpreting measures of thematic classification accuracy. Remote Sensing of Environment 62:77-89. Stehman, S.V. 1999. Comparing thematic maps based on map value. International Journal of Remote Sensing 20:2347-2366. Trani, M.K. and Giles, R.H. 1999. An analysis of deforestation: Metrics used to describe pattern change. Forest Ecology and Management 114:459-470. Turner, M.G., Gardner, R.H., O'Neill, R.V. 2001. Landscape ecology in theory and practice. New York: Springer. Upton, G. and Fingleton, B. 1985. Spatial Data Analysis by Example. Volume 1: Point Pattern and Quantitative Data. Chichester: John Wiley & Sons. White, R., Engelen, G., Uljee, I. 1997. The use of constrained cellular automata for high-resolution modelling of urban land-use dynamics. Environment an Planning B 24: 323-343.
Integrating computational and visual analysis for the exploration of health statistics
Etien L. Koua and Menno-Jan Kraak International Institute for Geoinformation Science and Earth Observation (ITC), P.O. Box 6, 7500 AA Enschede, The Netherlands
Abstract One of the major research areas in geovisualization is the exploration of patterns and relationships in large datasets for understanding underlying geographical processes. One of the attempts has been to use Artificial Neural Networks as a technology especially useful in situations where the numbers are vast and the relationships are often unclear or even hidden. We investigate ways to integrate computational analysis based on the SelfOrganizing Map, with visual representations of derived structures and patterns in an exploratory geovisualization environment intended to support visual data mining and knowledge discovery. Here we explore a large dataset on health statistics in Africa. Keywords: Exploratory visualization, Data mining, Knowledge discovery, Self-Organizing Map, Visual exploration.
1 Introduction The exploration of patterns and relationships in large and complex geospatial data is a major research area in geovisualization, as volumes of data become larger and data structure more complex. A major problem associated with the exploration of these large datasets, is the limitation of common geospatial analysis techniques, in revealing patterns or processes (Gahegan et al. 2001; Miller and Han 2001). New approaches in spatial analysis and visualization are needed to represent such data in a visual form that can better stimulate pattern recognition and hypothesis genera-
654
Etien L. Koua and Menno-Jan Kraak
tion, and to allow for better understanding of the processes, and support knowledge construction. Information visualization techniques are increasingly used in combination with other data analysis techniques. Artificial Neural Networks have been proposed as part of a strategy to improve geospatial analysis of large, complex datasets (Schaale and Furrer 1995; Openshaw and Turton 1996; Skidmore et al. 1997; Gahegan and Takatsuka 1999; Gahegan 2000), because of their ability to perform pattern recognition and classification, and because they are especially useful in situations where the data volumes are large and the relationships are unclear or even hidden (Openshaw and Openshaw 1997). In particular, the Self-Organizing Map (SOM) (Kohonen 1989) is often used as a means of organizing complex information spaces (Girardin 1995; Chen 1999; Fabrikant and Buttenfield 2001; Skupin 2003; Skupin and Fabrikant 2003). Recent effort in Knowledge Discovery in Databases (KDD) has provided a window for geographic knowledge discovery. Data mining, knowledge discovery, and visualization methods are often combined to try to understand structures and patterns in complex geographical data (MacEachren et al. 1999; Wachowicz 2000; Gahegan et al. 2001). One way to integrate KDD framework in geospatial data exploration is to combine the computational analysis methods with visual analysis in a process that can support exploratory and knowledge discovery tasks. We explore the SOM for such integration, to uncover the structure, patterns, relationships and trends in the data. Some graphical representations are then used to portray derived structures in a visual form that can support understanding of the structures and the geographical processes, and facilitate human perception (Card et al. 1999). We present a framework for combining pattern extraction with the SOM and the graphical representations in an integrated visual-computational environment, to support exploration of the data and knowledge construction. An application of the method is explored for a large socio-demographic and health dataset for African countries, to provide some understanding of the complex relationships between socio-economic indicators, locations and the burden of diseases such as HIV/AIDS. The ultimate goal is to support visual data mining and exploration, and gain insights on underlying distributions, patterns and trends.
Integrating computational and visual analysis
655
2 Visual data mining and knowledge discovery for understanding geographical processes The basic idea of visual data exploration is to present the data in some visual form, that allows to get insight into the data and draw conclusions (Keim 2002). Visual data mining is the use of visualization techniques to allow users to monitor, evaluate, and interpret inputs and outputs of process of data mining process. Data mining and knowledge discovery in general are one approach to analysis of large amount of data. The main goal of data mining is identifying valid, novel, potentially useful and ultimately understanding patterns in data (Fayyad et al. 1996). Typical tasks, for which data mining techniques are often used, include clustering, classification, generalization and prediction. The different applications of data mining techniques suggest three general categories of objectives (Weldon 1996): explanatory (to explain some observed events), confirmatory (to confirm a hypothesis), and exploratory (to analyze data for new or unexpected relationships). These techniques vary from traditional statistics to artificial intelligence and machine learning. Artificial Neural Networks are particularly used for exploratory analysis as non-linear clustering and classification techniques. Unsupervised neural networks such as the SOM are a type of neural clustering, and network architectures such as backpropagation and feedforward are neural induction methods used for classification (supervised learning). The algorithms used in data mining are often integrated into Knowledge Discovery in Databases (KDD), a larger framework that aims at finding new knowledge from large databases. This framework has been used in geospatial data exploration (Openshaw et al. 1990; MacEachren et al. 1999; Wachowicz 2000; Miller and Han 2001) to discover and visualize the regularities, structures and rules in data. The promises inherent in the development of data mining and knowledge discovery processes for geospatial analysis include the ability to yield unexpected correlation and causal relationships. Since the dimensionality of the dataset is very high, it is often ineffective to work in such high dimension space to search for patterns. We use the SOM algorithm as a data mining tool to project input data into an alternative measurement space based on similarities and relationships in the input data that can aid the search for patterns. It becomes possible to achieve better results in such similarity space rather than the original attribute space (Strehl and Ghosh 2002).
656
Etien L. Koua and Menno-Jan Kraak
3 The Self-Organizing Map and the exploration of geospatial data
3.1 The Self-Organizing Map The Self-Organizing Map (Kohonen 1989) is an Artificial Neural Network used to map multidimensional data onto a low dimensional space, usually a 2D representation space. The network consists of a number of neural processing elements (units or neurons) usually arranged on a rectangular or hexagonal grid, where each neuron is connected to the input. The goal is to group nodes close together in certain areas of the data value range. Each of the units i is assigned an n-dimensional weight vector mi that has the same dimensionality as the input patterns. What changes during the network training process, are the values of those weights. Each training iteration t starts with the random selection of one input pattern xt . Using Euclidean distance between weight vector and input pattern, the activation of the units is calculated. The resultant maps (SOMs) are organized in such a way that similar data are mapped onto the same node or to neighboring nodes in the map. This leads to a spatial clustering of similar input patterns in neighboring parts of the SOM and the clusters that appear on the map are themselves organized internally. This arrangement of the clusters in the map reflects the attribute relationships of the clusters in the input space. For example, the size of the clusters (the number of nodes allotted to each cluster) is reflective of the frequency distribution of the patterns in the input set. Actually, the SOM uses a distribution preserving property which has the ability to allocate more nodes to input patterns that appear more frequently during the training phase of the network configuration. It also applies a topology preserving property, which comes from the fact that similar data are mapped onto the same node, or to neighboring nodes in the map. In other words, the topology of the dataset in its ndimensional space is captured by the SOM and reflected in the ordering of its nodes. This is an important feature of the SOM that allows the data to be projected onto the lower dimension space while roughly preserving the order of the data in its original space. Another important feature of the SOM for knowledge discovery in complex datasets, is the fact that it is an unsupervised learning network meaning that the training patterns have no category information that accompany them. Unlike supervised methods which learn to associate a set of inputs with a set of outputs using a training data set for which both input and output are known, SOM adopts a
Integrating computational and visual analysis
657
learning strategy where the similarity relationships between the data and the clusters are used to classify and categorize the data. The SOM can be useful as a knowledge discovery tool in database methodology since it follows the probability density function of underlying data. 3.2 Computational analysis and visualization framework One of the advantages of the SOM is that the outcome of the computational process can easily be portrayed through visual representation. The first level of the computation provides a mechanism for extracting patterns from the data. As described in the previous section, the SOM adapts its internal structures to structural properties of the multidimensional input such as regularities, similarities, and frequencies. These properties of the SOM can be used to search for structures in the multidimensional input. The computational process provides ways to visualize the general structure of the dataset (clustering), as well as the exploration of relationships among attributes, through graphical representations that visualize the resultant maps (SOMs). The graphical representations are used to enable visual data exploration allowing the user to get insight into the data, evaluate, filter, and map outputs. This is intended to support visual data mining (Keim 2002) and the knowledge discovery process by means of interaction techniques (Cabena et al. 1998). This framework is informed by current understanding of effective application of visual variables for cartographic and information design, developing theories of interface metaphors for geospatial information displays, and previous empirical studies of map and information visualization effectiveness.
4 Application to the exploration of geographical patterns in health statistcs
4.1 The data The dataset consists of 74 variables on socio-demographic, and health indicators for all African countries. Maps of few attributes of the dataset are provided in figure 1. In this section, the dataset is explored, and different visualization techniques are used to illustrate the exploration of (potential) multivariate patterns and relationships among the different countries.
658
Etien L. Koua and Menno-Jan Kraak
Fig. 1. Example of attributes of the test dataset: HIV prevalence rate end of 2001, HIV rate among commercial sex workers, Total literacy rate, percentage of married women, birth rate, total death rate, life expectancy at birth, average age at first mariage, and GNI per capita 2001.
4.2 Exploration of the general patterns and clustering The SOM offers a number of distance matrix visualizations to show the cluster structure and similarity (patterns). These techniques show distances between neighbouring network units. The most widely distance matrix technique used is the U-matrix (Ultsch and Siemon 1990). In figure 2a, the structure of the data set is visualized in a U-matrix. Countries having similar characteristics based on the multivariate attributes are positioned close to each other, and the distance between them represents the degree of similarity or dissimilarity. These common characteristics representation can be regarded as the health standard for these countries. Light areas represent clusters (vectors are close to each other in the input space), and dark areas represent cluster separators (large distance between the neurons: a gap between the values in the input space). Alternative representations to the Umatrix visualization are discussed below: 2D and 3D projections (using projection methods such as the Sammon's mapping and PCA), 2D and 3D surface plots, and component planes.
Integrating computational and visual analysis
(a)
(b)
(c)
659
(d)
Fig. 2. Representation of the general patterns and clustering in the input data: The unified distance matrix showing clustering and distances between positions on the map (a), projection of the SOM results in 3D space (c); 3D surface plot (d), and a map of the similarity coding extracted from the SOM computational analysis (b).
In figure 2c, the projection of the SOM offers a view of the clustering of the data with data items depicted as colored nodes. Similar data items are grouped together with the same type or color of markers. Size, position and color of markers can be used to depict the relationships between the data items. The clustering structure can also be viewed as 2D or 3D surfaces representing the distance matrix (figure 2d) using color value to indicate the average distance to neighboring map units. This is a spatialization (Fabrikant and Skupin 2003) that uses a landscape metaphor to represent the density, shape, and size or volume of clusters. Unlike the projection in figure 2c that shows only the position and clustering of map units, areas with uniform color are used in the 2D and 3D surface plots to show the clustering structure and relationships among map units. In the 3D surface (figure 2d), color value and height are used to represent the regionalization of map units according to the multidimensional attributes. 4.3 Exploratory visualization and knowledge discovery The correlations and relationships in the input data space can be easily visualized using the component planes visualization (figure 3). The component planes show the values of different attributes for the different map units (countries) and how each input vector varies over the space of the SOM units. They are used to support exploratory tasks, to facilitate the knowledge discovery process, and improve geospatial analysis. Comparatively with the maps in figure 1, patterns and relationships among all the attributes can be easily examined in a signle visual represention using the SOM component planes visualization. Since the SOM represents the similarity clustering of the multivariate attributes, the visual representation becomes more accessible and easy to explore. This kind of spatial
660
Etien L. Koua and Menno-Jan Kraak
clustering makes it possible to conduct exploratory analyses to help in identifying the causes and correlates of health problems (Cromley and McLafferty 2002), when overlayed with environmental, social, transportantion, and facilities data. These map overlays have been important hypothesis-generating tools in public health research and policymaking (Croner et al. 1992). In figure 3a, all the components are displayed and a selection of few of them are made more visible for the analysis in figure 3b. Two variables that are correlated will be represented by similar displays. The kind of visual representation (imagery cues) provided in the SOM component planes visualization can facilitate visual detection, and has an impact on knowledge construction (Keller and Keller 1992). As such, the SOM can be used as an effective tool to visually detect correlations among operating variables in a large volume of multivariate data. From the global patterns, correlations and relationships exploration in figure 3a, hypotheses can be made and further investigation can follow, in the process of understanding the patterns. To enhance visual detection of the relationships and correlations, the components can be ordered so that variables that are correlated are displayed next to each other (see figure 3c and 4d), in a way similar to the collection maps of Bertin (Bertin 1981). It becomes easy to see for example that that the HIV prevalence rate in Africa is related to a number of other variables including the literacy rate and behavior (characterized in the dataset as high risk sexually behavior and limited knowledge on risks factors), and other factors such as the high prevalence rate among prostitutes, and the high rate of infection for other sexual transmitted diseases. As a consequence of the high prevalence rate in regions such as southern Africa, there seem to be a low birth rate, and life expectancy at birth, and a high death rate, highly impacted by the HIV infection. The birth rate in mostly infected regions seems to be a consequence of the prevention measures, the increase use of condom among a large proportion of single females for contraception, but who seem to prevent themselves against the HIV prevention rather. It is also observed through the component planes visualization that factors such as the percentage of married women, the percentage of sexually active single female, and the average age at first marriage in these countries, are highly related to the prevalence rate.
Integrating computational and visual analysis
661
(a) (b)
(c)
(d)
(e)
Fig. 3. Detail exploration of the dataset using the SOM component visualization: all the components can be displayed to reveal the relationships between the attributes for different spatial locations (countries) in (a). Selected components related to a specific hypothesis can be further explored (b). Component planes can be ordered based on correlations among the attributes to facilitate visual recognition of relationships among all the attributes (c) or selected attributes (d). Position of countries on the SOM grid is shown in (e).
4.4 The integrated visual-computational environment We have extended the alternative representations of the SOM results used to highlight different characteristics of the computational solution and integrated them with other graphics into multiple views, to allow brushing and linking (Egbert and Slocum 1992; Monmonier 1992; Cook et al. 1996; Dykes 1997), for exploratory analysis (see figure 4). These multiple views are used to simultaneously present interactions between several variables over the space of the SOM, maps and parallel coordinate plots, and to emphasize visual change detection and the monitoring of the variability through the attribute space. These alternative and different views on the data can help stimulate the visual thinking process, characteristic for visual exploration, and support hypothesis testing, evaluation and interpretation of patterns from general patterns extracted to specific selection of attributes and spatial locations.
662
Etien L. Koua and Menno-Jan Kraak
(a) (b)
(e)
(c) (d)
(f)
(g)
Fig. 4. The user interface of the exploratory geovisualization environment, showing the representation of the general patterns and clustering in the input data: unified distance matrix (b), projection of the SOM results in 3D space (c), 3D surface plot (e), map of the SOM similarity coding (a) and parallel coordinate plot (d), component planes (f) and map unit labels in (g).
5 Conclusion In this paper we have presented an approach to combine visual and computational analysis into an exploratory visualization environment intended to contribute to the analysis of large volumes of geospatial data. The approach focuses on the effective application of computational algorithms to extract patterns and relationships in geospatial data, and visual representation of derived information. A number of visualization techniques were explored. The SOM computational analysis was integrated with visual exploration tools to support exploratory visualization. Interactive manipulation of the graphical representations can enhance user goal specific querying and selection from the general patterns extracted to more specific user selection of attributes and spatial locations. The link between the attribute space visualization based on the SOM, the geographic space with maps representing the SOM results, and other graphics such as parallel coordinate plots, in multiple views provides alternative perspectives for
Integrating computational and visual analysis
663
better exploration, hypothesis generation, evaluation and interpretation of patterns, and ultimately support for knowledge construction.
References Bertin, J. (1981). Graphics and Graphic Information Processing. Berlin, Walter de Gruyter. Cabena, P., P. Hadjnian, R. Stadler, J. Verhees and Z. Alessandro (1998). Discovering data mining: From concept to implementation. New Jersey, Prentice Hall. Card, S. K., J. D. Mackinlay and B. Shneiderman (1999). Readings in Information Visualization. Using Vision to Think. San Francisco, Morgan Kaufmann Publishers. Chen, C. (1999). Information visualization and Virtual Environments. London, Springer-Verlag. Cook, D., J. J. Majure, J. Symanzik and N. Cressie (1996). Dynamic Graphics in a GIS: Exploring and Analyzing Multivariate Spatial Data Using Linked Software. Computational Statistics: Special Issue on Computeraided Analysis of Spatial Data 11(4): 467-480. Cromley, E. K. and S. L. McLafferty (2002). GIS and public health. New York, The Guilford Press. Croner, C., L. Pickle, D. Wolf and A. White (1992). A GIS approach to hypothesis generation in epidemiology. ASPRS/ACSM technical papers. A. W. Voss. Washinton, DC, ASPRS/ACSM. 3: 275-283. Dykes, J. A. (1997). Exploring Spatial Data Representation with Dynamic Graphics. Computers & Geosciences 23(4): 345-370. Egbert, S. L. and T. A. Slocum (1992). EXPLOREMAP: An Exploration System for Choropleth Maps. Annals, Association of American Geographers . 82(2): 275-288. Fabrikant, S. I. and B. Buttenfield (2001). Formalizing semantic spaces for information access. Annals of the Association of American Geographers 91(2): 263-280. Fabrikant, S. I. and A. Skupin (2003). Cognitively Plausible Information Visualization. Exploring GeoVisualization. M. J. Kraak. Amsterdam, Elsevier. Fayyad, U., G. Piatetsky-Shapiro and P. Smyth (1996). From data mining to knowledge discovery in databases. Artificial Intelligence Magazine. 17: 3754. Gahegan, M. (2000). On the application of inductive machine learning tools to geographical analysis. Geographical Analysis 32(2): 113-139. Gahegan, M., M. Harrover, T. M. Rhyne and M. Wachowicz (2001). The integration of geographic visualization with Databases, Data mining, Knowledge Discovery Construction and Geocomputation. Cartography and Geographic Information Science 28(1): 29-44.
664
Etien L. Koua and Menno-Jan Kraak
Gahegan, M. and M. Takatsuka (1999). Dataspaces as an organizational concept for the neural classification of geographic datasets. Fourth International Conference on GeoComputation, Fredericksburg, Virginia, USA. Girardin, L. (1995). Mapping the virtual geography of the World Wide Web. Fifth International World Wide Web conference, Paris, France. Keim, D. A. (2002). Information Visualization and Visual Data Mining. IEEE transactions on Visualization and Computer Graphics 7(1): 100-107. Keller, P. and M. Keller (1992). Visual clues: Practical Data Visualization. Los Alamitos, CA, IEEE Computer Scociety Press. Kohonen, T. (1989). Self-Organization and Associative memory, Spring-Verlag. MacEachren, A. M., M. Wachowicz, R. Edsall, D. Haug and R. Masters (1999). Constructing knowledge from multivariate spatiotemporal data: integrating geographical visualization with knowledge discovery in databases methods. International Journal of Geographical Information Science 13(4): 311-334. Miller, H. J. and J. Han (2001). Geographic data mining and knowledge discovery. London, Taylor and Francis. Monmonier, M. (1992). Authoring Graphics Scripts: Experiences and Principles. Cartography and Geographic Information Systems 19(4): 247-260. Openshaw, S., A. Cross and M. Charlton (1990). Building a prototype geographical correlates machine. International Journal of Geographical Information Systems 4(4): 297-312. Openshaw, S. and C. Openshaw (1997). Artificial Intelligence in geography. Chichester, John Wiley & Sons. Openshaw, S. and I. Turton (1996). A parallel Kohonen algorithm for the classification of large spatial datasets. Computers-and-Geosciences 22(9): 10191026. Schaale, M. and R. Furrer (1995). Land surface classification by Neural Networks. InternationalJourna of Remote Sensing 16(16): 3003-3031. Skidmore, A., B. J. Turner, W. Brinkhof and E. Knowles (1997). Performance of a Neural Network: mapping forests using GIS and Remote Sensed data. Photogrammetric Engineering and Remote Sensing 63(5): 501-514. Skupin, A. (2003). A novel map projection using an Artificial Neural Network. 21st International Cartographic Conference (ICC), 'Cartographic Renaissance;, Durban, South Africa. Skupin, A. and S. Fabrikant (2003). Spatialization Methods: A Cartographic Research Agenda for Non-Geographic Information Visualization. Cartography and Geographic Information Science. 30(2): 99-119. Strehl, A. and J. Ghosh (2002). Relationship-based clustering and visualization for multidimensional data mining. INFOMS Journal on Computing 00(0): 1-23. Ultsch, A. and H. Siemon (1990). Kohonen's self-organizing feature maps for exploratory data analysis. Proceedings International Neural Network Conference INNC'90P, Dordrecht, The Netherlands. Wachowicz, M. (2000). The role of geographic visualization and knowledge discovery in spatio-temporal modeling. Publications on Geodesy 47: 27-35. Weldon, J., L. (1996). Data mining and visualization. Database programming and design 9(5).
Using Spatially Adaptive Filters to Map Late Stage Colorectal Cancer Incidence in Iowa Chetan Tiwari and Gerard Rushton Department of Geography, The University of Iowa, Iowa City, Iowa 52242, USA
Abstract Disease rates computed for small areas such as zip codes, census tracts or census block groups are known to be unstable because of the small populations at risk. All people in Iowa diagnosed with colorectal cancer between 1993 and 1997 were classified by cancer stage at the time of their first diagnosis. The ratios of the number of late-stage cancers to cancers at all stages were computed for spatial aggregations of circles centered on individual grid points of a regular grid. Late-stage colorectal cancer incidence rates were computed at each grid point by varying the size of the spatial filter until it met a minimum threshold on the total number of colorectal cancer incidences. These different-sized areas are known as spatially adaptive filters. The variances analyzed at grid points showed that the maps produced using spatially adaptive filters gave higher statistical stability in computed rates and greater geographic detail when compared to maps produced using conventional fixed-size filters.
1 Introduction With the availability of high quality geospatial data on disease incidences, disease maps are increasingly being used for representing, analyzing and interpreting disease incidents. They are being used to make important decisions about resource allocation and to study the interrelationships that might exist between spatially variable risk factors and the occurrence of disease. Several methods exist for mapping disease. It is common to determine a rate using disease cases as the numerators and the population-at-
666
Chetan Tiwari and Gerard Rushton
risk as the denominators. Rates of disease occurrence or incidence that are computed for small areal units like zip codes, census tracts or block groups are known to be unstable because of the small populations at risk. This problem is particularly found in rural areas that tend to have low population densities. A common solution to this problem is to aggregate the data (not the rates) over some larger area to estimate the disease rate at a point. There are other methods of spatial smoothing that are not of interest in this paper that involve the fitting of smooth surfaces to point measures. These include smoothing methods that range from linear smoothers in which the smoothing function is primarily dependent on the distance and weight of points within a defined neighbourhood to non-linear methods like response-surface analysis, kriging, etc. Other methods like median polish and headbanging provide tools for local smoothing of data (Pickle 2002). Kafadar (1994) provides a comparison of several such smoothing methods. Methods that address smoothing of point based data, which are available commonly as aggregates over some geographical unit or less commonly at the individual level are of interest in spatial epidemiology (Lawson 2001). With the availability of high-resolution data, there has also been a move towards developing local statistics as opposed to global statistics with a focus on point-based methods. Atkinson and Unwin (2002) developed and implemented point-based local statistical methods within the MapInfo GIS software. Bithell (1990) used density estimation techniques to study the relative risk of disease at different geographical locations using a dataset on childhood leukaemia in the vicinity of the Sellafield nuclear processing plant in the U.K. His study introduced the idea of ‘adaptive kernel estimation’, in which he proposed that the size of the bandwidth should vary according to the sample density. The use of such adaptive kernels would ideally provide greater smoothing in areas of low densities and greater sensitivity in areas of higher density. Rushton and Lolonis (1996) used a series of overlapping circles of fixed width to estimate the rates of infant mortality in Des Moines, Iowa for points within a regular grid. However, such fixed-size spatial filters are not sensitive to the distribution of the base population, and hence this method produces unstable rates in areas with low population densities. Talbot et al. (2000) used a modified version of such fixed spatial filters to borrow strength from surrounding areas. They used spatial filters with constant or nearly constant population size to create smoothed maps of disease incidence. Other methods like the Empirical Bayes method that shrink the local estimates towards the global mean gained popularity in the 1990s. With increased computing capability, this method led to the development of full Bayes methods that used
Using Spatially Adaptive Filters
667
Markov Chain Monte Carlo (MCMC) methods to account for any spatial autocorrelation (Bithell 2000; Rushton 2003). In this paper we examine and statistically compare the properties of maps produced using fixed-size spatial filters and spatially adaptive filters.
2 Data The data consisted of all people in Iowa diagnosed with colorectal cancer (n=7728) between 1993 and 1997. Each person was geocoded to their respective street address. In some cases, when this was not possible, the location of the centroid of their zipcodes was substituted. Each person was classified by stage of cancer at the time of their first diagnosis. Each eligible case had a morphology that was consistent with the adenomaadenocarcinoma sequence. The stage of the disease was defined as the summary stage at the time of diagnosis using the SEER Site-Specific Summary Staging Guide. Late-stage CRC was defined as AJCC stage of at least II. Rates of late-stage colorectal cancer incidence were calculated as a ratio of the number of late-stage colorectal cancer cases by the total number of colorectal cancer cases. These rates were calculated at all points within a regular grid laid out over Iowa. For computational efficiency, the distance between grid points was selected to be 5 miles. The proportion of persons found to have late-stage colorectal cancer at the time of first diagnosis is generally thought to correlate inversely with the proportion of people who are screened for this cancer. Areas with low late-stage rates are expected to have higher survival rates. Finding areas with high late-stage rates can lead to enhanced prevention and control efforts that involve improved rates of screening for this cancer.
3 Spatial Filtering Choropleth maps are examples of spatially filtered maps. However, their usefulness is affected by the different sizes, shapes and populations they represent as well as the property of spatial discreteness. Such maps produced from point-based data are known to produce different patterns when the shape and scale of the boundaries are changed. This is the Modifiable Areal Unit Problem (Openshaw and Taylor 1979; Amrhein 1995). Changing the scale at which the disease data are aggregated by choosing different types of areas like zip code areas, census block group areas or counties results in different patterns of disease incidence. To illustrate the aggrega-
668
Chetan Tiwari and Gerard Rushton
tion/zoning effect, consider the maps in Fig. 1a through Fig. 1d. These maps are produced by shifting the boundaries of the zip codes in different directions. By doing this, we are keeping the size of the areal units approximately the same, but we are changing the structure of the boundaries. It is important at this stage to note that the distribution of the point-based data remains unaltered. As expected, the patterns of disease are quite different.
The problem of scale is easily solved when point-based data are available. The map in Fig. 2a was produced using fixed-size spatial filters in which the rates of late-stage colorectal cancer incidence were computed
Using Spatially Adaptive Filters
669
for spatial aggregations around each point within a regular grid. The rates at each grid point were calculated as the number of late-stage colorectal cancer cases within a 20-mile circular filter divided by the number of all colorectal cancer cases within the same 20-mile radius. The disease rates computed at grid points were then converted into a surface map using spatial interpolation (Fig. 2b). The inverse distance weighted (IDW) method was used to create the surface maps. To minimize double smoothing of the results only the eight direct neighbors were used.
In this method, since the rates of disease incidence are computed for spatial aggregations around points within a regular grid and because this grid is independent of any administrative or other boundary, the use of spatial filters controls for both the issues that contribute to the MAUP. ‘Control’ in this context does not mean that the resulting pattern is insensitive to the choice of the size of the spatial filter being used. It does mean, however, that the resulting pattern changes in predictable ways as the size of the spatial filter changes. The size of the spatial filter used affects the reliability and amount of geographic detail that is portrayed by the map. Larger spatial filters produce maps with high levels of reliability, but low geographic detail. Alternatively, smaller filters produce maps of high geographic detail, but low reliability. As a result, in some cases, fixed-size spatial filters have problems of oversmoothing and undersmoothing disease rates in relation to the distribution of the population at risk. In other words, fixed-size spatial filters may be using large filter sizes in areas where a smaller filter size could have been used as effectively, thereby resulting in loss of geographic detail; or they may be using smaller filter sizes that produce unreliable estimates in areas with sparse populations at risk.
670
Chetan Tiwari and Gerard Rushton
Spatially adaptive filters overcome these problems by using variable width spatial filters that consider the base population at risk when computing disease rates. In the following sections of this paper, maps produced using both these methods are compared and we expect to see high geographic detail and reliability in maps that were produced using spatially adaptive filters.
4 Spatially Adaptive Filters In the spatially adaptive filters method, late-stage rates were computed for spatial aggregations centered on each grid point within a regular grid. GIS and other supporting software were used to compute these rates by varying the size of the spatial filter until it met a minimum threshold value on the total number of colorectal cancer incidences (denominator of the rate). These different-sized areas are known as spatially adaptive filters.
Using Spatially Adaptive Filters
671
Fig. 3a and 3b are examples of two filter sizes (a fixed 12-mile filter on the left and a fixed 36-mile filter on the right) that were used to create maps of late-stage colorectal cancer incidence in Iowa. In Fig. 3a, we see high geographic detail and in Fig. 3b we see an extremely smooth map. Fig. 3d shows large variability in rates for the 12 mile filter map and low variability for rates for the 36 mile map. To overcome this problem, we use spatially adaptive filters to create maps of disease incidence with maximum possible spatial resolution but without an increase in variability of rates (Fig. 3c). As can be seen in Fig. 3d, the variability of rates in Fig. 3c is less than in Fig. 3a, yet considerable geographic detail is preserved compared with Fig. 3b. Correlations between rates computed at each grid point using different sized fixed-spatial filters and different threshold values for spatially adaptive filters were examined (Fig. 4). The x-axis corresponds to the rates computed at each grid point using a fixed filter method. The y-axis indicates the correlation of rates computed at each grid point between the fixed filter method and the spatially adaptive filter method. The curves indicate different threshold values on the denominator. As we increase the size of the fixed-spatial filter, we would expect it to pull in a larger number of colorectal cancer cases. The correlation curves in Fig. 4 follow this trend and we notice greater correlation as fixed-filter sizes and threshold values increase. For each threshold value there is a fixed filter size with which it has the highest correlation. As the threshold value increases, the filter size with which it has the highest correlation also increases. Thus each threshold size has a correlation pattern in the form of an inverted “U” shape with the maximum value moving to the right as the threshold value increases. This indicates some similarity between maps produced using the fixed-filter method and the spatially adaptive filters method. The more uniform the population density in the area mapped, the higher is the correlation between a fixed filter map and the spatially adaptive filter size with which it most closely corresponds. It follows that in an area with variable population density there will be some areas of the map where the results of the fixed filter and spatially adaptive filter are the same and other areas where the results will be different. In terms of the amount of geographical detail that is obtained, we find that the maps produced using spatially adaptive filters perform better. In this paper we illustrate this argument by comparing two maps that were produced using a fixed spatial filter of 20 miles and using spatially adaptive filters with a minimum threshold value of 75 colorectal cancer cases.
672
Chetan Tiwari and Gerard Rushton 1.0 0.8
R2
0.6 0.4
No. of cases at the denominator
0.2 0.0 5
12
16
20
24
36
Threshold Threshold Threshold Threshold
> > > >
10 50 75 100
Filter Size (in miles)
Frequency
Fig. 4. Similarity in late-stage Colorectal Cancer (CRC) incidence rates computed using different sized fixed spatial filters and spatially adaptive filters with varying threshold values 1000 900 800 700 600 500 400 300 200 100 0
a b
450 500 550 600 650 700 750 800 850 900 950
Fig. 5. Distribution of rates produced using (a) the 20-mile fixed filter method and (b) the spatially adaptive filters method with threshold > 75
Fig. 5 shows that the overall distribution of rates of late-stage colorectal cancer incidence in Iowa produced using both these methods are very similar Fig. 6, however, shows that considerable differences are found between the rates at specific grid locations. As illustrated in Fig. 7, the map produced using the spatially adaptive filters method gives us greater geographic detail when compared to the fixed filter method.
Using Spatially Adaptive Filters
673
900 850
Frequency
800 750 700 650 600 550 500 450 450
500
550
600
650
700
750
800
850
900
Fig. 6. Comparison of rates at each grid point from the 20 mile fixed filter and the adaptive filter with threshold > 75
Fig. 7. Late-stage colorectal cancer incidence in Iowa: Comparison of geographic detail in maps produced using (a) the 20-mile fixed spatial filter and (b) the spatially adaptive filter with a threshold > 75
674
Chetan Tiwari and Gerard Rushton
In the spatially adaptive filters method, approximately 70% of the filters used were less than 20 miles and approximately 10% were larger than 20 miles (Fig. 8). The variances on the resulting maps, however, as noted in Fig. 5 were the same. 100%
600
90% 500
80% 70%
Frequency
400
60% 50%
300
40% 200
30% 20%
100
10% 0
0% 12 14 16 18 20 22 24 26 28 30 32 34 36 Filter size at grid points Frequency
Cumulative %
Fig. 8. Amount of over-smoothing in the fixed 20-mile filter map
The map produced using spatially adaptive filters used spatial filter sizes ranging from 12 miles to 36 miles. Fig. 9 illustrates a small sample (approximately 5%) of the filter sizes that were used in the spatially adaptive filters method. Notice that smaller filter sizes are used in urban areas (central part of the state) and larger filter sizes are used in rural areas and in areas close to the state boundary. Larger filter sizes were required close to the map boundaries because of known problems relating to edge effects (Lawson et al. 1999).
Using Spatially Adaptive Filters
675
12-14 14-18 18-22 22-26
0
100 miles
26-30
Fig. 9. Sample of filter sizes used in the spatially adaptive filters method - every twentieth grid point
5 Conclusions These results demonstrate the advantages of using spatially adaptive filters in obtaining stable maps of disease incidence compared with maps produced using fixed-size spatial filters. We found that in most cases, the maps produced using spatially adaptive filters were using smaller filter sizes than a fixed filter map to give results with high reliability in rates and better geographic detail. Many cancer registries throughout the world are now able to geocode cancer incidences to very fine levels of geographic resolution. Although access to such data is restricted because of privacy and confidentiality concerns, an important question is whether this geographic detail is needed to produce more accurate maps of cancer rates. Future research will explore this question using both disaggregated and aggregated cancer data. As part of the research project that supported this work, a new version of the Disease Mapping and Analysis Program is being developed at The University of Iowa that uses spatially adaptive filters for creating disease maps. Its current status can be viewed at http://www.uiowa.edu/~gishlth/DMAP.
676
Chetan Tiwari and Gerard Rushton
Acknowledgements We thank Ika Peleg, Michele West, Geoffrey Smith and the Iowa Cancer Registry for their assistance. Partial support for this work was provided by NIH Grant # ROI CA095 961, “A GIS-based workbench to interpret cancer maps”
References Amrhein CG (1995) Searching for the elusive aggregation effect: evidence from statistical simulations. Env and Pl A 27:105-199 Atkinson PJ, Unwin DJ (2002) Density and local attribute estimation of an infectious disease using MapInfo. Comp & Geosci 28:1095-1105 Bithell JF (1990) An application of density estimation to geographical epidemiology. Stat in Med 9:691-701 Bithell JF (2000) A classification of disease mapping methods. Stat in Med 19:2203-2215 Kafadar K (1994) Choosing among two-dimensional smoothers in practice. Comp Stat & Data Analysis 18:419-439 Lawson AB (2001) Statistical methods in spatial epidemiology. John Wiley & Sons, West Sussex Lawson AB, Biggeri A, Dreassi E (1999) Edge effects in disease mapping. In: Lawson AB, Biggeri A, Bohning D, Lesaffre E, Viel JF, Bertollini R (eds) Disease mapping and risk assessment for public health. John Wiley & Sons, pp 83-96 Openshaw S, Taylor P (1979) A million or so correlation coefficients: three experiments on the modifiable areal unit problem. In: Wrigley N. (ed) Statistical applications in the spatial sciences. Pion, London, pp 127-144 Pickle LW (2002) Spatial analysis of disease. In: Beam C (ed) Biostatistical applications in cancer research. Kluwer Academic Publishers, Tampa, pp 113-150 Rushton G (2003) Public health, GIS, and spatial analytic tools. Ann Rev of Pub Hlth 24:43-56 Rushton G, Lolonis P (1996) Exploratory spatial analysis of birth defect rates in an urban population. Stat in Med 15:717-726 Talbot TO, Kulldorff M, Forand SP, Haley VB (2000) Evaluation of spatial filters to create smoothed maps of health data. Stat in Med 19:2399-2408