Rough Fuzzy Image Analysis
Foundations and Methodologies
K10185_FM.indd 1
3/29/10 1:38:14 PM
Chapman & Hall/CRC Mat...
10 downloads
542 Views
9MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Rough Fuzzy Image Analysis
Foundations and Methodologies
K10185_FM.indd 1
3/29/10 1:38:14 PM
Chapman & Hall/CRC Mathematical and Computational Imaging Sciences Series Editors
Chandrajit Bajaj
Guillermo Sapiro
Center for Computational Visualization The University of Texas at Austin
Department of Electrical and Computer Engineering University of Minnesota
Aims and Scope This series aims to capture new developments and summarize what is known over the whole spectrum of mathematical and computational imaging sciences. It seeks to encourage the integration of mathematical, statistical and computational methods in image acquisition and processing by publishing a broad range of textbooks, reference works and handbooks. The titles included in the series are meant to appeal to students, researchers and professionals in the mathematical, statistical and computational sciences, application areas, as well as interdisciplinary researchers involved in the field. The inclusion of concrete examples and applications, and programming code and examples, is highly encouraged.
Proposals for the series should be submitted to the series editors above or directly to: CRC Press, Taylor & Francis Group 4th, Floor, Albert House 1-4 Singer Street London EC2A 4BQ UK
K10185_FM.indd 2
3/29/10 1:38:14 PM
Chapman & Hall/CRC Mathematical and Computational Imaging Sciences
Rough Fuzzy Image Analysis
Foundations and Methodologies
Edited by
Sankar K. Pal James F. Peters
K10185_FM.indd 3
3/29/10 1:38:14 PM
CRC Press Taylor & Francis Group 6000 Broken Sound Parkway NW, Suite 300 Boca Raton, FL 33487-2742 © 2010 by Taylor and Francis Group, LLC CRC Press is an imprint of Taylor & Francis Group, an Informa business No claim to original U.S. Government works Printed in the United States of America on acid-free paper 10 9 8 7 6 5 4 3 2 1 International Standard Book Number: 978-1-4398-0329-5 (Hardback) This book contains information obtained from authentic and highly regarded sources. Reasonable efforts have been made to publish reliable data and information, but the author and publisher cannot assume responsibility for the validity of all materials or the consequences of their use. The authors and publishers have attempted to trace the copyright holders of all material reproduced in this publication and apologize to copyright holders if permission to publish in this form has not been obtained. If any copyright material has not been acknowledged please write and let us know so we may rectify in any future reprint. Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying, microfilming, and recording, or in any information storage or retrieval system, without written permission from the publishers. For permission to photocopy or use material electronically from this work, please access www.copyright.com (http:// www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization that provides licenses and registration for a variety of users. For organizations that have been granted a photocopy license by the CCC, a separate system of payment has been arranged. Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe. Library of Congress Cataloging‑in‑Publication Data Rough fuzzy image analysis : foundations and methodologies / editors, Sankar K. Pal, James F. Peters. p. cm. “A CRC title.” Includes bibliographical references and index. ISBN 978-1-4398-0329-5 (hardcover : alk. paper) 1. Image analysis. 2. Fuzzy sets. I. Pal, Sankar K. II. Peters, James F. III. Title. TA1637.R68 2010 621.36’7--dc22
2009053741
Visit the Taylor & Francis Web site at http://www.taylorandfrancis.com and the CRC Press Web site at http://www.crcpress.com
K10185_FM.indd 4
3/29/10 1:38:14 PM
Preface This book introduces the foundations and applications in the state-of-art of roughfuzzy image analysis. Fuzzy sets* and rough sets** as well as a generalization of rough sets called near sets*** provide important as well as useful stepping stones in various approaches to image analysis that are given in the chapters of this book. These three types of sets and various hybridizations provide powerful frameworks for image analysis. Image analysis focuses on the extraction of meaningful information from digital images. This subject has its roots in studies of space and the senses by J.H. Poincar´e during the early 1900s, studies of visual perception and the topology of the brain by E.C. Zeeman and picture processing by A.P. Rosenfeld**** . The basic picture processing approach pioneered by A.P. Rosenfeld was to extract meaningful patterns in given digital images representing real scenes as opposed to images synthesized by the computer. Underlying picture processing is an interest in filtering a picture to detect given patterns embedded in digital images and approximating a given image with simpler, similar images with lower information content (this, of course, is at the heart of the near set-based approach to image analysis). This book calls attention to the utility that fuzzy sets, near sets and rough sets have in image analysis. One of the earliest fuzzy set-based image analysis studies was published in 1982 by S.K. Pal***** . The spectrum of fuzzy set-oriented image analysis studies includes edge ambiguity, scene analysis, image enhancement using smoothing, image description, motion frame analysis, medical imaging, remote sensing, thresholding and image frame analysis. The application of rough sets in image analysis was launched in a seminal paper published in 1993 by A. Mr´ ozek and L. Plonka****** . Near sets are a recent generalization of rough sets that have proven to be useful in image analysis and pattern
* See, e.g., Zadeh, L.A., Fuzzy sets. Information and Control (1965), 8 (3) 338-353; Zadeh, L.A., Toward a theory of fuzzy granulation and its centrality in human reasoning and fuzzy logic, Fuzzy Sets and Systems 90 (1997), 111-127. See, also, Rosenfeld, A., Fuzzy digital topology, in Bezdek, J.C., Pal, S.K., Eds., Fuzzy Models for Pattern Recognition, IEEE Press, 1991, 331-339; Banerjee, M., Kundu, M.K., Maji, P., Content-based image retrieval using visually significant point features, Fuzzy Sets and Systems 160, 1 (2009), 3323-3341; http://en.wikipedia.org/wiki/Fuzzy set ** See, e.g., Peters, J.F., Skowron, A.: Zdzislaw Pawlak: Life and Work, Transactions on Rough Sets V, (2006), 1-24; Pawlak, Z., Skowron, A.: Rudiments of rough sets, Information Sciences 177 (2007) 3-27; Pawlak, Z., Skowron, A.: Rough sets: Some extensions, Information Sciences 177 (2007) 28-40; Pawlak, Z., Skowron, A.: Rough sets and Boolean reasoning, Information Sciences 177 (2007) 41-73.; http://en.wikipedia.org/wiki/Rough set *** See, e.g., Peters, J.F., Puzio, L., Image analysis with anisotropic wavelet-based nearness measures, Int. J. of Computational Intelligence Systems 79, 3-4 (2009), 1-17; Peters, J.F., Wasilewski, P., Foundations of near sets, Information Sciences 179, 2009, 3091-3109; http://en.wikipedia.org/wiki/Near sets. See, also, http://wren.ee.umanitoba.ca **** See, e.g., Rosenfeld, A.P., Picture processing by computer, ACM Computing Surveys 1, 3 (1969), 147-176 ***** Pal, S.K., A note on the quantitative measure of image enhancement through fuzziness, IEEE Trans. on Pat. Anal. & Machine Intelligence 4, 2 (1982), 204-208. ****** Mr´ ozek, A., Plonka, L., Rough sets in image analysis, Foundations of Computing and Decision Sciences 18, 3-4 (1993), 268-273.
0-2 recognition******* . This volume fully reflects the diversity and richness of rough fuzzy image analysis both in terms of its underlying set theories as well as its diverse methods and applications. From the lead chapter by J.F. Peters and S.K. Pal, it can be observed that fuzzy sets, near sets and rough sets are, in fact, instances of different incarnations of Cantor sets. These three types of Cantor sets provide a foundation for what A. Rosenfeld points to as the stages in pictorial pattern recognition, i.e., image transformation, feature extraction and classification. The chapters by P. Maji and S.K. Pal on rough-fuzzy clustering, D. Malyszko and J. Stepaniuk on rough-fuzzy measures, and by A.E. Hassanien, H. Al-Qaheri, A. Abraham on rough-fuzzy clustering for segmentation point to the utility of hybrid approaches that combine fuzzy sets and rough sets in image analysis. The chapters by D. Sen, S.K. Pal on rough set-based image thresholding, H. Fashandi, J.F. Peters on rough set-based mathematical morphology as well as an image partition topology and M.M. Mushrif, A.K. Ray on image segmentation illustrate how image analysis can be carried out with rough sets by themselves. Tolerance spaces and a perceptual approach in image analysis can be found in the chapters by C. Henry, A.H. Meghdadi, J.F. Peters, S. Shahfar, and S. Ramanna (these papers carry forward the work on visual perception by J.H. Poincar´e and E.C. Zeeman). A rich harvest of applications of rough fuzzy image analysis can be found in the chapters by A.E. Hassanien, H. Al-Qaheri, A. Abraham, W. Tarnawski, G. Schaefer, T. Nakashima, L. Miroslaw, C. Henry, S. Shahfar, A.H. Meghdadi and S. Ramanna. Finally, a complete, downloadable implementation of near sets in image analysis called NEAR is presented by C. Henry. The Editors of this volume extend their profound gratitude to the many reviewers for their generosity and many helpful comments concerning the chapters in this volume. Every chapter was extensively reviewed and revised before final acceptance. We also received many helpful suggestions from the reveiwers of the original proposal for this CRC Press book. In addition, we are very grateful for the help that we have received from S. Kumar, A. Rodriguez, R.B. Stern, S.K. White, J. Vakili and others at CRC Press during the preparation of this volume. The editors of this volume have been supported by the Natural Sciences and Engineering Research Council of Canada (NSERC) research grant 185986, Manitoba Centre of Excellence Fund (MCEF) grant, Canadian Network of Excellence (NCE) and Canadian Arthritis Network (CAN) grant SRI-BIO-05, and the J.C. Bose Fellowship of the Government of India. March 2010
******* See,e.g.,
Sankar K. Pal James F. Peters
Gupta, S., Patnik, S., Enhancing performance of face recognition by using the near set approach for selecting facial features, J. Theor. Appl. Inform. Technol. 4, 5 (2008), 433-441; Henry, C., Peters, J.F., Perception-based image analysis, Int. J. Bio-Inspired Comp. 2, 2 (2009), in press; Peters, J.F., Tolerance near sets and image correspondence, Int. J. of Bio-Inspired Computation 1(4) (2009), 239-245; Peters, J.F., Corrigenda and addenda: Tolerance near sets and image correspondence, Int. J. of Bio-Inspired Computation 2(5) (2010), in press; Ramanna, S., Perceptually near Pawlak partitions, Transactions on Rough Sets XII, 2010, in press, Ramanna, S., Meghdadi, A., Measuring resemblances between swarm behaviours: A perceptual tolerance near set approach, Fundamenta Informaticae 95(4), 2009, 533-552.
0-3
Table of Contents 1 Cantor, Fuzzy, Near, and Rough Sets in Image Analysis James F. Peters and Sankar K. Pal . . . . . . . . . . . . . . . . . . . . .
1-1
2 Rough-Fuzzy Clustering Algorithm for Segmentation of Brain MR Images Pradipta Maji and Sankar K. Pal . . . . . . . . . . . . . . . . . . . . . . . 2-1 3 Image Thresholding using Generalized Rough Sets Debashis Sen and Sankar K. Pal . . . . . . . . . . . . . . . . . . . . . . .
3-1
4 Mathematical Morphology and Rough Sets Homa Fashandi and James F. Peters . . . . . . . . . . . . . . . . . . . .
4-1
5 Rough Hybrid Scheme: An application of breast cancer imaging Aboul Ella Hassanien, Hameed Al-Qaheri, Ajith Abraham . . . . . . . . .
5-1
6 Applications of Fuzzy Rule-based Systems in Medical Image Understanding Wojciech Tarnawski, Gerald Schaefer, Tomoharu Nakashima and Lukasz Miroslaw . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-1 7 Near Set Evaluation And Recognition (NEAR) System Christopher Henry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7-1
8 Perceptual Systems Approach to Measuring Image Resemblance Amir H. Meghdadi and James F. Peters . . . . . . . . . . . . . . . . . . .
8-1
9 From Tolerance Near Sets to Perceptual Image Analysis Shabnam Shahfar, Amir H. Meghdadi and James F. Peters . . . . . . . .
9-1
10 Image Segmentation: A Rough-set Theoretic Approach Milind M. Mushrif and Ajoy K. Ray . . . . . . . . . . . . . . . . . . . .
10-1
11 Rough Fuzzy Measures in Image Segmentation and Analysis Dariusz Malyszko and Jaroslaw Stepaniuk . . . . . . . . . . . . . . . . .
11-1
12 Discovering Image Similarities. Tolerance Near Set Approach Sheela Ramanna . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
12-1
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
I-1
1 Cantor, Fuzzy, Near, and Rough Sets in Image Analysis
James F. Peters Computational Intelligence Laboratory, Electrical & Computer Engineering, Rm. E2-390 EITC Bldg., 75A Chancellor’s Circle, University of Manitoba, Winnipeg R3T 5V6 Manitoba Canada
Sankar K. Pal Machine Intelligence Unit, Indian Statistical Institute,Kolkata, 700 108, India
1.1
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Cantor Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Near Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1–1 1–2 1–2
1.4 Fuzzy Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1–8
1.5 Rough Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1–9
Near Sets and Rough Sets • Basic Near Set Approach Near Sets, Psychophysics and Merleau-Ponty • Visual Acuity Tolerance • Sets of Similar Images • Tolerance Near Sets • Near Sets in Image Analysis
•
Notion of a Fuzzy Set • Near Fuzzy Sets • Fuzzy Sets in Image Analysis Sample Non-Rough Set • Sample Rough Set • Rough Sets in Image Analysis
1.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–11 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–11 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1–12
Introduction
The chapters in this book consider how one might utilize fuzzy sets, near sets, and rough sets, taken separately or taken together in hybridizations, in solving a variety of problems in image analysis. A brief consideration of Cantor sets (Cantor, 1883, 1932) provides a backdrop for an understanding of several recent types of sets useful in image analysis. Fuzzy, near and rough sets provide a wide spectrum of practical solutions to solving image analysis problems such as image understanding, image pattern recognition, image retrieval and image correspondence, mathematical morphology, perceptual tolerance relations in image analysis and segmentation evaluation. Fuzzy sets result from the introduction of a membership function that generalizes the traditional characteristic function. The notion of a fuzzy set was introduced by L. Zadeh in 1965 (Zadeh, 1965). Sixteen years later, rough sets were introduced by Z. Pawlak in 1981 (Pawlak, 1981a). A set is considered rough whenever the boundary between its lower and upper approximation is non-empty. Of the three forms of sets, near sets are newest, introduced in 2007 by J.F. Peters in a perception-based approach to the study of the nearness of observable objects in a physical continuum (Peters and Henry, 2006; Peters, 2007c,a; Peters, Skowron, and Stepaniuk, 2007; Henry and Peters, 2009b; Peters and Wasilewski, 2009; Peters, 2010). This chapter highlights a context for three forms of sets that are now part of the computational intelligence spectrum of tools useful in image analysis and pattern recognition. The principal con1–1
Rough Fuzzy Image Analysis
1–2
tribution of this chapter is an overview of the high utility of fuzzy sets, near sets and rough sets with the emphasis on how these sets can be used in image analysis, especially in classifying parts of digital images presented in this book.
1.2
Cantor Set
To establish a context for the various sets utilized in this book, this section briefly presents the notion of a Cantor set. From the definition of a Cantor set, it is pointed out that fuzzy sets, near sets and rough sets are special forms of Cantor sets. In addition, this chapter points to links between the three types of sets that are part of the computational intelligence spectrum. Probe functions in near set theory provide a link between fuzzy sets and near sets, since every fuzzy membership function is a particular form of probe function. Probe functions are real-valued functions introduced by M. Pavel in 1993 as part of a study of image registration and a topology of images (Pavel, 1993). Z. Pawlak originally thought of a rough set as a new form of fuzzy set (Pawlak, 1981a). It has been shown that every rough set is a near set (this is Theorem 4.8 in (Peters, 2007b)) but not every near set is a rough set. For this reason, near sets are considered a generalization of rough sets. The contribution of this chapter is an overview of the links between fuzzy sets, near sets and rough sets as well as the relation between these sets and the original notion of a set introduced by Cantor in 1883 (Cantor, 1883). By a ‘manifold’ or ‘set’ I understand any multiplicity, which can be thought of as one, i.e., any aggregate [inbegri f f ] of determinate elements which, can be united into a whole by some law. –Foundations of a General Theory of Manifolds, –G. Cantor, 1883.
. . . A set is formed by the grouping together of single objects into a whole. –Set Theory –F. Hausdorff, 1914.
In this mature interpretation of the notion of a set, G. Cantor points to a property or law that determines elementhood in a set and “unites [the elements] into a whole” (Cantor, 1883), elaborated in (Cantor, 1932), and commented on in Lavine (1994). In 1851, Bolzano (Bolzano, 1959) writes that “an aggregate so conceived that is indifferent to the arrangement of its members I call a set”. At that time, the idea that a set could contain just one element or no elements (null set) was not contemplated. This is important in the current conception of a near set, since such a set must contain pairs of perceptual objects with similar descriptions and such a set is never null. That is, a set is a perceptual near set if, and only if it is never empty and it contains pairs of perceived objects that have descriptions that are within some tolerance of each other (see Def. 2).
1.3
Near Sets How Near How near to the bark of a tree are drifting snowflakes, swirling gently round, down from winter skies? How near to the ground are icicles,
Cantor, Fuzzy, Near, and Rough Sets in Image Analysis
1–3
slowly forming on window ledges? –Fragment of a Philosophical Poem. –Z. Pawlak & J.F. Peters, 2002.
The basic idea in the near set approach to object recognition is to compare object descriptions. Sets of objects X,Y are considered near each other if the sets contain objects with at least partial matching descriptions. –Near sets. General theory about nearness of objects, –J.F. Peters, 2007.
TABLE 1.1
Nomenclature
Symbol O, X,Y F, B φi (x) φ B (x) ε · 2 ∼ =B,ε ∼ =B A⊂ ∼ =B,ε C∼ =B,ε X B,ε Y
Interpretation Set of perceptual objects, X,Y ⊆ O, A ⊂ X, x ∈ X, y ∈ Y , Sets of probe functions, B ⊆ F, φi ∈ B, φi : X → ℜ, ith probe function representing feature of x, (φ1 (x), φ2 (x), . . . , φi (x), . . . , φk (x)),description of x of length k, ε ∈ ℜ (reals) such that ε ≥ 0, 1 = (∑ki=1 (·i )2 ) 2 , L2 (Euclidean) norm, {(x, y) ∈ O × O : φ (x) − φ (y) 2 ≤ ε }, tolerance relation, shorthand for ∼ =B,ε , ∼ ∀x, y ∈ A, x ∼ =B,ε y (i.e., A∼ =B,ε is a preclass in =B,ε ), ∼ tolerance class, maximal preclass of =B,ε , X resembles (is near) Y ⇐⇒ X ∼ =B,ε Y .
Set Theory Law 1 Near Sets Near sets contain elements with similar descriptions. Near sets are disjoint sets that resemble each other (Henry and Peters, 2010). Resemblance between disjoint sets occurs whenever there are observable similarities between the objects in the sets. Similarity is determined by comparing lists of object feature values. Each list of feature values defines an object’s description. Comparison of object descriptions provides a basis for determining the extent that disjoint sets resemble each other. Objects that are perceived as similar based on their descriptions are grouped together. These groups of similar objects can provide information and reveal patterns about objects of interest in the disjoint sets. For example, collections of digital images viewed as disjoint sets of points provide a rich hunting ground for near sets. For example, near sets can be found in the favite pentagona coral fragment in Fig. 1.1a from coral reef near Japan. If we consider the greyscale level, the sets X,Y in Fig. 1.1b are near sets, since there are many pixels in X with grey levels that are very similar to pixels in Y .
1.3.1
Near Sets and Rough Sets
Near sets are a generalization of rough sets. It has been shown that every rough set is, in fact, a near set but not every near set is a rough set Peters (2007b). Near set theory originated from an
Rough Fuzzy Image Analysis
1–4
(1.1a) favite coral
(1.1b) near sets
FIGURE 1.1: Sample Near Sets
interest in comparing similarities between digital images. Unlike rough sets, the near set approach does not require set approximation Peters and Wasilewski (2009). Simple examples of near sets can sometimes be found in tolerance classes in pairs of image coverings, if, for instance, a subimage of a class in one image has a description that is similar to the description of a subimage in a class in the second image. In general, near sets are discovered by discerning objects–either within a single set or across sets–with descriptions that are similar. From the beginning, the near set approach to perception has had direct links to rough sets in its approach to the perception of objects (Pawlak, 1981a; Orłowska, 1982) and the classification of objects (Pawlak, 1981a; Pawlak and Skowron, 2007c,b,a). This is evident in the early work on nearness of objects and the extension of the approximation space model (see, e.g., (Peters and Henry, 2006; Peters et al., 2007)). Unlike the focus on the approximation boundary of a set, the study of near sets focuses on the discovery of affinities between perceptual granules such as digital images viewed as sets of points. In the context of near sets, the term affinity means close relationship between perceptual granules (particularly images) based on common description. Affinities are discovered by comparing the descriptions of perceptual granules, e.g., descriptions of objects contained in classes found in coverings defined by the tolerance relation ∼ =F,ε .
1.3.2
Basic Near Set Approach
Near set theory provides methods that can be used to extract resemblance information from objects contained in disjoint sets, i.e., it provides a formal basis for the observation, comparison, and classification of objects. The discovery of near sets begins with choosing the appropriate method to describe observed objects. This is accomplished by the selection of probe functions representing observable object features. A basic model for a probe function was introduced by M. Pavel (Pavel, 1993) in the text of image registration and image classification. In near set theory, a probe function is a mapping from an object to a real number representing an observable feature value (Peters, 2007a). For example, when comparing fruit such as apples, the redness of an apple (observed object) can be described by a probe function representing colour, and the output of the probe function is a number representing the degree of redness. Probe functions provide a basis for describing and discerning affinities between objects as well as between groups of similar objects (Peters and Ramanna, 2009). Objects that have, in some degree, affinities are considered near each other. Similarly, groups of objects (i.e. sets) that have, in some degree, affinities are also considered near each other.
1.3.3
Near Sets, Psychophysics and Merleau-Ponty
Cantor, Fuzzy, Near, and Rough Sets in Image Analysis
1–5
Near sets offer an ideal framework for solving problems based on human perception that arise in areas such as image processing, computer vision as well as engineering and science problems. In near set theory, perception is a combination of the view of perception in psychophysics (Hoogs, Collins, Kaucic, and Mundy, 2003; Bourbakis, 2002) with a view of perception found in MerleauPonty’s work (Merleau-Ponty, 1945, 1965). In the context of psychophysics, perception of an object (i.e., in effect, our knowledge about an object) depends on sense inputs that are the source of signal values (stimularions) in the cortex of the brain. In this view of perception, the transmissions of sensory inputs to cortex cells senses are likened to probe functions defined in terms of mappings of sets of sensed objects to sets of real-values representing signal values (the magnitude of each cortex signal value represents a sensation) that are a source of object feature values assimilated by the mind. Perception in animals is modelled as a mapping from sensory cells to brain cells. For example, visual perception is modelled as a mapping from stimulated retina sensory cells to visual cortex cells (see Fig. 1.2). Such mappings are called probe functions. A probe measures observable physical characteristics of objects in our environment. In other words, a probe function provides a basis for what is commonly called feature extraction (Guyon, Gunn, Nikravesh, and Zadeh, 2006). The sensed physical characteristics of an object are identified with object features. The term feature is used in S. Watanabe’s sense of the word (Watanabe, 1985), i.e., a feature corresponds to an observable property of physical objects. Each feature has a 1-to-many relationship to real-valued functions called probe functions representing the feature. For each feature (such as colour) one or more probe functions can be introduced to represent the feature (such as grayscale, or RGB values). Objects and sets of probe functions form the basis of near set theory and are sometimes referred to as perceptual objects due to the focus on assigning values to perceived object features. Axiom 1 An object is perceivable if, and only if the object is describable. In Merleau-Ponty’s view (Merleau-Ponty, 1945, 1965), an object is perceived to the extent that it can be described. In other words, object description goes hand-in-hand with object perception. It is our mind that identifies relationships between object descriptions to form perceptions of sensed objects. It is also the case that near set theory has been proven to be quite successful in finding solutions to perceptual problems such as measuring image correspondence and segmentation evaluation. The notion of a sensation in Poincar´e (Poincar´e, 1902) and a physical model for a probe
FIGURE 1.2: Sample Visual Perception
function from near set theory (Peters and Wasilewski, 2009; Peters, 2010) is implicitly explained by Zeeman (Zeeman, 1962) in terms of visual perception. That is, ‘seeing’ consists of mappings from
1–6
Rough Fuzzy Image Analysis
sense inputs from sensory units in the retina of the eye to cortex cells of the brain stimulated by sense inputs. A sense input can be represented by a number representing the intensity of the light from the visual field (i.e., everything in the physical world that causes light to fall on the retina.) impacting on the retina. The intensity of light from the visual field will determine the level of stimulation of a cortex cell from retina sensory input. Over time, varying cortex cell stimulation has the appearance of an electrical signal. The magnitude of cortex cell stimulation is a real-value. The combination of an activated sensory cell in the retina and resulting retina-originated impulses sent to cortex cells (visual stimulation) is likened to what Poincar´e calls a sensation in his essay on separate sets of similar sensations leading to a perception of a physical continuum (Poincar´e, 1902). This model for a sensation underlies what is known as a probe function in near set theory (Peters, 2007b; Peters and Wasilewski, 2009). DEFINITION 1.1
Visual Probe Function Let O = {perceptual objects}. A perceptual object is something in the visual field that is a source of reflected light. Let ℜ denote the set of reals. Then a probe φ is a mapping φ : X → ℜ. For x ∈ X, φ (x) denotes an amplitude in a visual perception (see, e.g., Fig. 1.2). In effect, a probe function value φ (x) measures the strength of a feature value extracted from each sensation. In Poincar´e, sets of sensations are grouped together because they are, in some sense, similar within a specified distance, i.e., tolerance. Implicit in this idea in Poincar´e is the perceived feature value of a particular sensation that makes it possible for us to measure the closeness of an individual senation to other sensations. A human sensation modelled as a probe measures observable physical characteristics of objects in our environment. The sensed physical characteristics of an object are identified with object features. In Merleau-Ponty’s view, an object is perceived to the extent that it can be described (Merleau-Ponty, 1945, 1965). In other words, object description goes hand-in-hand with object perception. It is our mind that identifies relationships between object descriptions to form perceptions of sensed objects. It is also the case that near set theory has been proven to be quite successful in finding solutions to perceptual problems such as measuring image correspondence and segmentation evaluation. Axiom 2 Formulate object description to achieve object perception. In a more recent interpretation of the notion of a near set, the nearness of sets is considered in the context of perceptual systems (Peters and Wasilewski, 2009). Poincar´e’s idea of perception of objects such as digital images in a physical continuum can be represented by means of perceptual systems, which is akin to but not the same as what has been called a perceptual information system (Peters and Wasilewski, 2009; Peters, 2010). A perceptual system is a pair O, F where O is a non-empty set of perceptual objects and F is a non-empty, countable set of probe functions (see Def. 1). Definition 1 Perceptual System (Peters, 2010) A perceptual system O, F consists of a sample space O containing a finite, non-empty set of sensed sample objects and a non-empty, countable set F containing probe functions representing object features. The perception of physical objects and their description within a perceptual system facilitates pattern recognition and the discovery of sets of similar objects. In the near set approach to image analysis, one starts by identifying a perceptual system and the defining a cover on the sample space with an appropriate perceptual tolerance relation. Method 1 Perceptual Tolerance
Cantor, Fuzzy, Near, and Rough Sets in Image Analysis
1–7
1. identify a sample space O and a set F to formulate a perceptual system O, F, and then 2. introduce a tolerance relation τε that defines a cover on O.
1.3.4
Visual Acuity Tolerance
Zeeman (Zeeman, 1962) introduces a tolerance space (X, τε ), where X is the visual field of the right eye and ε is the least angular distance so that all points indistinguishable from x ∈ X are within ε of x. In this case, there is an implicit perceptual system O, F, where O := X consists of points that are sources of reflected light in the visual field and F contains probes used to extract feature values from each x ∈ O.
1.3.5
Sets of Similar Images
Consider O, F, where O consists of points representing image pixels and F contains probes used to extract feature values from each x ∈ O. Let B ⊆ F. Then introduce tolerance relation ∼ =B,ε to define a covers on X,Y ⊂ O. Then, in the case where X,Y resemble each other, i.e., X B,ε Y , then measure the degree of similarity (nearness) of X,Y (a publicly available toolset that makes it possible to complete this example for any set of digital images is available at (Henry and Peters, 2010, 2009a)). See Table 1.1 (also, (Peters and Wasilewski, 2009; Peters, 2009b, 2010)) for details about the bowtie notation B,ε used to denote resemblance between X and Y , i.e., X B,ε Y
(1.3a) Lena
(1.3b) Lena TNS
FIGURE 1.3: Lena Tolerance Near Sets (TNS)
1.3.6
Tolerance Near Sets
In near set theory, the trivial case is excluded. That is, an element x ∈ X is not considered near itself. In addition, the empty set is excluded from near sets, since the empty set is never something that we perceive, i.e., a set of perceived objects is never empty. In the case where one set X is near another set Y , this leads to the realization that there is a third set containing pairs of elements x, y ∈ X × Y with similar descriptions. The key to an understanding of near sets is the notion of a description. The description of each perceived object is specified a vector of feature values and each feature is
Rough Fuzzy Image Analysis
1–8
(1.4a) Photographer
(1.4b) Photographer TNS
FIGURE 1.4: Photographer Tolerance Near Sets
represented by what is known as a probe function that maps an object to a real value. Since our main interest is in detecting similarities between seemingly quite disjoint sets such as subimages in an image or pairs of classes in coverings on a pair of images, a near set is defined in context of a tolerance space. Definition 2 Tolerance Near Sets (Peters, 2010) Let O, F be a perceptual system. Put ε ∈ ℜ, B ⊂ F. Let X,Y ⊂ O denote disjoint sets with coverings determined by a tolerance relation ∼ =B,ε . Sets X,Y are tolerance near sets if, and only if there are preclasses A ⊂ X, B ⊂ Y such that A B,ε B.
1.3.7
Near Sets in Image Analysis
The subimages in Fig. 1.3b and Fig. 1.4b delineate tolerance classes (each with its own grey level) subregions of the original images in Fig. 1.3a and Fig. 1.4a. The tolerance classes in these images are dominated by (light grey), (medium grey) and (dark grey) subimages along with a few (very dark) subimages in Fig. 1.3b and many very dark subimages in Fig. 1.4b. From Def. 2, it can be observed that the images in Fig. 1.3a and Fig. 1.4a are examples of tolerance near sets, i.e., ImageFig. 1.4a F,ε ImageFig. 1.3a ). Examples of the near set approach to image analysis can be found in, e.g., (Henry and Peters, 2007, 475-482, 2008, 1-6, 2009a; Gupta and Patnaik, 2008; Peters, 2009a,b, 2010; Peters and Wasilewski, 2009; Peters and Puzio, 2009; Hassanien, Abraham, Peters, Schaefer, and Henry, 2009; Meghdadi, Peters, and Ramanna, 2009; Fashandi, Peters, and Ramanna, 2009) and in a number of chapters of this book. From set composition Law 1, near sets are Cantor sets containing one or more pairs of objects (e.g., image patches, one from each digital image) that resemble each other as enunciated in Def. 2, i.e., X, T ⊂ O are near sets if, and only if X F,ε Y ).
1.4
Fuzzy Sets A fuzzy set is a class of objects with a continuum of grades of membership. –Fuzzy sets, Information and Control 8 –L.A. Zadeh, 1965.
Cantor, Fuzzy, Near, and Rough Sets in Image Analysis
1–9
. . . A fuzzy set is characterized by a membership function which assigns to each object its grade of membership (a number lying between 0 and 1) in the fuzzy set. –A new view of system theory –L.A. Zadeh, 20-21 April 1965.
Set Theory Law 2 Fuzzy Sets Every element in a fuzzy set has a graded membership.
1.4.1
Notion of a Fuzzy Set
The notion of a fuzzy set was introduced by L.A. Zadeh in 1965 (Zadeh, 1965). In effect, a Cantor set is a fuzzy set if, and only if every element of the set has a grade of membership assigned to it by a specified membership function. Notice that a membership function φ : X → [0, 1] is a special case of what is known as a probe function in near set theory.
1.4.2
Near Fuzzy Sets
A fuzzy set X is a near set relative to a set Y if the grade of membership of the objects in sets X,Y is assigned to each object by the same membership function φ and there is a least one pair of objects x, y ∈ X ×Y such that φ (x) − φ (y) 2 ≤ ε }, i.e., the description of x is similar to the description y within some ε .
1.4.3
Fuzzy Sets in Image Analysis
Fuzzy sets have widely used in image analysis (see, e.g., (Rosenfeld, 1979; Pal and King, 1980, 1981; Pal, 1982; Pal, King, and Hashim, 1983; Pal, 1986, 1992; Pal and Leigh, 1995; Pal and Mitra, 1996; Nachtegael and Kerre, 2001; Deng and Heijmans, 2002; Martino, Sessa, and Nobuhara, 2008; Sussner and Valle, 2008; Hassanien et al., 2009)). In the notion of fuzzy sets, (Pal and King, 1980, 1981) defined an image of M × N dimension and L levels as an array of fuzzy singletons, each with a value of membership function denoting the degree of having brightness or some property relative to some brightness level l, where l = 0, 1, 2, . . . , L − 1. The literature on fuzzy image analysis is based on the realization that the basic concepts of edge, boundary, region, relation in an image do not lend themselves to precise definition. From set composition Law 2, it can be observed that fuzzy sets are Cantor sets.
1.5
Rough Sets A new approach to classification,based on information systems theory, given in this paper. . . . This approach leads to a new formulation of the notion of fuzzy sets (called here the rough sets). The axioms for such sets are given, which are the same as the axioms of topological closure and interior. –Classification of objects by means of attributes. –Z. Pawlak, 1981.
Rough Fuzzy Image Analysis
1–10 TABLE 1.2
Pawlak Indiscernibility Relation and Partition Symbols
Symbol Interpretation ∼B x/∼B U/∼B B∗ (X)
= {(x, y) ∈ X × X | f (x) = f (y) ∀ f ∈ B}, indiscernibility, cf. (Pawlak, 1981a), x/∼B = {y ∈ X | y ∼B x}, elementary set (class), U/∼B = {x/∼B | x ∈ U}, quotient set. B∗ (X) = x/∼ (lower approximation of X),
B ∗ (X)
B ∗ (X)
x/∼ ⊆X
=
B
x/∼ ∩X=0/
B
x/∼ (upper approximation of X). B
B
Set Theory Law 3 Rough Sets Any non-empty set X is a rough set if, and only if the approximation boundary of X is not empty. Rough sets were introduced by Z. Pawlak in (Pawlak, 1981a) and elaborated in (Pawlak, 1981b; Pawlak and Skowron, 2007c,b,a). In a rough set approach to classifying sets of objects X, one considers the size of the boundary region in the approximation of X. By contrast, in a near set approach to classification, one does not consider the boundary region of a set. In particular, assume that X is a non-empty set belonging to a universe U and that F is a set of features defined either by total or partial functions. The lower approximation of X relative to B ⊆ F is denoted by B∗ (X) and the upper approximation of X is denoted by B ∗ (X), where B∗ (X) =
x/∼ ⊆X
x/∼ , B
B
B ∗ (X) =
x/∼ ∩X=0/
x/∼ . B
B
The B-boundary region of an approximation of a set X is denoted by BndB (X), where / B∗ (X)}. BndB (X) = B ∗ (X) \ B∗ (X) = {x | x ∈ B ∗ (X) and x ∈ Definition 3 Rough Set (Pawlak, 1981a) A non-empty, finite set X is a rough set if, and only if |B ∗ (X) − B∗ (X)| = 0. A set X is roughly classified whenever BndB (X) is not empty. In other words, X is a rough set / In sum, a rough set is a Cantor set if, and only if its whenever the boundary region BndB (X) = 0. approximation boundary is non-empty. It should also be noted that rough sets differ from near sets, since near sets are defined without reference to an approximation boundary region. This means, for example, with near sets the image correspondence problem can be solved without resorting to set approximation. Method 2 Rough Set Approach 1. Let (U, B) denote a sample space (universe) U and set of object features B, 2. Using relation ∼B , partition the universe U, 3. Determine the size of the boundary of a set X.
1.5.1
Sample Non-Rough Set
Let x ∈ U. x/∼B (any elementary set) is a non-rough set.
Cantor, Fuzzy, Near, and Rough Sets in Image Analysis
1.5.2
1–11
Sample Rough Set
Any set X ⊂ U where
x/∼ ⊆U/∼ B B
x/∼ = X. B
In other words, if a set X does not equal its lower approximation, then the set X is rough, i.e., roughly approximated by the equivalence classes in the quotient set U/∼ . B
1.5.3
Rough Sets in Image Analysis The essence of our approach consists in viewing a digitized image as a universe of a certain information system and synthesizing an indiscernibility relation to identify objects and measure some of their parameters. – Adam Mrozek and Leszek Plonka, 1993.
In terms of rough sets and image analysis, it can be observed that A. Mr´ozek and L. Plonka were pioneers (Mr´ozek and Plonka, 1993). For example, he was one of the first to introduce a rough set approach to image analysis and to view a digital image as a universe viewed as a set of points. The features of pixels (points) in a digital image are a source of knowledge discovery. Using Z. Pawlak’s indiscernibility relation, it is then a straightforward task to partition an image and to consider set approximation relative to interesting objects contained in subsets of an image. This work on digital images by A. Mr´ozek and L. Plonka appeared six or more years before the publication of papers on approximate mathematical morphology by Lech Polkowski (Polkowski, 1999) (see, also, (Polkowski, 1993; Polkowski and Skowron, 1994)) and connections between mathematical morphology and rough sets pointed to by Isabelle Bloch (Bloch, 2000). The early work on the use of rough sets in image analysis has been followed by a number of articles by S.K. Pal and others (see, e.g., (Pal and Mitra, 2002; Pal, UmaShankar, and Mitra, 2005; Peters and Borkowski, 2004; Borkowski and Peters, 2006; Borkowski, 2007; Maji and Pal, 2008; Mushrif and Ray, 2008; Sen and Pal, 2009)). From set composition Law 3, it can be observed that rough sets are Cantor sets.
1.6
Conclusion
In sum, fuzzy sets, near sets and rough sets are particular forms of Cantor sets. In addition, each of these sets in the computational intelligence spectrum offer very useful approaches in image analysis, especially in classifying objects.
Acknowledgements This research by James Peters has been supported by the Natural Sciences and Engineering Research Council of Canada (NSERC) grant 185986, Manitoba Centre of Excellence Fund (MCEF) grant, Canadian Centre of Excellence (NCE) and Canadian Arthritis Network grant SRI-BIO-05, and Manitoba Hydro grant T277 and that of Sankar Pal has been supported by the J.C. Bose Fellowship of the Govt. of India.
1–12
Rough Fuzzy Image Analysis
Bibliography Bloch, L. 2000. On links between mathematical morphology and rough sets. Pattern Recognition 33(9):1487–1496. Bolzano, B. 1959. Paradoxien des unendlichen (paradoxes of the infinite), trans. by d.a. steele. London: Routledge and Kegan Paul. Borkowski, M. 2007. 2d to 3d conversion with direct geometrical search and approximation spaces. Ph.D. thesis, Dept. Elec. Comp. Engg. http://wren.ee.umanitoba.ca/. Borkowski, M., and J.F. Peters. 2006. Matching 2d image segments with genetic algorithms and approximation spaces. Transactions on Rough Sets V(LNAI 4100):63–101. Bourbakis, N. G. 2002. Emulating human visual perception for measuring difference in images using an spn graph approach. IEEE Transactions on Systems, Man, and Cybernetics, Part B 32(2):191–201. ¨ Cantor, G. 1883. Uber unendliche, lineare punktmannigfaltigkeiten. Mathematische Annalen 201:72–81. ———. 1932. Gesammelte abhandlungen mathematischen und philosophischen inhalts, ed. e. zermelo. Berlin: Springer. Deng, T.Q., and H.J.A.M. Heijmans. 2002. Grey-scale morphology based on fuzzy logic. J. Math. Imag. Vis. 16:155–171. Fashandi, H., J.F. Peters, and S. Ramanna. 2009. L2 norm length-based image similarity measures: Concrescence of image feature histogram distances. In Signal and image processing, int. assoc. of science & technology for development, 178–185. Honolulu, Hawaii. Gupta, S., and K.S. Patnaik. 2008. Enhancing performance of face recognition systems by using near set approach for selecting facial features. J. Theoretical and Applied Information Technology 4(5):433–441. Guyon, I., S. Gunn, M. Nikravesh, and L.A. Zadeh. 2006. Feature extraction. foundations and applications. Berlin: Springer. Hassanien, A.E., A. Abraham, J.F. Peters, G. Schaefer, and C. Henry. 2009. Rough sets and near sets in medical imaging: A review. IEEE Trans. Info. Tech. in Biomedicine 13(6): 955–968. Digital object identifier: 10.1109/TITB.2009.2017017. Henry, C., and J.F. Peters. 2007, 475-482. Image pattern recognition using approximation spaces and near sets. In Proc. 11th int. conf. on rough sets, fuzzy sets, data mining and granular computing (rsfdgrc 2007), joint rough set symposium (jrs 2007), lecture notes in artificial intelligence 4482. Heidelberg, Germany. ———. 2008, 1-6. Near set image segmentation quality index. In Geobia 2008 pixels, objects, intelligence. geographic object based image analysis for the 21st century. University of Calgary, Alberta. ———. 2009a. Near set evaluation and recognition (near) system. Tech. Rep., Computationa Intelligence Laboratory, University of Manitoba. UM CI Laboratory Technical Report No. TR-2009-015.
Cantor, Fuzzy, Near, and Rough Sets in Image Analysis ———. 2009b. Perception-based image analysis. Int. J. of Bio-Inspired Computation 2(2). in press. ———. 2010. Near sets. Wikipedia. http://en.wikipedia.org/wiki/Near sets. Hoogs, A., R. Collins, R. Kaucic, and J. Mundy. 2003. A common set of perceptual observables for grouping, figure-ground discrimination, and texture classification. IEEE Transactions on Pattern Analysis and Machine Intelligence 25(4):458–474. Lavine, S. 1994. Understanding the infinite. Cambridge, MA: Harward University Press. Maji, P., and S.K. Pal. 2008. Maximum class separability for rough-fuzzy c-means based brain mr image segmentation. Transactions on Rough Sets IX, LNCS-5390:114134. Martino, F.D., S. Sessa, and H. Nobuhara. 2008. Eigen fuzzy sets and image information retrieval. In Handbook of granular computing, ed. W. Pedrycz, A. Skowron, and V. Kreinovich, 863–872. West Sussex, England: John Wiley & Sons, Ltd. Meghdadi, A.H., J.F. Peters, and S. Ramanna. 2009. Tolerance classes in measuring image resemblance. Intelligent Analysis of Images & Videos, KES 2009, Part II, Knowledge-Based and Intelligent Information and Engineering Systems, LNAI 5712 127–134. ISBN 978-364-04591-2, doi 10.1007/978-3-642-04592-9 16. Merleau-Ponty, Maurice. 1945, 1965. Phenomenology of perception. Paris and New York: Smith, Gallimard, Paris and Routledge & Kegan Paul. Trans. by Colin Smith. Mr´ozek, A., and L. Plonka. 1993. Rough sets in image analysis. Foundations of Computing and Decision Sciences 18(3-4):268–273. Mushrif, M., and A.K. Ray. 2008. Color image segmentation: Rough-set theoretic approach. Pattern Recognition Letters 29(4):483493. Nachtegael, M., and E.E. Kerre. 2001. Connections between binary, grayscale and fuzzy mathematical morphologies. Fuzzy Sets and Systems 124:73–85. Orłowska, E. 1982. Semantics of vague concepts. applications of rough sets. Polish Academy of Sciences 469. In G.Dorn, P. Weingartner (Eds.), Foundations of Logic and Linguistics. Problems and Solutions, Plenum Press, London/NY, 1985, 465-482. Pal, S.K. 1982. A note on the quantitative measure of image enhancement through fuzziness. IEEE Trans. Pattern Anal. Machine Intell. PAMI-4(2):204–208. ———. 1986. A measure of edge ambiguity using fuzzy sets. Pattern Recognition Letters 4(1):51–56. ———. 1992. Fuzziness, image information and scene analysis. In An introduction to fuzzy logic applications in intelligent systems, ed. R.R. Yager and L.A. Zadeh, 147–183. Dordrecht: Kluwer Academic Publishers. Pal, S.K., and R.A. King. 1980. Image enhancement with fuzzy set. Electronics Letters 16(10): 376–378. ———. 1981. Image enhancement using smoothing with fuzzy set. IEEE Trans. Syst. Man and Cyberns. SMC-11(7):495–501.
1–13
1–14
Rough Fuzzy Image Analysis
Pal, S.K., R.A. King, and A.A. Hashim. 1983. Image description and primitive extraction using fuzzy sets. IEEE Trans. Syst. Man and Cyberns. SMC-13(1):94–100. Pal, S.K., and A.B. Leigh. 1995. Motion frame analysis and scene abstraction: Discrimination ability of fuzziness measures. J. Intelligent & Fuzzy Systems 3:247–256. Pal, S.K., and P. Mitra. 2002. Multispectral image segmentation using rough set initialized em algorithm. IEEE Transactions on Geoscience and Remote Sensing 11:24952501. Pal, S.K., and S. Mitra. 1996. Noisy fingerprint classification using multi layered perceptron with fuzzy geometrical and textual features. Fuzzy Sets and Systems 80(2):121–132. Pal, S.K., B. UmaShankar, and P. Mitra. 2005. Granular computing, rough entropy and object extraction. Pattern Recognition Letters 26(16):401–416. Pavel, M. 1993. Fundamentals of pattern recognition. 2nd ed. N.Y., U.S.A.: Marcel Dekker, Inc. Pawlak, Z. 1981a. Classification of objects by means of attributes. Polish Academy of Sciences 429. ———. 1981b. Rough sets. International J. Comp. Inform. Science 11:341–356. Pawlak, Z., and A. Skowron. 2007a. Rough sets and boolean reasoning. Information Sciences 177:41–73. ———. 2007b. Rough sets: Some extensions. Information Sciences 177:28–40. ———. 2007c. Rudiments of rough sets. Information Sciences 177:3–27. Peters, J.F. 2007a. Near sets. general theory about nearness of objects. Applied Mathematical Sciences 1(53):2609–2029. ———. 2007b. Near sets. general theory about nearness of objects. Applied Mathematical Sciences 1(53):2609–2029. ———. 2007c. Near sets. special theory about nearness of objects. Fundamenta Informaticae 75(1-4):407–433. ———. 2009a. Discovering affinities between perceptual granules: L2 norm-based tolerance near preclass approach. In Man-machine interactions, advances in intelligent & soft computing 59, 43–55. The Beskids, Kocierz Pass, Poland. ———. 2009b. Tolerance near sets and image correspondence. Int. J. of Bio-Inspired Computation 1(4):239–445. ———. 2010. Corrigenda and addenda: Tolerance near sets and image correspondence. Int. J. Bio-Inspired Computation 2(5). in press. Peters, J.F., and M. Borkowski. 2004. K-means indiscernibility relation over pixels. In Lecture notes in computer science 3066, ed. S. Tsumoto, R. Slowinski, K. Komorowski, and J.W. Gryzmala-Busse, 580–585. Berlin: Springer. Doi 10.1007/b97961. Peters, J.F., and C. Henry. 2006. Reinforcement learning with approximation spaces. Fundamenta Informaticae 71:323–349.
Cantor, Fuzzy, Near, and Rough Sets in Image Analysis Peters, J.F., and L. Puzio. 2009. Image analysis with anisotropic wavelet-based nearness measures. International Journal of Computational Intelligence Systems 3(2):1–17. Peters, J.F., and S. Ramanna. 2009. Affinities between perceptual granules: Foundations and perspectives. In Human-centric information processing through granular modelling sci 182, ed. A. Bargiela and W. Pedrycz, 49–66. Berlin: Springer-Verlag. Peters, J.F., A. Skowron, and J. Stepaniuk. 2007. Nearness of objects: Extension of approximation space model. Fundamenta Informaticae 79(3-4):497–512. Peters, J.F., and P. Wasilewski. 2009. Foundations of near sets. Information Sciences. An International Journal 179:3091–3109. Digital object identifier: doi:10.1016/j.ins.2009.04.018. Poincar´e, H. 1902. La science et l’hypoth`ese. Paris: Ernerst Flammarion. Later ed,˙, Champs sciences, Flammarion, 1968 & Science and Hypothesis, trans. by J. Larmor, Walter Scott Publishing, London, 1905. Polkowski, L. 1993. Mathematical morphology of rough sets. Bull. Polish Acad. Ser. Sci.Math, Warsaw: Polish Academy of Sciences. ———. 1999. Approximate mathematical morphology. rough set approach. Rough and Fuzzy Sets in Soft Computing, Berlin: Springer - Verlag. Polkowski, L., and A. Skowron. 1994. Analytical morphology: Mathematical morphology of decision tables. Fundamenta Informaticae 27:255–271. Rosenfeld, A. 1979. Fuzzy digital topology. Inform. Contrl 40(1):76–87. Sen, D., and S. K. Pal. 2009. Histogram thrsholding using fuzzy and rough means of association error. IEEE Trans. Image Processing 18(4):879–888. Sussner, P., and M.E. Valle. 2008. Fuzzy associative memories and their relationship to mathematical morphology. In Handbook of granular computing, ed. W. Pedrycz, A. Skowron, and V. Kreinovich, 733–753. West Sussex, England: John Wiley & Sons, Ltd. Watanabe, S. 1985. Pattern recognition: Human and mechanical. John Wiley & Sons: Chichester. Zadeh, L.A. 1965. Fuzzy sets. Information and Control 201:72–81. Zeeman, E.C. 1962. The topology of the brain and the visual perception. New Jersey: Prentice Hall. In K.M. Fort, Ed., Topology of 3-manifolds and Selected Topics, 240-256.
1–15
2 Rough-Fuzzy Clustering Algorithm for Segmentation of Brain MR Images 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Fuzzy C-Means and Rough Sets . . . . . . . . . . . . . . . . . . . . Fuzzy C-Means
•
Rough Sets
2.3 Rough-Fuzzy C-Means Algorithm . . . . . . . . . . . . . . . . . . Objective Function the Algorithm
•
Cluster Prototypes
•
Feature Extraction Machine Intelligence Unit, Indian Statistical Institute, Kolkata, 700 108, India
Sankar K. Pal Machine Intelligence Unit, Indian Statistical Institute, Kolkata, 700 108, India
2.1
•
2–5
Details of
2.4 Pixel Classification of Brain MR Images . . . . . . . . . . . 2.5 Segmentation of Brain MR Images . . . . . . . . . . . . . . . . .
Pradipta Maji
2–1 2–3
2–7 2–9
Selection of Initial Centroids
2.6 Experimental Results and Discussion . . . . . . . . . . . . . . 2–13 Haralick’s Features Versus Proposed Features • Random Versus Discriminant Analysis Based Initialization • Comparative Performance Analysis
2.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–18 Acknowledgments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–18 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2–19
Introduction
Segmentation is a process of partitioning an image space into some non-overlapping meaningful homogeneous regions. The success of an image analysis system depends on the quality of segmentation (Rosenfeld and Kak, 1982). A segmentation method is supposed to find those sets that correspond to distinct anatomical structures or regions of interest in the image. In the analysis of medical images for computer-aided diagnosis and therapy, segmentation is often required as a preliminary stage. However, medical image segmentation is a complex and challenging task due to intrinsic nature of the images. The brain has a particularly complicated structure and its precise segmentation is very important for detecting tumors, edema, and necrotic tissues, in order to prescribe appropriate therapy (Suetens, 2002). In medical imaging technology, a number of complementary diagnostic tools such as xray computer tomography (CT), magnetic resonance imaging (MRI), and position emission tomography (PET) are available. MRI is an important diagnostic imaging technique for the early detection of abnormal changes in tissues and organs. Its unique advantage over other modalities is that it can provide multispectral images of tissues with a variety of 2–1
2–2
Rough Fuzzy Image Analysis
contrasts based on the three MR parameters ρ, T1, and T2. Therefore, majority of research in medical image segmentation concerns MR images (Suetens, 2002). Conventionally, the brain MR images are interpreted visually and qualitatively by radiologists. Advanced research requires quantitative information, such as the size of the brain ventricles after a traumatic brain injury or the relative volume of ventricles to brain. Fully automatic methods sometimes fail, producing incorrect results and requiring the intervention of a human operator. This is often true due to restrictions imposed by image acquisition, pathology and biological variation. So, it is important to have a faithful method to measure various structures in the brain. One of such methods is the segmentation of images to isolate objects and regions of interest. Many image processing techniques have been proposed for MR image segmentation, most notably thresholding (Lee, Hun, Ketter, and Unser, 1998; Maji, Kundu, and Chanda, 2008), region-growing (Manousakes, Undrill, and Cameron, 1998), edge detection (Singleton and Pohost, 1997), pixel classification (Pal and Pal, 1993; Rajapakse, Giedd, and Rapoport, 1997) and clustering (Bezdek, 1981; Leemput, Maes, Vandermeulen, and Suetens, 1999; Wells III, Grimson, Kikinis, and Jolesz, 1996). Some algorithms using the neural network approach have also been investigated in the MR image segmentation problems (Cagnoni, Coppini, Rucci, Caramella, and Valli, 1993; Hall, Bensaid, Clarke, Velthuizen, Silbiger, and Bezdek, 1992). One of the main problems in medical image segmentation is uncertainty. Some of the sources of this uncertainty include imprecision in computations and vagueness in class definitions. In this background, the possibility concept introduced by the fuzzy set theory (Zadeh, 1965) and rough set theory (Pawlak, 1991) have gained popularity in modeling and propagating uncertainty. Both fuzzy set and rough set provide a mathematical framework to capture uncertainties associated with human cognition process (Dubois and H.Prade, 1990; Maji and Pal, 2007b; Pal, Mitra, and Mitra, 2003). The segmentation of MR images using fuzzy c-means has been reported in (Bezdek, 1981; Brandt, Bohan, Kramer, and Fletcher, 1994; Hall et al., 1992; Li, Goldgof, and Hall, 1993; Xiao, Ho, and Hassanien, 2008). Image segmentation using rough sets has also been done (Mushrif and Ray, 2008; Pal and Mitra, 2002; Widz, Revett, and Slezak, 2005a,b; Widz and Slezak, 2007; Hassanien, 2007). In this chapter, a hybrid algorithm called rough-fuzzy c-means (RFCM) algorithm is presented for segmentation of brain MR images. Details of this algorithm have been reported in (Maji and Pal, 2007a,c). The RFCM algorithm is based on both rough sets and fuzzy sets. While the membership function of fuzzy sets enables efficient handling of overlapping partitions, the concept of lower and upper approximations of rough sets deals with uncertainty, vagueness, and incompleteness in class definition. Each partition is represented by a cluster prototype (centroid), a crisp lower approximation, and a fuzzy boundary. The lower approximation influences the fuzziness of the final partition. The cluster prototype (centroid) depends on the weighting average of the crisp lower approximation and fuzzy boundary. However, an important issue of the RFCM based brain MR image segmentation method is how to select initial prototypes of different classes or categories. The concept of discriminant analysis, based on the maximization of class separability, is used to circumvent the initialization and local minima problems of the RFCM, and enables efficient segmentation of brain MR images (Maji and Pal, 2008). The effectiveness of the RFCM algorithm, along with a comparison with other c-means algorithms, is demonstrated on a set of brain MR images using some standard validity indices. The chapter is organized as follows: Section 2.2 briefly introduces the necessary notions of fuzzy c-means and rough sets. In Section 2.3, the RFCM algorithm is described based on the theory of rough sets and fuzzy c-means. While Section 2.4 deals with pixel classification problem, Section 2.5 gives an overview of the feature extraction techniques employed in seg-
Rough-Fuzzy Clustering Algorithm for Segmentation of Brain MR Images
2–3
mentation of brain MR images along with the initialization method of c-means algorithm based on the maximization of class separability. Implementation details, experimental results, and a comparison among different c-means are presented in Section 2.6. Concluding remarks are given in Section 2.7.
2.2
Fuzzy C-Means and Rough Sets
This section presents the basic notions of fuzzy c-means and rough sets. The rough-fuzzy c-means (RFCM) algorithm is developed based on these algorithms.
2.2.1
Fuzzy C-Means
Let X = {x1 , · · · , xj , · · · , xn } be the set of n objects and V = {v1 , · · · , vi , · · · , vc } be the set of c centroids, where xj ∈ ℜm , vi ∈ ℜm , and vi ∈ X. The fuzzy c-means provides a fuzzification of the hard c-means (Bezdek, 1981; Dunn, 1974). It partitions X into c clusters by minimizing the objective function J=
n X c X
´ (µij )m ||xj − vi ||2
(2.1)
j=1 i=1
where 1 ≤ m ´ < ∞ is the fuzzification factor, vi is the ith centroid corresponding to cluster βi , µij ∈ [0, 1] is the fuzzy membership of the pattern xj to cluster βi , and ||.|| is the distance norm, such that n n X 1X ´ ´ (µij )m (2.2) (µij )m xj ; where ni = vi = ni j=1 j=1 and
c X 2 dij m−1 µij = ( ( ) ´ )−1 ; where d2ij = ||xj − vi ||2 dkj
(2.3)
k=1
subject to c X i=1
µij = 1, ∀j, and 0
ǫ. Although fuzzy c-means is a very useful clustering method, the resulting memberships values do not always correspond well to the degrees of belonging of the data, and it may
2–4
Rough Fuzzy Image Analysis
be inaccurate in a noisy environment (Krishnapuram and Keller, 1993, 1996). In real data analysis, noise and outliers are unavoidable. Hence, to reduce this weakness of fuzzy c-means, and to produce memberships that have a good explanation of the degrees of belonging for the data, Krishnapuram and Keller (Krishnapuram and Keller, 1993, 1996) proposed a possibilistic approach to clustering which used a possibilistic type of membership function to describe the degree of belonging. However, the possibilistic c-means sometimes generates coincident clusters (Barni, Cappellini, and Mecocci, 1996). Recently, the use of both fuzzy (probabilistic) and possibilistic memberships in a clustering has been proposed in (Pal, Pal, Keller, and Bezdek, 2005).
2.2.2
Rough Sets
The theory of rough sets begins with the notion of an approximation space, which is a pair < U, R >, where U be a non-empty set (the universe of discourse) and R an equivalence relation on U , i.e., R is reflexive, symmetric, and transitive. The relation R decomposes the set U into disjoint classes in such a way that two elements x, y are in the same class iff (x, y) ∈ R. Let denote by U/R the quotient set of U by the relation R, and U/R = {X1 , X2 , · · · , Xm } where Xi is an equivalence class of R, i = 1, 2, · · · , m. If two elements x, y ∈ U belong to the same equivalence class Xi ∈ U/R, then x and y are called indistinguishable. The equivalence classes of R and the empty set ∅ are the elementary sets in the approximation space < U, R >. Given an arbitrary set X ∈ 2U , in general it may not be possible to describe X precisely in < U, R >. One may characterize X by a pair of lower and upper approximations defined as follows (Pawlak, 1991): R(X) =
[
Xi ;
R(X) =
[
Xi
Xi ∩X6=∅
Xi ⊆X
That is, the lower approximation R(X) is the union of all the elementary sets which are subsets of X, and the upper approximation R(X) is the union of all the elementary sets which have a non-empty intersection with X. The interval [R(X), R(X)] is the representation of an ordinary set X in the approximation space < U, R > or simply called the rough set of X. The lower (resp., upper) approximation R(X) (resp., R(X)) is interpreted as the collection of those elements of U that definitely (resp., possibly) belong to X. Further, • a set X ∈ 2U is said to be definable (or exact) in < U, R > iff R(X) = R(X). ˜ , iff • for any X, Y ∈ 2U , X is said to be roughly included in Y , denoted by X ⊂Y R(X) ⊆ R(Y ) and R(X) ⊆ R(Y ). • X and Y is said to be roughly equal, denoted by X ≃R Y , in < U, R > iff R(X) = R(Y ) and R(X) = R(Y ). In (Pawlak, 1991), Pawlak discusses two numerical characterizations of imprecision of a subset X in the approximation space < U, R >: accuracy and roughness. Accuracy of X, denoted by αR (X), is simply the ratio of the number of objects in its lower approximation to that in its upper approximation; namely αR (X) =
|R(X)| |R(X)|
Rough-Fuzzy Clustering Algorithm for Segmentation of Brain MR Images
2–5
The roughness of X, denoted by ρR (X), is defined by subtracting the accuracy from 1: ρR (X) = 1 − αR (X) = 1 −
|R(X)| |R(X)|
Note that the lower the roughness of a subset, the better is its approximation. Further, the following observations are easily obtained: 1. As R(X) ⊆ X ⊆ R(X), 0 ≤ ρR (X) ≤ 1. 2. By convention, when X = ∅, R(X) = R(X) = ∅ and ρR (X) = 0. 3. ρR (X) = 0 if and only if X is definable in < U, R >.
2.3
Rough-Fuzzy C-Means Algorithm
Incorporating both fuzzy and rough sets, next a newly introduced c-means algorithm, termed as rough-fuzzy c-means (RFCM) (Maji and Pal, 2007a,c), is described. The RFCM algorithm adds the concept of fuzzy membership of fuzzy sets, and lower and upper approximations of rough sets into c-means algorithm. While the membership of fuzzy sets enables efficient handling of overlapping partitions, the rough sets deal with uncertainty, vagueness, and incompleteness in class definition.
2.3.1
Objective Function
Let A(βi ) and A(βi ) be the lower and upper approximations of cluster βi , and B(βi ) = {A(βi ) − A(βi )} denote the boundary region of cluster βi . The RFCM partitions a set of n objects into c clusters by minimizing the objective function ˜ × B1 if A(βi ) 6= ∅, B(βi ) 6= ∅ w × A1 + w A1 if A(βi ) 6= ∅, B(βi ) = ∅ JRF = (2.4) B1 if A(βi ) = ∅, B(βi ) 6= ∅ A1 =
c X
X
i=1 xj ∈A(βi )
||xj − vi ||2
B1 =
c X
X
´ (µij )m ||xj − vi ||2
i=1 xj ∈B(βi )
vi represents the centroid of the ith cluster βi , the parameter w and w ˜ correspond to the relative importance of lower bound and boundary region, and w + w ˜ = 1. Note that, µij has the same meaning of membership as that in fuzzy c-means. In the RFCM, each cluster is represented by a centroid, a crisp lower approximation, and a fuzzy boundary (Fig. 2.1). The lower approximation influences the fuzziness of final partition. According to the definitions of lower approximations and boundary of rough sets, if an object xj ∈ A(βi ), then xj ∈ / A(βk ), ∀k 6= i, and xj ∈ / B(βi ), ∀i. That is, the object xj is contained in βi definitely. Thus, the weights of the objects in lower approximation of a cluster should be independent of other centroids and clusters, and should not be coupled with their similarity with respect to other centroids. Also, the objects in lower approximation of a cluster should have similar influence on the corresponding centroid and cluster. Whereas, if xj ∈ B(βi ), then the object xj possibly belongs to βi and potentially belongs to another cluster. Hence, the objects in boundary regions should have different influence on the centroids and clusters. So, in the RFCM, the membership values of objects in lower approximation are µij = 1, while those in boundary region are the same as fuzzy c-means (Equation 2.3). In other word, the RFCM algorithm first partitions the data into two classes - lower approximation and boundary. Only the objects in boundary are fuzzified.
2–6
Rough Fuzzy Image Analysis
Cluster βi Crisp Lower Approximation A( βi ) with µ ij = 1 Fuzzy Boundary B( βi ) with µ ij [0, 1]
FIGURE 2.1
2.3.2
RFCM: cluster βi is represented by crisp lower bound and fuzzy boundary
Cluster Prototypes
The new centroid is calculated based on the weighting average of the crisp lower approximation and fuzzy boundary. Computation of the centroid is modified to include the effects of both fuzzy memberships and lower and upper bounds. The modified centroid calculation for the RFCM is obtained by solving Equation 2.4 with respect to vi : ˜ × D1 if A(βi ) 6= ∅, B(βi ) 6= ∅ w × C1 + w (2.5) C1 if A(βi ) 6= ∅, B(βi ) = ∅ viRF = D1 if A(βi ) = ∅, B(βi ) 6= ∅
C1 =
1 |A(βi )|
X
xj ;
where |A(βi )| represents the cardinality of A(βi )
xj ∈A(βi )
and D1 =
1 ni
X xj ∈B(βi )
´ (µij )m xj ; where ni =
X
´ (µij )m
xj ∈B(βi )
Thus, the cluster prototypes (centroids) depend on the parameters w and w, ˜ and fuzzification factor m ´ rule their relative influence. The correlated influence of these parameters and fuzzification factor, makes it somewhat difficult to determine their optimal values. Since the objects lying in lower approximation definitely belong to a cluster, they are assigned a higher weight w compared to w ˜ of the objects lying in boundary region. Hence, for the RFCM, the values are given by 0 < w ˜ < w < 1. From the above discussions, the following properties of the RFCM algorithm can be derived. S 1. A(βi ) = U , U be the set of objects of concern. 2. A(βi ) ∩ A(βk ) = ∅, ∀i 6= k. 3. A(βi ) ∩ B(βi ) = ∅, ∀i. 4. ∃i, k, B(βi ) ∩ B(βk ) 6= ∅. 5. µij = 1, ∀xj ∈ A(βi ). 6. µij ∈ [0, 1], ∀xj ∈ B(βi ). Let us briefly comment on some properties of the RFCM. The property 2 says that if an / A(βk ), ∀k 6= i. That is, the object xj is contained in βi definitely. object xj ∈ A(βi ) ⇒ xj ∈ The property 3 establishes the fact that if xj ∈ A(βi ) ⇒ xj ∈ / B(βi ), - that is, an object may not be in both lower and boundary region of a cluster βi . The property 4 says that
Rough-Fuzzy Clustering Algorithm for Segmentation of Brain MR Images
2–7
if xj ∈ B(βi ) ⇒ ∃k, xj ∈ B(βk ). It means an object xj ∈ B(βi ) possibly belongs to βi and potentially belongs to other cluster. The properties 5 and 6 are of great importance in computing the objective function JRF and the cluster prototype v RF . They say that the membership values of the objects in lower approximation are µij = 1, while those in boundary region are the same as fuzzy c-means. That is, each cluster βi consists of a crisp lower approximation A(βi ) and a fuzzy boundary B(βi ).
2.3.3
Details of the Algorithm
Approximate optimization of JRF (Equation 2.4) by the RFCM is based on Picard iteration through Equations 2.3 and 2.5. This type of iteration is called alternating optimization. The process starts by randomly choosing c objects as the centroids of the c clusters. The fuzzy memberships of all objects are calculated using Equation 2.3. Let µi = (µi1 , · · · , µij , · · · , µin ) represent the fuzzy cluster βi associated with the centroid vi . After computing µij for c clusters and n objects, the values of µij for each object xj are sorted and the difference of two highest memberships of xj is compared with a threshold value δ. Let µij and µkj be the highest and second highest memberships of xj . If (µij − µkj ) > δ, then xj ∈ A(βi ) as well as xj ∈ A(βi ), otherwise xj ∈ A(βi ) and xj ∈ A(βk ). After assigning each object in lower approximations or boundary regions of different clusters based on δ, memberships µij of the objects are modified. The values of µij are set to 1 for the objects in lower approximations, while those in boundary regions are remain unchanged. The new centroids of the clusters are calculated as per Equation 2.5. The main steps of the RFCM algorithm proceed as follows: 1. Assign initial centroids vi , i = 1, 2, · · · , c. Choose values for fuzzification factor m, ´ and thresholds ǫ and δ. Set iteration counter t = 1. 2. Compute µij by Equation 2.3 for c clusters and n objects. 3. If µij and µkj be the two highest memberships of xj and (µij − µkj ) ≤ δ, then xj ∈ A(βi ) and xj ∈ A(βk ). Furthermore, xj is not part of any lower bound. 4. Otherwise, xj ∈ A(βi ). In addition, by properties of rough sets, xj ∈ A(βi ). 5. Modify µij considering lower and boundary regions for c clusters and n objects. 6. Compute new centroid as per Equation 2.5. 7. Repeat steps 2 to 7, by incrementing t, until |µij (t) − µij (t − 1)| > ǫ. The performance of the RFCM depends on the value of δ, which determines the class labels of all the objects. In other word, the RFCM partitions the data set into two classes - lower approximation and boundary, based on the value of δ. In the present work, the following definition is used: n 1X δ= (µij − µkj ) (2.6) n j=1 where n is the total number of objects, µij and µkj are the highest and second highest memberships of xj . That is, the value of δ represents the average difference of two highest memberships of all the objects in the data set. A good clustering procedure should make the value of δ as high as possible. The value of δ is, therefore, data dependent.
2.4
Pixel Classification of Brain MR Images
2–8
Rough Fuzzy Image Analysis
In this section, we present the results of different c-means algorithms on pixel classification of brain MR images, that is, the results of clustering based on only gray value of pixels. Above 100 MR images with different sizes and 16 bit gray levels are tested with different c-means algorithms. All the brain MR images are collected from Advanced Medicare and Research Institute, Salt Lake, Kolkata, India. The comparative performance of different cmeans is reported with respect to DB, and Dunn index, as well as the β index (Pal, Ghosh, and Sankar, 2000), which are reported next. Davies-Bouldin (DB) Index:
The Davies-Bouldin (DB) index (Bezdek and Pal, 1988) is a function of the ratio of sum of within-cluster distance to between-cluster separation and is given by c S(vi ) + S(vk ) 1X DB = maxi6=k c i=1 d(vi , vk ) for 1 ≤ i, k ≤ c. The DB index minimizes the within-cluster distance S(vi ) and maximizes the between-cluster separation d(vi , vk ). Therefore, for a given data set and c value, the higher the similarity values within the clusters and the between-cluster separation, the lower would be the DB index value. A good clustering procedure should make the value of DB index as low as possible. Dunn Index:
Dunn index (Bezdek and Pal, 1988) is also designed to identify sets of clusters that are compact and well separated. Dunn index maximizes d(vi , vk ) for 1 ≤ i, k, l ≤ c. Dunn = mini mini6=k maxl S(vl ) A good clustering procedure should make the value of Dunn index as high as possible. β Index: The β-index of Pal et al. (Pal et al., 2000) is defined as the ratio of the total variation and within-cluster variation, and is given by β=
ni ni c X c c X X X X N ||xij − v||2 ; M = ni = n; ; where N = ||xij − vi ||2 ; M i=1 j=1 i=1 i=1 j=1
ni is the number of objects in the ith cluster (i = 1, 2, · · · , c), n is the total number of objects, xij is the jth object in cluster i, vi is the mean or centroid of ith cluster, and v is the mean of n objects. For a given image and c value, the higher the homogeneity within the segmented regions, the higher would be the β value. The value of β increases with c. Consider the image of Fig. 2.3 as an example, which represents an MR image (I-20497774) of size 256×180 with 16 bit gray levels. So, the number of objects in the data set of IMAGE20497774 is 46080. Table 2.1 depicts the values of DB index, Dunn index, and β index of FCM and RFCM for different values of c on the data set of I-20497774 considering only gray value of pixel. The results reported here with respect to DB and Dunn index confirm that both FCM and RFCM achieve their best results for c = 4 (background, gray matter, white matter, and cerebro-spinal fluid). Also, the value of β index, as expected, increases
2–9
Rough-Fuzzy Clustering Algorithm for Segmentation of Brain MR Images TABLE 2.1 I-20497774 Value of c 2 3 4 5 6 7 8 9 10
Performance of FCM and RFCM on
DB Index FCM RFCM 0.51 0.21 0.25 0.17 0.16 0.15 0.39 0.17 0.20 0.19 0.23 0.27 0.34 0.27 0.32 0.28 0.30 0.24
Dunn Index FCM RFCM 2.30 6.17 1.11 1.62 1.50 1.64 0.10 0.64 0.66 1.10 0.98 0.12 0.09 0.31 0.12 0.13 0.08 0.12
β Index FCM RFCM 2.15 2.19 3.55 3.74 9.08 9.68 10.45 10.82 16.93 17.14 21.63 22.73 25.82 26.38 31.75 32.65 38.04 39.31
with increase in the value of c. For a particular value of c, the performance of RFCM is better than that of FCM. Fig. 2.2 shows the scatter plots of the highest and second highest memberships of all the objects in the data set of I-20497774 at first and final iterations respectively, considering w = 0.95, m ´ 1 = 2.0, and c = 4. The diagonal line represents the zone where two highest memberships of objects are equal. From Fig. 2.2, it is observed that though the average difference between two highest memberships of the objects are very low at first iteration (δ = 0.145), they become ultimately very high at the final iteration (δ = 0.652).
At 1st Iteration
After 20th Iteration 1
Second Highest Membership Value
Second Highest Membership Value
1
0.8
0.6
0.4
0.2
0
0.6
0.4
0.2
0 0
0.2
0.4
0.6
Highest Membership Value
FIGURE 2.2
0.8
0.8
1
0
0.2
0.4
0.6
0.8
1
Highest Membership Value
Scatter plots of two highest membership values of all objects in data set I-20497774
Table 2.2 compares the performance of different c-means algorithms on some brain MR images with respect to DB, Dunn, and β index considering c = 4 (back-ground, gray matter, white matter, and CSF). All the results reported in Table 2.2 confirm that the RFCM algorithm produces pixel clusters more promising than do the conventional methods. Some of the existing algorithms like PCM and FPCM have failed to produce multiple clusters as they generate coincident clusters even when they have been initialized with the final prototypes of FCM. Also, the values of DB, Dunn, and β index of RFCM are better compared to other c-means algorithms.
2.5
Segmentation of Brain MR Images
2–10
Rough Fuzzy Image Analysis TABLE 2.2 Data Set I-20497761
I-20497763
I-20497774
I-20497777
Performance of Different C-Means Algorithms Algorithms HCM FCM RCM RFCM HCM FCM RCM RFCM HCM FCM RCM RFCM HCM FCM RCM RFCM
DB Index 0.16 0.14 0.15 0.13 0.18 0.16 0.15 0.11 0.18 0.16 0.17 0.15 0.17 0.16 0.15 0.14
Dunn Index 2.13 2.26 2.31 2.39 1.88 2.02 2.14 2.12 1.17 1.50 1.51 1.64 2.01 2.16 2.34 2.39
β Index 12.07 12.92 11.68 13.06 12.02 12.63 12.59 13.30 8.11 9.08 9.10 9.68 8.68 9.12 9.28 9.81
In this section, the feature extraction methodology for segmentation of brain MR images is first described. Next, the methodology to select initial centroids for different c-means algorithms is provided based on the concept of maximization of class separability (Maji and Pal, 2008).
2.5.1
Feature Extraction
Statistical texture analysis derives a set of statistics from the distribution of pixel values or blocks of pixel values. There are different types of statistical texture, first-order, second-order, and higher order statistics, based on the number of pixel combinations used to compute the textures. The first-order statistics, like mean, standard deviation, range, entropy, and the qth moment about the mean, are calculated using the histogram formed by the gray scale value of each pixel. These statistics consider the properties of the gray scale values, but not their spatial distribution. The second-order statistics are based on pairs of pixels. This takes into account the spatial distribution of the gray scale distribution. In the present work, only first- and second-order statistical textures are considered. A set of 13 input features is used for clustering the brain MR images. These include gray value of the pixel, two recently introduced features (first order statistics) - homogeneity and edge value of the pixel (Maji and Pal, 2008), and 10 Haralick’s textural features (Haralick, Shanmugam, and Dinstein, 1973) (second order statistics) - angular second moment, contrast, correlation, inverse difference moment, sum average, sum variance, sum entropy, second order entropy, difference variance, and difference entropy. They are useful in characterizing images, and can be used as features of a pixel. Hence these features have promising application in clustering based brain MRI segmentation. Homogeneity
If H is the homogeneity of a pixel Im,n within 3 × 3 neighborhood, then 1 {|Im−1,n−1 + Im+1,n+1 − Im−1,n+1 − Im+1,n−1 | + 6(Imax − Imin ) |Im−1,n−1 + 2Im,n−1 + Im+1,n−1 − Im−1,n+1 − 2Im,n+1 − Im+1,n+1 |}
H=1−
where Imax and Imin represent the maximum and minimum gray values of the image. The region that is entirely within an organ will have a high H value. On the other hand, the regions that contain more than one organ will have lower H values (Maji and Pal, 2008).
Rough-Fuzzy Clustering Algorithm for Segmentation of Brain MR Images
2–11
Edge Value
In MR imaging, the histogram of the given image is in general unimodal. One side of the peak may display a shoulder or slope change, or one side may be less steep than the other, reflecting the presence of two peaks that are close together or that differ greatly in height. The histogram may also contain a third, usually smaller, population corresponding to points on the object-background border. These points have gray levels intermediate between those of the object and background; their presence raises the level of the valley floor between the two peaks, or if the peaks are already close together, makes it harder to detect the fact that they are not a single peak. As the histogram peaks are close together and very unequal in size, it may be difficult to detect the valley between them. In determining how each point of the image should contribute to the segmentation method, the current method takes into account the rate of change of gray level at the point, as well as the point’s gray level (edge value); that is, the maximum of differences of average gray levels in pairs of horizontally and vertically adjacent 2 × 2 neighborhoods (Maji et al., 2008; Weszka and Rosenfeld, 1979). If ∆ is the edge value at a given point Im,n , then ∆=
1 max{|Im−1,n + Im−1,n+1 + Im,n + Im,n+1 − Im+1,n − Im+1,n+1 − Im+2,n − Im+2,n+1 |, 4 |Im,n−1 + Im,n + Im+1,n−1 + Im+1,n − Im,n+1 − Im,n+2 − Im+1,n+1 − Im+1,n+2 |}
According to the image model, points interior to the object and background should generally have low edge values, since they are highly correlated with their neighbors, while those on the object-background border should have high edge values (Maji et al., 2008). Haralick’s Textural Feature
Texture is one of the important features used in identifying objects or regions of interest in an image. It is often described as a set of statistical measures of the spatial distribution of gray levels in an image. This scheme has been found to provide a powerful input feature representation for various recognition problems. Haralick et al. (Haralick et al., 1973) proposed different textural properties for image classification. Haralick’s textural measures are based upon the moments of a joint probability density function that is estimated as the joint co-occurrence matrix or gray level co-occurrence matrix (Haralick et al., 1973; Rangayyan, 2004). It reflects the distribution of the probability of occurrence of a pair of gray levels separated by a given distance d at angle θ. Based upon normalized gray level co-occurrence matrix, Haralick proposed several quantities as measure of texture like energy, contrast, correlation, sum of squares, inverse difference moments, sum average, sum variance, sum entropy, entropy, difference variance, difference entropy, information measure of correlation 1, and correlation 2. In (Haralick et al., 1973), these properties were calculated for large blocks in aerial photographs. Every pixel within these each large block was then assigned the same texture values. This leads to a significant loss of resolution that is unacceptable in medical imaging. In the present work, the texture values are assigned to a pixel by using a 3 × 3 sliding window centered about that pixel. The gray level co-occurrence matrix is constructed by mapping the gray level co-occurrence probabilities based on spatial relations of pixels in different angular directions (θ = 0◦ , 45◦ , 90◦ , 135◦ ) with unit pixel distance, while scanning the window (centered about a pixel) from left-to-right and top-to-bottom (Haralick et al., 1973; Rangayyan, 2004). Ten texture measures - angular second moment, contrast, correlation, inverse difference moment, sum average, sum variance, sum entropy, second order
2–12
Rough Fuzzy Image Analysis
entropy, difference variance, and difference entropy, are computed for each window. For four angular directions, a set of four values is obtained for each of ten measures. The mean of each of the ten measures, averaged over four values, along with gray value, homogeneity, and edge value of the pixel, comprise the set of 13 features which is used as feature vector of the corresponding pixel.
2.5.2
Selection of Initial Centroids
A limitation of the c-means algorithm is that it can only achieve a local optimum solution that depends on the initial choice of the centroids. Consequently, computing resources may be wasted in that some initial centroids get stuck in regions of the input space with a scarcity of data points and may therefore never have the chance to move to new locations where they are needed. To overcome this limitation of the c-means algorithm, next a method is described to select initial centroids, which is based on discriminant analysis maximizing some measures of class separability (Otsu, 1979). It enables the algorithm to converge to an optimum or near optimum solutions (Maji and Pal, 2008). Prior to describe the new method for selecting initial centroids, next a quantitative measure of class separability (Otsu, 1979) is provided that is given by J(T) =
P1 (T)P2 (T)[m1 (T) − m2 (T)]2 P1 (T)σ12 (T) + P2 (T)σ22 (T)
(2.7)
where P1 (T) =
T X z=0
m1 (T) =
L−1 X
h(z); P2 (T) =
h(z) = 1 − P1 (T)
z=T+1
T L−1 X 1 1 X zh(z); m2 (T) = zh(z) P1 (T) z=0 P2 (T) z=T+1
σ12 (T) =
1 P1 (T)
T X
[z − m1 (T)]2 h(z); σ22 (T) =
z=0
1 P2 (T)
L−1 X
[z − m2 (T)]2 h(z)
z=T+1
Here, L is the total number of discrete values ranging between [0, L − 1], T is the threshold value, which maximizes J(T), and h(z) represents the percentage of data having feature value z over the total number of discrete values of the corresponding feature. To maximize J(T), the means of the two classes should be as well separated as possible and the variances in both classes should be as small as possible. Based on the concept of maximization of class separability, the method for selecting initial centroids is described next. The main steps of this method proceeds as follows. 1. The data set X = {x1 , · · · , xj , · · · , xn } with xj ∈ ℜm are first discretized to facilitate class separation method. Suppose, the possible value range of a feature fm in the data set is (fm,min , fm,max ), and the real value that the data element xj takes at fm is fmj , then the discretized value of fmj is Discretized(fmj ) = (L − 1) ×
fmj − fm,min fm,max − fm,min
where L is the total number of discrete values ranging between [0, L − 1]. 2. For each feature fm , calculate h(z) for 0 ≤ z < L.
(2.8)
Rough-Fuzzy Clustering Algorithm for Segmentation of Brain MR Images
2–13
3. Calculate the threshold value Tm for the feature fm , which maximizes class separability along that feature. 4. Based on the threshold Tm , discretize the corresponding feature fm of the data element xj as follows f mj =
1, if Discretized(fmj ) ≥ Tm 0, Otherwise
5. Repeat steps 2 to 4 for all the features and generate the set of discretized objects X = {x1 , · · · , xj , · · · , xn }. 6. Calculate total number of similar discretized objects N(xi ) and mean of similar objects v(xi ) of xi as
N(xi ) =
n X
1 X δj × xj N(xi ) j=1 n
δj
and v(xi ) =
j=1
where δj =
1 0
if xj = xi Otherwise
7. Sort n objects according to their values of N(xi ) such that N(x1 ) > N(x2 ) > · · · > N(xn ). 8. If xi = xj , then N(xi ) = N(xj ) and v(xj ) should not be considered as a centroid (mean), resulting in a reduced set of objects to be considered for initial centroids. 9. Let there be n ´ objects in the reduced set having N(xi ) values such that N(x1 ) > N(x2 ) > · · · > N(xn´ ). A heuristic threshold function can be defined as follows (Banerjee, Mitra, and Pal, 1998): n ´
Tr =
X 1 R ; where R = ǫ ˜ N(x ) − N(xi+1 ) i i=1
where ǫ˜ is a constant (= 0.5, say), so that all the means v(xi ) of the objects in reduced set having N(xi ) value higher than it are regarded as the candidates for initial centroids (means). The value of Tr is high if most of the N(xi )’s are large and close to each other. The above condition occurs when a small number of large clusters are present. On the other hand, if the N(xi )’s have wide variation among them, then the number of clusters with smaller size increases. Accordingly, Tr attains a lower value automatically. Note that the main motive of introducing this threshold function lies in reducing the number of centroids. Actually, it attempts to eliminate noisy centroids (data representatives having lower values of N(xi )) from the whole data set. The whole approach is, therefore, data dependent.
2.6
Experimental Results and Discussion
2–14
Rough Fuzzy Image Analysis
In this section, the performance of different c-means algorithms on segmentation of brain MR images is presented. Details of the experimental set up, data collection, and objective of the experiments are same as those of Section 2.4. Consider Fig. 2.3 as an example that represents an MR image (I-20497774) along with the segmented images obtained using different c-means algorithms. Each image is of size 256 × 180 with 16 bit gray levels. So, the number of objects in the data set of I-20497774 is 46080. The parameters generated in the discriminant analysis based initialization method are shown in Table 2.3 only for I-20497774 data set along with the values of input parameters. The threshold values for 13 features of the given data set are also reported in this table. Table 2.4 depicts the values of DB index, Dunn index, and β index of FCM and
FIGURE 2.3
I-20497774: original and segmented images of HCM, FCM, RCM, and RFCM
RFCM for different values of c on the data set of I-20497774, considering w = 0.95 and m ´ = 2.0. The results reported here with respect to DB and Dunn index confirm that both FCM and RFCM achieve their best results for c = 4. Also, the value of β index, as expected, increases with increase in the value of c. For a particular value of c, the performance of RFCM is better than that of FCM. TABLE 2.3
Values of Different Parameters
Size of image = 256 × 180 Minimum gray value = 1606, Maximum gray value = 2246 Samples per pixel = 1, Bits allocated = 16, Bits stored = 12 Number of objects = 46080 Number of features = 13, Value of L = 101 Threshold Values: Gray value = 1959, Homogeneity = 0.17, Edge value = 0.37 Angular second moment = 0.06, Contrast = 0.12 Correlation = 0.57, Inverse difference moment = 0.18 Sum average = 0.17, Sum variance = 0.14, Sum entropy = 0.87 Entropy = 0.88, Difference variance = 0.07, Difference entropy = 0.79
Finally, Table 2.5 provides the comparative results of different c-means algorithms on I-20497774 with respect to the values of DB index, Dunn index, and β index. The corresponding segmented images along with the original one are presented in Fig. 2.3. The results reported in Fig. 2.3 and Table 2.5 confirm that the RFCM algorithm produces segmented image more promising than do the conventional c-means algorithms. Some of the existing algorithms like PCM and FPCM fail to produce multiple segments as they generate coincident clusters even when they are initialized with final prototypes of the FCM.
Rough-Fuzzy Clustering Algorithm for Segmentation of Brain MR Images
2–15
TABLE 2.4 Performance of FCM and RFCM on I-20497774 data set Value of c 2 3 4 5 6 7 8 9 10
DB Index FCM RFCM 0.38 0.19 0.22 0.16 0.15 0.13 0.29 0.19 0.24 0.23 0.23 0.21 0.31 0.21 0.30 0.24 0.30 0.22
Dunn Index FCM RFCM 2.17 3.43 1.20 1.78 1.54 1.80 0.95 1.04 0.98 1.11 1.07 0.86 0.46 0.95 0.73 0.74 0.81 0.29
β Index FCM RFCM 3.62 4.23 7.04 7.64 11.16 13.01 11.88 14.83 19.15 19.59 24.07 27.80 29.00 33.02 35.06 40.07 41.12 44.27
TABLE 2.5 Performance of Different C-Means on I-20497774 data set Algorithms HCM FCM RCM RFCM
TABLE 2.6 Algorithms HCM
FCM
RCM
RFCM
2.6.1
DB Index 0.17 0.15 0.16 0.13
Dunn Index 1.28 1.54 1.56 1.80
β Index 10.57 11.16 11.19 13.01
Haralick’s and Proposed Features on I-20497774 data set Features H-13 H-10 P-2 H-10 ∪ P-2 H-13 H-10 P-2 H-10 ∪ P-2 H-13 H-10 P-2 H-10 ∪ P-2 H-13 H-10 P-2 H-10 ∪ P-2
DB Index 0.19 0.19 0.18 0.17 0.15 0.15 0.15 0.15 0.19 0.19 0.17 0.16 0.13 0.13 0.13 0.13
Dunn Index 1.28 1.28 1.28 1.28 1.51 1.51 1.51 1.54 1.52 1.52 1.51 1.56 1.76 1.76 1.77 1.80
β Index 10.57 10.57 10.57 10.57 10.84 10.84 11.03 11.16 11.12 11.12 11.02 11.19 12.57 12.57 12.88 13.01
Time (ms) 4308 3845 1867 3882 36711 34251 14622 43109 5204 5012 1497 7618 15705 15414 6866 17084
Haralick’s Features Versus Proposed Features
Table 2.6 presents the comparative results of different c-means for Haralick’s features and features proposed in (Maji and Pal, 2008) on I-20497774 data set. While P-2 and H-13 stand for the set of two proposed features (Maji and Pal, 2008) and thirteen Haralick’s features, H-10 represents that of ten Haralick’s features which are used in the current study. The proposed features are found as important as Haralick’s ten features for clustering based segmentation of brain MR images. The set of 13 features, comprising of gray value, two proposed features, and ten Haralick’s features, improves the performance of all c-means with respect to DB, Dunn, and β. It is also observed that the Haralick’s three features sum of squares, information measure of correlation 1, and correlation 2, do not contribute any extra information for segmentation of brain MR images.
2.6.2
Random Versus Discriminant Analysis Based Initialization
Table 2.7 provides comparative results of different c-means algorithms with random initialization of centroids and the discriminant analysis based initialization method described in
2–16
Rough Fuzzy Image Analysis
Section 2.5.2 for the data sets I-20497761, I-20497763, and I-20497777 (Fig. 2.4). TABLE 2.7 Method Data Set I-204 97761
Performance of Random and Discriminant Analysis Based Initialization Algorithms HCM FCM RCM RFCM
I-204 97763
HCM FCM RCM RFCM
I-204 97777
HCM FCM RCM RFCM
FIGURE 2.4
The
Initialization Random Proposed Random Proposed Random Proposed Random Proposed Random Proposed Random Proposed Random Proposed Random Proposed Random Proposed Random Proposed Random Proposed Random Proposed
DB Index 0.23 0.15 0.19 0.12 0.19 0.14 0.15 0.11 0.26 0.16 0.21 0.15 0.21 0.14 0.17 0.10 0.33 0.16 0.28 0.15 0.27 0.13 0.19 0.11
Dunn Index 1.58 2.64 1.63 2.69 1.66 2.79 2.07 2.98 1.37 2.03 1.54 2.24 1.60 2.39 1.89 2.38 1.52 2.38 1.67 2.54 1.71 2.79 1.98 2.83
β Index 9.86 12.44 12.73 13.35 10.90 12.13 11.89 13.57 10.16 13.18 10.57 13.79 10.84 13.80 11.49 14.27 6.79 8.94 7.33 10.02 7.47 9.89 8.13 11.04
Time (ms) 8297 4080 40943 38625 9074 6670 19679 16532 3287 3262 46157 45966 10166 6770 19448 15457 4322 3825 42284 40827 8353 7512 18968 16930
Examples of some brain MR images: I-20497761, I-20497763, I-20497777
discriminant analysis based initialization method is found to improve the performance in terms of DB index, Dunn index, and β index as well as reduce the time requirement of all c-means algorithms. It is also observed that HCM with this initialization method performs similar to RFCM with random initialization, although it is expected that RFCM is superior to HCM in partitioning the objects. While in random initialization, the c-means algorithms get stuck in local optimums, the discriminant analysis based initialization method enables the algorithms to converge to an optimum or near optimum solutions. In effect, the execution time required for different c-means algorithms is lesser in this scheme compared to random initialization.
2.6.3
Comparative Performance Analysis
Table 2.8 compares the performance of different c-means algorithms on some brain MR images with respect to DB, Dunn, and β index. The segmented versions of different c-
Rough-Fuzzy Clustering Algorithm for Segmentation of Brain MR Images
2–17
means are shown in Figs. 2.5-2.7. All the results reported in Table 2.8 and Figs. 2.5-2.7 TABLE 2.8 Data Set I-204 97761 I-204 97763 I-204 97777
Performance of Different C-Means Algorithms Algorithms HCM FCM RCM RFCM HCM FCM RCM RFCM HCM FCM RCM RFCM
DB Index 0.15 0.12 0.14 0.11 0.16 0.15 0.14 0.10 0.16 0.15 0.13 0.11
Dunn Index 2.64 2.69 2.79 2.98 2.03 2.24 2.39 2.38 2.38 2.54 2.79 2.83
β Index 12.44 13.35 12.13 13.57 13.18 13.79 13.80 14.27 8.94 10.02 9.89 11.04
Time (ms) 4080 38625 6670 16532 3262 45966 6770 15457 3825 40827 7512 16930
confirm that although each c-means algorithm, except PCM and FPCM, generates good segmented images, the values of DB, Dunn, and β index of the RFCM are better compared to other c-means algorithms. Both PCM and FPCM fail to produce multiple segments of the brain MR images as they generate coincident clusters even when they are initialized with the final prototypes of other c-means algorithms. Table 2.8 also provides execution time (in milli sec.) of different c-means. The execution time required for the RFCM is significantly lesser compared to FCM. For the HCM and RCM, although the execution time is less, the performance is considerably poorer than that of RFCM. Following conclusions can be drawn from the results reported in this chapter:
FIGURE 2.5
I-20497761: segmented versions of HCM, FCM, RCM, and RFCM
FIGURE 2.6
I-20497763: segmented versions of HCM, FCM, RCM, and RFCM
1. It is observed that RFCM is superior to other c-means algorithms. However, RFCM requires higher time compared to HCM/RCM and lesser time compared
2–18
Rough Fuzzy Image Analysis
FIGURE 2.7
I-20497777: segmented versions of HCM, FCM, RCM, and RFCM
to FCM. But, the performance of RFCM with respect to DB, Dunn, and β is significantly better than all other c-means. The performance of FCM and RCM is intermediate between RFCM and HCM. 2. The discriminant analysis based initialization is found to improve the values of DB, Dunn, and β as well as reduce the time requirement substantially for all c-means algorithms. 3. Two features proposed in (Maji and Pal, 2008) are as important as Haralick’s ten features for clustering based segmentation of brain MR images. 4. Use of rough sets and fuzzy memberships adds a small computational load to HCM algorithm; however the corresponding integrated method (RFCM) shows a definite increase in Dunn index and decrease in DB index. The best performance of the segmentation method in terms of DB, Dunn, and β is achieved due to the following reasons: 1. the discriminant analysis based initialization of centroids enables the algorithm to converge to an optimum or near optimum solutions; 2. membership of the RFCM handles efficiently overlapping partitions; and 3. the concept of crisp lower bound and fuzzy boundary of the RFCM algorithm deals with uncertainty, vagueness, and incompleteness in class definition. In effect, promising segmented brain MR images are obtained using the RFCM algorithm.
2.7
Conclusion
A robust segmentation technique is presented in this chapter, integrating the merits of rough sets, fuzzy sets, and c-means algorithm, for brain MR images. Some new measures are reported, based on the local properties of MR images, for accurate segmentation. The method, based on the concept of maximization of class separability, is found to be successful in effectively circumventing the initialization and local minima problems of iterative refinement clustering algorithms like c-means. The effectiveness of the algorithm, along with a comparison with other algorithms, is demonstrated on a set of brain MR images. The extensive experimental results show that the rough-fuzzy c-means algorithm produces a segmented image more promising than do the conventional algorithms.
Acknowledgments. The authors thank Advanced Medicare and Research Institute, Kolkata, India, for providing brain MR images. This work was done when S. K. Pal was a Govt. of India J.C. Bose Fellow.
Rough-Fuzzy Clustering Algorithm for Segmentation of Brain MR Images
Bibliography Banerjee, Mohua, Sushmita Mitra, and Sankar K Pal. 1998. Rough Fuzzy MLP: Knowledge Encoding and Classification. IEEE Transactions on Neural Networks 9(6): 1203–1216. Barni, M., V. Cappellini, and A. Mecocci. 1996. Comments on A Possibilistic Approach to Clustering. IEEE Transactions on Fuzzy Systems 4(3):393–396. Bezdek, J. C. 1981. Pattern Recognition with Fuzzy Objective Function Algorithm. New York: Plenum. Bezdek, J. C., and N. R. Pal. 1988. Some New Indexes for Cluster Validity. IEEE Transactions on System, Man, and Cybernetics, Part B 28:301–315. Brandt, M. E., T. P. Bohan, L. A. Kramer, and J. M. Fletcher. 1994. Estimation of CSF, White and Gray Matter Volumes in Hydrocephalic Children Using Fuzzy Clustering of MR Images. Computerized Medical Imaging and Graphics 18:25–34. Cagnoni, S., G. Coppini, M. Rucci, D. Caramella, and G. Valli. 1993. Neural Network Segmentation of Magnetic Resonance Spin Echo Images of the Brain. Journal of Biomedical Engineering 15(5):355–362. Dubois, D., and H.Prade. 1990. Rough Fuzzy Sets and Fuzzy Rough Sets. International Journal of General Systems 17:191–209. Dunn, J. C. 1974. A Fuzzy Relative of the ISODATA Process and its Use in Detecting Compact, Well-Separated Clusters. Journal of Cybernetics 3:32–57. Hall, L. O., A. M. Bensaid, L. P. Clarke, R. P. Velthuizen, M. S. Silbiger, and J. C. Bezdek. 1992. A Comparison of Neural Network and Fuzzy Clustering Techniques in Segmenting Magnetic Resonance Images of the Brain. IEEE Transactions on Neural Networks 3(5):672–682. Haralick, R. M., K. Shanmugam, and I. Dinstein. 1973. Textural Features for Image Classification. IEEE Transactions on Systems, Man and Cybernetics SMC-3(6): 610–621. Hassanien, Aboul Ella. 2007. Fuzzy Rough Sets Hybrid Scheme for Breast Cancer Detection. Image Vision Computing 25(2):172–183. Krishnapuram, R., and J. M. Keller. 1993. A Possibilistic Approach to Clustering. IEEE Transactions on Fuzzy Systems 1(2):98–110. ———. 1996. The Possibilistic C-Means Algorithm: Insights and Recommendations. IEEE Transactions on Fuzzy Systems 4(3):385–393. Lee, C., S. Hun, T. A. Ketter, and M. Unser. 1998. Unsupervised Connectivity Based Thresholding Segmentation of Midsaggital Brain MR Images. Computers in Biology and Medicine 28:309–338. Leemput, K. V., F. Maes, D. Vandermeulen, and P. Suetens. 1999. Automated ModelBased Tissue Classification of MR Images of the Brain. IEEE Transactions on Medical Imaging 18(10):897–908.
2–19
2–20
Rough Fuzzy Image Analysis
Li, C. L., D. B. Goldgof, and L. O. Hall. 1993. Knowledge-Based Classification and Tissue Labeling of MR Images of Human Brain. IEEE Transactions on Medical Imaging 12(4):740–750. Maji, Pradipta, Malay K. Kundu, and Bhabatosh Chanda. 2008. Second Order Fuzzy Measure and Weighted Co-Occurrence Matrix for Segmentation of Brain MR Images. Fundamenta Informaticae 88(1-2):161–176. Maji, Pradipta, and Sankar K. Pal. 2007a. RFCM: A Hybrid Clustering Algorithm Using Rough and Fuzzy Sets. Fundamenta Informaticae 80(4):475–496. ———. 2007b. Rough-Fuzzy C-Medoids Algorithm and Selection of Bio-Basis for Amino Acid Sequence Analysis. IEEE Transactions on Knowledge and Data Engineering 19(6):859–872. ———. 2007c. Rough Set Based Generalized Fuzzy C-Means Algorithm and Quantitative Indices. IEEE Transactions on System, Man and Cybernetics, Part B, Cybernetics 37(6):1529–1540. ———. 2008. Maximum Class Separability for Rough-Fuzzy C-Means Based Brain MR Image Segmentation. LNCS Transactions on Rough Sets IX(5390):114–134. Manousakes, I. N., P. E. Undrill, and G. G. Cameron. 1998. Split and Merge Segmentation of Magnetic Resonance Medical Images: Performance Evaluation and Extension to Three Dimensions. Computers and Biomedical Research 31(6):393–412. Mushrif, Milind M., and Ajoy K. Ray. 2008. Color Image Segmentation: Rough-Set Theoretic Approach. Pattern Recognition Letters 29(4):483–493. Otsu, N. 1979. A Threshold Selection Method from Gray Level Histogram. IEEE Transactions on System, Man, and Cybernetics 9(1):62–66. Pal, N. R., K. Pal, J. M. Keller, and J. C. Bezdek. 2005. A Possibilistic Fuzzy C-Means Clustering Algorithm. IEEE Transactions on Fuzzy Systems 13(4):517–530. Pal, N. R., and S. K. Pal. 1993. A Review on Image Segmentation Techniques. Pattern Recognition 26(9):1277–1294. Pal, S. K., A. Ghosh, and B. Uma Sankar. 2000. Segmentation of Remotely Sensed Images with Fuzzy Thresholding, and Quantitative Evaluation. International Journal of Remote Sensing 21(11):2269–2300. Pal, S. K., and P. Mitra. 2002. Multispectral Image Segmentation Using Rough Set Initiatized EM Algorithm. IEEE Transactions on Geoscience and Remote Sensing 40(11):2495–2501. Pal, Sankar K, Sushmita Mitra, and Pabitra Mitra. 2003. Rough-Fuzzy MLP: Modular Evolution, Rule Generation, and Evaluation. IEEE Transactions on Knowledge and Data Engineering 15(1):14–25. Pawlak, Z. 1991. Rough Sets, Theoretical Aspects of Resoning About Data. Dordrecht, The Netherlands: Kluwer. Rajapakse, J. C., J. N. Giedd, and J. L. Rapoport. 1997. Statistical Approach to Segmentation of Single Channel Cerebral MR Images. IEEE Transactions on Medical Imaging 16:176–186.
Rough-Fuzzy Clustering Algorithm for Segmentation of Brain MR Images Rangayyan, Rangaraj M. 2004. Biomedical Image Analysis. CRC Press. Rosenfeld, A., and A. C. Kak. 1982. Digital Picture Processing. Academic Press, Inc. Singleton, H. R., and G. M. Pohost. 1997. Automatic Cardiac MR Image Segmentation Using Edge Detection by Tissue Classification in Pixel Neighborhoods. Magnetic Resonance in Medicine 37(3):418–424. Suetens, Paul. 2002. Fundamentals of Medical Imaging. Cambridge University Press. Wells III, W. M., W. E. L. Grimson, R. Kikinis, and F. A. Jolesz. 1996. Adaptive Segmentation of MRI Data. IEEE Transactions on Medical Imaging 15(4):429–442. Weszka, J. S., and A. Rosenfeld. 1979. Histogram Modification for Threshold Selection. IEEE Transactions on System, Man, and Cybernetics SMC-9(1):62–66. Widz, Sebastian, Kenneth Revett, and Dominik Slezak. 2005a. A Hybrid Approach to MR Imaging Segmentation Using Unsupervised Clustering and Approximate Reducts. Proceedings of the 10th International Conference on Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing 372–382. ———. 2005b. A Rough Set-Based Magnetic Resonance Imaging Partial Volume Detection System. Proceedings of the First International Conference on Pattern Recognition and Machine Intelligence 756–761. Widz, Sebastian, and Dominik Slezak. 2007. Approximation Degrees in Decision Reduct-Based MRI Segmentation. Proceedings of the Frontiers in the Convergence of Bioscience and Information Technologies 431–436. Xiao, Kai, Sooi Hock Ho, and Aboul Ella Hassanien. 2008. Automatic Unsupervised Segmentation Methods for MRI Based on Modified Fuzzy C-Means. Fundamenta Informaticae 87(3-4):465–481. Zadeh, L. A. 1965. Fuzzy Sets. Information and Control 8:338–353.
2–21
3 Image Thresholding using Generalized Rough Sets 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–1 3.2 Generalized Rough Set based Entropy Measures with respect to the Definability of a Set of Elements . . 3–3 Roughness of a Set in a Universe • The Lower and Upper Approximations of a Set • The Entropy Measures • Relation between ρR (X) and ρR (X { ) • Properties of the Proposed Classes of Entropy Measures
Debashis Sen Center for Soft Computing Research, Indian Statistical Institute
Sankar K. Pal Center for Soft Computing Research, Indian Statistical Institute
3.1
3.3 Measuring Grayness Ambiguity in Images . . . . . . . . . 3–11 3.4 Image Thresholding based on Association Error . . 3–15 Bilevel Thresholding
•
Multilevel Thresholding
3.5 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–19 Qualitative analysis
•
Quantitative analysis
3.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–26 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3–27
Introduction
Real-life images are inherently embedded with various ambiguities. In order to perceive the nature of ambiguities in images, let us consider a 1001 × 1001 grayscale image (see Figure 3.1(a)) that has sinusoidal gray value gradations in horizontal direction. When an attempt is made to mark the boundary of an arbitrary region in the image, an exact boundary can not be defined as a consequence of the presence of steadily changing gray values (gray value gradation). This is evident from Figure 3.1(b) that shows a portion of the image, where it is known that the pixels in the ‘white’ shaded area uniquely belong to a region. However, the boundary (on the left and right sides) of this region is vague as it can lie anywhere in the gray value gradations present in the portion. Value gradation is a common phenomenon in real-life images and hence it is widely accepted (Pal, 1982; Pal, King, and Hashim, 1983; Udupa and Saha, 2003) that regions in an image have fuzzy boundaries. Moreover, the gray levels at various pixels in grayscale images are considered to be imprecise, which means that a gray level resembles other nearby gray levels to certain extents. It is also true that pixels in a neighborhood with nearby gray levels have limited discernibility due to the inadequacy of contrast. For example, Figure 3.1(c) shows a 6 × 6 portion cut from the image in Figure 3.1(a). Although the portion contains gray values separated by 6 gray levels, it appears to be almost homogeneous. The aforementioned ambiguities in 3–1
3–2
Rough Fuzzy Image Analysis
(a) A grayscale image
(b) Fuzzy boundary
(c) Rough resemblance
FIGURE 3.1: Ambiguities in a grayscale image with sinusoidal gray value gradations in horizontal direction
images due to fuzzy boundaries of various regions and rough resemblance of nearby gray levels is studied and modeled in this chapter. Note that, the aforementioned ambiguities are related to the indefiniteness in deciding an image pixel as white or black and hence they can be collectively referred to as the grayness ambiguity (Pal, 1999). Fuzzy set theory of Lofti Zadeh, is based on the concept of vague boundaries of sets in the universe of discourse (Klir and Yuan, 2005). Rough set theory of Zdzislaw Pawlak, on the otherhand, focuses on ambiguity in terms of limited discernibility of sets in the domain of discourse (Pawlak, 1991). Therefore, fuzzy sets can be used to represent the grayness ambiguity in images due to the vague definition of region boundaries (fuzzy boundaries) and rough sets can be used to represent the grayness ambiguity due to the indiscernibility between individual or groups of gray levels (rough resemblance). Rough set theory, which was initially developed considering crisp equivalence approximation spaces (Pawlak, 1991), has been generalized by considering fuzzy (Dubois and Prade, 1990, 1992; Thiele, 1998) and tolerance (Skowron and Stepaniuk, 1996) approximation spaces. Furthermore, rough set theory, which was initially developed to approximate crisp sets, has also been generalized to approximate fuzzy sets (Dubois and Prade, 1990, 1992; Thiele, 1998). In this chapter, we propose the use of the rough set theory and its certain generalizations to quantify grayness ambiguity in images. Here the generalizations to rough set theory based on the approximation of crisp and fuzzy sets considering crisp equivalence, fuzzy equivalence, crisp tolerance and fuzzy tolerance approximation spaces in different combinations are studied. All these combinations give rise to different concepts for modeling vagueness, which can be quantified using the roughness measure (Pawlak, 1991). We propose classes of entropy measures which use the roughness measures obtained considering the aforementioned various concepts for modeling vagueness. We perform rigorous theoretical analysis of the proposed entropy measures and provide some properties which they satisfy. We then use the proposed entropy measures to quantify grayness ambiguity in images, giving an account of the manner in which the grayness ambiguity is captured. We show that the aforesaid generalizations to rough set theory regarding the approximation of fuzzy sets can be used to quantify grayness ambiguity due to both fuzzy boundaries and rough resemblance. We then propose an image thresholding methodology that employs grayness ambiguity
Image Thresholding using Generalized Rough Sets
3–3
measure obtained using the proposed classes of entropies. The strength of the proposed methodology lies in the fact that it does not make any prior assumptions about the image unlike many existing thresholding techniques. We present a novel bilevel thresholding scheme that performs thresholding by assigning a bin in the graylevel histogram of an image to one of the two classes based on the computation of certain association errors. In the methodology, the graylevel histogram is first divided into three regions, say, bright (a region of larger gray values), dark (a region of smaller gray values) and an undefined region. These regions are obtained using two predefined gray values, which are called the seed values. It is known (prior knowledge) that the bins of a graylevel histogram representing the smallest and the largest gray value would belong to the dark and bright regions, respectively. Hence, we consider that the graylevel bins of the histogram below the smaller seed value belong to the dark region and those above the larger seed value belong to the bright region. Rest of the graylevel bins form the undefined region. Then, each graylevel bin in the undefined region is associated with the defined regions, dark and bright, followed by the use of grayness ambiguity measure to obtain the errors due to the associations. The thresholding is then achieved by comparing the association errors and assigning each graylevel bin of the undefined region to one of the defined regions that corresponds to the lower association error. To carry out multilevel thresholding in a manner similar to the bilevel thresholding, more than two seed values would be required. Unlike bilevel thresholding, in the case of multilevel thresholding we do not posses the prior knowledge required to assign all the seed values. Hence, we present a binary tree structured technique that uses the proposed bilevel thresholding scheme in order to carry out multilevel thresholding. In this technique, each region (node) obtained at a particular depth are further separated using the proposed bilevel thresholding method to get the regions at the next higher depth. The required number of regions are obtained by proceeding to a sufficient depth and then discarding some regions at that depth using a certain criterion. As a region in the graylevel histogram of an image corresponds to a region in the image, the aforementioned thresholding methodology would divide the image into predefined (required) number of regions. Image thresholding operations for segmentation and edge extraction are carried out in this chapter employing grayness ambiguity measure obtained based on the proposed classes of entropies. The aforesaid image thresholding operations are performed in two ways, namely, by the ambiguity minimization method reported in (Pal et al., 1983) and by the proposed image thresholding methodology. Qualitative and quantitative experimental results obtained using aforementioned methods are compared to that obtained using a few popular existing image thresholding techniques in order to demonstrate the utility of the proposed entropy measures and the effectiveness of the proposed image thresholding methodology. The organization of this chapter is as follows. In Section 3.2, the proposed entropy measures and their properties are presented after briefly mentioning the existing entropy measures based on rough set theory. The use of the proposed entropy measures for quantifying grayness ambiguity in images is presented in Section 3.3. The explanation of the proposed image thresholding methodology is given in Section 3.4. In Section 3.5, experimental results are presented to demonstrate the utility and effectiveness of the proposed entropy measures and image thresholding methodology. The chapter concludes with Section 3.6.
3.2
Generalized Rough Set based Entropy Measures with respect to the Definability of a Set of Elements
3–4
Rough Fuzzy Image Analysis
Defining entropy measures based on rough set theory has been considered by researchers in the past decade. Probably, first such work was reported in (Beaubouef, Petry, and Arora, 1998), where a ‘rough entropy’ of a set in a universe has been proposed. This rough entropy measure is defined based on the uncertainty in granulation (obtained using a relation defined over universe (Pawlak, 1991)) and the definability of the set. Another entropy measure called the ‘rough schema entropy’ has been proposed in (Beaubouef et al., 1998) in order to quantify the uncertainty in granulation alone. Other entropy measures of granulation have been defined in (D¨ untsch and Gediga, 1998; Wierman, 1999; Liang, Chin, Dang, and Yam, 2002). Later, entropy measures of fuzzy granulation have been reported in (Bhatt and Gopal, 2004; Mi, Li, Zhao, and Feng, 2007). It is worthwhile to mention here that (Yager, 1992) and (Hu and Yu, 2005) respectively present and analyze an entropy measure, which, although not based on rough set theory, quantifies information with the underlying elements having limited discernibility between them. Incompleteness of knowledge about a universe leads to granulation (Pawlak, 1991) and hence a measure of the uncertainty in granulation quantifies this incompleteness of knowledge. Therefore, apart from the ‘rough entropy’ in (Beaubouef et al., 1998) which quantifies the incompleteness of knowledge about a set in a universe, the other aforesaid entropy measures quantify the incompleteness of knowledge about a universe. The effect of incompleteness of knowledge about a universe becomes evident only when an attempt is made to define a set in it. Note that, the definability of a set in a universe is not always affected by a change in the uncertainty in granulation. This is evident in a few examples given in (Beaubouef et al., 1998), which we do not repeat here for the sake of brevity. Hence, a measure of incompleteness of knowledge about a universe with respect to only the definability of a set is required. First attempt of formulating an entropy measure with respect to the definability of a set was made in (Pal, Shankar, and Mitra, 2005), which was used for image segmentation. However, as pointed out in (Sen and Pal, 2007), this measure does not satisfy the necessary property that the entropy value is maximum (or optimum) when the uncertainty (in this case, incompleteness of knowledge) is maximum. In this section, we propose classes of entropy measures, which quantify the incompleteness of knowledge about a universe with respect to the definability of a set of elements (in the universe) holding a particular property (representing a category). An inexactness measure of a set, like the ‘roughness’ measure (Pawlak, 1991), quantifies the definability of the set. We measure the incompleteness of knowledge about a universe with respect to the definability of a set by considering the roughness measure of the set and also that of its complement in the universe.
3.2.1
Roughness of a Set in a Universe
Let U denote a universe of elements and X be an arbitrary set of elements in U holding a particular property. According to rough set theory (Pawlak, 1991) and its generalizations, limited discernibility draws elements in U together governed by an indiscernibility relation R and hence granules of elements are formed in U . An indiscernibility relation (Pawlak, 1991) in a universe refers to the similarities that every element in the universe has with the other elements of the universe. The family of all granules obtained using the relation R is represented as U/R. The indiscernibility relation among elements and sets in U results in an inexact definition of X. However, the set X can be approximately represented by two
Image Thresholding using Generalized Rough Sets exactly definable sets RX and RX in U , which are obtained as [ RX = {Y ∈ U/R : Y ⊆ X} [ RX = {Y ∈ U/R : Y ∩ X 6= ∅}
3–5
(3.1) (3.2)
In the above, RX and RX are respectively called the R-lower approximation and the Rupper approximation of X. In essence, the pair of sets < RX, RX > is the representation of any arbitrary set X ⊆ U in the approximation space < U, R >, where X can not be defined. As given in (Pawlak, 1991), an inexactness measure of the set X can be defined as ρR (X) = 1 −
|RX| |RX|
(3.3)
where |RX| and |RX| are respectively the cardinalities of the sets RX and RX in U . The inexactness measure ρR (X) is called the R-roughness measure of X and it takes a value in the interval [0, 1].
3.2.2
The Lower and Upper Approximations of a Set
The expressions for the lower and upper approximations of the set X depends on the type of relation R and whether X is a crisp (Klir and Yuan, 2005) or a fuzzy (Klir and Yuan, 2005) set. Here we shall consider the upper and lower approximations of the set X when R denotes an equivalence, a fuzzy equivalence, a tolerance or a fuzzy tolerance relation and X is a crisp or a fuzzy set. When X is a crisp or a fuzzy set and the relation R is a crisp or a fuzzy equivalence relation, we consider the expressions for the lower and the upper approximations of the set X as
where M (u)
=
RX
= {(u, M (u))| u ∈ U }
RX
= {(u, M (u))| u ∈ U }
X Y ∈U/R
M (u)
=
X
Y ∈U/R
(3.4)
mY (u) × inf max(1 − mY (ϕ), µX (ϕ)) ϕ∈U
mY (u) × sup min(mY (ϕ), µX (ϕ))
(3.5)
ϕ∈U
where the membership function mY represents the belongingness of every element (u) in the P universe (U ) to a granule Y ∈ U/R and it takes values in the interval [0, 1] such that Y mY (u) = 1, and µX , which takes values in the interval [0, 1], is the membership function associated with X. When X is a crisp set, µX would take values only from the set {0, 1}. Similarly, when R is a crisp equivalence relation mY would take values only from the set P {0, 1}. In the above, the symbols (sum) and × (product) respectively represent specific fuzzy union and intersection operations (Klir and Yuan, 2005), which are chosen judging their suitability with respect to the underlying application of measuring ambiguity. In the above, we have considered the indiscernibility relation R ⊆ U × U to be an equivalence relation, that is, R satisfies crisp or fuzzy reflexivity, symmetry and transitivity properties (Klir and Yuan, 2005). We shall also consider here the case when the transitivity property is not satisfied. Such a relation R is said to be a tolerance relation (Skowron
3–6
Rough Fuzzy Image Analysis
TABLE 3.1
The different names of < RX, RX >
X Crisp Fuzzy Crisp Fuzzy Crisp Fuzzy Crisp Fuzzy
R mY (u) ∈ {0, 1} (crisp equivalence ) mY (u) ∈ {0, 1} (crisp equivalence) mY (u) ∈ [0, 1](fuzzy equivalence) mY (u) ∈ [0, 1](fuzzy equivalence) SR : U × U → {0, 1}(crisp tolerance) SR : U × U → {0, 1}(crisp tolerance) SR : U × U → [0, 1](fuzzy tolerance) SR : U × U → [0, 1](fuzzy tolerance)
< RX, RX > rough set of X rough-fuzzy set of X fuzzy rough set of X fuzzy rough-fuzzy set of X tolerance rough set of X tolerance rough-fuzzy set of X tolerance fuzzy rough set of X tolerance fuzzy rough-fuzzy set of X
and Stepaniuk, 1996). When R is a tolerance relation, we consider the expressions for the membership values corresponding to the lower and upper approximations (see (3.5)) of an arbitrary set X in U as M (u)
=
M (u)
=
inf max(1 − SR (u, ϕ), µX (ϕ))
ϕ∈U
sup min(SR (u, ϕ), µX (ϕ))
(3.6)
ϕ∈U
where SR (u, ϕ) is a value representing the tolerance relation R between u and ϕ. Note that, two different notions of expressing the upper and lower approximations of a set, exist in literature pertaining to rough set theory (Radzikowska and Kerre, 2002). Among the two notion, one is based on concept of similarity and the other is based on concept of granulation due to limited discernibility. We use the first aforesaid notion in (3.6) and the second aforesaid notion in (3.5), considering aspects of their practical implementation for measuring ambiguity. We refer the pair of sets < RX, RX > differently depending on whether X is a crisp or a fuzzy set; the relation R is a crisp or a fuzzy equivalence, or a crisp or a fuzzy tolerance relation. The different names are listed in Table 3.1.
3.2.3
The Entropy Measures
As mentioned earlier, the lower and upper approximations of a vaguely definable set X in a universe U can be used in the expression given in (3.3) in order to get an inexactness measure of the set X called the roughness measure ρR (X). The vague definition of X in U signifies incompleteness of knowledge about U . Here we propose two classes of entropy measures based on the roughness measures of a set and its complement in order to quantify the incompleteness of knowledge about a universe. One of the proposed two classes of entropy measures is obtained by measuring the ‘gain in information’ or in our case the ‘gain in incompleteness’ using a logarithmic function as suggested in the Shannon’s theory (Shannon, 1948). This proposed class of entropy measures for quantifying the incompleteness of knowledge about U with respect to the definability of a set X ⊆ U is given as ρR (X) ρR (X { ) i 1h L (3.7) + ρR (X { ) logβ HR (X) = − ρR (X) logβ 2 β β where β denotes the base of the logarithmic function used and X { ⊆ U stands for the
3–7
Image Thresholding using Generalized Rough Sets
complement of the set X in the universe. The various entropy measures of this class are obtained by calculating the roughness values ρR (X) and ρR (X { ) considering the different ways of obtaining the lower and upper approximations of thevaguely definable set X. Note that, the ‘gain in incompleteness’ term is taken as − logβ
ρR β
in (3.7) and for β > 1
it takes a value in the interval [1, ∞]. The other class of entropy measures proposed is obtained by considering an exponential function (Pal and Pal, 1991) to measure the ‘gain in incompleteness’. This second proposed class of entropy measures for quantifying the incompleteness of knowledge about U with respect to the definability of a set X ⊆ U is given as { 1 E HR (X) = (3.8) ρR (X)β 1−ρR (X) + ρR (X { )β 1−ρR (X ) 2 where β denotes the base of the exponential function used. The authors in (Pal and Pal, 1991) had considered only the case when β equaled e. Similar to the class of entropy L measures HR , the various entropy measures of this class are obtained by using the different ways of obtaining the lower and upper approximations of X in order to calculate ρR (X)
and ρR (X { ). The ‘gain in incompleteness’ term is taken as β 1−ρR in (3.8) and for β > 1 it takes a value in the finite interval [1, β]. Note that, an analysis on the appropriate values L E that β in HR and HR can take is given later in Section 3.2.5. We shall name a proposed entropy measure using attributes that represent the class (logarithmic or exponential) it belongs to, and the type of the pair of sets < RX, RX > considered. For example, if < RX, RX > represents a tolerance rough-fuzzy set and the expression of the proposed entropy in (3.8) is considered, then we call such an entropy as the exponential tolerance rough-fuzzy entropy. Some other examples of names for the proposed entropy measures are, the logarithmic rough entropy, the exponential fuzzy rough entropy and the logarithmic tolerance fuzzy rough-fuzzy entropy.
3.2.4
Relation between ρR (X) and ρR (X { )
Let us first consider a brief discussion on fuzzy set theory based uncertainty measures. Assume that a set F S is fuzzy in nature and it is associated with a membership function µF S . As mentioned in (Pal and Bezdek, 1994), most of the appropriate fuzzy set theory based uncertainty measures can be grouped into two classes, namely, the multiplicative class and the additive class. It should be noted from (Pal and Bezdek, 1994) that the measures belonging to these classes are functions of µF S and µF S { where µF S = 1 − µF S { . Now, as mentioned in (Jumarie, 1990) and pointed out in (Pal and Bezdek, 1994), the existence of an exact relation between µF S and µF S { suggests that they ‘theoretically’ convey the same. However, sometimes such unnecessary terms should to be retained as dropping them would cause the corresponding measures to fail certain important properties (Pal and Bezdek, 1994). We shall now analyze the relation between ρR (X) and ρR (X { ), and show that there exist no unnecessary terms in the classes of entropy measures (see (3.7) and (3.8)) proposed using rough set theory and its certain generalizations. As it is known that ρR (X) takes a value in the interval [0, 1], let us consider ρR (X) =
1 , 1≤C≤∞ C
(3.9)
Let us now find the range of values that ρR (X { ) can take when the value of ρR (X) is given. Let the total number of elements in the universe U under consideration be n. As we have
3–8
Rough Fuzzy Image Analysis
X ∪ X { = U , it can be easily deduced that RX ∪ RX { = U and RX ∪ RX { = U . Using these deductions, from (3.3) we get |RX| |RX|
(3.10)
n − |RX| |RX { | =1− n − |RX| |RX { |
(3.11)
ρR (X) = 1 − ρR (X { ) = 1 −
From (3.9), (3.10) and (3.11), we deduce that 1 |RX| = ρR (X { ) = ρR (X) n − |RX| C
|RX| n − |RX|
! (3.12)
We shall now separately consider three cases of (3.12), where we have 1 < C < ∞, C = 1 and C = ∞. |RX| When we have 1 < C < ∞, we get the relation |RX| from (3.9). Using this = C−1 C relation in (3.12) we obtain ! C 1 |RX|( C−1 ) { (3.13) ρR (X ) = C n − |RX| After some algebraic manipulations, we deduce 1 ρR (X ) = C −1 {
!
1 n |RX|
−1
(3.14)
Note that, when 1 < C < ∞, ρR (X) takes value in the interval (0, 1). Therefore, in this case, the value of |RX| could range from a positive infinitesimal quantity, say , to a maximum value of n. Hence, we have C −1 C −1 ≤ |RX| ≤ n (3.15) C C Using (3.15) in (3.14), we get ≤ ρR (X { ) ≤ 1 (3.16) nC − (C − 1) As 1 < C < ∞, > 1, we may write (3.16) as 0 < ρR (X { ) ≤ 1
(3.17)
Thus, we may conclude that for a given non-zero and non-unity value of ρR (X), ρR (X { ) may take any value in the interval (0, 1]. When C = 1 or ρR (X) takes a unity value, |RX| = 0 and the value of |RX| could range from to a maximum value of n. Therefore, it is easily evident from (3.12) that ρR (X { ) may take any value in the interval (0, 1] when ρR (X) = 1. Let us now consider the case when C = ∞ or ρR (X) = 0. In such a case, the value of |RX| could range from zero to a maximum value of n and |RX| = |RX|. As evident from (3.12), when C = ∞, irrespective of any other term, we get ρR (X { ) = 0. This is obvious, as a exactly definable set X should imply an exactly definable set X { . Therefore, we find that the relation between ρR (X) and ρR (X { ) is such that, if one of them is considered to take a non-zero value (that is, the underlying set is vaguely definable or inexact), the value of the other, which would also be a non-zero quantity, can not be uniquely specified. Therefore, there exist no unnecessary terms in the proposed classes of entropy measures given in (3.7) and (3.8). However, from (3.10) and (3.11), it is easily evident that ρR (X) and ρR (X { ) are positively correlated.
Image Thresholding using Generalized Rough Sets
3.2.5
3–9
Properties of the Proposed Classes of Entropy Measures
In the previous two subsections we have proposed two classes of entropy measures and we have shown that the expressions for the proposed entropy measures do not have any unnecessary terms. However, the base parameters β (see (3.7) and (3.8)) of the two classes of entropy measures incur certain restrictions, so that the proposed entropies satisfy some important properties. In this subsection, we shall discuss the restrictions regarding the base parameters and then provide few properties of the proposed entropies. Range of Values for the Base β L E The proposed classes of entropy measures HR and HR respectively given in (3.7) and (3.8) must be consistent with the fact that maximum information (entropy) is available when the uncertainty is maximum and the entropy is zero when there is no uncertainty. Note that, in our case, maximum uncertainty represents maximum possible incompleteness of knowledge about the universe. Therefore, maximum uncertainty occurs when both the roughness values L E used in HR and HR equal unity and uncertainty is zero when both of them are zero. It can L be easily shown that in order to satisfy the aforesaid condition, the base β in HR must take E a finite value greater than or equal to e(≈ 2.7183) and the base β in HR must take a value L E L in the interval (1, e]. When β ≥ e in HR and 1 < β ≤ e in HR , the values taken by both HR E and HR lie in the range [0, 1]. Note that, when β takes an appropriate value, the proposed entropy measures attain the minimum value of zero only when ρR (X) = ρR (X { ) = 0 and the maximum value of unity only when ρR (X) = ρR (X { ) = 1.
Properties
Here we present few properties of the proposed logarithmic and exponential classes of enE L as functions of two parameters representing roughand HR tropy measures expressing HR ness measures. We may respectively rewrite the expressions given in (3.7) and (3.8) in parametric form as follows i A B 1h L (3.18) + B logβ HR (A, B) = − A logβ 2 β β 1 E HR (A, B) = Aβ 1−A + Bβ 1−B (3.19) 2 where the parameters A (∈ [0, 1]) and B (∈ [0, 1]) represent the roughness values ρR (X) and ρR (X { ), respectively. Considering the convention 0 logβ 0 = 0, let us now discuss the L E properties of HR (A, B) and HR (A, B) along the lines of (Ebanks, 1983). L E P1. Nonnegativity: We have HR (A, B) ≥ 0 and HR (A, B) ≥ 0 with equality in both the cases if and only if A = 0 and B = 0. E P2. Continuity: As all first-order partial and total derivatives of HR (A, B) exists E for A, B ∈ [0, 1], HR (A, B) is a continuous function of A and B. On the other L hand, all first-order partial and total derivatives of HR (A, B) exists only for L A, B ∈ (0, 1]. However, it can be easily shown that limA→0, B→0 HR (A, B) = 0 L (HR (A, B) tends to zero when A and B tend to zero) using L’hospitals rule and L we have 0 logβ 0 = 0. Therefore, HR (A, B) is a continuous function of A and B, where A, B ∈ [0, 1]. L E P3. Sharpness: It is evident that both HR (A, B) and HR (A, B) equal zero if and only if the roughness values A and B equal zero, that is, A and B are ‘sharp’.
3–10
Rough Fuzzy Image Analysis
L E P4. Maximality: Both HR (A, B) and HR (A, B) attain their maximum value of unity L if and only if the roughness values A and B are unity. That is, we have HR (A, B) ≤ L E E HR (1, 1) = 1 and HR (A, B) ≤ HR (1, 1) = 1, where A, B ∈ [0, 1]. L L E E P5. Resolution: We have HR (A∗ , B ∗ ) ≤ HR (A, B) and HR (A∗ , B ∗ ) ≤ HR (A, B), where A∗ and B ∗ are respectively the sharpened version of A and B, that is, A∗ ≤ A and B ∗ ≤ B. L L E E P6. Symmetry: It is evident that HR (A, B) = HR (B, A) and HR (A, B) = HR (B, A). L E Hence HR (A, B) and HR (A, B) are symmetric about the line A = B. L P7. Monotonicity: The first-order partial and total derivatives of HR (A, B), when A, B ∈ (0, 1], are
L L δHR dHR 1h A + = = − logβ δA dA 2 β L L B δHR dHR 1h + = = − logβ δB dB 2 β
1 i ln β 1 i ln β
(3.20)
L For the appropriate values of β in HR (A, B), where A, B ∈ (0, 1], we have L L L L δHR dHR δHR dHR = ≥ 0 and = ≥0 δA dA δB dB
(3.21)
L L Since we have HR (A, B) = 0 when A = B = 0 and HR (A, B) > 0 when A, B ∈ L (0, 1], we may conclude from (3.20) and (3.21) that HR (A, B) is a monotonically non-decreasing function. In a similar manner, for appropriate values of β in E HR (A, B), where A, B ∈ (0, 1], we have
i E E δHR dHR 1h = = β (1−A) − Aβ (1−A) ln β ≥ 0 δA dA 2 i E E dHR 1h δHR = = β (1−B) − Bβ (1−A) ln β ≥ 0 δB dB 2
(3.22)
E E We also have HR (A, B) = 0 when A = B = 0 and HR (A, B) > 0 when A, B ∈ E (0, 1], and hence we may conclude from (3.22) that HR (A, B) is a monotonically non-decreasing function. P8. Concavity: A two dimensional function f un(A, B) is concave on a two dimensional interval < [amin , amax ], [bmin , bmax ] >, if for any four points a1 , a2 ∈ [amin , amax ] and b1 , b2 ∈ [bmin , bmax ], and any λa , λb ∈ (0, 1)
f un(λa a1 + (1 − λa )a2 , λb b1 + (1 − λb )b2 ) ≥ λ11 f un(a1 , b1 ) + λ12 f un(a1 , b2 ) +λ21 f un(a2 , b1 ) + λ22 f un(a2 , b2 ) where λ11 = λa λb , λ12 = λa (1 − λb ), λ21 = (1 − λa )λb , λ22 = (1 − λa )(1 − λb )
(3.23)
Image Thresholding using Generalized Rough Sets
(a) Plot of the proposed logarithmic class of entropies for various roughness values A and B
3–11
(b) Plot of the proposed exponential class of entropies for various roughness values A and B
FIGURE 3.2: Plots of the proposed classes of entropy measures
FIGURE 3.3: Plots of the proposed entropy measures for a few values of the base β, when A = B
L E Both HR (A, B) and HR (A, B) are concave functions of A and B, where A, B ∈ [0, 1], as they satisfy the inequality given in (3.23) when appropriate values of β and the convention 0 logβ 0 = 0 are considered. L E The plots of the proposed classes of entropies HR and HR as functions of A and B are L E given in Figures 3.2 and 3.3, respectively. In Figure 3.2, the values of HR and HR are shown for all possible values of the roughness measures A and B considering β = e. Figure 3.3 shows the plots of the proposed entropies for different values of the base β, when A = B.
3.3
Measuring Grayness Ambiguity in Images
In this Section, we shall use the entropy measures proposed in the previous section in order to quantify the grayness ambiguity (See Section 3.1) in a grayscale image. As we shall see later, the entropy measures based on the generalization of rough set theory regarding the
3–12
Rough Fuzzy Image Analysis
approximation of fuzzy sets (that is, when the set X considered in the previous section is fuzzy) can be used to quantify grayness ambiguity due to both fuzzy boundaries and rough resemblance. Whereas, the entropy measures based on the generalization of rough set theory regarding the approximation of crisp sets (that is, when the set X considered in the previous section is crisp) can be used to quantify grayness ambiguity only due to rough resemblance. Now, we shall obtain the grayness ambiguity measure by considering the fuzzy boundaries of regions formed based on global gray value distribution and the rough resemblance between nearby gray levels. The image is considered as an array of gray values and the measure of consequence of the incompleteness of knowledge about the universe of gray levels in the array quantifies the grayness ambiguity. Note that, the measure of incompleteness of knowledge about a universe with respect to the definability of a set must be used here, as the set would be employed to capture the vagueness in region boundaries. Let G be the universe of gray levels and ΥT be a set in G, that is ΥT ⊆ G, whose elements hold a particular property to extents given by a membership function µT defined on G. Let OI be the graylevel histogram of the image I under consideration. The fuzzy boundaries and rough resemblance in I causing the grayness ambiguity are related to the incompleteness of knowledge about G, which can be quantified using the proposed classes of entropy measures in 3.2.3. We shall consider ΥT such that it represents the category ‘dark areas’ in the image I and the associated property ‘darkness’ given by the membership function µT shall be modeled using the Z-function (Klir and Yuan, 2005) as given below 1 #2 " (l−(T −∆)) 2∆ 1−2 " #2 µT (l) = Z(l; T, ∆) = (l−(T +∆)) 2 2∆ 0
l ≤T −∆ T −∆≤l ≤T ; l∈G
(3.24)
T ≤l ≤T +∆ l ≥T +∆
where T and ∆ are respectively called the crossover point and the bandwidth. We shall consider the value of ∆ as a constant and that different definitions of the property ‘darkness’ can be obtained by changing the value of T , where T ∈ G. In order to quantify the grayness ambiguity in the image I using the proposed classes of entropy measures, we consider the following sets ΥT
= {(l, µT (l))| l ∈ G}
Υ{T
= {(l, 1 − µT (l))| l ∈ G}
(3.25)
The fuzzy sets ΥT and Υ{T considered above capture the fuzzy boundary aspect of the grayness ambiguity. Furthermore, we consider limited discernibility among the elements in G that results in vague definitions of the fuzzy sets ΥT and Υ{T , and hence the rough resemblance aspect of the grayness ambiguity is also captured. Granules, with crisp or fuzzy boundaries, are induced in G as its elements are drawn together due to the presence of limited discernibility (or indiscernibility relation) among them and this process is referred to as the graylevel granulation. We assume that the indiscernibility relation is uniform in G and hence the granules formed have a constant support cardinality (size), say, ω. Now, using (3.4), (3.5) and (3.6), we get general expressions for the different lower and upper approximations of ΥT and Υ{T obtained considering the
3–13
Image Thresholding using Generalized Rough Sets different indiscernibility relations discussed in Section 3.2.2 as follows ΥT = {(l, MΥT (l))| l ∈ G}, ΥT = {(l, MΥT (l))| l ∈ G} Υ{T = {(l, MΥ{ (l))| l ∈ G}, Υ{T = {(l, MΥ{ (l))| l ∈ G} T
(3.26)
T
where we have MΥT (l) MΥT (l) MΥ{ (l) T
MΥ{ (l) T
= = = =
γ X i=1 γ X i=1 γ X i=1 γ X i=1
mzωi (l) × inf max(1 − mzωi (ϕ), µT (ϕ)) ϕ∈G
mzωi (l) × sup min(mzωi (ϕ), µT (ϕ)) ϕ∈G
mzωi (l) × inf max(1 − mzωi (ϕ), 1 − µT (ϕ)) ϕ∈G
mzωi (l) × sup min(mzωi (ϕ), 1 − µT (ϕ))
(3.27)
ϕ∈G
when equivalence indiscernibility relation is considered and we have MΥT (l) = inf max(1 − Sω (l, ϕ), µT (ϕ)), MΥT (l) = sup min(Sω (l, ϕ), µT (ϕ)) ϕ∈G
ϕ∈G
MΥ{ (l) = inf max(1 − Sω (l, ϕ), 1 − µT (ϕ)), MΥ{ (l) = sup min(Sω (l, ϕ), 1 − µT (ϕ))(3.28) T
ϕ∈G
T
ϕ∈G
when tolerance indiscernibility relation is considered. In the above, γ denotes the number of granules formed in the universe G and mzωi (l) gives the membership grade of l in the ith granule zω i . These membership grades may be calculated using any concave, symmetric and normal membership function (with support cardinality ω) such as the one having triangular, trapezoidal or bell (example, the π function) shape. Note that, the sum of these membership grades over all the granules must be unity for a particular value of l. In (3.28), Sω : G×G → [0, 1], which can be any concave and symmetric function, gives the relation between any two gray levels in G. The value of Sω (l, ϕ) is zero when the difference between l and ϕ is greater than ω and Sω (l, ϕ) equals unity when l equals ϕ. The lower and upper approximations of the sets ΥT and Υ{T take different forms depending on the nature of rough resemblance considered, and whether the need is to capture grayness ambiguity due to both fuzzy boundaries and rough resemblance or only those due to rough resemblance. The nature of rough resemblance may be considered such that an equivalence ω relation between gray levels induces granules having crisp (crisp zω i ) or fuzzy (fuzzy zi ) boundaries, or there exists a tolerance relation between between gray levels that may be crisp (Sω : G × G → {0, 1}) or fuzzy (Sω : G × G → [0, 1]). When the sets ΥT and Υ{T considered are fuzzy sets, grayness ambiguity due to both fuzzy boundaries and rough resemblance would be captured. Whereas, when the sets ΥT and Υ{T considered are crisp sets, only the grayness ambiguity due to rough resemblance would be captured. The different forms of the lower and upper approximation of ΥT are shown graphically in Figure 3.4. We shall now quantify the grayness ambiguity in the image I by measuring the consequence of the incompleteness of knowledge about the universe of gray levels G in I. This measurement is done by calculating the following values P P (l)OI (l) l∈G MΥ{ T l∈G MΥT (l)OI (l) { (3.29) , %ω (ΥT ) = 1 − P %ω (ΥT ) = 1 − P l∈G MΥT (l)OI (l) l∈G MΥ{ (l)OI (l) T
3–14
Rough Fuzzy Image Analysis
(a) Crisp ΥT and Crisp zω i
(b) Fuzzy ΥT and Crisp zω i
(c) Crisp ΥT and Fuzzy zω i
(d) Fuzzy ΥT and Fuzzy zω i
(e) Crisp ΥT and Sω : G × G → {0, 1}
(f) Fuzzy ΥT and Sω : G × G → {0, 1}
(g) Crisp ΥT and Sω : G × G → [0, 1]
(h) Fuzzy ΥT and Sω : G × G → [0, 1]
FIGURE 3.4: The different forms that the lower and upper approximation of ΥT can take when used to get the grayness ambiguity measure
The grayness ambiguity measure Λ of I is obtained as a function of T , which characterizes the underlying set ΥT , as follows %ω (ΥT ) %ω (Υ{T ) 1 { % (Υ ) log (3.30) + % (Υ ) log ΛL (T ) = − ω T ω β β T ω 2 β β Note that, the above expression is obtained by using %ω (ΥT ) and %ω (Υ{T ) in the proposed logarithmic (L) class of entropy functions given in (3.8), instead of roughness measures. When the proposed exponential (E) class of entropy functions is used, we get 1 1−%ω (ΥT ) 1−%ω (Υ{ { T) ΛE (T ) = (3.31) % (Υ )β + % (Υ )β ω T ω ω T 2 It should be noted that the values %ω (ΥT ) and %ω (Υ{T ) in (3.29) are obtained by considering ‘weighted cardinality’ measures instead of cardinality measures, which are used for calcu-
Image Thresholding using Generalized Rough Sets
3–15
lating roughness values (see (3.3)). The weights considered are the number of occurrences of gray values given by the graylevel histogram OI of the image I. Therefore, the weighted cardinality of the underlying set (in G) gives the number of pixels in the image I that take the gray values belonging to that set. From (3.30) and (3.31), we see that the grayness ambiguity measure lies in the range [0, 1], where a larger value means higher ambiguity.
3.4
Image Thresholding based on Association Error
In this section, we propose a new methodology to perform image thresholding using the grayness ambiguity measure presented in the previous section. The proposed methodology does not make any prior assumptions about the image unlike many existing thresholding techniques. As boundaries of regions in an image are in general not well-defined and nearby gray values are indiscernible, we consider here that the various areas in an image are ambiguous in nature. We then use grayness ambiguity measures of regions in an image to perform thresholding in that image.
3.4.1
Bilevel Thresholding
Here we propose a methodology to carry out bilevel image thresholding based on the analysis of the graylevel histogram of the image under consideration. Let us consider two regions in the graylevel histogram of an image I containing a few graylevel bins corresponding to the dark and bright areas of the image, respectively. These regions are obtained using two predefined gray values, say gd and gb , with the graylevel bins in the range [gb , gmax ] representing the initial bright region and the graylevel bins in the range [gmin , gd ] representing the initial dark region. The symbols gmin and gmax represent the lowest and highest gray value of the image, respectively. A third region given by the graylevel bins in the range (gd , gb ) is referred to as the undefined region. Now, let the association of a graylevel bin from the undefined region to the initial bright region causes an error of Errb units and the association of a graylevel bin from the undefined region to the initial dark region results in an error of Errd units. Then, if Errd > Errb (Errb > Errd ), it would be appropriate to assign the graylevel bin from the undefined region to the bright (dark) region. The Proposed Methodology
Here we present the methodology to calculate the error caused due to the association of a graylevel bin from the undefined region to a defined region. Using this method we shall obtain the association errors corresponding to the dark and bright regions, that is, Errd and Errb . Each of these association errors comprise of two constituent error measure referred to as the proximity error and the change error. Let Hi represent the value of the ith bin of the graylevel histogram of a grayscale image I. We may define Sb , the array of all the graylevel bins in the initial bright region as Sb = [Hi : i ∈ Gb ], where Gb = [gb , gb + 1, . . . , gmax ]
(3.32)
and Sd , the array of all the graylevel bins in the initial dark region as Sd = [Hi : i ∈ Gd ], where Gd = [gmin , . . . , gd − 1, gd ]
(3.33)
Now, consider that a graylevel bin from the undefined region corresponding to a gray value ga has been associated to the initial bright region. The bright region after the association
3–16
Rough Fuzzy Image Analysis
is represented by an array Sba as Sba = [Hia : i ∈ Gab ], where Gab = [ga , . . . , gb , . . . , gmax ] Hia = Hi when (i = ga or i ≥ gb ), Hia = 0 elsewhere.
(3.34)
In a similar manner, the dark region after the association is represented by an array Sda as Sda = [Hia : i ∈ Gad ], where Gad = [gmin , . . . , gd , . . . , ga ]
(3.35)
Hia = Hi when (i = ga or i ≤ gd ), Hia = 0 elsewhere. In order to decide whether the graylevel bin corresponding to the gray value ga belongs to the bright or dark region, we need to determine the corresponding errors Errd and Errb . As mentioned earlier, our measure of an association error (Err) comprises of a proximity error measure (ep ) and a change error measure (ec ). We represent an association error as Err = (α + βec ) + ep
(3.36)
where α and β are constants such that α + βec and ep take values from the same range, say, [0, 1]. In order to determine the errors ep and ec corresponding to the bright and dark regions, let us consider the arrays Sba and Sda , respectively. We define the change error due to the association in the bright region as ebc =
GA(Sba ) − GA(S´ba ) GA(S a ) + GA(S´a ) b
(3.37)
b
where the array S´ba is obtained by replacing Hgaa by 0 in Sba and GA(SΩ ) gives the grayness ambiguity in the image region represented by the graylevel bins in an array SΩ . The grayness ambiguity in the image region is calculated using the expression in (3.30) or (3.31). Note that, the grayness ambiguity is calculated for a region in an image here and not for the whole image as presented in Section 3.3. Now, in a similar manner, the change error due to the association in the dark region is given as edc =
GA(Sda ) − GA(S´da ) GA(S a ) + GA(S´a ) d
(3.38)
d
where the array S´da is obtained by replacing Hgaa by 0 in Sda . It is evident that the expressions in (3.37) and (3.38) measure the change in grayness ambiguity of the regions due to the association of ga and hence we refer the measures as the change errors. The form of these expressions is chosen so as to represent the measured change as the contrast in grayness ambiguity, which is given by the ratio of difference in grayness ambiguity to average grayness ambiguity. As can be deduced from (3.37) and (3.38), the change errors would take values in the range [−1, 1]. It is also evident from (3.37) and (3.38) that the change error may take a pathological value of 0/0. In such a case we consider the change error to be 1. Next, we define the proximity errors due to the associations in the bright and dark regions respectively as ebp = and
edp
=
1 − GA(S´ba ) 1 − C × GA(S´a ) d
(3.39) (3.40)
In the above, we take edp = 0, if C × GA(S´da ) > 1. It will be evident later from the explanation of the function GA(·), that the grayness ambiguity measures in (3.39) and
Image Thresholding using Generalized Rough Sets
3–17
(3.40) increase with the increase in proximity of the graylevel bin corresponding to ga from the corresponding regions. Thus the expressions in (3.39) and (3.40) give measures of farness of the graylevel bin corresponding to ga from the regions and hence we refer the measures as the proximity errors. The symbol C is a constant such that the values of ebp and edp when ga equals gb − 1 and gd + 1, respectively, are same and hence the proximity error values are not biased towards any region. As can be deduced from (3.39) and (3.40), the proximity errors would take values in the range [0, 1]. The various arrays defined in this section are graphically shown in Figure 3.5.
FIGURE 3.5: The various defined arrays shown for a multimodal histogram
From Section 3.3, we find that we need to define the crossover point T , the bandwidth ∆ of the Z-function and the granule size ω in order to measure grayness ambiguity. For the calculation of the association errors corresponding to the bright and dark regions, we define the respective crossover points as ga + gb 2 gd + ga and Td = 2 Tb =
(3.41) (3.42)
Considering the above expressions for the crossover points and the explanation in Section 3.3, it can be easily deduced that the grayness ambiguity measures in (3.39) and (3.40) increase with the increase in proximity of the graylevel bin corresponding to ga from the defined regions. While calculating the association errors corresponding to both the bright and dark regions, it is important that same bandwidth (∆) and same granule size (ω) be considered. As presented earlier in (3.36), the errors due to the association of a gray value from the
3–18
Rough Fuzzy Image Analysis
undefined region to the dark and the bright region are given as Errd = (α + βedc ) + edp Errb = (α +
βebc )
+
ebp
(3.43) (3.44)
We calculate the association errors Errd and Errb for all graylevel bins corresponding to ga ∈ (gd , gb ), that is, the graylevel bins of the undefined region. We then compare the corresponding association errors and assign these graylevel bins to one of the two defined (dark and bright) regions that corresponds to the lower association error. In (3.43) and (3.44), we consider α = β = 0.5 and hence force the range of contribution from the change errors to [0, 1], same as that of the proximity errors. Thus the bilevel thresholding is achieved by separating the bins of the graylevel histogram into two regions, namely, the dark and the bright regions. As a region in the graylevel histogram of an image corresponds to a region in the image, the aforesaid bilevel thresholding would divide the image into two regions.
3.4.2
Multilevel Thresholding
Here we extend above given novel bilevel image thresholding methodology to the multilevel image thresholding problem. Note that, we do not posses the prior knowledge required to assign more than two seed values for carrying out multilevel thresholding. Therefore, we understand that the concept of thresholding based on association error can be used to separate a histogram only into two regions and then these regions can further be separated only into two regions each and so on. From this understanding, we find that the proposed concept of thresholding using association error could be used in a binary tree structured technique in order to carry out multilevel thresholding. Now, let us consider that we require a multilevel image thresholding technique using association error in order to separate a image into Θ regions. Let D be a non-negative integer such that 2D−1 < Θ ≤ 2D . In our approach to multilevel image thresholding for obtaining Θ regions, we first separate the graylevel histogram of the image into 2D regions. The implementation of this approach can be achieved using a binary tree structured algorithm (Breiman, Friedman, Olshen, and Stone, 1998). Note that in (Breiman et al., 1998), the binary tree structure has been used for classification purposes, which is not our concern. In our case, we use the binary tree structure to achieve multilevel image thresholding using association error, which is an unsupervised technique. We list a few characteristics of a binary tree below stating what they represent when used for association error based multilevel image thresholding. 1. A node of the binary tree would represent a region in the histogram. 2. The root node of the binary tree represents the histogram of the whole image. 3. The depth of a node is given by D. At any depth D we always have 2D nodes (regions). 4. Splitting at each node is performed using the bilevel image thresholding technique using association error proposed in the previous section. 5. All the nodes at a depth D are terminal nodes when our goal is to obtain 2D regions in the histogram. In order to get Θ regions from the 2D regions, we need to declare certain bilevel thresholding of histogram regions (node) at depth D −1 as invalid. In order to do so, we define a measure (ι) of a histogram region based on the association errors Errd and Errb obtained for the
Image Thresholding using Generalized Rough Sets
3–19
FIGURE 3.6: Separation of a histogram into three regions using the proposed multilevel thresholding based on association error
values of ga (see Section 3.4.1) corresponding to the histogram region as follows ι=
X
Errd (ga ) + Errb (ga )
(3.45)
ga ∈(gd ,gb )
where ga , gd and gb are the same as explained in the previous section, except for the fact that they are defined for the underlying histogram region and not for the entire histogram. We use the expression in (3.45) to measure the suitability of the application of the bilevel image thresholding technique to all the regions in the graylevel histogram at the depth D−1. Larger the value of ι for a region of the graylevel histogram, more is the corresponding average association error and hence more is the suitability. Hence, in order to get Θ regions, we declare the bilevel thresholding of 2D − Θ least suitable (based on ι) regions at depth D − 1 as invalid and hence we are left with Θ regions at depth D. Now, as a region in the graylevel histogram of an image corresponds to a region in the image, the aforesaid multilevel thresholding would divide the image into Θ regions. Figure 3.6 graphically demonstrates the use of proposed multilevel thresholding technique using association error in order to obtain three regions (Regions 1, 2 and 3) in the histogram. The values ι1 and ι2 gives the suitability of the application of the bilevel thresholding on the two regions at depth D = 1.
3.5
Experimental Results
In this Section, we demonstrate the utility of the proposed entropy measures and effectiveness of the proposed image thresholding methodology by considering some image segmentation and edge extraction tasks. Grayness ambiguity measure based on the proposed entropies are employed to carry out image thresholding in order to perform image segmentation and edge extraction. As mentioned in Section 3.1, the aforesaid image thresholding is performed in two ways, namely, by the ambiguity minimization method reported in (Pal et al., 1983) and by the image thresholding methodology proposed earlier in this chapter. Results obtained using a few popular existing image thresholding algorithms are also con-
3–20
Rough Fuzzy Image Analysis
sidered for qualitative and quantitative performance comparison with that obtained using the two aforementioned techniques. Throughout this section, we consider the grayness ambiguity measure given in (3.30), which signifies measuring the ambiguity using the proposed logarithmic class of entropy functions. The quantities in (3.29) which are used in (3.30) are calculated considering that the pairs of lower and upper approximations of the sets ΥT and Υ{T represent a tolerance fuzzy rough-fuzzy set. The aforesaid statement, according to the terminology given in Section 3.2.3, signifies that logarithmic tolerance fuzzy rough-fuzzy entropy is used in this section to get the grayness ambiguity measure. We consider the values of the parameters ∆ and ω respectively as 8 and 6 gray levels, and the base β as e, without loss of generality. Note that, the logarithmic tolerance fuzzy rough-fuzzy entropy is a representative of the proposed entropies which can be used to capture grayness ambiguity due to both fuzzy boundaries and rough resemblance.
(a) The Image
(b) Graylevel Histogram
(c) Segmentation by (i)
(d) Segmentation by (ii)
(e) Segmentation by (iii)
(f) Segmentation by (iv)
(g) Segmentation by (v)
(h) Segmentation by (vi)
(i) Segmentation by (vii)
FIGURE 3.7: Segmentation obtained using the various thresholding algorithms applied to separate dark and bright regions in an image
3–21
Image Thresholding using Generalized Rough Sets
3.5.1
Qualitative analysis
Segmentation
Let us consider here qualitative assessment of segmentation results in different images in order to evaluate the performance of various techniques. The techniques considered for comparison are: (i) the proposed thresholding methodology using the aforesaid grayness ambiguity measure, (ii) the ambiguity minimization based thresholding method reported in (Pal et al., 1983) using the aforesaid grayness ambiguity measure (iii) thresholding method by Otsu (Otsu, 1979), (iv) thresholding method by Kapur et al. (Kapur, Sahoo, and Wong, 1985), (v) thresholding method by Kittler et al. (Kittler and Illingworth, 1986), (vi) Thresholding method by Tsai (Tsai, 1985), and (vii) thresholding method by Pal et al. (Pal et al., 1983). The methods considered will henceforth be referred using their corresponding numbers.
(a) The Image
(b) Graylevel Histogram
(c) Segmentation by (i)
(d) Segmentation by (ii)
(e) Segmentation by (iii)
(f) Segmentation by (iv)
(g) Segmentation by (v)
(h) Segmentation by (vi)
(i) Segmentation by (vii)
FIGURE 3.8: Performance of the various thresholding algorithms applied to find the core and extent of the galaxy in an image
3–22
Rough Fuzzy Image Analysis
In Figure 3.7, we consider an image with almost a bell-shaped graylevel histogram. Separation of dark and bright regions in this image is a non-trivial task as the histogram is not well-defined for thresholding using many existing algorithms. It is evident from the figure that the proposed bilevel thresholding methodology (algorithm (i)), algorithm (iii) and algorithm (vi) perform much better than the others in separating the dark areas in the image from the bright ones. An image of a galaxy is considered in Figure 3.8. The graylevel histogram of this image is almost unimodal in nature and hence extracting multiple regions from it is a non-trivial task. We use the proposed multilevel thresholding scheme and the various other schemes to find out the total extent and the core region of the galaxy. It is evident from the figure that the results obtained using the proposed thresholding methodology is as good as some of the others. While the ‘white’ shaded area in the result obtained
(a) The Image
(b) Graylevel Histogram
(c) Segmentation by (i)
(d) Segmentation by (ii)
(e) Segmentation by (iii)
(f) Segmentation by (iv)
(g) Segmentation by (v)
(h) Segmentation by (vi)
(i) Segmentation by (vii)
FIGURE 3.9: Performance of the various thresholding algorithms applied to segment a ‘low contrast’ image into three regions
using algorithm (i) represent a region slightly larger than the core region, the ‘white’ shaded
3–23
Image Thresholding using Generalized Rough Sets
area in the result obtained using algorithm (vi) represents a region slightly smaller than the core region. Figure 3.9 presents an image where the sand, sea and sky regions are to be separated. The image has a multimodal histogram and it is evident from the image that the average gray values of the three regions do not differ by much. As can be seen from the figure, the proposed multilevel thresholding methodology (algorithm (i)) performs better than some of the others and as good as algorithms (ii), (iv) and (vii). The results in Figures 3.7(c) and (d), Figures 3.8(c) and (d) and Figures 3.9(c) and (d) demonstrate the utility of the proposed logarithmic tolerance fuzzy rough-fuzzy entropy. Note that, as described in Section 3.4, two values gd and gb are needed to be predefined in order to use the proposed thresholding methodology. We have considered gd = gmin + 20 and gb = gmax − 20. Edge Extraction
Let us consider here qualitative assessment of edge extraction results in different images in order to evaluate the performance of various techniques. We consider the gradient magnitude at every pixel in an image and determine thresholds from the associated gradient magnitude histogram in order to perform edge extraction in that image. Gradient magnitude histograms are in general unimodal and positively (right) skewed in nature. In literature, very few techniques have been proposed to carry out bilevel thresholding in such histograms. Among these techniques, we consider the following for comparison: (viii) unimodal histogram thresholding technique by Rosin (Rosin, 2001) and (ix) the thresholding technique by Henstock et al. (Henstock and Chelberg, 1996). In addition to the aforesaid techniques, we also consider here some of the existing thresholding techniques mentioned previously in this section.
(a) The Image
(b) The Histogram
(c) Edges by (i)
(d) Edges by (ii)
(e) Edges by (iii)
(f) Edges by (v)
(g) Edges by (viii)
(h) Edges by (ix)
FIGURE 3.10: Performance of the various thresholding algorithms applied to mark the edges in a gradient image
(a) The Image
(b) The Histogram
(c) Edges by (i)
(d) Edges by (ii)
(e) Edges by (iii)
(f) Edges by (iv)
(g) Edges by (v)
(h) Edges by (vi)
FIGURE 3.11: Qualitative performance of the various thresholding algorithms applied to obtain the edge, non-edge and possible edge regions in a gradient image
3–24
Rough Fuzzy Image Analysis
As described in Section 3.4, two values gd and gb are needed to be predefined in order to use the proposed thresholding methodology. While using the proposed thresholding methodology on gradient magnitude images, gd and gb represent two gradient magnitude values and we consider the input parameters as gd = gmin + max([10 g3% ]) and gb = gmax − max([10 g97% ]). The notation gρ% denotes the ρth percentile of the gradient magnitude in the distribution (gradient magnitude histogram). Figures 3.10 and 3.11 give the edge extraction performance of the various thresholding algorithms. In Figure 3.10, we find that the proposed technique does much better than the others in determining the valid edges and eliminating those due to the inherent noise and texture. In Figure 3.11, we find three regions in the gradient image. One (white) represents the gradient values which surely correspond to valid edges, another (black) represents those which surely do not correspond to valid edges and the third region (gray) represents the gradient values which could possibly correspond to valid edges. Such multilevel thresholding in gradient magnitude histograms could be used along with the hysteresis technique suggested in (Canny, 1986) in order to determine the actual edges. We see from the figure that the proposed techniques perform as good as or better than the others. Note that, the gradient magnitude at every pixel in an image is obtained using the operator given in (Canny, 1986), and edge thinning has not been done in the results shown in Figures 3.10 and 3.11, as it is not of much significance with respect to the intended comparisons.
3.5.2
Quantitative analysis
Here, we consider human labeled ground truth based quantitative evaluation of bilevel thresholding based segmentation in order to carry out a rigorous quantitative analysis. The Image Dataset Considered
We consider 100 grayscale images from the ‘Berkeley Segmentation Dataset and Benchmark’ (Martin, Fowlkes, Tal, and Malik, 2001). Each one of the 100 images considered are associated with multiple segmentation results hand labeled by multiple human subjects and hence we have multiple segmentation ground truths for every single image. The Evaluation Measure Considered
We use the local consistency error (LCE) measure defined in (Martin et al., 2001) in order to judge the appropriateness of segmentation results obtained by a bilevel thresholding algorithm. Consider SH as a segmentation result hand labeled by a human subject and SA as a segmentation result obtained applying an algorithm to be analyzed. The LCE measure representing the appropriateness of SA with reference to the ground truth SH is given as LCE(SH , SA ) =
n n o 1X min E(SH , SA , pi ), E(SA , SH , pi ) n i=1
where E(S1 , S2 , p) =
|R(S1 , p) \ R(S2 , p)| |R(S1 , p)|
(3.46)
(3.47)
In the above, \ represents set difference, |x| represents the cardinality of a set x, R(S, p) represents the set of pixels corresponding to the region in segmentation S that contains pixel p and n represents the number of pixels in the image under consideration. The LCE take values in the range [0, 1], where a smaller value indicates more appropriateness of the segmentation result SA (with reference to the ground truth SH ).
3–25
Image Thresholding using Generalized Rough Sets
As we have considered the evaluation of bilevel thresholding based segmentation here, every image under consideration would be separated into two regions. However, the number of regions in the human labeled segmentation ground truths of the 100 images considered is always more than two. Now, the LCE measure penalizes an algorithm only if both SH and SA are not refinements of each other at a pixel and it does not penalize an algorithm if any one of them is the refinement of the other at a pixel (Martin et al., 2001). Therefore, the use of LCE measure is desirable in our experiments, as we do not want to penalize an algorithm when SA is not a refinement of SH at a pixel and SH is a refinement of SA at that pixel. This aforesaid case is a very highly probable one in our experiments, as the number regions associated with SA would be much less than SH .
(a) Box plots for algorithm (i)
(b) Box plots diagrams for algorithm (ii)
(c) Box plots for algorithm (iii)
(d) Box plots for algorithm (iv)
(e) Box plots for algorithm (v)
(f) Box plots for algorithm (vi)
(g) Box plots for algorithm (vii)
FIGURE 3.12: Box plot based summarization of segmentation performance by various thresholding algorithms
3–26
Rough Fuzzy Image Analysis
Analysis of Performance
Consider Figure 3.12 that shows box plots (Tukey, 1977) which graphically depict the LCE values corresponding to the segmentation achieved by bilevel thresholding using algorithms (i) to (vii) mentioned earlier in this Section. A box plot, which in Figure 3.12 summarizes the LCE values obtained corresponding to all the segmentation ground truths available for an image, is given for all the 100 images considered. We find from the box plots that the LCE values corresponding to the algorithms (i), (ii), (v) and (vii) are in general smaller compared to that corresponding to the other algorithms. It is also evident that algorithms (ii) and (vii) perform almost equally well, with algorithm (ii) doing slightly better. From all the box plots in Figure 3.12, we find that segmentation results achieved by algorithms (i) and (v) are equally good and they give the best performance among the algorithms considered. The average of all the LCE values obtained using an algorithm is minimum when algorithm (v) is used. However, maximum number of zero LCE values are obtained when algorithm (i) is used. Hence, we say from quantitative analysis that algorithms (i) and (v) are equally good and they give the best segmentation results.
3.6
Conclusion
In this chapter, image thresholding operations using rough set theory and its certain generalizations have been introduced. Classes of entropy measures based on generalized rough sets have been proposed and their properties have been discussed. A novel image thresholding methodology based on grayness ambiguity in images has then been presented. For bilevel thresholding, every element of the graylevel histogram of an image has been associated with one of the two regions by comparing the corresponding errors of association. The errors of association have been based on the grayness ambiguity measures of the underlying regions and the grayness ambiguity measures have been calculated using the proposed entropy measures. Multilevel thresholding has been carried out using the proposed bilevel thresholding method in a binary tree structured algorithm. Segmentation and edge extraction have been performed using the proposed image thresholding methodology. Qualitative and quantitative experimental results have been given to demonstrate the utility of the proposed entropy measures and the effectiveness of the proposed image thresholding methodology.
Image Thresholding using Generalized Rough Sets
Bibliography Beaubouef, Theresa, Frederick E Petry, and Gurdial Arora. 1998. Information-theoretic measures of uncertainity for rough sets and rough relational databases. Information Sciences 109(1-4):185–195. Bhatt, R B, and M Gopal. 2004. Frid: fuzzy-rough interactive dichotomizers. In Proceedings of the ieee international conference on fuzzy systems, 1337–1342. Breiman, L, J H Friedman, R A Olshen, and C J Stone. 1998. Classification and regression trees. Boca Raton, Florida, U.S.A.: CRC Press. Canny, J. 1986. A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell. 8(6):679–698. D¨ untsch, Ivo, and G¨ unther Gediga. 1998. Uncertainty measures of rough set prediction. International Journal of General Systems 106(1):109–137. Dubois, D, and H Prade. 1990. Rough fuzzy sets and fuzzy rough sets. International Jounral of General Systems 17(2-3):191–209. ———. 1992. Putting fuzzy sets and rough sets together. In Slowi´ nski, r., (ed.), intelligent decision support, handbook of applications and advances of the rough sets theory, 203–232. Ebanks, Bruce R. 1983. On measures of fuzziness and their representations. Journal of Mathematical Analysis and Applications 94(1):24–37. Henstock, Peter V, and David M Chelberg. 1996. Automatic gradient threshold determination for edge detection. IEEE Trans. Image Process. 5(5):784–787. Hu, Qinghua, and Daren Yu. 2005. Entropies of fuzzy indiscrenibility relation and its operations. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 12(5):575–589. Jumarie, G. 1990. Relative information: theories and applications. New York, NY, USA: Springer-Verlag New York, Inc. Kapur, J N, P K Sahoo, and A K C Wong. 1985. A new method for gray-level picture thresholding using the entropy of the histogram. Computer Vision, Graphics, and Image Processing 29:273–285. Kittler, J, and J Illingworth. 1986. Minimum error thresholding. Pattern Recognition 19(1):41–47. Klir, George, and Bo Yuan. 2005. Fuzzy sets and fuzzy logic: Theory and applications. New Delhi, India: Prentice Hall. Liang, Jiye, K S Chin, Chuangyin Dang, and Richard C M Yam. 2002. A new method for measuring uncertainty and fuzziness in rough set theory. International Journal of General Systems 31(4):331–342. Martin, D, C Fowlkes, D Tal, and J Malik. 2001. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In Proceedings of 8th international conference on computer vision, vol. 2, 416–423.
3–27
3–28
Rough Fuzzy Image Analysis
Mi, Ju-Sheng, Xiu-Min Li, Hui-Yin Zhao, and Tao Feng. 2007. Information-theoretic measure of uncertainty in generalized fuzzy rough sets. In Rough sets, fuzzy sets, data mining and granular computing, 63–70. Lecture Notes in Computer Science, Springer. Otsu, N. 1979. A threshold selection method from gray-level histogram. IEEE Trans. Syst., Man, Cybern. 9(1):62–66. Pal, Nikhil R, and James C Bezdek. 1994. Measuring fuzzy uncertainity. IEEE Trans. Fuzzy Syst. 2(2):107–118. Pal, Nikhil R, and Sankar K Pal. 1991. Entropy: A new definition and its application. IEEE Trans. Syst., Man, Cybern. 21(5):1260–1270. Pal, S K. 1982. A note on the quantitative measure of image enhancement through fuzziness. IEEE Trans. Pattern Anal. Mach. Intell. 4(2):204–208. Pal, S K, R A King, and A A Hashim. 1983. Automatic grey level thresholding through index of fuzziness and entropy. Pattern Recognition Letters 1(3):141–146. Pal, Sankar K. 1999. Fuzzy models for image processing and applications. Proc. Indian National Science Academy 65(1):73–90. Pal, Sankar K, B Uma Shankar, and Pabitra Mitra. 2005. Granular computing, rough entropy and object extraction. Pattern Recognition Letters 26(16):2509–2517. Pawlak, Zdzislaw. 1991. Rough sets: Theoritical aspects of reasoning about data. Dordrecht, Netherlands: Kluwer Academic. Radzikowska, Anna Maria, and Etienne E Kerre. 2002. A comparative study of fuzzy rough sets. Fuzzy Sets and Systems 126(2):137–155. Rosin, Paul L. 2001. Unimodal thresholding. Pattern Recognition 34(11):2083–2096. Sen, D, and S K Pal. 2007. Histogram thresholding using beam theory and ambiguity measures. Fundamenta Informaticae 75(1-4):483–504. Shannon, C E. 1948. A mathematical theory of communication. Bell System Technical Journal 27:379–423. Skowron, Andrzej, and Jaroslaw Stepaniuk. 1996. Tolerance approximation spaces. Fundamenta Informaticae 27(2-3):245–253. Thiele, H. 1998. Fuzzy rough sets versus rough fuzzy sets – an interpretation and a comparative study using concepts of modal logics. Technical Report CI-30/98, University of Dortmund. Tsai, Wen-Hsiang. 1985. Moment-preserving thresholding: a new approach. Computer Vision, Graphics, and Image Processing 29:377–393. Tukey, John W. 1977. Exploratory data analysis. Addison-Wesley. Udupa, J K, and P K Saha. 2003. Fuzzy connectedness in image segmentation. Proc. IEEE 91(10):1649–1669.
Image Thresholding using Generalized Rough Sets Wierman, M J. 1999. Measuring uncertainty in rough set theory. International Journal of General Systems 28(4):283–297. Yager, Ronald R. 1992. Entropy measures under similarity relations. International Journal of General Systems 20(4):341–358.
3–29
4 Mathematical Morphology and Rough Sets Homa Fashandi
James F. Peters
4.1 4.2 4.3 4.4 4.5
Computational Intelligence Laboratory,University of Manitoba, Winnipeg R3T 5V6 Manitoba Canada
4.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–13 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4–14
Computational Intelligence Laboratory,University of Manitoba, Winnipeg R3T 5V6 Manitoba Canada
4.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Basic Concepts from Topology . . . . . . . . . . . . . . . . . . . . . . Mathematical Morphology . . . . . . . . . . . . . . . . . . . . . . . . . . . Rough Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mathematical Morphology and Rough Sets . . . . . . .
4–1 4–1 4–3 4–5 4–9
Some Experiments
Introduction
This chapter focuses on the relation between mathematical morphology (MM) (Serra, 1983) operations and rough sets (Pawlak, 1981, 1982; Pawlak and Skowron, 2007c,b,a) mainly based on topological spaces considered in the context of image retrival (see, e.g., (Fashandi, Peters, and Ramanna, 2009)) and the basic image correspondence problem (see, e.g., (Peters, 2009, 2010; Meghdadi, Peters, and Ramanna, 2009)). There are some obvious similarities between MM operations and set approximations in rough set theory. There have been several attempts to link MM and rough sets. Two major works have been published in this area (Polkowski, 1993; Bloch, 2000). L. Polkowski defines hit-or-miss topology on rough sets and proposed a scheme to approximate mathematical morphology within the general paradigm of soft computing (Polkowski, 1993),(Polkowski, 1999). Later, I. Bloch tries to demonstrate a direct link between MM and rough sets through relations, a pair of dual operations and neighbourhood systems (Bloch, 2000). I.Bloch’s approach is carried forward by J.G. Stell, who defines a single framework that includes the principal constructions of both mathematical morphology and rough sets (Stell, 2007). To make this chapter fairly self-contained, background information on the basics of topology is presented, first. The chapter then presents the basics of mathematical morphology. Then principles of rough set theory are considered and the links between them are discussed. Finally, a proposed application of the ideas from these two areas is given in terms of image retrieval.
4.2
Basic Concepts from Topology
This section introduces the basic concepts of topology (Engelking, 1989; Gemignani, 1990). For the sake of completeness, basic definitions from topology are briefly presented in this 4–1
4–2
Rough Fuzzy Image Analysis
section. Those readers who are familiar with these concepts, may ignore this section. The main reference of this section is the book written by M.C. Gemignani,(Gemignani, 1990). Topology: Let X be a non-empty set. A collection τ of subsets of X is said to be a topology on X if • X and φ belongs to τ • The union of any finite or infinite number of sets in τ belongs to τ • The intersection of any two sets in τ belongs to τ . The pair (X, τ ) is called a topological space. Several topologies can be defined on every set X. Discrete Topology: if (X, τ )is a topological space such that, for every x ∈ X, the singleton set {x} is in τ , then τ is the discrete topology. Open and closed sets are pivotal concepts in topology. Open sets: Let (X, τ ) be a topology. Then the members of τ are called open sets. Therefore, • X and φ are open sets. • The union of any finite or infinite number of open sets are open sets. • The intersection of any finite number of open sets is an open set. Closed Sets: Let (x, τ ) be a topological space. A set S ⊆ X is said to be closed, if X\S is open. • φ and X are closed sets. • The intersection of any finite or infinite number of closed sets is a closed set. • The union of any finite number of closed set is a closed set. Some subsets of X, may be both closed and open. In a discrete space, every set is both open and closed, while in a non-discrete space all subsets of X are neither open nor closed, except X and φ. Clopen Sets: A subset S of a topological space (X, τ ) is said to be clopen if it is both closed and open in (X, τ ). The concept of limit points are closely related to topological closure of a set. Limit Points: Let A be a subset of a topological space (X, τ ). A point x ∈ X is said to be a limit point (cluster point or accumulation point)of A, if every open set, O containing x contains a point of A different from x. The following propositions provide a way of testing a set to determine if it is closed or not. Proposition 4.2.1 Let A be a subset of a topological space (X, τ ). Then A is closed if it contains all of its limit points. Proposition 4.2.2 Let A be a subset of a topological space (X, τ ), and A0 be the set of all limit point of A, then A ∪ A0 is closed. The topological concepts of closure and interior play an important role in this chapter. A brief explanation of these concepts is given next.
Mathematical Morphology and Rough Sets
4–3
Closure Let A be a subset of a topological space (X, τ ). Then the subset A ∪ A0 consisting of all its limit points is called the closure of A and is denoted by A. Interior Let (X, τ ) be any topological space and A be any subset of X. The largest open ˚ set contained in A is called the interior of A, A. Recall that in algebra every vector is a linear combination of the basis. In topology, every open set can be obtained by a union of members of the basis. Basis of a Topology Let (X, τ ) be a topological space. A collection B of open subsets of X is said to be a basis for the topology τ , if every open set is a union of members of B. In other words, B generates the topology.
4.3
Mathematical Morphology
Objects or images in our application are considered as subsets of the euclidean space E n or subsets of an affinely closed subspace X ⊆ E n . For digital objects(images) space is considered to be Z n , where Z is the set of integer numbers. Dilation and erosion are two primary mathematical morphology operators and can be defined by Minkowski sum and Minkowski difference: A ⊕ B = {x + y : x ∈ A, y ∈ B}. (4.1) where A, B ⊆ X, and ’+’ is the sum in euclidean space E n . A B = {x ∈ X : x ⊕ B ⊆ A}.
(4.2)
For simplicity, a set B is assumed to be symmetric about the origin, therefor B = −B = {−x : x ∈ B}. Mathematical morphology operators are defined in different ways. For example, consider two binary images A⊂ Z 2 and B⊂ Z 2 . The dilation of A by B is also defined as ˆx ) ∩ A 6= φ}. A ⊕ B = {x | (B (4.3) ˆ where Bx is obtained by first reflecting B about its origin, and then shifting it such that its origin is located at point x. B is called a structuring element(SE) and it can have any shape, size and connectivity. The characteristic of SE is application dependent. As ˆx = Bx . Based on equation 4.3, dilation mentioned earlier, for simplicity, we consider B ˆx and A have overlapping. The of an image A by B, is the set of all points x such that B erosion of binary images A by B is defined: A B = {x | (Bx ) ⊆ A}
(4.4)
Erosion of A by B is the collection of all points x such that Bx is contained in A. To be consistent with Polkowski (Polkowski, 1999), we use dB (A) for dilation of A by B and eB (A) for erosion of A by B. New morphological operations could be obtained by composition of mappings. Opening (oB (A)) and closing (cB (A)) are two operators obtained by the following compositions, respectively: oB (A) = dB (eB (A)) = {x ∈ X : ∃y, (x ∈ {y} ⊕ B ⊆ A)}.
(4.5)
cB (A) = eB (dB (A)) = {x ∈ X : ∀y, (x ∈ y ⊕ B ⇒ A ∩ ({y} ⊕ B) 6= φ)}.
(4.6)
By moving the structuring element B on the image A, we are gathering information about the medium A in terms of B. The simplest relationships can be obtained by B moving on
4–4
Rough Fuzzy Image Analysis
the medium A, where B ⊂ A (remember erosion- or opening) and A ∩ B 6= φ(dilation or closing). We clarify this idea by citing what G. Matheron wrote in his book in 1975,page xi (Matheron, 1975)(Serra also referred to this part in his book (Serra, 1983, p.84)): ”In general, the structure of an object is defined as the set of relationships existing between elements or parts of the object. In order to experimentally determine this structure, we must try, one after the other, each of the possible relations and examine whether or not it is verified. Of course, this image constructed by such a process will depend to the greatest extent on the choice made for the system < of relationships considered as possible. Hence this choice plays a priori a constructive role in (in the Kantian meaning) and determines the relative worth of the concept of structure at which we will arrive. In the case of a porous medium, let A be a solid component(union of grains), and Ac the porous network. In this medium, we shall move a figure B, called the structuring pattern, playing the role of a probe collecting information. This operation is experimentally attainable.” We can get more information about the object if we gather more information about it. The information can be obtained through the relations, whether it is false or true. Assume that we have a family of structuring elements B, each B ∈ B and each relation (B ⊂ A, B ∩ A 6= φ)gives us some information about A. As an example of a set of structuring elements, consider a sequence {Bi } made up of compact disks of radius ri = ro + 1/i, which tend toward the compact disk of radius r0 . Topology and Mathematical Morphology
Topological properties of mathematical morphology have been introduced and studied by Matheron and Serra in (Matheron, 1975) and (Serra, 1983), respectively. Here we briefly mention some of them. We start with opening and closing. The concepts of topological closure and interior are comparable with morphological closing and opening. The only difference is that in morphology the closing and opening are obtained with respect to a given structuring element (Serra, 1983) but in topology closure and interior are defined in terms of closed and open sets of the topology (Engelking, 1989). To blur this difference and obtain a closer relation between mathematical morphology operators and topological interior and closure, consider the following proposition. Proposition 4.3.1 Let (X, d) be a metric space. Then the collection of open balls is a basis for a topology τ on X (Gemignani, 1990). The topology τ induced by the metric d and (X, τ ) is called the induced topological space. If d is the euclidean metric on R, then a set of open balls is a basis for the topology τ induced ˚r and B r be an open ball and closed ball of radius r, respectively. by a metric d. Let B ˚ as follows: Consider the sets of structuring elements B and B B = {B r | r > 0}. (4.7) ˚ = {B ˚r | r > 0}. B (4.8) ˚ form a basis for the topology on the euclidean space Based on the above proposition, B or image plane. The following equations are showing the relations between mathematical morphology’s opening and closing to topological closing and opening for a set A ⊂ X in ˚ and A be interior and closure of a set A. Then euclidean space, let A [ [ ˚= oB (A). (4.9) oB (A) = A ˚ B∈B
B∈B
4–5
Mathematical Morphology and Rough Sets and A=
\
cB (A) =
˚ B∈B
\
cB (A).
(4.10)
B∈B
The following equations also relate the interior and closure to erosion and dilation, (Serra, 1983): [ [ ˚= A eB (A) = eB (A). (4.11) and A=
˚ B∈B
B∈B
\
\
˚ B∈B
dB (A) =
dB (A).
(4.12)
B∈B
In words, the interior of a set A is the union of the opening or erosions of the set A with open or closed balls of different sizes. The closure of a set A is the union of the closing or dilations of A with open/closed balls of different sizes.
4.4
Rough Sets
In rough set theory, objects in a universe X, are perceived by means of their attributes(features). Let ϕi denote a real-valued function that represents an object feature. Each element x ∈ A ⊆ X is defined by its feature vector, ϕ(x) = (ϕ1 (x), ϕ2 (x), . . . , ϕn (x)) Peters and Wasilewski (2009). An equivalence (called an indiscernibility relation (Pawlak and Skowron, 2007c)) can be defined on X. Let ∼ be an equivalence relation defined on X, i.e., ∼ is reflexive, symmetric and transitive. An equivalent relation(∼) on X classifies objects (x ∈ X) into classes called equivalence classes. Objects in each class have the same feature-value vectors and are treated as one generalized item. The indiscernibility(equivalence) relation ∼X,ϕ is defined in (4.13). ∼X,ϕ = {(x, y) ∈ X × X : ϕ(x) = ϕ(y)}.
(4.13)
∼X,ϕ partitions universe X into non-overlapping equivalence classes denoted by X/∼ϕ or simply X/∼ . Let x/∼ denote a class containing an element x as in (4.14). x/∼ = {y ∈ X | x ∼ϕ y}.
(4.14)
Let X/∼ denote the quotient set as defined in (4.15). X/∼ = {x/∼ | x ∈ X}.
(4.15)
The relation ∼ holds for all members of each class x/∼ in a partition. In a rough set model, elements of the universe are described based on the available information about them. For each subset A ⊆ X, rough set theory defines two approximations based on equivalence classes, lower approximation A− and upper approximation, A− : A− = {x ∈ X : x/∼ ⊆ A}
(4.16)
A− = {x ∈ X : x/∼ ∩ A 6= φ}
(4.17)
The set A ⊂ X is called a rough set if A− 6= A− , otherwise it is called exact set, (Pawlak, 1991). Figure 4.1 shows the set A and lower and upper approximation of it in terms of the partitioned space. The space X is partitioned in to squares of the form (j, j + 1]2 .
4–6
Rough Fuzzy Image Analysis
(4.1a) Set A and partitioned space
(4.1b) lower approximation
(4.1c) upper approximation
FIGURE 4.1: Upper and lower approximation of a set A in the partitioned space of the form (j, j + 1]2
Topology of Rough sets
To study the topological properties of rough set, we define a partition topology on X, based on partitions induced by the equivalence relation ∼X,ϕ . The equivalence classes x/∼ form the basis for the topology τ (Steen and Seebach, 1995). Let A denote the closure of a set A. Since the topology is a partition topology, to find the closure A of a subset A ⊂ X, we have to consider all of the closed sets containing the ˚ is the set A and then select the smallest closed set. The interior of a set A (denoted A) largest open set that is contained in A. Open sets and their corresponding closed sets in the topology are: • X is an open set, φ is the corresponding closed set. • φ is a closed set, X is the corresponding closed set. • xi/∼ is an open set ⇒ X\xi/∼ = xci/∼ , is a closed set. where i = 1, . . . , n, and n is the total number of equivalence classes in the partition topology and \ is the set difference. Recall that the union of any finite or infinite number of open sets are open sets and the intersection of any finite number of open sets is an open set. For a set A ⊆ X and the partition topology τ , we have: A = A−
(4.18)
˚ = A− A
(4.19)
˚ − and A− are closure, interior, upper and lower approximation of A, respecwhere A, A,A tively. Now it is clear that we could define properties of a rough set in the language of ˚ where A and A ˚ are closure and topology. Next, define π − boundary of A as: Ab = A\A, interior of A, (Lashin, Kozae, Abo Khadra, and Medhat, 2005). A set A is said to be rough, if Ab 6= φ, otherwise it is an exact set. Generally, for a given topology τ and A ⊆ X, we have: • • • •
A A A A
is is is is
˚ = A. totally definable, if A is an exact set, A = A ˚ A 6= A. internally definable, if A = A, ˚ A = A. externally definable, if A 6= A, ˚ and A 6= A. undefinable, if A 6= A
Mathematical Morphology and Rough Sets
4–7
(4.2a) Set A in the partitioned space X, where the (4.2b) Sets A (solid line) and Y (dashed line) have boundary region contains shaded rectangles equivalence relation with each other
FIGURE 4.2: Equivalence relation between sets
Polkowski(Polkowski, 1993), also defined an equivalence relation based on topological prop˚ 6= A, the equivalence erties of rough set. For a rough subset of universe, A ⊂ X, with A class A/∼ is defined as: ˚= Y ˚ and A = Y }. A/∼ = {Y ⊆ X|A
(4.20)
In other words, the equivalence class of a set X is the collection of those sets with the same interior and closure of a set X. Notice that in equation 4.14, an equivalence class is based on an element x ∈ X, whereas, in equation 4.20, an equivalence class of a set A ⊆ X is calculated. Figure 4.2 demonstrates the idea of sets that have an equivalence relation with each other. Those rough sets with equal interior and closure have an equivalence relation with each other. In other words, all those sets that fall into boundary regions of the set X have an equivalence relation with A. Figure4.2b shows the boundary region of a set A (solid line), and a sample set Y (dashed line) that has an equivalence relation with A. Instead of using an equivalence relation, arbitrary binary relations can be used to form an approximation space that is called a generalized approximation space. In this case, instead of having partitions on X, a covering can be defined by a tolerance relation. That is, if we use the tolerance relation ∼ =X,ϕ,ε =X,ϕ,ε defined in (4.21) instead of equivalence relation ∼, ∼ defines a covering on X, i.e., the tolerance classes in the covering may or may not disjoint sets. The result from A. Skowron and J. Stepaniuk is called a tolerance approximation space (Skowron and Stepaniuk, 1996). E.C. Zeeman formally defined a tolerance relation ∼ = on a set X as a reflexive and symmetric relation and introduced the notion of a tolerance space (Zeeman, 1962)∗ . A special kind of tolerance relation is a well known equivalence relation, which is reflexive, symmetric and transitive and is similar to equation 4.13. For example, we can define a tolerance relation on the set X as given in (4.21). ∼ =X,ϕ,ε = {(x, y) ∈ X × X :| ϕ(x) − ϕ(y) |≤ }.
(4.21)
∗ It has been observed by A.B. Sossinsky (Sossinsky, 1986) that it was J.H. Poincar´ e who informally introduced tolerance spaces in the context of sets of similar sensations (Poincar´ e, 1913). Both E.Z. Zeeman and J.H. Poincar´ e introduce tolerance spaces in the context of sensory experience.
4–8
Rough Fuzzy Image Analysis
(4.3a) Two overlapped sets with (4.3b) Two disjoint set with tol- (4.3c) Inclusion and tolerance tolerance relation erance relation relation
FIGURE 4.3: Tolerance relation between sets; Space X is partitioned into squares, Sets A(solid line), set Y dashed lines have tolerance relation. A ∩ Y are colored in gray The relation ∼ =B,ε (Peters, 2009) =X,ϕ,ε is a special case of the tolerance near set relation ∼ (see, also, the weak nearness relation in (Peters and Wasilewski, 2009)). For conciseness, ∼ = is used to denote ∼ =X,ϕ,ε . In a manner similar to what L. Polkowski has done in defining the equivalence class of a rough set A ⊂ X, we introduce the following equation as a tolerance ˚ 6= A: class for a rough subset of the universe, A ⊂ X, with A A/∼ = = {Y ⊆ U |A ∩ Y 6= φ}.
(4.22)
In other words we are suggesting that two rough sets of universe A, Y ⊂ X have tolerance relation(∼ =) with each other iff A ∩ Y 6= φ. Notice that equation 4.21 is defined on elements in a set X. By contrast, equation 4.22 is defined for two sets. Proposition 4.4.1 ∼ = is a tolerance relation. Proof. To show ∼ = is a tolerance relation, we have to show that it is reflexive and symmetric: • if A = Y ⇒ A = Y therefore A ∩ Y 6= φ; So A ∼ = is a = Y . This means that ∼ reflexive relation, A ∼ = A. • if A ∼ = Y ⇒ A ∩ Y 6= φ ⇒ Y ∩ A 6= φ ⇒ Y ∼ = is a symmetric = Y ; therefore ∼ relation. Proposition 4.4.2 Equation4.20 is the specialization of equation 4.22. Proof. We have to show that the equivalence class of a set A is a special case of its tolerance class. In other words, an equivalence class A/∼ is included in A/∼ = . Let Y ⊂ X and ˚ ˚ Y ∈ A/∼ , we have to show that Y ∈ A/∼ = . Since Y ∈ A/∼ , we have A = Y and A = Y . We only need the first part; by A = Y we get A ∩ Y = A 6= φ. Therefore A/∼ is included in A/∼ = . Figure 4.3 are showing three different sets that have tolerance relation with set A ⊂ X. In figure 4.3b the set Y ⊂ X is disjoint from the set A with A ∩ Y = φ, but has a tolerance relation with A because A ∩ Y 6= φ.
Mathematical Morphology and Rough Sets
4.5
4–9
Mathematical Morphology and Rough Sets
There are two main papers connecting mathematical morphology to rough set theory. One by Polkowski(Polkowski, 1999), who uses the language of topology to connect the two fields and Bloch’s work that is mainly based on the language of relations,(Bloch, 2000). We begin this section with some examples from (Polkowski, 1999). The first one is partitioning Z ⊆ E n into a collection {P1 , P2 , ..., Pn }, where Pi is the partition of the i − th axis Ei into intervals of (j, j +1]. In the second example, a structuring element B = (0, 1]n is selected. It is easily seen that oB (X) = X− for each X ⊆ Z. Also, cB (X) = X − . These two examples clarify the relation between mathematical morphology and rough sets in a cogent way. Mathematical morphology is mainly developed for the image plane and a structuring element has a geometrical shape in this space. L. Polkowski defined a partition in this space (not necessarily through the definition of equivalence relation) and then obtained the upper and lower approximation of a set X based on these partitions. Equivalence classes have the same characteristics and they are in the form of (j, j + 1]n . At the same time, a structuring element B with the same characteristics of equivalent classes is defined, (0, 1]n . Then Bx , translation of B by x, can hit (overlap) any of the equivalence classes. In classical rough set theory, objects are perceived by their attributes and classified into equivalence classes based on the indiscernibility of the attribute values. In the above examples, the geometrical position of the pixels in the image plane, act as attributes and form the partitions. In (Polkowski, 1993), the morphological operators are defined on equivalence classes. In her article, I. Bloch tried to connect rough set theory to mathematical morphology based on relations (Bloch, 2000). She uses general approximation spaces, where, instead of the indiscernibility relation, an arbitrary binary relation is used. She suggests that upper and lower approximation can be obtained from erosion and dilation,(Bloch, 2000). The binary relation defined in her work is xRy ⇔ y ∈ Bx . Then from the relation R, r(x) is derived in the following way. ∀x ∈ X, r(x) = {y ∈ X | y ∈ Bx } = Bx Consider ∀x ∈ X, x ∈ Bx and let be B be symmetric. Then erosion and lower approximation coincide: ∀A ⊂ X, A− = {x ∈ X | r(x) ⊂ A} = {x ∈ X | Bx ⊂ X} = eB (A) The same method is used to show that upper approximation and dilation coincide. I. Bloch also extends the idea to dual operators and neighbourhood systems. The common result of both (Polkowski, 1999) and (Bloch, 2000) is the suggestion that upper and lower approximations can be linked to closing(dilation) and opening(erosion). Topology, neighbourhood systems, dual operators and relations are used to show the connection.
4.5.1
Some Experiments
In the following subsection we tried some experiments to demonstrate the ways of incorporating both fields in image processing. Lower Approximation as Erosion Operator
When mathematical morphology is used in image processing applications, a structuring element mainly localized in the image plane is used. The result is the interaction of the
4–10
Rough Fuzzy Image Analysis
structuring element with the image underneath. On the other hand in rough set theory, the universe is partitioned or covered by some classes(indiscernibility, tolerance or arbitrary) and the objects in the universe are perceived based on the knowledge available in these classes. The interaction of the structuring element with the image underneath is localized in space; In other words, the underneath image is being seen(characterized) through the small window opened by the structuring element. In rough set theory a set is approximated by the knowledge gathered in equivalence classes from the whole universe. This is the main difference in mathematical morphology and rough sets. In mathematical morphology, especially when the concept of lattice is introduced in this field, the universe consists of all the possible images. But when we are applying morphological operators on images, there is only one image and a set of structuring elements. The result is a new image belonging to the universe. In the application of rough set theory, the available data in databases forms the universe and all the approximations are based on the available data. To have almost the same milieu, we also consider a set of finite images to be our universe (this is possible, if we view the universe X as a set of points and each image A in the universe to be a subset of the universe, i.e., particular set of points A ⊂ X). Ten sets of images in different categories are considered. Each set consists of 100 images. Figure 4.4 shows samples of some of the categories. A category of images is defined as a collection of images with visual/semantic similarities. For instance, the category of seaside images, mountain images, dinosaurs or elephants can be derived from images in the Simplicity image archive (Group, 2009). The categorization is done by an individual and it is not unique. Each image may belong to different categories. For instance, in figure 4.5a the elephant pictures are categorized into elephant category; but in figure 4.5b, they are in animal and/or nature categories. Therefore the categorization is completely application dependent and subjective.
FIGURE 4.4: Some image samples from different categories
The aim is to approximate a query image, I ∈ X, based on one and/or several of these
4–11
Mathematical Morphology and Rough Sets
(4.5a) Sample image categories
(4.5b) Sample image categories
FIGURE 4.5: Image universe and categories
(4.6a) The subspace C, some sam- (4.6b) original query image,A ples from flowers category. the set C contains 100 flower images
(4.6c) lower approximation in terms of the universe X
FIGURE 4.6: Obtaining lower approximation of a query image in terms of the universe three color components as features
categories. This will reveal important information about the degree of similarity between the query image and the categories. We define an equivalence relation on the set of images X. Three color components in addition to an image index are considered as features: ϕ(pij ) = (ϕ1 (pij ), ϕ2 (pij ), ϕ3 (pij ), ϕ4 (pij )) ϕ1 (pij ) = R(pij ), ϕ2 (pij ) = G(pij ), ϕ3 (pij ) = B(pij ), ϕ4 (pij ) = i, i = 1, 2, . . . , j = 1, 2, . . . , Mi , where pij is the j th pixel of ith image(Ii ) in the universe; Mi is the number of pixels in the image Ii . R extracts the red component of the pixel, G extracts green and B, blue component. Equivalence classes are formed on each image, I ∈ X. So each image I is partitioned into equivalence classes based on features defined by ϕ, I/∼,ϕ .
4–12
Rough Fuzzy Image Analysis
(4.7a) The subspace C, some sam- (4.7b) original query image,A ples from elephant category. the set C contains 100 elephant images
(4.7c) lower approximation in terms of the universe X
FIGURE 4.7: Obtaining lower approximation of a query image in terms of the universe three color components as features
(4.8a) Subspace C, some sea shore samples (set X contains 100 sea shore images)
(4.8b) (See color insert) Sample (4.8c) (See color insert) Sample (4.8d) lower approximation seashore query image lower approximation in terms of in terms of the universe the universe X, excluding the X,including the query image query image
FIGURE 4.8: (Please see color insert for Figures 4.8b and c) Obtaining lower approximation of a query image in terms of the universe and three color components as features
Then we define a partition topology on X, where the basis of the topology is the set of
Mathematical Morphology and Rough Sets
4–13
partitions formed on the images inside the universe. The empty set φ is also added to the basis (Steen and Seebach, 1995). Let (X, τ ) be the partition topology on X. We consider each category of images as a subspace topology. Let C be a non-empty subset of X. The collection τC = {T ∩ C : T ∈ τ } of subsets of C is called the subspace topology. The topological space (C, τC ) is said to be a subspace of (X, τ ), (Gemignani, 1990). Let I ∈ X be an image. We want to find the interior of the set I in terms of open sets of different subspaces. I˚C = {c ∈ τc | c ⊆ I} (4.23) where (C, τc ) is a subspace topology for a category of images. Based on equation 4.19, we could say that I˚C = IC− (4.24) In other words, the interior of a set I relative to subspace C is equal to the lower approximation of the set I with respect to subspace C. Figures 4.6 and 4.7 demonstrate the results of lower approximation of a query image in terms of the specified subspace. As it is obvious in the examples, the more similar the query image to the subspace in terms of the predefined features, the more complete is the lower approximation. The black pixels in the lower approximation image are those parts of the image that are filtered. In these examples the features are three color components in RGB space. Notice that we used 4 features to form the partition topology.
4.6
Conclusion
In summary, this chapter presents the basic definitions of mathematical morphology and rough set theory. Mathematical morphology is defined and expanded in the image processing domain. Rough set theory is first introduced for image archives. Although the initial domains and applications of these two fields are different, there are connections between the two. This chapter brings together the common aspects of mathematical morphology and rough set theory. The lower approximation of rough set theory is analogous to opening/erosion of mathematical morphology. The same is true for upper approximation and closing/dilation. We have proposed a method to use the idea of lower approximation to find the similarity between images. A partition topology is defined on images gathered as a universe of images. Four features including color information and image indices are used to form image partitions. Subspace topologies are used to model each category of images. An interior of a query image is then calculated based on different subspaces. In other words, we find the lower approximation of the query image in terms of different subspaces. We are proposing that the closer the lower approximated image is to the query image, the more similar the query image is to the subspace.
4–14
Rough Fuzzy Image Analysis
Bibliography Bloch, I. 2000. On links between mathematical morphology and rough sets. Pattern Recognition 33(9):1487–1496. Engelking, R. 1989. General topology, revised & completed edition. Berlin: Heldermann Verlag. Fashandi, H., J.F. Peters, and S. Ramanna. 2009. L2-norm length-based image similarity measures: Concrescence of image feature histogram distances. In Signal & image processing, int. assoc. of science & technology for development, honolulu, hawaii, 178–185. Gemignani, M.C. 1990. Elementary topology. Courier Dover Publications. Group, James Z. Wang Research. 2009. Simplicity: Content based image retrieval image database search engine. Lashin, EF, AM Kozae, AA Abo Khadra, and T. Medhat. 2005. Rough set theory for topological spaces. International journal of approximate reasoning 40(1-2):35–43. Matheron, G. 1975. Random sets and integral geometry. Wiley New York. Meghdadi, A.H., J.F. Peters, and S. Ramanna. 2009. Tolerance classes in measuring image resemblance. In Kes 2009, part ii, lnai 5712, 127–134. Berlin: Springer. Pawlak, Z. 1981. Classification of objects by means of attributes. Polish Academy of Sciences 429. ———. 1982. Rough sets. International J. Comp. Inform. Science 11:341–356. ———. 1991. Rough sets: Theoretical aspects of reasoning about data. Kluwer Academic Print on Demand. Pawlak, Z., and A. Skowron. 2007a. Rough sets and boolean reasoning. Information Sciences 177:41–73. ———. 2007b. Rough sets: Some extensions. Information Sciences 177:28–40. ———. 2007c. Rudiments of rough sets. Information Sciences 177:3–27. Peters, James F. 2009. Tolerance near sets and image correspondence. International Journal of Bio-Inspired Computation 1(4):239–245. ———. 2010. Corrigenda and addenda: Tolerance near sets and image correspondence. International Journal of Bio-Inspired Computation 2(5). in press. Peters, James F., and Piotr Wasilewski. 2009. Foundations of near sets. Information Sciences 179:3091–3109. Digital object identifier: doi:10.1016/j.ins.2009.04.018, in press. Poincar´e, H. 1913. Mathematics and science: Last essays, trans. by j.w. bolduc. N.Y.: Kessinger Pub. Polkowski, L. 1999. Approximate mathematical morphology. Rough set approach. Rough Fuzzy Hybridization: A New Trend in Decision-Making.
Mathematical Morphology and Rough Sets Polkowski, LT. 1993. 8. Mathematical Morphology of Rough Sets. Bulletin of the Polish Academy of Sciences-Mathematics 41(3):241. Serra, J. 1983. Image analysis and mathematical morphology. Academic Press, Inc. Orlando, FL, USA. Skowron, A., and J. Stepaniuk. 1996. Tolerance Approximation Spaces. Fundamenta Informaticae 27(2/3):245–253. Sossinsky, A.B. 1986. Tolerance space theory and some applications. Acta Applicandae Mathematicae: An International Survey Journal on Applying Mathematics and Mathematical Applications 5(2):137–167. Steen, L.A., and J.A. Seebach. 1995. Counterexamples in topology. Courier Dover Publications. Stell, J.G. 2007. Relations in mathematical morphology with applications to graphs and rough sets. In Spatial information theory, vol. 4736 of LNCS, 438–454. Springer. Zeeman, E. C. 1962. The topology of the brain and visual perception. In Topology of 3-manifolds and related topics (proc. the univ. of georgia institute, 1961), ed. JR M.K. Fort, 240–256. Prentice Hall.
4–15
5 Rough Hybrid Scheme: An application of breast cancer imaging 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Fuzzy sets, rough sets and neural networks: Brief Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5–1 5–3
Fuzzy Sets • Rough sets • Neural networks • Create gray-level co-occurrence matrix from image
Aboul Ella Hassanien Cairo University, Egypt
Hameed Al-Qaheri Kuwait University
Ajith Abraham Norwegian University of Science and Technology
5.1
5.3 Rough Hybrid Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5–5
Pre-processing: Intensity Adjustment through Fuzzy histogram hyperbolization algorithm • Clustering and Feature Extraction: Modified fuzzy c-mean clustering algorithm and Gray level co-occurrence matrix • Rough sets analysis • Rough Neural Classifier
5.4 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5–11 5.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5–12 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5–14
Introduction
Breast carcinomas are a leading cause of death for women throughout the world. It is second or third most common malignancy among women in developing countries (Rajendra, Ng, Y. H. Chang, and Kaw, 2008). The incidence of breast cancer is increasing globally and the disease remains a significant public health problem. Statistics from the National Cancer Institute of Canada show that the lifetime probability of a woman developing breast cancer is one in nine, with a lifetime probability of one in 27 of death due to the disease, also about 385,000 of the 1.2 million women diagnosed with breast cancer each year occur in Asia (Organization, 2005). Because only localized cancer is deemed to be treatable and curable, as opposed to metastasized cancer, early detection of breast cancer is of utmost importance. Mammography is, at present, the best available tool for early detection of breast cancer. However, the sensitivity of screening mammography is influenced by image quality and the radiologists level of expertise. Contrary to masses and calcifications, the presence of architectural distortion is usually not accompanied by a site of increased density in mammograms. The detection of architectural distortion is performed by a radiologist through the identification of subtle signs of abnormality, such as the presence of spiculations and distortion of the normal oriented texture pattern of the breast. Mammography is currently the gold standard/method to detect early breast cancer before it becomes clinically palpable. The use of mammography results in a 25% to 30% decreased mortality rate in screened women compared with controls after 5 to 7 years (Nystrom, 5–1
5–2
Rough Fuzzy Image Analysis
Andersson, Bjurstam, Frisell, Nordenskjold, and Rutqvist, 2002). Breast screening aims to detect breast cancers at the very early stage (before lymph node dissemination). Randomized trials of mammographic screening have provided strong evidence that early diagnosis and treatment of breast cancer reduce breast cancer mortality (Nystrom et al., 2002). Breast cancer usually presents with a simple feature or a combination of the following features: a mass, associated calcifications, architectural distortion, asymmetry of architecture, breast density or duct dilation and skin or nipple changes (Nystrom et al., 2002; Rajendra et al., 2008). In fact, a large number of mammogram image analysis systems have been employed for assisting physicians in the early detection of breast cancers on mammograms (Guo, Shao, and Ruiz, 2009; Ikedo, Morita, Fukuoka, Hara, Lee, Fujita, Takada, and Endo, 2009). The earlier a tumor is detected, the better the prognosis. Usually, breast cancer detection system starts with preprocessing that includes digitization of the mammograms with different sampling and quantization rates. Then, the regions of interest selected from the digitized mammogram are enhanced. The segmentation process is designed to find suspicious areas, and to separate the suspicious areas from the background that will be used for extracting features of suspicious regions. In the feature selection process, the features of suspicious areas will be extracted and selected, and suspicious regions will be classified into two classes: cancer or non cancer (Aboul Ella, Ali, and Hajime, 2004; Aboul Ella, 2003; Setiono, 2000; Rajendra et al., 2008; Ikedo et al., 2009; Maglogiannis, Zafiropoulos, and Anagnostopoulos, 2007). Rough set theory (Aboul Ella et al., 2004; Hirano and Tsumoto, 2005; Pawlak, 1982) is a fairly new intelligent technique that has been applied to the medical domain and is used for the discovery of data dependencies, evaluates the importance of attributes, discovers the patterns of data, reduces all redundant objects and attributes, and seeks the minimum subset of attributes. Moreover, it is being used for the extraction of rules from databases. One advantage of the rough set is the creation of readable if-then rules. Such rules have a potential to reveal new patterns in the data material. Other approaches like case based ´ ezak, 2000; Aboul Ella, 2009) are also reasoning and decision trees (Aboul Ella, 2003; Sl¸ widely used to solve data analysis problems. Each one of these techniques has its own properties and features including their ability of finding important rules and information that could be useful for data classification. Unlike other intelligent systems, rough set analysis requires no external parameters and uses only the information presented in the given data. The combination or integration of more distinct methodologies can be done in any form, either by a modular integration of two or more intelligent methodologies, which maintains the identity of each methodology, or by integrating one methodology into another, or by transforming the knowledge representation in one methodology into another form of representation, characteristic to another methodology. Neural networks and rough sets are widely used for classification and rule generation (Greco, Inuiguchi, and Slowinski, 2006; Aboul Ella, 2007; Henry and Peters, 1996; Peters, Liting, and Ramanna, 2001; Peters, Skowron, Liting, and Ramanna, 2000; Sandeep and Rene, 2006; Aboul Ella, 2009). Instead of solving a problem using a single intelligent technique such as neural networks, rough sets, or fuzzy image processing alone, the proposed approach in this chapter is to integrate the three computational intelligence techniques (forming a hybrid classification method) to reduce their weaknesses and increase their strengths. An application of breast cancer imaging has been chosen to test the ability and accuracy of a hybrid approach in classifying breast cancer images into two outcomes: malignant cancer or benign cancer. This chapter introduces a rough hybrid approach to detecting and classifying breast cancer images into two outcomes: cancer or non-cancer.
Rough Hybrid Scheme: An application of breast cancer imaging
5–3
This chapter is organized as follows: Section 5.2 gives a brief mathematics background of fuzzy and rough sets and neural network. Section 5.3 discusses the proposed rough hybrid scheme in detail. Experimental analysis and discussion of the results are described in Section 5.4. Finally, conclusions are presented in Section 5.5.
5.2
Fuzzy sets, rough sets and neural networks: Brief Introduction
Recently various intelligent techniques and approaches have been applied to handle the different challenges posed by data analysis. The main constituents of intelligent systems include fuzzy logic, neural networks, genetic algorithms, and rough sets. Each of them contributes a distinct methodology for addressing problems in its domain. This is done in a cooperative, rather than a competitive, manner. The result is a more intelligent and robust system providing a human-interpretable, low cost, exact enough solution, as compared to traditional techniques. This section provides a brief introduction into fuzzy sets, rough sets and neural networks.
5.2.1
Fuzzy Sets
Professor Lotfi Zadeh (Zadeh, 1965) introduced the concept of fuzzy logic to present vagueness in linguistics, and further implement and express human knowledge and inference capability in a natural way. Fuzzy logic starts with the concept of a fuzzy set. A fuzzy set is a set without a crisp, clearly defined boundary. It can contain elements with only a partial degree of membership. A Membership Function (MF) is a curve that defines how each point in the input space is mapped to a membership value (or degree of membership) between 0 and 1. The input space is sometimes referred to as the universe of discourse. Let X be the universe of discourse and x be a generic element of X. A classical set A is defined as a collection of elements or objects x ∈ X, such that each x can either belong to or not belong to the set A, A v X. By defining a characteristic function (or membership function) on each element x in X, a classical set A can be represented by a set of ordered pairs (x, 0) or (x, 1), where 1 indicates membership and 0 non-membership. Unlike conventional set mentioned above fuzzy set expresses the degree to which an element belongs to a set. Hence the characteristic function of a fuzzy set is allowed to have value between 0 and 1, denoting the degree of membership of an element in a given set. If X is a collection of objects denoted generically by x, then a fuzzy set A in X is defined as a set of ordered pairs: A = {(x, µA (x)) | x ∈ X}
(5.1)
µA (x) is called the membership function of linguistic variable x in A, which maps X to the membership space M, M = [0, 1], where M contains only two points 0 and 1, A is crisp and µA (x) is identical to the characteristic function of a crisp set. Triangular and trapezoidal membership functions are the simplest membership functions formed using straight lines. Some of the other shapes are Gaussian, generalized bell, sigmoidal and polynomial based curves. The adoption of the fuzzy paradigm is desirable in image processing because of the uncertainty and imprecision present in images, due to noise, image sampling, lightning variations and so on. Fuzzy theory provides a mathematical tool to deal with the imprecision and ambiguity in an elegant and efficient way. Fuzzy techniques can be applied to different phases
5–4
Rough Fuzzy Image Analysis
of the segmentation process; additionally, fuzzy logic allows to represent the knowledge about the given problem in terms of linguistic rules with meaningful variables, which is the most natural way to express and interpret information. Fuzzy image processing (Aboul Ella et al., 2004; Kerre and Nachtegael, 2000; Nachtegael, Van-Der-Weken, Van-De-Ville, Kerre, Philips, and Lemahieu, 2001; Sandeep and Rene, 2006; Sushmita and Sankar, 2005; Rosenfeld, 1983) is the collection of all approaches that understand, represent and process the images, their segments and features as fuzzy sets. An image I of size M xN and L gray levels can be considered as an array of fuzzy singletons, each having a value of membership denoting its degree of brightness relative to some brightness levels.
5.2.2
Rough sets
Due to space limitations we provide only a brief explanation of the basic framework of rough set theory, along with some of the key definitions. A more comprehensive review can be found in sources such as (Polkowski, 2002). Rough sets theory provides a novel approach to knowledge description and to approximation of sets. Rough theory was introduced by Pawlak during the early 1980s (Pawlak, 1982) and is based on an approximation space-based approach to classifying sets of objects. In rough sets theory, feature values of sample objects are collected in what are known as information tables. Rows of a such a table correspond to objects and columns correspond to object features. Let O, F denote a set of sample objects and a set of functions representing object features, respectively. Assume that B ⊆ F, x ∈ O. Further, let x∼B denote x/∼B = {y ∈ O | ∀φ ∈ B, φ(x) = φ(y)} , i.e., x ∼B y (description of x matches the description of y). Rough sets theory defines three regions based on the equivalent classes induced by the feature values: lower approximation BX, upper approximation BX and boundary BN DB (X). A lower approximation of a set X contains all equivalence classes x/∼B that are proper subsets of X, and upper approximation BX contains all equivalence classes x/∼B that have objects in common with X, while the boundary BN DB (X) is the set BX \ BX, i.e., the set of all objects in BX that are not contained in BX. Any set X with a non-empty boundary is roughly known relative, i.e., X is an example of a rough set. The indiscernibility relation ∼B (also written as IndB ) is a mainstay of rough set theory. Informally, ∼B is a set of all classes of objects that have matching descriptions. Based on the selection of B (i.e., set of functions representing object features), ∼B is an equivalence relation that partitions a set of objects O into classes (also called elementary sets (Pawlak, 1982)). The set of all classes in a partition is denoted by O/∼B (also by O/IndB ). The set O/IndB is called the quotient set. Affinities between objects of interest in the set X ⊆ O and classes in a partition can be discovered by identifying those classes that have objects in common with X. Approximation of the set X begins by determining which elementary sets x/∼B ∈ O/∼B are subsets of X.
5.2.3
Neural networks
Neural networks (NN) is an Artificial Intelligent (AI) methodology based on the composition of the human brain, as well as made up of a wide network of interconnecting processors. The basic parts of every NN are the processing elements, connections, weights, transfer functions, as well as the learning and feedback laws.
Rough Hybrid Scheme: An application of breast cancer imaging
5–5
Throughout most of this work the classical multi-layer feed-forward network, as described in (Henry and Peters, 1996), is utilized. The most commonly used learning algorithm is back-propagation. The signals flow from neurons in the input to those in the output layer, passing through hidden neurons, organized by means of one or more hidden layers. By a sigmoidal excitation function for a neuron we will understand a mapping of the form: f (x) =
1 1 + e−βx
(5.2)
where x represents weighted sum of inputs to a given neuron and β is the coefficient called gain, which determines the slope of the function. Let Icni , Ocnj and wij be input to neuron i, output from neuron j, and the weight of connection between i and j, respectively. We put: Icni =
n X
wij Ocnj
(5.3)
j=1
Ocni = f (Icni )
5.2.4
(5.4)
Create gray-level co-occurrence matrix from image
Statistically, texture is a unity of local variabilities and spatial correlations. Gray level co-occurrence matrix (GLCM) is one of the most known texture analysis methods that estimates image properties related to second-order statistics. Each entry (i, j) in GLCM corresponds to the number of occurrences of the pair of gray levels i and j which are a distance d apart in original image. In order to estimate the similarity between different gray level co-occurrence matrices, Haralick (Haralick, 1979) proposed 14 statistical features extracted from them. To reduce the computational complexity, only some of these features were selected. In this paper we use energy, entropy, contrast and inverse difference moment. For further reading see,e.g, (Aboul Ella, 2007).
5.3
Rough Hybrid Approach
In this section, an application of breast cancer imaging has been chosen and hybridization scheme that combines the advantages of fuzzy sets, rough sets and neural networks in conjunction with statistical feature extraction techniques, have been applied to test their ability and accuracy in detecting and classifying breast cancer images into two outcomes: cancer or non-cancer. The architecture of the proposed rough hybrid approaches is illustrated in Figure 1. It is comprised of four fundamental building phases: In the first phase of the investigation, a preprocessing algorithm based on fuzzy image processing is presented. It is adopted to improve the quality of the images and to make the segmentation and feature extraction phase more reliable. It contains several sub-processes. In the second phase, a modified version of the standard fuzzy c-mean clustering algorithm is proposed to initialize the segmentation, then the set of features relevant to region of interest is extracted, normalized and represented in a database as vector values. The third phase is rough set analysis. It is done by computing the minimal number of necessary attributes, their significance and by generating a sets of rules. Finally, a rough neural network is designed to discriminate different regions of interest in
5–6
Rough Fuzzy Image Analysis
order to separate them into malignant and benign cases. These four phases are described in detail in the following sections along with the steps involved and the characteristic features of each phase.
FIGURE 5.1: (Please see color insert) Fuzzy rough hybrid scheme
Rough Hybrid Scheme: An application of breast cancer imaging
Algorithm 1 fuzzy-based histogram hyperbolization Step-1: Parameter initialization 1: 2:
3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15:
16: 17: 18: 19: 20: 21:
22: 23:
Setting the shape of membership function (triangular) Setting the value of fuzzifier β. such that β = −0.75µ + 1.5. Step-2: Fuzzy data for (i=0;i¡hieght;i++) do for (j=0;j¡width;j++) do if data[i][j]=100) & (data[i][j]200)&(data[i][j] 0 and finite. limpi →1 ∆I(pi ) = ∆I(pi = 1) = k2 , k2 > 0 and finite. k2 < k1 . With increase in pi , ∆I(pi ) decreases exponentially. ∆I(p) and H, the entropy, are continuous for 0 ≤ p ≤ 1. H is maximum when all pi ’s are equal, i.e. H(p1 , . . . , pn ) ≤ H(1/n, . . . , 1/n).
With these in mind, (Pal and Pal, 1991) defines the gain in information from an event as ∆I(pi ) = e(1−pi ) , which gives a new measure of entropy as H=
n X
pi e(1−pi ) .
i=1
Pal’s version of entropy is given in Fig. 7.8. Note, these images were formed by first converting the original image to greyscale, calculating the entropy for each subimage, and multiplying this value by 94 (since the maximum of H is e1−1/256 ).
(7.8a)
(7.8b)
(7.8c)
FIGURE 7.8: Example of Pal’s entropy applied to images: (a) Original image (Martin et al., 2001), (b) Pal’s entropy applied to subimages of size 5 × 5, and (c) Pal’s entropy applied to subimages of size 10 × 10.
7.4.5
Edge based probe functions
7–11
Near Set Evaluation And Recognition (NEAR) System
The edge based probe functions integrated in the NEAR system incorporate an implementation of Mallat’s Multiscale edge detection method based on Wavelet theory (Mallat and Zhong, 1992). The idea is that edges in an image occur at points of sharp variation in pixel intensity. Mallat’s method calculates the gradient of a smoothed image using Wavelets, and defines edge pixels as those that have locally maximal gradient magnitudes in the direction of the gradient. Formally, define a 2-D smoothing function θ(x, y) such that its integral over x and y is equal to 1, and converges to 0 at infinity. Using the smoothing function, one can define the functions ψ 1 (x, y) =
∂θ(x, y) ∂x
and ψ 2 (x, y) =
∂θ(x, y) , ∂y
which are, in fact, wavelets given the properties of θ(x, y) mentioned above. Next, the dilation of a function by a scaling factor s is defined as ξs (x, y) =
1 x y ξ( , ). s2 s s
Thus, the dilation by s of ψ 1 , and ψ 2 is given by ψs1 (x, y) =
1 1 ψ (x, y) s2
and ψs2 (x, y) =
1 2 ψ (x, y). s2
Using these definitions, the wavelet transform of f (x, y) ∈ L2 (R2 ) at the scale s is given by Ws1 f (x, y) = f ∗ ψs1 (x, y)
and Ws2 f (x, y) = f ∗ ψs2 (x, y),
which can also be written as 1 ∂ Ws f (x, y) ∂x (f ∗ θs )(x, y) = s∇(f ~ ∗ θs )(x, y). = s ∂ Ws2 f (x, y) ∂y (f ∗ θs )(x, y) Finally, edges can be detected by calculating the modulus and angle of the gradient vector defined respectively as p Ms f (x, y) = |Ws1 f (x, y)|2 + |Ws2 f (x, y)|2 and
As f (x, y) = argument(Ws1 f (x, y) + iWs2 f (x, y)),
and then finding the modulus maximum defined as pixels with modulus greater than the two neighbours in the direction indicated by As f (x, y) (see (Mallat and Zhong, 1992) for specific implementation details). Examples of Mallatt’s edge detection method obtained using the NEAR system are given in Fig. 7.9. Edge present
This prob function simply returns true if there is an edge pixel contained in the subimage (see, e.g., Fig. 7.10). Number of edge pixels
This probe function returns the total number of pixels in a subimage belonging to an edge (see, e.g., Fig. 7.11).
7–12
Rough Fuzzy Image Analysis
(7.9a)
(7.9b)
(7.9c)
(7.9d)
FIGURE 7.9: (See color insert) Example of NEAR system edge detection using Mallat’s method: (a) Original image, (b) edges obtained from (a), (c) original image, and (d) obtained from (c).
(7.10a)
(7.10b)
(7.10c)
FIGURE 7.10: Example of edge present probe function: (a) Edges obtained from Fig. 7.5a, (b) Application to image with subimages of size 5 × 5, and (c) Application to image with subimages of size 10 × 10.
Edge orientation
This probe function returns the average orientation of subimage pixels belonging to an edge (see, e.g., Fig. 7.12).
7.5
Equivalence class frame
7–13
Near Set Evaluation And Recognition (NEAR) System
(7.11a)
(7.11b)
(7.11c)
FIGURE 7.11: Example of number of edge pixels probe function: (a) Original image, (b) Application to image with subimages of size 5 × 5, and (c) Application to image with subimages of size 10 × 10.
(7.12a)
(7.12b)
(7.12c)
FIGURE 7.12: Example of average orientation probe function: (a) Original image, (b) Application to image with subimages of size 5 × 5, and (c) Application to image with subimages of size 10 × 10.
This frame calculates equivalence classes using the Indiscernibility relation of Defn. 1, i.e., given an image X, it will calculate X/∼B where the objects are subimages of X. See Section 7.3 for an explanation of the theory used to obtain these results. A sample calculation using this frame is given in Fig. 7.13 and was obtained by the following steps: 1. 2. 3. 4.
Click Load Image button. Select number of features (maximum allowed is four). Select features (see Section 7.4 for a list of probe functions). Select window size. The value is taken as the square root of the area for a square subimage, e.g., a value of 5 creates a subimage of 25 pixels. 5. Click Run. The result is given in Fig. 7.13 where the bottom left window contains an image of the equivalence classes where each colour represents a single class. The bottom right window is used to display equivalence classes by clicking in any of the three images. The coordinates
7–14
Rough Fuzzy Image Analysis
FIGURE 7.13: Sample run of the equivalence class frame using a window size of 5 × 5 and B = {φNormG , φHShannon }.
of the mouse click determine the equivalence class that is displayed. The results may be saved by clicking on the save button.
7.6
Tolerance class frame
This frame calculates tolerance classes using the Tolerance relation of Defn. 3, i.e., given an image X, it will calculate X/∼ =B where the objects are subimages of X. This approach is similar to the one given in Section 7.3 with the exception that Defn. 1 is replaced with Defn. 3. A sample calculation using this frame is given in Fig. 7.14 and was obtained by the following steps: 1. 2. 3. 4.
Click Load Image button. Select number of features (maximum allowed is four). Select features (see Section 7.4 for a list of probe functions). Select window size. The value is taken as the square root of the area for a square subimage, e.g., a value of 5 creates a subimage of 25 pixels. 5. Select , a value in the interval [0, 1]. 6. Click Run. The result is given in Fig. 7.14 where the left side is the original image, and the right side is used to display the tolerance classes. Since the tolerance relation does not partition an image, the tolerance classes are displayed upon request. For instance, by clicking on either of the two images, all the tolerance classes are displayed that are within of the subimage containing the coordinates of the mouse click. Further, the subimage containing the mouse click is coloured black.
Near Set Evaluation And Recognition (NEAR) System
7–15
FIGURE 7.14: Sample run of the tolerance class frame using a window size of 10 × 10, B = {φNormG , φHShannon }, and ε = 0.05.
7.7
Segmentation evaluation frame
This frame performs segmentation evaluation using perceptual morphology as described in (Henry and Peters, 2008, 2009c), where the evaluation is labelled the Near Set Index (NSI). Briefly, the NSI uses perceptual morphology (a form of morphological image processing based on traditional mathematical morphology (Henry and Peters, 2009c)) to evaluate the quality of an image segmentation. As is given in (Henry and Peters, 2009c), the perception-based dilation is defined as A ⊕ B = {x/∼B ∈ B | x/∼B ∩ A 6= ∅}, and the perception-based erosion is defined as [ A B = {x/∼B ∩ A}, x/∼B ∈B
where the set A ⊆ O is selected such that it has some a priori perceptual meaning associated with it, i.e. this set has definite meaning in a perceptual sense outside of the probe functions in B. Furthermore, the structuring element B is the quotient set given in Eq. 7.1, i.e., B = O/∼B ∗ . As was reported in (Henry and Peters, 2009c), the quotient set is used as the SE in perceptual morphology, since it contains the perceptual information necessary to augment the set A in a perceptually meaningful way. This perceptual information is in the form of elementary sets (collections of objects with the same descriptions) since, we perceive
∗ The quotient set is being relabelled only to be notationally consistent with traditional mathematical morphology.
7–16
Rough Fuzzy Image Analysis
objects by the features that describe them and that people tend to grasp not single objects, but classes of them (Orlowska, 1982). For instance, given a set of probe functions B, and an image A, this frame can perform the perceptual erosion or dilation using B = O/∼B as the SE. Also, the NSI is calculated if perceptual erosion was selected. A sample calculation using this frame is given in Fig. 7.15 and was obtained by the following steps:
FIGURE 7.15: Sample run of the segmentation evaluation frame using a window size of 2 × 2, and B = {φNormG , φHShannon }.
1. Click Load Image & Segment button. 2. Select an image click Open. 3. Select segmentation image and click Open. Image should contain only one segment and the segment must be white (255, 255, 255) and the background must be black (0, 0, 0). The image is displayed in the top frame, while the segment is displayed in the bottom right (make sure this is the case). 4. Select number of features (maximum allowed is four). 5. Select features (see Section 7.4 for a list of probe functions). 6. Select window size. The value is taken as the square root of the area for a square subimage, e.g., a value of 5 creates a subimage of 25 pixels. 7. Click Erode to perform perceptual erosion and segmentation evaluation. Click Dilate to perform perceptual dilation (no evaluation takes place during dilation). The result is given in Fig. 7.15 where the bottom left window contains the an image of the equivalence classes where each colour represents a different class. The bottom right window contains either the segments erosion or dilation. Clicking on any of the three images will display the equivalence class containing the mouse click in the bottom right image. The
Near Set Evaluation And Recognition (NEAR) System
7–17
NSI is also displayed on the left hand side.
7.8
Near image frame
This frame is used to calculate the nearness of two images using the nearness measure from Eq. 7.3 defined in Section 7.2. A sample calculation using this frame is given in Fig. 7.16 and was obtained by the following steps:
FIGURE 7.16: Sample run of the near image frame using a window size of 10 × 10, B = {φNormG , φHShannon }, and ε = 0.05.
1. 2. 3. 4.
Click Load Images button and select two images. Select number of features (maximum allowed is four). Select features (see Section 7.4 for a list of probe functions). Select window size. The value is taken as the square root of the area for a square subimage, e.g., a value of 5 creates a subimage of 25 pixels. 5. Select ε, a value in the interval [0, 1]. 6. Click Run. The result is given in Fig. 7.16 where the left side contains the first image, and the right side contains the second image. Clicking in any of the two images will display the tolerance classes from both images near to the subimage selected by the mouse click. The subimage matching the coordinates of the mouse click is coloured black and all subimages that are near to the black subimage are displayed using a different colour for each class. The NM is also displayed on the left hand side.
7–18
Rough Fuzzy Image Analysis
7.9
Feature display frame
This frame is used to display the output of processing an image with a specific probe function. A sample calculation using this frame is given in Fig. 7.17 and was obtained by the following steps:
FIGURE 7.17: Sample run of the feature display frame.
1. 2. 3. 4.
Click Load Image button and select an image. Select features (see Section 7.4 for a list of probe functions). Select probe function Click Display feature.
7.10
Conclusion
This chapter has presented details on the NEAR system available for download at (Peters, 2009a). Specifically, it has presented background on near set theory, introduced some useful features in image processing, and systematically discussed all functions of the NEAR system. This tool has proved to be vital in the study of near set theory. By design, the system is modular and easily adaptable, as can be seen by the varied results reported in (Henry and Peters, 2007; Peters, 2007a,c; Henry and Peters, 2008; Peters, 2008; Peters and Ramanna, 2009; Peters, 2009b; Henry and Peters, 2009c; Peters and Wasilewski, 2009; Peters, 2009c, 2010; Hassanien et al., 2009; Henry and Peters, 2009b). Future work will focus on improvements for measuring image similarity, as well as the ability to compare images from databases for use in image retrieval.
7–19
Near Set Evaluation And Recognition (NEAR) System
Bibliography Bartol, W., J. Mir´ o, K. Pi´ oro, and F. Rossell´o. 2004. On the coverings by tolerance classes. Information Sciences 166(1-4):193–211. Christoudias, C., B. Georgescu, and P. Meer. 2002. Synergism in low level vision. In Proceedings of the 16th international conference on pattern recognition, vol. 4, 150–156. Quebec City. Fashandi, H., J. F. Peters, and S. Ramanna. 2009. l2 norm lenght-based image similarity measures: Concrescence of image feature histogram distances 178–185. Gerasin, S. N., V. V. Shlyakhov, and S. V. Yakovlev. 2008. Set coverings and tolerance relations. Cybernetics and System Analysis 44(3):333–340. Gupta, S., and K. Patnaik. 2008. Enhancing performance of face recognition systems by using near set approach for selecting facial features. Journal of Theoretical and Applied Information Technology 4(5):433–441. Hassanien, A. E., A. Abraham, J. F. Peters, G. Schaefer, and C. Henry. 2009. Rough sets and near sets in medical imaging: A review. IEEE Transactions on Information Technology in Biomedicine 13(6):955–968. Digital object identifier: 10.1109/TITB.2009.2017017. Henry, C., and J. F. Peters. 2007. Image pattern recognition using approximation spaces and near sets. In Proceedings of the eleventh international conference on rough sets, fuzzy sets, data mining and granular computer (rsfdgrc 2007), joint rough set symposium (jrs07), lecture notes in artificial intelligence, vol. 4482, 475–482. ———. 2008. Near set index in an objective image segmentation evaluation framework. In Proceedings of the geographic object based image analysis: Pixels, objects, intelligence, to appear. University of Calgary, Alberta. ———. 2009a. Near set evaluation and recognition (near) system. Tech. Rep., Computational Intelligence Laboratory, University of Manitoba. UM CI Laboratory Technical Report No. TR-2009-015. ———. 2009b. Perception based image classification. Tech. Rep., Computational Intelligence Laboratory, University of Manitoba. UM CI Laboratory Technical Report No. TR-2009-016. ———. 2009c. Perceptual image analysis. International Journal of Bio-Inspired Computation 2(2):to appear. Henry, C., and J.F. Peters. 2009d. http://en.wikipedia.org/wiki/Near sets.
Near
sets.
Wikipedia.
Magick++. 2009. Imagemagick image-processing library. www.imagemagick.org. Mallat, S., and S. Zhong. 1992. Characterization of signals from multiscale edges. IEEE Transactions on Pattern Analysis and Machine Intelligence 14(7):710–732. Marti, J., J. Freixenet, J. Batlle, and A. Casals. 2001. A new approach to outdoor scene description based on learning and top-down segmentation. Image and Vision Computing 19:1041–1055.
7–20
Rough Fuzzy Image Analysis
Martin, D., C. Fowlkes, D. Tal, and J. Malik. 2001. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In Proceedings of the 8th international conference on computer visison, vol. 2, 416–423. Meghdadi, A. H., J. F. Peters, and S. Ramanna. 2009. Tolerance classes in measuring image resemblance. Intelligent Analysis of Images & Videos, KES 2009, Part II, Knowledge-Based and Intelligent Information and Engineering Systems, LNAI 5712 127–134. ISBN 978-3-64-04591-2, doi 10.1007/978-3-642-04592-9 16. Orlowska, E. 1982. Semantics of vague concepts. applications of rough sets. Tech. Rep. 469, Institute for Computer Science, Polish Academy of Sciences. ———. 1985. Semantics of vague concepts. In Foundations of logic and linguistics. problems and solutions, ed. G. Dorn and P. Weingartner, 465–482. London/NY: Plenum Pres. Pal, N. R., and S. K. Pal. 1991. Entropy: A new definition and its applications. IEEE Transactions on Systems, Man, and Cybernetics 21(5):1260 – 1270. ———. 1992. Some properties of the exponential entropy. Information Sciences 66: 119–137. Pavel, M. 1983. “shape theory” and pattern recognition. Pattern Recognition 16(3): 349–356. Pawlak, Z. 1981. Classification of objects by means of attributes. Tech. Rep. PAS 429, Institute for Computer Science, Polish Academy of Sciences. ———. 1982. Rough sets. International Journal of Computer and Information Sciences 11:341–356. Pawlak, Z., and A. Skowron. 2007a. Rough sets and boolean reasoning. Information Sciences 177:41–73. ———. 2007b. Rough sets: Some extensions. Information Sciences 177:28–40. ———. 2007c. Rudiments of rough sets. Information Sciences 177:3–27. Peters, J. F. 2007a. Classification of objects by means of features. In Proceedings of the ieee symposium series on foundations of computational intelligence (ieee scci 2007), 1–8. Honolulu, Hawaii. ———. 2007b. Near sets. general theory about nearness of objects. Applied Mathematical Sciences 1(53):2609–2629. ———. 2007c. Near sets. special theory about nearness of objects. Fundamenta Informaticae 75(1-4):407–433. ———. 2008. Classification of perceptual objects by means of features. International Journal of Information Technology & Intelligent Computing 3(2):1 – 35. ———. 2009a. Computational intelligence laboratory. Http://wren.ece.umanitoba.ca/.
Near Set Evaluation And Recognition (NEAR) System ———. 2009b. Discovery of perceptually near information granules. In Novel developements in granular computing: Applications of advanced human reasoning and soft computation, ed. J. T. Yao, in press. Hersey, N.Y., USA: Information Science Reference. ———. 2009c. Tolerance near sets and image correspondence. International Journal of Bio-Inspired Computation 1(4):239–245. ———. 2010. Corrigenda and addenda: Tolerance near sets and image correspondence. International Journal of Bio-Inspired Computation 2(5). in press. Peters, J. F., and S. Ramanna. 2007. Feature selection: A near set approach. In Ecml & pkdd workshop in mining complex data, 1–12. Warsaw. ———. 2009. Affinities between perceptual granules: Foundations and perspectives. In Human-centric information processing through granular modelling, ed. A. Bargiela and W. Pedrycz, 49–66. Berline: Springer-Verlag. Peters, J. F., S. Shahfar, S. Ramanna, and T. Szturm. 2007a. Biologically-inspired adaptive learning: A near set approach. In Frontiers in the convergence of bioscience and information technologies. Korea. Peters, J. F., A. Skowron, and J. Stepaniuk. 2006. Nearness in approximation spaces. In Proc. concurrency, specification & programming, 435–445. Humboldt Universit¨at. ———. 2007b. Nearness of objects: Extension of approximation space model. Fundamenta Informaticae 79(3-4):497–512. Peters, J. F., and P. Wasilewski. 2009. Foundations of near sets. Information Sciences. An International Journal 179:3091–3109. Digital object identifier: doi:10.1016/j.ins.2009.04.018. ———. 2010. Tolerance space view of what we see. poincar´e’s physical continuum and zeeman’s visual perception. Mathematical Intelligencer, submitted. Peters, J.F., and L. Puzio. 2009. Image analysis with anisotropic wavelet-based nearness measures. International Journal of Computational Intelligence Systems 3(2): 1–17. Poincar´e, H. 1913. Mathematics and science: Last essays, trans. by j. w. bolduc. N. Y.: Kessinger. Schroeder, M., and M. Wright. 1992. Tolerance and weak tolerance relations. Journal of Combinatorial Mathematics and Combinatorial Computing 11:123–160. Seemann, T. 2002. Digital image processing using local segmentation. Ph.d. dissertation, School of Computer Science and Software Engineering, Monash University. Shreider, Yu. A. 1970. Tolerance spaces. Cybernetics and System Analysis 6(12): 153–758. Skowron, A., and J. Stepaniuk. 1996. Tolerance approximation spaces. Fundamenta Informaticae 27(2-3):245–253.
7–21
7–22
Rough Fuzzy Image Analysis
Sossinsky1986. 1986. Tolerance space theory and some applications. Acta Applicandae Mathematicae: An International Survey Journal on Applying Mathematics and Mathematical Applications 5(2):137–167. Weber, M. 1999. Leaves dataset. Url: www.vision.caltech.edu/archive.html. wxWidgets. 2009. wxwidgets cross-platform gui library v2.8.9. www.wxwidgets.org. Zeeman, E. C. 1962. The topology of the brain and the visual perception. In Topoloy of 3-manifolds and selected topices, ed. K. M. Fort, 240–256. New Jersey: Prentice Hall. Zheng, Z., H. Hu, and Z. Shi. 2005. Tolerance relation based granular space. Lecture Notes in Computer Science 3641:682–691.
8 Perceptual Systems Approach to Measuring Image Resemblance 8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8–1 8.2 Perceptual Systems, Feature Based Relations and Near Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8–2 Perceptual Systems • Perceptual Indiscernibility and Tolerance Relations • Nearness Relations and Near Sets
Amir H. Meghdadi Computational Intelligence Laboratory, University of Manitoba
James F. Peters Computational Intelligence Laboratory, University of Manitoba
8.1
8.3 Analysis and Comparison of Images Using Tolerance Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8–11 Tolerance Overlap Distribution nearness measure (TOD)
8.4 Summary and Conclusions . . . . . . . . . . . . . . . . . . . . . . . . 8–19 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8–21
Introduction
Image resemblance is viewed as a form of nearness between sets of perceptual objects as originally proposed in near set theory in (Peters, 2007b,c) and further elaborated in (Peters, 2009, 2010; Peters and Wasilewski, 2009), where a nearness relation is shown to be a tolerance relation on the family of near sets in a perceptual system. The idea of using tolerance relations (Sossinsky, 1986) in formalizing the concept of perceptual resemblance between images was introduced in 2008 (see, e.g., (Peters, 2008b) and elaborated in (Peters, 2009, 2010; Peters and Wasilewski, 2009)) as a result of a collaboration with Z. Pawlak in 2002 on describing the nearness of perceived objects (Pawlak and Peters, 2002,2007). In this approach, images are considered to be non-empty sets of perceptual objects O (pixels or subimages). Moreover, a set F of probe functions is used to describe the objects by extracting some of their perceivable features. < O, F > is named as perceptual information system (Peters and Ramanna, 2009). Near set theory grew out of a generalization of the rough set approach (Pawlak, 1981a,b) in describing the affinities between sample objects. The perceptual basis of near set theory was inspired by Orlowska’s suggestion that approximation spaces are the formal counterpart of perception or observation (Orlowska and Pawlak, 1984). 8–1
Rough Fuzzy Image Analysis
8–2
This concept has been described in (Peters, 2008c) as follows: “Our mind identifies relationships between object features to form perceptions of sensed objects. Our senses gather the information from the objects we perceive and map sensations to values assimilated by the mind. Thus, our senses can be likened to perceptual probe functions in the form of a mapping of stimuli from objects in our environment to sensations (values used by the mind to perceive objects)”.
8.2
Perceptual Systems, Feature Based Relations and Near Sets
In this section, formal definitions of perceptual systems, tolerance and nearness relations are provided. TABLE 8.1
Perceptual System Symbols
Symbol Interpretation
Symbol Interpretation
O F ∼B ∼ =B,ε x/∼B O/∼B B B
X Sample X ⊆ O, B Sample B ⊆ F, φ∈B Probe φ : O −→ , ∼B Weak indiscernibility relation, A ⊂∼ =B,ε ∀x, y ∈ A, x ∼ =B,ε y (preclass) x/∼ x in maximal preclass (tolerance class), =B,ε X/∼B = {x/∼B | x ∈ X}, quotient set, B,ε Tolerance nearness relation, B,ε Weak tolerance nearness relation.
8.2.1
Set of perceptual objects, Set of probe functions, Set of real numbers, Indiscernibility relation, Tolerance relation, = {y ∈ X | y ∼B x}, = {x/∼B | x ∈ O}, Nearness relation, Weak nearness relation,
Perceptual Systems
DEFINITION 8.1 Perceptual System A perceptual system O, F is a realvalued, total∗ , deterministic information system where O is a non-empty set of perceptual objects, while F is a countable set of probe functions.
A perceptual object (x ∈ O) represents something in the physical world that can be perceived with our senses (for example a pixel in an image, or an image in a set of images). Usually, we are dealing with a set of objects X ⊆ O (for example an image that consists of pixels or subimages). A probe function φ(x) is a real-valued
∗
A perceptual system is total inasmuch as each probe function φ maps O to a single real-value.
Perceptual Systems Approach to Measuring Image Resemblance
8–3
function representing a feature of the physical object x. A set of probe functions F = {φ1 , φ2 , ..., φl } can be defined to extract all the feature-values for each object x. However, not all the probe functions (features) may be used all the time. The set B ⊆ F represents the probe functions in use. This approach to representation and comparison of feature values by probe functions started with the introduction of near sets (See (Peters, 2007a) and (Peters, 2008a)). Probe functions provide a basis for describing and discerning affinities between sample objects in the context of what is known as a perceptual system. This represents a departure from partialfunctions known attributes define in terms of a column of values in an information system table in rough set theory. Example 8.1
Perceptual Subimages (pixel windows). An image can be partitioned into subimages viewed as perceptual objects. Each subimage has feature values that are the result of visual perception, i.e., how we visualize a subimage (e.g., its colour, texture, spatial orientation). Figure 8.1 for example shows an image of size 255×255 pixels divided into pixel windows (subimages) of size 85 × 85 pixels. For simplicity, the size of the subimages are very large here. In practice, the size of subimages are much smaller, resulting in a higher number of subimages. (see figure 8.2 for example). Therefore, the image can be represented with a set (O) of 9 perceptual objects as follows: O = {x1 , x2 , ...x9 }
(8.1)
Different probe functions can be defined to extract feature values of an image or subimage. Average gray, image entropy, texture and color information in each subimage are some examples. For practical use, several feature values are needed to represent an image. However, in some examples of this chapter, only average gray value and sometimes entropy have been used. Moreover, feature values have been normalized between 0 and 1 in cases where more than one probe function is used. Figure 8.1 shows the image as well as all the subimages, where all of the pixels in each subimage the gray level of a pixel is replaced with the average gray level of the subimage containing the pixel. By way of illustration, average gray levels are shown in each subimage in Fig. 8.1.
8.2.2
Perceptual Indiscernibility and Tolerance Relations
Indiscernibility and tolerance relations are defined in order to establish and measure affinities between pairs of perceptual objects in a perceptual system O, F . These relations are a subset of O × O. Indiscernibility relation is a key concept in approximation spaces in rough set theory. A perceptual indiscernibility relation is defined as follows.
Rough Fuzzy Image Analysis
8–4
1
4
7
161
139
161
2
5
8
78
80
152
3
6
9
66
116
114
FIGURE 8.1: An image and its 9 average gray levels subimages DEFINITION 8.2 Perceptual Indiscernibility Relation (Peters, 2010) Let O, F be a perceptual system. Let φ ∈ B, x, y ∈ O and let
φB (x) = (φ1 (x), . . . , φi (x), . . . , φL (x)) denote a description of object x containing feature values represented by φi ∈ B. A perceptual indiscernibility relation ∼B is defined relative to B as follows ∼B = {(x, y) ∈ O × O | φB (x) − φB (y) 2 = 0},
(8.2)
where · 2 denotes the L2 (Euclidean) norm. The set of all perceptual objects in O that are indiscernible relative to an object x ∈ O is called an equivalence class, denoted by x/∼B . This form of indiscernibility relation introduced in (Peters, 2009) is a variation of the very useful relation introduced by Z. Pawlak in 1981 (Pawlak, 1981a). Note that all of the elements in x/∼B have matching descriptions, i.e., the objects in x/∼B are indiscernible from each other. Then, by definition,
FIGURE 8.2: An example of an image partitioned into 961 subimages
∀x ∈ O,
x/∼B = {y ∈ O | y ∼B x}.
(8.3)
The indiscernibility relation partitions the set O to form the quotient set O/∼B , a set of x/∼B .
Perceptual Systems Approach to Measuring Image Resemblance
8–5
O/∼B = {x/∼B | x ∈ O}, x/∼B = O,
(8.4) (8.5)
x∈O
∀x, y ∈ O
(x/∼B ) ∩ (y/∼B ) = ∅.
(8.6)
Example 8.2
Figure 8.3a shows an image of size 256 × 256 pixels and their subimages. Let O, F
be a perceptual system where O denotes the set of 25 × 25 subimages.
A) The original image
(a) C) The equivalence class on a white background
(c)
B) A subimage in the covering
(b) D) The equivalence class marked on image
(d)
(8.3a) An equivalent class
A) The original image
(a) C) The equivalence class on a white background
(c)
B) A subimage in the covering
(b) D) The equivalence class marked on image
(d)
(8.3b) A tolerance class
FIGURE 8.3: An image and one of its equivalence (Left) and tolerance (Right) classes Let B = {φ1 (x)} ⊆ F is the set of only one probe function φ1 where φ1 (x) = gray(x) is the average gray scale value of subimage x. The two marked subimages are perceptually indiscernible with respect to their gray level values and hence they are named an equivalence class. The equivalence class is shown both individually (c) and on top of the image where the rest of image is blurred (d). Perceptual indiscernibility relation can also be defined in a weak sense as follows Perceptual Weak Indiscernibility Relation (Peters and Wasilewski, 2009; Peters, 2009)
DEFINITION 8.3
Let O, F be a perceptual system. Let B = {φ1 , φ2 , ..., φl } and x, y ∈ O. A perceptual weak indiscernibility relation ∼B is defined relative to B as follows
Rough Fuzzy Image Analysis
8–6
∼B = {(x, y) ∈ O × O |
∃φi ∈ B
φi (x) − φi (y) 2 = 0}.
(8.7)
Tolerance Relations and Tolerance Classes
The concept of indiscernibility relation can be generalized to the tolerance relation, which is very important in near set theory (Peters, 2009, 2010; Peters and Wasilewski, 2009; Peters and Ramanna, 2009; Peters and Puzio, 2009; Meghdadi, Peters, and Ramanna, 2009). Tolerance relations emerge in transition from the concept of equality to almost equality when comparing objects by they feature values. Tolerance Relation A tolerance relation ζ ⊆ X × X on a set X in general, is a binary relation that is reflexive and symmetric but not necessarily transitive (Sossinsky, 1986). DEFINITION 8.4
1. ζ ⊂ X × X, 2. ∀x ∈ X, (x, x) ∈ ζ, 3. ∀x, y ∈ X, (x, y) ∈ ζ ⇒ (y, x) ∈ ζ. Moreover, the notation x ζ y can be used used as an abbreviation of (x, y) ∈ ζ. The set X supplied with the binary relation ζ is named a tolerance space and is shown with Xζ . The term tolerance space was originally coined by E. C. Zeeman in (Zeeman, 1962). A perceptual tolerance relation is defined in the context of perceptual systems as follows, where ∼ =B, is used instead of ζ, to denote the tolerance relation. Perceptual Tolerance Relation (Peters, 2009, 2010) Let O, F be a perceptual system and let ∈ (set of all real numbers). For every B ⊆ F the perceptual tolerance relation ∼ =B,ε is defined as follows: DEFINITION 8.5
∼ =B,ε = {(x, y) ∈ O × O : φB (x) − φB (y) 2 ≤ ε},
(8.8)
where φB (x) = [φ1 (x) φ2 (x) ... φl (x)]T is a feature-value vector representing an object description obtained using all of the probe functions in B and · 2 is the L2 norm (Lp norm in general (J¨ anich, 1984)). A tolerance relation ∼ =B,ε defines a covering on the set O of perceptual objects, resulting in a set of tolerance classes and tolerance blocks (Bartol, Mir, Piro, and Rossell, 2004; Schroeder and Wright, 1992). In this work, the definition of tolerance class in (Bartol et al., 2004) has been used as follows and is similar to equivalence class in (8.3).
Perceptual Systems Approach to Measuring Image Resemblance
8–7
Tolerance Preclass ∼ A set A ⊂=B,ε is a preclass in ∼ =B,ε if, and only if ∀x, y ∈ A, x ∼ =B,ε y. DEFINITION 8.6
DEFINITION 8.7 Tolerance Class A set A ⊂∼ =B,ε if, and only if A is a maximal preclass. =B,ε is a tolerance class in ∼
where ∼ =B,ε is defined in (8.8) with respect to a given set of probe functions B. Let x/∼ =B,ε denote a maximal preclass containing x. Tolerance classes can overlap and hence the set of all tolerance classes is a covering of O denoted by O/∼ =B,ε . Recall that a cover is a family of subsets of O whose union is O and their intersection is not necessarily empty. O/∼ =B,ε = {x/∼ =B,ε | x ∈ O}
x/∼ =B,ε = O
(8.9) (8.10)
x∈O
Example 8.3
Figure 8.3b shows the image in example 8.2 and its subimages. Let O, F be a perceptual system where O denotes the set of 25 × 25 subimages. The image is divided into 100 subimages of size 25 × 25 and can be shown as a set X = O of all the 100 subimages. Let B = {φ1 (x)} ⊆ F where φ1 (x) = gray(x) is the average gray scale value of pixels in subimage x, normalized between 0 and 1. Let = .1. A subimage x has been selected and the marked subimages in the figure belong to a tolerance class that is represented by the selected subimage because their gray level values are close to the gray value of subimage x within the tolerance level . Example 8.4
The simple image in example 8.1 (Figure 8.1) is considered here again. For each given subimage, the corresponding tolerance class has been obtained by finding all the subimages that have the average gray scale values within the tolerance range ( = .1) of the average gray value of the given subimage. Figure 8.4 shows all the tolerance classes which are calculated for ∀x ∈ O after removing the redundant classes. Note that x1 /∼ =B,ε = x7 /∼ =B,ε = x8 /∼ =B,ε , x2 /∼ =B,ε = x3 /∼ =B,ε = x5 /∼ =B,ε and x6 /∼ =B,ε = x9 /∼ =B,ε and hence there are 4 tolerance classes in total. The set of all tolerance classes is a covering of O and is shown with O/∼ =B = {x/∼ =B | x ∈ O} = {x4 /∼ =B, , x5 /∼ =B, , x8 /∼ =B, , x9 /∼ =B, , }. Tolerance Matrix
In order to demonstrate a tolerance space and all its tolerance classes, a tolerance matrix is defined here to show the tolerance relation between pairs of perceptual objects. Each row in a tolerance matrix represents one tolerance class and each column represents one perceptual object (subimage). Corresponding to each probe
Rough Fuzzy Image Analysis
8–8 The original image
The tolerance class displayed on a white background
161
139
161
78
80
152
66
116
114
The tolerance class displayed on the image
1th Tol class. subimage:5
2th Tol class. subimage:9
3th Tol class. subimage:4
4th Tol class. subimage:8
(8.4a)
(8.4b)
FIGURE 8.4: (a) The image covering, (b) all classes O/∼ =B = {x/∼ =B | x ∈ O} function φk ∈ B, a tolerance matrix T Mk = [tij ] is defined as in equation 8.11, where the elements tij of the matrix will be zero if subimages xi and xj do not belong to the same tolerance class which is defined by xi . tij =
φk (xj ) 0
if xj ∈ xi /∼ =B otherwise
(8.11)
Subsequently, an Ordered Tolerance Matrix (OT M ) is defined by removing identical rows in the tolerance matrix (identical tolerance classes) and sorting out the rows (classes) based on the average value of φk (x) among the perceptual objects of each tolerance class. Example 8.5
In example 8.4, there are 9 perceptual objects (equation 8.1) and 4 tolerance classes. The tolerance matrix T M defined for the only probe function φ1 (average gray value between 0 and 255) will be a 9 × 9 matrix as follows with 4 non redundant rows: ⎡
T Mφ1
⎢ ⎢ ⎢ ⎢ ⎢ ⎢ = {tij } = ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣
162 0 0 140 0 0 162 153 0 0 79 67 0 81 0 0 0 0 0 79 67 0 81 0 0 0 0 162 0 0 140 0 117 162 153 115 0 79 67 0 81 0 0 0 0 0 0 0 140 0 117 0 0 115 162 0 0 140 0 0 162 153 0 162 0 0 140 0 0 162 153 0 0 0 0 140 0 117 0 0 115
⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦
(8.12)
The element tij of the tolerance matrix is set to the average gray value of j th subimage if ith and j th subimages belong to the same tolerance class that is defined
Perceptual Systems Approach to Measuring Image Resemblance
8–9
by ith subimage. (tij = 0 otherwise). The T M matrix can be displayed as an image as shown in figure 8.5.
2
1
3
4
Tolerance Matrix 6 5
7
9
8
1
2
3
Subimages
4
5
6
7
8
9
Subimages
FIGURE 8.5: The tolerance matrix displayed as an image
The tolerance classes have average gray values of 65, 150, 159, 175 and 184 respectively. ⎡
⎤ 0 79 67 0 81 0 0 0 0 ⎢ 0 0 0 140 0 117 0 0 115 ⎥ ⎥ OT M = ⎢ ⎣ 162 0 0 140 0 117 162 153 115 ⎦ 162 0 0 140 0 0 162 153 0
(8.13)
The OT M matrix can be shown as an image that displays the gray level values and the spatial information of all the tolerance classes.
Average Gray Level 67 Class 1
Class 2
Class 3
Class 4 162 1
2
3
4 5 Subimage index
6
7
8
9
FIGURE 8.6: The ordered tolerance matrix displayed as an image
Rough Fuzzy Image Analysis
8–10
In Figures 8.8 to 8.10, examples of ordered tolerance matrix is shown for more images. The images are from the Berkeley segmentation dataset (Martin, Fowlkes, Tal, and Malik, 2001). Each image size is 321 × 481 which is divided into 160 subimages of size 30 × 30. The only probe function used is the average gray level values between 0 and 255. The value of in equation 8.8 is equal to 20 where the gray values have not been normalized.
8.2.3
Nearness Relations and Near Sets
Nearness relations were introduced in the context of a perceptual system O, F
by James Peters (Peters and Wasilewski, 2009) after the introduction of near set theory in 2007 (see (Peters, 2007c),(Peters, 2007b) and (Peters and Wasilewski, 2009)). These relations are defined between sets of perceptual objects. Therefore, a nearness relation R is a subset of P(O) × P(O). DEFINITION 8.8 Weakly Nearness Relation Let O, F be a perceptual system and let X, Y ⊆ O. A set X is weakly near to a set Y within a perceptual system O, F and is shown with XF Y , if and only if the following condition is satisfied: ∃ x ∈ X, y ∈ Y, B ⊆ F such that x ∼B y Consequently, nearness relation is defined on P(O) as follows:
F = {(X, Y ) ∈ P(O) × P(O) | XF Y }
(8.14)
Example 8.6
Figure 8.7 shows two images and their corresponding subimages (25 × 25 pixels each). Let X and Y denotes the set of all subimages in image 1 and image 2 respectively. Let O = X ∪ Y be the set of all subimages in two images. Let O, F
be a perceptual system and let B ⊆ F and B = {φ1 (x)}, where φ1 (x) = gray(x) is the gray scale value of subimage x. Images X and Y are then weakly near to each other (XF Y ) because they have elements x ∈ X and y ∈ Y with matching descriptions (x ∼B y). Nearness Relation Let O, F be a perceptual system and let X, Y ⊆ O. A set X is near to a set Y within a perceptual system O, F and is shown with X F Y , if and only if the following condition is satisfied (Peters and Wasilewski, 2009): DEFINITION 8.9
∃ x ∈ X, y ∈ Y, A, B ⊆ F , f ∈ F and also ∃ A ∈ O/∼A , B ∈ O/∼B , C ∈ O/∼C such that A, B ⊆ C Consequently, the nearness relation is defined on P(O) as follows: F = {(X, Y ) ∈ P(O) × P(O) | X F Y }
(8.15)
Perceptual Systems Approach to Measuring Image Resemblance Image 1
Image 2
X
Y
8–11
FIGURE 8.7: Two images and sample subimages with matching descriptions Weak Tolerance Nearness Relation Let O, F be a perceptual system and let X, Y ⊆ O and ∈ . A set X is perceptually almost near to a set Y in a weak sense within the perceptual system O, F and is shown with F , if and only if the following condition is satisfied: ∃ x ∈ X, ∃ y ∈ Y, ∃ B ⊆ F such that x ∼ =B,ε y Consequently, weak tolerance nearness relation is defined on P(O) as follows: DEFINITION 8.10
F = {(X, Y ) ∈ P(O) × P(O) | XF Y }
(8.16)
DEFINITION 8.11 Near Sets Let O, F be a perceptual system and let X ⊆ O. A set X is a near set iff there is Y ⊆ O such that X F Y . The family of near sets of a perceptual system is denoted by Near F (O)
Tolerance Near Sets Let O, F be a perceptual system and let X ⊆ O. A set X is a tolerance near set iff there is Y ⊆ O such that X F Y . The family of tolerance near sets of a perceptual system is denoted by Near F˜ (O) DEFINITION 8.12
8.3
Analysis and Comparison of Images Using Tolerance Classes
Tolerance classes as described in section 8.2.2, can be viewed as structural elements in representing an image. The motivation for using tolerance classes in perceptual image analysis is the conjecture that visual perception in the human perception is performed in the class level rather than pixel level.
8–12
Rough Fuzzy Image Analysis
For a single image, tolerance classes can be calculated as shown in section 8.2.2 and displayed in a tolerance matrix or ordered tolerance matrix as shown in the simple example of Figures 8.5 and 8.6. The main approach in this chapter for comparing pairs of images is to compare the corresponding tolerance classes. Size of the tolerance classes, the overlap between tolerance classes and their distributions, for example can be used to quantitatively compare the image tolerance classes. Lattice Valued Tolerance Matrix
FIGURE 8.8: An image covering and the ordered tolerance matrix (30 × 30 subimages)
Lattice Valued Tolerance Matrix
FIGURE 8.9: An image covering and the ordered tolerance matrix (30 × 30 subimages)
Lattice Valued Tolerance Matrix
FIGURE 8.10: An image covering and the ordered tolerance matrix (30 × 30 subimages)
Perceptual Systems Approach to Measuring Image Resemblance
8.3.1
8–13
Tolerance Overlap Distribution nearness measure (TOD)
A similarity measure is proposed here based on statistical comparison of overlaps between tolerance classes at each subimage. The proposed method is as follows. Suppose X, Y ∈ O are two images (sets of perceptual objects). The sets of all tolerance classes for image X and Y are shown as follow and form a covering for each image. X/∼ =B,ε = {x/∼ =B,ε | x ∈ X}
(8.17)
Y/∼ =B,ε = {y/∼ =B,ε | y ∈ Y }
(8.18)
Subsequently, the set of all overlapping tolerance classes corresponding to each object (subimage) x is named as ΩX/∼ (x) and is defined as follows: =B,ε (x) = {z/∼ ΩX/∼ =B,ε ∈ X/∼ =B,ε | x ∈ z/∼ =B,ε } =B,ε
(8.19)
Consequently, the normalized number of tolerance classes in X/∼ =B,ε which are overlapping at x is named as ω and defined as follow: (x) ΩX/∼ =B,ε (8.20) ωX/∼ (x) = =B,ε X/∼ =B,ε Similarly, the set of all overlapping tolerance classes at every subimage y ∈ Y is denoted by ωY/∼ (y). Assuming that the set of probe functions B and the value of =B,ε
are known, we use the more simplified notation of ΩX (x) and ωX (x) for the set X/∼ =B,ε and the notations ΩY (y) and ωY (y) for the set Y/∼ =B,ε . Let {b1 , b2 , ..., bNb } are the discrete bins in calculation of histograms of ωX (x) and ωY (y) where x ∈ X and y ∈ Y . Therefore, the empirical distribution function (histogram) of ωX (x) at bin value bj is shown here as HωX (bj ) and defined as the number of subimages x with a value of ωX (x) that belongs to j th bin. The cumulative distribution function is then defined as follows: CHωX (bj ) =
i=j
HωX (bi )
(8.21)
i=1
CHωY (bj ) is similarly defined for image Y . The Tolerance Overlap Distribution (TOD) nearness measure is defined by taking the sum of differences between cumulative histograms as defined in equation 8.22 where γ is a scaling factor and is set here to 0.6. ⎛ T OD = 1 − ⎝
j=N
b j=1
⎞γ |CHωX (bj ) − CHωY (bj )|⎠
(8.22)
Rough Fuzzy Image Analysis
8–14
The proposed method is compared with tolerance nearness measure (tNM, a previously developed method by C. Henry and J. Peters (Hassanien, Abraham, Peters, Schaefer, and Henry, 2009; Peters and Wasilewski, 2009; Henry and Peters, 2009b) and also with a simple histogram based method based on comparing the cumulative histograms of gray level values. The results of comparison is presented in example 8.7. Tolerance nearness measure: tNM
Tolerance nearness measure (tNM) is based on the idea that if one considers the union of two images as the set of perceptual objects, tolerance classes should contain almost equal number of subimages from each image. tNM between two images (Henry and Peters, 2009b) is defined as follows: Suppose X and Y are the sets of perceptual objects (subimages) in image 1 and image 2. Let z/∼ =B,ε denote a maximal preclass containing z. Z = X ∪ Y is the set of all perceptual objects in the union of images and for each z ∈ Z the tolerance class is shown as: z/∼ =B,ε = {s ∈ Z
|
φB (z) − φB (s) ≤
(8.23)
The part of the tolerance class that is a subset of X is named as [z/∼ =B,ε ]⊆X and similarly, part of the tolerance class that is a subset of Y is named [z/∼ =B,ε ]⊆Y . Therefore: [z/∼ =B,ε ]⊆X {x ∈ z/∼ =B,ε | x ∈ X} ⊆ z/∼ =B,ε
(8.24)
[z/∼ =B,ε ]⊆Y {y ∈ z/∼ =B,ε | y ∈ Y } ⊆ z/∼ =B,ε
(8.25)
z/∼ =B,ε = [z/∼ =B,ε ]⊆X ∩ [z/∼ =B,ε ]⊆Y
(8.26)
Subsequently, a tolerance nearness measure is defined as the weighted average of the closeness between the cardinality (size) of sets [z/∼ =B,ε ]⊆X and the cardinality of ] where the cardinality of z is used as the weighting factor. [z/∼ ∼ /=B,ε =B,ε ⊆Y tN M = z/∼ =B,ε
1 |z/∼ =B,ε |
×
min( |[z/∼ =B,ε ]⊆X | , |[z/∼ =B,ε ]⊆Y | ) z/∼ =B,ε
max( |[z/∼ =B,ε ]⊆X | , |[z/∼ =B,ε ]⊆Y | )
× |z/∼ =B,ε | (8.27)
Histogram similarity measure: HSM
For the sake of comparison, a third similarity measure is also defined here to compare distributions (histograms) of gray scale values in images. Therefore a histogram similarity measure is defined here as the absolute difference between CDF of the gray scale values in two images similar to equation 8.22 ⎞γ ⎛ j=N
b |CHgX (bj ) − CHgY (bj )|⎠ (8.28) HSM = 1 − ⎝ j=1
Perceptual Systems Approach to Measuring Image Resemblance
8–15
where gX (x) and gY (y) are the (average) gray level values of (subimages) pixels x ∈ X and y ∈ Y . {b1 , b2 , ..., bNb } are the bins in calculating the histograms of gX (x) and gY (y) and CHgX (bj ) and CHgY (bj ) are cumulative distribution functions (empirical histograms). Example 8.7
Sample images and their tolerance classes are shown in Figure 8.12. The number of overlapping tolerance classes at each subimage, ωX (x)) (or ωY (y)) is plotted versus the index of subimage, x (or y). The empirical CDFs for ωX (x) and ωY (y) are shown in Figure 8.11. TOD and tNM nearness measures are shown in table 8.2 for different values of p × p subimage size (p = 10, 20) and epsilon (ε = 0.01, 0.05, 0.1, 0.2). Different sets (B1 and B2 ) of probe function have been used. B1 = {φ1 } and B2 = {φ1 , φ2 } where φ1 represents average gray value of a subimage and φ2 represent the entropy of subimage. HSM measure is also calculated using the gray level values of all the pixels in each image using equation 8.28. #Tol-X and #Tol-Y represent the number of tolerance classes for images X and Y, respectively. The results are shown in table 8.2 and plotted in figure 8.13.
Empirical histograms / Distributions
Cumulative histograms / CDFs
200
1
1st Histogram HX(bi)
180
Y
i
160 Cumulative histogram values
Histogram values
2nd CDF CHY(bi) |CHX Ŧ CHY|
0.8
140 120 100 80 60
0.7 0.6 0.5 0.4 0.3
40
0.2
20
0.1
0 0
1st CDF CHX(bi)
0.9
2nd Histogram H (b )
0.2
0.4 0.6 Histogram Bins: bi
0.8
1
0 0
0.2
0.4 0.6 Histogram Bins: bi
0.8
1
FIGURE 8.11: CDF of the number of overlaps between tolerance classes for ωX (x) and ωY (y)
Similarity between groups of conceptually different images
The similarity measures discussed in section 8.3.1, can be calculated for each pair of images in an image database to reveal structures in the set of images and possibly be used for clustering or classification of images or image retrieval. In order to
Rough Fuzzy Image Analysis
8–16 Image X
Image Y
Ordered tolerance classes in X
Ordered tolerance classes in Y 30
47
222
215
y (subimages in Y)
x (subimages in X)
ω (y)
1
0.5
Y
X
ω (x)
1
0 0
200 400 x (subimages in X)
0.5
0 0
600
200 400 y (subimages in Y)
600
FIGURE 8.12: Sample images, their ordered tolerance matrices and plot of the number of overlaps TABLE 8.2
Similarity measure calculated for different values of tolerance level () and subimage size (p)
p
#Tol-X B1 B2
#Tol-Y B1 B2
10 10 10 10 20 20 20 20
TOD B1 B2
HSM B1 B2
tNM B1 B2
0.01 0.05 0.10 0.20
425 401 399 384
526 592 603 614
423 411 425 371
534 606 620 602
0.96 0.91 0.88 0.83
0.98 0.90 0.88 0.85
0.86 0.86 0.86 0.86
0.86 0.86 0.86 0.86
0.74 0.86 0.87 0.91
0.45 0.68 0.73 0.88
0.01 0.05 0.10 0.20
86 96 92 90
124 133 132 124
78 102 100 86
126 134 136 120
0.94 0.91 0.87 0.83
0.97 0.92 0.89 0.84
0.86 0.86 0.86 0.86
0.86 0.86 0.86 0.86
0.58 0.74 0.78 0.88
0.34 0.59 0.72 0.89
visually demonstrate the similarity between pairs of images, a similarity matrix demonstration scheme is used here in which the similarity between pairs of images in a set of N images is shown as a symmetric N × N matrix. Let I = {I1 , I2 , ..., IN }
Perceptual Systems Approach to Measuring Image Resemblance
(8.13a)
8–17
(8.13b)
FIGURE 8.13: (Please see color insert) tNM, HSM and TOD for different values of tolerance level, subimage size p = 10, 20
be a set of images. Let sij is one kind of similarity (nearness) measure between image Ii and image Ij . The similarity matrix is SM = {sij } and is graphically shown with a square of N × N picture elements (cells) where each picture element is shown with a gray scale brightness value of gr = sij . Full similarity (identical) is shown with 1 (white) and the complete dissimilarity is shown with 0 (black). If the image database can be classified into subsets of conceptually similar images, it is expected that similarity is high (bright) within images of a subset and low (dark) between subsets.
FIGURE 8.14: Selected images from Caltech image database
Example 8.8
Figure 8.14 shows 9 sample images randomly selected from Caltech vision group
Rough Fuzzy Image Analysis
8–18 Similarity Matrix − Sim1 − TOD
Similarity Matrix − Sim2 − tNM
1 airplane1.jpg−[1]
1 airplane1.jpg−[1]
0.9 airplane2.jpg−[2] 0.8 airplane3.jpg−[3]
0.9 airplane2.jpg−[2] 0.8
0.7
airplane3.jpg−[3]
face_0009.jpg−[4]
0.6
face_0009.jpg−[4]
0.6
face_0073.jpg−[5]
0.5
face_0073.jpg−[5]
0.5
face_0095.jpg−[6]
0.4
face_0095.jpg−[6]
0.4
image_0008.jpg−[7]
0.3
0.7
0.3
image_0008.jpg−[7]
0.2
image_0011.jpg−[8]
0.2
image_0011.jpg−[8]
0.1 image_0021.jpg−[9]
0.1 image_0021.jpg−[9]
0 1
2
3
4
5
6
7
8
9
0 1
2
3
4
5
6
7
8
9
FIGURE 8.15: TOD and tNM similarity matrix for images in 8.14
database (Computational Vision Group at Caltech, 2009). The set consists of 3 subsets of airplanes, faces and leaves images. The similarity matrices for TOD and tNM are shown in figure 8.15 where B = {φ1 , φ2 }, the set of probe functions consists of φ1 (the average gray level) and φ2 (entropy) of subimages. Feature values have been normalized between 0 and 1 and ε = 0.1.
FIGURE 8.16: Selected images from USC-SIPI dataset
Example 8.9
As an example, 50 images in 5 different groups are randomly selected from USCSIPI image database (USC Signal and image processing institute, 2009) shown in figure 8.16 where each group of images is shown in one row. TOD and tNM are
Perceptual Systems Approach to Measuring Image Resemblance
8–19
Similarity Matrix − Sim2 − tNM
Similarity Matrix − Sim1 − TOD 1
Aerial (1).tiff−[1] Aerial (10).tiff−[2] Aerial (2).tiff−[3] Aerial (3).tiff−[4] Aerial (4).tiff−[5] Aerial (5).tiff−[6] Aerial (6).tiff−[7] Aerial (7).tiff−[8] Aerial (8).tiff−[9] Aerial (9).tiff−[10] Man (1).tiff−[11] Man (10).tiff−[12] Man (2).tiff−[13] Man (3).tiff−[14] Man (4).tiff−[15] Man (5).tiff−[16] Man (6).tiff−[17] Man (7).tiff−[18] Man (8).tiff−[19] Man (9).tiff−[20] Motion (1).jpg−[21] Motion (10).jpg−[22] Motion (2).jpg−[23] Motion (3).jpg−[24] Motion (4).jpg−[25] Motion (5).jpg−[26] Motion (6).jpg−[27] Motion (7).jpg−[28] Motion (8).jpg−[29] Motion (9).jpg−[30] Satellite (1).jpg−[31] Satellite (10).jpg−[32] Satellite (2).jpg−[33] Satellite (3).jpg−[34] Satellite (4).jpg−[35] Satellite (5).jpg−[36] Satellite (6).jpg−[37] Satellite (7).jpg−[38] Satellite (8).jpg−[39] Satellite (9).jpg−[40] brain (1).jpg−[41] brain (10).jpg−[42] brain (2).jpg−[43] brain (3).jpg−[44] brain (4).jpg−[45] brain (5).jpg−[46] brain (6).jpg−[47] brain (7).jpg−[48] brain (8).jpg−[49] brain (9).jpg−[50]
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
1
Aerial (1).tiff−[1] Aerial (10).tiff−[2] Aerial (2).tiff−[3] Aerial (3).tiff−[4] Aerial (4).tiff−[5] Aerial (5).tiff−[6] Aerial (6).tiff−[7] Aerial (7).tiff−[8] Aerial (8).tiff−[9] Aerial (9).tiff−[10] Man (1).tiff−[11] Man (10).tiff−[12] Man (2).tiff−[13] Man (3).tiff−[14] Man (4).tiff−[15] Man (5).tiff−[16] Man (6).tiff−[17] Man (7).tiff−[18] Man (8).tiff−[19] Man (9).tiff−[20] Motion (1).jpg−[21] Motion (10).jpg−[22] Motion (2).jpg−[23] Motion (3).jpg−[24] Motion (4).jpg−[25] Motion (5).jpg−[26] Motion (6).jpg−[27] Motion (7).jpg−[28] Motion (8).jpg−[29] Motion (9).jpg−[30] Satellite (1).jpg−[31] Satellite (10).jpg−[32] Satellite (2).jpg−[33] Satellite (3).jpg−[34] Satellite (4).jpg−[35] Satellite (5).jpg−[36] Satellite (6).jpg−[37] Satellite (7).jpg−[38] Satellite (8).jpg−[39] Satellite (9).jpg−[40] brain (1).jpg−[41] brain (10).jpg−[42] brain (2).jpg−[43] brain (3).jpg−[44] brain (4).jpg−[45] brain (5).jpg−[46] brain (6).jpg−[47] brain (7).jpg−[48] brain (8).jpg−[49] brain (9).jpg−[50]
1 2 3 4 5 6 7 8 91011121314151617181920212223242526272829303132333435363738394041424344454647484950
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0 1 2 3 4 5 6 7 8 9 1011121314151617181920212223242526272829303132333435363738394041424344454647484950
FIGURE 8.17: Matrix of TOD and tNM similarity measures between pairs of images = 0.1, p=10 calculated for each possible pair of images where B = {φ1 , φ2 } consists of normalized gray level and entropy probe functions and = .1. Images are 256 by 256 pixels in size and the subimage size is 10 by 10 pixels. Figure 8.17 displays TOD and tNM similarity matrices where the similarity between images of the same group is generally higher.
8.4
Summary and Conclusions
In a perceptual system approach to discovering image similarities, images are considered as sets of points (pixels) with measurable features such as colour, i.e., points with feature values that we can perceive. Inspired by rough set theory, near set theory provides a framework for describing affinities between sets of perceptual objects and thus can be used to define description-based similarity measures between images. In this framework, tolerance spaces can be used to establish a realistic concept of nearness between sets of objects where equalities hold only approximately within a tolerance level of permissible deviation. This tolerance space form of near sets provides a foundation for modeling human perception in a physical continuum. Therefore, a perceptual image processing approach is rapidly growing out of the near set theory in recent years. See (Peters, 2009, 2010; Henry and Peters, 2009b; Peters and Wasilewski, 2009) and (Henry and Peters, 2009a), for example. In this chapter, two novel measures of image similarity based on tolerance spaces have been introduced and studied. The first nearness measure (tNM) was previously published in (Henry and Peters, 2008) and the second measure (TOD) was first proposed by the authors in (Meghdadi et al., 2009) and further elaborated and tested here. Both measures were tested on sample pairs of images and their dependence on the method parameters such as tolerance level (ε) and perceptual subimage size was studied. In order to evaluate the measures, nearness within and between groups
8–20
Rough Fuzzy Image Analysis
of perceptually similar images was calculated. Preliminary results show that very simple probe functions such as average gray level value and image entropy can be successful in classifying one category of images out of the rest (for example, images of airplanes in case of TOD and images of leaves in case of tNM in example 8.8). In our future work, many more probe functions as well as refinements in existing measures and the introduction of new nearness measures are currently being considered as a means of strengthening the proposed image retrieval and image classification methods described in this chapter.
Perceptual Systems Approach to Measuring Image Resemblance
8–21
Bibliography Bartol, W., J. Mir, K. Piro, and F. Rossell. 2004. On the coverings by tolerance classes. Information Sciences 166(1-4):193–211. Doi: DOI: 10.1016/j.ins.2003.12.002. Computational Vision Group at Caltech. 2009. ages archives of computational vision group at http://www.vision.caltech.edu/archive.html.
Imcaltech.
Hassanien, Aboul Ella, Ajith Abraham, James F. Peters, Gerald Schaefer, and Christopher Henry. 2009. Rough sets and near sets in medical imaging: A review. IEEE Transactions on Information Technology in Biomedicine 13(6): 955–968. Digital object identifier: 10.1109/TITB.2009.2017017. Henry, C., and J. F. Peters. 2008. Near set index in an objective image segmentation evaluation framework. In Geographic object based image analysis: Pixels, objects, intelligence, 1–6. University of Calgary, Alberta. ———. 2009a. Perception based image classification. Tech. Rep., Computational Intelligence Laboratory, University of Manitoba. UM CI Laboratory Technical Report No. TR-2009-016. Henry, Christopher, and James F. Peters. 2009b. Perceptual image analysis. International Journal of Bio-Inspired Computation 2(2):to appear. J¨ anich, K. 1984. Topology. Berlin: Springer-Verlag. Martin, D., C. Fowlkes, D. Tal, and J. Malik. 2001. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In 8th int’l conf. computer vision, vol. 2, 416–423. Meghdadi, Amir H., James F. Peters, and Sheela Ramanna. 2009. Tolerance classes in measuring image resemblance. 127–134. Santiago, Chile. ISBN 978-3-64-04591-2, doi 10.1007/978-3-642-04592-9 16. Orlowska, Ewa, and Zdzislaw Pawlak. 1984. Representation of nondeterministic information. Theoretical computer science 29(1-2):27–39. Pawlak, Zdzislaw. 1981a. Classification of objects by means of attributes 429. ———. 1981b. Rough sets. International J. Comp. Inform. Science 11:341–356. Pawlak, Zdzislaw, and James Peters. 2002,2007. Jak blisko (how near). Systemy Wspomagania Decyzji I:57, 109. ISBN 83-920730-4-5.
8–22
Rough Fuzzy Image Analysis
Peters, James, and Sheela Ramanna. 2009. Affinities between perceptual granules: Foundations and perspectives. In Human-centric information processing through granular modelling, 49–66. Berlin: Springer. 10.1007, ISBN 978-3540-92916-1 3. Peters, James F. 2007a. Classification of objects by means of features. In Proc. ieee symposium series on foundations of computational intelligence (ieee scci 2007), 1–8. Honolulu, Hawaii. ———. 2007b. Near sets. general theory about nearness of objects. Applied Mathematical Sciences 1(53):2609–2629. ———. 2007c. Near sets. special theory about nearness of objects. Fundamenta Informaticae 75:407–433. ———. 2008a. Classification of perceptual objects by means of features. Int. J. of Info. Technology & Intell. Computing 3(2):1–35. ———. 2009. Tolerance near sets and image correspondence. International Journal of Bio-Inspired Computation 1(4):239–245. ———. 2010. Corrigenda and addenda: Tolerance near sets and image correspondence. International Journal of Bio-Inspired Computation 2(5). in press. Peters, James F., and Piotr Wasilewski. 2009. Foundations of near sets. Information Sciences 179:3091–3109. Digital object identifier: doi:10.1016/j.ins.2009.04.018. Peters, J.F. 2008b. Notes on perception. Computational Intelligence Laboratory Seminar. ———. 2008c. Notes on tolerance relations. Computational Intelligence Laboratory Seminar. See J.F. Peters, “Tolerance near sets and image correspondence”, Int. J. of Bio-Inspired Computing 4 (1), 2009, 239-245. Peters, J.F., and L. Puzio. 2009. Image analysis with anisotropic wavelet-based nearness measures. International Journal of Computational Intelligence Systems 3(2):1–17. Schroeder, M., and M. Wright. 1992. Tolerance and weak tolerance relations. Journal of Combinatorial Mathematics and Combinatorial Computing 11: 123–160. Sossinsky, A. B. 1986. Tolerance space theory and some applications. Acta Applicandae Mathematicae: An International Survey Journal on Applying Mathematics and Mathematical Applications 5(2):137–167.
Perceptual Systems Approach to Measuring Image Resemblance
8–23
USC Signal and image processing institute. 2009. USC signal and image processing institute image database. http://sipi.usc.edu/database. Zeeman, E. C. 1962. The topology of the brain and visual perception. Topology of 3-manifolds 3:240–248.
9 From Tolerance Near Sets to Perceptual Image Analysis
Shabnam Shahfar University of Manitoba
Amir H. Meghdadi University of Manitoba
James F. Peters University of Manitoba
9.1
9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9–1 9.2 Perceptual systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9–2 9.3 Perceptual Indiscernibility and Tolerance Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9–3 9.4 Near Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9–5 9.5 Three Tolerance Near Set-based Nearness Measures for Image Analysis and Comparison . . . . . . . . . . . . . . . . . . . . 9–6 Tolerance Cardinality Distribution Nearness Measure (TCD) • Tolerance Overlap Distribution nearness measure (TOD) • Tolerance Nearness Measure (tNM)
9.6 Perceptual Image Analysis System . . . . . . . . . . . . . . . . . 9–9 9.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9–11 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9–15
Introduction
The problem considered in this chapter how is to find and measure the similarity between two images. The image correspondence problem is a central and important area of research in computer vision. To solve the image correspondence problem, a biologically inspired approach using near sets and tolerance classes is proposed in this chapter. The proposed method is developed in the context of perceptual systems (Peters and Ramanna, 2008), where each image or parts of an image are considered as perceptual objects (Peters and Wasilewski, 2009). ”A perceptual object is something presented to the senses or knowable by human mind” (Peters and Wasilewski, 2009; Murray, Bradley, Craigie, and Onions, 1933). The perceptual system approach presented here is inspired by the early 1980s work of Z. Pawlak (Pawlak, 1981) on the classification of objects by means of attributes and E. Orlowska (Orlowska, 1982) on approximate spaces as formal counterparts of perception and observation. It has been shown in (Peters, 2007b) that near sets are a generalization of rough sets (Pawlak and Skowron, 2007; Polkowski, 2002). Near sets provide a good basis for the classification of perceptual objects. Near sets are disjoint sets that have matching descriptions to some degree (Henry and Peters, 2009b). One set X is considered near another set Y in the case where there is at least one x ∈ X with a description that matches the description of y ∈ Y (Peters, 2007c,b). The proposed approach in this chapter also benefits from the idea of tolerance classes introduced by Zeeman (Zeeman, 1961). Tolerance relations are viewed 9–1
9–2
Rough Fuzzy Image Analysis
as good models of how one perceives, how one sees. Tolerance relations are also considered as a basis for studying similarities between visual perceptions (Zeeman, 1961; Peters, 2008c). In this chapter, first, the formal definitions of perceptual systems, equivalence and tolerance relations, and near set theory will be reviewed in sections 9.2, 9.3, and 9.4, respectively. Then, the new nearness measure called tolerance cardinality distribution measure (TCD) will be introduced in section 9.5. More details of the implementation of a tolerance-based perceptual image analysis system will be reviewed in section 9.6. Finally, section 9.7 will conclude the chapter.
9.2
Perceptual systems
A perceptual system is a real valued, total, deterministic, information system. Such a system consists of a set of perceptual objects and set of probe functions representing object features (Peters and Wasilewski, 2009), (Peters, 2007b,a; Peters and Ramanna, 2008). A perceptual object (x ∈ O) is something presented to the senses or knowable by human mind (Murray et al., 1933). For example, a pixel or a group of pixels in an image can be perceived as a perceptual object. Features of an object such as color, entropy, texture, contour, spatial orientation, etc. can be represented by probe functions. A probe function can be thought of as a model for a sensor. A probe function φ(x) is a real-valued function representing features of the physical object x. A set of prob functions F = {φ1 , φ2 , ..., φl } can be defined to generate all the features for each object x, where φi : O →