Patricia Melin Modular Neural Networks and Type-2 Fuzzy Systems for Pattern Recognition
Studies in Computational Intelligence, Volume 389 Editor-in-Chief Prof. Janusz Kacprzyk Systems Research Institute Polish Academy of Sciences ul. Newelska 6 01-447 Warsaw Poland E-mail:
[email protected] Further volumes of this series can be found on our homepage: springer.com Vol. 366. Mario Köppen, Gerald Schaefer, and Ajith Abraham (Eds.) Intelligent Computational Optimization in Engineering, 2011 ISBN 978-3-642-21704-3 Vol. 367. Gabriel Luque and Enrique Alba Parallel Genetic Algorithms, 2011 ISBN 978-3-642-22083-8 Vol. 368. Roger Lee (Ed.) Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing 2011, 2011 ISBN 978-3-642-22287-0 Vol. 369. Dominik Ry_zko, Piotr Gawrysiak, Henryk Rybinski, and Marzena Kryszkiewicz (Eds.) Emerging Intelligent Technologies in Industry, 2011 ISBN 978-3-642-22731-8 Vol. 370. Alexander Mehler, Kai-Uwe Kühnberger, Henning Lobin, Harald Lüngen, Angelika Storrer, and Andreas Witt (Eds.) Modeling, Learning, and Processing of Text Technological Data Structures, 2011 ISBN 978-3-642-22612-0 Vol. 371. Leonid Perlovsky, Ross Deming, and Roman Ilin (Eds.) Emotional Cognitive Neural Algorithms with Engineering Applications, 2011 ISBN 978-3-642-22829-2 Vol. 372. Ant´onio E. Ruano and Annam´aria R. V´arkonyi-K´oczy (Eds.) New Advances in Intelligent Signal Processing, 2011 ISBN 978-3-642-11738-1 Vol. 373. Oleg Okun, Giorgio Valentini, and Matteo Re (Eds.) Ensembles in Machine Learning Applications, 2011 ISBN 978-3-642-22909-1 Vol. 374. Dimitri Plemenos and Georgios Miaoulis (Eds.) Intelligent Computer Graphics 2011, 2011 ISBN 978-3-642-22906-0
Vol. 378. János Fodor, Ryszard Klempous, and Carmen Paz Suárez Araujo (Eds.) Recent Advances in Intelligent Engineering Systems, 2011 ISBN 978-3-642-23228-2 Vol. 379. Ferrante Neri, Carlos Cotta, and Pablo Moscato (Eds.) Handbook of Memetic Algorithms, 2011 ISBN 978-3-642-23246-6 Vol. 380. Anthony Brabazon, Michael O’Neill, and Dietmar Maringer (Eds.) Natural Computing in Computational Finance, 2011 ISBN 978-3-642-23335-7 Vol. 381. Radoslaw Katarzyniak, Tzu-Fu Chiu, Chao-Fu Hong, and Ngoc Thanh Nguyen (Eds.) Semantic Methods for Knowledge Management and Communication, 2011 ISBN 978-3-642-23417-0 Vol. 382. F.M.T. Brazier, Kees Nieuwenhuis, Gregor Pavlin, Martijn Warnier, and Costin Badica (Eds.) Intelligent Distributed Computing V, 2011 ISBN 978-3-642-24012-6 Vol. 383. Takayuki Ito, Minjie Zhang, Valentin Robu, Shaheen Fatima, and Tokuro Matsuo (Eds.) New Trends in Agent-Based Complex Automated Negotiations, 2012 ISBN 978-3-642-24695-1 Vol. 384. Daphna Weinshall, J¨orn Anem¨uller, and Luc van Gool (Eds.) Detection and Identification of Rare Audiovisual Cues, 2012 ISBN 978-3-642-24033-1 Vol. 385. Alex Graves Supervised Sequence Labelling with Recurrent Neural Networks, 2012 ISBN 978-3-642-24796-5 Vol. 386. Marek R. Ogiela and Lakhmi C. Jain (Eds.) Computational Intelligence Paradigms in Advanced Pattern Classification, 2012 ISBN 978-3-642-24048-5
Vol. 375. Marenglen Biba and Fatos Xhafa (Eds.) Learning Structure and Schemas from Documents, 2011 ISBN 978-3-642-22912-1
Vol. 387. David Alejandro Pelta, Natalio Krasnogor, Dan Dumitrescu, Camelia Chira, and Rodica Lung (Eds.) Nature Inspired Cooperative Strategies for Optimization (NICSO 2011), 2011 ISBN 978-3-642-24093-5
Vol. 376. Toyohide Watanabe and Lakhmi C. Jain (Eds.) Innovations in Intelligent Machines – 2, 2011 ISBN 978-3-642-23189-6
Vol. 388. Tiansi Dong Recognizing Variable Environments, 2012 ISBN 978-3-642-24057-7
Vol. 377. Roger Lee (Ed.) Software Engineering Research, Management and Applications 2011, 2011 ISBN 978-3-642-23201-5
Vol. 389. Patricia Melin Modular Neural Networks and Type-2 Fuzzy Systems for Pattern Recognition, 2012 ISBN 978-3-642-24138-3
Patricia Melin
Modular Neural Networks and Type-2 Fuzzy Systems for Pattern Recognition
123
Author
Prof. Patricia Melin Tijuana Institute of Technology Division of Graduate Studies, Tijuana, Mexico Mailing Address P.O. Box 4207 Chula Vista CA 91909, USA E-mail:
[email protected] ISBN 978-3-642-24138-3
e-ISBN 978-3-642-24139-0
DOI 10.1007/978-3-642-24139-0 Studies in Computational Intelligence
ISSN 1860-949X
Library of Congress Control Number: 2011939329 c 2012 Springer-Verlag Berlin Heidelberg This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilm or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer. Violations are liable to prosecution under the German Copyright Law. The use of general descriptive names, registered names, trademarks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. Typeset & Cover Design: Scientific Publishing Services Pvt. Ltd., Chennai, India. Printed on acid-free paper 987654321 springer.com
Preface
We describe in this book, hybrid intelligent systems using type-2 fuzzy logic and modular neural networks for pattern recognition applications. Hybrid intelligent systems combine several intelligent computing paradigms, including fuzzy logic, neural networks, and bio-inspired optimization algorithms, which can be used to produce powerful pattern recognition systems. The book is organized in three main parts, which contain a group of chapters around a similar subject. The first part consists of chapters with the main theme of theory and design algorithms, which are basically chapters that propose new models and concepts, which can be the basis for achieving intelligent pattern recognition. The second part contains chapters with the main theme of using type-2 fuzzy models and modular neural networks with the aim of designing intelligent systems for complex pattern recognition problems. The third part contains chapters with the theme of evolutionary optimization of type-2 fuzzy systems and modular neural networks in intelligent pattern recognition, which includes the application of genetic algorithms for obtaining optimal type-2 fuzzy integration systems and ideal neural network architectures. In the part of theory and algorithms there are 4 chapters that describe different contributions that propose new models and concepts, which can be the considered as the basis for achieving intelligent pattern recognition. The first chapter offers an introduction to the areas of type-2 fuzzy logic and modular neural networks for pattern recognition applications. The second chapter describes the basic concepts of type-2 fuzzy logic applied to the problem of edge detection in digital images. The third chapter describes a general methodology for applying type-2 fuzzy logic on improving the recognition ability of modular neural networks. The fourth chapter describes the use of type-2 fuzzy systems for improving the performance of response integration in modular neural networks. In the part of type-2 fuzzy systems and modular neural networks for pattern recognition applications there are 4 chapters that describe different contributions on achieving human recognition using hybrid systems based on type-2 fuzzy logic. The first chapter describes the use of a modular neural network with fuzzy response integration for human recognition based on the iris biometric measure. The second chapter deals with the design of modular neural network architecture with fuzzy integration using the ear biometric measure as information for recognition. The third chapter describes the design of a type-2 fuzzy modular neural system for
VI
Preface
signature recognition. The fourth chapter describes the application of type-2 fuzzy logic and modular neural networks for achieving efficient face recognition. In the part of evolutionary optimization of type-2 fuzzy systems and modular neural networks there are 5 chapters that describe different contributions of new algorithms for optimization and their application to designing optimal type-2 fuzzy logic response integrators and ideal modular neural network architectures. The first chapter describes the optimization of type-2 fuzzy response integrators and modular neural networks using genetic algorithms for achieving human recognition based on the face, fingerprint and voice biometric measures. The second chapter deals with an approach for the optimization of number of rules and membership functions for type-2 fuzzy response integrators of modular networks with hierarchical genetic algorithms for human recognition based on face, fingerprint and voice. The third chapter describes the application a general method for designing type-2 fuzzy systems based on genetic algorithms that can be used as response integrators in modular neural networks. The fourth chapter describes the optimal design of the modular architecture and fuzzy response integrator based on genetic algorithms for achieving human recognition based on the iris biometric measure. The fifth chapter describes the application of a hierarchical genetic algorithm for designing optimal fuzzy response integrators and the modular neural network architecture for human recognition based on the ear biometric information. In conclusion, the book comprises chapters on diverse aspects of type-2 fuzzy logic and modular neural networks with evolutionary models for achieving intelligent pattern recognition for different applications, including human recognition. The combination of evolutionary optimization methods with type-2 fuzzy logic and modular neural networks can be considered as a hybrid approach for obtaining efficient and accurate solutions to complex pattern recognition problems.
July 21, 2011
Patricia Melin Tijuana Institute of Technology Mexico
Contents
Part I: Basic Concepts and Theory 1 Introduction to Type-2 Fuzzy Logic in Neural Pattern Recognition Systems .............................................................................................................. 3 2 Type-1 and Type-2 Fuzzy Inference Systems for Images Edge Detection .......................................................................................................... 7 2.1 Introduction ............................................................................................... 7 2.2 Sobel Operators......................................................................................... 8 2.3 Edge Detection by Gradient Magnitude .................................................... 9 2.4 Edge Detection Using Type-1 Fuzzy Logic ............................................ 10 2.5 Edge Detection Using Type-2 Fuzzy Logic ............................................ 14 2.6 Comparison of Results ............................................................................ 16 2.7 Summary ................................................................................................. 19 3 Type-2 Fuzzy Logic for Improving Training Data and Response Integration in Modular Neural Networks .................................................... 21 3.1 Method for Image Recognition ................................................................ 21 3.2 Type-2 Fuzzy Inference System as Edge Detector................................... 22 3.3 The Modular Structure ............................................................................. 25 3.4 Simulation Results.................................................................................... 25 3.5 Summary .................................................................................................. 28 4 Method for Response Integration in Modular Neural Networks Using Type-2 Fuzzy Logic ........................................................................................ 29 4.1 Introduction .............................................................................................. 29 4.2 Proposed Approach for Recognition ........................................................ 31 4.3 Modular Neural Networks........................................................................ 31 4.4 Integration of Results for Person Recognition Using Fuzzy Logic .......... 32 4.5 Modular Neural Networks with Type-2 Fuzzy Logic as a Method for Response Integration ................................................................................ 33
VIII
Contents
4.6 Simulation Results ................................................................................... 35 4.7 Summary .................................................................................................. 39
Part II: Modular Neural Networks in Pattern Recognition 5 Modular Neural Networks for Person Recognition Using the Contour Segmentation of the Human Iris ................................................................... 43 5.1 Introduction.............................................................................................. 43 5.2 Background and Basic Concepts.............................................................. 44 5.3 Proposed Method and Problem Description............................................. 46 5.4 Modular Neural Network Architecture .................................................... 49 5.5 Simulation Results ................................................................................... 50 5.6 Summary .................................................................................................. 58 6 Modular Neural Networks for Human Recognition from Ear Images Compressed Using Wavelets ......................................................................... 61 6.1 Introduction.............................................................................................. 61 6.2 Background .............................................................................................. 64 6.3 Ear Recognition Process........................................................................... 66 6.4 Summary .................................................................................................. 75 7 Signature Recognition with a Hybrid Approach Combining Modular Neural Networks and Fuzzy Logic for Response Integration .................... 77 7.1 Introduction.............................................................................................. 77 7.2 Problem Statement and Outline of Our Proposal ..................................... 78 7.3 Background Theory.................................................................................. 79 7.4 Experiments ............................................................................................. 85 7.5 Summary .................................................................................................. 92 8 Interval Type-2 Fuzzy Logic for Module Relevance Estimation in Sugeno Response Integration of Modular Neural Networks...................... 93 8.1 Introduction.............................................................................................. 93 8.2 Modular Neural Networks........................................................................ 94 8.3 Sugeno Integral for Modules Fusion ........................................................ 96 8.4 The Sugeno Integral ................................................................................. 97 8.5 Fuzzy Logic for Density Estimation ........................................................ 97 8.6 FIS-1 to Estimate Fuzzy Densities ........................................................... 99 8.7 FIS-2 for Estimate Fuzzy Densities........................................................ 100 8.8 Sugeno Integral for Information Fusion ................................................. 101 8.9 Simulation Results ................................................................................. 103 8.10 Summary .............................................................................................. 105
IX
Contents
Part III: Optimization of Modular Neural Networks for Pattern Recognition 9 Optimization of Fuzzy Response Integrators in Modular Neural Networks with Hierarchical Genetic Algorithms ..................................... 109 9.1 Introduction ............................................................................................ 109 9.2 Neural Networks .................................................................................... 110 9.3 Fuzzy Logic............................................................................................ 111 9.4 Genetic Algorithms ................................................................................ 112 9.5 Modular Neural Network with Fuzzy Integration for Face, Fingerprint and Voice Recognition ........................................................................... 113 9.6 Modular Neural Network Results........................................................... 115 9.7 Fuzzy Integration Results....................................................................... 118 9.8 Hierarchical Genetic Algorithm ............................................................. 119 9.9 Comparison with Other Works............................................................... 124 9.10 Summary .............................................................................................. 126 10 Modular Neural Network with Fuzzy Response Integration and Its Optimization Using Genetic Algorithms for Human Recognition Based on Iris, Ear and Voice Biometrics ................................................. 127 10.1 Introduction ........................................................................................ 127 10.2 Background ........................................................................................ 128 10.3 Basic Concepts ................................................................................... 129 10.4 Proposed Method and Results ............................................................ 130 10.5 Summary ............................................................................................ 144 11 A Comparative Study of Type-2 Fuzzy System Optimization Based on Parameter Uncertainty of Membership Functions ............................ 145 11.1 Introduction ........................................................................................ 145 11.2 Preliminaries ...................................................................................... 146 11.3 Optimization Method Description...................................................... 147 11.4 Fuzzy Systems Optimization Based on the Level of Uncertainty ...... 148 11.5 Simulation Results.............................................................................. 150 11.6 Summary ............................................................................................ 161 12 Neural Network Optimization for the Recognition of Persons Using the Iris Biometric Measure........................................................................ 163 12.1 Introduction ........................................................................................ 163 12.2 Methods of Integration ....................................................................... 164 12.3 Iris Image Pre-processing ................................................................... 165 12.4 Statement of the Problem and Proposed Method ............................... 170 12.5 Summary ............................................................................................ 184
X
Contents
13 Optimization of Neural Networks for the Accurate Identification of Persons by Images of the Human Ear as Biometric Measure................ 185 13.1 Introduction ........................................................................................ 185 13.2 Modular Artificial Neural Networks .................................................. 186 13.3 Integration Methods ........................................................................... 187 13.4 Pre-processing the Biometric Image of the Ear.................................. 188 13.5 Problem Statement and Proposed Method ......................................... 190 13.6 Summary ............................................................................................ 204 References .......................................................................................................... 205 Index................................................................................................................... 213
Part I
Chapter 1
Introduction to Type-2 Fuzzy Logic in Neural Pattern Recognition Systems
We describe in this book, new methods for building intelligent systems for pattern recognition using type-2 fuzzy logic and soft computing techniques. Soft Computing (SC) consists of several computing paradigms, including type-1 fuzzy logic, neural networks, and genetic algorithms, which can be used to create powerful hybrid intelligent systems [37, 57]. In this book, we are extending the use of fuzzy logic to a higher order, which is called type-2 fuzzy logic [12]. Combining type-2 fuzzy logic with traditional SC techniques, we can build powerful hybrid intelligent systems that can use the advantages that each technique offers in solving pattern recognition problems [59]. Fuzzy logic is an area of soft computing that enables a computer system to reason with uncertainty [103]. A fuzzy inference system consists of a set of if-then rules defined over fuzzy sets. Fuzzy sets generalize the concept of a traditional set by allowing the membership degree to be any value between 0 and 1. This corresponds, in the real world, to many situations where it is difficult to decide in an unambiguous manner if something belongs or not to a specific class [12]. Fuzzy expert systems, for example, have been applied with some success to problems of decision, control, diagnosis and classification, just because they can manage the complex expert reasoning involved in these areas of application [20, 44, 58, 60]. The main disadvantage of fuzzy systems is that they can't adapt to changing situations [24, 59] For this reason, it is a good idea to combine fuzzy logic with neural networks or genetic algorithms, because either one of these last two methodologies could give adaptability to the fuzzy system [19, 59, 60]. On the other hand, the knowledge that is used to build these fuzzy rules is uncertain. Such uncertainty leads to rules whose antecedents or consequents are uncertain, which translates into uncertain antecedent or consequent membership functions [41]. Type-1 fuzzy systems, like the ones mentioned above, whose membership functions are type-1 fuzzy sets, are unable to directly handle such uncertainties [46, 47]. We also describe in this book, type-2 fuzzy systems, in which the antecedent or consequent membership functions are type-2 fuzzy sets [12, 62]. Such sets are fuzzy sets whose membership grades themselves are type-1 fuzzy sets; they are very useful in circumstances where it is difficult to determine an exact membership function for a fuzzy set [12]. Type-2 fuzzy systems have been applied with relative success
P. Melin: Modular Neural Networks and Type-2 Fuzzy Systems, SCI 389, pp. 3–6. springerlink.com © Springer-Verlag Berlin Heidelberg 2012
4
Chapter 1 Introduction to Type-2 Fuzzy Logic in Neural Pattern Recognition Systems
in many real-world applications [27], like in control [1, 11, 14, 15, 20, 29, 95, 99, 100], time series prediction [6, 42, 76], classification and decision [2, 3, 4, 21, 22, 46, 47, 73, 81, 82, 85, 86, 89, 98, 101, 104, 106], diagnosis [23, 67, 72, 75], and pattern recognition [30, 31, 32, 33, 36, 50, 54, 61, 64, 66, 80, 91, 105]. There have also been recent theoretical advances in type-2 fuzzy logic [1, 13, 26, 68, 69, 71, 77, 78, 88, 92, 94, 96, 102, 108, 109] that have allowed to improve performance in the implementations, and the use of optimization techniques to automate the design of type-2 fuzzy systems [5, 10, 22, 45, 70, 79, 90, 97]. Uncertainty is an inherent part of intelligent systems used in real-world applications. The use of new methods for handling incomplete information is of fundamental importance. Type-1 fuzzy sets used in conventional fuzzy systems cannot fully handle the uncertainties present in intelligent systems. Type-2 fuzzy sets that are used in type-2 fuzzy systems can handle such uncertainties in a better way because they provide us with more parameters [83]. This book deals with the design of intelligent systems using interval type-2 fuzzy logic for minimizing the effects of uncertainty produced by the instrumentation elements, environmental noise, etc. [84]. Neural networks are computational models with learning (or adaptive) characteristics that model the human brain [57]. Generally speaking, biological natural neural networks consist of neurons and connections between them, and this is modeled by a graph with nodes and arcs to form the computational neural network. This graph along with a computational algorithm to specify the learning capabilities of the system is what makes the neural network a powerful methodology to simulate intelligent or expert behavior. Neural networks can be classified in supervised and unsupervised. The main difference is that in the case of the supervised neural networks the learning algorithm uses input-output training data to model the dynamic system, on the other hand, in the case of unsupervised neural networks only the input data is given. In the case of an unsupervised network, the input data is used to make representative clusters of all the data. It has been shown, that neural networks are universal approximators, in the sense that they can model any general function to a specified accuracy and for this reason neural networks have been applied to problems of system identification, control, diagnosis, time series prediction, and pattern recognition [32, 56, 58, 60, 65]. We also describe the basic concepts, theory and algorithms of modular and ensemble neural networks [52, 59]. We will also give particular attention to the problem of response integration, which is very important because response integration is responsible for combining all the outputs of the modules [80, 93]. Basically, a modular or ensemble neural network uses several monolithic neural networks to solve a specific problem [48, 49, 64]. The basic idea is that combining the results of several simple neural networks we will achieve a better overall result in terms of accuracy and also learning can be done faster [50, 51, 80]. Genetic algorithms and evolutionary methods are optimization methodologies based on principles of nature [24]. Both methodologies can also be viewed as searching algorithms because they explore a space using heuristics inspired by
Chapter 1 Introduction to Type-2 Fuzzy Logic in Neural Pattern Recognition Systems
5
nature [25]. Genetic algorithms are based on the ideas of evolution and the biological process that occur at the DNA level. Basically, a genetic algorithm uses a population of individuals, which are modified by using genetic operators in such a way as to eventually obtain the fittest individual. Any optimization problem has to be represented by using chromosomes, which are a codified representation of the real values of the variables in the problem. Both, genetic algorithms and evolutionary methods can be used to optimize a general objective function. As genetic algorithms are based on the ideas of natural evolution, we can use this methodology to evolve a neural network or a fuzzy system for a particular application [10, 11, 25]. The problem of finding the best architecture of a neural network is very important because there are no theoretical results on this, and in many cases we are forced to trial and error unless we use a genetic algorithm to automate this process. A similar thing occurs in finding out the optimal number of rules and membership functions of a fuzzy system for a particular application, here a genetic algorithm can also help us avoid time consuming trial and error. In this book, we use genetic algorithms to optimize the architecture of fuzzy and neural systems [80]. We describe in this book a new approach for face recognition using modular neural networks with a fuzzy logic method for response integration [63, 64]. We describe a new architecture for modular neural networks for achieving pattern recognition in the particular case of human faces. Also, the method for achieving response integration is based on the fuzzy Sugeno integral and type-2 fuzzy logic. Response integration is required to combine the outputs of all the modules in the modular network. We have applied the new approach for face recognition with a real database of faces from students and professors of our institution. Recognition rates with the modular approach were compared against the monolithic single neural network approach, to measure the improvement. The results of the modular neural network approach gives excellent performance overall and also in comparison with the monolithic approach. We describe a new architecture for modular neural networks for achieving pattern recognition in the particular case of human fingerprints [93]. Also, the method for achieving response integration is based on the fuzzy Sugeno integral. Response integration is required to combine the outputs of all the modules in the modular network. We also describe in this book the use of neural networks, fuzzy logic and genetic algorithms for voice recognition [32, 93]. In particular, we consider the case of speaker recognition by analyzing the sound signals with the help of intelligent techniques, such as the neural networks and fuzzy systems. We use the neural networks for analyzing the sound signal of an unknown speaker, and after this first step, a set of type-2 fuzzy rules is used for decision making. We need to use fuzzy logic due to the uncertainty of the decision process. We also use genetic algorithms to optimize the architecture of the neural networks. We illustrate our approach with a sample of sound signals from real speakers in our institution. This book also presents three modular neural network architectures as systems for recognizing persons based on the iris biometric measurement of humans [80]. In these systems, the human iris database is enhanced with image processing
6
Chapter 1 Introduction to Type-2 Fuzzy Logic in Neural Pattern Recognition Systems
methods, and the coordinates of the center and radius of the iris are obtained to make a cut of the area of interest by removing the noise around the iris. The inputs to the modular neural networks are the processed iris images and the output is the number of the person identified. The integration of the modules was done with a gating network method. This book also describes human recognition from ear images as biometric using modular neural networks with preprocessing ear images as network inputs [80]. We propose a new modular neural network architecture composed of twelve modules, in order to simplify the problem making it smaller. Comparing with other biometrics, ear recognition has one of the best performances, even when it has not received much attention. To compare with other existing methods, we used the 2D Wavelet analysis with global thresholding method for compression, and Sugeno Measures and Winner-Takes-All as modular neural network integrator. Recognition results achieved with this approach were excellent. This book also describes a modular neural network with fuzzy response integration for the problem of signature recognition [59]. Currently, biometric identification has gained a great deal of research interest within the pattern recognition community. For instance, many attempts have been made in order to automate the process of identifying a person’s handwritten signature; however this problem has proven to be a very difficult task. In this work, we propose a modular network that has three separate modules, each using different image features as input, these are: edges, wavelet coefficients, and the Hough transform matrix. Then, the outputs from each of these modules are combined using a Sugeno fuzzy integral and a fuzzy inference system. The experimental results obtained using a benchmark database individual’s shows that the modular architecture can achieve a very high recognition accuracy. Therefore, we conclude that the proposed architecture provides a suitable platform to build a signature recognition system. Furthermore we consider the verification of signatures as false acceptance, false rejection and error recognition of the modular neural network. We describe in this book our new approach for human recognition using as information the combination of three biometric measures, iris, ear, and voice of a person [80]. We have described above the use of intelligent techniques for achieving iris recognition, ear, and voice identification. Now we can consider the integration of these three biometric measures to improve the accuracy of human recognition. The new approach will integrate the information from three main modules, one for each of the three biometric measures. The new approach consists in a modular architecture that contains three basic modules: iris, ear, and voice. The final decision is based on the results of the three modules and uses type-2 fuzzy logic to take into account the uncertainty of the outputs of the modules. Recognition results are improved significantly with the use of type 2 fuzzy logic, especially for real world situations, under uncertainty or noise, when decisions are more difficult to make. Of course, the problem is in this case is how to find the optimal type 2 fuzzy system to integrate the information of the modules in an appropriate fashion. Genetic algorithms are used to automatically design the optimal type 2 fuzzy systems for achieving this goal.
Chapter 2
Type-1 and Type-2 Fuzzy Inference Systems for Images Edge Detection
Edges detection in digital images is a problem that has been solved by means of the application of different techniques from digital signal processing, also the combination of some of these techniques with Fuzzy Inference System (FIS) has been experienced [64]. In this chapter a new FIS Type-2 method is implemented for the detection of edges and the results of three different techniques for the same intention are compared [65].
2.1 Introduction In the area of digital signal processing, methods have been proven that solve the problem of image recognition. Some of them include techniques like binarization, bidimensional filter, detection of edges and compression using banks of filters and trees, among others. Specifically in methods for the detection of edges we can find comparative studies of methods like: Canny, Narwa, Iverson, Bergholm y Rothwell. Others methods can group in two categories: Gradient and Laplacian [65]. The gradient methods like Roberts, Prewitt and Sobel detect edges, looking for maximum and minimum in first derived from the image. The Laplacian methods like Marrs-Hildreth do it finding the zeros of second derived from the image. This work is the beginning of an effort for the design of new pre-processing images techniques, using Fuzzy Inference Systems (FIS), which allows feature extraction and construction of input vectors for neural networks with aims of image recognition. Artificial neural networks are one of the most used objective techniques in the automatic recognition of patterns, here some reasons [59]: • • •
Theoretically any function can be determined. Except the input patterns, it is not necessary to provide additional information. They are possible to be applied to any type of patterns and to any data type.
The idea to apply artificial neural networks for images recognition, tries to obtain results without providing another data that the original images, of this form the
P. Melin: Modular Neural Networks and Type-2 Fuzzy Systems, SCI 389, pp. 7–20. springerlink.com © Springer-Verlag Berlin Heidelberg 2012
8
Chapter 2 Type-1 and Type-2 Fuzzy Inference Systems for Images Edge Detection
process is more similar to the form in which the biological brain learns to recognize patterns, only knowing experiences of past [8, 9, 32, 40]. Models with modular neural networks have been designed, that allow recognizing images divided in four or six parts. This is necessary due to the great amount of input data, since an image without processing is of 100x100 pixels, needs a vector 10000 elements, where each one corresponds to pixel with variations of gray tones between 0 and 255. This chapter shows an efficient Fuzzy Inference System for edges detection, in order to use the output image like input data for modular neural networks. In the proposed technique, it is necessary to apply Sobel operators to the original images, and then use a Fuzzy System to generate the vector of edges that would serve as input data to a neural network.
2.2 Sobel Operators The Sobel operator applied on a digital image, in gray scale, calculates the gradient of the intensity of brightness of each pixel, giving the direction of the greater possible increase of black to white, in addition calculates the amount of change of that direction [65]. The Sobel operator performs a 2-D spatial gradient measurement on an image. Typically it is used to find the approximate absolute gradient magnitude at each point in an input grayscale image. The Sobel edges detector uses a pair of 3x3 convolution masks, one estimating the gradient in the x-direction (columns) and the other estimating the gradient in the y-direction (rows). A convolution mask is usually much smaller than the actual image. As a result, the mask is slid over the image, manipulating a square of pixels at a time. The Sobel masks are shown in equation (2.1):
− 1 0 1 Sobelx = − 2 0 2 − 1 0 1
2 1 1 Sobel y = 0 0 0 − 1 − 2 − 1
(2.1)
Where Sobely y Sobelx are the Sobel Operators throughout x-axis and y-axis. If we define I as the source image, gx and gy are two images which at each point contain the horizontal and vertical derivative approximations, the latter are computed as in equations (2.2) and (2.3). i =3 j = 3
g x = Sobel x ,i , j * I r +i − 2,c + j − 2
(2.2)
i =1 j =1
i =3 j = 3
g y = Sobel y ,i , j * I r +i −2,c + j −2 i =1 j =1
(2.3)
2.3 Edge Detection by Gradient Magnitude
9
Where gx and gy are the gradients along axis-x and axis-y, and * represents the convolution operator. The gradient magnitude g is calculated with equation (2.4).
g = g x2 + g y2
(2.4)
2.3 Edge Detection by Gradient Magnitude Although the idea presented in this chapter, is to verify the efficiency of a FIS for edges detection in digital images, from the approaches given by Sobel operator, is necessary to display first the obtained results using only the gradient magnitude. It will be used as an example the first image of the subject number one of the ORL database (Figure 2.1). The gray tone of each pixel of this image is a value of between 0 and 255. In Figure 2.2 appears the image generated by gx, and figure 8.3 presents the image generated by gy.
20 40 60 80 100 20
40
60
80
Fig. 2.1 Original Image 1.pgm
20
20
40
40
60
60
80
80
100
100 20
40
60
80
Fig. 2.2 Image given by gx
20
40
60
80
Fig. 2.3 Image given by gy
10
Chapter 2 Type-1 and Type-2 Fuzzy Inference Systems for Images Edge Detection
An example of maximum and minimum values of the matrix given by gx, gy and g from the image 1.pgm is shown in Table 2.1. Table 2.1 Maximum and minimum values from 1.pgm, gx, gy y g
Tone
1.pgm
gx
gy
g
Minimum Maximum
11 234
-725 738
-778 494
0 792
After applying equation (2.4), g is obtained as it is in Figure 2.4.
Fig. 2.4 Edges image given by g
2.4 Edge Detection Using Type-1 Fuzzy Logic A Mamdani FIS was implemented using Type-1 Fuzzy Logic [103], with four inputs, one output and 7 rules, using the Matlab Fuzzy Logic Toolbox, which is shown in Figure 2.5.
M
EDGES DETECTOR DH (mamdani)
DV EDGES
Fig. 2.5 FIS in Matlab Fuzzy Logic Tool Box
2.4 Edge Detection Using Type-1 Fuzzy Logic
11
For the Type-1 Fuzzy Inference System, 4 inputs are required, 2 of them are the gradients with respect to x-axis and y-axis, calculated with equation (2.2) and equation (2.3), to which we will call DH and DV respectively. The other two inputs are filters: A high-pass filter, given by the mask of the equation (2.5), and a low-pass filter given by the mask of equation (2.6). The highpass filter hHP detects the contrast of the image to guarantee the border detection in relative low contrast regions. The low-pass filter hMF allow to detects image pixels belonging to regions of the input were the mean gray level is lower. These regions are proportionally more affected by noise, supposed it is uniformly distributed over the whole image. The goal here is to design a system which makes it easier to include edges in low contrast regions, but which does not favor false edges by effect of noise.
1 − 16 1 hHP = − 8 − 1 16
1 1 1 hMF = * 1 25 1 1
1 1 − 8 16 3 1 − 4 8 1 1 − − 8 16 −
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
(2.5)
(2.6)
Then the inputs for the type 1FIS are: DH=gx DV=gy HP= hHP*I M= hMF*I where * is the convolution operator. For all the fuzzy variables, the membership functions are of Gaussian type. According to the executed tests, the values in DH and DV, go from -800 to 800, then the ranks in x-axis adjusted as it is in Figures 2.6, 2.7 and 2.8, in where the membership functions are: LOW: gaussmf(43,0), MEDIUM: gaussmf(43,127), HIGH: gaussmf(43,255).
12
Chapter 2 Type-1 and Type-2 Fuzzy Inference Systems for Images Edge Detection
LOW MEDIUM
Degree of membership
1
HIGH
0.8 0.6 0.4 0.2 0 -800
-600
-400
-200
0 DH
200
400
600
800
600
800
600
800
Fig. 2.6 Input variable DH
LOW
Degree of membership
1
MEDIUM
HIGH
0.8 0.6 0.4 0.2 0 -800
-600
-400
-200
0 DV
200
400
Fig. 2.7 Input variable DV
LOW
Degree of membership
1
MEDIUM
HIGH
0.8 0.6 0.4 0.2 0 -800
-600
-400
-200
0 HP
200
400
Fig. 2.8 Input variable HP
In the case of variable M, the tests threw values in the rank from 0 to 255, and thus the rank in x-axis adjusted, as it is appreciated in Figure 2.9.
2.4 Edge Detection Using Type-1 Fuzzy Logic
Degree of membership
MEDIUM
LOW
1
13
HIGH
0.8 0.6 0.4 0.2 0 0
50
100
150
200
250
M
Fig. 2.9 Input variable M
In Figure 2.10 is the output variable EDGES that also adjusted the ranks between 0 and 255, since it is the range of values required to display the edges of an image.
LOW
Degree of membership
1
HIGH
MEDIUM
0.8 0.6 0.4 0.2 0 0
50
100
150
200
250
EDGES
Fig. 2.10 Output variable EDGES
The seven fuzzy rules that allow to evaluate the input variables, so that the exit image displays the edges of the image in color near white (HIGH tone), whereas the background was in tones near black (tone LOW). 1. If (DH is LOW) and (DV is LOW) then (EDGES is LOW) 2. If (DH is MEDIUM) and (DV is MEDIUM) then (EDGES is HIGH) 3. If (DH is HIGH) and (DV is HIGH) then (EDGES is HIGH)
14
Chapter 2 Type-1 and Type-2 Fuzzy Inference Systems for Images Edge Detection
4. If (DH is MEDIUM) and (HP is LOW) then (EDGES is HIGH) 5. If (DV is MEDIUM) and (HP is LOW) then (EDGES is HIGH) 6. If (M is LOW) and (DV is MEDIUM) then (EDGES is LOW) 7. If (M is LOW) and (DH is MEDIUM) then (EDGES is LOW)
The result obtained for image of Figure 2.1 is remarkably better than the one than it was obtained with the method of gradient magnitude, as it is in Figure 2.11.
Fig. 2.11 EDGES Image by FIS Type 1
Reviewing the values of each pixel, we see that all fall in the rank from 0 to 255, which is not obtained with the method of gradient magnitude.
2.5 Edge Detection Using Type-2 Fuzzy Logic For the Type-2 FIS, the same method was followed as in Type-1 FIS, indeed to be able to make a comparison of both results. The tests with the type-2 FIS, were executed using the computer program imagen_bordes_fis2.m, which creates a Type-2 Inference System (Mamdani) by intervals. The mentioned program creates the type 2 fuzzy variables as it is seen in Figure 2.12. The wide of the FOU chosen for each membership function was the one that had better results after several experiments. The program imagen_bordes_fuzzy2.m was implemented to load the original image, and to apply the filters before mentioned. Because the great amount of data that the fuzzy rules must evaluate, the image was divided in four parts, and the FIS was applied to each one separately. The result of each evaluation gives a vector with tones of gray by each part of the image, in the end is the complete image with the edges (Figure 2.13).
2.5 Edge Detection Using Type-2 Fuzzy Logic
15
LOW MEDIUM HIGH
0.5
-800
-600
-400
-200
200
400
600
800
M LOW MEDIUM HIGH
0.5
-800
-600
-400
-200
200 400 DH LOW MEDIUM HIGH
600
800
-600
-400
-200
200 400 DV LOW MEDIUM HIGH
600
800
-600
-400
-200
200 400 HP LOW MEDIUM HIGH
600
800
-600
-400
-200
600
800
0.5
-800
0.5
-800
0.5
-800
200
400
EDGES
Fig. 2.12 Type-2 fuzzy variables
Fig. 2.13 EDGES Image by Type 2 FIS
16
Chapter 2 Type-1 and Type-2 Fuzzy Inference Systems for Images Edge Detection
2.6 Comparison of Results The first results of several tests conducted in different images can be appreciated in Table 2.2. Table 2.2 Results of Edge Detection by FIS1 y FIS2 (dark background)
Original Image
EDGES (FIS 1)
EDGES (FIS 2)
At first, the results with Type-1 FIS and Type-2 FIS can be seen to be very similar. However thinking about that to show the images with a dark background it could confuse the contrast of tones, tests were done inverting the consequent of
2.6 Comparison of Results
17
the rules, so that the edges take the dark tone and the bottom the clear tone, the rules changed to the following form: 1. If (DH is LOW) and (DV is LOW) then (EDGES is HIGH) 2. If (DH is MEDIUM) and (DV is MEDIUM) then (EDGES is LOW) 3. If (DH is HIGH) and (DV is HIGH) then (EDGES is LOW) 4. If (DH is MEDIUM) and (HP is LOW) then (EDGES is LOW) 5. If (DV is MEDIUM) and (HP is LOW) then (EDGES is LOW) 6. If (M is LOW) and (DV is MEDIUM) then (EDGES is HIGH) 7. If (M is LOW) and (DH is MEDIUM) then (EDGES is HIGH)
Fuzzy Systems were tested both (Type-1 and Type-2), with the new fuzzy rules and same images, obtaining the results that are in Table 2.3. Table 2.3 Results of Edge Detection by FIS1 y FIS2 (clear background)
EDGES (FIS 1)
EDGES (FIS 2)
18
Chapter 2 Type-1 and Type-2 Fuzzy Inference Systems for Images Edge Detection
In this second test can be appreciated a great difference between the results obtained with the FIS 1 and FIS 2, noticing at first a greater contrast in the images obtained with the FIS 1 and giving to the impression of a smaller range of tones of gray in the type-2 FIS. In order to obtain an objective comparison of the images, histograms were elaborated respectively corresponding to the resulting matrices of edges of the FIS 1 and FIS 2, which are in Table2.4. The histograms show in the y-axis the range of tones of gray corresponding to each image and in x-axis the frequency in which he appears pixel with each tone. Table 2.4 Histograms of the Resulting Images of the Edges by Gradient Magnitud, Fis 1 and Fis 2 Methods IMAGE: 1.PGM METHOD: GRADIENT MAGNITUDE
GM-1.pgm-DB
300
200
100
0 0
50
100
150
200
250
FIS 1-1.pgm-CB
METHOD: FIS 1 (CLEAR BACKGROUND)
1000
500
0 0
50
100
150
200
250
METHOD: FIS 2 (CLEAR BACKGROUND)
FIS 2-1.pgm-CB
4000 3000 2000 1000 0 0
50
100
150
200
250
As we can observe, unlike detector FIS1, with FIS2 the edges of an image could be obtained from very complete form, only taking the tones around 150 and 255.
2.7 Summary
19
As a last experiment, in this occasion to the resulting images of the FIS Type-2 the every pixel out of the range between 50 and 255 was eliminated. Table 2.5 shows the amount of elements that was possible to eliminate in some of the images, we see that the Type-2 Edges Detector FIS allows to using less than half of the original pixels without losing the detail of the images. This feature could be a great advantage if these images are used like input data in neural networks for detection of images instead the original images. We have to mention that type-2 fuzzy systems were implemented using the type-2 fuzzy toolbox developed previously by our group [16, 17, 18]. Table 2.5 Type-2FIS Edges Images Including Only Pixels with Tones between 150 and 255 BORDERS IMAGE
DIMENSION (pixels)
PIXELS INCLUDED
108x88
4661
(9504)
49 %
144x110
7077
(15840)
44.6 %
10 20 30 40 50 60 70 80 90 100 10
20
30
40
50
60
70
80
20
40
60
80
100
120
140 20
40
60
80
100
2.7 Summary The application of Sobel filters was very useful to define the input vectors for the Type-1 FIS and the Type-2 FIS, although in future works we will try to design Neuro-Fuzzy techniques able to extract image patterns without another data that the original image and to compare the results with traditional techniques of digital signal processing.
20
Chapter 2 Type-1 and Type-2 Fuzzy Inference Systems for Images Edge Detection
Thanks to the histograms of the images it was possible to verify the improvement of results of the FIS Type-1 with respect to the FIS Type-2, since with only the appreciation of the human eye was very difficult to see an objective difference. The best result was obtained by the Type-2 Fuzzy Inference System, because it was possible to clear more than half of the pixels without depreciating the image, which will reduce in drastic form the cost of training in a neural network.
Chapter 3
Type-2 Fuzzy Logic for Improving Training Data and Response Integration in Modular Neural Networks
The combination of Soft Computing techniques allows the improvement of intelligent systems with different hybrid approaches [12, 57, 59]. In this work we consider two parts of a Modular Neural Network for image recognition, where a Type-2 Fuzzy Inference System (FIS 2) makes a great difference. The first FIS 2 is used for feature extraction in training data, and the second one to find the ideal parameters for the integration method of the modular neural network. Once again fuzzy logic is shown to be a tool that can help improve the results of a neural system, when facilitating the representation of the human perception.
3.1 Method for Image Recognition At the moment, many methods for image recognition are available. But most of them include a phase of feature extraction or another type of preprocessing closely related to the type of image to recognize. The method proposed in this paper can be applied to any type of images, because the preprocessing phase does not need specific data about the type of image. Even if the method was not designed only for face recognition [64], we have made the tests with the ORL face database (AT&T Laboratories Cambridge) composed of 400 images of size 112x92. There are 40 persons, with 10 images of each person. The images are taken at different times, lighting and facial expressions. The faces are in up-right position of frontal view, with slight left-right rotation. Figure 3.1 shows the 10 samples of one person in ORL database. To explain the proposed steps of the method, we need to separate it them in two phases: the training phase in Figure 3.2 and the recognition phase in Figure 3.3.
P. Melin: Modular Neural Networks and Type-2 Fuzzy Systems, SCI 389, pp. 21–28. springerlink.com © Springer-Verlag Berlin Heidelberg 2012
22
Chapter 3 Type-2 Fuzzy Logic for Improving Training Data and Response Integration
Fig. 3.1 Set of 10 samples of a person in ORL
Fig. 3.2 Steps in Training Phase
Fig. 3.3 Steps in Recognition Phase
3.2 Type-2 Fuzzy Inference System as Edge Detector In previous work we presented an efficient fuzzy inference system for edges detection, in order to use the output image like input data for modular neural networks. In the proposed technique, it is necessary to apply Sobel operators to the original images, then use a Type-2 Fuzzy Inference System [12] to generate the vector of edges that would serve like input data in a neural network. Type-2 Fuzzy Logic enables us to handle uncertainties in decision making [62] and recognition in a more convenient way and for this reason was proposed.
3.2 Type-2 Fuzzy Inference System as Edge Detector
23
For the Type-2 Fuzzy Inference System, 3 inputs are required, 2 of them are the gradients with respect to x-axis and y-axis, calculated with (3.1), to which we will call DH and DV respectively. The Sobel edges detector uses a pair of 3x3 convolution masks, one estimating the gradient in the x-direction (columns) and the other estimating the gradient in the y-direction (rows). 2 1 1 − 1 0 1 0 0 Sobelx = − 2 0 2 Sobely = 0 − 1 − 2 − 1 − 1 0 1
(3.1)
Where Sobely y Sobelx are the Sobel Operators throughout x-axis and y-axis. If we define I as the source image, gx and gy are two images which at each point contain the horizontal and vertical derivative approximations, the latter are computed as (3.2) and (3.3). i =3 j =3
g x = Sobel x ,i , j * I r +i − 2,c + j − 2
(3.2)
i =1 j =1
i =3 j =3
g y = Sobely ,i , j * I r +i −2,c+ j −2 i =1 j =1
(3.3)
Where gx and gy are the gradients along axis-x and axis-y, and * represents the convolution operator. The other input is a filter that calculates when applying a mask by convolution to the original image. The low-pass filter hMF (3.4) allow us to detect image pixels belonging to regions of the input were the mean gray level is lower. These regions are proportionally more affected by noise, supposed it is uniformly distributed over the whole image. The goal here is to design a system which makes it easier to include edges in low contrast regions, but which does not favor false edges by effect of noise. 1 1 1 hMF = * 1 25 1 1
1 1 1 1 1
1 1 1 1 1
1 1 1 1 1
1 1 1 1 1
(3.4)
Then the inputs for FIS type 2 are: DH=gx, DV=gy, M= hMF*I, where * is the convolution operator, and de output is a column vector contains the values of the image edges, and we can represent that in graphics shown in Figure 3.4. The Edges Image is smaller than the original because the result of convolution operation is a central matrix where the convolution has a value. Then in our example, each image with dimension 112x92 is reduced to 108x88.
24
Chapter 3 Type-2 Fuzzy Logic for Improving Training Data and Response Integration
The inference rules and membership function parameters allow us to calculate a gray value between -4.5 and 1.5 for each pixel, where the most negative values corresponds to the dark tone in the edges of the image. Then if we see the rules, only when the increment value of the inputs DH and DV are low the output is HIGH or clear (the background), in the rest of rules the output is LOW or dark (the edges). The complete set of fuzzy rules is given as follows: 1. 2. 3. 4. 5.
If (DH is LOW) and (DV is LOW) then (EDGES is HIGH) (1) If (DH is MEDIUM) and (DV is MEDIUM) then (EDGES is LOW) (1) If (DH is HIGH) and (DV is HIGH) then (EDGES is LOW) (1) If (M is LOW) and (DV is MEDIUM) then (EDGES is LOW) (1) If (M is LOW) and (DH is MEDIUM) then (EDGES is LOW) (1)
1
LOW
MEDIUMHIGH
1
0.5
0 -200
1
MEDIUMHIGH
0.5
0
200 DH
400
600
LOW MEDIUM HIGH
0 -200
1
0.5
0 -200
LOW
0
200 DV
400
LOW
600 HIGH
0.5
0
200 M
400
0 -6
-4
-2 EDGES
0
2
Fig. 3.4 Membership Function for the Type-2 FIS Edge Detector
The edge detector allows us to ignore the background color. We can see in this database of faces, different tones present for the same or another person. Then we eliminate a possible influence of a bad classification by the neural network, without losing detail in the image. Another advantage of edge detector is that the values can be normalized to a homogenous value range, independently the light, contrast or background tone in each image. At the examples in Figure 3.5, all the edges in the images have a minimum value of -3.8 and a maximum value of 0.84. In particular for neural network training, we find these values to make the training faster: the mean of the values is near 0 and the standard deviation is near 1 for all the images.
3.3 The Modular Structure
25
Fig. 3.5 Examples of edge detection with the Type-2 FIS method
3.3 The Modular Structure The design of the Modular Neural Network consists of 3 monolithic feedforward neural networks, each one trained with a supervised method with the first 7 samples of the 40 images [64]. Then the edges vector column is accumulated until the number of samples to form the input matrix for the neural networks as it is in the scheme of Figure 3.7. Once the complete matrix of images is divided in 3 parts, each module is training with a correspondent part, with some rows of overlap. The target to the supervised training method consist of one identity matrix for each sample, building one matrix with dimensions 40x(40*number_of_samples), as shown in Figure 3.8. Each Monolithic Neural Network has the same structure and is trained under the same conditions, like we can see in the next code segment: layer1=200; layer2=200; layer3=number_of_subjects; net=newff(minmax(p),[layer1,layer2,layer3], {'tansig','tansig','logsig'},'traingdx'); net.trainParam.goal=1e-5; net.trainParam.epochs=1000; The average number of epochs to meet the goal in each module is of 240, and the required time of 160 seconds.
3.4 Simulation Results A program was developed in Matlab that simulates each module with the 400 images of the ORL database, building a matrix with the results of the simulation
26
Chapter 3 Type-2 Fuzzy Logic for Improving Training Data and Response Integration
of each module. These matrices are stored in the file “mod.mat” to be analyzed later for the combination of results. We can observe that in the columns corresponding to the training data, the position with a value near one is the image selected correctly. However in the columns that correspond to the test data this doesn’t always happens, reason why it is very important to have a good combination method to recognize more images. According to exhaustive tests made in the simulation matrices, we know that recognition of the images that were used for the training of the neural networks is of the 100%. Therefore the interest is focused on the recognition of the samples that do not belong to the training set, is to say samples 8, 9 and 10. The parameters for the Sugeno Fuzzy Integral that will be inferred will be the Fuzzy Densities, a value between 0 and 1 for each module, which determines the rate for each module. The parameter lambda, according to the theory of fuzzy measures depends on the values of the fuzzy densities, and is calculated by searching for the roots of a polynomial. After the simulation of an image in the Neural Network, the simulation value is the only known parameter to make a decision, then to determine the fuzzy density for each module is the unique available information. For this reason we analyze the values in many simulations matrix and decide that each input to the FIS Type-2 corresponds to the maximum value of each column corresponding to the simulation of each module of each one of the 400 images. The process to recognize each one of the images is shown in Figure 3.6.
Fig. 3.6 Process of recognition using the type-2 fuzzy modular approach
Then each output corresponds to one fuzzy density, to be applied for each module to perform the fusion of results later with the Fuzzy Sugeno Integral. The inference rules found fuzzy densities near 1 when de maximum value in the simulation is between 0.5 and 1, and near 0 when the maximum value in the simulation is near 0. The fuzzy rules are shown below and membership functions in Figure 3.7.
3.4 Simulation Results
27
1. If (max1 is LOW) then (d1 is LOW) (1) 2. If (max2 is LOW) then (d2 is LOW) (1) 3. If (max3 is LOW) then (d3 is LOW) (1) 4. If (max1 is MEDIUM) then (d1 is HIGH) (1) 5. If (max2 is MEDIUM) then (d2 is HIGH) (1) 6. If (max3 is MEDIUM) then (d3 is HIGH) (1) 7. If (max1 is HIGH) then (d1 is HIGH) (1) 8. If (max2 is HIGH) then (d2 is HIGH) (1) 9. If (max3 is HIGH) then (d3 is HIGH) (1) Although the rules are very simple, allows to model the fuzziness to rate de modules when the simulation result don’t reach the maximum value 1. However some of the images don’t reach the sufficient value in the simulation of the three modules, in these cases, do not exists enough information to select an image at the modules combination, and the image is wrongly selected. 1
LOW MEDIUM HIGH
1
0.5 0 -0.5 1
0.5
0
0.5 1 max1 LOW MEDIUM HIGH
1.5
1
0
0.5 1 d1 LOW MEDIUM HIGH
1.5
0
1.5
0.5
0
0.5 1 max2 LOW MEDIUM HIGH
1.5
0 -0.5 1
0.5 0 -0.5
0 -0.5 1
0.5 0 -0.5
LOW MEDIUM HIGH
0.5 1 d2 LOW MEDIUM HIGH
0.5
0
0.5 max3
1
1.5
0 -0.5
0
0.5 d3
1
1.5
Fig. 3.7 Membership functions for the FIS to find fuzzy densities
In order to measure of objective form the final results, we developed a method of random permutation, which rearranges the samples of each person before the training. Once a permutation is made, the modular neural networks are trained and combined four times to obtain the sufficient information to validate the results. The average recognition rate is of 96.5%.
28
Chapter 3 Type-2 Fuzzy Logic for Improving Training Data and Response Integration
We show in Table 3.1 the summary of simulation results for each of the modules and the average and maximum results of the modular network (after fusion or combination of the results). Table 3.1 Summary of the simulation results with the hybrid approach Permutation
1 2 3 4 5
Train 1
92.75 96.5 91.5 94.5 93.75
Train 2
95 95.25 92 94.5 93.5
Image Recognition (%) Train 3 Train 4 Average
92.2 94.25 93.75 93.25 94
93.25 95.5 95.25 94 96
93.3 95.375 93.125 94.0625 94.3125 94.035
Maximum
95 96.5 95.25 94.5 96 96.5
We have to mention that type-2 fuzzy systems were implemented using the type-2 fuzzy toolbox developed previously by our group [16, 17, 18].
3.5 Summary We have shown in this chapter that the combination of Soft Computing techniques allows the improvement of intelligent systems with different hybrid approaches. In this chapter we considered two parts of a Modular Neural Network for image recognition, where a Type-2 Fuzzy Inference System (FIS 2) help us improves the performance results in image recognition. The first FIS 2 was used for feature extraction in training data, and the second one to find the ideal parameters for the integration method of the modular neural network.
Chapter 4
Method for Response Integration in Modular Neural Networks Using Type-2 Fuzzy Logic
We describe in this chapter a new method for response integration in modular neural networks using type-2 fuzzy logic [59]. The modular neural networks were used in human person recognition [64, 80]. Biometric authentication is used to achieve person recognition. Three biometric characteristics of the person are used: face, fingerprint, and voice [32]. A modular neural network of three modules is used. Each module is a local expert on person recognition based on each of the biometric measures. The response integration method of the modular neural network has the goal of combining the responses of the modules to improve the recognition rate of the individual modules. We show in this chapter the results of a type-2 fuzzy approach for response integration that improves performance over type-1 fuzzy logic approaches.
4.1 Introduction Today, a variety of methods and techniques are available to determine unique identity, the most common being fingerprint, voice, face, and iris recognition [7, 8, 9, 36, 40, 52, 54, 55, 74, 107]. Of these, fingerprint and iris offer a very high level of certainty as to a person's identity, while the others are less exact. A large number of other techniques are currently being examined for suitability as identity determinants. These include (but are not limited to) retina, gait (walking style), typing style, body odor, signature, hand geometry, and DNA. Some wildly esoteric methods are also under development, such as ear structure, thermal imaging of the face and other parts of the body, subcutaneous vein patterns, blood chemistry, anti-body signatures, and heart rhythm, to name a few. The four primary methods of biometric authentication in widespread use today are face, voice, fingerprint, and iris recognition. All of these are supported in our approach, some more abundantly than others. Generally, face and voice are considered to be a lower level of security than fingerprint and iris, but on the other hand, they have a lower cost of entry. We describe briefly in this section some of these biometric methods [59].
P. Melin: Modular Neural Networks and Type-2 Fuzzy Systems, SCI 389, pp. 29–39. springerlink.com © Springer-Verlag Berlin Heidelberg 2012
30
Chapter 4 Method for Response Integration in Modular Neural Networks
Face Recognition. Facial recognition has advanced considerably in the last 10 to 15 years. Early systems, based entirely on simple geometry of key facial reference points, have given way to more advanced mathematically-based analyses such as Local Feature Analysis and Eigenface evaluation. These have been extended though the addition of "learning" systems, particularly neural networks. Face recognition systems are particularly susceptible to changes in lighting systems. For example, strong illumination from the side will present a vastly different image to a camera than neutral, evenly-positioned fluorescent lighting. Beyond this, however, these systems are relatively immune to changes such as weight gain, spectacles, beards and moustaches, and so on. Most manufacturers of face recognition systems claim false accept and false reject rates of 1% or better. Voice Recognition. Software systems are rapidly becoming adept at recognising and converting free-flowing speech to its written form. The underlying difficulty in doing this is to flatten out any differences between speakers and understand everyone universally. Alternatively, when the goal is to specifically identify one person in a large group by their voice alone, these very same differences need to be identified and enhanced. As a means of authentication, voice recognition usually takes the form of speaking a previously-enrolled phrase into a computer microphone and allowing the computer to analyse and compare the two sound samples. Methods of performing this analysis vary widely between vendors. None is willing to offer more than cursory descriptions of their algorithms--principally because, apart from LAN authentication, the largest market for speaker authentication is in verification of persons over the telephone. Fingerprint Recognition. The process of authenticating people based on their fingerprints can be divided into three distinct tasks. First, you must collect an image of a fingerprint; second, you must determine the key elements of the fingerprint for confirmation of identity; and third, the set of identified features must be compared with a previously-enrolled set for authentication. The system should never expect to see a complete 1:1 match between these two sets of data. In general, you could expect to couple any collection device with any algorithm, although in practice most vendors offer proprietary, linked solutions. A number of fingerprint image collection techniques have been developed. The earliest method developed was optical: using a camera-like device to collect a highresolution image of a fingerprint. Later developments turned to silicon-based sensors to collect an impression by a number of methods, including surface capacitance, thermal imaging, pseudo-optical on silicon, and electronic field imaging. As discussed, a variety of fingerprint detection and analysis methods exist, each with their own strengths and weaknesses. Consequently, researchers vary widely on their claimed (and achieved) false accept and false reject rates. The poorest systems offer a false accept rate of around 1:1,000, while the best are approaching 1:1,000,000. False reject rates for the same vendors are around 1:100 to 1:1000.
4.3 Modular Neural Networks
31
4.3 Modular Neural Ne
twor ks
4.2 Proposed Approach for Recognition Our proposed approach for human recognition consists in integrating the information of the three main biometric parts of the person: the voice, the face, and the fingerprint [59]. Basically, we have an independent system for recognizing a person from each of its biometric information (voice, face, and fingerprint), and at the end we have an integration unit to make a final decision based on the results from each of the modules. In Figure 4.1 we show the general architecture of our approach in which it is clearly seen that we have one module for voice, one module for face recognition, and one module for fingerprint recognition [32]. At the top, we have the decision unit integrating the results from the three modules. In this paper the decision unit is implemented with a type-2 fuzzy system.
Fig. 4.1 Architecture of the proposed modular approach
4.3 Modular Neural Networks This section describes a particular class of "modular neural networks", which have a hierarchical organization comprising multiple neural networks; the architecture basically consists of two principal components: local experts and an integration unit, as illustrated in Figure 4.2. In general, the basic concept resides in the idea that combined (or averaged) estimators may be able to exceed the limitation of a single estimator [59]. The idea also shares conceptual links with the "divide and conquer" methodology [87]. Divide and conquer algorithms attack a complex problem by dividing it into simpler problems whose solutions can be combined to yield a solution to the complex problem. When using a modular network, a given task is split up among several local experts NNs. The average load on each NN is reduced in comparison with a single NN that must learn the entire original task, and thus the combined model may be able to surpass the limitation of a single NN.
32
Chapter 4 Method for Response Integration in Modular Neural Networks
The outputs of a certain number of local experts (Oi) are mediated by an integration unit. The integrating unit puts the outputs together using estimated combination weights (gi). The overall output Y is given by equation (4.1).
Module 1
Module 2
y1 g1
y2 g2
x
Module N
y=f(x)
yN g
N Gating Network
Fig. 4.2 Architecture of a modular neural network
Yi = Σgi OI
(4.1)
Nowlan, Jacobs, Hinton, and Jordan described modular networks from a competitive mixture perspective. That is, in the gating network, they used the "softmax" function [59]. More precisely, the gating network uses a softmax activation gi of ith output unit given by Gi = exp (kui)/ Σj exp (kuj)
(4.2)
Where ui is the weighted sum of the inputs flowing to the ith output neuron of the gating network. Use of the softmax activation function in modular networks provides a sort of "competitive" mixing perspective because the ith local expert's output Oi with a minor activation ui does not have a great impact on the overall output Yi.
4.4 Integration of Results for Person Recognition Using Fuzzy Logic On the past decade, fuzzy systems have displaced conventional technology in different scientific and system engineering applications, especially in pattern recognition
4.5 Modular Neural Networks with Type-2 Fuzzy Logic
33
4.5 Mod ular Neural Networks with Ty pe-2 Fuzzy Logic
and control systems. The same fuzzy technology, in approximation reasoning form, is resurging also in the information technology, where it is now giving support to decision making and expert systems with powerful reasoning capacity and a limited quantity of rules. For the case of modular neural networks, a fuzzy system can be used as an integrator or results. The fuzzy sets were presented by L. A. Zadeh in 1965 to process / manipulate data and information affected by unprobabilistic uncertainty / imprecision [103]. These were designed to mathematically represent the vagueness and uncertainty of linguistic problems; thereby obtaining formal tools to work with intrinsic imprecision in different type of problems; it is considered a generalization of the classic set theory. Type-2 fuzzy sets are used for modeling uncertainty and imprecision in a better way [13, 26, 27, 28, 34, 35, 38, 39, 43, 53]. These type-2 fuzzy sets were originally presented by Zadeh in 1975 and are essentially “fuzzy fuzzy” sets where the fuzzy degree of membership is a type-1 fuzzy set. The new concepts were introduced by Mendel [62] allowing the characterization of a type-2 fuzzy set with a superior membership function and an inferior membership function; these two functions can be represented each one by a type-1 fuzzy set membership function. The interval between these two functions represents the footprint of uncertainty (FOU), which is used to characterize a type-2 fuzzy set [41]. The uncertainty is the imperfection of knowledge about the natural process or natural state [12]. The statistical uncertainty is the randomness or error that comes from different sources as we use it in a statistical methodology.
4.5 Modular Neural Networks with Type-2 Fuzzy Logic as a Method for Response Integration As was mentioned previously, type-2 fuzzy logic was used to integrate the responses of the three modules of the modular network. Each module was trained with the corresponding data, i.e. face, fingerprint and voice. Also, a set of modular neural networks was built to test the type-2 fuzzy logic approach of response integration. The architecture of the modular neural network is shown in Figure 4.3. From this figure we can appreciate that each module is also divided in three parts with the idea of also dividing each of the recognition problems in three parts. Experiments were performed with sets of 20 and 30 persons. The trainings were done with different architectures, i.e. different number of modules, layers and nodes. As can be appreciated from Figure 4.3, the first module was used for training with voice data. In this case, three different words were used for each person. The words used were: access, presentation, and hello.
34
Chapter 4 Method for Response Integration in Modular Neural Networks
MODULE 1 SubMod. 1 SubMod. 2 SubMod. 3 MODULE 2 SubMod. 1 SubMod. 2 SubMod. 3 MODULE 3 SubMod. 1 SubMod. 2 SubMod. 3
Fig. 4.3 Architecture of the Modular Network used for the recognition problem
The second module was used for training with person face data. In this case, two different photos were taken from each person, one in a normal position and the other with noise. The idea is that training with noise will make the recognition more robust to changes in the real world. We show in Figure 4.4 the photos of two persons in a normal situation and in a noisy situation. Normal Images
Images with Noise
Fig. 4.4 Sample Photos of Faces in a Normal and Noisy Situation
4.6 Simulation Results
35
The third module was used with fingerprint data of the group of persons. The fingerprint information was taken with a scanner. Noise was added for training the neural networks. In all cases, each module is subdivided in three sub-modules, in this way making easier the respective recognition problem.
4.6 Simulation Results A set of different trainings for the modular neural networks was performed to test the proposed type-2 fuzzy logic approach for response integration in modular neural networks. We show in Table 4.1 some of these trainings with different numbers of modules, layers and nodes. The training times are also shown in this table to illustrate the performance with different training algorithms and conditions. Table 4.1 Sample Trainings of the Modular Neural Network
36
Chapter 4 Method for Response Integration in Modular Neural Networks
Once the necessary trainings were done, a set of tests were performed with different type-2 fuzzy systems. The fuzzy systems were used as response integrators for the three modules of the modular network. In the type-2 fuzzy systems, different types of membership functions were considered with goal of comparing the results and deice on the best choice for the recognition problem. The best type-2 fuzzy system, in the sense that it produced the best recognition results, was the one with triangular membership functions. This fuzzy system has 3 input variables and one output variable, with three membership functions per variable. We show in Figures 4.5 and 4.6 the membership functions of the type-2 fuzzy system.
Fig. 4.5 Input variables of the type-2 fuzzy system
Fig. 4.6 Output variables of the type-2 fuzzy system
The recognition results of this type-2 fuzzy system for each of the trainings of the modular neural network are shown in Table 4.2. In Table 4.2 we show the results for 15 trainings of the modular neural network. In each row of this table we can appreciate the recognition rate with the type-2 fuzzy system. We can appreciate that in 8 out of 15 cases, a 100% recognition rate was achieved.
4.6 Simulation Results
37
Table 4.2 Results of the Type-2 Fuzzy System with Triangular Membership Functions
The fuzzy systems with the worst results for the modular neural network were the ones with Gaussian and Trapezoidal membership functions. We use 3 input variables and one output variable, as in the previous fuzzy system. We show in Figures 4.7 and 4.8 the Gaussian membership functions of this system.
Fig. 4.7 Input variables for type-2 fuzzy system with Gaussian membership functions
38
Chapter 4 Method for Response Integration in Modular Neural Networks
Fig. 4.8. Output variable for type-2 fuzzy system with Gaussian membership functions
We show in Figures 4.9 and 4.10 the Trapezoidal membership functions of another type-2 fuzzy system.
Fig. 4.9. Input variables for the Type-2 Fuzzy System with Trapezoidal Functions
Fig. 4.10. Output variable for type-2 fuzzy system with Trapezoidal functions.
The results that were obtained with Gaussian and Trapezoidal membership functions are similar. We show in Table 4.3 the recognition results obtained with the type-2 fuzzy system with Trapezoidal membership functions. We can appreciate from Table 4.3 that only in 6 out of the 15 cases a 100% recognition rate is obtained. Also, there are 4 cases with low recognition rates.
4.7 Summary
39
Table 4.3 Recognition rates with the Type-2 System and Trapezoidal Functions
We have to mention that results with a type-1 fuzzy integration of responses were performed in previous work, in which the recognition rates were consistently lower by an average of 5%. We can state in conclusion that the type-2 fuzzy system for response integration is improving the recognition rate in the case of persons based on face, fingerprint and voice. We have to mention that type-2 fuzzy systems were implemented using the type-2 fuzzy toolbox developed previously by our group [16, 17, 18].
4.7 Summary We described in this chapter a new method for response integration in modular neural networks that uses type-2 fuzzy logic to model uncertainty in the decision process. We showed different trainings of the modular neural networks, and tested different type-2 fuzzy systems for response integration. Based on the obtained recognition rates, the best results were achieved with a type-2 fuzzy system with triangular membership functions. The results obtained with this type-2 fuzzy system are better than the previously obtained by a similar type-1 approach.
Part II
Chapter 5
Modular Neural Networks for Person Recognition Using the Contour Segmentation of the Human Iris
This chapter presents three modular neural network architectures as systems for recognizing persons based on the iris biometric measurement of humans [80]. In these systems, the human iris database is enhanced with image processing methods, and the coordinates of the center and radius of the iris are obtained to make a cut of the area of interest by removing the noise around the iris. The inputs to the modular neural networks are the processed iris images and the output is the number of the person identified. The integration of the modules was done with a gating network method [59].
5.1 Introduction This chapter is focused on the area of modular neural networks for pattern recognition based on biometric measures [32], specifically in the recognition by the human iris biometric measurement [80]. At present, biometric measurements are being widely used for person recognition systems [59]. A lot has been said about the use of such measures, particularly for the signature, fingerprint, face and voice [64]. As more research was done in this area further biometric measures were discovered, among which the human iris by its peculiarity of not losing over the years its universality and authenticity. In order to get a good identification, we proposed a modular neural network architecture divided into three modules, each module input is a part of the database of human iris, and some methods or techniques are used for pre-processing the images such as normalization, resizing, cut, edge detection, among several others. The end result in terms of modular neural network architecture and image preprocessing depends on the tests made and also the time to achieve identification [64, 65]. The chapter is organized as follows: Section 2 contains a brief explanation from previous research with human iris for recognition of people and basic concepts relevant to the area, section 3 defines the method proposed for this research and the description of problem addressed in this paper, section 4 presents the results achieved in research and in Section 5 draws conclusions and future work. P. Melin: Modular Neural Networks and Type-2 Fuzzy Systems, SCI 389, pp. 43–59. springerlink.com © Springer-Verlag Berlin Heidelberg 2012
44
Chapter 5 Modular Neural Networks for Person Recognition
5.2 Background and Basic Concepts 5.2.1 Modular Neural Network An artificial neural network (ANN) is a distributed computing scheme based on the structure of the nervous system of humans [59]. The architecture of a neural network is formed by connecting multiple elementary processors, this being an adaptive system that has an algorithm to adjust their weights (free parameters) to achieve the performance requirements of the problem based on representative samples. Therefore we note that an ANN is a distributed computing system characterized by: A set of elementary units, each of which has low processing capabilities. A dense interconnected structure using weighted links. Free parameters to be adjusted to meet performance requirements. A high degree of parallelism. The most important property of artificial neural networks is their ability to learn from a training set of patterns, i.e. is able to find a model that fit the data. The artificial neuron consists of several parts. On one side are the inputs, weights, the summation, and finally the adapter function. The input values are multiplied by a weights and added:
x w i
i
ij
. This function is completed with
the addition of a threshold amount θ i This threshold has the same effect as an entry with value -1. It serves so that the sum can be shifted left or right of the origin. After addition, we have the function f applied to the resulting set the final value of the output, also called yi. The result of the sum before applying the function f, also often called activation value ai. The modular neural networks are composed of simple networks that behave as functional blocks and these are the neural modules. A modular neural network works similarly to a classical neural network, as it is composed of sigmoidal, linear or discrete activation neurons and are trained with common learning algorithms (gradient descent with adaptive learning algorithm, backpropagation, gradient descent scaling, etc.). What distinguishes it from other neural models, is that it is developed based on functional modules and each module runs a neural network with the same characteristics or different (input layer, hidden layers, output layer, depending activation, learning algorithm, number of neurons per layer, etc.). In this model the modules work independently and in the end a form commonly called integrator performs the function of deciding between the different modules to determine which of them has the best solution (including network of gateways, fuzzy integrator, etc.). Figure 5.1 shows a modular neural network scheme:
5.2 Background and Basic Concepts
45
Fig. 5.1 Schematic of a modular artificial neural network
5.2.2 Historical Development The first use of the iris was presented in Paris [80], where criminals were classified according to the color of their eyes following a proposal by the French ophthalmologist Bertillon in1880. Research in visual identification technology began in 1935. During that year an article appeared in the ’New York State Journal of Medicine’, which suggested that “the pattern of arteries and veins of the retina could be used for unique identification of an individual”. After researching and documenting the potential use of the iris as a tool to identify people, ophthalmologists Flom and Safir patented their idea in 1987; and later, in 1989, they patented algorithms developed with the mathematician Daugman. Thereafter, other authors developed similar approaches. Later in 2001, Daugman also presented a new algorithm for the recognition of people using the biometric measurement of Iris. The literature has well documented the uniqueness of visual identification. The iris is so unique that there are no two irises alike, even twins, in all humanity. The probability of two irises producing the same code is 1 in 1078, becoming known that the earth’s population is estimated at approximately 1010 million, it is almost imposible to ocurr. Biometric identification techniques are very diverse, since any significant element of a person is potentially useful as an element of biometric identification. Even with the diversity of existing techniques, when developing a biometric identification system, this remains a totally independent of the technique. Human beings have many features in common, but also have characteristics that distinguish them and make them unique from each other. Over the years there have been many studies and research about it and developing techniques, methods and systems that enable the use of these patterns for personal identification.
46
Chapter 5 Modular Neural Networks for Person Recognition
5.2.3 Iris Properties The iris is an internal organ of the eye, located behind the cornea and the aqueous humor, which consists of a screening connective tissue, fibers, rings and colors that are a distinctive mark of the people to observe at close range (see Fig. 5.2). The iris texture is not a genetic expression and the morphogenesis is completely random [80].
Fig. 5.2 Human Iris
The properties of the iris that enhance its use for identification of individuals include: a) uniqueness in two individuals, b) inability to modify it without risk of vision loss, c) is a pattern with high randomness, and d) ease of record at close range. But it also presents some disadvantages such as: a) its small size makes it difficult to acquire it at certain distances, b) is a moving target, c) is located on a curved surface, moist and reflective, d), its image is often affected by eyelashes, eyelids and light reflections, and e) the deformations are not elastic when the pupil changes size.
5.3 Proposed Method and Problem Description The proposed methodology for iris recognition can be stated as follows [80]: 1) Search for a database of human iris. 2) Determine the database division in terms of individuals per module for the modular neural network. 3) Search for methods and / or pre-processing techniques for application to the image database and obtain an optimal identification. 4) Implementing in a programming language the different modular neural network architectures (see Fig. 5.3). The RNMs consist of 3 modules. 5) Find a modular integration method to provide the desired results.
5.3 Proposed Method and Problem Description
47
Fig. 5.3 Schematic representation of the division of problem and system operation
5.3.1 Problem Description This work focuses primarily on the identification of individuals. This problem is well known by the scientific community, as innumerable investigations have been developed in this area [8, 40], considering various measures to achieve it with biometrics (fingerprint, voice, palm of hand, signature) and various methods of identification (with particular emphasis on neural networks). The specific problem considered in this work is: “Obtain a good percentage of person identification based on the biometric measurement of the human iris, using modular neural networks”. We used a database of human Iris from the Institute of Automation Chinese Academy of Sciences (CASIA) (see Fig. 5.4). It consists of 14 images (7 right eye - 7 left eye) per person, for a total of 99 individuals, giving a total of 1386 images. The image dimensions are 320 x 280, JPEG format.
Fig. 5.4 Examples of the human iris images from CASIA database
48
Chapter 5 Modular Neural Networks for Person Recognition
5.3.2 Image Pre-processing The pre-processing that has been applied to the images before they are introduced to the neural network is as follows:
Obtain the coordinates and radius of the iris and pupil using the method developed by Masek and Kovesi. Making the cut in the Iris. Resize the cut of the Iris to 21-21 Convert images from vector to matrix Normalize the images.
1) Obtain coordinates of the center and radius of Iris: To get the coordinates of the center and radius of the iris and pupil of images in the CASIA database, the method developed by Masek and Kovesi was used. This method involves applying a series of filters and mathematical calculations to achieve the desired gain. First we apply edge detection with Canny’s method (see Fig. 5.5 (a)), then the process continues using a gamma adjustment of the image (see Fig 5.5 (b)), to the resulting image obtained above a no maxima suppression is applied (see Fig. 5.5 (c)), and subsequently we applied to the image a threshold method (see Fig. 5.5 (d)).
Fig. 5.5 (a) Edge detection with Canny’s method (b) Image Adjust Gamma (c) No Maxima Suppression (d) Threshold
Finally, we apply the Hough transform to find the maximum in the Hough space and, therefore, the circle parameters (row and column at the center of the iris and the radius).
5.4 Modular Neural Network Architecture
49
2) Cut out the Iris: After obtaining the coordinates of the Iris, the upper right and lower left points are calculated to make the cut (see Fig. 5.6). RowUpLeft = RowIris - RadiusIris; RowLowRight = (RowIris+RadiusIris)-RowUpLeft; ColUpLeft = ColumnIris - RadiusIris; ColLowRight = (ColumnIris + RadiusIris) - ColUpLeft;
Fig. 5.6 Cut of iris, using the “imcrop” Matlab function
5.4 Modular Neural Network Architecture The work was focused on the recognition of persons using a modular neural network. We worked with 3 modules, each module input considers 33 individuals (264 images for training - 198 images for testing). We used the method of integration called Gating Network. The architecture of the modular neural network was defined with an empirical expression. Based on the following expression we can calculate the number of nodes as follows [80]:
1st hidden layer (2 _ (k + 2)) = 70. 2nd hidden layer (k + m) = 41. Output layer (k) = 33.
In determining the characteristics of the modular neural network, such as the number of modules to be used, the inputs, hidden layers, the number of neurons for each hidden layer and outputs, the initial architecture is specified in Fig. 5.7.
50
Chapter 5 Modular Neural Networks for Person Recognition
Fig. 5.7 Initial Modular Neural Network Architecture
5.5 Simulation Results Experiments were performed with the modular neural network architecture described in the previous section, and we perfomed experiments with 3 types of learning algorithms: gradient descent with adaptive learning (GDA), gradient descent with adaptive learning and momentum (GDX) and scaled conjugate gradient (SCG). After obtaining the results of this architecture, two modified architectures were also tested to improve the accuracy.
5.5.1 Results with the Initial Modular Neural Network Architecture The following results were achieved by each of the 3 modules in terms of percentage of identification. Module 1 In this module the results were better with the scaled conjugate gradient algorithm (SCG), with an identification rate of 96.46% (191/198), goal error of 0.0000001, execution time of 29 seconds and 207 iterations (see Fig. 5.8 and Table 5.1).
5.5 Simulation Results
51
Fig. 5.8 Graph of error of the best training in Module 1 Table 5.1 Results for Module 1
Error
Time
Iterations
Rec.
Ident.
% Ident.
Traingda
0.000001
1.32 m.
1399
264/264
189/198
95.45
Traingdx
0.000001
38 s.
551
264/264
188/198
94.94
Trainscg
0.0000001
29 s.
207
264/264
191/198
96.46
Module 2 In this module the results were better with the gradient descent algorithm with adaptive learning (GDA), with an identification rate of 96.46% (191/198), goal error of 0.000001, execution time of 56 seconds and 833 iterations (see Fig. 5.9 and Table 5.2).
52
Chapter 5 Modular Neural Networks for Person Recognition
Fig. 5.9 Graph of error of the best training in Module 2 Table 5.2 Results for module 2
Error
Time
Iterations
Rec.
Ident.
%Ident.
Traingda
0.000001
56 s.
833
264/264
191/198
96.46
Traingdx
0.000001
40 s.
586
264/264
189/198
95.45
Trainscg
0.0000001
28 s.
207
264/264
189/198
95.45
Module 3 In this module the results were better with the scaled conjugate gradient algorithm (SCG), with a 94.94% of identification rate (188/198), goal error of 0.0000001, execution time of 28 seconds and 144 iterations (see Fig. 5.10 and Table 5.3).
5.5 Simulation Results
53
Fig. 5.10 Graph of error of the best training in Module 3 Table 5.3 Results for Module 3
Error
Time
Iterations
Rec.
Ident.
%Ident.
Traingda
0.000001
66 s.
992
264/264
187/198
94.44
Traingdx
0.000001
29 s.
416
264/264
186/198
93.93
Trainscg
0.0000001
20 s.
144
264/264
188/198
94.94
Integration Analyzing the results obtained in the three modules, we see that in the first and third module, the learning algorithm that showed better results was the scaled conjugate gradient; and in the second module the best results were obtained with the gradient descent with learning adaptive. We decided to use the gating network integration method with the outputs of the three modules. Getting as final result a 95.95% of identification rate (570/594) (see Table 5.4).
54
Chapter 5 Modular Neural Networks for Person Recognition Table 5.4 Results of the integration for the 3 modules
Integrator
MD1
MD2
MD3
Rec.
Ident.
Gating Network Trainscg Traingda Trainscg 792/792 570/594
% Ident. 95.95
5.5.2 First Modification of the Modular Neural Network Architecture In obtaining the results with the proposed modular neural network, we decided to modify manually the number of neurons in a hidden layer of the network, considering the second hidden layer for the modification. There were two changes in the second hidden layer that produced good results: 70 neurons in first hidden layer and 67 neurons in the second hidden layer, and 70 neurons in first hidden layer and 100 neurons in the second hidden layer. Module 1 The results of this module found that the best result was achieved with gradient descent algorithm with adaptive learning (GDA) with 70 hidden neurons in first layer and 67 neurons in second hidden layer with a 96.96% of identification rate (192/198) (see Table 5.5). Table 5.5 Results for Module 1
70-41
70-67
70-100
Identif.
% Identif.
Identif.
% Identif.
Identif.
% Identif.
Traingda
189/198
95.45
192/198
96.96
190/198
95.95
Traingdx
188/198
94.94
189/198
95.45
191/198
96.46
Trainscg
191/198
96.46
189/198
95.45
190/198
95.95
Module 2 The results of this module found that the best result was achieved with gradient descent algorithm with adaptive learning (GDA) with 70 hidden neurons in first layer and 67 neurons in second hidden layer with a 97.97% of identification rate (194/198) (see Table 5.6).
5.5 Simulation Results
55
Table 5.6 Results for module 2
70-41 Identif.
70-67
70-100
% Identif.
Identif.
% Identif.
Identif.
% Identif.
Traingda 191/198
96.46
194/198
97.97
193/198
97.47
Traingdx 189/198
95.45
191/198
96.46
192/198
96.96
Trainscg
95.45
192/198
96.96
193/198
97.47
189/198
Module 3 The results of this module found that the best result was achieved with the scaled conjugate gradient algorithm (SCG) with 70 neurons in first hidden layer and 100 neurons in second hidden layer with a 95.45% of identification rate (189/198) (see Table 5.7). Table 5.7 Results for module 3 70-41
70-67
70-100
Identif.
% Identif.
Identif.
% Identif.
Identif.
% Identif.
Traingda
187/198
94.44
187/198
94.44
188/198
94.94
Traingdx
186/198
93.93
187/198
94.44
188/198
94.94
Trainscg
188/198
94.94
188/198
94.94
189/198
95.45
Knowing the best results for the 3 modules, it is now determined that the modular neural network architecture is modified as shown in Fig. 5.11:
56
Chapter 5 Modular Neural Networks for Person Recognition
Fig. 5.11 Modified Modular Neural Network Architecture
Integration Analyzing the results obtained in the three modules, we see that in the first and second module, the learning algorithm that showed better results was the gradient descent with learning adaptive rate with 70 hidden neurons in first layer and 67 neurons in second hidden layer; and in the third module which showed best results was the scaled conjugate gradient with 7 hidden neurons in first layer and 10 neurons in second hidden layer. The gating network integration is done with the algorithms and numbers of neurons mentioned above for each module. The final result is 96.80% of identification rate (575/594) (see Table 5.8). Table 5.8 Results of the integration for 3 modules with modified architecture
Integrator
MD1
MD2
MD3
Rec.
Ident.
% Ident.
Gating Network Trainscg Traingda Trainscg 792/792 570/594
95.95
Gating Network Traingda Traingda Trainscg 792/792 575/594 Modified Architecture
96.80
Cross Validation 10 Times After obtaining the final results of the integration we chose to carry out crossvalidation of 10 different combinations of training and test images, and averaging the results of identifications (see Table 5.9):
5.5 Simulation Results
57
Table 5.9 Results for cross validation for 10 times
Validator
MD1
MD2
MD3
% Ident.
Cross Validation 91.57 10 Times
92.93
90.25
91.58
5.5.3 Second Modification of the Modular Neural Network Architecture (Extended) In analyzing the results obtained with the integration in the modular neural network we found that the learning algorithm (SCG) identifies the image of human iris of a particular person that other different learning algorithm (GDA) do not identified, as well as the number of neurons in the hidden layers of the neural network. Based on the above ideas, we propose an extended architecture of the modular neural network. In this architecture each module consists of two modules with different learning algorithms or different number of neurons in the second hidden layer (the two modules were more successful results with the modified architecture), see in Fig 5.12:
Fig. 5.12 Extended Modular Neural Network Architecture
58
Chapter 5 Modular Neural Networks for Person Recognition
With the outputs of each module we perform the results integration. Obtaining as final result a 97.13% identification rate (577 images of 594 test images) with the winner takes all integrator and 96.96% identification rate (576 images of 594 test images) with the Mamdani type fuzzy integrator (see Table 5.10). Table 5.10 Results of the integration for 3 modules with extended architecture
Integrator
MD1
MD2
MD3
Rec.
Ident.
Gating Network Traingda Traingda Trainscg 792/792 575/594 192 194 189 Modified Architecture 194 193 190 792/792 577/594 Integrator: Winner Takes All Integrator: Fuzzy 193 194 189 792/792 576/594 Type Mamdani
% Ident. 96.80
97.13 96.96
Cross Validation 10 Times After obtaining the final results of the integration we chose to carry out crossvalidation of 10 different combinations of training and test images, and averaging the results of identifications (see Table 5.11). Table 5.11 Results for cross validation for 10 times with extended architecture
Cross Validation 10 Times Modified Architecture Extended Architecture with Integrator Winner Takes All Extended Architecture with Integrator Fuzzy Type Mamdani
% Ident. 91.58 91.83 91.41
5.6 Summary In this chapter we presented modular neural network architectures, which have as input the database of human iris images, with the database images divided in three parts for the three modules, and each module has two hidden layers. In this work, several methods were used to make the elimination of noise that the original pictures had until the coordinates of the center and radius were obtained, and then make a cut around the iris.
5.6 Summary
59
With the extended architecture we achieved higher results than the initial and modified by achieving a 97.13% identification rate (577 images of 594 test images) with the integrator winner takes all and 96.96% identification rate( 576 images of 594 test images) with the Mamdani type fuzzy integrator. These results demonstrate that the use of the human iris biometric measurement worked with modular artificial neural networks and favorable results of person identification were obtained. Future work consists in considering other different pre-processing of images or that complements the current one and using a method for optimization of the modular artificial neural network, with which we could get beyond the rate of detection obtained in the results shown in this chapter.
Chapter 6
Modular Neural Networks for Human Recognition from Ear Images Compressed Using Wavelets
This chapter is focused in the human recognition from ear images as biometric using modular neural networks with preprocessing ear images as network inputs [80]. We proposed modular neural network architecture composed of twelve modules, in order to simplify the problem making it smaller. Comparing with other biometrics, ear recognition has one of the best performances, even when it has not received much attention [9, 74]. To compare with other existing methods, we used the 2D Wavelet analysis with global thresholding method for compression, and Sugeno Measures and Winner-Takes-All as modular neural network integrator. Recognition results achieved was up to 97%.
6.1 Introduction Is not a surprise to see that actually the human identifying methods by possessions such a cards, badges, keys or by knowledge such a passwords, userid, Personal Identification Number (PIN), are being replaced by biometrics [59]. We can say that every person is unique. There are not two persons with the same face, identical fingerprints or voice [32]. This and other singularities are part of the biometrics. Biometrics is the science of identifying or verifying the identity of a person based on physiological or behavioral characteristics. In physiological also called passive characteristics, the biometric measure is taken from a subject with or without his/her consent, depends of the biometric capture system. In the other way the behavioral also called active characteristics, the subject must show his/her biometric measure doing an action in front of a captured system [93]. Biometrics offer much higher accuracy than the more traditional ones. Possession can be lost, forgot or replicated easily. Knowledge can be forgotten. Both possessions and knowledge can be stolen or shared with other people. In biometrics these drawbacks do exist only in small scale [59].
P. Melin: Modular Neural Networks and Type-2 Fuzzy Systems, SCI 389, pp. 61–75. springerlink.com © Springer-Verlag Berlin Heidelberg 2012
62
Chapter 6 Modular Neural Networks for Human Recognition
Some biometrics are shown at (Figure 6.1). The behavioral characteristics are voice, handwritten signature, keyboard strokes, odor, and more. The physiological ones are fingerprint, iris, face, hand geometry, finger geometry, retina, vein structure, ear, and more. The physiological characteristics systems are generally more reliable than the based ones on behavioral characteristics, despite the last ones can be sometimes simpler to integrate in some specific uses [59].
Fig. 6.1 Sample Biometrics
The most commonly used biometrics according to the International Biometric Group in 2006 were: fingerprint, face, voice, iris, handwritten signature and hand geometry [8, 9, 32, 40, 74]. The Ear biometric measure is not commonly used, yet (Figure 6.2). There are many applications where biometrics can be used [59]. Basically a biometric system may operate in verification mode also known as authentication, or identification mode also known as recognition. Identification mode answer the question Who are you?, the system recognizes a subject by searching the templates of all the users in the database for a match. In verification mode the question is, You are who you claim to be? and the system validates a person’s identity by comparing the captured biometric data with her own biometric. But, what biological measurements qualify to be a biometric measure? Any human physiological and/or behavioral characteristic can be used as a biometric characteristic as long as it satisfies the following requirements: universal, unique,
6.1 Introduction
63
Fig. 6.2 Commonly used Biometrics
permanent, measurable, acceptable, capable and reliable. These concepts are described below: • • • • • • •
Universality (U): each person must own the characteristic. Distinctiveness (D): each person must be differentiable between each subject. Permanence (P): the characteristic does not have to change with time. Collectability (Co): there must exist the capacity to characterize the characteristic measured quantitatively. Acceptability (A): the characteristic must have great acceptance between the societies. Performance (Pf): accuracy, speed, and robustness of technology used. Circumvention (C): reflects how easily the system can be fooled using fraudulent methods.
A brief comparison of different biometrics based on seven factors for an ideal biometric previously mentioned, is provided in (Table 6.1). As you can see, ear, fingerprint and hand geometry have better averages than other biometrics [59]. However, no single technique can outperform all the others in all operational environments. In this sense, each biometric technique is admissible and there is no optimal biometric characteristic.
64
Chapter 6 Modular Neural Networks for Human Recognition
Table 6.1 Comparison for different biometrics. High, Medium, and Low are denoted by H, M, and L, respectively.
Biometric Techniques Ear DNA Face Fingerprint Keystroke Hand G. Iris Odor Retina Signature Voice
U
D
P
Co
A
Pf
C
M H H M L M H H H L M
M H L H L M H H H L L
H H M H L M H H M L L
M L H M M H M L L H M
M H L H L M H L H L L
H L H M M M L M L H H
M L H M M M L L L H H
6.2 Background 6.2.1 Related Work Human ear identification has been interesting in recent years [9, 80]. At 1906 Imhofer found that in a set of 500 ears, only 4 characteristics were needed to state the ears unique. The most famous work among ear identification was made by Alfred Iannarelli at 1989, where he compared over 10.000 ears drawn from a randomly selected sample in California; he found that all ears were different [59]. The following account summarizes previous work on ear [59]. Another study was among identical and non-identical twins using Iannarelli’s measurements. The result was that ears are not identical, even identical twins had similar but not identical ears. After Iannarelli’s classification there have become different and more scientific methods for ear identification: Carreira-Perpiñan in 1995 used outer ear images are proposed for human recognition and compression neural networks to reduce the dimensionality of the images and produce a feature vector of manageable size. Moreno et al. in 1999 presented a multiple identification method, which combines the results from several neural classifiers using feature outer ear points, information obtained from ear shape and wrinkles, and macro features extracted by compression network. Burge and Burger (1998 - 2000) obtained automatic ear biometrics with Voronoi diagram of its curve segments. Hurley, Nixon and Carter (2000) used force field transformations for ear recognition. The image is treated as an array of Gaussian attractors that act as the source of the force field.
6.2 Background
65
Victor, Chang, Bowyer and Sarkar (2002 - 2003) used a Principal Component Analysis (PCA) approach, and made a comparison between ears and faces. Ear Recognition Laboratory at USTB in 2003, they established an image database of 60 subjects (3 images for one subject). Using the method of kernel principal component analysis (KPCA), the ear identification rate was 94%. In 2004, they enlarged the image database to 77 subjects, 4 images for one subject, with pose variation and lighting variation. For ear feature extraction, they proposed a novel recognition method based on local features extraction, which means extracting the shape feature of outer ear and the structural feature of inner ear then using BP neural network for classification. In this way, the recognition rate rises to 85%.
6.2.2 Ear The ear structure is quite complex, it has a great variety of classified zones [80]. The most important ear parts are: Helix, Shell (Concha) and Lobule (Figure 6.3).
Fig. 6.3 Ear parts
Researchers have suggested that the shape and appearance of the human ear is unique to each person, and relatively unchanging during the lifetime of an adult making it better suited for long-term identification when compared to other biometrics, such as face recognition. Ear recognition is not affected by environmental factors such as mood, health, and clothing. However, ear recognition has not received much attention like others biometrics e.g. face, fingerprint, iris, etc. For that reasons, the ear is taken as biometric, since it presents a structural singularity that is practically impossible that two subjects own the same physiological characteristic, even for “identical” twins.
66
Chapter 6 Modular Neural Networks for Human Recognition
We can mention some advantages and disadvantages for ear biometric: • • • • • •
Ears are smaller than other biometrics. Reduced spatial resolution. Ear biometric can be take with or without the subject consent. Biometrics like iris, retina and DNA are more permanent than ear form. At the same level are fingerprint and hand geometry. Less permanent than ear form are signature, face and voice. Ear biometrics are not usable if the ear is covered e.g. with a hat or hair. We have almost none adjectives to describe ears. Ear recognition systems can be fooled with methods like plastic surgery.
6.3 Ear Recognition Process Ear recognition process used for this work is composed by the following steps [80]: • • • •
•
Data acquisition Image pre-processing o Regions of interest (ROI) o Wavelets Neural network structure Neural network training o Scaled Conjugate Gradient (SCG) o Gradient Descent with Momentum and Adaptive Learning Rate (GDX) Modular integration o Sugeno Measures o Winner-Takes-All (WTA)
In the next sections we describe these steps, and we show the results obtained.
6.3.1 Data Acquisition The images were acquired at University of Science and Technology Beijing (USTB), which contains 308 images from 77 subjects, 4 images for one subject with pose variation and lighting variation. The subjects are students and teachers from the department of Information Engineering. The database was formed between November 2003 and January 2005 with 77 subjects, four images each. Two images with angle variation and one with illumination variation. Each image is 24-bit true color image and 300*400 pixels. The first image and the fourth one are both profile image but under different lighting. The second and the third one
6.3 Ear Recognition Process
67
have the same illumination condition with the first while they have separately rotated +30 degree and -30 degree with the first one (Figure 6.4).
Fig. 6.4 USTB Database
6.3.2 Image Pre-processing The original image is true color or RGB. To get the coefficients, the image must to have a color map, it means, the image must be indexed. First of all, we convert the image to grayscale and then from RGB to indexed. The figures (Figure 6.5 and Figure 6.6) below show how the system interprets the original RGB image and how interprets the indexed image. As we can see, the RGB image lost too much information.
Fig. 6.5 RGB Image
68
Chapter 6 Modular Neural Networks for Human Recognition
Fig. 6.6 Indexed Image
The next step for image pre-processing was the image resize from 300*400 to 50*75 pixels taken the region of interest (ROI) to eliminate as much as possible noise (Figure 6.7).
(a)
(b)
Fig. 6.7 (a) 0riginal Image. (b) Resize Image Black square = ROI
6.3 Ear Recognition Process
69
For image compression, we used two-dimensional wavelet analysis with Global Thresholding method, with two decompose levels and near symmetric wavelet (sym8) and 20% for hard-thresholding. The approximation coefficients were stored in a row vector for training. Basically, this method consists of taking the wavelet expansion of the signal and keeping the largest absolute value coefficients. In this case, you can set a global threshold, a compression performance, or a relative square norm recovery performance. Thus, only a single parameter needs to be selected. The following figures (Figure 6.8 and Figure 6.9) show the decomposition and compression process.
Fig. 6.8 Image Decomposition
70
Chapter 6 Modular Neural Networks for Human Recognition
Fig. 6.9 Image Compression
6.3.3 Neural Network Structure The use of monolithic neural networks, such as a multilayer perceptrón, has some drawbacks: e.g. slow learning, weight coupling, and the black box effect [59]. These can be alleviated by the use of a modular neural network (MNN). The creation of a MNN has three steps: task decomposition, module creation and decision integration. For our investigation, we decomposed our network data by this way: 308 training images, divided in 104 images for module 1 and 2, and 100 images for module 3; 77 identification images, divided in 26 images for module 1 and 2, and 25 for module 3 (Figure 6.10). Then, each module was divided as follows (Figure 6.11): •
Module 1 (Subjects 1-26) o Module 4: Helix o Module 5: Concha o Module 6: Lobule • Module 2 (Subjects 27-52) o Module 7: Helix o Module 8: Concha o Module 9: Lobule • Module 3 (Subjects 53-77) o Module 10: Helix o Module 11: Concha o Module 12: Lobule
6.3 Ear Recognition Process
71
Using the expression (2*(k+2), k+m, k): • • •
First hidden layer: 56 neurons module 1 and 2, 54 module 3. Second hidden layer: 30 neurons module 1 and 2, 29 module 3. Output layer: 26 neurons module 1 and 2, 25 module 3.
We used Sugeno measures and Winner-Takes-All for integrate the modules 4, 5 and 6, 7, 8 and 9, and 10, 11 and 12. For modules 1, 2 and 3 we used a Gating Network to take de final decision.
Data Base
Module1 Subject 1-26
Image Preprocessing
Module2 Subject 27- 52
Gating
Module3 Subject 53-77
Fig. 6.10 Schematic representation of the neural network architecture
72
Chapter 6 Modular Neural Networks for Human Recognition 1
1
2
2
56
30
1
1
2
2
56
30
1
1
2
2
56
30
Helix
Concha
Integration
Lobule
Fig. 6.11 Neural network architecture for each module
6.3.4 Neural Network Training For the 77 subjects, we trained the 4 available images and we used every image with cross validation for testing. It means, for each module we used the first image for test at the first iteration, then the second image for test at the second iteration, and so on until the four image. The learning algorithms used for this work were gradient descent momentum and an adaptive learning rate or traingdx, and scaled conjugate gradient or trainscg. The tables below (Table 6.2, Table 6.3 and Table 6.4) show the performance for those learning algorithms, being trainscg the best with less number of iterations but not in time. Table 6.2 Performance for trainscg algorithm. H=Helix, C=Concha, L=Lobule.
Module 1 H C L 377 509 608
Iterations Module 2 H C L 584 669 948
H 444
Module 3 C L 411 400
Table 6.3 Performance for traingdx algorithm. H=Helix, C=Concha, L=Lobule.
Module 1 H C L 1000
Iterations Module 2 H C L 1000
Module 3 H C L
6.3 Ear Recognition Process
73
Table 6.4 Time comparison between SCG and GDX. SCG= trainscg, GDX= traingdx.
Train SCG GDX
Goal 0.00001 0.00001
Time 30.37 24.20
6.3.5 Modular Integration The integrators used in this work were Sugeno Measures and WTA mechanisms. The following tables (Table 6.5, Table 6.6 and Table 6.7) show the recognition results obtained for each module using cross validation and Sugeno Measures as integrator. Table 6.5 Recognition results for Module 1 with Sugeno integration. CV = Cross Validation
Train
Module 1 CV1
CV2
CV3
CV4
%
trainscg
25/26
26/26
24/26
26/26
97.11
traingdx
23/26
23/26
24/26
22/26
88.46
Table 6.6 Recognition results for Module 2 with Sugeno integration. CV = Cross Validation
Train trainscg traingdx
CV1 24/26 22/26
CV2 25/26 23/26
Module 2 CV3 25/26 24/26
CV4 26/26 23/26
% 96.15 88.46
Table 6.7 Recognition results for Module 3 with Sugeno integration. CV = Cross Validation
Train
Module 3 CV1
CV2
CV3
CV4
%
trainscg
24/25
25/25
25/25
25/25
99.0
traingdx
23/25
22/25
21/25
22/25
88.0
74
Chapter 6 Modular Neural Networks for Human Recognition
The tables (Table 6.8, Table 6.9, and Table 6.10) show the recognition results obtained for each module using cross validation and Winner-Take-All as integrator.
Table 6.8 Recognition results for Module 1 with WTA integration. CV = Cross Validation
Train
Module 1 CV1
CV2
CV3
CV4
%
trainscg
23/26
25/26
26/26
25/26
95.19
traingdx
24/26
23/26
25/26
24/26
92.30
Table 6.9 Recognition results for Module 2 with WTA integration. CV = Cross Validation
Train
Module 2 CV1
CV2
CV3
CV4
%
trainscg
24/26
23/26
24/26
26/26
93.26
traingdx
26/26
23/26
25/26
24/26
94.23
Table 6.10 Recognition results for Module 3 with WTA integration. CV = Cross Validation
Train
Module 3 CV1
CV2
CV3
CV4
%
trainscg
24/25
23/25
25/25
23/25
95.00
traingdx
24/25
23/25
24/25
23/25
94.00
The table below (Table 6.11) shows the final average obtained from each module. Table 6.11 Final recognition result for Sugeno and WTA integration
Train trainscg
Sugeno Rec.% 97.42
WTA Rec.% 94.48
traingdx
88.30
93.51
6.4 Summary
75
We can compare our results with the work done by other researchers in the same field using various techniques [9]. Our recognition rate is good because the modular neural network architecture that we used helped us to avoid slow learning and make easier the recognition process.
6.4 Summary In this chapter we proposed a method of human recognition based on human ear images using 2D wavelet analysis for preprocessing. Ear images are resized to a fixed size followed by select regions of interest. After that near symmetric wavelet of level two is used to image compression. Ear database is trained following the proposed modular neural network architecture, which help us to improve the train process. In fact, if we want to add more subjects to the database is not necessary start over, we just need to add more modules to our architecture and that is a great advantage comparing with other similar works. Sugeno Measures integrator obtains the best recognition average being this 97.11% with SCG algorithm versus 94.48% with WTA integrator from SCG algorithm. Both integrators are good; however, in this case it was not enough to choose a winner through the neural weights as do WTA integrator.
Chapter 7
Signature Recognition with a Hybrid Approach Combining Modular Neural Networks and Fuzzy Logic for Response Integration
This chapter describes a modular neural network (MNN) with fuzzy integration for the problem of signature recognition. Currently, biometric identification has gained a great deal of research interest within the pattern recognition community [59]. For instance, many attempts have been made in order to automate the process of identifying a person’s handwritten signature; however this problem has proven to be a very difficult task. In this work, we propose a MNN that has three separate modules, each using different image features as input, these are: edges, wavelet coefficients, and the Hough transform matrix. Then, the outputs from each of these modules are combined using a Sugeno fuzzy integral and a fuzzy inference system [65]. The experimental results obtained using a database of 30 individual’s shows that the modular architecture can achieve a very high 99.33% recognition accuracy with a test set of 150 images. Therefore, we conclude that the proposed architecture provides a suitable platform to build a signature recognition system. Furthermore we consider the verification of signatures as false acceptance, false rejection and error recognition of the MNN.
7.1 Introduction Recently, there has been an increased interest in developing biometric recognition systems for security and identity verification purposes [8, 9, 32, 59, 64, 65, 93, 107]. Such systems usually are intended to recognize different types of human traits, which include a person’s face, their voice, fingerprints, and specific handwriting traits [40, 52, 74, 80]. Particularly, the handwritten signature that each person posses is widely used for personal identification and has a rich social tradition [59]. In fact, currently it is almost always necessary in all types of transactions that involve legal or financial documents. However, it is not a trivial task for a computational system to automatically recognize a person’s signature for the following reasons. First, there can be a great deal of variability when a person signs a document. This can be caused by different factors, such as a person’s mood, free time to write the signature, and the level of concentration during the actual act of signing a document. Second, because signatures can be so diverse it is not evident which type of features should be used P. Melin: Modular Neural Networks and Type-2 Fuzzy Systems, SCI 389, pp. 77–92. springerlink.com © Springer-Verlag Berlin Heidelberg 2012
78
Chapter 7 Signature Recognition with a Hybrid Approach Combining MNNs
in order to describe and effectively differentiate among them. For instance, some signatures are mostly written using straight line segments, and still others have a much smoother form with curved and circular lines. Finally, many signatures share common traits that make them appear quite similar depending on the types of features that are analyzed. In this work, we present a handwritten signature recognition system using Modular Neural Networks (MNNs) with the Sugeno fuzzy integral. We have chosen a MNN because they have proven to be a powerful, robust, and flexible tool, useful in many pattern recognition problems. In fact, we only extract simple and easily computed image features during our preprocessing stage, these features are: image edges, wavelet transform coefficients, and the Hough transform matrix. The MNN we propose uses these features to perform a very accurate discrimination of the input data used in our experimental tests. Therefore, we have confirmed that a MNN system can solve a difficult biometric recognition problem using a simple set of image features.
7.2 Problem Statement and Outline of Our Proposal The problem we address in this chapter is concerned with the automatic recognition of a person’s signature that is captured on a Tablet PC. We suppose that we have a set of N different people, and each has a unique personal signature. The system is trained using several samples from each person, and during testing it must determine the correct label for a previously unknown sample. The system we are proposing consists on a MNN with three separate modules [59]. Each module is given as input the features extracted with different feature extraction methods: edge detection, wavelet transform, and Hough transform. The responses from each of the modules are combined using a Sugeno fuzzy integral [65], which determines the person to whom the input signature corresponds. A general schematic of this architecture is shown in Figure 7.1, where all of the modules and stages are clearly shown.
Fig. 7.1 General architecture of the proposed Modular Neural Network for signature recognition
7.3 Background Theory
79
In the following section we present a brief review of some of the main concepts needed to understand our work.
7.3 Background Theory In this section we provide a general review of artificial neural networks and modular architectures, we discuss how the output from the modular system can be integrated using Sugeno fuzzy integrals, and we describe the feature extraction methods that provide the input for each of the modules in our MNN.
7.3.1
Modular Neural Networks
Artificial Neural Networks (ANNs) are information processing systems that employ a conceptual model that is based on the basic functional properties of biological neural networks [57, 59]. In the past twenty or thirty years, ANN research has grow very rapidly, in the development of new theories of how these systems work, in the design of more complex and intricate models, and in their application to a diverse set problem domains. Regarding the latter, application domains for ANN include pattern recognition, data mining, time series prediction, robot control, and in the development of hybrid methods with fuzzy logic and genetic algorithms, to mention but a few examples [48, 49, 58, 60, 80, 93]. In canonical implementations, most systems employ a monolithic network in order to solve the given task. However, when a system needs to process large amounts of data or when the problem is highly complex, then it is not trivial, and sometimes unfeasible, to establish a good architecture and topology for a single network that can solve the problem. For instance, in such problems a researcher might attempt to use a very large and complex ANN. Nevertheless, large networks are often difficult to train, and for this reason they rarely achieve the desired performance. In order to overcome some of the aforementioned shortcomings of monolithic ANNs, many researchers have proposed modular approaches [32, 59, 80, 93]. MNNs are based on the general principle of divide-and-conquer, where one attempts to divide a large problem into smaller sub-problems that are easier to solve independently. Then, these partial solutions are combined in order to obtain the complete solution for the original problem. MNNs employ a parallel combination of several ANNs, and normally contain two main components: (1) local experts; and (2) an integrating unit. The basic architecture is shown in Figure 7.2. Each module consists of a single ANN, and each is considered to be an expert in a specific task. After the input is given to each module it is necessary to combine all of the outputs in some way, this task is carried out by a special module called an integrator. The simplest form of integration is given by a gating network, which basically switches between the outputs of the different modules based on simple criteria, such as the maximum level of activation. However, a better combination of the responses from each module can be obtained using more elaborate methods of integration, such as the Sugeno fuzzy integral.
80
Chapter 7 Signature Recognition with a Hybrid Approach Combining MNNs
Fig. 7.2 Architecture of a Modular Network
7.3.2 Sugeno Fuzzy Integral The Sugeno fuzzy integral is a nonlinear aggregation operator that can combine different sources of information [12, 59, 65]. The intuitive idea behind this operator is based on how humans integrate information during a decision making process. In such scenarios it is necessary to evaluate different attributes, and to assign priorities based on partially subjective criteria. In order to replicate this process on an automatic system, a good model can be obtained by using a fuzzy representation. Finally, several works have shown that the use of a Sugeno fuzzy integral as a MNN integrator can produce a very high level of performance, and for these reasons we have chosen it for the system we describe here.
7.3.3 Fuzzy Systems Fuzzy theory was initiated by Lotfi A. Zadeh in 1965 with his seminal paper “Fuzzy sets”. Before working on fuzzy theory, Zadeh was a well-respected scholar in control theory [103]. A big event in the 70’s was the birth of fuzzy controllers for real systems. In 1975, Mamdani and Assilian established the basic framework of fuzzy controller and applied the fuzzy controller to control a steam engine. Their results were published in another seminal paper in fuzzy theory “An experiment in linguistic synthesis with a fuzzy logic controller”. They found that the fuzzy controller was very easy to construct and worked remarkably well [37]. The fuzzy inference system is a popular computing framework based on the concepts of fuzzy set theory, fuzzy if- then rules, and fuzzy reasoning. It has found successful applications in a wide variety of field, such as automatic control, data classification, decision analysis, experts systems, times series prediction, robotics, and patter recognition [1, 6, 12, 20, 29, 83, 84].
7.3 Background Theory
81
The basic structure of a fuzzy inference system consists of three conceptual components: a rule base, which contains a selection of fuzzy rules; a database, which defines the membership functions used in the fuzzy rules; and a reasoning mechanism, which performs the inference procedure upon the rules and given facts to derive a reasonable output or conclusion.
7.3.4 Feature Extraction In this work, we employ three individual modules, and each receives different image features extracted from the original image of a person’s signature. Each of these feature extraction methods are briefly described next. 7.3.4.1
Edge detection
For images of handwritten signatures, edges can capture much of the overall structure present within, because people normally write using a single color on a white background. Hence, we have chosen to apply the Canny edge detector [64] to each image that generates a binary image of edge pixels, see Fig 7.3.
(a)
(b)
Fig. 7.3 (a) Original image of a signature. (b) Image edges.
7.3.4.2 Wavelet Transform The wavelet transform decomposes a signal using a family of orthogonal functions, it accounts for both the frequency and the spatial location at each point. The most common application is the Discrete Wavelet Transform (DWT) using a Haar wavelet. The DWT produces a matrix of wavelet coefficients that allows us to compress, and if needed reconstruct, the original image [59]. In Figure 7.4 we can observe the two compression levels used in our work.
82
Chapter 7 Signature Recognition with a Hybrid Approach Combining MNNs
(a)
(b)
(c)
Fig. 7.4 (a) Original image of a signature. (b) First level of decomposition. (c) Second level of decomposition.
7.3.4.3 Hough Transform In the third and final module we employ the Hough transform matrix as our image features. The Hough transform can extract line segments from the image. In Figure 7.5 we show a sample image of a signature and its corresponding Hough transform matrix. Finally, in order to reduce the size of the matrix, and the size of the corresponding ANN, we compress the information of the Hough matrix by 25%.
ρ
-100 0 100
(a)
-50
(b)
0
50
θ
Fig. 7.5 (a) Sample of a signature image with some of the lines found by the Hough transform (b) The Hough transform matrix
7.3 Background Theory
83
7.3.4.4 Verification of Signatures Currently, security practice always involves PIN number, password, and access card. However, these signs are not very reliable, since it can be forgotten or lost. Automatic signature verification is one of the most practical ways to verify human´s identify. Signature verification can be used in many applications such as security, access control, or financial and contractual matters The process of signature verification often consists of a learning stage and a testing stage, as shown in figure 7.6. In the learning stage, the verification system uses the feature extracted from one or several training samples to build a reference signature database. In the testing stage the user inputs the signature into input device. Then the system uses this information to extract the reference in the database, and compares the features extracted from the input signature with the reference. Finally the verification process out whether the test signature is genuine or not.
Input Signature
Data Capture Pre-Procesamiento
Feature Extract MNN Fuzzy integrator
Reference Database Verification Result Fig. 7.6 Signature verification process
84
Chapter 7 Signature Recognition with a Hybrid Approach Combining MNNs
In the research area of signature verification, a type I error rate and type II error rate are usually called false reject rate (FRR) and false acceptance rate (FAR) respectively [59]. To minimize the type II errors, which represent the acceptance of the counterfeited signatures will normally increase the type I errors, which are the rejections of genuine signature. In most case, type II error rate is considered to be more important, but it is not a must. This will depend on the purpose, design, characteristics and application of the verification systems. If the system requests a high security, false accept rate should reduced to its lowest; if the security is not so strict, the system can be adjust to its lowest average false rate. The fuzzy system will answer the greater activation of the 3 modules signing, taking the form of higher activation winner; this means the 27 rules in the system are considered fuzzy. Once the winner module did a signature verification process to know whether the signature that shows fuzzy integrator corresponds to the person. For this we also conducted 30 trainings with 150 different samples of genuine signatures, to make activation and get an average of this activation. The average of the activations is used as to whether a signature is forged or genuine. Typically when the signature is authentic we obtain a high activation and when the signature is false the activation is low, although not necessarily, as may happen if there is a high activation but the signature is false activation or a low but firm is true, so we take into account four different cases: 1. False acceptance (FRA). 2. False rejection (FRR). 3. Error. 4. Signature Authentic. After taking as reference the average of activations, 85 samples were collected from forged signatures of 17 persons and 65 authentic samples of 13 persons, giving a total of 150 samples between false and authentic signatures of 30 persons. In total 210 images of signatures of each module for the training signatures are authentic. Table 7.1 shows the case that can be given upon verification of signatures, taken as a basis the average activations. Table 7.1 Signature verification procedure Recognizes Yes Yes No No No No
Overcome threshold Yes No Yes No No Yes
Original signature No Yes Yes Yes No No
Result False Acceptance False Rejection Error Error Correct Correct
7.4 Experiments
85
7.4 Experiments In this section we present our database of signature images, describe our experimental set-up, and detail the experimental results we have obtained using monolithic and modular networks.
7.4.1 Image Database For this work we build a database of images with the signatures of 30 different people, students and professors from the computer science department at the Tijuana Institute of Technology, BC, México. We collected 12 samples of the signature from each person; this gives a total of 360 images in total. Sample images from the database are shown in Figure 7.7.
Fig. 7.7 Images from our database of signatures. Each row shows different samples from the signature of the same person.
7.4.2 Experimental Setup In this work, we are interested in verifying the performance of our proposed MNN for the problem of signature recognition. Therefore, in order to obtain comparative measures we divide our experiments into four separate tests. 1.
2. 3. 4.
First, we use each module as a monolithic ANN for signature recognition. Therefore, we obtain three sets of results, one for each module, where in each case a different feature extraction method is used. Second, we train our MNN using all three modules concurrently and the Sugeno fuzzy integral as our integration method. Third, we train our MNN using all three modules concurrently and the Fuzzy System as our integration method. Fourth, Signature verification: false acceptance, false rejection.
86
Chapter 7 Signature Recognition with a Hybrid Approach Combining MNNs
In all tests 210 images were chosen randomly and used for training, and the remaining 150 were used as a testing set. Additionally, after some preliminary runs it was determined that the best performance was achieved when the ANNs were trained with the Scaled Conjugate Gradient (Trainscg) algorithm, with a goal error of 0.001. Moreover, all networks had the same basic ANN architecture, with two hidden layers. In what follows, we present a detailed account of each of these experimental tests. 7.4.2.1 Monolithic ANNs The results for the first monolithic ANN are summarized in Table 7.2. The table shows a corresponding ID number for each training case, the total epochs required to achieve the goal error, the neurons in each hidden layer, and the total time required for training. Recognition performance is shown with the number of correct recognitions obtained with the 150 testing images, and the corresponding accuracy score. In this case, the best performance was achieved in the third training run where the algorithm required 78 epochs, and the ANN correctly classified 131 of the testing images. Table 7.2 Performance for a monolithic ANN using edge features; bold indicates best performance No
Epochs
Neurons
Time
Correct
Accuracy (%)
01
80
100-100
00:01:11
123/150
82
02
59
100-100
00:01:18
120/150
80
03
78
100-100
00:01:07
131/150
87
04
90
100-100
00:01:26
117/150
78
05
80
100-100
00:01:08
119/150
79
06
78
100-100
00:01:34
123/150
82
07
53
100-100
00:00:46
123/150
82
08
79
80-90
00:01:08
123/150
82
09
55
80-90
00:00:56
128/150
85
10
58
80-90
00:00:58
122/150
81
The second monolithic ANN uses the wavelet features as input, and the obtained results are summarized in Table 7.3. In this case the best performance was obtained in the third training run, with a total of 5 epochs, and 144 correctly classified images. It is obvious that wavelet features provide a very good discriminative description of the signature images we are testing.
7.4 Experiments
87
Table 7.3. Performance for a monolithic ANN using wavelet features Train
Epochs
Neurons
Time
Correct
Accuracy (%)
01
12
100-100
00:00:18
135/150
90
02
30
100-100
00:00:25
138/150
92
03
05
100-100
00:00:08
144/150
96
04
09
100-100
00:00:11
140/150
93
05
06
80-90
00:00:08
142/150
95
06
05
80-90
00:00:05
140/150
93
07
10
80-90
00:00:14
141/150
94
08
07
80-90
00:00:09
138/150
92
09
10
80-90
00:00:15
140/150
93
10
05
80-90
00:00:06
137/150
91
Finally, the third monolithic ANN uses the Hough transform matrix, and the corresponding results are shown in Table 7.4. The best performance is achieved in the fourth training run, with a total of 6 epochs and 141 correctly classified images. Table 7.4 Performance for a monolithic ANN using the Hough transform Train
Epochs
Neurons
Time
Correct
Accuracy (%)
01
63
100-100
00:00:19
135/150
90
02
65
100-100
00:00:51
140/150
93
03
68
100-100
00:00:19
141/150
94
04
06
80-90
00:00:08
141/150
94
05
04
80-90
00:00:05
140/150
93
06
45
80-90
00:00:11
138/150
92
07
08
80-90
00:00:09
138/150
92
08
05
80-90
00:00:08
137/150
91
09
05
80-90
00:00:06
137/150
91
10
33
50-50
00:00:20
138/150
92
88
Chapter 7 Signature Recognition with a Hybrid Approach Combining MNNs
It is important to note that in all three cases, the monolithic methods did achieve good results. The best performance was obtained using wavelet features, and the Hough transform matrix also produced very similar results. On the other hand, the simple edge features produced a less accurate recognition than the other two methods. 7.4.2.2 Modular Neural Network with Sugeno Fuzzy Integral The final experimental results correspond to the complete MNN described in Figure 7.1, and Table 7.5 summarizes the results of ten independent training runs. For the modular architecture, performance was consistently very high across all runs, and the best recognition accuracy of 98% was achieved in half of the runs. In fact, even the worst performance of 95% is better or equal than all but one of the monolithic ANNs (see Table 7.3). Table 7.5 Results for the Modular Neural Network with fuzzy Sugeno Integral Trian
Epochs
Time
Correct
Accuracy (%)
01 02
55 150
00:00:49 00:01:34
147/150 146/150
98 97
03 04
180 300
00:01:53 00:02:20
144/150 147/150
96 98
05 06
150 155
00:01:30 00:01:45
147/150 146/150
98 97
07 08
320 310
00:02:49 00:02:38
147/150 148/150
98 98
09 10
285 03
00:01:58 00:00:02
145/150 143/150
96 95
7.4.2.3 Modular Neural Network with a Fuzzy System We use a fuzzy systems integrator for the three modules of the network. The fuzzy systems are of Mamdani type, contain three inputs (module 1, module 2, module 3) output (winner module), and 27 rules. Several tests were performed with the fuzzy systems that have the same input, and output rules, but with different functions of membership: Triangular, trapezoidal and Gaussian. In Figures 7.8, 7.9, 7.10 we show the fuzzy systems with trapezoidal Membership functions, Triangular and Gaussian.
7.4 Experiments
89
1. If (MBinaria is Bajo) and (MWavelet is Bajo) and (MHough is Bajo) then (ModGanador is Modulo2) (1) 2. If (MBinaria is Bajo) and (MWavelet is Bajo) and (MHough is Medio) then (ModGanador is Modulo3) 3. If (MBinaria is Bajo) and (MWavelet is Bajo) and (MHough is Alto) then (ModGanador is Modulo3) (1) 4. If (MBinaria is Bajo) and (MWavelet is Medio) and (MHough is Bajo) then (ModGanador is Modulo2) (1) 5. If (MBinaria is Bajo) and (MWavelet is Medio) and
Fig. 7.8 Representation of fuzzy systems with trapezoidal membership functions
1. If (MBinaria is Bajo) and (MWavelet is Bajo) and (MHough is Bajo) then (ModGanador is Modulo2) (1 2. If (MBinaria is Bajo) and (MWavelet is Bajo) and (MHough is Medio) then (ModGanador is Modulo3) ( 3. If (MBinaria is Bajo) and (MWavelet is Bajo) and (MHough is Alto) then (ModGanador is Modulo3) (1) 4. If (MBinaria is Bajo) and (MWavelet is Medio) and (MHough is Bajo) then (ModGanador is Modulo2) (1) 5 If (MBinaria is Bajo) and (MWavelet is Medio) and
Fig. 7.9 Representation of fuzzy systems with Triangular membership functions
90
Chapter 7 Signature Recognition with a Hybrid Approach Combining MNNs
1. If (MBinaria is Bajo) and (MWavelet is Bajo) and (MHough is Bajo) then (ModGanador is Modulo2) (1 2. If (MBinaria is Bajo) and (MWavelet is Bajo) and (MHough is Medio) then (ModGanador is Modulo3) 3. If (MBinaria is Bajo) and (MWavelet is Bajo) and (MHough is Alto) then (ModGanador is Modulo3) (1) 4. If (MBinaria is Bajo) and (MWavelet is Medio)
Fig. 7.10 Representation of fuzzy systems with Gaussian membership functions
Table 7.6 Results for the Modular Neural Network with fuzzy system Train
Membership Funtion
Error goal
Epochs
01
Triangular
0.001
232
00:03:42
148/150
98.66
02
Triangular
0.001
560
00:09:15
146/150
97.33
03
Triangular
0.001
710
00:11:05
148/150
98.66
04
Trapezoidal
0.001
96
00:01:37
147/150
98.00
05
Trapezoidal
0.001
302
00:08:01
147/150
98.00
06
Trapezoidal
0.001
304
00:08:03
146/150
97.33
07
Gaussian
0.001
150
00:02:50
146/150
97.33
08
Gaussian
0.001
257
00:06:02
149/150
99.33
09
Gaussian
0.001
223
00:03:17
149/150
99.33
Time
Correct
Accuracy (%)
7.4 Experiments
91
The results obtained with the fuzzy system as an integrator of MNN with different membership functions (see table 7.6), were good, in this case the best result was obtained in the training with 9 Gaussian membership function, with a total of 223 epochs, and 149 images are classified correctly. The method of training is scaled conjugate gradient (Transcg). Overcoming the best result with fuzzy Sugeno integral (see table 7.5). 7.4.2.4 Modular Neural Network with a Fuzzy System Adding Uniform Random Noise After multiple tests done with the fuzzy system, and taking into account that the best result was obtained with Gaussian membership functions, we applied noise to the images of signatures, using “uniform random noise”. The noise level of 0.5 was applied. Table 7.7 shows the top 10 results. The best training is the one in the second row, with a total of 146 correctly classified images. Table 7.7 Result with uniform random noise
Train
Method
Time
Correct
Accuracy (%)
01
Trainscg
00:02:56
141/150
92.66
02
Trainscg
00:02:57
146/150
97.33
03
Trainscg
00:04:50
144/150
96.00
04
Trainscg
00:03:25
143/150
95.33
05
Trainscg
00:04:20
144/150
96.00
06
Trainscg
00:03:09
141/150
94.00
07
Trainscg
00:06:50
139/150
92.66
08
Trainscg
00:02:54
141/150
94.00
09
Trainscg
00:03:33
146/150
97.33
10
Trainscg
00:05:56
144/150
96.00
7.4.2.5 Results of Verification of Signatures Table 7.8 shows the results as a percentage for each case: false acceptance, false rejection, error recognition and the percentage of correct signatures.
92
Chapter 7 Signature Recognition with a Hybrid Approach Combining MNNs Table 7.8 Results from the verification of signatures
Train
01 02 03 04 05 06 07 08 09 10
Time
00:06:32 00:06:03 00:07:32 00:08:03 00:08:02 00:09:00 00:06:06 00:06:42 00:06:52 00:06:13
False Acceptance (%) 9.33 7.33 10.00 14.66 7.33 16.00 6.66 11.33 15.33 12.00
False Rejection (%) 8.00 18.00 5.33 2.00 18.00 4.00 14.00 10.66 4.66 6.00
Error Recognition (%) 1.33 0.66 2.00 1.33 0.66 1.33 1.33 1.33 2.00 2.66
Correct Sgnatures (%) 81.33 74.00 82.66 82.00 74.00 78.66 78.00 76.66 78.00 79.33
7.5 Summary In this chapter we have addressed the problem of signature recognition, a common behavioral biometric measure. We proposed a modular system using ANNs and three types of image features: edges, wavelet coefficients, and the Hough transform matrix. In our system, the responses from each module were combined using a Sugeno fuzzy integral and a fuzzy inference system. In order to test our system, we built a database of image signatures from 30 different individuals. In our experiments, the proposed architecture achieves a very high recognition rate, results that confirm the usefulness of the proposal. In our tests, we have confirmed that the modular approach always outperforms, with varying degrees, the monolithic ANNs tested here. However, in some cases the difference in performance was not very high, only 3 or 2 percent. Nevertheless, we believe that if the recognition problem is made more difficult then the modular approach will more clearly show a better overall performance.
Chapter 8
Interval Type-2 Fuzzy Logic for Module Relevance Estimation in Sugeno Response Integration of Modular Neural Networks
8.1 Introduction Aggregation has the purpose of making simultaneous use of different pieces of information provided by several sources in order to come to a conclusion or a decision. The aggregation operators are mathematical objects that have the function of reducing a set of numbers into a unique representative number, and any aggregation or fusion process done with a computer underlies numerical aggregation [64]. Most aggregation operators use some kind of parameterization to express additional information about the objects that take part in the aggregation process; then the parameters are used to represent the background knowledge. Among all the existing types of parameters, the fuzzy measures are a rich and important family. They are of interest because they are used for aggregation purposes in conjunction with fuzzy integrals like Choquet and Sugeno Integrals [65]. In this work we describe an image recognition method using Modular Neural Networks combined with Sugeno Integral [65]. The information to be combined is the simulation outputs of the 3 modules trained to recognize a different part of the image. The modular architecture consists in dividing each image in 3 parts after the edges detection process, and uses each part as training data for 3 monolithic neural networks. Then the problem becomes how to combine the simulation of the three modules in order to recognize the maximum number of images possible. We make the final decision using the Sugeno Integral, which is used to combine the simulation vectors into only one vector, then at the end of the method the system decides the best choice of recognition in the same manner than made with only one monolithic neural network, but with the problem of complexity resolved [59]. Then the problem is to find the ideal input parameters for the Sugeno Integral, which means, the values of the fuzzy density for each module, to rank its relevance in the decision process. This problem was solved by building a FIS to estimate the fuzzy densities using only the simulation vectors as input variables. This step was implemented with two fuzzy logic systems, namely using Type-1 and Interval Type-2 Fuzzy Logic respectively [12, 13, 15, 17, 18, 19, 83, 84].
P. Melin: Modular Neural Networks and Type-2 Fuzzy Systems, SCI 389, pp. 93–105. springerlink.com © Springer-Verlag Berlin Heidelberg 2012
94
Chapter 8 Interval Type-2 Fuzzy Logic for Module Relevance Estimation
8.2 Modular Neural Networks 8.2.1 Modular Structure The modular structure was designed for a database of images, like the Olivetty Research Laboratory database of faces (ORL), but it is not limited to this data. To measure the recognition rate in an objective form, we trained the modular neural networks with the set of images in a random ordered fashion, this process is called a random permutation [32, 63, 80, 93]. The design of the Modular Neural Network consists of 3 monolithic feedforward neural networks, each one trained with a supervised method with the first 7 samples of the 40 images of ORL. The edges vector for each image is accumulated into a matrix, as shown in the scheme of Figure 8.1. Then the complete matrix of images is divided into 3 parts, each module is trained with a corresponding part, with the some rows for overlapping.
Fig. 8.1 Input: Seven images for each person
The target to the supervised training method consists of one identity matrix with dimensions 40x40 for each sample, building one matrix with total dimensions (40x40x7), as shown in Figure 8.2.
Fig. 8.2. Target: One identity matrix with dimensions 40x40 for each sample
Each monolithic neural network has the same structure of Figure 8.3 and was trained under the same conditions: •
Three hidden layers with 200 neurons and tansig transfer functions.
8.2 Modular Neural Networks
• •
95
The output layer with 40 neurons and purelin transfer functions. The training function is gradient descent with momentum and adaptive learning rate back-propagation (traingdx).
Fig. 8.3 Structure of each monolithic neural network
8.2.2 Training of the Modules The next code segment was created to train each module, where newff is a Matlab Neural Network Toolbox function to create a feed-forward back-propagation network with the parameters specified as follows. layer1=200; layer2=200; layer3=40; net=newff(minmax(p),[layer1,layer2,layer3],{'tansig','tansig','log sig'},'traingdx'); net.trainParam.goal=1e-5; net.trainParam.epochs=1000;
8.2.3 Modules Simulation A program was developed in Matlab that simulates each module with the 400 images of the ORL database [63], building a matrix with the results of the simulation of each module, as it is shown in Figure 8.4. These matrices are stored in the file “mod.mat” to be analyzed later for the combination of results.
Fig. 8.4 Scheme of simulation matrices for the three modules
96
Chapter 8 Interval Type-2 Fuzzy Logic for Module Relevance Estimation
In the simulation matrix, columns corresponding to the images in the training data, with a value near one, are always selected correctly. However, some outputs of the training data have very low values in all positions, reason why it is very important to have a good combination method to recognize more images.
8.3 Sugeno Integral for Modules Fusion For the recognition of one image we have to divide it in three parts, and then simulate each one using the corresponding module. Then for each image to recognize we have three simulation vectors. To make the final decision is necessary the fusion of the three vectors, using an aggregation operator like Sugeno Integral. In this section we provide some antecedents about Sugeno Measures and Sugeno Integral [32, 63, 64].
8.3.1 Sugeno λ-Meausures The Sugeno measures (λ-measures) are monotone measures, μ that are characterized by the following requirement: For all A, B ∈ P(X), if A ∩ B = Ø, then:
μ ( A ∪ B ) = μ ( A) + μ ( B ) + λμ ( A) μ ( B )
(8.1)
Where λ>-1 is a parameter by which different λ-measures are distinguished. Equation (1) is usually called λ-rule. When X is a finite set and values µ({x}) (called fuzzy densities) are given for all x∈X, then the value µ(A) for any A ∈ P(X), can be determined from this values on singletons by a repeated application of the λ-rule. This value can be expressed as:
x∈S
μ ( A) = ∏ (1 + λμ ({x})) / λ
(8.2)
Observe that, given values µ({x}) for all x∈X, the values of λ can be determined by the requirement that µ({X})=1. Applying this requirement to the equation (8.2) results in the equation (8.3): n
λ + 1 = ∏ (1 + λμ ({xi }))
(8.3)
i =1
for λ. This equation determines the parameter uniquely under the conditions stated in the following theorem: Theorem 8.1. Let µ({x}) 0 for at least two elements on X. Then equation (3) determines the parameter λ uniquely as follows: If μ ({x}) < 1 , then λ is equal to the unique root of the equation in the interx∈X
val (0, ∞) , which means that µ qualifies as a lower probability, λ>0.
8.5 Fuzzy Logic for Density Estimation 8.5 Fuzzy Logic for Dens ity Est imatio n
If
97
μ ({x}) = 1 , then λ=0, which is the only root of the equation, which means
x∈ X
that µ is a classical probability measure, λ=0. If μ ({x}) > 1 , then λ is equal to the unique root of the equation in the interx∈X
val (−1,0) , which means that µ qualifies as an upper probability, λ