A ppIie d Computational Intelligence
This page intentionally left blank
Applied Computa tional Intelligence Proceedings of the 6th International FLINS Conference Blankenberge, Belgium, September 1-3, 2004
Edited by
Da Ruan, Pierre D’hondt Belgian Nuclear Research Centre (SCKCEN), Belgium
Martine De Cock, Mike Nachtegael & Etienne E. Kerre Ghent University, Belgium
vp World Scientific NEW JERSEY
L O N D O N * SINGAPORE * BElJlNG * SHANGHAI
HONG KONG
TAIPEI * C H E N N A I
Published by World Scientific Publishing Co. Pte. Ltd.
5 Toh Tuck Link, Singapore 596224 USA ofice: 27 Warren Street, Suite 401402, Hackensack, NJ 07601 UK ofice: 57 Shelton Street, Covent Garden, London WC2H 9HE
British Library Cataloguing-in-PublicationData A catalogue record for this book is available from the British Library.
APPLIED COMPUTATIONAL INTELLIGENCE Proceedings of the 6th International FLINS Conference Copyright 0 2004 by World Scientific Publishing Co. Re. Ltd. All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means,
electronic or mechanical, includingphotocopying, recording or any information storage and retrieval system now known or to be invented, without written permission from the Publisher.
For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy is not required from the publisher.
ISBN 981-238-873-7
Printed in Singapore by World Scientific Printers ( S ) Pte Ltd
FOREWORD FLINS, an acronym originally for Fuzzy Logic and Intelligent Technologies in Nuclear Science (1 994), has now been extended to Applied Computational Intelligence (2004) to advance the theory and applications of computational intelligence for complex systems in general and for nuclear/power research in particular. The main objectives/missions of FLINS are: (1) conducting research on applied computational intelligent systems for solving intricate problems pertaining to nuclear/power research and related complex systems; (2) bridging the gap between machine intelligence and complex systems via joint research with Belgian, European, and international research institutes and universities; and (3) encouraging interdisciplinary research and bringing multi-discipline researchers together via the international FLINS conferences on applied computational intelligence. In this framework, the first international FLINS workshop (FLINS’94) was held in Mol, Belgium, September 14-16, 1994. A special journal issue of FLINS’94 was published in Fuzzy Sets and Systems 74, Number 1, 1995. The second international FLINS workshop (FLINS’96) was again held in Mol, September 25-27, 1996.A special issue dedicated to FLINS’96 was published in International Journal of Intelligent Systems 13, Numbers 2-3, 1998. The third international FLINS workshop (FLlNS’98) was held in Antwerp, Belgium, September 14-16, 1998. A special issue of FLINS’98 was published in International Journal of General Systems 29, Numbers 1-2, 2000. The fourth international FLINS conference (FLINS 2000) was held in Bruges, Belgium, August 28-30, 2000. A special issue of FLINS 2000 on “Intelligent Information Systems and Applications” appeared in Information Sciences 142, Numbers 1 4 , 2002. The fifth international FLINS conference (FLINS 2002) was held in Gent, Belgium, September 16-18, 2002. A special issue of FLINS 2002 on “Information Fusion” has been just published in International Journal of General Systems 33, Numbers 2-3,2004. FLINS 2004 is the sixth in a series of FLINS conferences and aims at covering state-of-the-artresearch and development in all areas pertaining applied computational intelligence. It also includes both modern A1 related computational intelligence and classical computational methods with an emphasis on current and potential applications in complex systems. The major goals of this forum are to bridge the gap between theory and applications as well as to
V
vi
encourage interactions among scientists, researchers, and engineers from universities, research institutes, and industrial companies. FLINS 2004, coorganized by the Belgian Nuclear Research Centre (SCKoCEN) and Ghent University (UGent), and continuously co-sponsored by the Fund for Scientific Research-Flanders (FWO), offered a unique international forum to present and discuss techniques that are new and promising for applied computational intelligence and to launch international co-operations. The FLINS 2004 proceedings consist of a series of invited lectures by distinguished professors and individual presentations, in total 113 papers selected out of 162 regular submissions and 8 invited papers from no less than 29 countries. The volume begins with the list of our invited lectures: (a) Uncertainty modelling in real-time: from logistics to financial management by Hans-Jurgen Zimmermann, (b) Reasoning with uncertain intelligence information by Ron R. Yager, (d) An introduction to a self organizable & regulating engine by Paul P. Wang, (c) Hand gesture as a means of human-friendly interfacehnteraction by Zenn Z. Bien, (e) Higher-order neural units for image processing and their applications to robot routing problems by Madan M. Gupta, (0 Classifiers and decision makers by Javier Montero, (g) Fuzzy data mining: fizziness /uncertainty in discovery of associations by Guoqing Chen, and (h) Uncertain information processing: from theory to implementations by Yang Xu. The volume is followed by three contributed parts: (1) Foundations of computational intelligence, (2) Knowledge discovery and decision making, and (3) Applied research and nuclear applications.
To recognize our invited speakers’contributions to FLINS conferences, from FLINS 2004 onwards, one FLINS Gold Award, several FLINS Appreciation Awards, and FLINS Outstanding Service Awards will be issued for each FLINS conference. It is our great pleasure to announce that the first FLINS Gold Award goes to Professor Zadeh for his invention of fuzzy set theory and for his continuously supporting FLINS activities. Our 2004 FLINS Appreciation Awards go to Professors Zimmermann, Yager, Wang, Bien, Gupta, Montero, and Chen. Last but not least, our 2004 FLINS Outstanding Service Award goes to Professor Xu for his tremendous efforts to attract many Chinese researchers to FLINS 2004. At the same time, to recognize the best research in line with applied computationalintelligence at FLINS conferences in the fiture, several awards such as the best research paper, the best student paper, and the best nuclear
vii
application paper will be issued. The results of FLINS 2004 for such best papers (if any) will be announced at FLINS 2006 due to the peer-review by the FLINS committee. On the occasion of the 10th anniversary of the FLINS forum, I, on behalf of FLINS at SCK-CEN want to thank all contributors, all referees, all regular and invited sessions’ chairs, and all program committee members of FLINS 2004 for their kind co-operation in helping to prepare FLINS 2004. For the last 10 years, both Pierre D’hondt and Etienne Kerre have been very cooperative with all FLINS conferences by taking care of the organizations as advisors and by sharing the responsibilities of the FLINS proceedings as co-editors. Their valuable advice and many suggestionsaffected the success of all FLINS conferences.Martine De Cock and Mike Nachtegael (M&M) have been working as the conference co-managers since 2002 and they have now also joined the co-editorship of the proceedings of FLINS 2004. Without the intensive efforts of M&M, there would be no FLINS 2004. Remarkably, Paul Govaerts, General Manager of SCKoCEN, has been attending all FLINS conferences and welcoming all FLINS conferences’ participants since 1994 and deserves our 2004 FLINS Appreciation Award. Finally, all support from SCKCEN, UGent and FWO is greatly appreciated and acknowledged.
Da Ruan, FLINS 2004 Chair Mol, May 2004
This page intentionally left blank
CONTENTS
Foreword D. Ruan
V
Invited Lectures
1
Uncertainty Modelling in Real-Time: From Logistics to Financial Management H.-J. Zimmermann
1
Reasoning with Uncertain Intelligence Information R.R. Yager
1
An Introduction to a Self Organizable & Regulating Engine P P Wang and J. Robinson
1
Hand Gesture as a Means of Human-Friendly Interfacehteraction Z. Bien, J.-H. Do, J.-B. Kim, H. Jang, and D.-J. Kim
7
Higher-Order Neural Units for Image Processing and Their Applications to Robot Routing Problems Z.-G. Hou, K.-I: Song, and M.M. Gupta
13
Classifiers and Decision Makers J. Montero
19
Fuzzy Data Mining: FuzzinessKJncertainty in Discovery of Associations G. Chen
25
Uncertain Information Processing: From Theory to Implementations at Southwest Jiaotong University in China I! x u
26
ix
X
PART 1: FOUNDATIONS OF COMPUTATIONAL INTELLIGENCE
27
Progress and Prospect in Lattice-Valued Logic Systems Based on Lattice Implication Algebras E Xu, D. Ruan, and J, Liu
29
Implication Algebra by Meta-Theory of the Uncertainty G. Resconi
3.5
A Method of Constructing a Quasi-Lattice Implication Algebra C . Zeng
41
On the Construction of Finite Lattice Implication Algebras E Zhu
45
The Relations between Lattice Implication Algebras and Brouwerian Lattice X . Pan, E Xu, and J. Liu
51
Relating Intuitionistic Fuzzy Sets and Interval-Valued Fuzzy Sets Through Bilattices 0.Arieli, C. Cornelis, G. Deschrijver, and E.E. Kerre
57
Some Fuzzy LI-Ideals in Lattice Implication Algebras EL. Liu, S.E Liu, and I: Xu
65
LI-Filters in Lattice Implication Algebras EL. Liu, Y Xu, and S. E Liu
71
The Properties of Implication Operator in Lattice Implication Algebra L. yi, E Xu, and Z. Pei
77
L-Valued Modal Fuzzy Reasoning Based on the Lv pL Z. Ping and I: Xu
83
A Resolution Method on Lattice-Valued Tense Propositional Logic WLiandEXu
87
xi
Lattice Implication Algebra Model of Linguistic Variable Truth and Its Inference Z. Pei and I:Xu
93
The Logic Validity of Analogy Inference and Learning W Pan and I: Pan
99
a-Automated Reasoning Method Based on LP(X) W Wang, B. Jiang, and E Xu
105
Neural Networks = Fuzzy Rule Bases E. Kolman and M. Margaliot
111
Analysis of Weight Characteristics in a Multilayer Perceptron W Kastnel; T Forstel; and R. Hampel
118
Immune Emergent Computation-Based Bio-Network Architecture and Its Simulation Platform X. Zhang, I: Ding, L. Ren, and L. Gao
124
Fuzzy Analysis and Calculation Based on the Structured Element S.Guo
128
Ranking Fuzzy Numbers with the General Characteristic Z. Wang and Q. Mou
134
An Inclusion Measure for Rough Sets E Deng and I: Xu
138
A Semantic Model of Linguistic Truth Variable 2. Wu and I:X u
144
A Multi-Attribute Evaluating Approach based on Analysis and Learning of Attribute Coordinate J. Feng and N. Lu
148
xii
PART 2: KNOWLEDGE DISCOVERY AND DECISION MAKING
153
Mining Causality from Imperfect Data L.J. Mazlack
155
Discovering Frequent Patterns in Sequence Groups J. W Guan, D.A. Bell, and D. I:Liu
161
Empirically Validating an Adapted Classification Based on Associations Algorithm on UCI Data D. Janssens, I:Lun, G. Wets, G. Chen, and 1': Brijs
167
A Method for Association Rules Mining J. Ma, G. Chen, E.E. Kerre, and D. Ruan
173
Enriching the E W E R Model Based on Discovered Association Rules G. Chen, M.Ren, and X. Guo
179
An Extended Process Model of Knowledge Discovery in Database I: Li and D. Ruan
185
Intuitionistic Fuzzy Relational Databases and Translation of the Intuitionistic Fuzzy SQL B. Kolev
189
Efficient Clustering with Fuzzy Ants S. Schockaert, M. De Cock, C. Cornelis, and E.E. Kerre
195
Enhanced RBF Network Using Fuzzy Control Method K.-B. Kim, J.-W Moon, and J.-H. Nam
20 1
Pattern Recognition with Spiking Neural Networks and Dynamic Synapses 205 A. Belatreche, L.P Maguire, and T.M. McGinnity Evolving Autoregressive Neural Networks for the Problem of Modeling Nonlinear Heterogeneous Time Series N. Reyhani and M.Kharrat
211
xiii
Recognition of Identifiers from Shipping Container Images using Fuzzy Binarization and Neural Network with Enhanced Learning Algorithm K.B. Kim
215
Fuzziness-Driven Edge Detection Based on Renyi’s A-Order Fuzzy Entropy 1.K. Vlachos and G.D. Sergiadis
222
Statistical Analysis of ECG Signals with Wavelet Techniques and Methods of Non-Linear Dynamics I. Pdzsit
230
Classification of Two-Phase Flow Regimes via Image Analysis by a Neuro-Wavelet Approach C. Sunde, S. Avdic, and I. P&sit
236
Performing an Analysis of Fuzzy Fusion for Spermatozoa in Fertile Human J. Lu and I: Hu
240
Linguistic Summaries of Image Patterns H. Nair
246
Image Retrieval Using Linguistic Expressions of Colors A.A. Younes, I. Truck, H. Akdag, and I: Remion
250
A Coloring Algorithm for Image Classification D. Gdmez, J. Montero, J. YhAez, and C. Poidomani
258
Fuzzy Models to Deal with Heterogeneous Information in Decision Making Problems in Engineering Processes L. Martinez, J. Liu, and D. Ruan
266
Self-Tuning Method for Fuzzy Rule Base with Belief Structure J. Liu, L. Martinez, J.-B. Yang, and J. Wang
272
Aggregation of Fuzzy Opinions with the Mean Distance and a Similarity Measure under Group Decision Making J. Lan, I.: Xu, and J. Liu
276
xiv
A New Method with Projection Technique for Fuzzy Multi-Attribute Decision Making J. Lan, I: Xu, and J. Liu
282
A New Criterion for Fuzzy Multi-Attribute Decision Making J. Lan, I: Xu, and J. Liu
288
A Systems Analysis of Improving the Reasonability of Appraisement System of Human Resource X. Liu, S. Chen, and I: X u
292
A Model of Evaluation of an Appraisement System of Human Resource X. Liu, S. Chen, and I: Xu
296
An Algorithm for Linear Bilevel Programming Problems C. Shi, G. Zhang, and J. Lu
300
A Fuzzy Goal Approximate Algorithm for Solving Multiple Objective Linear Programming Problems with Fuzzy Parameters F Wu, G. Zhang, and J. Lu
304
A Kind of Fuzzy Least Squares Support Vector Machines for Pattern Classification S. Chen and I: Xu
308
Online Training Evaluation in Virtual Reality Simulators Using Evolving Fuzzy Neural Networks L.S. Machado and R.M. De Moraes
3 14
The Personalized PageRank Based on User Behaviors Z. Li, E Du, I: Xu, I.: Wang, and D. Qi
318
Impiementation of An Inteiligent Spider’s Algorithm for Search Engine I: Du, I.: Xu, Z. Li, and D. Qi
324
A Model for Semantic of Linguistic Information with Incomparability J. Ma, D. Ruan, and I: X u
328
xv
Fuzzy Adaptive Control Based on L-R Fuzzy Number and Application in the Pension Systems X. Liang, B. Tang, I: Xue, and Z. Zhu
334
Study on Intelligentized Split-Spread and Intelligentized RenewCompound of Multiplex Information-Flow in Working-Flow Management W Kou and E Xu
338
A Fuzzy Set Theoretic Method to Validate Simulation Models J. Martens, E Put, and E. E, Kerre
342
Application in Evaluation of Loess Collapsibility with Information Diffusion Technique W Xie, J. Wang, J. Yuan, and R. Li
350
Mode of Soft Risk Map Made by Using Information Diffusion Technique J. Zhang, C. Huang, and S. Qiao
358
Benefit of Soft Risk Map Made by Using Information Diffusion Technique 364 C. Huang and H. Inoue A Risk Assessment Model of Water Shortage and Its Application for Analyzing Bearing Capacity of Water Resources L. Feng and C. Huang
370
An Earthquake Risk Assessment Method Based on Fuzzy Probability I. Karimi, E. Hiillemeiel; and K.Meskouris
376
Reliability Theory Using High Conditional Probability Events M. Oussalah
382
PART 3: APPLIED RESEARCH AND NUCLEAR APPLICATIONS
391
A Fuzzy Impulse Noise Detection and Suppression Filter for Color Images 393 I: Zkiftzis and I. Andreadis Video Deinterlacing Using Adaptive Fuzzy Filters A. Sanz, E Ferndndez, J. Gutie'rrez, G. Triviiio, A. Sanchez, J. C. Crespo, and A. Mazadiego
397
xvi
Color Image Enhancement Method Using Fuzzy Surfaces in the Framework of the Logarithmic Models V Patrascu
403
GIS and Satellite Image Processing for the Study of Human Impact Assessment on Steppe Development in Uzbekistan I. Muminov and J. Benedikt
409
The Haar Wavelets in a Fuzzy System Form and an Application to the Johnson Noise Thermometry B.S. Moon, I.K. Hwang, C.E. Chung, and K.C. Kwon
416
Neural Network Based Sonar Target Differentiation WS. Lim, M.VC. Rao, and C.K. Loo
422
Systematic Design of a Stable Fuzzy Controller for a Robotic Manipulator Using Describing Function Technique E. Giirkan
428
An Intelligent Robot Overview for Medical In Vitro Fertilization J. Lu and Z Hu
434
Self-Adaptation of the Symbolic World Model of a Mobile Robot: An Evolution-Based Approach C. Galindo, J.-A. Fernandez-Madrigal, and J. Gonzalez
438
Multiple Objective Genetic Algorithms for Autonomous Mobile Robot Path Planning Optimization 0. Castillo, L. Trujillo, and F! Melin
444
Fine Tuning for Autonomous Vehicle Steering Fuzzy Control J.E. Naranjo, C. Gonzdlez, R. Garcia, i? de Pedro, and J. Revuelto
450
A New Sonar Landmark for Place Recognition A. Poncela, C. Urdiales, C. Trazegnies, and E Sandoval
456
Automatic Parking with Collision Avoidance D. Maravall, J. De Lope, and M.A. Patricio
463
xvii Modeling the Relationship between Nonwoven Structural Parameters and their Properties from Few Number of Data I? Vroman, L. Koehl, X. Zeng, and E Chen
47 1
A New Framework for Fuzzy Multi-Attribute Outranlung Decision Making 0. Kabak and F: Vlengin
477
Fault Diagnosis in Air-Handling Unit System Using Dynamic Fuzzy Neural Network J. Du and M.J. Er
483
Prioritizing Design Requirements Based on Fuzzy Outranking Methods C. Kahraman, 7: Ertay, and C.E. Bozdai
489
An Analytic Strategic Planning Framework for E-Business Projects G. Biiyiikozkun
495
Urban Land Development with Possibilistic Information I? Guo
501
An Application of Fuzzy AHPDEA Methodology for the Facility Layout Design in the Presence of Both Quantitative and Qualitative Data U.R. Tuzkaya and I: Ertay
507
An Intelligent Hybrid Approach for Industrial Quality Control Combining Neural Networks, Fuzzy Logic and Fractal Theory I? Melin and 0. Castillo
513
A Fuzzy Heuristic Multi-Attribute Conjunctive Approach for ERP Software Selection C. Kahraman, G. Biiyiikozkan, and D. Ruan
519
Estimation of Ease Allowance of a Garment using Fuzzy Logic I: Chen, X.Zeng, M. Happiette, I? Bruniaux, R. Ng, and W Yu
525
Multi-Attribute Comparison of Quality Consultants in Turkey Using Fuzzy AHP U. Cebeci
531
xviii
NARMAX-Model Based Time Series Prediction: Feedforward and Recurrent Fuzzy Neural Network Approaches !I Gao, M, J. El; and J. Du
537
Implementation of On-Line Monitoring Programs at Nuclear Power Plants J. W Hines and E. Davis
543
Prediction Interval Estimation Techniques for Empirical Modeling Strategies and Their Applications to Signal Validation Tasks B. Rasmussen and J. u! Hines
549
Nuclear Power Plant Monitoring with MLP and RBF Network K. Nabeshima, E. Ayaz, S. Sekel; B. Barutcu, E. Turkcan, and K. Kudo
557
Technical Aspects of the Implementation of the Fuzzy Decision Aid System Developed for the Belgian Radioactive Waste Management Agency PL. Kunsch, A. Fiordaliso, and Ph. Fortemps
56 1
Reactor Coolant Leak Detection System 1. Petruzela
567
A Fuzzy Clustering Approach for Transients Classification E. Zio and P: Baraldi
573
Frame Structure of China’s Nuclear Emergency Decision-Making Operation System (COSY) and Evaluating Subsystem Fuzzy Decision-Making Method N . Lu, J. Feng, and Z Zhang
579
Surveillance Tests Optimization of a PWR Auxiliary Feedwater System by Constrained Genetic Programming R.l? Baptista, R. Schirru, C.M.N.A. Pereira, C.M.E Lapa, and R.P Domingos
583
xix
New Proposal of Reactivity Coefficient Estimation Method Using a Gray-Box Model in Nuclear Power Plants M. Mori, !I Kagami, S. Kanemoto, T. Tamaoki, M.Enomoto, and S. Kawamura
589
HTGR Helium Turbine Conceptual Design by GenetidGradient Optimization L. Yang, S. Yu, G. Yang, and Z. Huang
597
Continuous and Discrete Methods for the Aggregation and the COG Stages in the Power Control of a TRIGA Reactor J.S. Benitez-Read, M. Ndjera-Hemdndez, and B. Pe'rez-Clavel
603
A Niching Method with Fuzzy Clustering Applied to a Genetic Algorithm for a Nuclear Reactor Core Design Optimization W E Sacco, C.M.N.A. Pereira, and R. Schirru
607
Estimation of Break Location and Size for Loss of Coolant Accidents Using Neural Networks M.G. Na, S.H. Shin, D. W Jung, S.P Kim, J. H. Jeong, and B. C. Lee
611
Characterisation of Size, Shape and Motion Behaviour of Insulation Particles in Coolant Flow Using Image Processing Methods A. Seeligel; R. Hampel, and S. Alt
617
Control of Chaotic Systems Using Fuzzy Clustering Identification and Sliding Mode Control H. Salarieh and A. Alas9
623
Secure Communication Based on Chaotic Synchronization I? Li, 2. Li, WA. Halang, and G. Chen
629
Transition between Fuzzy and Chaotic Systems 2. Li, I? Li, and WA. Halang
635
A Novel Chaos-Based Video Encryption Algorithm H. Jim, Z Mao, 2. Wang, Z. Li, and I? Li
641
xx
A Theory of Fuzzy Chaos for the Simulation and Control of Non-Linear Dynamical Systems 0. Castillo and F! Melin
649
Hardware Implementation of an Improved Symmetry Feature Point Extraction Algorithm D. Popescu and J. Zhang
655
Design of a Fuzzy Model-Based Controller for a Drum Boiler-Turbine System A. Habbi and M. Zelmat
659
Universal Takagi-Sugeno Fuzzy Controller Core Implemented in a PLD Device D. Oseli, M. Mraz, and N. Zimic
665
Parallel Pipeline Floating-Point Fuzzy Processor N. Popescu and J. Zhang
67 1
Dealing with Dynamic Aspects of Operators’ Performance G. Petkov
677
Author Index
685
Invited Lectures
This page intentionally left blank
UNCERTAINTY MODELLING IN REAL-TIME FROM LOGISTICS TO FINANCIAL MANAGEMENT
H.-J. ZIMMERMANN Inform GmbH, Aachen, Pascalstr.2, 0-52076 Aachen, Germany
Uncertainty has always been considered a problem in planning and decision making. In the past it has often either been neglected or one has tried to model it by using probabilistic models. This is certainly acceptable under two conditions: 1. The uncertainty has to be of stochastic character and 2. The planning is of the macroscopic, strategic kind. If this is not the case, the modeller has a number of choices: If he wants to model uncertainty explicitly he has to choose the appropriate uncertainty theory. This will depend on a number of factors: Type and quantity of available data, the type of uncertainty, the intended use and user of the model output , and the problem-imposed time constraints. It may turn out, however, that the explicit modelling of uncertainty leads either to too time consuming procedures or that the decision procedure has to be microscopic and then he may have to adopt the wait-and-see strategy. The first situation is typical for (strategic and tactical) planning, the second for control - as will be seen not only in the engineering sense. In this contribution 1 shall focus my attention on the latter situation. It becomes more and more frequent since speed and complexity get an ever increasing importance in the world of to day. This tendency cannot be attributed to only a few sectors of the economy but it is more or less present in most areas. We shall consider as examples two sectors which seem to be far apart from each other: Logistics and financial management. In logistics three sub areas will be considered: Real time control of in-house transportation, control in large container terminals and control of ground operations in airports. All three have the same characteristics. They are very complex, even the near microscopic future is uncertain , and decisions have to be taken on-line and determine very much the competitiveness of the operation. Systems exist, which satisfy these requirements. Since the structures normally require integer solutions the methods used to determine good , feasible or optimal solutions are from integer programming, heuristics and approximate reasoning. 3
4
From financial management I shall consider the area of fraud detection and fraud prevention in credit and debit card management and in insurances. Here the problem is to detect, for instance, fraudulent operations before financial losses have occurred. In card management this has to be done on-line and without molesting “honest” customers. One important difference to the logistic applications is, that here not only the future is uncertain, but also the knowledge about the fraudulent strategies, which are fast changing and require fast counter measures. Therefore successful systems are based predominantly on approximate reasoning and dynamic intelligent data mining. Examples will be presented and applied methods discussed.
REASONING WITH UNCERTAIN INTELLIGENCE INFORMATION RONALD R. YAGER
Machine Intelligence Institute, Iona College
New Rochelle, NY 10801, USA,
[email protected] In his report to Congress, David A. Kay, who led the US government's efforts to find evidence of Iraq's illicit weapons programs, reported that the current intelligence systems dealing with weapons of mass destruction are increasingly based on limited information. In light of this situation, he indicated that modern intelligence analysis systems need a way for an analyst to say, "I don't have enough information to a make a judgment," a capacity that he felt the current intelligence systems do not possess. Central to attaining this capability is the ability to deal with uncertain and imprecise information. We believe that fuzzy logic with its focus on uncertainty can help. It has the ability to simultaneously exploit both precise formal measurements of the type obtained from state of the art electronic and mechanical monitoring devices as well the type of imprecise information obtained from human sources which is often perception based and expressed in linguistic terms. Here we shall discuss some ideas from fuzzy logic and other related soft computing technologies with the goal of providing some tools necessary to supply this needed capability to systems used for intelligence analysis.
5
AN INTRODUCTION TO A SELF ORGANIZABLE & REGULATING ENGINE PAUL P. WANG Department of Electrical & Computer Engineering, Pratt School of Engineering, Duke University Box 90291, Durham, NC 27708, USA
[email protected], Voice: (+I) 919-660-5259 JOSHUA ROBINSON Pratt Undergraduate Fellow, Pratt School of Engineering, Duke University,Durham, NC 27708, USA,
[email protected] SORE (Self Organizable & Regulating Engine) is the most general Boolean network of which Kauffmann’s NK Boolean network is a special case. Kauffmann’s N K Boolean network imposes the severe restriction of K < N , which closely approximates the biological reality of K < N or K x'; (14) (x-->Y) -->y = (Y-->x) -->x (15) (X-->Y)-->Y) = (Y-->x~->x) They have some basic properties as follows: if I-->x = I ,then x = I; I->x = x and x-->O = x'; O->x = I and x-->I = I; The definition, along with the examples of quasklattice implication algebra, has been given in the relevant materials. In the reference"], the Lukasiewicz implication algebra based on the complete chain is stated as a particular case of * This paper is supported by the National Natural Science Foundation of P.R.
China (Grant no. 60074014) 41
42
lattice application algebra, and accordingly we construct some lattice application algebras (also they are quasi-lattice implication algebras) with the structure of Lukasiewicz. But, how to construct the implication relation so as to render those non-chained, and even chained algebras to be quasGlattice implication algebras remains unsolved. Since this is such an urging issue of practical concerns, the author in this paper attempts to address this construction problem by putting forward the following method. Algorithm
To define a bounded lattice& V ,A ,O,I, ') and the L is the set of finite symbols. To define the operation ->:L X G > L , and the results are showed in a data table ,which contains three attributes-Fir, Sec and Result-to illustrate the operation: Fir->Sec=Result. In the table we suppose the values of Result as a series of variables X1, X2,. . ,Xn and n is square number of symbols in L. According to the following attributes of quasi-lattice implication algebra: a. if I->x = I ,then x = I; b. I->x = x and x->O = 2 ; c. @>-x = I and x->I = I; the values of some unknowns in the dada table can be obtained and we name the new data table LA. 4) To take L and LA as Descartes product and this new table LAresult contains four attributes-x, y , z and resultyz-to illustrate the operation: x->(y->z) and get its result: resultyz. 5 ) To take LAresult in joint operation and name the new table LA-result-total which contains five attributes-x, resultyyz, y, resultxz and xh-to illustrate the operation: x->(y->z) = y->(x->z)(here, the x and resultyyz means x->(y->z) ,the y and resultxz means y->(x->z)), and the following SQL"] language shows us this joint operation: SELECT "la-result-a" ."x", "la_result_a"."resultyz", "la_result_b"."x", "la-result-b"."resultyz",number( *) FROM "la-result" "la-result-a", "la-result" "la-result-b" WHERE ( "la_result_a"."x" = "la_result_b"."y" ) and ( "la_result_a"."y" = "la_result-b"."x" ) and ( "la_result_a"."z" = "la_result_b"."z" ). 6 ) To select the part which doesn't contain variables from LA-result-total and name it LA-final (We sure that the part which contains variables are all satisfy x->(y->z) = y->(x->z)). 7) To make L and LA Descartes product and select the part which satisfies y=z, here this new table LA-result-I5 contains four attributes-x,y, resultxy
43 and z-to illustrate the operation: (x->y) ->y (here the result of x->y is resultxy). 8) To make LA-result-i5 joint with itself and name the new table LA-result-i5-total (The LA-result-i5total illustrates the equation: ((x>YF-'Y) = (Y-->X)-->x)). SELECT "la_result_i5_a"."resultyz", "la~result~i5~a"."z","la~result~i5~b~~.~~resu~tyz~~, "la_result_i5_b"."z" FROM "la-result-i5" "la-result-i5-a", "la-result-i5" "la-result-i5-b" WHERE ( "la_result_i5_a"."x" = "la_result_i5_b"."y" ) And ( "la_result_i5_a"."y" = "la_result_i5_b"."x" ). 9) To simplify the table LA-result-i5-total and add it to LA-final. 10) To disassemble LA-final and name the new table LA-final-sort(0) which contains four attributes-Item1 item1 , Item2, Result and Xh (In the mean time, to delete the redundant data and use variables for the unknown results). 11) To classify LA-final-sort(0) and name it F(0) according to the results. 12) Let i=lJi=l. 13) To pick up the ith variable. 14) Assume its value to be the ji th member, and substitute this value for the corresponding variable in item1 and item2 of LA-final-sort(i-1), then redraw the result and get the new table LA-final-sort(i). 15) To classify LA-final-sort(i) and name it F(i) according to the results. We get the F(i-1) Lower Approximation sets of all the sets of F(i), then get the union[31[41of all the F(i-1) Lower Approximation sets. If the union is equal to LA-final-sort(O), it means it's consistent with LA-final, so we get the right value of the unkown, let i=i+l, j, =1 and turn to 16; if not, then it is not consistent with LA-final and therefore it is not the right value of the unknown. In this case we turn to 19. 16) If i is more than the elements' number in L, then let i=i-1 and turn to 17; Otherwise turn to 15. 17) If i =O then the Algorithm shall end; Otherwise turn to 19. 18) Ifji is more than the elements' number in L, then let i=i-1 and turn to 17. Otherwise turn to 15 19) Let ji =ji +1 and turn to18. 3.
Example
As a example, a quasi-lattice implication algebra is constructed according to the algorithmin what follows : The definition of partial order in L-here item1L={ O,a,b,I}---is item1 O n-1 we have
a,+(q +ad
=
a2(n-l).(1a,-k) = q+(a,+ak).
Noticing i S k 5 j implies i 5 n-l-j+k, we have
a,-+(Cz/-+ak) = a,+an++k= I, q+(a,+ak) = q+I = I. This proves that al-+(q+ak)= q+(al+ak)when i I k I j . Similarly, we can prove the equation also holds when j I k I i. (F2): According to the six cases of i 5 j S k, j I i 5 k, i I k 5 j , j I k Ii, k S i 5 j and k S j I i, respectively, we can easily cheek that (a,+q) + ((q+ak) +(a,+ak))= 1, Val,q,a&(n). (F6): While i >j we have (a,+q)+q = an-l-l+,+q = al, (q-+a,)-+a, = h a l = a,. This shows that (a,+q)+q = (q-+a,)+a, in the case of i > j . Similarly, we can prove the equation also holds in the case of i 5 j . For (F3), (F4) and (F5), we easily check they hold directly from the above operation table. Thus, we prove that (L(n), -+, e ) satisfies all axioms of commutative FI-algebras. Our proof is completed. Theorem 2. Let L(n):= { 6 , al,a2,---, an-2,I }, and (L(n), 2) a chain: e = a 4 al -< ai 4 ...4 ane2 4 an-l= I. Then there exists a unique lattice implication algebra (L(n),v, A, +,I , 0 , I ) with the derived order I, and it is isomorphic to the n-valued tukasiewicz logic system. Proof. By Theorem 1 we only need to prove that if a lattice implication algebra (L(n), -+, e ) with the derived order I given in (L(n), I)then its implication operation “+” is well-define as in Theorem 1, and (L(n),+, 8 ) z (In,
+>O). In fact, since the derived order in the lattice implication algebra (L(n), +,
) is consistent with the partial ordering I in (L(n), I),that is, a I b e a-+b = 1, we have that a,+a, = I for 0 I i Ij S n-1. In order to prove the implication
49
operation "+" in (L(n),+, 6 ) is well-define as in Theorem 1, we only need to for 0 Ij I i I n-1. prove that a,+q = First, by Lemma l(4) we have a,+& Ial+al Sa,+a21 ... Sal-+al-l Ia,-+a,= I. We claim that a,-+a& # a,-+a,+,, s = 0, 1, 2,..., i-1. Otherwise, we have that al+aA= al-+ay+l, and also we have ((a,+aJ+a,')' = ((a,+a,+++a,')'. This shows that a, A as= aIAa,+ by Lemma l(2). This implies av=a,+lby a,I a, and as+] I a,, however, this is a contradiction. So we have al-+as< al-+a,+l. Next, by Lemma l(4) we obtain 6 = a,,,.l+& I la,+&. We claim that al+l+a,,# al-+&,t = i, i+l;.., n-2. If not, then we have that a,++@ = a+&, and also we have (a,++&)+& = (al+~)+cro. This shows that by Lemma l(2). This implies a,+l= al by v = a, v I a, and I however, this is a contradiction. So we have al+l+&< a, +&.
Summing up the above results, we obtain the differential n elements of L(n): an-l+&, a,,-2-+~; .-,a,+I+&, a,+@, a,+al, aI-+a2;-. ,a,+a,-~,a,+a,. Since they form a chain according to the partial ordering 5 on L(n), we obtain a,,-+& = Q, a,,-2+&= al;.., a,+I+q= a,,+2, a,+al.l = a,,.2,a,+a, = anel. a,+& = a,,+l,al-+al= a,,-l,--., This shows that a,+q = a,,-l-r+J for 0 lj S i I n-1. Finally, we define a map L(n) I,,by t p(a,)=---, t = 0 , l,2,...,n-l .
-
fl-1
Then we can easily cheek that cp is an isomorphism, and so (L(n), -+, 6 ) E (I,,,
+,0). Summing up the above discusses, the proof of Theorem 2 is completed. Corollary. The n-valued Lukasiewicz logic system (L(n),v, A, +, 6 ,I ) is a lattice implication algebra. Suppose (L, v, A, +, I , 0, 1) is a lattice implication algebra. Let L(a): = { x EL:x+a = a } , for any a EL.Obviously, we have that 1EL.Let Ndenote the set of the natural numbers, for any a, XEL,n E h i : we define u'+x = u+x, and an+ x I,
= U+(P+X).
Theorem 3. Let (L, v, A, -+, I , 0, 1) is a finite lattice implication algebra with the derived order 1.Then (L, I)is a chain if and only if L(a) = { l},V U E L a n d a # 1. is a chain of n elements. Proof. For the necessity, we assume that (L, I) Then, from the operation table given in Theorem I and the uniqueness in
50
Theorem 2, we immediately see that a z 1 and x -+a = a imply x =1, that is, L(a) = (1). For the sufficiency, we have that L(a) = { l},V a e L and a z 1. Suppose that (L, I) is not a chain, then there exist x, EL such that not either x I y or y Ix, i.e., x+y z 1 and y+x z 1. Let u = x-+y, v = y-+x. We claim that u”+v z 1 for any n e N Otherwise, we have that um+v = 1 for some mGN Let k be the least natural number such that uk-+v = 1. If k = 1, then u < v, and so u v v = v f 1, however, this is a contradiction by Lemma l(3). If k > 1, then it follows from u+(uk-’+v) = uk-+v = 1 that u I uk-l+v. In addition, by Lemma l(5) we easily obtain v < uk-’+v, so u v v 5 uk-’+v. By Lemma l(3) we have that uk-‘+v = 1, this is contrary to the choosing of k. Summing up the above discussion, we show that U”+V z 1 for any n E h ! On the other hand, considering the following chain in (L, I): u-+v Iu2+v Iu3 -+v I...< Urn++ I-.., since L is a finite set, we obtain that there exists sellrsuch that u+(u.’ +v) = us+’ +v = d+v. Thus, by u # 1 and the condition of the sufficiency we have that u,‘ -+ v = 1, this is contrary to the above conclusion which is proved. The proof of the sufficiency is completed. Acknowledgments
The author expresses his sincere thanks to Professor Yang Xu for his valuable suggestions. References
1. 2. 3. 4. 5. 6.
7. 8. 9.
W. M. Wu, Fuzzy Systems and Mathematics 4( l), 56 (1990). Y. Xu, J. of Southwest Jiaotong Univ. 28( l), 20 (1993) (in Chinese). Y. Xu, and K. Y . Qin, J. Fuzzy Math. 12( l), 25 1(1993). Y. Xu, D. Ruan, K. Y. Qin and J. Liu, Lattice-Valued Logic, SpringerVerlag Berlin Heidelberg, Printed in Germany (2003). Y.B. Jun, Bull. Korean Math. SOC.34(2), 193 (1997). E. H. Roh, S. Y.Kim, Y.Xu and Y. B. Jun, International J. of Mathematics and Mathematical Sciences 27( l), 45 (2001). Y. Q. Zhu and W. B. Tu, Bull. Korean Math. SOC.38(1), 191(2001). P. Hajek, Metamathematics of Fuzzy Logic, Kluwer Academic Publishers, Dordrecht (1998). G. J. Wang, Non-classical Mathematical Logic and Approximate Reasoning, Chinese Scientific Press, Beijing (2000) (in Chinese).
THE RELATIONS BETWEEN LATTICE IMPLICATION ALGEBRAS AND BROUWERIAN LATTICE* XIAODONG PAN Department of Applied Mathematics, Southwest Jiaotong University Chengdu 6/0031,Sichuan, P.R. China E-mail: xdpan I @I 63.com YANG XU Department of Applied Mathematics, Southwest Jiaotong University ,Chengdu 610031,Sichuan, P.R. China E-mail:
[email protected]. cn
JUAN LIU Department of Applied Mathematics, Southwest Jiaotong University ,Chengdu 61003I,Sichuan, P.R. China Brouwerian lattice and lattice implication algebra both are important generalizations of Boolean algebras. In this paper, we investigate the relations between them, and verify that we can construct a lattice H implication algebras by closed elements of any Brouwerian lattice with 0.
1.
Introduction
Non-classical logic has become a considerable formal tool for computer science and artificial intelligence to deal with fuzzy information and uncertain information. Many-valued logic, a great extension and development of classical logic [l], has always been a crucial direction in non-classical logic. In order to research the many-valued logical system whose prepositional value is given in a relatively general lattice, Xu [2] proposed a kind of logical algebraic structurelattice implication algebras by combining algebraic lattice and implication algebras within the framework of the axiomatic definitions on operators and discussed some of its properties. Since then this logical algebra has been extensively investigated by several researchers (see e.g. [4,5,6,7,8]). In [9] Xu discussed lattice properties in lattice implication algebras. In [lo] Xu investigated a special class of lattice-complemented lattice. In [3] Xu and Qin introduced the notion of lattice H implication algebras, which is equal to * This work is supported by grant 60074014 the National Natural Science
foundation of P.R. China. 51
52
Boolean algebras. In a sense, lattice implication algebras is an extension of Boolean algebras. In connection with the foundations of logic, in [ 111 Brouwer and Heyting characterized an important generalization of Boolean algebrasBrouwerian lattice. It seems that there exist some connections between lattice implication algebras and Brouwerian lattice. In this paper, we investigate the relations between lattice implication algebra and Brouwerian lattice, and verify that we can construct a lattice H implication algebra by closed elements of any Brouwerian lattice with 0. 2. Preliminaries
In this section, we give some basic information on the lattice implication algebras and Brouwerian lattice. Definition 2.1 (Birkhoff [ll]). A Brouwerian lattice is a lattice L in which, for any given elements a and b, the set of all x E L such that a A x Ib contains a greatest element b:a, the relative Pseudo-complement of a in 6 . Theorem 2.2 (Birkhoff (111). Any Brouwerian lattice is distributive. Remark: One easily verifies that any Boolean algebra is a Brouwerian lattice, and any chain is also a Brouwerian lattice, and any finite distributive lattice is a Brouwerian lattice. Definition 2.3 (Xu [2]). A bounded lattice (L, V,A , 0, 1) with order-reversing involution ’ and a binary operation is called a lattice implication algebra if it satisfies the following axioms: (11) x + ( y -+ z ) = Y -+ (x 7% 4, (12) x -9 x = 1,
+
(I3)
x -+ y
(14)
x
(15)
(x+Y)+y=(Y-+x)+x,
= y‘ 3 x’,
+y = y 3x = 1
sx =y ,
( L I ) ( X V y ) + Z = (X -+Z ) A (Y -3 Z ) , (Lz) (x A y ) + z = (x + z) v ( y + 4 , for all x , y , z E L . A lattice implication algebras L is called lattice H implication algebras if it satisfies x v y v (x A y ) z) = 1 , for any x,y,z E L Theorem 2.4 (Xu [2]). Let L be a lattice implication algebra, then (L,V,A) is a distributive lattice. Theorem 2.5 (Jun et al. [ 5 ] ) . Let L be a lattice implication algebra. Every LIideal I of L is a lattice ideal.
+
53 3.
Relations between Brouwerian lattice and lattice implication algebras
Next, we investigate the relations between Brouwerian lattice and lattice implication algebra. Theorem 3.1 (Birkhoff Ill]). The ideals of any distributive lattice L form a complete Brouwerian lattice. Theorem 3.2 (XU et al. [31). (L, V ~ A , ' , + , 0, 1) is a lattice H implication algebra if and only if ( L Y v , ~ ,is' )Boolean lattice, xr is the complement of x and x+y=x'vy. Theorem 3.3. The LI-ideals of any lattice implication algebra L form a complete Brouwerian lattice. ProoJ: Every LI-ideal of L is a lattice ideal by Theorem 2.5. The LI-ideals of L form a complete Brouwerian lattice by Theorem 3. I and Theorem 2.4. 0 Theorem 3.4. Let L be a finite lattice implication algebra, then (L.,v,A) is a Brouwerian lattice. ProoJ: Since any finite distributive lattice is a Brouwerian lattice. Hence, this is immediate from Theorem 2.4. 0 Theorem 3.5 (Birkhoff [ll]). A complete lattice is Brouwerian if and only if the join operation is completely distributive on meets, so that: a A VX, = v ( a A x,) for any set { x, } . Theorem 3.6 (Liu [12]). Let L be any lattice implication algebra, t h e n ( L , v , ~ ) satisfies infinite distributive law. By Theorem 3.5 and Theorem 3.6, we can obtain the following conclusion: Theorem 3.7. Let L be a complete lattice implication algebra, then (L,V~A) is a Brouwerian lattice. 0 In ill], let L be any Brouwerian lattice with 0, the element O:a is called the pseudo-complement of a, and is denoted a*. Lemma 3.8. Let L be any Brouwerian lattice with 0, then for any a,b E L , (1). If a 2 b , then b' I a' ; (2). a I a" ; 1 *** (3). a = a ; (4). (a V b)' = U* Ab', ( a A b)*2 a* V b*. Prooj Since a < b implies u Ab* 2 b~ b* = 0 , that is, a 5 b implies b' 5 a' . (1) holds. By definition of a * , (2), (3) are trivial. We only need to prove that (4) holds. For any a,b E L , by (1) we have (a v b)' I a* A b' . On the other hand, by Theorem 2.2 we know that L is distributive, hence
54
b') = ( a a'~ A b') v (bA a' A b') I (0 A b') v (0 A a') = 0. This implies that a' A b' i (a v b)' . It follows that (a v b)' = a' ( a A b)' 2 a' v b' can be proved similarly. Hence, (4) holds. 0 ( a v b)r\ (a'
A
A
b' .
Let B be the set of "closed" elements satisfying u=u**, in which we define a new binary operation "V " as join operation, and satisfies a V b = ( a v b)" . Lemma 3.9. Let L be any Brouwerian lattice with 0, for any a, b E L (a A b)' = a*Vb*. Proof Suppose x A a A b = 0 in L, define y = x A a" A b" . We obtain y ~ a ~ b =, which O implies y ~ I ba' . But y ~ a l yb"l by definition of y ; hence y A a I b" A b" = 0 . This implies y I a' ; but * ** y I a by its definition; hence y I a A a = 0 . In summary, x A a A b = o implies x A a * * A b * * = O , that is, ( a A b)' I (a" A b")' . But the reverse inequality is obvious; hence (a A b)' = (a" A b**)* . On the other hand, (a' v b')' = a" A b" , and so ( a A b)' = (a" A b")' = ((a' v b ) ) = a'Vb' . 0 Theorem 3.10. Let L be any Brouwerian lattice with 0, then (B,V,r\,*) is a Boolean lattice. Proof Since 0 E B is obvious, B is a nonempty subset of L. For all x E B ,0 A x = 0 implies 0' 2 x , hence, B has the universal upper bound 0'. If a,b E B ,then a = a" ,b = b" . By Lemma 3.8, it follows that ( a A b)" I (a' v b')' = a A b I (a A b)'" , and hence a r \ b = ( a r \ b) * E B . On the other hand, we know that ( a v b)**E B and a v b I ( a v b)" are obvious. Hence ( a v b)" is an upper bound of a, b in B. suppose f = f** E B is an upper bound of u, b in B, thatistosay a 5 f a n d b l f , t h e n a v b I f . H e n c e l * I ( a v b ) * , t h i s implies ( a v b)" lf** = f . Therefore aVb = ( a v b) is the least upper bound, B is a lattice. Next, we prove that B is distributive. For any a, b,c E B , by Theorem 2.2, Lemma 3.8 and Lemma 3.9, we have a A (bVc) = a" A (b v c)" = (a' v (b' A c'))' = ((a' v b') A (a' v c'))' = (a' v b*)*V(a*v c')' = (a" A b**)V(a**A c**)= ( a A b)V(a A c ) . Hence, B is distributive. Finally, we prove that there exists complementary element in B for any a E B . First, a A a' = 0 is obvious. Since 11 (a v a')' = a A a = 0 , it follows that ava' = 0' , that is, a' is the complement of u in B. This proved that B is a Boolean lattice, ending the proof I.
'
I
.
It
0 Theorem
3.11.
Let
L
be
any
Brouwerian
lattice
with
then
0,
(B, V,A,*,+,O,O*) is a lattice H implication algebras, where mapping satisfies a -+ b = b : a forany a , b E B .
"
-+"
55
Prooj By Theorem 3.2 and Theorem 3.10, B constructs lattice H implication algebras. We only need to prove that a b = b : a for any a, b E B . By Theorem 3.2 it suffices to show that b : a = a* v b for any a, b E B . In order to prove this, on the one hand we have
-+
a A ( a *v b ) = ( a ~ a * ) v ( a ~ b ) = a ~ b I b , this implies a* v b 2 b : a ; On the other hand, a A ( b : a ) I b implies (a A (b : a)) v a* I b v a * , i.e., ( a v a * ) A( ( b :a ) v a * )= 0’ A ( @ : a ) v a’) = ( b :a ) v a* 2 b v a * , this shows that (b : a ) I b v a*. Therefore b : a = a* v b . This completes the proof. By Theorem 3.1 1 and properties of lattice H implication algebras, the following conclusions are trivial. ** Corollary 3.12. Let L be any Brouwerian lattice with 0, if it satisfies x = x for any x E L , then L can construct a lattice H implication algebra. Corollary 3.13. Let L be any Brouwerian lattice with 0, Let B be the set of “closed” elements satisfying a=a**, we have the following conclusions: (1).Forany a,b€ B, ( b : a ) : a = b : a ; (2).Forany a , b E B and c ~ [ a , O *(]b, : a ) : c = b : a ; (3).Forany a , b E B , a : ( b : a ) = a ; (4). For any a,b,c E B , (c : b) : a = ( c :a ) : (b : a ) ; (5). For any a,b,c E B , ( ( c :b ) :a ) = c : ( a A b); (6). For any a,b,c E B , ((c : a ) : (b : a)) : ((c : b) : a ) = 0;. 0 Remark not all Brouwerian lattices can construct lattice implication algebras, for this, we can give easily some examples to explain it. It is possible that the Boolean lattice of all closed elements is not a sublattice. 4.
Conclusion
In this paper, we discussed the relations between lattice implication algebra and Brouwerian lattice, and verified that we can construct a lattice H implication algebra by closed elements of any Brouwerian lattice with 0. It hopes that above work would serve as a foundation for further study the structure of lattice implication algebras.
5. Acknowledgements Authors would like to express their sincere thanks to the referees for their valuable suggestions and comments.
56 References
1.
D.W. Borns, J.M. Mack, An Algebraic Introduction to Mathematical Logic, Springer, Berlin, 1975. 2. Y. Xu, Lattice implication algebras, J. Southwest Jiaotong Univ. 28(1) (1993) 20-27. 3. Y. Xu, K.Y. Qin, Lattice H implication algebras and lattice implication algebra classes, J. Hebei Mining Civil Eng. Inst. 3 (1992) 139-143. J. Liu, Y. Xu, Filters and structure of lattice implication algebra, 4. Chinese Sci. Bull. 42 (1997) 1517-1520. 5. Y.B. Jun, E.H. Roh, Y. Xu, LI-ideals in lattice implication algebras, Bull. Korean Math. SOC.35 (1998) 13-24. Y.B. Jun, On LI-ideals and Prime LI-ideals of lattice implication 6. algebras, J. Korean Math. SOC.36 (2) (1999) 369-380. J. Liu, Y. Xu, On prime filters and decomposition theorem of lattice 7. implication algebras, J. Fuzzy Math. 6 (1998) 1001-1008. 8. Y.L. Liu, S.Y. Liu, Y. Xu, K.Y. Qin, ILI-ideals andprime LI-ideals in lattice implication algebras, J. Information Sciences 155 (2003) 157-175. Y. Xu, K.Y. Qin, Lattice properties in lattice implication algebras, 9. Collected Works on Applied Mathematics, Chengdu science University Publishers, 1992 10. Y. Xu, Complemented lattice, J. Southwest Jiaotong Univ. 1 (1992) 3741. 11. G. Birkhoff, Lattice Theory, Vol. XXV, 3rd Edition, American Mathematical Society Colloquium Publications, Providence, RI, 1967. 12. J. Liu, Study on the structure of the Lattice Implication Algebra, Southwest Jiaotong University Master Degree Dissertation, 1996.
RELATING INTUITIONISTIC FUZZY SETS AND INTERVAL-VALUED FUZZY SETS THROUGH BILATTICES
0. ARIELI Department of Computer Science, The Academic College of Tel-Aviv, Antokohki 4, Tel-Aviv 611 61, Israel E-mail:
[email protected] C. CORNELIS: G. DESCHRIJVER AND E. E. KERRE Department of Applied Mathematics and Computer Science, Ghent University, Krijgslaan 281 (S9), B-9000 Gent, Belgium E-mail: {Chris.cornelis,glad. deschrijver, etienne.kerre} @ugent.be
In this paper, we show that bilattices are robust mathematical structures that prcvide a natural accommodation to, and bridge between, intuitionistic fuzzy sets and interval-valued fuzzy sets. In this way, we resolve the controversy surrounding the formal equivalence of these two models, and open up the path for a new tradition for representing positive and negative information in fuzzy set theory.
1. Motivation
Bilattices are algebraic structures that were introduced by Ginsberg14, and further examined by Fittingl23l3 and others, e . g 2 , as a general framework for many applications in computer science. In this paper, we show that these structures can also elegantly and naturally accomodate intuitionistic fuzzy sets (IFSs) and interval-valued fuzzy sets (IVFSs), which are two frequently encountered and syntactically equivalent generalizations of Zadeh’s fuzzy sets. In particular, and more generally than in previous works, we demonstrate that Atanassov’s decision to restrict the evaluation set for L-intuitionistic fuzzy sets to consistent couples of the “square” C2 forces the resulting structure to coincide with the “triangle” T(C).This insight provides a convenient stepping stone towards more general and expressive *Work supported by the Fund for Scientific Research - Flanders.
57
58 models for the representation and processing of positive and negative imprecise information. 2. Preliminaries: IFSs, IVFSs, and Bilattices 2.1. Intuitionistic fuzzy sets (IFSs)
A fuzzy set17 is a nebular collection of elements from a universe U , described by a membership function p : U -+ [0,1]. An intuitionistic fuzzy set3 (IFS, for short) is a nebular collection of elements from a universe U , described by a pair of functions ( p l y ) , each one maps elements from U to the unit interval [0,1], such that for every u in U , p(u) v ( u ) 5 1. Intuitively, p is a membership function and v is a non-membership function. These two functions are not necessarily each other's complement (an assumption which is implicit in Zadeh's fuzzy set theory), i.e., the amount of the 'missing information', 1 - p(u) - v(u), may be strictly positive. Given a complete lattice C = (L, I),Goguen15 introduced the concept of L-fuzzy sets as a mapping p : U 4 L. Intuitionistic fuzzy sets can be interpreted as a particular kind of L-fuzzy sets, where the corresponding complete lattice is the following":
+
Definition 2.1. Define: C* = ( L * , < L * ) where , L* = { ( q , z 2 ) I (z1,zz) E [O,l]x[O,1]and ~ + z 2 5 1 ) , a n d( z l , z z ) < ~( Y*I , Y Z ) i f f z i I y 1 a n d z 2 2 ~ 2 . Atanassov and Stoeva4 introduced the following generalization of the IFS construct, called an intuitionistic L-fuzzy set (ILFS).
Definition 2.2. Let ( L ,5 ~be)a complete lattice with an involution operation" N and a non-empty set U called universe. An intuitionistic L-fuzzy set in U is a mapping g : U + L x L, such that if g ( u ) = (z1, z 2 ) then z1 < L N(z2),for all u in U . 2.2. Interval-valued fizzy sets (IVFSs) In interval-valued fuzzy sets the membership degrees are represented by intervals in [0,1] that approximate the correct (but unknown) membership degree. Another justification for this approach is that, in reality, intervals of values better reflect experts' opinions than exact numbers. An IVFS can be seen as an L'-fuzzy set, where the corresponding lattice is given by the following definition. aI.e., for every
I,y
in L , N ( N ( z ) = ) I,and if
2
Q y then N ( z )> L N ( y ) .
59 Definition 2.3. Define: L' = (L', < L r ) , where L' = { [ 2 1 , 2 2 ]I ( 2 1 , 2 2 ) E [ 0 , 1 ]x [0,1] and z1 I 2 2 } , and [ X I , 221 I L I [ Y I ,Y Z ] iff 51 I y1 and 2 2 I y 2 . Atanassov and Gargov5 defined another generalization of the IFS construct, called interval-valued intuitionistic fuzzy set (IVIFS), which deviates from the line of thinking of ILFS, and which is more related to the intuition behind IVFSs.
Definition 2.4. An interval-valued intuitionistic fuzzy set5 in a universe U is a mapping g : U -+ (L')2, such that g ( u ) = ( [ x i , x t ][xi,x$]) , and zt dj 1, for all u in U.
+
Indeed, applying Definition 2.2 to (L', < L r ) , where the involution N on the lattice (L', < L r ) is defined by N ( [ z lQ]) , = [l - 5 2 , l - 211, gives the alternative condition [xi, 2$] < L I [l - x $ , 1 -
41.
2.3. Bilattices
As noted above, bilattices are used here for relating IFSs and IVFSs. First, we recall some basic definitions and notions that are related to these structures.
Definition 2.5. A pre-bilattice12 is a structure B = ( B ,St, &), such that B is a nonempty set containing at least two elements, and ( B ,i t ) , ( B ,Ik) are complete lattices. Definition 2.6. A bilattice14 is a structureb B = ( B ,Lt, < k , l),such that ( B ,S t , i k ) is a pre-bilattice, and 1 is a unary operation on B that has the following properties: for every 2 , y in B , ( 1 ) if
2 I t
y then lz Zt l y , (2) if
IC < k
y then
TE
i k l y , (3) 1-2 = 2 .
The original motivation of Ginsberg14 for using bilattices was to provide a uniform approach for a diversity of applications in AI. In particular, he considered first-order theories and their consequences, truth maintenance systems, and default reasoning. Later, it was shown that bilattices are bNote that Definition 2.6 is not the same as the one in ', but rather corresponds to Ginsberg's original definition of bilattices. In terms of Fitting, the definition above describes a pre-bilattice with a negation, while the structures considered in are prebilattices that are interlaced. As a bilattice may not be interlaced on one hand, and it may not be possible to define a negation operator for a given interlaced pre-bilattice on the other hand, the present definition of bilattices is incomparable with that of 7. This will not be an obstacle in what follows, though.
'
60 useful for giving semantics to logic and that they provide an intuitive semantics t o consequence relations for reasoning with uncertainty'. Following the conventional notations in the literature, we shall denote by A (by V ) It-meet (st-join), and by 8 (by @) +meet (sk-join) of a bilattice B ; f and t will denote the extreme elements of ( B , and I,T will denote the extreme elements of ( B ,s k ) . In some bilattices a dual negation operator, called confEation12 (-), is definable. It is an involution of ( B , and order preserving of ( B ,< t ) :
st),
sk) (I) if s k y then - 2 2k - y , ( 2 ) if z st y then - 2 st - y , (3) - - x Proposition 2.1. Let B = ( B ,st,s k , be a bilattice. Then: J:
= IC.
1)
u)
l4
-f = t , - t = f ,
l T = T . Also, for every x , y i n B ,
-I=-L,
~ ( Z A Y=)i x V T y , - ( z V y ) = ~ x A l y 1, ( 2 @ / ) = 1 2 @ 1 y , ' ( Z @ y ) = l z @ l y .
b) l 2 I f B has a conflation, then - f = f , -t=t, - I = T , - T = l . For every z, y znB, -(z A y ) = - 2 A -9, -(x V y ) = - 2 V - y , -(z 8 y ) = - 2 @ - y , and - ( x @ y ) = - 2 8 - y . 3. Squares and Triangles 3.1. Squares
Definition 3.1. l4 Let L = ( L ,st)be a complete lattice. The structure L2 = ( L X L , ~ ~is, defined ~ ~ , as~ follows: ) (1) ~ ( 1 ~ 1 ,=z z( 5) 2 , ~ 1 ) ,
st ( Y I , Y Z ) iff 2 1 I L Y I and 2 2 L L Y Z ,
(2)
(21,52)
(3)
( m , I~ k) ( ~
S L YI and
1 ~ iff ~ 51 2 )
22
ILY Z .
In what follows we refer to L2 as a square. A pair ( 2 1 , 2 2 ) E L2 may intuitively be understood so that 2 1 represents the amount of belief for some assertion, and z2 is the amount of belief against it. This is clearly the same idea as that of Atanassov3, discussed in Section 2.1, of splitting a belief about the membership of an element u to two components ( p ( u ) ,~ ( u ) )As . we shall show, the similarity does not remain only on this intuitive level. Proposition 3.1. Let C = ( L ,< L ) be a complete lattice with a j o i n and a meet UL. Then: u)
l4
nL
L2 i s a bilattice, in which
T p = (sup(L),sup(L)),
t
~
l p = (inf(L),inf(L)), = 2 (sup(L),inf(L)), f , 9 = (inf(L),sup(L)).
T h e basic operations in L2 are defined as follows:
T ( X ~ , X ~=) ( 2 2 , X I ) ,
(21,22)V(YlrY2)=(Z1ULYl,22nLY2), ( ~ 1 , ~ 2 ) A ( Y l , Y 2 ) = ( ~ l n L Y l , ~ Z ~ L Y 2 ) r
61
b) l1 Suppose that L has an involution. Denote by x- the t,, then we have (1") m = n, (2") p,i = Ut Theorem 3.2 Let F ( x )= g ( x , E )be a fuzzy-valued function represented a
N
by a fuzzy structured element
-
E , the A -level set of f ( x ) is a interval-valued
functionF, (x) for anyA ~ ( 0 , 1 )then , [g(x,e,L), g(x,eSz) I.
F&)=g(x,E,)=
(4)
N
Proof: Since any X E X , f ( x > = g ( x , E ) i s a bounded close fuzzy number, by theorem 2.3, the conclusion holds. 4. Differential and integral of Fuzzy-valued function u
Suppose that f(X)is a fuzzy-valued function on D
f,(x)= [Acn,(~),f~(~, (x)] is an
c R, the
A-level set
interval-valued function on D, and
A c n , ( x )f2(,)(x) , are all differentiable in D for any
~ ( 0 , l )We . denote
132
then f9x) is called a differees we d, and the difficu at of70 x)is then f9x) is called a differees we d, and the difficu at of70 x)is then f9x) is called a differees we d, and the difficu at of70 x)is
Suppose that
A(,)
(x) and f2(,) (x) are
Riemann integrable on D, and we
denote
v.2E (0,1], then say that
N
f ( x ) is fuzzy Riemann integrable on D, having
Y
Theorem 4.1 Let f ( x ) = g(x, E ) be a fuzzy-valued function represented by a fuzzy structured elementE, g(x,y) a monotone bounded function on [-1,1] with respect to y. If g(x,y) is differentiable with respect to X E D c X, N
then f
(x)is surely differentiable on D, and then we have
Proof: From the formula ( 5 ) , havey(x) = u/z*F'(x). By theorem 2.3 n0. We call N the length of sequence s and denote \s = N. Let us denote the set of sequences over B as SB and denote the set of sequences over B with length ./V as SQ^. Example 1.1 (Customer sequences) Consider a large database of customer transactions. Each transaction consists of three fields: transaction Date, customer Id, transaction Items. The following example is given in [1]. Customers i 1 2 3 4 5
\
Sequences s. {30}{90} {10,20}{30}{40,60,70} {30,50,70} {30}{40,70}{90} {90}
Length |s;| 2 3 1 3 1
where B = {10,20,30,40,50,60,70,90}. Example 1.2 (Document sequences) Consider a large collection of documents. Each document consists of several fields: document Id, term Categories (e.g., country names, topics, people names, organisations, stock exchanges, etc.). The following example is given by Feldman et al in their paper [2] to investigate maximal association rules and mining for keywords co-occurrences in document collection, where collection D consists of 10 documents di, c?2, •••, ^10!
163 and Categories Tl, T2 are countries, topics; respectively.
D\Pk d l , dz d3
d4,d5 d6 d7
ds, d g dlo
I I I I I I I 1
Tl
= countries T2 = topics {Canada, I r a n , U S A } {crude,ship} {earn} {USA} {jobs, cpi} {USA) { e a r n ,cpi} {USA} {sugar,t e a } {Canada} {Canada, U S A } {trade, acq} {Canada,USA} {earn}
Example 1.3 ( D N A sequences/Profiles) Let 13 be the set of nucleotides A,C,G,T. Then, a sequence formed by singletons (one element subsets) is a u2,u3,u4}, where DNA sequence [4].For example, let 24 = {ul, ui=ACGTAAAAGTCACACGTAGCCCCACGTACAGT, =CGCGTCGAAGTCGACCGTAAAAGTCACACAGT, u~=GGTCGATGCACGTAAAATCAGTCGCACACAGT, u~=ACGTAAAAGTAGCTACCCGTACGTCACACAGT. It is interesting that DNA sequences are usually used in scientific area of biology and medicine while DNA profiles are frequently used by journalists to report crime events. Now, let us define some containing relations in S,. First of all, sequence a = A1A2 ...A, ( n > 0 ) is said t o be contained i n sequence b = BIB2...B, ( m > 0 ) and denoted by a 5 b, if there exist n integers 1 5 i l < 22 < ... < i, 5 m such that A1 C Bil,A2 Biz,...,A, C Bin. In this case, we say that a is a subsequence of b and that b is a super sequence or an extension of a. Obviously, we have la\ 5 Ibl. Usually, sequence a = A1A2...A, ( n > 0 ) is said t o be usually contained in sequence b = BlB2 ...B, (rn > 0), if there exist n integers 1 5 il < iz < ... < i, 5 m such that A1 = Bil,A2 = Biz,..., A, = Bin.In this case, we say that a is a usual sub-sequence of b and that b is a usual super sequence or extension of a. In particular, the strong containing relation in S, means that sequence A1 Az.. .A, is said to be strongly contained in sequence B1 B2.. .B, and denoted by A1A2 ...A, 5 BIB2 ...B,, if there exist n contiguous integers 1 5 i < i + 1 < ... < i n - 1 5 m such that A1 = &,A2 = ...,A, = Bi+,-l. In this case, we say that a is a strong subsequence of b and that b is a strong super sequence or extension of a. For DNA sequences, we only consider the strong containing relation. Given a sequence s , the set of sub-sequences of s is said t o be the language f r o m the sequence, denoted by L ( s ) , which is equal to { V ~ W5 s , Is1 > 0 ) . A ~2
+
164 sub-sequence of s is said to be a pattern in the sequence. Let U be a group (set) of sequences, U = { S I , s2, ..., s k , ..., SK},K = IUI. ..., ISkl, ..., 1 ~ ~ 1 ) . Denote L = max(ls11,1~21, The union L(U) = U r = l L ( s k ) = UF=l{vIv 5 s k , (v1> 0) is said t o be the language from the group, denoted by L ( U ) .
3
Frequent Patterns Contained in a Sequence Group
A particular sequence can be contained (or can "co-occur") in many sequences of a group as their common sub-sequence, i.e., common pattern. First of all, for a sequence w E L ( U ) in t h e language from group U , we need to know how many sequences in U containing w. The number, denoted by t and SO t 5 IU/ = K , of such sequences is called its support/occurrence number and it is said to be a t-support/occurrence pattern. Of course, a t-support/occurrence pattern is a t'-support/occurrence pattern whenever t' 5 t, and we prefer the number is a maximal one. Furthermore, for a sequence w E C ( U ) in the language from group U , we need to know what sequences in U containing w. For a sequence w E C ( U ) in the language from group U , the sub-group (subset) of sequences in U containing w is { u E Ulu w}, denoted by w'. Sub-group wu consists of sequences in group U in which sequence w is contained, and is called the support/occurrence group of sequence w. So w is a (wUI-support/occurrencepattern. We also call w a lwul/lUl-frequent pattern. In one word, for a given group U of sequences and a given pattern w, it is foremost to know its support/occurrence group wu in U . When a pattern w is given, we also want to indicate its its support/occurrence group wu simultaneously. Therefore, a particular notation is necessarily introduced for patterns. We call this notation the occurring notation for patterns. In this notation, pattern w is written as w{wu} with its support/occurrence group wu added and detailed to indicate the group when w' c U . Notice that, in this notation, a naked pattern w means that its support/occurrence group is the whole group U , i.e., wu = U . In this case, w is abbreviated from wU. Theorem 2.1 Let U be a group of sequences over B. For two patterns w , v E C ( U ) , if v 5 w then vu 2 w.' That is, subsequence has super support/occurrence group, and super sequence has support/occurrence subgroup. Generally, given a threshold T > 0, sequence w E L(U) is called a T frequent pattern if Iw"I/IUI 2 T . Here T is called the minimum support rate or minimum frequency, and w is said to be a pattern with minimum support (rate) T [1,4].
165 Notice that 0 5 IwuI/IUI 5 1. Thus, a threshold 7 should take a value satisfying 0 < r 5 1. Usually, 7 is given by a percentage. In this paper, the frequency of a pattern is defined as the support/occurrence rate of the pattern in the group of sequences. Repeatedly, we say pattern w is IwUI-support/occurrence or Iwul/lUl-frequent. The set of r-frequent patterns with length 1 is denoted by M(U, r, I ) = {w I w E C ( U ) ,Iw"I/IUI >. r , lwl = l } , where 1 5 L. Denote M ( U , 7 ) = ~ f = ~ M ( U , r , l ) . 4
Theorems for Finding Patterns
1-length patterns are called units. A 1-length pattern w E (2" - (8)) is in M(U,.r, 1) if and only if Iw"I/IUI 2 7 . Theorem 3 1. M ( U , k/lU(, 1) = { z E (2" - (8)) 1 lzul 2 k). 2. In M ( U, k/lUI, 1 + I), every pattern w can be expressed as either left or right concatenation w = zx or w = yz of such a unit z that z E M(U , k/lUI, 1) and x,y E M( U, k/lUl, I ) . Conversely, given z E M(U, k/lUl, 1) and x,y E M(U, k/lUl, l ) , their concatenation w = zx or w = yz is in M ( U , k/lUl,l 1) if its frequency is not decreased. Algorithm Appending for constructing 7 = k/lUl-frequent patterns M(U, k/lUl) begin 1. Find all 1-length patterns in M(U,k/lUI,l) from z E (2" - (8)) by checking its occurring/support group lzul 2 k. 2. Find M(U,k/lUl,l 1) from M(U,k/lUI,l) as follows. begin For all patterns in M ( U ,k/lUl,l) of length 1 keep concatenating either left or right with 1-length patterns z in M ( U ,k/lUI, 1) For each resultant pattern w of length 1+1 compute its support/occurrence group w' Add w to M(U, k/lUl, 1 1) if lwul >. k end end
+
+
+
In the case where there is nothing known at the beginning, Algorithm Appending is a possible way to construct k-support/occurrence patterns for a given k. However, its computational cost is rather high. Fortunately, there is an easier way to find higher occurrence patterns, when we know all ksupport/occurrence patterns at the outset.
166
Algorithm Checking for finding higher occurrence patterns For each pattern w in M ( U , ( k - l)/lUl,l) of length 1 check its support/occurrence group wu add w to M ( U , k / l U l , l ) if lw'l 2 k 5
Summary and Future Work
We have given a general treatment for some kind of sequences such as customer sequences, document sequences, and DNA sequences, etc. We have presented algorithms based on theorems developed here t o find frequent patterns in sequences. Further work and applications to discover knowledge about patterns in sequences are currently in process. References 1. Agrawal, R.; Srikant, R. 1994-1995, Mining sequential patterns, in Proceedings of the 11t h International Conference on Data Engineering, Taipei, Taiwan, March 1995; I B M Research Report R J 9910, October 1994 (expanded version). 2. Feldman, R.; Aumann, Y.; Amir, A,; Zilberstain, A,; Kloesgen, W. BenYehuda, Y. 1997, Maximal association rules: a new tool for mining for keyword co-occurrences in document collection, in Proceedings of the 3rd International Conference o n Knowledge Discovery ( K D D 1997), 167-170. 3. Frawley, W.J., Piatetsky-Shapiro, G., & Matheus, C.J. (1991). Knowledge discovery in databases: an overview. In G. Piatetsky-Shapiro, W.J. Frawley (eds). Knowledge Discovery in Databases (pp. 1-27). AAAI/MIT Press. 4. Kiem, H.; Phuc, D. 2000, " Discovering motif based association rules in a set of DNA sequences ", in W . Ziarko & Y. Yao (ed.) Proceedings of the Second International Conference on Rough Sets and Current Trends in Computing (RSCTC'2000), Banff, Canada, October 16-19, 2000; 348352. ISBN 0828-3494, ISBN 0-7731-0413-5.
EMPIRICALLY VALIDATING AN ADAPTED CLASSIFICATIONBASED ON ASSOCIATIONS ALGORITHM ON UCI DATA DAVY JANSSENS', W LAN', GEERT WETS', GUOQING CHEN2 AND TOM BRIJS' 'Limburgs Universitair Centrum, Transportation Research Institute, Universitaire Campus, Gebouw D, B-3590 Diepenbeek, Belgium Email: fdavy.janssens;geert.wets;tom.brijs) @luc.ac.be 'School of Economics and Management, Tsinghua University, Beijing 100084, China Email: fyull,chengq] @em.tsinghua.edu.cn
In recent years, extensive research has been carried out by using association rules to build more accurate classifiers. The idea behind these integrated approaches is to focus on a limited subset of association rules, i.e. those rules where the consequence of the rule is restricted to the classification class attribute. This paper aims to contribute to this integrated framework by adapting the CBA (Classification Based on Associations) algorithm. More specifically, CBA was modified by coupling it with a new measurement of the quality of association rules: i.e. intensity of implication. By means of this measurement, the sequence in which the class association rules are chosen, was changed when building the classifier. The new algorithm has been implemented and empirically tested on 16 popular datasets from the UCI Machine Learning Repository. Furthermore, the results were validated with original CBA, with C4.5 (both on original and on discretized datasets), and with Nai've Bayes. The adapted CBA algorithm presented in this paper, proved to generate a lowest average error rate and produced classifiers that are more compact than original CBA.
1. Introduction Classification and association-rule discovery are certainly two of the most frequently addressed tasks in the data mining literature. In recent years, extensive research has been carried out to integrate classification and association rule mining. By focusing on a limited subset of association rules, i.e. those rules where the consequent of the rule is restricted to the classification class attribute, it is possible to build more accurate classifiers. Despite the solid performance of these integration approaches in previous research efforts, the comprehensiveness and complexity of dealing with the often-large number of association rules have lead to difficulties and (accuracy versus generality) trade-off questions, which are part of a lot of research currently going on. Liu, Ma & Wong, proposed an improvement of their original CBA (classification based on associations)system [l] in [2] to cope with the weaknesses in the system. In spite of the fact 167
168
that the presented adaptations of CBA are valuable, some important issues still remain unsolved. Our goal is to address them in this paper. The potential weakness is situated in the way CBA sorts its (class) association rules. The sorting in CBA is quite important because the rules for the final classifier will be selected by following the sorted sequence. CBA mainly sorts its rules by using the conditional probability (confidence). This is a good measure when classes are equally distributed. However, as we will show, when class distributions differ significantly, and especially for classes where the frequency is low, this is not the most adequate approach to follow. For this reason, we propose intensity of implication [3] as a better measure to sort the class association rules. The next section describes the classification problem in terms of association rules, along with the sorting mechanism applied by CBA. 2.
Classification based on associations
An association rule X => Y typically holds in a transaction set D with confidence c if c% of transactions in D that contain X also contain Y. The rule X => Y has support s in the transaction set D if s% of transactions in D contain XuY. Given a set of transactions D, the problem of mining association rules is to generate all association rules that have support and confidence greater than a user-specified minimum support (minsup) and minimum confidence (minconf). To make association rules suitable for the classification task, the CBA method focuses on a special subset of association rules, i.e. those rules with a consequent limited to class label values only; the so-called class association rules (CARs). Thus, only rules of the form A => ci where ci is a possible class, need to be generated. Therefore, the Apriori algorithm which is widely used for generating association rules, was modified to build the CARs. Details about these modifications can be found in [ 11. After all the CARs have been generated, a classifier can be built by means of the original CBA algorithm. CBA will first rank all the CARs and sort them in the descending sequence. As we will show in the next section, this rank will be subject to one of the modifications that were implemented. The ranking is as follows: given two rules ri and rj, ri > r, (or ri is said having higher rank than r,), if (1) conf (ri)> conf (rj);or (2) conf (ri)= conf (r,), but sup (ri) > sup (r,); or (3) conf (ri)= conf (rj)and sup (ri)= sup (r,), but ri is generated before rj. If at least one case among all the cases covered by the rule is classified correctly by the rule, the rule is inserted into the classifier by following this sorted descending sequence order and all the cases it covers are removed from the database. The rule insertion stops when either all of the rules are used or no cases are left in the database. The majority class among all cases left in the
169
database is selected as the default class. The default class is used in case when there are no covering rules. Then, the algorithm computes the total number of errors, which is the sum of the number of errors that have been made by the selected rules in the current classifier and the number of errors made by the default class in the training data. After this process, the first rule that has the least number of errors is identified as the cutoff rule. All the rules after this rule are not included in the final classifier since they will only produce more errors [ 11.
3.
Identifying weaknesses and proposing adaptations to CBA
3.1. Limits of conditionalprobability (confidence) As mentioned above, rules are inserted in the classifier following the sorted confidence order. This will determine to a large extent the accuracy of our final classifier. Confidence is a good measure for the quality of (class) association rules but it also suffers from certain weaknesses. The aim of this section is to elaborate on them. The first weakness is that the conditional probability of a rule X => Y is invariable when the size of s(Y) or D varies. s(Y) is the subset of the cases which are covered by the consequent of the rule, while D is the total number of observations in the dataset. The confidence property is also insensitive to cardinal dilatation (i.e. the size of the subsets increases in the same proportion). In all these cases, the confidence will be a constant while in reality the rule X =>Y is more likely to happen when (i) the size of s(Y) increases or when (ii) the size of D decreases; and when (iii) the size of all the sets grows in the same proportion. The second drawback of the use of conditional probability is that when for a particular class, the minsup parameter is set to 1%or even lower, it might very well happen that some rules have a high confidence parameter but on the other hand they might be confirmed by a very limited number of instances, and that those rules stem from noise only. This is why it is always dangerous to look for implications with small support even though these rules might look very “interesting”. As a result, choosing the most confident rules may not always be the best selection criterion. Therefore, a suitable measure that takes both drawbacks into account, i.e. intensity of implication was used to perform the sorting. The next section elaborates on this.
170
3.2. Intensity of implication Intensity of implication, introduced by Gras & Lahrer [3] measures the distance to random choices of small, even non statistically significant, subsets. In other words, it measures the statistical surprise of having so few examples on a rule as compared with a random draw [4]. Consider a database D, where ID1 is the total number of observations in the database, and an association rule X 3 Y. Now, let U and V be two sets randomly chosen from D with the same cardinality as X and Y respectively, i.e., s(u> = s ( X ) and s(V) = s(Y). Let S(U ATV) be the random variable that measures the expected number of random negative examples (let 7V mean ‘not V‘) under the assumption that U and V are independent, and s(X A ~ Y the ) number of negative examples observed on the rule. Now, if s(X ATY) is unusually small compared with s(U ATV), the one we would expect at random, then we say that the rule X 2 Y has a strong statistical implication. In other words, the intensity of implication for a rule X 2 Y is stronger, if the quantity P[S(U ATV) I s(X A+)] is smaller. Intensity of implication is then defined as 1 - P[S(U ATV) I s(X AYY)].Since S(U A ~ V ) follows the hypergeometric law, which means Pr[S(U~lV)=k]=Pr[of s(U) examples selected at random, exactly k are not in V]= c~,-,,xc::~~-~ taking into account that s(U) = s(X) and s(V) = s(Y), the intensity
C;’
S(XA7Y)
of implication can be written as: 1k=mex(O.s(X)-s(Y))
Ct(,Y)X
c:::::-k
C;;)
This formula is suitable as long as the number of cases in the database, i.e. IDI, is reasonably small. Otherwise, due to its combinatorial explosion, this formula becomes difficult to be calculated. Therefore, Suzuki et a1 [5] came up with an approximation of this formula for big datasets. They argue that if s(U ATV) is small, which is often the case in rule discovery, Poisson approximations can be applied. In that case, the above formula for intensity of implication reduces to an approximation that is easier to compute, i.e.:
For convenience, the standard measures of confidence and support can be incorporated in this formula, which results in the following equation:
171
The original CBA algorithm can now be adapted by using formula (2) to sort the class association rules (CARS). In this notation, rule ri has a higher rank than rule r, if it has a larger value of intensity of implication. When two rules have the same values of intensity of implication, they are ranked according to the sorting mechanism of the original CBA, which is mentioned in section 2. The other steps of the algorithm remain unchanged. 4.
Empirical section
In this section, the adapted CBA algorithm is evaluated on 16 popular UCI datasets. In order to get a more comprehensive evaluation, these datasets are also classified by original CBA, the classical decision tree technique C4.5 (both on original and discretized datasets) and Nalve Bayes. The continuous attributes are discretized by means of an entropy-based discretization method if needed. Tenfold cross validation is used to test the performance of these classifiers. The benchmarking results are described in Table 1. Tests on several datasets show that adapted CBA has a satisfactory accuracy improvement while it also achieves more compact classifiers. An exciting test is the horse dataset: while the original CBA get a 19.03% 10-fold cross validation error rate with an average of Table 1: Benchmarking results Dataset
Adapted CBA Error Rate
Original - CBA
c4.5.
c45 disc.
NB
Vum. of rules
Endr Rate
N u of rules
Error Rate
Error Rate
E m r Rate
13.48%
18.70%
1
Austral
13.48%
26.4
14.35%
130.5
15.51%
2
Breast
4.72%
28.4
3.86%
42.2
5.01%
4.43%
2.58%
3
Cleve
15.47%
16.9
17.16%
63.8
25.41%
20.79%
16.17%
4
Crx
12.90%
34.2
14.93%
138.2
13.91%
12.75%
18.99%
5
Diabetes
24.21%
10.4
22.26%
38.5
25.78%
22.92%
24.22%
6
German
25.60%
56.5
26.70%
134
29.50%
27.60%
25.30%
7
Heart
16.30%
13.6
17.78%
37.6
21.11%
18.89%
14.81%
8
Hepatic
18.67%
18.4
16.21%
25.2
17.42%
16.77%
15.48%
9
Horse
14.12%
1
19.03%
87.9
15.22%
15.22%
20.92%
10
Hypo
1.23%
24.4
1.64%
30
0.73%
0.85%
1.90%
11
Ion0
9.10%
21.7
8.25%
44.8
9.69%
9.69%
8.26%
12
Labor
11.67%
4.2
10.00%
12.5
19.30%
15.79%
8.77%
13
Pima
23.17%
11
23.43%
38.3
26.82%
22.66%
25 %
14
Sick
2.43%
10.7
2.64%
47.4
1.14%
2.07%
4.32%
15
Sonar
18.31%
27.4
22.60%
41
24.52%
18.75%
25.48%
16
Ti-tac
0.00%
8
0.00%
8
14.20%
14.20%
29.65%
13.21%
17.925
13.80%
49.3375
16.58%
14.80%
16.28%
Average
172
87.9 rules, the adapted CBA only generated 1 rule and suffers only from a 14.12% error rate. Generally spoken, adapted CBA has the lowest average error rate and the average number of rules it generated on these datasets is nearly one third of the original CBA. The original CBA also has a better performance than C4.5 (both on original and discretized datasets) and Nalve Bayes. Although Nalve Bayes performs excellent on several datasets such as breast, heart and labor, its behaviour is unstable since it assumes attributes are independent, which is quite a dubious assumption in real life datasets. The performance of C4.5 on discretized datasets is better than on original datasets, but still slightly worse than adapted and original CBA. The adapted CBA implemented in above tests used formula 2 to compute the intensity of implication. Formula 1 was tried as well, but apart from its computation problem, it empirically has a worse performance on most datasets and generated an average error rate of 15.16%.
5.
Conclusion
The algorithm presented in this paper is a modified version of the CBA algorithm. Experiments proved that adapted CBA achieves good performance and generates classifiers that are more compact than original CBA. Since a rule with a relatively low confidence occasionally may have a high intensity of implication, the evaluation of a more advanced sorting mechanism, will be the subject of additional and future research.
Acknowledgement The work was partly supported by the NSFC (79925001/70231010), the MOE Funds for Doctoral Programs (20020003095), and the Bilateral Scientific and Technological Cooperation Between China and FlandersKzech. References 1. B. Liu, W. Hsu and Y. Ma, Integrating Classification and Association Rule Mining, in Proc. of KDD-98, New York, 80-86 (1998) 2. B. Liu, Y. Ma and C. Wong, Classification using Association Rules: Weaknesses and Enhancements, in Vipin Kumar, et al, (eds.), Data mining for scientific and engineering applications (2001) 3. R. Gras and A. Lahrer, L’implication statistique: une nouvelle mtthode d’analyse des donntes, Mathe‘matiques, Informatique et Sciences Humaines, 120,5-31, (1993) 4. S. Guillaume, F. Guillet, J. Philippt, Improving the discovery of association rules with intensity of implication, in PKDD’98,10, 3 18-327 (1998). 5. E. Suzuki and Y. Kodratoff, Discovery of surprising exception rules based on intensity of implication, in PKDD’98, 10-18 (1998).
A METHOD FOR ASSOCIATION RULES MINING *
J . MA School of Electric Engineering, Southwest Jiaotong University Chengdu 610031, China E-mail: mj990320263.net
G . CHEN School of Economics and Management, Tsinghua University Beijing 100081, China E-mail: chengq0em.tsinghua. edu. e n E. E. KERRE Applied Mathematics and Computer Science, Ghent University Krijgslaan 281-5’9, B-9000 Gent, Belgium E-mail: EtienneXerreO UGent. be D. RUAN Belgium Nuclear Research Centre B-2400 Mol, Belgium E-mail: druanasckcen. be
In this paper, a matrix-based approach for mining a special kind of boolean association rules is presented and discussed through an example from the viewpoint of uncertainty reasoning.
1. Introduction
Mining association rules in databases is one of the important and fundamental issues in data mining. An association rule is a triple (X + Y,a,B), where X , Y are sets of items, X + Y is called a potential rule, cy and B are the degrees of support and confidence of the rule, respectively. Degree of support of the rule indicates how many percents of records in the database contain X and Y simultaneously, and degree of confidence means how many *This work is supported by the China-Flanders bilateral cooperation (BIL 00/46)
173
174
percents of records which contain X also contain Y . Since Agrawal et al.' presented the concept of association rules, research in this area has rapidly progressed ,3,4 ,5 ,7. The present work focuses on mining a special kind of boolean association rules and presents a matrix-based method. In section 2, the basic concept of an association rule is given and a matrix-based method is discussed through an example. In Section 3, the method is discussed from the viewpoint of logic. 2. Main Results
Let I = { I1,Iz,. . . ,Im} be a set of items (attributes). A subset of I is called an itemset. A database D is a set of elements T (called a transaction), where T is a tuple (TID, TB), TID is the identifier of T in D ,and TB is an itemset in I. An itemset X is said t o be contained in T and denoted by X C T if X E TB. In the following, let T(X)be the set of transactions containing X. Suppose X, Y are two disjoint itemsets. An implication T : X + Y is said to be an association rule in D with support o and confidence p, where o = and p = In general, a rule which we are interested in is the one whose degrees of support and confidence are greater than the predefined thresholds min-supp (the minimum degree of support) and min-conf (the minimum degree of confidence).7 For a database D ,mining association rules is to find out all interesting rules. In this study, mining boolean association rules is an important and fundamental problem. Most approaches for mining boolean association rules are done in scanning-counting strategy. Different from other existing methods, we will present a matrix-based method from the viewpoint of logical inference. Consider the following example taken from Han's book 7.
w1 wl.
Example 2.1. Suppose I = {Il,I2,13,14,I5}is a set of items, D is a database with 9 transactions shown in Table 1. The results of applying APRIORI t o D with min-supp = 2 (here we take the number of transactions as the min-supp for simplifying computing) are shown in Table 2. If we take min-conf = 70%, then the output rules are I1 A I5 + 12, I2 A I5 + 11,and I5 + I1 A 12. Degrees of support and confidence are two important parameters in mining association rules. For a potential rule, there should be at least one time scanning of the database. Hence, computing these two parameters is
175 Table 1. A Database D . TID TlOO T200 T300 T400 T500 T600 T700 T800 T900 Table 2.
TB {Il,I2,15} {I2,14} {I2,13} {Il,I2,I4} {Il,I3} {I2,13} {Il,I3} { I l , I2,I3,I5} {Il,I2,13}
Generated association rules from frequent itemset (11,I2,15}
Generated association rule I1 A I 2 + I 5 I1 A I 5 + I 2 I 2 A I5 + I1 I1 + I 2 A I 5 I 2 + I1 A I 5 I 5 + I1 A I 2
Degree of Support 2
Degree of Confidence (%) 50 100 100 33 29 100
a time-consuming task in a large database. Notice that for a rule X +- Y , if taking X and Y as two vertices of a graph, then the rule can be regarded as an edge directed from X to Y. Therefore, mining a potential rule can be taken as finding a directed path from X to Y . In virtue of the idea of matrix representation in graph theory, we could construct a matrix to describe the simultaneous appearance of any two items, from which we can find all potential rules of the form X + Y , where X and Y are items. We shall illustrate our method by the aforementioned example. The database D in Example 2.1 can be represented as a matrix and denoted by
IllOOl' 01010 01100 11010 10100 01100 10100 11101 \lllOO,
For any 2 E I, let Ti(.) describe the state of transaction Ti containing Ti(.). Furthermore, for any X C I, item 2, then we have IT(z)I =
EL,
176
let T i ( X ) describe the state of transaction Ti containing itemset X , then we have IT(X)l = X!=,(n,,, Ti(z)). Suppose z, y E I , and T : z j y be a potential association rule. It is easy to say that the degrees of support of z and z U y can be calculated respectively as follows:
and
Hence the degree of confidence of r is
Because Ti(z)= T,”(z) for any Ti(z), we can replace all Ti(.) in denominator of Eq. (4) by T?(z).Hence Eq. (4)becomes
Therefore, when z and y run through all possible items, we get a matrix called the basic support matrix of the database, containing the degree of any two items, from which we can derive all potential rules of form z + y. For instance, for database D, the basic support matrix is 64412 47422
M= 12020 22102 For X = (11) and Y = {12}, we can get that the degrees of support and confidence for rule I1 + I 2 are 4 and 67% by M , because the values of the matrix elements m12 and mll are 4 and 6, respectively, and m12/m11 M 67%. They are equal to the results by APR1OR.I.
177
3. Logical Analysis
In our opinion, mining association rules can be treated as the inverse problem of approximate reasoning. An approximate reasoning is of the form X*Y
X’ Y’,
*
where Y’ is dericed from X’ and X Y. However, in mining association rules, X Y is estimated by X (seen as X’) and Y’ (seen as Y’). Notice that Schema Eq. (6) has two possible interpretations: 1) X A (X Y )+ Y = 1; and 2) (X +-Y ) + (X + Y ) = 1, which indicate that
+
*
by taking a transaction Ti as a valuation from the viewpoint of logic. Therefore the following expression is an estimation of the degree of support for a potential rule
Ti(X + Y) = min{Ti(X),Ti(X) + Ti( Y) }, where
(9)
+ is the implication operator in classical logic. Hence, we have
That is the reason for the presented matrix-based algorithm. Remark: the operations min and + can take other forms, such as tnorms or residual implication operators in order to extend the method to mining fuzzy association rules or quantitative rules 8,9,10,11. Some work is on the way lo.
178
4. Conclusion
In this paper, a method based on matrix representation to mining boolean association rules has been discussed and analyzed by regarding the mining process as the inverse of approximate reasoning.
References 1. R. Agrawal, T. Imielinski, and A. Swarmi, Mining association rules between sets of items in large databases. In Proceeding of the ACM-SIGMOD International Conference on Management of Data, Washington D. C., 207-216 (1993). 2. W.-H. Au and K.C.C. Chan, Farm: A data mining system for discovering fuzzy association rules. In Proceedings of 1999 I E E E International f i z z y Systems Conference, Seoul, Korea (1999). 3. G. Chen, Q. Wei and E. E. Kerre, Fuzzy logic in discovering associations: An overview. In Data Mining and Knowledge Discovery Approaches Based on Rule Induction Techniques. Kluwer Academic Publishers, USA (2002). 4. G. Chen and Q. Wei, f i z z y association rules and the extended mining algorithms. Information Science, 201-228 (2002). 5. G. Chen, Q. Wei, D. Liu and G. Wets, Simple association rules (SAR) and the SAR-based rule discovery, Computer & Industrial Engineering 43, 721-733 (2002). 6. D. Dubois, E. Hullermeier and H. Prade. A note on quality measures for fuzzy association rules, In Lecture Notes in Artificial Intelligence 2715, 677-648, Springer (2003). 7. J. Han and M. Kamber, Data Mining Concepts and Techniques, Morgan Kaufmann (2000). 8. J. Hipp, U. Guntzer and G. Nakhxizadeh, Algorithms for association rule mining - a general survey and comparison, SIGKDD Explorations 2(1), 5864 (2000). 9. T.-P. Hong, K.-Y. Lin and S.-L. Wang, f i z z y data mining for interesting generalized association rules, f i z z y Sets and Systems 138, 255-269 (2003). 10. E. Hullermeier and J. Beringer, Mining implication-based fuzzy association rules in databases. In Intelligent Systems for I n f o m a t i o n Processing: From Representation to Applications. Elsevier, B. Bouchon-Meunier and L. Foulloy and R.R. Yager (eds) (2003). 11. D. Ruan and E. E. Kerre, On the extension of the compositional rule of inference, International Journal of Intelligent Systems 8 , 807-817 (1993).
ENRICHING THE ER/EER MODEL BASED ON DISCOVERED ASSOCIATION RULES' GUOQING CHEN, MING REN, XUNHUA GUO School of Economics and Management, Tsinghua University Beijing 100084, China {chengq, renm,
[email protected]. edu.cn} The Entity-Relationship model (ER) along with its extensions (EER) is a powerful means for business and data modeling at a conceptual level. However, the models need to keep extended or enriched with new semantics as the real world changes and the understanding of the real world improves. This paper focuses on enriching the ER model with new knowledge discovered from databases. A particular type of knowledge of concern is association rule (AR). In bridging the paradigms of AR and ER, a framework of semantic correspondence is proposed. In doing so, the notion of specialization is extended so as to reflect sub-groupings of entities and relationships in terms of E-specialization and R-specialization, respectively.
1
Introduction
The Entity-Relationship (ER) model is a powerful means for business and data modeling that helps identify essential elements of the domain of interest in a conceptual and integrated manner. The ER model describes the real-world aspects in terms of entities, relationships and attributes [3]. During the past decades, basic ER concepts have been extended in various ways, resulting in enhanced concepts (EER) such as aggregation, refinement, specializatiodgeneralization, inheritance, etc. (e.g., [8,9]). Moreover, as uncertainty and imprecision is often lnherent in conceptual semantics, fuzzy ER models have been proposed as well from different perspectives (e.g., fuzzy ER concepts in [13], and fuzzy EER concepts in [4,10]). These efforts have largely enriched the usefulness and expressiveness of EWEER as a conceptual model in many respects. Traditionally, EWEER models are built upon the knowledge available from business managers, systems analysts and database designers. There are reasons, however, that the models need to keep extended or enriched with new semantics as the real world concerned changes and the understanding of the real world improves. Data mining is a process of nontrivial discovery of implicit, previously
'
The work was partly supported by the National Natural Science Foundation of China (79925001 /7023 IOIO), the MOE Funds for Doctoral Programs (20020003095), and the Bilateral Scientific and Technological Cooperation Between China and Flanders.
179
180
unknown and potentially useful knowledge from large databases. The semantics reflected by the newly discovered knowledge may either be previously unaware of at the time when the model was built, or be previously unimportanthninteresting but now become significantly meaningful and worth considering. Thus, incorporating the new knowledge into the model is necessary and enables us to gain a more integrated and informative view of the domain. A particular type of knowledge of concern is association rule (AR), which is of the form X=Y, where X and Y are collections of data items, and X n Y = 0 . Usually, two measures (i.e., degree of support and degree of confidence) are used to evaluate a rule. A rule is regarded interesting if these two measures meet with given thresholds, and it satisfies other interestingness measures set by domain experts and decision makers. An example of AR is “customers with high income turn to buy a large quantity of products”. In recent years, association rule mining has attracted considerable attention in algorithmic efficiency and semantic expressiveness (such as generalized, multi-level, quantitative, or fuzzy association rules [2,5,6,7,11,12]. This paper aims at bridging the paradigms of AR and ER by introducing a framework of semantic correspondence between AR data items and EWEER components.
2
E-Specialization and R-Specialization
An ER model can be denoted as a 2-tuple (E, 4,where E={E,, Ez, . .., Em}and F ( R 1 , Rz, ..., R,,) are sets of entity classes and of relationship classes. Each Ei is associated with a set of attributes J(Ei), and each Rj with a set of attributes J(Rj). Traditionally, a specialization is referred to as a process of defining subclasses from a given class. The specialization may be attribute-based or user-defined, depending on how the subclasses are generated. Basically, if Ei is a subclass of F (F is called the super-class of Ei), we have Ei F. Graphically, an attribute-based specialization is illustrated in Figure 1, where subclasses El, E2and E3 are defined on attribute A of entity class F according to the values of A: vI, vz, v3, respectively. Such A-values can be singletons, subsets, or fuzzy sets [4]. For instance, for subsets of A, the specialization may result in subclasses StudentofAgerO, 301, StudentofAge[31, 601, and StudentofAge[61, 1001 respectively. Further, a more general specialization is on fuzzy values of A, such as young, middle-aged and old, giving rise to subclasses youngstudent, middle-agedstudent, and oldstudent, respectively. Since each of such specializations is with respect to entities, we hereby refer to it as an E-specialization. Moreover, there are cases in which specialization may be with respect to
181
Figure 1. E-specialization on attribute A.
Figure 2. R-specialization on attribute A
relationships. For example, suppose that Quantity is an attribute of relationship class Purchase that links entity classes Product and Customer. Then Purchase may be specialized in terms of the values of Quantity (crisp or fuzzy). As a result, we may have sub-groupings of the instances of Purchase. In Figure 2, R is specialized into RI on value v of attribute A. The doted lines indicate its participating entity classes. Specializations as such are called R-specializations. Note that any subclass of R may also have its own attributes (e.g., attribute B). Furthermore, R-specialization can be made based on an attribute of not only a relationship class, but also a participating entity class. As shown in Figure 3, R1 is a sub-grouping of instances of R with participating entities of EI2 and E2. An example of R1 is “customers (E2) purchase products with A=vI2 (El$’. In general, we have R, c R. 3
Semantic Correspondence between ER and AR
Consider an association rule X z Y , with each data item in X or Y being of the form , where A is a descriptive character of an object, and v is a value of A. Let s: I-+{E, @ denote a mapping from items (i.e., ’s) to entity classes in E or to relationship classes in i. In other words, for , there exists E E E or R E i, such that AE J(E) or AEJ(R), and v E Dom(A). Without loss of generality, suppose an AR to be reflected in EWEER is of the form =sA~, v2>.We then have four cases of semantics correspondence as follows. Case 1: AR mapped onto two entities This association rule may lead to a R-specialization on two attributes (A l and A2), each of which is related to a separate entity class. In fact, these two attributes may also result in two E-specializations of entity classes El and EZ,with subclasses EI(AI=vI)and E2(A2=vZ)respectively. Thus, the R-specialization is with respect to
182
these two subclasses. Concretely, let RI= R(EI(A,=vI), E2(A2=v2)), we have RI G R. For example, an association rule - / x E X}, where x = icoll, .. ., c o k is an ordered tuple belonging to agiven universe X, {toll,...,col,,} is the set of attributes of the elements of X, ~ R ( xis) the degree of membership of x in the relation R In other words, R is an intuitionistic fuzzy subset of X with membership and nonmembership functions p~ and vRrespectively. A projection operation over R defines a relation, which is a vertical subset of R, containing the values of the specified attributes, i.e. containing only the specified columns: ncoll ,.,.,co~m(R) = {, pR(x),vR(x)>/ X E X }, where m5n and the set of attributes {toll,...,col,} is a subset of {toll,...,cob}. The projection retains the degrees of membership and non-membership of R. A selection operation defines a relation, which contains only those tuples from R for which a certain predicate is satisfied. We can say that the selection modifies the degrees of membership and non-membership of R depending on the corresponding value of the predicate: GP@) = {))> ), 1x E X 1, where P is the predicate, i.e. the elements of the result relation have degree of membership, which is logically AND-ed with the corresponding value of the predicate P. A Cartesian product of two relations RxS is identical to the Cartesian product operation defined in the intuitionistic fuzzy sets theory [2], which uses the logical AND between the degrees of membership: Let S be another intuitionistic fuzzy relation: S = {>/ y E y>, then: Rxs = { , min@R(x), pS(y>),M(vR(x>,vS(Y>,)> y' E mu) We will also define the term temporal intuitionistic fuzzy relation, which is an analog of the temporal intuitionistic fuzzy set, i.e. a relation, the elements of which have timedependent degrees of membership and non-membership. 3.
Physical Representation of the Intuitionistic Fuzzy Relational Model
The intuitionistic fuzzy relation is represented as a table with two additional columns for storing the degree of membership and non-membership of the record. In the case of temporal IFR the table should contain at least one more column for a timestamp and a rule for computing the degree of membership and non-membership depending on the timestamp. The intuitionistic fuzzy predicates are stored as functions, which return an intuitionistic fuzzy Boolean value (a record type with two fields - m and n, respectively the truth and falsity degree). The IFRDB model can contain an explanatory rules database, in which some imprecise terms are defined along with the rule for computing the degrees of truth and falsity. A certain predicate can
191
contain in its definition the logic for computing the degree of truth and falsity or if the database has explanatory rules base, it can correspond to a certain explanatory rule.
4. Intuitionistic Fuzzy SQL When using SELECT statements, the SQL should interpret the predicate in the WHERE clause as a modifier of the degrees of membership and non-membership of the rows in the query. When using DML commands for modifying the data in the database, there are two ways of determining the degrees of membership and non-membership of the modified (inserted) records: implicit - when the data is modified by an expert, which has a certain degree of reliability and the records obtain the corresponding degree of reliability explicit - the degrees are specified directly in the command We define the following modifiers, which modify either the degrees of membership of the elements in the entire relation, or the value of a certain predicate: VERY/MUCH strengthens the predicate by modifying its degrees of truth and falsity according to the operation A’ defined in [4]: ifP=, then VERY P= APPROX opposite to VERY, loosens the predicate according to the operation 2.A defined in [4]: if P = , then APPROX P = POSS and CERT- respectively the operators 0 and €from the intuitionistic fuzzy sets theory; transform the intuitionistic fuzzy relation (predicate) to a fuzzy relation (predicate) by adding the degree of indefiniteness either to the degree of truth or to the degree of falsity Other modifiers: D[a], F[a,b], G[a,b], H[a,b], J[a,b], H*[a,b], J*[a,b], C, I, P[a,bI, Q[a,bl Custom modifiers - user defined modifier functions We define the keywords -mship and -nmship, which represent the columns containing respectively the degrees of membership and non-membership of the rows in the result relation. We define the keyword MODIF in order to define a MODIF clause which stands at the end of the statement and claims that the modifier(s) specified after MODIF must be applied to the result relation. The level operators could also be applied to a certain relation in order to filter only those elements with degree of membership greater than a certain level (e.g. the operator N[a,b] from the intuitionistic fuzzy sets theory). The intuitionistic fuzzy SQL statement should be translated to a classical SQL statement, i.e. the target SQL statement should contain the formulas for
192
calculating the degrees of membership (pJ and non-membership (vr) of the result records. For the purpose we define the following translation rules for the FROM and WHERE clauses: we calculate the ~ land f vf (respectively the degrees of membership and nonmembership obtained after the application of the Cartesian product operation) for all relations in the FROM clause: Let RI, R2, ..., R,, be the relations in the FROM clause. An element from the result set will be of the form x, = <xl, k , ..., x,,>, where %(k=l,. . .,n) is the corresponding tuple (set of attributes) from the relation Rk.Then for each tuple x, from the result set, clf (PRI(%), PR2(%), .. .,PRnh)) and vf (vRI(%),vR2(x2), . ..,VRn(%)) we calculate the uy and v, (respectively the degrees of truth and falsity used for applying the selection operation) for all the predicates in the where clause. The single logical operations are translated the following way: ForoperationOR(p, ORp20R ... 0 R p n ) ~ ~ ( p I , ..., p 2 pn)=max@(pI), , ~ @ 2 ) 9 ...>P@~)),voR@I,P~, ~ . . , p n > = ~ ( v @ d , v ( P...,v@J) d, For operation AND @, ANDp, AND ... ANDp,,) /LAN&,, p2, ..., pn)= min (P@I),P@z), .. ., ~@n)), VAND@I, PZ,. . v Pn) = ~ z ( v @ I )v@2), , . . .> v(PlJ) For operation NOT (NOTp) ~NOT(P) = v@), VNOT(P) = p(p) For all the modifiers there should be defined a function in the database. For example the modifier VERY should have a corresponding function VERY which returns the modified intuitionistic fuzzy Boolean value. k = min (k,uy), Vr = max (vf, v,) since the entire WHERE clause from the intuitionistic fuzzy SQL statement acts as a modifier to the degrees o f membership and non-membership of the result rows, then if the statement doesn’t contain level operators, the WHERE clause in the translated SQL statement should contain only the condition -nmship c 1 in order to filter those records which have no degree of membership or indefiniteness, otherwise the translated SQL statement should contain in its WHERE clause the corresponding conditions to satisfy the level operators. The realization of the functions min and max with variable count of arguments in SQL uses the aggregate functions rnin and max and is the following (using Oracle PL/SQL syntax): min(xl, k , ..., %) is represented as “(SELECT min(-x-) FROM (SELECT xl AS -x- FROM dual UNION ALL SELECT x2 FROM dual UNION ALL . .. UNION ALL SELECT xn FROM dual))” ma$xl, x2, .. .,x,,) is represented as “(SELECT max(-x-) FROM (SELECT xl AS -x- FROM dual UNION ALL SELECT x2 FROM dual UNION ALL ... UNION ALL SELECT xn FROM dual))”
193
Example: Let us have the intuitionistic fuzzy relations A and B. Both of them contain columns M and N for storing the degrees of membership and non-membership respectively. Let the intuitionistic fuzzy predicates P and Q be defined over the columns A.X and B.Y. intuitionistic fuzzy SQL statement: SELECT A.X, B.Y, -mship, -nmship FROM A join B ON A.id = B.id
WHEREVERYP(A.X,B.Y)ORQ(A.X,B.Y)MODIFN[0.3,0.7] translated SQL statement (Oracle PL/SQL syntax): SELECT A.X, B.Y, (select min(-x-) from (select a.m as -x- from dual union all select b.m from dual union all select max (-x-) from (select very@(a.x, b.y)).m from dual union all select q(a.x,b.y).m from dual) ) ) as -mship, (select max(-x-) from (select a.n as -x- from dual union all select b.n from dual union all select min (-x-) from (select very@(a.x, b.y)).n from dual union all select q(a.x,b.y).n from dual)) ) as -nmship FROM A join B ON Aid = B.id WHERE -mship >= 0.3 AND -nmship 128 Then I,,"
I,
rf
Else
Step 3:
= 255-
I,,,
= IMd
IUidF> IUaxF Then
rf
If
I,,""
Else
D
> I,[,"
Then
CT = I,,,"
=
Else
If
I,"
>
Then
C T =IuIdF
Else o = IUaxF
Step 4:
Calculate the normalized IMnNeW& IUarNew
I,,
New NeW
'Max
=I,,-a = IMid
a
218
In most cases, individual identifiers are embossed in the identifier area and the noise between identifier codes and the background is caused by shadows. We used the fuzzy binarization algorithm to remove the noise from the shadows. The membership fimction of the proposed hzzy binarization is shown in Figure 3.
"t I
I,,"
Figure 3. Proposed fuzzy membership function
Next, we extracted the container identifiers from the binarized identifier area by using the contour tracking method. In this paper, the 4-directional contour tracking method using 2x2 mask was applied considering the whole preprocessing time of container images[5]. Transform identifier area to
Transform identifier area to grayscale image
groups sequentially in the
groups sequentially in the vertical direction (a) Identifier extraction in vertical identifier area
(b) Identifier extraction in horizontal identifier area
Figure 4. Two types of identifier extraction algorithms
In this paper, the extracted identifiers are arranged in a single row by using Euclidean distances between identifiers and classified into three code groups. The Euclidean distance is calculated by measuring the distance between the start pixel of the first identifier and the start pixel of the other identifier having a vertical offset from the first identifier. The vertical offset must be less than one
219
half of the vertical size of the first identifier. Then, by combining identifier sequences in every row, one row of identifiers is created. Finally, identifiers in the row are classified sequentially to code groups according to the IS0 standard [l]. Figure 4(a) shows the procedure for identifier extraction in identifier area with vertical arrangement and Figure 4(b) shows the extraction procedure in the area with horizontal arrangement. 3.
Identifier Recognition using an Enhanced RBF Network
For the improvement of the success rate of recognition, this paper proposes an enhanced RBF network that adapts the ARTl network to the learning structure between the input layer and the middle layer and applies the output layer of the ARTl network to the middle layer. In the ARTl network, the vigilance parameter determines the allowable degree of mismatch between any input pattern and saved patterns[6]. Moreover, since many application of image recognition based on the ARTl network assigns an empirical value to the vigilance parameter reduction of the success rate of recognition may occur. To correct this defect, this paper enhances the ARTl network by adjusting the vigilance parameter dynamically according to the homogeneity between patterns by using Yager’s intersection operator, which is one of fuzzy connection operators[7]. Eq. (2) shows the equation applied to the ARTl network for refinement in this paper, which dynamically adjusts the vigilance parameter p by using Yager’s intersection operator. p ( n + 1) = 1- min(1, ,/((I - p ( n ) y + (1 - p ( n - 1))’
4.
j
(2)
Performance Evaluation
Totally 100 container images of 754x504 pixel size and 256 colors were used in the experiment. By using the proposed extraction algorithm for all 100 images the identifier areas were successfully extracted from the images. Table 1. Performance comparison of identifier extraction
220
Applying identifier extraction algorithms proposed in this paper and the histogram based algorithm [3] to the extracted identifier areas, experimental results were summarized and compared in Table 1. Our algorithm, first, distinguished the background and container identifiers by using the proposed fuzzy binarization, and then, extracted identifiers by using the contour tracking. Table 2 compares learning performances in the experiment that applied the conventional ARTl Based BRF network algorithm and an enhanced ARTIbased RBF network to container identifiers extracted by the proposed algorithm mentioned above. As shown in Table 2, the number of clusters created at the learning process of the proposed ARTl network was much lower than the conventional ARTl network, which means that it is efficient to use the proposed ARTl network in the construction of middle layer in the enhanced ARTI-based RBF network. Table 2. Comparison of learning performance
Table 3 compares recognition performances of the two algorithms by the number of recognition successes in the experiment. As shown in Table 3, the recognition rate of the enhanced ARTI-based RBF network was higher than the conventional ART 1-based RBF network. Table 3. Comparison of recognition performance
221
5.
Conclusions
In this paper, we have proposed and evaluated a novel recognition algorithm of container identifiers for the automatic recognition of transport containers. The container images demonstrate certain characteristics, such as irregular size and position of identifiers, diverse colors of background and identifiers, and the impaired shape of identifiers caused by container damages and the bent surface of containers making the identifier recognition by image processing difficult. Hence, we proposed a hzzy binarization algorithm to separate clearly the background. For identifier recognition, we also proposed an enhanced ART 1based RBF network that organizes the middle layer effectively using the enhanced ARTl neural network. The proposed network adjusts the vigilance parameter dynamically according to the homogeneity between patterns. Results of the recognition experiment by applying the conventional ART1based RBF network and the enhanced RBF network to the 1054 extracted identifiers show that the enhanced ARTl-based RBF network has a higher rate of recognition compared to the conventional ARTl -based RBF network.
References 1. ISO-6346, Freight Containers-Coding -Identification and Marking, 1995. 2. N. B. Kim, “Character Segmentation from Shipping Container Image using Morphological Operation,” Journal of Korea Multimedia Society, V01.2, No.4, pp.390-399, 1999. 3. M. Y . Nam, E. K. Lim, N. S. Heo and K. B. Kim, “A Study on Character Recognition of Container Image using Brightness Variation and Canny Edge,” Proceedings of Korea Multimedia Society, Vo1.4, No. 1, pp. 111- 1 15, 2001. 4. Liane C. Ramac and Pramod K. Varshney, “Image Thresholding Based on Ali-Silvey distance Measures,” Pattern Recognition, Vo1.30, No.7, pp. 1 1611173, 1997. 5. K. B. Kim, S. W. Jang and C. K. Kim,”Recognition of Car License Plate by Using Dynamical Thresholding Method and Enhanced Neural Networks,” Lecture Notes in Computer Science, LNCS 2756, pp. 309-3 19,2003. 6. Grossberg, S., “Adaptive pattern classification and universal recoding: parallel development and coding of neural feature detectors, ” Biol. Cybern., V01.23, pp. 187-202, 1976. 7. H. J. Zimmermann, Fuzzy set Theory and it’s Applications, Kluwer Academic Publishers, 1991.
FUZZINESS-DRIVEN EDGE DETECTION BASED ON RENYI’S A-ORDER FUZZY ENTROPY
I. K. VLACHOS AND G. D. SERGIADIS Aristotle University of Thessaloniki Faculty of Technology, Department of Electrical & Computer Engineering, Telecommunications Laboratory, University Campus, GR-54124, Thessaloniki, Greece E-mail: { ivla,sergiadi}@auth.gr This paper presents an algorithm for edge detection in images based on fuzzy sets theory. In order to perform edge detection we exploit the fuzziness present a t edge locations in an image. The proposed parametric scheme is based on the notion of the a-order entropy of a fuzzy set. The parameter a controls the sensitivity of the algorithm to detect various types of edges. A detailed comparison with the algorithm proposed in [l] is carried out.
1. Introduction Edge detection is a fundamental task in pattern recognition and machine vision. An edge is defined as the boundary between two regions with relatively distinct gray-level properties. Most edge detection techniques are based on the computation of a local derivative operator. Fuzzy sets theory [2] has been successfully applied to many image processing and pattern recognition problems. The extensive use of fuzzy logic in digital image processing is mainly favored from the ability of fuzzy sets to cope with “qualitative” measures, such as the edgeness of a region, by modelling the ambiguity and vagueness often present in digital images. In addition, fuzzy sets theory provides us with a solid mathematical framework for incorporating expert knowledge into digital image processing systems. In this paper we present an efficient algorithm for edge detection in images based on fuzzy sets theory, as a modification of the algorithm introduced in [l].The method exploits the intrinsic fuzziness present at areas along an edge. The fuzziness is measured using Renyi’s a-order entropy of a fuzzy set. An intuitive membership function is also introduced in order to describe the “degree of edgeness” of a pixel in its vicinity.
222
223
0.0
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.6
0.8
1.0
Figure 1. (a) ea curves for a + 1 (solid line), a = 0.3 (short-dashed line), and a = 3.0 (dashed line). (b) Sugeno-based parametric index of fuzziness for X = 80.0 (solid line), and X = -0.9877 (short-dashed line).
2. Fuzzy Sets and Fuzzy Entropies 2.1. Image Representation in the Setting of Fuzzy Sets Theory Let us consider an image X of size M x N pixels, having L gray levels g ranging from 0 to L - 1. The image X can be regarded as an array of fuzzy singletons [3]-[5].Each element of the array denotes the membership value p x ( g i j ) of the gray level g i j , corresponding to the ( i , j ) - t h pixel, regarding to a predefined image property such as brightness, edgeness, etc. Using the fuzzy sets notation image X can be represented as:
X = { p , x ( g i j ) / g i j l i = 0 , 1 , . . . , M - 1, j = O , l , . . . , N - l} .
(1)
2.2. The a-Order Entropy of a f i z z y Set
As an extension of Shannon’s entropy, Renyi in [6] defined the a-order entropy Ha of a probability distribution (PI,p2, . . . ,p,). Bhandari and Pal in [7]introduced the a-order fuzzy entropy, which in the case of an image X defined as in Eq. 1 is given by:
where a(# 1) is a positive real parameter and e a ( p x ( g i j ) ) is defined as:
224
It should be mentioned that the a-order fuzzy entropy constitutes a oneparameter generalization of the De Luca and Termini’s entropy HLT defined in [8],since lima-+l Ha = HLT. Fig. l ( a ) illustrates Eq. 3 for various values of parameter a. 3. Fuzziness-Based Edge Detection
3.1. Fast f i z z y Edge Detection [ l ]
An edge is defined as the boundary between two regions with relatively distinct intensity properties. According t o this definition edgy pixels belong to both regions. Therefore, it is expected that these pixels should exhibit high fuzziness values. In [l]the linear index of fuzziness was used to measure the degree of fuzziness of pixels inside a square window W. Let us consider an “optimal” edge in a 3 x 3 neighborhood W :
(4) 100 100 100 The corresponding membership values can be easily calculated by intensity normalization:
0.0 0.0 0.0
(5) 1.0 1.0 1.0
Definition 3.1. The degree of edgeness of the (m,n)-th pixel is given by w-lw-1
where w is the size of the sliding square window. Eq. 6 can be rewritten for the window W centered a t the (m,n)-th pixel as: pEdgeness(gmn) = min{l,w%(W))
>
(7)
where 71 is the linear index of fuzziness defined in [9], which in the case of an image X is given by the following formula:
The index of fuzziness considers the intersection of a fuzzy set and its complement. This means that the index of fuzziness measures the lack of distinction between fuzzy sets and their complements.
225 In [l]the membership function of Eq. 7 was modified in order for the algorithm to become more robust, since the spatial calculation of the membership values is noise sensitive. The modified function is given by the following formula: PlEdgeness (gmn)
= min{l, wv
- minspatial (gij) (w)1(maxspatial (gij)Smax
1.
(9) where gmaz is the maximum gray level of the image. Furthermore, in [l] the parametric Sugeno fuzzy complement was also used in order t o control the sensitivity of the algorithm to edges. The Sugeno fuzzy complement is given by the following equation:
with X E ( - 1 , ~ ) .
3.2. Proposed Method The fuzziness-based approach to edge detection proposed in [l]exhibits some drawbacks. Let us consider the following situation where the pixels inside the sliding window have only two different intensity values, that is
W = 8080 80 8 [: :8 i:8
.
The corresponding membership values, computed by intensity normalization, are either 0 or 1, thus assigning to the central pixel a degree of edginess equal to zero, according to Eq. 9. Therefore, using Eq. 9 simply fails t o extract this type of edges. Another issue arises from the non-symmetric behavior of the parametric index of fuzziness based on Sugeno fuzzy complement, shown in Fig. 1(b), with product as the intersection operator. Let us consider the case of the sliding window being located at two different types of edges; one “bright” and one “dark”, with corresponding membership values, i.e. =
& I
0.0 0.0 0.0 0.9 0.9 0.9 [l.O 1.0
,I
&/ =
0.0 0.0 0.0 0.1 0.1 0.1 [LO 1.0
,.,I
.
(12)
Both edges are LLequivalent”in the sense that the strength of the edge in both cases, calculated as gray-level difference, equals 0.9. Thus, it is expected that both edges should be treated similarly, that means assigning the
226 same degree of edgeness to the central pixel. However, this is not the case if we consider the parametric approach based on Sugeno fuzzy complement with product as the intersection operator. Due to its non-symmetric nature the Sugeno-based index of fuzziness assigns different weights for the same departure from the values 0 and 1. For example in the case of the edges mentioned above, if we calculate the degree of edgeness for X = -0.9877, in the case of the “bright” edge the membership value is 1.0, while for the “dark” we have 0.2465. This shows clearly that even thought the edges are “equivalent” they are not treated the same way. In order to overcome this drawback we propose the following algorithm for fuzziness-driven edge detection based on Renyi’s a-order entropy of a fuzzy set. Fuzzy image processing consists of three stages: (a) fuzzification, (b) suitable modification of the membership values, and (c) defuzzification. In the fuzzification stage of the proposed algorithm, the membership function p x is initialized according to:
for all i E {0,1, . . . , M - 1) and j E (0, 1,. . . , N - l),where gmin and gmax stand for the minimum and maximum gray levels of the image respectively. Using Eq. 13 for initializing the membership values has the advantage of stretching out the membership function over the entire gray-level range and normalizing the intensity values in order to lie in the unit interval [0,1]. Fuzzy entropy is a measure of the fuzziness of a fuzzy set, arising from the inherent ambiguity carried by the fuzzy set itself. Moreover, entropy is also a measure of the amount of information contained in a system. This approach provides us with another way of looking to the problem of edge detection, since it is expected that edge points will carry more information than non-edge ones, because object contours are the primary sources of stimulation of the human visual system. In order t o ensure that the proposed method successfully retrieves all existing edges we modify Eq. 9 as follows:
where 6 is given by the following formula
227
(a)
(b)
(4
(c)
Figure 2. (a) Computer-generated image. Results obtained using the Sugeno-based parametric index of fuzziness for (b) X = -0.9877 and (c) X = 80.0. (c) Proposed edge detection scheme for a = 0.3.
and p x ( g i j ) is defined as P X ( g i j)
Sij
=
m a i ~ { O , l.,. ,.M -l},
j€{O,l,.
.. ,N-
1) ( g i j )
(16)
Parameter a of the a-order fuzzy entropy controls the sensitivity of the proposed method to edges. As a --t 0 the measure becomes more sensitive in extracting edges from the image under processing. It should be mentioned that instead of Renyi’s a-order entropy, we can use any parametric index of fuzziness derived from parametric t-norms, such as Dombi’s t-norm, or the t-norms proposed by Hamacher, Schweizer-Sklar etc. These parametric fuzzy t-norms can be used to implement symmetric indices of fuzziness in the general form of 4
M-lN-1
where T ( . ) is the implemented t-norm, p denotes the control parameter of the t-norm, and A is a normalization factor ensuring that the index of fuzziness takes values in the [0,1] interval. This is an essential requirement for the index in order to be qualified as a sensible measure of fuzziness. 4. Experimental Results
In order to evaluate the performance of the proposed edge detection scheme, we applied the proposed algorithm to various real-world and synthetic images of different types. We have considered gray-scale images of size 256 x 256 pixels with 8 bits-per-pixel gray-tone resolution. Fig. 2(a) demonstrates how the proposed method raises the drawbacks stated in Sec. 3.2. The synthetic computer-generated image consists of the two “optimal”
228
Figure 3. (a) Test image, and edge images obtained with the parameter set to (b) a = 3.0, and (c) a = 0.3 using the proposed method. (d) Image obtained using Sobel operator.
(a) Figure 4.
(b)
(c)
( 4
(a) Test image, and edge images obtained with the parameter set to (b)
a = 3.0, and (c) a = 0 . 3 using the proposed method. (d) Image obtained using Prewitt operator.
edges described by Eq.12. The horizontal edges of the rectangle correspond to the “bright” edges, while the vertical ones to the “dark”. Figs. 2(b) and 2(c) depict the results obtained using the Sugeno-based index of fuzziness with product as the intersection operator and the value of the parameter X set t o 80.0 and -0.9877 respectively. The result derived using the proposed approach is illustrated in Fig. 2(d) for a = 0.3. By comparing the produced edge maps we can observe that our method successfully treats both edges equivalently, by assigning them the same degree of edgeness, while at the same time overcomes the drawback of the sliding window being located a t an area consisting only of two different intensity values. The proposed method was also tested using real-world images. The results obtained using our approach are compared to those derived using different edge-detection techniques, such a s the Sobel and Prewitt operators. Figs. 3(b) and 3(c) show the edge maps of the test image illustrated in Fig. 3(a) for different values of the parameter a. For the image of Fig. 3(b) we have set a to a = 3.0, while for the one in Fig. 3(c) the parameter was
229
0.3. Fig. 3(d) shows the result obtained after applying the Sobel operator. From the images one can observe that as the parameter a decreases more edges emerge since the a-order fuzzy entropy assigns higher weights even to small gray-level differences. Compared to the Sobel operator the proposed method significantly extracts different types of edges and has the advantage that its sensitivity can be easily controlled by suitably tuning the parameter a of Renyi’s fuzzy entropy. The efficiency of our algorithm, is mainly favored from the ability of fuzzy sets to model the ambiguity and vagueness present in digital images. Another result is also depicted in Fig. 4 from which similar conclusions can be drawn. 5 . Conclusions
This paper presents an efficient algorithm for edge detection in images based on Renyi’s a-order entropy of a fuzzy set. The algorithm measures the degree of edgeness in image regions in terms of fuzziness and informational content, using a sliding-window approach. The sensitivity of the proposed method can be adjusted in order to retrieve edges of different types. Finally, our future work involves a detailed investigation of the behavior of various indices of fuzziness derived from parametric t-norms in order to perform fuzziness-driven edge detection.
References 1. H. R. Tizhoosh, Fast and robust fuzzy edge detection, in Fuzzy filters for image processing M. Nachtegael, D. Van der Weken, D. Van De Ville, E . E. Kerre (Eds.), Springer-Verlang, Heidelberg, (2003) pp. 178-194. 2. L. A. Zadeh, Fuzzy sets, Inj. Contr. 8 , (1965) pp. 338-353. 3. S. K. Pal and R. A. King, Image enhancament using fuzzy set, Electronics Letters 16, (1980) pp. 376-378. 4. S. K. Pal and R. A. King, Image enhancement using smoothing with fuzzy sets, IEEE Trans. Syst., Man, and Cybern. SMC-11, (1981) pp. 494-501. 5. S. K. Pal, A note on the quantitative measure of image enhancement through fuzziness, IEEE Trans. Pattern Anal. Machine Zntell. PAMI-4, (1982) pp. 204-208. 6. A. Renyi, On measures of entropy and information, Proc. Fourth Berkeley Symposium on Mathematical Statistics and Probability, (1960). 7. D. Bhandari and N. R. Pal, Some new information mesures on fuzzy sets, Information Sciences 67,(1993) pp. 209-228. 8. A. De Luca, S. Termini, Definition of a nonprobabilistic entropy in the setting of fuzzy set theory, In& Contr. 20, (1972) pp. 301-312. 9. A. Kaufmann, Introduction to the theory of fuzzy subsets, Academic Press, New York, (1975).
STATISTICAL ANALYSIS OF ECG SIGNALS WITH WAVELET TECHNIQUES AND METHODS OF NON-LINEAR DYNAMICS
IMRE PAZSIT Chalmers University of Technology, Department of Reactor Physics, SE-418 96 Goteborg, Sweden E-mail:
[email protected] This paper demonstrates the use of some methods of signal analysis, performed on ECG and in some cases blood pressure signals, for the identification of the heart health status. Spectral analysis, continuous wavelet transform and fractal analysis were tested. The analysis was made on data from mice and rats. A correlation was found between the health status of the mice and the rats and some of the statistical descriptors, most notably the phase of the cross-spectra between ECG and blood pressure, and the fractal properties and dimensions of the interbeat series (RR-interval fluctuations).
1. Introduction Methods of time and frequency domain analysis of random processes', used earlier mostly for non-living systems, are being increasingly used in medicine and biology2. As a consequence, one can note papers even on nuclear conferences that report on analysis of heartbeat data or some other medical diagnostics signal. The purpose of the present paper is that, in line with this development, to report on some spectral, wavelet and fractal analyses of heartbeat data, i.e. ECG signals, with the goal of identifying features that indicate unhealthy status. In contrast to ordinary stationary processes most often observed in engineering applications, an ECG signal is not stationary, rather quasi-periodic. It consists of the periodic repetition of the so-called PQRST complex, which has a "spiky", strongly non-harmonic character. In most cases so far, the analysis was done on the interbeat intervals, i.e. the times between the R-R peaks, also called HRV (for a review see Ref. [2]). This series is more like a stationary random signal, and it has interesting fractal properties.
230
231 One subject of the present contribution is the test of spectral, wavelet and fractal techniques on data from mice and rats.
2. The experimental data The data used in this analysis consist of two groups. One is taken from mice, and in these measurements only ECG data were recorded. There are 4 data files available, which will be labelled as Cases 1,2,3 and 5. In the second group, data of blood pressure and ECG data were recorded simultaneously from rats. There are two measurements of this kind, labelled as bphrl and bphr2. One illustrative case of mice data is shown in Fig. 1 below. The data appear quite regular, with a normal heart frequency of about 10 Hz. This frequency is clearly seen in the wavelet transform of the time signal, represented by horizontal line just below mid-height of the periodogram (lower part of Fig. 1). Each QRS complex induces a conical shape on the periodogram, whereas the interbeat interval fluctuations appear as deviations of the horizontal HRV from being a straight line. Such a representation is, however, quite ineffective to quantify or even sensitively detect heart rate variations. Such anomalies can be better seen on the analysis of the R-R interbeat intervals. Raw ECG data
2
3
4
5
Time [s]
Figure 1. Time series from measurement with mouse, Case 1, together with its continuous Morlet wavelet transform.
232 3. Analysis of the data
Spectral analysis was only made on the raw ECG signals. Wavelet transform was performed on both the raw signals and on the time series represented by the R R interbeat intervals as functions of the beat number (these will be called HRV signals for simplicity). The fractal analysis was only performed on the HRV signal. These will be described here below. 3.1. Spectral analysis of the r a w ECG signals
For brevity, we only cite here the only interesting finding. It is related t o the cross-spectra and correlations between blood pressure and ECG data. There are two such measurements for rats. For both cases, auto-spectra of the blood pressure and the ECG signal were calculated, as well as the coherence and the phase of the cross-spectra, as functions of the frequency. The results with the interesting finding are shown in Fig.2. bphr2
Blood pressure
4096
bphr2
Heort rote
10-1
-
:: 10' : 10'
t
10-4
10-5
100
10-0 0
10-1
j-J 0
~~~
20
bphr2
0
40 60 Frequtlncy [Hz]
80
100
coherence HR - EP
j 9
40 F,mqY."Cy
60
80
100
[HI]
phase
20
40 60 Frequency [Hz]
HR
- EP
f
0
e -1
0.1
-2
02
00 0
20
bphr2
-3
20
40 60 Frsqusncy [HZ]
80
100
0
60
100
Figure 2. APDS, coherence and phase of the blood pressure and ECG signal for rats, case bphr2. The linear dependence of the phase curve breaks down for low frequencies.
The autospectra (APSD) show the fundamental frequency and higher harmonics peaks of the periodic heartbeats. The fundamental frequency is under 5 Hz, both for the B P and the ECG signals, which is characteristic for rats. The interesting part lies in the coherence and the phase relationships between the two signals. Despite of the low coherence, there is a clear linear dependence of the phase of the cross spectrum as a function of the frequency,
233 up to about 100 Hz. Such a linear relationship is due to a constant time delay between the first and the second signal. The slope is proportional to the time delay between the two signals. Both measurements show a clear time delay effect, but there is a significant difference between the two cases. For the case bphrl, (not shown here), the linear phase stretches from zero frequency up t o high frequencies. In the case bphr2, Fig. 2, the phase is constant and equal to -180° from zero frequency up to 7-8 Hz, i.e. the two signals are out of phase in this region, varying in opposite phase without time delay. This indicates that for the case bphr2, at low frequencies there is another process that interferes with the transport process. In industrial diagnostics, the occurrence of a new process, interfering with the pure transport, is often a sign of an anomaly. Our finding is consistent with such an expectation, since, due to the difference in their physical exercise, the rat bphrl can be taken as healthier than bphr2.
3.2. Wavelet and fractal analysis of the HRV data There are standard methods of extracting RR-interval data from raw ECG signals, but these are usually based on the identification of the QRScomplex in the ECG waveform. For our data these methods were not applicable due t o the data quality. Instead, the R-R peaks were identified as the maxima of the negative derivative of the original series. Wavelet transform for the analysis of both ECG and HRV data has been used for a long time4. There have been primarily two strategies in use so far. One is to search for characteristic patterns either in the ECG or in the HRV data that could indicate anomalous behaviour4. The other avenue is to look for self-similarity properties of the HRV signal on the wavelet transform4, or with other methods3, and determining its fractal dimension. A search for the qualitative self-similarity properties of the data available to us was made on the RR-interval series derived in the way described above. The interval series a s functions of the beat number, and their wavelet transform for HRV data for one case of mice is shown in Fig. 3. The wavelet transform was calculated with the wavelet toolbox of the Interactive Data Language (IDL). The wavelet transform clearly shows a self-similar fractal structure in the cardiac dynamics of the rats, in that large structures which can be observed at large scales occur in a similar but smaller form for smaller scales, over the whole range of scales considered. For the quantitative investigation of the fractal properties of the inter-
234
0.04
-
4 16
l64 256
1024 0
460
200
1wO
12M)
Figure 3. RR-interval series as a function of beat number for mouse Case 1 (top part) and its continuous wavelet transform (bottom part).
beat intervals, the method suggested by Higuchi5 was used, in the form as described in Ref. [3]. This consists of the calculation of the average length ( L ( k ) )of the RR-interval curve v ( n ) , n = 1 , 2 , 3 . .. , N , at various length scales k as the average of the quantity 1
L,(k) = -[( k
I-[
c a=1
Iw(mf ilc)- w(m
N-1 + (i - 1)kI)- [?]k I
(1)
with respect to m = 1,2,3,.. . k . The negative slope of log(L(k))as plotted against log k gives the fractal dimension of the curve. Plots of l o g ( L ( k ) ) vs log k are shown for all four cases of mice RRinterval series in Fig. 4. It is seen that all cases have a fractal dimension D larger than unity, and in particular 1.5 5 D 5 1.9 for cases 1, 3 and 5. These cases have a quite definite single value of the fractal dimension. However, Case 2 shows two segments of the curve, representing a different fractal dimension for shorter and longer scales. These are approximately equal to D s = 1.6 and D L = 3.5. The tendency ( D s < D L ) corresponds to what in Ref. [3] was found for healthy, but elderly patients. It is interesting to note that the HRV data showing two fractal dimensions in our case belongs to an animal with an anomaly (excess in growth hormones). Hence a correlation is found between health status and fractal properties of the interbeat intervals. 4. Conclusions These preliminary investigations have demonstrated or reproduced the use of some spectral, wavelet transform and fractal dimension analysis in the
235
Figure 4. Plots of log(L(k))vs log k for all four cases of mice RR-interval series.
classification of the status of the cardiac dynamics of animals. More detailed analysis will be performed on annotated data base of human ECG signals along the wavelet based methods indicated in the paper.
Acknowledgments The measured data from mice and rats were obtained from Dr. Goran Bergstrom, Department of Physiology, Goteborg University, which is acknowledged with thanks. This work was financially supported by the science foundation Adlerbertska forskningsstiftelsen and Kristina Stenborgs stiftelse.
References 1. I. Pazsit, Int. J . for Real-Time Systems 2 7 , 97-113 (2004) 2. M. Malik, J. T. Bigger, A. J. Camm R. E. Kleiger, A. Malliani et al., European Heart Journal, 17, 354-381 (1996) 3. L. Guzman-Vargas and E. Calleja-Quevedo and F. Angulo-Brown, Fluct. Noise Letters,3, L83-L89 (2003) 4. B. Suki, A. M. Alencar, U. F'rey, P. Ch. Ivanov, S. V. Buldyrev, A. Majumdar, H. E. Stanley, C. A., F h c t . Noise Letters, 3, Rl-R25, (2003) 5 . T. Higuchi, Physica D, 31, 277-283 (1988)
CLASSIFICATION OF TWO-PHASE FLOW REGIMES VIA IMAGE ANALYSIS BY A NEXJRO-WAVELET APPROACH*
c. SUNDE', s AVDIC~+AND I. P ~ S I T ' 'Department of Reactor Physics, Chalmers University of Technolom, SE-412 96 Goteborg, Sweden Faculty of Sciences, Department of Physics, University of Tuzla, 75000 Tuzla, Bosnia-Herzegovina,
A non-intrusive method of two-phase flow identification is investigated in this paper. It
is based on image processing of data obtained from dynamic neutron radiography recordings. Classification of the flow regime types is performed by an artificial neural network (ANN) algorithm. The input data to the ANN are some statistical functions (mean and variance) of the wavelet transform coefficients of the pixel intensity data. The investigations show that bubbly and annular flows can be identified with a high confidence, but slug and chum-turbulent flows are more often mixed up in between themselves.
1. Introduction
Two-phase flow patterns are usually classified into four classical so-called flow regime types. These are 1) bubbly, 2) slug, 3) churn-turbulent, and 4) annular flow regimes (see Fig. 1). Recognition and, possibly control, of the flow regime types is essential in numerous practical applications. Although the flow classification can be done reliably in fully instrumented channels in which thermocouples, pressure transducers, flow-meters etc. are available, a more challenging alternative would be to use non-intrusive methods. In this field, the availability of the methods is not as wide. Non-intrusive methods so far have been based on radiation attenuation measurements, such as X-ray [I] or gamma-rays. These methods are usually based on the detection of collimated rays penetrating the flow, and the processing of the variation of the intensity, modulated by the flow, by various statistical methods (probability distributions, auto- and cross-correlations and spectra). A qualitatively different approach, which will be pursued in this paper,
* This work is supported by the Swedish Centre for Nuclear Technology
Work partially supported by contract 14.10-030515/200310002 of the Swedish Nuclear Power Inspectorate
236
237
is to use image analysis. After all, the concept of flow regime arises from an intuitive judgement of the topology of the flow, based on visual observation. Bubbly f b w Slug flow Such images of two-phase flow can be produced in transparent pipes easily with visible light, but in metal pipes neither X-rays or gamma-rays are applicable. X-rays do not penetrate the wall, and gamma-rays can not, in general, be produced with desired intensity such that an image with good contrast and Churn flow Annular flow dynamics can be achieved. However, dynamic neutron radiography has been developed to produce images of twophase flow in metal pipes at the Kyoto University Research Reactor Institute [2]. Some of these measurements were made available for us, and were used in the Fig. 1. Neutron radiography images present analysis. Some sample images are of the four basic flow regimes. shown in Fig. 1. 2.
The classification algorithm
Our objective in this work was to find an algorithrmc identification method. Artificial neural networks (ANNs) were selected for this purpose. Static images for raining and test data were provided as follows. IN the recording at our disposal, the heating was gradually increased starting with cold water. During the measurement, all four flow regimes occurred in sequence, with a smooth transition between the different regimes. The sections of the four regies were identified. From each sequence, 200 tiff images were extracted and used in the classification. Each image consisted of about 60 000 pixels (4 11* 153 pixels). In this work a simple feedANN forward network with backpropagation was used consisting of Tonsig an input layer, an output layer and one hidden layer (Fig. 2). As the figure shows, 40 nodes were used in the hidden layer, two different types Fig. 2. The layout of the ANN used. of transfer functions between the layers, and four output nodes, representing the four different regimes. Before the output nodes, thresholding was used; values about the threshold were converted
238
to unity and values below to zero. An identification was considered definite (but obviously not necessarily correct) if and only if one of the output nodes showed unity and the others zero. The regime type was defined by which output node was "fired".
3.
Generation of the input data with wavelet transform
The information content in the images, represented by the about 60.000 pixels, needs obviously reduced in dimension before using it for input. Wavelet transform s e e m to be a very effective tool to achieve this goal ([3], [4]). Wavelet transform coefficients are often better (more sensitive) features in pattern recognition tasks than the original data. What regards two-phase flow, the four flow regimes have structures that show up spatial variations at different scales. Hence, wavelet coefficients from a multi-resolution analysis (or the original input data, after a wavelet multiresolution analysis) seem to be suitable input data. To further reduce the number of input data, at each resolution scale, one can condense the coefficients further, i.e. use their first two statistical moments (mean and variance). As it turned out, the quality of the radiography recordings had a relatively poor quality in terms of quantitative usage for the present purpose such that a full 2-D multiscale resolution analysis was not practical. After the step of splitting the data into approximations and details as
no useful information content remained in the details DI. Thus, only the data of the first level approximation were used. These were further condensed into two parameters, the mean and the variance. Hence the network used in this work had two input nodes. This will be improved significantly in the future, as work is going on with better quality input data from new measurements.
4.
Results
Fig. 3 shows the mean of the wavelet coeffients corresponding to the first approximation of the multiresolution analysis for various wavelets, as functions of the sample number. The figure shows that this parameter has a relatively good discrimination power. The variance of the same values, not shown here, shows more overlapping, and hence less discrimination. The results are shown in Fig. 4 where tests were made with 200 samples for each regime. The results show that the identification was quite successful. In particular the success ratio was 100% for annular flow, and very close to 100%
239
" mm
Fig. 3. The mean of the wavelet coefficients for the different flow regimes as funtions of the sample number
NN bion I
Fig. 4. The results of the identification procedure with different wavelets
also for bubbly flow. Slug and chum-turbulent flow were mistaken for each other in a few percents of the cases. This pilot study shows the possibilities of the method, even if the quality of the input data did not make the full power of the wavelet pre-processing process visible. Further work is going on to test and develop the method further with better quality input data.
Acknowledgments We thank Prof. Kaichiro Mishima, Kyoto University Reactor Research Institute, for providing us with the flow images. The visit of Senada Avdic to Chalmers was supported by the Swedish Nuclear Power Inspectorate (SKI).
References 1. M. A. Vince and R. T. Lahey, Int. J. Multiphase Flow.8,93 (1982). 2. K. Mishima, S. Fujine, K. Yoneda, K. Yonebayashi, K. Kanda and H. Nishihara: Proc. 3rd Japan-US Seminar on Two-Phase Flow Dynamic, Ohtsu, Japan (1988). 3 . N. Hazarika, J. Z. Chen; A. C. Tsoi, A. Sergejew, Signal Processing 59, 61 (1997) 4. P. S. Addison, The Illustrated Wavelet Transform Handbook, IoP (2002)
PERFORMING AN ANALYSIS OF FUZZY FUSION FOR SPERMATOZOA IN FERTILE HUMAN JIA LU
Computer Science and Information System, UniversiQ of Phoenix 5050 NW 125 Avenue, Coral Springs, FL 33076, US.A cluiia(ii.email. uophx.edu
YUNXIA HU Department of Health Science Nova Southeaster UniversiQ, U.S.A szmizvh!&ivfloridu.coin
To evaluate the sperm quality of fertile men, the sperm morphology was examined based on a fuzzy data fusion. All sperms were observed in infertile patients. The abnormal spermatozoa of the fertile men were classified into different types: head, neck, and tail. Detailed study of each sperm track was possible as the image sequence with the individual sperm movement traces for easy inspection. The criteria defining the types of the classification could be set in order to track the details of fuzzy fusion analysis. The morphology was performed as a complete in the evaluation. The numbers of the normal and abnormal sperm were identified, and the type of defect was recorded for each abnormal spermatozoon. We showed the results in this paper that fuzzy fusion morphology provided a unified and the consistent framework to express different shapes of spermatozoa in fertile human.
1.
Introduction
Human spermatozoa morphology is assessed routinely as part of standard laboratory analysis in the diagnosis of human male infertility. This practice has its origins in the work of (Seracchioli 1995) which showed that sperm morphology was significantly different in fertile compared to infertile man. The evaluation of human sperm morphology has been a difficult and inconsistent science, since it is based on the individual sperm parameters. There are many approaches to human sperm morphology recognition available, and some of them have been applied to real world tasks with great success (Swanson 1985). However, these evaluations for human sperm morphology are normally hard to establish and human knowledge is hard to incorporate into the precision levels. Human semen evaluation continues to be influenced by subjectiveness of the investigator and a lack of objective measurements for sperm morphology continues to be a problem.
240
24 1
Fuzzy data fusion is especially suited to provide methods to deal with and to fuse uncertain and ambiguous data arising in computer vision (Karen 1989). Fuzzy logic theory has already turned out to be a valuable tool for the solution of various single tasks in image understanding (Kopp 1996). These successful applications of fuzzy notions stimulate the idea that the integration of single vision modules using fuzzy methods will result in a powerful fuzzy data fusion system (Rusan 2002). In this paper, we present a general approach of processes and representations in 3-CCD camera image recognition for those individual sperm shall be implemented using the theory of fuzzy data fusion.
2. Spermatozoa Human morphological slides were prepared using smearing and staining technique in order to create imaging analysis based on the fuzzy fusion in the window programming. The morphological abnormalities normally relate to the main regions of the spermatozoon (i.e. head, neck/mid-piece, and tail). There were bent, asymmetrical tail insertion, thick or irregular mid-piece or thin mid-piece in neck, and mid-piece abnormalities. Fuzzy data fusion was performed based on various sperm head, neck, and tail defects. Small acrosomal area or double heads sperm and free tails were not computed. A high frequency of coiled tails indicated that the sperm had been subjected to hypo-osmotic stress. Tail coiling related to sperm aging. A frequency of coiled tails was computed in fuzzy data fusion.
3.
Fuzzy fusion behavior
The most obvious illustration of fusion is the use of various sensors typically to detect a human sperm images. Fuzzy data fusion was used for recognition of the human sperm properties. The processes for the experiment are often categorized as low, intermediate and high level fusion depending on the processing stage at which fusion takes place. It combines several sources of sperm morphology data to produce new data that was more informative and synthetic of the inputs. Typically, the images presented several spectral bands of the same scene are fused to produce a new image that ideally contains in a single channel of all of the information available in spectral smear. An operator (or an image processing algorithm) could then use this single image instead of the original images. This is important when the number of available spectral bands becomes so large that it is impossible to look at images separately. This kind of fusion requires a precise pixel level of the available images. Those features may come from several raw data sources or from the same
242
data sources. The objective is to find relevant features among available features that might come from several feature extraction methods. Human sperm morphology information to be fused was captured from multismear images. For the individual sperm information source corresponds to different microscope sequences of the sperm sample. According to the head size, neck, and tail quantization level in different slides of selection path during the process of image capture, the data alignment was transformed for the multiple source data into a common coordinate system. The human sperm data was modeled by fuzzy modeling correspond to multiple sources of feature fusion, fuzzy data hsion, and the fusion decision in h z z y fusion behavior.
4. Fuzzy recognition 4.1. Human sperm asfuzw, integral and fuzzy sets
Definition 1: Let S be a fuzzy location with the element of S denoted by s, in this case, S = (s). S is the set of human sperm volume and s is the coordinate of shape, s = (% b, c). Definition 2: Let fuzzy set A be a fuzzy subset on S, A = (s, pa (s)ls ES). A is a fuzzy set for human sperm of S. Definition 3: Let X be a universal set for the sperm image, where D = (d), D is the fuzzy index of multi-smear. X provides the H (head) when d = 1, N(neck) when d=2, T (tail) when d=3, and X is the set of signal intensity of image. The element of X is denoted by sd Definition 4: Let the fuzzy set AX be a fuzzy set of the human important sperm defined as AX = ((s, d), pAX(s,d)Is~S, xd E X) For the fuzzy integral measurement of human reproductive sperm, I need observe the image by the following definitions. Let Z be a non-empty finite set. Definition 5: An objective set function s:2' -> [0, I] is a fuzzy measurement if s(0) = 0; s(Z) = 1, if I, J c 2" and I c J then $1) I s(J), if I , c 2" for 15 n < cc and the sequence {In} is the monotone in the sense of inclusion. Definition 6: let s be a h z z y measurement on Z. The function h: Z -> R with respect to s is defined as in Eq. (I), {h(zl), ..., h(zl)} = Z {h(zi) - h(zi-1))s (Ii) (1) where indices I have been defined 0 I h(z) I h(z1) I 1, I = { q, .. .,z,}; h(q)) =O.
243 4.2. Fuzw model Fuzzy models were proposed for the fuzzy information. The membership functions of human sperm corresponding to H, N, and T, pH, pN, and p~ were defined as in Eq. (2) (Lu and Hu 2004),
p$(s, h)=%+% sin((h2 - (s+h)/2 x s/h-s)
(2)
The membership functions of fuzzy models were used to analyze the feature of the parameter increasing of the operator. Membership functions present the relation of the degree of the sperm shape levels. They project the fuzzy feature of human sperm image onto corresponding fuzzy of H, N, and T. The knowledge rules were used for the fuzzy models. A combination of the sperm features, the membership function with high degree of H: H x N x T -> [0, 13 ~ H ( sif) and only if s E H where d = 1, 2, 3. The analysis of these features was based on any fuzzy intersection operator for fusing these features. Fuzzy starts with the fuzzification process of the given image yielding a fuzzy image defined as in Eq. (31,
.
.
FH={pH(H,+,);d=O,l,. .,d-1 ,n=O,l,. .,N-1 ,OIpHI1} (3) where p (Hdn) is the membership functions that denote the degree of the brightness relative to the value of a pixel Hdnwhich is situated position of the given image. The membership functions were used to model the ambiguity in the image. Although there exist some different type of membership functions, it has been widely applied to image problems. This function was still defined as in Eq. (4),
where b = (a + c) / 2 is the crossover point. The bandwidth of the membership function was defined as F = b-a = c -b. It determines the fuzziness in pH. 4.3. Fuzzy fuzzifcation
A fuzzy fuzzification was proposed for several fuzzy subsets by using fuzzy set. A fuzzy relation of human sperm was applied to interior region connection in subset and exterior region between subsets. After fuzzification, a fuzzy measure was calculated to determine the average amount of ambiguity using a linear or quadratic index of fuzziness, a fuzzy entropy or an index of non-fuzziness measure. The index reveals the ambiguity in the given images (H)(N)(T) by measuring the shapes between its fuzzy property pH, pN pT and the nearest binary version pH, pN
244 pT. Since the aim of the fuzzification is to determine the object from the properties of human sperm, the optimal images can be determined by minimizing the ambiguity in the given image. To determine the minimum ambiguity the crossover point, the bandwidth of the membership functions varies along the all shape levels of human sperm. We simulated the average percentage normal and percentage abnormal and selected the larger of those two groups for duplicate comparison. We computed the difference between the two assessments in that group. If the difference was smaller than the value obtained from fuzzy membership function computing, the assessments could be accepted. If the difference was larger than the value, two new assessments should be simulated. For the soft computing, we need check whether the smear was difficult to read from the computing, the smearing kept in reserve should be stained with fresh solutions and assessed. 4.4. Human sperm digital images A fuzzy set S in a universe X is characterized by a X - [0,1] mapping Xs, which associates with every sperm shape element x in X a degree of membership XS(X)of x in the fuzzy set S. In the following, we denoted the degree of membership by S(x). Note that a digital image can be identified with a fuzzy set that takes values on the grid points (x,y, z), with x, y, z E N, 0 < x I M and 0 < y I
N and 05 z 5 W (M, N, W E N). Therefore, for head, neck, and tail: H, N, T, we had that H, N, T €AX), with X = {(x, y, z)l 0 Ix I M, 0 s )
127-s . 7 Zf (43 < s < 127) A (1 5
S)
3. Determining images profiles After the formal definitions of colors and qualifiers, the next step is to build the image profile. A profile is defined according to the image membership to the various categories: the nine fundamental colors and the nine color qualifiers. For each pixel of the image we can determine the values taken by the various membership functions to the categories. For each category the value obtained corresponds to the ratio between, on the one hand, the sum, on all the pixels of the image, of the membership functions values to the category and on the other hand the number of pixels, which gives a quantity between 0 and 1. This quantity is the membership degree of an
254
T'
0
43
127
212
255
d f
L
Figure 3.
Dimensions L and S.
image to the given class. The membership degree of an image to a certain class is defined as follows: Let I be an image. Let P be the set representing the pixels of I . Each element p of the set P is defined by its color coordinates (hp,l,, s,). p can be one pixel or a set of pixels. We can calculate the functions ft(hp), .fq(lp, s), for t E 7 and q E Q.
Ft,q
Let Ftand be the following functions, representing the membership degree of I to the classes t and ( t ,q ) :
Every image is defined by a profile of 90 elements (ITI+ITxQl = 9+81). A profile can be presented as follows : [ Ft(I),F t , q ( I ) ] An image can be assigned to several classes among the 90 existing ones. There are 9 principal classes denoted Ct with t E 7 , and 81 subclasses which express a refinement of the research: (?t,q with ( t ,q ) E 7 x Q.
255
Figure 4. Profile of an image.
As shown in figure 4 the classes can be represented through a tree with father-son relationship, the classes Ct can be considered as fathers - and the classes (?t,q as their sons. For example the father of class Cred,somber is Cred.
Let us denote:
0
The classes Ct if & ( I ) 2 F * ( I )- A, with X a tolerance threshold The classes (?t,q if & ( I ) >_ F * ( I )- X and F t , q ( I )>_ F,*(I) - X
An image is assigned to a subclass only if it is also assigned to its father class. 4. Presentation of the software In the software, a database is used to store images with their profiles (cf. figure 5). That helps us to optimise the exploitation of these information. The software is divided into two sections, the first one corresponds to the treatment and the insertion of the images in the database, the second one to the exploitation of this database through requests with linguistic terms. In the first section, the profile of the new image is built and stored: to insert an image in the database we have to introduce a new record into the database after determining the values taken by the functions representing the membership degrees of the image t o the various classes. In the second section, the database can be exploited according to two levels of precision. The first one corresponds t o the nine fundamental colors (dimension H), the second one t o the nine color qualifiers.
256 Imagequalifier
,
Imape ID name size image
Image ID Hue ID Ftat .pialifioO)
l..n
Imagehue
Image
I
1
^ Hue ID Fte(I)
l..n
l..n
l..n
l..n
Hue
Qualifier l..n
Qualifier ID qualifier-name
Figure 5.
Hue ID hue-name
Database
j
Figure 6.
Query with only a color and query with a color and a qualifier.
Once the hue is selected, the user has the possibility to refine his request by specifying a color qualifier. For that purpose, he can choose one of those proposed in the list, or click in the corresponding zone in the image. For example, figure 6 shows images whose dominant color is "blue" and images whose dominant color is "luminous blue". Two other kinds of requests are handled: the first one allows us to retrieve B&W and gray-level images (in this case, H is not considered), and the second one allows us to retrieve images with more than one dominant color. One-color requests can be successively added (composed) to obtain a multi-color request.
257 5 . Conclusion
In this work we have developed a n approach that permits t o classify images according t o their dominant color(s). We limit ourselves t o fundamental colors and nine color qualifiers; those can be widened without modifying the approach, only few modifications have t o be performed in the software. As a fut,ure work, this software will be adapted for medical applications: it will help medical forecasts and analysis, like tumors detection. Instead of working on the whole image, the software will consider zones of images. It will retrieve images which contain zones with a strongly dominant color (for example red) which can correspond to cancerous cells.
References 1. B. Bouchon-Meunier (1995). La Logique Floue et ses Applications, AddisonWesley, 1995. 2. Y. Chen and J. Z. Wang (2002). A Region-Based Fuzzy Feature Matching Approach to Content-Based Image Retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 9, 1252-1267, 2002. 3. L. Foulloy (1990). Du contr8le symbolique des processus : dkmarche, outils, exemples, Phd Thesis, Universitk Paris XI, September 1990. 4. M. Hammami, L. Chen, D. Zighed, Q. Song (2002). Dkfinition d’un modkle de peau et son utilisation pour la classification des images. In Proceedings of MediaNet ’2002, 187-198, Sousse, Tunisie, June 2002. 5. F. Herrera, L. Martinez (2001). A model based on lingustic two-tuples for dealing with multigranularity hierarchical linguistic contexts in multiexpert decision-making. IEEE, Transactions on Systems, Man and Cybernetics. Part B,31(2), 227-234, 6. P. Hong, T. Qi and T. S. Huang (2000). Incorporate support vector machines to content-based image retrieval with relevance feedback. IEEE International Conference on Image Processing, Vancouver, Canada, 2000. 7. B. Le S a m (2003). Classification non exclusive et personnalisation par apprentissage : Application ii la navigation dans les bases d’images, Phd Thesis, INRIA, France, July 2003. 8. J. Roire (2000). Les noms des couleurs. In Pour la science, Hors skrie, n 27. 9. I. Truck (2002). Approches symbolique et floue des modificateurs linguistiques et leur lien avec l’agrkgation. Phd Thesis, Universitk de Reims ChampagneArdenne, France, December 2001. 10. I. Truck, H. Akdag, A. Borgi (2001). A symbolic Approach for Colorimetric Alterations. In Proceedings of EUSFLAT 2001, 105-108, Leicester, England, Septembre 2001. 11. I. Truck, H. Akdag, A. Borgi (2001). Using Fuzzy modifiers in Colorimetry. In Proceedings of the 5th World Multiconference on Systemics, Cybernetics and Informatics, SCI 2001, 472-477, Orlando, Florida, USA, 2001.
A COLORING ALGORITHM FOR IMAGE CLASSIFICATION
D. GOMEZ, J. MONTERO, J. YANEZ AND c. POIDOMANI Department of Statistics and O.R. Complutense University Madrid, Spain E-mail:
[email protected] In this paper we present a pixel coloring algorithm, to be considered as a tool in fuzzy classification. Such an algorithm is based upon a sequential application of a divisive binary procedure on a fuzzy graph associated to the image to be classified, taking into account surrounding pixels. Each color will suggest a possible class, if homogeneous, and the hierarchical structure of colors will allow gradation between classes.
1. Introduction
Classification in remotely sensing images quite often suggests techniques based upon fuzzy models. This is mainly the case when there are no objects to be classified. Objects, at least in a standard sense, use to present clear borders, and classification can be developed just based upon a boundary analysis and a previous knowledge of the shapes of the different objects under consideration. On the contrary, many classification problems about earth land use, for example, refer to classes showing gradation from one class to the next class. There are no clear boundaries, and each class defines a fuzzy set with no particular shape (see Bezdek and Harris'). In fact, there is an increasing research on Fuzzy Sets Theory applied to in remote sensing classification problems (see, e.g., Foody') . Many different approaches can be found in remote sensing classification literature. In Amo et for example, some of the authors proposed a classification model based upon a modified outranking model, basically taken from Pearman et aZ.12. But the output information appeared to be difficult to be managed by non-qualified decision makers. A main need was to develop fuzzy representation techniques. In particular, it was missing some kind of coloring tool allowing a consistent and informative picture aL3l5,
258
259 showing possible regions and gradation of membership t o possible classes. In this paper we propose an unsupervised crisp coloring methodology, to be considered within a more elaborated fuzzy classification system, as defined in Amo et a.’l4. The coloring procedure we present here is defined by means of a divisive crisp binary coloring process, which seems promising as a helpful tool in order to find out consistent regions and postulate possible fuzzy classes. In section 2 we introduce the basic pixels fuzzy graph associated to an image, and in section 3 we present a crude coloring algorithm. A final comments section shows the some particular improvements actually under development (see G6mez et ~ 1 . ~ ) .
2. The image and its associated pixels fuzzy graph
Let us consider an image as a bidimensional map of pixels, each one of them being characterized by a fixed number of measurable attributes. These attributes can be, for example, the values of the three bands of the visible spectrum (red, green and blue), the whole family of spectrum band intensities, or any other family of physical measures. Our main objective is to determine a family of pixels suggesting to define a class. This information should be taken into account in a later supervised analysis where additional information may exist. The image I under consideration is therefore divided into pixels (information units), and the whole information is summarized as a vector of b measures for each pixel I = {(z:,~, . . . ,z$) / ( i , j ) E P } , where P represents the associated set of pixels in which the image is divided, P = { ( i , j )/ i E (1,.. . , r } j E (1,.. . , s } } ] meaning that we are dealing with an image of size T x s , each pixel being characterized by b numerical measures. Given such an image I , a standard crisp classification problem pursues a partition in crisp regions, each one being a subset of pixels, to be considered a candidate for a new class, in case such a region is homogeneous enough. In this way, a crisp classification approach looks for a family of subsets of pixels {Al,. . . , A,} such that P = U;=,AI, but Ai n Aj = 0,V i # j . Our approach in this paper pursues to obtain an approximate gradation by splitting each subset under consideration into two crisp classes every time. The key tool will be a distance between the measured properties of pixels d : P x P -+ [0, oo),which at a first stage can be based upon the Euclidean distance in Rb. Of course, any other ad hoc distance can be taken into account in a future research. Obviously, the classification process will
260 be strongly dependent on the selection of the appropriate distance, to be chosen taking into account all features of the image under consideration, toget her with our particular classification objectives. Hence, our set of pixels P is being modeled as a planar fuzzy graph (see, e.g., K6czyl0, Mordeson and Nair8, or Rosenfeld13) whose nodes are pixels, described by means of their Cartesian coordinates i E { 1,. . . ,r } and j E (1,.. . , s}. The graph will be planar in the sense that two pixels ( i , j ) and ( i ’ , j ’ ) cannot be linked if J i- i ‘ J Ij - j’l > 1. Consequently, two pixels could be adjacent only if they share one coordinate being the other one contiguous. Let G = (V,@ be a fuzzy graph, where V is the node set and the fuzzy edges set fi is characterized by the matrix p = ( p i j ) i , j E ” , where pi3 = p ~ ( { i , j } ) , V i ,Ej V , and pfi : V x V -+ M is the associated membership function. Each element pij E M represents the intensity level of the edge { i , j } for any i , j E V . The set M is linearly ordered (pi,j 3 pi,,j, means that the intensity level of edge { i , j } is lower than the intensity level of edge { i ‘ , j ‘ } ) . Hence, the set M allows the literal graduation of the edge sets; for example, if M = { n ,I, h } the edges can be graduated as null ( n ) ,low ( I ) or high ( h ) . We can then denote by G ( I ) = (P ,E ) the graph associated to our image I , where M = [0, 00) is the domain of the distance function d:
+
-
Definition 2.1. Given the image I and a distance d between measured properties of pixels, the pixels fuzzy graph is defined as the pair G ( I ) = h_
(Pi E ) . h_
Notice that our pixels fuzzy graph G ( I ) can be also characterized by the set P plus two T x s matrices, D 1 and D 2 , where D& = d ( ( i , j ) ,(i + l , j ) ) , V ( i , j )E { l , ... , r - l} x (1,... , s } and D& = d ( ( i , j ) , ( i , j + l ) ) , V E { 1,. . . ,r } x { 1,. . . , s - 1). Since our coloring procedure will be based upon this alternative representation, from now on we shall denote our pixels fuzzy graph G ( I ) by ( T , s, D1, D 2 ) . The key coloring algorithm proposed in the next section will take advantage of the above alternative representation, which shows relation between adjacent pixels in the pixels fuzzy graph G ( I ) . h_
-
261
3. A crude coloring algorithm A c-coloring of a graph G = (V,E ) (see, e.g., Pardalos et al.”) is a mapping C : V -+ (1,.. . , c } , verifying C(w) # C ( d ) if {w,w’} E E. Any c-coloring induces a crisp classification of the nodes set V , being each class associated to one color: V c ( k )= {v E V / C(w) = k } , k c { I , .. . , c } . Our objective is to obtain a classification of pixels through a c-coloring C of the pixels fuzzy graph G(1):the pixel ( i , j ) E P will be classified as k E { 1,. . . , c } if its color is C ( i ,j ) = k . In order to color a fuzzy graph, we consider G,, the crisp graph defined by the a-cut edge set E, = { { e , e ’ } / pe,et 2 a } . The values of this parameter a will be selected in such a way that, a successive binary coloring process will be applied to some fuzzy subgraphs of G(1). The first binary coloring analyzes the pixels set P classifying each pixel as 0 or 1. The second binary coloring is applied separately to the subgraph generated by those pixels colored as 0, to obtain the classes 00 and 01, and to the subgraph generated by those pixels colored as 1, to obtain the classes 10 and 11. This hierarchical process of binary coloring is repeated in the crude coloring process. In this way, a c-coloring C will be defined on G ( I ) : if C ( i , j ) = k , with k = 6 for instance] then the binary representation of k - 1 = 5 is 101, i.e. the pixel ( i , j ) will be binary colored three times (1, 0 and 1, respectively).
-
-
-
3.1. The basic binary coloring procedure
A natural way of introducing the basic binary coloring procedure is to classify two adjacent pixels as 0 and 1 if and only if the distance between them is greater or equal than a prescribed threshold a . Notice that, in this way, only adjacent pixels are classified as distinct (if distance between them is high), while a standard approach classifies two arbitrary pixels in the same class if that distance is low (no matter they are adjacent or not). Formally, and in order to define the first binary coloring procedure, given a value a , let G, denote the a-cut of the fuzzy graph G(1):G, = ( P ,E,), where
-
The set E, is the set of all pairs of adjacent pixels with a distance d lower than a. Let col : P -+ {0,1} be a binary coloring of G,. The first binary coloring can be obtained assigning an arbitrary color (”0” or ”1”)to certain
262
pixel and fixing an order in which pixels will be colored. That first pixel to be colored could be, for example, the pixel ( 1 , l ) in the left top corner of the image; then pixels will be colored from left to right and up to down depending on a fixed threshold a. In general, given a colored pixel (2, j ) the adjacent pixels (i 1,j)and ( i , j 1) will be subsequently colored. Since pixel (i 1 , j 1) can be alternatively colored either from pixel (i 1,j) or from pixel ( i , j l),a natural constraint is that both colors must be the same; otherwise, the coloring will be denoted as inconsistent.
+
+
+ +
+
+
Definition 3.1. Given a pixel set P a square is a subset of four pixels s q ( i , j ) = {(i,j);(i
+ 1,j);( i , j+ 1);(i + 1 , j+ 1))
being i E (1,.. . ,r - l} and j E (1,.. . , s - l}. We shall then denote by PS the set of all squares, PS = ( s q ( i , j ) / i E (1,.. . , r - l}, j E (1,.. . , s - 1)).
Definition 3.2. Given a pixels fuzzy graph ( r , s , D 1 , D 2 ) , a square s q ( i , j ) E PS is consistent at level a if given an arbitrary color coZ(i,j),the above binary coloring procedure assigns the same color to pixel (i + 1,j 1), no matter if it is done from pixel ( i , j + 1) or pixel (i + 1,j). Otherwise, the pixel square is inconsistent.
+
Consequently, the above binary coloring of pixels fuzzy graph depends on the chosen threshold value a , and we have two extreme cases: E = max(i,j)EP{da,j;dS,j} (if we fix a threshold a: > Z, then the whole picture is considered as a unique class (col(i,j ) = col(1,l) V ( i ,j ) E P ) ; and g = min(i,j)EP{d:j;d~,j} (in case a < g , the picture looks like a chess board, being all adjacent pixels alternatively classified as ”0” and ”1”). Only the interval [a,4 should be properly considered. Indeed, determining an appropriate intermediate a level is not a trivial task. But once a level a: is given, the inconsistent squares can be detected with the binary function
inconsis,(i, j , D 1 , 0 2 ) =
if is inconsistent at level a { 01 otherwise sq(i,j)
where each value inconsis,(i, j , D1, D 2 ) depends on the square values d i , j , d&+l, dS,j and dS+l,j. Pseudocode for computations relative to listing consistent and inconsistent squares, inconsis(i,j , a:, D1, D 2 ) , has been developed by the authors in terms of four additional 0 - 1 variables (see G6mez et al.’).
263
Definition 3.3. Given a value a , the pixels fuzzy graph ( T , s , D1, D 2 ) is consistent at level a if all squares s q ( i , j ) E PS are consistent at level a. Definition 3.4. Given a pixels fuzzy graph ( T , s, D1, D 2 ) ,its consistency level, denoted as a*,is the maximum value a E [ g , E ]for which the fuzzy graph is consistent.
Existence of such a consistency level a* is always assured, at least meanwhile our image contains a finite number of pixels. If some inconsistency is detected for a given a level, a decreasing procedure can be introduced in order to find a lower level a* assuring consistency. Such a procedure will be initialized with a* = Z,and then we search among inconsistent pixels s q ( i , j ) , by means of a new function newalpha (see G6mez et al.’). We can therefore look for a value a* assuring consistency. Pixels axe being classified either into a class ”0” or a class ”1” and in the next step we proceed to get a more precise color for both classes (class ”0” will switch either into ”00” or ”01”). This will be done by alternatively activating only one of the classes already colored in a previous stage. Analogously, such a binary coloring process is applied in subsequent stages to those activated pixels under consideration (a subset of pixels P’ c P ) . This subset of pixels P’ getting a more precise color at each stage can be also characterized by a matrix act such that a c t ( i , j ) = l , V ( i , j ) E P’ and a c t ( i , j ) = 0 V ( i , j ) s;Z P’. We can compute the interval [a,E] for the activated pixels, by means of a procedure initalpha(r, s, D 1 D , 2 ,act). It may be the case that two adjacent pixels are not activated, and therefore the process should stop. This situation can be easily detected in the associated pseudocode initalpha, where the lowest distance between activated pixels cy is initialized as a very big value (the greatest value of distances between activated pixels is initialized as 0). Again, notice that a given square can be consistent for a value a but inconsistent for another value a’ < a. Hence, a decreasing procedure must be repeated for the overall set PS until we find a new level a* assuring that all squares are made consistent in the new coloring environment. Then, a function called consislevel is the core of our algorithm: it will iteratively compute the consistency level a* for the family of pixels being actually activated (the initializing value will be E , which is obtained from the procedure initalpha). The input arguments of consislevel are the pixels fuzzy graph ( T , s , D1, D 2 ) and the T x s matrix act. The interval [a,E] is computed after the procedure initalpha is called, being a* the returned value. The associated consisleveZ(r, s , D1, D2, act) pseudocode computes
264
the consistency level a * ( a c t ) , for a given a subset P' c P of activated pixels. Following the standard order in the activated pixels (at every stage) the level a*(act) assures a valid binary coloring col procedure. In order t o perform these computations, a procedure bin cola(^, s , D 1 D ,2 , act, COZ) will compute the binary coloring of activated pixels at level a , the fist call t o this procedure taking as initialization c o l ( i , j ) = O , V ( i , j ) E P (see G6mez et aL9). 4. Final comments
The final objective of the algorithm we propose in this paper is to show decision maker several possible pictures of the image, each one obtained by means of an automatic coloring procedure of each pixel based upon a particular distance. Such a coloring procedure takes into account behavior of each pixel with respect t o its surrounding pixels, and each color will suggest a possible class. Our coloring process is based upon a basic binary procedure, which is again and again applied, leading to a hierarchical structure of colors (i.e., possible classes). This basic binary procedure evaluates the distance of the measurable description between adjacent pixels, assigning a color depending on whether such a distance is either lower or higher than a previously chosen threshold. Each colored picture can be analyzed by decision makers in a posterior classification procedure: certain homogeneous regions can be identified, and a subsequent comparison may lead to a fuzzy classification, if we are able to evaluate the degree of concordance of each pixel to each one of those identified regions (see Amo et ul.'). Due to space limitations, pseudocodes have not been included in this paper, but they can be obtained from the authors under request, plus additional details (see G6mez et ~ 1 . ~ )Of . course, the classification process induced by the previous binary coloring can be refined, and an appropriate relaxed coloring algorithm should be tried in order to bypass the computational inefficiency of the above crude coloring algorithm (see G6mez et al.g for details). Once our basic binary coloring process has been successively applied t times, we shall be able to distinguish 2t classes. Our complete coloring process is therefore equivalent to a hierarchical classification procedure, obtaining as output a set of nested clusters, to be properly analyzed.
Acknowledgments This Research has been partially supported by CICyT grant BFM20020281, Spain.
265 References 1. A. Amo, D. G6mez, J . Montero and G. Biging: Relevance and redundancy in fuzzy classification systems. Mathware and Soft Computing 8, 203-216 (2001). 2. A. Amo, J. Montero and V. Cutello: O n the principles of f u z z y classification. Proc. N.A.F.I.P.S. Conference, 675-679 (1999). 3. A. Amo, J. Montero and G. Biging: Classifying pixels by means of f u z z y relations. Int. J. General Systems 29, 605-621 (2000). 4. A. Amo, J. Montero, G. Biging and V. Cutello: Fuzzy classification systems. European Journal of Operational Research (to appear). 5. A. Amo, J. Montero, A. FernBndez, M. L6pez, J. Tordesillas and G. Biging: Spectral f u z z y classification: a n application. IEEE Trans. Syst. Man and Cyb. (C) 32, 42-48 (2002). 6. J.C. Bezdek and J.D. Harris: Fuzzy partitions and relations: a n axiomatic basis for clustering. Fuzzy Sets and Systems 1, 111-127 (1978). 7. G.M. Foody: T h e continuum of classification fuzziness in thematics mapping. Photogrammetric Engineering and Remote Sensing 65, 443-451 (1999). 8. J.N. Mordeson and S. Nair: Fuzzy graphs and Fuzzy Hypergraphs (PhysicaVerlag, Heidelberg, 2000). 9. D. G6mez, J. Montero, J . Y S e z and C. Poidomani: A f u z z y graph coloring algorithm for image classsification. Technical Report (Dept. Statistics and O.R., Complutense University, Madrid, Spain). 10. L. K6czy: Fuzzy graphs in the evaluation and optimization of netwoks. Fuzzy sets and systems 46:307-319 (1992). 11. P.M. Pardalos, T. Mavridou and J. Xue: T h e Graph Coloring Problem: A Bibliographic Survey. In: D.Z. Du and P.M. Pardalos (Eds.): Handbook of Combinatorial Optimization, vol. 2 (Kluwer Academic Publishers, Boston, 1998); 331-395. 12. J. Montero, A. Pearman and J. Tejada: Fuzzy multicriteria decision support for budget allocation in the transport sector. TOP 3, 47-68 (1995). 13. A. Rosenfeld: Fuzzy graphs. In: L.A. Zadeh, K.S. Fu and M. Shimura (Eds.): Fuzzy sets and their applications to cognitive and decision processes (Academic Press, New York, 1975); 77-95.
FUZZY MODELS TO DEAL WITH HETEROGENEOUS INFORMATION IN DECISION MAKING PROBLEMS IN ENGINEERING PROCESSES*
L. MARTiWEZ Dept. of Computer Science University of Jakn, 23071 - JaLn, Spain e-mail:
[email protected] J.LIU Manchester School of Management UMIST PO Box 88 Manchester, UK, M60 1 QD e-mai1:j.
[email protected]. uk
D. RUAN Belgian Nuclear Research Centre (SCK*CEN) Boeretang 200, 2400 Mol, Belgium e-mail:
[email protected] Before implementing a n engineering system in the design process are studied different proposals to evaluate and rank them. In this evaluation process several experts assess different aspects and criteria according to their knowledge and preference on them. These criteria may have different nature (quantitative, qualitative) and the experts could belong to different areas and have different knowledge on each criterium, so the assessments used to express the value on each criterium could be assessed with different types of information (numerical, linguistic, interval-valued). In such a case, to select the best proposal we must deal with this heterogeneous information t o evaluate and rank the different proposals. In this contribution we show different fuzzy approaches for dealing with heterogeneous information.
1. Introduction In the design of traditional engineering systems the main objective for selecting a design option is to minimize cost. In recent years, however, the design selection has increased its complexity due to the need of taking into account aspects or criteria such as safety, cost and technical performance *This work has been partially supported by the research project TIC 2002-03348, and by the UK Eng. and Phy. Sci. Research Council (EPSRC) under grant no: gr/r30624.
266
267 simultaneously. In the future all solicitations involving source selection should be structured using safety, cost and technical performance considerations '. Also, the decision of implementing a design in an engineering system will depend on if the design can satisfy technical and economical constraints. Therefore, Multi-Criteria Decision Making (MCDM) techniques 2,10 could be applied for ranking the different design options. In these MEMCDM problems the preferences provided by the experts for the different criteria may be expressed with different types of information depending on the knowledge of the experts and on the nature of the criteria (quantitative and qualitative). When the experts do not have a precise knowledge about the criteria the probability theory could be useful t o deal with vague information, but it is not too difficult t o find many aspects of uncertainties that do not have a probabilistic character since they are related to imprecision and vagueness of meanings. In addition qualitative aspects are difficult t o assess by means of precise numbers. To rank engineering designs using MEMC-DM problems we deal with criteria as safety, cost and technical performance. In these problems, intrinsically vague information appear and could be assessed by means of numerical information (probabilistic), interval values and in those cases such that the nature of the criterium is qualitative the use of linguistic information l2 is common and suitable. Therefore, it is not a seldom situation to deal with numerical, interval valued and linguistic information in the evaluation process of engineering designs 8)11. We shall call this type of information as heterogeneous information. The decision model to rank the different designs assessed by means of heterogeneous information taking into account the criteria of safety, cost and technical performance will use a framework as the following one '>11: 0 Safety assessments will be synthesized for each design. Cost and technical assessments will be provided by the experts. 0 These assessments will be the input information for a MEMC-DM that we shall solve to rank the different designs. In this MEMC-DM problem the assessments for the criteria will be combined to obtain a degree of suitability of each design option. The main difficulty to solve this problem is that the values used to assess the criteria are expressed in different utility spaces (heterogeneous information) and we shall need to unify the different utility spaces to combine the input information in order to obtain the degree of suitability of each design option. In this contribution we shall show an approach to unify this heteroge-
268 neous information dealing with fuzzy sets and after how this approach could be improve using the linguistic 2-tuple model ‘. This contribution is structured as follows. In section 2 we show how to unify heterogeneous information dealing with fuzzy sets. In section 3 we review the linguistic 2-tuple model and its application to deal with heterogeneous information. Finally some conclusions are pointed out. 2. Using fuzzy sets to deal with Heterogeneous Information We must keep in mind we are dealing with heterogeneous contexts composed by numerical, interval valued and linguistic information. Our aim is t o rank the different proposals characterized with this type of information. So, we need to unify the heterogeneous information into a common utility space to operate on it easily. Here, we show how to unify numerical, interval valued and linguistic information into a common utility space that is fuzzy sets on a linguistic term set, ST. The common utility space ST may be chosen depending on the specific problem, according to the conditions shown in ‘. Afterwards, each numerical, interval-valued and linguistic evaluation, is transformed into a fuzzy set in ST, F ( S T ) ,using the following transformation functions: (I) Transforming numerical values, s$ E [0, 11, into F ( S T ) : T :
~(s:)
[O, 1) + F ( S T )
= {(so,Yo),. . . , ( ~ s , ’ ~ 9 ) }E, S ~i T a n d y t E [0,11
Remark: We consider membership functions, ps,(.), for linguistic labels, s, E ST, are represented by a parametric function (a,, b,, d,, q ) . And being 7,the degree of membership of the number into the linguistic terms of ST. (2) Transforming linguistic terms, sk E S , into F ( S T ) : TSST TSS,(S,L3)
s
= {(Ck,YL)/k
-+
F(ST)
E {0,..7g}l,
E
s
7;= maxy min{Psk (Y), P C k (Y)l where p s (.) ~ and p c k ( . )are the membership functions of the fuzzy 23 sets associated with the terms s k and c k , respectively.
269 (3) Transforming interval-valued, s : ~ in [0,1] into F ( S T ) . Let 1 = [&,TI be an interval in [0,1]. We assume that the interval-valued has a representation, inspired in the membership function of fuzzy sets 7: pr(fi) =
{
0,if6
The transformation function is: rrs, : 1 J’(ST) -+
7rsT(s,13)= { ( c k , ~ ; )/ k E (0, ..., g ) > , 7; = maxy min{PsI13 ( Y ) , P C k ( Y ) )
where , U ~ I (.) is the membership function associated with the x3 interval-valued st. At this moment all the input information (heterogeneous information) is expressed in a common utility space and we can operate with this information easily to obtain a ranking of the alternatives. This method has been applied successfully in the process of safety synthesis in 8)11
3. Using 2-tuples to deal with heterogeneous information The use of fuzzy sets allow us to unify the heterogeneous information, but the results to rank the different proposals will be fuzzy sets that are not straight to order and not easy to understand for all the experts. However, the use of the linguistic 2-tuple model will allow to order straightly the different proposals and the results will be easily understandable by all the experts . Now, we review briefly the linguistic 2-tuple model and show how to convert the fuzzy sets obtained in the section 2 into linguistic 2-tuples.
3.1. The 2-Tuple Fzlzzy Linguistic Representation Model The 2-tuple fuzzy linguistic representation model, presented in 4, will be used in this contribution to unify the heterogenous information. This model is based on symbolic methods and takes as the base of its representation the concept of Symbolic Translation.
Definition 1. The Symbolic Translation of a linguistic term si E S = {so, ..., sg} is a numerical value assessed an [ - . 5 , . 5 ) that support the ”difference of information” between an amount of information p E [O,g] and the closest value in ( 0 , ...,g} that indicates the index of the closest linguistic term si E S , being [O,g] the interval of granularity of S .
270 Fkom this concept the linguistic information is represented by means of 2-tuples (ri,ai), ri E S and a( E [-.5, . 5 ) . This model defines a set of functions between linguistic 2-tuples and numerical values.
Definition 2. Let S = {so, ..., s g } be a linguistic term set and p E [O,g] a value supporting the result of a symbolic aggregation operation, then the 2-tuple that expresses the equivalent information to P is obtained with the following function: 5 : [0,g ] S x [-0.5,0.5) si i =round(P) A@) = ( ~ i , a )w,i t h a = p - i a E [-.5, .5)
-
where round(.) is the usual round operation, si has the closest index label to "P" and "a" is the value of the symbolic translation.
Proposition 1.Let S = {so, ...,s g } be a linguistic t e r n set and ( s i ,a ) be a linguistic 2-tuple. There is always a A-l function, such that, from a 2-tuple it returns its equivalent numerical value P E [O,g] in the interval of granularity of S .
-
Proof. It is trivial, we consider the function:
A-l : S x [-.5, .5) [O,g] A-'(s~,a ) = i + a = p
A linguistic computational model for 2-tuples was introduced in '. 3.2. Transforming fuzzy sets in ST into linguistic 2-tuples In section 2 the heterogeneous information was unified by means of fuzzy sets in the common utility space, ST, now we shall transform them into linguistic 2-tuples in ST. This transformation is carried out using the function x and the A function (Def. 2): X : F(ST)
---f
[o,g]
X ( T ( f q ) = X ( { ( S j , Tj), j = 0, ...,9 ) ) =
p
* c
g=o j-r,
=P
is a numerical value in the granularity interval of S T , i.e., S, =
{ s o , ..., s g } , p E [O,g]. Then, to obtain the linguistic 2-tuple from we shall
use the A function presented in the Definition 2:A(P) = (sira ) Now all the input information are expressed in a common utility space, S T , by means of linguistic 2-tuples. So we can use all the linguistic 2-tuple operators to obtain the results we are looking for. This model has been used to deal with het.erogeneous processes in evaluation and decision processes in 3 , 6 .
271
4. Conclusions
In engineering we can face problems involving decision processes dealing with information assessed in different utility spaces. In this contribution we have showed two fuzzy approaches t o deal easily with heterogeneous information composed by numerical, interval valued and linguistic values. In the future we shall apply these approaches t o the whole decision process in the engineering problem. References 1. D. Dubois and H. Prade. Fuzzy Sets and Systems: Theory and Applications. Kluwer Academic, New York, 1980. 2. L. Gin-Shuh and J. Wang Mao-Jiun. Personnel selection using fuzzy MCDM algorithm. European Journal of Operational Research, 78(1):22-33, 1994. 3. F. Herrera, E. Herrera-Viedma, L. Martinez, and P.J. SBnchez. A linguistic decision process for evaluating the installation of an ERP system. In 9th International Conference o n Fuzzy Theory and Technology, Cary (North Carolina) USA, 2003. 4. F. Herrera and L. Martinez. A 2-tuple fuzzy linguistic representation model for computing with words. I E E E Transactions o n Fuzzy Systems, 8(6):746752, 2000. 5. F. Herrera and L. Martinez. The 2-tuple linguistic computational model. Advantages of its linguistic description, accuracy and consistency. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, S(Supp1.):3349, 2001. 6. F. Herrera, L. Martinez, and P.J. Sgnchez. Managing non-homogeneous information in group decision making. European Journal of Operational Research, page To appear, 2004. 7. D. Kuchta. Fuzzy capital budgeting. Fuzzy Sets and Systems, 111:367-385, 2000. 8. J. Liu, J.B. Yang, J. Wang, H.S. Sii, and Y.W. Wang. Fuzzy rule-based evidential reasoning approach for safety analysis. International Journal of General Systems, page In press, 2004. 9. H.S. Sii and J. Wang. A subjective design for safety framework for offshore engineering products. In Workshops o n Reliability and Risk Based Inspection Planning and ESRA Technical Committee o n Offshore Safety, Zurich, Switzerland, 2000. 10. R.R. Yager. Non-numeric multi-criteria multi-person decision making. Group Decision and Negotation, 2:81-93, 1993. 11. J. B. Yang, J. Liu, J. Wang, and H. S. Sii. A generic knowledge-base inference methodology using the evidential reasoning approach - RIMER. I E E E Transactions o n Systems, Man, and Cybernetics, page To appear, 2004. 12. L.A. Zadeh. The concept of a linguistic variable and its applications to approximate reasoning. Information Sciences, Part I, 11, III, 8,8,9:199-249,301357,43-80, 1975.
SELF-TUNING METHOD FOR FUZZY RULE BASE WITH BELIEF STRUCTURE JUN LIU', LUIS MARTINEZ2, JIAN-BO YANG', JIN WANG3
' Manchester School of Management, UMST, PO Box 88, Manchester M60 I QD, UK Dept. of Computer Science, University of Jakn, E-23071 Jakn, Spain School of Engineering, Liverpool John Moores University, Liverpool UK A framework for modelling the safety of an engineering system using a fuzzy rule-based evidential reasoning (FURBER) approach has been proposed recently, where a fuzzy rulebase designed on the basis of a belief structure (called a belief rule expression matrix) forms a basis in the inference mechanism ofFURBER. In this paper, a learning method for optimally training the elements of the belief rule expression matrix and other knowledge representation parameters in FURBER is proposed. This process is formulated as a nonlinear objective function to minimize the differences between the output of a belief rule Lrise and given data. The optimization problem is solved using the optimization tool provided in MATLAB. A numerical example is provided to demonstrate how the method can be implemented.
1. Introduction A framework for modelling the safety of an engineering system using a fuzzy rule-based evidential reasoning (FURBER) approach was recently proposed [*I. In the framework, a fuzzy rule-base designed on the basis of a belief structure is used to capture uncertainty and non-linear relationships between the parameters, and the inference of the rule -based system is implemented using the evidential reasoning algorithmL3]. A belief rule expression matrix forms a basis in the inference mechanism of FURBER, which is a framework for Epresenting expert knowledge but it is difficult to determine its elements entirely subjectively, in particular for a large scale rule base with hundreds of rules. Also, a change in a rule weight or an attribute weight may lead to significant changes in the performance of a belief rule base. As such, there is a need to develop a method that can generate an optimal rule expression matrix using expert judgments as well as statistical data. In this paper, a learning method for optimally training the elements of the belief rule expression matrix and other knowledge representation parameters in FURBER is proposed. This process is formulated as a nonlinear objective function to minimize the differences between the output of a belief rule base and given data and is solved using the optimization tool provided in MATLAB.
2. Fuzzy Rule-Bas ed Evidential Reasoning (FURBER) Approach This section reviews the FURBER framework [21. To take into account belief degrees, attribute weights and rule weights in a rule, a belief rule-base is given by R=(R,, Rz, ..., R L } .The kthrule can be represented as follows:
272
273 Rk: IF U is Ak THEN D with belief degree p k , with a rule weight 6, and attribute weights 81, &,. . ., 6 , where U represents the antecedent attribute vector ( U I ..., , UTk), Ak the packet antecedents { A : ,..., A:k }, where A: ( i = l ,..., Tk) is the linguistic value of the 'i antecedent attribute in the k' rule; Tk the number of antecedent attributes used in the 'k rule, D the consequent vector @I,..., DN), and J k the vector of the belief degrees ( Jlk ,. . ., J N k ) for kE { 1 ,. .., L } . This is the vector form of a belief rule, J i k measures the degree to which Di is the consequent if the input activates the antecedent Ak in the k' rule for i = l , . .., N , k = l , . .., L . L is the number of rules in the rule-base and N is the number of possible consequents. The rule-base can be summarized using a belief rule expression matrix shown in Table 1 : Table 1: A belief rule expression matrix
In the matrix, Wk is the activation weight of Ak, which measures the degree to which the kth rule is weighted and activated. The degree of activation of the kth rule W k is calculated as:
where a: ( i = l , .. ., Tk) is the degree of belief to which the input for U, belongs to A,k of the 'i individual antecedent in the k' rule.. In this paper we assume that aik (i=1, ..., Tk) are already given. Based on the above belief rule expression matrix, we can use the evidential reasoning (ER) approach 13] to combine rules and generate final conclusions. Using the overall ER algorithm [51, the combined degree of belief pi in D, is generated as follows:
[
L
p* n(wkJj,k
pJ. =
k=l
N L +l-Wkzflj,k)-n(l-Wk j =I k=l
1-p*
[
]
" I
zJj,k)
j =1
,J= 1 , ...,N
(2)
n) 0)) = [ae,u:],where pg(z) is the membership function of k , and el is the closure operate. The normality of the fuzzy number implies that l-level = {x : p g ( z ) = l} is not an empty set. Let us denote a1 = a;, a2 = al1, a3 = uf and a4 = a:. A fuzzy number R is called a L- R fuzzy number if its membership function can be represented as :
-
-
4x1=
{
L ( Y ) ,z < a2; 1, a1 I x I a,; R ( F ) ,z > a3
(1)
where the function R is nonincreasing, left-continuous function of real line satisfying the condition R(0) = 1, the function L is nondecreasing, rightcontinuous function of real line satisfying the condition L(0) = 1, spreads al = a2 - a l , a, = a4 - a3. The functions L and R is called the lek- and right-hand side.
278 The trapezoidal fuzzy number is the simplest form of L - R fuzzy number. The sides of this fuzzy number are linear, we denote (alla2, a3, a4) as a trapezoidal fuzzy number1i3. When a2 = a3 we obtain the triangular fuzzy number, the conditions a l = a2 and a3 = a4 imply the close interval and, in the case a1 = a2 = a3 = a4, we obtain a crisp real number.
Definition 2.1. Let 2 be L - R fuzzy number. Its membership function is p x ( x ) , x E R, If the integral JEmlx(px(x)dx < co,then
is defined as the expectation of
8.
Now we assume that & is the L - R fuzzy number represented ith expert's subjective estimation of the ration to an alternative under a given criterion. Let E = F(fi1,E2,...,En) be the consensus of opinions, How to construct F by combining these evaluated rating Ei(i = 1,2, ...,n ) is an important issue. In this paper, the simple additive weight (SAW) method is employed. For example, if opinions of four experts are represented by positive trapezoidal fuzzy numbers as follows:
-
R1 =
( 1 , 2 , 3 , 4 ) , E 2= ( 1 , 3 , 3 . 5 , 4 ) , E= ~ (2,2.5,3,5),E4 = (1.5,2,3,4)
and their weights are 0.3, 0.4, 0.2, 0.1, then the result of aggregation would be E = 0.3 0 0.4 0E 2 0.2 0E3 0.1 0E4 = (1.25,2.5,3.2,4.2).
+
+
+
3. Aggregation Method In this section, we give a new method for measuring aggregated opinion. Let El, El ..., k, be L - R fuzzy numbers represented experts' opinions. Their membership functions are p ~ & ( z(i) ,= 1 , 2 , ..., n ) and their expectations
E(Ei,)(i
= 1,2, ..., n ) exist. We move the membership functions p ~ , ( x ) (i = 1 , 2 , ..., n) towards the left-according to the size of E(E,) and obtain new fuzzy numbers K1, K2 , ...,K , and new membership functions pk-,(x)=
- -
+
p ~ , ( x ,?3(Ez))(i= 1 , 2 , . . . , n ) . Then a new similarity degree between
E,, EJ is defined as follows:
279 The similarity degree S ( R i ,&) between expert Ei and expert Ej is determined by the proportion of the consistent area ~ ‘ ( r n i n { p (z), ~ , pir; (z)})dz to the total area Jx(rnaz{p,z (z), p - (z)})dz and it is an index of the simK3 ilar opinions of expert Ei and expert Ej . It represents the similarity degree of expert Ei and expert E j in the case that their opinion expectations equal to zero. Since in many cases expert’s opinions are not intersect, it is essential to choose a basic point to measure the similarity degree of expert’s opinions. In this paper, the basic point is that expert’s opinion expectations equal to zero. Obviously 0
5 S ( R 2 , E j ) 5 1,S ( R i ,Rj) = S ( R j ,R,)
Definition 3.1. Let the mean distance of
(4)
Ei,Ej be two L - R fuzzy trapezoidal numbers, then Ei,& is defined by: d m ( R i , R j ) = IE(Ri) - E(Rj)I
(5)
Since E(Ri),E ( & ) are the centroid of the opinions of expert Ei and expert E j , the dm(?ii, Rj) is used as an index of the opinion difference of expert Ei and expert Ej. Considering S ( & , &) may be equal to zero, the new measure function is defined as follows:
T ( & ,E j ) =
0 . 5 S ( k i ,kj)
1 + d,(Ri,
0.5 + 1+ d m ( R i , Rj)
Ej)’
The T ( & , & ) considers not only the similar opinions, but also the opinion differences of expert Ei and expert Ej. The larger the d m ( E i , Rj), the less the agreement degree. Contrarily, the larger the S(Ri,E j ) is, the more the similarity degree. The average agreement degree of expert Ei(i = 1 , 2 ,..., n ) is obtained by averaging the degree of similarity with respect to other experts: n
Without considering ith expert’s degree of importance, the aggregation weight of Ei is then given by n
280 Since we obtain the weight of experts’ opinions, we can combine all experts’ opinions into a consensus opinion by:
E = W(E1)O El + W ( E 2 )o E.2 + . . . + W(E,)
O R,
(9)
We summarize the method discussed above and propose an algorithm t o combine all experts’ opinion into a consensus opinion. Algorithm Initial step: For the criterion and an alternative under group decision making environment, each expert Ei ,(i= 1 , 2 , . . . ,n ) proposes his opinion as a L - R fuzzy number denoted by Ri and its membership function is p~~(z). stepl: Calculate the expectation E ( k i ) of each fuzzy number Ei ; then move its membership function towards the left according t o the size of its expectation. New fuzzy numbers El, k2, . .. , and their membership functions p~~(z) = p~~(z + E ( R i ) ) ( i= 1 , 2 , . . . ,n) are obtained.
-
En
step2: Calculate the similarity degrees S ( z i ,E j ) ( i , j = 1 , 2 , . . . , n ,i # j ) between Ei, step3: Calculate the mean distance d,(&, & ) ( i , j = 1 , 2 , . . . , n ,i # j ) between Ri, Rj . step4: Calculate the agreement degree T ( g i ,&)(i, j = 1 , 2 , . . . ,n, i # j ) step5: Calculate the average agreement A(Ei)(i= 1 , 2 , . . . , n , ) of each expert Ei. step6: determine the weight W ( E i ) . step7: combine all experts’ opinions into a consensus opinion R .
&.
-
4. Conclusions
In this paper, an aggregation method is proposed t o aggregate individual opinions into group consensus opinion under group decision making while the opinions are represented by L - R fuzzy numbers. The method extended some of the previous methods in that it can not only deal with the situation when fuzzy numbers are intersected, but also when fuzzy numbers are disjoint. Using the mean distance also may help to avoid the lose of information. It is certainly interesting and useful in group decision-making. References 1. Ph.Diamond, Least squares methods in fuzzy data analysis, Proc. [FSA’S], Brussels, Management & System Science 1991,pp.60-64. 2. M. Fedrizzi and J. Kacprzyk, On measuring consensus in the setting of fuzzy preference relations, in: J. Kacprzyk and M. Roubens, Eds. Non-
281
3.
4.
5.
6. 7.
8.
9. 10.
11. 12.
conventional preference Relations in Decision Making (Springer, Berlin, 1988) 129-141. S.Herilpern, Using a distance between fuzzy numbers in socieeconomic systems, in R.Trapl(Ed), Cybernetic and Systems’94, World Scientific, Singapore, 1994 ,pp.279-286. H. M. Hsu, C.T. Chen, Aggregation of fuzzy opinions under group decision making, Fuzzy Sets and Systems 79 (1996) 279-285. A. Ishikawa, M. Ambiguous, T. Shiga, G. Tomizawa, R. Tactic and H. Mileage, The max-min Delphi method and fuzzy Delphi method via fuzzy integration, Fuzzy Sets and Systems 55 (1993) 241-253. J. Kacprzyk and M. Federation, A soft measure of consensus in the setting of partial (fuzzy) preferences, European J . Operate. Res. 34 (1988) 315-325. J. Kacprzyk, M. Federation and H. Norm, Group decision making and consensus under fuzzy preferences and fuzzy majority, Fuzzy Sets and Systems 49 (1992) 21-31. G.Munda et al, Qualitative multicriteria methods for fuzzy evaluation problems: An illustration of economic-ecological evaluation ,Eur.J.OR, 82 (1995),79-97 H. Nurmi, Approaches to collective decision making with fuzzy preference relations, Fuzzy Sets and Systems 6 (1981) 249-259. T. Tanino, On group decision making under fuzzy preferences, in: J. Kacprzyk, M. Fedrizzi Eds, Multiperson Decision Making Using Fuzzy Sets and Possibility Theory (Kilowatt Academic Publishers, Dordrecht, 1990) 172-185. R. N. Xu and X. Y . Zhai, Extensions of the analytic hierarchy process in fuzzy environment, Fuzzy Sets and Systems 52 (1992) 251-257. R.Zwick,E.Carlstein,D.V.Budescu,Measuresof similarity among fuzzy concepts:a comparative analysis,Internat.J.Approx.Reason1(1987)221-242.
A NEW METHOD WITH PROJECTION TECHNIQUE FOR FUZZY MULTI-ATTRIBUTE DECISION MAKING *
JIBIN LAN School of Economics and Management, South West Jiaotong University, School of Mathematics and Information Science, Guangxi university, Nanning, Guangxi, P. R. China. 530004 E-mail:
[email protected]. cn YANG XU AND JIAZHONG LIU Center of Intelligent Control and Development, South West Jiaotong University, Chengdu, Sichuan, P.R. China.610031 E-mail:
[email protected]. edu. c n
Abstract: A conception of the left and right projection which one fuzzy vector project on another is introduced in this paper. The purpose is t o propose a new method to select an optimal alternative in fuzzy multi-attribute decision making environment. Using the conception of the left and right projection, the difference of each alternative with the fuzzy ideal solution or negative ideal solution are projected on the fuzzy weight vector. The size of the combination projection coefficient that combines the left and right projection coefficients is used as a judgement standard to measure each alternative. The decision making criterion is that the more the combination projection coefficient, the more superior the alternative . Keywords : alternative; fuzzy weight vector; fuzzy ideal solution; fuzzy vector projection; combination projection coefficient; fuzzy multi-attribute decision making.
1. Introduction Many different methods have been employed to deal with fuzzy multi-attribute decision making. The technique for ordering preference by similarity to ideal solution T O P S I S is one of them, the basic idea of the T O P S I S is that both fuzzy positive ideal solution and fuzzy negative ideal solution are the frames of reference. Hamming distance is employed to measure the differences between each alternative and the fuzzy positive 122343536~819210311
*This work is supported by the Chinese NSF grants 60074014.
282
283
ideal solution or fuzzy negative ideal solution. The principle of decision making is that the smaller the size of the Hamming distance between an alternative and the fuzzy positive ideal solution, the superior the alternative; or the lager the size of the Hamming distance between an alternative and the fuzzy negative ideal solution, the superior the alternative. In this paper, we will introduce a conception of the left and right projection of two fuzzy vectors. We consider that the weight vector should be the decision maker's predilection for every attribute and the weight vector should become a standard scale to measure each alternative superior or inferior. 2. Preliminaries
A fuzzy number k is called a L- R fuzzy number if its membership function can be represented by :
=
{
qy), 5 < az; 1,
al
IzIa,;
R ( V ) ,2 > a3
(1)
where R is nonincreasing, left-continuous function of real line satisfying the condition R(0) = 1, L is nondecreasing, right-continuous function of real line satisfying the condition L(0) = 1, al, a, are the spread. The functions L and R is called the left- and right-hand side. In many applications connected with data analysis we need not have fuzzy data but the simpler form, close intervals or crisp numbers. We introduce the left and right expected value7 of the fuzzy number k. The left- and right expected value of the k can be defined by:
E,(k) = al - L Y ~
L(z)dz,E*(k)= a,
+ ar
Definition 2.1. Let k1, k2,. . . ,kn be fuzzy numbers, their membership functions are p (z), p~~ (z), . . . ,p ~ (z). , The minimum and maximum RL Rz,... ,k, are denoted as 2-,ki+. Their membership fuzzy set of 21, function are represented by:
-
284
3. The new method
The basic model of fuzzy multi-attribute decision making can be described as: A given alternative set A = {Al, A2, . . * ,Am} and an attribute set C = {Cl, (72,. . . , Cn} for each alternative, the attribute set is counterpart the weight vector G = (GI)G2, . . . ,&) which explains comparative important degree of each attribute. There are many forms expressing the weight vector, but the most common are: (a) utility preference function; (b) the analytical hierarchy process and a fuzzy version of the classical linear weighted average. Since subjectivity, vagueness, and imprecision enter into the assessments of decision makers, we assume that attribute indices and the weight values are L - R fuzzy numbers. Each alternative Ak, (k = 1 , 2 , . . . , rn) can be represented as follows: Ak =
(zkl)Zk2)"'
(1)
,zkn).
Definition 3.1. Let w1 = ( w 1 1 , w 1 2 , - . . , q n )be a fuzzy vector, where w l j , j = 1 , 2 , . . . , n are L - R fuzzy numbers. The left and right expected vector are defined as: E*(211) = (E*( W l l ) , E*(21121, . . . , E* (win)); E*(Wl)= ( E * ( V l l ) , E * ( V l 2 ) , . . . , E * ( v l n ) ) .
(2)
Definition 3.2. Let w l = ( ~ 1 ' ~, 1 2 ,.. . ,win), v 2 = ( ~ 2 1 ~, 2 2 , .. . , usn) be two fuzzy vectors. The cosine of the left and right expected angle of v1 and 212 are defined as:
n
cos(E(v;),E(V;)) = [ C E*(vlj)E*(vzj)l/ j=1
( CE*(U1jl2
J"
j=1
2 E*(v2j)2).
j-1
(3)
Now let us consider fuzzy multiple attributes decision making problem. Given the fuzzy weight vector G = ( 6 1 , G,.. . , Gn), its left and right normalized expected vector are E,(G) = (E,(wl), E , ( w ~ )., . . ,E,(w,)), E*(G)= ( E * ( W l ) , E * ( W 2 ) , . . . ,E*(W")). The positive ideal solution P+ is defined as follows:
P+ = (P;, Pz', . . . ,P 2 )
(4)
where PT equals the maximum set of & j , (i = 1,2, .. . ,m) if and only if attribute Cj is the kind of proceeds attribute index, or PT equals the
285
minimum set of Z i j (i = 1,2, . . . ,m ) if and only if attribute Cj is the kind of cost attribute index. The negative ideal solution N - is defined as follows:
where NJ: equals the minimum set of Z i j , (i = 1 , 2 , . . * ,m) if and only if attribute Ci is the kind of proceeds attribute index, or NJT equals the maximum set of Z i j ( i = 1,2,... , m ) if and only if attribute Cj is the kind of cost attribute index. We construct the difference between alternative Ai and the positive ideal solution P+ is as follows:
D+ = (d:, d A , . . . ,d&)
(6)
where d;. = P; - Z i j if attribute Cj belongs to the kind of proceeds attribute index, or d t = Zij - PT if attribute Cj belongs to the kind of cost attribute index. the difference between alternative Ai and the negative ideal solution N - is constructed as follows:
where dE< = Z i j - NJT if attribute Cj belongs to the kind of proceeds attribute index, or d; = NJT - Zij if attribute Cj belongs to the kind of cost attribute index. In order to avoid the difference of the unit among attribute indices , D', 0%: should be normalized. the normalized left and right expected values are denoted as follows: I
E,(dt) = E * ( d t ) / l ~ ~ r n E , ( d ~ j ) ; E =*E(*z(td)t ) /lsksrn max E*(d&); = E*(d;)/ max
lsksrn
E,(dij);E*(2;) = E*(d;)/ max E*(dLj). lsksrn
(8)
The normalized left expected vector fizr is denoted as E,(&) = (E*(&),E,(d,), . . ,E*(z;)) and the normalized right expected vector fii is denoted as E*(DL)= (E*(z;),E*(zZ),... ,E*(zYn)).So do 6:. Let normalized EL, 5: project on the weight vector 6. We obtain the normalized left projection coefficient as follows: I
n
&b;
= ( C E*(@/
n
n
j=l n
j=1
j=1
j=1
p--+ -wD,
c
E*(Wj)2)1/2COS(E*(fi)i), E*(G));
= ( C E * ( q y / C E*(Wj)2)'~2COS(E*(D~), E,(6))
(9)
286
and the normalized right projection coefficient as follows:
cos(E*(E;), E*(G)) and where cos(E, (E;), E,(G)),cos(E,(@), E*(G)), cos(E*(@), E*(G))are the cosine of the left and right angle of 5; and 5: with$. We construct the combination projection coefficient as follows:
+ (1- X)&fi;,
P ; ( X ) = X&j;
P'(4
= Xpea+ f (1 - X)P,d+
(11)
where X E [0,1]. Now we can construct a measure function as follows:
When X E [0,1] is given, we get a new decision making method. The larger the fi(X), the inferior the alternative Ai . The combination projection coefficient p,(X) is a function of X and can be explained as the total difference sum of alternative Ai with the negative ideal N - . The combination projection coefficient p'(X) is a function of X and can be explained as the sum of difference among alternative Ai with the positive ideal P+. Obviously, the more the p'(X), the inferior the alternative Ai. Contrarily, the more the p i ( X ) , the superior the alternative Ai. Algorithm(P1SM) step 1:According to the decision maker's preference, we determine the weight vector G,then calculate the left and right normalized expected vectors E,(G),E*(G). step 2: According to the fuzzy indices, calculate fuzzy positive ideal solution P+ = (P:, p;, . . . , p : ) and fuzzy negative ideal solution N - = ( N r , NT , . * ,N i ) . step 3: calculate the differences 0: between alternative Ai and the positive ideal solution P+ and the differences 0; between alternative Ai and the negative ideal solution N - . step 4: Normalize the left and right expectations of O r , Dh, step 5: Calculate the left and right projection coefficients feb,:, peb;,
-
of 0' 7 Df . step' 6 : give X E [O,11. Calculate combination projection coefficients
&fi+
7
287 p r (A), p i (A) and the measure function fi (A).
step 7: rank alternatives according to the size of fi(A). f i ( A ) , the superior the alternative Ai.
The larger the
4. Conclusion
In this paper, the method (PISM) with projection technique for fuzzy multiattribute decision making is introduced. The method extended T O P S I S . It is certainly interesting and useful in multi-attribute decision-making. References 1. S. M .Bass and H. Kwakernaak, Rating and ranking of multiple aspect alternatives using fuzzy sets, Automatica Vol. 13 (1977),47-58. 2. J.J.Buckley, The multiple judge, multiple criteria ranking problem: Afuzzy set approach, Fuzzy Sets and Systems, 13(1984),25-38. 3. Ph.Diamond, Least squares methods in fuzzy data analysis, Proc. [FSA'S], Brussels, Management & System Science, (1991),pp.60-64. 4. D. Dubois and H. Prade, f i z z y Sets and Systems: Theory and Applications. (Academic Pree,New York,1980) 5. J.Efstathiou and V. Rajkovic, Multi-attribute decision making using fuzzy heuristic approach, IEEE Trans. On Systems,Man.Cybernetics, 9(1979),326333. 6. P.Gou, Self-organizing fuzzy aggregation model t o rank the objects with mulitple attributes, IEEE llansaction on SMC: Part A, vol. 30,(2000), 573-580. 7. S.Herilpern, Expected value of fuzzy number, Fuzzy sets and Systems, 47, (1992) 81-87. 8. H.Kwakernaak, An algorithm for rating multiple aspect alternative using fuzzy sets, Automatica, 15, (1979),615-616. 9. K.Nakamura, Preference relation on a set of fuzzy utilities as a basis for decision making, Fuzzy Sets and Systems, 20,(1986),147-162. 10. E.Takeda and T.Nishida, Multiple criteria decision problems with domination structures, Fuzzy Sets and Systems, 2,(1980),123-136. 11. R.R.Yager, O n ordered weighted averageing aggregation opemtiors i n multi-criteria decision-making. IEEE Transaction on SMC: Part A, vol, 19,(1988) ,183-190.
A NEW CRITERION FOR FUZZY MULTI-ATTRIBUTE DECISION MAKING *
JIBIN LAN School of Economics and Management, South West Jiaotong University, Chengdu, Sichuan, P.R. China. 610031 E-mail: lanjibin@gxu,edu.cn
YANG XU AND JIAZHONG LIU Center of Intelligent Control and Development, South West Jiaotong University, Chengdu, Sichuan, P.R. China.610031 E-mail:
[email protected] Abstract: A conception of the left and right projection which one fuzzy vector projects on another is introduced in this paper. Using the conception of the left and right projection, each alternative is projected on the fuzzy weight vector. The size of the combination projection coefficient that combines the left projection coefficient with the right one is used as a judgement standard to measure each alternative. Keywords: alternative; fuzzy weighted vector; fuzzy vector projection; combination projection coefficient; fuzzy multi-attribute decision making.
1. Introduction
Several methods have been proposed to deal with fuzzy multi-attribute decision making (MADM), Laarhoven and Pedrycz’s method’is based hierarchical aggregation and is similar to Satty’s method8 with the difference that is fuzzifies the criteria and alternatives rating pairwise comparisons. Bellman and Zadeh2 proposed the m u - m i n principle to deal with multiple attribute problems, the approach states that one can look at a decision as a goal and attributes which combined form the decision space. The decision space is a fuzzy set whose membership function is the degree to which each alternative is a solution. The combination operation used in the intersection to express the ‘and’ connective between the goal and attributes, the optiT h i s work is supported by the Chinese NSF grants 60074014
288
289 ma1 alternative is the one whose membership achieves maximum. Yager’ovll employed the max-min principle of Bellman and Zadeh approach to deal with (MADM), the main difference is that the importance of attributes are represented as exponential scalars. The weighted average m e t h o d ~ l > ~ the most commonly used algorithm, the objective function for aggregating the preference and attribute values is:
where A i , Wj and Z i j represent alternative i, the weight of attribute j and the relative merit of attribute j for alternative i, respectively. 2. The New aggregation rule
A fuzzy number E is called a L - R fuzzy number if its membership function can be represented by :
CLb)=
{
L ( Y ) ,z
< a2;
1, al 5 2 I a?“; R ( Y ) ,z > a3
(2)
where R is nonincreasing, left-continuous function of real line satisfying the condition R(0) = 1, L is nondecreasing, right-continuous function of real line satisfying the condition L(0) = 1, al, aTare the spread. The functions L and R is called the left- and right-hand side. The left and right expected value of the E can be defined by6: 03
E,(k) = ai - a1
L ( z ) d s , E * ( k ) = a,
+ ar
1
M
R(z)dz.
(3)
Definition 2.1. Let v1 = (v11,v12,... ,vln),v2 = (v21,v22,... , ~ 2 be ~ ) two fuzzy vectors, the cosine of the left and right expected angle of w1 and w2 are defined as: n
cos(v1*,v2*) =
/------
c
[j=1CE*(Vlj)E*(WBj)ll[ (j=1 C E*(vlj)2j = 1 E*(.2j)2)1; n
coS(V;,vb) = [C E*(vlj)E*(vzj)Il[ j=1
j=1
(4)
j=1
Now let us consider the fuzzy multiple attributes decision making problem. Given the fuzzy weight vector W = (Wl,.w7;,...,Wn), its left and right expectedvector are E,(W) = (E*(Wl),E*(&), . . . , E*(Wn)),E * ( G )=
290 (E*(Wl),E*(Wz),. . . , E*(W,)). In order to avoid the difference of the unit among attribute indices, the left and right expected value should be normalized For the kind of proceeds attribute indices, the normalized left and right expected value are denoted as follows: E,(xk3)
= [E*(Zk3) - min
lo}
where C is an k x n matrix of fuzzy numbers, A is an m x n matrix, b is an m-vector, and x is an n-vector of decision variables, x e R" • Now, we have the following definitions about FMOLP problems. Definition 2.1 x' is said to be a complete optimal solution, if and only if there exists x' e X such that /.(x'\Ji(x), i = l *, for all x e X • Definition 2.2 x' is said to be a Pareto optimal solution, if and only if there does not exists another xeX such that 7-(*)h/!-(**)' f°r ^ '• Definition 2.3 x is said to be a weak Pareto optimal solution, if and only if there does not exists another x e X such that f.(x)> f,(x')> f°r all '• Associated with the FMOLP problems, let's consider the following multiple objective linear programming (MOLPj) problems: (MOLPx) (Maximize ((cj;,*},{Qs,;t))F,V/l e [o,l] |s.t.
(2)
x 6 X = {it 6 R" | Afjc < b^,Afx < b*,x > 0,VA 6 [0,l]}
where cj-
For the crisp MOLPj, problems, we also have the following definitions. Definition 2.4 x" is said to be a complete optimal solution, if and only if there exists x'eX suchthat f . ( x ' } > f . ( x ) , \ = 1, ...,k, for all xsX .' Definition 2.5 x' is said to be a Pareto optimal solution, if and only if there does not exists another x e X suchthat f.(x)> /(**)> for all /and /(*) */.(*:') for at least one J
-'
Definition 2.6 x' is said to be a weak Pareto optimal solution, if and only if there does not exists another xe X such that f.(x)> f.(x"),i= 1» • • • . k . ' The following theorem shows the relationships between FMOLP problem and the MOLPX problem. Theorem 2.1 Let x' e X be a solution to the MOLP^ problem. Then x" is also a solution to the FMOLP problem
306 3
A Fuzzy Goal Approximation Algorithm for Solving FMOLP Problem
(x)r
Considering the FMOLP problem, for each of the fuzzy multiple objective functions f"(x) = ()., ..., , assume that the DM can specify some fuzzy goals g" =(g",,g2,...,~,)T which reflects the desired values of the objective functions of the DM. Based on the definition of FMOLP problem and MOLPl problem and Theorem 2.1, we can make the conclusion that the solution of MOLPh problem is equally the solution of FMOLP problem. From the definition of MOLPk problem, when the DM , the corresponding Pareto optimal sets up some fuzzy goals =(i,,i2,...,f,)T
6 x(.),
solution, which is, in the minimax sense, the nearest to the fuzzy goals or better than that if the fuzzy goals is attainable, is obtained by solving the following minimax problem:
i
( M O L P ~ ) Min m x
(3)
~ ~ k , x ) - g ~ , ( c : , x ) - g : r , v n t [OJI
x E X = {x E R" I Akx S 6k,A;x 5 6 2 , x t 0,VL E [O,l]}
s.t.
where g:=[g,",,g,L,,,..,g:P) g: =[g::.g,q,r,g;J For the simplicity in presentation, we define = {x E R" I A;X I b;, A;X I b; ,x 2 0} A [0,1~ The main steps of algorithm are described as follows: Let the interval [0,1] be decomposed into 1 mean sub-intervals with (1+1) nodes ~ ( =i0, . . . , I ) which arearranged in the order of o=;l, n. Handle any Ri and any R, in such the way, until any Ri and any Rj satisfies Ri n Rj=O. in order that the disassembled copy is integrally and right reverted in the site ti ,the disassembly Rk should be labeled, define integral set Zi, Zi={ 1,2;.*,n}, Zi shows that the disassembled Rk belongs to the copy which is not disassembled. The set V' of source copy spread to the site ti can be expressed as: Vf={(R,,ZI,W1),(R2,Z2,W2),...,(Ri,Zi,Wi),..',(Rn,Zn,W,>,.'.,(Ry,Zy,Wy)), where for any Ri and Rj satisfies Ri n R,=0,1 , respec-a tively. Then, let
E,(m) and Fa(m)are the fuzzy expected risk values relating t o the minimum probabilities and the maximum probabilities, respectively, in the a-cut of ~ ( mwith ) respect to m. Finally, we use the fuzzy expected risk values, &(m) and F,(m) as parameters for transforming numeric fuzzy risks into cartographic form. Thus, by using the a-cut method7, we can provide a series of risk maps. They are produced by employing the information diffusion technique, which is regarded as one of the soft computing8 approaches, these maps are called soft risk maps. A fuzzy risk represents the imprecision. A a-cut represents some confidence in the estimate. In this paper, we provide the soft risk map with respect to 0.25-cut of fuzzy risks of strong earthquakes in Yunnan province, China(Fig.2). The soft risk map includes two specific maps: (1)conservative risk map, which is produced by using the fuzzy expected values with respect to the minimum probabilities in a-cut of fuzzy risks; (2)risky risk map, which is produced by using the fuzzy expected values of the maximum probabilities in a-cut of fuzzy risks. From Fig.3, we know that soft risk map is a kind of risk map in which each area are assigned two risk values. The double risk values represent the imprecision. The 0.25-cut represents some confidence in the estimate.
4. Comparison Between Soft Risk Map and Traditional Risk Map The form of traditional risk maps is given in Fig.3. Traditional risk maps are resulted Gom the probabilistic* or deterministic approach'. Traditional methods for representing risk information are inadequate because they do not tolerate imprecision with a single risk value. Fig.3 indicates that,in traditional risk map, each area is assigned a crisp risk value. This assignment is
362 c
a. conservative risk map
Figure 2.
Figure 3.
b. risky risk map
Soft risk map of strong earthquakes in Yunnan province.
Traditional risk map of strong earthquakes in Yunnan province.
unique for that area. However, in practical situations, risks with respect to natural disasters are impossible to estimate with precision. The expressive inadequacy is largely due to the traditional approaches used t o estimate risks. Expressive inadequacy may lead to loss of valuable information and reduction of accuracy of analysis. Apparently, the scheme of representation relevent to soft risk maps is superior t o that of traditional risk maps. Soft risk map is resulted from fuzzy risks obtained by employing the technique of information diffusion. Fuzzy risk is a multi-valued risk which can represent the imprecision of risk estimations. A a-cut represents some confidence in the estimate. In contrast, soft risk map can provide a better representation for risk information
363 leading to more satisfying results. In this respect, soft risk map do meet the demands of risk information users. Soft risk maps can provide more risk information for a variety of businessmen who invest in dangerous projects(e.g., nuclear power plants) or non-dangerous projects(e.g., flower shops). A soft risk map with Q >- 0 might be useful for the investors in a nuclear power plant, An owner of a flower shop might e interested in the soft map with Q = 1 when he chooses an insurance company to buy a property insurance policy for his shop. 5 . Conclusion
A fuzzy risk is a multi-valued risk. The benefit of this is that one can easily understand the imprecision of risk assessment of natural disasters in case of lack of data. Soft risk map resulted from fuzzy risk method can tolerate imprecision and provide more information for map users. Soft risk map is the normal update of traditional risk map.
Acknowledgement The work on this paper was done in Key Laboratory of Environmental Change and Natural Disaster, The Ministry of Education of China. References 1. Bazzuro, Paolo, and C.A. Cornell (1999). Disaggregation of seismic hazard, Bull. Seism. SOC.Am., 89(2), 501-520. 2. C. A. Cornell(l968), Engineering seismic risk analysis, Bull. Seism. SOC.Am., 5 8 , 1583-1606. 3. C.F. Huang(1998), Concepts and methods of fuzzy risk analysis. Risk Research and Management in Asian Perspective (edited by Beijing Normal University and et al., International Academic Publishers, Beijing), 12-23. 4. C.F. Huang(1997), Principle of information diffusion, Fuzzy Sets and Systems, 91(1),69-90. 5. C.F. Huang, D. Ruan( 1996). Information diffusion principle and application in fuzzy neuron, Fuzzy Logic Foundations and Industrial Applications (edited by Da Ruan, Kluwer Academic Publishers, Massachusetts), 165-189. 6. C.F. Huang, Peijun Shi(1999), Fuzzy risk and calculation, Proceedings of 18th International Conference of the North American Fuzzy Information Processing Society, New York, June, 90-94. 7. C.F. Huang, Y. Shi(2002), Towards efficient fuzzy information processing - Using the principle of information diffusion, Physica-Verlag (Springer), Heidelberg, Germany. 8. L.A. Zadeh(1994), Soft computing and fuzzy logic, I E E E Software, 11(6), 48-56.
BENEFIT OF SOFT RISK MAP MADE BY USING INFORMATION DIFFUSION TECHNIQUE *
CHONGFUHUANG Institute of Resources Technology and Engineering College of Resources Science and Technology, Beijing Normal University Beijing 100875, China. E-mail:
[email protected]. c n
HIROSHI INOUE School of Management, Tokyo University of Science Kuki, Saitama 346 Japan
In this paper, we discuss the benefit of the soft risk map calculated by the interiorouter-set model. We suppose that a company will invest three projects in a district consisting of four zones with different flood risks. The company chooses zones based on the risks beside water resource and cost. The result shows that the soft risk map is better than the traditional risk map.
1. Introduction Due to the complexity of natural disaster systems, it is impossible to accurately estimate the risks of natural disasters within a specified error range. As a result, there is no believable risk map. In this case, fuzzy probabilities would be used to represent fuzzy risks and give a new risk map, called soft risk map. With given samples, we can employ the interior-outer-set model (1OSM)l as an information diffusion technique2 to calculate fuzzy probability distributions serving for the map. In this paper, we discuss the benefit of soft risk map from IOSM with respect to optimization of resources allocation. In this research, we suppose that, in a map, the real risks, in term of their underlying probability distributions, are known, and some samples from the populations are given. Using the histogram method to analyze the samples, we produce a traditional risk map, and the interior-outer-set model to a soft risk map. Then, *Project Supported by National Natural Science Foundation of China, No. 40371002
364
365 we suppose that a company will invest three projects in the district shown in the map. To promote the optimization of resources allocation, we choose zones for the projects based on a risk map beside other restrictions.
2. Flood risk map as a restriction Let D be a district consisting of four zones: A l , A z , As,Aq. A river flows through A1 where often flood, abundant water resource and the most convenient transportation. A lake is lying in A2, now and then flood, reasonable water resource and convenient transportation. As is a mountainous area, rare flood, little water resource and inconvenient transportation. A4 is a basin where flood occurs when the rainfall so much or drought when the rainfall so little, meanwhile the most convenient transportation. Obviously, the natural conditions of the zones are different. They can be distinguished by used three attributes: flood risk, water resource, transportation. Theoretically speaking, the flood risk of a zone can be represented by using a probability distribution of a flood measured by submersed area. To compare different zones with respect to the flood, we often employ the proportion of submersed area and total area of a zone to be the flood index and use the probability distribution of the index to represent the flood risk of the zone. The expected value of the probability distribution can be regarded as the flood risk degree of the zone. This paper focuses on discussing the benefit of the soft risk map, not to calculate the flood risks of real zones. Therefore, we are allowed to assign a probability distribution of the flood index to a zone. In general, the frightful flood does not occur frequently for any zone where people are living. Meanwhile, the frequency of the light flood may very high. Therefore, it is logical t o assume the probability distribution p ( z ) of the flood index z to be an exponential distribution in Eq.(l). p(z)=
There is no loss in generality when we suppose that zones Al,Az,A3,Ad have parameters X = 9,15,100,25, respectively. In this case, real risks are 0.111,0.0667,0.01,0.04,respectively, shown in Fig. 1. Using the generato? of random numbers obeying an exponential distribution to these parameters, with seed number 574985 and size n = 10, we obtain the following four samples regarded as data of flood events occurred
366 in the last 10 years in zones A1, A2, As, A4, respectively.
X i = {0.093,0.228,0.115,0.081,0.079,0.005,0.347,0.066,0.037,0.721}, X , = {0.056,0.137,0.069,0.048,0.047,0.003,0.208,0.039,0.022,0.433}, X3 = { 0.008,0.020,0.010,0.007,0.007,0.000,0.031,0.005,0.003,0.064}, X4 = {0.033,0.082,0.041,0.029,0.028,0.002,0.125,0.023,0.013,0.260}. Practically impossible to prove if the probability distribution of the flood index is an exponential distribution or other theory distribution, such as normal distribution. Hence, if only given samples are available, we use the histogram model(HM):
1 Ij(z E I j ) = -(number of zi in the same bin as x).
(2)
n
to estimate flood risk for making a traditional risk map. With respect to HM, Otness and Encysin3 proved that the asymptotic optimum estimate of the bin number m is m = 1.87(n - l)2/5.
(3)
Hence, for the given samples, we obtain: m = 1.87(10 - 1)2/5 M 4 to be the number of the intervals I j , j = 1 , 2 , . . . , m, for HM. From X I ,X2, X3, X4, we obtain HM risks shown in Fig.1. Using IOSM, with the same intervals in HM, from X I , X ~ , X ~ we ,X~, obtain four possibility-probability distributions. For example, from X I , we PO PI PZ P3 P4 P5 P6 P7 P8 P9 PI0 0.00 0.07 0.08 0.11 0.16 0.32 0.50 1.00 0.26 0.00 0.00 0.26 0.41 1.00 0.11 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.41 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.50 1.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
,
(4)
where Ij = [0.005+ ( j - l)h,O.O05+jh),h = 0.179,j = 1 , 2 , 3 , 4 , and pk = k/10, k = 0 , 1 , 2 , . . . , l o . From Eq.(4) we know that, the probability of a flood event occurring in interval I1 would be in b6,PS] with possibility more than 0.26. The probability interval with possibilities can be represented as in Eq.(5).
PI, = [ ~ j / a ~j ,i / b j ] . where ~ j7ri, are possibilities, Eq.(4) we obtain
a j , bj
PI, = [0.50/P6,0.26/PS],
(5)
are probabilities. For example, from
PI, = [0.41/Pl, o.ll/P3],
367
Let
m
m
m
m
where u j is the midpoint of interval Ij. Then, we obtain an risk interval [u,, Ub] with possibilities 7r and T I , respectively, represented by Eq.(6). Defuzzifying the interval by Eq.(7), we obtain u to be IOSM risk.
RA = [T/%,T1/ub].
(6)
For example, for zone A1, we have n-=
0.5
+ 0.41 + 1+ 0.5 = 0.602, 4
=
0.26
+ 0.11 + 0.41 + 1 = 0.445, 4
R A= ~ [0.602/0.121,0.445/0.205],
uA1
=
0.602 x 0.121 0.602
+ 0.445 x 0.205 = 0.157.
+ 0.445
Similarly, uA2 = 0.094, uAQ= 0.0144, uA4 = 0.0571, shown in Fig.1 as IOSM risks.
Real risk: 0.01 HM risk: 0.0167 IOSM risk: 0.0144
Real risk: 0.04 HM risk: 0.0665 IOSM risk: 0.0571
Real risk: 0.111 HM risk: 0.185 IOSM risk: 0.157
(Basin Zone)
IOSM risk: 0.094
Figure 1. The real flood risks, and estimated risks by histogram model(HM) and interior-outer-set model (IOSM). 3. Projects, water resource and cost
There is no loss in generality when we suppose that, the natural environment and transportation condition of the district are relatively stable. Furthermore we suppose that, in the district, a company will invest three projects 01,02,03 whose profit functions, in next 10 years, are
+
gol (w, s, 2 ) = 10 0 . 6 ~ 0.1s - 3.12, go2(w, S, 2 ) = 9 1 . 1-~0.1s - 3.12, gO3(w, s, 2 ) = 5 l . l w - 0.1s - 5.62.
+
+
where w is an index of water resource, s is the cost of raw materials, labour force and transportation, and 2 is flood risk. The profit is measured in billion RMB Yuan. According to the natural environment and transportation condition of the district D ,we suppose that
w A 1= 0.8,w A Z= 0 . 6 , w A 3~ 0 . 2 ,w A 3~ 0 . 4 ,
369
4. Zones for the projects If project Oi is located at zone Aj, then the profit in next 10 years will be 9oi (WA •> SA i XA • ) where XA . is flood risk in zone Aj. Table 1 shows the profit of the projects in the district D, and the names of zones to produce maximum profit, from which we know that the IOSM risks lead to the same zones chosen with the real risks. Table 1. Profit (billion RMB Yuan) of projects o\, 0%, 03 located at zones Project
Risk Real HM
01
IOSM Real HM
02
IOSM Real HM
03
IOSM
Zone AI 10.106 9.877 9.963 9.506 9.276 9.363 5.228 4.814 4.971
Zone A2 10.143 10.007 10.059 9.443 9.307 9.359 5.276 5.030 5.124
Zone As 10.029 10.010 10.015 9.129 9.110 9.115 5.104 5.070 5.079
Zone A^ 10.096 10.014 10.043 9.296 9.214 9.243 5.196 5.048 5.100
Chosen Zone A2 AA A2
A! A2 Ai A2 A3 A2
5. Conclusion and discussion The soft risk map is better than the traditional risk map for choosing zones for projects. In this paper, we merely use the weighted midpoint of the fuzzy risk interval RA in Eq.(6). There might be some model to use all information in the interval for reaching a better result. Acknowledgement The work on this paper was done in Key Laboratory of Environmental Change and Natural Disaster, The Ministry of Education of China. References 1. C.F. Huang, An application of calculated fuzzy risk, Information Sciences, 142(1), (2002), 37-56. 2. C.F. Huang, and Y. Shi, Towards Efficient Fuzzy Information Processing— Using the Principle of Information Diffusion, Heidelberg, Germany: PhysicaVerlag(Springer), 2002.
3. R.K. Otness, and L. Encysin, Digital Time Series Analysis, New York: John Wiley, 1972.
A RISK ASSESSMENT MODEL OF WATER SHORTAGE AND ITS APPLICATION FOR ANALYZING BEARING CAPACITY OF WATER RESOURCES*
LIHUA FENG+ Department of Geography, Zhejiang Normal University, Jinhua 321004, China
CHONGFUHUANG Institute of Resources Technology and Engineering, College of Resources Science and Technology, Beajing Normal University, Beijing 100875, China
In this paper, we suggest a risk assessment model of water shortage based on information diffusion theory and use it to analyze the bearing capacity of water resources in Jinhua City. From the application we know that the carrying capacity of water resources is the maximum support capability on human activities under a certain stage of social development and sound circulation in ecosystem. For Jinhua City, we discover that the water shortage is not the type of lacking of water sources, but the water quality and water conservancy projects. Therefore, only the middle scheme of giving attentions to both economic development and environmental protection is the first choice.
1. Introduction
Due to the complexity of a water system with natural and societal attributes, it is impossible to accurately estimate the risk of water shortage for a city. In order to simplify the analysis, we assume that the uncertainty in water shortage is only related to the uncertainty in natural rainfall. The core in the risk of water shortage is to estimate the possibility distribution of rainfall of the study region. We also assume that the natural rainfall in the study period is a stable Markov process, therefore there is no change on the possibility distribution with the period of study. Even then, the esti*The Project Supported by Zhejiang Provincial Natural Science Foundation of China(402034) t Corresponding author: Tel.: +86-579-2306806, E-mail:
[email protected]. net.cn
370
371 mation of the risk of water shortage must be very rough when the size n of sample including observations of natural rainfall for n years is small, such as n < 30. A small sample provides incomplete information to estimate water shortage. In this case, the information diffusion techniques4 can help us t o improve the estimation. It has been proven3 that, the estimation from the techniques is better than one from the classical histogram at least 28% higher in the work efficiency when the given sample is small. Hence, with the result of risk assessment from the techniques, it is possible to reasonably calculate the carrying capacity of water resources for a city on human activities under a certain stage of social development and sound circulation in ecosystem and discover the types of the water shortage. In this paper, we employ the normal diffusion model5 t o construct a risk model for estimating the risk of water shortage with a small sample of observations of natural rainfall. 2. Risk assessment model for water shortage
Information diffusion is a kind of set numerical method processing on samples by fuzzy mathematics2, it could transfer single-valued sample into set numerical sample. The simplest model is of normal diffusion model. Give a sample X = {x1,x2,..., x n ] in a universe of water shortage, U = { u I , u ~ , . ., u. r n } ,we can diffuse the information of a single-valued observation xi to all points of U by using Eq. (1).
where h is diffusion coefficient calculated by the maximum, minimum values as well as the number of the given Let m j=1
Then, observation xi is changed into a normalization fuzzy set with the following membership function: PI1
(Uj)= fi(Uj)/Ci.
(3)
With p x i( u j ) ,i = 1,2, . . . ,n, as fuzzy observations, we assign information gain to a monitoring point uj as n i=l
372 Its physical meaning is to extrapolate from X through information diffusion that, if the observing value of water shortage could only choose one from u1, u2,. . . ,um, then when deeming xi as the representative of samples, the number of samples with observing value ujwill be q ( u j ) ,which is often not a positive integer, but surely not a number below zero. Let m j=1
and
we obtain the probability in exceeding u j as the following.
k=j
that is just the required estimation for risk assessment. 3. Risk assessment of w a t e r shortage in Jinhua City
Jinhua City is located in the middle of Zhejiang Province, China, total area 10918km2, total population 4.492 millions. The city is of sub-tropic monsoon climate, possesses balanced rainfall and heat, light and temperature supplements each other, a comprehensive agriculture development zone with farming, forest, and fishery, and is called as the “Second barn” of Zhejiang Province. In recent years, the problem of water shortage has emerged quietly. The drought in 1996 caused problem in drinking water for 160 thousands person and 80 thousands lives tock in Jinhua with direct economic loss of RMB 170 millions; 50 years not experienced high temperature occurred in the Summer of 2003, Autumn drought occurred after the Summer drought, the precipitation from July-October is 124mm only, just 30% of the normal year. Due to the uneven distribution in time and space, plus river pollution and the lack of water storage project, Jinhua city is forced t o give up taking the river water which they had taken for generations as drinking water in 1998, in stead of that they spent huge money to get water from Shafan reservoir for the city; On November 24, 2000, Yiwu invested RMB 200 millions to purchase fresh water 49.99 million m3 from Hengjing reservoir of Dongyang city, which is the first transaction on water right after the theory of “Water right and water market” by the Ministry
373
of Water Resources, China. Therefore, the lack of water has become an important restraint for Jinhua City to build a well-off society in an all-round way by 2020. For 23 years' measured data on rainfall at Jinhua Station in 1980-2002, we take the interval [0, 2000] as the universe of Xi and transfer the continuous universe into discrete universe: t/ = {«i,u 2 ,-" ,u n } = {0,20,40,---,2000}. By using the risk model consisting of Eq.(l)-(7), we obtain the risk assessment of water shortage in Jinhua City shown in Table 1. Table 1. Risk assessment for water shortage in Jinhua City Annual Rainfall 900 1000 1100 1200 1300 1400
Exceed- probability 1 0.9965 0.9469 0.8597 0.7782 0.6581
Annual Rainfall 1500 1600 1700 1800 1900 2000
Exceed- probability 0.4934 0.3221 0.1919 0.0906 0.0308 0.0014
As it is calculated by year, therefore the line of 1400 in the table means that the probability with annual rainfall > 1400mm in Jinhua in the future is p = 0.6581. In another word, rainfall of such kind of risk in future low water year in Jinhua is about 3 years a meet. The impact of annual rainfall on the utilization of water resources in Jinhua is quite high. Based on the 23 years' measured data on rainfall at Jinhua Station in 1980-2002, by using of the above calculation for risk assessment, the rainfall in future 5 years a meet low water year in Jinhua is only 1276mm by inner insertion, and only 1151mm in 10 years a meet low water year. 4. Analysis on the bearing capacity of water resources in Jinhua City The annual average rainfall in Jinhua City is 1503mm, which belongs to humid area. Comparing with the water shortage in dry area7, the current water shortage in Jinhua is not the type of lacking of water sources, but the water quality and water conservancy projects, namely water shortage due to water pollution and the lack of storage project. According to the index system for the study of bearing capacity of water resources in Jinhua, it can be divided into 5 subsystems, namely population, agriculture, industry, environment protection and water resources subsystems. With the guide of ecological concept, based on the actual situation
374
and SD model6 demand in Jinhua, field investigations are carried out along main streams such as Jinhua River, Dongyang River and Wuyi River, water resources and related data on social economic system since 1980 are fully collected. Then system dynamics equations in population, agriculture, industry, environment protection and water resources 5 subsystems are established based on the features of water resources in Jinhua. Totally more than 100 variables and parameters are selected into the model with 9 level equations] 9 rate equations and large number of auxiliary equations, and 3 step functions, 3 table functions, 4 ramp functions and 8 clip functions are used. Through analysis on parameter error and sensitivity, it shows that the model is reasonable in structure] and could reflect the actual features of the bearing capacity of water resources in Jinhua, therefore, it can be used for the prediction of the dynamic development process after the implementing of future policy parameters. In order t o discuss the way of harmonic development between the bearing capacity of water resources and social economy in future 20 years of the region, based on historic data and the standard for a well-off-in-all-rounds society, agriculture, industry, GDP and the investment growing rate, irrigation quota, water consumption amount per unit output value of ten thousand yuan, sewage disposal volume are selpcted a s policy parameters; by using of SD model, 3 kinds of development schemes are simulated (shown in Table 2), where the high scheme is suggested to realize average GDP per capita >USD 3000 in 2010, the middle scheme fully considers the relationship between population, resources, environment and economic development in Jinhua, and the low scheme takes environment protection as the main objective. Fi-om Table 2 we know that the bearing capacity of water resources would be unable t o satisfy the total water demanded in the high scheme, the objective of quadruple GDP will not be realized in 2020 if we choose the low scheme. Only the middle scheme of giving attentions t o both economic development and environmental protection is the first choice.
5 . Conclusion
The information diffusion techniques can be used to effectively estimate the risk of water shortage that helps us to analyze the bearing capacity of water resources for determining which scheme is suitable for a city t o develop.
375
Table 2. Comparison on major indices in three schemes on bearing capacity of water resources in Jinhua City. APV IPV GDP IEP SDV PPRL TWD BCWR Year 2000 8.16 126.00 54.44 2.1 130.30 17.3 2.47 2.49 2005 12.55 193.87 83.76 2.8 154,40 20.4 2.62 2.70 19.31 298.29 128.88 3.8 175.80 22.9 2.91 2.77 2010 High 2015 29.71 458.95 198.29 5.1 229.80 29.9 3.24 2.94 334.00 43.7 3.72 3.01 2020 45.71 706.15 305.10 6.8 8.16 126.00 54.44 2.1 130.30 17.3 2.47 2.49 2000 2005 11.71 180.89 78.16 3.0 130.90 16.9 2.57 2.70 16.81 259.69 112.20 4.4 97.41 11.5 2.79 2.92 Middle 2010 24.13 372.82 161.08 6.3 58.84 5.2 3.01 3.24 2015 2020 34.65 535.23 231.25 9.0 48.87 2.4 3.29 3.48 2000 8.16 126.00 54.44 2.1 130.30 17.3 2.47 2.49 2005 10.91 168.62 72.85 3.3 123.80 15.8 2.53 2.70 2010 Low 14.61 225.65 97.49 5.0 77.66 8.4 2.69 3.07 2015 19.55 301.96 130.47 7.7 17.75 0 2.81 3.55 2020 26.16 404.10 174.59 11.9 0 0 2.96 3.94 APV— Agriculture Production Value, IPV— Industry Production Value, GDP — Gross Domestic Product, IEP — Investment on Environment Protection, SDV— Sewage Disposal Volume,PPRL — Polluted Percentage of River Length, TWD— Total Water Demanded,BCWR— Bearing Capacity of Water Resources Scheme
References 1. E.A. Chatman, Diffusion theory:A review and test of a conceptual model in information diffusion, Journal of the American Society for Information Science, 37(6), (1986), 377-386. 2. C.F. Huang, Principle of information diffusion, Fuzzy Sets and Systems, 91(1), (1997), 69-90. 3. C.F. Huang, Demonstration of benefit of information distribution for probability estimation, Signal Processing, 80(6), (2000), 1037-1048. 4. C. F. Huang, Information diffusion techniques and small sample problem, International Journal of Information Technology and Decision Making, 1(2), (2002), 229-249. 5. C. F. Huang, and Y. Shi, Towards Efficient Fuzzy Information Processing—Using the Principle of Information Diffusion, Heidelberg, Germany: Physica-Verlag(Springer), 2002. 6. Y. Motohashi, and S. Nishi, Prediction of end-stage renal disease patient population in Japan by system dynamics model, International Journal of Epidemiology, 20(4), (1991), 1032-1036. 7. D. Verschuren, K.R. Laird, and B.F. Gumming, Rainfall and drought in equatorial east Africa during the past 1100 years, Nature, 403, (2000), 410-414.
AN EARTHQUAKE RISK ASSESSMENT METHOD BASED ON FUZZY PROBABILITY IMAN KARIMI Department ofStructura1 Statics and Dynamics, RWTH Aachen University Aachen, 52056, Germany EYKE HULLERMEIER Department of Mathematics and Computer Science, University of Marburg Marburg, 35032, Germany KONSTANTIN MESKOURIS Department of Structural Statics and Dynamics, RWTH Aachen University Aachen, 52056, Germany
This paper presents an outline of a proposed risk assessment system for evaluating expected damage of structures and consequent financial losses and casualties due to a likely earthquake. Uncertainties caused by insufficient knowledge about the correlation of parameters has been considered by fuzzy relations, while uncertainties in eliciting the likelihood of earthquake magnitude from scarce past events, has been expressed by fuzzy probability instead of conventional probability. The approach represents fuzzy probability by constructing a possibility distribution over probability out of a small data set and a sample case calculation is presented.
1. Introduction: Seismic hazard and earthquake risk assessment An earthquake occurs when two sides of a fault suddenly slide along or on each other. The released energy propagates from this ruptured zone, which is called focus, in every direction and causes a strong ground motion in the region around the focus. The intensity of this motion depends on the released energy (earthquake magnitude), the distance from the focus, the soil properties and the topography of the site. This strong ground motion causes damages to the structures and consequently leads to financial losses and casualties. Because of various uncertainties, assessing the seismic risk of a structure and consequent losses is highly complicated. These uncertainties can be divided into two categories: 1. Uncertainties about the correlation among the parameters of the hazard, damage and loss. These uncertainties are due to the lack of subjective knowledge or lack of abundant data sets for objective determination of the correlations
376
377 2.
Uncertainties concerning the likelihood of occurrence and magnitude of the seismic hazard
In this study we have tried to consider the uncertainties of the first category by constructing fuzzy relations, employing different data mining and machine learning methods as well as utilizing all possible data sources such as observed data, expert opinion and numerical models. On the other hand, the uncertainties of the second category have been modeled with a two-dimensional uncertainty pattern, instead of the conventional probabilistic representation. The latter issue is the main focus of this paper and will be dealt with in the following sections. Therefore the main features of the proposed risk assessment system can be summarized as follows: Constructing fuzzy relations among parameters of magnitude, distance, soil properties, topography, intensity, damage to each type of structure and subsequent loss (causalities) Constructing a hazard pattern of the magnitude, which will be applied on the above-mentioned fuzzy relations in order to obtain the risk pattern of the loss.
2. Assessing the probability of natural hazards There are at least two ways to define or interpret probabilities: deduction from the knowledge of the particular constitution of the events under which they occur; or, derivation from the long-continued observation of a past series of events. In the case of natural hazards, particularly earthquakes, available physical knowledge of their genesis and pattern, are not sufficient for deducing their probability. Therefore, one has to resort to objective assessment and since in most regions earthquakes are scarce events, objective assessments based on such small data sets will not yield reliable values of probabilities as well. Hence, a framework capable of expressing imprecise probability is required. Various approaches have been proposed for elaborating this concept, which from the viewpoint of their output format might be divided into two categories: Interval valued probabilities Fuzzy probabilities The concept of introducing a probability interval instead of a single value for representing likelihood of an event has been implemented based on different theories by defining upper and lower probabilities, between which the unknown precise probability will fall. For instance, Dempster-Shafer Theory represents this interval by defining belief and plausibility functions [1,2], while in possibility theory it is expressed by necessity and possibility of an event [3,4].
378 It can also be argued that if a complete set of observations leads us to a clear objectively assessed probability, then an incomplete data set, which might contain imprecise observation, delivers a fuzzy image and thus the probability of an event would be a fuzzy number instead of a crisp one. We can consider this concept as the extension of the concept of the first category as well; for instance by generalization of the Dempster- Shafer Theory, i.e. defining fuzzy valued belief and plausibility functions [5]. This fuzzy probability could also be achieved by using the theory of fuzzy random variables [6] Another concept in this category is defining a possibility distribution for the probability of an event [7]. In the next sections we will present our approach of eliciting fuzzy probability from a scarce data set by constructing a possibility distribution of the probability. 3.
Possibility-Probabilitydistribution
Consider a real-valued random variable X (e.g. the magnitude of an earthquake) whose range is partitioned into several intervals (bins) A,, A, ,K ,A, . We are interested in characterizing the probabilities Pr(X E A i ) that X takes a value in the different intervals. Let { x , ,x,,K,x,}be a given sample, p") = Pr(X E A ; ) and ni the number of observations xj so that x j E Ai. The standard (point) estimation of the probability is then given by the relative frequency:
This estimation converges stochastically toward the true probability p, i.e. limp = p n.r+-
However, for small values of n, the estimated probability of Eq. (1) will obviously be afflicted with a high degree of uncertainty. Therefore, the point estimation is usually endowed with (or even replaced by) a confidence interval in classical statistics. A confidence interval C, also referred to as a credible set in Bayesian statistics, is constructed so that Pr(p E C,) = a , where a is the socalled confidence level. Commonly used confidence levels are values such as, e.g., 0.90, 0.95, and 0.99, although the final choice remains arbitrary to some extent. An alternative approach to characterize the uncertainty associated with estimation is to consider the complete family of confidence intervals C, 0 < a: I 1 . This approach avoids a particular choice of a and obviously includes more information about the estimated quantity. In the following, we will present an approach in which the
379 information provided by a family of confidence intervals is represented in terms of a possibility distribution. We employ a Bayesian approach and assume prior knowledge about the (unknown) probability degree p to be represented in terms of a prior probability distribution (density function) pi over [0, 11. That is, p i ( p ) is the probability (density) of the probability degree p. This distribution allows one to incorporate expert knowledge into the inference procedure. Even though this point will not be explored further in this paper, let us note that the incorporation of background knowledge is crucial in the context of our application where data is usually sparse but expert knowledge more or less available. In the case where no prior knowledge is available, piis simply specified by the uniform distribution. Now, recall that we have a sample in which ni among n data points fall into the interval Ai.The posterior probability p i is then given by
For computational reasons, p iand p ; will not be defined as density functions over the complete unit interval [0,1] but rather as discrete probability distributions on a finite subsetbo,pu,,K,pm}that O=po < p, < . . . h The vector space of intensity values ( V , ( + ) , ( x ) )is isomorphic to the space of real numbers (R,t,.) by the function cp : V + R , defined as: h(x)v=
VVEV , V A ER ,
VVE
v,
The isomorphism cp verifies: VVl,V2E
v
7
' d h R~, V V E V , The scalar product
cph( + b 2 ) = cpbl >+c p b 2 ) c p ( W V )= cp(4 (. I .)v : V x V -+R
(2.4) as:
cph
v,
is defined using the isomorphism
1
(2.5) (9I v2 )v = ).(Pb2 Based on the scalar product (. I .)v the vector space V becomes a Euclidean
V'Y,v2 E
space. The norm
11. IIv:
V -+ R+ is defined via the scalar product:
Practically, the closed interval the open interval (0,l).
[EJ- E] , where
0 < E 1, n > 1 and P = I ( i , j ) E [I,rn]x[l,n]]a fuzzy partition of the support Q . The
{e,
membership degrees of a point (x,y ) E 52 to the fuzzy window Wv are given by the fimctions wv : Q +[OJ] defined by the relation:
406 ~j
wjj(x,y)=-. m
where
ui : [a,b]+ [OJ]
vj (Y)
(XI
(4.1)
xUi<x>
Cvj(Y)
i=l
j=1
, v j :[c,d ] -+[OJ]
are the following Gaussian 2
-(STand
functions: ui(x)= e
vj(y) =e
-(3 1
i-1 j-1 wherexi =a+-(b-a) and y j = c + - ( d - c ) . m-1 n-1 The parameters a , p (~0 , ~ control ) the fuzzification degree and offer more flexibility for the partition P. For each window q,,the fuzzy cardinality card(Wij),the fuzzy mean of luminosity pii , the fuzzy variance of luminosity 2 2 oij and fuzzy mean of saturation yij are defined by:
card(Fv,)=
c
w&y)
(X>Y)EQ
(4.4)
5. The enhancement method using fuzzy surfaces
Let be a color image described by its three scalar functions R : Q + V , G : 51 + V , B : 51 + V and L, S its luminosity and saturation computed with relations (3.1,3.2). The fuzzy window Wij will supply a triple of parameters
(pij,oii,yii), which reflects the statistics according to the pixels belonging (in fuzzy meaning) to this window. Then we can define the following hzzy surfaces [ 2 ] :
407
The enhanced image components will be calculated using the following function: (5.4)
where X = R , G , B , The method presented in this section was used for the images “kidsatl” and “kidsat4” shown in figures l a and l c [7]. The enhanced images can be seen in figures 1b and 1d.
Figure I . a), c) The original images “kidsatl”, “kidsat4”; b), d) The enhanced images.
408
6.
Conclusions
This paper presents an enhancement method for color images. After splitting the image support in fuzzy windows, for each one of them three statistical parameters are computed: the mean of luminosity, the variance of luminosity and the mean of saturation. Then, using three fuzzy surfaces, an enhancement function is built. Due to the support fizzification, this hnction is adjusted to diverse areas of the processed image. The truncation operations were eliminated using logarithmic operations defined on bounded sets.
References 1. M. Jourlin and J.C. Pinoli, Image dynamic range enhancement and stabilization in
2.
3. 4.
5.
6. 7.
the context of the logarithmic image processing model, Signal processing, Vol. 41, no. 2, 225, (1995). Y . Lin, G.A. Cunningham, S. V. Coggeshall, Input variable identification - Fuzzy curves and fuzzy surfaces, Fuzzy Sets and Systems, 82,65, (1996). J. J. Koenderink, A. J. V. Doom, The Structure of Locally Orderless Images, International Journal of Computer Vision, 31(213), 159, (1 999). V. Patrascu, V. Buzuloiu, Modelling of Histogram Equalisation with Logarithmic Affine Transforms, Recent Trends in Multimedia Information Processing, World Scientific Press, 2002, Proceedings of the 9 I h International Workshop on Systems, Signals and Image Processing, IWSSIP’O2, 312 Manchester, UK, (2002). V. Patrascu, Color Image Enhancement Using the Support Fuzzification, In Fuzzy Sets and Systems - IFSA’03, Vol. LNAI 2715, Springer-Verlag 2003, Proceedings of the lothInternational Fuzzy Systems Association World Congress, 41 2, Istanbul, Turkey, (2003). Z. Rahman, D. J. Jobson, G. A. Woodell, Retinex processing for automatic image enhancement, Human Vision and Electronic Imaging VII, (Eds. B. Rogowitz, T. Pappus),Proc. SPIE 4662,390, (2002). http://dragon.larc.nasa.gov/retinex/.
GIS AND SATELLITE IMAGE PROCESSING FOR THE STUDY OF HUMAN IMPACT ASSESSMENT ON STEPPE DEVELOPMENT IN UZBEKISTAN ISKANDAR MUMINOV National University of Uzbekistan,Institute of Applied Physics, Tashkent 700174 muminofmbcc.com.uz JOSEF BENEDIKT GEOLOGIC Dr. L?enedikt9Lerchengasse 34/3, A- I080 Vienna E-mail:josej
[email protected] The availability of satellite imagery and GIS databases enhances the evaluation of human impact on changing ecosystems. This area was not selected by accident, it has been cultivated by people for very long time starting from the Palaeolithic age. Furthermore it is a trans-border zone between settled and nomadic cultures . The impact of extensive human activities to changes in land use are described by landscape features in a GIS layer based environment using several physical characteristics (vegetation, soils, agricultural use). MapModels will be used to develop scenarios of anthropogenic impact on the process of increasing steppe development. The use of fuzzy set based logical generalizations in modeling geographical phenomena is argued to improve human impact assessment as well as to use GIS as an advanced decision support tool to promote regional development. A knowledge - driven approach is suggested as a working environment to further enhance environmental risk evaluation. Future steps necessary are listed.
1
Introduction
In addition to the physical conditions a lot of land use development in Uzbekistan is due human impact on the environment. The paper gives a geographical description of the general problems of human impact on the degradation of land in the Nuratau area and adjacent regions in Uzbekistan, quantifies the changes in the environment and describes a possibility in assessing human impact on nature in a future MapModels-application. 1.I
Geographic Description Of Historical Development
To study human impact on the environment thoroughly historical and archaeological data have to be considered. We would like to emphasize that historic data on the Study Area are of general nature and they are by far 409
41 0
incomplete and need correction and completion, which is being investigated in the future [l]. “Agricultural activity of people gradually creates artificial ecosystems, so called agrocoenoses, which live by their own laws: to be sustainable they need permanent targeted human labor: they are unable to exist without intervention” (Moiseev, quoted in [2]). During the last decades of the 20th century many ecosystems of the region became destroyed because of extensive agriculture. Furthermore, industrial development and growth of large cities, population of the area began to grow rapidly as well. Rich pastures attracted more and more cattle breeders, but excessive pasturing led to degradation of grass cover and caused soil erosion. Almost all juniper forests, pistachio and almond bushes on mountain slopes were cut off. Isolated species of juniper still remained in most godforsaken ravines among inaccessible rocks only. Most factors threatening natural complexes are of human origin, namely fires, pasturing, fruits collection, harvesting of mulbeny-tree leaves, uncontrolled melioration in. Cows are pasturing constantly in planted forest belt areas, sheep and goats continues to be driven massively in the area of mountain areas (gullies). Overgrazing, typical to lower mountainous area and territories close to villages, leads to sharp degrading of vegetation and acute soil erosion. Harvesting of hay and cutting trees in flood-lands are more local, and their negative impact is not so obvious [ 13. Today, Nuratau and adjacent territories are being intensively used as pastures, as well as for growing forage and grain crops where possible, people grow gardens and keep vineyards. People annually sow wheat and partially barley on large areas both in Northern and in Southern piedmonts. Irrigated lands are mainly used for cotton, boghara lands are used for wheat growing, gardening and vineyards.
1.2
The Study Area
A Digital Elevation Model (DEM) was created using the topographic map in the scale of 1:500 000 to provide a better perspective of the Study Area. A satellite image (250m resolution) was superimposed on the DEM (Fig. 1.)
41 1
Fig. 1: Aidarkul and Tuzkan Lakes (1,2); Farish Steppe (3); Nuratau Mountain Ridge (4); Nuratau Valley (5); Aktau Mountain Ridge ( 6 )
Northern slopes of the Nurata Mountain ridge are directed towards Kizilkum Desert. In watery periods, there are many gullies with spring waters. Due to the latter, settlements appeared in ravines and close to the ravines where people cultivate lands and breed cattle. In water scarcity periods, many gullies get dry and there is hardly enough water to irrigate gardens, and water practically does not reach the valley. Width of small rivers in watery years is 2-3 meters as a rule, with depths of tens of centimeters in some places, whereas in years of water scarcity, width of small rivers is hardly 30-40 centimeters. For example, this situation was observed in the Hayat sai area in 2001, when there was a considerable drought. In 2002, a year with lots of rain, the landscape had a completely different look. By visual interpretation of satellite images made in 1989, 1998, 2002 and 2003 acreage indicators were obtained for the Farish Steppe of Nurata State Reserve. It has to be noted that the GIS layers are inaccurate due to the general lack of metadata at this point resulting in inaccuracies in spatial data referencing among others. The following quantities in land use changes were derived from visual interpretation of digital imagery and GIS layers using Desktop GIS MapInfo: In 1989, total acreage of agricultural lands in the study area, excluding pastures in the Farish Steppe, comprised approx. 36,000 hectares, in 1998 it was approx. 50,000 hectares, and in 2002 approx. 60,000 hectares were cultivated. Area of flood-land vegetation in Nurata State Reserve in 1989 comprised 2310 hectares, and in 1998 it comprised 2148 hectares. During the last decade floodland vegetation was reduced by 162 hectares. Defined area of water of Aidarkul and Tuzkan lakes changes were also calculated using satellite imagery analysis
412
made at different points of time in the period from 1989 to 2003. If we consider 1989 as a reference point, the largest defined area of water was in 2002 comprising approx. 3,000 km2 in April. For the same period in 2003, the defined area of water of these lakes was approx 2,900 km2. 2
Human Impact on Steppe Development in Uzbekistan
The term “landscape” refers to a homogenous territory defined by its origin and history of development. A landscape is characterized by a homogenous geologic foundation, relief of the same kind, common climate, and common combination of hydrothermal conditions, soils and biocenoses, and consequently, having a similar set of simple goecomplexes. Every specific landscape is a genetically integrated, a dynamic geosystem with vertical and horizontal links. Every landscape is in constant interaction with other surrounding landscapes through exchange of substances and energy. Every landscape is unique in space as a geographic individual, at the same time being a part of some typological whole. Major problems caused by human impact on the landscape in the study area include: Detriment and halting of renewal of forest, vegetation cover of mountain valleys, flood-lands as well as mountain slopes; increased wash out of soil due to excessive and mismanaged pasturing of livestock in the areas of Nuratau and Aktay Mountain Ranges; Deforestation of mountain slopes as a result of a longtime practice of cutting trees and bushes for fuel in all mountain areas of the study area; Pasturing of livestock in the territories of natural reserves causing overgrazing, and possibility of infection transmission from domestic animals to wild animals, Severtsev’s ram, in particular, which is registered in the Red Book; High pasture load near villages and uneven utilization due to pasturing of thousands of sheep, goats and cattle. Almost entire piedmont zone, and partially slopes are constantly under pasture load - within the radius of 1 km from settlements 80% of vegetation has been destroyed. In the zone of piedmonts, where most of population is concentrated, pastures are 7-8 times overloaded, even more in Nuratau and Aktau areas; Overgrazing, degradation of vegetation, baring and compacting of soil in the piedmont plain, especially near water (wells, artesian wells), in Farish Steppe area is well visible on the satellite image; Acute change and transformation of vegetation cover because of intensive cultivation, irrational expansion of pasturing.
413
A Geographic Information System seams suitable to address the multi-layered question of human impact assessment. Many Desktop GIS systems, however, do not take into account the uncertainty other than quantifying errors data and information. In this work we attempt to include local - geographical knowledge and to provide means as GIS extensions based on fuzzy sets and logical extensions to adequately model some of the issues described above. 2.1
Uncertainty and Human Impact Assessment
Humans act irrational, that is, we are not aware of all the facts and data available when making decisions. Our evaluation is not scientific in the sense that it is not possible to estimate probabilities and make decisions. Human actions do not follow the logic of a computer system or a binary decision making. A lot of social, perceptional, linguistic and other elements have an impact on human decisions. Traditional computer-based models do not reflect the complexity of the human mind in, for example, shaping a landscape. A generalization of logical concepts may be more suitable in addressing the vast field of human decision making. Soft Computing techniques are tolerant to errors and focus on a generalized concept of formalisms like Fuzzy Logic extends logical axioms. Throughout the so-called "age of information" knowledge itself has not improved but the possibilities to handle uncertainty have gained tremendous importance. 3
Map Models - A CIS modeling language
MapModels is a flexible tool for explorative spatial data analysis and deemed suitable in modeling complex decision making. It has been developed at the Regional Science Institute of the Vienna University of Technology with the intention to bridge the gap between spatial decision analysts and computer programmers [3]. MapModels is a visual programming language based on the widespread desktop GIS ArcView 3. It supports the development and implementation of analysis procedures based on flowchart representations in a very intuitive and user-friendly manner. It is particularly suited for extended decision making with fuzzy set modeling of geographical notions [4]. Flowcharts are used for the visualization of models and analysis processes in a wide range of applications. Normally this kind of graphic representation is simply focused on the illustration of the model structure and information flow but doesn't directly control the underlying processes.
414
Within MapModels the nodes of a flowchart are in fact active elements of the model. They provide a visual encapsulation of real analysis procedures and data objects where input data and analysis operations are represented by labeled icons connected by edges which characterize the dataflow (Fig. 2).
Fig. 2. a simple spatial query: "find all relatively flat areas with an elevation higher than a given threshold" (center: MapModel; left: slopel100m; right: slopel700m with corresponding results)
Since MapModels flowcharts contain executable code, the specification and the implementation of the analysis model is just one single step. It is a kind of a drawing-process where flowchart elements are inserted into the model environment and connected by means of drag-and-drop operations with the mouse [ 5 ] . 4
Outlook
The text reports on work in progress and gives you an idea of the situation and possible methodology used with the data collected in this area. Although there are no results yet in assessing human impacts within a GIS based Decision Support System, the authors intended to give the reader a description on the necessity of using such tools and the promising methodology of generalizations of classical methodology in handling uncertainty and semantic knowledge in the field of land use development. The main topics to be addressed which are of
415
particular importance in working on land use development in Uzbekistan and map modeling are: Develop a model to assess human activity Create a database with elements of electronic maps on natural and human factors. This database should include numeric and descriptive data and knowledge about condition of environment, as well as social and economic situation; Study possibilities of using wind technologies and solar engineering as replacement for existing sources of power in order to preserve trees and bushes in the study area from cutting for fuel. Sow perennial forage crops in the piedmont plain in cattle water access and driving areas to reduce desertification; Hold regional seminars to discuss existing issues of environment protection, its assessment, and ways to address these issues, and to develop practical recommendations to preserve and rationally use natural environment. RemarWAcknowledgment This work is a report on parts of Mr. Muminov's thesis project, who provided field data, satellite imagery and local knowledge to be integrated within a CIS modeling environment. References
1 . The German Environmental Protection Union: Nurata Mountains and south-west part of the Kizilkum Desert, Nuratau Report (1 996) pp. 125-126 2. Muminov I., To a problem of compiling of a modem landscape map, Geography and Values. National University of Uzbekistan, Tashkent Workshop abstracts (2001) (russian only) 3. Riedl L., Vacik H. and Kalasek R, MapModels: a new approach for spatial decision support in silvjcultural decision making, Computers and Electronics in Agriculture 27 (2000) pp. 407-41 2. 4. Benedikt J., Reinberg S., Riedl L., Vague Geographical Knowledge Management - A flow-chart based application to spatial information analysis, In: R. DECALUWE, G.DETRE, G.BORDOGNA (eds.), Flexible Querying And Reasoning In Spatio-Temporal Databases: Theory And Applications (2004, in press) 5. Benedikt J., Reinberg S . and Riedl L., A GIS application to enhance cellbased information modelling, Information Sciences 142 (2002) pp. I5 1- 160
THE HAAR WAVELETS IN A FUZZY SYSTEM FORM AND AN APPLICATION TO THE JOHNSON NOISE THERMOMETRY
B. S. MOON, I. K. HWANG, C. E. CHUNG, K. C. KWON Korea Atomic Energy Research Institute, 150 Dukjin-Dong, Yusong-Ku, Daejeon, 305-353, Korea, Rep. of E-mail:
[email protected] We describe how the multi-resolution analysis using the Haar wavelets and the corresponding discrete wavelet transformation change when the approximate functions involved are represented by fuzzy systems. Using the scaling function of the Haar wavelet or the step function as the input fuzzy sets, we prove that the fuzzy system representation of the intermediate approximate functions differ from the corresponding functions in the MRA of the Haar wavelet by O(h) where h is the length of the subintervals. Thus, the fuzzy system representation of the functions are identical with those of the Haar wavelet when it is applied to the gray scale image or when the manipulations of the function values are performed in integers with a fixed number of bits. We prove that the approximation based on the Haar wavelet picks up the trend curve when it is applied to a band pass filtered sinusoidal signal. An example to determine the trend of the temperature change curve from the power of the Johnson noise signal is included.
1. Introduction
It is well known[l,2] that a continuous function can be approximated by a fuzzy system within an arbitrary accuracy on any bounded set. One of the advantages of using fuzzy system representation is that in fuzzy systems not only the independent variables are approximated but also the dependent variable is approximated. The approximation can be done in such a way that the accuracy is not lost any further than the one by the corresponding crisp approximation. Note that the gray scale images or the functions with A/D converted values can be considered so that they are already in a fuzzy system form. For the gray scale images, the row number and the column number of each pixel may be taken as input fuzzy set numbers, and the gray scale value in each pixel ranging from 0 to 255 can be taken as the output fuzzy set number. Thus, every gray scale image in its original form can be 41 6
41 7
Figure 1. T h e scaling function for the Haar Wavelet 4(2kt - 1 )
considered as a fuzzy rule table. For the functions obtained by a sampling with their values from an n-Bit A/D converter, we take the sample numbers as the input fuzzy set numbers and the n-Bit numbers in their integral form as the output fuzzy set numbers. Note that there will be 2n output fuzzy sets in this case. In the following, we will consider the multi-resolution analysis (MRA) of a function based on a scaling function as a successive approximation of the function, and similarly we will consider the discrete wavelet transformation (DWT) of a function as a successive approximation based on the wavelet. Recall that all of the functions used in these approximations can be represented or approximated by the corresponding fuzzy systems and hence we expect the MRA and DWT in a fuzzy system form should be possible. 2. The Haar Wavelet in a Fuzzy System Form
The Haar wavelet decomposition for a continuous function f ( t ) on [0,1] is defined [3] as follows; Let xi = ih,i = 0, 1 , 2 , . . . ,2", where h = 2-n and let
for 1 = 0,1,2, ' . . , 2k - 1. Note that the function f k ( t ) is constant on every subinterval of length 2-' starting from 0. Let the scaling function of the Haar wavelet be
4(t) =
{
l j ! (
o< t l l otherwise
41 8
then the function f k ( t ) can be written as 2k
and it is routine to check that the multiresolution relation[3] becomes 1 c(k - 1,1) = - ( c ( k , 21) + c ( k , 21 2 For the wavelet transformation, we define
+ 1))
(5)
gk-l(t) = f k - l ( t ) - fk(t)
(6)
and the Haar wavelet G ( t ) by $ ( t ) = 4(2t)- 4(2t - 1). Then, we can write
where the coefficients d ( k , 1 ) satisfy 1 d ( k - 1,1) = - ( c ( k , 21 1) - ~ ( k21)) , (8) 2 To define the wavelet decomposition in a fuzzy system form, let C(n,1) = fn,l for 1 = 0 , 1 , 2 ; . . , 2 n - 1, and for k = n - 1,k - 2 , . . . , 1 , let
+
1 C(k - 1,1) = [-(C(k, 21) 2
+ q k , 21 + l))]
(9)
for 1 = 0 , 1 , 2 , . . . ,2"' - 1 where we used the Gauss bracket function to obtain integer values. Similarly for the discrete wavelet transformation, we define 1 - 1 , l ) = [2 - ( c ( k ,21 1) - C(k,21))] (10)
&
+
Note that C ( k , l ) are integers for all k and 1 and that the sequence {C(k,0), C(k, l),C(k,2), . . . , C(k,2k - 1 ) ) is the fuzzy rule table for a function d ( t )which an approximation of fk(t). In the following, we will show that the error due to this approximation is not very large and it is controllable.
Lemma 2.1. Let f n , l be integral values of a function as defined above and let c(n,Z) = f n , l for 1 = 0 , 1 , 2 , . . . , n . If c ( k , 1 ) and C ( k , l ) are defined as (5) and (10) respectively, then we have IC(k,l) - c ( k , l ) l 5
q.
Theorem 2.1. Let f;c(t) = x;10C(k,1)4(2kt - 1 ) and fk(t) be as defined in (4). Then we have Ifk(t) - fk(t)I 5
"g.
419 2k
Proof. Note that the sum in fk(t) - f k ( t ) = Cl=o(C(k,1) -c(k, 1))4(2%-1) is a direct sum of functions whose supports are disjoint. Hence, we have M a z l f k ( l ) - f k ( t ) ( = rnaz{(E(k,I) - c(k, 1)(1 = 0 , 1 , . . . , 2') and the latter is less than or equal to by Lemma 2.1. Q.E.D.
9
3. Application to the Johnson Noise Thermometry Johnson noise is the temporarily fluctuating electrical current I ( t ) in an electric circuit of resistance R and self-inductance L at absolute temperature T . The Johnson noise can be expressed by the following formula [4,5] based on the assumption that the many complex interactions between the conducting electrons and the thermally vibrating atomic lattice of the wire produce a thermal electromotive force in the circuit.
This can be rewritten with r
=L/R
as
Note that it is an Ornstein-Uhlenbeck process and v(t) can be written as
v(t)= ,+r(t)
(13)
where r is a Gaussian white noise and c = 2 k T R [4].Therefore, we have v(t)' = 2kTR from which we find that the temperature T is proportional to v(t)2.
Figure 2.
Comparison of the Haar Approx. and the Fuzzy Haar Approx.
420
In the following, we prove that the approximation of a sinusoidal signal in a fixed frequency band by the 4th step of the multi-resolution analysis using the Haar wavelet reduces the amplitude of the oscillation to less than 15% of the original.
Lemma 3.1. Let m and N be positive integers such that N is divisible by 256, i.e. N = 256N1. If 5 5 !j and if m = moN1 ml where 0 5 ml < N1, then we have 48 5 mo 5 127.
+
&
~w
Proof. From 5 < f , we have 3 x 24N1 5 moN1 +ml < 28N1. Hence, by dividing by N1, we get 3 x 24 5 mo < 28. Now, note that mo 3 if and only if 48 5 mo. Q.E.D. 0 I < 1 and hence 3 x 24 I N1
+
+
&
Next, we define for 1 = 0,1,2,. .., S(1) = CEl k ’ S i n ( e ) and C(1) = k ’ C o s ( e ) , then by direct calculations we have the following.
& EL,
Lemma 3.2. If S(1) and C(1) are as defined above, then f o r all integers m with 48 5 m 5 127, we have IS(O)l, IC(0)I 5 0.1, I S ( l ) l ,IC(l)l 5 1.0, 15., IS(3)l, IC(3)l 5 250., and IS(4)1, IC(4)I 54000. IS(‘2)1,IC(2)I I Lemma 3.3. If 0 5
J:
< 0.5, then EL”=, $-5
$(l
+ 32).
Theorem 3.1. If m and N are positive integers with N 16 1 6< - N -< 2 I, then we have l & ~ k = l S i n ( ~ 5) 0.15. I
Figure 3.
=
256N1 and i f
Transient Johnson Noise Power with Wavelet Approximation
42 1 nm k
+ dk nm
Proof. Let m = moN1+ ml and write Sin(-) as Sin(-&Note that we have 48 5 mo 5 127 by Lemma 3.1. Let a = Now, using then p 5 and hence p 5 and ,B = Sin(a k p k ) = Sin(ak)Cos(bk) Cos(ak)Sin(pk)which is equal t o Sin(ak)x&(-l)lw Cos(,&k) C ~ o ( - l ) Therefore, L ~ . we can
+
write
I&
a.
+
16
Ck=l s
2nmk
i n ( T ) (5
s.
+ c;=:,
5% +
9.
(s(o)( f plc(l)l + gls(2)l + GIC(3)l +
glS(4)I Now, by using Lemma 3.2 and Lemma 3.3, it is routine t o check that the last sum is less than or equal t o 0.15. Q.E.D. Corollary 3.1. If f(t) is a signal with mean 0, sampled at a rate of N samples per second, band pass filtered with lower limit greater than &N and upper limit frequency below then the maximum amplitude of the 4th step Haar approximation of f ( t ) is below 15% of the original amplitude.
y,
Figure 3 shows an example Johnson noise signal power and its approximation, where the oscillating curve with larger amplitude is the original and the curve in the middle is the approximated. 4. Conclusion
We have shown that the multi-resolution analysis of functions and the corresponding discrete wavelet transformation based on the Haar wavelet, can be represented by fuzzy systems. We have also shown t h a t the Haar wavelet approximation can be used to determine the transient temperature from the power of the Johnson noise signal without time delay. For a further study, i t is left t o prove t h a t Theorem 3.1 is true even when the sample rate N is not a multiple of 256.
References 1. B. S. Moon, A Practical Algorithm for Representing Polynomials of Two Vari-
2.
3.
4. 5.
ables by Fuzzy Systems with Accuracy O(h4), Fuzzy Sets and Systems 119(2), 321(2001). J. L. Castro, M. Delgado, Fuzzy systems with defuzzification are universal approximators, IEEE Trans. Systems Man Cybernet. 26(l), 149(1996). Raghuveer M. Rao and Ajit S. Bopardikar, Wavelet Transforms; Introduction to Theory and Applications, Addison-Wesley Longman (1998). D. T. Gilespie, A Mathematical Comparison of Simple Models of Johnson Noise and Shot Noise J . Phys., Condens. Matter. 12, 4195-4205 (2000). D. T. Gilespie, The Mathematics of Brownian Motion and Johnson Noise, Am. J . Phys. 64 (3) , 225-240(1996).
NEURAL NETWORK BASED SONAR TARGET DIFFERENTIATION W.S.LIM M.V.C.RA0 C.K.LO0 Center of Robotics and Automation, Faculty of Engineering and Technology, Multimedia University, Jalan Ayer Keroh Lama, 75450 Melaka, Malaysia. Email:
[email protected] In this paper, the stability-plasticity behaviors of Minimal Resource Allocation Network (MRAN) and Probabilistic Neural Network (PNN) in differentiating targets by means of sonar signals have been investigated. MRAN has shown to have lower network complexity but experiences higher plasticity compared to PNN in the experimental study. In terms of on-line learning performance, MRAN has also proven to be more superior to PNN.
1. Introduction Pattern classifying has become an important topic for robotics research in many applications [l]. These classifiers are capable of predicting the shape of objects or obstacles surrounding the robot by means of processing the input data received from numerous types of sensors or detectors on the robot. Generally, different objects with different curvature will reflect the signals at different angles and intensity. In sonar signals, they supply the data of the distance between the target and the detector that serve as the input to the neural networks. In this paper, we investigate the use of neural networks in processing the sonar signals reflected by different targets and localization applications for indoor environments. It describes how M R A N algorithm performs on pattern classification of various targets and a comparison between its performance and Probabilistic Neural Network (PNN). The robustness and plasticity of MRAN neural network performance are tested and compared. The comparison of results shows that MRAN has a high level of plasticity that deteriorates its capability in maintaining the neurons’ weights of the previous encountered patterns. Hence it is more suitable for online learning where the weights change rapidly according to the online received data.
2.
Target Classification with Neural Networks
The target differentiation algorithm used in earlier works like in [2] is reviewed. It has given useful ideas of how to differentiate targets by means of their shapes and radius of curvature. In this work, two types of neural network algorithms are 422
423
used, namely Minimal Resource Allocation Network (MRAN) and Probabilistic Neural Network (PNN) for classifying three primitive targets. The target primitives modeled in this study are wall, corner, and edge (Figure 1). The M R A N and PNN used in this work have one input, one hidden and one output layer. For MRAN, there are 6 input nodes and 3 output nodes employed, whilst the number of hidden neurons varies with each training data set. For PNN, there are also 6 input nodes but only 1 output node. The number of hidden nodes in PNN depends on the number of training data [ 5 ] .
wall
corner
edge
Figure 1 . Cross sections of the target primitives differentiated in this work.
Figure 2. Sensitivity region of an array of ultrasonic transducers in P2AT robot.
2.1. Minimal Resource Allocation Network (MRAN)
In M R A N , the network begins with no hidden neurons. As each training data pair (input and output) is received, the network builds itself up based on two growth criteria, equation (1) and (2). (1) 1i.l - pinr 11 > ~i (2) )Iei 11 = 1I Yi -Axil )I > emin where J L ' is ~ the center (of the hidden unit) which is closest to xi (the input received). ei is the calculated error, the difference between output received, yi and the network output,Axi). ~i , emin are threshold to be selected appropriately. The algorithm adds new hidden neurons or adjust the existing network parameters according to the training data received. The algorithm also incorporates a pruning strategy that is used to remove hidden neurons that do not contribute significantly to the output. Its simulation results in [ 101 show that the algorithm is able to build up a network that can perform equalization better than several existing methods. Details on the classical PNN are not further explained here, as more information can be referred from [3]. 2.2. Experiment procedures In our system, a commercially available robot simulator, Amigobot modeled P2AT is employed for data collection (Figure 3). Six identical acoustic sonar
424
transducers on the front side of the robot were utilized as shown in Figure 2. Each transducer can operate both as transmitter and receiver and detects echo signals reflected from targets within its own sensitivity region. The echo signals, which indicate the distance of the target from the transducers, are collected at 200 locations for each target, by positioning the robot from r = 102.5 mm to r = 600 mm in 2.5 mm increments, which gave 600 training data (3 target types x 200 data each) in total for training purposes.
Figure 3. Amigobot simulation software
Three training methods are employed to judge the robustness and plasticity of both networks. Firstly, the networks are trained in a way that it received data collected from one target type, the second and then followed by the third target type in sequence. In the second method, they are trained by randomly mixed data of all three target types. In third training method, both networks are trained in a similar way to the first method, but it is extended further by repeating the first target type training again in sequence. After the training process, the networks are tested to investigate their robustness and stability for targets situated at new distances from the robot. For example, the robot was located at a distance of r = 105 mm to r = 602.5 mm in 2.5 mm increments. It is basically the space between two side-by-side original locations previously used in collecting the training data, thus providing 600 new data (3 target types x 200 data each) for testing the neural network performance. The same testing data used to test h4RAN was applied again to test the network performance in PNN. As stated earlier, there are 3 output nodes for the MRAN network. Each of these output nodes corresponds to each type of target. For simplification of calculation, the three outputs are scaled to yield only the value of 0 or 1. Then we employed the “winner takes all” method for selecting the largest value among the three outputs to be a “1” while the other two remain as “0’. The output node
425 with a “1” indicates the type of target detected. This in turn will give every test result in combination of 0 and 1 only (001,010 or 100).
3. Comparative Analysis and Discussion Both networks are compared on their accuracy of estimating the correct target, namely wall, corner and edge. As discussed in the previous section, the data used for testing the networks are different from the data used for training. This is to further ensure that the robustness and stability of the networks are tested. The numerical values in Table 1 show the percentages of correct target-type classification.
Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met Network used Target type
MRAN
PNN
Training Method Method 1
wall
comer
edge
wall
comer
edge
37.5 %
55.3 %
100 %
98.0 %
93.8 %
97.5 %
Method 2
65.8 %
72.4%
48.0%
98.0%
93.8 %
95.6 %
Method 3
100%
52.1 %
70.7%
98.0%
93.8 %
95.6 %
Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met The results obtained above have led us to study further on the concept of stability-plasticity dilemma in MRAN and PNN neural networks. Till today, many research works have been done to encounter the stability-plasticity problems in neural networks such as in [6,7] for pattern classifications and function approximation. A major restriction on traditional artificial neural network is that the approximation capability will be frozen after the completion of training process. This results in a gradual degradation of estimation performance when applied to non-stationary environment. In solving this problem, the key challenge is the requirement to maintain a compromise between robustness toward interference and the adaptability to environment changes. Until recent decades, artificial neural networks have been providing us with many successful evidences on the application of multivariate and nonlinear time series prediction [8,9]. However, traditional neural networks always perform unsatisfactorily in non-stationary cases because of a deficiency of feedback mechanism to accommodate the input distribution changes. Missing this feedback mechanism, the common way to adapt the distribution skewness is to
426
completely clearing the existing network memory and begin with a new training set including information about current changes.
3.1. MRANperformance study The results from MRAN show that perfect classification (100%) is obtained for the object / target-type that is trained last in the sequence, i.e. for Methods 1 and 3. This implies the network learns very quickly and able to classify accurately on what it has just learned. In Method 1, edge is the last batch of training data received by the network, and edge is the only target accurately classified whereas for wall and corner, the network failed to classify them accurately. The same case goes for Method 3, where wall data is repeated in the training process, and as expected, wall has the perfect classification percentage this time. In Method 2, the network classifies poorly in overall when the training data is randomly mixed. The number of hidden neurons generated in MRAN for each training is in the range of 15 to 20. Generally the algorithm starts from null state with zero number of neurons. While learning occurs, neurons are incrementally allocated into hidden layer according to the two criteria given by equation (1) and (2). If the two criteria are simultaneously satisfied, a neuron would be created and associated with the network to diminish the errors from unanticipated vector. If the two criteria are not satisfied, instead of using neuron allocation strategy to minimize the errors, MRAN will adapt the Extended Kalman Filter algorithm to optimize the estimation by gradually repairing the errors on the connection weights and neuron centers [4]. Hence the frequent updating of weights and centers would cause the network to be unstable and high in plasticity.
3.2. PNNperformance study On the other hand, the results from PNN show that the highest classification percentage is from wall, followed by edge and then corner. Generally, the classification percentage of all targets achieved the satisfactory level (above 90%).The number of hidden neurons generated in PNN for each of the training is depending on the number of training data [5]. In this case, the number of neurons generated is approximately 600 which is equivalent to the sample size used to train both networks. Hence, the training process time is much longer, and a large amount neurons generated also adds to the complexity of the network. However, PNN does not suffer from these unstable and plasticity problems because generally it creates a separate neuron for each training sample. These generated neurons are neither pruned nor adjusted in the training process, hence the weights of every neurons are maintained. For moderately sized databases this
427 is not a problem, but unfortunately it will be a major drawback for large databases and applications where it deteriorates the speed and consequently increases the complexity of the network. 4.
Concluding remarks
The percentage of the correct target type classification is high at 100% for the last target trained by the MRAN for all the three training methods. This shows that the target type which is trained most recently will give a high classification accuracy. MRAN can be said to have high plasticity and furthermore it is unstable. In other words, the weights carried by the hidden neurons during training process change too rapidly, making the network tend to “forget” what it learned previously. No doubt, it learns very quickly, but it only “remembers” what is trained most recently. Hence MRAN is not suitable in pattern classification, but it is extremely useful and efficient for online learning. The advantage of M R A N network is its low complexity due to small number of hidden neurons generated during the training. PNN shows to be a more stable network than MRAN but the disadvantages are, it requires longer time to be trained due to the large number of hidden neurons, high network complexity, and its learning speed is slower than MRAN. Hence PNN is more suitable as a pattern classifier than MRAN.
References 1. R. P. Lippman, IEEE ASSP Mag., pp. 4 - 22, Apr. 1987. 2. B. Barshan, B. Ayrulu, and S . W. Utete, IEEE Trans. Robotics and Automation, vol. 16, pp. 435 - 442, August 2000. 3. D.F. Specht, IEEE Conf., Neural Net., Vol. 3, no. l., pp. 109-118, 1990.M. 4. Chan, C. Fung, IJCNN ‘99. IEEE International Joint Conference, Volume: 3, 10-16,July 1999, pp. 1554 -1559 ~01.3. 5. Donald F. Specht, IEEE Conf, Volume:1,7-l1,June 1992,pp.761-768, vol. 1. 6. B. L. Pulito, T. R. Damarla, S . Nariani, Neural Networks, 1990 IJCNN International Joint Conference on 17-21 June 1990, pp. 825 -833, v01.2. 7. J. P. Albright, 1994. IEEE World Congress on Computational Intelligence., 1994 IEEE International Conference, Volume: I , 27 June-2 July 1994, pp. 498 -502, ~ 0 1 . 1 . 8. Rok Rape, Dusan Fefer, Janko Drnovsek. Time Series Prediction with Neural Networks, IMTC’94, IEEE, May 1994. 9. Ben Jacobsen. Time Series Properties of Stock Returns, Kluwer Bedrijfshformatie, 1997. __ . 10. Deng Jian Ping, Narasimhan Sundararajan, and P. Saratchandran, IEEE Trans. Neural Networks, vol. 13, pp. 687 - 696, May 2002.
SYSTEMATIC DESIGN OF A STABLE FUZZY CONTROLLER FOR A ROBOTIC MANIPULATOR USING DESCRIBING FUNCTION TECHNIQUE
EVREN GURKAN Department of Electrical Engineering and Computer Science Case Western Reserve University Cleveland, OH 44106, USA E-mail: evrengometu.edu.tr Our aim in this paper is to develop a systematic method for design of stable fuzzy controllers using describing function technique. A given multi-input single-output fuzzy system is first reduced to a set of single-input single-output fuzzy systems using the additivity property. Then, the analytical expressions for the describing functions for each of the single-input single-output system are evaluated. The describing function of a fuzzy system is an interval-valued function; therefore, robust control results for the stability analysis are used. We have applied the theoretical results to the single-axis control of the Phantom haptic interface robotic manipulator.
1. Introduction The stability analysis of fuzzy systems becomes an important issue when they are used as controllers. Thus, finding systematic ways of the stability analysis is a major interest in literature. The existing approaches are reviewed in Sugeno l2 and Kandel '. The methods can be grouped as Lyapunov methods (12, 16, and 4), robust stability analysis (7, 11 and lo), adaptive fuzzy controller design (15,13 and 14), and frequency domain methods (9,5 and 1). Describing function method is used in 9, 5 and 1. Kim et al. derive analytical expressions for the describing functions of a fuzzy system with single input, and a fuzzy system with two inputs, where the second input is the derivative of the first. The existence of the limit-cycle of the fuzzy control system is predicted using the describing function analysis. In 1, the describing function method is used to analyze the behavior of PD and PI fuzzy logic controllers. The existence of stable and unstable limit cycles are predicted. Describing function analysis of a T-S fuzzy system is done in 5, 428
429
where the describing function is evaluated experimentally. The existence of multiple equilibria and of limits cycles are examined. In this paper, we develop a systematic design method for fuzzy controllers using describing function technique. We first apply additivity property of fuzzy systems to multi-input single-output fuzzy systems to decompose them into a sum of single-input single-output fuzzy systems. The describing function of these systems can then be calculated analytically using the methods in 9. The describing function obtained in this way is interval-valued, so in the stability analysis we use robust control results. The mathematical overview of the calculation of the describing function together with the additivity property, and the stability analysis is given in Section 2. The theoretical results are applied to the single-axis control of the Phantom haptic interface robotic manipulator in Section 3. Section 4 concludes the paper.
2. Mathematical Overview In this section, we give a brief overview of the additivity property of the fuzzy systems, and the analytical calculation of the describing function. The stability analysis based on describing function technique is also discussed here.
2.1. Additive Fuzzy S y s t e m s We extend the additivity property of fuzzy systems introduced in Cuesta et al. in order to reduce the multi-input single-output fuzzy systems into a sum of single-input single-out ones. The closed form of the fuzzy logic system with center average defuzzifier, product-inference rule and singleton fuzzifier is:
i
j
1
k
where,
P
~
tS
We decompose the fuzzy system in Equation 1 using the additivity property. For simplicity, we develop the theory for n = 4, and the extension of the theory to higher degrees is straightforward. For the fuzzy system to be
430 additively decomposable, it should satisfy the following property [5]:
f(.)
=f(21,x2,...,xn)=f(~l,O,...,O)+
f(O,x2,.
.. ,O)
+ . ..+ f(O,O,.
(3)
. . ,xn)
The assumptions on the membership functions for the system to be decomposable are given in 5. We use triangular membership functions that satisfy these assumptions: 2 - 4qi-1 7
54i-1
I xq < 5bqi
4qi-1
where 4-qi = -&i. These type of membership functions also satisfy the assumptions in the calculation of the describing function, which is introduced in subsection 2.2. When 4 1 a 5 x 1 < & + I , two consequent rules are fired for x1 with memberships pla(xl) and p l a + l ( q ) . The same applies for the other inputs. As a total, there are 24 = 16 rules fired. The corresponding fuzzy system output is:
f(x) = plap26/13cp4dYabcd + ~ l a ~ 2 b ~ 3 ~ ~ 4 d + l Y a b c d + l
+ .. + ~ ~ a + 1 ~ 2 b + l ~ 3 c + l ~ 4 d Y a + l b + l c + l d
(5)
'
+pla+l p2b+l P3c+l/14d+l Ya+lb+lc+ld+l
The decomposed system should have four single-input single-output systems of the form:
f(%l,O, 070) = f l ( 4
= PluYafgh
+ Pla+lYa+lfgh
+ pZb+lYeb+lgh =~ ~ c ~ e+ f c~ ~h 3 c + l ~ e f c + l h
f ( O , x 2 , 0 1 0 ) = f 2 ( 5 2 ) = PZbYebgh
f(o,o,~ 3 7 0 )= f 3 ( 5 3 ) f(o,o, 0 7 x 4 ) = .f4(24)
= p4dYefgd f
(6)
p4d+lYefgd+l
We derive the conditions under which f l ( x l ) + f 2 ( 2 2 ) + f 3 ( 1 ~ 3 ) +
f4(54)
f(x) is satisfied [6], such that we should choose yafgh+yebgh+yefch+yefgd Yabcd,
= =
etc. All the constraints and the proofs can be found in 6.
2.2. Describing Function After reducing the fuzzy system into a sum of single-input single-output fuzzy systems, we use the analytical calculation of describing function of fuzzy systems introduced in 9. We review the describing function for a fuzzy system without giving the proofs (the basic assumptions and proofs can be found in 9). The membership functions are in the form of Equation
431
4, and the closed form of the system is given by Equation 1 for n = 1. The describing function of a single-input single-output fuzzy system is then given as: d
-C{-
bl 4 AuiA N(A, W ) = N ( A ) = - = A T Ai = O 2A4i I,&( - sin &+I cos &+I) - (bi - sin bi cos Si)) 1
+ - - - ( 4 i U i + 1 - 4i+lUi)(COS &+l
Adi
(7 )
- cos bi)}
where d satisfies 4 d 5 A < &+I, d > 0, and varies with A; {hi} are defined to be the angles where the input sinusoid x = Asinb intersects the centers {q5i}’s of membership functions. For {&}’s, we have:
=0 4i Si = sin -1 (-), A 7r &d+l 5 60
7r
(i = 1,.. . , d, 0 < bi < -) 2
(8)
2.3. Stability Analysis We use describing function method for the stability analysis of the fuzzy systems. In this method, the describing function of the fuzzy system is cons ) sidered in cascade with a linear plant with transfer function G ( s ) = 4-
4s)’
which has a low-pass property. The characteristic equation of the feedback system with the fuzzy controller replaced by the describing function N ( A ) that is in cascade with the linear plant G ( s ) is: C ( s ) = 1 N ( A ) G ( s ) ,so we have:
+
C(S)= 1
+ N(A)G(s) = d ( s ) + N(A)n(s)
(9) The C in the above equation is an interval polynomial, since N(A) is real and interval-valued that depend on A. For the stability analysis of this interval polynomial, we use Kharitonov’s theorem for real polynomials [ 2 ] . For our system in Equation 9, we need to check the Kharitonov polynomials for the characteristic equation C , where N(A) E [N,i,,N,,,]. If these polynomials are found to be Hurwitz, then we conclude that our system is stable.
3. Application Example We apply the theoretical results to the control of the Phantom haptic interface robotic manipulator. The state equations of the system is in the
432
following form [3]:
where K , = 0, B, = 6.46 x and M , = 2.02 x lop5. We use our fuzzy system as a state feedback controller with the describing function N ( A ) = NI N2, so the controller takes the form u = -[Nl Nz]z. Here, N1 E [al, b l ] and N2 E [a2,b2]. The characteristic equation for the closed loop system becomes C(s) = s2 "GSNzs which is an intervalvalued polynomial. For this second order polynomial t o satisfy Kharitonov's theorem, the lower bounds of N1 and N2 should be higher than zero, i.e. a1 > 0 and a2 > 0. We choose 5 rules, and the controller parameters after application of additivity property (f (z) = fl(z1) fi(z2)) are chosen as follows: For f l , the centers of the membership functions are 4 - 2 = -42 = -3, 4-1 = -41 = -1.5, 40 = 0 and y-2 = -y2 = -0.0005, y-l = -y1 = -0.0001, yo = 0. The range for N I for these parameters is [0.0667 x 10-30.1449 x lop3], which is positive. We assign the same membership functions for f 2 , and y-2 = -y2 = -0.00008, y-1 = -y1 = -0.00003, yo = 0, and the range for N2 is [0.2 x 10-40.2521 x The stable system states that result using this fuzzy system to control the robot manipulator are shown Fig.la. When we change the assignment for f 2 such that y-2 = -y2 = 0.0008, y-1 = -y1 = 0.0003, yo = 0, the range of N2 changes to [-0.2521 x - 0.2 x becoming negative. The resulting unstable system states are given in Fig.lb.
+
+
+ KG3NL,
+
(4
(b)
Figure 1. (a) Stable and (b) Unstable System States
433 4. Conclusion We have presented our stable fuzzy controller design approach using describing function technique. We have applied the theoretical results to t h e control of the Phantom haptic interface robotic manipulator, where t h e simulation results agreed with the theory.
References 1. J. Aracil and F. Gordillo. Describing function method for stability analysis of pd and pi fuzzy controllers. Fuzzy Sets and Systems, in press, 2003. 2. S. P. Bhattacharyya, H. Chapellat, and L. H. Keel. Robust Control: The Parametric Approach. Prentice Hall, 1995. 3. M. C. Cavqoglu and F. Tendick. Kalman filter analysis for quantitative comparison of sensory schemes in bilateral teleoperation systems. In Proc. IEEE Int’l Conference on Robotics and Automation, May 2003. 4. S. Y. Chen, F. M. Yu, and H. Y. Chung. Decoupled fuzzy controller design with single-input fuzzy logic. Fuzzy Sets and Systems, 129:335-342, 2002. 5. F. Cuesta, F. Gordillo, J. Aracil, and A. Ollero. Stability analysis of nonlinear multivariable takagi-sugeno fuzzy control systems. IEEE Transactions on Fuzzy Systems, 7(5):508-520, 1999. 6. E. Giirkan. Uncertainty Modelling and Stability Analysis for % W a y Fuzzy Adaptive Systems. PhD thesis, Middle East Technical University, 2003. 7. H. Han and C. Y. Su. Robust fuzzy control of nonlinear systems using shapeadaptive radial basis functions. Fuzzy Sets and Systems, 125:23-38, 2002. 8. A. Kandel, Y. Luo, and Y. Q. Zhang. Stability analysis of fuzzy control systems. Computers and Structures, 105:33-48, 1999. 9. E. Kim, H. Lee, and M. Park. Limit-cycle prediction of a fuzzy control system based on describing function method. IEEE Transactions on Fuzzy Systems, 8( 1):11-22, 2000. 10. L. Luoh. New stability analysis of t-s fuzzy system with robust approach. Mathematics and Computers in Simulation, 59:335-340, 2002. 11. C. W. Park. Lmi-based robust stability analysis for fuzzy feedback linearization regulators with its applications. Information Sciences, 152:287-301, 2003. 12. M. Sugeno. On stability of fuzzy systems expressed by fuzzy rules with singleton consequents. IEEE Transactions on Fuzzy Systems, 7(2):201-224, April 1999. 13. Y. Tang, N. Zhang, and Y. Li. Stable fuzzy adaptive control for a class of nonlinear systems. Fuzzy Sets and Systems, 104:279-288, 1999. 14. S. C. Tong, T . Wang, and J. T. Tang. Fuzzy adaptive output tracking control of nonlinear systems. Fuzzy Sets and Systems, 111:169-182, 2000. 15. L. X. Wang. Adaptive Fuzzy Systems and Control: Design and Stability Analysis. Prentice Hall, 1994. 16. S. Y. Yi and M. J. Chung. Systematic design and stability analysis of a fuzzy logic controller. Fuzzy Sets and Systems, 72:271-298, 1995.
AN INTELLIGENT ROBOT OVERVIEW FOR MEDICAL IN VITRO FERTILIZATION JIA LU Computer Science and Information System University of Phoenix 5050 NW 125 Avenue, Coral Springs, Florida, FL 33076, USA. cluiia @email.uouhx.edu
WNXIA HU Health Science, Nova Southeastern University
[email protected] This paper proposed an intelligent robot for In Vitro Fertilization (TVF) clinic research. The surgeon consistent is important issues for the research. The end effector of the robot was obtained through right, left, up, and down motions at the location. The robot was attachable to the bedside using a conventional scope holder with multiple joints. The control system was expected to enhance the overall performance of the system during eggs retrieval procedure. The auto-tracking control algorithms were demonstrated and computed for the potential assistant robot system.
1. Introduction In 1994 Green developed a telesurgery system (Green 1994). He showed the possibility of teleoperation robot’s performance reaching on site open surgery. In 1995, Funda developed a system including a remote motion robot that holds a laparoscopic camera and the instruments. This robot has degrees of freedom and rotations components for the instrument mounted joystick (Funda 1995). In 1996, IBM Research and Johns Hopkins Medical Center developed ceiling mounted surgical robot system for laparoscopic camera navigation (Taylor 1996). Casals proposed a camera control strategy to track the instruments during a surgical procedure based on the computer vision analysis of the laparoscopic image (Casals 1996). To alleviate the problems, Cavusoglu has developed a compact laparoscopic endoscope manipulator, which was mounted on the port of the patient’s, using the McKibben artifical muscles (Cavusoglu 1998). A robotic arm was designed to hold the telescope with the goal of improving safety and reducing the need of skilled camera operator (Zhang 1999). In order to enhance the performance of IVF eggs retrieval surgery and to overcome disadvantage of conventional systems, in this paper, we proposed an intelligent robot. 434
435 2. An Intelligent Robot For the hardware, the workspace of an assistant robot must be used not to affect tissues during eggs retrieval procedure. An optimized workspace reduces the possible damage in case of system failure. To observe and analyze the motions at the ranges, it is important for us to do the simulation first for the eggs retrieval. For the software, the programming commends are also critical for an assistant robot moving functionality. In order to achieve high safety and adaptability for the optimal range of the motion, the system was followed the two steps (Lu and Hu 2004): A potential assistant robot was used to minimize the interference during the procedure operation. The robot was used with a conventional laparoscope holder for its fixation to the bedside. The robot was mounted and positioned at various locations for necessary views at different degrees. The robot senses to the shape of eggs in the human environment. The motion reconstruction algorithm was based on the calculation of any surface on Euclidean 3-space R3 (Lu and Hu 2004). Theorem 1: For the approximated tangent vector, A is a regular rotation on Euclidean 3-space R3 and A(p + Ap), where Ap had small value in Eq. (1).
where A 0 indicates the location length of rotation A. Theorem 2. For a proper definition of area in R3, the coordinate patch (x) was performed as x: M ->N. Let AP be a small coordinate rectangle in M with side Ap and Av. Such that, x distorts AP into a small rotation region x(AP) in N for different location of parameters rotation. If the location from x(p, v) to x(p + Ap, V) is linearly approximated by the vector Ap, the location from x(p, v) is approximated for x(p + Ap, v). Therefore, the region x(AP) is approximated by the parallelogram in the tangent plane at x(p, v). We can find that the parallelogram in the tangent plane coincided with the parallelogram in different points on any location. If different points on any location were determined, the points can be derived by parallelism theorem. If the local area composed by different points was very small, the points would be on certain location. The algorithm was proposed to derive the motion of the eggs in the environment by using degree of freedom sensor. In the Euclidean 3-space R 3 , N time sweeping was executed with a range sensor (N22). In this case, all locations of human must pass through B, (body common location). If there were a large number of points on the eggs retrieval, these points can be expressed in C, as the
436
motion position vector. The performance of the proposal algorithm can be evaluated through the several simulations. The important steps for the robotic systems were the motion tracking control. This control can be used the powerful mixed integer quadric programming algorithm. The optimality criterion for the robotic system can be used the following Eq. ( 2 ) (Yin 2003),
I = [vT(t) vT(t+l)... vT(t+N-l)]
(2)
where I defined index function for the optimal solution for N time of horizon optimization. The larger N results were better control performance in increasing the computation burden exponentially. The number N can be restricted within the small value. The desired operation of the eggs retrieval procedure may not be reasonably accomplished by only one time of N steps optimization. We considered the receding of the horizon for repeating optimization. The process would continue until the desired manipulation was achieved. The time sequence of continuous and discrete logical solutions corresponds to the optimal motion. However, the desired model recede the horizon for the easy computation in large computation time. The following approach (Yin 2003) was applicable to the control of fast varying dynamics for the robot system. The main controller ran under PC in order to do the retrieval eggs. The board acquired the eggs data from 3-CCD camera on the tip of robotic section. To control the positions of the motors, PID controller was used. The control algorithms analyzed the images obtained from the retrieval board for the control. The operations for the eggs retrieval surgeon were occupied by surgical tools. The PC commands such as ‘right’, ‘left’, ‘down’, and ‘up’ were used for controlling the surgical robot. The locations of surgeons varied depending on surgical types. A patient was at fix position, and the camera operator was changed in different view during the procedure. The surgeons in actual surgery required an assistant robot that was capable of visualizing the entire patient’s environment during the procedure. We determined the ranges of the motion were from -45,45, -45,and 45 degrees for each direction. 3. Simulations The result of the motion estimation process was capable of computing the motion among these positions. The eggs of motion compensation kept some kinds of stabilized retrieval sequence. According to the proposed algorithms, the motion estimation and correction was performed for the eggs retrieval sequence. For the real image test, we got image frames for implementing to the system. The image
437 frame sequence was the size of 340 x 346 pixels and contained 45 frames in length. We chose the rotated location 200x 300 x 200 x 300 pixels of the frame as the active block for estimation. Eggs remained in the fallopian tube for a while, and robot was propelled toward the uterus by its arm through different motions at the locations. The auto-tracking algorithm was used for the simulation. We used the desired rotated location from the robot to get the optimal trajectory in eggs retrieval. We showed the simulation and manipulator kinematics graphically was more efficient. 4. Conclusions
Trajectory searching plan was often the kinenatic analysis only, and it is usually ignored for the actual dynamics of the robot. The obstacle avoidance strategy for end-effector and the manipulator’s joints are the key for an assistant robot. Joint space can be considered as a manipulator dynamics based on the inputs and the output of trajectory planning algorithm in terms of a time sequence of the values attained by position, velocity, and acceleration. Therefore, we think this approach may be effective if deviations from the linearization point are alternatively if the different linearizd models are used as the assistant robot moves along its trajectory.
References 1. P.S. Green, Teleprsence surgery demonstration system. Proc. Of the Conf. on Robotics and Automation, pp 2302-2307. (1994). 2. J. Funda, A Telerobotic assistant for laparoscopic surgery. IEEE, EMBS Magazine Special Issue Robotics in Surgery, pp. 279-291. (1995). 3. R. H. Taylor, An overview of computer-integrated surgery at the IBM. Watson Research Center, IBM, Vol. 40, No. 2, pp 163-183. (1996). 4. J. A. Casals, (1996). Automatic guidance of an assistant robot in laparoscopic surgery. Proc of Conf. on Robotic and Automation, pp 895-900 (1996). 5. M. C. Cavusoglu, Laparoascopic telesurgical workstation. Proc. Of the SPIE International Symp. On Biological Optics pp 296-303. (1998). 6. J. M. Zhang, A Flexible New Technique for Camera Application in Internet, Technical Report MSR-TR. (1999). 7. Y. J. Yin, (2003). On Novel Hybrid System Control Framework of Intelligent Robots. ICSSSE, Hong Kong, pp 535-540 (2003). 8. J. Lu and Y. X. Hu, A potential assistant robot for IVF retrieve. Proc. IEEE SoutheasternCON, NC, U.S.A (2004).
SELF-ADAPTATION OF THE SYMBOLIC WORLD MODEL OF A MOBILE ROBOT. AN EVOLUTION-BASED APPROACH* CIPRIANO GALINDO JUAN-ANTONIO FERNANDEZ-MADRIGAL JAVIER GONZALEZ System Engineering and Automation Department University of Malaga, Spain
A world model adapted to changes in the robot’s environment as well as to changes in its operations permits a situated agent, i.e. a mobile robot, to properly plan and execute its tasks within its workspace. Such adaptation of the symbolic world model involves two problems: maintaining the world model coherent with the environment information, and optimizing the model to perform efficiently the robot tasks. This paper presents a first step towards the self-adaptation of the symbolic world model of a mobile robot with respect to both aspects. Then, the paper focuses on the optimization subproblem, proposing an evolution-based approach to keep the symbolic model of the robot optimized over time with respect to the current agent’s operational needs (tasks).
1. Introduction A situated agent must account for a certain model of its workspace in order to perform deliberative actions, i.e., planning tasks. In particular, mobile robots that intend to perform autonomously usually work within highly dynamic and largescale environments, possibly performing tasks that also vary over time. The success of the agent in these situations largely depends on maintaining its internal model updated and coherent with the real environment. We call this selfadaptation of the symbolic model of the agent, and distinguish two parts in that problem: firstly, achieving coherence between the model and the real environment just in order to plan and execute tasks correctly (the coherence subproblem); secondly, tuning that model in order to improve the efficiency of task planning within the particular environment and operation needs of the robot (the optimization subproblem). The coherence subproblem is in fact the problem of anchoring, profusely addressed in the recent literature ([ 1],[2]). Anchoring deals with connecting symbols of the world model of an agent with its sensory information. The solution of the anchoring process usually involves to check for new anchors, that is, relations between symbols and sensory information, and to maintain previously
This work was supported by the Spanish Government under research project DP12002-0 13 19. E-mails: {cipriano, jafma, jgonzalez }@ctima.uma.es
438
439 found anchors coherent with the new information gathered by the robot sensors. In this paper, the problem of anchoring is not addressed, although a general scheme is proposed where future work on that issue can fit. Concerning the optimization subproblem, we have found less work in literature ([5],[9]). Elsewhere, we have been proposed an approach based on the psychological idea that the human brain arranges environmental information in a way that improves hislher efficiency in performing tasks [4]. That approach, called the task-driven paradigm, is intended to improve over time the performance of the set of operations carried out by the robot. Up to our knowledge, considering both subproblems within a single framework has not been addressed. However, we believe that a situated agent must face both of them in order to obtain the greatest benefit in its operations. In this paper, we propose a general system called ELVIRA where both subproblems can be integrated. ELVIRA has been tested in simulation within large-scale environments modelled through graph structures and under variations on the tasks that the agent must carry out. It has yielded valuable results concerning the improvement of self-adaptation of the symbolic model of the robot. The optimization subproblem is approached by ELVIRA through an evolutionbased method. In general, evolutionary approaches, like genetic algorithms, may not be seen appropriated to face dynamic systems. Nevertheless, in some particular situations, they can be adapted to any-time schemes, since they can provide approximate solutions that improve over time ([5],[9]). Our experiments clearly show that the evolutionary optimizer of ELVIRA achieves a high degree of adaptation of the symbolic model of the robot respect to its tasks. Next, the symbolic world model used in this work is introduced. Section 3 gives a description of ELVIRA. In section 4 some discussion and experimental results are presented. Finally, conclusions and future work are outlined. 2. The Hierarchical World Model
Within the ELVIRA system, the symbolic model of the environment plays a central role. Assuming that our agent works in large-scale space, the model should be suitably arranged for coping with a potentially huge amount of information. As stated in literature, humans widely use a mechanism called abstraction ([7],[8]) to model their environments. Abstraction works by establishing different levels of detail in the symbolic data, grouping symbols into more general ones and considering them as new symbols that can be abstracted again. The result is a hierarchy ofabstraction that ends when all information is modelled by a single universal symbol. In order to manage abstractions we use a mathematical model called AH-graph [4]. An AH-graph is a graph representation that includes hierarchical information
440
organized in different layers (see fig. 1). These layers are called hierarchical levels, and are isolated from one another; they consist of flat graphs whose nodes represent elements of the environment, and whose arcs represent relations between them. The lowest hierarchical level is called the ground level and it represents the environment with the maximum amount of detail that is available. The highest hierarchical level is called the univeixd level. In our particular approach, the AH-graph represents the topology of space. Thus, nodes represent distinctive places for robot navigation and localization while arcs indicate the possibility of navigating between these places. The AH-graph model has demonstrated a significant computational improvement in route planning 131 and robot task-planning [6]. More details about the AH-graph model can be found in [4].
a)
b)
C)
Fig 1 An example of an AH-graph (a) A schematic map of a real envlronment Distinctive places for robot navigation are marked with small rhombuses (b) Ground level topology of distinctive places (c) Upper levels of the hierarchy
3.
ELVIRA
We have developed a system, called ELVIRA for producing plans while adapting the symbolic world model of an agent to changes in both the world and the tasks that the agent must face. As shown in fig. 2, ELVIRA has two inputs: the task to be performed by the robot and the environment information gathered by its sensors. It outputs a plan that solves the current requested task and an adapted hierarchical model of the current world. ELVIRA comprises four processes. Three of them -anchoring, planner, and optimizer- are periodically executed, possibly with time-variant periods. This sampling implies that the hierarchical world model may be temporarily inconsistent with the environment. However, in this work we do not deal with such inconsistency since we assume that the environment remains inalterable while the robot plans and executes tasks. In the following, the components of ELVIRA are described in detail. -Anchoring Keeper. It gathers world information captured by the robot sensors and updates the current world model (only the ground level). The anchoring
441
relations must be continuously updated during the robot operation in such a way that each symbol unequivocally represents world elements. Such anchoring process is out of the scope of this paper. -Task Planner. It uses the current hierarchical world model to efficiently plan tasks [6]. From the number of times that each task has been requested up to the moment, this process maintains an estimation of the probabilities of arriving the tasks. These probabilities are provided to the Hierarchical Optimizer to guide the optimization search, as explained below. -Hierarchical Optimizer. It is based on a genetic algorithm that improves the current model of the robot's environment to reduce the computational cost of planning tasks. The genetic population encodes the minimum number of parameters to generate a set of potential models (AH-graphs) upon the ground level maintained by the anchoring process. Genetic individuals are evaluated using a cost function that weights the cost involved in planning the tasks by the estimated probabilities given by the task planner, that is, we give more weight to those more frequent tasks. Thus, the symbolic world model adapts better to frequent requested tasks to the detriment of other infrequent ones. -Hierarchy Constructor. From the information encoded in the best individual given by the optimizer and the current ground level information, this process constructs the current best hierarchical world model for the robot. Such construction is based on a graph clustering algorithm (see [4] for more detail).
Fig. 1. The ELVIRA system. ELVIRA is fed with the information gathered by the robot sensois and the requested tasks. It yields the best known hierarchical world model adapted to the agent tasks and its environment and the resulting plan for the requested tasks.
4.
Discussion of Experimental Results
In order to evaluate the suitability of ELVIRA for adapting the world model of a robot, we have conducted a variety of tests in large-scale environments and with different number of tasks to plan. This section presents an illustrative example of
442
planning five different navigation tasks within an environment of more than 500 distinctive places modeled by an AH-graph. A navigation task consists of planning a path between two distinctive places represented by nodes of the AHgraph. In this scenario, while ELVIRA is adapting the AH-graph through the genetic Hierarchical Optimizer, the robot is requested to plan a new navigation task (out of the five possible) every ten cycles of the optimizer (see fig. 3a). Fig. 3 shows the self-adaptation achieved by ELVlRA in the symbolic model of the world for this experiment. Here, the cost of planning a task is calculated by weighting the number of explored arcs of the AH-graph to find the shortest path, by the frequency of the task. By doing so, we provide a measure that accounts for both the cost of task planning and the probability of dealing with a particular task (a very costly but infrequent task should not influence the model to a great extend). Fig. 3a shows the evolution of the planning cost along the optimization steps from the point of view of the robot (which may not know all possible tasks). On the other hand Fig. 3b measures the same model adaptation using the groundtruth frequency to calculate the cost of task planning. Finally, fig. 3c indicates the evolution of the task frequencies calculated by the robot which tends to the ground-truth frequencies when time tends to infinity. Task Planning Cos1 wing the Task FrequencyCalculated by the Robot (numberafexplaredAHGr& R
A
C
I
C
I
A
R
I
I
C
C
A
I
B
I
8
40
80
A+.
B
C
C
E
E
o
A E
180
ZOO
b, task bewemy)
a c A E C B A O E D E A B A & . B E A E 1 I i l l l ! l ! l l l l l ; l l l l l l l ~ l l l l 1l 1 r ~ - ~ - r i - i - r ~ - i - r i - i - r ~ - ~ - r i - i - r i - ~ - r i - - i - r -
A
_I _ _I l_ ._1 _ _ _I _i _- I_ _ 1_ 1_ _- 1_ _ 1_ _1 _ _7 _1_ _1 _ 1_ _I _ _1 _ _/ _I _ _I _ 1_
!
1 -I_ L 1 - 1 -
0
o
L
80
100
120
140
160
220
240
260
280
300
320
310
360
380
4
Gsnellonr
Fig 3 a) Evaluation of the symbolic model adapted by ELVIRA Sharp increments are due to the arrival of new tasks not previously known by the agent Arrivals of tasks are denoted by a letter (from A to E) b) Evaluation of the adapted model considering the ground-truth frequency of tasks (A 30%,B 20%,C 20%,D 20%, and E 10%) c) Quadratic error between the frequency estimated by the robot and the real frequency, over time
In fig. 3a two stages of the self-adaptation process can be distinguished. At first, when the agent has no knowledge of all possible tasks, the fitness of its symbolic model oscillates substantially. As long as new tasks are requested, the agent
443
increases its knowledge on their frequency, so oscillations diminish until a steadystate is reached, approximately at the 180thstep of the optimizer cycle. Notice how dealing with a new task not previously considered by the robot may increase sharply the cost of planning since the robot world model was not tested against the new task. This occurs frequently in the transitory state in fig. 3a (before step 200), as shown by the abrupt risings of the planning cost. However, observe that immediately after a new task arrives, the optimizer starts to improve the planning cost, which reflects the self-adaptation to the new situation. 5.
Conclusions and Future Work
This paper has proposed a system to adapt the symbolic world model of a situated agent to changes in its environment and operations. . Our work focuses on the evolutionary optimization of the world model of the agent in order to improve efficiency in planning tasks, which has exhibited promising results. In the future we plan to apply ELVIRA to a real mobile robot, which requires going into the anchoring process. We also intend to extend the symbolic world model of the robot to a multi-hierarchical arrangement to better improve the model adaptation (as shown in [3]). References 1. Bonarini A., Matteucci M., Restelli M. Concepts for Anchoring in Robotics.
2.
3. 4. 5.
6. 7. 8.
9.
AI*IA: Advances in Artificial Intelligence. Springer-Verlag, 2001. Coradeschi S., and Saffiotti A., An Introduction to the Anchoring Problem. Robotics and Autonomous Systems Vol. 43, No. 2-3, 2003. Fernandez J.A., and Gonzalez J., Multihierarchical Graph Search. IEEE Trans. on Pattern Analysis and Machine Intelligence. Vol. 24. No.l,2002. Fernandez J.A., and Gonzalez J., Multi-Hierarchical Representation of Large-Scale Space. Kluwer Academic Publishers, 2001. Floreano D., Mondada F. Evolution ofHoming Navigation in a Real Mobile Robot. IEEE Trans. on Systems, Man, and Cybernetics, Vo1.26, 1996. Galindo C., Fernandez J.A., and Gonzalez J., Hierarchical task Planning through World Abstraction. To appear in IEEE TRA. Hirtle, S.C. and Jonides J., Evidence of Hierarchies in Cognitive Maps. In Memory and Cognition, Vol. 13, No. 3, 1985. Kuipers B.J., The Spatial Semantic Hierarchy, in AI, vol. 119, 2000. Nordin P, Bazhaf W., and Brameier M. Evolution of a World Model for a Miniature Robot using Genetic Programming. Robotics and Autonomous Systems, vol. 25, 1998.
MULTIPLE OBJECTIVE GENETIC ALGORITHMS FOR AUTONOMOUS MOBILE ROBOT PATH PLANNING OPTIMIZATION OSCAR CASTILLO, LEONARD0 TRUJILLO, and PATRlCIA MELIN Dept. of Computer Science, TijuanaInstitute of Technology Tvuana, B. C.. Mexico
[email protected] This paper describes the use of a Genetic Algorithm (GA) for the problem of Offline Point-to-Point Autonomous Mobile Robot Path Planning”. The problem consist of generating “valid’ paths or trajectories, for the robot to use to move from a starting position to a destination across a flat map of a terrain, represented by a 2 dimensional grid, with obstacles and dangerous ground that the Robot must evade. This means that the GA optimizes possible paths based on two criteria: length and difficulty. First, we decided to use a Conventional GA to evaluate its ability to solve this problem (using only one criteria for optimization), and due to the fact that we want to optimize paths under the two criteria or objectives, then we extended the Conventional GA to implement the ideas of Pareto optirnality, making it a Multiple Objective Genetic Algorithm (MOGA). We present useful performance measures and simulation results of the Conventional GA and of the MOGA that show that both Genetic Algorithms are effective tools for solving the point-to-point robot path planning problem.
1. Introduction The problem of Mobile Robot Path Planning is one that has intrigued and has received much attention in Robotics, since it is at the essence of what a mobile robot needs to be considered truly “autonomous”. A Mobile Robot must be able to generate collision free paths to move from one location to another, and in order too truly show a level of intelligence these paths must be optimized under some criteria that are important to the robot and the terrain given. Genetic algorithms and evolutionary methods have extensively been used to solve the path planning problem, such as in [l], [2], [3], and [4], but this paper uses as basis for comparison and development the work done by Sugihara [5]. In this work, a grid representation of the terrain is used, and different values are assigned to the cells in a grid to represent different levels of difficulty for the robot to traverse a particular cell, also they represent a codification of monotone paths for the solution of the path planning problem. The conventions mentioned that are used in [5] are also used in this paper, but that is were the similarities end. The next sections show a comparison of the two methods and how they differ, but first we present a simplified version of the path-planning problem. This version only uses a binary representation of a terrain intended to represent solid obstacles and clear cells for the robot, and in which paths are only optimized under the criteria of the length. 444
445
2.
Genetic Algorithms
A Genetic Algorithms is an evolutionary optimization method used to solve, in theory “any” possible optimization problem. A GA [l 11 is based on the idea that a solution to a particular optimization problem can be viewed as a individual and that this individuals characteristics can be coded into a finite set of parameters. These parameters are the genes or the genetic information that makes up the chromosome that represents the real world structure of the individual, which in this case is a solution to a particular optimization problem. Because the GA is an evolutionary method, this means that a repetitive loop or a series of generations are used in order to evolve a population S of p individuals to find the Jittest individual to solve a particular problem. The fitness of each individual is determined bye a givenfitness function that evaluates the level of aptitude that a particular individual has to solve the given optimization problem. Each generation in the genetic search process produces a new set of individuals using genetic operations or genetic operators: Crossover and Mutation, operations that are governed by the crossover rate y and the mutation rate p respectively. These operators produce new child chromosomes with the intention of bettering the overall fitness of the population while maintaining a global search space. Individuals are selected for genetic operations using a Selection method that is intended to select the fittest individuals for the role of parent chromosomes in the Crossover and Mutation operations. Finally these newly generated child chromosomes are reinserted into the population using a Replacement method. This process is repeated a k number of generations. The Simple or Conventional GA [ll] is known to have the next set of common characteristics: Constant number o f p individuals in the genetic search population. Constant length binary string representation for the chromosome. One or two point crossover operator and single bit mutation operator, with constant values for p and y. Roulette Wheel (SSR) Selection method. Complete or Generational Replacement method or Generational combined with an Elitist strategy. 2.1 Multiple Objective Genetic Algorithms
Real world problem solving will commonly involve [12] the optimization of two ore more objectives at once, a consequence of this is that it’s not always possible to reach an optimal solution with respect to all of the objectives evaluated individually. Historically a common method used to solve multi objective problems is by a lineal combination of the objectives, in this way creating a single objective function to optimize [7] or by converting the objectives into restrictions imposed on the optimization problem. In regards to
446
evolutionary computation, [131 proposed the first implementation for a multi objective evolutionary search. The proposed methods in [9], and [lo] all center around the concept of Pareto optimality and the Pareto optimal set, using this concepts of optimality of individuals evaluated under a multi objective problem, they each propose afitness assignment to each individual in a current population during an evolutionary search based upon the concepts of dominance and nondominance of Pareto optimality. Where the definition of dominance is stated as follows: Definition 2.1: For an optimization problem with n-objectives, solution u is said to be dominated by solution v I$
Vi = 1,2,....,n.
f , ('>
3j = 1,2,....,n,
... f ,('1
' ('1 f,
,and
> f , (4
(1) (2)
2.2 Triggered Hypermutation
In order to improve on the performance of a GA, there are several techniques available such as [l 11 expanding the memory of the GA in order to create a repertoire to respond to unexpected changes in the environment. Another technique used to improve the overall speed of conversion for a GA is the use of a Triggered Hypermutation Mechanism [8], which consists of using mutation as a control parameter in order to improve performance in a dynamic environment. The GA is modified by adding a mechanism by which the value of ,u is changed as a result of a dip in the fitness produced by the best solution in each generation in the genetic search. This way p is increased to a high Hypermutation value each time the top fitness value of the population at generation k dips below some lower limit set beforehand, this causes the search space to be incremented at a higher rate thanks to the higher mutation rate, and conversely p is set back to a more conventional lower value once the search is closing in to an appropriate optimal solution.
3.
Conventional GA With One Optimization Criteria
The GA used to solve the path planning problem in a binary representation of the terrain, is the Simple or Conventional GA, as its known in the Evolutionary Computing area (one point crossover, Roulette Wheel Selection, one bit binary mutation, complete replacement). But due to experimental results, some modifications where made, the one important modification made to the algorithm was the inclusion of a Triggered Hypermutation Mechanism [8], which drastically improves the overall performance of the GA, Table 1 synthesizes the simulation results of the best configuration of the GA with
447
terrains represented by nxn grids with n = 16, 24, 32, each with 100 runs, 500 generations per search and randomly generated maps.
Table 1 Simulation Results for the Conventional GA Replacement Elitism
population
n 16
Mutation Hipermutation
% Ideal Fitness
100
98%
Valid solutions 98%
Generational
100
24
Hipermutation
98.3 Yo
96 Yo
Generational
100
32
Hipermutation
97.2%
88%
Note: The Ideal Fitness column, expresses the percentage of the ideal solutions for a grid configuration that’s the value of a map with zero obstacles, which a particular best solution of a genetic search reaches. 4.
MOGA With Two Optimization Criteria
The complete solution we want for the path planning problem, includes a terrain with not only free spaces and solid obstacles, but also difficult terrain that a robot should avoid when possible making it a multiple objective optimization problem [9], [lo]. A direct comparison is made with the MOGA used here and the GA proposed by [5] in Table 2. Table 2 Sugihara and MOGA methods Sugihara
MOGA
Paths
Monotone
Monotone
Aptitude
Linear combination
Pareto Optimality
Repair Mechanism
Out of bounds
Out of bounds, collisions and incomplete paths.
Genetic Operators
One point Crossover and single bit Binary Mutation
One point Crossover and single bit Binary Mutation
SelectionMethod
Roulette Wheel
ReplacementMethod
Roulette Wheel with Tournament Generational
Generational , and Elitist strategy
Termination
Max. Generations
Max. Generations
44%
Using the benchmark test of Figure 1 presented in [ 5 ] , [6] and [7], and the performance measure ofprobability of optimality Lop,(k) we compare the two methods and we show the results in Table 3, Table 3 Simulation results for MOGA
Population No. of Generationsk Mutation Rate
Sugihara
AGOM Generational
AGOM Elitism
30 1000 0 04
60 500 0 05
200 150 0 09
08
08
09
0 95 45%
Not Applicable 44%
Not Applicable 76%
/1
Cross0ver Rate
Y Win Probability w Probability de Optimali@ Lop,
(0
Figure 1 Benchmark Test. 5.
Conclusions
We describe in this paper the application of a MOGA approach for autonomous robot path planning. We have compared our simulation results with other
449
existing evolutionary approaches to verify the performance of our method and the implementation. Simulation results show the feasibility of using genetic algorithms for optimizing the paths that can be used by autonomous mobile robots. Acknowledgments
We would like to thank the Research Grant Committee of COSNET for the financial support given to this research project (under grant 424.03-P). We would also like to thank CONACYT for the scholarship given to the student that participated in this research (Leonard0 Trujillo). References
1. Xiao, Michalewicz, “An Evolutionary Computation Approach to Robot Planning and Navigation”, Sop Computing in Mechatronics, pp. 117 - 128, 2000. 2. Ali, Babu, Varghese, “Offline Path Planning of Cooperative Manipulators Using Co-Evolutionary Genetic Algorithm”, 2002. 3. Farritor, Dubowsky, “A Genetic Planning Method and its Application to planetary Exploration”, pp. 1-3,2002. 4. Sauter, Mathews, “Evolving Adaptive Pheromone Path Planning Mechanisms”, First International Conference on Autonomous Agents and Multi-Agent Systems, pp. 1-2,2002 5. Sugihara, Kazuo, “Genetic Algorithms for Adaptive Planning of Path and Trajectory of a Mobile Robot in 2D Terrains”, IEICE Trans. Inj & Syst., Vol. E82-D, pp. 309-313,1999. 6 . Sugihara, Kazuo, “A Case Study on Tuning of Genetic Algorithms by Using Performance Evaluation Based on Experimental Design”, 1997. 7. Sugihara, Kazuo, “Measures for performance evaluation of genetic algorithms”, Proc. 3rd. joint Conference on Information Sciences, Research Triangle Park, NC, vol. I, pp. 172-175, 1997. 8. Cobb, “An Investigation into the Use of Hypermutation as an Adaptive Operator in Genetic Algorithms Having Continuous, Time-Dependent Nonstationary Environments”, 1990. 9. Fonseca, Flemming, 1993, “Genetic algorithms for multiobjective optimization: formulation, discussion and generalization”, 5th Int. Conf. Genetic Algorithms, pp. 4 16-423, 1993. 10. Srinivas, Deb, 1994, “Multiobjective optimization using non-dominated sorting in genetic algorithms”, Evolutionary Computation, pp. 22 1-248. 11. Castillo, Oscar and Melin, Patricia, Soft Computing for Control of NonLinear Dynamical Systems, Springer-Verlag, Heidelberg, Germany, 2001. 12. Castillo, Oscar and Melin, Patricia, Soft Computing and Fractal Theory for Intelligent Manufacturing, Springer-Verlag, Heidelberg, Germany, 2003.
FINE TUNING FOR AUTONOMOUS VEHICLE STEERING FUZZY CONTROL J. EUGENIO NARANJO, CARLOS GONZALEZ, RICARDO GARC~A,TERESA DE PEDRO, JAVIER REVUELTO Instituto de Autombtica Industrial. Consejo Superior de Investigaciones CientiJicas. Ctra. Camp0 Real Km. 0,200. 28500 La Poveda. Arganda del Rey. Madrid, Spain. Phone: +34 918711900. Fax: +34 9187170.50.
The application of fuzzy logic based control is a way for modeling human behavior successfully proven . These techniques are very useful, particularly in the automated vehicle field, where the elements to be controlled are mostly very complex and cannot be described by a linear model. This paper presents a control system for vehicle steering. We have developed a set of controllers, which have been fine tuned according to human experience and a study of the human driving. The comprehension and interaction of input control variables are very important factors during control adjustment. These controllers have been used and tested on human-like automatic driving routes, generating a robust and repetitive system.
1. Introduction Our work addresses the intelligent transportation systems field, of which one of the research topics in this field is automatic driving systems. Our research work focuses on this topic. We design and implement control systems to automate vehicle actuators, that is, steering wheel, throttle and brake pedal, and test these actuators on real vehicles and real roads in route tracking experiments [ 11. Techniques based on classical control [2] as well as AI-based control, like neural networks [3] or fuzzy logic [4], are used for car steering wheel control. In this case, we use fuzzy logic-based controllers, which were demonstrated by M. Sugeno to be very well suited for mobile robot and industrial automation applications [5] in the early 1990s. Another reason for choosing fuzzy logic is that the steering of a car does not have a clearly linear mathematical model. Next, we describe part of the work done within the Autopia program [ 6 ] ,a set of Intelligent Transportation Systems research projects in which we develop our work.
2.
Building the Fuzzy Controller
The basic application of an automatic steering control is to keep a vehicle in the right lane of a road, without sharp unnecessary turns and overactuations that could cause a hazard. In our case, we have built a private driving zone, laid out to resemble a built-up area, where we run the experiments. Each street of the 450
45 1
circuit has been modeled using high precision RTK-DGPS positions. We use these data to represent the streets as a series of straight segments. Two massproduced Citroen Berlingo vans have been automated and equipped with another GPS receiver and an onboard computer that houses the control system. Map matching techniques are used to get the car to automatically go along defined routes, where the control system calculates the necessary input information. Two variables have been defined to control the steering: lateral and angular errors. Lateral error is the distance from the reference line to the car and angular error represents the deviation of the car direction vector from the same reference line. Human drivers also use these inputs, and they can be modeled as fuzzy variables in our automatic system. Another important consideration is that driving along a straight road is a far cry from driving along a road with bends. Steering wheel movements should be slow and very slight along straight stretches of road, where we should avoid sharp turns that could make us to get out of lane. When driving around bends, however, the movements of the steering wheel should be wide and fast, but if we turn the steering wheel as far as we can very quickly on a bend, we are likely to lose control of the vehicle and have an accident. This behavior also has to be considered at control system design time.
2.1. Defining the Steering Fuzzy Controller The elements that conform a fuzzy controller are inputs, outputs and rules. Two controllers have been defined, one for straight road driving and other for bend tracking, which will be selected alternatively depending on the features of the road and the status of the vehicle. 2.1.1. Input
The fuzzy control input is composed of the system variables. The fuzzification method transforms their crisp values into the granulated fuzzy values. For this purpose, we define membership functions for the two input variables, as shown in Figure 1 Note that there are two fuzzy sets for each variable that correspond to the straight and bend controls. Although the name of the labels for each variable is the same, its meaning is not. Right and Left are the linguistic labels for each membershp function. They indicate whether the car is pointing to the left or to the right of the reference line. This information is used to take the appropriate control actions. Figure 1 also shows that the slop of the straight membershp functions (b, d) are higher than the bend slop (a, c). This definition represents human behavior in the two driving modes, which is very fast and reactive in straight lane driving and slower when driving around bends.
452 Right
Left
-63
0
63
Right
Degrees
LeJi
-2
a)
Left
0
2
Degrees
b)
Right
-3
0
3
Meters
Right
-0.8
LeJi
0 0.8
Meters
Figure 1. Input membership functions. a) Angular error in bend tracking. b) Angular error in straight tracking. c) Lateral error in bend tracking. d) Lateral error in straight tracking.
2.1.2. Rules Only four rules are needed for the lateral control of a vehicle. We have chosen Mamdani [7] operators and the inference rules are built like Sugeno-Takagi [S] rules, forming very intuitive but powerful sentences. The fuzzy variables are represented in bold and the labels for these variables in italics. IF Angular-Error Left THEN Steering Right IF Angular-Error Right THEN Steering Left IF Lateral-Error Left THEN Steering Right IF Lateral-Error Right THEN Steering Left 2.1.3. Output The output value is generated using Sugeno’s singletons. Two labels (Left and Right) have been defined for managing the steering wheel turn. There are also separate definitions for bend and straight line driving; unlimited turning of the steering wheel is allowed for tracking bends, the output being defined between 0 and 1. In straight road tracking, steering wheel turns must be very limited, and we emulate this behavior by c o d i i n g the possible values of the Steering variable to the [0, 0.0251 interval. This means that the car can turn at most 2.5% of the total steering wheel rotation. The defuzzification method is the weighted average (l), where 0 is the otput value and S.Wi is steering value inferred by each rule i. Specifically, S = I for driving around bends and S =0,02.5 for straight driving. Finally W iis the weigth of the rule i, that is, the extent to whch the current input values satisfy the conditions of the rule i.
453
2.1.4. Controller Tests This version of the fuzzy controller has been installed in the test-bed vehicle to test its performance. We plotted an automatic route, in which straight segments of road alternate with tight bends. The results are shown in Figure 2. The reference trajectory map used for defining the desired route is formed with the representative GPS positions of the target route. The car drives on the right as in the continent, and left and right turns are included to test the performance of the controller under all circumstances. Figure 2 shows that the intuitive adjusted controller performs as desired. Looking closely, however, we find that the controller bend mode is not finely tuned. Two problems appear when taking right-hand bends. First, the car starts turning too late, leading to a massive left-hand lane invasion, corrected by turning the steering wheel as far as it will go. This maximum steering wheel turn leads to the second problem: a costly adaptation maneuver and trajectory overshoot until the vehicle recovers the correct positioning. Around left-hand bends, the car adapts correctly to the map shape, but the steering turn decreases a little at the end of these bends, and the car goes off the side of the road, forcing the control system to sharply correct the trajectory until the car is positioned correctly again. At right turns, the left lane very often has to be invaded in human driving, because the presence of a curb prevents the car from cutting off the comer. However, the car should get as close as possible to the corner.
2.1.5. Fine Tuning the Steering Fuzzy Controller After examining system performance, the membership functions of the bend controller have to be fine tuned. The driving for left or right turnings is different when the vehicle is in the right-hand lane from when it is traveling down the center of the road. That is, unlike the straight control, the input membership functions for the bend control cannot be symmetric. The first problem concerning right-hand bends happens because there are overactuations on the steering. The two rules involved in this situation are: IF Angular-Error Left THEN Steering Right IF Lateral-Error Right THEN Steering Left The reason behind this problem is that the second rule carries too much weight in the control system because the slop of the Lateral-Error Right (Figure 1) forces the car to go too far into the other lane, before it loses strength. To correct this, the slope of that label needs to be smoothed out. After some tests, the optimum value for the vertex of the h c t i o n was found to be -5.9 meters.
454 4462599
I I I
I I
I I
I I
I I
I I
I
I
I
I
I
I
I
I
I I
456800
456850
458900
458950
459000
459050
I
4462549
4462499
4462449 456750
459100
Figure 2. Automatic driving route through the driving zone. The figure shows the edges of the road, centerline, reference trajectory map and actual car trajectory map.
The second problem about right-hand bends is that steering overactuations occur when the car tries to get back into the right lane, after taking the bend. In this case, it is the rules that are complementary to the ones described for the above problem that come into play: IF Angular-Error Right THEN Steering Left IF Lateral-Error Left THEN Steering Right This problem is caused by the first rule and happens after the car invades the left-hand lane, thus activating the second rule: the correction to take the car back to the correct path is too sharp, and there is a little oscillation. The solution is to smooth the shape of the Lateral-Error Left. The new found value is 3.75 m. The left turns entail the following rules, in which the tracking error occurs: IF Angular-Error Right THEN Steering Left IF Lateral-Error Right THEN Steering Left The reason behind the problem is that, when the angular error is too low and the lateral error is near zero, the steering correction to the left is insufficient. To correct this, the strength of the Angular-Error Right needs to be increased to amplify the actuation of the steering wheel. So, we set the new value at -53". After modifying the controller, we repeated the experiment to compare the results. The new route is shown in Figure 3. As we can see, the problem that appeared in the first version of the controller have been corrected.
3.
Conclusion
An artificial drive-by-wire steering system can be implemented using simple and intuitive fuzzy rules and input and output variables, which can be defined and fine-tuned based on human experience and knowledge. The experiments show a
455 human like behavior that is as good in straight lane driving as in driving around bends, demostrating the power of the developed controllers. 4462599
4462549
4462499
I
4462449 458750
458800
458850
458900
458950
459000
459050
459100
Figure 3. Automatic route tracking controlled with the fine-tuned fuzzy system.
Acknowledgments The authors whsh thank the to Spanish Ministries MFOM and MCYTI that support COPOS and ISAAC projects and to Citroen Espaiia SA References 1. R. Garcia, et al., Integration of Fuzzy Techniques and Perception Systems for ITS. Computational Intelligent Systems for Applied Research FLINS2002. Ed. World Scientific. Gent, Belgium, pp. 3 14-320 (2002) 2. A. Broggi et al., Automatic vehicle guidance: the Experience of the ARGO Autonomous Vehicle, World Scientific (1999) 3. D. Pomerleau, ALVINN: An Autonomous Land Vehicle In a Neural Network, Advances in Neural Information Processing Systems 1, Morgan Kaufmann (1989) 4. M. Hitchings et al., Fuzzy Control, Intelligent Vehicle Technologies, Vlacic, Parent, Harashima eds. pp. 289-327, SAE International (2001) 5. M. Sugeno, Industrial Applications of Fuzzy Control, North Holland (1985) 6. J. E. Naranjo et. al., Adaptive Fuzzy Control for Inter.-Vehicle Gap Keeping, IEEE Trans. on ITS, Vol. 4: No. 3, Sep. (2003) 7. E. H. Mamdani, Application of Fuzzy Algorithms f o r Control of a Simple Dynamic Plant, Proc. IEE, 121, 12, pp 1585-1588, (1974) 8. T. Takagi and M. Sugeno, Fuzzy Identification of Systems and Its Applications to Modeling and Control, IEEE Trans.on Systems, Man and Cybernetics, Vol. SMC-15, No. 1, pp. 116-132, (1985)
A NEW SONAR LANDMARK FOR PLACE RECOGNITION
A. PONCELA: C. URDIALES, C. TRAZEGNIES AND F. SANDOVAL Departamento de Tecnologz'a Electrdnica, E . T. S.I. Telecomunicacidn, Universidad de Mdaga, Campus de Teatinos, 29071, Mdaga, Spain
This paper presents a new sonar based landmark to represent significant places in an environment for localization purposes. This landmark is based on extracting the contour free of obstacles around the robot from a local evidence grid. This contour is represented by its curvature, calculated by a noise-resistant function which adapts to the natural scale of the contour at each point. Then, curvature is reduced to a short feature vector by using Principal Component Analysis. The landmark calculation method has been successfully tested in a medium scale real environment using a Pioneer robot with Polaroid sonar sensors.
1. Introduction One of the main concerns of autonomous navigation is localization, defined as the problem of correlating the coordinate system of a robot with that of the external world. The most classic approach, odometry, consists of equipping the wheels of the vehicle with encoders to estimate displacements. Unfortunately, odometry is highly affected by noise due to deformation of the wheel radius, wheel slippage, vibrations and other errors. To achieve reliable localization, most methods rely on acquiring and processing external information to compare it to an internal environment model, either available a priori or acquired on-line. Commercial applications (i.e. [4]) typically rely on artificial beacons to reduce the problem to measurement, correlation and triangulation. This is fast and accurate [3] but depends on knowing a priori the beacons layout. A second choice is to use natural landmarks in the environment. In this case, the robot acquires landmarks on-line while its position is still known and then uses the learnt ones to correct odometry. These algorithms depend on the kind of sensors available, typically video-cameras, optical range finders and sonar sensors. While vision-guided localization results in superior accuracy with reduced speed 'Email:
[email protected] 456
457 and increased cost, optical range finders present a low performance when compared to the large beamwidth of non-expensive sonar sensors [l].Thus, this paper focuses on a method to extract significant landmarks from sonar sensors. Sonar sensors provide the distante to obstacles around the robot so that structures like corridors, corners or intersections can be perceived. The main problem of these sensors is that they are affected by errors, specially angular inaccuracy and mu!tiple echoes. Most sonar error correcting techniques rely on fusing sensor information in time and space, usually by means of evidence grids [ 5 ] . These grids are fast and easy to construct and they efficiently combine any number of sensor readings. However, grids usually involve a huge data volume. Also, they are affected by the robot heading and relative position so they are not easy to compare. To avoid this problem, some methods prefer to work with raw range data [6][8] which is corrected by using statistical methods like Principal Components Analysis (PCA). Other methods [9] rely on matching techniques to evaluate the similarity between grids. In this work, we propose a new method t o combine both approaches: we rely on grids to achieve more stable sensor readings, but grids are reduced to a short feature vector by means of PCA. However, prior to PCA, we extract the contour of the area free of obstacles around the robot and represent it by means of a curvature function very stable against noise and transformations. Thus, the feature vector returned by PCA is resistant against noise and transformations too. This paper presents these vectors and shows how they can be used to recognize significant places in the environment.
2. Contour representation Sonar landmarks are extracted from local grids built around the robot (51. A local grid is a two dimensional tesselation of the space around the robot into cells presenting a probability of occupation ranging from 0 to 1. We propose a new landmark extraction algorithm to extract a short feature vector from a local grid so that they can be easily compared and stored. Our method initially relies on representing the contour of the region free of obstacles (RFO) in each grid by means of its curvature. First, the grid is thresholded: cells whose occupancy value is below a threshold U are considered free space and the rest are occupied or non explored. Then, the thresholded grid is stored in polar coordinates to easily look for the closest non-free cell 0 6 to the robot in each direction 4. Then, we use a median
458
filter to partially remove noise from the grid. The contour C is given by O+,V4 E [-180,180]. Fig. 1 shows the different stages of the contour extraction algorithm. In the grid in Fig. l . a , obstacles, free space and unexplored areas are printed in white, black and gray, respectively. Fig. 1.b shows the thresholded local map after obtaining the closest obstacles around the robot 0, which is noisy because of sonar errors. The filtered map is presented in Fig. 1.c and, after all regions in the grid not centered around the robot are removed, the RFO appears in Fig. 1.d. Finally, Fig. 1.e presents the contour of the grid in Fig. 1.a.
Figure 1. Contour extraction: (a) local map; (b) thresholded local map; (c) thresholded local map after filtering; (d) region of interest; (e) contour of the local map in (a).
Contour C is represented by its adaptively estimated curvature function, calculated as proposed in [7]: 1. Calculation of the incremental chain code of C for each pixel n, , the difference in z and y between pixels n and n+l. ( A z ( n )A , y ( n ) ) as 2. Calculation of the maximum contour length free of discontinuities around n, k(n). k(n) is obtained by comparing the Euclidean distance d(n-k(n),n+k(n))t o the real length of the contour l m a z ( k ( n ) ) . k(n) is the largest value that satisfies:
4. - k ( n ) ,n + k ( n ) )2 l m Q Z ( k ( n -) ) u k
(1)
3. Calculation of the incremental adaptive chain code ( A z ( n ) k&(n)k), , associated to n. 4. Calculation of the curve slope a t every point n of C. It is equal to:
5. Calculation of the curvature a t every point n, CF,. This value is locally approximated by Ang(n + 1) - Ang(n). The main advantage of this curvature function is that it is very resistant against noise [7].However, it shifts depending on the robot orientation
459
even for the same position. Thus, we use the module of its FFT. Figs. 2.a and c present two different real contours and the clearly different IFFTI of their curvature (Figs. 2.b and d). 256-IFFTls are still complicated to compare and store so we use Principal Component Analysis (PCA) t o compact them. PCA is used to reduce the dimensionality of a data set of correlated variables while retaining most of their original variation. Given a set of IFFTls of N arbitrarily chosen local maps, we can find a P-dimensional subspace, P 5 N , which all JFFTlsbelong to. Thus, any JFFTJcan be represented with only P components on a basis of P 5 N . PCA allows calculation of this P-dimensional basis. Fig. 3.a shows a real test environment and the positions where 250 grids are gathered to calculate our basis. The basis calculation is performed off-line and only once. Then, ( F F T J s can be projected onto it on-line in a fast way. Figs. 3.b and c show how much information is explained by different sized bases. It can be noted that most information is explained by the first component, roughly the function average. The variance percentage of the input set explained by the first 10 components is equal to 85%. It has been tested that a basis of 10 vectors is enough t o separate different places unless differences are quite subtle.
3. Tests and results Once a basis is available, the robot can gather local grids while it is moving and project their curvature IFFY'( onto the basis to acquire a landmark at its current position. If grids used to calculate the basis were captured a t places different enough, the resulting feature vectors are representative whether the current local map was used to obtain the basis or not. To test the validity of the proposed landmark extraction method, we have performed several experiments in a real environment using a Pioneer robot equipped with frontal Polaroid sonar sensors. To grant that a landmark
Figure 2. (a) Contour of local map; (b) IFFTI of the curvature function of contour in (a); (c) contour of local map; (d) ( F F T I of the curvature function of contour in (c).
460 is representative enough, grids corresponding to mostly non-explored areas are discarded. This is important because in mostly non-explored areas grids depend strongly on the orientation of the robot and may change significantly from one sensor reading to the next. A landmark is representative as long as similar places yield similar landmarks and different ones do not. Thus, a first test has consisted of extracting landmarks a t close positions while the robot is moving ahead. If those positions are close enough, the layout of the environment around the robot is bound to remain mostly the same in absence of dynamic obstacles. Thus, landmarks should be very similar. Fig. 4.a shows nine consecutive RFO contours captured in the test environment in Fig. 3.a and their feature vectors (Fig. 4.b). Some of these nine contours were used to obtain the basis and some of them were purposefully not. Nevertheless, despite minor differences between these contours, it can be noted that feature vectors are very similar. Fig. 5 shows a second test of landmark similarity. In this case, all landmarks have been captured a t different locations (Fig. 5.a). However, it can be observed that landmarks 1 and 3 are captured a t similar locations, both having a wall on the right. Landmark 2 , however, is captured in front of a wall, near a corner, while landmark 4 is captured at a free of obstacles location. Fig. 5.b shows the feature vectors of landmarks 1 and 3. It can be observed that both vectors are quite similar as expected. Fig. 5.c shows landmarks 2 and 4, which are clearly different between themselves and also from landmarks 1 and 3. In order to establish a distance between landmarks for matching, we have evaluated different metrics and chosen the
Figure 3. (a) Landmarks for basis calculation in the real test environment; (b) percentage of variance explained by the principal components.
46 1 loom
,
-20001 0
' 2
I
I 4 6 8 Feature Vector Camponenl Index
10
Figure 4. (a) Consecutjvely captured grid contours; (b) vectors for contours 1-9. (c) landmark locations.
Tanimoto distance [a], which yielded the best results. Table 1 shows the distance among each two landmarks. It can be appreciated that landmarks 1 and 3 are t,he closest ones. However, landmarks 1 and 2 are not too far. This is reasonable because they correspond to near positions of the robot. However, the wall appearing in front of landmark 2 makes them different eriough to discriminate between both places. Finally, landmark 4 is quite different from all the rest. Table 1.
Tanimoto distances
Landmark1
Landm.urk2
Landmark3
Landmark4
La.ndmark1
0
0.1014
0.059
0.2768
Lan.dmark2
0.1014
0
0.1572
0.3362
Lnndmark3
L on,dma.rk 4
:=m 0.059
0.1572
0
0.1793
0.2768
0.3362
0.1793
0
2000 4000
l : j D
0 -2000 0
2
4
6
@)
Figure 5.
8
10
0
2
4
6
(c)
(a) Landmark locat,ions (b) vectors for 1 and 3 (c) vectors for 2 and 4.
8
10
462
4. Conclusions and future work This paper has presented a new sonar based landmark for localization. Landmarks are extracted from the curvature of the region free of obstacles around the robot in local grids. The modules of the FFT of the curvatures are processed by Principal Component Analysis to extract a short feature vector. These vectors have been successfully evaluated in a real environment using a Pioneer robot equipped with 8 frontal Polaroid sensors. Landmarks are useful to distinguish different places but, since similar places a t different locations yield similar vectors, future work will focus on statistically combining information from consecutive landmarks for localization purposes. Acknowledgments This work has been partially supported by the Spanish Ministerio de Ciencia y Tecnologia (MCYT) and FEDER funds, project No. TIC2001-1758. References 1. G.C. Anousaki and K.J. Kyriakopoulos, Simultaneous localization and map building for mobile robot navigation, IEEE Robotics f5 Automation magazine, pp. 42-53, September, (1999). 2. J.D. Holliday, C.Y. Hu and P. Willet, Grouping of Coefficients for the Calculation of Inter-Molecular Similarity and Dissimilarity using 2 0 Fragment Bit-Strings, Combinatorial Chemistry €3 High Throughput Screening, 5 (2), pp. 155-166, (2002). 3. L. Kleeman, Optimal estimation of position and heading for mobile robots using ultrasonic beacons and dead-reckoning, Proc. of the IEEE Int. Conf. on Robotics and Automation (ICRA '92), Nice, France, pp.2582-2587, (1992). 4. T.S. Levitt and D.T. Lawton, Qualitative navigation for mobile robots, Artificial Intelligence, 44 (3), pp. 305-360, (1990). 5. H.P. Moravec, Sensor fusion in certainty grids for mobile robots, AI Magazine, 9, pp. 61-74, (1988). 6. C. Urdiales, A. Bandera, R. Ron and F. Sandoval, Real time position estimation for mobile robots by means of sonar sensors, Proc. of the 1999 IEEE Znt. Conf. on Robotics and Automation (ICRA '99), pp. 1650-1655, (USA), (1999). 7. C. Urdiales, C. Trazegnies, A. Bandera and F. Sandoval, Corner detection based on adpatively filtered curvature funcion, Electronics Letters, 39 (5), pp. 426-428, (2003). 8. N. Vlassis and B. Krse, Robot Environment Modeling via Principal Component Regresion, Proceedings of the 1999 IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS 1999), pp. 677-682, Kyongju, Korea, (1999). 9. B. Yamauchi, and R. Beer, Spatial learning for navigation in dynamic environments, IEEE Trans. on Syst., Man and Cyb., 6 (26), pp. 496-505, (1996).
AUTOMATIC PARKING WITH COLLISION AVOIDANCE
D. MARAVALL AND J. DE LOPE Department of Artificial Intelligence Faculty of Computer Science Universidad Polite'cnica de Madrid Campus de Montegancedo, 28660 Madrid, Spain E-mail: dmaraval1,jdlope @fi.upm.es
M. A. PATRICIO Department of Computer Science Universidad Carlos 111 de Madrid Campus d e Colmenarejo, Madrid, Spain E-mail:
[email protected] Automatic parking of a car-like robot in the presence of a priori unknown obstacles is the topic of the paper. This problem is solved by means of a bio-inspired approach in which the robot controller does not need t o know the car kinematics and dynamics, neither does it call for a priori knowledge of the environment map. The key point in the proposed approach is the definition of performance indexes that for automatic parking happen t o be functions of the strategic orientations to be injected, in real time, t o the robot controller. This solution leads t o a dynamical multicriteria optimization problem, which is extremely hard t o be dealt with analytically. A genetic algorithm is therefore applied. The results of computer simulations are finally discussed.
1. Introduction In this paper we address the problem of automatically parking a backwheel drive vehicle with the additional difficulty created by the presence of a priori unknown obstacles, so that the car controller has to autonomously perform, in real time, two different tasks: parking and collision avoidance. To solve this double-sided problem we put forward a solution based on a biomimetic approach that we have recently proposed1>'. 463
464
2. Sensory-Motor Coordination in Automated Vehicles
Using a Biomimetic Approach The idea underpinning the method is to optimize specific measurable robot's behavior indexes using appropriate sensors. The optimization is solved by means of heuristic techniques, which makes the robot controller highly flexible and very simple. In the manner that living beings solve their physical control problems, like manipulation and locomotion, the robot develops a behavior strategy based on the perception of its environment, embodied as behavior indexes and aimed at improving (optimizing) the evolution of the above-mentioned behavior indexes. It does all this following the known perception-decision-action cycle. The robot vehicle considered in this paper is a conventional back-wheel drive car, whose discrete dynamic equations can be modeled, for the lowspeed range typical of parking maneuvers, as:
where ( x , y ) are the coordinates for the point of application of the force of traction on the vehicle; 0 is the heading of the vehicle on the plane on which it is moving; v is its speed; L is the distance between the front and back axles, and the variable q5 is the direction of the driving wheels with respect to the vehicle heading 8. Obviously, (v,4) are the robot control variables and ( 2 ,y , 0) are its state variables. Finally, &,ax is the maximum angle that can be applied to the direction of the driving wheels. In the case of automatic parking without obstacles, there are two behavior indexes of interest: J1 and 52. These two indexes quantify the goal that the robot should park in the final position ( x d , y , j ) and the goal that the robot should park in line with the parking space direction, e d , respectively. Hence:
Set out in the terms described above, automatic parking can be considered as a standard multicriteria optimization problem, where the agent control actions (in this case, the direction of the steering wheel, because the robot is moving a t constant speed) should simultaneously minimize the two indexes that appear in expression (2). As any driver will have found
465
in practice, the dynamic coordination of these indexes in a parking maneuver is an extremely complex control problem, as, in nonholonomic vehicles, any slip in the combination of the actions suggested by the approach and heading indexes may be disastrous for the parking maneuver. The presence of arbitrary and a priori unknown obstacles introduces an additional goal -i.e. collision avoidance- and, consequently, an additional behavior or performance index, which is the topic of the next section.
3. Collision Avoidance We have previously proposed3i4 a method for autonomous navigation of mobile robots, based on the coordination of some strategic orientations of the mobile robot. These orientations correspond, roughly speaking, to the two fundamental tasks of any mobile robot, including car-like robots: (1) collision avoidance and (2) goal reaching, which in automatic parking is, obviously, the parking space. As regards the general task of collision avoidance, we introduced3 a generalization of the well-known artificial potential field (APF) theory5. More specifically, we added to the customary normal orientation the tangential orientation, making it possible for the robot to perform smoother and more efficient trajectories for collision avoidance. We suppose that the robot only knows the space parking coordinates (zd, yd) and its direction O d . To perform the parking maneuver, the car is equipped exclusively with four sets of ultrasound sensors placed at each of its four edges. Roughly speaking, the normal orientation represents the objective or goal of flying from the obstacle, whereas the two tangential orientations, right and left, are meant for the robot to round up the obstacles, either by following the right or the left direction. Obviously, the other important orientation for the robot is the one that provides the trajectory towards the parking space. It is straightforward to show4 that the normal orientation is given by
where UT(x, y ) is the repulsive potential field:
466
in which p ( 5 , y) is the shortest distance from the car to the nearest obstacle and po is a threshold distance beyond which the obstacles do not influence the car movements. The tangential orientations are
where the opposing signs correspond to the right and left tangential orientations. One important drawback of expressions (3) and (5) is that we need to compute, at each robot position, the repulsive potential field to obtain the two orientations for collision avoidance. We applied' a simpler method for the computation of both orientations @' and 4Tlby means of a ring of ultrasound sensors, so that we shall suppose that the normal and tangent orientations are available at each robot position without computing the exact potential field function. 4. Dynamic Multicriteria Optimization with Genetic
Algorithms Summarizing, we have at each instant all the necesary orientations to guide the car towards the parking space without colliding with the existing obstacles: (1) orientation 4' that optimizes the performance index J1 which drives the robot towards the final position (Q, g d ) ; (2) orientation 4' that optimizes the performance index Jz responsible of making the car to park in line with the parking space 19d; (3) the normal orientation 4n that precludes the robot from being too close to the obstacles and the tangential orientation +T that constraints the robot to follow the optimum trajectory to circumvect the obstacles. The next and crucial step is to coordinate all these competitive and opposing control orientations:
4d(t)= J [ 4 W , &t), 4"(tIlW)]
(6)
where by @ ( t ) we are referring to the desired and final orientation to be applied to the car. The simplest, although by no means trivial, coordination of the competitive orientations is a linear one:
&t)
= w1(t)+'(t)
+ W ' ( t ) 4 ' ( t ) + w n ( t M n ( t )+ wr(t)4'(t)
(7)
Therefore, at each instant the car controller computes its final orientation 4d(t)as a function of the four basic angles, so that the key point is
467
to select the suitable set of instantaneous weights w l ( t ) , w z ( t ) , w n ( t ) and w T ( t ) .This is a rather difficult optimization problem, due to the dynamical nature of the four competitive objectives, as they depend on the relative position and shape of the existing obstacles, which are highly uncertain and a priori unknown by the designer of the robot controller. To make more manageable this multicriteria optimization problem, let us first merge the two orientations 4'(t) and +'(t) into a single goal orientation:
so that the final car orientation can be rewritten as
4"t)
= w g ( t ) @ ( t )+ w n ( t W ( t )
+WT(t)4T(t)
(9)
NOW,let us consider the following rule-based reasoning. (1) If the nearest obstacle is v e q close to the robot, t h e n give maximum priority to the normal orientation. ( 2 ) If the nearest obstacle is at an intermediate distance, t h e n the tangential orientation has maximum priority. ( 3 ) If the nearest obstacle is far from the robot, then give maximum priority to the goal orientation. This reasoning process, expressed as linguistic rules, can be implemented either by means of a fuzzy logic-based controller or by a process based on the dynamical coordination of multiple performance indices1i4. More specifically, these rules can be formalized, as far as the numerical values of the weights w g , wn and wT are concerned, in the way shown in Fig. 1, where we have represented the distribution of the coordination parameters as a function of the distance of the robot to the nearest obstacle.
Figure 1. Distribution of the coordination parameters w g , wn and w, a s a function of distance d.
Fig. 1 is self-explanatory. Thus, d, is a critical distance in the sense that it acts as a security threshold beyond which the robot behavior is
468
dominated by the objective of flying out the nearest obstacle by following the normal orientation. In a similar way, d M is another crucial distance beyond which the robot behavior is dominated by the objective of going straightforwardly to the goal. Finally the third parameter A d determines the region of influence of the tangential navigation; i.e.when the robot's main objective is to round up the nearest obstacle by following the tangent orientation. Thanks to the introduction of this dependence on distance of the coordination weights w g ( t ) ,w n ( t ) and w7(t) the search space has been dramatically simplified and reduced. Now, the multiobjective optimization problem exclusively depends on the three critical distances d,, A d and d M , which unlike the parameters w g ( t ) ,w n ( t ) and w T ( t ) are not time-dependent, making the optimum search one order of magnitude simpler, as compared with the direct optimization based on the coordination weights. As regards the coordination of the two subgoal orientations & ( t ) and @ ( t ) ,we shall make use of the results obtained in our previous work on automatic parking without obstacles. Hence, we introduce the following linguistic rules: (1) If the vehicle is far from the parking space, then priority should be given to the approach subgoal (index J1 and orientation q5l). (2) If the vehicle is near to the target, then priority should be given to the heading subgoal (index 52 and orientation q52). Note the simmilarity of these rules with the preceding rules, although now there are only two coordination weights, w l ( t ) and w z ( t ) , and, consequently, two optimization parameters, d A and d h , for the coordination of the subgoal orientations. In summary, there are two embedded optimization processes. The first one concerns the computation of the goal orientation @ ( t )as a result of the coordination of the two subgoals for the parking task -i.e. approximation orientation q5l(t) and alignment orientation 42(t)--. Once @ ( t )has been obtained, the next optimization process affects the coordination of this goal orientation and the two orientations for collision avoidance -i.e. normal and tangential orientations-. The optimum search is very hard to be solved by means of analytical methods or by means of gradient-based techniques, due to the highly uncertain and a priori uknown spatial distribution and shape of the obstacles. Therefore, we have applied a genetic algorithm to solve this twofold multicriteria optimization problem. The experiments were conducted using the University of Sheffield's Ge-
469
netic Algorithm Toolbox for Matlab7. For all cases, a 20-bit resolution binary coding was used for the parameters processed; the parameter ranges depend on the variables to be optimized. The stochastic sampling method was used to select individuals. The crossover probability used is 0.7; the mutation probability is set proportionally to the size of the population and is never over 0.02. Additionally, elitism from generation to generation is used. Hence, 90% of the individuals of each new population are created by means of the selection and crossover operator and 10% of the best individuals of each generation are added directly to the new population. As regards the fitness evaluation, quality is determined by rewarding the individuals that minimize final position in each experiment, the closer an individual is to the position defined as the target, a t the end of the path, the better this individual is. Additionally, individuals who manage to reach the target along a shorter path are also considered better, although the weighting of this factor is lower. Finally, a penalty is added every time that the robot collides with an obstacle. The experiments were actually designed by defining a set of initial and final vehicle positions that would cover the different relative situations between the starting position of the robot and final goal. Each individual generated in the evolutionary process was simulated with these initial and final conditions to thus determine its problem-solving quality. As we mentioned above, the five parameters evolved were d,, d M , Ad, d; and db. The number of individuals in each population was 40 and the convergence to the best solutions was reached in the 60th generation. Fig. 2 shows the path achieved by the fittest individual of two different generations.
5 . Concluding Remarks
Automatic parking of a car-like robot in the presence of unpredictible obstacles has been solved by means of a bio-inspired method, in which the robot controller optimizes, in real time, its trajectory towards the parking space without colliding with the existing obstacles. The coordination of the two main tasks of the robot, i.e. parking and collision avoidance, leads to a dynamical multicriteria optimization problem which has been solved using a genetic algorithm. The results of computer simulations have shown the viability of the proposed method.
470
:[
11 /
P
-30 -10
I
-5
.
0
’
I
.
10
XI
IS
,
20
,
21
,
31
, s
I
10
.
1 -10
0 -8
,
,
0
,
I
,
30
,
I5
m,
zr,
,
P
, s
.
10
Figure 2. Paths achieved by the fittest individual. Left: 20th generation, d , = 5.113, d M = 2.068, A d = 1.702, d& = 7.982, d h = 18.868 and Right: 60th generation, d , = 2.559, d M = 5.998, A d = 2.746, d&, = 2.199, d h = 4.576.
Acknowledgments This work has been partially funded by the Spanish Ministry of Science and Technology, project DP12002-04064-C05-05.
References 1. D. Maravall, J. de Lope, “A Bio-Inspired Robotic Mechanism for Autonomous Locomotion in Unconventional Environments”, C. Zhou, D. Maravall, D. Ruan (eds.), Autonomous Robotic Systems: Soft Computing and Hard Computing Methodologies and Applications, Physica-Verlag, Heidelberg, 2003, pp. 263292. 2. D. Maravall, J. de Lope, “A Reinforcement Learning Method for Dynamic Obstacle Avoidance in Robotic Mechanisms”, D. Ruan, P. D’Hondt, E.E. Kerre (eds.), Computational Intelligent Systems for Applied Research, World Scientific, Singapore, 2002, pp. 485-494. 3. J. de Lope, D. Maravall, “Integration of Reactive Utilitarian Navigation and Topological Modeling”, C. Zhou, D. Maravall, D. Ruan (eds.), Autonomous Robotic Systems: Soft Computing and Hard Computing Methodologies and Applications, Physica-Verlag, Heidelberg, 2003, pp. 103-139. 4. D. Maravall, J. de Lope, “Integration of Potential Field Theory and Sensorybased Search in Autonomous Navigation”, 15th IFAC World Congress, International Federation of Automatic Control, Barcelona, 2002. 5. 0. Khatib, “Real-time obstacle avoidance for manipulators and mobile robots”, Int. J . of Robotics Research, 5(1), 1986, pp. 90-98. 6. D. Maravall, J. de Lope, M.A. Patricio, “Competitive Goal Coordination in Automatic Parking”, Proc. 1st European Workshop on Evolutionary Algorithms in Stochastic and Dynamic Environments, Coimbra, (2004). 7. A. Chipperfield, P. Fleming, H. Pohlheim, C. Fonseca, Genetic Algorithm Toolbox for Matlab, Department of Automatic Control and Systems Engineering, University of Sheffield, 1994.
MODELING THE RELATIONSHIP BETWEEN NONWOVEN STRUCTURAL PARAMETERS AND THEIR PROPERTIES FROM FEW NUMBER OF DATA P. VROMAN, L. KOEHL, X. ZENG Ecole Nationale Suptrieure des Arts & Industries Textiles, 9 rue de I’Ermitage, Roubaix 591 00, France T. CHEN College of Textiles, Donghua University, Shanghai 200051, P.R. China
This paper deals with the modeling of the relations between the functional properties (outputs) and the structural parameters (inputs) of nonwoven products using soft computing techniques. To reduce the complexity of such model, a selection of the most relevant input parameters is required. A new selection method based on a ranking criterion is first presented. Several models, taking into account specificities of the nonwoven families, are secondly defined using multilayer feed forward neural network. The interest of the selection method is also tested and discussed.
1.
Introduction
Nonwoven products have been more and more widely used nowadays because of their numerous and interesting functional properties (e.g. insulation, protection, filtration, breathiness). The number of end-uses designed with nonwoven materials has significantly grown in the last decades while the production in Western Europe rose by 8% [l]. Consequently, great attentions have been paid to explore the relationships between the structural parameters of nonwovens (thickness, basis weight, raw material.. .) and their functional properties. Such approach enables manufacturers to obtain a better understanding of the influence of the nonwoven structure and the related process parameters on the product quality. Thus, we develop, based on soft computing techniques, several mathematical models for characterizing the relations between the structural parameters (input variables) and their functional properties (output variables) in order to design new nonwoven products [ 2 ] . On the top of the nonlinear relations between inputs and output variables, the complexity of the designed model is related to the huge number of structural parameters, their interdependencies and also the critical lack of available data. A new selection procedure of the input variables (structural parameters) based on a ranking criterion is first presented [3]. In order to solve the difficulty that only a 47 1
472
few number of measured data is available for parameters selection and modeling due to the limitations of the production lines, our method has been developed by properly integrating both human knowledge on processes and products and measured data. Similar works has been done using neural networks [4]. After selecting the relevant structural parameters, the model characterizing the relation between the structure parameters and each property is then developed using a multilayer feed forward neural network [ 5 ] . This modeling procedure has been successfully applied to the prediction of filtration property of nonwoven product.
2.
Selection Procedure of Relevant Structural Parameters
As it is quite difficult to produce large numbers of samples for studying the influence of each structural parameter on the functional properties selected from the final product specifications, small-scale ANN models will be built from limited number of learning data and the most relevant structural parameters should be selected before the modeling procedure. A criterion is defined to rank the structural parameters considering both the human knowledge and their sensitivity to the properties [3]. Accordingly, the ranking criterion is the linear combination of two elements. The first element (H) represents the human knowledge on products. The second element (Sk) represents the measured data of nonwoven properties based on a distance based method, which is defined according to the following two rules: 1. IF a small variation of an input variable corresponds to a big variation of the output variable, THEN this input is considered as a sensitive variable. 2. IF a big variation of an input variable corresponds to a small variation of the output variable, THEN this input is considered as an insensitive variable. Therefore, according to this criterion (&), an input variable is considered relevant if its small variation induces a great variation of an output. The ranking criterion is formulated as follows: Denote X, = (xsl,xs2, ..., xsk, ..., x , , , ) ~the input vector of all the structural parameters and Y, = (ysI, ys2, ..., y,,, . . ., ySm)*the output vector of properties that correspond to the sample s (SE { 1, .. ., z } ) . All the recorded data have been normalized to eliminate the scale effects and the learning data set contains z samples. To rank the relevant inputs for a given output yj, a criterion variable Fk is defined as follows: F k = gl . H (xk, yj) f g2 . s k
(1)
473
where dL(X,,X,)= J d 2 ( X , , X , ) - d ~ ( X i , X , ) /CE{I, , ..., n } , j e { l , ..., m},gl and gzare two positive coefficients. The criterion is designed for searching the best compromise between both the human knowledge and measured data. d(X,, X,)is the Euclidean distance between X, and XI. dk(& &) is the projection of d(X,,XI)on the axis x k . And dCy,,, yo) is the Euclidean distance between y,, and yb. The larger Fk is, the more relevant the input xk will be to this y,. The first element H of the ranking criterion characterizes the degree of coherence between the human knowledge expressed as seen in Table 1 and the variation of measured data. Its principle is as follows. If a variable x k has the same variation trend in learning data set as in the human knowledge, it will be considered as relevant. Otherwise, it will be considered as irrelevant. The universe of discourse ofy, is divided into t equivalent intervals C,.The set AkP is constructed with the set of input data x k , which belongs to the output interval C, of y,. zkp is generated by the overlap between Akpand Akp+l @ E { 1, .. .,t- 11). The human knowledge is expressed with linguistic sentences, such as: IF x l is increasing THEN yl is increasing (see Table 1): R(xl, y l ) = +1 Then H can be calculated using the following formula:
As shown in Figure 1, lzkpl and [ u k p [ are the lengths of the intervals, which correspond to the intersection and union generated by C, and cJPcI, respectively.
474 Table 1. Extract of the human knowledge table
I Structural Parameters (Input space) r
I
thickness basis weight
Hydraulic Functional Properties (Output space) ~
liquid strike through time
filtration level
I
-1
+I
I
+I -1
water permeability
I
-1
+l
I I I I I I I I I I I I I I
Figure 1. Relationship between the input and output spaces.
After calculating H and sk, the value of the relevancy criterion Fk for each input x k and the given output yi can be determined. Then all the Fk's can be ranked in a descending order. Namely, the input corresponding to the highest value of Fk will be the most relevant input to this output, and so on. According to this procedure, the most relevant structural parameters of nonwoven products are obtained and will be used for further modeling procedure. These parameters take into account both the conformity of human knowledge on process technology and the sensitivity of measured data to the physical properties. 3.
Artificial Neural Network Modeling
In our work, since there are several families of nonwoven products and each family has some different structural parameters, all the structure parameters are divided into two groups. One group includes the common structural parameters for all the families and the other group includes the specific structural parameters for each family. Accordingly, two kinds of neural network models are established. A general model makes use of all common structural parameters as its inputs. A specific model makes use of the common and specific structural parameters of each family as the model inputs (Figure 2). The Levenberg-
475 Maquardt fast learning procedure [5] is then used for determining the parameters of the neural networks from the public and the specific data groups. A
Fber count public data
-
filtration level
thicknlss basis weight I
specific data
binder rate
Figure 2. Feed forward neural network general model structure.
4.
Results and Discussion
In our work, 18 samples have been used for studying the water permeability of several families of nonwoven. At first, the structural parameters are selected using the ranking method described above. The same weight of 0.5 is assigned to both the human knowledge criterion (H) and the data sensitivity (&). Table 2 gives 7 most relevant structural parameters of nonwoven fabrics. For the general neural network model, the first 5 most relevant structural parameters are used as public input variables. Next, a specific model is built for only one family of nonwoven, manufactured with a chemical bonded. In this model, the binding rate is added to the set of input variables. The general model is trained with 15 data and validated on 3 new data. The specific model is trained with 5 data and validated with only one datum. Table 2. Ranking of structural parameters according to their relevancy to water permeability Data Sensitivity
c
Input Basis weight Thickness Fiber density Total pore volume Basis weight uniformity Fiber count Fiber length
SI, 0.1000 0.1006 0.1086 0.1036 0.1052 0.1052 0.1045
Ranko 7 6 1 5 2 2 4
Human Knowledge
R -1 -1 -1 1 1 -1 0
H 0.6177 0.5212 0.4574 0.4248 0.3333 0 0
Ranking Fk
0.3588 0.3109 0.2830 0.2642 0.2193 0.0526 0.0523
Rank 1 2 3 4 5 6 7
Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and and pnn pnn withthree traning methmeth met met Table 11 comparative maran withthree traning Table 11 comparative maran and pnn withthree traning meth met
476
1. The general model makes use of sample from several families which differ from each other in many aspects while the specific model only makes use of sample of one family which has quite a few similarities. 2. The specific model uses the same structure, learning procedure and the final weights and biases than the general model and adjusts the weights connecting the specific input and hidden neurons. 5.
Conclusion
In this paper, the relationship between the structural parameters and the functional properties of nonwoven products are modeled using neural network. In order to reduce the complexity and solve the difficulty of insufficient available data, we select the most relevant structural parameters as input variables of the model according to the data sensitivity and the conformity of human knowledge. In the modeling procedure, a general model is first designed for all families of nonwoven. It is built from the set of public input variables. Then, a specific model is built for one family of nonwoven by adding specific structural parameters to the set of public input variables. In the learning procedure, the general model is learnt from all available nonwoven samples while the specific model adjusts its weights and biases considering the specific input. The simulation results show low prediction error on water permeability using both the general and specific models. References
1. Western Europe: nonwoven production +8%. Technical Textiles, 46(3): El00 (2003). 2. L.A. Zadeh, The Roles of Fuzzy Logic and Soft-Computing in the Conception, Design and Deployment of Intelligent Systems, BT Technol. Jo. 14(4), 32-36 (1996). 3. L. Koehl, P. Vroman, T. Chen and X. Zeng, Selection of Nonwoven Relevant Structural Parameters by Integrating Measured Data and Human Knowledge, Multiconference Congress CESA '2003, 9-1 1th july, Lille, France. 4. S.J. Raudys and A.K. Jain, Small Sample Size Effects in Statistical Pattern Recognition: Recommendations for Practitionners, IEEE Trans. On Pattern Analysis and Machine Intel. 13(3) (1991). 5 . D.W. Patterson, Artificial Neural Networks - Theory and Applications, Prentice Hall (1996).
A NEW FRAMEWORK FOR FUZZY MULTI-ATTRIBUTE
OUTRANKING DECISION MAKING OZGURKABAK Istanbul Technical University, Department of Industrial Engineering, 34367 Ma& Istanbul /Turkey
FUSUN ULENGIN+ Istanbul Technical University, Department of Industrial Engineering, 34367 Ma& Istanbul /Turkey
In this paper, a new fuzzy multi-attribute decision making framework for group decision making is proposed. The proposed framework combines both the concept of fuzzy outranking and fuzzy attributes to provide a more flexible way of comparing the alternatives. It consists of three steps. In the first step, the preferences of the decision makers are gathered and then aggregated. In the second step, the attributes are normalized by considering the threshold and the target values of each attribute. At the last step, the results of the previous steps are used as input to a fuzzy multi-attribute decision making approach in order to rank and/or select alternatives. A fuzzy logic based computation is applied in each step for the imprecise or ambiguous data.
1. Introduction
In many real life situations, one of the basic problems is the uncertainty contained in the information about the situation. Additionally, whenever the aim is to make a comparison of an alternative set based on different indicators, it is very hard to find exact quantitative values for all of them. For example, it is very difficult to arrive at straightforward values to measure ecological sustainability. In general, we set a number of criteria for sustainability of a system and we call this system sustainable if its dynamics never drive it outside the boundaries of acceptable values for these criteria. Therefore, in addition to knowledge about the current situation, it is important to formulate targets for assessing progress toward goals. However, real world systems are complex systems characterized by subjectivity, incompleteness and imprecision. Therefore, the specification of clearly defined targets is rather unrealistic. Fuzzy logic has the ability to deal with the complex concepts which are not amenable to a straightforward quantification and contain ambiguities. The starting point of decision models in a fuzzy environment was the model introduced in [4].Traditional approaches to discrete multicriteria evaluation in a corresponding author:
[email protected] 477
478
fuzzy environment are provided by [13], [7], [2], [3], and [5]. For a detailed overview of these methods see [ 101. In the proposed framework, a fuzzy outranking approach is used in order to take into account the non-compensatory nature of some specific problems. [ 111 and [ 121 used outranking relations effectively by introducing fuzzy concordance relations and fuzzy discordance relation.
2.
The Framework of the Proposed Model
The proposed model to evaluate the introduced problem is composed of 3 steps. Details of the steps of the model will be given in the following sections.
2.1. Determination of Weights In the first step the weights of attributes are determined according to experts’ opinions based on a 1-6 scale. The experts are asked to evaluate contribution of each attribute to the evaluation. They have opportunity to indicate their preferences as very high, high, fairly high, fairly low, low and very low. Then these preferences are represented as triangular fuzzy numbers (TFNs) according to Chen and Hwang’s fifth scale [6]. Subsequently, by means of fuzzy arithmetic operations, logic of which depends on [9], firstly average weight of each indicator is calculated and normalization is then applied for normalized weights. Mathematical formulation of the weights is given below.
where the preference of kth expert forjth indicator is pk, = (pk ;p t ;p: weight of jth indicator is calculated as
=rs:,
s,=(s,L;sy;sP), e is
), average
the number of experts, t is
the normalized weight of the jth indicator is
,=I
w ,= (,$;
w y ;wp ), and m is the number of indicators.
2.2. Measurement of Performance Scores In the second step, performance scores are measured. Instead of using the data for each attribute directly, normalization is made to obtain a common scale to allow aggregation and to facilitate fuzzy computation. For this purpose, the
479
revised version of the fuzzy normalization proposed by [8] is used. The basic difference of the revised approach is that not only the target values but also the thresholds of each attribute are taken into account. This permits the possibility of avoid trade-off among attributes. Additionally, due to the vague and uncertain characteristics of these two values, they are represented by TFNs. The corresponding normalization formulations are given below: For utility attribute (the higher, the better), the following normalization is used:
min(max(x,,;LBf)-LB,!;TP -LB~).min(max(x,,,LB:)-LB~;T~ -LB:) TP - LB: Ti" - LB," min(max(x,; LB:)- LB,L;T; - LB;) T: - LB;
1
(3)
For the cost type of attribute (the lower, the better), however, the normalization is made by: min(UBf -rnax(x,;UBF);UB; -T:) UBj" -qL
min(UBy -max(x,,UBy);UBj" - T y ) . UBj" -T,M
min(UBy -max(x,;UBy),UBf -T,!) U B f -T,!
1
(4)
Here, Nu is a TFN and it represents the normalized value of a alternative i with respect to attributej; xii is the entry value of alternative i with respect to attribute j ; LBj is the lower bound of the utility attribute j and it is a TFN characterized by LB, = (LBR; LB!' l ;LBI.); UBj is the upper bound of the cost attribute j and it is a I l and T j is the target value of TFN characterized by UBj = (uBR;uBM;uB~.), J J J attribute j and it is characterized by Tj = (T,!;T; ; T ; ) . Here; subtraction and division operations for TFNs are based on the definitions in [6].
2.3. Fuzzy Outranking Approach for the Comparison of Alternatives In this study the revised version of the methodology suggested by Aouam et.al. [ 11 is proposed for the comparison of alternatives. Herein details of methodology will not given but changes made with respect to [ 11 will be emphasized. The original method can take both crisp and fuzzy inputs. An outranking intensity is introduced to determine the degree of overall outranking between competing alternatives, which are represented by fuzzy numbers. The comparison of these degrees is made through the concept of overall existence ranking index (I(a)). In the methodology first of all fuzzy concordance (dC(a,b)) and disconcordance functions (dD(a,b)) are calculated. Then outranking intensity (df(a,b)) is measured that is used to get the overall outranking intensity (I(a)). In
480
order to get fuzzy concordance function, the partial outranking number (dj(a,b)) and the fuzzy concordance number (C(a,b)) are calculated respectively. Maximum non-significant threshold (sj) and criteria weights (pj) are inserted to the model at this level. Additionally, in order form fuzzy disconcordance function, fuzzy disconcordance numbers (Dj(a,b)) are used in which fuzzy veto thresholds (vj) are included. The main drawbacks encountered in Aouam et.aZ.'s methodology are related to fuzzy operations. Although it is claimed that the model can take fuzzy inputs, there are some problems on applying the model for fuzzy inputs. Therefore in this study some revisions are made on the model. The first revision is made on the computation of partial outranlung fuzzy number, dj(a,b). If attributej is fuzzy then in the proposed revision dj(a,b) is defined as fallows:
d;(ab> = ( (giwL- g;(b>R>+, (gi(aIM- gi(b>M>+,
- gj(b>L>+ + s;
)I
(5)
where gj(a)L; gj(a)M,and gj(a)Rare the left, central and right values, respectively, of the triangular fuzzy membership function of gj(a). The second revision is made on the calculation of fuzzy concordance number as fallows:
where ('p:)L, (pi)M, and ('p:)R are left, central and right values, respectively, of the triangular fuzzy membership function of pic and
If gj(a) and g,(b) are fuzzy numbers then the comparison of them in eq.(7) will require fuzzy ranktng method. For this purpose in revised version we use Baldwin and Guild fuzzy ranking approach [6] to decide whether gj(a) 2 gj(b) or not. According to this approach if we want to compare A and B fuzzy numbers, and if the membership functions ~AA(xA)and ~B(xB)are piecewise linear functions then the preferability of A over B is
where 6 and y are right and central values, respectively, of the TFN of A; a and p are left and central values, respectively, of the TlW of B. As a result the fuzzy number with higher preferability will be bigger. If we apply this approach to eq.(7) , we get
48 1
Another revision made on [l] is concerned with the calculations of fuzzy disconcordance number for fuzzy attribute j , that is as fallows:
The revised method is also different from the original one on the calculation of fuzzy disconcordance function. During the calculation of fuzzy disconcordance numbers (DJ(a,b)), the comparison of two fuzzy numbers is required. Baldwin and Guild’s ranking approach is suggested to solve this problem. Finally fuzzy outranlung function and overall intensity index is defined similarly to that used in [ 13. Overall intensity index I(A) is used to evaluate alternatives.
3.
Conclusions and Further suggestions
In this paper, a fuzzy outranking approach which is non-compensatory in nature and which takes into account the fuzziness of the weights of the indicators as well as those of the threshold values is proposed. The proposed method is applied to the evaluation and ranking of 36 countries using 47 attributes. Details can be found in Kabak and Ulengin [14]. The proposed method improves the one given in Aouam et al. [ 11 by using more accurate fuzzy attributes as input. In the original paper if inputs are TFNs there is a possibility of getting TFNs such that a > b or b > c or a>c (TFN is considered as (a; b; c)), which is, in fact, in conflict with the logic behind TFNs (for a TFN a I b p , . Theorem 3. The passive optimal building size following equations 4 p q -C(q)) = 1- N P >
9
9.
is the solution of the
(16)
where p < p , . Theorem 4. The active optimal building size is larger than the passive optimal building size, namely, 4' > q* . Definition 2. Suppose that there are two possibility distributions, x Aand nB. If for any arbitrary x nA(x) 1 nB(x) holds, then nEis said more informed than nA, which is denoted as nB> nA. Theorem 5. Denote the active optimal building size based on possibility distributions A and B as q i and qi , respectively, the passive optimal building size based on possibility distributions A and B as qrA and q . B ,respectively. If * * nE> nA, then qA 2 qe and qtAIqeEhold. Theorem 6 means that increasing the uncertainty of price can make investor increase investment scale C ( q ) if he considers the active focus point and decrease investment scale if he considers the passive focus point. Given the possibility distribution of the building price p in the future as np( p ), according to the extension principle, the possibility distribution of profit with the building size q , is obtained as follows,
For a given possibility distribution, two optimal building sizes q' and q . , can be obtained from (14)-(17). Two possibility distributions of profits with q* and q* ,denoted as n * ( r ) and n.(r),respectively, can be obtained.
506
Theorem 6. The possibility measure of R 2 0 based on possibility distributions z * ( r ) and n~(t-0are denoted as P o s * ( R2 0) and Pos.(R 2 0) , respectively. The necessity measure of R 2 0 based on possibility distributions Z*( r ) and ~ ( r are ) denoted as Nec*(R20) and Nec,(R 20) , respectively. The followings hold, (1) Pos*(R 2 O ) = sup m i n ( n p ( p ) , u ( R ( q v ( p ) , p ) ) ) P E [ P , .P"
1
(2) N e c * ( R 2 0 ) = 0 (3) Pos,(R 2 0 ) = 1 (4) Nec,(R 2 0) = inf PEI P , P" 3
I
max( l - n t p ( p ) , u ( R ( g V ( p )p, ) ) )
Decision rule for urban land investment development If Nec, ( R L 0) 2 a or POS * ( R 2 0 ) 2 p ,then the build should be constructed at present for selling at date 1, or else, not constructed, where 0 I a I 1 and 0 5 f l < 1 are predetermined thresholds for making decision. References 1. M. Amram and K.Kulatilaka, Real Option: Managing Strategic Investment in an Uncertain World, Harvard Business School Press, Boston, 1999. 2. A. K. Dixit and P.S. Pindyck, Investment under Uncertainty, Princeton University Press, New Jersey, 1994. 3. D. Dubois, H. Prade and R. Sabbadin, Decision-theoretic foundations of possibilty theory, European Journal of Operational Research 128 (2001) 459-478. 4. R. Richard, Valuation for Real Estate Decision, Democrat Press, Santa Cruz, 1972. 5. S. Titman, Urban Land Prices under Uncertainty, The American Economic Review 75( 1985) 505-5 14. 6. T. Whalen, Decision making under uncertainty with various assumptions about available information, IEEE Transaction on Systems, Man and Cybernetics 14 (1984) 888-900. 7. R. R. Yager, Possibilistic decision making, IEEE Transaction on Systems, Man and Cybernetics 9 (1979) 388-392.
AN APPLICATION OF FUZZY AHPDEA METHODOLOGY FOR THE FACILITY LAYOUT DESIGN IN THE PRESENCE OF BOTH QUANTITATIVE AND QUALITATIVE DATA UMUT RIFAT TUZKAYA Department of Industrial Engineering, Yildiz Technical Universiry, Barbaros Street, Yildiz, istanbul, 34349, Turkey
TIJEN ERTAY' Department of Managerial Engineering, Istanbul Technical University, Macka, 34367,Istanbul, Turkey
An effective facility layout evaluation procedure necessitates the consideration of qualitative criteria, e.g. flexibility in volume and variety and quality related to the product and production, as well as quantitative criteria such as material handling cost, adjacency score, shape ratio, material handling vehicle utilization, etc. in the decision process. This paper presents a decision-making methodology based on data envelopment analysis (DEA), which uses both quantitative and qualitative criteria, for evaluating Facility Layout Design (FLD). A computer-aided layout-planning tool, VisFuctory, is adopted to facilitate the layout alternative design process as well as to collect quantitative data. Fuuy AHP is then applied to collect qualitative data related to quality and flexibility. DEA methodology is used to solve the layout design problem by simultaneously considering both the quantitative and qualitative data. The purposed integrated procedure is applied to a real data set consisting of twelve FLDs provided of the plastic profile production system.
1. Introduction Facility layout is one of the main fields in industrial engineering where much research effort was spent and numerous approaches were developed. A layout problem can surface in the design and allocation of space in a new building or the reassignment of space in an existing building [l]. Various constructive and improvement algorithms are built for new layouts and rearrangements respectively. Especially, in last decade commercial products have become available based on some original algorithms. These computer-aided approaches provide speed on calculations and constitute integrated structures. Also in this study, a software package that uses many algorithms at the same time is applied to generate alternative layouts. In addition, some qualitative
* Corresponding Author:
[email protected] 507
508 criterions are determined by fuzzy AHP to choose the most required one between these applicable alternatives. At last stage, all of these data are embedded to DEA to conclude the evaluating procedure. 2.
Fuzzy AHP
Fuzzy set theory was introduced by Zadeh [2] to deal with vague, imprecise and uncertain problems. This theory has been used as a modeling tool for complex systems that are hard to define precisely, but can be controlled and operated by humans [3]. More detailed discussions related to fuzzy sets, fuzzy relations, and fuzzy operations can be found in Ross [4], Chen [ 5 ] , and Zadeh [63. By embedding AHP method into fuzzy sets, an other application area of Fuzzy Logic is revealed. Various types of fuzzy AHP methods can be seen in the literature, such as Laarhoven and Pedrcyz (1983), Buckley (1985), Chang (1996), and Leung and Cao (2000). Decision makers usually find that it is more confident to give interval judgments than fixed value judgment. This is because usually he/she is unable to explicit about hidher preferences due to the fuzzy nature of the comparison process [ 7 ] . In this study, Chang’s [8] extent analysis method is preferred since the steps of this approach are relatively easier than the other fuzzy AHF’ approaches and similar to the crisp AHP.
3. Data Envelopment Analysis DEA is a non-parametric technique for measuring and evaluating the relative efficiencies of a set of entities with common crisp inputs and outputs. DEA, developed by Charnes et al. [9] converts multiple inputs and outputs in to a scalar measure of efficiency. There are some multiple criteria approaches to DEA problems that are formulated based on different DEA models. Here, two-staged DEA is applied to get realistic results. Firstly, an ordinary linear program that transformed from fractional program is used. A deviation variable for the DMUo as do is defined and tried to minimize it. 6represent the inefficiency rate for DMUo. But usually, DEA’s discrimination power cannot be sufficient and DEA results can show that most of the DMU’s are efficient and their do values are zero. For this reason in the second stage Minimax Efficiency Method is preferred. Minimax efficiency is a good method to prevent weak discriminating power. This is a practical method to alleviate the problem of multiple relatively efficient DMUs. Also it is more restrictive than the efficiency defined in classical DEA. The mentioned DEA models was not given here but detailed and explicit information can be found in Li and Revees [ 101.
509 4. Methodology The layout problem is considered as an integrated framework of Fuzzy AHP and DEA. Qualitative ve quantitative performance measuresare are used together to find most desirable layout. T h s integrated framework is illustrated in Figurel. The methodology constitutes from three stage; generation of alternative layouts, gathering data concerned these alternatives for DEA, and using DEA to choose the most desirable layout alternative.
4 Layout alternative generation
I Flow Distance I
Handling cost
1
,
Material handling vehicle utilization
(
I
I
,
Shape Ratio
I
DEA methodology to ineasure layout designs perfonnances
I
c I I Selection of the final layout design solution
Figure 1. The framework for choosing most desirable layout design
4.1. Alternative Layout Generation
EM’SVisFactory software package is used to generate layout alternatives. This software has three main parts that are FactoryPLAN, FactoryFLOW and FactoryOPT. This software’s main parts can be used for various aims like measuring current layouts efficiencies, determining density between activities, and calculating total cost or time about the concerned material handling. In this study, the data about produced main products, their average weights, components that used in products, material handling vehicles and material flows between departments are given to FactoryFLOW. Also the data about department’s space requirements and relationships between departments are
510
given to FactoryPLAN. At last step, FactoryOPT is used the data obtained from FactoryFLOW and FactoryPLAN to generate layout alternatives. In this alternative generation process, FactoryOPT uses different algorithms by considering parameters, which can be changed by the user.
4.2. Gathering data for DEA Two types of criteria constitute DEA inputs and outputs. Handling costs and adjacency scores, inputs for DEA are quantitative criterions. Also material handling vehcle utilization, flow distances, and shape ratios are quantitative criterions and outputs for DEA. These listed criterions are obtained from VisFactory program’s outputs for all alternative layouts. Second type criteria, flexibility, and quality are qualitative ones and used as outputs in DEA too. These qualitative data must be converted to convenient state for DEA. To achieve this necessity and deal with the vagueness and imprecision of qualitative data set, Fuzzy AHP method is used.
4.3. DEA Methodology for choosing the most desirable layout alternative DEA is applied to choose the most desirable layout alternative by using the data gathered from VisFactory and Fuzzy AHP. Thanks to DEA methodology, performances of all alternative layouts are measured. A decision is given according to DEA results, like the current layout is efficient or it must be changed to improved one. 5.
Case Study
The purposed framework is applied to a plastic profile producer. The company produces various extrusion products. Some of the production characteristics bring problems relevant to productivity. Therefore, layout improvements are required. By using VisFactory, eleven layout alternatives are generated and the data, shown in Table 1 about alternatives and also current layouts are obtained as mentioned in section 4.2. Table 1. Quantitative Criterions’ Values for alternative layouts
Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met
511 11 Current
I I
20330,681 203551
45161 174021
0,3437231 0,44214291
29,46 30,02
As fifth criteria, volume-flexibility, variety-flexibility, production-quality, and product-quality are combined by using Chang's Fuzzy AHP methodology. The fuzzy evaluation matrix relevant to the goals is given in Table 2. Table 2. Fuzzy evaluation matrix relevant to goals [Volume Flexibility Variety Flexibility IProduction Quality1 Product Quality
I
I
Table 11 comparative maran and pnn withthree traning meth met Table 11 comparative maran and pnn withthree traning meth met 150 Table 11 comparative maran and pnn withthree traning meth met 150 Table 11 comparative maran and pnn withthree traning meth met 150 Table 11 comparative maran and pnn withthree traning meth met 150 The vectors shown below are obtained from Table 2 by using the expressions that are presented in Chang's Fuzzy AHl' methodology. (5.29, 6.33, 7.40) 0(20.77,24.08,27.55) = (109.76, 152.53,203.89) S v s = (8.50, 10.00, 11.50) 0 (20.77,24.08,27.55) = (176.50,240.83,316.85) SpnQ= (5.19, 5.83, 6.57) 0(20.77,24.08,27.55) = (107.68, 140.49, 180,93) SpQ= (1.79, 1.92,2.09) 0(20.77,24.08,27.55) = (37.52,46.16,57.47) By using these vectors, degree possibilities are obtained like below: V(SvoF2Sva~)= 0.237, V(SvoF2S,) = 1.0, V(SvoFLSPQ)= 1.0 v(svaF 2SVoF) = 1.0, v(sVfi 2SbQ)= 1.0,v(sVfi 2SPQ)= 1.0 = 0.855, V(Sp,,Q 2Svfi) = 0.042, V(S,Q 2S p~ = ) 1.0 V(S,Q = 0.0, v(SpQ2spnQ)= 0.0. V(s, 2sVoF) = 0.0, v(SpQ Thus the weight vector of the goals are calculated as WG = (0.19,0.78, 0.03, O.O)T Pair-wise comparison matrices realize comparison of importance of a criterion over another or an alternative over another. Degrees of the importance are given as triangular fuzzy numbers: Equal importance (1, 1, 1), weak importance (2/3, 1, 3/2), fairly strong (3/2, 2, 5/2), very strong (5/2,3, 7/2), absolute (7/2, 4, 9/2). As a result of evaluation of alternative layouts with respect to the goals by using importance degrees, the weight vectors are acquired for each goal. By the combination of priority weights for goals and alternatives, the last alternative priority weights are obtained as; 0.003, 0.077, 0.078, 0.035, 0.136, 0.120, 0.036, 0.000, 0.110, 0.136, 0.135, and 0.134, with respect to alternative one to alternative eleven and current layout. In ths point all criterions' values for all alternative layouts are ready for DEA. Firstly, classical DEA is applied but the results are not distinguishable because seven alternative and current layouts seems efficient. Therefore minimax efficiency method is applied to increase the discriminating power. The result was
512
remarkable, only second and tenth alternative layouts are efficient. However, current layout’s efficiency score is 0.85 and needs improvement. 1
2
3
4
5
6
7
8
9
10
11
t
1.00 1.00 0.99 0.98 1.00 1.00 1.00 0.86 0.86 1.00 1.00
1.00
wnimaxefficiency~~~0.95 1.00 0.98 0.90 0.84 0.94 0.82 0.82 0.82 1.00 0.96
0.85
Classical Method
6. Conclusion This study addresses the evaluation of the facility layout alternatives by developing an integrated framework based Fuzzy AHPDEA methodology together with a software tool. The proposed framework is applied to a real data set consisting of twelve alternatives. As a result of this application, two alternatives determined as relatively efficient. References
1. R. S. Liggett, Automated facilities layout: past, present and future, Automation in Construction, 9, 197-2 15 (2000) 2 . Zadeh LA. Fuzzy sets. Information and Control, 8:338-53, (1965). 3. F. Dweiri, Fuzzy development of crisp activity relationship charts for facilities layout, Computers and Industrial Engineering, 36 (1999). 4. T. J., Ross, Fuzzy Logic with Engineering Applications, 2795 (1995). 5. S. Chen, C. Hwang, and F. P. Hwang, Fuzzy Multiple Attribute Decision Malung, 42,3140 (1992). 6. Zadeh LA. Probability measures of fuzzy events. Journal of Mathematical Analysis and Applications, 23:422 1-427 (1968). 7. C. Kahraman, U. Cebeci, and D. Ruan, Multi-attribute comparison of catering service companies using fuzzy AHP: The case of Turkey, International Journal of Production Economics, 87,171-184 (2004). 8. K. Zhu, Y. Jing, and D. Chang, A discussion on Extent Analysis Method and applications of fuzzy AHP, European Journal of Operational Research, 116,450-456 (1999). 9. A. Charnes, W. W. Cooper, and E. Rhodes, Measuring the efficiency of decision-malung units, European Journal of Operational Research, 2, 429444 (1978). 10. X. Li and G. R. Reeves, A multiple criteria approach to data envelopment analysis, European Journal of Operational Research 115,507-517 (1999).
AN INTELLIGENT HYBRID APPROACH FOR INDUSTRIAL
QUALITY CONTROL COMBINING NEURAL NETWORKS, FUZZY LOGIC AND FRACTAL THEORY PATRICIA MELM AND O S C A R CASTILLO Department of Computer Science, Tijuana Institute of Technology, P.O. Box 4207, Chula Vista CA 91909, USA. :pmelin@,tectiiuana.mx We describe in this paper a new hybrid intelligent approach for industrial quality control combining neural networks, fuzzy logic and fractal theory. We also describe the application of the neuro-fuzzyfractal approach to the problem of quality control in the manufacturing of sound speakers in a real plant. The quality control of the speakers was done before by manually checking the quality of sound achieved after production [4]. A human expert evaluates the quality of sound of the speakers to decide if production quality was achieved. Of course, this manual checking of the speakers is time consuming and occasionally was the cause of error in quality evaluation [8]. For this reason, it was necessary to consider automating the quality control of the sound speakers.
1. Introduction The quality control of the speakers was done before by manually checking the quality of sound achieved after production [4]. A human expert evaluates the quality of sound of the speakers to decide if production quality was achieved. Of course, this manual checking of the speakers is time consuming and occasionally was the cause of error in quality evaluation [8]. For this reason, it was necessary to consider automating the quality control of the sound speakers. The problem of measuring the quality of the sound speakers is as follows: 1. First, we need to extract the real sound signal of the speaker during the testing period after production 2. Second, we need to compare the real sound signal to the desired sound signal of the speaker, and measure the difference in some way 3. Third, we need to decide on the quality of the speaker based on the difference found in step 2. If the difference is small enough then the speaker can be considered of good quality, if not then is bad quality. The first part of the problem was solved by using a multimedia kit that enable us to extract the sound signal as a file, which basically contains 108000 points over a period of time of 3 seconds (this is the time required for testing). We can say that the sound signal is measured as a time series of data points [3], which has the basic characteristics of the speaker. The second part of the problem was solved by using a neuro-fuzzy approach to train a hzzy model with the data from the good quality speakers [9]. We used a neural network [6] to obtain a Sugeno fuzzy system [14, 151 with the time series of the ideal speakers. In this case, a neural network [5, 11, 131 is used to adapt the parameters of the hzzy system with real data of the problem. With this fuzzy model, the time series of other speakers can be used as checking data to evaluate the total error between the real speaker and the desired one. The third part of the problem was solved by 513
514
using another set of fuzzy rules [20], which basically are fuzzy expert rules to decide on the quality of the speakers based on the total checking error obtained in the previous step. Of course, in this case we needed to define membership functions for the error and quality of the product, and the Mamdani reasoning approach was used. We also use as input variable of the fuzzy system the fractal dimension of the sound signal. The fractal dimension [9] is a measure of the geometrical complexity of an object (in this case, the time series). We tested our fuzzy-fractal approach for automated quality control during production with real sound speakers with excellent results. Of course, to measure the efficiency of our intelligent system we compared the results of the fuzzy-fractal approach to the ones by real human experts. 2. Basic Concepts of Sound Speakers In any sound system, ultimate quality depends on the speakers [4]. The best recording, encoded on the most advanced storage device and played by a top-ofthe-line deck and amplifier, will sound awful if the system is hooked up to poor speakers. A system's speaker is the component that takes the electronic signal stored on things like CDs, tapes and DVD's and turns it back into actual sound that we can hear. 2.1 Sound Basics
To understand how speakers work, the first thing you need to do is understand how sound works. Inside your ear is a very thin piece of skin called the eardrum. When your eardrum vibrates, your brain interprets the vibrations as sound. Rapid changes in air pressure are the most common thing to vibrate your eardrum. An object produces sound when it vibrates in air (sound can also travel through liquids and solids, but air is the transmission medium when we listen to speakers). When something vibrates, it moves the air particles around it. Those air particles in turn move the air particles around them, carrying the pulse of the vibration through the air as more and more particles are pushed farther from the source of the vibration. In this way, a vibrating object sends a wave of pressure fluctuation through the atmosphere. When the fluctuation wave reaches your ear, it vibrates the eardrum back and forth. Our brain interprets this motion as sound. We hear different sounds from different vibrating objects because of variations in : sound wave frequency -- A higher wave frequency simply means that the air pressure fluctuates faster. We hear this as a higher pitch. When there are fewer fluctuations in a period of time, the pitch is lower. 0 air pressure level -- the wave's amplitude -- determines how loud the sound is. Sound waves with greater amplitudes move our ear drums more, and we register this sensation as a higher volume.
515
A speaker is a device that is optimized to produce accurate fluctuations in air pressure. Figure 1 shows a typical speaker driver.
Fig.1. A typical speaker driver, with a metal basket, heavy permanent magnet and paper diaphragm
3. Description of the Problem The basic problem consists in the identification of sound signal quality. Of course, this requires a comparison between the real measured sound signal and the ideal good sound signal. We need to be able to accept speakers, which have a sound signal that do not differ much from the ideal signals. We show in Figure 2 the form of the sound signal for a good speaker (of a specific type). The measured signal contains about 108 000 points in about 3 seconds. We need to compare any other measured signal with the good one and calculate the total difference between both of them, and if the difference is small then we can accept the speaker as a good one. On the other hand, if the difference is large then we reject the speaker as a bad one. We show in Figure 3 the sound signal for a speaker of bad quality. We can clearly see the difference in the geometrical form of this signal and the one shown in Figure 2. In this case, the difference between the figures is sufficiently large and we easily determine that the speaker is of bad quality. We also show in Figure 4 another sound signal for a bad quality speaker.
Fig. 2. Sound signal of a Good Speaker
516
Fig. 3. Sound Signal of Bad Speaker (Case 1)
Fig. 4. Sound Signal of Bad Speaker (Case 2 )
4. Basic Concepts of Fractal dimension Recently, considerable progress has been made in understanding the complexity of an object through the application of fractal concepts [8] and dynamic scaling theory [ 111. For example, financial time series show scaled properties suggesting a fractal structure [ 1, 2, 31. The fractal dimension of a geometrical object can be defined as follows: d = lim [InN(r)] / [ln(l/r)]
(1)
r+O
where N(r) is the number of boxes covering the object and r is the size of the box. An approximation to the fractal dimension can be obtained by counting the number of boxes covering the boundary of the object for different r sizes and then performing a logarithmic regression to obtain d (box counting algorithm). In Figure 5, we illustrate the box counting algorithm for a hypothetical curve C . Counting the number of boxes for different sizes of r and performing a logarithmic linear regression, we can estimate the box dimension of a geometrical object with the following equation: In N(r) = Inp - d Inr
this algorithm is illustrated in Figure 6 .
Yt Fig.5. Box counting algorithm for a curve C
Fig.6. Logarithmic regression to find dimension
517
We developed a computer program for calculating the fractal dimension of a sound signal. The computer program uses as input the figure of the signal and counts the number of boxes covering the object for different grid sizes. For example, the fractal dimension for the sound signal of Figure 2 is of 1.6479, which is a low value because it corresponds to a good speaker. On the other hand, the fractal dimension for Figure 3 is 1.7843, which is a high value (bad speaker). Also, for the case of Figure 4 the dimension is 1,8030, which is even higher (again, a bad speaker). 5. Experimental Results We describe in this section the experimental results obtained with the intelligent system for automated quality control. The intelligent system uses a fizzy rule base to determine automatically the quality of sound in speakers. We used a neural network to adapt the parameters of the fuzzy system using real data from the problem. We used the time series of 108000 points measured from a good sound speaker (in a period of 3 seconds) as training data in the neural network. We then use the measured data of any other speaker as checking data, to compare the form of the sound signals. A neural network is used to adapt a fuzzy system with training data of good sound speakers. The approximation is very good considering the complexity of the problem. Once the training was done, we used the fuzzy system for measuring the total difference between a given signal and the good ones. This difference is used to decide on the final quality of the speaker using another set of fuzzy rules with the Mamdani approach. The fuzzy rules are as follows: IF Difference is small IF Difference is regular IF Difference is regular IF Difference is medium IF Difference is medium IF Difference is large IF Difference is large IF Difference is small
AND Fractal Dimension is small THEN Quality Excellent AND Fractal Dimension is small THEN Quality is Good AND Fractal Dimension is high THEN Quality is Medium AND Fractal Dimension is small THEN Quality is Medium AND Fractal Dimension is high THEN Quality is Bad AND Fractal Dimension is small THEN Quality is Medium AND Fractal Dimension is high THEN Quality is Bad AND Fractal Dimension is high THEN Quality is Medium
We show in Figure 7 the non-linear surface of a type-I hzzy system for quality control of the sound speakers. We also show in Table 1 the results of the hzzy systems for I2 specific situations
Fig.7. Non-linear surface of type-I fuzzy system
518 Table 1. Outputs of the fuzzy systems for 12 specific situations
Table 11 comparative maran and pnn withthree traning meth met 150 Table 11 comparative maran and pnn withthree traning meth met 150 Table11 11comparative comparativemaran maranand andpnn pnnwiththree withthreetraning traningmeth methmet met150 150 Table Table 11 comparative maran and pnn withthree traning meth met 150 Table Table11 11comparative comparativemaran maranand andpnn pnnwiththree withthreetraning traningmeth methmet met150 150 Table 11 comparative maran and pnn withthree traning meth met 150 Table 11 comparative maran and pnn withthree traning meth met 150 6. Conclusions We described in this paper the application of a fizzy logic to the problem of automating the quality control of sound speakers during manufacturing in a real plant. We have implemented an intelligent system for quality control in the plant using the new approach. We also use the fractal dimension as a measure of geometrical complexity of the sound signals. The intelligent system performs rather well considering the complexity of the problem. The intelligent system has been tested in a real manufacturing plant with very good results.
References [I] Castillo, 0. and Melin, P. (2001). “Soft Computing for Control of Non-linear Dynamical Systems”, SpringerVerlag, Heidelberg, Germany. [2] Castillo, 0. and Melin, P. (2002):’Hybrid Intelligent Systems for Time Series Prediction Using Neural Networks, Fuzzy Logic and Fractal Theory”, IEEE Transactions on NeuralNetworks, vol. 13, pp. 1395-1408. [3] Castillo, 0. and Melin, P. (2003). “Soft Computing and Fractal Theory for Intelligent Manufacturing”, Springer-Verlag, Heidelberg, Germany. [4] Dickason, V. (1997). The Loudspeaker Cookbook, McGraw Hill. [5] Haykin, S . (1996). “Adaptive Filter Theory”, Third Edition, Prentice Hall. [6] Jang, J.R., Sun, C.T. and Mizutani E. (1997). Neuro-Fuzzy and Soft Computing, Prentice Hall. [7] Karnik, N. N. and Mendel, J. M. (1998). “An Introduction to Type-2 Fuzzy Logic Systems”, Technical Report, University of Southern California. [8] Loctite Co. (1999). Speaker Assembly Adhesives Guide. [9] Mandelbrot, B. (1987). “The Fractal Geometry of Nature”, W.H. Freeman and Company. [ 101 Melin, P. and Castillo, 0. (2002). “Modelling, Simulation and Control of Non-Linear Dynamical Systems”, Taylor and Francis, London, Great Britain. [ 111 Parker, D.B. (1982). “ Learning Logic”, Invention Report 581-64, Stanford University. [ 121 Rasband, S.N. (1990). “Chaotic Dynamics of Non-Linear Systems”, Wiley Interscience. [ 131 Rumelhart, D.E., Hinton, G.E., and Williams, R.J. (1986). “Learning Internal Representations by Error Propagation”, in “Parallel Distributed Processing: Explorations in the Microstructures of Cognition”, MIT Press, Cambridge, USA, Vol. 1, pp. 318-362. [I41 Sugeno, M. and Kang, G.T. (1988). Structure Identification of Fuzzy Model, F u y Sets and Systems, Vol. 28, pp.15-33. [15] Takagi, T. and Sugeno, M. (1985). Fuzzy Identification of Systems and its Applications to Modeling and Control, IEEE Transactions on Systems, Man, and Cybernetics,vol. 15, pp.116-132. [I61 Wagenknecbt, M. and Hartmann, K. (1988). “Application of Fuzzy Sets of Type 2 to the Solution of Fuzzy Equations Systems”, Fuzzy Sets andSystems, Vol. 25, pp. 183-190. [17] Yager, R. R. (1980). “Fuzzy Subsets of Type I1 in Decisions”, Journal ofcybernelics, Vol. 10, pp. 137-159. [18] Zadeh, L. A. (1971). Similarity Relations and Fuzzy Ordering, Information Sciences,vol. 3, pp. 177206. [19] Zadeh, L. A. (1973). Outline of a New Approach to the Analysis of Complex Systems and Decision Processes, IEEE Transactions on Systems, Man, and Cybernetics, voI3,pp. 28-44. [20] Zadeh, L. A. (1975). “The Concept of a Linguistic Variable and its Application to Approximate Reasoning”, Information Sciences, 8, pp. 43-80.
A FUZZY HEURISTIC MULTI-ATTRIBUTE CONJUNCTIVE APPROACH FOR ERP SOFTWARE SELECTION CENGIZ KAHRAMAN', GULCIN BUmKOZKAN*32,DA RUAN3 'Istanbul Technical University, Department of Industrial Engineering, Macka 34367, Istanbul, Turkey 2Galatasaray Universiw, Department of Industrial Engineering, Ortakoy 34357 Istanbul, Turkey 'Belgian Nuclear Research Centre (SCK-CEN), Boeretang 200, B-2400 Mol, Belgium In recognition of the importance of the ERP software selection and of the sizable risk that organizations take when they decide to buy this type of technology, this study proposed a systematic decision process for selecting a suitable ERP package. We are interested in two stages of decision-making: the identification of serious software candidates and the choice of the most appropriate ERP software package. To achieve this, we propose an integrated approach based on fuzzy logic, heuristic multi attribute utility and conjunctive methods. Finally, a case study is given to demonstrate the potential of the methodology.
1. Introduction Under the pressure to proactively deal with the radically changing external environment, many firms have changed their information system strategies by adopting application software packages rather than in-house development [ 11. An application package such as enterprise resource planning (ERP) system provides reduced cost, rapid implementation, and high system quality [2]. The growing number and quality of ERP software, the cost of the package (costs equaling several thousands, hundreds of thousands, and even millions of dollars) [3], the set-up, running and other related costs and the fact that the complexity of ERE' packages and profusion of alternatives require expertise for their evaluation make the selection of an appropriate ERP package a vital issue to practitioners [ 4 ] . In addition, since an ERP system, by its very nature, will impose its own logic on a company's strategy, organization, and culture, it is imperative that the ERP selection decision be conducted with great care. A group approach to ERP software selection decision offers many benefits, including improved overall decision quality and decision-making effectiveness. Clearly, ERF' software selection is not a well-defined or structured decision problem. The presence of multiple criteria and the involvement of multiple decision makers (DMs) will expand decisions from one to many several dimensions, thus, increasing the ~
~
* Correspondingauthor: e-mail:
[email protected] 519
520 complexity of the solution process. For this reason, in this study, we propose to use both heuristic multi attribute utility and conjunctive methods by combining with fuzzy set theory in a two-phase ERP software selection process. The application of the suggested methodology is also explained through a case study. 2. Suggested framework for ERP software selection The evaluation procedure begins with identifying the software selection criteria. From a candidate list, the DMs have to narrow the field to four to six serious candidates. This can be accomplished by a preliminary analysis of the strengths and weaknesses of each supplier and the “goodness of fit” of the software to company needs by conducting heuristics rules. The measurement of performance corresponding to each criterion is conducted under the setting of fuzzy set theory. Finally, we apply in the second phase fuzzy conjunctive method to achieve the final ranking results and select the most appropriate ERP software package. The evaluation algorithm may then be summarized by the following: Step 1. Identify interest groups of people in the decision environment, Step 2. Identify attributes and establish a universe of discourse for each attribute, Step 3. List all alternatives, Step 4. Interact with DMs to identify their heuristics and according to them, construct heuristic decision rules for each group, Step 5. List the selected most appropriate alternatives and measure along the attributes, Step 6. Calculate the utilities of alternatives for each group and determine the alternatives to get for the following steps, Step 7. Interact with the DMs to identify the cutoff vector, Step 8. Determine the possibility and necessity measures for the identified alternatives using a fuzzy conjunctive method, Step 9. Rank the alternatives from the best to the worst and select the best ERP package for the company. The DMs are actively involved in the decision process and the decision rules are constructed through serious discussions between the DMs and analysts. 3.
Evaluation methods of ERP software selection
3.1. Heuristic Multi Attribute Utility Function (MAUF) approach The authors [ 5 , 61 argued that the multiple attribute utility function (MAUF) could not be practically obtained by the combination of single attribute utility functions because of the dependency among attributes. Therefore a heuristic approach is needed to define the MAUF [7]. A heuristic method is an algorithm that gives only approximate solution to a given problem. Heuristic methods are commonly used because they are much faster than exact algorithms. Since decision data may be numerically and/or linguistically expressed, fuzzy set
521 theory must be incorporated in the heuristic approach. In this case, the utility function is represented in the “IF .. THEN’ decision rule format.
3.2. Fuzzy conjunctive approach Dubois et al. [8] proposed the fuzzy version of the conjunctive method. They pointed out that when data in a decision matrix and the DM’s standard levels are fuzzy, the matching between these two fuzzy data becomes vague and, naturally, a matter of degree. The degree of matching is measured by the following membership function: p (a)= s u p h Q(x)(pp (x)= a ) tla , where
n,(.)
PI Q
represents the degree of possibility that x is the (unique) value which describes an object modeled by Q; p p ( x )is the degree of compatibility between the value x and the meaning of P. Thus, pqe(a)denotes the degree of compatibility of Q with respect to P. Dubois et al. [8] defined two scalar indices to approximate the ,uPlQ(a)measure so that compatibility between fuzzy sets can be estimated: the possibility of matching and the necessity of matching for single and multiattributes. In this study, we use the measures of matching for multi-attributes (as in most real world problems) by applying the following equations: (1) n ( A ~ , A , ) - rmn n(x,o,xq): the possibility measure J=1,
n
N(AO,A~)= 1=1, ,n
where
AO
=
p.(
~ ( ~ y , ~ ,the ) : necessity measure
,..., 2 ) , A, = (xZ1 ,..., x I n ) , and
XI”
(2)
and xv are defined on the same
domain U . The vector is the cutoff vector specified by the DM, while A,, i = l ,...,m , is the vector that contains the performance scores of the ith alternative under all attributes. The other computations are given in [8]. AO
4. Application of the proposed framework We consider a medium size company that wishes to replace its legacy manufacturing planning system with an ERP product. Each of the candidate ERP systems has different strengths and weaknesses. A diverse cross-functional team consisting of five members has been organized to determine performance aspects. We use five critical performance aspects that are proposed widely in literature and practice [3]. They are price, capability, reliability, credibility and service support. We assume that opinions of the five team members have been obtained in the form of linguistic variables for each performance aspect as shown in Table 1. Each of these aspects is treated as a fuzzy set, bounded to a predetermined interval, and characterized by a single probability distribution. Figure 1 gives the membership functions for the attribute ‘price’. If the range for
522 the other attributes is the interval [0,10], Figure 2 illustrates one set of assignments that can be made. Table 1. The conversion scales Performance asgects
Price
Very expensive WE)
Expensive (E)
.Rather cheap (VC)
Cheap (C)
Very cheap
Capability
Incapable (0
Little capable (LC)
Rather capable (RC)
Capable
Very capable (VC)
Reliability
Unreliable (U)
Little reliable
Rather reliable
Very reliable
(W
(W
Reliable (R)
Little credible (LC)
Rather credible (RC)
Credible (C)
Very credible (VC)
Medium(h4)
High0
Veryhigh(VH)
Credibility
Incredible (I)
Service support
Very low (VL)
150 200
(L)
500
400
300
(VC) (C)
600 650
Figure 1 . Membership functions for the attribute ‘price’
We established the universes of discourse for all attributes and their utility. Through careful discussion, we identified the heuristics and ideal solutions by the experts and obtained the decision rules. We listed all the alternatives and using the identified decision rules, we obtained the utility for each alternative as given in Table 2. Three alternatives (1, 4 and 9), which have the highest utility values, are identified for a more detailed analysis.
0
1
2
3
4
6
7
8
9 1 0
Figure 2. Membership functions for linguistic variables
523 Table 2. The linguistic evaluation with respect to each alternative ERP softwares
Alt. 1 Alt. 2
Service support
Price
Capability
Reliability
Credibility
Very expensive Rather cheap
Very capable
Very reliable
Very credible
Rather
Credible
Little reliable
Little credible
Alt. 3
Very cheap
Rather capable Little capable
Alt. 4
Expensive
Capable
Very reliable
Credible
Alt. 5
Rather cheap
Reliable
Alt. 6
Cheap
Rather capable Rather capable
Alt. 7
Rather capable
Alt. 8
Very exepnsive Cheap
Rather reliable Rather reliable
Rather credible Rather credible
Capable
Alt. 9
Expensive
Capable
Alt. 10
Rather cheap
Little capable
UTILITY
Very high High
VERY GOOD
Very high Very high
MEDIUMPOOR VERY GOOD
High
GOOD
Low
MEDIUM
GOOD
reliable
Medium
MEDIUM
Little reliable
Rather credible Little credible
High
MEDIUMGOOD
Reliable
Very credible
High
VERY GOOD
Reliable
Credible
Low
MEDIUM GOOD
Let's assume that the cutoff vector is given as A° = {(x°,0.25), (jt°,0.15), (*°,0.20), (jt4°,0.15), (jc5°,0.25)} where the numbers 0.25, 0.15, 0.20, 0.15, and 0.25 are the weights associated with each attribute, and the x°j, j = 1,2,3,4,5, are summarized as: jc°: rather cheap; x° : capable; jc°:very reliable; x°: credible; x° : very high. For the first alternative (/=!), the values of n(*°»•*,;,) are calculated as 0, 0.4, 1.0, 0.4, 1.0 and the values of N(x°,xiJ] are calculated as 0, 0, 0.5, 0, 0.5 for./ = 1,...,5. Applying Eqs (1) and (2), we compute the possibility and necessity measures for three identified alternatives as in Table 3. The rank order from the best to the worst is obtained as {4, (l, 9)}. Table 3. Possibility and necessity measures Alternative i
1
4
n(A°;A,)
0.75
0.75
9 0.75
0.75
0.85
0.75
44°; A)
If the WjS, j = 1,2,3,4,5, are changed to be (0.80, 0.45, 0.60, 0.45, 0.80), the new obtained measures are given in Table 4. The new rank order from the best to the worst is {4,9,1}.
524 Table 4.Modified possibility and necessity measures
I
Altemativei
I I
1
maran0.20 and
I I
4
II
9
I
0.50 0.40 withthree traning
Table 11 comparative pnn meth met 150 Table 11 comparative maran and pnn withthree traning meth met 150 5. Conclusion W e proposed a n integrated approach to the ERP software selection problem. W e used fuzzy heuristics to eliminate the worst alternatives among all at the first stage and then used a fuzzy conjunctive method to select the best among the remaining alternatives. Changing the weights of the attributes, it is shown that the rank order of the alternatives may change. These methods are very useful in comparing alternatives when there is incomplete information about them.
References 1. K.C. Laudon, and J.P. Laudon, Management Information Systems: Organization and Technology, Prentice-Hall, Englewood Cliffs, NJ, (1996). 2. H.C. Lucas, E.J. Walton, and M.J. Ginzberg, Implementing packaged software, MIS Quarterly, 537-549, (1988).
3. J. Verville, and A. Halingten, A six-stage model of the buying process for ERP software, Industrial Marketing Management, 32(7), 585-594, (2003).
4. C-C. Wei, and M-J.J., Wang, A comprehensive framework for selecting an ERP system, International Journal of Project Management, 22(2), 161-169, (2004).
5. J. Efstathiou, and V. Rajkovic, Multiattribute decision-making using a fuzzy heuristic approach, IEEE Trans. On Systems, Man, Cybernetics, Vol. SMC-9, 326333, (1979). 6. J. Efstathiou, The incorporation of objective and subjective aspects into the assessment of information systems, In: The information systems environment, Lucas, Land, Lincoln, and Supper (eds.), North-Holland, 187-197, (1980). 7. S-J. Chen, and C-L. Hwang, Fuzzy multiple attribute decision-making: methods and applications, Springer-Verlag, Berlin, (1992). 8. D. Dubois, H. Prade, and C. Testemale, Weighted fuzzy pattern matching, Fuzzy Sets and Systems, 28 (3), 313-331, (1988).
ESTIMATION OF EASE ALLOWANCE OF A GARMENT USING FUZZY LOGIC Y.CHEN, X.ZENG, M.HAPPIETTE, P.BRUNIAUX Ecole Nationale SupPrieure des Arts & Industries Textiles, 9 rue de I’Ermitage, Roubaix 59100, France R.NG, W.YU The Hong Kong Polytechnic University, Hong Kong, China
The ease allowance is an important criterion in garment sales. It is often taken into account in the process of construction of garment patterns. However, the existing pattern generation methods can not provide a suitable estimation of ease allowance, which is strongly related to wearer’s body shapes and movements and used fabrics. They can only produce 2D patterns for a fixed standard value of ease allowance. In this paper, we propose a new method of estimating ease allowance of a garment using fuzzy logic and sensory evaluation. Based on these values of ease allowance, we develop a new method of automatic pattern generation, permitting to improve the wearer’s fitting perception of a garment. The effectiveness of our method has been validated in the design of trousers of jean type. It can also be applied for designing other types of garment.
1.
Introduction
A garment is assembled from different cut fabric elements fitting human bodies. Each of these cut fabric elements is reproduced according to a pattern made on paper or card, which constitutes a rigid 2D geometric surface. For example, a classical trouser is composed of cut fabrics corresponding to four patterns: front left pattern, behind left pattern, front right pattern and behind right pattern. A pattern contains some reference lines characterized by dominant points which can be modified. Of all the classical methods of garment design, draping method is used in the garment design of high level [l]. Using this method, pattern makers drape the fabric directly on the mannequin, fold and pin the fabric onto the mannequin, and trace out the fabric patterns. This method leads to the direct creation of clothing with high accuracy but needs a very long trying time and sophisticated techniques related to personalized experience of operators. Therefore, it can not be applied in a massive garment production. Direct drafting method is more quick and more systematic but often less precise [2]. It is generally applied in classical garment industry. Using this method, pattern makers directly draw patterns on paper using a patter construction procedure, implement in a Garment
525
526
CAD system. This construction procedure does not determine the amount of ease allowance, but instead generates flat patterns for any given value of ease allowance. In practice, it is necessary to find a compromise between these two garment construction methods so that their complementarity can be taken into account in the design of new products. To each individual corresponds a pattern whose parameters should include his body size and the amount of ease allowance of the garment. In fact, most of fabrics are extensible and can not be well deformed. Moreover, the amount of ease allowance of a garment, defined as the difference in space between the garment and the body, can be taken into account in the pattern by increasing the area along its outline. In practice, there exist three types of ease allowance : (1) standard ease, (2) dynamic ease and (3) fabric ease. Standard ease allowance is the difference between maximal and minimal perimeters of wear’s body. It is obtained from standard human body shape for the gesture of standing or sitting still. This amount can be easily calculated using a classical drafting method [2], [ 3 ] . Dynamic ease allowance provides sufficient spaces to wearers having non standard body shapes (fat, thin, big hip, strong leg, . ..) and for their movements (walking, jumping, running, etc.). Fabric ease allowance takes into account the influence of mechanical properties of fabrics of the garment. It is a very important concept for garment fitting. Existing automatic systems of pattern generation or garment CAD systems can not determine suitable amounts of ease allowance because only standard ease allowance is taken into account. In this case, 2D patterns are generated according to the predefined standard values of ease allowance for any body shapes and any types of fabric. In this paper, we propose a new method for improving the system of garment pattern generation by defining the concept of fuzzy ease allowance capable of taking into account two aspects: standard ease and dynamic ease.
2.
Construction of the Fuzzy Logic Controller
This method permits to generate new values of ease allowance using a Fuzzy Logic Controller (FLC), adapted to body measurements and movements of each individual. The corresponding scheme is given in Figure 1, For simplicity, only trousers of jean type related to the comfort at hip position are studied in this paper and the influence of fabric ease related to physical properties of garment and other body positions not taken into account.
527
measurements
Fuzzy Logic Controller
standard ease
+ dynamic eaie
Garment pattern generation
evaluation
Table 11 comparative maran and pnn withthree traning meth met 150 The FLC used for generating fuzzy ease allowance includes an interface of fuzzification, a base of fuzzy rules, an inference mechanism and an interface of defuzzification. It permits to produce the fuzzy ease allowance at hip position, i.e. the combination of the dynamic ease and the standard ease, from a number of relevant measures on wearer’s body shapes and comfort sensation of wearers. The amount of fuzzy ease allowance will be further used for generating more suitable patterns. The construction of this FLC is based on a learning base built from a adjustable garment sample to generate different trouser sizes and a group of representative evaluators or wearers (sensory panel). Precisely, the procedure of construction of this learning base is given as follows: Step 1: Selecting or producing a special sampling jean whose key positions can be adjustable in order to generate different values of ease allowance. This sample can be used to simulate jeans of different sizes and different styles. In our project, the sample is adjusted to have three sizes: normal size, loose fit and tight fit and the corresponding ease allowance at hip position can vary from -1 to 8. These values are taken as output learning data of the FLC. Step 2 : Selecting a group of n evaluators having different body shapes. Step 3: Sensory evaluation: evaluating the sample at different adjustable sizes by each evaluator according to a questionnaire defined a priori. The perception of wearers related to jean comfort on different body positions and different movements can be taken into account in the reply of the questionnaire. In our project, we calculate for each evaluator the minimal value of the evaluation scores with respect to all body positions and all movements and take the minimal values for all evaluators as input learning data of the FLC. Steu 4: Objective evaluation: measuring the body shapes of the evaluators on a number of dominant geometric positions. These values are also taken as input learning data of the FLC. This procedure permits to obtain input/output data for different sizes of the learning garment sample. In this FLC, the inputs include 3 variables. The first and second input variables, measuring the wearer’s body shapes and related to standard and
528 dynamic ease allowance, are waist girth ( X I ) and waist to hip (x2) respectively. Their linguistic values are: {very small (VS), small (S), normal (N), big (B), very big (VB)}. The third input variable (x3) measures the comfort sensation of wearers and its linguistic values, obtained from sensory evaluation of wearers, are {very uncomfortable (VUC), uncomfortable (UC), normal (N), comfortable (C), very comfortable (VC)}. All these measures are obtained from Step 3 and Step 4 of the previous learning procedure. The output of the FLC is the estimated fuzzy ease allowance, denoted by y . Its values are real numbers varying between -1 and 8. The corresponding learning inpudoutput data, measured and evaluated on the garment sample, are denoted by {(XII,x12, x13; yl), ..., (x,~,xn2, x,3; y J } . Sugeno method is used for defuzzification [4]. The fuzzy rules are extracted from these input/output learning. For each input variable xi (i=l, 2, 3), the parameters of its membership functions are obtained using fuzzy c-means clustering method [5]. This method permits to classify the learning data {XI;, ..., xni}into 5 classes, corresponding to the five fuzzy values of xi. For each learning data xki, we obtain the membership degrees for these five fuzzy values as follows: p l ( x k ; ) ,..., ,u5(xki).Assuming that the corresponding membership functions take a triangular shape characterized by Tr(uli,bli,clj) , .. ., T r ( ~ ~ ~ , b ~ ; the , c ~15~ )parameters , ulb ..., c5i are obtained by minimizing the following criterion:
Figure 2. The membership functions for X I (waist girth)
An method method because data.
example of the membership functions optimized by fuzzy c-means is given in Figure 2. In practice, the fuzzy values obtained by this lead to more precise results than uniformly partitioned fuzzy values each fuzzy value generally correspond to one aggregation of learning
529 3.
Fuzzy Rules Extraction from Learning Data
The fuzzy rules of the FLC for estimation of ease allowance are extracted from the learning data ( ( X I ] , x12, ~ 1 3 yI), ; ..., (xnI,x n 2 , xn3; y J } using the method of antecedent validity adaptation (AVA) [ 6 ] . The principle of this is essentially a process by which the antecedent validity of each data, with respect to fuzzy values and fuzzy rules, is evaluated in order to adjust the output consequent. Compared with the other fuzzy extraction methods, the AVA method can effectively resolve the conflicts between different rules and then decrease the information lost by selecting only the most influential rules. In our project, the basic idea of applying the AVA method is briefly presented as follows. According to Section 2, the input variable xi ( i E { l , 2, 3)) is partitioned into 5 fuzzy values: FV,={VS, S, N , B, VB). For each learning data (Xkl. Xk2, Xk3; yk), we set up the following fuzzy rules by combining all fuzzy values of these three input variables: Rule j: IF (x, is A { ) AND (xq is A ! ) AND (x3 is A!), THEN (y is yk) with A,' EFV;and the validity degree of the rule
Given a predefined threshold o,the rule j is removed from the rule base if the following condition holds: D( rule j)
(8)
3.2. G-FNN Learning Algorithm In this paperr G-FNN learning algorithm is used to establish the FNN prediction model from prior time series values. It provides an efficient way of constructing the prediction model online and combining structure and parameter learning simultaneously. Structure learning includes determining the proper number of rules n,. The parameters learning corresponds t o premise and consequent parameters learning of the FNN. Premise parameters include membership function parameters cJ and C, , and consequent parameter refers to the weight w J of the FNN (refer to Appendices).
4. Experiments
4.1. N A R X ( 2 , l ) Dynamic Process The following dynamic NARX(2,l) process is generated Yt =
+ 2.51 + x t - i
~t-i~t-2[~t-l
1+ Y L
+ Yt-2
+et
(9)
where the input has the form xt = sin(27rt/25). The FNN prediction models are estimated using the first 200 observations of the time series generated from (9). The G-FNN predictors are then tested on the following 200 observations from the model. To demonstrate the superior performance of the G-FNN predictors, comparisons with three other fuzzy neural methods with dynamic topology, i.e. RBF-AFS’, FNNS2 and D-FNN4, are shown in Table 1. In this
540 comparison, noise-free signals and fitting models with ny = 2 and n, = 1 are used for all the methods. It can be seen that the G-FNN provides better generalization as well as greater parsimony. The proposed G-FNN predictors are fast in learning speed because no iterative learning loops are needed. Generating less number of rules and parameters leads t o better computational efficiency using G-FNN learning algorithm. Model RBF-AFS FNNS D-FNN Feedforward G-FNN Recurrent G-FNN I Recurrent G-FNN I1
n,35 22 6 5 5 4
np
280 84 60 48 46 52
Testing MSE 1.9e-2 5.0e-3 8.0e-4 1.le-4 3.2e-4 4.5e-4
Training Method iterative loops 300 iterative loops one pass one pass one pass one pass
4.2. Chaotic Mackey-Glass Time Series
Chaotic Mackey-Glass time series is a benchmark problem that has been The time series is generated considered by a number of researchers from the following equation 1,31?1?1?.
where T > 17 gives chaotic behavior. Higher value of T yields higher dimensional chaos. For the ease of comparison, parameters are selected as: a = 0.1, b = 0.2 and T = 17. The fitting model of (10) is chosen to be Yt = F[Yt-p, Yt-p-At, Yt-p-2At7 Yt-p-snt]
(11) For simulation purpose, it is assumed that yt = 0, W < 0 and yo = 1.2. In this experiment, the following values are chosen: p = At = 6 and 118 5 t 5 1140. The first 500 input/output data pairs generated from (10) are used for training the FNN while the following 500 data pairs are used for validating the identified model. Using the G-FNN learning algorithm, a total of 8 fuzzy rules are generated for the G-FNN predictor during training. Figure 1 shows that the actual and predicted values are essentially the same and their differences can only be seen on a finer scale. 5 . Conclusions
In this paper, one feedforward and two recurrent G-FNN predictors are proposed, tested and compared. The proposed G-FNN predictors provide a sequential and hybrid learning method for model structure determination and parameter identification which greatly improves predictive
54 1
0 200
300
400
500
I
I
I
' 500
600
600
700
800
900
I
I
I
0 I000 1100
t
0.061
-0041
(a) I
'
200
,
' 300
. 400
,
'
I
,I
I
'
700
'
800
'
900
' 1000
I '
1100
t (b)
Figure 1. Prediction results: (a) Mackey-Glass time series from t = 118 to 1140 and six-step ahead prediction; (b) Prediction error.
performance. Experiments and comparative studies demonstrate superior performance of the proposed approaches. Appendix A.
Two Criteria of Rule Generation For each training data pair [zt,yt] : t = 1 . . . n d , where yt is the desired output or the supervised teaching signal and n d is the total number of training data, the system error is defined as et = (Iyt - &I]. If et is bigger than a designed threshold K,, a new fuzzy rule should be considered. At sample time t , regularized Mahalanobis distance is calculated as rndj = J[zt - c j l T C j [zt - c j ] ,j = 1 . . . n,. The accommodation factor is
@,
a new rule defined as dt = min m d j . If dt is bigger than Kd = should be considered because the existing fuzzy system does not satisfy ~-completeness~. Otherwise, the new input data can be represented by the nearest existing rule. Pruning of Rules The Error Reduction Ratio (ERR) concept proposed in is adopted here for rule pruning. At sample time t , we have from ( 5 ) y = Ow e, where y = [yl y2 . . . yt]* E Rt is the teaching signal, w E R ' is the real-valued weight vector, 0 = [@I . . . = QR E StX"is known as the regressor and transformed by QR decomposition, e = [el e 2 . . . etIT E Rt is the system error vector that is assumed to be uncorrelated with the regressor 0 , and Y = n,(n( 1). An ERR due t o q, can be defined as err, = *. ( s T y Y Total ERR T e r r j , j = 1 . .. n, corresponding t o the j t h
+
+
542
d-.
errj)Terrj rule is defined as T e r r j = If T e r r j is smaller than a designed threshold 0 < Ke,,< 1, the j t h fuzzy rule should be deleted, and vice versa. Determination of Premise Parameters Premise parameters or Gaussian membership functions of the F N N are allocated to satisfy the +completeness of fuzzy rules. In case of et > K , and dt > Kd, we compute the Euclidean distance edij, = llzi - bijn 11 between zi and the boundary point b i j , E { C i l , ciz,.. . , c ~ N , , i+,in, zqrnax}.Next, we find ;n = argmin edij,. If edijn is less than a threshold or a dissimilarity ratio of neighboring membership function K,, , we choose
Ci(n,+l)
= bij,,
Ci(n7+1)
=
(A.1)
Otherwise, we choose
c i ( n v + l )=
max (lCi(n,+l) - C i ( n , + l ) ,
I, ICi(n,+l) - Ci(n,+l)* I)
@
(A.3)
In case of et > K , but dt 5 Kd, the ellipsoidal field needs t o be decreased to obtain a better local approximation. A simple method to reduce the Gaussian width is as follows = K, x oijOld
(-4.4) where K, is a reduction factor which depends on the sensitivity of the input variables. In case of the rest, the system has good generalization and nothing need to be done except adjusting weight. Determination of Consequent Parameters TSK-type consequent parameters are determined using the Linear Least Squared (LLS) method as w = Oty, where where 0 t is the pseudoinverse of 0. 0.%3new
References 1. K. B. Cho and B. H. Wang, Fuzzy Sets and Systems, 83,325 (1996). 2. C. T. Chao, Y. J. Chen and C. C. Teng, IEEE Trans. System, Man and Cybernetic, 26,344 (1996). 3. S. Chen, C. F. N. Cowan and P. M. Grant, IEEE Trans. Neural Networks, 2, 302 (1991). 4. S. Wu and M. J. Er, IEEE Trans. Systems, Man and Cybernetics, Part B, 30, 358 (2000). 5. L. X. Wang, A Course in Fuzzy Systems and Control, New Jersey: Prentice Hall (1997). 6. A. Jazwinski, Stochastic Processes and Filtering Theory, New York: Academic Press (1970).
IMPLEMENTATION OF ON-LINE MONITORING PROGRAMS AT NUCLEAR POWER PLANTS J. WESLEY HINES Nuclear Engineering Department, The University of Tennessee, Knoxville, Tennessee 37996-2300 EDDIE DAVIS
Edan Engineering Corporation, 900 Washington St., Suite 830, Vancouver, Washington 98660 The investigation and application of on-line monitoring programs has been ongoing for over two decades by the U.S. nuclear industry and researchers. To this date, only limited pilot installations have been demonstrated and the original objectives have changed significantly. Much of the early work centered on safety critical sensor calibration monitoring and reduction. The current focus is on both sensor and equipment monitoring. This paper presents the major lessons learned that contributed to the lengthy development process including model development and implementation issues, and the results of a recently completed cost benefit analysis.
1. Introduction and Background For the past two decades, Nuclear Power Plants (NPPs) have attempted to move towards condition-based maintenance philosophies using new technologies developed to ascertain the condition of plant equipment. Specifically, techniques have been developed to monitor the condition of sensors and their associated instrument chains. Historically, periodic manual calibrations have been used to assure sensors are operating correctly. This technique is not optimal in that sensor conditions are only checked periodically; therefore, faulty sensors can continue to operate for periods up to the calibration frequency. Faulty sensors can cause poor economic performance and unsafe conditions. Periodic techniques also cause the unnecessary calibration of instruments that are not faulted which can result in damaged equipment, plant downtime, and improper calibration under non-service conditions. Early pioneers in the use of advanced information processing techniques for instrument condition monitoring included researchers at the University of Tennessee (UT) and Argonne National Laboratory (ANL). UT Developed Neural Networks based systems while ANL developed the Multivariate State Estimation System (MSET) [ 13. The EPFU Instrument Monitoring and Calibration (IMC) Users Group formed in 2000 with an objective to demonstrate OLM technology in operating nuclear power plants for a variety of systems and applications. The On-Line Monitoring Implementation Users Group formed in mid 2001 to demonstrate OLM in multiple applications at many nuclear power plants and has a four-year time frame. Current U.S. nuclear plant participants include Limerick, Salem, Sequoyah, TMI, VC Summer, and Sizewell B 543
544
using a system produced by Expert Microsystems Inc. (expmicrosys.com), and Harris and Palo Verde, which are using a system developed by Smartsignal Inc. (smartsignal.com). Each of these plants is currently using OLM technology to monitor the calibration of process instrumentation. In addition to monitoring implementation, the systems have an inherent dual purpose of monitoring the condition of equipment, which is expected to improve plant performance and reliability. The major European participant in this area is the Halden Research Project where Dr. Paolo Fantoni and his multi-national research team have developed a system termed Plant Evaluation and Analysis by Neural Operators (PEANO) [2] and applied it to the monitoring of nuclear power plant sensors.
2. On Line Monitoring Techniques The OLM systems use historical plant data to develop empirical models that capture the relationships between correlated plant variables. These models are then used to verify that the relationships have not changed. A change can occur due to sensor drift, equipment faults, or operational error. Numerous data-based technologies have been used by major researchers in the field. Three technologies have emerged and have been used in the Electric Power Industry that use different databased prediction methods: a kernel based method (MSET), a neural network based method PEANO [2]),and non-linear partial least squares (NLPLS) [3]. These methods are described and compared in Hines [4]. The major lesson learned in applying empirical modeling strategies are that the methods should produce accurate results, produce repeatable and robust results, have an analytical method to estimate the uncertainty of the predictions, be easily trained and easily retrained for new or expanded operating conditions. 2.1 Accurate Results
Early applications of autoassociative techniques, such as MSET, were publicized to perform well with virtually no engineering judgment necessary. One item of interest is the choice of inputs for a model. Early application limits were said to be around 100 inputs per model with no need to choose and subgroup correlated variables. However, experience has shown that models should be constructed with groups of highly correlated sensors resulting in models commonly containing less than 30 signals [ 5 ] . It has been shown that adding irrelevant signals to a model increases the prediction variance while not including a relevant variable biases the estimate [ 6 ] . 2.2 Repeatable and Robust Results When empirical modeling techniques are applied to data sets that consist of collinear (highly correlated) data sets, ill-conditioning can result in highly
545 accurate performance on the training data, but highly variable, inaccurate results on unseen data. Robust models perform well on data that have incorrect inputs as expected noisy environments or when a sensor input is faulted. Regularization techniques can be applied to make the predictions repeatable, robust, and with lower variability. A summary of the methods is given in Gribok [7], and regularization methods have been applied to many of the systems currently in use. 2.3 Uncertainty Analysis The most basic requirements outlined in the NRC safety evaluation [S] are that of an analysis of the uncertainty in the empirical estimates. Argonne National Laboratory has performed Monte Carlo based simulations to estimate the uncertainty of MSET based technique estimations [9]. These techniques produce average results for a particular model trained with a particular data set. Researchers at The University of Tennessee have developed analytical techniques to estimate prediction intervals for all of the major techniques (MSET, AANN, PEANO, and NLPLS). The analytical results were verified using Monte Carlo based simulations and provide the desired 95% coverage [6]. Each of the techniques performs well, some better than the others, on various data sets. 2.4 Ease of Training and Retraining As will be shown in section 3.0, it is virtually impossible for the original training data to cover the entire range of operation. The operating conditions may change over time and the models may need to be retrained to incorporate the new data. MSET based methods are not trained, but are non-parametric modeling techniques. These techniques work well in that new data vectors can simply be added to the prototype data matrix. Artificial Neural Networks require fairly long training times. Other parametric techniques, such as Non-Linear Partial Least Squares, can be trained much faster. Recently, PEANO system has incorporated a NLPLS algorithm with performed with equaled accuracy to the original AANN algorithm and can be trained in minutes versus days [lo].
3. OLM Plant Implementation Several lessons have been learned from EPRI’s three years of OLM implementation and installation. The major areas include data acquisition and quality, and model development, and results interpretation.
3.1 Data Acquisition and Quality In order to build a robust model for OLM, one must first collect data covering all the operating conditions in which the system is expected to operate and for which signal validation is desired. This data is historical
546
data that has been collected and stored and may not represent the plant state due to several anomalies that commonly occur. These include interpolation errors, random data errors, missing data, loss of significant figures, stuck data, and others. Data should always be visually observed and corrected or deleted before use. 3.1.1 Interpolation Errors The first problem usually encountered in using historical data for model training is that it is usually not actual data, but instead, data resulting from compression routines normally implemented in data archival programs. For example, the PI Data Historian from OSI Software creates a data archive that is a time-series database. However, all of the data is not stored at each collection time. Only data values that have changed by more than a tolerance are stored along with their time stamp. This method requires much less storage but results in a loss of data fidelity. When data is extracted from the historian, data values between logged data points are calculated through a simple linear interpolation. The resulting data appears to be a saw tooth time series and the correlations between sensors may be severely changed. Data collected for model training should be actual data and tolerances should be set as small as possible or not used. 3.1.2 Data Quality Issues Several data quality issues are common. These cases include 0 Lost or missing data, 0 Single or multiple outliers in one sensor or several, 0 Stuck data in which the data value does not update, 0 Random data values, 0 Unreasonable data values, 0 Loss of significant digits. Most of these data problems can be visually identified or can be detected by a data clean up utility. These utilities remove bad data or replace it with the most probably data value using some algorithm. It is most common to delete all bad data observations from the training data set. Most OLM software systems include automated tools for data cleanup; these tools easily identify extreme outlying data but are typically insensitive to data errors that occur within the expected region of operation. The addition of bad data points in a training set can invalidate a model.
3.2 Model Development Model development is not just a simple click and go as once claimed. There are several decisions that need to be made including: 0 Defining models and selecting relevant inputs, Selecting relevant operating regions, 0 0 Selecting relevant training data.
547
The model must be trained with data covering all operating regions in which it is expected to operate. These operating regions can vary significantly between nuclear plants since regions are defined by system structure, sensor values, operating procedures. One example of a system structure change is the periodic usage of standby pumps or the cycled usage of redundant pumps. A model must be trained for each operating condition for the system to work properly, but excessive training on unusual conditions may degrade the performance on the most usual operating conditions. Therefore, some plant line-ups may not ever be included in the training set. Operating conditions also change due to cyclical changes such as seasonal variations. If a model is trained during mild summers and then monitoring occurs in a hotter summer with higher cooling water temperatures, the model will not perform correctly. In this case, data from the more severe operating conditions must be added to the training data. 3.3 Results Interpretation
Once a model is trained and put into operation, the predictions must be evaluated to determine if the system is operating correctly, if a sensor is drifting, if an operating condition has changed, or if an equipment failure has occurred. The choice of which has occurred can be made using logic and this logic has been programmed into expert system type advisors with some success [ 1 I]. The logical rules operate on the residuals, which are the difference between the predictions and the observations. Under normal conditions, the residuals should be small random values. If only one residual grows, the hypothesis is that a sensor has degraded or failed. If several residuals significantly differ from zero, the operating state has probably changed or an equipment failure has occurred. More in depth knowledge and engineering judgment must be used to ascertain which has occurred. 4. Conclusions
The development and application of On-Line Monitoring systems has occurred over the past 20 years. Through that time period much has been learned about improving the modeling techniques, implementing the system at a plant site, evaluating the results, and the economically basis for such an installation. The original objective of extending Technical Specification sensor calibrations to meet extended fuel cycles has changed to monitoring both safety and non-safety related signals, performance, and equipment. A recently, completed Cost Benefit Analysis [12]shows a nominal six year payback for a 600 sensor installation and shows that the real basis may be in the more difficult to quantify benefits of efficiency improvement and equipment monitoring. As plants hlly field these technologies, the efforts and experiences of plant personnel, researchers, and EPRI project managers will prove invaluable.
548
References 1. Singer, R.M., K.C. Gross, J.P. Herzog, R.W. King and S.W. Wegerich (1996), "Model-Based Nuclear Power Plant Monitoring and Fault Detection: Theoretical Foundations", Proc. 9th Intl. Conf. on Intelligent Systems Applications to Power Systems, Seoul, Korea. 2. Fantoni, P., S. Figedy, A. Racz, (1998), "A Neuro-Fuzzy Model Applied to Full Range Signal Validation of PWR Nuclear Power Plant Data", FLINS-98, Antwerpen, Belgium. 3. Rasmussen, B., J.W. Hines, and R.E. Uhrig (2000), "Nonlinear Partial Least Squares Modeling for Instrument Surveillance and Calibration Verification", Proc. Maintenance and Reliability Conference, Knoxville, TN. 4. Hines, J.W. and B. Rasmussen (2000), "On-Line Sensor Calibration Verification: "A Survey"", 14th International Congress and Exhibition on Condition Monitoring and Diagnostic Engineering Management, Manchester, England, September, 2000. 5. EPRI (2002), Plant Systems Modeling Guidelines to Implement OnLine Monitoring, EPRI, Palo Alto, CA: 1003661. 6. Rasmussen, B., (2003), "Prediction Interval Estimation Techniques for Empirical Modeling Strategies and their Applications to Signal Validation Tasks", Ph.D. dissertation, Nuclear Engineering Department, The University of Tennessee, Knoxville. 7. Gribok, A.V., J.W. Hines, A. Urmanov, and R.E. Uhrig, "Regularization of Ill-Posed Surveillance and Diagnostic Measurements", Power Plant Surveillance and Diagnostics, eds. Da Ruan and P. Fantoni, Springer, 2002. 8. NRC Letter dated June 22, 2000, Safety Evaluation related to Topical Report (TR) 104965 "On-Line Monitoring of Instrument Channel Performance". 9. Zavaljevski, N., A. Miron, C. Yu, and E. Davis (2003), "Uncertainty Analysis for the Multivariate State Estimation Technique (MSET) Based on Latin Hypercube Sampling and Wavelet De-Noising", Transaction of the American Nuclear Society, New Orleans, LA, November 16-20. 10. Fantoni, Paolo F., Mario Hofhann, Brandon Rasmussen, Wesley Hines, and Andreas Kirschner, (2002), "The use of non linear partial least square methods for on-line process monitoring as an alternative to artificial neural networks," 51hInternational Conference on Fuzzy Logic and Intelligent Technologies in Nuclear Science (FLINS), Gent, Belgium, Sept. 16-18. 11. Wegerich, S, R. Singer, J. Herzog, and A. Wilks (2001), "Challenges Facing Equipment Condition Monitoring Systems", Proc. Maintenance and Reliability Conference, Gatlinburg, TN. 12. EPRI (2003), On-Line Monitoring Cost Benefit Guide, Final Report, EPRI, Palo Alto, CA: 1006777.
PREDICTION INTERVAL ESTIMATION TECHNIQUES FOR EMPIRICAL MODELING STRATEGIES AND THEIR APPLICATIONS TO SIGNAL VALIDATION TASKS BRANDON RASMUSSEN AND J . WESLEY HINES The University of Tennessee, Knoxville, TN 37771 Empirical modeling techniques have been applied to on-line process monitoring to detect equipment and instrumentation degradations. However, few applications provide prediction uncertainty estimates, which can provide a measure of confidence in your decisions. This paper presents the development of analytical prediction interval estimate methods for three common non-linear empirical modeling strategies: artificial neural networks (ANN), neural network partial least squares (NNPLS), and local polynomial regression (LPR). The techniques are applied to nuclear power plant operational data for sensor calibration monitoring and verified via bootstrap simulation studies.
1. Introduction Empirical modeling techniques are being used for on-line monitoring of process equipment and instrumentation in the nuclear power industry [ 11. The original objective was to reduce the calibration interval of safety critical sensors by moving to a condition based approach through determining their status. As stated in EPRI's On-line Monitoring of Instrument Channel Performance [2]: "On-line monitoring is the assessment of channel performance and calibration while the channel is operating". The modeling strategies applied to signal validation tasks in this work are: artificial neural networks (ANN), neural network partial least squares ("PLS), and local polynomial regression (LPR). These 3 modeling paradigms have been the most commonly reported for applications to signal validation tasks in on-line monitoring applications. The focus of this work was to provide point-wise prediction intervals, which contain the measured responses to a specified significance level, namely 95%. One of the functions of an on-line monitoring system is to report when an empirical model's estimations are significantly deviating from the measured values of the monitored process. While the ability to detect these significant deviations has been proven, the quantification of the uncertainty associated with the empirical model estimates is rarely addressed. To verify that the observed deviations are significant, in that they exceed all observed effects of modeling uncertainty, prediction interval estimation techniques need to be developed, proven, and incorporated into existing and future software for on-line monitoring applications. 549
550
2. Empirical Modeling Techniques This section provides a brief overview of the three empirical modeling systems under investigation. The three systems were selected for study because they are currently implemented in on-line power plant monitoring systems.
2.1 Artificial Neural Networks Artificial neural network (ANN) models, inspired by biological neurons, contain layers of simple computing nodes that operate as nonlinear summing devices. These nodes are highly interconnected with weighted connection lines, and these weights are adjusted when training data are presented to the ANN during the training process. The training process often identifies underlying relationships between environmental variables that were previously unknown. Successfully trained A " s can perform a variety of tasks, the most common of which are: prediction of an output value, classification, function approximation, and pattern recognition. Neural networks have been applied to signal validation in the power industry [3,4].
2.2 Neural Network Partial Least Squares First introduced by H. Wold in the field of econometrics [ 5 ] ,PLS has become an important technique in many areas, including psychology, economics, chemical engineering, medicine, pharmaceutical science, and process modeling. In attempts to enhance the technique by providing non-linear modeling capabilities, the use of single hidden layer feedforward neural networks (NN) has been applied in the field of chemical engineering [ 6 ] . These methods have been under study at the University of Tennessee, for the purposes of signal validation in large-scale processes, beginning in late 1999 [7]. This method will be referred to as neural network partial least squares (NNPLS). A NNE'LS signal validation system has been implemented, on a trial basis, at the 9" unit of Tennessee Valley Authority's Kingston fossil plant, in Harriman, Tennessee, USA [S, 91. 2.3 Local Polynomial Regression Local polynomial regression (LPR) models are often referred to as lazy learning methods. Lazy learning comprises a set of methods in which data processing is deferred until a prediction at a query point needs to be made. These methods are also referred to as memory based methods due to the approach of storing the training data, and recalling relevant training data when a query is made. A good review of lazy learning methods, focusing on locally weighted regression, is presented by Atkeson et. al. [lo]. A training data set is comprised of a set of input vectors, and a corresponding set of output values. A query point is an input vector for which an output is to be determined. Relevant data is identified by the use of a distance function where maximum relevance occurs when a query point matches a point in the
551 training set, relevance diminishes from this maximum as the distance between a query point and the training points increases. Nonparametric regression using data in a neighborhood of the present query point is generally referred to as a local model. The neighborhood size is controlled by a bandwidth parameter, which is implemented as a kernel width. Local models attempt to fit the data in a region surrounding the query point with a d * degree polynomial. This paper presents the results using local linear regression (LLR) in which d=l.
3. Uncertainty and its Estimation This section presents a brief discussion of empirical uncertainty and its sources. Next it provides a brief description of analytical methods for prediction interval estimation. 3.1 Sources of Uncertainty
There are several sources of prediction uncertainty related to the modeling process. They include the selected predictor variables, the amount and selection of data used to develop the model, the model structure including complexity, and the noise in the predictors and response. Uncertainty can be divided into bias and variance. The selection of a training set is prone to sampling variation because there is variability in the random sampling from the entire population of data. Since each possible training set will produce a different model, there is a distribution of predictions for a given observation. The issues relative to fluctuations in the response for a given observation are reduced as the training data set size increases, i.e. more training data lowers the uncertainty of the estimates, assuming the data are representative of the process. Model misspecification occurs when a given model is incorrect, and a bias is introduced due to the improper model, e.g. fitting non-linear data to a linear model will result in a biased model. Model misspecification may occur for the ANN models and the NNPLS models, though given the proper number of free parameters both techniques are proven to perform adequately, i.e. with minimal bias. Misspecification can also occur for a local polynomial model through the combined selection of the bandwidth parameter and the degree of the locally fitted polynomial. Model misspecification is made worse by an improper choice of model complexity. Incorrect model complexity increases model uncertainty. A model without the required flexibility will bias the solution while an overly complex model tends to fit the noise in the training data and has an increased variance. For an ANN, the complexity is determined by the number of hidden neurons; for LPR, the complexity is controlled by the polynomial order and the bandwidth; and for a NLPLS model, the complexity is determined by the number of latent variables included in the model. The selected set of predictor variables influences the model uncertainty. If the predictor variable set does not contain the necessary information to
552 accurately model the desired response, a bias results. If the predictor variable set contains variables that are unrelated to the desired response an increased solution variance results. Lastly, noise in the input and output data is a potential source of uncertainty. Each of the analytical approaches to prediction interval estimation presented herein considers only the noise in the dependent, or response, variable. Alternate theories based on the error-in-variables model are available for including the noise in the predictor variables in developing prediction intervals; however, they require knowledge of the noise level present, which is generally unknown.
3.2 Prediction Interval Estimation for Empirical Techniques The derivation of prediction intervals for these three techniques is too involved for this paper but can be found in Rasmussen [ 111. The methods for NNPLS and ANN follow Chryssolouris et. al. [ 121 and result in predictions of the form.
where F is the Jacobian matrix computed using the training data, and fo is the Jacobian computed for a new observation xo used for computing prediction intervals for the corresponding prediction Yo. Prediction intervals for the non-parametric techniques follow the bias-variance decomposition of the MSE, and can also be found in Rasmussen [ 111. 4. Results
The prediction interval estimation techniques were applied to several data sets. This paper will present the results for prediction of a nuclear power plant's first stage turbine pressure. The turbine pressure data set contains 5 predictor variables: 3 steam generator steam pressures, a turbine first stage pressure, and the unit gross generation. The response variable is also a turbine first stage pressure channel, though not the same channel as the one included in the predictor variable set. The data was provided by the Electric Power Research Institute (EPRI) and is from an operating U.S. nuclear power plant sampled at one-minute intervals. The training data set contains good data and the test data set contains a known drifting sensor. The data was standardized to have a zero mean and unit variance for model training and testing. For each of the modeling techniques, several architectures were developed and analyzed. The architecture with the best accuracy also had the smallest prediction interval. The number of latent factors was varied for the " P L S model, the number of hidden neurons was varied for the ANN model, and the kernel width was varied for the LLR model.
553 4.1 Neural Network Partial Least Squares Results
A " P L S model with two latent factors was found to be optimal. The prediction results on the test data set are shown in Figure 1 below. This figure shows the prediction for a specific sample with a circle along with its associated prediction interval. The prediction intervals contain the measured values until sample number 500, when the sensor drifts to a value lower than those predicted. When the prediction interval contains the actual value we call this coverage and the coverage was 96% for the good portion of the test data. This agrees
Bootstrap was used to validate the analytical PI results resulting in a 98% coverage value. Thus, the bootstrap estimates are slightly overestimating the true uncertainty for the " P L S models in this case, and that the analytic prediction intervals perform as expected and provide sufficient c,overage in all cases. 4.2 Artificial Neural Network Results The optimal neural network architecture contained two hidden neurons. The prediction results on the test data set are shown in Figure 2 below. The first observation is that the magnitudes of the prediction intervals are much lower than those of the NNPLS models. The average coverage values are above the expected value of 0.95. The test data predictions indicate the know drift which becomes significant around sample 500. For the ANN models, the analytic and bootstrap intervals for the test data were very similar for all evaluated architectures. The coverage values for both approaches to interval estimation remained at or above the expected level for all evaluated architectures.
554 4.3 Local Linear Regression Results The optimal local linear regression architecture was determined to have a kernel width of 0.75. The prediction results on the test data set are shown in Figure 3 below. The coverage values of the prediction intervals with respect to the fault free were at or above the expected value. Again, the drift can be easily identified.
555 4.4 Results Summary
All three techniques provided accurate prediction interval estimates as proven through coverage values of near 95%; however, the techniques did not produce equal accuracies. Table 1 presents a quantitative comparison of the three techniques.
Table 11 comparative maran and pnn withthree traning meth met 150 Model Type
Mean Absolute Errors (PSIA)
Prediction Intervals (PSIA)
Drift Estimates
,h! (PSIA)
0 (PSIA)
NNPLS ANN
1.3 0.5 0.4
4.1 1.2 1.3
-10.5 -8.9 -9.1
0.54 0.19 0.09
LLR
The NNPLS models performed poorly for this data set when compared to the other methods. This is because the turbine pressure data set contained only mild correlations. Previous work documents the need for high correlations in the predictor variables for successful implementation of the " P L S architecture [Rasmussen 20021. However, the resultant prediction intervals reflected the poor performance and the coverage values were consistently at or above the expected 95% level. The ANN models for. this data set performed well. The prediction intervals provided the appropriate coverage for the test data, the average errors were minimal, and the predictions identified the drift. The LLR models also performed well for this case. The prediction intervals provided the appropriate coverage for the majority of different bandwidth models evaluated. The errors with respect to the test data were slightly lower than those observed for the ANN models, and the drift in the test data response was clearly identifiable in all cases. 5. Conclusions
Methods were developed and applied to the estimation of prediction intervals for three commonly used empirical models. The analytical algorithms were successfully applied to actual nuclear power plant data. The analytic prediction interval estimation techniques were shown to consistently provide the expected level of coverage for the empirical models. The methods were also able to detect and quantify a drifting sensor. The prediction interval estimation methods were also applied to several other data sets in which similar results were found. However, the NNPLS algorithm performed very well on data sets with higher correlations. This fact agrees with the well-published assumption that no model performs best on all data sets.
556 References 1. Davis, E., Shankar, R., (2003), "Results of the EPRUUtility On-Line Monitoring Implementation Program", Transactions of the American Nuclear Society 2003 Winter Meeting, New Orleans, LA. 2. Davis, E., D. Funk, D. Hooten, and R. Rusaw, (1998), "On-Line Monitoring of Instrument Channel Performance," EPRI TR- 104965. 3. Fantoni, P.F., and A. Mazzola, (1996), "A pattern recognition-artificial neural networks based model for signal validation in nuclear power plants," Annals of Nuclear Energy, 23, no. 13, 1069-1076. 4. Hines, J.W., R. Uhrig, C. Black, and X. Xu, (1997), "An evaluation of instrument calibration monitoring using artificial neural networks," Proceedings of the American Nuclear Society, Albuquerque, NM, November 16-20. 5. Wold, H., (1966), "Nonlinear Estimation by Iterative Least Squares Procedures," F. N. David, Ed., John Wiley, New York, 1966. 6. Qin, S. Joe, and T.J. McAvoy, (1992), "Nonlinear PLS modeling using neural networks," Comp. Chem. Engng., 16, no. 4, 379-39 1. 7. Rasmussen, B., J.W. Hines, and R.E. Uhrig, (2000), "A Novel Approach to Process Modeling for Instrument Surveillance and Calibration Verification," The Third American Nuclear Society International Topical Meeting on Nuclear Plant Instrumentation and Control and Human-Machine Interface Technologies, Washington DC, November 13-17. 8. Rasmussen, B., (20M), "Neural Network Partial Least Squares for Instrument Surveillance and Calibration Verification," MS Thesis, Nuclear Engineering Department, The University of Tennessee, Knoxville. 9. Hines, J.W., Rasmussen and R.E. Uhrig (2002, "An On-line Sensor Calibration Monitoring System, International Journal of COMADEM, Birmingham, United Kingdom. 10 Atkeson, C., Moore, A. & Schaal, S. (1997). Locally weighted learning. Artif. Intel. Rev., 11, 76-113. 11. Rasmussen, B., (2003), "Prediction Interval Estimation Techniques for Empirical Modeling Strategies and their Applications to Signal Validation Tasks", Ph.D. dissertation, Nuclear Engineering Department, The University of Tennessee, Knoxville. 12. Chryssolouris, George, Moshin Lee, and Alvin Ramsey, (1996), "Confidence interval prediction for neural network models," IEEE Transactions on Neural Networks, 7, no. 1,229-232.
NUCLEAR POWER PLANT MONITORING WITH MLP AND RBF NETWORK KUNIHIKO NABESHIMA Research Groupfor Advanced Reactor System, Japan Atomic Energy Research Institute, Tokai-mura, Ibaraki-ken, 3 19-I 195, Japan EMINE AYAZ, SERHAT SEKER, BURAK BARUTCU, ERDINC TURKCAN Electric Engineering Department, Istanbul Technical University 34469 Maslak, Istanbul, Turkey KAZUHIKO KUDO Department of Quantum Physics and Nuclear Engineering, Kyushu University 6-10-1 Hakozaki, Higashiku, Fukuoka, 812-8581, Japan
The monitoring system with MLP and RBF network has been developed for NPP. MLP is used to detect the symptom of anomalies and RBF network is used to identify the abnormal events. From the off-line test results using PWR simulator, it is clear that the monitoring system can successfully detect and diagnose small anomalies earlier than the conventional alarm system.
1. Introduction
Main purpose of Nuclear Power Plant (NPP) monitoring is to diagnose the current status of operational plants using process signals in real-time. Especially, it must be more important for plant safety in aged NPPs to detect the symptom of small anomalies at the beginning stage. Therefore, we have developed Artificial Neural Network On-line Monitoring Aids (ANNOMA) system for the Borssele NPP in the Netherlands [l]. In the system, Multi Layer Perceptron (MLP) in auto-associative mode can model and predict the plant dynamics by training normal operational data only. The basic principle of the anomaly detection is to monitor the deviation between process signals measured from the actual plant and the corresponding values predicted by MLP [2]. The expert system is used to diagnose the plant status with the measured signals, the outputs of MLP and the alarm information from the conventional alarm system. The test results showed that the monitoring system with MLP and an expert system could detect and diagnose small anomalies earlier than the conventional alarm system. However, the description of rules in the expert system is slightly complicated if many kinds of anomaly cases are assumed. In this study, Radial Basis Function (RBF) network [3] is tested in place of expert system because it has the
557
558 advantage of fast learning and easy adaptation for the changes of network construction, that is, input or output signals. 2.
Plant Monitoring System
Figure 1 shows the overview of monitoring system. The monitoring system receives the digitized plant signals every two seconds from the data acquisition system. Out of these, most significant plant signals are selected for the inputs of MLP; neutron flux, flow rate, pressure, temperature, electric power, etc. Here, on-line Pressurized Water Reactor (PWR) training simulator is utilized for evaluating the performance of RBF network as diagnosis method, because it is difficult to collect many kinds of anomaly data from the actual power plants. The PWR simulator is manufactured on the basis of an existing 822 MWe power plant, Suny-1 in U.S.A.We can take into consideration in 49 abnormal events of major system including failure of pumps, valves, controllers, pipes, etc. The conventional alarm system is attached to the panel of the plant simulator. Input Nsulm Flux
Flow Rate Temp
PWR Plant Simulator (Surry 1 Model)
‘Owe
Digital Signal. 22 Ch Time Interval 2 sec Figure 1. Overview of ANNOMA system with MLP and RBF network.
I
2.1. MLP for Anomaly Detection
MLP has three layers: input, one hidden and output layer. The numbers of input and output nodes are 22 as shown in Table 1, respectively. In auto-associative Table 1. Monitoring signals and their maximum error in learning.
Table 11 comparative maran and pnn withthree traning meth met 150 Table 11 comparative maran and pnn withthree traning meth met 150 Table 11 comparative maran and pnn withthree traning meth met 150 Table 11 comparative maran and pnn withthree traning meth met 150 Table 11 comparative maran and pnn withthree traning meth met 150 [kgffcm2]
Hot-leg Temperature (C) Steam Pressure (loop-B)
0.19781 [“Cl 0.07070 [kgffcm2]
8
VCTLevel Turbine Impulse Press.
0.38583 Vh.1 0.13519
9 10 11
SGLevel(B) SGLevel (C) Steam Flow (loop-B)
0.07122 kgficm2] 20 Steam Pressure (loop-C) 0.09953 [“h] 0.58640 0.08940 [%I 21 Average Neutron Flux 2.82109 [a] 22 Generated Electric Power 2.31500 -MWe]
7
18
19
wgficmz]
559
network, the output signals are supposed to be the same as the input signals at the same time step. The number of hidden node is selected as 20. The backpropagation algorithm is used for learning, and the sigmoidal function is selected as transfer function. The patterns for initial learning were obtained during normal start-up and steady state operations. Table 1 shows maximum learning errors after repeated 1,000 times per each pattern, which are small enough for modeling.
2.2. RBF Network for Plant Diagnosis In this study, (RBF) network is utilized instead of expert system to develop more efficient diagnosis system. The input layer in RBF network is abnormal extent of 22 signals calculated by MLP, as shown in Table 2. The abnormal extent is defined as follows; 1 for e/emax>1.25, 0.5 for 1.0<s/Emax= 7 7 ( t , X , i ) / U (12) q ( t , ~ , i )and u are continuous functions. Due to boundedness of chaotic attractor and upper and lower limits for control signal, v(t,x,i) and u lie in a bounded set in. Therefore we can normalize the numerator and denumerator of g(t,x) in relation (12) for fuzzy clustering process. At starting moment we must set u in an arbitrary initial value, then using sliding mode we can write: u = gk (t,x)-'
(x, - f ( t , x )- k.sat(*))
(13)
s = / z ( x - x d ) , ~ , = ~ d - i l ( ~ - ~ dk )= ,q + E
(14) Where g k ( t , x ) is the result of clustering identification of g(t,x) when kth input-output data point is used. E > 0 , 4 > 0 , and q > 0 can be chosen arbitrary, but if 4 becomes large then the steady state error becomes large. 5.
Case Study and Simulation
The Lorenz system is considered as a case study, dynamic equations are as: i = p ( y - x ) , j , = Rr - y - x z , z = -pz+ xy (15) The behavior of above system is chaotic for p = 1 0 , p = 8 / 3 , R = 28. We assume that the dynamic system is unknown. It is supposed that all of the state variables are detectable. Moreover, the deviation AR around R = 28 is used as a control variable parameter. In this example the state variable y is selected as the system output. The algorithm is simulated for two different desired trajectories: a) yd = 5 , and b) yd = 5 + 5sin(4t) . In both cases two other states x and z, are not controlled; however if we control the state y in a desired path, two other states does not show the chaotic behavior. Thus, the method used in sliding mode only controls the state y . Since the final goal is controlling the system partially and only on state variable y , so the relations (13)-(15) are rewritten as: (16) X = f i ( x , y , z ) , j,=f2(x,y,z)+g(x,y,z).u, z = f 3 ( x , Y , z )
627 ~ = g k ( X , Y , z ) - l ( Y -A r -k-sat(_s/$)), s=/2o,-Yd) Y r = j i , - 4 Y - Y d ) (17) Also we set k = 40, A = 1 and $ = 0.01 , and 0 = 1 and Y = 5 in identification algorithm. Figs 1 and 2 show time responses of identification and control system. In both cases, up to t = 30sec the controller has not been switched on. Fig 3 shows the values of control parameter versus time.
' 0
5
10
15
20
25
30
35
40
Tlme(sec)
Figure 1. Beha\rior of Lorenz system, yd = 5 Controller off for
I
< 30, op for
I
> 30 sec
0
5
10
15
20
25
30
35
A0
5
10
15
20
25
30
35
40
50 N
o
-500
Tirne(sec)
Figure 2 Behavior of Lorenz system, yd =5+5sin@t) Control off fort 30
-50 1
0
I 5
10
15
5
10
15
20
25
30
35
40
20 25 Tme(sec)
30
35
40
200 0
LL -200 -400'
0
Figure 3. Variations of R . upper Fig.: y d = 5 lower Fig.: yd = 5 + Ssin(4t)
I
.
628 6.
Conclusion
Using the method explained in this paper, it is possible to control the chaotic systems without having any exact and crisp mathematical model of dynamic equations. In this method we obtain a fuzzy model of system by use of fuzzy clustering. Because of using the recurrent property of chaotic trajectories, it is not necessary to exert any external noise signal for identification of chaotic part of system. Besides, we can estimate some other part of system when the control signal is applied, and use the updating specifications of identifier. Because of using the sliding mode control, the closed loop system is robust against the uncertainties of estimation. The power of control signal is not minimal, as a result of the natural property of nonlinear control methods. The only necessity in this method is the ability of saving relatively numerous data points when the identification is applied. The results of a case study on Lorenz system and simulations verify the effectiveness of the proposed approach.
References 1. E. Ott, C. Grebogi and J. A. Yorke, Phys. Rev. Lett. 64, 1196 (1990). 2. T. Shinbort, E. Ott, C. Grebogi and J. Yorke, Phys. Rev. Lett. 65, 3215 (1 990). 3. N. J. Mehta and R. M. Henderson, Phys. Rev. A44, pp. 4861-65, (1991). 4. F.T. Arecchi, S. Boccaletti, M. Ciofini and R. Meucci, lnt. J Bif & Chaos. 8, n.8, 1643, (1998). 5. C. C. Hwang, R. F. Fung, J. Y. Hsieh and W. J. Li, Znt. J ofEng. Sc. 37, 1893, (1 999). 6. Xinghuo Yu, Chaos, Solitons & Fractal. 8, n. 9, 1577, (1997). 7. K. Konishi, M. Hirai and H. Kokame, Phys. Rev. Lett. A245,5 11, (1998). 8. H. H. Tsai, C. C. Fuh and C. N. Chang, Chaos, Soliton. & Fractals. 14,627, (2002). 9. C. C. Fuh and H. H. Tsai, Chaos, Solitons &Fractals. 13,285, (2002). 10. Y. M. Liaw and P. C. Tung, Phys. Rev. Lett. A211,350, (1996). 11. 0. Calvo and J. H. E. Cartwright, Int. J. ofB$ & Chaos. 8, 1743, (1998). 12. X. Guan and C. Chen, Fuzzy Sets & Sys. 139, 8 1, (2003). 13. M. Ramesh and S. Narayanan, Chaos, Soliton & Fractals. 12, 2395, (2001). 14. Li- Xin Wang, A Course in Fuzzy Systems and Control, Prentice-Hall International, Inc. (1997).
SECURE COMMUNICATION BASED ON CHAOTIC SYNCHRONIZATION PING LI, ZHONG LI AND WOLFGANG A. HALANG Faculty of Electrical Engineering, FernUniversitat in Hagen, 58084 Hagen, Germany E-mail: ping.li{ zhong. li}{wolfgang.halang} @fernmi-hagen.de GUANRONG CHEN Department of Electronic Engineering, City University of Hong Kong, Kowloon, Hong Kong SAR, P. R. China E-mail: gchenoee. cityu.edu. hk Chaos-based secure communication has been an attractive topic recently. In this paper, the techniques of secure communication based on chaotic synchronization are reviewed, including the schemes of modulating the message signal and the chaotic signal, the approaches of synchronization between the chaotic transmitter and the chaotic receiver, and the security analysis of the communication. Discussion on future studies is carried out.
1
Introduction
Over the past few years, the application of chaotic systems to secure communication has attracted much attention. There are several reasons for it. Firstly, since chaotic signal is typically broadband] noiselike and difficult t o predict, they are candidates for the carriers in communication by masking the message signal. Secondly, Pecora and Carroll have demonstrated that it is possible to synchronize two chaotic systems by coupling. Since then, chaotic synchronization based secure communication has been arousing more and more interest. Finally, it is shown that there is a close relation between chaotic systems and cryptographic algorithms] and chaotic systems have potential application t o cryptographic algorithms. In secure communication based on chaotic synchronization, the message signal is hidden in chaotic signal by some modulation approaches, and the combination signal is transmitted to the receiver. By synchronizing the transmitter and the receiver and then regenerating the chaotic signal at the receiver, the message signal can be extracted from the transmitted signal. As to chaotic encryption, S.J. Li has presented a detail review about chaotic encryption in his dissertation In this paper, an overview to the current secure communication based on chaotic synchronization will be given, which includes the schemes of modulating the message signal and the chaotic signal, the secu-
'.
629
630 rity analysis of the communication. In addition, measures for improving the security and the future research direction will be suggested.
2
Modulation schemes of secure communication based on chaotic synchronization
Different implementations of hiding the message signal to the chaotic signal are reviewed in the following. (1) Chaotic masking Chaotic masking is to mask a message signal simply by adding it to a chaotic signal. At t h e receiver, the message signal can be extracted by subtracting the regenerated chaotic signal, which is obtained by synchronization between the response system and the original transmitted signal 2 . Since the additive message signal has effects of a perturbation injected by the transmitted signal into the receiver, synchronization is possible only when the power level of the message signal is sufficiently small. Therefore, the power of the original message signals is limited. Additionally, this method is very sensitive to channel noise and parameter mismatches between the transmitter and receiver systems. Consequently, quality of the recovered signal and the degree of security of communication are low. (2) Chaos Shift Keying (CSK) or Chaotic Switching (CS) The scheme is used in the case of the binary message signal, where the message signal is used to switch the transmitted signal between two similar chaotic systems with the same structure but different parameters, and used to encode bit 0 and bit 1 of the message signal, respectively. By synchronizing the chaotic attractors in the transmitter and the corresponding one in the receiver, the binary message signal is recovered '. Recently, two distinct chaotic attractors are used to represent one message bit '. CSK scheme is very robust to noise and parameter mismatch. (3) Chaotic Parameter Modulation (CPM) The message signal is used to modulate a parameter of the transmitter. The receiver uses auto-synchronization to recover the message signal by reproducing the modulation '. A variation of CPM was presented 4 , where the digital signal can be highly masked via a multi-step parameter modulation combined with alternative driving of different transmitter variables. This method makes it impossible for an intruder to reconstruct the chaotic attractor to extract the message signal '. (4) Chaos Modulation (CM) The message signal hidden in the chaotic signal of the transmitter is injected into the dynamics of the transmitter and the receiver, and drives the
631 transmitter and the receiver in exactly the same way. Consequently, chaos modulation has been shown to be able to provide perfect synchronization between the transmitter and the receiver 2 . Recently, CM has been combined with chaotic masking ti or CSK ?to develop the original ones. ( 5 ) Inverse Systems (IS) The output of transmitter is employed to drive the receiver. Since the receiver is the inverse of the transmitter, the output of the receiver is the message signal ’. The IS method can be applied to both analog and digital systems, here, only analog ones are introduced. In IS scheme, it is important to design the inverse system of the transmitter. Recently, several demodulation procedures have been proposed to approximate the exact inverse of the transmitter *.
3
Security Analysis of Chaotic Secure Communication
So far, there are many schemes of chaotic secure communication, however, many of them are proven to be weak under various attack methods. (1) Constructing Dynamics of the Transmitter The message signal can be recovered by constructing the dynamics of the transmitter via nonlinear dynamic(NLD) forecast and return maps. NLD is used to extract the message signal in chaotic masking and CSK ’, as well as those in secure communication systems based on high-dimensional chaotic systems lo. By partially reconstructing dynamics of chaotic system via return maps 5 , the message signal from chaotic masking and CSK is extracted. Moreover, it has been shown that the chaotic secure communication scheme has a higher security level if its return map is more complicated or the change of the return map is more irregular ll. (2) Spectral Analysis In 1995, T. Yang et.al unmasked CSK scheme via the average frequency difference between two attractors l 2, by using a spectrogram to detect their difference l 3 and by combining the spectrogram and two single-layer neural networks 14, Rather recently, G. ALvarez recovered the message signal by power analysis attack 15. (3) Constructing An Intruder’s Receiver When the sensitivity to the parameter values is very low, the message signal can be recovered by constructing an intruder’s receiver with parameters values considerably different from the original ones. The message signals were decrypted by constructing an receiver via parameter estimation and generalized synchronization 15. (4)Conventional Crypt analysis Met hods
632 Conventional cryptanalysis methods such as those based on key space, brute force attacks, statistical analysis and plaintext attacks etc. are used in security analysis of chaotic secure communication. As we see, the security analysis of communication is flexible like cryptanalysis. One should use suitable methods in terms of special characteristics of secure communication itself. 4
Approaches for Improving Security
In terms of the weakness in security of chaotic synchronization communication, various measures have been developed to immunize attacks. (1) Application of High-dimensional Chaotic Systems The low-dimensionality of attractor is the weakness of security because of the rather simple geometric structure. While high-dimensional chaotic systems, which has the characteristics of increased randomness and unpredictability, can make it more difficult to describe the attractor’s structure and imitate the key parameters. Therefore, to improve the security of communication, high-dimensional chaotic systems are preferred. A high-dimensional chaotic system was implemented by using standard low-dimensional systems with well-known dynamics as building blocks 17, If only the building blocks in the transmitter and the receiver are synchronized, the message signal can be recovered. (2) Using Discrete Chaotic Systems Chaotic communication using analog systems, especially those based on chaotic masking, has serious weakness since the recovering of the message signal overwhelming depends on the synchronization error which can be used by an intruder to break the secure communication. In addition, digital systems make it possible for perfect matching of the receiver to transmitter. Therefore, using discrete chaotic systems is an alternative way to improve the security of communication 6. Since real transmission lines introduce an unknown amount of attenuation and delay in the transmitted signal as well as distortion and noise, it is difficult to achieve synchronization between the transmitter and the receiver, and secure communication is failed. Therefore, research toward secure communication based on discrete chaotic systems attracts more and more interest. (3) Combination of Chaotic Synchronization and Conventional Encryption The message signal is encrypted by multishift cipher before modulated with chaotic signal of the transmitter in different ways and then transmitted to the receiver It is shown that the method is much more sensitive to recovering errors and the modeling errors, thus, the level of security is
enhanced. Also, the message signal 2o is XOR a chaotic signal, which is generated by truncating the output signal of the transmitter before transmitted. Since the above transmitted signals are encrypted before transmitted, they contain very little information about the dynamics of the transmitter and can not be used to construct the transmitter by an intruder. As a result, the security of communication is improved. In summary, to improve the security of communication means to make the process of masking the message signal with the chaotic signal difficult as much as possible. First, one can apply relatively more complicated modulation schemes such as CM or CPM instead of the chaotic masking or, to develop the current modulation schemes to a higher security level. Second, one can use complex chaotic signal as the carrier, for instance, high-dimensional chaotic one. Third, one can encrypt the message signal before modulated, where many good encryption schemes can be selected. It is noted that while the security degree is increased via some approaches, the cost for secure communication may rise and other performance of communication such as real-time may become worse. Therefore, it is important for a designer to obtain a compromise. 5
Conclusion
In this paper, we reviewed the modulation schemes of secure communication based on chaotic synchronization. Then, the methods for security analysis of the communication systems and measures to improve the security have been discussed. So far, the security of the most communication systems is unsatisfactory and requires much more development before they can be used practically. Therefore, designing chaos-synchronization communication schemes with a higher security and robustness to noisy perturbation should be a focus of future study.
References 1. S.J. Li. Analyses and New Designs of Digital Chaotic Ciphers. PhD thesis, Xi’an Jiaotong University, 2003. 2. H.G. Schuster(ed). Handbook of Chaos Control. Wiley-VCH, 1999. 3. Y. Chu and S. Chang. Dynamical cryptography based on synchronised chaotic systems. Electron. Lett, 35(12):974-975, 1999. 4. P. Palaniyandi and M. Lakshmanan. Secure digital signal transmission by multistep parameter modulation and alternative driving of transmitter variables. Int. J. Bifurcation. Chaos, 11(7):2031-2036, 2001.
634 5. G. Prez and H.A. Cerdeira. Extracting messages masked by chaos. Phys. Rev. Lett., 74(11):1970-1973, 1995. 6. M. Feki et. al. Secure digital communication using discrete-time chaos synchronization. Chaos, Solitons and Fractals, 18:881-890, 2003. 7. K. Murali. Digital signal transmission with cascaded heterogeneous chaotic systems. Phys. Rev. E, 63, 2000. 016217. 8. J. A. Ramirez, H. Puebla, and J. Solis-Daun. An inverse system approach for chaotic communication. Int. J. Bifurcation and Chaos, 11(5):14111422, 2001. 9. K. Short. Unmasking a modulated chaotic communications scheme. Int. J. Bifurc. Chaos, 6(2):367-375, 1996. 10. K. M. Short and A.T. Parker. Unmasking a hy- per-chaotic communication scheme. Phys. Rev. E, 58:1159-1162, 1998. 11. T. Yang et. al. Cryptanalyzing chaotic secure communication using return maps. Physics Letters A , 25(6):495-510, 1998. 12. T. Yang. Recovery of digital signals from chaotic switching. Int. J . Circuit Theory Application, 23(6):611-615, 1995. 13. T. Yang, L.B. Yang, and C.M. Yang. Breaking chaotic secure communication using a spectrogram. Physics Letters A , 247(1-2):105-111, 1998. 14. T. Yang, L.B. Yang, and C.M. Yang. Application of neural networks t o unmasking chaotic secure communication. Physica D, 124(1-3):248-257, 1998. 15. G. Alvarez, F. Montoya, M. Romera, and G. Pastor. Breaking parameter modulated chaotic secure communication system. arXiv: nlin. CD/0311041 v l 20, Nov 2003. 16. G. Alvarez, F. Montoya, G. Pastor, and M. Romera. Breaking a secure communication scheme based on the phase synchronization of chaotic systems. arXiv: nlin. CD/0311040 v l 20, Nov 2003. 17. H. Puebla and J. A. Ramirez. More secure communication using chained chaotic oscillators. Phys. Lett. A , 283:96-108, 2001. 18. G. Grassi and S. Mascolo. Synchronizing hyperchaotic systems by observer design. IEEE Trans. Circuits Syst. II, 46:478-483, 1999. 19. T. YANG. A survey of chaotic secure communication systems. Int. J. Computational Cognition, 2(2):81-130, 2004. 20. Y. Zhang, G.H. Du, Y.M. Hua, and J.J. Jiang. digital speech communication by truncated chaotic synchronization. Int. J. Bifurcation and Chaos, 13(3):691-701, 2003. 21. M. Itoh, C.W. Wu, and L.O. Chua. Communication systems via chaotic signals from a reconstruction viewpoint. Int. J. Bifurcation and Chaos, 7(2) :275-286, 1997.
TRANSITION BETWEEN FUZZY AND CHAOTIC SYSTEMS*
ZHONG LI, PING LI AND WOLFGANG A. HALANG Faculty of Electrical Engineering, Fern Uniuersitat Hagen, 58084, Germany E-mail: ping. li{ zhong. li, wolfgang. halang} Qfernuni-hagen. de
This paper presents an overview of recent studies on the interaction between fuzzy logic and chaos theory. On the one hand, it is shown that chaotic systems can be transformed to either model-free fuzzy models or model-based fuzzy models, which means that fuzzy systems can also be chaotic. On the other hand, i t is further shown that fuzzy systems can be made chaotic (or chaotified) with some simple and implementable controllers, and the existing chaotification approaches are mathematically rigorous in the sense of some commonly used mathematical criteria for chaos such as those defined by Deveney and by Li-Yorke.
1. Introduction Although the relation between fuzzy logic and chaos theory is not completely understood at the moment, the study on their interactions has been carried out for more than a decade, at least from the following aspects: fuzzy control of chaos, adaptive fuzzy systems from chaotic time series, theoretical relations between fuzzy logic and chaos theory, fuzzy modeling chaotifying Takagi-Sugeno of chaotic systems with assigned properties (TS) fuzzy models and fuzzy-chaos-based cryptography. Fuzzy logic was originally introduced by Lotfi Zadeh in 1965 in his seminal paper “fuzzy sets” ?, and the first evidence of physical chaos was Edward Lorenz’s discovery in 1963 ’, although the study of chaos can retrospect to the phyilosophical pondering hundreds of years ago and to the work of the French mathematician Jules Henri Poincark at the turn of the last century. Fuzzy set theory resembles human reasoning using approximate information and inaccurate data to generate decisions under uncertain environments. It is designed to mathematically represent uncertainty and vagueness and to provide formalized tools for dealing with imprecision in real‘i2,
3341596,
‘This work was supported by Alexander von Humboldt Foundation.
635
636 world problems. On the other hand, chaos theory is a qualitative study of unstable aperiodic behavior in deterministic nonlinear dynamical systems. Research reveals that it is due t o the drastically evolving and changing chaotic dynamics human brain can process massive information instantly. “The controlled chaos of the brain is more than an accidental by-product of the brain complexity, including its myriad connections, but rather, it may be the chief property that makes the brain different from an artificialintelligence machine” ’. Therefore, it is believed that both fuzzy logic and chaos theory are related to human reasoning and information processing. Based on the above-mentioned observations, the study on the interactions between fuzzy logic and chaos theory should provide a new and promising, although challenging, approach for theoretical research and simulational study of human intelligence. In order t o better understand this research subject, in this paper, the studies on transition between fuzzy and chaotic systems will be briefly reviewed, which includes fuzzy modeling of chaotic systems and chaotifying fuzzy systems. In addition, other research on fuzzy-chaos-based applications will also be commented. 2. f i z z y Modeling of Chaotic Systems In this section, only the model-based approach is considered for fuzzy modeling of chaotic systems. Here, TS fuzzy modeling of Lorenz system is carried out t o show this approach. The Lorenz equations are as follows:
dt
XY - bz
where (T,T , b > 0 are parameters ((Tis the Prandtl number, T is the Rayleigh number, and b is a scaling constant). The nominal values of ( a ,T , b) are (10,28, for chaos t o emerge. The system (1) has two nonlinear quadratic terms- xy and ZZ. Therefore, this system can be divided into a linear system with a nonlinear term as follows:
i)
dt
637
To construct a T S fuzzy model for the Lorenz system, the nonlinear terms z z and z y must be expressed as a weighted linear sum of some linear functions. For this purpose, we first need the following corollary. Corollary 2.1. Assume z E
[Ml , M 2 ] . The nonlinear term
f (2,Y) = 2.y can be represented b y a lineur weighted sum of linear functions of the form
where
and
Now, we can construct an exact TS fuzzy model, which is not an approximation, of system (1). Using Corollary 2.1, system (2) can be expressed as follows: Plant Rules: Rule 1: IF z ( t ) is about M1
dt
Rule 2: IF z ( t ) is about
M2
where
[
=
r -1]I;0 Mi -b
[
-u u
-u u
A1
, A2 =
T
-1
0
M2
-.".3 -b
,
638 and the membership functions are
rl = - X + Mz Mz
- MI
,
2 - MI rz= Mz - Mi ’
where rz is positive semi-definite for all z E [ M I ,M z ] . We emphasize that the TS fuzzy model of the Lorenz system, shown in Fig. 1, is not an approximation of the original system, but is a perfect fuzzy model since the defuzzified output of the TS fuzzy model is identical to that of the original chaotic Lorenz system.
Figure 1. TS fuzzy model of the Lorenz system
3. Generating Chaos From Fuzzy Systems
In contrast to the main stream of ordering or suppressing chaos, the opposite direction of making a nonchaotic dynamical system chaotic or retaining the existing chaos of a chaotic system, known as “chaotification” (or sometimes, “anticontrol of chaos”),has attracted continuous attention from the engineering and physics communities in recent years. There are many practical reasons for chaos generation, for instance, chaos has an impact on some novel time- and/or energy-critical applications. Specific examples include high-performance circuits and devices (e.g., delta-sigma modulators and power converters) , liquid mixing, chemical reactions, biological systems (e.g., in the human brain, heart, and perceptual processes), secure information processing, and critical decision-making in political, economic and military events. Some systematic and rigorous approaches have been developed t o chaotify general discrete-time and continuous-time systems, which inspire us to extend these technologies to fuzzy systems 33415.
639 An example is included here for illustration. First, consider a nonchaotic discrete-time T-S fuzzy model, given by
+ u(t), Rule 2: IF s ( t ) is rZ THEN z ( t + 1) = Gzz(t) + u ( t ) ,
Rule 1: IF z ( t ) is F1 THEN z ( t + 1) = Glz(t) where
G I = [d 0.3 ~o ]
,
G z = -d[ ~0.3 o]
)
z ( t ) E [-d, d] and d > 0, with the membership functions
The controlled TS fuzzy system is described as follows,
where the controller is taken as a sinusoidal function as shown above without corning to the detail for simplicity. In the simulation, the magnitude of the control input is arbitrarily chosen to be CJ = 0.1. Thus, Ilu(t)lloo< 0,and can also be regarded as a control parameter. Without control, the TS fuzzy model is stable. When ,B = 1.3, the phase portrait diagramsis shown in Fig. 2, respectively. These numerical results verify the theoretical analysis and the design of the proposed chaos generator. Remarks: For continuous-time TS fuzzy systems, two approaches have been used for chaotification. One is t o discretize them first, then the above method can be applied The other is very general by designing a tirnedelayed feedback controller 6 .
’.
4. Conclusion
In this paper, current studies on the interactions between fuzzy logic and chaos theory have been briefly reviewed with focus on fuzzy modeling of chaotic systems and chaotification of fuzzy systems. With the further understanding of their relations, combining fuzzy and chaos control methods will provide promising means for applications. Therefore, more efforts are needed to further explore the interactions between fuzzy logic and chaos theory.
640
Figure 2.
Phase portrait with some structure
References 1. S. Baglio, L. Fortuna and G. Manganaro, “Design of fuzzy iterators t o generate chaotic time series with assigned Lyapunov exponent,” Electronics Letters, vO1.32, No.4, 1996, pp.292-293. 2. M. Porto and P. Amato, “A fuzzy approach for modeling chaotic dynamics with assigned properties,” The 9th IEEE Int. Conf. on Fuzzy Systems, 2000, FUZZ-IEEE 2000, V01.1, pp.435-440. 3. 2 . Li, J. B. Park, G. Chen, Y. H. Joo and Y. H. Choi, “Generating chaos via feedback control from a stable TS fuzzy system through a sinusoidal nonlinearity,” Int. J . Bifur. Chaos, V01.12, No.10, 2002, pp.2283-2291. 4. 2. Li, J. B. Park, Y. H. Joo, G. Chen and Y. H. Choi, “Anticontrol of chaos for discrete TS fuzzy systems,” IEEE Runs. Circ. Syst.-I, Vo1.49, No.2, 2002, pp.249-253. 5. 2. Li, J. B. Park, and Y. H. Joo, “Chaotifying continuous-time TS fuzzy systems via discretization,” IEEE Runs. Circ. Syst.-I, vo1.48, No.10, 2001, pp.1237-1243. 6. 2. Li, W. Halang, G. Chen and L. F. Tian “Chaotifying a continuous-time TS fuzzy system via time-delay feedback,” J. of Dynamic of Disc, Cont., Impul. Syst., Series B, Vol.10, No.6, 2003, pp.813-832. 7. L. Zadeh, “Fuzzy sets,” Inf. Control, vo1.8, 1965, pp.338-353. 8. E.N. Lorenz, “Deterministic Nonperiodic Flow,” J . Atmos. Sci., Vol. 20, 1963, pp.130-141. 9. W.J. Freeman, “The physiology of perception,” Scientific American, Feb. pp. 78-85, 1991.
A NOVEL CHAOS-BASED VIDEO ENCRYPTION ALGORITHM *
HUAN JIAN, YAOBIN M A 0 AND ZHIQUAN WANG Department of Automation, Nanjing University of Sci. i 3 Tech., No.200 Xiaolingwei Str., Nanjing 210094, P. R. China E-mail:
[email protected] ZHONG LI AND PING LI Faculty of Electrical and Computer Engineering, Fern Universitat Hagen, 58084, Germany E-mail: zhong. liafernuni-hagen. de
The proposed chaos-based encryption algorithm first employs a sawtooth-like chaotic map t o generate a pseudo-random bit sequence, then uses it t o determine the types of encrypting operations performed on video coding streams. Four kinds of selective encrypting operations are introduced to efficiently scramble and shuffle all of the DC coefficients, part of the AC coefficients of I blocks as well as Motion Vectors (MVs) through XOR, XNOR and replacement operations. With slight computational overhead and tiny data dilation, the encryption algorithm is amenable to the H.263 video-conference coding standard. Finally, the feasibility and the security of the proposed algorithm are demonstrated by experiments carried out on several segments of H.263 coded video streams.
1. Introduction
The wide deployment of video services such as VoD (Video on Demand), video conference and video surveillance has dramatically increased the research interest of multimedia security in recent ten years. To protect video contents, cryptology, which appears to be an effective way for information security, has been employed in many practical applications. However, due to the intrinsic properties of videos such as bulk capacity and high redundancy, compression of video data is unavoidable. The extra operations of encryption aggravate the cost of video coding, which makes real-time *This work was supported by the Nature Science Foundation of China under grant 60174005 and Nature Science Foundation of Jiangsu Province under grant BK2001054.
64 1
642
video application difficult. In this regards, it is argued that the traditional encryption algorithms like DES, IDEA, and RSA, which have been originally developed for text data, are not suitable for secure real-time video
application^^^^^. A recent major trend in multimedia encryption is to minimize the computational requirements by “selective encryption” that only those intelligible important parts of content are subject to encryptionl0>l2.Many h he me^^^^^^,^^^^^ complied with this principle have been brought out, to sum up, the core idea of the proposed selective algorithms for video encryption is to shuffle or scramble only a portion of the compressed bitstream so that the resulting output is un-decodable in the absence of correct keys, or, even it is illegally breakable, the visual quality of the decoded video is still unacceptable for practical use. Since most commonly used video compression standards such as MPEG, H.261 and H.263 utilize transform coding, for instance block-based DCT, to reduce spacial redundance and inter-picture prediction to eliminate temporal redundancy, the main energy of a video is concentrated in a few DC (Direct Current) and AC (Alternating Current) coefficients in most of the intra-coded frames (I frames), meanwhile, the inter-frame information is expressed as several motion vectors (MVs). After I-frame extraction, the residual information of a video segment is contained in two other kinds of frames, forward predictive coded frames (P frames) and hi-directional predictive frames (B frames) respectively. Selective encryption can be imposed on entire I frames or portion of I, P and B frames as well as MVs. Different schemes employs different element selection strategies and different cryptological algorithms. Some earlier algorithms involve encryption of I framesg. However, Agi and Gong’ revealed that great portions of the videos can be reconstructed by un-encrypted I blocks in P and B frames, therefore sole encryption of I frames may not be sufficient. Tanglo has suggested a cipher scheme that encrypts different levels of selected video streams. According to different security requirements, stream headers, I frames, I blocks in P and B frames and some other types of frames are subject to encryption respectively. Other efficient algorithms that alter sign bits of motion vectors and DCT coefficients have been proposed by Shi and Bhargava7, which significantly reduced the calculation overhead spent on encryption. Recently, Zeng and Lei l2 has proposed a scramble technique in frequency domain that divides the transform coefficients into blocks/segments and performs encryptions through part or all of following three operations: selective bit scrambling, block shuffling and block rotation of the transform coefficients and motion vectors.
643
In parallel with the development of selective encryption, another important technology, chaos based cryptology, has emerged and pullulated. Owing to the prominent properties of chaos like sensitivity to parameters and initial values, ergodicity and broad-band spectrum, chaotic pseudo-random sequence has several advantageous features such as the ease of generation, sensitive dependence on seeds, and non-periodicity, which especially suits for the application of cryptology. There has some attempts on incorporating chaos in data encryptions, and even video encryptions. Sobhy and Shehatas have described an algorithm that uses the chaotic Lorenz system to accomplish all-purpose encryp tions. According to their experiments, the time required for encrypting an image of 93 kilo-bytes is about 20 seconds, which is quite unacceptable for a real-time application. A chaos based image encryption scheme suggested by Yen and Guol' is to change the motion vectors by XOR and XNOR operations with two separately generated chaotic sequences controlled by two predefined keys. However, since the scheme doesn't mention any operations on I blocks, the improvement of the security on video encryption is suspectable. Inspirited by selective encryption and chaos-based encryption, in this paper, we present a novel chaotic video encryption algorithm. A sawtoothlike chaotic map is first used to generate a pseudo-random bit sequence (PRBS), then, according to the PRBS, the DC coefficients, some AC coefficients of I blocks and motion vectors are encrypted consequently. The details of the algorithm as well as some experimental results will be elaborated in the paper. The rest of the paper is organized as follows. In section 2, details of the proposed algorithm are described. Section 3 will exhibit the experimental results. Finally, section 4 concludes the whole paper.
2. Chaos-based video encryption algorithm 2.1. The generation of chaotic pseudo-random bit sequence
To simplify the software implementation and fasten the encryption speed, a sawtooth-like map3 (shown in formula (1) ) is used to generate the chaotic PRBS (CPRBS). = cz,
mod 1
(1)
It can be demonstrated that the above map has many good properties which totally meets our CPRBS generator requirements.
644
Firstly, we inspect the Lyapunov Exponent (LE) of map (1). According to the definition of Lyapunov Exponent of one dimensional map, the LE of the map, A, can be calculated as below: ~
T-1
= lnc
(2)
So when c > 1, the map is chaotic. Further more, if we restrict the parameter c in an integer set and let c 2 2, the distribution of the sequence generated by map (1) should be even6. That means, if we want to generate an evenly distributed pseudorandom bit sequence by bi-polarizing the chaotic sequence produced by the sawtooth-like map, the threshold should be set to 0.5. By using above chaotic map, a CPRNG is easy to obtain. Suppose that after n iterations, the map produces a value x,. The n t h bit b, of the sequence is then determined by the following coin tossing formula: bn=
{ 0 ifif xx,, 6 174 164
I
I
long run 0
3.2. Encryption results Experiments of encrypting video sequences have been performed, which shows the effectiveness of the proposed algorithm. Fig. 2 has exhibited one set of typical results. 3.3. Encryption speed The cost of time on encrypting H.263 video streams has been tested. In this experiment, three segments of videos are subject to test, each of which
647
Figure 2. Experiments on Forman.qczfsequence: the upper left is the original image of the 1st frame, the upper right is the corresponding encrypted image, the bottom left is the original image of the 86th frame and the bottom right is the corresponding encrypted image.
contains 300 frames of 176 x 144 images. The testing results are shown in Table 2. From the experimental results, we can find that the computational overhead introduced by the encryption processing is slight and the encryption processing doesn’t interfere the motion features of the videos. Table 2.
Testing results of encryption speed.
video comparative
Table 11 maran and pnn withthree traning meth met 150 Table 11 comparative maran and pnn withthree traning meth met 150 Foreman.qcif
32.27
32.84
1.77%
25.63pnn withthree 25.86traning meth 0.90%met 150 Table 11 comparative maran and 24.03traning meth 1.05%met 150 Table 11 comparative maran and pnn withthree
3.4. Data dilation The same three video segments are also used for the test of data dilation. Experimental results show that the proposed encryption algorithm introduces slight data inflation. Table 3 has exhibits the results.
Table 11 comparative maran and pnn withthree traning meth met 150 video seauence Foreman. qcaj Claire.. qcij
Bridae-far.acif .,” *
OICIIY.
0
CIraUlt
1
2
3
1
I
s
V C l
Figure 2 Bi-dimensional view of Chua’s attractor.
To explain the role of dynamical absorbers in controlling chaotic behavior let us consider the Duffing’s oscillator, coupled with an additional linear system: X” + ax’+ bX + cX3+ d(X-Y) = Bo + Blcoswt (64 Y” + e(Y-X) = 0 (6b) where a, b, c, d, e, Bo, B1, and w are constants. Here d and e are the characteristic parameters for the absorber, and we take e as the control parameter. It is well known that the Duffing’s oscillator shows chaotic behavior for certain parameter regions. We show in Figure 3 a two-dimensional view of the chaotic behavior in Duffing’s oscillator. In Figure 4 we show a plot of
652 variable X across time [0, 3501. Let us consider the parameters of Equation (6) to be fixed at the values a=0.077, b=O, c=1.0, Bo=0.045, B1=0.16, ~ 1 . 0then , we can find [7] that we have chaos for e E [0, 0.101, and we can control this chaos by increasing e above 0.10. s
I I
Y
I. 110
I
0
I D
Y
ffi" I
E q
Y
a 110"
~p
1
0
5
0
0
5
?
Y
Figure 3 Chaotic behavior in Duffing's oscillator. I lm
.I
5
Y
la tlo n
0
f D
L-0
5 0
1 0 0
1 5 0 tlm
Y
ffln I E 4
.
Y
1 llo n
pL
2 0 0
2 5 0
300
3
0
Figure 4 Plot of variable X across time for Duffing's oscillator. 4.
Controlling Chaotic Behavior using the Concept of Fuzzy Chaos
In any of the above-mentioned methods for controlling chaos we have that a specific parameter is used to change the dynamics of the system from a chaotic to a stable behavior. For example, for the specific case of Duffing's oscillator the parameter "e" of equation ( 6 ) can be used for controlling the chaotic behavior of the oscillator. However, the crisp interval [0, 0.101 for parameter "el' in which chaotic behavior occurs is not really an accurate range of values. Of course, for e= 0 we can expect chaotic behavior, but as "e" increases in value real chaotic behavior is more difficult to find. In the crisp boundary of e=0.10, things are more dramatic, one can either find cyclic stable behavior or unstable behavior depending on the conditions of the experiment or simulation. For this reason, it is more appropriate to use the proposed concept of "fuzzy chaos", which will allow us to model the uncertainty in identifying this chaotic behavior. In this case, a membership function could be defined to represent this uncertainty in
653 finding chaotic behavior, and also this is really helpful in controlling chaotic behavior as we can take action even before completely chaotic behavior is present. For the case of the Duffing's oscillator we can define fuzzy rules for identifying specific dynamic behaviors. For example, chaotic behavior can be given by the following rule: IF e is Small THEN behavior is fuzzy chaos. In the above fuzzy rule, the linguistic term "small" has to be defined by the appropriate membership function. Other similar rules can be established for identifying different dynamic behavior for the system. One obvious advantage of this approach is that we are able to have relative evidence of chaotic behavior before there is complete instability. As a consequence of this fact we can take action in controlling this chaotic behavior sooner than with traditional methods. A sample fuzzy rule for controlling chaos is as follows: IF behavior is fuzzy chaos THEN increase is small positive This fuzzy rule simple states that when fuzzy chaos is present then we must increase slightly the value of ''el'. Of course, linguistic terms, like ''small positive", need to be defined properly. We show in Table 1 the comparison between the methods for controlling chaos for the two dynamic systems considered in this paper. In Table 1 we show the efficiency and accuracy for controlling chaotic behavior for the two cases described before. We did consider a sample of 200 different conditions experimental conditions for both dynamical systems, and compare the relative number of times that a particular method was able to really control chaotic behavior. The implementation of the fuzzy chaos approach for behavior identification was done in MATLAB [17]. We show in Figure 5 the membership functions of the linguistic variable corresponding to the parameter "el', in which there are three linguistic values. Table 1 Comparison between the methods for controlling chaos.
I Chua's circuit Duffng's oscillator
I 0
1 Traditional chaos definition 1 98.50 1 96.00
0 05
0 ,
0 -5
1
New fuzzv chaos definition (%)
I 99.50 I 98.50
0 2
I
0 25
e
Figure 5 Membership hnctions for parameter "e".
0 3
1
654
5.
Conclusions
We have presented in this paper a new theory of Fuzzy Chaos for non-linear dynamical systems. We can apply this theory for behavior identification. We also presented in this paper a new method for controlling non-linear dynamical systems. This method is based on a hybrid fuzzy-chaos approach to achieve, the control of a particular dynamical system given its mathematical model.
References 1. 2
3.
4. 5.
6.
7. 8. 9. 10
11
12 13 14 15
16 17 18
19 20 21
Abraham, E. & Firth, W. J. (1984). "Multiparameter Universal Route to Chaos in a Fabry-Perot Resonator", Opticul Bistubility, Vol. 2, pp. 119-126. Castillo, 0. & Melin, P. (1995). "An Intelligent System for the Simulation of Non-Linear Dynamical Economical Systems", Journul o/ Muthemuticul Modelling and Sirnulation in Systems Analysis, Gordon and Breach Publishers, Vol. 18-19, pp. 767-770. Castillo, 0. & Melin, P. (1996). "Automated Mathematical Modelling and Simulation of Dynamical Engineering Systems using Artificial Intelligence Techniques", Proceedings CESA'96, Gerf EC Lille, pp. 682-687. Castillo, 0. & Melin, P. (1997). "Mathematical Modelling and Simulation of Robotic Dynamic Systems using Fuzzy Logic Techniques and Fractal Theory", Proceedings of IMACS World Congress'97, Wissenschafi & Technik Verlag, Vol. 5, pp.343-348. Castillo, 0. & Melin, P. (1998). "Modelling, Simulation and Behavior Identification of NonLinear Dynamical Systems with a New Fuzzy-Fractal-Genetic Approach", Proceedings of IPMU'98, EDK Publishers, Vol. 1,pp. 467-474. Castillo, 0. & Melin, P. (1999). "A General Method for Automated Simulation of Non-Linear Dynamical Systems using a New Fuzzy-Fractal-Genetic Approach, Proceedings CEC'99, IEEE Press, Vol. 3, pp. 2333-2340. Castillo, 0. & Melin, P. (2001). Soft Computingfor Control of Nan-lineur Dynumicul Systems, Springer-Verlag, Heidelberg, Germany. Chua, L.O. (1993). "Global unfolding of Chua's circuit", IEICE Transactions Fund., pp. 704734. Devaney, R. ( 1989). An Introduction to Chuotic Dynumicul Systems, Addison Wesley Publishing. Goldberg, D.E. (1989). Genetrc Algorithms in Search, Optimizulion und Machine Leurning, Addison Wesley Publishing. Grebogi, C., On, E. & Yorke, J. A. (1987). "Chaos, Strange Attractors, and Fractal Basin Boundaries in Nonlinear Dynamics", Science, Vol. 238, pp. 632-637. Jang, J.3.R., Sun, C.-T. & Mizutani, E. (1997). Neurofuq und Soft Computing: A Compututronul Approuch to Leurnrng und Muchine Intelligence, Prentice-Hall. Kapitaniak, T. (1996). Controlling Chaos: Theoreticul und Pructicul Methods in Non-Linear Dynumrcs, Academic Press. Kocarev, L. & Kapitaniak, T. (1995). "On an equivalence of chaotic attractors", Journul of Physics A , , Vol. 28, pp. 249-254. Madan, R. (1993). Chuu's Circurt: Purudigm for Chuos, World Scientific, Singapore. Mandelbrot, B. (1987). The Fructul Geometry of Nature, W. H. Freeman and Company. Nakamura, S. (1997). Numerical Anulysis und Graphic Visuulixtion with MATLAB, Prentice Hall. Ott, E., Grebogi, C. & Yorke, J.A. (1990). "Controlling Chaos". Physicul Review Letters, Vol. 64, pp. 1196-1199. Pyragas, K. (1992). "Continuous control of chaos by self-controlling feedback", Physicul Letlers, pp. 42 1-428. Rasband. S.N. ( 1990). Chuorrc Dynamics ofNon-Lineur Systems, Wiley Interscience. Zadeh, L. A. (1975). The Concept of a Liiguistic Variabfe and its Application to Approximate Reasoning, Injvmulion Sciences, 8,43-SO \
,
HARDWARE IMPLEMENTATION OF AN IMPROVED SYMMETRY FEATURE POINT EXTRACTION ALGORITHM
D. POPESCU Computer Science Department, University P O L I T E H N I C A of Bucharest, 303 Splaiul Independentei, sec. 6, Bucharest, R O M A N I A E-mail:
[email protected] J. ZHANG Computer Science Department, University of Hamburg, 30 Vogt-Koelln Street, 22527 Hamburg, Germany E-mail:
[email protected] This paper presents a hardware implementation for an improved symmetry features point extraction algorithm used in a range estimation task, for an autonomous mobile robot application. The algorithm that computes the symmetry points is an improved version of the Symmetry Feature Point Extraction algorithm (SFP) presented in [ 1 1. With this algorithm the symmetry feature points are obtained directly from the cycle image acquired by the camera. All experiments are realised with the resources presented in [ 11 and a hardware development board with a XilinxZE FPGA circuit.
1. Introduction The development of a robust fuzzy expert system supposes a robust feature extraction phase. This task is realised using an enhanced algorithm based on the detection of the symmetry points from a digital image, as it was presented in [ 1 1. All the experiments shown in this paper use the resources presented in [ 41. For the hardware implementation, a board with a Xilinx Spartan2E FPGA circuit was used.
655
656 2. Improved Symmetry Feature Point Extraction Algorithm Since symmetry features points is a global feature, each object can be reduced to only one point. This represent a great advantage in aplications like range estimation because the object classification becomes, in fact, a point classification according to its properties (position, colour, symmetry values). The symmetry algorithm presented in detail in [ 1 ] implies two major steps: horizontal s y m m e t r y computation and vertical s y m m e t r y computation. Some improvements can be realised and these improvements are:
(1) for the horizontal symmetry computation the equation can be rewriten like:
but
s k
can take the form:
where:
Having in view the equations Eq. (3) and Eq. (4), the following recurrence equations can be established:
sl(k
1 + 1) = C2
sj(k
Ic + 1) = C2
'
(pi-k
-pi+k)'
'
(pi-k
-pi+k)'+
+ s1(k).
(5)
sj(k).
(6)
(2) for the vertical symmetry computation the recurrence equations are established in a similar manner. From the experiments it was observed that for the SFP algorithm the processor time necessary to compute all symmetry feature points for one figure is 50,480,000. In the case of the Improved SFP's algorithm (using
657 Eq. ( 5 ) and Eq. (6)) the processor time is dramatically reduced t o 330,000. The software program run on computer with PENTIUM IV processor and 512MB of RAM. In the Figure l(b), an example of a cyclic image acquired by the camera is presented. Figure l ( a ) introduces the notations for understanding how the symmetry points are obtained. The Improved SFP algorithm is:
b.
a.
Figure 1. (a) Useful notations, (b) a cyclic test image.
(1) Start to compute the vertical symmetry for all pixels which belong to the radius V S that ends in the point Po. (2) Compute the local maximum for all pixels and mark all the found pixels white. (3) Rotate the VS radius with an a angle, where a = %, in trigonometric sense. (4) The new pixel coordinates are obtained with the following equations: zi = ziant cos(a) ’
+ giant
yi = giant ’ cos(a) - S2,nt
&(a)
(7)
.sin(a)
(8)
’
( 5 ) Repeat the above steps until V S becomes SA1. It is already known that if we can determine from a circle, we can generate the whole circle. This fact is very helpful because will be necessary t o compute the pixel coordonates only for S1. (6) Establish a circle with the radius 0s. ( 7 ) Test if all Po pixels, which belong to the circle established in step 6, are vertical symmetry points or if they have vertical symmetry neighbours.
658 (8) If the test fails, mark all these pixels black. Decrease the radius and repeat steps 7 and 8.
Every computed symmetry point, p , has an 8-dimensional feature vector like in [ 1 1.
3. Hardware Implementation The hardware implementation is realised using a DIGILAB2E board which has an FPGA circuit (SPARTAN2EPQ208) and a clock frequency fixed a t 50MHz. Having in view the huge data computation, some memory modules need to be added at the initial configuration of the board. The data that came from the Nikon camera require implementing the usual arithmetic operations (addition, difference, multiplication and division) in floating point like in [ 2 ] and [ 3 1. The Improved SFP’sarchitecture was made using parallel and pipeline processing techniques. The total computation time for the whole Improved SPF’s algorithm in the worst case is 3,688,257 clock cycles. Having in view that the board clock has the frequency of SOMHz, the obtained time for running the Improved SPF’s algorithm is 73,765,140 n s which means approximatively 74 m s . In the best case, when the pipeline is full, the computational time is 2072,s. So we can conclude that this implementation works in real time. In order to improve the total computation time it is necessary to increase the clock frequency and to choose memory modules with a very high time access. The Improved SFP’s algorithm presented in this paper includes all the improvements that can be made to the classical SFP’s algorithm.
References 1. D. Popescu and J. Zhang and K. Huebner, Real-time intelligent vision sensor f o r robot navigation using symmetry features, Computational Intelligent Systems for Applied Research, Proceedings of the 5th International FLINS Conference, 421 (2002). 2. M. M. Mano, Computer System Architecture, Prentice-Hall, 303 (2000). 3. D. Popescu Pipeline Implementation of the Flouting Point Operations using FPGA circuits, Technical Report No. (2104-03, University POLITEHNICA of Bucharest, (2003). 4. D. Popescu and J. Zhang Fuzzy Expert System based o n symmetry features for range estimations, 6th International Conference on Climbing and Walking Robots, 1007(2003).
DESIGN OF A FUZZY MODEL-BASED CONTROLLER FOR A DRUM BOILER-TURBINE SYSTEM AHCENE HABBI , MIMOUN ZELMAT Laboratoire dilutomatique Applique'e, F.H. C. Universite'de Boumerdis,35000 Boumerdis, Algeria Phone/Fau.: +213-24-816905 E-mail:
[email protected] This paper addresses the design of a fuzzy control system for a drum boiler-turbinegenerator unit. In the design procedure, a dynamic fuzzy augmented system is suggested for the nonlinear boiler-turbine plant to deal with its non-minimum phase behavior. The f u v y control system is synthesized from a local concept viewpoint on the issue of the optimal control theory. The good performance of the designed fuzzy control system is shown by simulations in various conditions.
1. Introduction Due to the deregulation in energy market, the power companies are under ever increasing pressure to improve the efficiency of their industrial equipment. For instance, a combined cycle power plant used for cogeneration of electric and thermal energy may need to provide a large amount of steam on demand while at the same time maintain balance in power generation. The drum boiler-turbinegenerator (BTG) unit is a critical part to the power plant. It is difficult to change production configurations in a BTG unit because of the resulting major disturbances in energy and material balance. When the process losses this balance, it becomes much more difficult to control due to the changes in process dynamics. Because of the complicated dynamics of the boiler-turbine system many modeling efforts have been made. Models that are suitable for control design have been investigated in many papers [I+]. In our recently published paper [3], we developed a dynamic fuzzy model for a 160MW drum boilerturbine system. We demonstrated that the proposed fuzzy model captures well the key dynamical properties of the physical plant over a wide operating range, and is suitable for model-based control design. The control of the boiler-turbine system is still of substantial interest. One of the key difficulties is the control of the water level in the drum boiler. Water level dynamics are non-minimum phase, because of the shrink-and-swell effect, and vary very much with the load. The boiler-turbine control system, which controls the electrical output, the drum steam pressure and the drum water level is necessary for the stable load following, the safety of the power plant and fuel saving. Without good control of these variables, flexible production of energy to meet demands will be difficult. 659
660 Feedwater flow
Figure 1. Schematic of the drum boiler-turbine system.
On the issue of singularly perturbed model based control, Patre et al. developed, in [ 8 ] , a periodic output feedback controller for a steam power plant. However, the proposed controller does not have much practical implications since the design procedure is achieved on the basis of simplified dynamical equations of the plant. In [5], Kwon et a/. investigated the use of the robust control theory for designing a multivariable LQG controller for the boilerturbine system. The basic limitation is that for the nonlinear plant like the boilerturbine system, the control scheme is very complex since the gain-scheduling techniques are used. The goal of this work is to propose a scheme for designing a fuzzy model based controller for the nonlinear boiler-turbine system. Figure 1 shows the principal components of the steam power plant. First, a fuzzy augmented system is suggested to deal with the non-minimum phase behavior of the plant and to meet a desired loop shape. Then, quadratic optimization problem is solved for each local fuzzy augmented system and a global fuzzy controller is deduced for the global fuzzy system. Finally, a fuzzy estimator is built upon classical estimation theory using a local concept approach. 2.
Drum Boiler-turbine System Representation
2.1. The Boiler-turbine Dynamic Fuzzy Model
The dynamic fuzzy model of the nonlinear plant is represented by a set of IFTHEN logical rules as follows: Rule i: IF 6x,(t) is F' and 6x,(t) is Mi THEN 6x(t) = AiGx(t) + BiGu(t) 6y(t) = CiGx(t) + Di6u(t), i = 1,...,r.
(1)
where F' , Mi are fuzzy term sets of the ith plant-rule associated to the variation of the drum steam pressure 6x, (t) and the variation of the density of fluid in the
661
system 6x,(t), respectively. These state variables are defined as fuzzy variables in the fuzzy system. The local state-space parameters ( A i , B i , C i , D i ) in the consequent part describe local dynamics of the physical plant at a specified operating point (xi,u i ) . 2.2. The Fuzzy Augmented System
The redistribution of steam and water in the system causes the shrink-and-swell effect which causes the non-minimum phase behavior of water level dynamics. In order to deal with these inherent properties, we suggest to introduce dynamics augmentation for each fuzzy subsystems. It plays the role of making the singular values of the augmented subsystem as close as possible at a specified frequency range. The fuzzy augmented system can be expressed by the following IFTHEN rules: Rule i: IF 6x,(t) is F' and 6x,(t) is M i THEN i , ( t ) =A,x,(t)+B,iu,(t) 6y(t) = CaiXa(t),
where x, = [ :] , A Y
=[
A,
I:[=
B, O],13ai
(2) i = 1, ..., r.
andCai
=[: It,
and ua satisfies the following equation: 6ti(t) = Hu, (t).
(3 1
where H is obtained by using the pseudo-diagonalization method and a column scaling diagonal matrix [9]. The introduction of this constant matrix into the state equations plays the role of achieving weak interactions between the system variables and forcing the local dynamics to meet the desired specifications in the control design procedure.
3. Drum Boiler-turbine Fuzzy Control System Design The boiler-turbine control system is required to have the good command tracking, the disturbance rejection and the robustness to parameter variations. In order to achieve these objectives, in this paper, we use the LQG method with the concept of multivariable loop shaping. The design procedure is achieved from a local viewpoint, i.e. each fuzzy subsystem is forced to behave according to the desired specifications in the designed local control loop. Since the local fuzzy system is linear, its quadratic optimization problem is the same as the general
662 linear quadratic issue [ 101. Therefore, solving the optimal control problem for the fuzzy augmented system of Eq. (2) gives the following fuzzy control law: 1
i = 1 , . ..,I-.
u, (t) = - x p i R - ' B z i P i x a (t),
(4)
i=l
where p, denote the normalized firing strength of the ith rule, and P, is symmetric positive definite solution to the control algebraic Riccati equation: A:,P,
+P,A; +Q-P,B,,R-'BT,P,
i = l , ...,r.
=o,
(5)
where Q and R are weighting matrices chosen via several simulations to satisfy the design specifications with limitations in control inputs. The resulting optimal feedback fuzzy system with augmented dynamics is described by: r
a (t)
=
r
7,
]
e Pj x a (t).
P j [A - B ai
i=l j=l
In practice, it is necessary to design a fuzzy estimator in order to implement the fuzzy controller given by Eq. (4). The idea is that for each local dynamics, an estimator gain is determined on the basis of the loop transfer recovery (LTR) method. The resulting global estimator is a 'fuzzy blending' of each local estimator and can be expressed by a set of IF-THEN logical rules as follows: Rule i: IF 6x,(t) is F' and 6x,(t) is Mi THEN i,(t) = Aaika(t)+Baiua(t)+Gi[6y(t)-6f(t)] W t ) = CaikaW,
i
(7)
= 1, ...,r.
where G, (i=l, ...,r) are estimation error matrices. For the determination of estimation gains, covariance matrices Qf and Rf are chosen so that the loop transfer recovery can be approximately achieved for each local dynamics. For the obtained covariance matrices, the following estimator algebraic Riccati equation is solved for the positive definite matrix r,: r,A; +AaiTi+ Q f -TiC%R;'CaiTi =0,
i = l , ..., r.
(8)
and the estimator gains are given by: G i = TiC:RT',
i = 1,..., r.
(9)
As can be noticed, the fuzzy controller and the fuzzy estimator are independently designed according to the Separation Property [6].
663 Steam pressure (kglcm')
108.4 108.3 108.2 108
108.1 108
I
0
I
50
Time(r)lOO
150
k 1
200
Electrical ouiput (MW)
66'67 66.66
/ ; M
66.65
150
0
50
0
200
50
TimelqO
150
200
Water level (m)
Time{$' Water level (m)
2: ~J~~ 0.01
0.006 -0.02
0.002 0
50
100 Time (5)
(4
150
200
0
50
100
Time ( 6 )
150
200
(b)
Figure 2. Disturbances rejection with the designed fuzzy control system: (a) Input disturbance rejection, (b) output disturbance rejection.
4.
Simulation Results
To validate the design objectives, simulations are performed under various situations. In the simulation, we consider the rejection of disturbances at the input and output variables of the boiler-turbine system, and the tracking property of the fuzzy control system. Figure 2-a shows the responses of the boiler-turbine output variables to a step increase in fuel flow rate. The effect of this input disturbance is small and vanishes for t>50s. In addition, the output disturbance is well attenuated and its effect becomes insignificant after 100s as shown in Figure 2-b. In all cases, the electrical output most rapidly reaches its nominal state and the drum water level most slowly does. The ability of the designed fuzzy control system to track reference commands is well achieved as depicted in Figure 3. The boiler-turbine output variables follow well the reference input and the steady state error becomes zero in all cases.
5. Conclusion In this paper, a fuzzy control system with a fuzzy controller and a fuzzy estimator is designed for a fossil fuelled nonlinear boiler-turbine system. The design procedure is achieved from a local viewpoint using a nonlinear dynamic fuzzy model. A fuzzy augmented system is suggested to deal with the non-
664 Electrical output (MW)
Steam pressure (kg/cm')
108.5
107.5
1071 0 008
"
50
I
100
Time ( 5 ) Water level (m)
150
O
50
100
Time(s)
150
200
1
I
I
0
2oo
50
100
Time (s)
150
200
Figure 3. Tracking commands with the designed fuzzy control system
minimum phase behavior of the nonlinear plant. Using local concept approach, a nonlinear fuzzy control law is derived to control the multivariable boiler-turbine system and a nonlinear fuzzy estimator is designed to implement the fuzzy controller. Simulation results show that the synthesis methodology, which independently designs the fuzzy controller and the fuzzy estimator with the separation property, is very effective to the control of the strongly interacting variables of the nonlinear boiler-turbine system. References
1. ,htrom K. J. and Bell R. D. Drum-boiler dynamics. Autornatica, 36, (2000). 2. Bell R. B. and Astrom K. J. Dynamic models for boiler-turbine-alternator units. Report of Lund Institute of Technology, (1987). 3. Habbi A. and Zelmat M. A dynamic fuzzy model for a drum boiler-turbine system. Autornatica, 39(7), (2003). 4. Flynn M. E. and O'Malley M. J. A drum boiler model for long term power system dynamic simulation. IEEE Trans. Power Syst., 14(1), (1999). 5. Kwon W. H., Kim S. W. and Park P. G . On the multivariable robust control of a boiler-turbine system. Proc. IFAC Pow. Syst. Pow. Plant Cont.,(1989). 6. Ma X. J., Sun Z. Q. and He Y.Y. Analysis and design of fuzzy controller and fuzzy observer. IEEE Trans. Fuzz. Syst., 6(1), (1998). 7. Mayne D. Q., Rawlings J. B., Rao C. V. and Scokaert P. 0. M. Constrained model predictive control. Autornatica, 36(6), (2000). 8. Patre B. M., Bandyopadhyay B. and Werner H. Periodic output feedback control for singularly perturbed discrete model of steam power system. ZEE Proc. Cont. Theo. Applic., 146(3), (1999). 9. Rosenbrock H. Computer-aided control system design. Press Inc., (1974). 10. Wu S.J. and Lin C. T. Optimal fuzzy controller design: Local concept approach. IEEE Trans. Fuzz. Syst. 8(2), (2000).
UNIVERSAL TAKAGI-SUGENO FUZZY CONTROLLER CORE IMPLEMENTED IN A PLD DEVICE
D. OSELI, M. MRAZ, N. ZIMIC Laboratory for computer structures and systems, Faculty of Computer and Information Science, University of Ljubljana, Trzaska 25, SI-1000 Ljubljana, S L O V E N I A E-mail:
[email protected] Fuzzy logic controllers are nowadays mostly implemented as software code and run on conventional microprocessors. If there is a need for high speed processing, the controller must be implemented in hardware. One of the solutions is implementing a fuzzy logic controller in a programmable logic device. Taking into consideration some initial limitations, a universal fuzzy controller core can be constructed. Such controller can be quickly adapted to various system transfer functions, even while the controller is operating. This paper outlines some important design issues that we came across while constructing such fuzzy controller cores.
1. Introduction In the last decade an enormous theoretical as well as practical progress has been made in the field of fuzzy logic. One of the highlights is the fuzzy controller [1],[6],[10],[11].Such controllers are very robust, capable of approximating most system transfer functions [7],[8],[9]and are simple to develop. The most common method of implementation of the fuzzy controller is software coded controller running on a general purpose microprocessor. Implementation of fuzzy controllers in target systems where very short controller response times are required (less then lps) is more complicated. In such cases the use of a general purpose microprocessor which executes the fuzzy controller code does not solve the problem as large amount of code needs to be processed every controller cycle. Due to the outlined problems, special hardware solutions for fuzzy processing are required. Methods of fuzzy processing can be categorized as follows:
(1) General purpose microprocessors [I], (2) Microprocessors with additional fuzzy instructions [2], 665
666 (3) Fuzzy co-processors [3], (4) Programmable logic implementations [4], [5],[13], (5) ROM based implementations. For real-time and short response time systems or applications only hardware solutions as programmable logic devices or ROM based implementations can work satisfactorily. This paper outlines the hardware implementation of the fuzzy logic controller - implementation in a programmable logic device. The main characteristic of such implementation is highly parallel processing of information in the programmable logic device. This is the opposite of conventional micro- or co-processor based implementation where code is processed sequentially in many system clock cycles. The review of practical implementations and published papers researching this area shows, that the most common approach to the hardware implementation of the fuzzy controller is the use of pre-calculated data. Computationally demanding processing (e.g. degree of membership of input variable) of the fuzzy controller is pre-calculated in advance and accessed on a look-up basis during operation of the controller. In such cases only the computationally less demanding processing is implemented in the programmable logic device as active logic, with pre-calculated data stored in memory. 2. Towards universal fuzzy controller core The architecture of hardware implementation of universal fuzzy controller core must be designed in a way to allow very fast processing of various controller configurations. The drawback of the universality is that the footprint of such controller is larger, more logic gates are required and also more system clock cycles for a single controller cycle needed. The benefit is that the fuzzy controller is capable of handling variety of different configurations. To implement a universal fuzzy controller that could be configured for any type of application, thus operating with any translating function, one would need a nearly infinitely large PLD device. Thus some initial limitations are required. The first one is the type of the fuzzy controller. We have chosen a Takagi-Sugeno [6]. Its main advantage over Mamdani type is the defuzzification method. The weighted average equation (Eq.1) can be efficiently implemented in hardware, while Mamdani methods like COG are far more complex. Another advantage is the ability to optimize the translating function of the fuzzy controller using the ANFIS tool [12]. ANFIS is available in the Fuzzy logic package in the Matlab tool. Takagi-Sugeno fuzzy controllers are also proofed to approximate any system transfer function, as
667 presented in [7], [8] and [9].
Eq. (1) presents a general form of weighted average equation where all n fuzzy rules are evaluated. As presented, the weights for output linear functions are minimums of degrees of membership. The wedge operator presents a min function. Refer to [6] for detailed explanation of the Eq. (1). Further we have chosen three trapezoidal membership functions per input and decided for the two input single output fuzzy controllers. Trapezoidal membership functions are best suited for the hardware implementation due to the linearity. Such two input fuzzy controller with three membership functions per input can have up to nine fuzzy rules if using full set of rules, where each combination of active membership functions defines a new rule. The controller complexity was further simplified using fixed position of membership functions. In our prototype we have decided t o implement equally spaced membership functions with overlapping of only neighbouring membership functions. In such cases, the sum of degrees of membership of input variable t o input membership functions is always 1 (see Figure 1). Fuzzification
I
Small
x.
Medium
hree
I
Def -
Figure 1. Fuzzification and defuzzification example with three membership functions, three rules and three output linear functions.
Figure 1 presents a fuzzification and defuzzification example of a single input, single output fuzzy controller. Input value 89 is first fuzzified, as a result we get three degrees of membership (one is 0 as membership function Large is not activated). Further we have three examples of linear output equations yI , ye and y3. Combining input value and degrees of membership, a controller output value t is calculated.
668
3. Using ANFIS for controller development There are two possible ways to construct a fuzzy controller. The first is manual and the second is automatic construction of optimized controller using the ANFIS tool [12]. Manual construction of the fuzzy controller is suitable for systems where translating function is not known precisely. Controllers for such systems are constructed on the basis of desired behaviour of the controller and controller developer’s knowledge. For systems where translating function is known in precise or at least approximate form, the ANFIS tool can make all the hard work for the developer. In our case, the translating functions of tested controllers were known so the controllers were optimized using the ANFIS tool. The optimization procedure is the following: (1) Create initial empty FIS (Fuzzy Inference System - Matlab variable that completely define the fuzzy controller) structure, (2) Set the desired number of inputs, outputs, number, type and position of membership functions, create full set of fuzzy rules, (3) Create a file with pairs of input values and desired controller output values, use as many as possible, (4) Use the ANFIS tool on the FIS structure and set the desired max. error. Run until error is low enough.
The result of such optimization is a FIS structure, with parameters of output linear equations that define the behaviour of the controller. For two input controller with three membership functions per input and full set of fuzzy rules, there are nine output linear equations of form Eq. (2).
yz = p i + p f *xi + p ; *xz
Each Eq. (2) has three variable parameters p 6 , p f and p ; . For the described controller there are a total of 27 parameters. These are the parameters that completely define the behaviour of the controller. The programming of the fuzzy controller core thus consists only of transferring these parameters to the controller. Controller is then ready to operate with the desired transfer function.
669
4. VHDL model and results
The VHDL model of the presented fuzzy controller consists of two processes. One process is responsible for uploading the controller with variable parameters. The other is actually the fuzzy controller process, responsible for fuzzification, inference and defuzzification. This process is constructed as a state machine with four states. So far we have managed to optimize the code so that only 11 system clock cycles are required for a single controller cycle. As mentioned the only variable parameters of the fuzzy controller are of linear output equations. For the presented controller there is a total of 27 parameters. Compared t o any other implementation of the fuzzy controller, this is the smallest amount of data required t o completely configure the controller. Another advantage of the presented concept is the reconfigurability of the controller while it is running. It is possible t o upload new parameters to the controller without stopping it. This is suitable for systems that require constant control.
-
1
E
i o 1
-2 -3
4 300 300
0
Figure 2.
0
Typical error distribution of a two input hardware based fuzzy controller
The target programmable logic device for our prototype was Cypress Delta lOOk equivalent logic gates device. Approximately 80% of all logic gates were used. System clock frequency was llMHz, so we measured a l p s intervals between new controller output values. Due to the relatively small device size some design restrictions additionally appeared (e.g. predefined position of the input membership functions). The fuzzy controller core was tested with several different fuzzy controller configurations.
670 5 . Conclusion
As presented in Figure 2 the typical error distribution of 8 bit architecture of hardware based fuzzy controller is up to 3%. We must outline that this error is generated by the integer arithmetic in the PLD device. Using 12 or 16 bit arithmetic would considerably improve the precision of the controller. In our future research work, we intend to use larger programmable logic devices, where more complex fuzzy controllers could be implemented. We see the future of such fuzzy controller cores in hybrid, single chip devices, where conventional microprocessor core and fuzzy controller core are integrated in parallel. Some aspects of such design already appeared, as presented in [14]
References 1. M. Mraz, Mathematics and computers i n simulation, Elsevier, No.56, 259 (2001). 2. V. Salapura, I E E E Transactions on fuzzy systems, Vo1.8, No.6, 781(2000). 3. H. Eichfeld, T. Kunemund, M. Menke, I E E E Transactions o n fuzzy systems, vo1.4, No.4, 460 (1996). 4. M. A. Manzoul, D. Jayabharathi, I E E E Transactions o n Systems, Man and Cybernetics, Vo1.25, No.1, 213 (1995). 5. H. Surmann, A. Ungering, K. Goser,International Workshop on FieldProgrammable Logic and Applications, Austria, 124 (1992). 6. T. Takagi, M. Sugeno, I E E E Transactions on Systems, Man and Cybernetics, vo1.15, No.1, 116 (1985). 7. J.L. Castro, IEEE Transactions on Systems, Man and Cybernetics, Vo1.25, No.4, 629 (1995). 8. H. Ying, I E E E Transactions on Fuzzy Systems, vo1.6, No.4, 582 (1998). 9. H.O. Wang, J. Li, D. Niemannm K. Tanaka, Proceedings 9th I E E E Int. Conf. on Fuzzy Szstems, (2000). 10. R. Babuska, H.B. Verbruggen, Control Engineering Practice, Elsevier, vo1.4, No.11, 1593 (1996). 11. M. J. Patyra, J. L. Grantner, K. Koster, I E E E Transactions on Fuzzy Systems, vo1.4, No.4, 439 (1996). 12. J . 3 R. Jang, I E E E Transactions on Systems, Man, and Cybernetics, Vo1.23, No.3, 665 (1993). 13. E. Lago, M.A. Hinojosa, C.J. Jimenez, A. Barriga, S.S. Solano, X I I Conference on design of circuits and integrated systems DCIS 97, Spain, 715 (1997). 14. D. Andrews, D. Niehaus, P. Ashenden, I E E E Computer, Vo1.37, No.1, 118 (2004).
PARALLEL PIPELINE FLOATING-POINT FUZZY PROCESSOR
N. POPESCU Computer Science Department, University P O L I T E H N I C A of Bucharest, 303 Splaiul Independentei, sec. 6, Bucharest, R O M A N I A E-mail:
[email protected] J. ZHANG Computer Science Department, University of Hamburg, 30 Vogt-Koelln Street, 22527 Hamburg, Germany E-mail:
[email protected]. de This paper presents a hardware implementation for a Parallel Pipeline FloatingPoint Fuzzy Processor (PPFF) which uses Gaussian membership functions in order to represent the membership functions. The processor inputs are the parameters from a Sugeno fuzzy system. These parameters are developed in an off-line phase and are supposed to be known. The processor inputs are represented in singleprecision floating-point and all operations are realised in pipeline. The hardware implementation is realised using a H.0.T I1 Development System.
1. Introduction Fuzzy logic has a great advantage: any real system can be defined and modelled by it, even if the degree of uncertainty is large. All the designed fuzzy processors use trapezoidal membership functions in contrast to the implementation presented in the next sections, which uses Gaussian membership function. Additionally, in all fuzzy processors designed until now [ 1 1, [ 2 1, the number of membership functions and the number of inputs are kept low. These limitations are imposed by the fact that the number of rules in a fuzzy system is given by the number of fuzzy sets to the power of the number of input variables. If the number of fuzzy rules and the number of inputs are higher, then the resulting number of fuzzy rules which must be generated by the hardware device becomes too large.
671
672 In order to avoid this limitation, the developed PPFF processor applies the parameters from a Sugeno fuzzy system to inputs. The parameters are determined in an off-line phase, or are taken from a hardware device which implements a Sugeno fuzzy system determination method like in [ 41. These parameters are very small (1.30e-12) and, in order to work with them, a single-precision floating-point representation is absolutely necessary. Another limitation for the classical fuzzy processor consists in the fact that the inputs are considered integer. This fact is considered a limitation in robotic applications, because it is necessary to work with numbers in floating point in order to have good results. To eliminate this limitation, the PPFF processor can work with data represented in single-precision floating-point.
2. The Main Features of a PPFF Processor The experiments take into consideration a Sugeno fuzzy system with 8 inputs, 21 fuzzy rules in the data base and 21 membership functions for each input. The main features of the PPFF processor are summarised below: 0
0 0
0
0
0
0
the number of the inputs is at least 7; each input is given in a signed representation using 32 bits; each output has a signed representation using 32 bits; 21 Gaussian membership functions are used for each of the input variable fuzzy sets; 21 crisp membership functions called Oi are described for the output variable Y ; the overlapping of the membership functions is not restricted. the defuzzification method is a zero-order Sugeno; T-norm conjunction is implemented by a minimum.
In this implementation, the processing rate is highly dependent on the fuzzy expert system. This dependency implies the use of a memory module which is large enough for the future applications. Having in view that the expert system cannot work in natural environments, statistically speaking, the maximum number of fuzzy rules can be considered 100. All the possible fuzzy rules are stored in this module called “Fuzzy Rule Memory (FRM)”. Taking into account that any membership function has a Gaussian form, all the fuzzy rules are active. For this reason it is not necessary to develop a module to establish which fuzzy rule is active. The hardware implemen-
673 tation is realised for a H.O.T. I1 Development System in which the clock frequency has a maximum value of 100 MHz. This value for the clock frequency implies a clock period of 10 ns. The PPFF processor has been divided into several pipeline stages. 3. Fuzzification Process
The chosen shape for any membership function is the Gaussian one. The easiest description for the Gaussian shape is given in the equation Eq. (1). p ( x ) = [lt
(yy] -1
In this equation, it is important to specify that: 0 0
PPFF processor; a and b represent the Gaussian parameters. These parameters are used to describe any Gaussian membership function and they are generated in the off-line phase. They are unchangeable and they are stored in a special memory module. z represents the crisp input to the
A number of 294 Gaussian parameters will be stored. A floating point representation is used for any Gaussian parameter.
4. Inference Process The inference process implies the determination of some variables like the degree of truth and the firing level. The intersection between the value of the crisp input and the fuzzy set of the linguistic term from the premise is denoted by p and represents the degree of matching. This degree of matching is usually called "degree of truth". The determination of the p value is an important step, this value representing the grade of membership of an input variable to a given fuzzy set. The p value is always between 0 and 1. To determine this value is enough to substitute the z variable with a crisp input of the PPFF processor in the equation 1. Of course the values for the Gaussian variables are a priori known. The process t o find the p ( z ) value must be realised for every input. After all the p values are computed, another variable called firing level and denoted with CY is computed. The value for LY is calculated taking the minimum value from all the p variables. For every fuzzy rule from the
674
fuzzy rule database, an CY value is obtained. The total number of a values is 21 and, together with the crisp values of output, they are involved in the defuzzification process. The crisp value for the output is denoted zi. 5. Defuzzification Process
The mathematical equation which defines the zero-order Sugeno defuzzification method is presented in equation Eq. (2).
The defuzzifier block (zero-order Sugeno) performs a floating point multiplication in the first step, two floating point additions in the second step and a floating point division, in the third step. Once all the rules have been processed, the data stored into the two adders of the defuzzifier go to the divider circuit to compute the crisp output value. This floating point division is computed in parallel to the pipeline stages while the system begins to process a new data set. The floating point division operations are realised in conformity with [ 31. The data flow through the PPFF processor is presented in Figure 1.
6. Performance of the PPFF Processor The performance of the PPFF processor is analysed having in view the computational time required for all the pipeline stages. All the arithmetic operations are carried out in floating point in conformity with [ 3 1. The computational time is denoted with (CT) and, considering the above specifications, the pipeline stages description and the computation for the CT are given below:
(1) In this stage the computation of the p value starts. The first operation is a floating point difference between the crisp input and the first Gaussian parameter denoted a in the equation Eq. (1). The PPFF processor has a number of seven inputs, so all these inputs start to be processed in parallel like in Figure 1. It can be concluded that in this stage CT = 40ns. (2) The second stage is dedicated to realise a floating point division between the result obtained in the first step and the second Gaussian
675
I
Figure 1. The data flow in the PPFF processor.
parameter denoted by b. To realise this stage a computational time of 30 n s is required. The value of CT becomes 70 n s . (3) The result obtained in the second stage is risen t o the power of two. A simple way to realise this is to multiply by itself the result from the second stage. The value of CT becomes 100 n s . (4) The fourth pipeline stage implements a sum between a constant value “1” and the result obtained in stage 3. Having in view that the floating point addition requires a computational time of 40 n s , the new value for the CT is now 140 n s .
676 ( 5 ) The result at the stage 4 must be to the power of -1. The CT value is changed to 170 ns. From Figure 1, it can be observed that, now,
all the p values are computed. (6) In this stage, a minimum value between all the computed p values is computed. It can be observed that the circuit which realises the minimum operation is a combinatorial circuit. At the end of this pipeline stage the value Q for the first fuzzy rule is calculated. This stage represents the end of the inference phase. (7) With this stage the defuzzification phase starts. After this stage the new value for CT is 210 ns. (8) This pipeline stage realises a parallel floating-point addition. In this stage the dividend and the divisor for equation Eq. (2) are computed. Having in view that these two floating point operations are made in parallel, the new value for CT is 250 n s . (9) Finally, this last pipeline stage computes the zo value. This computation implies a floating division, so that the final value for CT is 280 ns. 7. Conclusions
In conclusion, the PPFF processor proposed in this paper has the capabilities to work in real time. Having in view that it works with parameters from a Sugeno fuzzy system, that the data values are expressed in floating-point and that the defuzzification method implemented is zero-order Sugeno, the obtained results are very precise. The same results will be obtained if the trapezoidal membership function will be used. References 1. A. Kandel and G. Langholz Fuzzy Hardware - Architectures and Applications, Kluwer Academic Publishers, 181 (1998). 2. A. Gabrielli, E. Gandolfi, E. Masetti and M. R. Roch VLSI design and realisation of a 4 input high speed fuzzy processor, 6th IEEE International Conference
on Fuzzy Systems, 779 (1997). 3. D. Popescu Pipeline Implementation of the Floating Point Operations using F P G A circuits, Technical Report No. (2104-03, University POLITEHNICA of Bucharest, (2003). 4. H. Molina, 0. Arellano, A. Reyes, L.M. Flores, J. A. Moreno, F. Gomez CMOS A N F I S Neurofuzzy System Prototype, Instrumentation and Development, 5 ( 2 ) (2001)
DEALING WITH DYNAMIC ASPECTS OF
OPERATORS’ PERFORMANCE GUEORGUI PETKOV Department of Thermal and Nuclear Power Engineering, Technical University of Sofa, 8 Kliment Ohridski Street, Room 2356, Sofa, 1797, Bulgaria
The paper presents the issues of dealing with dynamic aspects of operators’ performance. They are evaluated on the base of macroscopic “second-by second’ context model of cognition and communication processes. The human action context is represented as a field of interaction between human, machine, technology and organization. They form context together but the field influences dynamically their configuration. Some useful achievements of mathematical psychology are applied to context quantification and dynamic reconfiguration of cognition and communication processes modeling. A simplified probabilistic approach to dynamic quantification of operators’ performance is demonstrated by the Performance Evaluation of Teamwork method.
1. Introduction The human-machine system (HMS) behavior is formed and influenced by the dynamic processes’ interactions in hardware, software and liveware. The representations of the hardware and software performances are quantities that can be observed and measured. They can be explicitly determined and defined by matrices, which eigenvectors form a Hilbert space. On the other hand, the dynamic aspects of liveware are a product of perturbations in physical and mental processes. They are not easily observed and measured but require advanced and sophisticated analyzing tools. Fuzzy logic and intelligent tools can be applied but their use should be preceded by appropriate theoretical and practical insights for dealing with dynamic aspects of operators’ performance. The paper presents capacities of Performance Evaluation of Teamwork (PET) method for simplified probabilistic treatment and quantification of operators’ performance.
2. Dynamics of Context
2.1. Background The integration of human and machine has many dimensions and projections. When considering the dynamics of human performance reliability the first dimension is time. This temporal approach, e.g. time reliability curves (TRC), is usually complemented by procedural, influential or contextual approaches to avoid “bareness in modeling”. The TRC is “virtually impervious to context” [ 11. 677
678 The influence of situational, task, human and plant factors is indicated by performance shaping factors (PSF) [Z]. The “categories of human behavior according to basically different ways of representing the constraints in the behavior of a deterministic environment or system” (skill-, rule-, and knowledgebased levels) [3] are taken into account by adjusting the input parameters of probability distribution. This “improved” TRC version called the human cognitive reliability (HCR) correlation, though being justifiably criticized, remains one of the most commonly employed human reliability analysis (HRA) methods. However, from psychologists’ point of view, this method is very ambiguous and with “less-than-adequate psychological realism”, regarding its results. Hollnagel points out that “any description of human actions must recognize that they occur in context“ and must account for “how the context influences actions” [4]. Consequently, the questions of primary importance for dynamic aspects of human performance are: “what is human action context” “how the context is represented and changed” and “how the context influences the human performance”.
2.2. What Is Human Action Context? The human action context “consists of the ideas, situation, events, or information that relate to it and make it possible to understand it fully” [5].According to this definition, at least four general answers to that question are possible: the term “context” is used for particular state of mind of a cognitive system, for the situation itself (the state of the universe), for a particular description of the event sequence, or for all information about situation, event, state of mind and environment. A dynamic theory of context considers it as a dynamic fuzzy set of entities that influence human cognitive behavior on a particular occasion [ 6 ] .Usually psychologists prefer to represent context as “a state of the mind” - “the term of context refers to a set internal or mental representations and operations rather than a set of environmental elements”. The reason for that is the fact that the context is specijc (“only very few things in the universe that do influence context”), individual (“different people will be influenced by different elements of the same environment”) and internally represented (“all entities in the environment which do influence human behavior are internally represented”). However, the context is not only internally represented. It is externally in$uenced and formed by environment as well. No one doubts that the context elements are perceived and memorized from previous individual and generalized experience of people. A human tries to match all relevant elements by the reasoning mechanism, to take into account
679 their current state and to predict them in the future. These elements are associated in the human memory as a framework of the environment of specific human activity. This framework does not always consist of real objects but of ones that are subjectively (individually and specifically) perceived and memorized as a real field of interaction with this environment. Consequently, the context theoretically could be defined as a common state of universe, mind and situation (in their relation). In other words, the context should contain all information about environment and human in this situation (objective and subjective information). A practical problem of human reliability is how to take into account the dependence between operators and environment and to quantify human error probability (HEP). As is well known from natural sciences, the quantitative approach to macroscopic systems is based on the calculation of the number of the accessible states [7]. Practically, context may be regarded as a statistical measure of the degree of the HMS state randomness defined by the number of accessible states talung place in the systems’ ensemble. According to the PET inductive approach and by analogy with Shannon’s source-coding theorem, all N independent identically distributed context elements (each with information S ) can be compressed into more than NxS bits with negligible loss of information, as N+m; conversely, if they are compressed into fewer than NxS bits there is a dramatic fall-off information. Since it is impossible to describe the whole process in detail and all HMS accessible states, it is evident that these steps add immeasurably to knowledge of actual context.
2.3. How The Context Is Represented And Changed? On the one hand, a context description of given situation has to reflect dynamically, all specific information for the mind and environment before and after initiating event. On the other hand, the description of the ensemble of HMS and context elements must be sufficiently general for human, technology and organization of specific control area. Consequently, the use of several levels of context elaboration is imposed. The PET method uses macroscopic description as of its first level approximation. Regardless of the place, moment and agent, the performed human erroneous action (HEA) could be divided into three basic types that determine the reliability of human performance: violation (V), cognition error (CE) and execution error (EE). Based on these concepts a “second-by-second” macroscopic quantification procedure of contexts of individual cognition and team communication processes is made. Technologically recognised and associatively relevant context factors and conditions (CFC) such as goals,
680 transfers, safety functions, trends of parameters, scenario events and human actions are taken into account as context elements. The main principles of the dynamic description of context are the following: Context is a common state of universe, mind and situation; Context could be described for an isolated system that is unavoidable approximation; 0 Context consists of associatively relevant elements that on the macroscopic level should be technologically recognized CFC as well; 0 Context can be quantified by accounting the mental or physical accessible states of HMS and human-human system (HHS); Context quantification is not provided for separate action point probability, and for a probability of any human action in time interval after the initiating event. The procedure includes consecutive application of the Combinatorial Context Model (CCM) and the Violation of Objective Kerbs (VOK) method for context quantification and the use of their results for obtaining the individual and team probability of cognitive error by the PET method. Combinatorial context model
The CCM is based on the concept of “human performance shifts”, i.e. on the assumption that the context rate in any situation is proportional to the deviation (gradient) in operator’s mental model subjective (perceived and memorized) image ($sn) of past and future from the objective (reasoned) one ($),. Context Quantification consists in counting identical contexts (bit states). The probability is obtained as a relation of the number of combinations resulting in the same context to the total number of all context combinations. Violation of objective kerbs
To trace the context image in time, it is necessary to know how the context elements (CFC) change: I $on(t)-$sn(t)I. For the cognitive process @on(t)=$on(to)=@,n=const and &,,(t) changes from the minimum &,(to) to the objectively expected value $,(RT)=$,, where RT (response time) is the interval between initiation of mental process (cognition or communication) and response action. The general case, when $,,(t)#const and @,,(t)fconst should be considered. But if the objective image changes from $lon(t)to $zon(t)because of any cause or reason, then it is VOK. The cognitive process is violated and the operator is motivated to achieve another objective image. For the communication process, if $lsn(t)>@zsn(t), @lsn(t)=constand $zsn(t) changes from the minimum $2sn(to) to the objectively expected value
68 1
@zS,(CT)=@,,,,CT - communication time. It means that the objectively expected value is changed to the subjective knowledge of the team partner and vice versa.
Human erroneous actions definitions On the base of the Reason’s [S] qualitative definitions, the CCM and VOK quantitative definitions of HEA are formulated: 0 HEA (HEA=CEuEE) are ‘all those occasions in which a planned sequence of mental or physical activities fails to achieve its intended outcome’. PET cognitive error is probable when the @sn(t)f@on(t). n=l.. .N, i.e. when the differences between objective and subjective images of human action context is not zero, where zero-context is I Qon(t)-Qsn(t) I -+O/min. Violation: ‘aberrant action’ (literally ‘straying from the path’. . .)’; PET violation occurs when the objective image of factor n is changed from @lo,(t)to Qzon(t)because of any reason, n is number of CFC. Context quantificationformulae The procedure of the CCM and VOK in the PET method uses simple context quantification formula that is a relation of the system accessible states at the current moment t and the initial moment to (the beginning of the cognition or communication process):
where indices are: n = 0, 1, 2...N ( N is the number of CFC), and i - a number of transitions between equilibrium states.
3. How The Context Influences Human Performance? The most trivial approach to decision-making process description is to use a homeomorphic graph architecture, where sub-processes are presented as nodes. The empirical data obtained by mathematical psychology shows that the time required for accomplishment of the mental processes with selective influence is ignorable short compare to the time in arcs presenting non-selective influence. That is why the PET approach represents HMS and HHS as systems based entirely on holographic behavior without any communication among separate sub-processes (e.g. observation, identification, etc.). Therefore, subprocesses of cognitioddecision-making process in the model are considered to
682
be independent and quasi-constant values with stochastic nature (presented as nodes). The context impact on mental processes could be rather compared to the “conductor in electromagnetic field” phenomena (“decision sequence” in context” field). This field generates decisions instead of electricity. The communication between mental processes is described by “modulating context control links”. It means that these contextual links measure the probability (degree) of connectivity between them and accomplish the control of whole mental process (holographic-likebehavior). By this means, we first take into account non-selective influence and then analyze process in-depth considering selective influence. The process of decision-makingis regarded as consisting of quasi-processes. The following modeling framework for decision-making phase of human action, where appropriate contextual links are applied, can be outlined: 1. Configuration (graph architecture and its organization in time) of individual cognitive process; the PET method uses the architecture of a stepladder model as reliability model of individual cognition. 2. Configuration of group communication process (communication is superstructure of individual cognition); a simple geometrical model as reliability model is used by the PET method. 3. Configuration of leadership process should be applied as well (it is not available at the PET method yet). 4.
Issues
The operators’ performance should be based on dynamic context description and dynamic reliability models of human-machine and human-human systems. The PET practicability and efficiency were illustrated by retrospective HRA of well-known accidents and a quantitative assessment of operators’ performance on the base of thermo-hydraulic calculations and full-scope simulator data [9].
References 1. Ed Dougherty, Reliability Engineering and System Safety 41,27, (1993). 2. A. D. Swain and H. E. Guttman, NUREG/CR-1278, chapter 3, (1983) 3. J. Rasmussen, IEEE Transactions on Systems, Man, and Cybernetics, SMC13 3,258, (1983) 4. E. Hollnagel, CREAM, Elsevier Science Ltd. (1998). 5. J. Sinclair et al., Dictionary, William Collins Sons & Co Ltd., 305 (1998).
683 6. B. Kokinov, IJCAI-95 Workshop on Modeling Contest in Knowledge Representation and Reasoning, LAFORIA 95/11, (1995). 7. G. Petkov, P. Antao and C. Guedes Soares, ESREL’2001,3, 1851, (2001). 8. J. Reason, Reliability Engineering and System Safety 46,297, (1994). 9. G. Petkov, V. Todorov, T. Takov, K. Stoychev, V. Vladimirov, I. Chukov and V. Petrov, JRC ESReDA 25‘hSeminar, 29211027, (2003).
This page intentionally left blank
AUTHOR INDEX
Akdag H. Alasty A. Alt S. Andreadis I. Arieli 0. Avdic S. Ayaz E.
250 623 617 393 57 236 557
de Pedro T. Deschrijver G. Ding Y. DOJ.-H. Domingos R.P. Du J. Du Y.
450 57 124 7 583 483,537 318.324
Baptista R.P. Baraldi P. Barutcu B. Belatreche A. Bell D.A. Benedikt J. Benitez-Read J.S. Bien Z. Bozdag C.E. Brijs T. Bruniaux P. Buyukozkan G.
583 573 557 205 161 409 603 7 489 167 525 495,519
Enomoto M. Er M.J. Ertay T.
589 483,537 489,507
Feng J. Feng L. FernAndez F. Fernandez-Madrigal J.-A. Fiordaliso A. Forster T. Fortemps Ph.
148,579 370 397 438 56 1 118 561
Castillo 0. Cebeci U. Chen G. Chen S. Chen T. Chen Y. Chung C.E. Cornelis C. Crespo J.C.
444,513,649 531 25, 167, 173, 179,629 292,296,308 47 1 525 416 57,195 397
Davis E. De Cock M. De Lope J. De Moraes R.M. Deng F.
543 195 463 314 138
Galindo C. Gao L. Gao Y. Garcia R. Gdmez D. GonzAlez C. Gonzalez J. Guan J.W. Guo P. Guo S. Guo X. Gupta M.M. Gurkan E. Gutitrrez J.
438 124 537 450 258 450 438 161 501 128 179 13 428 397
Habbi A.
659
685
686 Halang W.A. Hampel R. Happiette M. Hines J.W. HOUZ.-G. Hu Y. Huang C. Huang Z. Hullermeier E Hwang I.K.
629,635 118,617 525 543,549 13 240,434 358,364,370 597 376 416
Inoue H.
364
Jang H. Janssens D. Jeong J.H. Jian H. Jiang B. Jung D.W.
7 167 611 64 1 105 611
Kabak 0. Kagami Y. Kahraman C. Kanemoto S. Karimi I. Kastner W. Kawamura S. Kerre E.E. Kharrat M. Kim D.-J. Kim J.-B. Kim K.B. Kim K.-B. Kim S.P. Koehl L. Kolev B. Kolman E. Kou W. Kudo K. Kunsch P.L. Kwon K.C.
477 5 89 489,519 5 89 376 118 589 57, 173, 195,342 211 7 7 215 20 1 611 47 1 189 111 338 557 561 416
Liu S.Y. Liu X. Liu Y.L. Loo C.K. Lu Jia Lu Jie Lu N.
276,282,288 167 583 611 185 629,635,641 350 87 318, 324,629, 635,641 334 422 161 29, 51, 266, 272, 276,282,288 65,71 292,296 65,71 422 240,434 300,304 148,579
Ma J. Machado L.S. Maguire L.P. Mao Y. Maravall D. Margaliot M. Martens J. Martinez L. Mazadiego A. Mazlack L.J. McGinnity T.M. Melin P. Meskouris K. Montero J. Moon B.S. Moon J.-W. Mori M. Mou Q.
173,328 314 205 64 1 463 111 342 266,272 397 I55 205 444,513,649 376 19,258 416 20 1 589 134
Lan J. Lan Y. Lapa C.M.F. Lee B.C. Li T. Li P. Li R. Li W. Li Z. Liang X. Lim W. S . Liu D.Y. Liu J.
687 Mraz M. Muminov L. Na M.G. Nabeshima K. Nair H. Nijera-Hernindez M. Nam J.-H. Naranjo J.E. Ng R.
665 409 611 557 246 603
Oseli D. Oussalah M.
665 382
Pan W. Pan X. Pan Y. Patrascu V. Patricio M.A. Pizsit L. Pei Z. Pereira C.M.N.A. P6rez-Clavel B. Petkov G. Petruzela I. Ping Z. Poidomani C. Poncela A. Popescu D. Popescu N. Put F.
99 51 99 403 463 230,236 77,93 583,607 603 677 567 83 258 456 655 67 1 342
Qi D. Qiao S.
318,324 358
Rao M.V.C. Rasmussen B. Remion Y. Ren L. Ren M. Resconi G.
422 549 250 124 179 35
20 1 450 525
Revuelto J. Reyhani N. Robinson J. Ruan D.
450 211 6 29, 173, 185,266, 328,519
Sacco W.F. Salarieh H. Sanchez A. Sandoval F. Sanz A. Schirru R. Schockaert S. Seeliger A. Seker S. Sergiadis G.D. Shi C. Shin S.H. Song K.-Y. Sunde C.
607 623 397 456 397 583,607 195 617 557 222 300 61 1 13 236
Tamaoki T. Tang B. Trazegnies C. Trivifio G. Truck I. Trujillo L. Tsiftzis Y. Turkcan E. Tuzkaya U.R.
589 334 456 397 250 444 393 557 507
Ulengin F. Urdiales C.
477 456
Vlachos I.K. Vroman P.
222 47 1
Wang J. Wang P.P. Wang W. Wang Y.
350 6 105 272,318
688 Wang Z. Wets G. Wu F. w u z.
134,641 167 304 144
Xie W. x u Y.
Xue Y.
350 26, 29, 51,65, 71, 77, 83, 87, 93, 105, 138, 144,276,282, 288, 292, 296, 308, 318, 324, 328,338 334
Yager R.R. Y6Aez J. Yang G. Yang J.-B.
5 258 597 272
Yang L. Yi L. Younes A.A. Yu s. Yu w. Yuan J.
597 77 250 597 525 350
Zelmat M. Zeng C . Zeng X. Zhang G. Zhang J. Zhang X. Zhang Y. Zhu Y. Zhu Z. Zimic N. Zimmermann H.-J. Zio E.
659 41 471,525 300,304 358,655,671 124 579 45 334 665 3 573